I have recently proposed that the Indo-Aryans originated in West Asia, roughly in the area between the Caucasus, Armenia and Iran (to the west and south of the Caspian Sea).
An alternative hypothesis of Indo-Aryan origins would derive them from the north of the Caspian Sea and ultimately from Northeastern Europe.
My ADMIXTURE experiments so far have provided substantial evidence of the former hypothesis, suggesting that the main Caucasoid component in South Asia is of West Asian origin.
As a visual test of the two hypotheses, I ran a PCA analysis of a number of South Asian populations, together with a wide assortment of Northern European, and West Asian populations to determine the origin of the Caucasoid component in India. Here are the results:
It is fairly clear to me that the Indian Cline is between South Indian and West Asian populations. I have added a regression line on the plot of the South Asian samples, excluding the three Balochistan populations (Makrani, Brahui, Balochi) and this line intersects pretty much the centroids of the Kurdish and Iranian clusters, i.e., the linguistic cousins of the Indo-Aryans and South Asian Iranic speakers.
According to my theory, the direction of the migrating Indo-Aryans took them north of Balochistan, across the Punjab and into India, from an ultimate source in the Transcaucasus, via Iran, Turkmenistan, and Afghanistan.
It is difficult to disentangle different genetic strata in this region, or to assess the importance of Indo-Aryan vs. other population movements. Nonetheless, the South Asian sample points' position on the PCA map can be explained by a linear regression with a high correlation coefficient of -0.83, so a simple cline between Iranian-like and South Indian-like people seems like a very good model approximation.
A recent linguistic model suggests a first-order split in the Indo-European family between Indo-Iranian and the rest of the family. Such a model might be attractive in the context of the best PIE origins model currently available, as it would derive the Indo-Iranians from an eastward migration from Anatolia, the Anatolian speakers from those who stayed behind near the homeland, and the rest of the Indo-Europeans from those who went to Europe.
Personally, I'm not particularly convinced that this is correct vs. the most commonly held model in which the Anatolian-European split is primary. Hopefully, a combination of genetics and linguistics will help resolve these issues.
We should also not forget that the clear vector of West Asian Caucasoid incursions into South Asia detected by both ADMIXTURE and PCA analyses need not have involved a single people or a single time.
It is clear from the figure, that Indo-European (Armenian/Iranian) and Caucasian (Adygei, Georgian, Lezgin) groups of West Asia form a cluster in comparison to both North Europeans and South Asians, and I see no real reason to think that the early Proto-Indo-Europeans were genetically that distinct from their neighbors. So, the Indian Cline was probably formed over thousands of years by dispersals of different kinds of people, speaking different languages, but all sharing the same basic West Asian gene pool.
UPDATE (May 10):
Here is a PCA plot using pretty much all the West Eurasian populations in my dataset. I have allowed no missing values, and the intersection of the various sources has left only 7,687 SNPs over which this was done. It is clear that the cline towards from South Indians to West Asia is recreated: