In the previous two parts of my trilogy, I presented the evidence for the clear West Asian origin of the bulk of South Asian Caucasoid ancestry. There was one thing that nagged me, however: Reich et al. (2009) had presented evidence (based on their 4-population test) that Ancestral North Indians formed a clade with CEU White Utahns at the exclusion of the Adygei from the Caucasus. This seemed inconsistent with my theory, and I considered many potential solutions to the problem, until I recently realized what was happening.
A good way to determine whether ANI is more similar to CEU or to Adygei is to calculate the first principal component of variation between CEU and Adygei, and then project the Indian Cline samples onto it. Since these samples are composed of an Onge-like South Indian component (outgroup) and a Caucasoid factor X, their position on PC1 of CEU vs. Adygei will be determined by the relationship of X with either CEU or Adygei.
Here are the results:
Notice that many populations are around the 0.5 mark between Adygei and CEU, i.e., they are not particularly closer to one population than the other. But, a few of them, notably Pathans and Kashmiri Pandits are closer to CEU than to Adygei.
Now, consider the following table (Note S3 Table 1) from Reich et al. (2009):
This is the evidence, based on the 4-population test that CEU and ANI form a clade. Notice that this is based on comparisons with Pathans and Kashmiri Pandits, i.e., with two groups that seem to deviate towards CEU in the PC1 projection. Indeed, only for the Pathans (the most CEU-like group) is the Z-score more than 3, the condition considered necessary for statistical significance. We can thus conclude that ANI is not in general a clade with CEU. This may be true only for the most CEU-like South Asian populations, but it is not generally true.
Now, we will see that it is not true even for the most CEU-like South Asian populations.
Clearly, the PC1 projections presented above hint why the evidence for CEU-ANI forming a clade is stronger for Pathans and Kashmiri Pandits. But, they seem to go against all the data I presented in my earlier two posts about the main West Asian origin of Ancestral North Indians. If that were true, then we would expect Ancestral North Indians to be projected closer to Adygei (0) rather than in the middle (0.5), or towards CEU (1).
I puzzled long about why this was the case, considering inter alia: that Adygei were not a good representative of West Asians, that ASI was not a true outgroup, or that I was wrong. All of these explanations failed, until I realized the true culprit: uneven sample sizes of Adygei and CEU.
Let's repeat the PC1 projection, but using a 17-person random sample from CEU, so that Adygei and CEU have equal sample sizes.
Unexpectedly, now all Indian Cline populations are clearly shifted towards the Adygei side of the CEU-Adygei PC1, and the results are compatible with the idea of the mainly West Asian origin of ANI.
It's not entirely clear to me why this is happening, without dissecting the results. Here is my tentative guess: CEU and Adygei populations both possess low-frequency West Eurasian variants that are absent in the smaller Adygei sample, but present in the much larger CEU one. When one of these variants pops up in an Indian Cline sample, it is mistaken for a CEU variant. By equalizing sample sizes, CEU does not have an edge over Adygei at including low-frequency variants, hence this bias is removed.
I have also carried out another experiment substituting CEU with 10 Lithuanians and 9 Belorussians from Behar et al. (2010) and Adygei with 19 Iranians from the same.
Any northern European component in Indians is likely to be more similar to eastern Europeans than to CEU, which is mainly of northwestern European origin. Also, I did not use Russians, as their low-level East Eurasian admixture might have altered them somewhat compared to the putative ancestors of the Indo-Aryans who lacked such eastern Asian influences.
I chose Iranians, as they are the linguistic cousins of the Indo-Aryans, and also happened to have a convenient sample size of 19 that was equal to the sum of Belorussians+Lithuanians.
Here are the results:
And another experiment using Hungarians as a Central European sample:
There you have it: clear evidence that the Ancestral North Indian component is more closely related to West Asians than to N/C/E Europeans.
To cut a long story short:
- CEU and ANI do not form a clade: the evidence for this clade is based on the most CEU-like Indian Cline populations, and even in their case it is an artefact of unequal sample sizes
- ANI is most similar to people from West Asia rather than Eastern Europe