Using the same dataset as in a previous experiment, I decided to calculate the extent of East Eurasian-like admixture in Eurasia.
First, I identified, using qp3Pop a set of population with significantly negative f3(Sardinian, Karitiana, Target) statistics:
This is actually a very helpful figure, as it shows how the f3 signal of admixture becomes weaker for more drifted populations (e.g., Finns) even if they have more of the investigated admixture than others (e.g., French).
It also shows that most West Eurasian populations appear admixed between Sardinians and Karitiana, whereas most East Eurasian ones (see spreadsheet) do not appear to be so, at least on the basis of the f3 test.
I next used qpF4Ratio to estimate the extent of this admixture. This depends on the following topology (Fig. 4 of Patterson et al. 2012):
I used: A=Papuan, B=Karitiana, C=Sardinian, and O=San, with X= any of the different investigated populations.
Note that this topology does not really hold for all X target populations whose admixture we are investigating. In particular, some populations have African admixture, hence O=San is not really an outgroup for them.
In the following, you can see the admixture proportion estimates using the F4 ratio test:
It should be obvious now how admixture estimates using the f4 ratio method depend on an appropriate outgroup. The f3-statistics indicate that all the above-listed populations are admixed between a Sardinian-like and a Karitiana-like population. But, the estimate of admixture based on the f4 ratio becomes negative, because f4(Papuan, San; X, Sardinian) is negative in populations where X has African admixture.
So, the Karitiana-like admixture of populations such as Spanish_D (est. 1.2%) is lower than their actual such admixture, because Spanish_D includes African admixture. For the Portuguese_D (est. -3.3%) where African admixture is even more significant, the effect is even stronger, and a nonsensical negative admixture score appears.
The converse took place when the f4 ratio method was applied by Moorjani et al. (2011). In that case, negative f4 scores with CEU as a parental population were taken as evidence of African admixture. But, since CEU has Amerindian-like admixture, the estimates of African admixture in that paper were higher than the actual values.
It will be interesting to derive corrected African admixture estimates after taking into account that CEU have Amerindian-like admixture, and, covnersely, corrected Karitiana-like admixture estimates after taking into account African admixture in some populations.
In any case, the data used for the above plots can be found in the spreadsheet, together with the list of all considered populations.
5 comments:
Formal admixture tests like this are over and over proving that non-formal admixture tests like ADMIXTURE, STRUCTURE, etc. are unreliable in estimating amounts of racial admixtures, as non-formal admixture tests are not good at detecting ancient racial admixtures, especially the ones between the non-Negroid races.
The high admixture of Afghans and Pakistani is easy enough to understand, given their northern neighbors and Mongol influences (at least in some populations) - but why would India be so high? Are the Indians in Indian_D predominantly from the North or Pakistani-populated regions? Or is there an effect due to relatedness of SE Asia with NE Asia?
Finally, where does the effect of African admixture end? Is it possible that Italians actually have a Karitiana-like admixture much more similar to their northern neighbors?
Also interesting to see Iranians so low, since without African admixture their expected Karitiana-like admixture should be about as high as that of Afghans.
ASI is related to East Asians, so when Indians are viewed as a mix of Sardinians and Karitiana they appear "Karitiana-admixed". The two ancestral populations are distantly related to the real populations that mixed to form Indians, but they _are_ related to them, that is why there is a negative f3 score.
Formal admixture tests like this are over and over proving that non-formal admixture tests like ADMIXTURE, STRUCTURE
It seems like what you get out of ADMIXTURE and STRUCTURE is a most parsimonious set of X populations that best explain overall genetic distances, when mixed in varying proportions. But these aren't necessarily the real ancient populations and they didn't necessarily really mix like that.
Whereas this formal admixture tests tell you whether population X is formed of a mix of populations Y and Z, that are respectively more like population A than B and more like B than A, but doesn't tell you how like A and B Y and Z are (they could be very like them, or very unlike them), nor in what proportion they mixed, nor provides a most parsimonious set of X populations to explain a data set (since it's not tractable to run a mass comparison with the algorithm and then attempt to find a parsimonious set of populations to explain the results).
I'm not sure about they hybrid approach - running ADMIXTURE / STRUCTURE first, then using a formal admixture test to see if these or real populations are admixed with the components produced - which seems to have the weaknesses of both methods.
Matt,
Tools like ADMIXTURE and STRUCTURE are not good at detecting ancient racial admixtures, so they are unreliable when you want to learn how much admixture a given population carries from a given race. The current races formed in very ancient times, so estimation of their global influence cannot be done with ADMIXTURE and STRUCTURE. ADMIXTURE and STRUCTURE are only good at detecting relatively recent admixtures, but unfortunately a large proportion of racial admixtures occurred in times too ancient for tools like ADMIXTURE and STRUCTURE to be able to detect. Tools like ADMIXTOOLS and TreeMix are much better at detecting such ancient racial admixtures and hence they are much more reliable in estimating how much admixture a given population carries from a given race.
As for your objection "but doesn't tell you how like A and B Y and Z are (they could be very like them, or very unlike them)", it depends on the populations used. For example, the ASI ancestry of South Asians is largely represented with a Mongoloid population, Karitiana, in Dienekes' this analysis due to the fact that ASI is closer to Mongoloids than to Caucasoids. So the results of South Asians and South Asian-admixed populations should be interpreted taking this into account. For populations that have no detectable South Asian admixture, we have no such problem and their Karitiana-like admixture can be confidently interpreted as largely, if not totally, Mongoloid admixture. Their exact amounts of Mongoloid admixture can be different (either higher or lower) from their amounts of Karitiana-like admixture, but they should not be expected to be much different. With more analyses like this, we'll soon know better.
Post a Comment