In a series of posts, I showed that European populations have east Eurasian-like admixture, an element that appears to be lacking in Sardinians. I did this both on the basis of the
3-population test and a number of different comparisons between West Eurasian populations, as well as on the basis of the
4-population test.
The fact that f4(Sardinian, CEU, Asian, African) is negative was interpreted by
Moorjani et al. (2011) as evidence that Sardinians have ~2.9% African admixture. As I pointed out at the time this level of admixture was predicated on the assumption that CEU did not have Asian admixture, and this assumption now appears not to hold.
Of course, the above-mentioned paper also used an admixture LD based method (ROLLOFF) to date the African admixture in Sardinians, coming up with an estimate of ~71 generations. But, we should remember that ROLLOFF does not quantify the
extent of this admixture.
Imagine walking along a Sardinian genome: the negative f4 signal is created both by occasional African-like segments you meet along the way, but also by the presence of East Eurasian SNPs in CEU in other locations where Sardinians may have no African admixture.
The f4 signal is a genomewide average that is influenced by two different processes: punctuation by African segments whose length distribution can supply information about the time of their introgression; and, the background genome that is lacking in East Eurasian-like polymorphism present in CEU.
In this post, I will show that:
- The admixture estimate of 2.9% is not robust, but depends on the choice of Asian population for f4 ancestry estimation, consistent with the idea that it is influenced by east Eurasian-like admixture that has affected northern European populations.
- If Sardinians are "scrubbed" of any trace of African admixture, the negative
f4(Sardinian, CEU, Asian, African) signal persists
Estimates of African admixture in Sardinians depend on choice of Asian/American population
African ancestry in Sardinians was estimated by Moorjani et al. (2011),
using the following ratio:
f4(San,Papuan; Sardinian,CEU) / f4(San,Papuan; YRI, CEU)
In
Table S6 different ancestral populations were used for f4 ancestry estimation, and all results ranged between 2.9-3.4%.
The signal of east Eurasian-like admixture in northern Europe is
strongest when Karitiana as used as an Asian/American reference. If the level of "African" admixture in Sardinians is driven, as I suspect, by the presence of east Eurasian-like admixture in northern Europe, then I expect this admixture to be highest when Karitiana instead of Papuans are used. And, indeed, this is what I observe :
f4(San,Papuan;Sardinian,CEU) = 0.00118099 (Z=10.6838)
f4(San,Papuan;YRI,CEU) = 0.0379664 (Z=88.2287)
(in all experiments I use a set of 28 Sardinians vs. 27 in the Moorjani et al. paper, a set of 112 CEU, 147 YRI, a set of 166,770 SNPs, and -k 200 for
fourpop)
therefore, African admixture in
Sardinians using Papuan reference = 0.00118099/0.0379664 = 3.1%
but
f4(San,Karitiana;Sardinian,CEU) = 0.00272141 (Z=22.7288)
f4(San,Karitiana;YRI,CEU) = 0.04449 (Z=100.19)
therefore, African admixture in
Sardinians using Karitiana reference = 0.00272141/0.04449 = 6.1%
A ~2-fold difference in African admixture has resulted from a different choice of outgroup. This is unexpected if West Eurasians did not exchange genes with Papuans and Karitiana since their divergence, but expected if CEU received genes from an Asian population that was more like Karitiana and less like Papuans.
Scrubbing Sardinians
Another way to demonstrate that east Eurasian-like admixture in CEU is inflating the perceived level of African-like admixture in Sardinians is to comprehensively "scrub" Sardinians of all traces of African ancestry by replacing segments of their DNA when there is even a
hint of such ancestry with missing values.
Going back to the mental experiment of walking along the Sardinian genome, we are going to remove spots of even remote possibility of African admixture. It will be shown that CEU continues to have evidence of east Eurasian-like admixture using the scrubbed Sardinians, suggesting that it is not only African-like admixture in Sardinians generating this signal, but also East Eurasian-like admixture in CEU.
I used
DIYDodecad to do this scrubbing, but one could potentially try any approach that can identify African segments, such as HAPMIX or PCA. I used the dataset assembled for
K7b and K12b, and carried out a K=3 ADMIXTURE analysis, which resulted in 3 components centered on West Eurasia, Asia, and Africa. I chose not to use an African component from higher-K (e.g. the K7b calculator), because it is conceivable that African ancestry might be lurking in southern Caucasoid components inferred with these tools (e.g., the "Southern" component of K7b or the "Southwest Asian" one of K12b). The
average African admixture in Sardinians using the
K3b calculator is 0.9%, and for the subset of CEU used it is 0.2%.
Using the
byseg mode of DIYDodecad, I created ancestry maps of the 28 HGDP Sardinians, and I only kept windows where the African admixture was exactly 0%. This is a very aggressive scrubbing, designed to remove virtually all African admixture from the population. For example, if a window has 99.9% West Eurasian admixture and 0.01% African, I will nonetheless remove it, even though chances are extremely high that the 0.01% represents only noise. I did not want to leave any doubt that any trace of identifiable African ancestry remained in my "scrubbed Sardinians".
I am very confident that my scrubbed Sardinians do not have any hint of African ancestry, but you can decide for yourselves. I base my confidence on (a) the extreme nature of the scrubbing , which threw away much of the Sardinian genome in order to ensure that no hints of local African ancestry remained (b) re-assessment of the scrubbed Sardinians with K3b showing that they are now 100% West Eurasian, (c)
ab initio ADMIXTURE analysis of CHB, YRI, CEU, and scrubbed Sardinians, demonstrating that the latter are 100% West Eurasian, while CEU has traces of 0.1% African and 0.3% Asian ancestry.
So, here are the results for the scrubbed Sardinians:
f4(San,Papuan;Sardinian_scrubbed,CEU) = 0.000678108 (Z=4.05225)
f4(San,Papuan;YRI,CEU) = 0.0379664 (Z=88.2287)
so scrubbed Sardinians with Papuan reference appear 0.000678108 / 0.0379664 = 1.8% African
and
f4(San,Karitiana;Sardinian_scrubbed,CEU) = 0.00205526 (Z=11.2848)
f4(San,Karitiana;YRI,CEU) = 0.04449 (Z=100.19)
so scrubbed Sardinians with Karitiana reference appear 0.00205526/0.04449 = 4.6% African
Despite the thorough scrubbing, Sardinians continue to show African admixture using f4 ancestry estimation. This is consistent with the idea that much of the African ancestry inferred using f4 ancestry estimation in Sardinians is an artifact of not taking into account east Eurasian-like admixture in CEU.
Conversely, a significant signal of east Eurasian-liked admixture in CEU persists whether one uses regular or scrubbed Sardinians:
With regular Sardinians:
f4(San,Papuan;Sardinian,Karitiana) = 0.0084678 (Z=21.2137)
f4(San,Papuan;Sardinian,CEU) = 0.00118099 (Z=10.6838)
So, CEU appears = 0.00118099/0.0084678 = 13.9% East Eurasian
With scrubbed Sardinians:
San,Papuan;Sardinian_scrubbed,Karitiana 0.00774427 0.00056725 13.6523
San,Papuan;Sardinian_scrubbed,CEU 0.000678108 0.000167341 4.05225
So, CEU appears = 0.000678108/0.00774427 = 8.8% East Eurasian
Conclusion
My "palimpsest" idea seems to be confirmed by the data. A first observation is that the level of African-like admixture in Sardinians depended on whether one used Papuans or Karitiana as an outgroup, suggesting that neither population was a true outgroup, and the signal of African admixture in Sardinians was driven in part by East Eurasian-like admixture in CEU. African admixture in Europe cannot be assessed accurately if one ignores the confounding effect of East Eurasian admixture.
When I aggressively scrubbed Sardinians so as to remove all traces of African ancestry, part of the African admixture fraction disappeared (expected, since African ancestry was removed from Sardinians), but a substantial part of it remained (unexpected, if the signal was driven only by African admixture, but expected, if it was driven in part by East Eurasian-like admixture in CEU). Conversely, using scrubbed Sardinians reduced, but did not make disappear, the admixture estimate for CEU.