Ancient European origins
There was a tip about this paper in the recent study of Native American origins. It was suggested that northern Europeans have an excess of central/east Eurasian-related ancestry relative to Sardinians. I had noticed over a year ago that northern Europeans tended to be Asian-shifted relative to Mediterranean Europeans, and when the same effect was hinted at in the Native American paper, I set out to explore the issue in a series of posts using the f4 and f3 statistics. So it's great to finally see the formal treatment of the same subject.
Figure 9 from the paper shows a scenario envisioned by the authors as consistent with the evidence:
In my earlier post I had suggested a couple of explanations for this pattern, including an Asian-shift of the Mesolithic substratum which contributes more to northern than southern Europeans, as well as a possible influence by a northern stream of Indo-European invasion into Europe. The rolloff age estimate in the paper is:
In Figure 7e we show the rolloff results. The signal is clear enough, though noisy. We estimate an admixture date of 4150 ± 850 B.P. Our standard errors computed using a block jackknife (block size=5cM) are uncomfortably large here.
However this date must be treated with great caution. We obtained a data set from the Illumina iControl database (http://www.illumina.com/science/icontroldb.ilmn) of ‘Caucasians’ and after curation have 1,232 samples of European ancestry genotyped on an Illumina SNP array panel. We merged the data with the HGDP Illumina 650Y genotype data obtaining a data set with 561, 268 SNPs. Applying rolloff to this sample with HGDP Karitiana and Sardinians as sources, we get a much more recent date of 2200 ± 762 years B.P.
If the admixture event was related to admixture between Neolithic and Mesolithic peoples, one might guess that the admixture date would be earlier. On the other hand, the evidence shows that down to 5,000 years ago, there were farmers in Europe who were like modern Sardinians, and hunter-gatherers who were ultra-North European (even more than current north Europeans), so fusion between incoming and resident groups was not a one-time deal when they first met. A recent mtDNA study also suggests that farmers and hunter-gatherers did not completely fuse until 4,000 years BP, after which time their distinctive mtDNA types begin to expand in unison.
In my opinion, the fusion may have been effected post-5ka after the arrival of Indo-Europeans into most of Europe. Before that time, there lived in Europe groups who had either a lot or a little Neolithic ancestry. The IE invasion acted as a shock that broke down old loyalties and brought together different groups whose focus was the new military/trading elite associated primarily with metallurgy. This invasion could have acted both as a source (in its northern stream) of East Eurasian-like ancestry, since it spread east-west and passed through territory where evidence of east Eurasian mtDNA has been turning up; but it could also have acted as a blender, creating out of the "apples and oranges" that existed in Europe prior to 5ka, a new variable mix.
In any case, the authors of the current paper discuss the Mesolithic vs. Indo-European issue:
Ancient DNA studies have documented a clean break between the genetic structure of the Mesolithic hunter-gatherers of Europe and the Neolithic first farmers who followed them. Mitochondrial analyses have shown that the first farmers in central Europe, belonging to the Linear Pottery culture (LBK), were genetically strongly differentiated from European hunter-gatherers (BRAMANTI et al., 2009), with an ‘affinity’ to present day Near Eastern and Anatolian populations (HAAK et al., 2010). More recently, new insight has come from analysis of ancient nuclear DNA from three hunter-gatherers and one Neolithic farmer who lived roughly contemporaneously at about 5000 years B.P. in what is now Sweden (SKOGLUND et al., 2012). The farmer’s DNA shows a signal of genetic relatedness to Sardinians that is not present in the hunter-gatherers who have much more relatedness to present-day northern Europeans. These findings suggest that the arrival of agriculture in Europe involved massive movements of genes (not just culture) from the Near East to Europe and that people descending from the Near Eastern migrants initially reached as far north as Sweden with little mixing with the hunter-gatherers they encountered. However, the fact that today, northern Europeans have a strong signal of admixture of these two groups, as proven by this study and consistent with the findings of (SKOGLUND et al., 2012), indicates that these two ancestral groups subsequently mixed.
Combining the ancient DNA evidence with our results, we hypothesize that agriculturalists with genetic ancestry close to modern Sardinians immigrated into all parts ofAnother application of the new methodology is to Spain, where many analyses (including some of the Dodecad Project) have shown that the population has both a "Mediterranean" and a "North European" component. The authors date this admixture to 3,600 +/- 400 BP, and they associate it with Bell Beaker-related backflow into Iberia. However, a newer study that probably appeared when this paper was in review showed that Mesolithic Iberians were also North European-like. So, one probably does not need a special explanation for their case: the Neolithic/Mesolithic mix that occurred in Scandinavia, probably also occurred in Spain. The 3.6ky signal for North European/Sardinian-like admixture in Spain is similar to the 4.15ky signal of North Eurasian/Sardinian admixture in northern Europe. Both cases may reflect the same event. The authors point out that these dates are inconsistent with Visigoths and the like contributing a major portion of north European ancestry to Spain, consistent with the Ralph and Coop (2012) study. It might even be tempting to ascribe the small ~0.5k difference in the age of the signal to this later migration, or even to Celtic-related migrations, since the Celts -based on phenotypic descriptions by ancient authors- belonged to a substantial degree to the northern Europeoids.
Europealong with the spread of agriculture. In Sardinia, the Basque country, and perhaps other parts of southern Europethey largely replaced the indigenous Mesolithic populations, explaining why we observe no signal of admixture in Sardinians today to the limits of our resolution. In contrast, the migrants did not replace the indigenous populations in northern Europe, and instead lived side-by-side with them, admixing over time (perhaps over thousands of years). Such a scenario would explain why northern European populations today are admixed, and also have a rolloff admixture date that is substantially more recent than the initial arrival of agriculture in northern Europe. (An alternative history that could produce the signal of Asian-related admixture in northern Europeans is admixture from steppe herders speaking Indo-European languages, who after domesticating the horse would have had a military and technological advantage over agriculturalists (ANTHONY, 2007). However, this hypothesis cannot explain the ancient DNA result that northern Europeans today appear admixed between populations related to Neolithic and Mesolithic Europeans (SKOGLUND et al., 2012), and so even if the steppe hypothesis has some truth, it can only explain part of the data.)
It will certainly be interesting to study the Beaker folk's autosomal DNA in relation to European prehistory, as R1b makes its first appearance with them on the European scene. Were they the people who brought North European/East Eurasian-like ancestry into Iberia, or did the pre-existing I folk already possess it? As more ancient DNA is sampled, so will our ideas about the sequence of events be better informed. (If Iron Age people from Bulgaria were also like Sardinians, then, as they say, the plot thickens.)
Dates of admixture with rolloff
Here are a couple examples of the rolloff fit of an exponential distribution that is used to estimate dates of admixture:
First, the Uygur (790 ± 60 year ago) shows a very good fit, and interesting things were happening in Central Asia in the 13th century.
They migrated southwards over many centuries, with large herds of Nguni cattle, probably entering what is now South Africa around 2,000 years ago in sporadic settlement, followed by larger waves of migration around 1400 AD.A little early sporadic settlements and a large pulse at 1,400AD may very well average to something very close to the given date.
It will certainly be fun to apply the same method on other data. I had waited for rolloff since it was originally announced, and "good things come to those who wait". One interesting test case might be that of Anatolian Turks, where, presumably there were two episodes of admixture, one, early one in Central Asia between West and East Eurasian people, and a second, recent one, in Anatolia when Central Asian Turkic speakers admixed with some of the pre-Turkic inhabitants. Another will probably be that of ANI-ASI admixture in South Asia; the group behind this paper has presented this research in conference, so I'm guessing there's another paper on that topic as well, and, perhaps the even more mysterious admixture in the case of West Africans.
The authors also announce the Affymetrix Human Origins Array which is based on ascertainments included in the Harvard HGDP set, and which I've been occasionally using in some of my own experiments. This new chip was recently used in the South African study. A new curated version of the HGDP set that removes outliers is also announced:
We successfully genotyped the array in 934 samples from the HGDP, and made the data publicly available on August 12 2011 at ftp://ftp.cephb.fr/hgdp supp10/. The present study analyzes a curated version of this dataset in which we have used Principal Component Analysis (Patterson 2006) to remove samples that are outliers relative to others from their same populations; 828 samples remained after this procedure. This curated dataset is available for download from the Reich laboratory website (http://genetics.med.harvard.edu/reich/Reich_Lab/ Datasets.html).UPDATE (8 Sep 2012 ): The following discussion in smallcase is now obsolete. See f-statistics are robust to differences in sample age for details.
Addendum on the applicability of tests of admixture to samples of different age
Finally, since the authors study f-statistics with the Tyrolean Iceman, it is worthwhile to link to one of my recent posts on the topic. I've made a small figure to repeat the main argument of that post:
Canc is an ancient individual whose genome has been sampled and who belongs to the lineage leading up to Cmod. As such, he is missing a few thousand years of evolution (shown with the dashed line). On the left figure, Canc is very old (so he is missing a lot of evolution), while on the right, he is fairly recent (so he is missing only a little).
You can mentally slide Canc up and down its branch. As it tends to Cmod (right), then Canc will appear unadmixed, because it will have "experienced" almost as many years of evolution as A has, and will be separated by exactly the same amount of evolution from B as A does. But, as Canc becomes older (left) and approaches the Root, then it will become much more related to B than A is, only on account of it being older. A test that compares Canc with A and B may conclude that Canc is a mixture of A and B.
Here is another way to explain this:
Let B be the allele found in group B and A be the allele in group A.
The pattern ABB means that Canc has B, and hence matches B. This is consistent with admixture from B-to-Canc if the allele B first appeared on the B branch of the tree.
But, it is also consistent with B being an allele at the Root that went to both sides of the tree: Canc is more likely to match the allele at the root (because he's older, closer to the Root) than A (who's younger, so an allele at the root has had more time to be lost due to drift, or a new one to appear through mutation).
Now, I don't think this effect has played a major role in the analyses' presented in this paper for the Iceman, because if A=North European, Canc=Iceman, and Cmod=Sardinians, then the f statistics show that it is A that is admixed with a B=Karitiana-like population. It might play a role in the Neandertal-like excess identified for the Iceman, because in that case A=Europeans, B=Vindija, and it is the Iceman that appears more admixed vis a vis living Europeans. I do think, however, on the basis of my ancestry map, that even if some of the signal is due to the proposed effect, not all of it is, since Neandertal-like segments in Oetzi tend to correspond with segments likely to be of European pre-Neolithic ancestry. And, indeed, if pre-Neolithic Europeans were indeed more Neandertal-like then that is another thing in which they may have resembled East Asians.
In any case, I do believe that some thought needs to be given to tests of admixture when either (i) one of the samples is an ancient genome, or (ii) there is some reason to think that the per annum rate of evolution has been different in two populations, which would "mimic" a closer/more distant relationship to the root. As we develop the ability to sample near 100ky-old samples, strange effects might appear if a test that does not take account of differences in sample ages is used.
To cap this long post, the new paper represents an exciting combination of new data, software, methods, and interpretation, that will probably give genome bloggers and all those interested in human history a lot to think about and/or play with in the coming months and years. I will certainly be unbundling and trying out the new ADMIXTOOLS suite.
UPDATE: Razib also covers this new paper.
Genetics doi: 10.1534/genetics.112.145037
Ancient Admixture in Human History
Nick Patterson et al.
Population mixture is an important process in biology. We present a suite of methods for learning about population mixtures, implemented in a software package called ADMIXTOOLS, that support formal tests for whether mixture occurred, and make it possible to infer proportions and dates of mixture. We also describe the development of a new single nucleotide polymorphism (SNP) array consisting of 629,433 sites with clearly documented ascertainment that was specifically designed for population genetic analyses, and that we genotyped in 934 individuals from 53 diverse populations. To illustrate the methods, we give a number of examples where they provide new insights about the history of human admixture. The most striking finding is a clear signal of admixture into northern Europe, with one ancestral population related to present day Basques and Sardinians, and the other related to present day populations of northeast Asia and the
This likely reflects a history of admixture between Neolithic migrants and the indigenous
Mesolithic population of Europe, consistent with recent analyses of ancient bones
and the sequencing of the genome of the Tyrolean ‘Iceman’. Sweden