September 20, 2012

rolloff analysis of North European admixture in Greeks.

One of the signals of admixture in my recent post on the Greeks on the crossroads of Eurasia was between north European and Near Eastern populations, with several pairs of such populations showing a significant negative f3(Greek_D; North European, Near Eastern) statistic. I used rolloff to estimate the date of this admixture.

Note that rolloff assumes a pulse model of admixture, whereby the two populations mix at a point in time, rather than experience gene flow over a protracted period. This may not be applicable in the case of Greeks, since gene flow may have occurred repeatedly throughout history. Also, rolloff estimates admixture times in the absence of very accurate ancestral populations, by exploiting allele frequency differences between them. So, for the first example below, with Finnish_D and Yemen_Jews as reference populations, which showed the most negative f3 statistic, this does not mean that Greeks are the product of admixture between Finns and Yemen Jews, but rather that allele frequency differences between these two populations reflect a contrast between North Europeans and Near Eastern populations, which may, presumably map, to the west Eurasian cline of diminishing Near Eastern "Neolithic" ancestry.

This experiment was performed on a set of 292,223 SNPs, and using the Rutgers map for Illumina chips. The first plot is using Finnish_D and Yemen_Jews as references. The fit does not visually appear extremely convincing, perhaps due to a smaller number of SNPs, or to the aforementioned deviation from the pulse admixture model.
(NB: I used the expfit.sh script with default parameters to do the exponential fit/plotting; note that negative weighted correlation points are not visible in the produced output)

The jackknife estimate of this admixture is 87.849 +/- 20.254 generations, or, assuming a generation length of 29 years, into 2,550 +/- 590 years.

The second plot uses Polish_D and Saudis as references:
The jackknife gives 67.235 +/- 22.148 generations, or 1950 +/- 640 years. The fit seems to capture the rise of the exponential for smaller cM genetic distances reasonably well.

Recently, Graham and Coop used fastIBD to identify a signal of possible Slavic admixture in the Balkans dating to the medieval period, using a similar generation time of 30 years. That method uses shared IBD segments between populations, so it may be limited to uncovering the most recent signal of admixture. Another piece of evidence comes from an abstract in ASHG 2012, according to which an Iron Age individual from Bulgaria was Sardinian-like. Since the Iron Age starts at the conclusion of the 2nd millennium BC, it might seem that a northern European element -whether present or not- had not admixed yet with the people who lived in the Balkans at the time. This seems to parallel the situation in two other earlier locations (c. 5ka in the Tyrolean Iceman and Gok4 Swedish TRB farmer), in which the North_European component was absent, although we cannot yet exclude its absence from the Iron Age Bulgarian, since a little such admixture might still leave an individual mostly Sardinian-like. Finally, levels of the North_European component in Greek individuals seem fairly variable, and this might indicate that levels of this element of ancestry had not had sufficient time to even out in the population.

(A different possibility is that the admixture signal reflects admixture of a Near Eastern kind. I consider this less likely, since there is evidence of "Southern" and "Southwest Asian" ancestry (in the K7b/K12b sense) already in Neolithic Europe.)

More research on the issue is certainly needed, but a first reading of the evidence suggests that this type of admixture may reflect events that took place during the historical period of Greek history.

Each of these experiments took about 1.5 days to complete. I am currently running another set of experiments with ~2-fold more SNPs, and assuming that finishes in good time, I may re-visit the question addressed in this post, to see if standard errors decrease and/or time estimates change with denser coverage.

10 comments:

kons7 said...

So, in layman's terms, what percentage of the Greek population contains northern European admixture?

Davidski said...

It seems rolloff gets confused by signals from multiple admixture events from similar sources. So if there are different layers of Northern European admixture in Southern Europe, rolloff will focus on the main one, but also use data from the others, and then come up with a skewed age estimate that's too recent for the main event.

I hope they fix this issue before more papers are released, because when they start applying rolloff to West and Central Asian population history, they'll shift everything that's happened forward by a couple thousand years.

Grey said...

kons
If the component called north european is the original layer then won't it be all of them?

Random thought

It seems to me there are four main routes into Europe and regardless of the significance or timing along each route the proportions of admixture are likely to have tapered off as you move further from the origin of each.

I see the four routes as
1) Coastal, med then atlantic
2) Danubian
3) Northern - steppe level
4) Northern - subarctic level

In each case there should be a minima and maxima zone based on distance from the origin.

One of the interesting things about this is Ireland/Scotland are at the minima of three and a long way from the maxima of the fourth (if we take that as being Finland) whereas the Baltic is similarly at the minima of three but near the maxima of the fourth so i'm wondering if those ancient writers may have been right about the red hair.

Onur said...

Rolloff tends to date admixtures to more recent times than the real admixture times. It needs to be fixed ASAP.

Dienekes said...

Rolloff tends to date admixtures to more recent times than the real admixture times. It needs to be fixed ASAP.


And, you know the 'real admixture times' how?

Onur said...

And, you know the 'real admixture times' how?

Admixture datings of rolloff are at odds with the times of several historically known admixture events.

A clear example is the rolloff dating of the Caucasoid-Mongoloid admixture in Uyghurs to the 13th century (the Mongol invasion era) by Patterson et al. (2012). We know from history that the Tarim basin (=the land of the present-day Uyghurs) was Turkicized several centuries before the Mongol invasions. So in the Uyghur example there is a clear discrepancy between the rolloff result of Patterson et al. (2012) and historical facts. Even Patterson et al. (2012) noted the historical implausibility of their rolloff Caucasoid-Mongoloid admixture dating of Uyghurs and tried to explain it with rolloff's tendency to favor recent dates.

Dienekes said...

We know from history that the Tarim basin (=the land of the present-day Uyghurs) was Turkicized several centuries before the Mongol invasions.

"Turkicized" is a linguistic category. rolloff estimates the age of admixture, not the age of arrival of a foreign population.

Onur said...

"Turkicized" is a linguistic category. rolloff estimates the age of admixture, not the age of arrival of a foreign population.

Most of the Tarim basin was Turkicized centuries before the Mongol invasion and the Tocharian languages (the most widespread pre-Turkic languages of the Tarim basin) went extinct (again, centuries before the Mongol invasion) as a result of the Turkicization. So it is clear that the genetic impact of the Turkic peoples on the Tarim basin must be mostly from centuries before the Mongol invasion. Also, we know from ancient mtDNA studies that the Tarim basin has exhibited significant Mongoloid admixture since at least the early Bronze Age (the earliest period of the Tarim basin for which we have DNA samples). Hence, it is certain that rolloff is off the track in the Uyghur case.

Nick Patterson (Broad) said...

"Rolloff ... needs to be fixed asap"

well gee thanks; pray tell me how, Very extensive
simulations in the main paper (ancient admixture,,)
show rolloff to be very well calibrated and robust
at least under simple scenarios, And complex scenarios are hard (for everybody), In the Uyghur
the genetic signal and date are especially clear. It's
unlikely that the signal is "wrong" but how to interpret itis not so clear,

Onur said...

well gee thanks; pray tell me how, Very extensive
simulations in the main paper (ancient admixture,,)
show rolloff to be very well calibrated and robust
at least under simple scenarios, And complex scenarios are hard (for everybody), In the Uyghur
the genetic signal and date are especially clear. It's
unlikely that the signal is "wrong" but how to interpret itis not so clear,


Archaeogenetics and history tell us that present-day Uyghurs are the result of a fluctuating Caucasoid-Mongoloid mix in the Tarim basin within the past several millennia. There have been multiple migration events from multiple locations into the Tarim basin within that time that shuffled the mix either towards a more Caucasoid or a more Mongoloid direction, add to these some South Asian genetic influence. So it is clear that the Uyghur case is far from being a simple one-time admixture event such as the African American and Hispanic cases and thus its rolloff dating should not be taken at face value and instead should be disregarded in light of all the contrary evidence. Rolloff is insufficient when it comes to complex scenarios such as the Uyghur case. I know that it needs to be fixed. But I don't know whether it can be fixed and how. That is the job of geneticists and their programmer assistants.