October 18, 2012

ADMIXTURE tracks Amerindian-like admixture in northern Europe

I have recently assembled a new "world" dataset of 4,280 individuals that I am currently incrementally analyzing with ADMIXTURE. But, I noticed an interesting pattern at K=4 that I wanted to share right away.

4 ancestral populations emerge at this level of resolution, which I have named: European, Asian, African, Amerindian. The names aren't important, and you can replace them with whatever you prefer. 

The interesting thing about this K=4 analysis is that European populations show evidence of Amerindian admixture, consistent with the pattern inferred using f-statistics, where European populations show admixture between Sardinians and a Karitiana-like population.

This pattern may have emerged at previous ADMIXTURE analyses at this level of resolution, but thanks to the f3 evidence presented in previous posts, it is now clear that it is no quirk of ADMIXTURE, but indicative of a real (albeit still rather mysterious) pattern of gene flow that differentially affected European populations.

For example, the Irish_D population has 7.6% of the Amerindian component, and so do HGDP Orcadians. HGDP Sardinians have only 1.7% of it, which appears to be the minimum in Europe, with French_Basque having more at 4.6%.

Another interesting observation is that West Eurasian populations that show an excess of East Eurasian-like admixture appear to be doing so for two separate reasons. For example, HGDP Russians have 11.7% of Amerindian component, but also 4.5% of "Asian", and 1000 Genomes Finns have 3.3% Asian and 12% Amerindian. Behar et al. (2010) Turks, on the other hand, have 9.9% Asian and 2.2% Amerindian. All these populations are East Eurasian-shifted relative to Sardinians, a pattern which can also be observed by looking at the K=3 analysis, but for apparently different reasons.

The pattern for Near Eastern populations is also interesting. For example, Yunusbayev et al. (2011) Armenians have 0% of the Amerindian component, and 5.7% of the Asian, and all three HGDP Arab populations (Druze, Palestinian, Bedouin) also have 0% of the Amerindian component, with variable levels of the Asian.

It would appear that whatever process contributed Amerindian-like admixture in Europeans, minimally affected Near Eastern populations, with Sardinians being demonstrably related to Neolithic Europeans (thanks to ancient DNA evidence), tilting towards the Near Eastern pattern. On the other hand, Near Eastern populations show evidence of Asian admixture, which probably involves unresolved East Asian/ASI ancestry, and will be resolved at higher K. Sardinians appear to be at the end of three clines: (i) Amerindian-like cline of Europe-Siberia-Americas, (ii) East Asian-like cline of Europe-Central Asia/Siberia-East Asia, (iii) ASI-like cline of Europe-Near East-South Asia. These are separate, but not independent phenomena.

To confirm that the signal picked up by ADMIXTURE tracks the signal picked up by ADMIXTOOLS formal tests, I calculated the following D-statistic:

D(Sardinian, European, Karitiana, San)

where European is any population with a sample size of at least 10, and which belonged at 99% in the European+Amerindian components:


And, here is a scatterplot:
The correlation is clear, and the Pearson coefficient is -0.96. This means that populations with higher % Amerindian, as estimated by ADMIXTURE, also show higher D-statistic evidence for admixture.

What of the actual estimates of admixture produced by ADMIXTURE? Using the F4 ratio test, I recently showed that African admixture in Sardinians confounds estimates of Amerindian-like admixture in northern Europeans and vice versa (Amerindian-like admixture in northern Europeans confounds African admixture in Sardinians).

In that experiment, I "scrubbed" Sardinians to remove segments of African ancestry, and showed that estimates of Amerindian-like admixture in the CEU population diminished from 13.9% to 8.8%. The latter seems reasonably close to the 7.1% inferred by ADMIXTURE.

On balance, I would say that ADMIXTURE at K=4 provides a good proxy for the effect described in Patterson et al. (2012). Its results are more difficult to interpret, because its underlying model does not take into account evolutionary relationships between populations. On the other hand, it has the advantage of being able to handle multiple ancestral populations, and has consistently proven able to generate useful data that correlate well with those from other techniques of population genetics.

22 comments:

Onur said...

What of the actual estimates of admixture produced by ADMIXTURE? Using the F4 ratio test, I recently showed that African admixture in Sardinians confounds estimates of Amerindian-like admixture in northern Europeans and vice versa (Amerindian-like admixture in northern Europeans confounds African admixture in Sardinians).

In that experiment, I "scrubbed" Sardinians to remove segments of African ancestry, and showed that estimates of Amerindian-like admixture in the CEU population diminished from 13.9% to 8.8%. The latter seems reasonably close to the 7.1% inferred by ADMIXTURE.

On balance, I would say that ADMIXTURE at K=4 provides a good proxy for the effect described in Patterson et al. (2012). Its results are more difficult to interpret, because its underlying model does not take into account evolutionary relationships between populations. On the other hand, it has the advantage of being able to handle multiple ancestral populations, and has consistently proven able to generate useful data that correlate well with those from other techniques of population genetics.


Still, ADMIXTURE only partially detects the ancient Mongoloid-like admixture in Europeans. The Karitiana-like admixture in CEU is certainly higher than 8.8%, which was a value optained through an implausibly extreme and by now falsified (as it is now clear that Sardinians have Karitiana-like admixture too and apparently much higher than their Negroid admixture) scenario of the level of Negroid admixture in Sardinians. So the lowest possible value for the Karitiana-like admixture in CEU is higher than 8.8%. The real value of the Karitiana-like admixture of CEU is probably much higher than that. It may even be higher than 13.9%, as Sardinians too possess Karitiana-like admixture, which has a negative effect on the Sardinian-referenced Karitiana-like admixture estimates of other European populations.

BTW, we know from previous ADMIXTURE analyses that a high proportion of the "Asian" component of K=4 in West Eurasian populations is ASI-related rather than Mongoloid-related. It would be clear at higher Ks.

AK said...

There's a theory Farley Mowat sets out in The Farfarers about a people he calls the "Albians" who inhabited western Europe prior to the neolithic immigration. A generalized version is that there may have been a North Atlantic "zone of interaction" up until the Nordic conquests (of e.g. Iceland), interacting by means of "canoes"/"knerrir", which might (IMO) be an old loan-word originally referring to "coracles". Could this be part of the explanation of the genetic data?

Lank said...

No K=2? That's disappointing.

Mike Keesey said...

Is it possible that these genes are from an ancient boreal Eurasian population? Or is this more recent admixture?

Lank said...

Also, why are there so few Pagani et al. samples?

Dienekes said...

Also, why are there so few Pagani et al. samples?

Initial:

AFAR_Pa 12
AMHARA_Pa 26
ANUAK_Pa 23
ARIBLACKSMITH_Pa 17
ARICULTIVATOR_Pa 24
ESOMALI_Pa 17
GUMUZ_Pa 19
OROMO_Pa 21
SOMALI_Pa 23
SUDANESE_Pa 24
TYGRAY_Pa 21
WOLAYTA_Pa 8


After possible relative removal:

AFAR_Pa 9
AMHARA_Pa 25
ANUAK_Pa 14
ARIBLACKSMITH_Pa 8
ARICULTIVATOR_Pa 21
ESOMALI_Pa 15
GUMUZ_Pa 14
OROMO_Pa 17
SOMALI_Pa 17
SUDANESE_Pa 19
TYGRAY_Pa 17
WOLAYTA_Pa 7


After low genotype removal:

AFAR_Pa 1
AMHARA_Pa 9
ANUAK_Pa 6
ARIBLACKSMITH_Pa 5
ARICULTIVATOR_Pa 6
GUMUZ_Pa 2
OROMO_Pa 8
SOMALI_Pa 3
SUDANESE_Pa 5
TYGRAY_Pa 9
WOLAYTA_Pa 1

Onur said...

Is it possible that these genes are from an ancient boreal Eurasian population? Or is this more recent admixture?

According to the f-statistics results, they must be from admixture with an Amerindian/Siberian-like population. How recent is open to debate to some degree.

Dale Light said...

It has long been argued that Basque and Irish fishermen had been visiting the Grand Banks for many centuries before Columbus an may have set up drying stations on North American coastal areas. This would have entailed some contact with Amerindian populations.

Davidski said...

Not too shabby. But of course we'd never know this is legit without the formal mixture tests saying so.

Can you post the allele frequencies from the K4? I'd like to test myself.

Onur said...

The "South_Asian" component at K=5 is partially Caucasoid, the rest of it being solely ASI (Ancestral South Indians). ASI are genetically closer to Mongoloids than to Caucasoids. In contrast, the "South_Asian" component, due to its significant Caucasoid element, is genetically closer to Caucasoids than to Mongoloids and relatively high in West Asian populations. West Asians possess minor ASI admixture, but surely lower than their amounts of the "South_Asian" component, which is partially Caucasoid.

There is also the issue of how much of the eastern shift of West Asians in formal tests is ASI-related and how much Mongoloid-related. On that issue, I previously wrote at another thread:

"West Asians (including the non-ethnic Russian peoples of the Caucasus) in general, the Iranic speaking ones and some groups in the northern Caucasus in particular, possess minor ASI (=Ancestral South Indian) admixture. Unlike the situation in Europe, where the eastern shift is entirely or almost entirely of Mongoloid origin, the eastern shift in West Asia is only partially of Mongoloid origin, the rest of it being of ASI origin."

The results of this world analysis confirm my above statements.

Hector said...

I am pretty certain that there is no basis to the assumption that ASI are closer to East Asians than Amerindians are to East Asians.

The grouping is rather very incongruous. "Asians" include almost all East Eurasian components while Amerindians, often regarded as a subset of East and Northeast Asians, are treated as a different component.

I am sure the resolution can be improved with minimal additional efforts.

I personally even doubt Eurasians can be simply divided into East and West. In particular the relationship between East Asians and Oceanic groups(New Guineans and Australian Abo) appears tenuous other than from mere statistical artifacts.

Matt said...

I am pretty certain that there is no basis to the assumption that ASI are closer to East Asians than Amerindians are to East Asians.
The grouping is rather very incongruous.


Reich's 2009 paper gives Dai-ASI as the least distant pair in an Papuan-Dai-ASI-Onge tree (though ASI and Onge form a clade).

In this context distances (Fst?) Dai-Yoruba=0.195, Dai-Papuan=0.157, Dai-Onge=0.119, Dai-Asi=0.076.

http://oi49.tinypic.com/sc9cnc.jpg

(By comparison, Metspalu et al 2011 gives Fst Dai-Papuan=0.180, Dai-Yoruba=0.186, Dai-HanChinese=0.008. Not sure what explains the difference for Dai-Papuan between these two.)

In Dienekes components, Dai is identical with East Asian at K=4 and is 97.7% identical at K=5.
Distances in Dienekes components at K=4, Amerind-East Asian= 0.099, East Asian-African=0.162.

So it doesn't seem to be too implausible from other genetic data that ASI and East Asian are more difficult to separate from East Asian and Amerind.

A theory based on Amerinds being a massively reduced subset of East Asians would help explain the high Fst of that with component with East Asian component, but then it would seem like the distance of the Amerind component from African and European components should be much more than they are (since there is no reason for Amerind to be a subset which makes them different to East Asian faster than to European and African).

As to why genetic similarities deviate from expectation based on phenotype, perhaps natural selection has a role?

Michael Boblett said...

To address an earlier point, perhaps it's beating a dead horse to debunk ideas of transatlantic voyages by the Irish or he Basques as explanations for Amerindian like stuff in Europeans, but here's my take:

The Boreal connection suggested by Onur seems better supported by the different Amerindian-like percentages across Northern Europs. Compare the Basques (4.6%) and even the Irish (7.6%) to Ukrainians (8.5%) and Lithuanians 9.1%(.)

Dienekes, you mentioned those pesky Lithuanians in another context as perhaps preserving some fairly old stuff in Northern European populations:
"It would seem that the Proto-Indo-Europeans mixed with different substrata in the four directions of their expansion: Sardinian-like people in southern Europe, Lithuanian-like people in northern Europe, South Indian-like people in South Asia, and East Eurasians in Siberia and east central Asia." This was in your article on the IE Invasion of the Baltics: http://dienekes.blogspot.com/2012/10/the-indo-european-invasion-of-baltic.html

Any possibility of seeing if Amerindian-like admixture overlaps with possible pre-IE stuff?

Hector said...

to Matt

I am pretty certain that if you run the same test with Buryats or even Manchu as the typical East Asians you will not get the same result.

The haplogroup-sharing of Y chromosome and mtDNA between modern South Asians and East Asians is almost entirely due to East Asians - > South Asians admixture and this also contributes to the counter-intuitive genetic result as ASI is just a theoretical reconstruction.

Dienekes said...

Any possibility of seeing if Amerindian-like admixture overlaps with possible pre-IE stuff?

Well, it's found in hunter-gatherers from Gotland. Indo-European was the language of a Neolithic people or even a Copper Age one. So, I'm pretty sure that those Amerindian-like admixed Gotland hunter-gatherers represent a form of pre-IE substratum.

http://dienekes.blogspot.com/2012/10/ancient-european-dna-assessment-with.html

Matt said...

The haplogroup-sharing of Y chromosome and mtDNA between modern South Asians and East Asians is almost entirely due to East Asians - > South Asians admixture and this also contributes to the counter-intuitive genetic result as ASI is just a theoretical reconstruction.

Well, whatever the demographic model (directions and admixtures), the present day Far East Asians (Han, Dai, Japanese, Korean, Vietnamese, &c.) seem to show higher relatedness to the pre-West Eurasian components of South Asians than they do to Native Americans (at least those in this sample).

Whether the Ancestral South Indian population experienced gene flow from an East Eurasian-like population prior to its West Eurasian influx or not (i.e. whether Ancestral South Indian is a composite of East Eurasian-like and "even more Ancestral South Asian" or it isn't).

terryt said...

"In particular the relationship between East Asians and Oceanic groups(New Guineans and Australian Abo) appears tenuous other than from mere statistical artifacts".

I was under the impression that East Asians and Oz/NG are well and truly separated. Hasn't research shown a very sudden change at Wallace's Line? We tend to find different mt-DNA and Y-DNA on opposite sides of the line, although those east of the line make up a small minority of haplogroups west of the line, especially in South China and SE Asia. Whether these populations east of Wallace's Line are related to ASI is yet to be proved (or disproved).

Ebizur said...

Matt, there is simply no way to support a hypothesis that East Asians are more closely related to ASI than East Asians are related to Native Americans. Neither physical anthropology nor haploid genetics would support such a wacky hypothesis (although it is true that, from the viewpoint of physical anthropology, East Asian populations tend to deviate from the Mongoloid extreme, represented by indigenous Americans and eastern Siberians, in the direction of Caucasoids, which drags East Asian populations toward South Asians in plots of the Caucasoid-Mongoloid continuum).

According to the data of Han-Jun Jin, Chris Tyler-Smith, and Wook Kim as published in "The Peopling of Korea Revealed by Analyses of Mitochondrial DNA and Y-Chromosomal Markers," the majority of modern South Koreans share common maternal ancestors with indigenous Americans to the exclusion of South Asians or any other extant human meta-population (105/185 = 56.8% A+B+C+D total). This is a clear reflection of their common Mongoloid ancestry. Most of the remainder of South Koreans belong to typically "Japanese"/"Jōmon" mtDNA haplogroups (19/185 = 10.3% M7, 14/185 = 7.6% N9, 13/185 = 7.0% G; 46/185 = 24.9% M7+N9+G total), none of which is known to be especially closely related to any haplogroup from South Asia, and which also should probably be considered as Mongoloid (or at least "proto-Mongoloid"). The only other major mtDNA haplogroup in modern Koreans is haplogroup F (18/185 = 9.7% F total), a subclade of macro-haplogroup R that is found throughout eastern Eurasia, from Siberia to the Malay Archipelago, and which is not, to my knowledge, commonly found in South Asians.

On the patrilineal side, Koreans and indigenous Americans share C3-M217 in common. The majority of their Y-chromosomes, however, are derived from O-M175 or Q-M242, two clades that coalesce to a common ancestor at the level of MNOPS-M526, not long before the divergence of the subclades of O-M175 and Q-M242. South Asians do not possess any Y-DNA haplogroup that is more closely related to East Asian O-M175 than the latter is related to indigenous American Q-M242.

Lathdrinor said...

There is a great deal of O in eastern India, though this is largely due to eastern India being the backyard of Austroasiatic and Tibeto-Burman populations. Moreover, Haplotype R and Q are both connected to O via P, so the connection between Amerindian Q and East Asian O has an Indian analogue.

However, Reich et. al never reconstructed haplotypes for ASI, and his split of Indians into ANI and ASI is a statistical exercise rather than a compelling theory of population migration. It is telling that the Onge population, which forms a clade with ASI, but which Reich states 'has not received maternal ancestry from populations outside of the Indian subcontinent for ~48,000 years' and thus stands better for an isolated population, is rather distant from Dai.

terryt said...

"Haplotype R and Q are both connected to O via P, so the connection between Amerindian Q and East Asian O has an Indian analogue".

But by the time we get to Q and O the difference involves several stages. MNOPS split first into four: M, S, NO and P. The first two are New Guinea/Melanesia so under your scenario Papuans would be as closely related to both Indian and American populations as either of the latter are to each other.

Michael Russell said...

Michael wrote, "Any possibility of seeing if Amerindian-like admixture overlaps with possible pre-IE stuff?"
The relationship between Amerindian-like admixture and proto-IE is very interesting.
It seems there is no Amerindian contribution to the proto-IE people group(s). If there were a contribution, it would have shown up in Armenian and Arab populations. Since it doesn't, Amerindian-like admixture is ruled out. The interesting question is why there is no Amerindian-like contribution in these IE groups, Armenian and Arab
Would you agree, Dienekes?

Michael Russell said...

I wrote my own lengthy perspective on this post because I think it is quite important. It is at http://www.michaelrussell.co/2012/11/human-evolution-european-populations-show-evidence-of-amerindian-admixture.html