May 08, 2011

On the northern/southern Caucasoid contributions to Asia

I project a great number of Siberian, Central Asian, and South Asian populations on the first two principal components created by Han, West Asians, and Northern Europeans.

PC1 captures east-west variation across Eurasia, although the Han are also related to Ancestral South Asians, a major component in the ancestry of South Asians. PC2 captures West Asian-North European variation, so it is quite useful to extract the relative northern vs. southern Caucasoid elements in the populations examined.

Here are the first two PCs with the populations used to create them. Northeastern European (N=49) includes Lithuanians, Belorussians, Russians, Poles, and various non-Balkan Slavs. Northwestern European (N=46) includes Germans, Irish, Norwegians, and various continental Germanics. West Asian (N=93) includes Armenians, Iranians, Adygei, Lezgins, and Georgians.

Population labels are always placed on population averages. Notice that the Han form a tight cluster, halfway (along PC2) between West Asians and Northeast Europeans; this is expected as they are an outgroup that has not been significantly affected by Caucasoids.

We will now project various populations onto the previous 2-D map: their horizontal position (along PC1) depends on the extent of Caucasoid admixture, while their vertical position (along PC2) depends on whether this admixture is more northern or southern Caucasoid.

UPDATE (May 9):

I have also carried out supervised ADMIXTURE analysis, using the dataset of this post, adding Onge from the Indian Ocean as a fourth ancestral group together with Han, Northern Europeans, and West Asians.
The results seem consistent with the PCA projection, while the distinctiveness of the East Asian (dark blue) and Ancestral South Indian (light blue) components emerges.

30 comments:

Onur said...

Some analysis results for the South Asian populations (I don't consider Hazaras as South Asian on genetic grounds):

1- they show up on the same horizontal line as the West Asian populations on the PCA plot,

2- their biggest component is the West Asian component in the supervised ADMIXTURE analysis,

3- they lack the Northern European component in the same ADMIXTURE analysis.

Leaving aside the few clearly recent Northern European (probably mostly Russian) admixed West Asian individuals, Northern Europeans and West Asians clearly form two separate fairly tight clusters on the PCA. Bearing this in mind, there is no doubt that above analysis results of the South Asian populations indicate that the Caucasoid portion of the South Asian ancestry comes overwhelmingly, if not totally, from West Asia and its environs.

Ms. Jen said...

How is it that the Turks and the Mongols are so separated in Figure 2, when the Turks & Mongols are supposedly from the same people group?

By Turks in Figure 2, do they mean the folks who were in Anatolia before the Turkic invasions?

Dienekes said...

How is it that the Turks and the Mongols are so separated in Figure 2, when the Turks & Mongols are supposedly from the same people group?

Modern Turks and Mongols are only partially descended from their common Altaic ancestors. In the case of the Turks in particular, who travelled a long way west and converted many Caucasoid peoples into their language, it is the genes of these people that makes them appear West Asian, although they do show a small amount of Mongoloid admixture that turns up in different analyses that are designed to capture it.

eurologist said...

Seems to be mostly latitude-driven: (very) high-latitude Asians have mostly Northern European admixture, low-latitude people have mostly West Asian admixture.

Of course, as always, one has to also look back deeper in time and acknowledge the possibility that Europe was initially settled from the North, in the first place, while the "two great lakes" and the Caucasus and Himalaya mountains may have been rather effective barriers for a very long time. Within Europe, only Greece, Italy, and the eastern Balkans are easily connected to West Asia.

Onur said...

Seems to be mostly latitude-driven: (very) high-latitude Asians have mostly Northern European admixture, low-latitude people have mostly West Asian admixture.

Eurologist, yes, that is the main pattern. So it is very normal for South Asians to show only West Asian on their Caucasoid side. In fact, I am flabbergasted at how some people could seriously imagine South Asians with more Northern European admixture than West Asian admixture in those southern latitudes.

Dieneke, could you provide the Fst distance values of the supervised ADMIXTURE poles? It would be nice if you also show their distances on PCA/MDS plots. BTW, what are the variation percentages of the PC1 and PC2 in the above PCA plots?

Onur said...

It seems Ms. Jen has never seen a Turk (of Turkey) and a Mongol and knows them only from low level history books.:D

Fanty said...

"I am flabbergasted at how some people could seriously imagine South Asians with more Northern European admixture than West Asian admixture in those southern latitudes."

I guess one reason behind this idea is this:

http://bialczynski.files.wordpress.com/2011/04/5-r1a-m173.jpg

And that quiet large portions of southasians share a common male anchestor with Estern Europeans, 15.000 or 5.000 years ago.

That, plus older aDNA experiments that actually showed northern european components in Indians seemed to fit together.

Together with 19th century racial ideas about an Indian connection of the "Nordic Race".

Balaji said...

On the PC plot, I estimated the position of ANI and ASI by extrapolating a line from Pathan/Sindhi to Mala/Madiga. For ANI, I get (.03,-.04) and for ASI (-.07,-.04). As noted by Reich et al., the Makrani and Balochi fall outside this line. ANI is closer to the West Asian cluster at (.03,-0.07) than to the North European cluster at (.03,.07). ASI is quite distinct from all the others.

Onur said...

Fanty, fortunately the days of the uniparental genetics prominence are over and autosomal genetics is much more advanced and prominent today than it was just a decade ago. Also the distribution of one single haplogroup doesn't tell anything about autosomes, hence overall genome, and about the distribution of the rest of the haplogroups.

As for the 19th century racial ideas about a "Nordic-Indian connection", I agree that they still subconsciously affect the thinking of some Nordicists and Nordicism-influenced people.

eurologist said...

And that quiet large portions of southasians share a common male anchestor with Estern Europeans, 15.000 or 5.000 years ago.

Fanty, how do you know it is 5,000 or 15,000 ya - and not 45,000? The latter fits known migration patterns and also the R1b pattern much better. As long as DNA-based time estimates are known to often be a factor of three or more off, I view them basically as useless.

And if all three: most of South European, most of North European, and (at least some of) West Asian originated in the subcontinent anyway, only much, much later migrations leave a discernible signal.

Onur said...

BTW, it is clear from the PCA plot that some of the Siberian and Chuvash samples that are shifted away from their own respective population clusters in the direction of Northern Europeans are significantly recent Northern European (almost certainly Russian) admixed. The individual-based ADMIXTURE results of the same samples I saw before confirms my this contention (as an extreme case, one "Chuvash" sample that is indistinguishable from the Russian samples on this PCA plot is also indistinguishable from the Russian samples and very different from the rest of the Chuvash samples in the ADMIXTURE analyses).

Fanty said...

I was just explaining WHY there are people who have ever believed in India beeing connected to Northern/Eastern Europe.

I have not even claimed to believe this myself. ;-)

a) 19th century racial "science"
b) Y-DNA Science 5 years ago
c) aDNA experiments, 1 MONTH ago

So, even up to 1 month ago, even aDNA experiments claimed, that there is a connection between India and northern/eastern Europe.

Its not until a few days ago, that it was found that this may be caused by too much people in the northern European reference groups. And suddenly every one jumps in and says: "How can there be ever anyone believing in the old theory?" ;-)

Davidski said...

South Asians, like Pathans, carry large North European segments, which can't date back 15,000 years or so. They also carry large West Asian segments, but that's not all that remarkeable, since West Asia is close to South Asia.

Indeed, one of the most interesting things that has come from all the recent data, including high density and ancient DNA, is the strong detection of North European genetic signals deep in Asia.

Nirjhar999 said...

19th century thoughts of race is still present in academic levels by judging races by skin tone and nose, here dienke is doing it by components colour.
But when dna's(M458, R2a y-dna,R mtdna, M mtdna) comes all the doughts goes out of the truth!

terryt said...

"the 'two great lakes' and the Caucasus and Himalaya mountains may have been rather effective barriers for a very long time".

I would thank that is extremely likely.

"the distribution of one single haplogroup doesn't tell anything about autosomes, hence overall genome, and about the distribution of the rest of the haplogroups".

Quite. If a male, for example, moves into a new region and has children with a local woman his children will only have half his aDNA (and half the local mother's). If his sons then have children with the local women his original aDNA will be further diluted. In fact it's quite possible for haplogroups to move through a population without disrupting the population's aDNA too significantly.

Onur said...

And suddenly every one jumps in and says: "How can there be ever anyone believing in the old theory?" ;-)

No, it was already quite evident from other genetic studies that South Asians are much closer genetically on their Caucasoid side to West Asians than to Northern Europeans. So I wasn't surprised by Dienekes' findings.

South Asians, like Pathans, carry large North European segments, which can't date back 15,000 years or so. They also carry large West Asian segments, but that's not all that remarkeable, since West Asia is close to South Asia.

So what? West Asians too carry Northern European segments and vice versa. What is important is the overall picture.

wagg said...

@ Onur : My reply doesn't appear in the comments. Please check this out :

http://waggg.livejournal.com/388.html

Nirjhar999 said...

Yep. But again no proof of AIT or AMT

lars said...

How to explain R1a lacking north africans be lactose tolreant?

It could be explained by regional independent developments.

lars said...

R1a(whose sister R1b, cousin R2 and second cousin Q are also Asian as well)seems to be south asian not northern european=>we may perhaps say that there is rather strong detection of south asian genetic signals deep in Europe

Onur said...

Wagg or Waggg (whatever), if you read my above comments carefully, you'll see that when I talked about the absence of the Northern European component in South Asians I was referring to Dienekes' supervised ADMIXTURE analysis. In unsupervised ADMIXTURE analyses, some components that are modal in Northern Europeans do indeed show up in some South Asian populations (especially those from the north), however in much smaller amounts, but they also show up in not so trivial amounts in many West Asian populations in the same unsupervised ADMIXTURE analyses, so there is nothing contrary to what I wrote in my above comments.

As for your lactase persistence allele example, just a single allele (and an allele, as you say, that also exists among West Asians) doesn't say anything about overall population relationships. Besides, I wrote in my above comments that we shouldn't make too much inferences from the distribution of one single haplogroup, and now you are doing the same for a single allele.:D

Lastly, the photos you present doesn't say anything about South Asians in general. Anyone who've been in South Asia (including its northernmost regions including Afghanistan) should know better. I can show you many many millions of swarthy (much swarthier than the swarthiest West Asians) South Asian photos. Also I can show you millions of very light pigmented West Asian photos. Note that most of the photos you show are those of children, who, we know, are lighter pigmented than adults.

lars said...

The map below shows that anatolia is less lactose intolerant than much of the balkans+italy+iberia and less than south asia.

Yes mr Onur is right at his statement that western asia has -well- much more blondism than south and central asia.
please see the map below

http://en.wikipedia.org/wiki/Talk%3APortuguese_people#Portuguese_hair_and_eye_color

anyway of course blondism amongst caucasoids is rather latitude-dependant and not-of course-continent or ethnolinguistic dependent.

wagg said...

@ Onur

"As for your lactase persistence allele example, just a single allele (and an allele, as you say, that also exists among West Asians) doesn't say anything about overall population relationships. Besides, I wrote in my above comments that we shouldn't make too much inferences from the distribution of one single haplogroup, and now you are doing the same for a single allele.:D"

For such a specific characteristic to be this widespread in theses specific regions, it has to be meaningful, I think.

The south Asians share a link with European populations that they do not share with west Asians (it's absent or very rare in west Asia, and given its spread and frequency in west Asia, it's not autocthonous, it seems arrived from outside) while from the results shown on this page we could expect the contrary.
Looking at the map it seems that the connection/transmission occured via the central Asian steppes and not west Asia either (in central Asia the frequency is not that important but we know there was an important east Asian genetic flow during iron age so the frequency of that allele was obviously higher in the past, in this region).

These things are transmitted they don't magically appear. Doesn't it plead for a certain quantity of R1a1a (not all, in my mind) coming from central Asia and carrying this specific mutation, in the past?

"Lastly, the photos you present doesn't say anything about South Asians in general. Anyone who've been in South Asia (including its northernmost regions including Afghanistan) should know better."

Yes I know it. I'm not this ignorant :)
It was just to say that it would fit well with a north-east European component. I wasn't using it as a decisive proof.

"Note that most of the photos you show are those of children, who, we know, are lighter pigmented than adults."

Three of them are adults... (and there are more to be found on the web so we can't pretend it's a child thing)

@ Lars : "It could be explained by regional independent developments."

It's an allele specific to some populations. There are other specific alleles (for lactose tolerance) that are specific to other populations.

wagg said...

@ Lars : "R1a(whose sister R1b, cousin R2 and second cousin Q are also Asian as well)seems to be south asian not northern european"

Indeed. Originally it probably was so but we also know that there were europoid R1a1a at the end of neolithic (kayser et al 2009, derenko et al 2002, Chunxiang Li et al 2010, etc... + the fact that the chalcolithic Afanasevo culture of south Siberia (about 3,500 BCE) is related to the kurgan cultures of south Russia/Ukraine of that time).

Onur said...

For such a specific characteristic to be this widespread in theses specific regions, it has to be meaningful, I think.

The south Asians share a link with European populations that they do not share with west Asians (it's absent or very rare in west Asia, and given its spread and frequency in west Asia, it's not autocthonous, it seems arrived from outside) while from the results shown on this page we could expect the contrary.
Looking at the map it seems that the connection/transmission occured via the central Asian steppes and not west Asia either (in central Asia the frequency is not that important but we know there was an important east Asian genetic flow during iron age so the frequency of that allele was obviously higher in the past, in this region).

These things are transmitted they don't magically appear. Doesn't it plead for a certain quantity of R1a1a (not all, in my mind) coming from central Asia and carrying this specific mutation, in the past?


Wagg, I didn't deny that they were transmitted, all I have been saying is that the affect of the transmissions you mention on the overall South Asian genome is small.

terryt said...

"anyway of course blondism amongst caucasoids is rather latitude-dependant and not-of course-continent or ethnolinguistic dependent".

To me it seems that blondism is concentrated in a stretch of northern Europe from the eastern Baltic across to the Ural Mountains. Of course it spreads more widely, but thinly elsewhere. Perhaps it originated in that region and spread from there.

wagg said...

@ Onur

"As for your lactase persistence allele example, just a single allele (and an allele, as you say, that also exists among West Asians) doesn't say anything about overall population relationships. Besides, I wrote in my above comments that we shouldn't make too much inferences from the distribution of one single haplogroup, and now you are doing the same for a single allele.:D"

For such a specific characteristic to be this widespread in theses specific regions, it has to be meaningful, I think.

The south Asians share a link with European populations that they do not share with west Asians (it's absent or very rare in west Asia, and given its spread and frequency in west Asia, it's not autocthonous, it seems arrived from outside) while from the results shown on this page we could expect the contrary.
Looking at the map it seems that the connection/transmission occured via the central Asian steppes and not west Asia either (in central Asia the frequency is not that important but we know there was an important east Asian genetic flow during iron age so the frequency of that allele was obviously higher in the past, in this region).

These things are transmitted they don't magically appear. Doesn't it plead for a certain quantity of R1a1a (not all, in my mind) coming from central Asia and carrying this specific mutation, in the past?

And notice that the presence of this allele in west Asia is lower than in north Cameroon....

"Lastly, the photos you present doesn't say anything about South Asians in general. Anyone who've been in South Asia (including its northernmost regions including Afghanistan) should know better."

Yes I know it. I'm not this ignorant :) (it's still more frequent than we usually think, especially in Afghanistan).
Anyway, it was just to say that it would fit well with a north-east European component. I wasn't using it as a decisive proof.

"Note that most of the photos you show are those of children, who, we know, are lighter pigmented than adults."

Three of them are adults... (and there are more to be found on the web so we can't pretend it's a child thing)

wagg said...

@ Lars : "It could be explained by regional independent developments."

It's an allele specific to some populations (obviously clearly linked with Europeans). There are other specific alleles (for lactose tolerance) that are specific to other populations.

"R1a(whose sister R1b, cousin R2 and second cousin Q are also Asian as well)seems to be south asian not northern european"

Indeed. Originally it probably was so but we also know that there were europoid R1a1a at the end of neolithic (kayser et al 2009, derenko et al 2002, Chunxiang Li et al 2010, etc... + the fact that the chalcolithic Afanasevo culture of south Siberia (about 3,500 BCE) is related to the kurgan cultures of south Russia/Ukraine of that time).

"How to explain R1a lacking north africans be lactose tolreant?"

Obvioulsy the presence of this specific allele in maghreb is linked with the presence of haplogroup mtDNA hg H (logically originally associated with R1b1b2 (nowadays R1b1b2 is rare in north Africa but it's still by far the most logical explanation and it fits well with the spread and frequency of that allele))

Onur said...

For such a specific characteristic to be this widespread in theses specific regions, it has to be meaningful, I think.

The south Asians share a link with European populations that they do not share with west Asians (it's absent or very rare in west Asia, and given its spread and frequency in west Asia, it's not autocthonous, it seems arrived from outside) while from the results shown on this page we could expect the contrary.
Looking at the map it seems that the connection/transmission occured via the central Asian steppes and not west Asia either (in central Asia the frequency is not that important but we know there was an important east Asian genetic flow during iron age so the frequency of that allele was obviously higher in the past, in this region).

These things are transmitted they don't magically appear. Doesn't it plead for a certain quantity of R1a1a (not all, in my mind) coming from central Asia and carrying this specific mutation, in the past?

And notice that the presence of this allele in west Asia is lower than in north Cameroon....


Wagg, I didn't deny that they were transmitted, all I have been saying is that the affect of the transmissions you mention on the overall South Asian genome is small.

BTW, I should add that we don't know from where the relevant lactase persistence allele spread. Also we don't how much of R1a1a in South Asia came from Europe (I strongly suspect that a very small percentage of it came from Europe).

Onur said...

Also we don't how much of R1a1a...

Also we don't know how much of R1a1a...