August 26, 2012

Inter-relationships of the Dodecad K12b and world9 components

Pconroy made a most excellent suggestion in the comments of a previous post, so I decided to follow up on it. His idea is to see what Dodecad components look like when they're measured in terms of other components. So, I took the K12b components and carried out the following procedure:

I used each of the 12 different components as "test data" in a supervised ADMIXTURE analysis that used the other 11 components as "reference data". This simple procedure can show what each component appears to be made of, if it is seen in the context of the remaining components. It is a good way to demonstrate relationships between them.

Here are the results:


Some observations:

  • Gedrosia appears to be Caucasus + a slice of Siberian
  • Both Siberian and Southeast Asian appear to be wholly East Asian
  • East Asian on the other hand, appears to be mostly Southeast Asian + minority Siberian
  • Northwest African appears to be Caucasus + a minority Sub Saharan
  • Atlantic Med appears to be Caucasus + a slice of North European
  • North European appears to be Atlantic Med + Gedrosia with a slice of Siberian
  • South Asian appears to be Caucasus + East Asian
  • East African appears to be Sub Saharan + minority Caucasus
  • Southwest Asian appears to be Caucasus
  • Sub Saharan appears to be East African
  • Caucasus appears Atlantic Med + Gedrosia + slices of Northwest African and Southwest Asian
The most salient point about this analysis is the central position of the Caucasus component vis a vis the others, consistent with my womb of nations theory. Not only do all West Eurasian components (except the North European) appear substantially "Caucasus" in this analysis, but the Caucasus component itself shows links with four others.

It could be argued that these results represent a confluence of peoples from all over West Eurasia into the highlands of West Asia where the Caucasus component is modal. But, the Caucasus region is arguably the most linguistically diverse in West Eurasia, and many of its languages do not appear to have come from elsewhere. Also, the Near East (where the Caucasus component is the most important one in most populations) is the birthplace of agriculture, which has demonstrably affected most of West Eurasia. On balance, this analysis seems consistent with population expansions out of West Asia.

The following graph summarizes the relationship between the 12 components. I used color intensity of the edges to indicate admixture intensity:




Finally, a few points to remember: 
  • the South Asian component appears like a mix of of Caucasus and East Asian; the latter probably acts as a stand-in for the Ancestral South Indians of Reich et al. (2009)
  • Similarly, the Gedrosia/Siberian influences on the North European component do not necessarily mean direct influences from these two regions; an explanation for these influences may intersect with the issue of East Eurasian-like ancestry in northern Europe
  • It is the Caucasus, rather than Southwest Asian component that seems to donate to the Northwest African and East African ones. That seems to flaunt geography, but probably indicates that the Southwest Asian component, with its strong Semitic associations (see distribution in K12b spreadsheet) represents a more specialized form of the more generalized Caucasus component.
  • Some components appear to be "terminal", affected but not much affecting: Southwest Asian, Northwest African, and Southeast Asian. These tend to appear at high K in admixture analyses, and probably represent either recent mixtures (Northwest African) or specialized forms of more generalized ones (Southwest Asian of Caucasus and Southeast Asian of East Asian)
  • Finally, remember that living populations show admixture proportions of many of these components. So, for example, the East African population often has Southwest Asian admixture, even though the East African component lacks it. And, as mentioned above, this may reflect the more generalized west Asian admixture that has affected East Africa, as well as the more specific Arabian admixture, associated e.g., with the spread of Semitic languages. Please refer to the K12b spreadsheet for admixture proportions of populations for the 12 components.
I have also done the same with the world9 calculator, which includes Amerindian and Australasian components. Here is how the world9 components are seen as mixtures of the remaining ones:



And, here is the graph showing how they seem to contribute to each other.


A few observations:

  • Amerindian appears wholly Siberian
  • East Asian appears Siberian + South Asian + slice of Australasian
  • African appears South Asian. I would attribute this to Africans being related to both West and East Eurasians approximately symmetrically, so in this type of experiment, South Asian (which is an ANI/ASI mix) appears like the best match
  • Atlantic_Baltic appears Caucasus_Gedrosia + Southern + slice of Amerindian
  • Australasian appears South Asian. I would guess that ASI and Australo-Melanesians share deep common ancestry from the earliest settlement of southern parts of Asia.
  • Siberian appears East Asian + slice of Amerindian
  • Caucasus_Gedrosia and Southern appear Atlantic_Baltic
  • South Asian appears an about equal mix of East Asian + Caucasus_Gedrosia + slice of Australasian
Raw data for these experiments can be found here.

16 comments:

pconroy said...

Excellent!

So it would seem that the K12b results support 2 recent papers.

1. The Bantu expansion started in East Africa

2. The Caucasus component, which is the basis of most of the others - with the exception of the East Asian - could only represent a farming population, and maybe the source of Indo-European languages.

3. The minor East Asian seen in NW and N Europe is Siberian or North Asian in actuality

hamarfox said...

On the World9 portion, if Amerindian is removed as an option for the atlantic_baltic component, is the same percentage then filled in by the Siberian component?

Logic would suggest so, but there does seem to be a special relationship between Europeans and Amerindians that cuts out the middleman, if yuo will.

Dienekes said...

1. The Bantu expansion started in East Africa

I wouldn't go that far. East African is the only population that even remotely resembles Sub-Saharan Africans. Any Sub-Saharan population (farmer or not) would map onto East African using this procedure.

karl00 said...

How about an inter-relationship analysis for the euro7 calculator or an updated version of that?

pconroy said...

It's interesting that South Asian decomposes to Caucasus_Gedrosia and East Asian mostly, plus a little Australasian.

So it would seem that the Ancestal South Asian component is best represented by East Asian and plus a little Australasian. Australasian itself is most like South Asian.

What this means is that there may be a much higher East Asian or South Asian admixture in South Asians than usually supposed - whether that is ancient of more recent is the question?!

princenuadha said...

Wow, South Asian is like the womb of Africa, Australia, East Asia, and Gedrosia components!

Sarcasm. Actually it just looks like South Asia sits near the geographic middle.

Though, it still seems cool to look at. It's also surpising the way "North European" turned out to be. Interestingly, Northwest European has more "Gedrosia" than Northeast Europe, but Northeast Europe has more "North European".

Eduardo Pinto said...

Yes, for Euro7 and also for K10a.

eurologist said...

Even with agriculture and Bronze Age flows, it is still quite surprising to see that many groups fall back to Caucasus, but not much to Gedrosia. You would think that fairly recent admixture would be relatively cleanly separated out, leaving more ancient connections lurking beneath.

North European is the only exception - and this may indeed indicate a connection from paleolithic settlement, when both Caucasian and North European may have originated from the-then Gedrosia but would have diverged from then on.

Also surprising that the non-Asian component in South Asia is affialiated with Caucasus, and not with Gedrosia. Basically, if you take out all Gedrosia, Caucasus and Asian from Indians, what you are left with is closer to Caucasian than Gedrosia. You would not expect that from recent admixture, which should cleanly separate the Caucasian component out and leave more ancient (e.g., Gedrosia) affinity. Perhaps Gedrosia has simply drifted too far away in itself since paleolithic times. It is also possible that the Gedrosia admixture into S/SE Indians does not predate the neolithic one by all that much: as I have argued before, for much of the pre-neolithic past, India was divided by large deserts due to drought.

Dienekes said...

@eurologist,

At K12b South Asian populations show up mainly as Gedrosia+South Asian.

At K7b they show up as West Asian+South Asian.

What happens is that their West Asian gets split into Caucasus and Gedrosia. The Gedrosia part shows up as admixture, and the Caucasus part gets absorbed in the South Asian.

You can think of it as mixing green with red. If you take out the yellow as admixture, it will look like yellow+(blue+red).

aramt said...

It is interesting that Atlantinc-Baltic has some Amerindian, but no Siberian comp. Could this be related to those mysterious haplogroups like mtDNA X hg in the western end of Europe and Y-DNA Q hg that turns up in northern Germany and western Scandinavia? So, these and other genetic signals would have been better preserved at the extremes, in Europe and America, but replaced in Siberia?

(OTOH, Iran, Pakistan and the Gulf have a lot more X and Q directly related to the Amerindian lineages and yet the signal isn't visible for that area at all, so I'm not sure what to make of the evidence.)

princenuadha said...

"What happens is that their West Asian gets split into Caucasus and Gedrosia. The Gedrosia part shows up as admixture, and the Caucasus part gets absorbed in the South Asian."

Which one are you going with?

Jim said...

1. The Bantu expansion started in East Africa"

"I wouldn't go that far. East African is the only population that even remotely resembles Sub-Saharan Africans"

Yep. All the linguistic evidence points to southeast Nigeria and western Cameroon. It also makes the direction of the migrations pretty obviously out of the region south and eastward - the patterns in the cattle vocabulary in various branches shows replacement form Nilotic languages among the languages that pased through forests and gave up cattle for a time.

pconroy said...

@aramt,

It's also just possible that Amerindians crossed the North Atlantic Ice Sheet during the Last Glacial Maximum (LGM), hunting seals or other sea mammals.

I'll dub this the "Reverse Solutrean Hypothesis". Among my 23andMe relatives are 17 mtDNA X matches. Of these there is:

X - 1
X2 - 2
X2b - 12
X2b4 - 1
X2c1 - 1

That's out of 1015 matches total.

Andrew Oh-Willeke said...

"The Caucasus component, which is the basis of most of the others - with the exception of the East Asian - could only represent a farming population, and maybe the source of Indo-European languages."

The web of connections would be suggestive of Caucasus being old Neolithic, and other components being more recent Neolithic. In that scenario, Indo-
European languages seem like a poor fit to the Caucasus component.

Alternately, the Caucasus could be a "spurting fountain" sending multiple successive waves out into the world from a relatively static source population. For example, perhaps it was the Out of Arabian refugium for West Eurasians, and an LGM refugium, and an important source for the early Neolithic, and an important source for metal age migrations, in four distinct waves that are hard to distinguish from each other.

The first place I'd want to look to distinguish the possibilities would be highland West Asian Paleoclimate. I'm increasingly comfortable with what is available in terms of Paloeclimate for lots of places (the Fertile Crescent, interior Arabia, South Asia, most of Europe, Northern Africa), but highlands are notoriously prone to microclimates.

I'd also make the distinction between a refugium and a relict population and wonder what combination of the two we have in the Caucasus. Dienekes offers a soft lob in favor of relict over refugium, and to put a bit more fine of a point on it, the phonetic and linguistic structures you see in all of the language families of the Caucuasus show an utter lack of the features you associate with mass language learning.

In a relict scenario, the Caucasus might be typical of an early Neolithic population that made up "Old Europe" and flourished far and wide, primarily sourced in the Fertile Crescent perhaps, that was then mowed off the map by later waves of migration that embellished the low land populations while leaving the Caucasus basically unchanged. From the perspective of narratives that make sense that can explain the modern palmpiset, this kind of scenario seems easier to fit to the historical facts than some sort of repeated mass demographic movement directly out of the mountains.

mikej2 said...

This analysis can still miss something. For example the North European can in reality include original proportions for Northern Europeans that are not revealed by the original K12b and World9 analysis. After removing the North European this hidden part included to the oginal analysis is lost. Making the original analysis with higher K value it could have been isolated and again analysed like in this follow-up analysis.

eurologist said...

At K12b South Asian populations show up mainly as Gedrosia+South Asian.

At K7b they show up as West Asian+South Asian.

What happens is that their West Asian gets split into Caucasus and Gedrosia. The Gedrosia part shows up as admixture, and the Caucasus part gets absorbed in the South Asian.

You can think of it as mixing green with red. If you take out the yellow as admixture, it will look like yellow+(blue+red).


Dienekes, I still don't understand. At K=12 you have a South Asian, Gedrosia, and a Caucasus component. Now, if you look at the South Asia component (i.e., anything resembling the other two already taken out) through the glasses of the other 11 components, you find the South Asian component to be made up of about equal parts East Asian and Caucasian. No trace of Gedrosia.

So, the part of Indian that is not Asian and does not neatly fall into Caucasian or Gedrosia is: Caucasian. I did not expect that, since more recent admixture should be more easily be separated out by the preceding steps.

Perhaps I am missing something, but to me that indicates that Gedrosia was highly isolated over much of its history, while Caucasian and South Asian had more contacts in the deep past.