I have used my
K12b dataset to isolate a set of 537 individuals who had less than 10% membership in the South Asian, Northwest African, Southeast Asian, South Asian, East African, Gedrosia, South Asian, East African, Southwest_Asian, and Sub_Saharan components. Hence, the remaining 537 individuals had 90%+ membership in the remaining Atlantic_Med, North_European, Caucasus, Siberian, and East_Asian components.
- The Atlantic_Med component is frequent in northwestern Europe
- The North_European component is dominant in northeastern Europe and forays into Siberia
- The Caucasus component is dominant in the Caucasus and forays into Central Asia
- The Siberian component is dominant in North Asia and forays into Europe
- The East_Asian component is frequent in East Asia and forays into North Asia
This pruning procedure may not be perfect, but it helps isolate a dataset consisting (mostly) of North Eurasian individuals. Furthermore, I removed all populations who had less than 5 remaining individuals after the first pruning step. Hence, in the end, I had a dataset of 38 populations/452 individuals. The remaining populations were:
Russian_D, Polish_D, German_D, Finnish_D, Swedish_D, Mixed_Slav_D, Norwegian_D, Lithuanian_D, Japanese_D, Daur, French, French_Basque, Hezhen, Japanese, Oroqen, Russian, Sardinian, Yakut, CEU30, JPT30, Belorussian, Chuvashs, Hungarians, Lithuanians, Romanians, Selkup, Evenk, Tuva, Yukagir, Nganassan, Dolgan, Buryat, Mongol, FIN30, Kent_1KG, Bulgarians_Y, Ukranians_Y, Mordovians_Y
Additionally, a sample of 30 Yoruba from the HapMap-3 was used as an outgroup.
TreeMix analysis
The TreeMix analysis was performed with default parameters, and allowing for a different number of migration edges.
Nomenclature: The direction of gene flow is best seen in the figure and/or associated treeout files.
For the text, I will put in (), the common ancestor of two populations, e.g., (French_Basque,Sardinian) and also as (X, *) the tree rooted at a particular node X, e.g., (Buryat, *)
0 migration edges:
The West and East Eurasian clusters are identified, with some populations with likely admixture being placed closer to the Eurasian root.
1 migration edge:
64% from (Sardinians/Basques) to Yoruba; this is difficult to interpret, but there has been evidence in the past that Africans and West Eurasians
share more ancestry than Africans and East Asians do. In the linked post, I proposed a
major episode of back-migration into Africa, and it is perhaps this that is being captured by this migration edge: Sardinians/Basques are the only two South-West Eurasian populations included, and any back-migration into Africa must have originated in the southern parts of West Eurasia.
Such a high level of back-migration may in fact be plausible, since Yoruba are a predominantly Y-haplogroup E bearing population, and the origin of the DE clade of the human Y-chromosome phylogeny is up in the air with both an African and Eurasian case having been advanced. Personally, I favor the Eurasian case, since within the
CT clade, we have two subclades: CF (Eurasian) and DE (Eurasian/African).
Interestingly, John Hawks has
recently discovered an unanticipated excess of
"Neandertal ancestry" in Yoruba. This may also point to a back-migration into Africa and/or admixture of a group of Africans related to Eurasians (whom I've called Afrasians), with groups of Africans (Palaeoafricans) that split before the H. sapiens/H. neandertalensis common ancestor.
There is, however, another detail in the figure that may have escaped your notice: there is now about 0.5 worth of drift in the figure (left-to-right) as opposed to only 0.12 in the tree without migration edges. So,
perhaps what we are seeing is indeed the first sign of admixture between modern and archaic humans in Africa, which has been made more likely by recent
anthropological discoveries.
It's not clear to me whether TreeMix has stumbled onto something important or not, but it is certainly worth keeping in mind that the above model fits the data better than the simple tree model. Moreover, TreeMix attempts to reverse the polarity of migration edges, and -apparently- the (Sardinian, French_Basque)-to-Yoruba edge is preferable to the reverse.
So, we should keep our minds open to the possibility that the greater similarity of West Eurasians to Africans is not the result of multiple Out-of-Africa waves, one of which affected only West Eurasians, but of an Into-Africa back-migration from West Eurasia.
So far, tree-based models have focused on how diverse African groups are, and hence, the reduced diversity of Eurasians has been interpreted as an Out-of-Africa bottleneck that carried a subset of African variation into Eurasia.
But, there is an alternative interpretation of the evidence, namely that African groups are diverse because they carry a superset of ancient Into-Africa variation, with the African-specific part of their variation being the result of admixture with pre-existing African hominins. Such a scenario cannot be captured by tree models, but is apparently considered and not rejected by TreeMix which allows for lateral gene flow. Let's wait and see what new things come from
full genome sequencing.
2 migration edges:
The (French_Basque/Sardinian)-to-Yoruba edge persists (64%) and a new edge was added from (
Buryat, *)-to-Mongol (85%). The "Mongol" sample consists of Siberian Mongols described by
Rasmussen et al. (2010). An inspection of their K12b
population portrait indicates that they do, in fact, have West Eurasian admixture, which according to the
K12b spreadsheet amounts to about 18% in total.
3 migration edges:
The aforementioned (French_Basque/Sardinian)-to-Yoruba (64%) and (Buryat,*)-toMongol (85%) edges persist, and now we have a 68% Nganasan-to-Selkup edge.
These are the two Siberian Uralic populations in the dataset. This seems to parallel the K12b
results, as Selkups have a North_European element which the Nganasans (Uralic speakers from the Arctic coast of Central Siberia lack), so we are seeing the hybridity of the Selkups here, who, like the Mongol sample are partly of West Eurasian ancestry.
4 migration edges:
The aforementioned (French_Basque,Sardinian)-to-Yoruba (64%), (Buryat,*)-to-Mongol (84%), and Nganasan-to-Selkup (68%) persist, and now we have a 89% (Buryat, *)-to-Tuva edge. According to the K12b the Tuva have 13.3% West Eurasian admixture, so again we have reasonably good agreement between TreeMix and ADMIXTURE.
Interestingly, the non-"eastern" component of Selkups and Tuvans now forms a clade. It seems that a Nganasan-like and a (Buryat, *)-like population have converged into southern Siberia, absorbed a common local element and became the Selkup and Tuva respectively.
5 migration edges:
The aforementioned (French_Basque,Sardinian)-to-Yoruba (64%), (Buryat,*)-to-Mongol (85%), Nganasan-to-Selkup (68%) persist, and 90
% (Buryat, *)-to-Tuva persist, and now we have a new
18% Oroqen-to-(Yakut, Evenk) edge. The Oroqen and the Evenk are Tungusic speakers, whereas the
Yakut are Turkic people from northeastern Siberia, having migrated there from the vicinity of Lake Baikal during the last millennium.
6 migration edges:
The aforementioned (French_Basque,Sardinian)-to-Yoruba (64%), (Buryat,*)-to-Mongol (85%), Nganasan-to-Selkup (68%), 90
% (Buryat, *)-to-Tuva persist, 18% Oroqen-to-(Yakut, Evenk), persist, and a new
16% Nganasan-to-Oroqen edge appears. Interestingly, this has allowed the Oroqen and Hezhen to now form their own clade, which makes sense as these are both Tungusic speakers from northeastern China. The other Tungusic population, the
Evenk group with the Turkic Yakut: what they share in common is that they both share origins close to Lake Baikal in Siberia.
7 migration edges:
The aforementioned (French_Basque,Sardinian)-to-Yoruba (64%), (Buryat,*)-to-Mongol (85%), Nganasan-to-Selkup (68%), 90
% (Buryat, *)-to-Tuva persist, 18% Oroqen-to-(Yakut, Evenk), 16% Nganasan-to-Oroqen edges persist, and there is a new
81% Evenk-to-Yukagir edge. The remainder of the Yukagirs' ancestry is derived from the West Eurasian tree. The
Yukagir language is rather mysterious, with some links to Uralic having been postulated. Here it pays off to look at the
population portraits, since it is apparent that -unlike the Selkup- their West Eurasian ancestry is limited to a few individuals.
It is fairly interesting that Russian anthropologists
placed the Yukagirs in the Baikal group of the Central Asian race, the same as the
Evenks, who are their biggest donors. So, Yakuts, Evenks, and Yukagirs all seem to share the same Baikal-type of origin.
8 migration edges:
There is now a 64% Sardinian-to-Yoruba edge, a 16% Oroqen-to-Yukagir edge, 20% (Buryat, *)-to-(Yakut, Evenk), and a 24% Nganasan-to-Chuvash edge, 29% Oroqen-to-(Yakut,Evenk) edge, 88% (Buryat, *)-to-Tuva, 62% Nganasan-to-Selkup, 85% (Buryat, *)-to-Mongol.
The tree has been rather re-organized, with two main Siberian groups identified: an eastern group (Hezhen, Daur, Oroqen, Buryat), and a central group (Yukagir, Dolgan, Nganasan, Yakut, Evenk, Selkup). The Chuvash, predominantly Europeoid Turkic speakers from Russia show evidence of gene flow from the central group as well, whereas the Selkup, Uralic speakers from Siberia, who belong to the central group, show evidence of gene flow from Europe.
9 migration edges:
64% (French_Basque,Sardinian)-to-Yoruba, 85% (Buryat, *)-to-Mongol, 68% Nganasan-to-Selkup, 92% (Buryat,*)-to-Tuva, 14% Oroqen-to-(Yakut,Evenk), 14% Nganasan-to-Oroqen, 82% Yakut-to-Yukagir, 90% Evenk-to-Dolgan, 13% Hezhen-to-(Nganasan, *).
10 migration edges:
64% (French_Basque, Sardinian)-to-Yoruba, (85% Nganasan, *)-to-Mongol, 68% Nganasan-to-Selkup, 92% (Nganasan,*)-to-Tuva, 15% Oroqen-to-(Yakut,Evenk), 15% Nganasan-to-Oroqen, 82% Yakut-to-Yukagir, 90% Evenk-to-Dolgan, 43% Hezhen-to-Buryat, 14% Sardinian-to-Bulgarian.
I will stop at this point. I may add more migration edges later to this post, but I'm tired of typing this stuff.
You can download all the plots and *.treeout files here.
UPDATE (March 20): I have repeated the experiment with HGDP San, rather than Yoruba as the outrgroup:
There is now a 63% migration edge from (Basque, Sardinian) to San.