October 30, 2013

Visualizing Y-haplogroup distributions in west Eurasia

From the paper:
The database contains distributions representing 90 populations (N = 16,751 males) by the frequencies of the published and unpublished Y-chromosome Hgs. These Hgs were combined into 18 different Hgs (C, E, ABDF*, G, H, I1, I2, J1, J2, K*, L, N, O, Q, R1a, R1b, R2, T), so that published sources could be used for comparisons. 
As shown in Fig. 1, Middle Eastern (Class 7) and Central European Classes (Class 8) form one non-separable cluster in the central part of the figure. All of the others of the 10 classes can be identified in different well separable areas around this central region. The Central Asian (Class 4) and Northwest Caucasian (Class 9) Classes are in neighbouring areas in the upper and upper-left parts, while the Arab-Dagestanian Class (Class 1) occupies the opposite, lower-left part of the map. The North-Central and Western European (Class 3 + Class 6) as well as the Atlantic (Class 10) Classes form a common branch in the lower-left part of the figure. The opposite, upper-right branch contains the East Baltic (Class 5) and North Eurasian (Class 2) Classes.
Forensic Science International: Genetics Supplement Series Available online 26 October 2013

Classification of the Y-haplogroup distributions of Western Eurasian populations using a self-learning algorithm

H. Pamjav et al.

The understanding of historical relationship between populations is a core aspect of human population history studies. We have compared the frequency of 18 different Y-SNP haplogroups in 90 Western Eurasian populations. Classification of haplogroup distribution vectors using a new self-learning classification algorithm so called “self-organizing cloud (SOC)” proved to be an effective tool to identify population groups, which share common paternal genetic features. By means of the algorithm, we have determined 10 different classes of populations based on the similarity of haplogroup composition. The analysis showed that paternal genetic markers tend to reflect geographical proximity of populations better than linguistic relationship, although certain Y-SNP haplogroups have relatively good correlation with specific language families. These observations are based on the comparative analysis of the Hg distributions of contemporary populations may reflect demographic history of them in the past.



SB said...

Very interesting! But I guess some of these are more or less obvious. The Pashtuns are connected by Indians, Selkup and Kazakh? I guess this makes sense based on their Y diversity for sure.
Brahmins connected with other indians and Tajik? I wish there would've been more detail about the algorithm in this paper ( or maybe I cannot see it all? ther eare only 3 pages!)

mooreisbetter said...

What do the traditional letter Haplogroups mean anymore? NOTHING.

Just where some scientist at the dawn of this field decided to arbitrarily cut off a SNP to distinguish it from other places.

TMCRA means much more, and is more valuable geographically too.

Mauri said...

How it is possible that Finns are close Komis, Maris and Latvians taking into account that the yDna differs around 30% totally and is common with Scandinavians? Scandinavians are on the opposite edge. Sounds weird.

Mauri said...

Looking statistics the Finnish yDna is almost 40% same with Scandinavians.

LivoniaG said...

What do the lines between the locations mean? It seems to confuse the visualization.

POL and CZE are surprisingly distant but there is a long single gray line between them. IRE seems to have quite a few of these lines running to it, other outliers don't.

Also what is NGD, SRD, BLG?

I'd love to see this set against a language phylogenic SOC mapped against this one. But modern borders and modern DNA won't carry us back far enough in time.

AWood said...

I wouldn't call this central Europe. It's more like east Europe and the Balkans clusters with the northern Mid East. From a YDNA perspective there is an ancestral cluster of R1b1b2 which is the ancestor to the clear west/south/central European cluster which is 3, 6, and 10. It looks like the author wanted to spin something which is not there.

Krefter said...

Y DNA is not everything it is just a direct paternal line R1 would have been non existent in almost all of Europe 6,000 years ago. But Autosomal DNA of European hunter gathers and farmers from before Y DNA R1 spread in Europe showed modern Europeans ancestry was already there. Y DNa N1c spread to northeast Europe maybe 6,000-10,000ybp today it is dominate in most Uralic speakers but it migrated originally from eastern Asia. That does not mean Finnish and other northeast Europeans are very related to heavily N people in Asia or in brotherclade O. It doesn't represent that much of their total ancestry.

It is important to figure out when y DNA haplogroups spread because there are a lot that are dominate in certain areas but spread very recently. Like R1b S21 in England and lowlands of Scotland it is about 20-30% but it came in the middle ages with Anglo Saxons. I see that so many people are trying to ignore the obvios connection with y DNA sand spread of Indo European languages. It is not a surprise areas of former Corded ware culture are dominted by y DNA R1a1a1b1 Z283 and Indo Iranians domintd by brotherclade R1a1a1b2 Z93. Both Corded ware culture(brought ancestor language of Balto Slavic) and the cultures that spread Indo Iranian languages descended from Yamna culture which has been seen as early Indo Europeans since the 1950's. Y DNA from Corded ware culture there are already two R1a1's and there 16 out of 17 R1a1's from Bronze and Iron age Indo Iranian's in Asia and eastern Europe. Anyone who thinks R1a is native to India is nuts. Almost all Indian R1a is under R1a1a1b2 Z93. And DNA from tarim mummies and Andronovo culture which are from before Indo Iranian languages spread to India. Show they had Y DNA R1a1 and typical European mtDNA haplogroups and were light skinned, with mainly light hair percentages only found in Europe. People based their theory R1a originated in India on modern diversity. But now that the Phylogenetic tree of R1a has been figured out the highest diversity is actually in Ukraine and Russia. The exact areas ancient Dniper Donets and Yamna cultures existed who are suspected to be early Indo Europeans.

Anyone who thinks R1b is native to western Europe I also nuts. R1b is not just western European. Almost all western European R1b is under R1b1a2a1a L11 which is estimated to be 5,000-6,000 years old,. There is R1b1a1 M73 in asia and Russia, R1b1a2a L23 and R1b1a2a2 Z2103 in Near east and Russia. R1b at somepoint came to Europe through the middle east R1b1a2a L23 in southeast Europe may be connected with R1b1a2a1a L11 in west Europe. R1b1a2a1a L11's subclades R1b P312 and R1b U106 probably spread in west Europe mainly in the last 4,000 years and they spread extremely quickly. They definitely can be connected with spread of Germanic and Italo Celtic languages.

Even though the vast majority of Y DNA in Europe spread during the Neloithic or after so all in the last 9,000 years. Autosomal DNa shows Europeans ancestry is a mixture of pre Neolithic and Neolithic people. Or most mainly pre Neolithic and the main ancestors of for example Irish 10,000ybp would not have had R1b. They would have had probably some type of hg I, F, or C.

Truth Prevail said...


"Another non-local component in the Caucasus was Arabian. This was largest in Transcaucasus
and Anatolian populations such as Armenians (41.6%) and Turkey (30.4%), possibly expressing ongoing
links between the Fertile Crescent and highland West Asia."


Truth Prevail said...

Answer to an inquiry via email
In terms of the clade which links the tribal arabs and Daghestanians it should be within the J1 haplogroup
Answer to an inquiry via email
Unfortunately, we lacked detailed data to distinguish between J1 subgroups, but it is still valid that Dagestanians and (tribal) Arabs are closer to each other than to other populations (on the male line).

Visualizing Y-haplogroup distributions in west Eurasia
Tibor Feher
Another non-local component in the Caucasus was Arabian. This was largest in Transcaucasus
and Anatolian populations such as Armenians (41.6%) and Turkey (30.4%), possibly expressing ongoing
links between the Fertile Crescent and highland West Asia."

Rob said...

Further to what Mooresbitter said- I don't see what exercises like this aim to achieve. Firstly- their results are hardly surprising or unheard of . Secondly, they lump all subhaplgroups together into one group, disregarding the fact that , eg different subclades of R1b have had very different histories for some thousands of years . Thirdly global population structure and relatedness is best done with autosomal DNA - which gives the true , non-skewed picture

Onur Dincer said...

"Another non-local component in the Caucasus was A8rabian. This was largest in Transcaucasus
and Anatolian populations such as Armenians (41.6%) and Turkey (30.4%), possibly expressing ongoing
links between the Fertile Crescent and highland West Asia."


TP, that analysis was done excluding most of the major components of Anatolia and the Caucasus. So they used the other components as proxies of most of the major components of Anatolia and the Caucasus. That is why that analysis has such weird-looking results, especially for the populations of Anatolia, the Caucasus and environs. When all of the major components of Anatolia and the Caucasus are included, Anatolian/Caucasian populations such as Turks and Armenians never get so high "Arabian" component results.

See for instance the most recent SNP analysis of DNA Tribes (in which all components are included):


There you will see that populations such as Turks and Armenians have in reality much lower levels of the "Arabian" component.

Va_Highlander said...


"And DNA from tarim mummies and Andronovo culture which are from before Indo Iranian languages spread to India. Show they had Y DNA R1a1 and typical European mtDNA..."

The oldest remains at Xiaohe had some typically European mtDNA. The majority were haplogroup C -- fourteen out of twenty samples -- with a likely origin in South Siberia. They predate the Andronovo horizon by some centuries.

Tunguska said...

"Unknown" noted the surprising distance between POL and CZE. Moreover, POL is just over the line from ABH and OSE. Could this lend credence to legends about a Sarmatian connection to modern Poles?

Truth Prevail said...

Thank you very much Onur

In general, the non-local components expressed here will depend on which local regions are excluded. However, Caucasus Mountains populations are genetically part of a continuum that includes other Middle Eastern populations. To provide a different perspective on genetic relationships in this part of the world, a graphical analysis using MDS is included in this Digest

article: http://dnatribes.com/dnatribes-digest-2013-04-02.pdfThan you very much Onur

And, per DNAtribes and based on available data Caucasus Mountains populations are part of a group of related Middle Eastern regions that also includes Mesopotamian, Arabian Peninsula, and North African populations. Within this group, Caucasus Mountains populations are most closely related to Mesopotamian populations on the basis of autosomal STR and SNP data.

However, this autosomal relationship does not necessarily exclude the possibility of other more specific links (such as Y-DNA links) with the Arabian Peninsula if indicated by other lineage studies (not included in DNA Tribes' autosomal analysis).

For Armenian populations, results do not exclude the possibility of some additional contacts with the Levant or Arabian Peninsula on the basis of autosomal relationships in this region (possibly related to the archaeological gap that precedes the emergence of the Urartian state; see The Peoples of the Hills by C. Burney and D. M. Lang, p. 127.). More information is included in these Digest articles: http://dnatribes.com/dnatribes-digest-2011-12-01.pdf and http://dnatribes.com/dnatribes-digest-2013-08-01.pdf

Mauri said...

He has obviously grouped together N1b and N1c. It means on timeline around 15000 years according datings ratings. We can compare this to the bifurcation line between Finnish and Germanic speaking I1-men, which is around 2500 years. My observation is that the conclusion he made is based purely on the geography. Languages can change and migrations of ancient people ate hard to find. There had been of course some, even remarkable geographic continuity, but the rest of statistics is only good will in readers' eyes.

eurologist said...

I agree with others that as is this is a pretty useless study that should have never made its way into a peer-reviewed journal.

A slightly revise work would have been much more useful: take a number of interesting points in time at which new subgroups formed, and then use those (this information is now starting to become available from full y-DNA analysis). For example, it might turn out (fully made up examples) that I1a2 and G2 and P are on the same time slice, or J2a1b and G2a1a and R1b1a1b2, etc.

Anonymous said...

i am from turkey we are not arab! i have seen many anatolian farmers who have bright colored eyes !

we also found 8500 years old neolitic anatolian farmer in istanbul.

Urartians described to having light skin fair hair etc