May 18, 2011

The Central Asian element in Turks (part 3)

In a previous post I summarized extensive evidence by myself and Turkish researchers to the effect that modern Turks are about 1/7 descended from Central Asian Turkic speakers, and 6/7 from pre-Turkic West Asians.

Some people have argued that Uzbeks, the best representative of the Central Asian ancestors of the Turks are inappropriate as a parental population.

Can Turks be modeled as a 1/7-6/7 simple mix of West Eurasians and Central Asians? I refer to my most recent K=11 ADMIXTURE results as useful data that can be used to test this hypothesis once again.

I will use the 4-way average of Greek_D, Armenian_D, Georgians, and Syrians as representative of the "West Eurasian" component in Turks. These 4 populations border Turks from the West, East, North, and South, and their average is expected to be a good stand-in for what pre-Turkish Anatolians were like, and probably more robust than choosing arbitrarily just one of the 4 populations.

I will use Uzbeks as representative of Central Asian Turks, and I will calculate the weighted average of the two (1/7 Uzbek + 6/7 "West Eurasian"). I will then compare this with the average of the Turks (from Behar et al. 2010)+Turkish_D combined sample.

If Turks can be modeled as the simple mix I have claimed, then the empirical Turkish average will be similar to the simulated one (1/7 Uzbek + 6/7 "West Eurasian"). Here are the actual numbers:

As you can see, the simulated average is virtually identical to the empirical one. All components do not deviate from it by more than 0.4%, and only the most important West Asian one deviates by a mere 1.8% which, in relative terms (divided by the mean of 49.2%) represents a 3.7% error.

Given the finite sample sizes, the limitations of ADMIXTURE, and the use of a 4-way average as a proxy for pre-Turkish Anatolians, I can easily claim that this does not only confirm the validity of my model but to an extraordinary degree.

A different way of testing the model's validity is the correlation between the empirical and simulated admixture proportions which is 0.99956. I don't think I need to point out how remarkable this is.

Conclusion

The empirical data are consistent with the idea that Anatolian Turks are a simple mix of a West Eurasian population element equivalent to the average of their immediate neighbors, and a Central Asian population element similar to Uzbeks in a 6:1 analogy. These results confirm and extend the extensive evidence of the previous post.

UPDATE (May 21): In a new experiment, I demonstrate that all available Turkic samples fall almost perfectly on a cline between West and East Eurasians. That experiment also shows that Uzbeks are the most West Eurasian out of the available Central Asian Turkic populations.

It is still unclear what the ratio of West/East Eurasian elements in Turkic people who entered Anatolia was, but these results certainly point out that the Uzbeks are not unusually Mongoloid in their makeup among Turkic peoples, rather the opposite.

May 17, 2011

The death of "acculturation" as a model for European farming dispersal

Very good open access article that shows how migrationism has returned to fashion and both the laborious wave of advance, as well as the processual "acculturation" model in which Mesolithic people slowly adopt elements of farming culture, transforming themselves into farmers are wrong. My only beef with the article is its treatment of genetics, which is largely based on rehashing of pre-ancient DNA studies and ignores the latest literature.
From the (open access) paper:
In this article I will consider the spread of agriculture from central Europe to the Atlantic (fig. 1). This involves four major “spread events”: the Cardial of the western Mediterranean, the Linienbandkeramik (LBK) of the interior, the Trægtbægerkultur (TRB) of southern Scandinavia, and the Neolithic of Britain and Ireland.
...
Above all, the migrationist scenarios suggested here may account for one thing: why we so rarely see long-term “transitional” stages between foraging and farming. Now we see foragers, now we see farmers; but in Europe we have singularly failed to catch foragers in the act of becoming farmers. The long-term developmental processes we have expected for decades have not materialized. Farmers can evidently trade axes with foragers for centuries or longer without destabilizing them or leading them to adopt farming. “Processes” there undoubtedly are, but we need to look inside the standard deviation of a radiocarbon date to see them in action.
Related:

Current Anthropology http://www.jstor.org/stable/10.1086/658368

Westward Ho!
The Spread of Agriculture from Central Europe to the Atlantic

Peter Rowley-Conwy

Recent work on the four major areas of the spread of agriculture in Neolithic western Europe has revealed that they are both chronologically and economically much more abrupt than has hitherto been envisaged. Most claims of a little agriculture in Late Mesolithic communities are shown to be incorrect. In most places, full sedentary agriculture was introduced very rapidly at the start of the Neolithic. “Transitional” economies are virtually absent. Consequently, the long-term processes of internal development from forager to farmer, so often discussed in Mesolithic-Neolithic Europe, are increasingly hard to sustain. The spread of agriculture by immigration is thus an increasingly viable explanation. The crucial role of boats for transport and of dairying for the survival of new farming settlements are both highlighted. Farming migrations were punctuated and sporadic, not a single wave of advance. Consequently, there was much genetic mixing as farming spread, so that agricultural immigrants into any region carried a majority of native European Mesolithic genes, not Near Eastern ones.

Link

The Neolithic founder crops

This is truly invaluable as a resource for the origins of the eight founder crops of the West Eurasian Neolithic. It is also open access, so read or persue the maps at your own pleasure.

Related:


Current Anthropology http://www.jstor.org/stable/10.1086/658367

The Neolithic Southwest Asian Founder Crops
Their Biology and Archaeobotany


Ehud Weiss and Daniel Zohary

This article reviews the available information on the founder grain crops (einkorn wheat, emmer wheat, barley, lentil, pea, chickpea, and flax) that started agriculture in Southwest Asia during the Pre-Pottery Neolithic period, some 11,000–10,000 years ago. It provides a critical assessment for recognizing domestication traits by focusing on two fields of study: biology and archaeobotany. The data in these fields have increased considerably during the past decade, and new research techniques have added much to our knowledge of progenitor plants and their domesticated derivatives. This article presents the current and accumulated knowledge regarding each plant and illustrates the new picture that emerged on the origin of agriculture.

Link

The spread of Austronesian farmers across the Pacific

This is the first of three interesting papers that appear ahead of print in Current Anthropology. Peter Bellwood reviews the spread of farming across the Pacific from its two sources (China including Taiwan, and the New Guinea highlands).

Related:


Current Anthropology http://www.jstor.org/stable/full/10.1086/658181

Holocene Population History in the Pacific Region as a Model for Worldwide Food Producer Dispersals

Peter Bellwood

Pacific prehistory (excluding Australia) since 3000 BC reflects the impacts of two source regions for food production: China from the Yangzi southward (including Taiwan) and the western Pacific (especially the New Guinea Highlands). The linguistic (Austronesian, Trans–New Guinea), bioanthropological/human genetic, and Neolithic archaeological records each carry signals of expansion from these two source regions. A combined consideration of the multiregional results within all three disciplines (archaeology, linguistics, and biology) offers a historical perspective that will never be obtained from one discipline or one region alone. The fundamental process of human behavior involved in such expansion—population dispersal linked to increases in human population size—is significant for explaining the early spreads of food production and language families in many parts of the world. This article is concerned mainly with the archaeological record for the expansion of early food producers, Austronesian languages, and Neolithic technologies through Taiwan into the northern Philippines as an early stage in what was to become the greatest dispersal of an ethnolinguistic population in world history before AD 1500.

Link

May 16, 2011

Before Silk: Unsolved Mysteries of the Silk Road by Colin Renfrew



A great talk by Colin Renfrew at the Penn Museum, a must view for anyone interested in Eurasian prehistory. It's good to know that Renfrew holds that the Tocharians originated in West Asia and spread along the Silk Road (south of the Caspian) and not along the steppes. That is also my opinion, and I would perhaps associate them with the J2a/R1b-rich population of eastern Anatolia, Trascaucasia and North Iran, and, perhaps, even related to the Gutians from the Zagros (after Gamkrelidze and Ivanov). That would explain quite well, I believe the anomaly that is the high J2a/R1b frequency in the region.

Also, I would disagree with long-term persistence of Tocharian in Xingiang; it's more likely that this was due to a late migration from somewhere to the west that was eventually swamped by the much more successful wave of the Indo-Iranians that must've involved a J2a/G2a/R1a1 combination. I've made the point before that the vast majority of the territory of Europe and West Asia has shifted language (if not language family) over a span of 2-3 thousand years, so I see no real reason to imagine 2-3 thousand years of linguistic continuity in Xinjiang until the 8th c. AD when Tocharian is first attested.

UPDATE:

Here are various abstracts from the Symposium Reconfiguring the Silk Road: New Research on East-West Exchange in Antiquity; hopefully more videos will be uploaded.


UPDATE II:

James Mallory's talk is also uploaded:



He spends a great deal of his time establishing the (not controversial) idea that Tripolye cultures cannot explain the Yamna phenomenon east of the Dnieper. I will not argue with this, but it is rather a defensive stance of arguing against the spread of Indo-Europeans from the Balkans to the steppe-lands. Positive evidence for the spread of Yamna in the opposite direction is what is needed for the steppe model, and that is what is lacking: with the exception of some clearly Yamna-derived sites in the northern Balkans and Hungary, the links to the rest of the European Indo-Europeans are weak to non-existent.

Moreover, I would argue that the assumption that 3-4,000 years BC there were Indo-European speakers east of the Dnieper is itself suspect. There is no great necessity of explaining how Yamna was Indo-Europeanized from the Balkans, because there is no evidence that it was Indo-European. The first clearly attested Indo-European groups in the European steppe are the Scythians, and they appear in the 1st millennium BC, with both craniology and ancient sources agreeing that they were of eastern origin. So, the inability of the Balkan Tripolye Indo-Europeans of Indo-Europeanizing the Pontic steppe east of the Dnieper is really a non-sequitur, since one must first demonstrate that populations east of the Dnieper spoke Indo-European languages 3,000 years before a single branch of IE (the Iranic) appears in the region from the east.

Mallory also highlights some other substantial problems of the steppe model. Tocharian has IE words for cereals and pigs, and the evidence such as it is suggests that east of the Dnieper there were no domesticated cereals and no evidence of pigs. All this, of course, disappears once we accept that the Tocharians did not move across the steppe lands north of the Caspian, but south of it, from Iran and ultimately the Near East.

Mallory also points out that while he suspects Afanasievo to be linked to the Proto-Tocharians, the sum total of the evidence linking Xinjiang with Afanasievo is meagre. I would also add that, as I've explained above, if one were to establish links between Bronze Age Tarim and Afanasievo, that still leaves a couple of millennia until the first attestation of the Tocharian languages.

Another point of interest is the claim that Indo-Iranians and Tocharians must've been separated in space and time to evolve into so distinctive languages. But, that too is a non-sequitur. We only have to look at the Near East or the Caucasus to witness the co-existence of a handful of language families and dozens of distinct languages. You don't need a large separation to create a new language subfamily, but only a few rivers or high mountains, and Transcaucasia provides both in abundance. I would see Tocharians as the last remnant of the eastern Indo-Europeans, perhaps refugees from further west into Xinjiang itself, with Indo-Iranians pushing them eastward. As for the later steppe cultures that were unquestionably Iranic, these were probably formed after the collapse of the BMAC which sent off Indo-Aryan offshoots south in the 2nd millennium BC, and Iranic ones north, east, and west soon thereafter.

May 15, 2011

Genes and Languages in the Caucasus

If there was ever a paper that was the equivalent of a box of candy, this is probably it. I will update this post with my comments.

UPDATE I (Genealogical rate, Gene-language concordance, Ossetes): I seriously don't know where to begin with this paper. So, given the serendipitous appearance of an abstract on Y-chromosome mutation rates, here is a major new pro-genealogical rate quote from the new paper:
We found that “evolutionary” estimates of most clusters fall far outside the range of the respective linguistic dates, while “genealogical” estimates gave a good fit with the linguistic 23 dates. At least two population events in the Caucasus are documented archaeologically, which allows additional comparison with these “historical” dates. In both cases, the historical (archaeological) date is similar to a genetic estimate based on the “genealogical” mutation rate (Supplementary Note 2).
And, here's a comparison of the linguistic and genetic (based on Y-chromosomes) trees from the paper:
The correspondence seems remarkable; the only major discrepancy is for Iranic (Indo-European) Ossetes who group with NW Caucasians genetically, which makes sense as the Ossetes are probably to a large extent NW Caucasians that underwent a language shift at the influence of the Alans.

Speaking of the Ossetes, their negligible R1a1-M198 frequency (0.4-0.8%) should be a warning that Iranic steppe nomads _does not equal_ R1a1. While a limited contribution of Alans to the Ossetes is expected, it is not expected that Ossetes will have two of the lowest M198 frequencies in the Caucassus: in all probability R1a1 was not particularly important among Alans, and, by implication (?) Sarmatians.

UPDATE II (4 haplogroups for 4 language families):

The most interesting discovery in this paper is, of course, the correspondence between Y-chromosome haplogroups and language groups, thanks to the very large number of individuals tested and the deep phylogenetic resolution of the haplogroups:
Overall, the most frequent haplogroups in the Caucasus were G2a3b1-P303 (12%), G2a1a-P18 (8%), J1*-M267(xP58) (34%), and J2a4b*-M67(xM92) (21%), which together encompassed 73% of the Y chromosomes, while the other 24 haplogroups identified in our study comprise the remaining 27% (Table 2). ... haplogroup G2a3b1-P303 comprised at least 21% (and up to 86%) of the Y chromosomes in the Shapsug, Abkhaz and Circassians ... haplogroup G2a1a-P18 comprised at least 56% (and up to 73%) of the Digorians and Ironians (both from the Central Caucasus Iranic linguistic group), while not being found at more than 12% (average 3%) in other populations... haplogroup J2a4b*-M67(xM92) comprised 51-79% of the Y chromosomes in the Ingush and three Chechen populations (North-East Caucasus, Nakh linguistic group), while, in the rest of the Caucasus, its frequency was not higher than 9% (average 3%) ... haplogroup J1*-M267(xP58) comprised 44-99% of the Avar, Dargins, Kaitak, Kubachi, and Lezghins (South-East Caucasus, Dagestan linguistic group) but was less than 25% in Nakh populations and less than 5% in the rest of Caucasus.

Interestingly, G2a3 is one of the lineages of early Central European farmers, and 2 medieval German knights. G2 is also, curiously, one of the West Eurasian lineages that are found in very small quantities in India, especially among upper caste Hindus. We are beginning to make connections across space and time, even though the patterns are far from clear yet.

The prevalence of J1*-M267(xP58) in Dagestan is well known (or suspected) from previous studies. Notice that J-P58, if we use the genealogical rate has an age of ~5.4ky in Semitic groups, and this is in concordance with the 5,750 years ago origin of Semitic languages based on Bayesian phylogenetics. So, it is clear that part of haplogroup J1 was prevalent in ancient Semitic groups, another, disjoint part in ancient Dagestani groups.

To make things more interesting, the Nakh groups (Ingush and Chechens) have J2a4b*-M67(xM92) as their modal haplogroup. Nakh is also a Northeast Caucasian language subfamily, like Dagestani, and indeed NE Caucasian is also called Nakho-Daghestanian. What did the early speakers of this family look like?

It would be tempting to think that Proto-Nakho-Dagestanians were J1-dominated, as J1 exists in both Nakh (16-25%) and Dagestani (58-99%) groups, whereas J2a4b-M67 (the Nakh modal haplogroup) is nearly completely absent in Dagestanians.

UPDATE III (No European influence):

Another interesting discovery of this study is the lack of European influence in the populations of the North Caucasus.
It seems that both R1a1a-M198 and I2a-P37 have a major barrier eastward in the Don river. Please note that the former is not strictly a European haplogroup, but it nonetheless experiences a massive drop in frequency, and is negligible everywhere except in Abkhaz-Circassians (NW Caucasus; 10.3-19.7%), with an outlier in Dargins (22%).

This seems to put a limit on the origin of any hypothetical movements across the Eurasian steppe east of the Don river, as haplogroup I2a-P37 is largely absent in Central Asia, and occurs 3 times in 1,525 individuals in this sample. So, while there have been proposals of a Central European origin of some steppe pastoralist groups, these are hard to reconcile with this picture.

UPDATE IV (Haplogroup G):

Two of the modal haplogroups in this paper are G2a1a-P18 (Iranic, 56-73%) and G2a3b1-P303 (NW Caucasians, 21-86%). Battaglia et al. (2008) also found a high frequency of G2a* in Georgians and Balkars (~30%, also modal in both populations). It appears that G2a is a mainly West (both NW and SW) Caucasian phenomenon within the context of this region.

UPDATE V (Starostin and Language depth)

The authors applied the methodology of the late Sergei Starostin to the problem of language time depth:
The present work employs Starostin’s methodology, and we made special efforts to create the high-quality linguistic databases required for this analysis. Thus, based on significantly extended and revised linguistic databases, we have applied a glotto-chronological approach to the North Caucasian languages. As a result, our study provides a unique opportunity to make direct comparisons of linguistic and genetic data from the same populations. Lexico-statistical methods have also been applied to a number of language families using a Bayesian approach to increase the statistical robustness of language classification (Gray and Atkinson, 2003; Kitchen et al., 2009; Greenhill et al., 2010). Using these methods with the Caucasus languages under
study here will be the focus of future work.
It will certainly be interesting to see Bayesian phylogenetic methods applied to the Caucasus languages in the future, using the linguistic datasets developed here. The concordance of genetic-linguistic results in this paper, in addition to the many successes of the G&A approach, is making it increasingly difficult for those who doubt our ability to estimate the age of language families in a manner similar to that with which biologists estimate the age of genetic variation.

See also Tower of Babel project and the Evolution of Human Languages project at the Santa Fe Institute.

UPDATE VI (Haplogroup J2a)

I have recently speculated about a possible link between the Caucasus region and India based on the appearance of a "Dagestan" component in India, the clear West Asian origin of Ancestral North Indians, as well as a possible linguistic link between Northeast Caucasian, Hurrian, and Indo-European.

A problem with that theory is that the high J1*(xP58) frequency in Dagestan has no counterpart in South Asia. The current study, however, adds data on the Nakh part of the Nakho-Dagestanian (Northeast Caucasian) family, showing this to be J2a4b-M67 dominated. So, while I think that J1*(xP58) may have been present among Proto-Northeast Caucasians, these must have interacted with J2a folk.

J-M67 is clearly intrusive into the Central Caucasus, from the South where a much greater variety of J2a-related lineages is observed among Armenians, North Iranians, and Anatolian Turks.

We now have good coverage of J2a in the entirety of the West Asian region, with the exception of Azerbaijan, and a few patterns are beginning to emerge:
  1. The center of the J2a world is somewhere between eastern Turkey, Armenia, Azerbaijan, Iran, and Syria
  2. The Caucasus is a northern extension of this world, just as Greece and Italy are its main western extensions, with a strong extension into Central Asia as far as Xinjiang, and well into South Asia all the way to upper caste South Indian Hindus.
  3. In the Caucasus itself J-M67 is dominating Nakh speakers, but with little other J2a related variation.
  4. In comparison to Nakhs, J2a seems more varied in Georgians, among Ossetes, and among NW Caucasian speakers
It is hard to make any pronouncements on how J2a spread northwards from its Transcaucasian cradle, but I would think that the Kura-Araxes and Maikop cultures are fairly good candidates for that spread, with the former being J2a dominated, and the latter being more G2a dominated. I would not, however, dismiss a more recent spread of J2a into the region.

UPDATE VII (Absence of E1b1b1):

This haplogroup has a more Mediterranean distribution and is conspicuously absent in the North Caucasus. Unfortunately no downstream markers were typed, but (a) its presence in small amounts in NW Caucasians (1-1.7%) together with a similar low frequency (1.5%) in Georgians, (b) its absolute absence among Nakho-Dagestanians, except for one Lezghin, suggest to me that it arrived to the region from the west, and is probably a low-frequency trace of Ancient Greek colonies of the Black Sea, just as it is associated with Greek colonists in the West Mediterranean and Sicily.

UPDATE VIII (Haplogroups L and T):

There is a little haplogroup L in the North Caucasus. L-M27 and L-M317 seems concentrated in the Northwest, while L-M357 is found only in Nakh speakers. The detection of L-M357 in North but not South Iran may be related with this population, and also the L-rich population of Syria, especially from the eastern inland area.

Haplogroup T has been the subject of a major recent paper. In this region, it is found in 2 NW Caucasians, 1 Ossete and a couple of Lezgins, but unfortunately with no fine phylogenetic resolution.

Mol Biol Evol (2011) doi: 10.1093/molbev/msr126

Parallel Evolution of Genes and Languages in the Caucasus Region

Oleg Balanovsky1,2,*, Khadizhat Dibirova1,*, Anna Dybo3, Oleg Mudrak4, Svetlana Frolova1, Elvira Pocheshkhova5, Marc Haber6, Daniel Platt7, Theodore Schurr8, Wolfgang Haak9, Marina Kuznetsova1, Magomed Radzhabov1, Olga Balaganskaya1,2, Alexey Romanov1, Tatiana Zakharova1, David F. Soria Hernanz10,11, Pierre Zalloua6, Sergey Koshel12, Merritt Ruhlen13, Colin Renfrew14, R. Spencer Wells10, Chris Tyler-Smith15, Elena Balanovska1 and The Genographic Consortium16

We analyzed 40 SNP and 19 STR Y-chromosomal markers in a large sample of 1,525 indigenous individuals from 14 populations in the Caucasus and 254 additional individuals representing potential source populations. We also employed a lexicostatistical approach to reconstruct the history of the languages of the North Caucasian family spoken by the Caucasus populations. We found a different major haplogroup to be prevalent in each of four sets of populations that occupy distinct geographic regions and belong to different linguistic branches. The haplogroup frequencies correlated with geography and, even more strongly, with language. Within haplogroups, a number of haplotype clusters were shown to be specific to individual populations and languages. The data suggested a direct origin of Caucasus male lineages from the Near East, followed by high levels of isolation, differentiation and genetic drift in situ. Comparison of genetic and linguistic reconstructions covering the last few millennia showed striking correspondences between the topology and dates of the respective gene and language trees, and with documented historical events. Overall, in the Caucasus region, unmatched levels of gene-language co-evolution occurred within geographically isolated populations, probably due to its mountainous terrain.

Link

May 14, 2011

Let the Y-STR mutation wars begin!

This should strictly go to the new ESHG abstracts post, but I am sure it will spark a lot of interest, so I am posting it separately. I recently noticed how Y-STR age estimates are dependent on the choice of Y-STRs used, so it will be very interesting to see what Busby and Capelli have come up with.

It is certainly a very good thing to reignite the debate, even though I do believe that Y-SNP based dating in the age of whole genome sequencing will solve many dating problems, especially for old clades of the tree. I have argued at length why the evolutionary mutation rate is wrong, but the more serious problem is the fact that different sets of Y-STRs lead to different age estimates (with slower-mutating ones producing much older ages than fast-mutating ones).


Microsatellite choice and Y chromosome variation: attempting to select the best STRs to date human Y chromosome lineages
G. B. J. Busby, C. Capelli
Recently the debate on the origins of the major European Y chromosome haplogroup R-M269 has reignited, and opinion has moved away from Paleolithic origins to the notion of a younger Neolithic spread of these chromosomes from the Near East. We investigate the young, STR-based Time to the Most Recent Common Ancestor estimates proposed so far for R-M269 related lineages and find evidence for an appreciable effect of microsatellite choice on age estimates. We further expand our analysis to include a worldwide dataset of over 60 STRs which differ in their molecular attributes. This analysis shows that by taking into account the intrinsic molecular characteristics of Y chromosome STRs, one can arrive at a more reliable estimate for the age of Y chromosome lineages. Subsequently, we suggest that most STR-based Y chromosome dates are likely to be underestimates due to the molecular characteristics of the markers commonly used, such as their mutation rate and the range of potential alleles that STR can take, which potentially leads to a loss of time-linearity. As a consequence, we update the STR-based age of important nodes in the Y chromosome tree, showing that credible estimates for the age of lineages can be made once these STR characteristics are taken into consideration. Finally we show that the STRs that are most commonly used to explore deep ancestry are not able to uncover ancient relationships, and we propose a set of STRs that should be used in these cases.

ESHG 2011 abstracts are online

From here. I didn't find much of interest this year, except a long-overdue look at Bulgarian Y-chromosomes but with not a very informative abstract.

Y-Chromosome genetic variation of modern Bulgarians
S. Karachanak et al.
To date, Bulgarian Y chromosomes have been studied only in macrogeographic context or in the lineage-based approach. Therefore, in order to comprehensively characterize Bulgarian Y-chromosome variation, we have performed high-resolution phylogenetic analysis of 812 healthy,unrelated Bulgarian males and compared the results with Y-chromosome data from other Eurasian populations.
The genotyping of 60 biallelic markers was performed in hierarchical order by RFLP and DHPLC analyses. The position of Bulgarians among other populations was visualized by Principal Component (PC) analysis.
About 80% of the total genetic variation in Bulgarians falls within haplogroups E-M35, I-M170, J-M172, R-M17 and R-M269. This finding shows that the Bulgarian haplogroup profile is congruent with those described for most European populations.
Among the prehistoric events marked by the observed haplogroups, the greatest contribution comes from the range expansion of local Mesolithic foragers triggered by adoption of agriculture introduced by a cadre of Near Eastern farmers. The Bulgarian Y chromosome gene pool also bears signals of the recolonization from different glacial refugia, the spread of agriculture from the Near East and the expansion of early farmers along the Central and East European river basins.
As for the interpopulation analysis, similarly to mtDNA, Bulgarians belong to the cluster of European populations, still being slightly distant from them. Bulgarians are distant from Turks (despite geographical proximity), Arabic and Caucasus populations and Indians. These trends in the PCA graph likely reflect not only prehistoric, but also more recent demographic events that have shaped the Y chromosome structure of modern Bulgarians.


An abstract on Yakuts seems to report the link between the Altaic-Turkic Yakut and the Altaic-Tungusic Evenk that I also discovered recently.

Autosomal and uniparental genetic diversity of the populations of Sakha (Yakutia): Implications for the peopling of Northeast Eurasia
S. A. Fedorov et al.
Sakha Autonomous Republic occupies a quarter of Siberian total land area in its northeastern part, is an important region for understanding the colonization of the Northern Eurasia by anatomically modern humans. To characterize the genetic variation in Sakha both the haploid mitochondrial DNA (mtDNA) and Y chromosomal as well as diploid autosomal loci (650 000 SNPs) of genome were analyzed in five native populations of Sakha (Yakuts, Evenks, Evens, Dolgans and Yukaghirs).
While striking prevalence of Y chromosome haplogroup N1c in gene pool differentiates Yakuts from other populations, the mtDNA and autosomal analyses demonstrate genetic similarity of all native populations of Sakha, in particular Yakuts and Evenks. The results also demonstrate closest genetic proximity of the populations of Sakha with southern Siberians. Both mtDNA and autosomal analyses reveal deep genetic discontinuity between Siberian and Beringian populations. MtDNA haplogroups A2 and G1b, prevalent in Beringian populations, are either minor or even absent in Sakha, where haplogroups C and D dominate. Autosomal analysis also differentiates Beringian populations from those of Sakha. Our results support the scenario that the territory of Sakha was colonized from the regions west and eastward of Lake Baikal with only minor gene flow from Lower Amur/Southern Okhotsk region and/or Kamchatka.
An abstract on Lithuanian Y-chromosomes

The place of the population of Lithuania between Northern and Eastern Europe: Y chromosome analysis
I. Uktverytė et al.

The population of Lithuania is constituted of 6 dialectal groups which form two major ethno-linguistic groups known as Aukštaitish and Žemaitish, both speaking Baltic languages of Indo-European family. Neighbouring Finno-Ugric (Northern and Eastern Europe), Slavonic (Eastern Europe) and Germanic (Northern Europe) populations surrounding the Baltic sea region influenced historical formation of Lithuanian ethno-linguistic groups. Analysis of the Lithuanian population genetic composition helps to understand the origin, history and place among other populations.
Y chromosome analysis was performed for 301 individuals from 6 dialectal groups. 25 SNPs were genotyped (TaqMan) to determine Y haplogroup and 17 STR were analysed to determine haplotype for each individual. Most frequent haplogroups in the population of Lithuania are R1a1a (42.2%, R1a1a1g compose 8.97% in studied population) and N1c1 (40.5%) and less frequent haplogroups are R1b1b1, I1, I2a, E1b1b1 (<5% each). AMOVA showed no statistically significant differences between two major ethno-linguistic groups Aukštaitish and Žemaitish (among groups p-value=0.897, among population within groups p-value=0.194, within populations p-value=0.282 based on 10100 permutations). MDS of genetic distances based on Y-biallelic markers showed that Lithuanians are closer to Latvian and Estonian populations than to Slavic populations (European part of Russia, Poland, Ukraine, Belorussia, stress=0.029). According to the frequencies of haplogroups, no statistically significant differences between ethno-linguistic groups were detected (p>0.05), moreover, MDS analysis sets the population of Lithuania between Northern and Eastern European populations.

An abstract on Sardinian population structure.


A genome-wide analysis of Sardinian population structure
M. Steri et al.

Sardinia is particular attractive for human genetic studies, being one of the larger isolated populations and thus suitable for large-scale studies. Several attempts have been made to explore its genetic structure, but they either analyzed a large set of markers in very few samples or thousands of individuals at specific loci. Here we genotyped 2,615 individuals with the Affymetrix 6.0 array. Samples were recruited from the north, south and central east areas of the Island, and initially considered as 3 distinct populations. Genotype calling was performed with Birdseed-v2, considering all samples as a unique cluster to avoid batch effects. Subsequently, we applied standard filters for samples and SNP quality, and used IBD sharing to detect, and discard, hidden relatives. Using principal component analysis, we identified outliers and reassigned each individual accordingly. An analysis of molecular variance indicated that only 0.21% of the variability could be attributable to inter-population variation (Fst=0.002), confirming a lack of large-scale substructure. We thus considered the Sardinians as a unique sample. Compared to HapMap3 populations, as expected, higher similarity was observed with Tuscany and CEPH samples (Fst=0.005 and 0.010, respectively). A genome-wide search for SNPs highly differentiated between Sardinians and these European populations confirmed the specialness of HLA and LCT regions, and also showed elevated Fst values (>0.27) at the CR1 gene, known to be related to malaria severity. We are now integrating sequencing data of many individuals to provide a more comprehensive analysis of variants in addition to the common SNPs in current genotyping platforms.


A major new study on Arabian mtDNA

Phylogeographic analyses; mitochondrial DNA; Arabian Peninsula
V. Fernandes et al.

Phylogeographic analyses of mitochondrial DNA (mtDNA) provide insights into modern human evolution. In recent years, worldwide studies of contemporary mtDNAs have indicated that modern humans left Africa ~60,000-70,000 years ago along the “southern coastal route”, across the Red Sea and via the Arabian Peninsula. Yet no obvious signs of the passage though Arabia have been found in genetics and archaeology fields. The aims of this work are to seek for possible mtDNA relicts of the initial dispersal from Africa in Arabia and to investigate the origins of lineages that arrived later. We are doing this by sequencing the complete mtDNA molecule (~16,568 bp) from unclassified lineages (referred to as the paraphyletic clusters L3*, N* and R*) and poorly studied haplogroups within the Eurasian macrohaplogroup N, which is predominant in Arabian populations today (86% in Saudi Arabia, 66% in Yemen and 79% in Dubai), in 90 samples from Dubai, Yemen, North/East Africa, the Near East and Europe. Our results will allow to test hypotheses about the settlement of the Arabian Peninsula.

The strangeness of the human genome

Here is a little experiment:

Calculate the first principal component of variation between Papuans, and Karitiana from Brazil. These are some of the populations most distant to Africa that one can find genetic data for. (One Papuan,HGDP00544, is substantially different from the rest, and is shifted towards East Asians, so he was removed, all analyses on 613,630 SNPs with all no-calls removed).

Project 18 Mbuti Pygmies+San (henceforth Palaeoafricans) and 21 Yoruba from the HGDP-CEPH onto this component.

What do we expect? According to the standard Out-of-Africa model, you expect that Palaeoafricans and Yoruba will not differ from each other along the axis in which Amerindians differ from Papuans. If you take into account the Denisovan admixture in Australo-Melanesians, you might expect Africans (who lack this admixture) to be more Amerindian-like (since Amerindians also lack that archaic component). But, you certainly don't expect in either scenario Palaeoafricans to differ from Yoruba.

What the data say. Here are the data points along PC1:

green = Papuans
magenta = Karitiana

Here is a blowup of the middle part, showing the African populations:

red = Palaeoafrican
blue = Yoruba

A t-test supports (p less than .000001) the obvious visual conclusion that Palaeoafricans differ from Yoruba in the same way that Papuans differ from Karitiana. This is quite remarkable: why would Yoruba differ from San/Pygmies in the same way that Amazonians differ from Australo-Melanesians?

The difference is not that great: the Paleoafrican/Yoruba means are -.022 and -.020 and the Papuan/Karitiana ones are -0.167 and 0.206 respectively. Hence, the difference between Palaeoafricans and Yoruba is only 0.54% or so in this projection. But it is there, and it points to events in human prehistory not covered by the "standard model".

Discussion

I have long argued that Africa should not be viewed only as a source, but also as a destination of population movemenets. If Africa was only a source, then there would be absolutely no reason for two different African groups to differ from each other in the same way that two of the most distant (from Africa) groups do. No one can reasonably argue, I think that Africans had the opportunity of any amount of gene flow with either Papuans or Amerindians.

I conjecture that the signal detected here is a legacy of a prehistoric episode of migration of Eurasians into Africa, which affected Yoruba more than it did Palaeoafricans. This population of Eurasians was slightly more similar to Karitiana than to Papuans. We will try to trace its origins next.

Using She instead of Karitiana

I repeat the previous experiment, but I use She, a far eastern ethnic group of China instead of the Karitiana.Here are the PC1 co-ordinates in this projection:

Yoruba: 0.111, Palaeoafrican: 0.108, Papuan: -0.155, She: 0.248

Hence, Yoruba are shifted by 0.74% on the Papuan-She axis relative to Palaeoafricans.

Using Tuscans instead of She

Using Tuscans the PC1 co-ordinates are:

Yoruba: 0.169, Palaeoafrican: 0.164, Papuan: -0.144, Tuscan: 0.289

Hence, Yoruba are shifted by 1.15% on the Papuan-Tuscan axis relative to Palaeoafricans.

Using Onge instead of Tuscans

Finally, I substituted Onge from the Indian Ocean for Tuscans. This analysis is based on 112,041 SNPs, so it's not directly comparable with the previous ones. Nonetheless:

Yoruba: 0.075, Palaeoafrican: 0.07, Papuan: -0.15, Onge: 0.267

Hence, Yoruba are shifted by 1.2% on the Papuan-Onge axis relative to Palaeoafricans.

Conclusion

In a previous post on McEvoy et al. (2011) I speculated about a possible West Eurasian back-migration into Africa. The results presented here are compatible with that theory, but they are also compatible with a second Out-of-Africa movement; the latter, however, if it happened, did not only affect West Eurasians, but also East Asians and even Amerindians, at least relative to Papuans who may have been more isolated than the rest.

On balance, I prefer a scenario with back-migration:
  1. It is difficult to envision a second Out-of-Africa that reached Brazil in its spread but avoided Papua, moreover there are no diagnostic uniparental markers of such an event
  2. It is simpler to think of a movement of Y-haplogroup DE-bearing men a short distance from "somewhere between the Indian Ocean (where the Andamanese live), and East Africa." which would introduce Eurasian-like genes into Sub-Saharan Africa.
UPDATE (May 16): See the interesting discussion on the problem of potential ascertainment bias in the comments. In short: the signal seems to persist in the She and Tuscan comparisons, but not in the Karitiana one. Perhaps this means that whatever event took place postdates the migration of Amerindians into the New World?

East Asian- and African-shift of West Asian/European populations

In my critique of Moorjani et al. (2011), I noticed how the authors projected West Eurasian samples on an African/East Eurasian axis and assumed that African-shift along that axis was due to the presence of African admixture.

I showed, that while some populations are shifted towards Africans, others are shifted towards Asians, so a projection along an African-East Asian axis is in reality a palimpsest of the two phenomena.

I argued that assessment of African admixture using a simple 2-population model that does not account for the "East Asian factor" leads to erroneous results. I then compared five methods of admixture estimation, showing that the Sub-Saharan admixture results of Moorjani et al. (2011) were higher than all the other methods, consistent with my hypothesis.

In the present, I show how many different European/West Asian populations are shifted towards Africans/East Asians.

Principal Components Analysis

First, a PCA plot of just the West Eurasian populations; labels are mapped onto each population's average position:
Second, a PCA plot together with 25 Chinese and 25 Yoruba; Eurasians are separated from Africans along eigenvector 1, and East Eurasians from Eurafricans along eigenvector 2.
Third, a blowup of the West Eurasian portion of the above plot:
Third, a further blowup of the plot, excluding Chuvash to make it even clearer:
Finally, here is a spreadsheet with the average PC co-ordinates of the studied populations. The first two eigenvalues are 25.92 and 11.91.

With respect to the Asian- and African- shift of West Eurasian populations, I note that northern Europeans (and Basques) are less African-shifted than southern Europeans, and, at the same time they are more Asian-shifted: the 16 least Asian-shifted populations have a coastline in the Mediterranean (excluding the Portuguese), while the 16 least African-shifted populations do not (excluding the French).

Discussion

This analysis suggests the importance of choosing appropriate populations to represent Caucasoids in the global context. One often sees CEU used for that purpose. I see no major problem with that in general, as it is good for different studies to have a similar reference point, and CEU have been used for years for that purpose.

However, when dealing with the problem of admixture, this becomes an issue. CEU emerges as one of several populations with minimal African-shift, but are intermediate in terms of their Asian-shift. Sardinians, on the other hand, have mininmal Asian-shift (by far), and are intermediate in terms of their African-shift.



If one were to choose a single population to serve as a Caucasoid pole according to a criterion of maximal differentiation, then Basques are the obvious candidate, as they are tied for 1st place in having least-African shift, and 2nd in terms of Asian-shift. Indeed, a K=3 ADMIXTURE analysis of this dataset demonstrates that they are in fact the population showing the maximal contribution of the Caucasoid-specific component.

The analysis presented here also demonstrates the relative value of different population isolates in ancestry analysis. The Chuvashs, for example, are clearly not part of the genetic continuum of Europe, and neither are French Basques and Sardinians: all of these populations form very distinct clusters within the West Eurasian-specific context (first plot of this post). Nonetheless, their analysis within a global context demonstrates that they are distinct in different ways: Chuvash because of their substantial Asian-shift, Sardinians because of the substantial lack thereof.

In conclusion, It is a good idea not to employ a simple 2-way population mixture model to assess either African or Asian admixture in West Eurasians. Such a model may lead to erroneous results if it employs West Eurasians' shift on the African-East Asian axis.

May 13, 2011

Last Neandertals of Russia's North


I am reposting this, as I don't know whether my original post has survived the recent epic #BloggerFail. I'm not going to re-write my whole text, and, besides John Hawks already has plenty on the subject.

In short: the evidence in this paper is not incompatible with the other recent paper that pushed the last Neandertals back in time due to advances in radiocarbon dating. It could very well be that Neandertals had been wiped out in most of Europe by the late 30,000s before present, but had managed to survive near the Arctic before modern humans got there.

That would not be very surprising, given that the periphery is where the last unassimilated survivors of demographic expansions are expected to be found. Today, for example, it is in northern Eurasia that one can find the few unassimilated tribes that have resisted the twin spreads out of the Fertile Crescent and the Yangtze River region that have largely shaped Eurasian demography over the last 10,000 years.


Science 13 May 2011:
Vol. 332 no. 6031 pp. 841-845
DOI: 10.1126/science.1203866

Late Mousterian Persistence near the Arctic Circle

Ludovic Slimak et al.

ABSTRACT

Palaeolithic sites in Russian high latitudes have been considered as Upper Palaeolithic and thus representing an Arctic expansion of modern humans. Here we show that at Byzovaya, in the western foothills of the Polar Urals, the technological structure of the lithic assemblage makes it directly comparable with Mousterian Middle Palaeolithic industries that so far have been exclusively attributed to the Neandertal populations in Europe. Radiocarbon and optical-stimulated luminescence dates on bones and sand grains indicate that the site was occupied during a short period around 28,500 carbon-14 years before the present (about 31,000 to 34,000 calendar years ago), at the time when only Upper Palaeolithic cultures occupied lower latitudes of Eurasia. Byzovaya may thus represent a late northern refuge for Neandertals, about 1000 km north of earlier known Mousterian sites.

Link

May 12, 2011

The origin of Sorbs

An important caveat from the paper:
One caution regarding our results is that the geographical origins of our reference populations are crudely characterized only by country and thus may not be random samples. If many of the Germans in the POPRES data are western German samples, this may inflate the apparent differences we observe between Germans and Sorbs. The LPZ Germans contained two individuals from Eastern Germany who do appear closer to the Sorbs, suggesting that population structure within countries is a valid concern. Certainly, a tighter and denser sampling of German, Polish and Czech individuals from regions surrounding the Sorbian territories would be ideal for confirming or refuting the results found in this study.
Another important point:
The MAF spectra (Supplementary Figure 6), although highly distorted because of SNP ascertainment, also show the Sardinians and Basques to have a noticeable excess of monomorphic SNPs. This excess suggests that some SNPs that are polymorphic in Europe may have been driven to extinction/fixation at a higher rate or never existed at all in these populations, consistent with genetic isolation.
It is clear that at least part of this phenomenon is explained by the fact that some polymorphic SNPs in non-Basque and non-Sardinian Europeans are due to Eurasian influences that these two populations lack. In small isolated populations, alleles may be driven to fixation, while larger populations may continue to exhibit polymorphism at the same loci. However, that phenomenon would not preferentially shift either Sardnians or French Basques away from East Asians. Hence, rather than think that polymorphism was lost in them, we can think that polymorphism was added to non-isolated Europeans due to phenomena along the east-west Eurasian axis.

European Journal of Human Genetics
advance online publication 11 May 2011; doi: 10.1038/ejhg.2011.65

Genetic variation in the Sorbs of eastern Germany in the context of broader European genetic diversity

Krishna R Veeramah et al.

Abstract

Population isolates have long been of interest to genetic epidemiologists because of their potential to increase power to detect disease-causing genetic variants. The Sorbs of Germany are considered as cultural and linguistic isolates and have recently been the focus of disease association mapping efforts. They are thought to have settled in their present location in eastern Germany after a westward migration from a largely Slavic-speaking territory during the Middle Ages. To examine Sorbian genetic diversity within the context of other European populations, we analyzed genotype data for over 30 000 autosomal single-nucleotide polymorphisms from over 200 Sorbs individuals. We compare the Sorbs with other European individuals, including samples from population isolates. Despite their geographical proximity to German speakers, the Sorbs showed greatest genetic similarity to Polish and Czech individuals, consistent with the linguistic proximity of Sorbian to other West Slavic languages. The Sorbs also showed evidence of subtle levels of genetic isolation in comparison with samples from non-isolated European populations. The level of genetic isolation was less than that observed for the Sardinians and French Basque, who were clear outliers on multiple measures of isolation. The finding of the Sorbs as only a minor genetic isolate demonstrates the need to genetically characterize putative population isolates, as they possess a wide range of levels of isolation because of their different demographic histories.

Link

May 10, 2011

The case for Euphratic

Gordon Whitaker presents an interesting theory about the presence of a pre-Sumerian Euphratic Indo-European language in Mesopotamia; as always I don't comment on linguistic theories, but I've often thought that the early Indo-Europeans of the Near East may have been swamped by people from their periphery (Arabia, the Caucasus, and Central Asia), so the theory is not that outlandish to my ears. One often thinks of the Indo-Europeans as doing the language replacement, so one often forgets that they were often the victims of language replacement as well (e.g., in large parts of Eurasia to Turkic, and the Near East to Semitic languages).

Sumerian appears seemingly out of thin air in Mesopotamia; that is peculiar if it was the aboriginal language of a highly productive (and populous) Neolithic group. The same can be said about Semitic languages that make their appearance into Mesopotamia soon after in the form of Akkadian. There is also the example of the Kassites, another language isolate, who attacked Mesopotamia (from Iran?) but were not as successful, nor were the Hurrians who spread their influence into northern Mesopotamia. I also like the author's suggestion that the picture of a pristine Sumerian-speaking aboriginal population contradicts all our historical knowledge about the region, where different groups have lived there side by side throughout recorded history.

It is difficult to detect a written substratum in Sumerian, as Sumerians were the first literate civilization in the world, and so we are unlikely to ever find written records of the putative Euphratic language. If such a language did exist, we can only search for it, much as Whitaker has done, as a substratum in the languages that followed it in Mesopotamia.

Bulletin of the Georgian National Academy of Sciences, vol. 2, no. 3, 2008

The Case for Euphratic

Gordon Whittaker
University of Göttingen, Germany

ABSTRACT. It will be argued that the cuneiform writing system, the Sumerian and Akkadian lexicon, and the place names of Southern Mesopotamia preserve traces of an early Indo-European language, indeed the earliest by more than a millennium. Furthermore, this evidence is detailed and consistent enough to reconstruct a number of features of the proposed Indo-European language, Euphratic, and to sketch an outline of Euphratean cultural patterns.

Link (pdf)

Neandertals just got older (redating of Mezmaiskaya cave)

Nicholas Wade explains:
Reviewing other Neanderthal dates ascertained with the new ultrafiltration method, Dr. Higham sees an emerging pattern that no European Neanderthal site can reliably be dated to less than 39,000 years ago. “It’s only with reliable techniques that we can interpret the archaeological past,” he said.

He is re-dating Neanderthal sites across Europe and so far sees no evidence for any extensive overlap between Neanderthals and modern humans. “There was a degree of contemporaneity, but it may not have been very long,” he said. A short period of contact would point to the extinction of the Neanderthals at the hands of modern humans.

“It’s very unlikely for Neanderthals to go extinct without some agency from modern humans,” Dr. Higham said.
That "agency" could be, of course that they killed 'em all or bred with them all, absorbing their genes, which would have been a drop in the much larger expanding sapiens population. The latter possibility is difficult to explain given that Europeans are not more Neandertal-like genomically than east Eurasians, so they do not seem to have acquired a dose of Neandertal genes all their own.

PNAS doi: 10.1073/pnas.1018938108

Revised age of late Neanderthal occupation and the end of the Middle Paleolithic in the northern Caucasus

Ron Pinhasi et al.

Advances in direct radiocarbon dating of Neanderthal and anatomically modern human (AMH) fossils and the development of archaeostratigraphic chronologies now allow refined regional models for Neanderthal–AMH coexistence. In addition, they allow us to explore the issue of late Neanderthal survival in regions of Western Eurasia located within early routes of AMH expansion such as the Caucasus. Here we report the direct radiocarbon (14C) dating of a late Neanderthal specimen from a Late Middle Paleolithic (LMP) layer in Mezmaiskaya Cave, northern Caucasus. Additionally, we provide a more accurate chronology for the timing of Neanderthal extinction in the region through a robust series of 16 ultrafiltered bone collagen radiocarbon dates from LMP layers and using Bayesian modeling to produce a boundary probability distribution function corresponding to the end of the LMP at Mezmaiskaya. The direct date of the fossil (39,700 ± 1,100 14C BP) is in good agreement with the probability distribution function, indicating at a high level of probability that Neanderthals did not survive at Mezmaiskaya Cave after 39 ka cal BP ("calendrical" age in kiloannum before present, based on IntCal09 calibration curve). This challenges previous claims for late Neanderthal survival in the northern Caucasus. We see striking and largely synchronous chronometric similarities between the Bayesian age modeling for the end of the LMP at Mezmaiskaya and chronometric data from Ortvale Klde for the end of the LMP in the southern Caucasus. Our results confirm the lack of reliably dated Neanderthal fossils younger than ∼40 ka cal BP in any other region of Western Eurasia, including the Caucasus.

Link

May 09, 2011

Pygmy and non-Pygmy height

Stature is a highly heritable trait, yet association studies searching for its causative genes have generally come up short on results. A couple of years ago, I posted an article that showed that Victorian 19th century height prediction methods trumped modern genomic ones: estimating one's height by taking one's parents' average explains an order of magnitude more variation in height than the genomic method.

This article shows that admixture between Pygmies and non-Pygmies is the major contributor of height variation in Pygmy groups. Note that this is a clear case where group differences in randomly chosen loci are correlated with a phenotypic trait. It was once hoped that the notion of group differences was superfluous: we would learn everything there was to know about individual-level phenotypic variation by examining individual-level genetic and environmental variation directly. According to this paper, the correlation between admixture estimates and height for the entire sample of males/females was 0.44/0.52 respectively, so about a quarter of variance in the trait can be explained by admixture.

And, here's the interesting point: this was all done with 28 microsatellites. Why do 28 microsatellites trump hundreds of thousands of SNPs? Because different groups of mankind are not the same in their genotypic propensity to manifest specific phenotypes, and admixture proportions (the pretty colors in structure runs) do correlate with measurable physical properties.

American Journal of Physical Anthropology
DOI: 10.1002/ajpa.21512

Indirect evidence for the genetic determination of short stature in African Pygmies

Noémie S.A. Becker

Abstract

Central African Pygmy populations are known to be the shortest human populations worldwide. Many evolutionary hypotheses have been proposed to explain this short stature: adaptation to food limitations, climate, forest density, or high mortality rates. However, such hypotheses are difficult to test given the lack of long-term surveys and demographic data. Whether the short stature observed nowadays in African Pygmy populations as compared to their Non-Pygmy neighbors is determined by genetic factors remains widely unknown. Here, we study a uniquely large new anthropometrical dataset comprising more than 1,000 individuals from 10 Central African Pygmy and neighboring Non-Pygmy populations, categorized as such based on cultural criteria rather than height. We show that climate, or forest density may not play a major role in the difference in adult stature between existing Pygmies and Non-Pygmies, without ruling out the hypothesis that such factors played an important evolutionary role in the past. Furthermore, we analyzed the relationship between stature and neutral genetic variation in a subset of 213 individuals and found that the Pygmy individuals' stature was significantly positively correlated with levels of genetic similarity with the Non-Pygmy gene-pool for both men and women. Overall, we show that a Pygmy individual exhibiting a high level of genetic admixture with the neighboring Non-Pygmies is likely to be taller. These results show for the first time that the major morphological difference in stature found between Central African Pygmy and Non-Pygmy populations is likely determined by genetic factors.

Link

May 08, 2011

On the northern/southern Caucasoid contributions to Asia

I project a great number of Siberian, Central Asian, and South Asian populations on the first two principal components created by Han, West Asians, and Northern Europeans.

PC1 captures east-west variation across Eurasia, although the Han are also related to Ancestral South Asians, a major component in the ancestry of South Asians. PC2 captures West Asian-North European variation, so it is quite useful to extract the relative northern vs. southern Caucasoid elements in the populations examined.

Here are the first two PCs with the populations used to create them. Northeastern European (N=49) includes Lithuanians, Belorussians, Russians, Poles, and various non-Balkan Slavs. Northwestern European (N=46) includes Germans, Irish, Norwegians, and various continental Germanics. West Asian (N=93) includes Armenians, Iranians, Adygei, Lezgins, and Georgians.

Population labels are always placed on population averages. Notice that the Han form a tight cluster, halfway (along PC2) between West Asians and Northeast Europeans; this is expected as they are an outgroup that has not been significantly affected by Caucasoids.

We will now project various populations onto the previous 2-D map: their horizontal position (along PC1) depends on the extent of Caucasoid admixture, while their vertical position (along PC2) depends on whether this admixture is more northern or southern Caucasoid.

UPDATE (May 9):

I have also carried out supervised ADMIXTURE analysis, using the dataset of this post, adding Onge from the Indian Ocean as a fourth ancestral group together with Han, Northern Europeans, and West Asians.
The results seem consistent with the PCA projection, while the distinctiveness of the East Asian (dark blue) and Ancestral South Indian (light blue) components emerges.