Dienekes’ Anthropology Blog: Tibeto-Burman

Showing posts with label Tibeto-Burman. Show all posts

January 26, 2016

History of extant populations of India

The five components they speak of are ANI, ASI, AAA (Ancestral Austro-Asiatic), ATB (Ancestral Tibeto-Burman), and a distinct fifth ancestry in the Andaman archipelago.

The differentiation of the four main components seems clear enough on the figure (left). The big question is how and in what order the different components got into India. I would wager that ASI was first and I modify my New Year's wish to ask for some ancient DNA from India too.

An interesting bit from the paper:

...that the practice of endogamy was established almost simultaneously, possibly by decree of the rulers, in upper-caste populations of all geographical regions, about 70 generations before present, probably during the reign (319–550 CE) of the ardent Hindu Gupta rulers

How plausible is that to anyone familiar with Indian history?

PNAS doi: 10.1073/pnas.1513197113

Genomic reconstruction of the history of extant populations of India reveals five distinct ancestral components and a complex structure

Analabha Basu, Neeta Sarkar-Roya, and Partha P. Majumder

India, occupying the center stage of Paleolithic and Neolithic migrations, has been underrepresented in genome-wide studies of variation. Systematic analysis of genome-wide data, using multiple robust statistical methods, on (i) 367 unrelated individuals drawn from 18 mainland and 2 island (Andaman and Nicobar Islands) populations selected to represent geographic, linguistic, and ethnic diversities, and (ii) individuals from populations represented in the Human Genome Diversity Panel (HGDP), reveal four major ancestries in mainland India. This contrasts with an earlier inference of two ancestries based on limited population sampling. A distinct ancestry of the populations of Andaman archipelago was identified and found to be coancestral to Oceanic populations. Analysis of ancestral haplotype blocks revealed that extant mainland populations (i) admixed widely irrespective of ancestry, although admixtures between populations was not always symmetric, and (ii) this practice was rapidly replaced by endogamy about 70 generations ago, among upper castes and Indo-European speakers predominantly. This estimated time coincides with the historical period of formulation and adoption of sociocultural norms restricting intermarriage in large social strata. A similar replacement observed among tribal populations was temporally less uniform.

Link

September 12, 2011

Y chromosomes from south to north in East Eurasia

I am continuing my Y-STR boycott, but I have to note that the paper uses the "evolutionary rate", and hence its age estimates are wrong. A great range of ages can be supported when one uses Y-STRs, because of their poor qualities, but, as a first step the 19ky estimate should be downsized to about 6ky.

I know next to nothing about Tibeto-Burman languages, but, apparently the age estimate for Proto-Han and Proto-Tibeto-Burman unity is... 6ky. I would wager that we have here, one more piece of evidence for Y-chromosome-language correlation, and not at all events of glacial antiquity.

Table 1 contains the haplogroup frequencies:

The clustering analysis is instructive:

Indo-Aryan/Dravidian are the clear outgroup, due to their possessing a set of West Eurasian/South Asian haplogroups largely lacking in Southeast Asia. Altaic and Austronesian is slightly closer to the main group, probably due to a different set of haplogroups. Not surprisingly, Austroasiatic speakers who are native to Southeast Asia are closer to Sino-Tibetans, while the latter category emerges naturally, joining Han with Tibeto-Burman.

PLoS ONE 6(8): e24282. doi:10.1371/journal.pone.0024282

Human Migration through Bottlenecks from Southeast Asia into East Asia during Last Glacial Maximum Revealed by Y Chromosomes

Xiaoyun Cai et al.

Molecular anthropological studies of the populations in and around East Asia have resulted in the discovery that most of the Y-chromosome lineages of East Asians came from Southeast Asia. However, very few Southeast Asian populations had been investigated, and therefore, little was known about the purported migrations from Southeast Asia into East Asia and their roles in shaping the genetic structure of East Asian populations. Here, we present the Y-chromosome data from 1,652 individuals belonging to 47 Mon-Khmer (MK) and Hmong-Mien (HM) speaking populations that are distributed primarily across Southeast Asia and extend into East Asia. Haplogroup O3a3b-M7, which appears mainly in MK and HM, indicates a strong tie between the two groups. The short tandem repeat network of O3a3b-M7 displayed a hierarchical expansion structure (annual ring shape), with MK haplotypes being located at the original point, and the HM and the Tibeto-Burman haplotypes distributed further away from core of the network. Moreover, the East Asian dominant haplogroup O3a3c1-M117 shows a network structure similar to that of O3a3b-M7. These patterns indicate an early unidirectional diffusion from Southeast Asia into East Asia, which might have resulted from the genetic drift of East Asian ancestors carrying these two haplogroups through many small bottle-necks formed by the complicated landscape between Southeast Asia and East Asia. The ages of O3a3b-M7 and O3a3c1-M117 were estimated to be approximately 19 thousand years, followed by the emergence of the ancestors of HM lineages out of MK and the unidirectional northward migrations into East Asia.

Link

March 02, 2011

Origin of Tibetans (Wang et al. 2011)

The correspondence of language-ethnic affiliation with genomic data is quite striking as can be seen in the neighbor-joining tree (bottom). From the paper:

The migration routes of the Chinese population as a single group have been outlined based on Y chromosome haplotype distributions. After the ancestors of Sino-Tibetans reached the upper and middle Yellow River basin, they divided into two subgroups: Proto-Tibeto-Burman and Proto-Chinese [2]. These two subgroups were similar to the two ancestral components of EA populations at K = 2 (Figure S1B). The ancestral component which was dominant in Tibetan and Yi arose from the Proto-Tibeto-Burman subgroup, which marched on to south-west China and later, through one of its branches, became the ancestor of modern Tibetans. Proto-Tibeto-Burmans also spread over the Hengduan Mountains where the Yi have lived for hundreds of generations [28]. Taking the optimal living condition and the easiest migration route into account, we favor the single-route hypothesis; it is more likely that their migration into the Tibetan Plateau through the Hengduan Mountain valleys occurred after Tibetan ancestors separated from the other Proto-Tibeto-Burman groups and diverged to form the modern Tibetan population.

I recently uncovered a genetic component specific to Altaic populations, and this paper shows that within the Sino-Tibetan group, a component centered on Tibeto-Burmans can be uncovered as well. Genomics is already contributing greatly to our understanding of ancient languages, and it is perhaps time for geneticists to use their tools in order to date the breakup of these language groups, providing important new data on which to base linguistic theories.

PLoS ONE 6(2): e17002. doi:10.1371/journal.pone.0017002

On the Origin of Tibetans and Their Genetic Basis in Adapting High-Altitude Environments

Binbin Wang et al.

Since their arrival in the Tibetan Plateau during the Neolithic Age, Tibetans have been well-adapted to extreme environmental conditions and possess genetic variation that reflect their living environment and migratory history. To investigate the origin of Tibetans and the genetic basis of adaptation in a rigorous environment, we genotyped 30 Tibetan individuals with more than one million SNP markers. Our findings suggested that Tibetans, together with the Yi people, were descendants of Tibeto-Burmans who diverged from ancient settlers of East Asia. The valleys of the Hengduan Mountain range may be a major migration route. We also identified a set of positively-selected genes that belong to functional classes of the embryonic, female gonad, and blood vessel developments, as well as response to hypoxia. Most of these genes were highly correlated with population-specific and beneficial phenotypes, such as high infant survival rate and the absence of chronic mountain sickness.

Link

July 11, 2010

mtDNA of Tibet

American Journal of Physical Anthropology doi:10.1002/ajpa.21350

A mitochondrial revelation of early human migrations to the Tibetan Plateau before and after the last glacial maximum

Zhendong Qin et al.

ABSTRACT

As the highest plateau surrounded by towering mountain ranges, the Tibetan Plateau was once considered to be one of the last populated areas of modern humans. However, this view has been tremendously changed by archeological, linguistic, and genetic findings in the past 60 years. Nevertheless, the timing and routes of entry of modern humans into the Tibetan Plateau is still unclear. To make these problems clear, we carried out high-resolution mitochondrial-DNA (mtDNA) analyses on 562 Tibeto-Burman inhabitants from nine different regions across the plateau. By examining the mtDNA haplogroup distributions and their principal components, we demonstrated that maternal diversity on the plateau reflects mostly a northern East Asian ancestry. Furthermore, phylogeographic analysis of plateau-specific sublineages based on 31 complete mtDNA sequences revealed two primary components: pre-last glacial maximum (LGM) inhabitants and post-LGM immigrants. Also, the analysis of one major pre-LGM sublineage A10 showed a strong signal of post-LGM population expansion (about 15,000 years ago) and greater diversity in the southern part of the Tibetan Plateau, indicating the southern plateau as a refuge place when climate dramatically changed during LGM.

Link

April 05, 2007

Y chromosomes across the Himalayas

American Journal of Human Genetics (online early)

The Himalayas as a Directional Barrier to Gene Flow

Tenzin Gayden et al.

High-resolution Y-chromosome haplogroup analyses coupled with Y–short tandem repeat (STR) haplotypes were used to (1) investigate the genetic affinities of three populations from Nepal—including Newar, Tamang, and people from cosmopolitan Kathmandu (referred to as "Kathmandu" subsequently)—as well as a collection from Tibet and (2) evaluate whether the Himalayan mountain range represents a geographic barrier for gene flow between the Tibetan plateau and the South Asian subcontinent. The results suggest that the Tibetans and Nepalese are in part descendants of Tibeto-Burman–speaking groups originating from Northeast Asia. All four populations are represented predominantly by haplogroup O3a5-M134–derived chromosomes, whose Y-STR–based age (±SE) was estimated at 8.1 ± 2.9 thousand years ago (KYA), more recent than its Southeast Asian counterpart. The most pronounced difference between the two regions is reflected in the opposing high-frequency distributions of haplogroups D in Tibet and R in Nepal. With the exception of Tamang, both Newar and Kathmandu exhibit considerable similarities to the Indian Y-haplogroup distribution, particularly in their haplogroup R and H composition. These results indicate gene flow from the Indian subcontinent and, in the case of haplogroup R, from Eurasia as well, a conclusion that is also supported by the admixture analysis. In contrast, whereas haplogroup D is completely absent in Nepal, it accounts for 50.6% of the Tibetan Y-chromosome gene pool. Coalescent analyses suggest that the expansion of haplogroup D derivatives—namely, D1-M15 and D3-P47 in Tibet—involved two different demographic events (5.1 ± 1.8 and 11.3 ± 3.7 KYA, respectively) that are more recent than those of D2-M55 representatives common in Japan. Low frequencies, relative to Nepal, of haplogroup J and R lineages in Tibet are also consistent with restricted gene flow from the subcontinent. Yet the presence of haplogroup O3a5-M134 representatives in Nepal indicates that the Himalayas have been permeable to dispersals from the east. These genetic patterns suggest that this cordillera has been a biased bidirectional barrier.

Link

January 13, 2006

Sahoo et al. (2006) online (Indian Y chromosome variation)

The new India Y-chromosome paper that I talked about in my previous blog entry is now online at the PNAS site.

Interestingly Sanghamitra Sahoo seems to have published a paper on the same topic only two months after Sanghamitra Sengupta did.

UPDATE

It is unfortunate that this paper uses a limited number of UEP markers. Hopefully, future studies will start to seek and test more recently derived markers, which are the only ones that can really address recent events authoritatively.

Moreover, no STR markers were typed, thus further limiting any possible inferences about the time depth of the various Indian lineages.

A real problem with the study is that it performed an "admixture analysis" which considered the modern Central Asians as representative of the prehistoric ones. As it is well known, Central Asians of today have substantial Mongoloid admixture from the proto-historical and historical period and are not representative of the ancient Indo-Iranian groups of the steppe.

In any case, the observations of the authors about the distribution of haplogroups in India are broadly similar to those of the other recent study, and we have to agree that the wholesale assignment of J/R/L Y chromosomes to a recent invasion cannot really be sustained.

From the paper's conclusions:

It is not necessary, based on the current evidence, to look beyond South Asia for the origins of the paternal heritage of the majority of Indians at the time of the onset of settled agriculture. The perennial concept of people, language, and agriculture arriving to India together through the northwest corridor does not hold up to close scrutiny. Recent claims for a linkage of haplogroups J2, L, R1a, and R2 with a contemporaneous origin for the majority of the Indian castes’ paternal lineages from outside the subcontinent are rejected, although our findings do support a local origin of haplogroups F* and H. Of the others, only J2 indicates an unambiguous recent external contribution, from West Asia rather than Central Asia. The current distributions of haplogroup frequencies are, with the exception of the O lineages, predominantly driven by geographical, rather than cultural determinants. Ironically, it is in the northeast of India, among the TB groups that there is clear-cut evidence for large-scale demic diffusion traceable by genes, culture, and language, but apparently not by agriculture.

This certainly seems reasonable. J2 is largely restricted to the upper castes in India, and its young age is very suggestive of an external arrival in Neolithic and later times. It is certainly beginning to stand out as the most important exogenous genetic component in the Indian population.

The conclusion that J2 arrived in India from West and not Central Asia is not well-founded, because in the Middle East J*(xJ2) is frequent, but completely lacking in India. But, perhaps, most of Middle Eastern J*(xJ2) expanded recently, with the growth of the Semitic groups and was not present in the parental population from which Indian J2 is derived. Central Asia cannot be rejected so easily though as a source for Indian J2, because the present-day Central Asians have components (within N/C/O/Q) which were probably added by Mongoloid groups recently, and are not representative of the prehistoric populations.

The next step should be to develop informative recent markers in haplogroups J2a and R1a1 to finally establish whether some of these can be unambiguously related to particular Western Eurasian populations. This might be the decisive step to conclude whether Renfrew's hypothesis A (arrival of IE languages to India with early farmers) or hypothesis B (arrival of IE languages to India with IE-ized pastoral nomads) is the correct one.

PNAS (online early)

A prehistory of Indian Y chromosomes: Evaluating demic diffusion scenarios

Sanghamitra Sahoo et al.

Understanding the genetic origins and demographic history of Indian populations is important both for questions concerning the early settlement of Eurasia and more recent events, including the appearance of Indo-Aryan languages and settled agriculture in the subcontinent. Although there is general agreement that Indian caste and tribal populations share a common late Pleistocene maternal ancestry in India, some studies of the Y-chromosome markers have suggested a recent, substantial incursion from Central or West Eurasia. To investigate the origin of paternal lineages of Indian populations, 936 Y chromosomes, representing 32 tribal and 45 caste groups from all four major linguistic groups of India, were analyzed for 38 single-nucleotide polymorphic markers. Phylogeography of the major Y-chromosomal haplogroups in India, genetic distance, and admixture analyses all indicate that the recent external contribution to Dravidian- and Hindi-speaking caste groups has been low. The sharing of some Y-chromosomal haplogroups between Indian and Central Asian populations is most parsimoniously explained by a deep, common ancestry between the two regions, with diffusion of some Indian-specific lineages northward. The Y-chromosomal data consistently suggest a largely South Asian origin for Indian caste communities and therefore argue against any major influx, from regions north and west of India, of people associated either with the development of agriculture or the spread of the Indo-Aryan language family. The dyadic Y-chromosome composition of Tibeto-Burman speakers of India, however, can be attributed to a recent demographic process, which appears to have absorbed and overlain populations who previously spoke Austro-Asiatic languages.

Link

November 24, 2005

New paper on Indian Y-chromosome variation

A new paper on Y-chromosome variation in India has become available as an unedited preprint in the AJHG site. This is a huge study which covered linguistic/caste groups from the entire country and used 69 binary markers and 10 microsatellites to create a very thorough sampling of Indian Y-chromosomal variation. It will take some time to digest all the new information, plus the supplemental materials of the paper that remain to be put online. I will blog more about this soon. In bullet form, some findings of the paper which caught my attention:

R1a1's molecular variance is highest in NW India and its age is substantial
R1a1's variance is high in tribals
The phylogeny of J2 has been refined and it is now split into two newly discovered clades, called J2a and J2b.
J2 is almost entirely absent from tribals and is represented at a higher frequency in upper castes than middle castes than lower castes.

UPDATE:

The samples:

High-resolution assessment of Y-chromosome binary haplogroup composition was conducted on 728 Indian samples representing 36 populations, including 17 tribal populations, from six geographic regions and different social and linguistic categories. They comprise (Austro-Asiatic) Ho, Lodha, Santal, (Tibeto-Burman) Chakma, Jamatia, Mog, Mizo, Tripuri, (Dravidian) Irula, Koya Dora, Kamar, Kota, Konda Reddy, Kurumba, Muria, Toda (Indo-European) Halba. The 18 castes include (Dravidian) Iyer, Iyengar, Ambalakarar, Vanniyar, Vellalar, Pallan and (Indo-European) Koknasth Brahmin, Uttar Pradhesh Brahmin, West BengalBrahmin, Rajput, Agharia, Gaud, Mahishya, Maratha, Bagdi, Chamar, Nav Buddha, Tanti. With exception of the Koya Dora and Konda Reddy groups, these samples have been previously described (Basu et al. 2003).

J2 is divided into two main clades: J2a*-M410 and J2b*-M12:

New phylogenetic resolution has been achieved within the J2-M172 clade with the discovery of the M410 nucleotide A to G substitution (Table 2). Now all J2-M172 derived lineages can be assigned to one of two sister clades, namely J2a*-M410 and J2b*-M12, necessitating an updated revision of the previous “haplogroup by lineage” YCC nomenclature for J2 (Jobling et al. 2003). The J2*-M172 phylogenetic revisions are presented in supplemental dataA5. We include the DYS413≤18 allele repeat node in the phylogeny as suggested by Di Giacomo et al. (2004). It is notable that no J2*-M172 haplogroup lacking both M410 and M12 derived alleles has yet been observed. The DYS413 locus was typed in M410 derived samples from India, Pakistan and Turkey. The vast majority displayed the ≤18 allele repeat, although 16/118 in Turkey had alleles ≥19, as did 5/17 in Pakistan and 5/28 in India, 4 of which were restricted to the Dravidian-speaking Iyengar and Iyer upper castes.

5 New Clades in haplogroups C, L, Q, and I:

We report 5 new clades that improve the haplogroup topology within the Y-chromosome genealogy. The new subclade C5-M356, accounts for 85% of the former C* haplogroups. While its overall frequency is only 1.4% in the Indian sample, it occurs in all linguistic groups, and in both tribes and castes. It also occurs in 1 Dravidian Brahui in Pakistan (Table 3). The new L3-M357 subclade which accounts for 86% of L-M20(xL1xL2) chromosomes in Pakistan; but occurs sporadically (3/728) in India. All Indian haplogroup Q representatives belong to the new M346-subclade. This new Q clade will aid in future studies attempting to narrow the candidate Asian/Siberian precursors of Native American chromosomes. The G5-M377 substitution is independent of G1-M285 and G2-P15 subclades (Cinnioglu et al. 2004) and occurs in Pakistan. The M379 polymorphism defines the I1c2 subclade, that occurs only our Pakistani data.

Indigenous Indian haplogroups:

On the basis of the combined phylogeographic distributions of haplotypes observed
among populations defined by social and linguistic criteria, candidate haplogroups that most plausibly arose in situ within the boundaries of present day India include C5-M356, F*-M89, H*-M69 (and its sub-clades H1-M52 and H2-APT), R2-M124 and L1-M76. The congruent geographic distribution of H*-M69 and potentially paraphyletic F*-M89 Y-chromosomes in India suggests that they might share a common demographic history.

R1a1 and R2:

The widespread geographic distribution of haplogroup R1a1-M17 across Eurasia and the current absence of informative subdivisions defined by binary markers leave its geographic origin uncertain. However the contour map of R1a1-M17 variance shows the highest variance in the northwest region of India (Figure 3).

...

In haplogroups R1a1 and R2 the associated mean microsatellite variance is highest in tribes (Table 8), not castes. This is a clear contradiction to what would be expected from an explanation involving a model of recent occasional admixture.

...

Specifically, they could have actually arrived in southern India from southwest Asian source region multiple times with some episodes being considerably earlier than others. Considerable archeological evidence exists regarding the presence of Mesolithic peoples in India (Kennedy 2000), some of whom could have entered the
subcontinent from the northwest during the late Pleistocene period. The high variance of R1a1 in India (Table 8), the spatial frequency distribution of R1a1 microsatellite variance (Figure 3) clines and expansion time (Table 7) support this view.

Clustering of R1a1 haplotypes:

The ages of the Y-microsatellite variation (Table 7) for R1a1 and R2 in India suggest that the pre-historical context of these haplogroups will likely be complex. A PC plot of R1a1-M17 Y-microsatellite data (Figure 4) shows several interesting features: (a) one tight population cluster comprising S. Pakistan, Turkey, Greece, Oman and West Europe, (b) one loose cluster comprising all the Indian tribal and caste populations, with the tribal populations occupying an edge of this cluster, and (c) Central Asia
and Turkey occupy intermediate positions. The upper and lower bounds of the divergence time between the two clusters is 12 kya and 8 kya, respectively. The pattern of clustering does not support the model that the primary source of the R1a1-M17 chromosomes in India was Central Asia or the Indus valley via Indo-European speakers.

The spread of J2a:

Figure 2 demonstrates the eastward expansion of J2a-M410 to Iraq, Iran and Central Asia coincident with painted pottery and ceramic figurines, well documented in the Neolithic archeological record (Cauvin 2000). Near the Indus valley, the Neolithic site of Mehrgarh beginning around 5000 BCE (Kenoyer 1998) displays the presence of these types of material culture correlated with the spread J2a-M410 in Pakistan. While the association of agriculture with J2a-M410 is recognized, it is not necessarily the only explanation for its history. Despite an apparent exogenous frequency spread pattern of hg J2a towards North and Central India from the west (Figure 2), it is premature to attribute it to a simplistic demic expansion of early agriculturalists and pastoralists from the Middle East. It reflects the overall net process of spread that may contain numerous as yet unrevealed movements embedded within the general pattern. It may also reflect a combination of elements of earlier prehistoric Holocene epi-paleolithic peoples from the Middle East, subsequent Bronze Age Harappans of uncertain provenance and succeeding Iron Age Indo-Aryans from Central Asia (Kennedy 2000). Further, the relative position of the Indian tribals (Fig. 4), the high microsatellite variance among them (Table 8), the estimated age (14 kya) of microsatellite variation within R1a1 (Table 7) and the variance peak in the west (Fig. 3) are entirely inconsistent with a model of recent gene flow from castes to tribes and a large genetic impact of the Indo-Europeans on the autochthonous gene pool of India. Instead, our overall inference is that an early Holocene expansion in NW India (including the Indus) contributed R1a1-M17 chromosomes both to the Central Asian and S Asian tribes prior to the arrival of the Indo-Europeans.

J2a in upper caste Indians:

The J2 clade is nearly absent among Indian tribals, except among Austro-Asiatic speaking tribals (11%). Among the Austro-Asiatic tribals, the predominant J2b2 hg occurs only in the Lodha.

...

Haplogroup J2a-M410 is confined to upper caste Dravidian and Indo-European speakers, with little occurrence in the middle and lower castes. This absence of even modest admixture of J2a in south Indian tribes and middle and lower castes is inconsistent with the L1 data. Overall, therefore, our data provide overwhelming support to an Indian origin of Dravidian speakers.

Haplogroup frequencies:

American Journal of Human Genetics (in press)

Polarity and Temporality of High Resolution Y-chromosome Distributions in India Identify Both Indigenous and Exogenous Expansions and Reveal Minor Genetic Influence of Central Asian Pastoralists

Sanghamitra Sengupta, Lev A. Zhivotovsky, Roy King, S. Q. Mehdi, Christopher A. Edmonds, Cheryl-Emiliane T. Chow, Alice A. Lin, Mitashree Mitra, Samir K. Sil, A. Ramesh, M.V. Usha Rani, Chitra M. Thakur, L. Luca Cavalli-Sforza, Partha P. Majumder and Peter A. Underhill

Abstract

While considerable cultural impact on social hierarchy and language in south Asia is attributable to the arrival of nomadic Central Asian pastoralists, genetic data (mitochondrial and Y chromosomal) have yielded dramatically conflicting inferences on the genetic origins of tribes and castes of south Asia. We sought to resolve this conflict using high-resolution data on 69 informative Y-chromosome binary markers and 10 microsatellite markers from a large set of geographically, socially and linguistically representative ethnic groups of south Asia. We have found that the influence of Central Asia on the pre-existing gene pool was minor. The ages of accumulated microsatellite variation in the majority of Indian haplogroups exceed 10-15 kya, attesting to the antiquity of regional differentiation. Therefore, our data do not support models that invoke a pronounced recent genetic input from central Asia to explain the observed genetic variation in south Asia. R1a1 and R2 haplogroups indicate demographic complexity that is inconsistent with a recent single history. Associated microsatellite analyses of the high frequency R1a1 haplogroup chromosomes indicate independent recent histories of the Indus valley and the peninsular Indian region. Our data are also more consistent with a peninsular origin of Dravidian speakers than a source with proximity to the Indus and significant genetic input resulting from demic diffusion associated with agriculture. Our results underscore the importance of marker ascertainment towards distinguishing phylogenetic terminal branches from basal nodes when attributing ancestral composition and temporality to either indigenous or exogenous sources. Our reappraisal indicates that pre-Holocene and Holocene era – not Indo-European – expansions have shaped the distinctive south Asian Y-chromosome landscape.

August 20, 2005

ESHG abstracts

I had previously posted some titles from this year's European Society of Human Genetics conference. There is now a pdf volume on the ESHG which contains all the abstracts of the conference. Some of them have already been published, and doubtlessly more of them will be published next year. I will discuss below some of the more intriguing entries:

F. Cruciani et al., Molecular dissection of the Y chromosome haplogroups A, E and R1b

The male-specific region of the human Y chromosome (MSY) is characterized by a low amount of sequence diversity compared to the mtDNA, the autosomes and the X chromosome. Recently, the use of DHPLC and direct sequencing of DNA has permitted to identify more than 300 new single nucleotide polymorphisms (SNPs) on the MSY. The analysis of the geographic distribution of the haplogroups identified by these markers has provided new insights in the history of human populations, at the same time, it came out that undetected Y chromosome SNPs still contain useful information. In this study we have analyzed the sequence variation of 60 kb of the TBL1Y gene. While previous studies have analyzed the sequence variation of the Y chromosome in a random sample of individuals, we here focus on 22 chromosomes belonging to three specific haplogroups (A, R1b and E), whose geographic distribution is relevant for the human evolutionary history of Africa and/or western Eurasia. We discovered 32 new SNPs, and placed them in the known Y chromosome phylogenetic tree: about half of the new mutations identify new branches of the tree. The geographic distribution of five new E-M78 sub-haplogroups, analyzed in more than 6,000 subjects from Eurasia and Africa, has led to the identification of interesting evolutionary patterns.

The discovery of new subclades, especially for E-M78 and R1b will be especially welcome for those interested in finer distinctions in these widely prevalent haplogroups. R1b for example occurs throughout the Caucasoid world, and so far very few meaningful sub-haplogroups of it were known. E-M78 is the main sublineage of haplogroup E3b and until now there was evidence fo haplotype clusters that differentiated E-M78 chromosomes; the discovery of new sub-haplogroups will probably reflect to some degree these previously known haplotype clusters.

People interested in their own personal anthropology may be advised to wait until the publication of the R1b and E-M78 sub-haplogroups and their incorporation into commercial "fine-resolution" SNP tests, if they are considering undertaking such a test.

I. Kutuev et al., Phylogeographic analysis of mtDNA and Y chromosome lineages in Caucasus populations

The Greater Caucasus marks a traditional boundary between Europe and Asia. Linguistically, it is one of the most diverse areas of the continental Eurasia, while genetics of the people living there is poorly understood. Mitochondrial DNA and NRY variability was studied in 23 Caucasus populations speaking Caucasus, Turkic, andIndo-European languages. Total sample comprised more than 1700 individuals on Y chromosome and more than 2100 individuals on mtDNA. Genetic outliers among the studied populations are relatively recently arrived Turkic speaking Nogays. The indigenous Caucasus populations possess generally less than 5% of eastern Eurasian mtDNA and Y-chromosomal haplotypes - in a profound contrast to the Turkic-speaking people at the other side of the Caspian, but not so dissimilar compared to the Volga-Turkic Tatars and Chuvashis or to the Anatolian Turks. Haplogroup frequency variation within the Caucasus populations, in some instances significant, appears to be caused primarily by specific aspects of the demographic history of populations. Phylogeographically, a particularly intriguing finding is the presence, though at low frequencies, of a predominantly northeastern African haplogroup M1 in many North Caucasus populations, though they lack sub-Saharan L lineages, relatively frequent in the Arab-speaking Levant. Results obtained help to place the Caucasus populations into the scenario of the peopling of Eurasia with anatomically modern humans. Possible migration routs, peopling of steppe and mountain parts of the Caucasus and causes of high linguistic diversity presence in this region is analyzed in this study.

The finding of M1 lineages in the Caucasus not associated with Sub-Saharan L lineages is important, because it can be explained in only one of two ways:

M1 originated in Asia, so its presence in east Africa can be explained by back-migration from Asia. We know that macrohaplogroup M originated in Asia, but it is not clear whether M1 itself originated in Asia or Africa; the "trail" of M lineages between South Asia and Eastern Africa is still flimsy, so we cannot draw any conclusions on this matter yet.
M1 originated in eastern Africa, but during a time when there was a much small level of penetration of sub-Saharan L lineages into the region.

V. Stepanov et al., Genetic diversity and differentiation of Y-chromosomal lineages in North Eurasia

Composition and frequency of Y-chromosomal haplogroups, defined by the genotyping of 36 biallelic loci in non-recombining part of Ychromosome, was revealed for native population of Siberia, Central Asia and Eastern Europe. Slavonic ethnic groups, which geographically represent Eastern Europe, are characterized by the high frequency of R1a1, I*, I1b, and N3a clades and by the presence of R1b3, J2, E, and G. Most frequent haplorgoup is R1a1, which comprises 44-51% of Y-chromosomes. The distinguishing peculiarity of Central Asian Caucasoids is the high frequency of Caucasoid clades R1a1, J*, J2, and the presence of R1b3 and G. Twenty-five haplogroups were found in gene pool of native Siberian populations. Only 7 of them have the frequency higher than 3%. In sum these 7 clades comprise 86% of Siberian samples. In populations of Southern Siberia the most frequent haplogroup is R1a1. The high frequency of N3a is characteristic for Eastern Siberians, and in Yakuts its frequency is almost 90%. Koryaks, Buryats and Nivkhs have the highest frequency of C3* lineage among investigated populations. Haplogroup O* revealed with variable frequency in most of Siberian. Highest frequency of Q* was found in Ketsand Northern Altayans (85% and 32%, respectively).The high level of genetic differentiation of North Eurasian population on Y-chromosomal lineages was revealed. The proportion of inter-population differences in the total genetic variability of region’s population according to the analysis of molecular variance is 19.04%. Genetic differences between territorial groups took 6.9% of total genetic variability, whereas 12.8% is the inter-population differences within groups.

This study seems to confirm what we already knew about the distribution of haplogroups in northern Eurasia, but it seems like a comprehensive survey of the area, which will be very useful when it appears in print.

S. Sengupta et al., Genescape of India, as Reconstructed from Polymorphic DNA Variation in the Y chromosome

The contemporary male gene pool of ethnic India largely comprises haplogroups that originated indigenously, in southeast Asia, and in west and central Asia. The indigenous haplogroup is predominant among the tribal group . The southeast Asian influence is largely on the male gene pools of Tibeto-Burman speaking tribals and Austro-Asiatic and Dravidian. The west and central Asian influence is primarily on caste groups - both Indo-European and Dravidian. The haplogroup diversity within the various tribal groups is lower than that within the caste groups. Analyses of molecular variance showed higher genetic variability among populations within linguistic clusters of tribals compared to castes. Moreover, the between group variability in the Indo-European caste cluster is higher than that in the Dravidian
caste cluster. This may be a reflection of diverse ancestries, antiquities and isolation of the tribals, coupled with subsequent cultural (linguistic) homogenization. Lesser between group genetic variability in caste groups may be a reflection of their recent founding history. The complete congruence of the patterns of Y-chromosomal and mitochondrial DNA differentiation may be indicative of inflow of both male and female genes from similar source populations. The rank order of FST values showed that tribes and castes are most differentiated, followed by upper and middle caste, upper and lower caste and middle and lower caste.

Again, this study seems to confirm the indigenous component in Indians, and the higher prevalence of western and central Asian Caucasoid haplogroups in castes compared to tribals. Also of interest is the finding that the main difference in the Indian population is between castes and tribals: within the castes, differentiation decreases towards the lower castes, the most differentiated ones being the upper castes.

E. Bogácsi-Szabó et al., Maternal and paternal lineages in ancient and modern Hungarians

Hungarian language represents the westernmost group of the Finno-Ugric language phylum, surrounded entirely by Indo-European speaking populations. Their linguistic isolation in the Carpathian basin suggests the possibility that they might also show a significant genetic isolation. According to historical data at the end of the 9th century Hungarian conquerors from the west side of the Ural Mountains settled down into the Carpathian Basin and took the hegemony. To determine the genetic background of Hungarians we examined mitochondrial and Y chromosomal DNA from ancient `conquerors` from Hungary, originated from the 10th century and from modern Hungarian-speaking adults from today's Hungary and Transylvanian Seklers (Romania). DNA was extracted from 35 excavated ancient bones and hair samples of 125 and 80 modern Hungarians and Seklers, respectively. Mitochondrial haplogroups were determined with HVS I sequencing and RFLP typing. The mtDNA HVS I sequences were compared with 2615 samples from 34 Eurasian populations retrieved from published data. ARLEQUIN 2.001 Software was used to estimate genetic distances between populations. The resulting matrix was summarized in two dimensions by use of Multidimensional Scaling. The M46 biallelic Y chromosomal marker (TAT, often called Uralic migration marker) was also investigated from 2 ancient, 34 modern Hungarian and 60 Sekler samples. Our results suggest that the modern Hungarian gene pool is very similar to other central European ones concerning the mitochondrial and Y chromosomal markers, while the ancient population contains more Asian type elements.

This is a very exciting study comparing ancient Magyar mtDNA and Y chromosomes (at least the Tat-C marker) with those of modern Hungarian speakers. Physical anthropologists have long identified a Mongoloid and mixed Mongoloid component in the Magyars, and this is now confirmed with the finding of Tat-C and Mongoloid mtDNA in the ancient Magyars at a higher frequency than in the modern population. Today, Hungarians are predominantly Caucasoid, and this is supported by the molecular data and reflects the assimilation of the indigenous Caucasoid population by the more "Asian" original Magyar population.

F. di Giacomo et al., Y chromosomal variation in the Czech Republic

In order to analyse the contribution of the Czech Republic to the genetic landscape of Europe, we typed 257 male subjects from 5 locations for 17 Unique Event Polymorphisms of the Y chromosome. Sixteen haplogroups or sub-haplogroups were identified, with only 5 chromosomes uncharacterized. Overall, the degree of population structuring was low. The three commonest haplogroups were R1a
(0.344), P*(xR1a) (0.281) and I (0.184). M157, M56 and M87 showed no variation within haplogroup R1a. Haplogroup I was mostly represented as I1b* and I1b2 was also detected in this population. Thus, the majority of the Czech male gene pool is accounted for by the three main haplogroups found in western and central Europe, the Balkans and the Carpathians. Haplogroup J was found at low frequency, in agreement with a low gene flow with the Mediterranean. In order to draw inferences on the dynamics of the Czech population, we typed 141 carriers of the 3 most common haplogroups for 10 microsatellites, and applied coalescent analyses. While the age of the I clade agreed with that reported in the vast study of Rootsi et al (2004), the ages of its sub-haplogroups differed considerably, showing that the I chromosomes sampled in the Czech Republic are a subset of those found throughout Europe. Haplogroup R1a turned out to be the youngest with an estimated age well after the Last Glacial Maximum. For all three major haplogroups the results indicate a fast population growth, beginning at approximately 60-80 generations ago.

The young age of R1a1 in Czechs, combined with its high frequency make it a likely candidate as reflecting historical or recent prehistorical events, and less likely to reflect the post-LGM recolonization of Europe.