- J16.05 - Checking the hypothesis of a Balkan origin of the Armenians
- P17.5 - Y chromosome haplogroup analysis to estimate genetic origin of Balts
- J16.27 - Armenian Highland as a transition corridor for the spread of Neolithic agriculturists
- J16.14 - In Search of the Origin of Haplogroup J1-P58
- J16.03 - Y-chromosome haplogroup analysis in the Besermyan ethnic group
- P17.4 - Mitochondrial DNA Analysis of the Southeast European Genetic Variation Reveals a New, Local Subbranching in Hg X2
- P16.085 - Population diversity and history of the Indian subcontinent: Uncovering the deeper mosaic of sub-structuring and the intricate network of dispersals
- J16.43 - Ancient mtDNA diversity in Bulgaria
- P16.072 - Mitochondrial DNA diversity in medieval and modern Romanian population
- P16.012 - Frequency analysis of the CCR5-Delta32 allele in medieval and modern Romanian population
- J16.78 - The gene pool of Argyn in the context of generic structure of Kazakhs according to data on SNP-Y-Chromosome markers.
May 30, 2013
ESHG 2013 abstracts
July 19, 2012
Huge study on Y-chromosome variation in Iran (Grugni et al. 2012)
UPDATE I: Here is the table of haplogroup frequencies for easy reference:
One of the most interesting finds is the presence of a few IJ-M429* chromosomes in the sample. Haplogroup IJ encompasses the major European I subclade, and the major West Asian J subclade. The discovery of IJ* chromosomes is consistent with the origin of this haplogroup in West Asia; it is widely believed that haplogroup I represents a pre-Neolithic lineage in Europe, although at present there are no Y chromosome-tested pre-Neolithic remains.
There is also a wide assortment of Q and R in Iran. While some of these may be intrusive (e.g., the 42.6% of Q1a2 in Turkmen, likely a legacy of their Central Asian origins), the overall picture appears consistent with a deep presence of these lineages in Iran. This is especially true for haplogroup R where pretty much every paragroup and derived group is present, excepting those likely to have originated recently elsewhere.
UPDATE II: From the paper:
Although accounting only for 25% of the total variance, the first two components (Figure 3) separate populations according to their geographic and ethnic origin and define five main clusters: East-African, North-African and Near Eastern Arab, European, Near Eastern and South Asian. The 1stPC clearly distinguishes the East African groups (showing a high frequency of haplogroup E) from all the others which distribute longitudinally along the axis with a wide overlapping between European and Arab peoples and between Near Eastern and South Asian groups. The 2ndPC separates the North-African and Near Eastern Arabs (characterized by the highest frequency of haplogroup J1) from Europeans (characterized by haplogroups I, R1a and R1b) and the Near Easterners from the South Asians (due to the distribution of haplogroups G, R2 and L). Iranian groups do not cluster all together, occupying intermediate positions among Arab, Near Eastern and Asian clusters. In this scenario, it is worth of noticing the position of three Iranian groups: (i) Khuzestan Arabs (KHU-Ar) who, despite their Arabic origin, are close to the Iranian samples; (ii) Armenians from Tehran (THE-Ar), whose position, in the upper part of the Iranian distribution, indicates a close affinity with the Near Eastern cluster, while their position near Turkey and Caucasus groups, due to the high frequency R1b-M269 and other European markers (eg: I-M170), is in agreement with their Armenia origin; (iii) Sistan Baluchestan (SB-Ba) that clusters with its neighbouring Pakistan.UPDATE III: There are lots of little details in the haplogroup distribution that make historical sense. For example, C3 exists in Assyrians from Azarbaijan, and both C*, C3, and O exists in Zoroastrians from Yazd. It is often forgotten that before the spread of Islam, and quite time thereafter, Inner Asia was teeming with Zoroastrians and Nestorian Christians. It seems quite likely that these outliers represent a legacy of these communities.
UPDATE IV: I have a feeling that Razib will take exception with this statement: "Ancient Persian people were firstly characterized by the Zoroastrianism. After the Islamization, Shi'a became the main doctrine of all Iranian people."
UPDATE V: This confirms my observation from the recent studies in Afghanistan, that there is an inverse relationship of J2a and R1a in Iranian-speaking groups, with an excess of the latter among the eastern Iranians, and of the former among the Persians. From the paper:
Among the different J2a haplogroups, J2a-M530 [46] is the most informative as for ancient dispersal events from the Iranian region. This lineage probably originated in Iran where it displays its highest frequency and variance in Yazd and Mazandaran (Figure 2). Taking into account its microsatellite variation and age estimates along its distribution area (Tables S3 and S7), it is likely that its diffusion could have been triggered by the Euroasiatic climatic amelioration after the Last Glacial Maximum and later increased by agriculture spread from Turkey and Caucasus towards southern Europe. The high variance observed in the Italian Peninsula is probably the result of stratifications of subsequent migrations and/or of the presence of sub-lineages not yet identified. Of interest in the M530 network (Figures 2 and S3) is the presence of a lateral branch that is characterized by a DYS391 repeat number equal to 9. Differently from previous observations [46], this branch is not restricted to Anatolian Greek samples being shared with different eastern Mediterranean coastal populations. The M530 diffusion pattern seems to be also shared by the paragroups J2a-M410* and J2a-PAGE55*. In addition, the variance distribution of the rare R1b-M269* Y chromosomes, displaying decreasing values from Iran, Anatolia and the western Black Sea coastal region, is also suggestive of a westward diffusion from the Iranian plateau, although more complex scenarios can be still envisioned because of its non-star like structure.Of course, the idea that the diffusion of J2a related lineages ties in with early agricultural expansions has been with us for a long time, but it is time to abandon it. First of all, as we have seen, J2a diminishes greatly as we head towards South Asia; it certainly doesn't look like the lineage of the multitude of agricultural settlements that sprang up along the southeastern vector soon after the invention of agriculture. Second, it is lacking so far in all ancient Y chromosome data from Europe down to 5,000 years ago. It seems much more probably that J2 related lineages spread from the highlands of West Asia much later.
The "age estimates" are the result of using the inappropriate "evolutionary mutation rate", and become even older because of the inclusion of the DYS388 marker that is very stable in many haplogroups but very mutable within haplogroup J. On the left you can see frequency, Y-STR variance, and haplotype network structures for various J-related groups.
It is unfortunate that there is no progress in the phylogeographic assessment of R1a in this paper. There have been substantial discoveries of SNPs within this haplogroup as a result of commercial testing; however there is clearly an ascertainment bias in the newer discoveries, as almost all these SNPs have been detected in Europeans. The new paper confirms the high levels of Y-STR variance in India, Pakistan, and Iran. Together with the cornucopia of related paragroups in Iran, there is little doubt that this haplogroup originated in the general area of Central/South Asia.
Personally, as I have stated before, I would relate this R1a with Neolithic peoples living east of the Caspian, in contrast to the R1b bearers who lived west and south of it. These two populations came under the influence of the Indo-Europeans and spread in different directions. The Indo-Iranians were then initially the mixed descendants of the Indo-Europeans and the R1a old agricultural population, and were formed in the territory of the Bactria-Margiana Archaeological Complex.
This also explains the contrast between Iranian and Armenian groups: the latter mostly lack the R1a lineage, contrasting with all Iranian groups (even their Kurdish neighbors) who possess it. Conversely, Iranian groups, and especially eastern Iranians and Indo-Ayrans lack the R1b lineage. This is due to the fact that neither R1a nor R1b were originally part of the Indo-European community, but their geographical position was such that they came under the influence of the Indo-Europeans when the latter began their expansion.
UPDATE VI: I have created my own dendrogram using the Y-haplogroup frequencies and the hclust package of R (default parameters):
From top to bottom, one can identify some clusters:
- Eastern Europe, further broken down into Balkans and Slavic+Hungary
- West Asian/Caucasus
- Iranian Proper
- Arab
These correspond largely to the clusters identified by the authors, with India and the Turkmen sample emerging as the clear outliers. I omitted the Ethiopian samples, since E-M78 was not resolved phylogenetically, causing the Ethiopians to group with the likely E-V13 from the Balkans.
UPDATE VII: I have also run MCLUST over the haplogroup frequency data over the MDS representation of the distance matrix. The maximum number of 10 clusters occurred with 5 MDS dimensions retained. Population assignments in the 10 clusters can be found in the table below:
| Iran/Azerbaijan_Gharbi+Tehran_(Assyrian) | 1 |
| Iran/Lorestan_(Lur) | 1 |
| Iran/Tehran_(Armenian) | 1 |
| Iran/Azerbaijan_Gharbi_(Azeri) | 2 |
| Iran/Hormozgan_(Bandari+Afro-Iranian) | 2 |
| Iran/Hormozgan/Qeshmi | 2 |
| Iran/Khorasan_(Persian) | 2 |
| Iran/Kurdistan_(Kurd) | 2 |
| Iran/Sistan_Baluchestan_(Baluch) | 2 |
| Pakistan | 2 |
| Iran/Fars+Isfahan_(Persian) | 3 |
| Iran/Gilan_(Gilak) | 3 |
| Iran/Yazd+Tehran_(Zoroastrian) | 3 |
| Turkey/Central | 3 |
| Turkey/East | 3 |
| Turkey/West_ | 3 |
| Iran/Golestan_(Turkmen) | 4 |
| India | 4 |
| Iran/Khuzestan_(Arab) | 5 |
| Egypt_(Arab) | 5 |
| Iraq/Baghdad | 5 |
| Oman | 5 |
| Saudi_Arabia | 5 |
| Tunisia | 5 |
| United_Arab_Emirates | 5 |
| Iran/Mazandaran_(Mazandarani) | 6 |
| Iran/Yazd_(Persian) | 6 |
| Balkarian | 6 |
| Georgia | 6 |
| Albania | 7 |
| Greece | 7 |
| Bosnia | 8 |
| Croatia | 8 |
| Slovenia | 8 |
| Czech_Republic | 9 |
| Hungary | 9 |
| Poland | 9 |
| Ukraine | 9 |
| Iraq_(Marsh_Arab) | 10 |
| Qatar | 10 |
| Yemen | 10 |
We can ignore cluster #4 which consists of the two outliers (India + Turkmen). The rest of the clusters seem relatively coherent. Notice, for example, the Arabian cluster #10, Balkan cluster #8, Eastern European cluster #9, Greek-Albanian cluster #7, Mixed Arab cluster #5.
PLoS ONE 7(7): e41252. doi:10.1371/journal.pone.0041252
Ancient Migratory Events in the Middle East: New Clues from the Y-Chromosome Variation of Modern Iranians
Viola Grugni et al.
Knowledge of high resolution Y-chromosome haplogroup diversification within Iran provides important geographic context regarding the spread and compartmentalization of male lineages in the Middle East and southwestern Asia. At present, the Iranian population is characterized by an extraordinary mix of different ethnic groups speaking a variety of Indo-Iranian, Semitic and Turkic languages. Despite these features, only few studies have investigated the multiethnic components of the Iranian gene pool. In this survey 938 Iranian male DNAs belonging to 15 ethnic groups from 14 Iranian provinces were analyzed for 84 Y-chromosome biallelic markers and 10 STRs. The results show an autochthonous but non-homogeneous ancient background mainly composed by J2a sub-clades with different external contributions. The phylogeography of the main haplogroups allowed identifying post-glacial and Neolithic expansions toward western Eurasia but also recent movements towards the Iranian region from western Eurasia (R1b-L23), Central Asia (Q-M25), Asia Minor (J2a-M92) and southern Mesopotamia (J1-Page08). In spite of the presence of important geographic barriers (Zagros and Alborz mountain ranges, and the Dasht-e Kavir and Dash-e Lut deserts) which may have limited gene flow, AMOVA analysis revealed that language, in addition to geography, has played an important role in shaping the nowadays Iranian gene pool. Overall, this study provides a portrait of the Y-chromosomal variation in Iran, useful for depicting a more comprehensive history of the peoples of this area as well as for reconstructing ancient migration routes. In addition, our results evidence the important role of the Iranian plateau as source and recipient of gene flow between culturally and genetically distinct populations.
Link
February 29, 2012
Serbian Y-chromosomes
High levels of Paleolithic Y-chromosome lineages characterize Serbia.
Regueiro M, Rivera L, Damnjanovic T, Lukovic L, Milasin J, Herrera RJ.
Abstract
Whether present-day European genetic variation and its distribution patterns can be attributed primarily to the initial peopling of Europe by anatomically modern humans during the Paleolithic, or to latter Near Eastern Neolithic input is still the subject of debate. Southeastern Europe has been a crossroads for several cultures since Paleolithic times and the Balkans, specifically, would have been part of the route used by Neolithic farmers to enter Europe. Given its geographic location in the heart of the Balkan Peninsula at the intersection of Central and Southeastern Europe, Serbia represents a key geographical location that may provide insight to elucidate the interactions between indigenous Paleolithic people and agricultural colonists from the Fertile Crescent. In this study, we examine, for the first time, the Y-chromosome constitution of the general Serbian population. A total of 103 individuals were sampled and their DNA analyzed for 104 Y-chromosome bi-allelic markers and 17 associated STR loci. Our results indicate that approximately 58% of Serbian Y-chromosomes (I1-M253, I2a-P37.2, R1a1a-M198) belong to lineages believed to be pre-Neolithic. On the other hand, the signature of putative Near Eastern Neolithic lineages, including E1b1b1a1-M78, G2a-P15, J1-M267 and J2-M172 and R1b1a2-M269 accounts for 39% of the Y-chromosome. Furthermore, an examination of the distribution of Y-chromosome filiations in Europe indicates extreme levels of Paleolithic lineages in a region encompassing Serbia, Bosnia-Herzegovina and Croatia, possibly the result of Neolithic migrations encroaching on Paleolithic populations against the Adriatic Sea.
Link
November 16, 2011
Armenian Y-chromosomes revisited (Herrera et al. 2011)
Armenian Y-chromosomes have been a largely ignored since the publication of the classic Weale et al. (2001) paper a decade ago. The Armenian DNA Project has largely covered the void during the intervening years, but it is nice that the topic is revisited by academics.However, owing to the contentions associated with the current calibrations of the Y-STR mutation rates,32,34,35,41 as well as the limitations of the assumptions utilized by the methodologies for time estimations, the absolute dates generated in this study should only be taken as rough estimates of upper bounds.

UPDATE II: I will have some additional thoughts on Y-chromosome distribution in the third update, but, for the time being, the two most important "nuggets" of information are: (i) the unusual haplogroup frequencies in Sasun (high R2 and T), which may be due to a founder effect, but it would be interesting if Armenian historians could find some explanation for their occurrence there, and (ii) the occurrence of R-M269*(xL23) in Ararat Valley. I invite more knowledgeable readers to comment on the issue; the haplotypes are in Table 2 of the supplement.
The relative expansion times for haplogroup J2-M172 (Table 4) generally correspond with those yielded for R1b-M343, with the exception of Greece and Crete, which, unlike haplogroup R1b-M343, are slightly older than the dates yielded for several of the Near Eastern groups as well as the four Armenian populations.
Neolithic patrilineal signals indicate that the Armenian plateau was repopulated by agriculturalists
Kristian J Herrera, Robert K Lowery, Laura Hadden, Silvia Calderon, Carolina Chiou, Levon Yepiskoposyan, Maria Regueiro, Peter A Underhill and Rene J Herrera
Abstract
Armenia, situated between the Black and Caspian Seas, lies at the junction of Turkey, Iran, Georgia, Azerbaijan and former Mesopotamia. This geographic position made it a potential contact zone between Eastern and Western civilizations. In this investigation, we assess Y-chromosomal diversity in four geographically distinct populations that represent the extent of historical Armenia. We find a striking prominence of haplogroups previously implicated with the Agricultural Revolution in the Near East, including the J2a-M410-, R1b1b1*-L23-, G2a-P15- and J1-M267-derived lineages. Given that the Last Glacial Maximum event in the Armenian plateau occured a few millennia before the Neolithic era, we envision a scenario in which its repopulation was achieved mainly by the arrival of farmers from the Fertile Crescent temporally coincident with the initial inception of farming in Greece. However, we detect very restricted genetic affinities with Europe that suggest any later cultural diffusions from Armenia to Europe were not associated with substantial amounts of paternal gene flow, despite the presence of closely related Indo-European languages in both Armenia and Southeast Europe.
Link
October 05, 2011
Y-chromosomes of Marsh Arabs
Different from the Iraqi control sample, the Marsh Arab gene pool displays a very scarce input from the northern Middle East (Hgs J2-M172 and derivatives, G-M201 and E-M123), virtually lacks western Eurasian (Hgs R1-M17, R1-M412 and R1-L23) and sub-Saharan African (Hg E-M2) contributions.
BMC Evolutionary Biology 2011, 11:288doi:10.1186/1471-2148-11-288
In search of the genetic footprints of Sumerians: a survey of Y-chromosome and mtDNA variation in the Marsh Arabs of Iraq.
Nadia Al-Zahery et al.
Abstract (provisional)
Background
For millennia, the southern part of the Mesopotamia has been a wetland region generated by the Tigris and Euphrates rivers before flowing into the Gulf. This area has been occupied by human communities since ancient times and the present-day inhabitants, the Marsh Arabs, are considered the population with the strongest link to ancient Sumerians. Popular tradition, however, considers the Marsh Arabs as a foreign group, of unknown origin, which arrived in the marshlands when the rearing of water buffalo was introduced to the region.
Results
To shed some light on the paternal and maternal origin of this population, Y chromosome and mitochondrial DNA (mtDNA) variation was surveyed in 143 Marsh Arabs and in a large sample of Iraqi controls. Analyses of the haplogroups and sub-haplogroups observed in the Marsh Arabs revealed a prevalent autochthonous Middle Eastern component for both male and female gene pools, with weak South-West Asian and African contributions, more evident in mtDNA. A higher male than female homogeneity is characteristic of the Marsh Arab gene pool, likely due to a strong male genetic drift determined by socio-cultural factors (patrilocality, polygamy, unequal male and female migration rates).
Conclusions
Evidence of genetic stratification ascribable to the Sumerian development was provided by the Y-chromosome data where the J1-Page08 branch reveals a local expansion, almost contemporary with the Sumerian City State period that characterized Southern Mesopotamia. On the other hand, a more ancient background shared with to Northern Mesopotamia is revealed by the less represented Y-chromosome lineage J1-M267*. Overall our results indicate that the introduction of water buffalo breeding and rice farming, most likely from the Indian sub-continent, only marginally affected the gene pool of autochthonous people of the region. Furthermore, a prevalent Middle Eastern ancestry of the modern population of the marshes of southern Iraq implies that if the Marsh Arabs are descendants of the ancient Sumerians, also the Sumerians were most likely autochthonous and not of Indian or South Asian ancestry.
Link
September 14, 2011
The Caucasus revisited (Yunusbayev et al. 2011)
This is another treasure trove of a paper, and together with Balanovsky et al. (2011) we now have a very clear picture of genetic variation in this most interesting of world regions.
The authors also post results up to K=10 in the supplementary material, which show Druze/Bedouin/Basque-centered component. It is actually possible to push the analysis higher than K=7 without such problem components appearing, by retaining non-closely related individuals (using --genome in PLINK and then iteratively removing individuals from pairs with PI_HAT greater than some value).
- light yellow "North East Asian"
- orange "South East Asian"
- brown "Neo African" or "Sub_Saharan", as there are no African hunter-gatherers
- dark blue "North European", as there is no split of east/west Europe at this level
- middle blue "West Asian"
- light blue "Southwest Asian"
- green "South Asian", but anchored on Sindhi, a population from Pakistan, due to the lack of more southern populations from India
- C has a concentration in the Turkic Nogays
- The presence of D this far west is very surprising, again in the Nogays. This haplogroup has a relic distribution, with particular concentrations in Tibet, Mongolia, Japan, and Andaman Islanders. In all likelihood its presence here is linked to the Nogays' eastern origin
- E and its subclades occurs at a very low frequency here
- G2a has a clear West Caucasus (both north and south) concentration
- I seems to have a mainly West Caucasus distribution as well; this is a common European haplogroup; it has quite elevated frequencies among the Andis and Kara Nogays. It would be interesting to discover some historical correlate for the presence of I in Kara Nogays but not Kuban Nogays and in Andis but not in most of the NE Caucasus
- J1 has the expected Northeast Caucasus nexus. This haplogroup is bimodal, with a mode in Arabians and a secondary mode in NE Caucasus. Note the paucity of J1e-P58, the reverse of the situation of Arabians; I've noted before the likely association of the P58 clade with Semitic languages.
- The extreme concentration of J2 in Chechens and Ingush are probably associated with low variance. Apart from these atypical populations, a substantial presence of this haplogroup can be found in the NW/S Caucasus in different populations and in the form of different subclades.
- The new LT mystery clade has its usual low-frequency wide distribution
- N occurs in Nogays as expected, and, like C, also in the NW Caucasus. This probably also represents an eastern influence, probably associated not only with the Nogays but also with various Tatar influences on the Caucasus.
- Q occurs widely in the NW Caucasus but only in 1 Nogay. Perhaps this is more of a Tatar marker, although a finer-scale resolution of this haplogroup is really necessary.
- R1a-related lineages occur less frequently here among eastern Slavs, a main reason for the disconnect between the Eastern European plain and the Caucasus. There does, however, appear to be good diversity here, with the presence of R1a*, R1a1-M198*, Note again how the Iranic Ossetians (both North and South) have almost no R1a1 compared to both their NW Caucasian and S Caucasian neighbors, again, suggesting that this may not have been an important Alan or steppe Iranian lineage, at least during the late antique time horizon. The occurrence of R1a1f-M458 may represent Slavic influence in the NW Caucasus.
- R1b-related lineages seem ubuiquitous in the Caucasus. R-M73 occurs substantially in Kara Nogays and Balkars, an apparent link with Central Asia where this haplogroup occurs frequently.
Mol Biol Evol (2011) doi: 10.1093/molbev/msr221
The Caucasus as an asymmetric semipermeable barrier to ancient human migrations
Bayazit Yunusbayev et al.
Abstract
The Caucasus, inhabited by modern humans since the Early Upper Paleolithic and known for its linguistic diversity, is considered to be important for understanding human dispersals and genetic diversity in Eurasia. We report a synthesis of autosomal, Y chromosome and mitochondrial DNA (mtDNA) variation in populations from all major subregions and linguistic phyla of the area. Autosomal genome variation in the Caucasus reveals significant genetic uniformity among its ethnically and linguistically diverse populations, and is consistent with predominantly Near/Middle Eastern origin of the Caucasians, with minor external impacts. In contrast to autosomal and mtDNA variation, signals of regional Y chromosome founder effects distinguish the eastern from western North Caucasians. Genetic discontinuity between the North Caucasus and the East European Plain contrasts with continuity through Anatolia and the Balkans, suggesting major routes of ancient gene flows and admixture.
Link
May 15, 2011
Genes and Languages in the Caucasus
We found that “evolutionary” estimates of most clusters fall far outside the range of the respective linguistic dates, while “genealogical” estimates gave a good fit with the linguistic 23 dates. At least two population events in the Caucasus are documented archaeologically, which allows additional comparison with these “historical” dates. In both cases, the historical (archaeological) date is similar to a genetic estimate based on the “genealogical” mutation rate (Supplementary Note 2).
Overall, the most frequent haplogroups in the Caucasus were G2a3b1-P303 (12%), G2a1a-P18 (8%), J1*-M267(xP58) (34%), and J2a4b*-M67(xM92) (21%), which together encompassed 73% of the Y chromosomes, while the other 24 haplogroups identified in our study comprise the remaining 27% (Table 2). ... haplogroup G2a3b1-P303 comprised at least 21% (and up to 86%) of the Y chromosomes in the Shapsug, Abkhaz and Circassians ... haplogroup G2a1a-P18 comprised at least 56% (and up to 73%) of the Digorians and Ironians (both from the Central Caucasus Iranic linguistic group), while not being found at more than 12% (average 3%) in other populations... haplogroup J2a4b*-M67(xM92) comprised 51-79% of the Y chromosomes in the Ingush and three Chechen populations (North-East Caucasus, Nakh linguistic group), while, in the rest of the Caucasus, its frequency was not higher than 9% (average 3%) ... haplogroup J1*-M267(xP58) comprised 44-99% of the Avar, Dargins, Kaitak, Kubachi, and Lezghins (South-East Caucasus, Dagestan linguistic group) but was less than 25% in Nakh populations and less than 5% in the rest of Caucasus.
The present work employs Starostin’s methodology, and we made special efforts to create the high-quality linguistic databases required for this analysis. Thus, based on significantly extended and revised linguistic databases, we have applied a glotto-chronological approach to the North Caucasian languages. As a result, our study provides a unique opportunity to make direct comparisons of linguistic and genetic data from the same populations. Lexico-statistical methods have also been applied to a number of language families using a Bayesian approach to increase the statistical robustness of language classification (Gray and Atkinson, 2003; Kitchen et al., 2009; Greenhill et al., 2010). Using these methods with the Caucasus languages understudy here will be the focus of future work.
- The center of the J2a world is somewhere between eastern Turkey, Armenia, Azerbaijan, Iran, and Syria
- The Caucasus is a northern extension of this world, just as Greece and Italy are its main western extensions, with a strong extension into Central Asia as far as Xinjiang, and well into South Asia all the way to upper caste South Indian Hindus.
- In the Caucasus itself J-M67 is dominating Nakh speakers, but with little other J2a related variation.
- In comparison to Nakhs, J2a seems more varied in Georgians, among Ossetes, and among NW Caucasian speakers
Parallel Evolution of Genes and Languages in the Caucasus Region
Oleg Balanovsky1,2,*, Khadizhat Dibirova1,*, Anna Dybo3, Oleg Mudrak4, Svetlana Frolova1, Elvira Pocheshkhova5, Marc Haber6, Daniel Platt7, Theodore Schurr8, Wolfgang Haak9, Marina Kuznetsova1, Magomed Radzhabov1, Olga Balaganskaya1,2, Alexey Romanov1, Tatiana Zakharova1, David F. Soria Hernanz10,11, Pierre Zalloua6, Sergey Koshel12, Merritt Ruhlen13, Colin Renfrew14, R. Spencer Wells10, Chris Tyler-Smith15, Elena Balanovska1 and The Genographic Consortium16
We analyzed 40 SNP and 19 STR Y-chromosomal markers in a large sample of 1,525 indigenous individuals from 14 populations in the Caucasus and 254 additional individuals representing potential source populations. We also employed a lexicostatistical approach to reconstruct the history of the languages of the North Caucasian family spoken by the Caucasus populations. We found a different major haplogroup to be prevalent in each of four sets of populations that occupy distinct geographic regions and belong to different linguistic branches. The haplogroup frequencies correlated with geography and, even more strongly, with language. Within haplogroups, a number of haplotype clusters were shown to be specific to individual populations and languages. The data suggested a direct origin of Caucasus male lineages from the Near East, followed by high levels of isolation, differentiation and genetic drift in situ. Comparison of genetic and linguistic reconstructions covering the last few millennia showed striking correspondences between the topology and dates of the respective gene and language trees, and with documented historical events. Overall, in the Caucasus region, unmatched levels of gene-language co-evolution occurred within geographically isolated populations, probably due to its mountainous terrain.
Link
October 14, 2010
African admixture in the Near East: where from?
At this level of detail, Africans are divided into three clusters which can be labeled Sub-Saharan (red), East African (blue), and "Mozabite" or North African (purple). Europeans and West Asians form the green cluster, while the Arab samples have a substantial contribution of the yellow cluster.June 03, 2010
Two major groups of living Jews (Atzmon et al. 2010)
Next, each of 2407 European subjects was assigned into one of 10 groups based on geographic region: South:Italy, Swiss-Italian; Southeast: Albania, Bosnia-Herzegovina, Bulgaria, Croatia, Greece, Kosovo, Macedonia, Romania, Serbia,Slovenia, Yugoslavia; Southwest: Portugal, Spain; East: CzechRepublic, Hungary; East-Southeast: Cyprus, Turkey; Central:Austria, Germany, Netherlands, Swiss-German; West: Belgium,France, Swiss-French, Switzerland; North: Denmark, Norway,Sweden; Northeast: Finland, Latvia, Poland, Russia, Ukraine;Northwest: Ireland, Scotland, UK.


Admixture with local populations, including Khazars and Slavs, may have occurred subsequently during the 1000 year (2nd millennium) history of the European Jews. Based on analysis of Y chromosomal polymorphisms, Hammer estimated that the rate might have been as high as 0.5% per generation or 12.5% cumulatively (a figure derived from Motulsky), although this calculation might have underestimated the influx of European Y chromosomes during the initial formation of European Jewry. Notably, up to 50% of Ashkenazi Jewish Y chromosomal haplogroups (E3b, G, J1, and Q) are of Middle Eastern origin,15 whereas the other prevalent haplogroups (J2, R1a1, R1b) may be representative of the early European admixture. The 7.5% prevalence of the R1a1 haplogroup among Ashkenazi Jews has been interpreted as a possible marker for Slavic or Khazar admixture because this haplogroup is very common among Ukrainians (where it was thought to have originated), Russians, and Sorbs, as well as among Central Asian populations, although the admixture may have occurred with Ukrainians, Poles, or Russians, rather than Khazars. In support of the ancestry observations reported in the current study, the major distinguishing feature between Ashkenazi and Middle Eastern Jewish Y chromosomes was the absence of European haplogroups in Middle Eastern Jewish populations.
AJHG doi:10.1016/j.ajhg.2010.04.015
Abraham's Children in the Genome Era: Major Jewish Diaspora Populations Comprise Distinct Genetic Clusters with Shared Middle Eastern Ancestry
Gil Atzmon et al.
Abstract
For more than a century, Jews and non-Jews alike have tried to define the relatedness of contemporary Jewish people. Previous genetic studies of blood group and serum markers suggested that Jewish groups had Middle Eastern origin with greater genetic similarity between paired Jewish populations. However, these and successor studies of monoallelic Y chromosomal and mitochondrial genetic markers did not resolve the issues of within and between-group Jewish genetic identity. Here, genome-wide analysis of seven Jewish groups (Iranian, Iraqi, Syrian, Italian, Turkish, Greek, and Ashkenazi) and comparison with non-Jewish groups demonstrated distinctive Jewish population clusters, each with shared Middle Eastern ancestry, proximity to contemporary Middle Eastern populations, and variable degrees of European and North African admixture. Two major groups were identified by principal component, phylogenetic, and identity by descent (IBD) analysis: Middle Eastern Jews and European/Syrian Jews. The IBD segment sharing and the proximity of European Jews to each other and to southern European populations suggested similar origins for European Jewry and refuted large-scale genetic contributions of Central and Eastern European and Slavic populations to the formation of Ashkenazi Jewry. Rapid decay of IBD in Ashkenazi Jewish genomes was consistent with a severe bottleneck followed by large expansion, such as occurred with the so-called demographic miracle of population expansion from 50,000 people at the beginning of the 15th century to 5,000,000 people at the beginning of the 19th century. Thus, this study demonstrates that European/Syrian and Middle Eastern Jews represent a series of geographical isolates or clusters woven together by shared IBD genetic threads.
December 28, 2009
Y chromosomes of Dagestan highlanders
Journal of Human Genetics 54, 689–694 (1 December 2009) | doi:10.1038/jhg.2009.94The key role of patrilineal inheritance in shaping the genetic variation of Dagestan highlanders
Laura Caciagli
Abstract
The Caucasus region is a complex cultural and ethnic mosaic, comprising populations that speak Caucasian, Indo-European and Altaic languages. Isolated mountain villages (auls) in Dagestan still preserve high level of genetic and cultural diversity and have patriarchal societies with a long history of isolation. The aim of this study was to understand the genetic history of five Dagestan highland auls with distinct ethnic affiliation (Avars, Chechens-Akkins, Kubachians, Laks, Tabasarans) using markers on the male-specific region of the Y chromosome. The groups analyzed here are all Muslims but speak different languages all belonging to the Nakh-Dagestanian linguistic family. The results show that the Dagestan ethnic groups share a common Y-genetic background, with deep-rooted genealogies and rare alleles, dating back to an early phase in the post-glacial recolonization of Europe. Geography and stochastic factors, such as founder effect and long-term genetic drift, driven by the rigid structuring of societies in groups of patrilineal descent, most likely acted as mutually reinforcing key factors in determining the high degree of Y-genetic divergence among these ethnic groups.
Link
November 18, 2009
Y chromosomes of NE Portuguese Jews
Phylogeographic analysis of paternal lineages in NE Portuguese Jewish communities
Inês Nogueiro et al.
Abstract
The establishment of Jewish communities in the territory of contemporary Portugal is archaeologically documented since the 3rd century CE, but their settlement in Trás-os-Montes (NE Portugal) has not been proved before the 12th century. The Decree of Expulsion followed by the establishment of the Inquisition, both around the beginning of the 16th century, accounted for a significant exodus, as well as the establishment of crypto-Jewish communities. Previous Y chromosome studies have shown that different Jewish communities share a common origin in the Near East, although they can be quite heterogeneous as a consequence of genetic drift and different levels of admixture with their respective host populations. To characterize the genetic composition of the Portuguese Jewish communities from Trás-os-Montes, we have examined 57 unrelated Jewish males, with a high-resolution Y-chromosome typing strategy, comprising 16 STRs and 23 SNPs. A high lineage diversity was found, at both haplotype and haplogroup levels (98.74 and 82.83%, respectively), demonstrating the absence of either strong drift or founder effects. A deeper and more detailed investigation is required to clarify how these communities avoided the expected inbreeding caused by over four centuries of religious repression. Concerning haplogroup lineages, we detected some admixture with the Western European non-Jewish populations (R1b1b2-M269, 28%), along with a strong ancestral component reflecting their origin in the Middle East [J1(xJ1a-M267), 12%; J2-M172, 25%; T-M70, 16%] and in consequence Trás-os-Montes Jews were found to be more closely related with other Jewish groups, rather than with the Portuguese non-Jewish population.
Link
October 16, 2009
The emergence and dispersal of haplogroup J-P58 (aka J1e)
To make things concrete, according to the model of drift-induced variance reduction proposed by Zhivotovsky, Underhill, and Feldman (2006), in 10,000 years (or 400 generations), J-P58 should have grown to the grand number of 200 men, or at least five orders of magnitude lower than the actual present-day haplogroup size. To account for the observed J-P58 size of millions of men, strong growth over time is needed, and with either the Z.U.F. (2006) analysis or my own, strong growth results in an accumulation of variance at close to the germline mutation rate.
With that said, all ages in this paper should be divided by a factor of 3. This is not only theoretically sound, but harmonizes better with other lines of evidence.
The paper studies Y-STR variance in several Middle Eastern populations. The lack of samples from the Caucasus does not allow us to infer the levels of Y-STR variance in that region. Arabian J-P58 from Saudi Arabia, Qatar, and UAE are pooled, resulting in low mean Y-STR variance of 0.16. This low value stems primarily from Qatar and UAE as the Saudi Arabian J-P58 makes a very small contribution (4 examples) in the pooled sample.
The timing and geographical distribution of J1e is representative of a demic expansion of agriculturalists and herder–hunters from thePre-Pottery Neolithic B to the late Neolithic era.24,26 The higher variances observed in Oman, Yemen and Ethiopia suggest either sampling variability and/or demographic complexity associated with multiple founders and multiple migrations.
A recent Bayesian analysis of Semitic languages supports an originin the Levant 5750 years ago and subsequent arrival in the Horn of Africa from Arabia 2800 years ago,11 thus providing an indirect support of our phylogenetic clock estimates. It is important to note that the glottochronological dates yield estimates for the break-up and expansion of the Proto-Semitic language. Proto-Semitic, itself, may have been spoken in a localized linguistic community for millennia before its bifurcation into the East and West Semitic branches.
The presence of a large frequency of undifferentiated J*(xJ1, J2) chromosomes in Soqotra suggests that the Arabian peninsula possessed such chromosomes, which now have a marginal status throughout the Middle East. I propose that a the early steppe desert herders of 6000-7000BC possessed J* chromosomes, that J1 arose in the Middle East, and its subclade J-P58 experienced rapid growth associated with the breakup and expansion of Semitic languages in the 4th millennium BC.archeological studies have shown an early presence (ca. 6000–7000 BCE) of domesticated herding in the arid steppe desert regions
European Journal of Human Genetics doi: 10.1038/ejhg.2009.166
The emergence of Y-chromosome haplogroup J1e among Arabic-speaking populations
Jacques Chiaroni et al.
Abstract
Haplogroup J1 is a prevalent Y-chromosome lineage within the Near East. We report the frequency and YSTR diversity data for its major sub-clade (J1e). The overall expansion time estimated from 453 chromosomes is 10 000 years. Moreover, the previously described J1 (DYS388=13) chromosomes, frequently found in the Caucasus and eastern Anatolian populations, were ancestral to J1e and displayed an expansion time of 9000 years. For J1e, the Zagros/Taurus mountain region displays the highest haplotype diversity, although the J1e frequency increases toward the peripheral Arabian Peninsula. The southerly pattern of decreasing expansion time estimates is consistent with the serial drift and founder effect processes. The first such migration is predicted to have occurred at the onset of the Neolithic, and accordingly J1e parallels the establishment of rain-fed agriculture and semi-nomadic herders throughout the Fertile Crescent. Subsequently, J1e lineages might have been involved in episodes of the expansion of pastoralists into arid habitats coinciding with the spread of Arabic and other Semitic-speaking populations.
Link









