Showing posts with label Uralic. Show all posts
Showing posts with label Uralic. Show all posts

July 11, 2016

Y-chromosome haplogroup N phylogeny resolved

AJHG Volume 99, Issue 1, p163–173, 7 July 2016

Human Y Chromosome Haplogroup N: A Non-trivial Time-Resolved Phylogeography that Cuts across Language Families

Anne-Mai Ilumäe et al.

The paternal haplogroup (hg) N is distributed from southeast Asia to eastern Europe. The demographic processes that have shaped the vast extent of this major Y chromosome lineage across numerous linguistically and autosomally divergent populations have previously been unresolved. On the basis of 94 high-coverage re-sequenced Y chromosomes, we establish and date a detailed hg N phylogeny. We evaluate geographic structure by using 16 distinguishing binary markers in 1,631 hg N Y chromosomes from a collection of 6,521 samples from 56 populations. The more southerly distributed sub-clade N4 emerged before N2a1 and N3, found mostly in the north, but the latter two display more elaborate branching patterns, indicative of regional contrasts in recent expansions. In particular, a number of prominent and well-defined clades with common N3a3’6 ancestry occur in regionally dissimilar northern Eurasian populations, indicating almost simultaneous regional diversification and expansion within the last 5,000 years. This patrilineal genetic affinity is decoupled from the associated higher degree of language diversity.

Link

August 17, 2014

Indo-Europeans preceded Finno-Ugrians in Finland and Estonia

According to an abstract of a Ph.D thesis (below). This would appear to work well with the dating of the signature Y-chromosome haplogroup of Finno-Ugrians. 

Bidrag till Fennoskandiens språkliga förhistoria i tid och rum (Heikkilä, Mikko)
My academic dissertation "Bidrag till Fennoskandiens språkliga förhistoria i tid och rum" ("Spatiotemporal Contributions to the Linguistic Prehistory of Fennoscandia") is an interdisciplinary study of the linguistic prehistory of Northern Europe chiefly in the Iron Age (ca. 700 BC―AD 1200), but also to some extent in the Bronze Age (ca. 1700―700 BC) and the Early Finnish Middle Ages (ca. AD 1200―1323). The disciplines represented in this study are Germanistics, Nordistics, Finnougristics, history and archaeology. The language-forms studied are Proto-Germanic, Proto-Scandinavian, Proto-Finnic and Proto-Sami. This dissertation uses historical-comparative linguistics and especially loanword study to examine the relative and absolute chronology of the sound changes that have taken place in the proto-forms of the Germanic, Finnic and Samic languages. Phonetic history is the basis of historical linguistics studying the diachronic development of languages. To my knowledge, this study is the first in the history of the disciplines mentioned above to examine the systematic dating of the phonetic development of these proto-languages in relation to each other. In addition to the dating and relating of the phonetic development of the proto-languages, I study Fennoscandian toponyms. The oldest datable and etymologizable place-names throw new light on the ethnic history and history of settlement of Fennoscandia. For instance, I deal with the etymology of the following place-names: Ahvenanmaa/Åland, Eura(joki), Inari(järvi), Kemi(joki), Kvenland, Kymi(joki), Sarsa, Satakunta, Vanaja, Vantaa and Ähtäri. 
My dissertation shows that Proto-Germanic, Proto-Scandinavian, Proto-Finnic and Proto-Sami all date to different periods of the Iron Age. I argue that the present study along with my earlier published research also proves that a (West-)Uralic language – the pre-form of the Finnic and Samic languages – was spoken in the region of the present-day Finland in the Bronze Age, but not earlier than that. In the centuries before the Common Era, Proto-Sami was spoken in the whole region of what is now called Finland, excluding Lapland. At the beginning of the Common Era, Proto-Sami was spoken in the whole region of Finland, including Southern Finland, from where the Sami idiom first began to recede. An archaic (Northwest-)Indo-European language and a subsequently extinct Paleo-European language were likely spoken in what is now called Finland and Estonia, when the linguistic ancestors of the Finns and the Sami arrived in the eastern and northern Baltic Sea region from the Volga-Kama region probably at the beginning of the Bronze Age. For example, the names Suomi ʻFinlandʼ and Viro ʻEstoniaʼ are likely to have been borrowed from the Indo-European idiom in question. (Proto-)Germanic waves of influence have come from Scandinavia to Finland since the Bronze Age. A considerable part of the Finnic and Samic vocabulary is indeed Germanic loanwords of different ages which form strata in these languages. Besides mere etymological research, these numerous Germanic loanwords make it possible to relate to each other the temporal development of the language-forms that have been in contact with each other. That is what I have done in my extensive dissertation, which attempts to be both a detailed and a holistic treatise.

September 03, 2013

ISABS 2013 abstracts

From the book of abstracts (pdf):

MITOCHONDRIAL DNA AND PHYLOGENETIC ANALYSIS OF PREHISTORIC NORTH AFRICAN POPULATIONS
North Africa is located at a crossroad between Europe, Africa and Asia and has been inhabited since the Prehistoric time. In the Epipaleolithic period (23.000 years to 10.000 years BP), the Western North Africa has been occupied by Mecha- Afalou Men, authors of the Iberomaurusian industry. The origin of the Iberomaurusians is unresolved, several hypotheses have been forwarded. With the aim to contribute to a better knowledge of the Iberomaurusian settlement we analysed the mitochondrial DNA (mtDNA) of skeletons exhumed from the prehistoric site of Taforalt in Morocco (23.000-10.800 years BP) and Afalou in Algeria (11.000 to 15.000 BP -Algeria). Hypervariable segment 1 of mtDNA from 38 individuals were amplified by Real-Time PCR and directly sequenced. Sequences were aligned with the reference sequence to perform the mtDNA classification within haplogroups. Phylogenetic analysis based on mitochondrial sequences from Mediterranean populations was performed using Neighbor-Joining algorithm implemented in MEGA program. mtDNA sequences from Afalou and Taforalt were classified in Eurasiatic and North African haplogroups. We noted the absence of Sub-Saharan haplotypes. Phylogenetic tree clustered Taforalt with European populations. Our results excluded the hypothesis of the sub-Saharan origin of Iberomaurusians populations and highlighted the genetic flow between Northern and Southern cost of Mediterranean since Epipaleolithic period.

DISCONTINUITY SCREENING OF THE EARLY FARMERS’ MT-DNA LINEAGES IN THE CARPATHIAN BASIN
Discontinuous mitochondrial (mt) haplotype data between Central-Europe’s first farmers and contemporary Europeans have been described before. Hungary was a key-area of the Neolithisation, in the route of Neolithisation following the River Danube, and that was also the birthplace of the Linear Pottery Culture, which later colonised Western and Northern Europe. Neolithic and post-Neolithic human remains as well as contemporary population of Hungary is involved in our project to gain information on their mt-haplotype pattern and especially on the frequency of Asian haplotypes in the Carpathian Basin. HVS-I sequences from nt15977 to nt16430 of Neolithic specimens with sufficient mtDNA preservation among an extended Neolithic collection were analysed for polymorphisms, identifying 23 different ones. A novel, N9a, N1a, C5, D1/G1a, M/R24 haplogroups were determined among the pre-industrial Hungarians. The presence of Asian haplotypes in the ancient populations must be taken into consideration when reconstructing the population history of Europe and Asia, so a survey of the recent Asian haplotype frequency in Europe is unavoidable. The ancient and recent haplotype pattern of Hungary is definitely worth further investigation to test a theory on the continuous population history of Europe, wheter genetic gaps between ancient and recent human populations of Europe were more likely to be detected. 

ANTHROPOLOGIC AND MITOCHONDRIAL DNA ANALYSIS OF A MEDIEVAL GRAVEYARD FROM SOPOT (CROATIA)
Anthropologic and DNA analysis of human remains recovered from a graveyard in ©opot near Benkovac (Croatia) dating to the 14th/15th century was conducted in order to reconstruct the origin and life conditions of the people populating the region at that time. The dynamics of the population represented in this graveyard are important for understanding Croatian history because the deceased individuals were buried according to pagan ritual which was uncommon in a post Christianization period. Human remains from a total of 31 graves were analyzed, in which 47 individuals were found (9 female, 23 male and 15 children). Average age at death for adults was lower than expected (for female 28.9, male 32.4 years), suggesting that the living conditions of these individuals were poor. In addition, 10 antemortem traumas were visible on 6 adults, which is a higher rate than expected, and indicates potential violence within the population group. Finally, mitochondrial DNA (mtDNA) analysis was performed on hypervariable regions one and two for 46 of the individuals. Due to the age and condition of the remains, only 19 of the samples yielded full sequence profiles. Haplogroup analysis was performed for these 19 individuals, with the majority of the results falling within the most common groups in present-day Croatia. However, examination of the lesscommon haplogroups suggested a possible migration of individuals from Asia. Collectively, the physical and molecular results from this study provide evidence to suggest that individuals recovered from this gravesite are not from the current indigenous population.
MATERNAL GENETIC PROFILE OF A NORTHWEST ALGERIAN POPULATION
The North African population gene pool based on mitochondrial DNA (mtDNA) polymorphisms has been shaped by the back-migration of several Eurasian lineages in Paleolithic and Neolithic times. Recent influences from sub-Saharan Africa and Mediterranean Europe are also evident. The presence of East-West and North- South haplogroup frequency gradients strongly reinforces the genetic complexity of this region. However, this genetic scenario is beset with a notable gap, which is the lack of consistent information for Algeria, the largest country in the continent. To fill this gap, we analyzed a sample of 240 unrelated subjects from a northwest Algeria cosmopolitan population. mtDNA sequences analysis was performed on the regulatory hypervariable segment I region (HVSI). Haplogroup diagnostic mutations were analyzed using PCR-RFLPs and/or SNaPshot multiplex reactions. Of all North African populations, Eurasian lineages are the most frequent in Algeria (80%) while sub-Saharan Africa origin accounts for the remaining (20%). Within them, the North African genetic component U6 and M1 count for 20%. Indeed, the U6 haplogroup, highly distributed in Northwestern African populations, show a high frequency in Algeria (11.83%), while, the M1 frequency (7.1%) raises an anomalous peak in its decreasing Northeast - Northwest gradient. Moreover, the high frequency of HV subgroups (38.33%) point to direct maritime contacts between the European and North African western sides of the Mediterranean. Besides, the most common western H subgroups, H1 (47.8%) and H3 (10.1%), represent 60% of H lineages. These frequencies and HV0 (7.5%) lie well within the observed Northwestern to Northeastern African decreasing gradients.
MATERNAL GENETIC VARIATION OF THE SLOVENIAN POPULATION IN A BROADER EUROPEAN CONTEXT AND COMPARED TO ITS PATERNAL COUNTERPART
Slovenia is a European country situated at the crossroads of main European cultural and trade routes. It is geographically more linked to Central Europe, but history draws it closer together to its ex-Yugoslavian, Southeast European (SEE) neighbors. Slovenian maternal heritage has not been analyzed since 2003 and our aim was to analyze SNP markers of 97 Slovenian mtDNAs in high resolution to see where this population fits according to its maternal genetic variation. We compared the Slovenian sample with the neighboring SEE populations, as well as with other published European population datasets. Also, we compared the obtained mtDNA variation results with the available Slovenian Y chromosome data to see how these two uniparental marker systems correspond to each other. In the PC plot based on mtDNA haplogroups frequencies, Slovenian population has an outlying position mostly due to the increased prevalence of J (14.4%) and T (15.4%) clade and especially because of the abundance and diversity of J1c samples in Slovenia, represented with 8 haplotypes and in a percentage of >11%. Although in an outlying position, Slovenian mtDNA variation still shows a certain degree of affinity to SEE. On the contrary, Slovenia’s paternal genetic heritage yielded results that correspond to the population’s geographic location and groups Slovenian population considerably closer to Central European countries, based on increased prevalence of Northern/Central European R1a-M198 and decreased frequency of Balkan-specific I2a2-M423. Such differences in maternal and paternal marker systems could indicate that Slovenian genetic variation was influenced by sex-biased demographic events.
AN ASIAN TRACE IN THE GENETIC HERITAGE OF THE EASTERN ADRIATIC ISLAND OF HVAR
The Island of Hvar is situated in the central eastern Adriatic, and its relatively small rural population has been reproductively isolated thought history. Therefore, founder effects, genetic drift and inbreeding have had significant role in the shaping of current genetic diversity of Hvar Islanders. We analyzed Y-chromosome SNP markers of 412 Hvar islanders in high resolution, with the aim to investigate the current paternal genetic diversity. We found a relatively high frequency (6.1%) of unrelated male samples belonging to the Q*-M424 haplogroup, which is unusual for European populations. Interestingly, a previous study showed 9 individuals from Hvar with mitochondrial haplogroup F, which is almost absent in Europe. Both findings could indicate a certain connection with Asian populations, where these haplogroups are most common. This might be a result of several migratory events in the history, one of which could be linked to the ancient Silk Road, the other a consequence of the arrival of the Slavs, following the Avars, to the eastern Adriatic in the 6th century or due to the expansion of the Ottoman Empire in 16th to 18th century. The presence of these rare mitochondrial and Y-chromosome lineages are an example of founder effect and random genetic drift which, in this small island with a high degree of isolation and endogamy, had a strong impact on shaping the genetic diversity of the population. 
GENETIC PORTRAIT OF THE BESERMYAN ETHNIC GROUP BASED ON MTDNA HAPLOGROUP STUDY
Besermyan are a small ethnic group living in the Volga-Ural region of Russia. They belong to Finno-Ugric language group, but speak a special dialect. There are some Bulgar-Chuvash borrowings in their adverb vocabulary that are absent in other dialects of the Udmurt language. Besermyan live in the northwestern part of modern Udmurtia in the Cheptsa basin. In 2002 their number was about three thousand. The Besermyan origin is a very interesting issue. There is a view that the endonym Besermyan (beserman) is derived from the Turkic word which means flMuslim« in Arabic. This hypothesis, along with their language, hints at the origin of this ethnic group; however the genetic portrait of Besermyan has not been described yet. In our study we used the data of mitochondrial DNA (mtDNA) HVSI sequencing from 98 Besermyans representing 10 villages in Udmurtia Republic of Russia. The prevalence of Western Eurasian mtDNA lineages (91.7%) over Eastern Eurasian ones (9.2%) was shown in the studied population which is consistent with the structure of mtDNA pool of Finno-Ugric ethnic groups of the Volga-Ural region. Some Eastern Eurasian lineages in Besermyan are represented by haplogroups D4b, A4b and Z1a which are also common in Udmurts. It is important to note though that the share of Western Eurasian component in Udmurts according to previous study by Bermisheva et al. (2002) is about 74.5% so mtDNA haplogroup distribution in Besermyans is closer to other Finno-Ugric people of the Volga-Ural region: Mordvins and Maris.
COSMOPOLITAN CENTRAL ASIA: TAJIK MTDNA TRACES THE EASTWEST MOVEMENT OF ANCIENT NOMADS 
Tajikistan is a country in the mountains of southeast Central Asia. Due to its isolation, mtDNA variation in the Tajiks has been fragmentary studied on a limited number of samples. In 1997 saliva samples were collected from unrelated Tajiks across Tajikistan. After long-term preservation DNA was extracted from 2 mm FTA discs. Due to degradation mtDNA was amplified using the primary and secondary PCRs with nested primers in the multiplex format. The origin of 91 mitochondrial genomes from Tajikistan traced from western Eurasia (62.6%), eastern Eurasia (25.3%), south Asia (11.0%), and North Africa (1.1%). Significant population structure in the distribution of these mtDNA lineages was revealed within the regional groups in Tajikistan. The mtDNA variation was compared between the Tajiks and 45 populations of Eurasia. Pairwise Fst comparisons and the correspondence analysis revealed non-significant differences between the Tajik and Uzbek populations. Although both nations speak languages belonging to different linguistic groups, this result corresponds to their cultural and economic proximity. Surprisingly, after the Uzbeks, the Tajik mtDNA pool most closely resembles to the Ossetians, an Indo-Iranian people from the North Caucasus. The Tajiks also display intensive gene flow and admixture with some other populations of Central Asia and the Iranian Plateau living along the centers and crossroads of the earliest civilizations and belonging to different linguistic groups including the Uyghur, Kazakh, Karakalpak, Turkmen, Pathans, Iranian Arabs, and Gilaki. This study demonstrates an impact of ancient nomad migrations and invasions on the distribution of mtDNA variation in Eurasia. 

June 21, 2013

Origins and dispersals of Y-chromosome haplogroup N

I will simply note that the authors use the effective mutation rate that is ~1/3 the genealogical mutation rate and hence their age estimates are inflated by ~3x. I have expressed reservations about using Y-STR based age estimates in general, but these concerns become more important for older lineages.

In particular, I would be very surprised if Y-haplogroup N turns up in Europe 8-10 thousand years ago, and I expect to see it make its first appearance in the 3rd millennium BC or thereabouts, perhaps together with the Seima-Turbino expansion across northern Eurasia. Thanks to the ancient DNA -preserving boreal cold, it may be possible to find out.

Irrespective of my disagreement on the mutation rate issue, I have to applaud the comprehensive survey carried out by these Chinese scientists: numbers invariably pay off.

PLoS ONE 8(6): e66102. doi:10.1371/journal.pone.0066102

Genetic Evidence of an East Asian Origin and Paleolithic Northward Migration of Y-chromosome Haplogroup N

Hong Shi et al.

The Y-chromosome haplogroup N-M231 (Hg N) is distributed widely in eastern and central Asia, Siberia, as well as in eastern and northern Europe. Previous studies suggested a counterclockwise prehistoric migration of Hg N from eastern Asia to eastern and northern Europe. However, the root of this Y chromosome lineage and its detailed dispersal pattern across eastern Asia are still unclear. We analyzed haplogroup profiles and phylogeographic patterns of 1,570 Hg N individuals from 20,826 males in 359 populations across Eurasia. We first genotyped 6,371 males from 169 populations in China and Cambodia, and generated data of 360 Hg N individuals, and then combined published data on 1,210 Hg N individuals from Japanese, Southeast Asian, Siberian, European and Central Asian populations. The results showed that the sub-haplogroups of Hg N have a distinct geographical distribution. The highest Y-STR diversity of the ancestral Hg N sub-haplogroups was observed in the southern part of mainland East Asia, and further phylogeographic analyses supports an origin of Hg N in southern China. Combined with previous data, we propose that the early northward dispersal of Hg N started from southern China about 21 thousand years ago (kya), expanding into northern China 12–18 kya, and reaching further north to Siberia about 12–14 kya before a population expansion and westward migration into Central Asia and eastern/northern Europe around 8.0–10.0 kya. This northward migration of Hg N likewise coincides with retreating ice sheets after the Last Glacial Maximum (22–18 kya) in mainland East Asia.

Link

May 16, 2013

Evolutionary history of Uralic languages (Honkola et al. 2013)

Journal of Evolutionary Biology DOI: 10.1111/jeb.12107

Cultural and climatic changes shape the evolutionary history of the Uralic languages

T Honkola et al.

Quantitative phylogenetic methods have been used to study the evolutionary relationships and divergence times of biological species, and recently, these have also been applied to linguistic data to elucidate the evolutionary history of language families. In biology, the factors driving macroevolutionary processes are assumed to be either mainly biotic (the Red Queen model) or mainly abiotic (the Court Jester model) or a combination of both. The applicability of these models is assumed to depend on the temporal and spatial scale observed as biotic factors act on species divergence faster and in smaller spatial scale than the abiotic factors. Here, we used the Uralic language family to investigate whether both ‘biotic’ interactions (i.e. cultural interactions) and abiotic changes (i.e. climatic fluctuations) are also connected to language diversification. We estimated the times of divergence using Bayesian phylogenetics with a relaxed-clock method and related our results to climatic, historical and archaeological information. Our timing results paralleled the previous linguistic studies but suggested a later divergence of Finno-Ugric, Finnic and Saami languages. Some of the divergences co-occurred with climatic fluctuation and some with cultural interaction and migrations of populations. Thus, we suggest that both ‘biotic’ and abiotic factors contribute either directly or indirectly to the diversification of languages and that both models can be applied when studying language evolution.

Link

March 11, 2013

Genomewide structure of populations from European Russia (Khrunin et al. 2013)

Notice:

  1. The intermediate position of Estonians between Balts and Finns
  2. The intermediate position of some Russian groups between Komi and the main body of Europeans.

PLoS ONE 8(3): e58552. doi:10.1371/journal.pone.0058552

A Genome-Wide Analysis of Populations from European Russia Reveals a New Pole of Genetic Diversity in Northern Europe

Andrey V. Khrunin et al.

Several studies examined the fine-scale structure of human genetic variation in Europe. However, the European sets analyzed represent mainly northern, western, central, and southern Europe. Here, we report an analysis of approximately 166,000 single nucleotide polymorphisms in populations from eastern (northeastern) Europe: four Russian populations from European Russia, and three populations from the northernmost Finno-Ugric ethnicities (Veps and two contrast groups of Komi people). These were compared with several reference European samples, including Finns, Estonians, Latvians, Poles, Czechs, Germans, and Italians. The results obtained demonstrated genetic heterogeneity of populations living in the region studied. Russians from the central part of European Russia (Tver, Murom, and Kursk) exhibited similarities with populations from central–eastern Europe, and were distant from Russian sample from the northern Russia (Mezen district, Archangelsk region). Komi samples, especially Izhemski Komi, were significantly different from all other populations studied. These can be considered as a second pole of genetic diversity in northern Europe (in addition to the pole, occupied by Finns), as they had a distinct ancestry component. Russians from Mezen and the Finnic-speaking Veps were positioned between the two poles, but differed from each other in the proportions of Komi and Finnic ancestries. In general, our data provides a more complete genetic map of Europe accounting for the diversity in its most eastern (northeastern) populations.

Link

November 02, 2012

ALDER estimates of East Eurasian admixture in Europe

I used the 1-reference method of ALDER to infer lower bounds of East Eurasian admixture in a few European populations. This method does not include a statistical test of admixture (as does the 2-reference one or the f3 test), but we can probably reasonably suppose that some such admixture did take place on the combined evidence of the f3 test and ADMIXTURE evidence.

In any case, I took the East Asian populations of Loh et al. (2012) which had no evidence of admixture with either ALDER or the f3 test, and also a few populations from Rasmussen et al. (2010) that included representatives of Siberian Uralic speakers, as well as the three main branches of narrow-sense Altaic (Turkic, Mongolian, Tungusic), and estimated lower bounds of admixture for a set of European populations. Results can be seen below:


The evidence for admixture appears most convincing in the 1000 Genomes Finns and HGDP Russians where the +/- interval does not intersect or approach zero irrespective of the Asian population chosen. For these populations, the percentages vary from ~4-5% for the "pure" East Eurasians to ~10% for some Siberian groups such as Selkup and Altai. Thes latter carry some West Eurasian admixture, so it makes sense that a greater deal of admixture with them is necessary to account for the observed "East Eurasian" influence. And, indeed, it is probably via such "intermediate" Siberian populations that some East Eurasian ancestry flowed into Europe, rather than via the relatively untouched populations of the Far East.

PS: Note that this probably represents the most recent signal of admixture, and not the older and more general "North Eurasian"/Amerindian-like admixture that, as Loh et al. mention in their paper cannot be captured with ALDER.

October 13, 2012

An estimate of the admixture time for Finns

Using a similar procedure as in my recent post on the Baltic (Update II), I used 15 FIN individuals from the 1000 Genomes together with 12 Nganasans from Rasmussen et al. (2010) as reference populations, and 15 other FIN individuals to estimate admixture LD in a rolloff analysis. Three outlier Nganasan individuals (GSM558800, GSM558802, GSM558807) were removed.
The estimated time of admixture is 86.095 +/- 10.187 generations, or 2500 +/- 300 years. It corresponds rather well to the beginning of the Iron Age in northern Europe.

As I mention in my previous post, there is evidence for intrusive cultures (Battle Axe and Seima Turbino) converging on the area from different directions during the preceding Bronze Age. If the above date is accurate, it will suggest a rather late admixture event between the Europeoid and Siberian elements of Finns. The former may have included both the descendants of Mesolithic European hunter-gatherers and intruders from Central Europe (Corded Ware/Battle Axe); the latter may have included both Comb Ceramic and the descendants of the Seima Turbino metallurgists.

September 05, 2012

Words denoting pulse crops in European languages

From the paper:
The attested Proto-Indo-European root-words directly linked to pulse crops are further testimony that Proto-Indo-European society was well-acquainted with agriculture (47), and was not predominantly nomadic and pastoral, as initially thought by the proposers of the Kurgan hypothesis (48).

PLoS ONE 7(9): e44512. doi:10.1371/journal.pone.0044512

Origin of the Words Denoting Some of the Most Ancient Old World Pulse Crops and Their Diversity in Modern European Languages

Aleksandar Mikic

This preliminary research was aimed at finding the roots in various Eurasian proto-languages directly related to pulses and giving the words denoting the same in modern European languages. Six Proto-Indo-European roots were indentified, namely arnk(')- (‘a leguminous plant’), *bhabh- (‘field bean’), * (‘a kernel of leguminous plant’, ‘pea’), ghArs- (‘a leguminous plant’), *kek- (‘pea’) and *lent- (‘lentil’). No Proto-Uralic root was attested save hypothetically *kaca (‘pea’), while there were two Proto-Altaic roots, *bukrV (‘pea’) and *(‘lentil’). The Proto-Caucasianx root * denoted pea, while another one, *howl(a)(‘bean’, ‘lentil’) and the Proto-Basque root *ilha-r (‘pea’, ‘bean’, ‘vetch’) could have a common Proto-Sino-Caucasian ancestor, *hVwlV (‘bean’) within the hypothetic Dene-Caucasian language superfamily. The Modern Maltese preserved the memory of two Proto-Semitic roots, *'adas- (‘lentil’) and *pul- (‘field bean’). The presented results prove that the most ancient Eurasian pulse crops were well-known and extensively cultivated by the ancestors of all modern European nations. The attested lexicological continuum witnesses the existence of a millennia-long links between the peoples of Eurasia to their mutual benefit. This research is meant to encourage interdisciplinary concerted actions between plant scientists dealing with crop evolution and biodiversity, archaeobotanists and language historians.

Link

East to West across Eurasia

A couple more interesting abstracts from the DNA in Forenscics 2012.


Genetic journey of the N1c haplogroup
Pamjav H, Nemeth E, Feher T, Volgyi A
Binary and Y-STR polymorphisms associated with the NRY region of the human Y chromosome preserve the paternal genetic legacy that has persisted to the present, permitting inference of human evolution, population migration and demographic history.The NRY region of the Y chromosome acts much like mtDNA to reveal the structure among human populations and possiblyto infer the order and timing of their descents. In the present study, we have investigated the originof haplogroup N1c-Tat phylogeographic structure and the genetic relationship of Eurasianpopulations by examining STR variation in a large number of individuals. We have identified 54samples as the haplogroup N1c-Tat from 5 population groups (N=632). To place the results into awider geographic context, we included 209 samples from published sources and 296 samples from the FTDNA public database into the phylogenetic analysis. According to previous studieshaplogroup N-M231 is of East Asian ancestry. Our results suggest that N1c-Tat mutation probably originated in South Siberia 8-9 thousand years ago and had spread through the Urals into the European part of present-day Russia. Its distribution is not fully correlated with the spread of Uralic languages. Turkic-speaking ethnic groups in South Siberia have high N1c-Tat presence and STR variance, while the N1c-L550 subgroup largely occurs among non-Uralic-speaking Europeanpopulations. Only the European N1c-Tat (xL550) subgroup can be linked to the spread of Finno-Ugric languages from the Kama-Urals area ~6,000 years ago. The subgroup N1c-L550 cannot be considered Finno-Ugric origin and its carriers might have been assimilated by Indo-European groups, resulting in their spread across Europe in historical times with Vikings and Balto-Slavs. Based on the present study Buryats were dominated by a young, about 800-years old N1c-Tat cluster, which suggest that this ethnic group could be a relatively recent admixture of Mongolian conquerors with a Paleo-Siberian population groups.
Of course these ages should be taken with a grain of salt because it is unclear how they were derived (i.e., whether the "evolutionary mutation rate" was used). Hopefully, someone will treat the  subject of N1c ages with Y-SNPs that do not have the problem of saturation that affects microsatellites. This is an interesting test case, because a ~3-fold change in ages will have important consequences for our understanding of the spread of Finno-Ugric languages into Europe: an earlier date would associate them with the Comb Ceramic, while a later, Bronze Age date would associate them with the Seima-Turbino phenomenon.


Huns in Bavaria? Genetic analyses of an artificially deformed skull from an early medieval cemetery in Burgweinting (Regensburg, Germany)

Schleuder R, Wilde S, Burger J, Grupe G, Forster P, Harbeck M
The morphological examination of an early medieval burial site in Burgweinting, which is dated to the end of the 5th century, revealed one female with an artificially, circularly deformed skull, a practice that is thought to be associated with the arrival of Nomads of the Eurasian steppe, particularly the Huns.    

Individuals with such artificial cranial deformations also can be found in other Late Roman and Early Medieval cemeteries in Europe mostly in the Carpathian basin but only as few isolated cases in Western Europe, where mostly women show such deformations.  
Regarding the artificial cranial deformations it is unclear whether a foreign custom was taken over by Germanic tribes or whether the individuals were members or descendants of Eurasian nomads.  
With the help of the find of Burgweinting, we exemplarily investigated this question.To identify the possible foreign origin of this female with alleged “Asian” skull deformation we sequenced the HVRI and HVRII region of the mitochondrial DNA.  
Our results show that the ancestry of a woman with artificially deformed skull can be linked to an at least partly Asian origin. So this indicates that at least some of the few individuals with skull deformation had not adopted the costume but can be seen as former members or descendants of the hunnish tribal community.   
It will be worthwhile if geneticists can co-operate with physical anthropologists and/or archaeologists more broadly in cases where morphology, or burial customs indicate that a possibly heterogeneous population exists at that site. The above is a good example of that synergy in action.

March 16, 2012

TreeMix analysis of North Eurasians (and an African surprise)

I have used my K12b dataset to isolate a set of 537 individuals who had less than 10% membership in the South Asian, Northwest African, Southeast Asian, South Asian, East African, Gedrosia, South Asian, East African, Southwest_Asian, and Sub_Saharan components. Hence, the remaining 537 individuals had 90%+ membership in the remaining Atlantic_Med, North_European, Caucasus, Siberian, and East_Asian components.
  • The Atlantic_Med component is frequent in northwestern Europe
  • The North_European component is dominant in northeastern Europe and forays into Siberia
  • The Caucasus component is dominant in the Caucasus and forays into Central Asia
  • The Siberian component is dominant in North Asia and forays into Europe
  • The East_Asian component is frequent in East Asia and forays into North Asia
This pruning procedure may not be perfect, but it helps isolate a dataset consisting (mostly) of North Eurasian individuals. Furthermore, I removed all populations who had less than 5 remaining individuals after the first pruning step. Hence, in the end, I had a dataset of 38 populations/452 individuals. The remaining populations were:
Russian_D, Polish_D, German_D, Finnish_D, Swedish_D, Mixed_Slav_D, Norwegian_D, Lithuanian_D, Japanese_D, Daur, French, French_Basque, Hezhen, Japanese, Oroqen, Russian, Sardinian, Yakut, CEU30, JPT30, Belorussian, Chuvashs, Hungarians, Lithuanians, Romanians, Selkup, Evenk, Tuva, Yukagir, Nganassan, Dolgan, Buryat, Mongol, FIN30, Kent_1KG, Bulgarians_Y, Ukranians_Y, Mordovians_Y
Additionally, a sample of 30 Yoruba from the HapMap-3 was used as an outgroup.

TreeMix analysis

The TreeMix analysis was performed with default parameters, and allowing for a different number of migration edges.

Nomenclature: The direction of gene flow is best seen in the figure and/or associated treeout files.
For the text, I will put in (), the common ancestor of two populations, e.g., (French_Basque,Sardinian) and also as (X, *) the tree rooted at a particular node X, e.g., (Buryat, *)

0 migration edges:

The West and East Eurasian clusters are identified, with some populations with likely admixture being placed closer to the Eurasian root.

1 migration edge:
64% from (Sardinians/Basques) to Yoruba; this is difficult to interpret, but there has been evidence in the past that Africans and West Eurasians share more ancestry than Africans and East Asians do. In the linked post, I proposed a major episode of back-migration into Africa, and it is perhaps this that is being captured by this migration edge: Sardinians/Basques are the only two South-West Eurasian populations included, and any back-migration into Africa must have originated in the southern parts of West Eurasia.

Such a high level of back-migration may in fact be plausible, since Yoruba are a predominantly Y-haplogroup E bearing population, and the origin of the DE clade of the human Y-chromosome phylogeny is up in the air with both an African and Eurasian case having been advanced. Personally, I favor the Eurasian case, since within the CT clade, we have two subclades: CF (Eurasian) and DE (Eurasian/African).

Interestingly, John Hawks has recently discovered an unanticipated excess of "Neandertal ancestry" in Yoruba. This may also point to a back-migration into Africa and/or admixture of a group of Africans related to Eurasians (whom I've called Afrasians), with groups of Africans (Palaeoafricans) that split before the H. sapiens/H. neandertalensis common ancestor.

There is, however, another detail in the figure that may have escaped your notice: there is now about 0.5 worth of drift in the figure (left-to-right) as opposed to only 0.12 in the tree without migration edges. So, perhaps what we are seeing is indeed the first sign of admixture between modern and archaic humans in Africa, which has been made more likely by recent anthropological discoveries.

It's not clear to me whether TreeMix has stumbled onto something important or not, but it is certainly worth keeping in mind that the above model fits the data better than the simple tree model. Moreover, TreeMix attempts to reverse the polarity of migration edges, and -apparently- the (Sardinian, French_Basque)-to-Yoruba edge is preferable to the reverse.

So, we should keep our minds open to the possibility that the greater similarity of West Eurasians to Africans is not the result of multiple Out-of-Africa waves, one of which affected only West Eurasians, but of an Into-Africa back-migration from West Eurasia.

So far, tree-based models have focused on how diverse African groups are, and hence, the reduced diversity of Eurasians has been interpreted as an Out-of-Africa bottleneck that carried a subset of African variation into Eurasia.

But, there is an alternative interpretation of the evidence, namely that African groups are diverse because they carry a superset of ancient Into-Africa variation, with the African-specific part of their variation being the result of admixture with pre-existing African hominins. Such a scenario cannot be captured by tree models, but is apparently considered and not rejected by TreeMix which allows for lateral gene flow. Let's wait and see what new things come from full genome sequencing.

2 migration edges:

The (French_Basque/Sardinian)-to-Yoruba edge persists (64%) and a new edge was added from  (Buryat, *)-to-Mongol (85%). The "Mongol" sample consists of Siberian Mongols described by Rasmussen et al. (2010). An inspection of their K12b population portrait indicates that they do, in fact, have West Eurasian admixture, which according to the K12b spreadsheet amounts to about 18% in total. 

3 migration edges:
The aforementioned (French_Basque/Sardinian)-to-Yoruba (64%) and (Buryat,*)-toMongol (85%) edges persist, and now we have a 68% Nganasan-to-Selkup edge. 

These are the two Siberian Uralic populations in the dataset. This seems to parallel the K12b results, as Selkups have a North_European element which the Nganasans (Uralic speakers from the Arctic coast of Central Siberia lack), so we are seeing the hybridity of the Selkups here, who, like the Mongol sample are partly of West Eurasian ancestry.

4 migration edges:

The aforementioned (French_Basque,Sardinian)-to-Yoruba (64%), (Buryat,*)-to-Mongol (84%), and Nganasan-to-Selkup (68%) persist, and now we have a 89% (Buryat, *)-to-Tuva edge. According to the K12b the Tuva have 13.3% West Eurasian admixture, so again we have reasonably good agreement between TreeMix and ADMIXTURE. 

Interestingly, the non-"eastern" component of Selkups and Tuvans now forms a clade. It seems that a Nganasan-like and a (Buryat, *)-like population have converged into southern Siberia, absorbed a common local element and became the Selkup and Tuva respectively.

5 migration edges:

The aforementioned (French_Basque,Sardinian)-to-Yoruba (64%), (Buryat,*)-to-Mongol (85%),  Nganasan-to-Selkup (68%) persist, and 90% (Buryat, *)-to-Tuva persist, and now we have a new 18% Oroqen-to-(Yakut, Evenk) edge. The Oroqen and the Evenk are Tungusic speakers, whereas the Yakut are Turkic people from northeastern Siberia, having migrated there from the vicinity of Lake Baikal during the last millennium.

migration edges:

The aforementioned (French_Basque,Sardinian)-to-Yoruba (64%), (Buryat,*)-to-Mongol (85%),  Nganasan-to-Selkup (68%), 90(Buryat, *)-to-Tuva persist, 18% Oroqen-to-(Yakut, Evenk), persist, and a new 16% Nganasan-to-Oroqen edge appears. Interestingly, this has allowed the Oroqen and Hezhen to now form their own clade, which makes sense as these are both Tungusic speakers from northeastern China. The other Tungusic population, the Evenk group with the Turkic Yakut: what they share in common is that they both share origins close to Lake Baikal in Siberia.

migration edges:

The aforementioned (French_Basque,Sardinian)-to-Yoruba (64%), (Buryat,*)-to-Mongol (85%),  Nganasan-to-Selkup (68%), 90(Buryat, *)-to-Tuva persist, 18% Oroqen-to-(Yakut, Evenk),  16% Nganasan-to-Oroqen edges persist, and there is a new 81% Evenk-to-Yukagir edge. The remainder of the Yukagirs' ancestry is derived from the West Eurasian tree. The Yukagir language is rather mysterious, with some links to Uralic having been postulated. Here it pays off to look at the population portraits, since it is apparent that -unlike the Selkup- their West Eurasian ancestry is limited to a few individuals.

It is fairly interesting that Russian anthropologists placed the Yukagirs in the Baikal group of the Central Asian race, the same as the Evenks, who are their biggest donors. So, Yakuts, Evenks, and Yukagirs all seem to share the same Baikal-type of origin.


migration edges:
There is now a 64% Sardinian-to-Yoruba edge, a 16% Oroqen-to-Yukagir edge, 20% (Buryat, *)-to-(Yakut, Evenk), and a 24% Nganasan-to-Chuvash edge, 29% Oroqen-to-(Yakut,Evenk) edge, 88% (Buryat, *)-to-Tuva, 62% Nganasan-to-Selkup, 85% (Buryat, *)-to-Mongol. 

The tree has been rather re-organized, with two main Siberian groups identified: an eastern group (Hezhen, Daur, Oroqen, Buryat), and a central group (Yukagir, Dolgan, Nganasan, Yakut, Evenk, Selkup). The Chuvash, predominantly Europeoid Turkic speakers from Russia show evidence of gene flow from the central group as well, whereas the Selkup, Uralic speakers from Siberia, who belong to the central group, show evidence of gene flow from Europe.

migration edges:

64% (French_Basque,Sardinian)-to-Yoruba, 85% (Buryat, *)-to-Mongol, 68% Nganasan-to-Selkup, 92% (Buryat,*)-to-Tuva, 14% Oroqen-to-(Yakut,Evenk), 14% Nganasan-to-Oroqen, 82% Yakut-to-Yukagir, 90% Evenk-to-Dolgan, 13% Hezhen-to-(Nganasan, *).

10 migration edges:

64% (French_Basque, Sardinian)-to-Yoruba, (85% Nganasan, *)-to-Mongol, 68% Nganasan-to-Selkup, 92% (Nganasan,*)-to-Tuva, 15% Oroqen-to-(Yakut,Evenk), 15% Nganasan-to-Oroqen, 82% Yakut-to-Yukagir, 90% Evenk-to-Dolgan, 43% Hezhen-to-Buryat, 14% Sardinian-to-Bulgarian.

I will stop at this point. I may add more migration edges later to this post, but I'm tired of typing this stuff.

You can download all the plots and *.treeout files here.


UPDATE (March 20): I have repeated the experiment with HGDP San, rather than Yoruba as the outrgroup:

There is now a 63% migration edge from (Basque, Sardinian) to San.

February 08, 2012

Links between Native Americans and southern Altaians

AJHG doi:10.1016/j.ajhg.2011.12.014,


Mitochondrial DNA and Y Chromosome Variation Provides Evidence for a Recent Common Ancestry between Native Americans and Indigenous Altaians

Matthew C. Dulik et al.

The Altai region of southern Siberia has played a critical role in the peopling of northern Asia as an entry point into Siberia and a possible homeland for ancestral Native Americans. It has an old and rich history because humans have inhabited this area since the Paleolithic. Today, the Altai region is home to numerous Turkic-speaking ethnic groups, which have been divided into northern and southern clusters based on linguistic, cultural, and anthropological traits. To untangle Altaian genetic histories, we analyzed mtDNA and Y chromosome variation in northern and southern Altaian populations. All mtDNAs were assayed by PCR-RFLP analysis and control region sequencing, and the nonrecombining portion of the Y chromosome was scored for more than 100 biallelic markers and 17 Y-STRs. Based on these data, we noted differences in the origin and population history of Altaian ethnic groups, with northern Altaians appearing more like Yeniseian, Ugric, and Samoyedic speakers to the north, and southern Altaians having greater affinities to other Turkic speaking populations of southern Siberia and Central Asia. Moreover, high-resolution analysis of Y chromosome haplogroup Q has allowed us to reshape the phylogeny of this branch, making connections between populations of the New World and Old World more apparent and demonstrating that southern Altaians and Native Americans share a recent common ancestor. These results greatly enhance our understanding of the peopling of Siberia and the Americas.

Link

December 08, 2010

Genome-wide analysis of population structure in the Finnish Saami

The K=6 ADMIXTURE results from the supplementary material can be seen below:

This is based on ~38k SNPs.

It is unfortunate that they included Native American HGDP populations, but did not include the most relevant published data on Siberians that I first used to study population structure across north Eurasia here and here and here.

Hence, they discover a "Native American"-like component in Saami, which in all likelihood can be further resolved into Siberian-specific components utilizing the Rasmussen et al. dataset.

The "closest approximation" to the East Eurasian component in Saami in the HGDP panel are the Yakuts, but finer-scale analysis (see my previous posts) reveals that the Yakuts are made up almost entirely of an Altaic-specific component tying them to Turkic, Mongol, and Tungusic populations, while the eastern component in European Finns, Vologda Russians and Chuvashs has relationships with Central Siberians such as Kets, Selkups, and Nganasans, all of which are missing in this paper.

Hopefully this data will become publicly available online for re-analysis with the relevant populations included.

European Journal of Human Genetics advance online publication 8 December 2010; doi: 10.1038/ejhg.2010.179

A genome-wide analysis of population structure in the Finnish Saami with implications for genetic association studies

Jeroen R Huyghe et al.

The understanding of patterns of genetic variation within and among human populations is a prerequisite for successful genetic association mapping studies of complex diseases and traits. Some populations are more favorable for association mapping studies than others. The Saami from northern Scandinavia and the Kola Peninsula represent a population isolate that, among European populations, has been less extensively sampled, despite some early interest for association mapping studies. In this paper, we report the results of a first genome-wide SNP-based study of genetic population structure in the Finnish Saami. Using data from the HapMap and the human genome diversity project (HGDP-CEPH) and recently developed statistical methods, we studied individual genetic ancestry. We quantified genetic differentiation between the Saami population and the HGDP-CEPH populations by calculating pair-wise FST statistics and by characterizing identity-by-state sharing for pair-wise population comparisons. This study affirms an east Asian contribution to the predominantly European-derived Saami gene pool. Using model-based individual ancestry analysis, the median estimated percentage of the genome with east Asian ancestry was 6% (first and third quartiles: 5 and 8%, respectively). We found that genetic similarity between population pairs roughly correlated with geographic distance. Among the European HGDP-CEPH populations, FST was smallest for the comparison with the Russians (FST=0.0098), and estimates for the other population comparisons ranged from 0.0129 to 0.0263. Our analysis also revealed fine-scale substructure within the Finnish Saami and warns against the confounding effects of both hidden population structure and undocumented relatedness in genetic association studies of isolated populations.

Link

November 22, 2010

Ancient mtDNA from Sargat culture

From the paper:
The Sargat culture was located in the forest-steppe region of southwestern Siberia, near what is now the border of Russia and northern Kazakhstan, from around the 5th century BC until the 5th century AD. It is associated with a number of similar archaeological cultures in the region from the same period or slightly preceding it, for example, the Gorokhovo, Iktul, and Baitovo. The Sargat culture is also known for containing a number of kurgan burials (Koryakova and Daire 2000; Matveeva 2000; Andrey Shpitonkov et al., personal communication, 2004), and roughly half of all graves contain the remains of horse harnesses (Koryakova 2000). On the basis of archaeological evidence, the Sargat culture has been ascribed to a zone of intermixture between the Iranian steppe peoples to the south, such as the Saka or Sarmatians, and native Ugrian and/or Siberian populations (Koryakova and Daire 2000; Matveeva 2000; Andrey Shpitonkov et al., personal communication, 2004). Previous craniological research has also suggested some intrusion of Iranian peoples from the south (Matveeva 2000).


I've written before about the intrusion of Iranian speakers into Uralic territory, so this is a nice confirmation of the fact:
The southern sites were both successful in all phases to varying degrees. The results can be seen in Table 2. The three Kurtuguz individuals belonged to haplogroups A, C, and Z.

...

The four Sopininsky samples represent two different graves, corresponding to two individuals. The kurgan burial included a tooth and a rib sample, which resulted in a sequence belonging to haplogroup T (more specifically, T1). This sequence is a relatively uncommon variant of T/T1 having the mutation 16243C. The "at grave included a tooth and a metatarsal sample, both of which yielded a sequence belonging to haplogroup Z, with one ampli!cation showing a double peak (C/T) at position 16224.
From the paper:
Furthermore, the speci!c subtype T1 tends to be found farther east and is common in Central Asian and modern Turkic populations (Lalueza-Fox et al. 2004), who inhabit much of the same territory as the ancient Saka, Sarmatian, Andronovo, and other putative Iranian peoples of the 2nd and 1st millennia BC.
...
The haplogroups of the other samples—A, C, and the two variants of Z—are typical of Siberian populations. Haplogroups A, C, and Z are common in northern Asia, particularly north of the Altai Mountains and the Amur River (i.e.,Siberia), and they decrease in frequency as one moves south, with haplogroup Z being rare at best (as one might expect, there is one individual of haplogroup Z present in the Iranian sample discussed earlier, three members of haplogroup C, and only three individuals with a variant of haplogroup A). In fact, haplogroups A, C, and Z along with haplogroups D, G, and Y constitute approximately 75% of the haplogroups of Siberia (Derenko et al. 2007; Mishmar et al. 2003).
The authors make a good point that this T in Siberians cannot be the result of Slavic expansion, as that postdates these ancient DNA samples. So, the picture seems reasonably consistent with what I know about Siberian prehistory, namely the presence of a Paleolithic substratum of east Eurasian origin, that was modified at its fringes by movements of Scytho-Sarmatian type of people of the steppes, and, more recently, by the expansion of the Russian Empire.

Human Biology
, Volume 82, Number 2, April 2010

Investigation of Ancient DNA from Western Siberia and the Sargat Culture

Casey C. Bennett, Frederika A. Kaestle

Mitochondrial DNA from 14 archaeological samples at the Ural State University in Yekaterinburg, Russia, was extracted to test the feasibility of ancient DNA work on their collection. These samples come from a number of sites that fall into two groupings. Seven samples are from three sites, dating to the 8th-12th century AD, that belong to a northern group of what are thought to be Ugrians, who lived along the Ural Mountains in northwestern Siberia. The remaining seven samples are from two sites that belong to a southern group representing the Sargat culture, dating between roughly the 5th century BC and the 5th century AD, from southwestern Siberia near the Ural Mountains and the present-day Kazakhstan border. The samples are derived from several burial types, including kurgan burials. They also represent a number of different skeletal elements and a range of observed preservation. The northern sites repeatedly failed to amplify after multiple extraction and amplification attempts, but the samples from the southern sites were successfully extracted and amplified. The sequences obtained from the southern sites support the hypothesis that the Sargat culture was a potential zone of intermixture between native Ugrian and/or Siberian populations and steppe peoples from the south, possibly early Iranian or Indo-Iranian, which has been previously suggested by archaeological analysis.

Link

November 07, 2010

Multidimensional scaling and ADMIXTURE across Northern Eurasia corresponds to geography and language

Here is a multi-dimensional scaling plot of a number of North Eurasian populations. In comparison to my previous post, I have excluded Americans and Greenlanders, and added several other populations from Central Asia and West Eurasia.

Population labels have been printed in the co-ordinates of the population averages; these largely correspond with identifiable blobs of colored points, but note that some populations have several outliers, so labels appear in white space. Most notable in that respect are the Koryak, Chukchi, and the Nganasan, all of whom have some apparently European-admixed individuals.


"Mongol" corresponds to Rasmussen et al. (2010) Mongol sample, while "Mongola" to the HGDP-CEPH one. The population codes on the left may not be clearly visible as they overlap with each other and are CEU, LT, HU (relatively unadmixed Caucasoids), FI/RU (Uralian-admixed northern Caucasoids), IR/TR (Altaic-admixed southern Caucasoids). The West Eurasian part of the plot can be seen blown up on the right.

The correspondence with geography and language is striking. Siberian isolates from the extreme north and east, Koryak and Chuckhi are on top; HapMap Chinese at the bottom. Between them are Uralians (Selkup, Yukagir, Nganassan) and Altaics (Mongol-Tungus-Turkic people).

Below is ADMIXTURE analysis for the same set of populations, for K=7:


Finns and Russians seem to have an excess of the "Nganasan" component over the Altaic, while Turks have the opposite. Below is a table of Fst distances between components:


The close relationship between the two Caucasoid components is apparent (Fst=0.033), but note fairly large Fst divergences between the morphologically Mongoloid groups. I attribute this mostly to the very low population sizes of these groups, which have probably affected them by drift. For the less demographically constrained Altaic and East Asian components, Fst=0.044.

If you are not familiar with these ethnic groups, the Red Book of the Peoples of the Russian Empire and the Ethnologue indexes on Altaic and Uralic are invaluable, as are the portraits of ethnic groups of China. On the right a picture of a Nganasan.

UPDATE: Also, a past post from the blog, collating Y-haplogroup N frequencies with anthropological descriptions. Nganasans apparently belong to haplogroup N at a frequency of 92.1%!

November 06, 2010

ADMIXTURE in Siberia, Greenland, and Alaska

I have discovered a great dataset from Rasmussen et al. (2010). The data had been used before in conjunction with an ancient DNA sequence, but for me it is invaluable, as it fills up one of the major holes in Eurasia, namely Siberia, and includes a number of Altaic, Uralic, and other North Eurasian people. I suspect that this will be invaluable in fine-tuning the Northeast Asian ancestry of Dodecad Project members.

To begin with, after I processed the data, I ran ADMIXTURE on it up to K=7. Below you can see the results for K=7:

I'm no expert in linguistics, but it's clear to me that the light blue component corresponds to Altaic speakers. It will be extremely interesting to see what the analysis including other Altaic speakers from my other datasets as well as West Eurasians of Uralic/Altaic language or with "Northeast Asian" admixture will show.

The table below has sample sizes and admixture proportions.

Stay tuned. More to come.

September 08, 2010

ASHG 2010 abstracts

The 2010 meeting of the American Society of Human Genetics is in November. Here are some interesting abstracts that caught my eye:

It's nice to finally see a genomic study on the Greek population.
P. Paschou et al. Evaluation of the HapMap dataset as reference for the Greek population.
The HapMap project has provided a unique tool for the analysis of human genetic variation, providing reference information for allele frequency and genotype distributions as well as linkage disequilibrium patterns of Single Nucleotide Polymorphisms (SNPs) across the entire genome. The latest release of HapMap phase 3 data provides genotypes for millions of SNPs in 11 populations from around the world, with Europe being represented by the CEU (originating from Northwestern Europe) and the TSI populations (Tuscan Italians from Southern Europe). Although initial studies support the fact that the CEU can be used as reference for the selection of tagging SNPs in other European populations, a critical step in the design of genetic association studies, this hypothesis has not been extensively studied across Europe and in particular in Southern Europe. We set out to explore the extent to which the HapMap populations can be used as reference for a previously unstudied population of South-Eastern Europe, the Greek population. To do so we studied genomic variation in 1,813 SNPs, genotyped by our group in 56 individuals of Greek origin, and compared them to the CEU and TSI genotypes (1,813 SNPs from the CEU HapMap dataset and 1,205 from the TSI dataset). The studied SNPs are spread over 13 autosomal chromosomes and 26 regions, ranging in size from 120Kb to more than 4Mb. Genotype, allele frequency, and pairwise LD measures were compared across all three populations. PCA was used in order to identify those markers that are responsible for the observed inter-sample variance. Tagging SNPs were selected in the CEU and TSI samples and their transferability to the Greek population was tested, using both the r2 metric as well as the efficiency of genotype imputation of the non-selected SNPs. Our results demonstrate that, although the CEU population can to some extent be used as reference for the Greek population, it is preferable to use as reference a European population of closer genetic ancestry, like the TSI. These results are applicable in medical genetics, in order to inform the design of genetic association studies, as well as in studies of evolutionary relationships of Southern European populations.
One of the great problems of Eurasian anthropology is whether the Uralic populations are simply variable admixtures of Caucasoids and Mongoloids or they contain a tertium quid in the form of a Proto-Uralic element. The latter need not be distinct from the other two, as it can also be an old or stabilized blend of the two major Eurasian races that later admixed with more recent groups on either side. The abstract does not seem promising in this respect, i.e., in identifying a common core of ancestry among Uralic speakers in addition to their variable east-west admixture, but it would be nice to see if anything like that exists in the paper.

K. Tambets et al. Haploid and autosomal variation within a linguistic continuum of the Uralic-speaking people of Eurasia.
For about last two decades the examination of uniparentally inherited genetic marker systems revealing the variation embedded in mtDNA and Y chromosome has been the main tool in the studies of human genetic origins. Within few recent years the analysis of the genome-wide SNP data of individuals from different populations has started to give promising new insights in the field of human population genetics. The uniparentally inherited markers have shown slightly different demographic scenarios for the maternal and paternal lineages of North Eurasian, particularly of European Uralic-speaking populations. The geographical location of a population has evidently been the most important component that dictates the proportion of western and eastern mtDNA types in the gene pool of Uralic-speakers. Thus, the palette of maternal lineages of the Uralic-speakers resembles that of their geographically close European or Western Siberian Indo-European and/or Altaic-speaking neighbours, respectively. At the same time, the most frequent North Eurasian Y chromosome type N1c, that is also a common link between almost all Uralic-speakers, is with few exceptions rare, if present at all, among Indo-European-speakers of Western and Southern Europe. Here we combine genome-wide high density SNP data (650 000 SNPs, Illumina) with uniparentally inherited mtDNA and Y-chromosome variation of 16 Uralic-speaking populations to assess their place on the genetic landscape of North Eurasia. By the use of principal component and structure-like analysis on the autosomal data we show that the proportions of western and eastern ancestry components among the Uralic-speakers are determined mostly by geographical factors. The westernmost populations from Europe, both Uralic- and Indo-European speakers, are similar in their pattern of ancestry components and show low levels (less than 10%) of the eastern component. Conversely, the eastern ancestry component is dominant (60-70%) in the gene pool of the Siberian Uralic-speakers. In general, the genome-wide analyses corroborate the results of mtDNA analysis and do not reflect the common genetic characteristics between western and eastern Uralic-speakers at the level seen in case of N1c. Interestingly, among Saami from North Europe, who are often considered as „outliers“ in genetic studies, the dominant western component is accompanied by 30% of eastern component making them more similar to Volga-Uralic populations than to their closest neighbours.



This seems to validate my thoughts on relics and their importance in age estimation.

U. A. Perego et al. The Initial Peopling Of The Americas: An Ever-Growing Number Of Founding Mitochondrial Genomes From Beringia
Genetic evidence based on mitochondrial DNA (mtDNA) has recently revealed the existence of additional founding lineages that have contributed to the first peopling of America’s double-continent in addition to the more popular five Native American haplogroups (A2, B2, C1, D1 and X2a), and has demonstrated as well the need for additional sampling and analysis to be performed for some of the already known but poorly characterized lineages. One paradigmatic example is represented by the pan-American haplogroup C1. Two of its sub-branches (C1b and C1c) harbor ages and geographical distributions that are indicative of an early arrival from Beringia about 15-17,000 years ago, concomitantly with the other currently accepted Paleo-Indian founders. However, the estimated age of C1d - the third Native American subset of C1 - is only 8-10,000 years, which is suggestive of a much later entry and spread in the Americas. In this study, we shed light on the origin of this enigmatic Native American branch of C1 by completely sequencing a large number of C1d mitochondrial genomes from a wide range of geographically diverse, mixed and indigenous American populations. The revised phylogeny shows that the age previously reported for C1d was heavily underestimated and indicate that C1d is ancient enough to be among the founding Paleo-Indian mtDNA lineages. Moreover, our results reveal that there were two C1d founder genomes for Paleo-Indians that most likely arose early (~16kya), either in the dynamic Beringian gene pool, or at a very initial stage of the Paleo-Indian southward migration. This brings the recognized maternal founding lineages of Native Americans to the unexpected number of 15, and indicates that the overall number of Beringian or Asian founder mitochondrial genomes will probably continue to increase as more Native American haplogroups reach the same level of phylogenetic resolution as we obtained here for C1d. Additionally, we have confirmed a nearly identical geographic distribution pattern for haplogroup C1d when comparing samples collected in the general mixed population with those from native tribal groups, as it was also reported previously for haplogroups X2a and D4h3. This substantiates the validity of searching large public mtDNA databases (such as the one available through the Sorenson Molecular Genealogy Foundation, www.SMGF.org) for novel founder candidates able to reveal unknown details concerning the ancient human history of the Americas.

Another interesting abstract. I've written before about the association of Y-chromosome haplogroups with the spread of Semitic speakers and the agreement with language phylogenetics.

N. Al-Zahery et al. The male gene pool of the contemporary Mesopotamia marsh population supports their Semitic origin.
The origin of the modern Mesopotamia marsh people, which are locally called “Ma’dan” or “Marsh’s Arabs”, is a question of great interest. Based on their life-style (living in reed houses, grazing of water buffalo and other aspects) and local archaeological sites, many historians and archaeologists believe they may have Sumerian ancestry. Although little is known about the origin of Sumerians themselves, two main hypotheses have been advanced in this regard. According to the first, Sumerians were a group of populations which migrated from the “South East” following a seashore route through the Arabian Gulf, and settled down in the southern marshes of Iraq. According to the second, the advancement of the Sumerian civilization is the result of migration from the mountainous area of Anatolia to the southern marshes of Iraq where they settled, adsorbing previous populations. In order to shed some light on the genetic origin of the Mesopotamia marsh population, we investigated the male gene pool of 145 DNA samples of modern Mesopotamia people, still living in marshes in the south of Iraq. The analyses of Single Nucleotide Polymorphisms (SNPs) and Short Tandem Repeats (STRs) of the paternally transmitted Male Specific region of the Y chromosome (MSY) revealed that more than 80% of marsh Y chromosomes belong to (Hg) J1-M267, the autochthonous haplogroup of Middle Eastern/Semitic speakers with possible recent expansion and/or founder effect reflected by the reduced STRs variability. In particular, 90% of them were assigned to the J1e-M267-PAGE08 sub-haplogroup, which is the predominant Y chromosome lineage among Middle Eastern Arab populations (Yemen, Qatar, UAE, and Levant). Thus, these findings testify, at least from the paternal side, a strong Semitic Arabian component in the contemporary Mesopotamia marshes population, whereas no clear Anatolian and/or South Asian genetic evidence has been detected.
The finding of haplogroup I in China is surprising, as I is not generally found that far away from Europe. It would be interesting to see what the actual haplotypes are.
Y. Lu et al. Western Eurasian Y chromosomes found in the Chinese Salar ethnic group
Salar is a small Western-Turkish-speaking population living mostly in Qinghai province of China. The most similar languages to Salar are all far in Turkmenistan. Historical records suggested that they may be descendants of the Turkic nomadic tribes in Central Asia. In this study, 141 Salar Y chromosomes were analyzed for 39 SNP and 14 STR markers to investigate the potential imprints of their western ancestors. The most frequent haplogroup (hg) in this population sample is Hg R, comprising 40% of all Y chromosomes. Most of these Hg R samples belong to R1a1 (M17), which distributes in a wide geographic region including South Asia, East Europe, Central Asia, and South Siberia. Other four Western Eurasian haplogroups (G-2%, H-5%, I-3%, J-3%) were also found in Salar Y chromosome gene pool. These paternal lineages of Salar are absent in their East Asian neighbors but frequent in Central Asia. Y-STR-based analyses also grouped Salar to Central Asians. On the other side, Salar also has low frequencies of the East Asian specific Hg D and Hg O, suggesting possible gene flow from their neighboring populations. This Y chromosome study demonstrated that Salar well keeps the Western Eurasian paternal lineages of their Central Asian ancestors although they may have migrated to Central China for about 800 years.

I wish that more "people pairs" would be studied this way, as it would give us some good insight of how migration affects gene pools (allele frequency changes, founder effects, possible social selection etc.)

M. Davis et al. Ancient and recent demographic events influence mitochondrial DNA diversity in an immigrant Basque population
The Basques are an ancient people, considered by many anthropologists to represent the oldest extant European population. Because of this, they have been the subject of numerous sociological and biological investigations. The Basque Diaspora, a relatively recent demographic expansion of the Basque population, has until now been overlooked in genetic studies. Samples were taken from 53 individuals with Basque ancestry in Boise, Idaho, and the mitochondrial DNA (mtDNA) sequence variation of the first and second hypervariable regions were determined. Thirty-six mtDNA haplotypes were detected in the sample. Comparing the genetic diversity in the Idaho sample with other Basque populations, signatures of founder effects were observed, consistent with both the recent and ancient history of Basque mitochondrial lineages. There has been a marked alteration of haplogroup frequency and diversity, and there is a slight reduction in other measures of diversity in the NW Basque population compared to the native Basque population. We have found a relatively high percentage of the Cambridge Reference Sequence (rCRS) haplotype for hypervariable regions I and II, which is absent in previous studies of Basque mtDNA, and rare in other Spanish populations. The amount of nucleotide diversity is consistent with a sample that is predominantly haplogroup H, which is especially common in the Basque regions of Europe, due to ancient migrations and expansions out of glacial refugia. This is the first report of mtDNA diversity in an immigrant Basque population, and we find that the diversity in NW Basques can be explained by the recent history of migration, as well as the phylogeography and diversity of the major European haplogroups.


W. S. Watkins et al. Admixture in New World populations: an analysis of Y-chromosome, mtDNA, and genome-wide microarray data
The first major interaction between Native Americans and Europeans is documented historically and occurred less than 550 years ago. This recent time frame provides an excellent opportunity to investigate the effects of admixture between two populations that were previously separated for hundreds of generations. To characterize European admixture in Native American populations, we sampled and analyzed a group of isolated Totonac agriculturists from tropical Mexico near Veracruz and a group of native Bolivians predominantly from the mountainous region near La Paz, Boliva. Mitochondrial sequencing of HVS1 showed that all samples had pre-Columbian mtDNA haplogroups (A, B, C, and D). Using a panel of 48 STRs or 12 Y-chromosome SNPs, Totonac Y-chromosomes lineages were all assigned to the pre-Columbian haplogroup Q1a3a, and Bolivian Y-chromosome lineages were assigned to haplogroups Q1a3a, R1, and J2. Haplogroups R1 and J2 are common in European populations. Principal components analysis (PCA) using >800K autosomal SNPs typed in 24 Totonacs and 23 Bolivians showed that all Totonacs and 14 Bolivians clustered distinctly from Eurasian individuals. Nine Bolivians, however, were positioned between the New World and European PCA clusters. Admixture analysis showed that these nine samples had 21 - 33% European admixture using a European reference population. All three observed Y-chromosome haplogroups, including the well-studied pre-Columbian haplogroup Q1a3a, occurred in the admixed individuals. Two of the nine admixed individuals had pre-Columbian mtDNA and Y-chromosome haplogroups but 21-23% European ancestry. This result demonstrates that Y-chromosome and mtDNA haplogroups are only partial indicators of an individual’s complete ancestry.

Readers of the blog know that I don't agree with the scenario presented in the followin abstract. The serial founder effect idea is used by geneticists to explain the overall reduced genetic diversity of our species (that we appear to be young, in evolutionary terms). Personally, I don't see how a smart, expanding species that all of the sudden had access to the resources of the landmass of Eurasia went through these extreme bottlenecks.
I think that the alternative of a larger human population, genetic diversity reduced across the species by ongoing climate- and culture-mediated selection, and admixture within Africa itself -where a particular expanding H. sapiens group must've co-existed with pre-existed hominids, anatomically modern or not- has merit.
J. Long et al. Evidence for archaic admixture in contemporary non-African human populations
Analyses of large-scale genetic data sets show evidence for a series of founder effects that occurred as modern humans left Africa and settled the rest of the world. Nonetheless, research on modern humans has not ruled out the possibility that other processes, such as local gene flow, or mixing between archaic and modern humans, have also contributed to modern human diversity. Recent analyses of the Neanderthal genome make archaic admixture a salient issue because they show evidence for mixing between Neanderthals and out-of-Africa migrants. The present study examines evidence for archaic admixture in genotypes for 619 microsatellite loci collected from over 2,000 individuals from 100 human populations. We obtained these data from the Marshfield Clinic collection. The populations analyzed represent all inhabited continents of the world. In our analysis, we formulate the serial founder effects (SFE) model as a special case of a phylogenetic model promoted by Cavalli-Sforza and his associates. In this light, the SFE process makes four predictions: 1) A tree of descent according to the pattern of fissions. 2) The root of the tree lies in Africa. 3) The length of each branch is proportional to ratio of evolutionary time to effective population size. 4) The gene identity between all pairs of populations that share the same most recent common ancestor is equal in expectation. Using hypothesis tests based on generalized hierarchical statistical models, we find good agreement between the SFE predictions and diversity within and between African populations, and we find good agreement between the SFE predictions and diversity between non-African populations. However, there is more diversity within the non-African populations than the SRE model can account for. This makes for greater genetic distance between Africans and non-Africans than otherwise expected. How and where did the non-Africans obtain this diversity? A simple explanation for the finding is that the earliest migrants out-of-Africa mixed with an archaic population such as Neanderthals prior to their expansion throughout Europe and Asia. Coalescent based computer simulations of the SFE model with mixing support our interpretation. The time and place that we detect mixing coincides perfectly with that detected in a recent examination of Neanderthal genome sequences. Our study shows that genomic diversity in modern humans still reflects ancient events and processes.

C. Flores et al. Using EuroAIMs to measure admixture proportions in atypical European populations: the case of Canary Islanders
Using ancestry informative markers (AIMs) allows reducing the number of makers needed for population stratification adjustments in association studies. As few as 100 AIMs are sufficient to adjust for the largest European axis of differentiation (i.e. EuroAIMs). However, their use for ancestry inference and adjustment in association studies in atypical European populations such as the Canary Islanders, a recently African-admixed population from Spain, needs to be addressed. We aimed to explore whether EuroAIMs were suitable both for the inference of Spanish and Northwest African admixture proportions and for ancestry adjustments in association studies including samples from Canary Islanders. We analyzed samples from Canary Islanders, mainland Spanish (IBE) and Northwest Africans (NWA) for 93 EuroAIMs and compared the data with CEU and YRI from HapMap, Basques and Mozabite from HGDP, as well as from previously analyzed European samples. The major genetic difference was observed between NWA and all European populations, preserving the northwest-to-southeast differentiation of European populations in the second axis. Analyses revealed that Canary Islanders were intermediate between IBE and NWA, and that direct sub-Saharan African influences were negligible. Assessment of individual admixtures without prior population information clearly identified two subpopulations corresponding to NWA and IBE, while Canary Islanders were admixed with an average of 17.4% Northwest African contribution varying largely among individuals (range 0-95.7%). As few as 23 EuroAIMs correctly estimated population membership to IBE and NWA, while 69 EuroAIMs were required to accurately estimate individual admixture proportions in Canary Islanders. Ancestry estimates based on a subset of 69 EuroAIMs also controlled significant allele frequency differences between IBE and Canary Islanders. These data suggest that a handful of EuroAIMs would be useful to control false-positives in association studies performed in Spanish populations. Supported by FUNCIS 23/07 and grants from the Spanish Ministry of Science and Innovation PI081383 and EMER07/001 to CF.
As I have I mentioned before, the Maasai (and many other east Africans in various degrees) are intermediate between Negroids and Caucasoids, and hence admixture estimates considering Yoruba Nigerians would tend to underestimate the African element. It's important to remember that extant Africans are not uniform, ranging from Caucasoids to Negroids, Pygmies, and Khoi-San, with multiple identifiable clusters within the major Negroid group itself, and all sorts of between-group gene flow in a regional basis. It is always useful (as is the case e.g., with African Americans) to both use historical knowledge about population sources, and also to validate historical narratives with the genetic evidence.
R. L. Raaum et al. Autosomal African admixture in Yemeni populations.
Approximately 30% of mtDNA lineages in South Arabian samples are African L haplotypes, whose origin has usually been attributed to migration and assimilation of African females into the Arabian population over approximately the last 2,500 years. Few In contrast, few Y chromosome lineages of clear recent sub-Saharan African origin have been found in Southern Arabian populations. This bias in maternal and paternal lineages is in accord with historical accounts of the female bias in the Middle Eastern slave trade. In order to evaluate autosomal African ancestry, we collected high-resolution SNP genotype data from a geographically representative set of 62 Yemenis selected from a collection of 552 samples acquired in the Spring of 2007. The ancestry of chromosomal segments in the Yemeni population was estimated using a haplotype-based local ancestry estimation method, HAPMIX. The HAPMIX method is based on a two way admixture model that requires two phased reference populations; we used the HapMap Yoruba in Ibadan, Nigeria (YRI), Luhya in Webuye, Kenya (LWK), Maasai in Kinyawa, Kenya (MKK), and CEPH US residents with ancestry from northern and western Europe (CEU) samples. The three African reference populations include two Bantu-speaking groups (YRI and LWK) and one Nilotic-speaking group (MKK). We estimated local ancestry in the Yemeni sample with all three European-African reference population combinations (CEU-YRI, CEU-LWK, CEU-MKK). The correlations among African ancestry calculated using all three reference population combinations are high (r > 0.98 in all pairwise correlations). Furthermore, there is no significant difference between the average proportion of African ancestry in Yemenis calculated using either of the two Bantu-speaking reference populations: CEU-YRI (mean 0.062, sd 0.044) and CEU-LWK (mean 0.076, sd 0.049) (p=0.13, two-tailed Welch two sample t-test). However, the average African ancestry calculated using the Maasai reference population (CEU-MKK, mean 0.148, sd 0.060) is significantly greater from that calculated using either the Yoruba or Luhya reference populations (p less than 0.0001 in both comparison, two-tailed Welch two sample t-test). These data suggest that the source population for the African ancestry of the Yemeni population is more similar to the contemporary Maasai population than either the Luhya or Yoruba.
The next abstract seems fun; it's always nice to see something that isn't like everything that came before it.
T. Rzeszutek et al. Music as a novel marker in the study of prehistoric human migrations.
The study of prehistoric human population history is often fraught with controversy owing to incongruent evidence among various markers of present-day genetic and cultural diversity. While archaeological evidence can be used to calibrate the conclusions drawn from present-day diversity, the fickle nature of the fossil record leaves some migration histories unresolved. Our work analyzes the potential of music - in particular, vocal music - to serve as novel migration marker, bolstering established migration work and shedding light on regions of the world whose settlement history is contested. One such migration is the recent expansion of Austronesian-speaking peoples across the Pacific within the last 6000 years. The dominant hypothesis posits a recent origin in Taiwan, with a rapid movement southwards and eastwards to populate Polynesia during the following 3500 years. While this model is strongly supported by both archaeological evidence and the present-day distribution of linguistic diversity, our goal was to analyze whether music could serve as a novel line of evidence in the study of Pacific prehistory. A critical concern regarding any migration marker is its time depth. In order to examine this for music, we analyzed correlations between musical diversity and mitochondrial-DNA diversity in 9 Taiwanese aboriginal tribes for which both types of data were available. A sample of 226 choral songs was analyzed using 39 binary characters representing significant structural features of music (e.g., rhythm, interval size, melodic contour, etc.). The musical samples were restricted to ritual musics, which constitute the most conservative (i.e., slowly changing) component of a culture’s repertoire. Mantel tests showed a significant correlation between musical distance and genetic distance among these 9 tribes, suggesting that music may have a time depth comparable to widely-used genetic markers like mitochondrial DNA. This work demonstrates that music has the potential to enrich the conclusions drawn from other markers, and establishes methods for employing it as a tool in the study of prehistoric human movements throughout the world. At the same time, we want to capitalize on music’s own unique dynamics of change over time and place, particularly its capacity for admixture. In other words, music might not only be able to support the narratives told by other migration markers but shed new light on the histories of population movement and cultural contact.


The bolded part in the following abstract makes sense, as it indicates (i) the distinctiveness of Ashkenazi Jews compared to CEU Europeans, and (ii) the fairly recent widespread formation of admixed individuals (in the last couple of generations) which generated individuals that are 1/4 1/2 and 3/4 AJ genomically.

V. Vacic et al., Admixture in Ashkenazi Jewish cohorts and implications for association studies.
Studies of complex genetic disorders may benefit from focusing on population isolates, such as Ashkenazi Jews (AJ). However, in order to truly exploit the advantages of reduced genetic diversity the self-declared AJ ancestry of study participants should be independently confirmed with available genetic data. We investigate whether the AJ cohorts display genetic heterogeneity, such as e.g. different rate of admixing in cases and controls, which could potentially confound disease association studies. We applied principal component analysis (PCA) to AJ cohorts ascertained in Israel and the US East Coast with the goal of characterizing population structure. As described previously, when compared to the HapMap samples with CEU, YRI and CHB/JPT ancestry, virtually all AJ samples cluster with the CEU. Similar analysis done on CEU and Jewish HapMap samples from Ashkenazi, Sephardic and Middle Eastern Jewish communities revealed that 97.8% of AJ samples cluster along the AJ-CEU axis, with modes at AJ and CEU cluster centers and at approximately quartile distances between them. We postulate that these groups correspond to 100-0, 75-25, 50-50, 25-75, and 0-100% AJ-CEU admixtures. Notably, only 91.7% of self-reported AJ individuals fall into the reference JHapMap panel AJ cluster, with 1.6, 3.3, 0.5 and 0.7% in the admixed modes ordered by decreasing fraction of AJ ancestry. We also observe admixing with the non-AJ Jewish communities: 0.7% of samples fall within the non-AJ clusters and 1.4% at a subgroup approximately halfway between the AJ and non-AJ cluster centers. In our dataset we found that when compared to the sample as a whole or only to controls, individuals with Crohn’s disease (CD) show significantly more admixing: 78.1, 3.1, 8.5, 2.0 and 0.9% in the 100, 75, 50, 25 and 0% AJ subgroups respectively. Also, CD samples show more admixing with non-AJ groups (2.8 and 1.0% in the 50-50 and 0-100 AJ-non-AJ subgroups). Isolates typically exhibit a greater amount of cryptic relatedness compared to outbred populations, which motivates an orthogonal method for verifying AJ ancestry based on identity-by-descent (IBD). The high background level of IBD within the Ashkenazi Jewish community can be used to estimate degree of AJ ancestry by averaging the IBD between a sample under study and the AJ individuals in the JHapMap panel. Our preliminary results show that this method recapitulates the high-level results from the PCA analysis and provides better resolution.