Showing posts with label Daghestan. Show all posts
Showing posts with label Daghestan. Show all posts

June 07, 2012

Languages of the Caucasus map


Quite an interesting effort. Re-Mapping Languages of the Caucasus:
Drawing on previously available ethnic and linguistic maps, supplemented by demographic data from other sources, we were able to create two linguistic maps: one representing the whole Caucasus area and the other zooming in on the particularly linguistically diverse region of Dagestan. Our first task was an accurate representation of the spatial distribution of various groups, unlike what is found in previously available maps, which often over-represent or under-represent the extent of linguistic groups. We have used the most recent census data available to capture the wholesale migrations, episodes of ethnic cleansing, and population exchanges that have changed the situation on the ground. Careful mapping of smaller linguistic groups, especially in Dagestan, has proved particularly instructive, as it allowed us to represent visually the correlation of language and topography, something that has not been done before. Having Jake Coolidge on board for this project was especially valuable, as he has employed modern cartography techniques to overlay the linguistic map on a detailed topographic representation. Finally, a careful use of the color scheme allowed us to demonstrate the family relatedness of the various languages spoken in this region, known justifiably as “the mountain of tongues”.
The general map is reproduced on top left. Here is the link for the Dagestan-specific map.

The GeoCurrents site itself seems quite interesting.

April 23, 2011

Genetic structure of West Eurasians

I have decided to generate a new major data dump of ADMIXTURE results. In comparison to previous such experiments:
  1. The focus is entirely on West Eurasians (Caucasoids).
  2. I have excluded all potential relatives from the source datasets, as well as several populations that tend to create uninformative clusters of their own (e.g., Druze or Ashkenazi Jews); exceptions are populations of great anthropological interest (e.g., Basques).
  3. I have included all relevant Dodecad Ancestry Project populations with 5+ participants.
  4. I have developed a new way of "framing" the region of interest by choosing appropriate sets of individuals from outside of it.
"Framing" populations

I have, since the beginning of my ADMIXTURE experiments, emphasized the importance of including appropriate population controls designed to squeeze out minor distant admixture in populations of interest, so that it does not confound the inference of region-specific components.

This leads to a problem: there are many possible sources of admixture. For example, we do not know a priori which set of African populations may have contributed to Caucasoid populations, or which set of East Asian ones. We could choose e.g., the Yoruba and the Chinese to represent Sub-Saharans and East Asians, but that might exclude possible sources of variation, and lead to Yoruba- and Chinese- specific clusters rather than more general Sub-Saharan and East Asian ones. If we included more population controls, we would cover more possible sources of variation, but ADMIXTURE would infer components of little interest (e.g., between Pygmies vs. Bushmen or Mongols vs. Chinese)

To avoid this, I propose to create meta-populations consisting of a single individual from many populations, i.e., a Yoruba, a Mandenka, a San, a Mbuti Pygmy, etc. for Sub-Saharan Africa, or a Miaozu, a Han, a Mongol, a She, a Hezhen, etc. for East Asia. That way we are both helping ADMIXTURE infer general components, while at the same time preventing it from inferring non-region specific ones.

Results

The entirety of the results presented here can be downloaded. They include:
  1. Population sources
  2. ADMIXTURE proportions for populations
  3. Fst divergences between components
  4. Population portraits showing individual level variation
See spreadsheet and associated bundle (or here).

At K=3, we observe the emergence West Eurasian, Sub-Saharan, and East/South Asian components.

The impact of the Sub-Saharan component is felt most distinctly in North Africa and the Near East, especially among Arabs; the impact of the East/South Asian one in West Asia and Northeastern Europe, especially among Finnic and Turkic speakers.

It is interesting to note that 39.8% of the Indian_D sample is assigned to the E/S Asian component. I had previously estimated in a roundabout way, and in a slightly smaller sample that the Ancestral South Indian component in Project participants was 33.3%, so ADMIXTURE has roughly managed to infer correctly that about 1/3 of this Indian sample's ancestry is more closely related to East Asians than to West Eurasians.

At K=4, the first split within the Caucasoid group appears: a component centered onn Europe, and one on West/South Asia.

Many populations possess both these components in clinal proportions.

The European component shrinks to insignificance in Arabians, such as Saudis and Yemenese.

The West/South Asian component shrinks to insignificance in Northeast Europeans, such as Finns, Lithuanians, north Russians, and Chuvash.


At K=5, a new Mediterranean component emerges. This is highly represented in populations to the North, South, and East of the Mediterranean sea.

This component is noteworthy for its absence in India and Northeastern Europe.

In Northeastern Europe, the Mediterranean component is hardly represented at all, whereas the West/South Asian component, freed of its K=4 Mediterranean associations now makes its appearance.

Conversely, in the West Mediterranean, among Basques, Sardinians, Moroccans, and Mozabites the West/South Asian component vanishes to non-existence.


At K=6, a North African component emerges.

Notice its presence in the Near East and parts of Southern Europe.

The two regions can be contrasted in terms of their African components, with very high North/Sub-Saharan African ratio in Europe vs. much lower in the Near East.

The explanation for this seems straightforward, as Europe was affected by North Africa in prehistoric and historic times, whereas the Near East also shares a border with more southern parts of the African continent, as well as the potential influence of the medieval slave trade that seems to have affected Muslim Near Eastern populations disproportionately.


At K=7, a Southwest Asian component emerges which is highest in Arabia and East Africa. I could've called this Red Sea, but I've reserved this name for a similar component that emerges at higher K.

It is clear that this is the main Caucasoid component present in East Africa.

It vanishes to non-existence in the Northern fringe of Europe, in the British Isles, Scandinavia, and among the Finns and Lithuanians.

Another interesting aspect of its distribution is its presence in Pakistan but not India. Perhaps, in this case, it reflects historical contacts between the Islamic Near East and parts of South Asia.


At K=8, we observe most of the familiar components from the K=10 analysis of the Dodecad Project. However, the use of the framing populations has meant that these components emerge before either Africans or East Eurasians split.

Now, the South Asian component appears, which swallows up most of the E/S Asian component that previously linked South with East Asians. This component extends a great way to the Near East and eastern parts of the Caucasus.

Quite interestingly, the remainder of the Caucasoid component in South Asia that is not absorbed by the new South Asian component seems to be split between the West Asian and North/Central European components, with an absence of the South European component.

It is among the Lezgins of the Caucasus that such a combination occurs, on the western shore of the Caspian Sea. The same combination of Caucasoid components also occurs in Uzbeks and Chuvash.

I conclude from this that the Caucasoids who entered South and Central Asia were probably derived from the eastern fringes of the Caucasoid world where only the West Asian (in the south) and North/Central European (in the north) are in existence. The area around the Caspian Sea seems like an excellent candidate for their origin, as I have speculated before, as that region has two important properties:
  1. It is transitional between predominantly N/C European populations to the north and predominantly W Asian populations to the south
  2. It is the border of the influence of the S European element, with Georgians possessing some of it, while Lezgins do not.

At K=9, we see the emergence of specific Sardinian and Basque components. Normally this is undesirable, but, I believe this breakup serves to divide the previously inferred South European component meaningfully.

What was South European in lower K seems to have an Atlantic vs. Mediterranean dimension, with the Basque/Sardinian ratio being particularly high in the Atlantic facade of Europe. Conversely, this ratio is low in the Mediterranean as we move eastwards: it is already low in Italy and the Balkans and becomes virtually zero in Cypriots, Armenians, and Levantine Arabs.

North Africa is also particularly interesting in having a low Basque/Sardinian rate, even in Morocco. It appears that Sardinians are a much better proxy of European influences in the region than Basques are.

K=10 is particularly exciting because, for the first time, there is clear evidence of structure in the North/Central European component that can now be split, for the first time, into Northwestern and Northeastern ones.

The NW European component is maximized in Orcadians, and people from the British Isles in general, as well as in Scandinavia. These populations have a low NE/NW ratio, as do the French, Iberians, and Italians.

Conversely, Balto-Slavs have a high NE/NW ratio.

Interestingly, Greeks have a balanced NE/NW ratio (1.2), intermediate between Italians and Balto-Slavs. Similar balanced ratios are also found among Lezgins (1.08), Turks, and Iranians. I conclude that Slavic or other Eastern European admixture cannot account for the totality of presence of this component in Greeks.

Indians have a 1.8 NE/NW ratio. In Pakistan this is 6.5, in Uzbeks it is 2.9, and in the North Eurasian_Ra it is 14.2. My conclusion is that a single migration of steppe people from eastern Europe cannot account for the presence of North European-like genes in Asia.

I propose that a palimpsest of population movements has brought such elements into the interior of Asia: the migration of the early Indo-Iranians from West Asia or the Balkans with a balanced NE/NW ratio, and, the migration of steppe people from Eastern Europe with a high NE/NW ratio. The latter, did affect much of Asia, but it is in India, where Iranian groups did not penetrate in great numbers the lower ratio of the Indo-Aryans has been best preserved.

The case of the Finns is also interesting, as there is a surplus of NE over NW European elements. Their position is intermediate between Scandinavians and Lithuanians/Russians but toward the latter. So, Finns appear to (i) have a substratum similar to Balto-Slavs, (ii) to be influenced by Scandinavians, and (iii) with a balance of East Eurasian elements (5.8% at this analysis) preserving the legacy of their linguistic ancestors from the east. At present it is difficult to determine how much of the NE European component in Finns is due to their eastern ancestors who were presumably mixed Caucasoid/Mongoloid long before they arrived in the Baltic, and how much was absorbed in situ.


At K=11 the Ethiopian/East African component emerges, absorbing some of the Red Sea and Sub-Saharan components from the previous K=10 run.

In comparison to the East African component of the Dodecad Project analysis, this component is closer to West Eurasians than to Sub-Saharan Africans, and a residual Sub-Saharan element remains in the two East African (Ethiopian and East_African_D) population samples. Presumably this is due to the more complete sampling of Sub-Saharan genetic diversity using the Sub_Saharan_H "framing" population.

Outside Africa, both E African/Sub-Saharan components are present in the Near East and North Africa with higher E African/Sub-Saharan ratios in the Near East and lower ones in North Africa.

In Europe, there are low such ratios in the few populations where African admixture is present, together with some N African. We can probably conclude that African admixture is mostly due to North Africans, and African-influenced Near Eastern populations, rather than directly from Sub-Saharan Africa.

At K=12 the first uninformative cluster emerged, centered on Iraqi Jews, hence I decided to stop the analysis at this point.

Population Portraits

There is a plethora of population portraits in the download bundle, showing how admixture proportions vary in individuals within populations, and how they vary between successive K.

Here is, for example, the K=11 portrait of Cypriots. A picture of overall homogeneity of this sample emerges, but notice how the NW European and NE European have disjoint presence in the Cypriot individuals, with 5 having some of the former, 6 having some of the latter, and only 1 of these having both.

Compare with Lezgins (right) where these two components occur in all individuals. Whatever this admixture represents, it must be old enough if it is so uniformly distributed in the population.



Here are the Georgians at K=10. Notice that their NE European component is unevenly distributed, and in every case where it occurs it is accompanied by a thin slice of East Asian. This may well indicate partial Russian or other Eastern European ancestry in these individuals.



Side-by-side comparisons are also quite useful. Consider Armenians vs. Lezgins vs. Iranians at K=7







Notice how Lezgins, who live north of the Caucasus mountains possess some of the N/C European component, which the Armenians, who live to the south of them lack. This should come as no surprise, as the Lezgins inhabit parts of the ancient Sarmatia Asiatica. Compare with Iranians, who are differentiated by their Indo-European Armenian neighbors by the presence of a "S Asian" component, which, in turn, ties them to their Indo-Aryan linguistic relatives.

Much more can be said, but I'll let readers explore the data on their own, and draw their own conclusions from them.

December 28, 2009

Y chromosomes of Dagestan highlanders

Journal of Human Genetics 54, 689–694 (1 December 2009) | doi:10.1038/jhg.2009.94

The key role of patrilineal inheritance in shaping the genetic variation of Dagestan highlanders

Laura Caciagli

Abstract

The Caucasus region is a complex cultural and ethnic mosaic, comprising populations that speak Caucasian, Indo-European and Altaic languages. Isolated mountain villages (auls) in Dagestan still preserve high level of genetic and cultural diversity and have patriarchal societies with a long history of isolation. The aim of this study was to understand the genetic history of five Dagestan highland auls with distinct ethnic affiliation (Avars, Chechens-Akkins, Kubachians, Laks, Tabasarans) using markers on the male-specific region of the Y chromosome. The groups analyzed here are all Muslims but speak different languages all belonging to the Nakh-Dagestanian linguistic family. The results show that the Dagestan ethnic groups share a common Y-genetic background, with deep-rooted genealogies and rare alleles, dating back to an early phase in the post-glacial recolonization of Europe. Geography and stochastic factors, such as founder effect and long-term genetic drift, driven by the rigid structuring of societies in groups of patrilineal descent, most likely acted as mutually reinforcing key factors in determining the high degree of Y-genetic divergence among these ethnic groups.

Link

May 02, 2009

Fine-scaled human genetic structure (Xing et al. 2009)


Our picture of global genetic variation becomes ever more clear. In this study -the supplementary material of which are online- the researchers studied 240K loci in 554 individuals from 27 populations.

From the paper:

The African, East/Southeast Asian, European, and Indian individuals are correctly assigned to their self-identified continental groups without exception.

Some individuals show evidence of membership in multiple groups. South Indian upper- and lower-caste populations have ∼30% and 10% membership in the inferred European group, respectively. South Indian tribal Irula have a relatively high probability of membership in the inferred Indian cluster. Southeast Asians (Iban, Cambodians, and Vietnamese) have ∼10% membership in the inferred Indian cluster, and the African Hema cluster shares ∼15% membership with the inferred European cluster.


The Hema are Nilotic-speaking pastoralists from the Congo. The Alur, from the same region were also studied.

Social stratification based on "European" (more properly extra-Indian Cauasoid) ancestry in South Indian populations is not surprising; see my post on the Origin of Hindu Brahmins. Differential -based on caste- admixture with an exogenous element is not really compatible with an indigenous creation of the caste system, and is more in accord with the traditional theory of an exogenous origination of the upper caste populations.

The study also includes populations from the Caucasus (Stalskoe and Urkarah) from Daghestan, which group with HGDP Adygei (see Figure S3C) and are clearly (FRAPPE analysis in Figure 4, reproduced top left) transitional (as mentioned in my previous post) between European and Indian Caucasoids, although quite clearly more on the European side.

Genome Research doi:10.1101/gr.085589.108

Fine-scaled human genetic structure revealed by SNP microarrays

Jinchuan Xing et al.

We report an analysis of more than 240,000 loci genotyped using the Affymetrix SNP microarray in 554 individuals from 27 worldwide populations in Africa, Asia, and Europe. To provide a more extensive and complete sampling of human genetic variation, we have included caste and tribal samples from two states in South India, Daghestanis from eastern Europe, and the Iban from Malaysia. Consistent with observations made by Charles Darwin, our results highlight shared variation among human populations and demonstrate that much genetic variation is geographically continuous. At the same time, principal components analyses reveal discernible genetic differentiation among almost all identified populations in our sample, and in most cases, individuals can be clearly assigned to defined populations on the basis of SNP genotypes. All individuals are accurately classified into continental groups using a model-based clustering algorithm, but between closely related populations, genetic and self-classifications conflict for some individuals. The 250K data permitted high-level resolution of genetic variation among Indian caste and tribal populations and between highland and lowland Daghestani populations. In particular, upper-caste individuals from Tamil Nadu and Andhra Pradesh form one defined group, lower-caste individuals from these two states form another, and the tribal Irula samples form a third. Our results emphasize the correlation of genetic and geographic distances and highlight other elements, including social factors that have contributed to population structure.

Link

December 02, 2007

ASHG 2007 abstracts


You can get the 656-page volume of abstracts (pdf) of this year's American Society of Human Genetics meeting. Some titles/abstracts that caught my eye.

A. Rosa et al.

Mitochondrial haplogroup H1 is protective for stroke.

S. Sharma et al.

The Autochthonous Origin and a Tribal Link of Indian Brahmins: Evaluation Through Molecular Genetic Markers


The co-existence and associated genetic evidences for the major rival models: i) recent Central Asian introduction of Indian caste system, ii) rank related west Eurasian admixture, iii) South Asian origin for Indian caste communities, and iv) late Pleistocene heritage of tribal and caste populations, leave the question of the origin of caste system in India hazy and obscure. To resolve the issue, we screened 621 Y-chromosomes (of Brahmins, occupying upper most caste position and Dalits and Tribals with the lower most positions in the Indian caste hierarchical system) with fifty-five Y-chromosomal binary markers and Y-microsatellite markers and compiled a data set of 2809 Y-chromosomes (681 Brahmins, 2128 Tribals and Dalits) for conclusions. Overall, no consistent difference was observed in Y-haplogroups distribution between Brahmins, Dalits and Tribals, except for some differences confined to a given geographical region. A peculiar observation of highest frequency (upto 72.22%) of Y-haplogroups R1a1* in Brahmins, hinted at its presence as a founder lineage for this caste group. The widespread distribution and high frequency across Eurasia and Central Asia of R1a1* as well as scanty representation of its ancestral (R*, R1* and R1a*) and derived lineages across the region has kept the origin of this haplogroup unresolved. The analyses of a pooled dataset of 530 Indians, 224 Pakistanis and 276 Central Asians and Eurasians, bearing R1a1* haplogroup resolved the controversy of origin of R1a1*. The conclusion was drawn on the basis of: i) presence of this haplogroup in many of the tribal populations such as, Saharia (present study) and Chenchu tribe in high frequency, ii) the highest ever reported presence of R1a* (ancestral haplogroup of R1a1*) in Kashmiri Pandits (Brahmins) and Saharia tribe, and iii) associated averaged phylogenetic ages of R1a* (~18,478 years) and R1a1* (~13,768 years) in India. The study supported the autochthonous origin of R1a1 lineage and a tribal link to Indian Brahmins.

Population structure in Sweden - A Y-chromosomal and mitochondrial DNA analysis.

T. Lappalainen et al.

A population sample representing the current Swedish population was analyzed for both maternally and paternally inherited markers with the aim of characterizing the genetic variation and structure of a modern North European population. We genotyped 12 Y-chromosomal and 27 mitochondrial DNA SNPs from DNA extracted and amplified from Guthrie cards of all the children born in Sweden during one week in 2003. The sample set consisted of 1914 samples (960 males) grouped according to place of birth. The ancient migration patterns are reflected in the clear north-south gradients in several palaeolithic and neolithic haplogroups in the mtDNA (U5, I, K, T, X) and the Y chromosome (R1b, N3). The haplogroup frequencies of the counties closest to Finland and Norway showed clear associations to the neighboring populations, resulting from the formation of the nations during the past millennium. Moreover, the recent immigration waves of the 20th century are visible both maternally and paternally, and have led to increased diversity and divergence from the main population in the major cities. Unfavorable population development in the ancient or recent past can be detected in several remote counties with low diversities and other signs of low population size and/or population crises. In conclusion, our study yielded valuable information about the various factors affecting the structure of the modern Swedish population that is vital for the use of the population in large population-based studies. Our sampling strategy, nonselective on the current population rather than stratified according to ancestry, represents the future of genetic studies in the increasingly panmictic populations of the world.

Relic Distribution of Y-Chromosome Haplogroup D Suggests Ancient Paleolithic Migration of Modern Humans in Eastern Asia.


H. Shi et al.

The Y chromosome haplogroup D is East Asian specific and prevalent in Tibetan and Japanese populations (30%-40%), but rare in other East Asian populations (<5%). We analyzed 5,174 Y chromosomes from 74 East Asian populations by typing haplogroup D related SNPs and eight Y chromosome microsatellite loci. We identified six sublineages under haplogroup D, and their distribution across East Asia suggested an ancient Paleolithic south-to-north migration, which likely predates the previously proposed northward diaspora of modern humans (reflected by the dominant occurrence of O3-M122 in East Asians) resulting in current relic distribution of haplogroup D in East Asia.

E. Marchani et al. Culture creates genetic structure in Daghestan.

M. Coelho et al. On the edge of the Bantu expansions: patterns of mtDNA and Y-chromosome variation in southwestern Angola.

J.S. Friedlaender et al. The genetic structure of Pacific Islanders.