February 09, 2007

European population stratification based on 10,000 markers

This is a very important preprint from the AJHG. The following statement from the paper confirms my previous observations:
Furthermore, PC3 and PC4, respectively, emphasize the separation of the Basques and Finns from other Europeans (Figure 5). The Basques are known to have unusual allele frequencies for several marker systems24 and speak a unique non-Indo-European language. In line with their non-Indo-European Uralic language and previous Y-chromosome work,25 the Finns show evidence of an increased affinity to the Central Asian populations when placed in an inter-continental context (Figure 1A and 1B).
For the Asiatic origins of the main Finnish Y-chromosome haplogroup N, mentioned in the paragraph, see here.

The results of the STRUCTURE analysis are fairly informative:

Note that for K=2, the main separation is between Sub-Saharans and the rest. Observe also that the east African Burunge -unlike the Mende- exhibit participation in the Eurasian cluster. As I have argued before, east Africa is a transitional zone between Sub-Saharan Africa and Eurasia.

For K=3 the Negroids are separated from the Caucasoids, and the blue cluster encompasses Mongoloids, Brahmins, and the Indian Mala caste. This is again expected, since Mongoloids (from the Altai) and South Asians share deep ancestry, evidenced e.g., by the mtDNA superhaplogroup M. Note also that Finns participate in this Asian cluster.

For K=4 the distinction between Mongoloids and South Asians becomes apparent. The Finns are now aligned partly with the Altaians, a relationship which persists for higher K.

For K=6 an interesting cluster encompassing mainly populations from the Mediterranean (plus Armenians) emerges.

As for the Greeks, they exhibit no substantial participation in the non-Caucasoid clusters. Note however, small Sub-Saharan contributions in the North African, Middle Eastern populations, in accordance with the previous evidence for elevated Sub-Saharan mtDNA in Arabic speaking populations.

Also from the paper:
Within the two broad Northern (Polish, Irish, English, Germans and some Italians) and Southeastern (Greeks, Armenians, Jews and some Italians) clusters further reliable structure is less obvious as individuals from different population samples are often interspersed with each other. Thus in some cases, geographic distance or physical barriers are not well reflected. For instance, despite their insular origin, Irish and English individuals cluster with the continental Germans and Poles. Similarly large geographical gaps, such as between Greece and Armenia, are much less obvious at the genetic level. Conversely, Italy appears to be a zone of sharp
differentiation over small distances. Some Italians cluster with the Northern Europeans while others fall into the Southeastern grouping (Figure 2A).
The similarity between Greece and Armenia is interesting in light of the historical evidence of the Balkan origins of the Phrygo-Armenians that I have blogged about before. The differentiation of Italians is probably due both to the Hellenic-Anatolian origins of some Italians from antiquity, as well as the Northern European origins of others, especially since the fall of the Western Roman Empire.

Update (Feb 9). It escaped my notice that a light blue cluster centered on the Burunge emerges at K=5. This element in their ancestry, differentiating them from the Mende seems to be specific to them among studied populations, and is assigned to the yellow (Caucasoid) cluster for lower K. This confirms my suggestions for an indigenous (non-Negroid) east African element which is related to the Caucasoids.

American Journal of Human Genetics (in press)

Measuring European Population Stratification using Microarray Genotype Data

Marc Bauchet et al.

A proper understanding of population genetic stratification—differences in individual ancestry within a population—is crucial in attempts to find genes for complex traits through association mapping. We report on genome-wide typing of ca. 10,000 single nucleotide polymorphisms (SNPs) in 297 individuals to explore population structure in Europeans of known and unknown ancestry. The results reveal the presence of several significant axes of stratification, most
prominently in a North-Southeastern trend, but also along an East-West axis. We also demonstrate the selection and application of EuroAIMs (European Ancestry Informative Markers) for ancestry estimation and correction. The Coriell “Caucasian” and CEPH Utah sample panels, often used as proxies for European populations, are found to reflect different subsets of the continent’s ancestry.


