The newer study included a wider sampling of populations, including Cypriots, Turks, and Eastern Slavs among others. Hence, the correspondence with the map of Europe is even stronger than before.
While in the previous study PC2 separated the Finns from the rest, the wider sampling, especially of Eastern Europeans now causes a clearer separation along the east-west axis. Note that PC2 in this study is not the same as PC2 in the previous one, as can be easily seen by e.g., (i) the fact thatPortugal and Spain are correctly placed to the west of Great Britain, and (ii) the fact that the Finnish score is about the same as that of Greeks and Yugoslavs in this study. In any case, Finland is represented by a single individual.
This underscores the often-forgotten fact that PCs are calculated from the available data and thus depend on the included populations. The inclusion of Eastern Europeans (esp. Eastern Slavs and Balts) in this study has now made the strong east-west differentiation in Europe, the most salient feature on the second PC. Unfortunately no higher PCs are presented.
From the paper:
The direction of the PC1 axis and its relative strength may reflect a special role for this geographic axis in the demographic history of Europeans (as first suggested in ref. 10). PC1 aligns north-northwest/ south-southeast (NNW/SSE, 216 degrees) and accounts for approximately twice the amount of variation as PC2 (0.30% versus 0.15%, first eigenvalue54.09, second eigenvalue52.04).The only deviation from geography seems to be the Slovakian individual:
More robust evidence for the importance of a roughly NNW/SSE axis in Europe is that, in these same data, haplotype diversity decreases from south to north (A.A. et al., submitted).
There is only one obvious outlier, which is Slovakia; however Slovakia is represented in our data set by only one individual, and based on the individual’s position in PC1-PC2 space it’s possible this outlier may actually have had Italian, rather than Slovakian ancestry.and the Russians:
The Russian Federation is less-striking as an outlier, and appears to lie too far “west” genetically, which may be a result of small sample size (n = 6) or simply that the Russians sampled here have ancestry from a location further west than the proxy location for Russia (Moscow) would suggest.
As for the Greeks (GR), once again, they are placed between Italians and their northern neighbors, with Albanians (AL), a pre-Slavic Balkan population especially close, followed by Slavomacedonians ("MK") and Bulgarians (BG). This is especially impressive given the small sample sizes (8 for Greece to 2 for Bulgaria). It appears that even individual members of ethnic groups "find their way" to the appropriate place of the map.
The Way Ahead
Both this and the previous paper have made it abundantly clear that, even in Europe, where genetic differentiation is very limited and populations are arrayed in a cline, it is possible to determine the rough geographical or ethnic origin of an unknown individual. Even for closely related groups where the precise origin can't be determined (e.g. Spanish vs. Portuguese), we can at least exclude with high probability most other European groups.
The applications of this are obvious: criminal or victim DNA can be pinpointed on the map. The ethnic origin of undocumented persons (e.g. illegal immigrants) can be ascertained with some confidence. Adoptees or persons of unknown ultimate origins (such as many inhabitants of the New World) may be able to trace their ancestry to something more than they could guess by looking at the mirror.
Indeed, while 500K markers are used in this analysis, it will turn out that fewer markers will carry most of the power of distinguishing between ethnic and geographical groups: you only have to examine 500K of them to harvest the useful ones.
This opens wide possibilities for ancestry testing; such testing previously gave one fairly obvious information ("you're 99% European") and cost a substantial amount. It will soon be possible, using a specialized panel of the most informative markers to create ancestry testing that is both affordable and informative.
The identification of ethnic differences in autosomal DNA also allows us to look at history from a new perspective, as the ethnic origins of skeletal material from the past will be ascertained with a precision unmatched by biological anthropology.
Of course, for the distant past, problems with obtaining authentic ancient DNA will remain, as well as the fact that accelerating human evolution may have changed allele frequencies or introduced new alleles into populations. Nonetheless, there is hope for real progress.
The real impediment will not be, in my opinion, technical, but rather psychological/political. The realization that not only major continental races, but also ethnic groups are biological entitites goes against the prevailing politically correct orthodoxy. According to this orthodoxy, European nations are artificial cultural constructions whose members share a "myth" of common origins; they are "constructed" products of the last few centuries; ethnic identification is a subjective notion of self-identity, rather than an objective notion of ancestry and homeland.
It now appears that while European nations are not races, they are, nonetheless, biological populations, occupying specific positions along the Caucasoid genetic continuum, and distinguishable from most other European nations, if not always their immediate neighbors.
Nature advance online publication 31 August 2008 | doi:10.1038/nature07331
Genes mirror geography within Europe
John Novembre et al.
Understanding the genetic structure of human populations is of fundamental interest to medical, forensic and anthropological sciences. Advances in high-throughput genotyping technology have markedly improved our understanding of global patterns of human genetic variation and suggest the potential to use large samples to uncover variation among closely spaced populations1, 2, 3, 4, 5. Here we characterize genetic variation in a sample of 3,000 European individuals genotyped at over half a million variable DNA sites in the human genome. Despite low average levels of genetic differentiation among Europeans, we find a close correspondence between genetic and geographic distances; indeed, a geographical map of Europe arises naturally as an efficient two-dimensional summary of genetic variation in Europeans. The results emphasize that when mapping the genetic basis of a disease phenotype, spurious associations can arise if genetic structure is not properly accounted for. In addition, the results are relevant to the prospects of genetic ancestry testing6; an individual's DNA can be used to infer their geographic origin with surprising accuracy—often to within a few hundred kilometres.