These correlations arise from the fact that humans tend to intermarry with their neighbors, so alleles have a decreasing probability of being transmitted from a person at location X to future generations, the further we go from X. But, the more interesting cases are those which show a violation of the overall pattern. These can usually arise because of genetic isolation or long-distance migration. An example is that of the African hunter-gatherer groups:
When hunter-gatherer populations (!Kung, San, Biaka Pygmy, and Mbuti Pygmy) and Mbororo Fulani were included in the analysis, they appeared as isolated clusters on the PCA plots and greatly reduced the similarity between PCA maps and geographic maps (Figure S3, Table S7). The similarity score decreased from 0.790 to 0.548 after including all five of these populations in the analysis. This value, however, is still statistically significant, with a -value of ; further, if we disregard the hunter-gatherer populations and Mbororo Fulani in Figure S3B and only examine the relative locations of the original 23 populations, we can still find a clear resemblance between genetic and geographic coordinates. Compared to the other 23 populations, the four hunter-gatherer populations appear as isolated groups at the south, and Mbororo Fulani appears at the north. These observations are clearer in plots with only one among the five outlier populations included at a time (Figure S3C–S3G), each of which also produces significant similarity scores between genetic and geographic coordinates (Figure S4, Table S7).Figure S3 is very informative:
Observe that in Figure S3C, the Mbororo Fulani appear in the Balkans (!) relative to Sub-Saharan Africans. That is of course, due to their partial West Eurasian ancestry, but the magnitude of the difference is such that one suspects that it is not only due to this factor; if it were, then the Fulani would place somewhere between Europe and Central Africa.
The remaining figures (D-G) supply the explanation: the four hunter-gatherer groups appear well south of their actual locations; the Pygmy groups not in W/C Africa, but in S Africa; the Khoisan ones not in S Africa but in the Ocean well south of it.
Why does gene-geography correlation suffer such a violation in Africa? Figure S3 shows how different groups relate to W/C Africans. But, one could also use hunter-gatherers as an anchor point (i.e., place them where they actually live): in that case the W/C Africans would be the ones who would be pushed north towards the Mediterranean.
And, indeed, that is a good argument for the idea I've floated a few times, of substantial Eurasian back-migration into Africa: the genetic difference between African farmers and African hunter-gatherers dwarfs the geographic distance. This can easily be explained if we assume that back-migration from Eurasia affected the former much more than the latter. So, African farmers can be shown to be the outcome of mixture between two-divergent elements: one Eurasian-like, one African hunter-gatherer-like. The latter could include both groups like existing African H-Gs but might also include other groups who had the misfortune of being completely absorbed before the Eye of Science set its sights on the African continent.
PLoS Genet 8(8): e1002886. doi:10.1371/journal.pgen.1002886
A Quantitative Comparison of the Similarity between Genes and Geography in Worldwide Human Populations
Chaolong Wang et al.
The spatial pattern of human genetic variation provides a basis for investigating the history of human migrations. Statistical techniques such as principal components analysis (PCA) and multidimensional scaling (MDS) have been used to summarize spatial patterns of genetic variation, typically by placing individuals on a two-dimensional map in such a way that pairwise Euclidean distances between individuals on the map approximately reflect corresponding genetic relationships. Although similarity between these statistical maps of genetic variation and the geographic maps of sampling locations is often observed, it has not been assessed systematically across different parts of the world. In this study, we combine genome-wide SNP data from more than 100 populations worldwide to perform a formal comparison between genes and geography in different regions. By examining a worldwide sample and samples from Europe, Sub-Saharan Africa, Asia, East Asia, and Central/South Asia, we find that significant similarity between genes and geography exists in general in different geographic regions and at different geographic levels. Surprisingly, the highest similarity is found in Asia, even though the geographic barrier of the Himalaya Mountains has created a discontinuity on the PCA map of genetic variation.