June 12, 2013

Analysis of multi-merge dataset of human autosomal microsatellite variation

Microsatellites may be a little "retro" in the age of million-SNP arrays and whole genome sequencing, but one has to admit that the following figure, resulting from a merge of multiple microsatellite datasets, is pretty impressive.

G3: Genes|Genomes|Genetics doi: 10.1534/g3.113.005728

Population Structure in a Comprehensive Genomic Data Set on Human Microsatellite Variation

Trevor J. Pemberton et al.

Over the past two decades, microsatellite genotypes have provided the data for landmark studies of human population-genetic variation. However, the various microsatellite data sets have been prepared with different procedures and sets of markers, so that it has been difficult to synthesize available data for a comprehensive analysis. Here, we combine eight human population-genetic data sets at the 645 microsatellite loci they share in common, accounting for procedural differences in the production of the different data sets, to assemble a single data set containing 5,795 individuals from 267 worldwide populations. We perform a systematic analysis of genetic relatedness, detecting 240 intra-population and 92 inter-population pairs of previously unidentified close relatives and proposing standardized subsets of unrelated individuals for use in future studies. We then augment the human data with a data set of 84 chimpanzees at the 246 loci they share in common with the human samples. Multidimensional scaling and neighbor-joining analyses of these data sets offer new insights into the structure of human populations and enable a comparison of genetic variation patterns in chimpanzees with those in humans. Our combined data sets are the largest of their kind reported to date and provide a resource for use in human population-genetic studies.



eurologist said...

Would be interesting to see more complicated reconstructions (rather than a simple tree) from these data. For example, the Uygur and Hazara, but also the SA Bantu surely are placed where they are due to significant admixture (interestingly large admixture for the SA Bantu, it seems).

The Adygei/Italian group is interesting and again seems to summarize that the West Asian influence is most pronounced in SE Europe and ~ comes (originally came) to a stop in Italy.

Jim said...

I see how close Orcadians and Russians are closer to each other than to western and other Europeans. And genetiker demanded to know how "Nordics" could be "mixed".

Unknown said...

This spiral plot is saying (IMO).

(1)Out of Africa looks like out of Kenya.The last tribes unambiguously African are all Kenyan.

(2)The Mozabite and Beja people (both nomads) connect Africa to Out of Africa. The authors label these Middle Eastern but they are both very strongly associated with North Africa. The modern Beja travel between Kenya's neighbour Sudan and Egypt.

(3)The Out of Africa population (where ever it is) gives rise to all out of Africa as expected.

(4)The out of Africa population diverges into two groups. Oceania/Asia and Europe/India/Mesopotamia/Middle/East. Seems reasonable.

(5) UNEXPECTEDLY, Europe is NOT a subset of a Mesopotamian, Indian or Middle Eastern population. The second group diverges into India/Mesopotamia and Europe/MiddleEast. If anything the Druze/Palestinians look like a subset of a European population. The Tuscans appear to diverge off BEFORE the Druze/Palestinians. I think there is a shortage of data on Middle Eastern groups in this paper. But this is thought provoking never-the-less.

Tuscans presumably represent the Etruscan culture. Another mystery.

(6) The Kalash lie close to the separation of India/Mesopotamia from European/Middle East. On the Indian/Mesopotamian side.So they are looking like an ancient representative of a parental Indian/mesopotamian population rather than a European stray.

To me overall this is looking like a Beja-like folk travelled from Kenya to the Sudan and into North Africa. One group then headed east and kept going becoming the Asians/Oceanians and eventually the Americans.One group stayed in North Africa awhile and later spread out. The folk who went East in this second wave became the Indian/Mesopotamians.

The folk who sailed along the shores of the mediterranean became the Tuscans (a hop north) and the Other-European/MiddleEast folk (a few skips east).

Lots of complicated stuff after that.

Nathan Paul said...

Sane Voices on this blog other than the resident sage. Annie and Eurologist. This time Annie Mouse summary is one of the best in long time.

eurologist said...


Remember that the data do not imply a tree model - instead, a tree model is imposed. Thus, admixed populations will appear to branch away higher up on the tree. So, while there definitely is a case for NE African populations being closest to ooA populations, they are also the ones with the strongest W and SW Asian and European admixture.

Unfortunately, W Asians are really underrepresented. So, one cannot say much about the European/ W Asian/ S Asian split. However, I am afraid this would be difficult even with better W Asian representation, due to millennia of admixture. Even the Kalash (and Parsi) are surely placed branching away that early due to W Asian / S Asian (and C Asian) admixture.

It is interesting to see how close Causcasians, Europeans, and SW Asians cluster - they would probably even more so without African admixture in SW Asians. It also appears again like some people around the Caucasus (as Armenians and also Georgians, IIRC) have much less of W Asian admixture than expected from proximity (where W Asian would be Iranian to Pakistani, minus their C Asian admixture).

Time to create zombie populations: subtract C Asian elements from W Asians, African from SW Asians, W Asian from the Caucasus and Europeans, and then make a tree model on that.

Rob said...

@ A.M.
"UNEXPECTEDLY, Europe is NOT a subset of a Mesopotamian, Indian or Middle Eastern population. The second group diverges into India/Mesopotamia and Europe/MiddleEast"

Not a subset of either independently, but a subset of all 3 variously , depending on which exact Euro groups you look at (eg SEE, WE, or northern Europe). And I'd break it down into south-central Asian, Mesopotamian, North-east African/ "Middle East" componenents. However, such complexity cannot be evident on a tree representation as this one.

Slumbery said...

Annie Mouse

"(1)Out of Africa looks like out of Kenya.The last tribes unambiguously African are all Kenyan."

Or the populations north from there are too heavily mixed by back-migrations from Eurasia. You can't really pinpoint the exact locations of population sources and splitting points this easily, because later population movements may have confused the tracks.

Hazara are a very interesting combination. Dodecad results show strong Siberian connection. (Also have a high frequency of the same Y Hg as mine, but this just a personal side interest.)