December 08, 2011

Population structure in South Asia (Metspalu et al. 2011)

I haven't read the paper fully yet (it's open access), but the abstract seems to agree with what I've written both here and over at the Dodecad blog, about South Asians being primarily a West Asian/South Asian variable mix. I will try to get and analyze the new data from the paper; it is strange that every time I am just about ready to release the new version of Dodecad v4, I discover a source of new data!

The American Journal of Human Genetics, Volume 89, Issue 6, 731-744, 9 December 2011

Shared and Unique Components of Human Population Structure and Genome-Wide Signals of Positive Selection in South Asia

Mait Metspalu et al.


South Asia harbors one of the highest levels genetic diversity in Eurasia, which could be interpreted as a result of its long-term large effective population size and of admixture during its complex demographic history. In contrast to Pakistani populations, populations of Indian origin have been underrepresented in previous genomic scans of positive selection and population structure. Here we report data for more than 600,000 SNP markers genotyped in 142 samples from 30 ethnic groups in India. Combining our results with other available genome-wide data, we show that Indian populations are characterized by two major ancestry components, one of which is spread at comparable frequency and haplotype diversity in populations of South and West Asia and the Caucasus. The second component is more restricted to South Asia and accounts for more than 50% of the ancestry in Indian populations. Haplotype diversity associated with these South Asian ancestry components is significantly higher than that of the components dominating the West Eurasian ancestry palette. Modeling of the observed haplotype diversities suggests that both Indian ancestry components are older than the purported Indo-Aryan invasion 3,500 YBP. Consistent with the results of pairwise genetic distances among world regions, Indians share more ancestry signals with West than with East Eurasians. However, compared to Pakistani populations, a higher proportion of their genes show regionally specific signals of high haplotype homozygosity. Among such candidates of positive selection in India are MSTN and DOK5, both of which have potential implications in lipid metabolism and the etiology of type 2 diabetes.

Link

7 comments:

Andrew Oh-Willeke said...

Their attempt to date the k5 component isn't very impressive and I don't think that they draw the right conclusions from their data, but the rest of the study make quite a bit of sense.

The European component (dark blue k4) looks like a piggy backing minority component of the ANI (k5) component. The light blue Near Eastern component (k3) could plausibly be Harappan.

The Dravidian speaking Brahui look very much like their Indo-Aryan language speaking neighbors, which suggests to me, given the makeup of both groups, an elite driven language shift by the Brahui that the other groups in the region either did not share in, or reversed later.

n/a said...

"Their attempt to date the k5 component isn't very impressive and I don't think that they draw the right conclusions from their data, but the rest of the study make quite a bit of sense."

Their dating effort may be imperfect. But I'm pretty sure the main problem is that "k5" is not a meaningful or coherent slice of the broader western Eurasian gene pool.

Dienekes said...

Their attempt to date the k5 component isn't very impressive and I don't think that they draw the right conclusions from their data, but the rest of the study make quite a bit of sense.

They are not actually dating the admixture event between the "West Asian" and "South Asian" elements. It is clear that such admixture took place, and fairly recently, because of the clear cline in South Asian populations. If the admixture took place a very long time ago, the very clear clinal assortment of populations would not have been preserved.

The estimate of the actual admixture event is fairly recent by Moorjani et al.

http://dienekes.blogspot.com/2011/08/ichg-2011-abstracts-are-onlineic.html

Even though there are uncertainties about the methodology still, it is clear that this is not a very ancient (e.g., early- or pre-Neolithic) admixture event.

What Metspalu et al. seem to be saying is that it does not seem that the "Caucasus-Baluchistan" component with its twin peaks came very recently. But, that pretty much depends on the number of people involved. If we looked at the genomes of New World Caucasoids, for example, we wouldn't guess that they came post-1492, since in both their overall diversity levels and haplotype structure they're not that different from Europeans.

n/a said...

the "Caucasus-Baluchistan" component

Is not real. At the very least, it's not meaningful on the time scale we're interested in here; but more likely there never existed at any time a "k5" people who spread anywhere. See the behavior of the intra-Eurasian components as K increases in supplementary figure 4.

Jim said...

"If we looked at the genomes of New World Caucasoids, for example, we wouldn't guess that they came post-1492, since in both their overall diversity levels and haplotype structure they're not that different from Europeans."

On the language side by analogy, someone proposed that Burushaski came south with the Indo-Aryans a la Kiowa Apache and the community Hunza Valley may be the remnant of that.

Anonymous said...

Mix is bad?

"Did genetic variation in West Eurasia and South Asia accumulate separately after the out-of-Africa migration; do the observed instances of shared ancestry component and selection signals reflect secondary gene flow between two regions, or do the populations living in these two regions have a common population history, in which case it is likely that West Eurasian diversity is derived from the more diverse South Asian gene pool"

and

"Summing up, our results confirm both ancestry and temporal complexity shaping the still on-going process of genetic structuring of South Asian populations. This intricacy cannot be readily explained by the putative recent influx of Indo-Aryans alone but suggests multiple gene flows to the South Asian gene pool, both from the west and east, over a much longer time span."

German Dziebel said...

"If we looked at the genomes of New World Caucasoids, for example, we wouldn't guess that they came post-1492, since in both their overall diversity levels and haplotype structure they're not that different from Europeans."

Dienekes, if you know this, how come you support out of Africa, a theory that's rooted in an untested or, better to say, counterfactual assumption that that a daughter population is always very different from the source population and always in the direction of diversity reduction in the daughter population???