March 20, 2011

D-statistic paper (Durand et al. 2011)

The D-statistic was introduced in Green et al. (2010) (Neandertal admixture paper) and used in Reich et al. (2010) (Denisovan admixture paper). In basic terms it studies whether from a pair of populations P1, P2 one is closer to a third one P3, using P4 as an outgroup.


In the aforementioned papers it was usually used like this:

D(Eurasian, African, Archaic, Chimpanzee)

and its positive values were interpreted as evidence of archaic admixture of different kind in subsets of modern humans (non-Africans and Melanesians).

The new paper is highly technical, but suggests that one can use this statistic to infer archaic admixture even in the absence of an ancient specimen. I'm not entirely clear on how the nuts-and-bolts of this work, but the gist of it seems to be the detection of levels of genetic divergence that can either be explained by thousands of generations of population structure that was broken down or archaic admixture.

It might be interesting to see if new types of archaic admixture can be predicted from the genomes of modern populations, while we wait for the next archaic hominin to be sequenced. As I've mentioned in the past, DNA preservation in hot and humid climates may make DNA preservation impossible, and hence there may never be an ancient sequence to compare against. In a sense, the Denisovan paper got lucky because the Denisova group (in the Altai) may have been related to the people that Melanesians admixed with much further south -- unless they took a massive detour.

So, hopefully, as full genomic data on diverse human populations become widely available over the next few years, new traces of archaic admixture and/or deep population structure may be inferred. The admixture record stands 2 for 2 with the only two actual archaic hominin groups that were tested so far, so I'd bet that this isn't the end of the story. Anthropological theory is bound to move away from naive recent Out-of-Africa and towards a more nuanced view of human origins, in which more diverse ancestors have their own place.

Mol Biol Evol (2011) doi: 10.1093/molbev/msr048

Testing for ancient admixture between closely related populations

Eric Y. Durand et al.

One enduring question in evolutionary biology is the extent of archaic admixture in the genomes of present-day populations. In this paper, we present a test for ancient admixture that exploits the asymmetry in the frequencies of the two non-concordant gene trees in a three-population species tree. This test was first developed to detect interbreeding between Neandertals and modern humans. We derive the analytic expectation of a test statistic, called the D-statistic, which is sensitive to asymmetry under alternative demographic scenarios. We show that the D-statistic is insensitive to some demographic assumptions such as ancestral population sizes, and requires only the assumption that the ancestral populations were randomly mating. A important aspect of D-statistics is that they can be used to detect archaic admixture even when no archaic sample is available. We explore the effect of sequencing error on the false positive rate of the test for admixture, and we show how to estimate the proportion of archaic ancestry in the genomes of present-day populations. We also investigate a model of subdivision in ancestral populations that can result in D-statistics that indicate recent admixture.


1 comment:

caldararo said...

One problem we face with this kind of work is the assumption that we are dealing with authentic sequences. As Guthrie and I noted in a paper (2010) on the Denisova sequences, there are serious deviations in the reported sequences (as published) and the Genbank reported sequences (assumed consensus). We could not account for this, but in addition Longo et al. have reported in PLoS ONE in February 2011, that most reported non-primate sequences contain human contamination. Contamination would appear, therefore to be underestimated in all present phylogenetic analyses.