Showing posts with label MixMapper. Show all posts
Showing posts with label MixMapper. Show all posts

December 12, 2012

Efficient moment-based inference of admixture parameters and sources of gene flow (Lipson et al. 2012)

My reading list keeps getting longer as another paper referenced by Loh et al. has now appeared on the arXiv, a day after the new Moorjani et al. paper on Romani origins. A number of papers from many of the same co-authors have appeared over the span of a couple of months, all of them containing interesting technical discussion on admixture parameter estimation, so perhaps this is a good place to make a list of them for easy reference:
This series of papers builds on earlier work, which can be found in the following:
The software introduced in the current paper (Lipson et al. 2012) can be found in the MixMapper page, and according to its description it is similar in spirit to the TreeMix software. Hopefully I'll be able to try it out for myself.

The most interesting thing about the current paper is, of course, its detection, to a lesser extent of the same "North Eurasian" ancestry found in northern Europeans by Patterson et al. also in Sardinians and Basques.

Sardinians did not appear to have such ancestry on the basis of the f3-statistic, but this might have been a consequence of the fact that they were the "least unadmixed" of the Europeans, so any application of f3(Sardinian; X, Amerindian) would not have given a negative result, because there does not exist any X less mixed with this Amerindian-like "North Eurasian" element than Sardinians.

Also, the ALDER paper seems not to have been able to date this type of admixture because of its antiquity. I have tried myself using a 1-ref approach on Sardinians (using Sardinians and various other "eastern" populations as possible contributors) but without success. So, it will be interesting to read how this type of ancestry was detected in the current paper. Any further comments will be posted in this space as updates.

UPDATE I: On the left you can see the model proposed for Europe. A first observation is the absence of a primate outgroup, or indeed of representatives of African hunter-gatherers. This makes sense in the context of this paper, since all African hunter-gatherers have been shown now to have admixture from African farmers, so they cannot be used for the "scaffold" tree, as they are not unadmixed.

However, their type of admixture differs from the admixture found in all other populations. For example, Europeans are a mixture of "Ancient Western Eurasians" and a group related to "Ancient Northern Eurasians". African hunter-gatherers, on the other hand, are a mixture between a group related to the Mandenka-Yoruba clade, and (potentially diverse) sets of "Palaeoafricans". The latter are an outgroup to the rest of mankind, and as such admixture with them cannot be represented in this model; consequently Yoruba assume by default a position of unadmixed outgroup to the rest of mankind, a position which -for reasons mentioned before in this blog- I believe is not correct. What effect this might have on the rest of the tree is not yet clear to me.

arXiv:1212.2555 [q-bio.PE]

Efficient moment-based inference of admixture parameters and sources of gene flow

Mark Lipson, Po-Ru Loh, Alex Levin, David Reich, Nick Patterson, Bonnie Berger

(Submitted on 11 Dec 2012)

The recent explosion in available genetic data has led to significant advances in understanding the demographic histories of and relationships among human populations. It is still a challenge, however, to infer reliable parameter values for complicated models involving many populations. Here we present MixMapper, an efficient, interactive method for constructing phylogenetic trees including admixture events using single nucleotide polymorphism (SNP) genotype data. MixMapper implements a novel two-phase approach to admixture inference using moment statistics, first building an unadmixed scaffold tree and then adding admixed populations by solving systems of equations that express allele frequency divergences in terms of mixture parameters. Importantly, all features of the tree, including topology, sources of gene flow, branch lengths, and mixture proportions, are optimized automatically from the data and include estimates of statistical uncertainty. MixMapper also uses a new method to express branch lengths in easily interpretable drift units. We apply MixMapper to recently published data for HGDP individuals genotyped on a SNP array designed especially for use in population genetics studies, obtaining confident results for 30 populations, 20 of them admixed. Notably, we confirm a signal of ancient admixture in European populations---including previously undetected admixture in Sardinians and Basques---involving a proportion of 20-40% ancient northern Eurasian ancestry.

Link