May 05, 2014

SPAMIX for spatial localization of admixed individuals

A new preprint on the bioRxiv suggests that it is possible to geographically localize the location of a person's four grandparents. This is often a problem for persons of mixed ancestry who often tend to plot in PCAs in some average location between their ancestors (so someone who is Swedish+Italian+Spanish+Russian might end up somewhere in central Europe even though none of his ancestors are central European).

This has appeared shortly after the GPS method of Elhaik et al. (2014) which presents evidence of being more accurate than SPA, so it will be interesting to see a comparison between SPAMIX and GPS. My experience in the Dodecad Project suggests that this is a useful feature (the Dodecad Oracle could sometimes be used for this purpose and e.g., could infer that a person that had one Ashkenazi Jewish grandparent and 3 English ones was a ~3/4 British+~1/4 Jewish mix, but it is limited to mixtures of two populations, so it could not cope with the case of 3-4 grandparents with different origins). There is an under-appreciated pool of adoptees who would love a tool like that, and there are also obvious forensic implications if something like this really works.

bioRxiv doi: 10.1101/004713

Spatial localization of recent ancestors for admixed individuals

Wen-Yun Yang et al.

Ancestry analysis from genetic data plays a critical role in studies of human disease and evolution. Recent work has introduced explicit models for the geographic distribution of genetic variation and has shown that such explicit models yield superior accuracy in ancestry inference over non-model-based methods. Here we extend such work to introduce a method that models admixture between ancestors from multiple sources across a geographic continuum. We devise efficient algorithms based on hidden Markov models to localize on a map the recent ancestors (e.g. grandparents) of admixed individuals, joint with assigning ancestry at each locus in the genome. We validate our methods using empirical data from individuals with mixed European ancestry from the POPRES study and show that our approach is able to localize their recent ancestors within an average of 470Km of the reported locations of their grandparents. Furthermore, simulations from real POPRES genotype data show that our method attains high accuracy in localizing recent ancestors of admixed individuals in Europe (an average of 550Km from their true location for localization of 2 ancestries in Europe, 4 generations ago). We explore the limits of ancestry localization under our approach and find that performance decreases as the number of distinct ancestries and generations since admixture increases. Finally, we build a map of expected localization accuracy across admixed individuals according to the location of origin within Europe of their ancestors.

Link

3 comments:

יוסי בן הרוש said...

I have a very good experience with Dodecad, which I used through http://www.gedmatch.com/. After depositing my DNA results from 23 and me and analyzing it,the Oracle found out that I am 50% Iraqian Jew and 50% Moroccan Jew that is 100% accurate. More accurate than 23andme that suggested I am a Druze. In fact, it pretty much amazed me to find out that there is a tool that can pinpoint in such accuracy the origin of Middle Easterns like me.

mooreisbetter said...

Dienekes, could you and others kindly post where one can participate in any and all of these projects?

My family is lucky to have 100% certainty in our family tree: that is, we have no adoptions, no holes in the record, we know what village everyone came from, and we have tested multiple males on most lines to confirm a low (no) likelihood of bastardy.

Particularly because my family knows where each major line was in c. 1600 AD makes me an ideal candidate to test these hypotheses and services.

Please let me know, all

Simon_W said...

Mooreisbetter, Dodecad is no longer updated, submissions are closed, but you can use Do-it-yourself-Dodecad:
http://dodecad.blogspot.com/2011/08/do-it-yourself-dodecad-v-20.html

Or you can use Gedmatch, that's easier to handle, but the downside is that it doesn't include the latest Globe calculators.