A new preprint on the bioRxiv suggests that it is possible to geographically localize the location of a person's four grandparents. This is often a problem for persons of mixed ancestry who often tend to plot in PCAs in some average location between their ancestors (so someone who is Swedish+Italian+Spanish+Russian might end up somewhere in central Europe even though none of his ancestors are central European).
This has appeared shortly after the GPS method of Elhaik et al. (2014) which presents evidence of being more accurate than SPA, so it will be interesting to see a comparison between SPAMIX and GPS. My experience in the Dodecad Project suggests that this is a useful feature (the Dodecad Oracle could sometimes be used for this purpose and e.g., could infer that a person that had one Ashkenazi Jewish grandparent and 3 English ones was a ~3/4 British+~1/4 Jewish mix, but it is limited to mixtures of two populations, so it could not cope with the case of 3-4 grandparents with different origins). There is an under-appreciated pool of adoptees who would love a tool like that, and there are also obvious forensic implications if something like this really works.
bioRxiv doi: 10.1101/004713
Spatial localization of recent ancestors for admixed individuals
Wen-Yun Yang et al.
Ancestry analysis from genetic data plays a critical role in studies of human disease and evolution. Recent work has introduced explicit models for the geographic distribution of genetic variation and has shown that such explicit models yield superior accuracy in ancestry inference over non-model-based methods. Here we extend such work to introduce a method that models admixture between ancestors from multiple sources across a geographic continuum. We devise efficient algorithms based on hidden Markov models to localize on a map the recent ancestors (e.g. grandparents) of admixed individuals, joint with assigning ancestry at each locus in the genome. We validate our methods using empirical data from individuals with mixed European ancestry from the POPRES study and show that our approach is able to localize their recent ancestors within an average of 470Km of the reported locations of their grandparents. Furthermore, simulations from real POPRES genotype data show that our method attains high accuracy in localizing recent ancestors of admixed individuals in Europe (an average of 550Km from their true location for localization of 2 ancestries in Europe, 4 generations ago). We explore the limits of ancestry localization under our approach and find that performance decreases as the number of distinct ancestries and generations since admixture increases. Finally, we build a map of expected localization accuracy across admixed individuals according to the location of origin within Europe of their ancestors.