bioRxiv doi: http://dx.doi.org/10.1101/005462
Human genomic regions with exceptionally high or low levels of population differentiation identified from 911 whole-genome sequences
Vincenza Colonna et al.
Background: Population differentiation has proved to be effective for identifying loci under geographically-localized positive selection, and has the potential to identify loci subject to balancing selection. We have previously investigated the pattern of genetic differentiation among human populations at 36.8 million genomic variants to identify sites in the genome showing high frequency differences. Here, we extend this dataset to include additional variants, survey sites with low levels of differentiation, and evaluate the extent to which highly differentiated sites are likely to result from selective or other processes. Results: We demonstrate that while sites of low differentiation represent sampling effects rather than balancing selection, sites showing extremely high population differentiation are enriched for positive selection events and that one half may be the result of classic selective sweeps. Among these, we rediscover known examples, where we actually identify the established functional SNP, and discover novel examples including the genes ABCA12, CALD1 and ZNF804, which we speculate may be linked to adaptations in skin, calcium metabolism and defense, respectively. Conclusions: We have identified known and many novel candidate regions for geographically restricted positive selection, and suggest several directions for further research.
Link
Yeah, the quote below, from the full PDF, points to the deep problem with this kind of analysis. Larger populations logically don’t make selection more efficient. The efficiency should go the other way, especially with some of the traits mentioned (like blue eyes) where there’s possible founder effects, etc. Identifying “high differentiation” is more about effect than it is about cause. And so it doesn’t tell us how the current situation got that way. And certainly can’t tell us if the selection was functional or an accidental one that bloomed because of a founder effect or local circumstances. The African result also points to a problem in the data.
ReplyDelete“SNP ascertainment in African populations has been less thorough than in
European populations, and so analyses based on known SNPs have been biased against
discovering highly differentiated sites in Africans. The identification of HighD sites from full
sequence data, however, is unaffected by recombination or ascertainment, and the lower
number of HighD sites in Africa (25 vs 110 each in EUR and ASN) supports the hypothesis of
less positive selection of this type, despite the larger effective population size which should
make selection more efficient.”