November 26, 2008

Allele surfing vs. Positive Selection

Continental human populations have very high allele frequency differences in several loci. One explanation for this phenomenon is that after their arrival in new lands, humans underwent selection for alleles that were appropriate in the new environments. An alternative explanation is that the frequencies were due to allele surfing, a process in which a small subset of individuals at the frontier of the expansion expands and multiplies into previously unsettled territory, causing their particular alleles to increase in frequency there.

From the paper:
The survey of the HGDP database on human polymorphisms reveals that large allele frequency differences between continental regions are extremely common. Indeed as much as 30% of loci show very large allele frequency differences between continents. These differences are unlikely to have been created by positive selection, but are more likely the result of neutral demographic processes such as the surfing phenomenon. Because the erosion of large allele frequency differences by mutation is slow, even for large mutation rates, the surprisingly large number of strongly differentiated STR alleles also do not need to be explained by the action of positive selection. Africa and the Americas show a much larger extent of differentiation than Eurasia or East Asia, which is certainly due to changes in allele frequencies during the colonisation of the Eurasian and the American continents. Disentangling the effects of selection and neutral demographic processes on genome diversity remains an important challenge of future human evolution studies.

This is a serious challenge to the selectionist paradigm and should be answered by its proponents. I would say that, from now on, the "gold standard" of positive selection should be concrete evidence that the proposed selected alleles actually do something that could have been selected, e.g., lactase persistence, where allele frequency differences are combined with a specific trait, which in turn is correlated with a particular selective influence (milk consumption after weaning). Statistical inference of selection without a comprehensive explanation is no longer intellectually convincing.

For example, ASPM and MCPH1 are loci that generated a lot of excitement as selection targets due to their large inter-group frequency differences. However, followup work has not found any substantial associations between them and anything of value: Has ASPM been the target of recent selection?, ASPM, MCPH1, CDK5RAP and BRCA1 and general cognition, reading or language. Were they really selected, or did they ride the wave of human advance?

Annals of Human Genetics doi: 10.1111/j.1469-1809.2008.00489.x

Large Allele Frequency Differences between Human Continental Groups are more Likely to have Occurred by Drift During range Expansions than by Selection

T. Hofer et al.

Several studies have found strikingly different allele frequencies between continents. This has been mainly interpreted as being due to local adaptation. However, demographic factors can generate similar patterns. Namely, allelic surfing during a population range expansion may increase the frequency of alleles in newly colonised areas. In this study, we examined 772 STRs, 210 diallelic indels, and 2834 SNPs typed in 53 human populations worldwide under the HGDP-CEPH Diversity Panel to determine to which extent allele frequency differs among four regions (Africa, Eurasia, East Asia, and America). We find that large allele frequency differences between continents are surprisingly common, and that Africa and America show the largest number of loci with extreme frequency differences. Moreover, more STR alleles have increased rather than decreased in frequency outside Africa, as expected under allelic surfing. Finally, there is no relationship between the extent of allele frequency differences and proximity to genes, as would be expected under selection. We therefore conclude that most of the observed large allele frequency differences between continents result from demography rather than from positive selection.

Link

5 comments:

Kosmo said...

Wow.

This is an important study.

I completely agree: "Statistical inference of selection without a comprehensive explanation is no longer intellectually convincing."

Average Joe said...

Why is it that only the advocates of selection have to prove their case? Why don't the advocates of allele surfing have to prove anything? Also what is the difference between allele surfing and genetic drift?

Kosmo said...

"Why is it that only the advocates of selection have to prove their case? Why don't the advocates of allele surfing have to prove anything?"

--The allele surfing hypothesis require less internal structure to function, and therfore would be the null.

"Also what is the difference between allele surfing and genetic drift?"

Allele surfing is like genetic drift with a single extra layer of organization. (that layer being the preferential expansion of alleles present on the outskirts of a population)

Tuuli Lappalainen said...

"I would say that, from now on, the "gold standard" of positive selection should be concrete evidence that the proposed selected alleles actually do something that could have been selected, e.g., lactase persistence, where allele frequency differences are combined with a specific trait, which in turn is correlated with a particular selective influence (milk consumption after weaning). Statistical inference of selection without a comprehensive explanation is no longer intellectually convincing."

This is an interesting and important study, I disagree with the above.

People haven't been relying blindly on allele frequency differences to infer natural selection, since there are several other statistics that look for specific signatures in the genome left by natural selection (like the EHH-based statistics, or recombination rate, or lots of others). It's quite well-known that the overlap between geographically differentiated loci and those that show other signs of selection is far from perfect, and that drift (or surfing) may cause large frequency differences just by chance.

But requiring functional evidence to call positive selection would be overcautious: genomewide scans for selection do and will find lots of loci where there is strong evidence of selection from multiple lines of evidence, but still no clue about the function. These findings should be published, and hopefully someone will find the function later on. It's like genome-wide association studies: sometimes you find genes that are unexpected and totally unknown, and it will take years to figure out the functional pathways. It stil doesn't mean that the finding isn't real.

Dienekes said...

I did not imply that "statistical inference of selection" boils down to just observing allele frequency differences. And, certainly putative signals of selection should be published. What I'm saying is that what is interpreted as a signal of selection often admits to other explanations (as this article shows). If drift can be excluded only on the basis of a statistical argument, then so much the better. But, ultimately the case for selection will be solidified (hence "the gold standard") if the statistical argument from the genome is part of a broader argument based on functional significance of the selected alleles.