September 16, 2008

Why genome-wide association studies don't really work (and how human evolution really happens)

In case you are skeptical about my gloomy assessment of the power of genome-wide association studies, here is Nicholas Wade in today's New York Times, profiling David Goldstein (via john hawks):
This idea, called the common disease/common variant hypothesis, drove major developments in biology over the last five years. Washington financed the HapMap, a catalog of common genetic variation in the human population. Companies like Affymetrix and Illumina developed powerful gene chips for scanning the human genome. Medical statisticians designed the genomewide association study, a robust methodology for discovering true disease genes and sidestepping the many false positives that have plagued the field.

But David B. Goldstein of Duke University, a leading young population geneticist known partly for his research into the genetic roots of Jewish ancestry, says the effort to nail down the genetics of most common diseases is not working. “There is absolutely no question,” he said, “that for the whole hope of personalized medicine, the news has been just about as bleak as it could be.”


The reason for this disappointing outcome, in his view, is that natural selection has been far more efficient than many researchers expected at screening out disease-causing variants. The common disease/common variant idea is largely wrong. What has happened is that a multitude of rare variants lie at the root of most common diseases, being rigorously pruned away as soon as any starts to become widespread.

I would only add that the common variant idea is probably wrong for neutral or positive traits as well. While negative traits are culled from the gene pool by purifying selection, advantageous traits (muscularity, beauty, intelligence, etc.) are positively selected.

But, what is selected? Surely, "a multitude of rare variants" can't exist for positive traits: most mutation is deleterious; the good stuff doesn't appear de novo very often. In my opinion, by and large, it is not common variants behind these traits. Rather, it is fortuitous combinations of unexceptional alleles.

This Lego-block paradigm is based on the notion that most of our alleles are commodity"building blocks"; if they are brought together harmoneously, they produce positive results. The occasional allele may have a large effect, and some alleles fit better together than others. Yet, most of the success or failure of a construction depends on how the components fit together, and not what they are.

I had previously made the point that evolution doesn't require mutation, selection, or drift but can be effected by the self-segregation of individuals into geographical or social niches for which they are better adapted. Differential reproduction of the semi-segregated geographical or social groups (i.e. group selection) then ensues.

There has doubtlessly been recent selection in humans, particularly because of feedback from the changed environments humans created for ourselves.

But, individuals don't differ from each other primarily because of genes that have undergone population-wide selection. Rather, we differ from each other first because we belong to a specific hierarchy of groups (race, subrace, ethnic group, etc.) and foremost because we have inherited a particular combination of alleles, our "family lego shape" from our parents.


McG said...

Great piece. David Goldstein (google him) wrote my two favorite papers on pop gen, review papers with Pollack and later Stumpf. He was/is a strong advocate that mutations are constrained. I have mentioned before that my greatest concern about the type of analysis we are trying to do is that we have no channel model. I think maybe I should express it more strongly, there is an awful lot of intelligence built into this whole system, call it natural selection or what you will, but there is more than randomness, H/T coin flipping going on, as to when and to who mutations occur.

In all my work I have tried to analyze results, histograms of mutations without predicating how or why they occurred. I think the initial paper of ZUL did the same thing along with establishing that STR microsatellite rates were the same as autosomalmicrocsatellite rates.

His subsequent analyses of trying to explain why evolutionary rates are not the same as germ-line is somewhat flawed by the forward looking methodology he used. He presumed he could model how and when mutations occurred as a simple Poisson process. I'm not at all sure that is correct??

I wish he was still "into" popgen.

Jason Malloy said...

"Rather, it is fortuitous combinations of unexceptional alleles."

This is epistasis, and I don't think it agrees with what behavior genetic studies show for traits like height, beauty, intelligence, etc. For instance estimates of heritability from identical twins, who share all their gene combinations, is not significantly higher than estimates of heritability from fraternal twins, who do not share all their gene combinations.

I think the genetic variation for many valued pheonotypical traits is simply spread across a large number of common variants of very small effect.

I suppose this predicts a fair amount of pleiotropy. There seems to be more evidence of this in populations where there appears to be rapid, recent selection. For instance all the disease-intelligence associations Cochran-Harpending identified for Ashkenazi Jews. Using the General Social Survey I also found mental illness and personality disorder traits were associated with intelligence for Jews, but not for gentiles.