September 01, 2007

Structure of genetic variation in US populations

A new AJHG preprint looks at the genetic variation in different groups of the US population.

From the paper:
A frequent claim about human population structure is that most common variation
is shared among all populations11-13. This, of course, depends on how population
boundaries are defined, but often cited to support such comments are the comparisons of SNP frequencies in pairs of populations in the HapMap data and the Perlegen data. Analyses of these data indicated that common SNPs were frequently both shared and common among populations of predominately African, Asian, and European ancestry. However, population genetic analysis was not the intended goal of either the HapMap or the Perlegen projects, and common, shared SNPs were over sampled by the ascertainment strategies used for each project.
Also an interesting view of the genetic structure of the main US population groups.
C. Stacked bar chart inferred from results of model-based cluster analysis using STRUCTURE 2.0. Each bar represents an individual, and each bar is divided according to the fraction of cluster membership. D. Triangle plot illustrating the proportion of African, Asian, and European American ancestry of each individual (dots) estimated from STRUCTURE 2.0. (PC=principal component; AfA=African American; AsA=Asian American; EA=European Americans; HA=Latino/Hispanic Americans; MAF=minor allele frequency.)


The structure of common genetic variation in U.S. populations

Stephen L. Guthery et al.

ABSTRACT

The common variant/common disease model predicts that most risk alleles underlying complex health-related traits are common and therefore old and found in multiple populations, rather than rare or population-specific. Accordingly, there is widespread interest in assessing the population structure of common alleles. However, such assessments have been confounded by analysis of datasets with bias toward ascertainment of common alleles (e.g., HapMap, Perlegen) or in which a relatively small number of genes and/or populations were sampled. The aim of this study was to examine the structure of common variation ascertained in major U.S. populations by resequencing the exons and flanking regions of 3,873 genes in 154 chromosomes from European, Latino/Hispanic, Asian, and African Americans generated by the Genaissance Resequencing Project. The frequency distributions of private and
common single nucleotide polymorphisms (SNPs) were measured, and the extent to which common SNPs were shared across populations was analyzed using several different estimators of population structure. Most SNPs that were common in one population were present in multiple populations, but SNPs common in one population were frequently not common in other populations. Moreover, SNPs that were common in two or more populations often differed significantly in frequency from one another, particularly in comparisons of African Americans versus other U.S. populations. These findings indicate that even if the bulk of alleles underlying complex health-related traits are common SNPs, geographic ancestry might well be an important predictor of whether a person carries a risk allele.

Link


No comments: