May 28, 2010

Genomic ancestry of ethnic Americans (Wang et al. 2010)

The paper also has quite extensive supplementary material. On the left, part of Figure S6: Clustering of the MEC Hawaiians (MEC_H) using the top 2 PCs from PCA on the HapMap European and East Asian samples plus MEC_H using 2509 SNPs.

As you can see Hawaiians can be viewed as having triple ancestry from Europeans, East Asians, and a third population, representing the natives prior to European American and Japanese immigration to the islands of Hawaii.

Quite interesting also is the paper's depiction of outliers in various populations. For example, in Japanese Americans:

The above (from Figure S2) depicts Japanese Americans on the PC axes determined by all populations. Outliers deviate towards Europeans. In some cases this makes sense, as the individuals report a White parent (1J 1W plus unlabeled dots), but there are a couple dots reporting two Japanese parents that still deviate towards Whites. Assuming no clerical error, this is probably a case of cryptic White ancestry, or, alternatively, a case of of two parents who considered themselves Japanese even though at least one of them had substantial Caucasoid ancestry.

On the flip side, here is the PCA map of White outliers.

Once again, we see a main pink cluster dominated by persons reporting 100% W ancestry, with deviations in two directions: top-right, towards East Asians, and top-left, towards Sub-Saharan Africans. There are clear cases where self-reported ancestry does not match reality, e.g., (i) the group of 4 pink dots on the top right showing clear evidence of East Asian admixture despite reporting 100% W ancestry, or (ii) a set of non-pink dots right in the middle of the pink cluster, which appear (at least in the first two PCs) to be indistinguishable from other Whites, but report mixed ancestry.

Human Genetics doi:10.1007/s00439-010-0841-4

Self-reported ethnicity, genetic structure and the impact of population stratification in a multiethnic study

Hansong Wang et al.

It is well-known that population substructure may lead to confounding in case–control association studies. Here, we examined genetic structure in a large racially and ethnically diverse sample consisting of five ethnic groups of the Multiethnic Cohort study (African Americans, Japanese Americans, Latinos, European Americans and Native Hawaiians) using 2,509 SNPs distributed across the genome. Principal component analysis on 6,213 study participants, 18 Native Americans and 11 HapMap III populations revealed four important principal components (PCs): the first two separated Asians, Europeans and Africans, and the third and fourth corresponded to Native American and Native Hawaiian (Polynesian) ancestry, respectively. Individual ethnic composition derived from self-reported parental information matched well to genetic ancestry for Japanese and European Americans. STRUCTURE-estimated individual ancestral proportions for African Americans and Latinos are consistent with previous reports. We quantified the East Asian (mean 27%), European (mean 27%) and Polynesian (mean 46%) ancestral proportions for the first time, to our knowledge, for Native Hawaiians. Simulations based on realistic settings of case–control studies nested in the Multiethnic Cohort found that the effect of population stratification was modest and readily corrected by adjusting for race/ethnicity or by adjusting for top PCs derived from all SNPs or from ancestry informative markers; the power of these approaches was similar when averaged across causal variants simulated based on allele frequencies of the 2,509 genotyped markers. The bias may be large in case-only analysis of gene by gene interactions but it can be corrected by top PCs derived from all SNPs.


1 comment:

Andrew Oh-Willeke said...

* "this is probably a case of cryptic White ancestry"

What a remarkably polite way to put that. I can also imagine an East Asian adoptee identifying as "white."

* "a set of non-pink dots right in the middle of the pink cluster, which appear (at least in the first two PCs) to be indistinguishable from other Whites, but report mixed ancestry."

Someone with a Spanish speaking ancestor who was entirely European in descent might report a mixed race identity, as might someone who an apochryphal Native American ancestor, or someone who had an African-American ancestor with a very high percentage of European descent who was still considered mixed race under the "one drop rule."