January 02, 2012

Relatives in HUGO Pan-Asian SNP data

I haven't used the Pan-Asian SNP data much since discovering them, because of the low number of common SNPs with all my main Illumina-based datasets, but this should be useful to anyone using this important data resource.

PLoS ONE 6(12): e29502. doi:10.1371/journal.pone.0029502

Identification of Close Relatives in the HUGO Pan-Asian SNP Database

Xiong Yang et al.

The HUGO Pan-Asian SNP Consortium has recently released a genome-wide dataset, which consists of 1,719 DNA samples collected from 71 Asian populations. For studies of human population genetics such as genetic structure and migration history, this provided the most comprehensive large-scale survey of genetic variation to date in East and Southeast Asia. However, although considered in the analysis, close relatives were not clearly reported in the original paper. Here we performed a systematic analysis of genetic relationships among individuals from the Pan-Asian SNP (PASNP) database and identified 3 pairs of monozygotic twins or duplicate samples, 100 pairs of first-degree and 161 second-degree of relationships. Three standardized subsets with different levels of unrelated individuals were suggested here for future applications of the samples in most types of population-genetics studies (denoted by PASNP1716, PASNP1640 and PASNP1583 respectively) based on the relationships inferred in this study. In addition, we provided gender information for PASNP samples, which were not included in the original dataset, based on analysis of X chromosome data.


