Haplotype clusters of rare variants in Korean genomes
A haplotype cluster corresponds to a DNA segment that is descended from a single founder and, therefore, some haplotypes are similar to each other in this segment. The advent of new sequencing technologies facilitates the identification of rare variants and therefore haplotype clusters containing rare variants (“rare haplotype clusters”). However, LD-related methods fail to extract rare haplotype clusters because of the high variance of LD measures for rare variants. IBD detection methods require sufficiently long IBD regions to avoid high false positive rates. We propose identifying rare haplotype clusters by HapFABIA which uses biclustering to combine LD information across individuals and IBD information along the chromosome. To identify rare haplotype clusters in the Korean population, we applied HapFABIA to data from the Korean Personal Genome Project (KPGP). Genotyping data from the KPGP was combined with those from the 1000-Genomes-Project leading to 1,131 individuals and 3.1 million single nucleotide variants (SNVs) on chromosome 1. HapFABIA identified 113,963 different rare haplotype clusters marked by tagSNVs that have a minor allele frequency of 5% or less. The rare haplotype clusters comprise 680,904 SNVs; that is 36.1% of the rare variants and 21.5% of all variants. The vast majority of 107,473 haplotype clusters contains Africans, while only 9,554 and 6,933 contain Europeans and Asians, respectively. We characterized haplotype clusters by matching with archaic genomes. Haplotype clusters that match the Denisova or the Neandertal genome are significantly enriched by Asians and Europeans. Interestingly, haplotype clusters matching the Denisova or the Neandertal genome contain, in some cases exclusively, Africans. Our findings indicate that the majority of rare haplotype clusters from chromosome 1 are based on ancient founder segments from times before humans migrated out of Africa. The enrichment of Koreans in Neandertal haplotype clusters (odds ratio 10.6 of Fisher’s exact test) is not as high as for Han Chinese from Beijing, Han Chinese from South, and Japanese (odds ratios 23.9, 19.1, 22.7 of Fisher’s exact test). In contrast to these results, the enrichment of Koreans in Denisova haplotype clusters (odds ratio 36.7 of Fisher’s exact test) is is higher than for Han Chinese from Beijing, Han Chinese from South, and Japanese (odds ratios 7.6, 6.9, 7.0 of Fisher’s exact test).This sounds very interesting, and I missed it the first time around because it will appear on the "Statistical Genetics and Genetic Epidemiology" session. I would wager that some of the African haplotype clusters may correspond to archaic Africans. I don't hold high hopes of ever seeing ancient DNA from sub-Saharan Africa, but the field has proceeded with such leaps and bounds that you never know.
The other interesting part is that some haplotype clusters matching Neandertals and Denisovans match exclusively Africans. Does that scream back-migration? We are only scratching the surface of the complexity of the deepest origins of our species, it seems.
An evaluation of genetic characteristics of two population isolates from Greece: the HELIC-Pomak and MANOLIS studies presents Illumina OmniExpress data for hundreds of individuals of Pomak origin (a Muslim Slavic-speaking minority of Greek Thrace) and Anogia (a remote mountainous village in Crete) as well as a comparative sample of hundreds of general Greeks. Sounds very exciting.
Inferring Y Chromosome Phylogeny by Sequencing Diverse Population is another effort to work out the Y-chromosome phylogeny using sequencing (see here for some other efforts). The abstract is a bit vague, but:
Further, we resolve a major long-standing polytomy by identifying a variant for which one haplogroup retains the ancestral allele, whereas its brother clades share the derived allele, thus indicating common ancestry and uniting the latter two branches. This finding has been confirmed by genotyping a larger panel. Finally, we estimate the MSY rate of mutation recurrence and the time to the most recent common ancestor of the sampled chromosomesThe structure within haplogroup F and within haplogroup K needs some working out, and I'm sure that when all is said and done we'll have a nice clean bifurcating phylogeny which will doubtlessly be very informative re: the immediate post-UP revolution settlement of Eurasia.