May 27, 2009

The first Korean genome

An interesting bit:
In addition, the comparison of indels between SJK and YH (Table 4) showed that the two genomes shared the same type of indels by 99.5% on the same genomic loci (SJK and HuRef shared 86.2%, SJK and Watson shared 87.8%, SJK and NA18507 shared 93.6%).

So -based on indels- the Korean and Chinese individuals are ~24 times less distant to each other than the Korean is to James Watson (a European descendant) and ~13 times less distant to each other than the Korean was to NA18507 (a Nigerian). Table 4 in the paper has all the detailed numbers.

Figure 2 shows the overlap -number of SNPs- between various full genomes available.


Consider (E): 1.2 million SNPs are shared by the Korean and Venter/Watson; ~0.5 million are shared by the Korean and Venter (but not Watson) and the Korean and Watson (but not Venter), i.e., they transcend racial lines.

But, another ~0.5 million is shared by Venter and Watson, but not the Korean. A subset of these may be shared by accident for these three individuals (i.e., another Korean might also possess some of them). Another subset may be shared by Venter and Watson and most other Caucasoids; another subset may be shared by Venter and Watson, presumably due to their common Western European ancestry (or shared other minor ancestry), and so on.

As we sample more full genomes, we will be able to zero in on the pan-human SNPs, which represent shared human genetic diversity, as well as SNPs limited to races, subraces, ethnic groups, regions, ..., individuals.

This is an open access paper, so you can read it for yourselves.

Genome Research doi:10.1101/gr.092197.109

The first Korean genome sequence and analysis: Full genome sequencing for a socio-ethnic group

Sung-Min Ahn et al.

Abstract

We present the first Korean individual genome sequence (SJK) and analysis results. The diploid genome of a Korean male was sequenced to 28.95-fold redundancy using the Illumina paired-end sequencing method. SJK covered 99.9% of the NCBI human reference genome. We identified 420,083 novel SNPs that are not in the dbSNP database. Despite a close similarity, significant differences were observed between the Chinese genome (YH), the only other Asian genome available, and SJK: 1) 39.87% (1,371,239 out of 3,439,107) SNPs were SJK-specific (49.51% against Venter's, 46.94% against Watson's, and 44.17% against the Yoruba genomes), 2) 99.5% (22,495 out of 22,605) of short indels (less than 4 bp) discovered on the same loci had the same size and type as YH, and 3) 11.3% (331 out of 2920) deletion structural variants were SJK-specific. Even after attempting to map unmapped reads of SJK to unanchored NCBI scaffolds, HGSV, and available personal genomes, there were still 5.77% SJK reads that could not be mapped. All these findings indicate that the overall genetic differences among individuals from closely related ethnic groups may be significant. Hence, constructing reference genomes for minor socio-ethnic groups will be useful for massive individual genome sequencing.

Link

1 comment:

argiedude said...

From Steve Sailer's website:

"An analysis of James Watson's genome shows that 16% of his genes are likely to have come from a black ancestor of African descent. By contrast, most people of European descent would have no more than 1%."

"The analysis by deCODE Genetics, an Icelandic company, also shows a further 9% of Watson’s genes are likely to have come from an ancestor of Asian descent."

“This level is what you would expect in someone who had a great-grandparent who was African,” said Kari Stefansson of deCODE Genetics, whose company carried out the analysis. “It was very surprising to get this result for Jim."

......................................

Here's the share price of deCODE Genetics. The false revelation about James Watson happened in December, 2007. It has lost exactly 90% of its share price since then. I think it's clear that the jack-ass Kari was trying to desperately create some free publicity for his dying company.

http://img211.imageshack.us/img211/5508/83145245.png

Latest annual revenues: 50 million dollars. Current total debt: 255 million dollars. Its current share price is still too high...