December 16, 2011
1000 Genomes at 2,100+ and counting
The latest working data on 1000 Genomes data include 2,123 individuals. I had already included some Khmer Vietnamese (KHV) from the previous working data for use with my K12a calculator. The list of populations in the datafile currently include: GBR FIN CHS PUR CDX CLM IBS PEL KHV ACB CEU CHD YRI CHB JPT LWK ASW MXL TSI GIH MKK I will probably take the time to extract anew the population data from the newest file, as well as split some (such as IBS) for which I have some more regional information. By my last count, I now have about ~10,600 individuals to work with (some are duplicates, e.g., between the HapMap and 1000 Genomes Project). In other news, I see some 23/11/2011 data on Y-chromosome SNPs. I haven't worked on those myself, but I know that many hobbyists are interested in the Y-chromosome aspect of the project, so those might be useful. Finally, there are slides from the ICHG seminar on the 1000 Genomes Project, which should be interesting reading.