July 25, 2012

R1b1a2 variants in 1000G data

It's nice to see a group of independent researchers documenting their work using 1000 Genomes data. I've been following on-and-off developments in this field, and I have to say that it requires deep commitment from the persons involved to keep a mental picture of the ever-deepening phylogeny.

But, in a sense, that is what's great about the efforts of citizen scientists tackling a scientific problem: they are deeply invested in understanding their little part of the human Y-chromosome phylogeny, because it's their part and every SNP discovery in it represents a small victory. Thus, they can expend the time and effort to push the discovery process to its technological limits.

It's only too sad that this can at present be done only using 1000 Genomes data, a.k.a. the global collection of full genome data that completely ignores the part of the world between  Italy and China/India. Hopefully, sometime in the future, the ever-better-understood twig of R1b1a2 will be placed within its wider Eurasian context.

PLoS ONE 7(7): e41634. doi:10.1371/journal.pone.0041634

Discovery of Western European  R1b1a2 Y Chromosome Variants in 1000 Genomes Project Data: An Online Community Approach

Richard A. Rocca1*, Gregory Magoon2, David F. Reynolds3, Thomas Krahn4, Vincent O. Tilroe5, Peter M. Op den Velde Boots6, Andrew J. Grierson7

The authors have used an online community approach, and tools that were readily available via the Internet, to discover genealogically and therefore phylogenetically relevant Y-chromosome polymorphisms within core haplogroup R1b1a2-L11/S127 (rs9786076). Presented here is the analysis of 135 unrelated L11 derived samples from the 1000 Genomes Project. We were able to discover new variants and build a much more complex phylogenetic relationship for L11 sub-clades. Many of the variants were further validated using PCR amplification and Sanger sequencing. The identification of these new variants will help further the understanding of population history including patrilineal migrations in Western and Central Europe where R1b1a2 is the most frequent haplogroup. The fine-grained phylogenetic tree we present here will also help to refine historical genetic dating studies. Our findings demonstrate the power of citizen science for analysis of whole genome sequence data.



  1. Yes, thank you to the citizen scientists! 1000 Genomes has good coverage of Western Europe and the Americas and G. Magoon and others have made lots of discoveries in haplogroup I, including I-M26 which I am interested in since it's my part of the tree. I am still waiting for some more promised samples from Peru, Spain, Puerto Rico etc. Maybe we will get a 1000 Genomes of Russia etc. some day.

  2. Following up on this paper, we have initiated a more substantial follow-up study.

    We have been granted access to 1000 male genomes from anonymous individuals collected through the Wellcome Trust UK10K project.

    This research is much more labour-intensive and we are requesting financial support from the community to assist with essential reagents and sequencing costs.

    Full details may be found here:

    Many thanks


Stay on topic. Be polite. Use facts and arguments. Be Brief. Do not post back to back comments in the same thread, unless you absolutely have to. Don't quote excessively. Google before you ask.