May 24, 2013

Haplofind: mtDNA haplogroup assignment tool

Hum Mutat. 2013 May 20. doi: 10.1002/humu.22356. [Epub ahead of print]

HAPLOFIND: A New Method for High-Throughput mtDNA Haplogroup Assignment.

Vianello D, Sevini F, Castellani G, Laura L, Capri M, Franceschi C.


Deep sequencing technologies are completely revolutionizing the approach to DNA analysis. Mitochondrial DNA (mtDNA) studies entered in the "post-genomic era": the burst in sequenced samples observed in nuclear genomics is expected also in mitochondria, a trend that can already be detected checking complete mtDNA sequences database submission rate. Tools for the analysis of these data are available, but they fail in throughput or in easiness of use. We present here a new pipeline based on previous algorithms, inherited from the "nuclear genomic toolbox", combined with a newly developed algorithm capable of efficiently and easily classify new mtDNA sequences according to PhyloTree nomenclature. Detected mutations are also annotated using data collected from publicly available databases. Thanks to the analysis of all freely available sequences with known haplogroup obtained from GenBank, we were able to produce a Phylotree-based weighted tree, taking into account each haplogroup pattern conservation. The combination of a highly-efficient aligner, coupled with our algorithm and a massive usage of asynchronous parallel processing, allowed us to build a high-throughput pipeline for the analysis of mitochondrial DNA sequences, that can quickly be updated to follow the ever-changing nomenclature. HaploFind is freely accessible at the web address



Family Tree said...

I was just looking at the Phylo Tree at PhyloTree.Org and I was just wondering why the mutations 309.1C(C), 315.1C, AC indels at 515-522, 16182C, 16183C, 16193.1C(C) and 16519 were not considered for phylogenetic reconstruction and are therefore excluded from the tree?
I have HVR1 - 16183C
If you could explain this to me in simple terms I would appreciate it. Thank you...

jenny l p said...

Family Tree, these SNPs are considered "highly volatile", meaning they mutate more frequently than other mtDNA SNPs. Because of this, you can completely match a haplogroup except for that one SNP and still be considered an exact match. As a result, a lot of algorithms tend to leave these SNPs out. Does that help?

Assignment Help said...

Well, initially HaploFind used to be "just" a high throughput algorithm to perform complete mtDNA sequences classification according to haplogroup nomenclature. At the moment, instead, represent a complete Web Application focused on mtDNA complete sequences annotation. It's based on PhyloTree phylogenetic tree, and uses Mitomap as a source for annotation data.
Assignment help