HAPLOFIND: a new method for high-throughput mtDNA haplogroup assignment

Hum Mutat. 2013 Sep;34(9):1189-94. doi: 10.1002/humu.22356. Epub 2013 Jun 12.

Abstract

Deep sequencing technologies are completely revolutionizing the approach to DNA analysis. Mitochondrial DNA (mtDNA) studies entered in the "postgenomic era": the burst in sequenced samples observed in nuclear genomics is expected also in mitochondria, a trend that can already be detected checking complete mtDNA sequences database submission rate. Tools for the analysis of these data are available, but they fail in throughput or in easiness of use. We present here a new pipeline based on previous algorithms, inherited from the "nuclear genomic toolbox," combined with a newly developed algorithm capable of efficiently and easily classify new mtDNA sequences according to PhyloTree nomenclature. Detected mutations are also annotated using data collected from publicly available databases. Thanks to the analysis of all freely available sequences with known haplogroup obtained from GenBank, we were able to produce a PhyloTree-based weighted tree, taking into account each haplogroup pattern conservation. The combination of a highly efficient aligner, coupled with our algorithm and massive usage of asynchronous parallel processing, allowed us to build a high-throughput pipeline for the analysis of mtDNA sequences that can be quickly updated to follow the ever-changing nomenclature. HaploFind is freely accessible at the following Web address: https://haplofind.unibo.it.

Keywords: NGS; Web service; haplogroup; haplogroup assignment; mtDNA.

MeSH terms

  • Algorithms
  • DNA, Mitochondrial / classification*
  • DNA, Mitochondrial / genetics*
  • Databases, Nucleic Acid*
  • Genetic Variation
  • Haplotypes*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Molecular Sequence Annotation
  • Phylogeny
  • Polymorphism, Single Nucleotide
  • Sequence Alignment
  • Sequence Analysis, DNA / methods*
  • Software

Substances

  • DNA, Mitochondrial