Numt identification and removal with RtN!

Bioinformatics. 2020 Dec 22;36(20):5115-5116. doi: 10.1093/bioinformatics/btaa642.

Abstract

Motivation: Assays in mitochondrial genomics rely on accurate read mapping and variant calling. However, there are known and unknown nuclear paralogs that have fundamentally different genetic properties than that of the mitochondrial genome. Such paralogs complicate the interpretation of mitochondrial genome data and confound variant calling.

Results: Remove the Numts! (RtN!) was developed to categorize reads from massively parallel sequencing data not based on the expected properties and sequence identities of paralogous nuclear encoded mitochondrial sequences, but instead using sequence similarity to a large database of publicly available mitochondrial genomes. RtN! removes low-level sequencing noise and mitochondrial paralogs while not impacting variant calling, while competing methods were shown to remove true variants from mitochondrial mixtures.

Availability and implementation: https://github.com/Ahhgust/RtN.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cell Nucleus
  • Genome, Mitochondrial*
  • High-Throughput Nucleotide Sequencing*
  • Sequence Analysis, DNA
  • Software