mTAGs: taxonomic profiling using degenerate consensus reference sequences of ribosomal RNA genes

Bioinformatics. 2021 Dec 22;38(1):270-272. doi: 10.1093/bioinformatics/btab465.

Abstract

Profiling the taxonomic composition of microbial communities commonly involves the classification of ribosomal RNA gene fragments. As a trade-off to maintain high classification accuracy, existing tools are typically limited to the genus level. Here, we present mTAGs, a taxonomic profiling tool that implements the alignment of metagenomic sequencing reads to degenerate consensus reference sequences of small subunit ribosomal RNA genes. It uses DNA fragments, that is, paired-end sequencing reads, as count units and provides relative abundance profiles at multiple taxonomic ranks, including operational taxonomic units based on a 97% sequence identity cutoff. At the genus rank, mTAGs outperformed other tools across several metrics, such as the F1 score by >11% across data from different environments, and achieved competitive (F1 score) or better results (Bray-Curtis dissimilarity) at the sub-genus level.

Availability and implementation: The software tool mTAGs is implemented in Python. The source code and binaries are freely available (https://github.com/SushiLab/mTAGs). The data underlying this article are available in Zenodo, at https://doi.org/10.5281/zenodo.4352762.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Consensus
  • Genes, rRNA
  • Microbiota* / genetics
  • Sequence Analysis, DNA / methods
  • Software*