Maximal viral information recovery from sequence data using VirMAP

Nat Commun. 2018 Aug 10;9(1):3205. doi: 10.1038/s41467-018-05658-8.

Abstract

Accurate classification of the human virome is critical to a full understanding of the role viruses play in health and disease. This implies the need for sensitive, specific, and practical pipelines that return precise outputs while still enabling case-specific post hoc analysis. Viral taxonomic characterization from metagenomic data suffers from high background noise and signal crosstalk that confounds current methods. Here we develop VirMAP that overcomes these limitations using techniques that merge nucleotide and protein information to taxonomically classify viral reconstructions independent of genome coverage or read overlap. We validate VirMAP using published data sets and viral mock communities containing RNA and DNA viruses and bacteriophages. VirMAP offers opportunities to enhance metagenomic studies seeking to define virome-host interactions, improve biosurveillance capabilities, and strengthen molecular epidemiology reporting.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • DNA Viruses / genetics*
  • Databases, Genetic
  • Genome, Viral
  • Humans
  • Information Storage and Retrieval*
  • Metagenomics
  • Sequence Analysis, DNA*
  • Software*