Platanus_B: an accurate de novo assembler for bacterial genomes using an iterative error-removal process

DNA Res. 2020 Jun 1;27(3):dsaa014. doi: 10.1093/dnares/dsaa014.

Abstract

De novo assembly of short DNA reads remains an essential technology, especially for large-scale projects and high-resolution variant analyses in epidemiology. However, the existing tools often lack sufficient accuracy required to compare closely related strains. To facilitate such studies on bacterial genomes, we developed Platanus_B, a de novo assembler that employs iterations of multiple error-removal algorithms. The benchmarks demonstrated the superior accuracy and high contiguity of Platanus_B, in addition to its ability to enhance the hybrid assembly of both short and nanopore long reads. Although the hybrid strategies for short and long reads were effective in achieving near full-length genomes, we found that short-read-only assemblies generated with Platanus_B were sufficient to obtain ≥90% of exact coding sequences in most cases. In addition, while nanopore long-read-only assemblies lacked fine-scale accuracies, inclusion of short reads was effective in improving the accuracies. Platanus_B can, therefore, be used for comprehensive genomic surveillances of bacterial pathogens and high-resolution phylogenomic analyses of a wide range of bacteria.

Keywords: de novo assembly; Bacterial genome; high-resolution phylogenomics; large-scale genomic surveillance.

MeSH terms

  • Algorithms*
  • Bacteria / classification
  • Bacteria / genetics*
  • Bacteria / pathogenicity
  • DNA, Bacterial
  • Escherichia coli / genetics
  • Genome, Bacterial*
  • Genomics
  • Phylogeny

Substances

  • DNA, Bacterial