The hidden perils of read mapping as a quality assessment tool in genome sequencing

Sci Rep. 2017 Feb 22:7:43149. doi: 10.1038/srep43149.

Abstract

This article provides a comparative analysis of the various methods of genome sequencing focusing on verification of the assembly quality. The results of a comparative assessment of various de novo assembly tools, as well as sequencing technologies, are presented using a recently completed sequence of the genome of Lactobacillus fermentum 3872. In particular, quality of assemblies is assessed by using CLC Genomics Workbench read mapping and Optical mapping developed by OpGen. Over-extension of contigs without prior knowledge of contig location can lead to misassembled contigs, even when commonly used quality indicators such as read mapping suggest that a contig is well assembled. Precautions must also be undertaken when using long read sequencing technology, which may also lead to misassembled contigs.

MeSH terms

  • Computational Biology / methods*
  • Limosilactobacillus fermentum / genetics*
  • Sequence Analysis, DNA / methods*
  • Whole Genome Sequencing / methods*