Assembling short reads from jumping libraries with large insert sizes

Bioinformatics. 2015 Oct 15;31(20):3262-8. doi: 10.1093/bioinformatics/btv337. Epub 2015 Jun 3.

Abstract

Motivation: Advances in Next-Generation Sequencing technologies and sample preparation recently enabled generation of high-quality jumping libraries that have a potential to significantly improve short read assemblies. However, assembly algorithms have to catch up with experimental innovations to benefit from them and to produce high-quality assemblies.

Results: We present a new algorithm that extends recently described exSPAnder universal repeat resolution approach to enable its applications to several challenging data types, including jumping libraries generated by the recently developed Illumina Nextera Mate Pair protocol. We demonstrate that, with these improvements, bacterial genomes often can be assembled in a few contigs using only a single Nextera Mate Pair library of short reads.

Availability and implementation: Described algorithms are implemented in C++ as a part of SPAdes genome assembler, which is freely available at bioinf.spbau.ru/en/spades.

Contact: [email protected]

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Gene Library*
  • Genome, Bacterial
  • Genomics / methods*
  • High-Throughput Nucleotide Sequencing / methods
  • Sequence Analysis, DNA / methods