Featherweight long read alignment using partitioned reference indexes

Sci Rep. 2019 Mar 13;9(1):4318. doi: 10.1038/s41598-019-40739-8.

Abstract

The advent of Nanopore sequencing has realised portable genomic research and applications. However, state of the art long read aligners and large reference genomes are not compatible with most mobile computing devices due to their high memory requirements. We show how memory requirements can be reduced through parameter optimisation and reference genome partitioning, but highlight the associated limitations and caveats of these approaches. We then demonstrate how these issues can be overcome through an appropriate merging technique. We incorporated multi-index merging into the Minimap2 aligner and demonstrate that long read alignment to the human genome can be performed on a system with 2 GB RAM with negligible impact on accuracy.

MeSH terms

  • Algorithms
  • Computer Storage Devices
  • Genome, Human / genetics*
  • Genomics / methods*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Nanopore Sequencing / methods
  • Sequence Analysis, DNA / methods
  • Software

Associated data

  • figshare/10.6084/m9.figshare.6964805.v1