HiLive: real-time mapping of illumina reads while sequencing

Bioinformatics. 2017 Mar 15;33(6):917-319. doi: 10.1093/bioinformatics/btw659.

Abstract

Motivation: Next Generation Sequencing is increasingly used in time critical, clinical applications. While read mapping algorithms have always been optimized for speed, they follow a sequential paradigm and only start after finishing of the sequencing run and conversion of files. Since Illumina machines write intermediate output results, HiLive performs read mapping while still sequencing and thereby drastically reduces crucial overall sample analysis time, e.g. in precision medicine.

Methods: We present HiLive as a novel real time read mapper that implements a k-mer based alignment strategy. HiLive continuously reads intermediate BCL files produced by Illumina sequencers and then extends initial k-mer matches by increasingly produced data from the sequencer.

Results: We applied HiLive on real human transcriptome data to show that final read alignments are reported within few minutes after the end of a full Illumina HiSeq 1500 run, while already the necessary conversion to FASTQ files as the standard input to current read mapping methods takes roughly five times as long. Further, we show on simulated and real data that HiLive has comparable accuracy to recent read mappers.

Availability and implementation: HiLive and its source code are freely available from https://gitlab.com/SimonHTausch/HiLive .

Contact: [email protected].

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Genome, Human
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Sequence Analysis, RNA / methods*
  • Software*
  • Transcriptome