AlienTrimmer: a tool to quickly and accurately trim off multiple short contaminant sequences from high-throughput sequencing reads

Genomics. 2013 Nov-Dec;102(5-6):500-6. doi: 10.1016/j.ygeno.2013.07.011. Epub 2013 Aug 1.

Abstract

Contaminant oligonucleotide sequences such as primers and adapters can occur in both ends of high-throughput sequencing (HTS) reads. AlienTrimmer was developed in order to detect and remove such contaminants. Based on the decomposition of specified alien nucleotide sequences into k-mers, AlienTrimmer is able to determine whether such alien k-mers are occurring in one or in both read ends by using a simple polynomial algorithm. Therefore, AlienTrimmer can process typical HTS single- or paired-end files with millions of reads in several minutes with very low computer resources. Based on the analysis of both simulated and real-case Illumina®, 454™ and Ion Torrent™ read data, we show that AlienTrimmer performs with excellent accuracy and speed in comparison with other trimming tools. The program is freely available at ftp://ftp.pasteur.fr/pub/gensoft/projects/AlienTrimmer/.

Keywords: Adapter oligonucleotides; High-throughput sequencing; Polynomial algorithm; Raw read trimming; Short contaminant sequence; k-mer decomposition.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Base Sequence
  • Computational Biology
  • DNA / genetics*
  • DNA Contamination
  • DNA Primers
  • High-Throughput Nucleotide Sequencing / methods*
  • Quality Control
  • Sequence Analysis, DNA / methods*
  • Software*

Substances

  • DNA Primers
  • DNA