Biases in read coverage demonstrated by interlaboratory and interplatform comparison of 117 mRNA and genome sequencing experiments

BMC Bioinformatics. 2012 Apr 19;13 Suppl 6(Suppl 6):S4. doi: 10.1186/1471-2105-13-S6-S4.

Abstract

High-throughput sequencing of whole genomes and transcriptomes allows one to generate large amounts of sequence data very rapidly and at a low cost. The goal of most mRNA sequencing studies is to perform the comparison of the expression level between different samples. However, given a broad variety of modern sequencing protocols, platforms and versions thereof, it is not clear to what extent the obtained results are consistent across platforms and laboratories. The comparison of 117 human mRNA and genome high-throughput sequencing experiments performed on the Illumina and SOLiD platforms at 26 institutions all over the world demonstrated high dependency of the gene coverage profiles on the producing laboratory. Gene coverage profiles showed laboratory-specific non-uniformity that survived the 3'-bias correction and mappability normalization, suggesting that there are other yet unknown mRNA-associated biases.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genome, Human
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • RNA, Messenger / genetics*
  • Sequence Analysis, DNA / methods*
  • Sequence Analysis, RNA / methods*
  • Transcriptome

Substances

  • RNA, Messenger