deGPS is a powerful tool for detecting differential expression in RNA-sequencing studies

BMC Genomics. 2015 Jun 13;16(1):455. doi: 10.1186/s12864-015-1676-0.

Abstract

Background: The advent of the NGS technologies has permitted profiling of whole-genome transcriptomes (i.e., RNA-Seq) at unprecedented speed and very low cost. RNA-Seq provides a far more precise measurement of transcript levels and their isoforms compared to other methods such as microarrays. A fundamental goal of RNA-Seq is to better identify expression changes between different biological or disease conditions. However, existing methods for detecting differential expression from RNA-Seq count data have not been comprehensively evaluated in large-scale RNA-Seq datasets. Many of them suffer from inflation of type I error and failure in controlling false discovery rate especially in the presence of abnormal high sequence read counts in RNA-Seq experiments.

Results: To address these challenges, we propose a powerful and robust tool, termed deGPS, for detecting differential expression in RNA-Seq data. This framework contains new normalization methods based on generalized Poisson distribution modeling sequence count data, followed by permutation-based differential expression tests. We systematically evaluated our new tool in simulated datasets from several large-scale TCGA RNA-Seq projects, unbiased benchmark data from compcodeR package, and real RNA-Seq data from the development transcriptome of Drosophila. deGPS can precisely control type I error and false discovery rate for the detection of differential expression and is robust in the presence of abnormal high sequence read counts in RNA-Seq experiments.

Conclusions: Software implementing our deGPS was released within an R package with parallel computations ( https://github.com/LL-LAB-MCW/deGPS ). deGPS is a powerful and robust tool for data normalization and detecting different expression in RNA-Seq experiments. Beyond RNA-Seq, deGPS has the potential to significantly enhance future data analysis efforts from many other high-throughput platforms such as ChIP-Seq, MBD-Seq and RIP-Seq.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods
  • Databases, Genetic
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation
  • Humans
  • Poisson Distribution
  • Sequence Analysis, RNA / methods*
  • Software*
  • Transcriptome