Using Poisson mixed-effects model to quantify transcript-level gene expression in RNA-Seq

Ming Hu; Yu Zhu; Jeremy M G Taylor; Jun S Liu; Zhaohui S Qin

doi:10.1093/bioinformatics/btr616

Using Poisson mixed-effects model to quantify transcript-level gene expression in RNA-Seq

Bioinformatics. 2012 Jan 1;28(1):63-8. doi: 10.1093/bioinformatics/btr616. Epub 2011 Nov 8.

Authors

Ming Hu¹, Yu Zhu, Jeremy M G Taylor, Jun S Liu, Zhaohui S Qin

Affiliation

¹ Department of Statistics, Harvard University, Cambridge, MA 02138, USA.

Abstract

Motivation: RNA sequencing (RNA-Seq) is a powerful new technology for mapping and quantifying transcriptomes using ultra high-throughput next-generation sequencing technologies. Using deep sequencing, gene expression levels of all transcripts including novel ones can be quantified digitally. Although extremely promising, the massive amounts of data generated by RNA-Seq, substantial biases and uncertainty in short read alignment pose challenges for data analysis. In particular, large base-specific variation and between-base dependence make simple approaches, such as those that use averaging to normalize RNA-Seq data and quantify gene expressions, ineffective.

Results: In this study, we propose a Poisson mixed-effects (POME) model to characterize base-level read coverage within each transcript. The underlying expression level is included as a key parameter in this model. Since the proposed model is capable of incorporating base-specific variation as well as between-base dependence that affect read coverage profile throughout the transcript, it can lead to improved quantification of the true underlying expression level.

Availability and implementation: POME can be freely downloaded at http://www.stat.purdue.edu/~yuzhu/pome.html.

Contact: [email protected]; [email protected]

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Cell Line, Tumor
Gene Expression Profiling
High-Throughput Nucleotide Sequencing*
Humans
Male
Microarray Analysis
Models, Statistical*
Prostatic Neoplasms / genetics
Sequence Analysis, RNA / methods*
Transcriptome

Abstract

Publication types

MeSH terms

Grants and funding