Empirical bayes analysis of sequencing-based transcriptional profiling without replicates

BMC Bioinformatics. 2010 Nov 16:11:564. doi: 10.1186/1471-2105-11-564.

Abstract

Background: Recent technological advancements have made high throughput sequencing an increasingly popular approach for transcriptome analysis. Advantages of sequencing-based transcriptional profiling over microarrays have been reported, including lower technical variability. However, advances in technology do not remove biological variation between replicates and this variation is often neglected in many analyses.

Results: We propose an empirical Bayes method, titled Analysis of Sequence Counts (ASC), to detect differential expression based on sequencing technology. ASC borrows information across sequences to establish prior distribution of sample variation, so that biological variation can be accounted for even when replicates are not available. Compared to current approaches that simply tests for equality of proportions in two samples, ASC is less biased towards highly expressed sequences and can identify more genes with a greater log fold change at lower overall abundance.

Conclusions: ASC unifies the biological and statistical significance of differential expression by estimating the posterior mean of log fold change and estimating false discovery rates based on the posterior mean. The implementation in R is available at http://www.stat.brown.edu/Zwu/research.aspx.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Bayes Theorem
  • Databases, Genetic
  • Gene Expression Profiling / methods*
  • Genomics
  • Oligonucleotide Array Sequence Analysis / methods*
  • Sequence Analysis, DNA
  • Sequence Analysis, RNA