Sample size calculation for differential expression analysis of RNA-seq data under Poisson distribution

Int J Comput Biol Drug Des. 2013;6(4):358-75. doi: 10.1504/IJCBDD.2013.056830. Epub 2013 Sep 30.

Abstract

Sample size determination is an important issue in the experimental design of biomedical research. Because of the complexity of RNA-seq experiments, however, the field currently lacks a sample size method widely applicable to differential expression studies utilising RNA-seq technology. In this report, we propose several methods for sample size calculation for single-gene differential expression analysis of RNA-seq data under Poisson distribution. These methods are then extended to multiple genes, with consideration for addressing the multiple testing problem by controlling false discovery rate. Moreover, most of the proposed methods allow for closed-form sample size formulas with specification of the desired minimum fold change and minimum average read count, and thus are not computationally intensive. Simulation studies to evaluate the performance of the proposed sample size formulas are presented; the results indicate that our methods work well, with achievement of desired power. Finally, our sample size calculation methods are applied to three real RNA-seq data sets.

MeSH terms

  • Humans
  • Kidney / metabolism
  • Liver / metabolism
  • Oligonucleotide Array Sequence Analysis*
  • Poisson Distribution
  • Sample Size*
  • Sequence Analysis, RNA*