A weighted sample size for microarray datasets that considers the variability of variance and multiplicity

J Biosci Bioeng. 2009 Sep;108(3):252-8. doi: 10.1016/j.jbiosc.2009.03.017.

Abstract

Microarray experiments are often performed to detect differently expressed genes among different clinical phenotypes. The method used to calculate the appropriate sample size for this purpose differs from the sample size calculation used for general clinical experiments, because microarrays include tens of thousands of genes. We proposed a sample size calculation method that considers variance among an entire gene set and used the Bonferroni correction to address the multiplicity problem. Specifically, by adjusting for the multiplicity problem, the existing equation for sample size calculation was modified based on the Bonferroni correction. By k-means cluster analysis, the variances across all genes can be divided into several groups with similar values, and the sample sizes for each group were subsequently calculated and weight-averaged. The results of this study show that the sample size was related to the number of genes on a chip. The weighted sample size, calculated by the proposed method, preserved the Type I error for selection of significant genes within a microarray data set.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Databases, Genetic
  • Gene Expression Profiling / methods*
  • Models, Statistical
  • Oligonucleotide Array Sequence Analysis / methods*
  • Phenotype
  • Reproducibility of Results
  • Research Design
  • Sample Size*