Normal uniform mixture differential gene expression detection for cDNA microarrays

BMC Bioinformatics. 2005 Jul 12:6:173. doi: 10.1186/1471-2105-6-173.

Abstract

Background: One of the primary tasks in analysing gene expression data is finding genes that are differentially expressed in different samples. Multiple testing issues due to the thousands of tests run make some of the more popular methods for doing this problematic.

Results: We propose a simple method, Normal Uniform Differential Gene Expression (NUDGE) detection for finding differentially expressed genes in cDNA microarrays. The method uses a simple univariate normal-uniform mixture model, in combination with new normalization methods for spread as well as mean that extend the lowess normalization of Dudoit, Yang, Callow and Speed (2002) 1. It takes account of multiple testing, and gives probabilities of differential expression as part of its output. It can be applied to either single-slide or replicated experiments, and it is very fast. Three datasets are analyzed using NUDGE, and the results are compared to those given by other popular methods: unadjusted and Bonferroni-adjusted t tests, Significance Analysis of Microarrays (SAM), and Empirical Bayes for microarrays (EBarrays) with both Gamma-Gamma and Lognormal-Normal models.

Conclusion: The method gives a high probability of differential expression to genes known/suspected a priori to be differentially expressed and a low probability to the others. In terms of known false positives and false negatives, the method outperforms all multiple-replicate methods except for the Gamma-Gamma EBarrays method to which it offers comparable results with the added advantages of greater simplicity, speed, fewer assumptions and applicability to the single replicate case. An R package called nudge to implement the methods in this paper will be made available soon at http://www.bioconductor.org.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • CD4-Positive T-Lymphocytes / virology
  • Computational Biology / methods*
  • DNA, Complementary / metabolism
  • Data Interpretation, Statistical
  • False Negative Reactions
  • False Positive Reactions
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation*
  • HIV / genetics
  • Humans
  • Internet
  • Models, Genetic
  • Models, Statistical
  • Nucleic Acid Hybridization
  • Oligonucleotide Array Sequence Analysis / methods*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Sequence Analysis, DNA
  • Software

Substances

  • DNA, Complementary