Hierarchical inverse Gaussian models and multiple testing: application to gene expression data

Stat Appl Genet Mol Biol. 2005:4:Article23. doi: 10.2202/1544-6115.1151. Epub 2005 Sep 6.

Abstract

Detecting differentially expressed genes in microarray experiments is a topic that has been well studied in the literature. Many hypothesis testing methods have been proposed that rely on strong distributional assumptions for the gene intensities. However, the shape of microarray data may vary substantially from one experiment to another, and model assumptions may be seriously violated in many cases. The literature on microarray data is mainly based on two distributions: the log-normal and the gamma distributions, that often appear to be effective when used in a Bayesian hierarchical framework. However, if a model that fits the data well in a global manner seems attractive, two points should be regarded with attention: the ability of the model to fit the tail of the observed distribution, and its robustness to a wrong specification of the model, in terms of error rates for the hypothesis tests. In order to focus on these aspects, we propose to use Bayesian models involving the inverse Gaussian distribution to describe gene expression data. We show that these models can be good competitors to the traditional Bayesian or random effect gamma or log-normal models in some situations. A multiple testing procedure is then proposed, based on an asymptotic property of the posterior probability of the one-sided alternative hypothesis. We show that the asymptotic property is well approximated for inverse Gaussian models, even when the number of observations available for each test is very small.