Multivariate Bayesian variable selection exploiting dependence structure among outcomes: Application to air pollution effects on DNA methylation

Biometrics. 2017 Mar;73(1):232-241. doi: 10.1111/biom.12557. Epub 2016 Jul 5.

Abstract

The analysis of multiple outcomes is becoming increasingly common in modern biomedical studies. It is well-known that joint statistical models for multiple outcomes are more flexible and more powerful than fitting a separate model for each outcome; they yield more powerful tests of exposure or treatment effects by taking into account the dependence among outcomes and pooling evidence across outcomes. It is, however, unlikely that all outcomes are related to the same subset of covariates. Therefore, there is interest in identifying exposures or treatments associated with particular outcomes, which we term outcome-specific variable selection. In this work, we propose a variable selection approach for multivariate normal responses that incorporates not only information on the mean model, but also information on the variance-covariance structure of the outcomes. The approach effectively leverages evidence from all correlated outcomes to estimate the effect of a particular covariate on a given outcome. To implement this strategy, we develop a Bayesian method that builds a multivariate prior for the variable selection indicators based on the variance-covariance of the outcomes. We show via simulation that the proposed variable selection strategy can boost power to detect subtle effects without increasing the probability of false discoveries. We apply the approach to the Normative Aging Study (NAS) epigenetic data and identify a subset of five genes in the asthma pathway for which gene-specific DNA methylations are associated with exposures to either black carbon, a marker of traffic pollution, or sulfate, a marker of particles generated by power plants.

Keywords: Bayesian variable selection; Markov chain Monte Carlo method; Multivariate regression analysis; Phase transition; Structured spike-and-slab prior.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Air Pollution / adverse effects*
  • Analysis of Variance
  • Asthma / etiology
  • Asthma / genetics
  • Bayes Theorem
  • Biometry / methods*
  • DNA Methylation* / genetics
  • Data Interpretation, Statistical*
  • Environmental Exposure / adverse effects
  • Humans
  • Models, Statistical*
  • Particulate Matter / adverse effects
  • Soot / adverse effects
  • Sulfates / adverse effects

Substances

  • Particulate Matter
  • Soot
  • Sulfates