Bayesian Variable Selection for Gaussian copula regression models

J Comput Graph Stat. 2020 Dec 10;30(3):578-593. doi: 10.1080/10618600.2020.1840997.

Abstract

We develop a novel Bayesian method to select important predictors in regression models with multiple responses of diverse types. A sparse Gaussian copula regression model is used to account for the multivariate dependencies between any combination of discrete and/or continuous responses and their association with a set of predictors. We utilize the parameter expansion for data augmentation strategy to construct a Markov chain Monte Carlo algorithm for the estimation of the parameters and the latent variables of the model. Based on a centered parametrization of the Gaussian latent variables, we design a fixed-dimensional proposal distribution to update jointly the latent binary vectors of important predictors and the corresponding non-zero regression coefficients. For Gaussian responses and for outcomes that can be modeled as a dependent version of a Gaussian response, this proposal leads to a Metropolis-Hastings step that allows an efficient exploration of the predictors' model space. The proposed strategy is tested on simulated data and applied to real data sets in which the responses consist of low-intensity counts, binary, ordinal and continuous variables.

Keywords: Gaussian copula; Mixed data; Multiple-response regression models; Sparse co-variance matrix; Variable selection.