Parameter-expanded data augmentation for analyzing correlated binary data using multivariate probit models

Stat Med. 2020 Nov 10;39(25):3637-3652. doi: 10.1002/sim.8685. Epub 2020 Jul 24.

Abstract

Data augmentation has been commonly utilized to analyze correlated binary data using multivariate probit models in Bayesian analysis. However, the identification issue in the multivariate probit models necessitates a rigorous Metropolis-Hastings algorithm for sampling a correlation matrix, which may cause slow convergence and inefficiency of Markov chains. It is well-known that the parameter-expanded data augmentation, by introducing a working/artificial parameter or parameter vector, makes an identifiable model be non-identifiable and improves the mixing and convergence of data augmentation components. Therefore, we motivate to develop efficient parameter-expanded data augmentations to analyze correlated binary data using multivariate probit models. We investigate both the identifiable and non-identifiable multivariate probit models and develop the corresponding parameter-expanded data augmentation algorithms. We point out that the approaches, based on one non-identifiable model, circumvent a Metropolis-Hastings algorithm for sampling a correlation matrix and improve the convergence and mixing of correlation parameters; the identifiable model may produce the estimated regression parameters with smaller standard errors than the non-identifiable model does. We illustrate our proposed approaches using simulation studies and through the application to a longitudinal dataset from the Six Cities study.

Keywords: correlated binary data; data augmentation; multivariate probit model; parameter-expanded data augmentation.

MeSH terms

  • Algorithms*
  • Bayes Theorem
  • Computer Simulation
  • Humans
  • Markov Chains