Data augmentation has been commonly utilized to analyze correlated binary data using multivariate probit models in Bayesian analysis. However, the identification issue in the multivariate probit models necessitates a rigorous Metropolis-Hastings algorithm for sampling a correlation matrix, which may cause slow convergence and inefficiency of Markov chains. It is well-known that the parameter-expanded data augmentation, by introducing a working/artificial parameter or parameter vector, makes an identifiable model be non-identifiable and improves the mixing and convergence of data augmentation components. Therefore, we motivate to develop efficient parameter-expanded data augmentations to analyze correlated binary data using multivariate probit models. We investigate both the identifiable and non-identifiable multivariate probit models and develop the corresponding parameter-expanded data augmentation algorithms. We point out that the approaches, based on one non-identifiable model, circumvent a Metropolis-Hastings algorithm for sampling a correlation matrix and improve the convergence and mixing of correlation parameters; the identifiable model may produce the estimated regression parameters with smaller standard errors than the non-identifiable model does. We illustrate our proposed approaches using simulation studies and through the application to a longitudinal dataset from the Six Cities study.
Keywords: correlated binary data; data augmentation; multivariate probit model; parameter-expanded data augmentation.
© 2020 John Wiley & Sons, Ltd.