Simultaneous modeling of multivariate heterogeneous responses and heteroskedasticity via a two-stage composite likelihood

Biom J. 2023 Aug;65(6):e2200029. doi: 10.1002/bimj.202200029. Epub 2023 May 22.

Abstract

Multivariate heterogeneous responses and heteroskedasticity have attracted increasing attention in recent years. In genome-wide association studies, effective simultaneous modeling of multiple phenotypes would improve statistical power and interpretability. However, a flexible common modeling system for heterogeneous data types can pose computational difficulties. Here we build upon a previous method for multivariate probit estimation using a two-stage composite likelihood that exhibits favorable computational time while retaining attractive parameter estimation properties. We extend this approach to incorporate multivariate responses of heterogeneous data types (binary and continuous), and possible heteroskedasticity. Although the approach has wide applications, it would be particularly useful for genomics, precision medicine, or individual biomedical prediction. Using a genomics example, we explore statistical power and confirm that the approach performs well for hypothesis testing and coverage percentages under a wide variety of settings. The approach has the potential to better leverage genomics data and provide interpretable inference for pleiotropy, in which a locus is associated with multiple traits.

Keywords: heterogeneity; heteroskedasticity; multivariate statistics; precision medicine; prediction.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genome-Wide Association Study* / methods
  • Genomics* / methods
  • Phenotype
  • Probability