Biclustering of medical monitoring data using a nonparametric hierarchical Bayesian model

Stat (Int Stat Inst). 2020;9(1):e279. doi: 10.1002/sta4.279. Epub 2020 Mar 15.

Abstract

In longitudinal studies in which a medical device is used to monitor outcome repeatedly and frequently on the same patients over a prespecified duration of time, two clustering goals can arise. One goal is to assess the degree of heterogeneity among patient profiles. A second yet equally important goal unique to such studies is to determine frequency and duration of monitoring sufficient to identify longitudinal changes. Considering these goals jointly would identify clusters of patients who share similar patterns over time and characterize temporal stability within each cluster. We use a biclustering approach, allowing simultaneous clustering of observations at both patient and time levels and using a nonparametric hierarchical Bayesian model. Because clustering units at the time level (i.e., time points) are ordered and hence unexchangeable, we utilize a multivariate Dirichlet process mixture model by specifying a Dirichlet process prior at the patient level whose base measure employs change points at the time level to achieve the desired joint clustering. We consider structured covariance between consecutive time points and assess model performance through simulation studies. We apply the model to data on 24-hr ambulatory blood pressure monitoring and examine the relationship between diastolic blood pressure and pediatric obstructive sleep apnoea.

Keywords: Markov chain Monte Carlo; biostatistics; clustering; longitudinal data; multivariate analysis; nonparametric methods.