BiMM forest: A random forest method for modeling clustered and longitudinal binary outcomes

Chemometr Intell Lab Syst. 2019 Feb 15:185:122-134. doi: 10.1016/j.chemolab.2019.01.002. Epub 2019 Jan 11.

Abstract

Clustered binary outcomes and datasets with many predictor variables are frequently encountered in clinical research (e.g. longitudinal studies). Generalized linear mixed models (GLMMs) typically employed for clustered endpoints have challenges for some scenarios, particularly for complex datasets which contain many interactions among predictors and nonlinear predictors of outcome. We propose a new method called Binary Mixed Model (BiMM) forest, which combines random forest and GLMM methodology. BiMM forest offers a flexible and stable method which naturally models interactions among predictors and can be employed in the setting of clustered data. Simulation studies show that BiMM forest achieves similar or superior prediction accuracy compared to standard random forest, GLMMs and its tree counterpart (BiMM tree) for clustered binary outcomes. The method is applied to a real dataset from the Acute Liver Failure Study Group. BiMM forest offers an alternative method for modeling clustered binary outcomes which may be applied in myriad research settings.

Keywords: clustered data; longitudinal data; mixed effects; random forest.