Random survival forests for dynamic predictions of a time-to-event outcome using a longitudinal biomarker

Kaci L Pickett; Krithika Suresh; Kristen R Campbell; Scott Davis; Elizabeth Juarez-Colunga

doi:10.1186/s12874-021-01375-x

Random survival forests for dynamic predictions of a time-to-event outcome using a longitudinal biomarker

BMC Med Res Methodol. 2021 Oct 17;21(1):216. doi: 10.1186/s12874-021-01375-x.

Authors

Kaci L Pickett¹, Krithika Suresh^{2

3}, Kristen R Campbell¹, Scott Davis⁴, Elizabeth Juarez-Colunga^{1

5}

Affiliations

¹ Department of Pediatrics, University of Colorado Anschutz Medical Campus, Aurora, 80045, Colorado, USA.
² Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, 80045, Colorado, USA. [email protected].
³ Adult and Child Consortium for Health Outcomes and Delivery Science, University of Colorado Anschutz Medical Campus, Aurora, 80045, Colorado, USA. [email protected].
⁴ Division of Renal Diseases and Hypertension, University of Colorado Anschutz Medical Campus, Aurora, 80045, Colorado, USA.
⁵ Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, 80045, Colorado, USA.

Abstract

Background: Risk prediction models for time-to-event outcomes play a vital role in personalized decision-making. A patient's biomarker values, such as medical lab results, are often measured over time but traditional prediction models ignore their longitudinal nature, using only baseline information. Dynamic prediction incorporates longitudinal information to produce updated survival predictions during follow-up. Existing methods for dynamic prediction include joint modeling, which often suffers from computational complexity and poor performance under misspecification, and landmarking, which has a straightforward implementation but typically relies on a proportional hazards model. Random survival forests (RSF), a machine learning algorithm for time-to-event outcomes, can capture complex relationships between the predictors and survival without requiring prior specification and has been shown to have superior predictive performance.

Methods: We propose an alternative approach for dynamic prediction using random survival forests in a landmarking framework. With a simulation study, we compared the predictive performance of our proposed method with Cox landmarking and joint modeling in situations where the proportional hazards assumption does not hold and the longitudinal marker(s) have a complex relationship with the survival outcome. We illustrated the use of the RSF landmark approach in two clinical applications to assess the performance of various RSF model building decisions and to demonstrate its use in obtaining dynamic predictions.

Results: In simulation studies, RSF landmarking outperformed joint modeling and Cox landmarking when a complex relationship between the survival and longitudinal marker processes was present. It was also useful in application when there were several predictors for which the clinical relevance was unknown and multiple longitudinal biomarkers were present. Individualized dynamic predictions can be obtained from this method and the variable importance metric is useful for examining the changing predictive power of variables over time. In addition, RSF landmarking is easily implementable in standard software and using suggested specifications requires less computation time than joint modeling.

Conclusions: RSF landmarking is a nonparametric, machine learning alternative to current methods for obtaining dynamic predictions when there are complex or unknown relationships present. It requires little upfront decision-making and has comparable predictive performance and has preferable computational speed.

Keywords: Area under the curve; Joint modeling; Landmarking; Prediction accuracy; Variable importance.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Algorithms*
Biomarkers
Computer Simulation
Humans
Machine Learning*
Proportional Hazards Models

Substances

Biomarkers

Grants and funding

UL1 TR002535/TR/NCATS NIH HHS/United States