High dimensional predictions of suicide risk in 4.2 million US Veterans using ensemble transfer learning

Sci Rep. 2024 Jan 20;14(1):1793. doi: 10.1038/s41598-024-51762-9.

Abstract

We present an ensemble transfer learning method to predict suicide from Veterans Affairs (VA) electronic medical records (EMR). A diverse set of base models was trained to predict a binary outcome constructed from reported suicide, suicide attempt, and overdose diagnoses with varying choices of study design and prediction methodology. Each model used twenty cross-sectional and 190 longitudinal variables observed in eight time intervals covering 7.5 years prior to the time of prediction. Ensembles of seven base models were created and fine-tuned with ten variables expected to change with study design and outcome definition in order to predict suicide and combined outcome in a prospective cohort. The ensemble models achieved c-statistics of 0.73 on 2-year suicide risk and 0.83 on the combined outcome when predicting on a prospective cohort of [Formula: see text] 4.2 M veterans. The ensembles rely on nonlinear base models trained using a matched retrospective nested case-control (Rcc) study cohort and show good calibration across a diversity of subgroups, including risk strata, age, sex, race, and level of healthcare utilization. In addition, a linear Rcc base model provided a rich set of biological predictors, including indicators of suicide, substance use disorder, mental health diagnoses and treatments, hypoxia and vascular damage, and demographics.

MeSH terms

  • Carcinoma, Renal Cell*
  • Cross-Sectional Studies
  • Humans
  • Kidney Neoplasms*
  • Machine Learning
  • Prospective Studies
  • Retrospective Studies
  • Suicide, Attempted
  • Veterans* / psychology