Objective: To construct applicable models suitable for predicting the risk of suicidal behavior among individuals with depression, particularly on the progression from no history of suicidal behavior to suicide attempts, as well as from suicidal ideation to suicide attempts.
Methods: Based on a prospective cohort from the UK Biobank, a total of 55,139 individuals aged 50 and above with depression were enrolled in the study, among whom 29,528 exhibited suicidal behavior. Specifically, they were divided into control (25,611), suicidal ideation (24,361), and suicide attempt (5167) groups. Least absolute shrinkage and selection operator (LASSO) regression was used to identify a subset of important features for distinguishing suicidal ideation and suicide attempts. We used the Gradient Boosting Decision Tree (GBDT) algorithm with stratified 10-fold cross-validation and grid-search to construct the prediction models for suicidal ideation or suicide attempts. To address the dataset imbalance in classifying suicide attempts, we used random under-sampling. The SHapley Additive exPlanations (SHAP) were used to estimate the important variables in the GBDT model.
Results: Significant differences in sociodemographic, economic, lifestyle, and psychological factors were observed across the three groups. Each classifier optimally utilized 8-11 features. Overall, the algorithms predicting suicide attempts demonstrated slightly higher performance than those predicting suicidal ideation. The GBDT classifier achieved the highest accuracy, with AUROC scores of 0.914 for suicide attempts and 0.803 for suicidal ideation. Distinctive predictive factors were identified for each group: while depression's inherent characteristics crucially distinguished the suicidal ideation group from controls, some key predictors, including the age of depression onset and childhood trauma events, were identified for suicide attempts.
Conclusions: We established applicable machine learning-based models for predicting suicidal behavior, particularly suicide attempts, in individuals with depression, and clarified the differences in predictors between suicidal ideation and suicide attempts.
Keywords: Depression; machine learning; prevention; risk factors; suicide.
© The Author(s) 2024.