Objective: To evaluate if a machine learning approach can accurately predict antidepressant treatment outcome using electronic health records (EHRs) from patients with depression.
Method: This study examined 808 patients with depression at a New York City-based outpatient mental health clinic between June 13, 2016 and June 22, 2020. Antidepressant treatment outcome was defined based on trend in depression symptom severity over time and was categorized as either "Recovering" or "Worsening" (i.e., non-Recovering), measured by the slope of individual-level Patient Health Questionnaire-9 (PHQ-9) score trajectory spanning 6 months following treatment initiation. A patient was designated as "Recovering" if the slope is less than 0 and as "Worsening" if the slope was no less than 0. Multiple machine learning (ML) models including L2 norm regularized Logistic Regression, Naive Bayes, Random Forest, and Gradient Boosting Decision Tree (GBDT) were used to predict treatment outcome based on additional data from EHRs, including demographics and diagnoses. Shapley Additive Explanations were applied to identify the most important predictors.
Results: The GBDT achieved the best results of predicting "Recovering" (AUC: 0.7654 ± 0.0227; precision: 0.6002 ± 0.0215; recall: 0.5131 ± 0.0336). When excluding patients with low PHQ-9 scores (<10) at baseline, the results of predicting "Recovering" (AUC: 0.7254 ± 0.0218; precision: 0.5392 ± 0.0437; recall: 0.4431 ± 0.0513) were obtained. Prior diagnosis of anxiety, psychotherapy, recurrent depression, and baseline depression symptom severity were strong predictors.
Conclusions: The results demonstrate the potential utility of using ML in longitudinal EHRs to predict antidepressant treatment outcome. Our predictive tool holds the promise to accelerate personalized medical management in patients with psychiatric illnesses.
© 2023 The Authors. Psychiatric Research and Clinical Practice published by Wiley Periodicals LLC on behalf of American Psychiatric Association.