Development and Evaluation of Machine Learning Models for the Detection of Emergency Department Patients with Opioid Misuse from Clinical Notes

medRxiv [Preprint]. 2024 Dec 12:2024.12.11.24318875. doi: 10.1101/2024.12.11.24318875.

Abstract

Objectives: The accurate identification of Emergency Department (ED) encounters involving opioid misuse is critical for health services, research, and surveillance. We sought to develop natural language processing (NLP)-based models for the detection of ED encounters involving opioid misuse.

Methods: A sample of ED encounters enriched for opioid misuse was manually annotated and clinical notes extracted. We evaluated classic machine learning (ML) methods, fine-tuning of publicly available pretrained language models, and a previously developed convolutional neural network opioid classifier for use on hospitalized patients (SMART-AI). Performance was compared to ICD-10-CM codes. Both raw text and text transformed to the United Medical Language System were evaluated. Face validity was evaluated by term feature importance.

Results: There were 1123 encounters used for training, validation, and testing. Of the classic ML methods, XGBoost had the highest AU_PRC (0.936), accuracy (0.887), and F1 score (0.863) which outperformed ICD-10-CM codes [accuracy 0.870; F1 0.830]. Logistic regression, support vector machine, and XGBoost models had higher AU_PRC using transformed text, while decision trees performed better using raw text. Excluding XGBoost, fine-tuned pre-trained language models outperformed classic ML methods. The best performing model was the fine-tuned SMART-AI based model with domain adaptation [AU_PRC 0.948; accuracy 0.882; F1 0.851]. Explainability analyses showed the most predictive terms were 'heroin', 'opioids', 'alcoholic intoxication, chronic', 'cocaine', 'opiates', and 'suboxone'.

Conclusions: NLP-based models outperform entry of ICD-10-CM diagnosis codes for the detection of ED encounters with opioid misuse. Fine tuning with domain adaptation for pre-trained language models resulted in improved performance.

Publication types

  • Preprint