Background: Stroke misdiagnosis, associated with poor outcomes, is estimated to occur in 9% of all stroke patients.
Objectives: We hypothesized that machine learning (ML) could assist in the diagnosis of ischemic stroke in emergency departments (EDs).
Design: The study was conducted and reported according to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis guidelines. We performed model development and prospective temporal validation, using data from pre- and post-COVID periods; we also performed a case study on a small cohort of previously misdiagnosed stroke patients.
Methods: We used structured and unstructured electronic health records (EHRs) of 56,452 patient encounters from 13 hospitals in Pennsylvania, from September 2003 to January 2021. ML pipelines, including natural language processing, were created using pre-event clinical data and provider notes in the EDs.
Results: Using pre-event information, our model's area under the receiver operating characteristics curve (AUROC) ranged from 0.88 to 0.92 with a similar range accuracy (0.87-0.90). Using provider notes, we identified five models that reached a balanced performance in terms of AUROC, sensitivity, and specificity. Model AUROC ranged from 0.93 to 0.99. Model sensitivity and specificity reached 0.90 and 0.99, respectively. Four of the top five performing models were based on the post-COVID provider notes; however, no performance difference between models tested on pre- and post-COVID was observed.
Conclusion: This study leveraged pre-event and at-encounter level EHR for stroke prediction. The results indicate that available clinical information can be used for building EHR-based stroke prediction models and ED stroke alert systems.
Keywords: artificial intelligence; emergency department; ischemic stroke; machine learning; natural language processing.
© The Author(s), 2024.