Trustworthiness of a machine learning early warning model in medical and surgical inpatients

JAMIA Open. 2025 Jan 6;8(1):ooae156. doi: 10.1093/jamiaopen/ooae156. eCollection 2025 Feb.

Abstract

Objectives: In the general hospital wards, machine learning (ML)-based early warning systems (EWSs) can identify patients at risk of deterioration to facilitate rescue interventions. We assess subpopulation performance of a ML-based EWS on medical and surgical adult patients admitted to general hospital wards.

Materials and methods: We assessed the scores of an EWS integrated into the electronic health record and calculated every 15 minutes to predict a composite adverse event (AE): all-cause mortality, transfer to intensive care, cardiac arrest, or rapid response team evaluation. The distributions of the First Score 3 hours after admission, the Highest Score at any time during the hospitalization, and the Last Score just before an AE or dismissal without an AE were calculated. The Last Score was used to calculate the area under the receiver operating characteristic curve (ROC-AUC) and the precision-recall curve (PRC-AUC).

Results: From August 23, 2021 to March 31, 2022, 35 937 medical admissions had 2173 (6.05%) AE compared to 25 214 surgical admissions with 4984 (19.77%) AE. Medical and surgical admissions had significant different (P <.001) distributions of the First Score, Highest Score, and Last Score among those with an AE and without an AE. The model performed better in the medical group when compared to the surgical group, ROC-AUC 0.869 versus 0.677, and RPC-AUC 0.988 versus 0.878, respectively.

Discussion: Heterogeneity of medical and surgical patients can significantly impact the performance of a ML-based EWS, changing the model validity and clinical discernment.

Conclusions: Characterization of the target patient subpopulations has clinical implications and should be considered when developing models to be used in general hospital wards.

Keywords: clinical decision support systems; early warning scores; hospital medicine; hospital surgery; machine learning.