Machine-learning based prediction of appendicitis for patients presenting with acute abdominal pain at the emergency department

Anoeska Schipper; Peter Belgers; Rory O'Connor; Kim Ellis Jie; Robin Dooijes; Joeran Sander Bosma; Steef Kurstjens; Ron Kusters; Bram van Ginneken; Matthieu Rutten

doi:10.1186/s13017-024-00570-7

Machine-learning based prediction of appendicitis for patients presenting with acute abdominal pain at the emergency department

World J Emerg Surg. 2024 Dec 23;19(1):40. doi: 10.1186/s13017-024-00570-7.

Authors

Anoeska Schipper^{1

2

3}, Peter Belgers^{1

2}, Rory O'Connor⁴, Kim Ellis Jie⁴, Robin Dooijes⁴, Joeran Sander Bosma¹, Steef Kurstjens^{3

5}, Ron Kusters^{3

6}, Bram van Ginneken¹, Matthieu Rutten^{7

8}

Affiliations

¹ Diagnostic Image Analysis Group, Department of Medical Imaging, Radboud University Medical Center, Nijmegen, the Netherlands.
² Department of Radiology, Jeroen Bosch Hospital, 's Hertogenbosch, the Netherlands.
³ Laboratory of Clinical Chemistry and Hematology, Jeroen Bosch Hospital, 's Hertogenbosch, the Netherlands.
⁴ Emergency Department, Jeroen Bosch Hospital, 's Hertogenbosch, the Netherlands.
⁵ Laboratory of Clinical Chemistry and Laboratory Medicine, Dicoon BV, location Canisius Wilhelmina Hospital, Nijmegen, the Netherlands.
⁶ Department of Health Technology and Services Research, Technical Medical Centre, University of Twente, Enschede, the Netherlands.
⁷ Diagnostic Image Analysis Group, Department of Medical Imaging, Radboud University Medical Center, Nijmegen, the Netherlands. [email protected].
⁸ Department of Radiology, Jeroen Bosch Hospital, 's Hertogenbosch, the Netherlands. [email protected].

Abstract

Background: Acute abdominal pain (AAP) constitutes 5-10% of all emergency department (ED) visits, with appendicitis being a prevalent AAP etiology often necessitating surgical intervention. The variability in AAP symptoms and causes, combined with the challenge of identifying appendicitis, complicate timely intervention. To estimate the risk of appendicitis, scoring systems such as the Alvarado score have been developed. However, diagnostic errors and delays remain common. Although various machine learning (ML) models have been proposed to enhance appendicitis detection, none have been seamlessly integrated into the ED workflows for AAP or are specifically designed to diagnose appendicitis as early as possible within the clinical decision-making process. To mimic daily clinical practice, this proof-of-concept study aims to develop ML models that support decision-making using comprehensive clinical data up to key decision points in the ED workflow to detect appendicitis in patients presenting with AAP.

Methods: Data from the Dutch triage system at the ED, vital signs, complete medical history and physical examination findings and routine laboratory test results were retrospectively extracted from 350 AAP patients presenting to the ED of a Dutch teaching hospital from 2016 to 2023. Two eXtreme Gradient Boosting ML models were developed to differentiate cases with appendicitis from other AAP causes: one model used all data up to and including physical examination, and the other was extended with routine laboratory test results. The performance of both models was evaluated on a validation set (n = 68) and compared to the Alvarado scoring system as well as three ED physicians in a reader study.

Results: The ML models achieved AUROCs of 0.919 without laboratory test results and 0.923 with the addition of laboratory test results. The Alvarado scoring system attained an AUROC of 0.824. ED physicians achieved AUROCs of 0.894, 0.826, and 0.791 without laboratory test results, increasing to AUROCs of 0.923, 0.892, and 0.859 with laboratory test results.

Conclusions: Both ML models demonstrated comparable high accuracy in predicting appendicitis in patients with AAP, outperforming the Alvarado scoring system. The ML models matched or surpassed ED physician performance in detecting appendicitis, with the largest potential performance gain observed in absence of laboratory test results. Integration could assist ED physicians in early and accurate diagnosis of appendicitis.

Keywords: Acute abdominal pain; Appendicitis; Artificial intelligence; Clinical decision support; Diagnostic follow-up; Emergency department; Machine learning.

MeSH terms

Abdominal Pain / etiology
Adult
Appendicitis* / diagnosis
Emergency Service, Hospital*
Female
Humans
Machine Learning*
Male
Middle Aged
Netherlands
Retrospective Studies
Triage / methods

Grants and funding

LSHM20103/Health~Holland, the Netherlands