Validation of a machine learning algorithm for early severe sepsis prediction: a retrospective study predicting severe sepsis up to 48 h in advance using a diverse dataset from 461 US hospitals

BMC Med Inform Decis Mak. 2020 Oct 27;20(1):276. doi: 10.1186/s12911-020-01284-x.

Abstract

Background: Severe sepsis and septic shock are among the leading causes of death in the United States and sepsis remains one of the most expensive conditions to diagnose and treat. Accurate early diagnosis and treatment can reduce the risk of adverse patient outcomes, but the efficacy of traditional rule-based screening methods is limited. The purpose of this study was to develop and validate a machine learning algorithm (MLA) for severe sepsis prediction up to 48 h before onset using a diverse patient dataset.

Methods: Retrospective analysis was performed on datasets composed of de-identified electronic health records collected between 2001 and 2017, including 510,497 inpatient and emergency encounters from 461 health centers collected between 2001 and 2015, and 20,647 inpatient and emergency encounters collected in 2017 from a community hospital. MLA performance was compared to commonly used disease severity scoring systems and was evaluated at 0, 4, 6, 12, 24, and 48 h prior to severe sepsis onset.

Results: 270,438 patients were included in analysis. At time of onset, the MLA demonstrated an AUROC of 0.931 (95% CI 0.914, 0.948) and a diagnostic odds ratio (DOR) of 53.105 on a testing dataset, exceeding MEWS (0.725, P < .001; DOR 4.358), SOFA (0.716; P < .001; DOR 3.720), and SIRS (0.655; P < .001; DOR 3.290). For prediction 48 h prior to onset, the MLA achieved an AUROC of 0.827 (95% CI 0.806, 0.848) on a testing dataset. On an external validation dataset, the MLA achieved an AUROC of 0.948 (95% CI 0.942, 0.954) at the time of onset, and 0.752 at 48 h prior to onset.

Conclusions: The MLA accurately predicts severe sepsis onset up to 48 h in advance using only readily available vital signs extracted from the existing patient electronic health records. Relevant implications for clinical practice include improved patient outcomes from early severe sepsis detection and treatment.

Keywords: Diagnostic; Machine learning algorithm; Sepsis prediction; Severe sepsis.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Datasets as Topic
  • Decision Support Systems, Clinical*
  • Female
  • Forecasting
  • Hospital Mortality
  • Humans
  • Intensive Care Units
  • Machine Learning / standards*
  • Male
  • Predictive Value of Tests
  • Reproducibility of Results
  • Retrospective Studies
  • Sepsis / diagnosis*
  • Sepsis / mortality
  • Severity of Illness Index
  • Time Factors
  • Time-to-Treatment