Utilizing Machine Learning Methods for Preoperative Prediction of Postsurgical Mortality and Intensive Care Unit Admission

Ann Surg. 2020 Dec;272(6):1133-1139. doi: 10.1097/SLA.0000000000003297.

Abstract

Objective: To compare the performance of machine learning models against the traditionally derived Combined Assessment of Risk Encountered in Surgery (CARES) model and the American Society of Anaesthesiologists-Physical Status (ASA-PS) in the prediction of 30-day postsurgical mortality and need for intensive care unit (ICU) stay >24 hours.

Background: Prediction of surgical risk preoperatively is important for clinical shared decision-making and planning of health resources such as ICU beds. The current growth of electronic medical records coupled with machine learning presents an opportunity to improve the performance of established risk models.

Methods: All patients aged 18 years and above who underwent noncardiac and nonneurological surgery at Singapore General Hospital (SGH) between 1 January 2012 and 31 October 2016 were included. Patient demographics, comorbidities, preoperative laboratory results, and surgery details were obtained from their electronic medical records. Seventy percent of the observations were randomly selected for training, leaving 30% for testing. Baseline models were CARES and ASA-PS. Candidate models were trained using random forest, adaptive boosting, gradient boosting, and support vector machine. Models were evaluated on area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC).

Results: A total of 90,785 patients were included, of whom 539 (0.6%) died within 30 days and 1264 (1.4%) required ICU admission >24 hours postoperatively. Baseline models achieved high AUROCs despite poor sensitivities by predicting all negative in a predominantly negative dataset. Gradient boosting was the best performing model with AUPRCs of 0.23 and 0.38 for mortality and ICU admission outcomes respectively.

Conclusions: Machine learning can be used to improve surgical risk prediction compared to traditional risk calculators. AUPRC should be used to evaluate model predictive performance instead of AUROC when the dataset is imbalanced.

Publication types

  • Comparative Study

MeSH terms

  • Adult
  • Aged
  • Female
  • Hospitalization / statistics & numerical data*
  • Humans
  • Intensive Care Units*
  • Machine Learning*
  • Male
  • Middle Aged
  • Postoperative Complications / mortality*
  • Preoperative Period
  • Prognosis
  • Retrospective Studies
  • Risk Assessment
  • Time Factors