Clinical utility of a deep-learning mortality prediction model for cardiac surgery decision making

J Thorac Cardiovasc Surg. 2023 Dec;166(6):e567-e578. doi: 10.1016/j.jtcvs.2023.01.022. Epub 2023 Feb 2.

Abstract

Objectives: The aim of this study using decision curve analysis (DCA) was to evaluate the clinical utility of a deep-learning mortality prediction model for cardiac surgery decision making compared with the European System for Cardiac Operative Risk Evaluation (EuroSCORE) II and to 2 machine-learning models.

Methods: Using data from a French prospective database, this retrospective study evaluated all patients who underwent cardiac surgery in 43 hospital centers between January 2012 and December 2020. A receiver operating characteristic analysis was performed to compare the accuracy of the EuroSCORE II, machine-learning models, and an adapted Tabular Bidirectional Encoder Representations from Transformers deep-learning model in predicting postoperative in-hospital mortality. The clinical utility of these models for cardiac surgery decision making was compared using DCA.

Results: Over the study period, 165,640 patients underwent cardiac surgery, with a mean EuroSCORE II of 3.99 ± 6.67%. In the receiver operating characteristic analysis, the area under the curve was significantly greater for the deep-learning model (0.834; 95% confidence interval, 0.831-0.838) than the EuroSCORE II (P < .001), the random forest model (P = .03), and the Extreme Gradient Boosting model (P = .03). In the DCA, the clinical utility of the 3 artificial intelligence models was superior to that of the EuroSCORE II, especially when the threshold probability of death was high (>45%). The deep-learning model showed the greatest advantage over the EuroSCORE II.

Conclusions: The deep-learning model had better predictive accuracy and greater clinical utility than the EuroSCORE II and the 2 machine-learning models. These findings suggest that deep learning with Tabular Bidirectional Encoder Representations from Transformers prediction model could be used in the future as the gold standard for cardiac surgery decision making.

Keywords: artificial intelligence; cardiac surgery; decision curve analysis; deep learning; machine learning.

MeSH terms

  • Artificial Intelligence
  • Cardiac Surgical Procedures* / adverse effects
  • Decision Making
  • Deep Learning*
  • Hospital Mortality
  • Humans
  • ROC Curve
  • Retrospective Studies
  • Risk Assessment