Five steps in performing machine learning for binary outcomes

J Thorac Cardiovasc Surg. 2024 Sep 5:S0022-5223(24)00782-7. doi: 10.1016/j.jtcvs.2024.08.048. Online ahead of print.

Abstract

Background: The use of machine learning (ML) in cardiovascular and thoracic surgery is evolving rapidly. Maximizing the capabilities of ML can help improve patient risk stratification and clinical decision making, improve accuracy of predictions, and improve resource utilization in cardiac surgery. The many nuances and intricacies of ML modeling need to be understood to appropriately implement these technologies in the clinical research setting. This primer provides an educational framework of ML for generating predicted probabilities in clinical research and illustrates it with a real-world clinical example.

Methods: We focus on modeling for binary classification and imbalanced classes, a common scenario in cardiothoracic surgery research. We present a 5-step strategy for successfully harnessing the power of ML and performing such analyses, and demonstrate our strategy using a real-world example based on data from the National Surgical Quality Improvement Program pediatric database.

Conclusions: Collaboration among surgeons, care providers, statisticians, data scientists, and information technology professionals can help to maximize the impact of ML as a powerful tool in cardiac surgery.

Keywords: imbalanced classes; machine learning; model development; model validation.