Machine Learning for Predicting Primary Graft Dysfunction After Lung Transplantation: An Interpretable Model Study

Transplantation. 2025 Jan 10. doi: 10.1097/TP.0000000000005326. Online ahead of print.

Abstract

Background: Primary graft dysfunction (PGD) develops within 72 h after lung transplantation (Lung Tx) and greatly influences patients' prognosis. This study aimed to establish an accurate machine learning (ML) model for predicting grade 3 PGD (PGD3) after Lung Tx.

Methods: This retrospective study incorporated 802 patients receiving Lung Tx between July 2018 and October 2023 (640 in the derivation cohort and 162 in the external validation cohort), and 640 patients were randomly assigned to training and internal validation cohorts in a 7:3 ratio. Independent risk factors for PGD3 were determined by integrating the univariate logistic regression and least absolute shrinkage and selection operator regression analyses. Subsequently, 9 ML models were used to construct prediction models for PGD3 based on selected variables. Their prediction performances were further evaluated. Besides, model stratification performance was assessed with 3 posttransplant metrics. Finally, the SHapley Additive exPlanations algorithm was used to understand the predictive importance of selected variables.

Results: We identified 9 independent clinical risk factors as selected variables. Among 9 ML models, the random forest (RF) model displayed optimal performance (area under the curve [AUC] = 0.9415, sensitivity [Se] = 0.8972, specificity [Sp] = 0.8795 in the training cohort; AUC = 0.7975, Se = 0.7520, Sp = 0.7313 in the internal validation cohort; and AUC = 0.8214, Se = 0.8235, Sp = 0.6667 in the external validation cohort). Further assessments on calibration and clinical usefulness indicated the promising applicability of the RF model in PGD3 prediction. Meanwhile, the RF model also performed best in terms of risk stratification for postoperative support (extracorporeal membrane oxygenation time: P < 0.001, mechanical ventilation time: P = 0.006, intensive care unit time: P < 0.001).

Conclusions: The RF model had the optimal performance in PGD3 prediction and postoperative risk stratification for patients after Lung Tx.