Machine learning-based model for predicting the occurrence and mortality of nonpulmonary sepsis-associated ARDS

Sci Rep. 2024 Nov 15;14(1):28240. doi: 10.1038/s41598-024-79899-7.

Abstract

Objective: The objective was to establish a machine learning-based model for predicting the occurrence and mortality of nonpulmonary sepsis-associated ARDS.

Methods: 80% of sepsis patients selected randomly from the MIMIC-IV database, without prior pulmonary conditions and with nonpulmonary infection sites, were used to construct prediction models through machine learning techniques (including K-nearest neighbour, extreme gradient boosting, support vector machine, deep neural network, and decision tree methods). The remaining 20% of patients were utilized to validate the model's accuracy. Additionally, local data were employed for further model validation.

Results: A total of 11,409 patients were included, with the most common type of infection being bloodstream infection. A total of 7,632 (66.9%) patients developed nonpulmonary sepsis-associated ARDS (NPS-ARDS). Patients with NPS-ARDS had significantly longer ICU stays (6.2 ± 5.2 days vs. 4.4 ± 3.7 days, p < 0.01) and higher 28-day mortality rates (19.5% vs. 14.9%, p < 0.01). Both internal and external validation demonstrated that the model constructed with the extreme gradient boosting method had high accuracy. In the internal validation, the model predicted NPS-ARDS and mortality in such patients with accuracies of 77.5% and 71.8%, respectively. In the external validation, the model predicted NPS-ARDS and mortality in these patients with accuracies of 78.0% and 81.4%, respectively.

Conclusion: The model established via the extreme gradient boosting method can predict the occurrence and mortality of nonpulmonary sepsis-associated ARDS to a certain extent.

Keywords: Acute respiratory distress syndrome; External validation; Internal validation; Machine learning; Mortality; Predictive model; Sepsis.