Background: Unlike well-established diseases that base clinical care on randomized trials, past experiences, and training, prognosis in COVID19 relies on a weaker foundation. Knowledge from other respiratory failure diseases may inform clinical decisions in this novel disease. The objective was to predict 48-hour invasive mechanical ventilation (IMV) within 48 h in patients hospitalized with COVID-19 using COVID-like diseases (CLD).
Methods: This retrospective multicenter study trained machine learning (ML) models on patients hospitalized with CLD to predict IMV within 48 h in COVID-19 patients. CLD patients were identified using diagnosis codes for bacterial pneumonia, viral pneumonia, influenza, unspecified pneumonia and acute respiratory distress syndrome (ARDS), 2008-2019. A total of 16 cohorts were constructed, including any combinations of the four diseases plus an exploratory ARDS cohort, to determine the most appropriate cohort to use. Candidate predictors included demographic and clinical parameters that were previously associated with poor COVID-19 outcomes. Model development included the implementation of logistic regression and three ensemble tree-based algorithms: decision tree, AdaBoost, and XGBoost. Models were validated in hospitalized COVID-19 patients at two healthcare systems, March 2020-July 2020. ML models were trained on CLD patients at Stanford Hospital Alliance (SHA). Models were validated on hospitalized COVID-19 patients at both SHA and Intermountain Healthcare.
Results: CLD training data were obtained from SHA (n = 14,030), and validation data included 444 adult COVID-19 hospitalized patients from SHA (n = 185) and Intermountain (n = 259). XGBoost was the top-performing ML model, and among the 16 CLD training cohorts, the best model achieved an area under curve (AUC) of 0.883 in the validation set. In COVID-19 patients, the prediction models exhibited moderate discrimination performance, with the best models achieving an AUC of 0.77 at SHA and 0.65 at Intermountain. The model trained on all pneumonia and influenza cohorts had the best overall performance (SHA: positive predictive value (PPV) 0.29, negative predictive value (NPV) 0.97, positive likelihood ratio (PLR) 10.7; Intermountain: PPV, 0.23, NPV 0.97, PLR 10.3). We identified important factors associated with IMV that are not traditionally considered for respiratory diseases.
Conclusions: The performance of prediction models derived from CLD for 48-hour IMV in patients hospitalized with COVID-19 demonstrate high specificity and can be used as a triage tool at point of care. Novel predictors of IMV identified in COVID-19 are often overlooked in clinical practice. Lessons learned from our approach may assist other research institutes seeking to build artificial intelligence technologies for novel or rare diseases with limited data for training and validation.
Keywords: COVID-19; Invasive mechanical ventilation; Machine learning; Respiratory failure; Triage tool.
Copyright © 2021 Elsevier Inc. All rights reserved.