External Validation of Prognostic Models in Critical Care: A Cautionary Tale From COVID-19 Pneumonitis

Crit Care Explor. 2024 Mar 27;6(4):e1067. doi: 10.1097/CCE.0000000000001067. eCollection 2024 Apr.

Abstract

Objectives background: To externally validate clinical prediction models that aim to predict progression to invasive ventilation or death on the ICU in patients admitted with confirmed COVID-19 pneumonitis.

Design: Single-center retrospective external validation study.

Data sources: Routinely collected healthcare data in the ICU electronic patient record. Curated data recorded for each ICU admission for the purposes of the U.K. Intensive Care National Audit and Research Centre (ICNARC).

Setting: The ICU at Manchester Royal Infirmary, Manchester, United Kingdom.

Patients: Three hundred forty-nine patients admitted to ICU with confirmed COVID-19 Pneumonitis, older than 18 years, from March 1, 2020, to February 28, 2022. Three hundred two met the inclusion criteria for at least one model. Fifty-five of the 349 patients were admitted before the widespread adoption of dexamethasone for the treatment of severe COVID-19 (pre-dexamethasone patients).

Outcomes: Ability to be externally validated, discriminate, and calibrate.

Methods: Articles meeting the inclusion criteria were identified, and those that gave sufficient details on predictors used and methods to generate predictions were tested in our cohort of patients, which matched the original publications' inclusion/exclusion criteria and endpoint.

Results: Thirteen clinical prediction articles were identified. There was insufficient information available to validate models in five of the articles; a further three contained predictors that were not routinely measured in our ICU cohort and were not validated; three had performance that was substantially lower than previously published (range C-statistic = 0.483-0.605 in pre-dexamethasone patients and C = 0.494-0.564 among all patients). One model retained its discriminative ability in our cohort compared with previously published results (C = 0.672 and 0.686), and one retained performance among pre-dexamethasone patients but was poor in all patients (C = 0.793 and 0.596). One model could be calibrated but with poor performance.

Conclusions: Our findings, albeit from a single center, suggest that the published performance of COVID-19 prediction models may not be replicated when translated to other institutions. In light of this, we would encourage bedside intensivists to reflect on the role of clinical prediction models in their own clinical decision-making.

Keywords: COVID-19 pneumonitis; acute respiratory distress syndrome; clinical prediction modeling; external validation; intensive care.