A Common Longitudinal Intensive Care Unit data Format (CLIF) to enable multi-institutional federated critical illness research

medRxiv [Preprint]. 2024 Sep 4:2024.09.04.24313058. doi: 10.1101/2024.09.04.24313058.

Abstract

Background: Critical illness, or acute organ failure requiring life support, threatens over five million American lives annually. Electronic health record (EHR) data are a source of granular information that could generate crucial insights into the nature and optimal treatment of critical illness. However, data management, security, and standardization are barriers to large-scale critical illness EHR studies.

Methods: A consortium of critical care physicians and data scientists from eight US healthcare systems developed the Common Longitudinal Intensive Care Unit (ICU) data Format (CLIF), an open-source database format that harmonizes a minimum set of ICU Data Elements for use in critical illness research. We created a pipeline to process adult ICU EHR data at each site. After development and iteration, we conducted two proof-of-concept studies with a federated research architecture: 1) an external validation of an in-hospital mortality prediction model for critically ill patients and 2) an assessment of 72-hour temperature trajectories and their association with mechanical ventilation and in-hospital mortality using group-based trajectory models.

Results: We converted longitudinal data from 94,356 critically ill patients treated in 2020-2021 (mean age 60.6 years [standard deviation 17.2], 30% Black, 7% Hispanic, 45% female) across 8 health systems and 33 hospitals into the CLIF format, The in-hospital mortality prediction model performed well in the health system where it was derived (0.81 AUC, 0.06 Brier score). Performance across CLIF consortium sites varied (AUCs: 0.74-0.83, Brier scores: 0.06-0.01), and demonstrated some degradation in predictive capability. Temperature trajectories were similar across health systems. Hypothermic and hyperthermic-slow-resolver patients consistently had the highest mortality.

Conclusions: CLIF facilitates efficient, rigorous, and reproducible critical care research. Our federated case studies showcase CLIF's potential for disease sub-phenotyping and clinical decision-support evaluation. Future applications include pragmatic EHR-based trials, target trial emulations, foundational multi-modal AI models of critical illness, and real-time critical care quality dashboards.

Keywords: Critical Care Data; Machine Learning; Temperature Trajectory Modeling.

Publication types

  • Preprint