Background: The Coronavirus disease 2019 (COVID-19) pandemic has underlined the urgent need for reliable, multicenter, and full-admission intensive care data to advance our understanding of the course of the disease and investigate potential treatment strategies. In this study, we present the Dutch Data Warehouse (DDW), the first multicenter electronic health record (EHR) database with full-admission data from critically ill COVID-19 patients.
Methods: A nation-wide data sharing collaboration was launched at the beginning of the pandemic in March 2020. All hospitals in the Netherlands were asked to participate and share pseudonymized EHR data from adult critically ill COVID-19 patients. Data included patient demographics, clinical observations, administered medication, laboratory determinations, and data from vital sign monitors and life support devices. Data sharing agreements were signed with participating hospitals before any data transfers took place. Data were extracted from the local EHRs with prespecified queries and combined into a staging dataset through an extract-transform-load (ETL) pipeline. In the consecutive processing pipeline, data were mapped to a common concept vocabulary and enriched with derived concepts. Data validation was a continuous process throughout the project. All participating hospitals have access to the DDW. Within legal and ethical boundaries, data are available to clinicians and researchers.
Results: Out of the 81 intensive care units in the Netherlands, 66 participated in the collaboration, 47 have signed the data sharing agreement, and 35 have shared their data. Data from 25 hospitals have passed through the ETL and processing pipeline. Currently, 3464 patients are included in the DDW, both from wave 1 and wave 2 in the Netherlands. More than 200 million clinical data points are available. Overall ICU mortality was 24.4%. Respiratory and hemodynamic parameters were most frequently measured throughout a patient's stay. For each patient, all administered medication and their daily fluid balance were available. Missing data are reported for each descriptive.
Conclusions: In this study, we show that EHR data from critically ill COVID-19 patients may be lawfully collected and can be combined into a data warehouse. These initiatives are indispensable to advance medical data science in the field of intensive care medicine.
Keywords: Big data; COVID-19; Data sharing; Database.
© 2021. The Author(s).