Background: Admissions are generally classified as COVID-19 hospitalizations if the patient has a positive SARS-CoV-2 polymerase chain reaction (PCR) test. However, because 35% of SARS-CoV-2 infections are asymptomatic, patients admitted for unrelated indications with an incidentally positive test could be misclassified as a COVID-19 hospitalization. Electronic health record (EHR)-based studies have been unable to distinguish between a hospitalization specifically for COVID-19 versus an incidental SARS-CoV-2 hospitalization. Although the need to improve classification of COVID-19 versus incidental SARS-CoV-2 is well understood, the magnitude of the problems has only been characterized in small, single-center studies. Furthermore, there have been no peer-reviewed studies evaluating methods for improving classification.
Objective: The aims of this study are to, first, quantify the frequency of incidental hospitalizations over the first 15 months of the pandemic in multiple hospital systems in the United States and, second, to apply electronic phenotyping techniques to automatically improve COVID-19 hospitalization classification.
Methods: From a retrospective EHR-based cohort in 4 US health care systems in Massachusetts, Pennsylvania, and Illinois, a random sample of 1123 SARS-CoV-2 PCR-positive patients hospitalized from March 2020 to August 2021 was manually chart-reviewed and classified as "admitted with COVID-19" (incidental) versus specifically admitted for COVID-19 ("for COVID-19"). EHR-based phenotyping was used to find feature sets to filter out incidental admissions.
Results: EHR-based phenotyped feature sets filtered out incidental admissions, which occurred in an average of 26% of hospitalizations (although this varied widely over time, from 0% to 75%). The top site-specific feature sets had 79%-99% specificity with 62%-75% sensitivity, while the best-performing across-site feature sets had 71%-94% specificity with 69%-81% sensitivity.
Conclusions: A large proportion of SARS-CoV-2 PCR-positive admissions were incidental. Straightforward EHR-based phenotypes differentiated admissions, which is important to assure accurate public health reporting and research.
Keywords: COVID-19; SARS-CoV-2; clinical research informatics; electronic health records; health care; health data; medical informatics; patient data; phenotype; public health.
©Jeffrey G Klann, Zachary H Strasser, Meghan R Hutch, Chris J Kennedy, Jayson S Marwaha, Michele Morris, Malarkodi Jebathilagam Samayamuthu, Ashley C Pfaff, Hossein Estiri, Andrew M South, Griffin M Weber, William Yuan, Paul Avillach, Kavishwar B Wagholikar, Yuan Luo, The Consortium for Clinical Characterization of COVID-19 by EHR (4CE), Gilbert S Omenn, Shyam Visweswaran, John H Holmes, Zongqi Xia, Gabriel A Brat, Shawn N Murphy. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 18.05.2022.