Objectives: To obtain timely and detailed data on COVID-19 cases in the United States, the Centers for Disease Control and Prevention (CDC) uses 2 data sources: (1) aggregate counts for daily situational awareness and (2) person-level data for each case (case surveillance). The objective of this study was to describe the sensitivity of case ascertainment and the completeness of person-level data received by CDC through national COVID-19 case surveillance.
Methods: We compared case and death counts from case surveillance data with aggregate counts received by CDC during April 5-September 30, 2020. We analyzed case surveillance data to describe geographic and temporal trends in data completeness for selected variables, including demographic characteristics, underlying medical conditions, and outcomes.
Results: As of November 18, 2020, national COVID-19 case surveillance data received by CDC during April 5-September 30, 2020, included 4 990 629 cases and 141 935 deaths, representing 72.7% of the volume of cases (n = 6 863 251) and 71.8% of the volume of deaths (n = 197 756) in aggregate counts. Nationally, completeness in case surveillance records was highest for age (99.9%) and sex (98.8%). Data on race/ethnicity were complete for 56.9% of cases; completeness varied by region. Data completeness for each underlying medical condition assessed was <25% and generally declined during the study period. About half of case records had complete data on hospitalization and death status.
Conclusions: Incompleteness in national COVID-19 case surveillance data might limit their usefulness. Streamlining and automating surveillance processes would decrease reporting burdens on jurisdictions and likely improve completeness of national COVID-19 case surveillance data.
Keywords: COVID-19; SARS-CoV-2; case surveillance; data completeness; race/ethnicity.