Purpose: The goal of this study was to use real-world data sources that may be faster and more complete than self-reported data alone, and timelier than cancer registries, to ascertain breast cancer cases in the ongoing screening trial, the WISDOM Study.
Methods: We developed a data warehouse procedural process (DWPP) to identify breast cancer cases from a subgroup of WISDOM participants (n = 11,314) who received breast-related care from a University of California Health Center in the period 2012-2021 by searching electronic health records (EHRs) in the University of California Data Warehouse (UCDW). Incident breast cancer diagnoses identified by the DWPP were compared with those identified by self-report via annual follow-up online questionnaires.
Results: Our study identified 172 participants with confirmed breast cancer diagnoses in the period 2016-2021 by the following sources: 129 (75%) by both self-report and DWPP, 23 (13%) by DWPP alone, and 20 (12%) by self-report only. Among those with International Classification of Diseases 10th revision cancer diagnostic codes, no diagnosis was confirmed in 18% of participants.
Conclusion: For diagnoses that occurred ≥20 months before the January 1, 2022, UCDW data pull, WISDOM self-reported data via annual questionnaire achieved high accuracy (96%), as confirmed by the cancer registry. More rapid cancer ascertainment can be achieved by combining self-reported data with EHR data from a health system data warehouse registry, particularly to address self-reported questionnaire issues such as timing delays (ie, time lag between participant diagnoses and the submission of their self-reported questionnaire typically ranges from a month to a year) and lack of response. Although cancer registry reporting often is not as timely, it does not require verification as does the DWPP or self-report from annual questionnaires.
Trial registration: ClinicalTrials.gov NCT02620852.