Purpose: We developed algorithms to identify patients with newly diagnosed cancer from a Japanese claims database to identify the patients with newly diagnosed cancer of the sample population, which were compared with the nationwide cancer incidence in Japan to assess the validity of the novel algorithms.
Methods: We developed two algorithms to identify patients with stomach, lung, colorectal, breast, and cervical cancers: diagnosis only (algorithm 1), and combining diagnosis, treatments, and medicines (algorithm 2). Patients with newly diagnosed cancer were identified from an anonymized commercial claims database (JMDC Claims Database) in 2017 with two inclusions/exclusion criteria: selecting all patients with cancer (extract 1) and excluding patients who had received cancer treatments in 2015 or 2016 (extract 2). We estimated the cancer incidence of the five cancer sites and compared it with the Japan National Cancer Registry incidence (calculated standardized incidence ratio with 95% CIs).
Results: The number of patients with newly diagnosed cancer ranged from 219 to 17,840 by the sites, algorithms, and exclusion criteria. Standardized incidence ratios were significantly higher in the JMDC Claims Database than in the national registry data for extract 1 and algorithm 1, extract 1 and algorithm 2, and extract 2 and algorithm 1. In extract 2 and algorithm 2, colorectal cancer in male and stomach, lung, and cervical cancers in females showed similar cancer incidence in the JMDC and national registry data.
Conclusion: The novel algorithms are effective for extracting information about patients with cancer from claims data by using the combined information on diagnosis, procedures, and medicines (algorithm 2), with 2-year cancer-treatment history as an exclusion criterion (extract 2).