Background: To assess the performance of various coding algorithms for identifying people with hepatitis B virus (HBV) and hepatitis C virus (HCV) using claims data according to different reference standards (RSs) and study periods (SPs).
Methods: A proportional random sampling of 10,000 patients aged ≥ 20 years in a health care system in Southern Taiwan were enrolled as study participants. We used three hierarchical RSs (RS1: having positive results of laboratory tests; R2: having RS1 or having prescriptions of anti-HBV or anti-HCV medications; R3: having R1 or R2 or having textual diagnosis recorded in electrical medical records) with three SPs (4-, 8-, and 12-years) to calculate positive predictive value (PPV) and sensitivity (Sen) of 6 coding algorithms using HBV- and HCV-related International Classification of Disease Tenth Revision Clinical Modification (ICD-10-CM) codes in Taiwan National Health Insurance claims data for years 2016-2019.
Results: Of 10,000 enrolled participants, the number of participants had confirmed HBV and HCV was 146 and 165, respectively according to RS1 with 4-years SP and increased to 729 and 525, respectively according to RS3 with 12-years SP. For both HBV and HCV, the PPV was lowest according to RS1 and highest according to RS3. The longer the SP, the higher the PPV. However, the Sen was highest according to RS2 with 4-years SP. For both HBV and HCV, the coding algorithm with highest PPV and Sen was " ≥ 3 outpatient codes" and " ≥ 2 outpatient or ≥ 1 inpatients codes," respectively.
Conclusions: In conclusion, using different RSs with different SPs would result in different estimation of PPV and Sen. To achieve the best yield of both PPV and Sen, the optimal coding algorithm is " ≥ 2 outpatients or ≥ 1 inpatients codes" for identifying people with HBV or HCV.
Keywords: Algorithms; Claims data; Hepatitis B virus; Hepatitis C virus; International Classification of Disease (ICD); Validation study.
© 2022. The Author(s).