Background: Administrative health data has been used extensively to examine congenital heart disease (CHD). However, the accuracy and completeness of these data must be assessed.
Objectives: To use data linkage of multiple administrative data sources to examine the validity of identifying CHD cases recorded in hospital discharge data.
Methods: We identified all liveborn infants born 2013-2017 in New South Wales, Australia with a CHD diagnosis up to age one, recorded in hospital discharge data. Using record linkage to multiple data sources, the diagnosis of CHD was compared with five reference standards: (i) multiple hospital admissions containing CHD diagnosis; (ii) receiving a cardiac procedure; (iii) CHD diagnosis in the Register of Congenital Conditions; (iv) cardiac-related outpatient health service recorded; and/or (v) cardiac-related cause of death. Positive predictive values (PPV) comparing CHD diagnosis with the reference standards were estimated by CHD severity and for specific phenotypes.
Results: Of 485,239 liveborn infants, there were 4043 infants with a CHD diagnosis identified in hospital discharge data (8.3 per 1000 live births). The PPV for any CHD identified in any of the five methods was 62.8% (95% confidence interval [CI] 60.9, 64.8), with PPV higher for severe CHD at 94.1% (95% CI 88.2, 100). Infant characteristics associated with higher PPVs included lower birthweight, presence of a syndrome or non-cardiac congenital anomaly, born to mothers aged <20 years and residing in disadvantaged areas.
Conclusion: Using data linkage of multiple datasets is a novel and cost-effective method to examine the validity of CHD diagnoses recorded in one dataset. These results can be incorporated into bias analyses in future studies of CHD.
Keywords: accuracy; capture-recapture; congenital heart disease; prevalence; validation.
© 2023 The Authors. Paediatric and Perinatal Epidemiology published by John Wiley & Sons Ltd.