Record linkage to enhance consented cohort and routinely collected health data from a UK birth cohort

Int J Popul Data Sci. 2019 Apr 2;4(1):579. doi: 10.23889/ijpds.v4i1.579.


Background: In longitudinal health research, combining the richness of cohort data to the extensiveness of routine data opens up new possibilities, providing information not available from one data source alone. In this study, we set out to extend information from a longitudinal birth cohort study by linking to the cohort child's routine primary and secondary health care data. The resulting linked datasets will be used to examine health outcomes and patterns of health service utilisation for a set of common childhood health problems. We describe the experiences and challenges of acquiring and linking electronic health records for participants in a national longitudinal study, the UK Millennium Cohort Study (MCS).

Method: Written parental consent to link routine health data to survey responses of the MCS cohort member, mother and her partner was obtained for 90.7% of respondents when interviews took place at age seven years in the MCS. Probabilistic and deterministic linkage was used to link MCS cohort members to multiple routinely-collected health data sources in Wales and Scotland.

Results: Overall linkage rates for the consented population using country-specific health service data sources were 97.6% for Scotland and 99.9% for Wales. Linkage rates between different health data sources ranged from 65.3% to 99.6%. Issues relating to acquisition and linkage of data sources are discussed.

Conclusions: Linking longitudinal cohort participants with routine data sources is becoming increasingly popular in population data research. Our results suggest that this is a valid method to enhance information held in both sources of data.

Grants and funding

This work was supported by the Wellcome Trust (grant number 087389/B/08/Z). For this project, KST was supported by awards establishing the Administrative Data Research Centre Wales from the Economic and Social Research Council (ESRC). CD, RAL and AA are supported by awards establishing the Farr Institute of Health Informatics Research from the Medical Research Council (MRC), in partnership with Arthritis Research UK, the British Heart Foundation, Cancer Research UK, the ESRC, the Engineering and Physical Sciences Research Council, the National Institute for Social Care and Health Research (Welsh Assembly Government), the Chief Scientist Office (Scottish Government Health Directorates) and the Wellcome Trust (MRC grants MR/K006584/1 and MR/K006525/1, respectively). RAL is also funded by the Asthma UK Centre for Applied Research (AUK-AC-2012-01). The Millennium Cohort Study is funded by grants to the Centre for Longitudinal Studies at the Institute of Education from the Economic and Social Research Council and a consortium of government departments. The study sponsors played no part in the design, data analysis and interpretation of this study, and the writing of the article or the decision to submit the paper for publication; the authors’ work was independent of their funders.