The Impact of the Choice of Data Source in Record Linkage Studies Estimating Mortality in Venous Thromboembolism

PLoS One. 2016 Feb 10;11(2):e0148349. doi: 10.1371/journal.pone.0148349. eCollection 2016.

Abstract

Linked electronic healthcare databases are increasingly being used in observational research. The objective of this study was to investigate the impact of the choice of data source in estimating mortality following VTE, with a secondary aim to investigate the influence of the denominator definition. We used the UK Clinical Practice Research Datalink (CPRD) to identify patients aged 18+ with venous thromboembolism (VTE). Multiple cohorts were identified in order to assess how mortality rates differed with a range of data sources. For each of the cohorts, incidence rates per 1,000 person years (/1000py) and relative rates (RRs) of all-cause mortality were calculated. The lowest mortality rate was found when only primary care data were used for both the exposure (VTE) and the outcome (death) (108.4/1000py). The highest mortality rate was found for patients diagnosed in secondary care (237.2/1000py). When linked primary and secondary care data were included for eligible patients and for the overlapping period of data collection, a mortality rate of 173.2/1000py was found. Sensitivity analyses varying the denominator definition provided a range of results (140.6-164.3/1000py). The relative rates of mortality by gender and age were comparable across all cohorts. Depending on the choice of data source, the population studied may be different. This may have substantial impact on the main findings, in particular on incidence rates of mortality following VTE.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Aged, 80 and over
  • Cohort Studies
  • Databases, Factual
  • Electronic Health Records / statistics & numerical data*
  • Female
  • Humans
  • Information Storage and Retrieval*
  • Male
  • Middle Aged
  • Primary Health Care / statistics & numerical data*
  • Risk Factors
  • Secondary Care / statistics & numerical data*
  • Survival Analysis
  • United Kingdom / epidemiology
  • Venous Thromboembolism / epidemiology
  • Venous Thromboembolism / mortality*
  • Venous Thromboembolism / pathology

Grants and funding

The authors received no specific funding for this work. Bert Leufkens receives no direct funding or donations from private parties, including the pharmaceutical industry. Research funding from public-private partnerships, i.e. IMI and TI Pharma (www.tipharma.nl), has been accepted under the condition that no company-specific product or company related study is conducted. He has received unrestricted research funding from public sources, i.e. the Netherlands Organisation for Health Research and Development (ZonMW), the EU 7th Framework Program (FP7), the Dutch Medicines Evaluation Board (MEB), the National Health Care Institute (ZIN) and the Dutch Ministry of Health.