Building Linked Big Data for Stroke in Korea: Linkage of Stroke Registry and National Health Insurance Claims Data

J Korean Med Sci. 2018 Dec 13;33(53):e343. doi: 10.3346/jkms.2018.33.e343. eCollection 2018 Dec 31.

Abstract

Background: Linkage of public healthcare data is useful in stroke research because patients may visit different sectors of the health system before, during, and after stroke. Therefore, we aimed to establish high-quality big data on stroke in Korea by linking acute stroke registry and national health claim databases.

Methods: Acute stroke patients (n = 65,311) with claim data suitable for linkage were included in the Clinical Research Center for Stroke (CRCS) registry during 2006-2014. We linked the CRCS registry with national health claim databases in the Health Insurance Review and Assessment Service (HIRA). Linkage was performed using 6 common variables: birth date, gender, provider identification, receiving year and number, and statement serial number in the benefit claim statement. For matched records, linkage accuracy was evaluated using differences between hospital visiting date in the CRCS registry and the commencement date for health insurance care in HIRA.

Results: Of 65,311 CRCS cases, 64,634 were matched to HIRA cases (match rate, 99.0%). The proportion of true matches was 94.4% (n = 61,017) in the matched data. Among true matches (mean age 66.4 years; men 58.4%), the median National Institutes of Health Stroke Scale score was 3 (interquartile range 1-7). When comparing baseline characteristics between true matches and false matches, no substantial difference was observed for any variable.

Conclusion: We could establish big data on stroke by linking CRCS registry and HIRA records, using claims data without personal identifiers. We plan to conduct national stroke research and improve stroke care using the linked big database.

Keywords: Big Data; Data Linkage; National Health Claim Data; Stroke Registry.

MeSH terms

  • Acute Disease
  • Aged
  • Big Data
  • Databases, Factual*
  • Female
  • Humans
  • Information Storage and Retrieval*
  • Insurance Claim Review
  • Male
  • Middle Aged
  • Registries
  • Stroke / pathology*