Sampling strategy, characteristics and representativeness of the InGef research database

Public Health. 2022 May:206:57-62. doi: 10.1016/j.puhe.2022.02.013. Epub 2022 Apr 1.

Abstract

Objectives: The aim of this study was to describe the sampling strategy as well as characteristics and the external validity of a representative sample database drawn from the German InGef research database.

Study design: This is a retrospective cohort study using anonymized claims data for the year 2019.

Methods: The InGef research database is an anonymized healthcare database with longitudinal claims data from approximately 8.8 Mio insurees. A sample of four million insurees was drawn intended to be representative for the German population with respect to age, sex and region. In addition to demographic information, data on hospitalization rates, mortality rates and drug prescription rates were analysed from the InGef sample database for the year 2019 to demonstrate validity and representativeness. Corresponding national reference data were obtained from official sources.

Results: The distributions of sex and age were similar in the InGef sample database and Germany (proportion of women: 50.8% vs 50.7%; mean age: 44.1 vs 43.9 years). The proportion of insurees living in the eastern part of Germany was lower in the InGef sample database (16.5% vs 19.5%). There was good accordance with German reference data with respect to hospitalization rates and overall mortality rates. Prescription rates for the 20 most often reimbursed drug classes were slightly higher in the InGef sample database.

Conclusions: The InGef sample database shows good overall agreement with the German population on measures of morbidity, mortality and drug usage.

Keywords: Claims data; Data sources; External validity; Healthcare databases; Pharmacoepidemiology.

MeSH terms

  • Adult
  • Databases, Factual
  • Drug Prescriptions*
  • Female
  • Germany / epidemiology
  • Hospitalization*
  • Humans
  • Retrospective Studies