Aim: To evaluate the extent to which balance in unmeasured characteristics of patients with type 2 diabetes (T2DM) was achieved in claims data, by comparing against more detailed information from linked electronic health records (EHR) data.
Methods: Within a large US commercial insurance database and using a cohort design, we identified patients with T2DM initiating linagliptin or a comparator agent within class (ie, another dipeptidyl peptidase-4 inhibitor) or outside class (ie, pioglitazone or a sulphonylurea) between May 2011 and December 2012. We focused on comparators used at a similar stage of diabetes to linagliptin. For each comparison, 1:1 propensity score (PS) matching was used to balance >100 baseline claims-based characteristics, including proxies of diabetes severity and duration. Additional clinical data from EHR were available for a subset of patients. We assessed representativeness of the claims-EHR-linked subset, evaluated the balance of claims- and EHR-based covariates before and after PS-matching via standardized differences (SDs), and quantified the potential bias associated with observed imbalances.
Results: From a claims-based study population of 166 613 patients with T2DM, 7219 (4.3%) patients were linked to their EHR data. Claims-based characteristics in the EHR-linked and EHR-unlinked patients were similar (SD < 0.1), confirming the representativeness of the EHR-linked subset. The balance of claims-based and EHR-based patient characteristics appeared to be reasonable before PS-matching and generally improved in the PS-matched population, to be SD < 0.1 for most patient characteristics and SD < 0.2 for select laboratory results and body mass index categories, which was not large enough to cause meaningful confounding.
Conclusion: In the context of pharmacoepidemiological research on diabetes therapy, choosing appropriate comparison groups paired with a new-user design and 1:1 PS matching on many proxies of diabetes severity and duration improves balance in covariates typically unmeasured in administrative claims datasets, to the extent that residual confounding is unlikely.
Keywords: administrative data; electronic medical records; glucose-lowering medications; linkage; type 2 diabetes.
© 2017 John Wiley & Sons Ltd.