Methods for using clinical laboratory test results as baseline confounders in multi-site observational database studies when missing data are expected

Pharmacoepidemiol Drug Saf. 2016 Jul;25(7):798-814. doi: 10.1002/pds.4015. Epub 2016 May 4.

Abstract

Purpose: Our purpose was to quantify missing baseline laboratory results, assess predictors of missingness, and examine performance of missing data methods.

Methods: Using the Mini-Sentinel Distributed Database from three sites, we selected three exposure-outcome scenarios with laboratory results as baseline confounders. We compared hazard ratios (HRs) or risk differences (RDs) and 95% confidence intervals (CIs) from models that omitted laboratory results, included only available results (complete cases), and included results after applying missing data methods (multiple imputation [MI] regression, MI predictive mean matching [PMM] indicator).

Results: Scenario 1 considered glucose among second-generation antipsychotic users and diabetes. Across sites, glucose was available for 27.7-58.9%. Results differed between complete case and missing data models (e.g., olanzapine: HR 0.92 [CI 0.73, 1.12] vs 1.02 [0.90, 1.16]). Across-site models employing different MI approaches provided similar HR and CI; site-specific models provided differing estimates. Scenario 2 evaluated creatinine among individuals starting high versus low dose lisinopril and hyperkalemia. Creatinine availability: 44.5-79.0%. Results differed between complete case and missing data models (e.g., HR 0.84 [CI 0.77, 0.92] vs. 0.88 [0.83, 0.94]). HR and CI were identical across MI methods. Scenario 3 examined international normalized ratio (INR) among warfarin users starting interacting versus noninteracting antimicrobials and bleeding. INR availability: 20.0-92.9%. Results differed between ignoring INR versus including INR using missing data methods (e.g., RD 0.05 [CI -0.03, 0.13] vs 0.09 [0.00, 0.18]). Indicator and PMM methods gave similar estimates.

Conclusion: Multi-site studies must consider site variability in missing data. Different missing data methods performed similarly. Copyright © 2016 John Wiley & Sons, Ltd.

Keywords: Mini-Sentinel; baseline confounders; database; laboratory test results; missing data methods; observational data; pharmacoepidemiology.

Publication types

  • Multicenter Study

MeSH terms

  • Antipsychotic Agents / administration & dosage
  • Antipsychotic Agents / adverse effects
  • Clinical Laboratory Techniques*
  • Confounding Factors, Epidemiologic
  • Creatinine / analysis
  • Data Interpretation, Statistical*
  • Databases, Factual / statistics & numerical data*
  • Drug-Related Side Effects and Adverse Reactions / diagnosis
  • Drug-Related Side Effects and Adverse Reactions / epidemiology*
  • Female
  • Glucose / analysis
  • Humans
  • International Normalized Ratio / methods
  • Lisinopril / administration & dosage
  • Lisinopril / adverse effects
  • Male
  • Proportional Hazards Models
  • Regression Analysis
  • Warfarin / administration & dosage
  • Warfarin / adverse effects

Substances

  • Antipsychotic Agents
  • Warfarin
  • Creatinine
  • Lisinopril
  • Glucose