Comparing lagged linear correlation, lagged regression, Granger causality, and vector autoregression for uncovering associations in EHR data

AMIA Annu Symp Proc. 2017 Feb 10:2016:779-788. eCollection 2016.

Abstract

Time series analysis methods have been shown to reveal clinical and biological associations in data collected in the electronic health record. We wish to develop reliable high-throughput methods for identifying adverse drug effects that are easy to implement and produce readily interpretable results. To move toward this goal, we used univariate and multivariate lagged regression models to investigate associations between twenty pairs of drug orders and laboratory measurements. Multivariate lagged regression models exhibited higher sensitivity and specificity than univariate lagged regression in the 20 examples, and incorporating autoregressive terms for labs and drugs produced more robust signals in cases of known associations among the 20 example pairings. Moreover, including inpatient admission terms in the model attenuated the signals for some cases of unlikely associations, demonstrating how multivariate lagged regression models' explicit handling of context-based variables can provide a simple way to probe for health-care processes that confound analyses of EHR data.

Publication types

  • Comparative Study

MeSH terms

  • Drug-Related Side Effects and Adverse Reactions*
  • Electronic Health Records*
  • Humans
  • Models, Statistical*
  • Regression Analysis