Imputing missing covariates in time-to-event analysis within distributed research networks: A simulation study

Pharmacoepidemiol Drug Saf. 2023 Mar;32(3):330-340. doi: 10.1002/pds.5563. Epub 2022 Nov 30.

Abstract

Purpose: In distributed research network (DRN) settings, multiple imputation cannot be directly implemented because pooling individual-level data are often not feasible. The performance of multiple imputation in combination with meta-analysis is not well understood within DRNs.

Methods: To evaluate the performance of imputation for missing baseline covariate data in combination with meta-analysis for time-to-event analysis within DRNs, we compared two parametric algorithms including one approximated linear imputation model (Approx), and one nonlinear substantive model compatible imputation model (SMC), as well as two non-parametric machine learning algorithms including random forest (RF), and classification and regression trees (CART), through simulation studies motivated by a real-world data set.

Results: Under the setting with small effect sizes (i.e., log-Hazard ratios [logHR]) and homogeneous missingness mechanisms across sites, all imputation methods produced unbiased and more efficient estimates while the complete-case analysis could be biased and inefficient; and under heterogeneous missingness mechanisms, estimates with RF method could have higher efficiency. Estimates from the distributed imputation combined by meta-analysis were similar to those from the imputation using pooled data. When logHRs were large, the SMC imputation algorithm generally performed better than others.

Conclusions: These findings suggest the validity and feasibility of imputation within DRNs in the presence of missing covariate data in time-to-event analysis under various settings. The performance of the four imputation algorithms varies with the effect sizes and level of missingness.

Keywords: Cox model; distributed research networks; missing covariates; multiple imputation; simulation study.

Publication types

  • Meta-Analysis
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Humans
  • Linear Models
  • Proportional Hazards Models