A stochastic multiple imputation algorithm for missing covariate data in tree-structured survival analysis

Stat Med. 2010 Dec 20;29(29):3004-16. doi: 10.1002/sim.4079. Epub 2010 Oct 20.

Abstract

Missing covariate data present a challenge to tree-structured methodology due to the fact that a single tree model, as opposed to an estimated parameter value, may be desired for use in a clinical setting. To address this problem, we suggest a multiple imputation algorithm that adds draws of stochastic error to a tree-based single imputation method presented by Conversano and Siciliano (Technical Report, University of Naples, 2003). Unlike previously proposed techniques for accommodating missing covariate data in tree-structured analyses, our methodology allows the modeling of complex and nonlinear covariate structures while still resulting in a single tree model. We perform a simulation study to evaluate our stochastic multiple imputation algorithm when covariate data are missing at random and compare it to other currently used methods. Our algorithm is advantageous for identifying the true underlying covariate structure when complex data and larger percentages of missing covariate observations are present. It is competitive with other current methods with respect to prediction accuracy. To illustrate our algorithm, we create a tree-structured survival model for predicting time to treatment response in older, depressed adults.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Algorithms*
  • Computer Simulation
  • Depressive Disorder / therapy
  • Humans
  • Models, Statistical*
  • Nonlinear Dynamics
  • Randomized Controlled Trials as Topic / statistics & numerical data
  • Regression Analysis
  • Remission Induction
  • Stochastic Processes*
  • Survival Analysis*
  • Treatment Outcome