Survival analysis with high-dimensional covariates

Stat Methods Med Res. 2010 Feb;19(1):29-51. doi: 10.1177/0962280209105024. Epub 2009 Aug 4.

Abstract

In recent years, breakthroughs in biomedical technology have led to a wealth of data in which the number of features (for instance, genes on which expression measurements are available) exceeds the number of observations (e.g. patients). Sometimes survival outcomes are also available for those same observations. In this case, one might be interested in (a) identifying features that are associated with survival (in a univariate sense), and (b) developing a multivariate model for the relationship between the features and survival that can be used to predict survival in a new observation. Due to the high dimensionality of this data, most classical statistical methods for survival analysis cannot be applied directly. Here, we review a number of methods from the literature that address these two problems.

Publication types

  • Review

MeSH terms

  • Analysis of Variance
  • Cluster Analysis
  • Gene Expression Profiling / statistics & numerical data
  • Genome-Wide Association Study / statistics & numerical data
  • Genomics / statistics & numerical data*
  • Humans
  • Nonlinear Dynamics
  • Proportional Hazards Models
  • Survival Analysis*
  • Survivors / statistics & numerical data*
  • Time Factors