Survival analysis with high-dimensional covariates

Daniela M Witten; Robert Tibshirani

doi:10.1177/0962280209105024

Survival analysis with high-dimensional covariates

Stat Methods Med Res. 2010 Feb;19(1):29-51. doi: 10.1177/0962280209105024. Epub 2009 Aug 4.

Authors

Daniela M Witten¹, Robert Tibshirani

Affiliation

¹ Department of Statistics, Stanford University, Stanford, CA 94305, USA. [email protected]

Abstract

In recent years, breakthroughs in biomedical technology have led to a wealth of data in which the number of features (for instance, genes on which expression measurements are available) exceeds the number of observations (e.g. patients). Sometimes survival outcomes are also available for those same observations. In this case, one might be interested in (a) identifying features that are associated with survival (in a univariate sense), and (b) developing a multivariate model for the relationship between the features and survival that can be used to predict survival in a new observation. Due to the high dimensionality of this data, most classical statistical methods for survival analysis cannot be applied directly. Here, we review a number of methods from the literature that address these two problems.

Publication types

Review

MeSH terms

Analysis of Variance
Cluster Analysis
Gene Expression Profiling / statistics & numerical data
Genome-Wide Association Study / statistics & numerical data
Genomics / statistics & numerical data*
Humans
Nonlinear Dynamics
Proportional Hazards Models
Survival Analysis*
Survivors / statistics & numerical data*
Time Factors

Grants and funding

P01 CA034233/CA/NCI NIH HHS/United States