How to assess prognostic models for survival data: a case study in oncology

M Schumacher; E Graf; T Gerds

How to assess prognostic models for survival data: a case study in oncology

Methods Inf Med. 2003;42(5):564-71.

Authors

M Schumacher¹, E Graf, T Gerds

Affiliation

¹ Institute of Medical Biometry and Medical Informatics, University Hospital Freiburg, Freiburg, Germany. [email protected]

PMID: 14654892

Abstract

Objectives: A lack of generally applicable tools for the assessment of predictions for survival data has to be recognized. Prediction error curves based on the Brier score that have been suggested as a sensible approach are illustrated by means of a case study.

Methods: The concept of predictions made in terms of conditional survival probabilities given the patient's covariates is introduced. Such predictions are derived from various statistical models for survival data including artificial neural networks. The idea of how the prediction error of a prognostic classification scheme can be followed over time is illustrated with the data of two studies on the prognosis of node positive breast cancer patients, one of them serving as an independent test data set.

Results and conclusions: The Brier score as a function of time is shown to be a valuable tool for assessing the predictive performance of prognostic classification schemes for survival data incorporating censored observations. Comparison with the prediction based on the pooled Kaplan Meier estimator yields a benchmark value for any classification scheme incorporating patient's covariate measurements. The problem of an overoptimistic assessment of prediction error caused by data-driven modelling as it is, for example, done with artificial neural nets can be circumvented by an assessment in an independent test data set.

Publication types

Evaluation Study

MeSH terms

Breast Neoplasms / diagnosis
Breast Neoplasms / mortality
Female
Humans
Neoplasms / diagnosis*
Neoplasms / mortality
Neural Networks, Computer
Prognosis
Proportional Hazards Models
Reproducibility of Results
Survival Analysis*