Confidence scores for prediction models

Biom J. 2011 Mar;53(2):259-74. doi: 10.1002/bimj.201000157. Epub 2011 Feb 17.

Abstract

In medical statistics, many alternative strategies are available for building a prediction model based on training data. Prediction models are routinely compared by means of their prediction performance in independent validation data. If only one data set is available for training and validation, then rival strategies can still be compared based on repeated bootstraps of the same data. Often, however, the overall performance of rival strategies is similar and it is thus difficult to decide for one model. Here, we investigate the variability of the prediction models that results when the same modelling strategy is applied to different training sets. For each modelling strategy we estimate a confidence score based on the same repeated bootstraps. A new decomposition of the expected Brier score is obtained, as well as the estimates of population average confidence scores. The latter can be used to distinguish rival prediction models with similar prediction performances. Furthermore, on the subject level a confidence score may provide useful supplementary information for new patients who want to base a medical decision on predicted risk. The ideas are illustrated and discussed using data from cancer studies, also with high-dimensional predictor space.

MeSH terms

  • Algorithms
  • Breast Neoplasms / therapy
  • Clinical Trials as Topic
  • Confidence Intervals
  • Data Interpretation, Statistical
  • Female
  • Humans
  • Male
  • Models, Statistical
  • Models, Theoretical
  • Neoplasms / diagnosis
  • Neoplasms / therapy*
  • Predictive Value of Tests
  • Prospective Studies
  • Prostatic Neoplasms / therapy
  • Regression Analysis
  • Reproducibility of Results
  • Statistics as Topic*