Practical experiences on the necessity of external validation

Stat Med. 2007 Dec 30;26(30):5499-511. doi: 10.1002/sim.3069.

Abstract

The validity of prognostic models is an important prerequisite for their applicability in practical clinical settings. Here, we report on a specific prognostic study on stroke patients and describe how we explored the prediction performance of our model. We considered two practically highly relevant generalization aspects, namely, the model's performance in patients recruited at a later time point (temporal transportability) and in medical centers different from those used for model building (geographic transportability). To estimate the accuracy of the model, we investigated classical internal validation techniques and leave-one-center-out cross validation (CV). Prognostic models predicting functional independence of stroke patients were developed in a training set using logistic regression, support vector machines, and random forests (RFs). Tenfold CV and leave-one-center-out CV were employed to estimate temporal and geographic transportability of the models. For temporal and external validation, the resulting models were used to classify patients from a later time point and from different clinics. When applying the regression model or the RFs, accuracy in the temporal validation data was well predicted from classical internal validation. However, when predicting geographic transportability all approaches had difficulties. We observed that the leave-one-center-out CV yielded better estimates than classical CV. On the basis of our results, we conclude that external validation in patients from different clinics is required before a prognostic model can be applied in practice. Even validating the model in patients recruited merely at a later time point does not suffice to predict how it may fare with regard to another clinic.

MeSH terms

  • Artificial Intelligence
  • Brain Ischemia / mortality
  • Brain Ischemia / rehabilitation
  • Confidence Intervals
  • Decision Trees
  • Epidemiologic Research Design
  • Germany / epidemiology
  • Humans
  • Logistic Models
  • Multicenter Studies as Topic / methods
  • Prognosis*
  • Recovery of Function
  • Validation Studies as Topic*