A comparison between different prediction models for invasive breast cancer occurrence in the French E3N cohort

Breast Cancer Res Treat. 2015 Apr;150(2):415-26. doi: 10.1007/s10549-015-3321-7. Epub 2015 Mar 6.

Abstract

Breast cancer remains a global health concern with a lack of high discriminating prediction models. The k-nearest-neighbor algorithm (kNN) estimates individual risks using an intuitive tool. This study compares the performances of this approach with the Cox and the Gail models for the 5-year breast cancer risk prediction. The study included 64,995 women from the French E3N prospective cohort. The sample was divided into a learning (N = 51,821) series to learn the models using fivefold cross-validation and a validation (N = 13,174) series to evaluate them. The area under the receiver operating characteristic curve (AUC) and the expected over observed number of cases (E/O) ratio were estimated. In the two series, 393 and 78 premenopausal and 537 and 98 postmenopausal breast cancers were diagnosed. The discrimination values of the best combinations of predictors obtained from cross-validation ranged from 0.59 to 0.60. In the validation series, the AUC values in premenopausal and postmenopausal women were 0.583 [0.520; 0.646] and 0.621 [0.563; 0.679] using the kNN and 0.565 [0.500; 0.631] and 0.617 [0.561; 0.673] using the Cox model. The E/O ratios were 1.26 and 1.28 in premenopausal women and 1.44 and 1.40 in postmenopausal women. The applied Gail model provided AUC values of 0.614 [0.554; 0.675] and 0.549 [0.495; 0.604] and E/O ratios of 0.78 and 1.12. This study shows that the prediction performances differed according to menopausal status when using parametric statistical tools. The k-nearest-neighbor approach performed well, and discrimination was improved in postmenopausal women compared with the Gail model.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Breast Neoplasms / epidemiology*
  • Carcinoma, Ductal, Breast / epidemiology*
  • Female
  • France
  • Humans
  • Middle Aged
  • Predictive Value of Tests
  • Proportional Hazards Models
  • Prospective Studies
  • ROC Curve
  • Risk Assessment
  • Risk Factors