Predictions of the pathological response to neoadjuvant chemotherapy in patients with primary breast cancer using a data mining technique

Breast Cancer Res Treat. 2012 Jul;134(2):661-70. doi: 10.1007/s10549-012-2109-2. Epub 2012 Jun 12.

Abstract

Nomogram, a standard technique that utilizes multiple characteristics to predict efficacy of treatment and likelihood of a specific status of an individual patient, has been used for prediction of response to neoadjuvant chemotherapy (NAC) in breast cancer patients. The aim of this study was to develop a novel computational technique to predict the pathological complete response (pCR) to NAC in primary breast cancer patients. A mathematical model using alternating decision trees, an epigone of decision tree, was developed using 28 clinicopathological variables that were retrospectively collected from patients treated with NAC (n = 150), and validated using an independent dataset from a randomized controlled trial (n = 173). The model selected 15 variables to predict the pCR with yielding area under the receiver operating characteristics curve (AUC) values of 0.766 [95 % confidence interval (CI)], 0.671-0.861, P value < 0.0001) in cross-validation using training dataset and 0.787 (95 % CI 0.716-0.858, P value < 0.0001) in the validation dataset. Among three subtypes of breast cancer, the luminal subgroup showed the best discrimination (AUC = 0.779, 95 % CI 0.641-0.917, P value = 0.0059). The developed model (AUC = 0.805, 95 % CI 0.716-0.894, P value < 0.0001) outperformed multivariate logistic regression (AUC = 0.754, 95 % CI 0.651-0.858, P value = 0.00019) of validation datasets without missing values (n = 127). Several analyses, e.g. bootstrap analysis, revealed that the developed model was insensitive to missing values and also tolerant to distribution bias among the datasets. Our model based on clinicopathological variables showed high predictive ability for pCR. This model might improve the prediction of the response to NAC in primary breast cancer patients.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Area Under Curve
  • Breast Neoplasms / drug therapy*
  • Carcinoma, Ductal, Breast / drug therapy*
  • Chemotherapy, Adjuvant
  • Computer Simulation
  • Data Interpretation, Statistical
  • Data Mining*
  • Decision Trees
  • Female
  • Humans
  • Logistic Models
  • Middle Aged
  • Models, Biological
  • Multivariate Analysis
  • Neoadjuvant Therapy
  • Nomograms
  • ROC Curve
  • Retrospective Studies
  • Treatment Outcome