Purpose: Toxicity to systemic cancer treatment represents a major anxiety for patients and a challenge to treatment plans. We aimed to develop machine learning algorithms for the upfront prediction of an individual's risk of experiencing treatment-relevant toxicity during the course of treatment.
Methods: Clinical records were retrieved from a single-center, consecutive cohort of patients who underwent neoadjuvant treatment for early breast cancer. We developed and validated machine learning algorithms to predict grade 3 or 4 toxicity (anemia, neutropenia, deviation of liver enzymes, nephrotoxicity, thrombopenia, electrolyte disturbance, or neuropathy). We used 10-fold cross-validation to develop two algorithms (logistic regression with elastic net penalty [GLM] and support vector machines [SVMs]). Algorithm predictions were compared with documented toxicity events and diagnostic performance was evaluated via area under the curve (AUROC).
Results: A total of 590 patients were identified, 432 in the development set and 158 in the validation set. The median age was 51 years, and 55.8% (329 of 590) experienced grade 3 or 4 toxicity. The performance improved significantly when adding referenced treatment information (referenced regimen, referenced summation dose intensity product) in addition to patient and tumor variables: GLM AUROC 0.59 versus 0.75, P = .02; SVM AUROC 0.64 versus 0.75, P = .01.
Conclusion: The individual risk of treatment-relevant toxicity can be predicted using machine learning algorithms. We demonstrate a promising way to improve efficacy and facilitate proactive toxicity management of systemic cancer treatment.