Machine Learning for Predicting Clinician Evaluation of Treatment Plans for Left-Sided Whole Breast Radiation Therapy

Adv Radiat Oncol. 2023 Apr 29;8(5):101228. doi: 10.1016/j.adro.2023.101228. eCollection 2023 Sep-Oct.

Abstract

Purpose: The objective of this work was to investigate the ability of machine learning models to use treatment plan dosimetry for prediction of clinician approval of treatment plans (no further planning needed) for left-sided whole breast radiation therapy with boost.

Methods and materials: Investigated plans were generated to deliver a dose of 40.05 Gy to the whole breast in 15 fractions over 3 weeks, with the tumor bed simultaneously boosted to 48 Gy. In addition to the manually generated clinical plan of each of the 120 patients from a single institution, an automatically generated plan was included for each patient to enhance the number of study plans to 240. In random order, the treating clinician retrospectively scored all 240 plans as (1) approved without further planning to seek improvement or (2) further planning needed, while being blind for type of plan generation (manual or automated). In total, 2 × 5 classifiers were trained and evaluated for ability to correctly predict the clinician's plan evaluations: random forest (RF) and constrained logistic regression (LR) classifiers, each trained for 5 different sets of dosimetric plan parameters (feature sets [FS]). Importances of included features for predictions were investigated to better understand clinicians' choices.

Results: Although all 240 plans were in principle clinically acceptable for the clinician, only for 71.5% was no further planning required. For the most extensive FS, accuracy, area under the receiver operating characteristic curve, and Cohen's κ for generated RF/LR models for prediction of approval without further planning were 87.2 ± 2.0/86.7 ± 2.2, 0.80 ± 0.03/0.86 ± 0.02, and 0.63 ± 0.05/0.69 ± 0.04, respectively. In contrast to LR, RF performance was independent of the applied FS. For both RF and LR, whole breast excluding boost PTV (PTV40.05Gy) was the most important structure for predictions, with importance factors of 44.6% and 43%, respectively, dose recieved by 95% volume of PTV40.05 (D95%) as the most important parameter in most cases.

Conclusions: The investigated use of machine learning to predict clinician approval of treatment plans is highly promising. Including nondosimetric parameters could further increase classifiers' performances. The tool could become useful for aiding treatment planners in generating plans with a high probability of being directly approved by the treating clinician.