Objective: To develop machine-learning models to predict recurrence and time-to-recurrence in high-grade endometrial cancer (HGEC) following surgery and tailored adjuvant treatment.
Methods: Data were retrospectively collected across eight Canadian centers including 1237 patients. Four models were trained to predict recurrence: random forests, boosted trees, and two neural networks. Receiver operating characteristic curves were used to select the best model based on the highest area under the curve (AUC). For time to recurrence, we compared random forests and Least Absolute Shrinkage and Selection Operator (LASSO) model to Cox proportional hazards.
Results: The random forest was the best model to predict recurrence in HGEC; the AUCs were 85.2%, 74.1%, and 71.8% in the training, validation, and test sets, respectively. The top five predictors were: stage, uterus height, specimen weight, adjuvant chemotherapy, and preoperative histology. Performance increased to 77% and 80% when stratified by Stage III and IV, respectively. For time to recurrence, there was no difference between the LASSO and Cox proportional hazards models (c-index 71%). The random forest had a c-index of 60.5%.
Conclusions: A bootstrap random forest model may be a more accurate technique to predict recurrence in HGEC using multiple clinicopathologic factors. For time to recurrence, machine-learning methods performed similarly to the Cox proportional hazards model.
Keywords: high-grade endometrial cancer; machine learning; recurrence.
© 2022 Wiley Periodicals LLC.