Purpose: Surgical oncologists are frequently confronted with the question of expected long-term prognosis. The aim of this study was to apply machine learning algorithms to optimize survival prediction after oncological resection of gastroesophageal cancers.
Methods: Eligible patients underwent oncological resection of gastric or distal esophageal cancer between 2001 and 2020 at Heidelberg University Hospital, Department of General Surgery. Machine learning methods such as multi-task logistic regression and survival forests were compared with usual algorithms to establish an individual estimation.
Results: The study included 117 variables with a total of 1360 patients. The overall missingness was 1.3%. Out of eight machine learning algorithms, the random survival forest (RSF) performed best with a concordance index of 0.736 and an integrated Brier score of 0.166. The RSF demonstrated a mean area under the curve (AUC) of 0.814 over a time period of 10 years after diagnosis. The most important long-term outcome predictor was lymph node ratio with a mean AUC of 0.730. A numeric risk score was calculated by the RSF for each patient and three risk groups were defined accordingly. Median survival time was 18.8 months in the high-risk group, 44.6 months in the medium-risk group and above 10 years in the low-risk group.
Conclusion: The results of this study suggest that RSF is most appropriate to accurately answer the question of long-term prognosis. Furthermore, we could establish a compact risk score model with 20 input parameters and thus provide a clinical tool to improve prediction of oncological outcome after upper gastrointestinal surgery.
Keywords: Esophageal cancer; Gastric cancer; Machine learning; Oncological outcome; Survival analysis.
© 2022. The Author(s).