Background: The clinical prognosis assessment of renal cell carcinoma (RCC) still relies on nuclear grading and nuclear score by naked eye with microscope, which has defects long time, low efficiency, and uneven evaluation level criteria. There are few machine learning (ML) studies investigating the prognosis in the RCC literature which could also quantify the risk of postoperative recurrence of RCC patients and guide cancer patients to conduct individualized postoperative clinical management. This study evaluated the suitability of ML algorithms for survival prediction in patients with RCC.
Methods: A total of 192,912 RCC patients from the Surveillance, Epidemiology, and End Results (SEER) were obtained from 2004 to 2015. Six ML algorithms including support vector machine (SVM), Bayesian method, decision tree, random forest, neural network, and Extreme Gradient Boosting (XGBoost) were applied to predict overall survival (OS) of RCC.
Results: Patients from the SEER with a median age of 62 years and the pathological types were clear cell RCC (47.6%), papillary RCC (9.5%), chromophobe RCC (4.0%) and others (4.1%) were collected. In the deleting patients with missing data, the highest accurate model was XGBoost [area under the curve (AUC) 67.0%]. In the deleting patients with missing data and survival time <5 years, the accuracy of random forest, neural network and XGBoost were high, with AUC of 80.8%, 81.5% and 81.8%, respectively. In the only deleting the missing tumor diameter and filling the missing dataset with missForest, the highest accurate model was random forest (AUC: 71.9%). In this study, the overall accuracy of the SVM model was not high, apart from in the population of patients with deleting the missing tumor diameter and survival time <5 years, and filling the missing data with missForest. Random forest, neural network and XGBoost had high accuracy, with AUC of 84.1%, 84.7% and 84.8%, respectively.
Conclusions: ML algorithms could be used to predict the prognosis of RCC. It could quantify the recurrence possibility of patients and help more individualized postoperative clinical management. Given the limitations and complexity of datasets, ML may be used as an auxiliary tool to analyze and process larger datasets and complex data.
Keywords: Renal cell carcinoma (RCC); algorithm; machine learning (ML); prediction; prognosis.
2024 Translational Andrology and Urology. All rights reserved.