Background: Distant metastasis of gastric cancer can seriously affect the treatment strategy of gastric cancer patients, so it is essential to identify patients at high risk of distant metastasis of gastric cancer earlier.
Method: In this study, we retrospectively collected research data from 18,472 gastric cancer patients from the SEER database. We applied six machine learning algorithms to construct a model that can predict distant metastasis of gastric cancer. We constructed the machine learning model using 10-fold cross-validation. We evaluated the model using the area under the receiver operating characteristic curves (AUC), the area under the precision-recall curve (AUPRC), decision curve analysis, and calibration curves. In addition, we used Shapley's addition interpretation (SHAP) to interpret the machine learning model. We used data from 1595 gastric cancer patients in the First Hospital of Jilin University for external validation. We plotted the correlation heat maps of the predictor variables. We selected an optimal model and constructed a web-based online calculator for predicting the risk of distant metastasis of gastric cancer.
Result: The study included 18,472 patients with gastric cancer from the SEER database, including 4,202 (22.75%) patients with distant metastases. The results of multivariate logistic regression analysis showed that age, race, grade of differentiation, tumor size, T stage, radiotherapy, and chemotherapy were independent risk factors for distant metastasis of gastric cancer. In the ten-fold cross-validation of the training set, the average AUC value of the random forest (RF) model was 0.80. The RF model performed best in the internal test set and external validation set. The RF model had an AUC of 0.80, an AUPRC of 0.555, an accuracy of 0.81, and a precision of 0.78 in the internal test set. The RF model had a metric AUC of 0.76 in the external validation set, an AUPRC of 0.496, an accuracy of 0.82, and a precision of 0.81. Finally, we constructed a network calculator for distant metastasis of gastric cancer using the RF model.
Conclusion: With the help of pathological and clinical indicators, we constructed a well-performing RF model for predicting the risk of distant metastasis in gastric cancer patients to help clinicians make clinical decisions.
Keywords: distant metastasis; external validation; gastric cancer; machine learning; web calculator.
Copyright © 2024 Qin, Qiu, Ge, Wu, Ma and Li.