Background: The individualized prediction and discrimination of precancerous lesions of gastric cancer (PLGC) is critical for the early prevention of gastric cancer (GC). However, accurate non-invasive methods for distinguishing between PLGC and GC are currently lacking. This study therefore aimed to develop a risk prediction model by machine learning and deep learning techniques to aid the early diagnosis of GC.
Methods: In this study, a total of 2229 subjects were recruited from nine tertiary hospitals between October 2022 and November 2023. We designed a comprehensive questionnaire, identified statistically significant factors, and created a web-based column chart. Then, a risk prediction model was subsequently developed by machine learning techniques. In addition, a tongue image-based risk prediction model was established by deep learning algorithms.
Results: Based on logistic regression analysis, a dynamic web-based nomogram was developed and it is freely accessible at: https://yz6677.shinyapps.io/GC67/ . Then, the prediction model was established using ten different machine learning algorithms and the Random Forest (RF) model achieved the highest accuracy at 85.65%. According with the predictive results, the top 10 key risk factors were age, traditional Chinese medicine (TCM) constitution type, tongue coating color, tongue color, irregular meals, pickled food, greasy fur, over-hot eating habit, anxiety and sleep onset latency. These factors are all significant risk indicators for the progression of PLGC patients to GC patients. Subsequently, the Swin Transformer architecture was used to develop a tongue image-based model for predicting the risk for progression of PLGC. The verification set showed an accuracy of 73.33% and an area under curve (AUC) greater than 0.8 across all models.
Conclusions: Our study developed machine learning and deep learning-based models for predicting the risk for progression of PLGC to GC, which will offer the assistance to determine the high-risk patients from PLGC and improve the early diagnosis of GC.
Keywords: Deep learning; Gastric cancer; Machine learning; Precancerous lesions of gastric cancer; Traditional Chinese medicine.
© 2025. The Author(s).