In order to take full advantage of traditional Chinese medicine (TCM) and western medicine, combined with machine learning technology, to study the risk factors and better risk prediction model of diabetic retinopathy (DR), and provide basis for the screening and treatment of it. Through a retrospective study of DR cases in the real world, the electronic medical records of patients who met screening criteria were collected. Moreover, Recursive Feature Elimination with Cross-Validation (RFECV) was used for feature selection. Then, the prediction model was built based on Gradient Boosting Machine (GBM) and it was compared with 4 other popular machine learning techniques, including Logistic Regression (LR), K-Nearest Neighbors (KNN), Random Forest, and Support Vector Machine (SVM). The models were evaluated with accuracy, precision, recall, F1 score, and area under the curve (AUC) value as indicators. In addition, grid search was used to optimize the model. To explain the results of the model more intuitively, the Shapley Additive exPlanation (SHAP) method was used. A total of 9034 type 2 diabetes mellitus (T2DM) patients meeting the screening criteria were included in this study, including 1118 patients with DR. 19 features were selected using RFECV in the model construction. We constructed 5 commonly used models, including GBM, LR, KNN, Random Forest, and SVM. By comparing model performance, GBM has the highest accuracy (0.85) and AUC value (0.934), which is the best prediction model. We also carried out hyperparameter optimization of grid search for this model, and the model accuracy reached 0.88, and the AUC value increased to 0.958. Through SHAP analysis, it was found that TCM syndrome types, albumin, low density lipoprotein, triglyceride, total protein, glycosylated hemoglobin were closely related to the increased risk of DR. It can be concluded that TCM syndrome type is the risk factor of DR. The GBM classifier based on grid search optimization, with relevant risk factors of TCM and western medicine as variables, can better predict the risk of DR.
Copyright © 2024 the Author(s). Published by Wolters Kluwer Health, Inc.