Development and Validation of Machine Learning Models for Predicting Tumor Progression in OSCC

Oral Dis. 2024 Oct 27. doi: 10.1111/odi.15159. Online ahead of print.

Abstract

Objectives: Development of a prediction model using machine learning (ML) method for tumor progression in oral squamous cell carcinoma (OSCC) patients would provide risk estimation for individual patient outcomes.

Patients and methods: This predictive modeling study was conducted of 1163 patients with OSCC from Hospital of Stomatology, SYSU and SYSU Cancer Center from March 2009 to October 2021. Clinical, pathological, and hematological features of the patients were collected. Six ML algorithms were explored, and model performance was assessed by accuracy, sensitivity, specificity, f1 score, and AUC. SHAP values were used to identify the variables with the greatest contribution to the model.

Results: Among the 1163 patients (mean [SD] age, 55.36 [12.91] years), 563 are from development cohort and 600 are from validation cohort. The Logistic Regression algorithm outperformed all other models, with a sensitivity of 94.7% (68.2%), a specificity of 55.3% (63.7%), and the AUC of 0.76 ± 0.09 (0.723) in the development (validation) cohort. The most predictive feature was neutrophil count.

Conclusion: This study demonstrated ML models can improve clinical prediction of oral squamous cell carcinoma progression through basic information of patients. These tools could be used to provide individual risk estimation and may help direct intervention.

Keywords: hematological indicator; machine learning; oral squamous cell carcinoma; progression; validation.