Importance: Predicting survival of oral squamous cell carcinoma through the use of prediction modeling has been underused, and the development of prediction models would augment clinicians' ability to provide absolute risk estimates for individual patients.
Objectives: To develop a prediction model using machine learning for 5-year overall survival among patients with oral squamous cell carcinoma and compare this model with a prediction model created from the TNM (Tumor, Node, Metastasis) clinical and pathologic stage.
Design, setting, and participants: A retrospective cohort study was conducted of 33 065 patients with oral squamous cell carcinoma from the National Cancer Data Base between January 1, 2004, and December 31, 2011. Patients were excluded if the treatment was considered palliative, staging demonstrated T0 or Tis, or survival or staging data were missing. Patient, tumor, treatment, and outcome information were obtained from the National Cancer Data Base. The data were split into a distribution of 80% for training and 20% for testing. The model was created using 2-class decision forest architecture. Permutation feature importance scores were used to determine the variables that were used in the model's prediction and their order of significance. Statistical analysis was conducted from August 1, 2018, to January 10, 2019.
Main outcomes and measures: Ability to predict 5-year overall survival assessed through area under the curve, accuracy, precision, and recall.
Results: Among the 33 065 patients in the study, the mean (SD) age was 64.6 (14.0) years, 19 791 were men (59.9%), 13 274 were women (40.1%), and 29 783 (90.1%) were white. At 60 months, there were 16 745 deaths (50.6%). The median time of follow-up was 56.8 months (range, 0-155.6 months). Age, pathologic T stage, positive margins at the time of surgery, lymph node size, and institutional identification were identified among the most significant variables. The calculated area under the curve for this machine learning model was 0.80 (95% CI, 0.79-0.81), accuracy was 71%, precision was 71%, and recall was 68%. In comparison, the calculated area under the curve of the TNM staging system was 0.68 (95% CI, 0.67-0.70), accuracy was 65%, precision was 69%, and recall was 52%.
Conclusions and relevance: Using machine learning algorithms, a prediction model was created based on patient social, demographic, clinical, and pathologic features. The developed prediction model proved to be better than a prediction model that exclusively used TNM pathologic and clinical stage according to all performance metrics. This study highlights the role that machine learning may play in individual patient risk estimation in the era of big data.