Development and validation of a novel artificial intelligence algorithm for precise prediction the postoperative prognosis of esophageal squamous cell carcinoma

BMC Cancer. 2025 Jan 23;25(1):134. doi: 10.1186/s12885-025-13520-6.

Abstract

Background: Esophageal squamous cell carcinoma (ESCC) is a highly aggressive malignancy, and current postoperative prognostic assessment methods remain unsatisfactory, underlining the urgent to develop a reliable approach for precision medicine. Given the similarities with gametogenesis, cancer/testis genes (CTGs) are acknowledged for regulation unrestrained multiplication and immune microenvironment during oncogenic processes. These processes are associated with advanced disease and poorer prognosis, indicating that CTGs could serve as ideal prognostic biomarkers in ESCC. The purpose of this study is to develop a novel clinically prognostic prediction system to facilitate the individualized postoperative care.

Methods: We conducted LASSO regression analysis of protein-coding CTGs and clinical characteristics from 119 pathologically confirmed ESCC patients to recognize powerful predictive variables. We employed nine supervised machine learning classifiers and integrated best predictive machine learning classifiers by weighted voting method to construct an ensemble model called PPMESCC. Additionally, functional assay was conducted to examine the potential effect of top-ranking CTG HENMT1 in ESCC.

Results: LASSO regression identified five CTGs and TNM stage as optimized prognostic features. Six machine learning classifiers were integrated to construct an ensemble model, PPMESCC, which exhibited outstanding performance in ESCC prediction. The AUC for PPMESCC was 0.9828 (95% confidence interval: 0.9608 to 0.9926), with an accuracy of 98.32% (95% CI: 96.64-99.16%) in the discovery cohort and 0.9057 (95% CI: 0.8897 to 0.9583) of AUC with an accuracy of 90% (95% CI: 89.08-93.28%) in validation cohort. In addition, the top-ranking CTG HENMT1 encodes 2'-O-methyltransferase of piRNAs that was confirmed positively correlated with the proliferation capacity of ESCC cells. Then we systematically screen piRNAs associated with esophageal carcinoma based on GWAS, eQTL-piRNA, and i2OM databases, and successfully discovered 8 piRNAs potentially regulated by HENMT1.

Conclusion: The study highlights the clinical utility of PPMESCC algorithm in prognostic prediction that may facilitate to establish the personalized screening and management strategies for postoperative ESCC patients.

Keywords: Artificial intelligence; Cancer/testis gene; Esophageal squamous cell carcinoma; Machine learning; Postoperative prognosis.

Publication types

  • Validation Study

MeSH terms

  • Aged
  • Algorithms
  • Artificial Intelligence
  • Biomarkers, Tumor* / genetics
  • Esophageal Neoplasms* / genetics
  • Esophageal Neoplasms* / pathology
  • Esophageal Neoplasms* / surgery
  • Esophageal Squamous Cell Carcinoma* / genetics
  • Esophageal Squamous Cell Carcinoma* / pathology
  • Esophageal Squamous Cell Carcinoma* / surgery
  • Female
  • Humans
  • Machine Learning
  • Male
  • Middle Aged
  • Prognosis

Substances

  • Biomarkers, Tumor