Preparing Well for Esophageal Endoscopic Detection Using a Hybrid Model and Transfer Learning

Cancers (Basel). 2023 Jul 26;15(15):3783. doi: 10.3390/cancers15153783.

Abstract

Early detection of esophageal cancer through endoscopic imaging is pivotal for effective treatment. However, the intricacies of endoscopic diagnosis, contingent on the physician's expertise, pose challenges. Esophageal cancer features often manifest ambiguously, leading to potential confusions with other inflammatory esophageal conditions, thereby complicating diagnostic accuracy. In recent times, computer-aided diagnosis has emerged as a promising solution in medical imaging, particularly within the domain of endoscopy. Nonetheless, contemporary AI-based diagnostic models heavily rely on voluminous data sources, limiting their applicability, especially in scenarios with scarce datasets. To address this limitation, our study introduces novel data training strategies based on transfer learning, tailored to optimize performance with limited data. Additionally, we propose a hybrid model integrating EfficientNet and Vision Transformer networks to enhance prediction accuracy. Conducting rigorous evaluations on a carefully curated dataset comprising 1002 endoscopic images (comprising 650 white-light images and 352 narrow-band images), our model achieved exceptional outcomes. Our combined model achieved an accuracy of 96.32%, precision of 96.44%, recall of 95.70%, and f1-score of 96.04%, surpassing state-of-the-art models and individual components, substantiating its potential for precise medical image classification. The AI-based medical image prediction platform presents several advantageous characteristics, encompassing superior prediction accuracy, a compact model size, and adaptability to low-data scenarios. This research heralds a significant stride in the advancement of computer-aided endoscopic imaging for improved esophageal cancer diagnosis.

Keywords: Vision Transformers; artificial intelligence; deep learning; endoscopy anatomy; transfer learning.