Image detection method for multi-category lesions in wireless capsule endoscopy based on deep learning models

World J Gastroenterol. 2024 Dec 28;30(48):5111-5129. doi: 10.3748/wjg.v30.i48.5111.

Abstract

Background: Wireless capsule endoscopy (WCE) has become an important noninvasive and portable tool for diagnosing digestive tract diseases and has been propelled by advancements in medical imaging technology. However, the complexity of the digestive tract structure, and the diversity of lesion types, results in different sites and types of lesions distinctly appearing in the images, posing a challenge for the accurate identification of digestive tract diseases.

Aim: To propose a deep learning-based lesion detection model to automatically identify and accurately label digestive tract lesions, thereby improving the diagnostic efficiency of doctors, and creating significant clinical application value.

Methods: In this paper, we propose a neural network model, WCE_Detection, for the accurate detection and classification of 23 classes of digestive tract lesion images. First, since multicategory lesion images exhibit various shapes and scales, a multidetection head strategy is adopted in the object detection network to increase the model's robustness for multiscale lesion detection. Moreover, a bidirectional feature pyramid network (BiFPN) is introduced, which effectively fuses shallow semantic features by adding skip connections, significantly reducing the detection error rate. On the basis of the above, we utilize the Swin Transformer with its unique self-attention mechanism and hierarchical structure in conjunction with the BiFPN feature fusion technique to enhance the feature representation of multicategory lesion images.

Results: The model constructed in this study achieved an mAP50 of 91.5% for detecting 23 lesions. More than eleven single-category lesions achieved an mAP50 of over 99.4%, and more than twenty lesions had an mAP50 value of over 80%. These results indicate that the model outperforms other state-of-the-art models in the end-to-end integrated detection of human digestive tract lesion images.

Conclusion: The deep learning-based object detection network detects multiple digestive tract lesions in WCE images with high accuracy, improving the diagnostic efficiency of doctors, and demonstrating significant clinical application value.

Keywords: Artificial intelligence; Deep learning; Human digestive tract; Object detection; Wireless capsule endoscopy.

MeSH terms

  • Capsule Endoscopy* / methods
  • Deep Learning*
  • Digestive System Diseases / diagnosis
  • Digestive System Diseases / diagnostic imaging
  • Digestive System Diseases / pathology
  • Humans
  • Image Interpretation, Computer-Assisted / methods
  • Neural Networks, Computer