Interpretable Computer Vision to Detect and Classify Structural Laryngeal Lesions in Digital Flexible Laryngoscopic Images

Otolaryngol Head Neck Surg. 2023 Dec;169(6):1564-1572. doi: 10.1002/ohn.411. Epub 2023 Jun 23.

Abstract

Objective: To localize structural laryngeal lesions within digital flexible laryngoscopic images and to classify them as benign or suspicious for malignancy using state-of-the-art computer vision detection models.

Study design: Cross-sectional diagnostic study SETTING: Tertiary care voice clinic METHODS: Digital stroboscopic videos, demographic and clinical data were collected from patients evaluated for a structural laryngeal lesion. Laryngoscopic images were extracted from videos and manually labeled with bounding boxes encompassing the lesion. Four detection models were employed to simultaneously localize and classify structural laryngeal lesions in laryngoscopic images. Classification accuracy, intersection over union (IoU) and mean average precision (mAP) were evaluated as measures of classification, localization, and overall performance, respectively.

Results: In total, 8,172 images from 147 patients were included in the laryngeal image dataset. Classification accuracy was 88.5 for individual laryngeal images and increased to 92.0 when all images belonging to the same sequence (video) were considered. Mean average precision across all four detection models was 50.1 using an IoU threshold of 0.5 to determine successful localization.

Conclusion: Results of this study showed that deep neural network-based detection models trained using a labeled dataset of digital laryngeal images have the potential to classify structural laryngeal lesions as benign or suspicious for malignancy and to localize them within an image. This approach provides valuable insight into which part of the image was used by the model to determine a diagnosis, allowing clinicians to independently evaluate models' predictions.

Keywords: artificial intelligence; detection; laryngeal cancer; laryngoscopy; neural networks.

MeSH terms

  • Computers
  • Cross-Sectional Studies
  • Humans
  • Laryngeal Neoplasms* / diagnostic imaging
  • Laryngeal Neoplasms* / pathology
  • Laryngoscopy / methods
  • Larynx* / diagnostic imaging
  • Larynx* / pathology