Applying Deep Learning with Convolutional Neural Networks to Laryngoscopic Imaging for Automated Segmentation and Classification of Vocal Cord Leukoplakia

Ear Nose Throat J. 2024 Sep 20:1455613241275341. doi: 10.1177/01455613241275341. Online ahead of print.

Abstract

Objectives: Vocal cord leukoplakia is clinically described as a white plaque or patch on the vocal cords observed during macroscopic examination, which does not take into account histological features or prognosis. A clinical challenge in managing vocal cord leukoplakia is to assess the potential malignant transformation of the lesion. This study aims to investigate the potential of deep learning (DL) for the simultaneous segmentation and classification of vocal cord leukoplakia using narrow band imaging (NBI) and white light imaging (WLI). The primary objective is to assess the model's accuracy in detecting and classifying lesions, comparing its performance in WLI and NBI. Methods: We applied DL to segment and classify NBI and WLI of vocal cord leukoplakia, and used pathological diagnosis as the gold standard. Results: The DL model autonomously detected lesions with an average intersection-over-union (IoU) >70%. In classification tasks, the model differentiated between lesions in the surgical group with a sensitivity of 93% and a specificity of 94% for WLI, and a sensitivity of 99% and a specificity of 97% for NBI. In addition, the model achieved a mean average precision of 81% in WLI and 92% in NBI, with an IoU threshold >0.5. Conclusions: The model proposed by us is helpful in assisting in accurate diagnosis of vocal cord leukoplakia from NBI and WLI.

Keywords: convolutional neural networks; image classification; image segmentation; laryngoscopic imaging; vocal cord leukoplakia.