Multi-modal deep learning for joint prediction of otitis media and diagnostic difficulty

Laryngoscope Investig Otolaryngol. 2024 Feb 8;9(1):e1199. doi: 10.1002/lio2.1199. eCollection 2024 Feb.

Abstract

Objectives: In this study, we propose a diagnostic model for automatic detection of otitis media based on combined input of otoscopy images and wideband tympanometry measurements.

Methods: We present a neural network-based model for the joint prediction of otitis media and diagnostic difficulty. We use the subclassifications acute otitis media and otitis media with effusion. The proposed approach is based on deep metric learning, and we compare this with the performance of a standard multi-task network.

Results: The proposed deep metric approach shows good performance on both tasks, and we show that the multi-modal input increases the performance for both classification and difficulty estimation compared to the models trained on the modalities separately. An accuracy of 86.5% is achieved for the classification task, and a Kendall rank correlation coefficient of 0.45 is achieved for difficulty estimation, corresponding to a correct ranking of 72.6% of the cases.

Conclusion: This study demonstrates the strengths of a multi-modal diagnostic tool using both otoscopy images and wideband tympanometry measurements for the diagnosis of otitis media. Furthermore, we show that deep metric learning improves the performance of the models.

Keywords: computer‐aided diagnosis; deep learning; diagnostic difficulty; otitis media.