Automatic Detection System for Velopharyngeal Insufficiency Based on Acoustic Signals from Nasal and Oral Channels

Diagnostics (Basel). 2023 Aug 21;13(16):2714. doi: 10.3390/diagnostics13162714.

Abstract

Velopharyngeal insufficiency (VPI) is a type of pharyngeal function dysfunction that causes speech impairment and swallowing disorder. Speech therapists play a key role on the diagnosis and treatment of speech disorders. However, there is a worldwide shortage of experienced speech therapists. Artificial intelligence-based computer-aided diagnosing technology could be a solution for this. This paper proposes an automatic system for VPI detection at the subject level. It is a non-invasive and convenient approach for VPI diagnosis. Based on the principle of impaired articulation of VPI patients, nasal- and oral-channel acoustic signals are collected as raw data. The system integrates the symptom discriminant results at the phoneme level. For consonants, relative prominent frequency description and relative frequency distribution features are proposed to discriminate nasal air emission caused by VPI. For hypernasality-sensitive vowels, a cross-attention residual Siamese network (CARS-Net) is proposed to perform automatic VPI/non-VPI classification at the phoneme level. CARS-Net embeds a cross-attention module between the two branches to improve the VPI/non-VPI classification model for vowels. We validate the proposed system on a self-built dataset, and the accuracy reaches 98.52%. This provides possibilities for implementing automatic VPI diagnosis.

Keywords: automatic diagnosis; deep learning; speech disorder; velopharyngeal insufficiency.