Objective: Ambulatory phonation monitoring (APM) has a long evolving history. Current devices mostly use a contact microphone or accelerometer over the anterior neck, limiting its general acceptance outside of academic purposes. This study applied wireless Bluetooth earphones to receive voice signals. We also designed a mobile App with personalized AI model to identify phonation segments.
Study design: Proof of concept study.
Setting: Acoustic laboratory.
Methods: The materials comprised 1-hour audio files from seven teachers recorded in the classroom. The first 5minutes were used to train the personalized SpeechDetection models using deep neural networks. Another six segments (30 seconds each) were selected for assessing the accuracy of this APM system using two parameters: (1) speech intensity, which was compared to the gold standard measured by CLIO 12, a professional system for voice recording, and (2) phonation segments, which was compared with manual labeling.
Results: The training accuracy of the SpeechDetection model ranged from 91.2% to 98.5%, with a mean of 95.4%. The testing accuracy for detecting phonation segments ranged from 88.4% to 97.0% (mean: 91.5%). The Kappa value of consistency ranged from 0.710 to 0.931 (mean: 0.813, P < 0.001 for all seven participants). After linear calibration, the accuracy of measuring speech intensity ranged from 0.846 to 0.927 (mean: 0.885, P < 0.001, Pearson correlation coefficient).
Conclusions: The study results demonstrated that a novel APM system using wireless earphones with mobile apps can accurately measure phonation segments and speech intensity for teachers in the classrooms. Further experiments under different environments with more participants are mandatory before extrapolating this system to real-world use cases.
Level of evidence: N/A.
Keywords: Artificial intelligence; Dosimetry; Machine learning.
Copyright © 2024 The Voice Foundation. Published by Elsevier Inc. All rights reserved.