Effect of simulated hearing loss on automatic speech recognition for an android robot-patient

Front Robot AI. 2024 Sep 2:11:1391818. doi: 10.3389/frobt.2024.1391818. eCollection 2024.

Abstract

The importance of simulating patient behavior for medical assessment training has grown in recent decades due to the increasing variety of simulation tools, including standardized/simulated patients, humanoid and android robot-patients. Yet, there is still a need for improvement of current android robot-patients to accurately simulate patient behavior, among which taking into account their hearing loss is of particular importance. This paper is the first to consider hearing loss simulation in an android robot-patient and its results provide valuable insights for future developments. For this purpose, an open-source dataset of audio data and audiograms from human listeners was used to simulate the effect of hearing loss on an automatic speech recognition (ASR) system. The performance of the system was evaluated in terms of both word error rate (WER) and word information preserved (WIP). Comparing different ASR models commonly used in robotics, it appears that the model size alone is insufficient to predict ASR performance in presence of simulated hearing loss. However, though absolute values of WER and WIP do not predict the intelligibility for human listeners, they do highly correlate with it and thus could be used, for example, to compare the performance of hearing aid algorithms.

Keywords: android robot-patient; automatic speech recognition; hearing loss simulation; patient simulation; simulated patient.

Grants and funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This work was supported by the projects Augmented Auditory Intelligence (A2I) and Prof werden und Prof sein in Bremerhaven (BeProf@BHV) funded by the German Ministry of Education and Research (BMBF) under grant numbers 16SV8594 and 03FHP184A, respectively. The simulations were performed at the HPC Cluster CARL, located at the University of Oldenburg (Germany) and funded by the DFG through its Major Research Instrumentation Programme (INST 184/157-1 FUGG) and the Ministry of Science and Culture (MWK) of the Lower Saxony State.