An ideal compressed mask for increasing speech intelligibility without sacrificing environmental sound recognitiona)

Eric M Johnson; Eric W Healy

doi:10.1121/10.0034599

An ideal compressed mask for increasing speech intelligibility without sacrificing environmental sound recognitiona)

J Acoust Soc Am. 2024 Dec 1;156(6):3958-3969. doi: 10.1121/10.0034599.

Authors

Eric M Johnson¹, Eric W Healy¹

Affiliation

¹ Department of Speech and Hearing Science, and Center for Cognitive and Brain Sciences, The Ohio State University, Columbus, Ohio 43210, USA.

PMID: 39666959
PMCID: PMC11646135 (available on 2025-12-01)
DOI: 10.1121/10.0034599

Abstract

Hearing impairment is often characterized by poor speech-in-noise recognition. State-of-the-art laboratory-based noise-reduction technology can eliminate background sounds from a corrupted speech signal and improve intelligibility, but it can also hinder environmental sound recognition (ESR), which is essential for personal independence and safety. This paper presents a time-frequency mask, the ideal compressed mask (ICM), that aims to provide listeners with improved speech intelligibility without substantially reducing ESR. This is accomplished by limiting the maximum attenuation that the mask performs. Speech intelligibility and ESR for hearing-impaired and normal-hearing listeners were measured using stimuli that had been processed by ICMs with various levels of maximum attenuation. This processing resulted in significantly improved intelligibility while retaining high ESR performance for both types of listeners. It was also found that the same level of maximum attenuation provided the optimal balance of intelligibility and ESR for both listener types. It is argued that future deep-learning-based noise reduction algorithms may provide better outcomes by balancing the levels of the target speech and the background environmental sounds, rather than eliminating all signals except for the target speech. The ICM provides one such simple solution for frequency-domain models.

MeSH terms

Acoustic Stimulation / methods
Adult
Aged
Female
Hearing Loss / physiopathology
Hearing Loss / rehabilitation
Humans
Male
Middle Aged
Noise* / adverse effects
Perceptual Masking*
Sound
Speech Intelligibility*
Speech Perception*
Young Adult

Abstract

MeSH terms

Grants and funding