Decision-directed speech power spectral density matrix estimation for multichannel speech enhancement

Yu Gwang Jin; Jong Won Shin; Nam Soo Kim

doi:10.1121/1.4977098

Decision-directed speech power spectral density matrix estimation for multichannel speech enhancement

J Acoust Soc Am. 2017 Mar;141(3):EL228. doi: 10.1121/1.4977098.

Authors

Yu Gwang Jin¹, Jong Won Shin², Nam Soo Kim³

Affiliations

¹ Corporate R&D Center, SK Telecom Co., Ltd., 65 Eulji-ro, Jung-gu, Seoul 04539, Korea [email protected].
² School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, 123 Cheomdan-gwagiro, Buk-gu, Gwangju 61005, Korea [email protected].
³ School of Electrical and Computer Engineering and Institute of New Media and Communications, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Korea [email protected].

PMID: 28372120
DOI: 10.1121/1.4977098

Abstract

In this letter, a multichannel decision-directed approach to estimate the speech power spectral density (PSD) matrix for multichannel speech enhancement is proposed. There have been attempts to build multichannel speech enhancement filters which depend only on the speech and noise PSD matrices, for which the accurate estimate of the clean speech PSD matrix is crucial for a successful noise reduction. In contrast to the maximum likelihood estimator which has been applied conventionally, the proposed decision-directed method is capable of tracking the time-varying speech characteristics more robustly and improves the noise reduction performance under various noise environments.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Acoustics*
Fourier Analysis
Humans
Models, Theoretical*
Motion
Noise / adverse effects*
Signal Processing, Computer-Assisted*
Sound Spectrography
Speech Production Measurement / methods*
Speech*
Vibration