Automatic detection of the second subglottal resonance and its application to speaker normalization

Shizhen Wang; Steven M Lulich; Abeer Alwan

doi:10.1121/1.3257185

Automatic detection of the second subglottal resonance and its application to speaker normalization

J Acoust Soc Am. 2009 Dec;126(6):3268-77. doi: 10.1121/1.3257185.

Authors

Shizhen Wang¹, Steven M Lulich, Abeer Alwan

Affiliation

¹ Department of Electrical Engineering, University of California, Los Angeles, California 90095, USA. [email protected]

PMID: 20000940
DOI: 10.1121/1.3257185

Abstract

Speaker normalization typically focuses on inter-speaker variabilities of the supraglottal (vocal tract) resonances, which constitute a major cause of spectral mismatch. Recent studies have shown that the subglottal airways also affect spectral properties of speech sounds, and promising results were reported using the subglottal resonances for speaker normalization. This paper proposes a reliable algorithm to automatically estimate the second subglottal resonance (Sg2) from speech signals. The algorithm is calibrated on children's speech data with simultaneous accelerometer recordings from which Sg2 frequencies can be directly measured. A cross-language study with bilingual Spanish-English children is performed to investigate whether Sg2 frequencies are independent of speech content and language. The study verifies that Sg2 is approximately constant for a given speaker and thus can be a good candidate for limited data speaker normalization and cross-language adaptation. A speaker normalization method using Sg2 is then presented. This method is computationally more efficient than maximum-likelihood based vocal tract length normalization (VTLN), with performance better than VTLN for limited adaptation data and cross-language adaptation. Experimental results confirm that this method performs well in a variety of testing conditions and tasks.

Publication types

Comparative Study
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Adolescent
Adult
Algorithms*
Automation*
Calibration
Child
Child, Preschool
Female
Humans
Language
Larynx / physiology*
Male
Models, Biological
Multilingualism
Phonetics
Sound Spectrography
Speech / physiology*
Speech Acoustics*