Estimating speech spectra for copy synthesis by linear prediction and by hand

Robert E Remez; Kathryn R Dubowski; Morgana L Davids; Emily F Thomas; Nina U Paddu; Yael S Grossman; Marina Moskalenko

doi:10.1121/1.3631667

Estimating speech spectra for copy synthesis by linear prediction and by hand

J Acoust Soc Am. 2011 Oct;130(4):2173-8. doi: 10.1121/1.3631667.

Authors

Robert E Remez¹, Kathryn R Dubowski, Morgana L Davids, Emily F Thomas, Nina U Paddu, Yael S Grossman, Marina Moskalenko

Affiliation

¹ Department of Psychology, Barnard College, Columbia University, New York, New York 10027, USA. [email protected]

Abstract

Linear prediction is a widely available technique for analyzing acoustic properties of speech, although this method is known to be error-prone. New tests assessed the adequacy of linear prediction estimates by using this method to derive synthesis parameters and testing the intelligibility of the synthetic speech that results. Matched sets of sine-wave sentences were created, one set using uncorrected linear prediction estimates of natural sentences, the other using estimates made by hand. Phoneme restrictions imposed on linguistic properties allowed comparisons between continuous and intermittent voicing, oral or nasal and fricative manner, and unrestricted phonemic variation. Intelligibility tests revealed uniformly good performance with sentences created by hand-estimation and a minimal decrease in intelligibility with estimation by linear prediction due to manner variation with continuous voicing. Poorer performance was observed when linear prediction estimates were used to produce synthetic versions of phonemically unrestricted sentences, but no similar decline was observed with synthetic sentences produced by hand estimation. The results show a substantial intelligibility cost of reliance on uncorrected linear prediction estimates when phonemic variation approaches natural incidence.

Publication types

Comparative Study
Research Support, N.I.H., Extramural

MeSH terms

Acoustic Stimulation
Adult
Analysis of Variance
Audiometry, Speech
Humans
Linear Models*
Male
Phonetics*
Signal Processing, Computer-Assisted*
Sound Spectrography
Speech Acoustics*
Speech Intelligibility*
Speech Recognition Software*
Time Factors

Abstract

Publication types

MeSH terms

Grants and funding