Acoustic-to-phonetic mapping using recurrent neural networks

IEEE Trans Neural Netw. 1994;5(4):659-62. doi: 10.1109/72.298235.

Abstract

This paper describes the application of artificial neural networks to acoustic-to-phonetic mapping. The experiments described are typical of problems in speech recognition in which the temporal nature of the input sequence is critical. The specific task considered is that of mapping formant contours to the corresponding CVC' syllable. We performed experiments on formant data extracted from the acoustic speech signal spoken at two different tempos (slow and normal) using networks based on the Elman simple recurrent network model. Our results show that the Elman networks used in these experiments were successful in performing the acoustic-to-phonetic mapping from formant contours. Consequently, we demonstrate that relatively simple networks, readily trained using standard backpropagation techniques, are capable of initial and final consonant discrimination and vowel identification for variable speech rates.