A modular architecture for articulatory synthesis from gestural specification

Rachel Alexander; Tanner Sorensen; Asterios Toutios; Shrikanth Narayanan

doi:10.1121/1.5139413

A modular architecture for articulatory synthesis from gestural specification

J Acoust Soc Am. 2019 Dec;146(6):4458. doi: 10.1121/1.5139413.

Authors

Rachel Alexander¹, Tanner Sorensen¹, Asterios Toutios¹, Shrikanth Narayanan¹

Affiliation

¹ Signal Analysis & Interpretation Laboratory (SAIL), University of Southern California, Los Angeles, California 90007, USA.

Abstract

This paper proposes a modular architecture for articulatory synthesis from a gestural specification comprising relatively simple models for the vocal tract, the glottis, aero-acoustics, and articulatory control. The vocal tract module combines a midsagittal statistical analysis articulatory model, derived by factor analysis of air-tissue boundaries in real-time magnetic resonance imaging data, with an αβ model for converting midsagittal section to area function specifications. The aero-acoustics and glottis models were based on a software implementation of classic work by Maeda. The articulatory control module uses dynamical systems, which implement articulatory gestures, to animate the statistical articulatory model, inspired by the task dynamics model. Results on synthesizing vowel-consonant-vowel sequences with plosive consonants, using models that were built on data from, and simulate the behavior of, two different speakers are presented.

Publication types

Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Acoustics
Gestures
Glottis / physiology*
Humans
Phonetics*
Speech / physiology*
Speech Acoustics*

Grants and funding

R01 DC007124/DC/NIDCD NIH HHS/United States