Automatic acoustic synthesis of human-like laughter

Shiva Sundaram; Shrikanth Narayanan

doi:10.1121/1.2390679

Automatic acoustic synthesis of human-like laughter

J Acoust Soc Am. 2007 Jan;121(1):527-35. doi: 10.1121/1.2390679.

Authors

Shiva Sundaram¹, Shrikanth Narayanan

Affiliation

¹ Speech Analysis and Interpretation Lab (SAIL), Department of Electrical Engineering-Systems, University of Southern California, Los Angeles, California 90089, USA. [email protected]

PMID: 17297806
DOI: 10.1121/1.2390679

Abstract

A technique to synthesize laughter based on time-domain behavior of real instances of human laughter is presented. In the speech synthesis community, interest in improving the expressive quality of synthetic speech has grown considerably. While the focus has been on the linguistic aspects, such as precise control of speech intonation to achieve desired expressiveness, inclusion of nonlinguistic cues could further enhance the expressive quality of synthetic speech. Laughter is one such cue used for communicating, say, a happy or amusing context. It can be generated in many varieties and qualities: from a short exhalation to a long full-blown episode. Laughter is modeled at two levels, the overall episode level and at the local call level. The first attempts to capture the overall temporal behavior in a parametric model based on the equations that govern the simple harmonic motion of a mass-spring system is presented. By changing a set of easily available parameters, the authors are able to synthesize a variety of laughter. At the call level, the authors relied on a standard linear prediction based analysis-synthesis model. Results of subjective tests to assess the acceptability and naturalness of the synthetic laughter relative to real human laughter samples are presented.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Communication Devices for People with Disabilities*
Computers
Humans
Laughter*
Models, Biological*
Software
Speech Acoustics*
User-Computer Interface*