Learning diphone-based segmentation

Robert Daland; Janet B Pierrehumbert

doi:10.1111/j.1551-6709.2010.01160.x

Learning diphone-based segmentation

Cogn Sci. 2011 Jan-Feb;35(1):119-55. doi: 10.1111/j.1551-6709.2010.01160.x. Epub 2010 Dec 9.

Authors

Robert Daland¹, Janet B Pierrehumbert

Affiliation

¹ Department of Linguistics, UCLA, Los Angeles, CA 90095-1543, USA. [email protected]

PMID: 21428994
DOI: 10.1111/j.1551-6709.2010.01160.x

Abstract

This paper reconsiders the diphone-based word segmentation model of Cairns, Shillcock, Chater, and Levy (1997) and Hockema (2006), previously thought to be unlearnable. A statistically principled learning model is developed using Bayes' theorem and reasonable assumptions about infants' implicit knowledge. The ability to recover phrase-medial word boundaries is tested using phonetic corpora derived from spontaneous interactions with children and adults. The (unsupervised and semi-supervised) learning models are shown to exhibit several crucial properties. First, only a small amount of language exposure is required to achieve the model's ceiling performance, equivalent to between 1 day and 1 month of caregiver input. Second, the models are robust to variation, both in the free parameter and the input representation. Finally, both the learning and baseline models exhibit undersegmentation, argued to have significant ramifications for speech processing as a whole.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Adult
Bayes Theorem
Child
Computer Simulation
Humans
Infant
Language Development*
Learning*
Models, Psychological*
Models, Statistical
Phonetics*
Speech*