Objective: The purpose of this study was to measure the auditory evoked potentials for speech and non-speech sounds with similar spectral distributions.
Methods: We developed two types of sounds, comprising naturally spoken vowels (natural speech sounds) and complex synthesized sounds (synthesized sounds). Natural speech sounds consisted of 5 Japanese vowels. Synthesized sounds consisted of a fundamental frequency and its second to fifteenth harmonics equivalent to those of natural speech sounds. The synthesized sound was filtered to have a similar spectral distribution to that of each natural speech sound. These sounds were low-pass filtered at 2000 Hz. The auditory evoked potential elicited by the natural speech sound /o/ and synthesized counterpart for /o/ were measured in 10 right-handed healthy adults with normal hearing.
Results: The natural speech sounds were significantly highly recognized as speech compared to the synthesized sounds (74.4% v.s. 13.8%, p < 0.01). The natural speech and synthesized sounds for the vowel /o/ contrasted strongly for speech perception (96.9% vs. 9.4%, p <0.01). However, the vowel /i/ and its counterpart were barely recognized as speech (4.7 v.s. 3.1%, p = 1.00). The N1 peak amplitudes and latencies evoked by the natural speech sound /o/ were not different from those evoked by the synthesized sound (p = 0.58 and p = 0.28, respectively). The P2 amplitudes evoked by the natural speech sound /o/ were not different from those evoked by the synthesized sound (p = 0.51). The P2 latencies evoked by the natural speech sound /o/ were significantly shorter than those evoked by the synthesized sound (p < 0.01). This modulation was not observed in a control study using the vowel /i/ and its counterpart (p = 0.29).
Conclusion: The early P2 observed may reflect central auditory processing of the 'speechness' of complex sounds.
Keywords: Auditory evoked potential; P2; Spectrum; Speech; Vowel.
Copyright © 2020. Published by Elsevier B.V.