Acoustic scene complexity affects motion behavior during speech perception in audio-visual multi-talker virtual environments

Sci Rep. 2024 Aug 16;14(1):19028. doi: 10.1038/s41598-024-70026-0.

Abstract

In real-world listening situations, individuals typically utilize head and eye movements to receive and enhance sensory information while exploring acoustic scenes. However, the specific patterns of such movements have not yet been fully characterized. Here, we studied how movement behavior is influenced by scene complexity, varied in terms of reverberation and the number of concurrent talkers. Thirteen normal-hearing participants engaged in a speech comprehension and localization task, requiring them to indicate the spatial location of a spoken story in the presence of other stories in virtual audio-visual scenes. We observed delayed initial head movements when more simultaneous talkers were present in the scene. Both reverberation and a higher number of talkers extended the search period, increased the number of fixated source locations, and resulted in more gaze jumps. The period preceding the participants' responses was prolonged when more concurrent talkers were present, and listeners continued to move their eyes in the proximity of the target talker. In scenes with more reverberation, the final head position when making the decision was farther away from the target. These findings demonstrate that the complexity of the acoustic scene influences listener behavior during speech comprehension and localization in audio-visual scenes.

MeSH terms

  • Acoustic Stimulation / methods
  • Adult
  • Comprehension / physiology
  • Eye Movements* / physiology
  • Female
  • Head Movements / physiology
  • Humans
  • Male
  • Sound Localization / physiology
  • Speech Perception* / physiology
  • Virtual Reality
  • Visual Perception / physiology
  • Young Adult