Dimensionality and Ramping: Signatures of Sentence Integration in the Dynamics of Brains and Deep Language Models

Théo Desbordes; Yair Lakretz; Valérie Chanoine; Maxime Oquab; Jean-Michel Badier; Agnès Trébuchon; Romain Carron; Christian-G Bénar; Stanislas Dehaene; Jean-Rémi King

doi:10.1523/JNEUROSCI.1163-22.2023

Dimensionality and Ramping: Signatures of Sentence Integration in the Dynamics of Brains and Deep Language Models

J Neurosci. 2023 Jul 19;43(29):5350-5364. doi: 10.1523/JNEUROSCI.1163-22.2023. Epub 2023 May 22.

Authors

Affiliations

¹ Meta AI Research, Paris 75002, France; and Cognitive Neuroimaging Unit NeuroSpin center, 91191, Gif-sur-Yvette, France.
² Cognitive Neuroimaging Unit NeuroSpin center, Gif-sur-Yvette, 91191, France.
³ Institute of Language, Communication and the Brain, Aix-en-Provence, 13100, France; and Aix-Marseille Université, Centre National de la Recherche Scientifique, LPL, Aix-en-Provence, 13100, France.
⁴ Meta AI Research, Paris 75002, France.
⁵ Aix Marseille Université, Institut National de la Santé et de la Recherche Médicale, CNRS, LPL, Aix-en-Provence 13100; and Inst Neurosci Syst, Marseille, 13005, France.
⁶ Aix Marseille Université, Institut National de la Santé et de la Recherche Médicale, CNRS, LPL, Aix-en-Provence 13100, France; and Inst Neurosci Syst, Marseille, 13005, France; and Assistance Publique Hopitaux de Marseille, Timone hospital, Epileptology and Cerebral Rythmology, Marseille, 13385, France.
⁷ Aix Marseille Université, Institut National de la Santé et de la Recherche Médicale, CNRS, LPL, Aix-en-Provence 13100, France; and Inst Neurosci Syst, Marseille, 13005, France; and Assistance Publique Hopitaux de Marseille, Timone hospital, Functional and Stereotactic Neurosurgery, Marseille, 13385, France.
⁸ Aix Marseille Université, Institut National de la Santé et de la Recherche Médicale, CNRS, LPL, Aix-en-Provence 13100, France; and Inst Neurosci Syst, Marseille, 13005, France.
⁹ Université Paris Saclay, Institut National de la Santé et de la Recherche Médicale, Commissariat à l'Energie Atomique, Cognitive Neuroimaging Unit, NeuroSpin center, Saclay, 91191, France; and Collège de France, PSL University, Paris, 75231, France [email protected].
¹⁰ LSP, École normale supérieure, PSL (Paris Sciences & Lettres) University, CNRS, 75005 Paris, France.

Abstract

A sentence is more than the sum of its words: its meaning depends on how they combine with one another. The brain mechanisms underlying such semantic composition remain poorly understood. To shed light on the neural vector code underlying semantic composition, we introduce two hypotheses: (1) the intrinsic dimensionality of the space of neural representations should increase as a sentence unfolds, paralleling the growing complexity of its semantic representation; and (2) this progressive integration should be reflected in ramping and sentence-final signals. To test these predictions, we designed a dataset of closely matched normal and jabberwocky sentences (composed of meaningless pseudo words) and displayed them to deep language models and to 11 human participants (5 men and 6 women) monitored with simultaneous MEG and intracranial EEG. In both deep language models and electrophysiological data, we found that representational dimensionality was higher for meaningful sentences than jabberwocky. Furthermore, multivariate decoding of normal versus jabberwocky confirmed three dynamic patterns: (1) a phasic pattern following each word, peaking in temporal and parietal areas; (2) a ramping pattern, characteristic of bilateral inferior and middle frontal gyri; and (3) a sentence-final pattern in left superior frontal gyrus and right orbitofrontal cortex. These results provide a first glimpse into the neural geometry of semantic integration and constrain the search for a neural code of linguistic composition.SIGNIFICANCE STATEMENT Starting from general linguistic concepts, we make two sets of predictions in neural signals evoked by reading multiword sentences. First, the intrinsic dimensionality of the representation should grow with additional meaningful words. Second, the neural dynamics should exhibit signatures of encoding, maintaining, and resolving semantic composition. We successfully validated these hypotheses in deep neural language models, artificial neural networks trained on text and performing very well on many natural language processing tasks. Then, using a unique combination of MEG and intracranial electrodes, we recorded high-resolution brain data from human participants while they read a controlled set of sentences. Time-resolved dimensionality analysis showed increasing dimensionality with meaning, and multivariate decoding allowed us to isolate the three dynamical patterns we had hypothesized.

Keywords: dimensionality; integration; intracranial; language; ramping; semantic composition.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Brain Mapping / methods
Brain* / physiology
Female
Humans
Language*
Linguistics
Magnetic Resonance Imaging / methods
Male
Reading
Semantics