A brain-to-text framework for decoding natural tonal sentences

Cell Rep. 2024 Oct 30;43(11):114924. doi: 10.1016/j.celrep.2024.114924. Online ahead of print.

Abstract

Speech brain-computer interfaces (BCIs) directly translate brain activity into speech sound and text. Despite successful applications in non-tonal languages, the distinct syllabic structures and pivotal lexical information conveyed through tonal nuances present challenges in BCI decoding for tonal languages like Mandarin Chinese. Here, we designed a brain-to-text framework to decode Mandarin sentences from invasive neural recordings. Our framework dissects speech onset, base syllables, and lexical tones, integrating them with contextual information through Bayesian likelihood and a Viterbi decoder. The results demonstrate accurate tone and syllable decoding during naturalistic speech production. The overall word error rate (WER) for 10 offline-decoded tonal sentences with a vocabulary of 40 high-frequency Chinese characters is 21% (chance: 95.3%) averaged across five participants, and tone decoding accuracy reaches 93% (chance: 25%), surpassing previous intracranial Mandarin tonal syllable decoders. This study provides a robust and generalizable approach for brain-to-text decoding of continuous tonal speech sentences.

Keywords: BCI; CP: Neuroscience; ECoG; Mandarin; artificial intelligence; brain-computer interface; deep neural networks; electrocorticography; natural speech; neural decoding; tonal language.