A brain-to-text framework for decoding natural tonal sentences

Daohan Zhang; Zhenjie Wang; Youkun Qian; Zehao Zhao; Yan Liu; Xiaotao Hao; Wanxin Li; Shuo Lu; Honglin Zhu; Luyao Chen; Kunyu Xu; Yuanning Li; Junfeng Lu

doi:10.1016/j.celrep.2024.114924

A brain-to-text framework for decoding natural tonal sentences

Cell Rep. 2024 Oct 30;43(11):114924. doi: 10.1016/j.celrep.2024.114924. Online ahead of print.

Authors

Daohan Zhang¹, Zhenjie Wang², Youkun Qian¹, Zehao Zhao¹, Yan Liu¹, Xiaotao Hao¹, Wanxin Li¹, Shuo Lu³, Honglin Zhu⁴, Luyao Chen⁵, Kunyu Xu⁶, Yuanning Li⁷, Junfeng Lu⁸

Affiliations

¹ Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai 200040, China; Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai 200040, China; National Center for Neurological Disorders, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai 200040, China.
² School of Biomedical Engineering, ShanghaiTech University, Shanghai 201210, China; State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai 201210, China.
³ Department of Chinese Language and Literature, Sun Yat-sen University, Guangzhou 510080, China.
⁴ Faculty of Life Sciences and Medicine, King's College London, London SE1 1UL, UK.
⁵ School of International Chinese Language Education, Beijing Normal University, Beijing 100875, China.
⁶ Institute of Modern Languages and Linguistics, Fudan University, Shanghai 200433, China.
⁷ School of Biomedical Engineering, ShanghaiTech University, Shanghai 201210, China; State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai 201210, China; Shanghai Clinical Research and Trial Center, Shanghai, 201210, China. Electronic address: [email protected].
⁸ Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai 200040, China; Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai 200040, China; National Center for Neurological Disorders, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai 200040, China; Institute of Modern Languages and Linguistics, Fudan University, Shanghai 200433, China; MOE Frontiers Center for Brain Science, Huashan Hospital, Fudan University, Shanghai 200040, China. Electronic address: [email protected].

PMID: 39485790
DOI: 10.1016/j.celrep.2024.114924

Abstract

Speech brain-computer interfaces (BCIs) directly translate brain activity into speech sound and text. Despite successful applications in non-tonal languages, the distinct syllabic structures and pivotal lexical information conveyed through tonal nuances present challenges in BCI decoding for tonal languages like Mandarin Chinese. Here, we designed a brain-to-text framework to decode Mandarin sentences from invasive neural recordings. Our framework dissects speech onset, base syllables, and lexical tones, integrating them with contextual information through Bayesian likelihood and a Viterbi decoder. The results demonstrate accurate tone and syllable decoding during naturalistic speech production. The overall word error rate (WER) for 10 offline-decoded tonal sentences with a vocabulary of 40 high-frequency Chinese characters is 21% (chance: 95.3%) averaged across five participants, and tone decoding accuracy reaches 93% (chance: 25%), surpassing previous intracranial Mandarin tonal syllable decoders. This study provides a robust and generalizable approach for brain-to-text decoding of continuous tonal speech sentences.

Keywords: BCI; CP: Neuroscience; ECoG; Mandarin; artificial intelligence; brain-computer interface; deep neural networks; electrocorticography; natural speech; neural decoding; tonal language.