Decoding visual and auditory stimuli from brain activities, such as electroencephalography (EEG), offers promising advancements for enhancing machine-to-human interaction. However, effectively representing EEG signals remains a significant challenge. In this paper, we introduce a novel Delayed Knowledge Transfer (DKT) framework that employs spiking neurons for attention detection, using our experimental EEG dataset. This framework extracts patterns from audiovisual stimuli to model brain responses in EEG signals, while accounting for inherent response delays. By aligning audiovisual features with EEG signals through a shared embedding space, our approach improves the performance of brain-computer interface (BCI) systems. We also present WithMeAttention, a multimodal dataset designed to facilitate research in continuously distinguishing between target and distractor responses. Our methodology demonstrates a 3% improvement in accuracy on the WithMeAttention dataset compared to a baseline model that decodes EEG signals from scratch. This significant performance increase highlights the effectiveness of our approach Comprehensive analysis across four distinct conditions shows that rhythmic enhancement of visual information can optimize multi-sensory information processing. Notably, the two conditions featuring rhythmic target presentation - with and without accompanying beeps - achieved significantly superior performance compared to other scenarios. Furthermore, the delay distribution observed under different conditions indicates that our delay layer effectively emulates the neural processing delays in response to stimuli.
Keywords: Brain–computer interface; Delay learning; Electroencephalography (EEG); Knowledge distillation; Spiking neural network.
Copyright © 2024 Elsevier Ltd. All rights reserved.