Search | arXiv e-print repository

EmT: A Novel Transformer for Generalized Cross-subject EEG Emotion Recognition

Authors: Yi Ding, Chengxuan Tong, Shuailei Zhang, Muyun Jiang, Yong Li, Kevin Lim Jun Liang, Cuntai Guan

Abstract: Integrating prior knowledge of neurophysiology into neural network architecture enhances the performance of emotion decoding. While numerous techniques emphasize learning spatial and short-term temporal patterns, there has been limited emphasis on capturing the vital long-term contextual information associated with emotional cognitive processes. In order to address this discrepancy, we introduce a… ▽ More Integrating prior knowledge of neurophysiology into neural network architecture enhances the performance of emotion decoding. While numerous techniques emphasize learning spatial and short-term temporal patterns, there has been limited emphasis on capturing the vital long-term contextual information associated with emotional cognitive processes. In order to address this discrepancy, we introduce a novel transformer model called emotion transformer (EmT). EmT is designed to excel in both generalized cross-subject EEG emotion classification and regression tasks. In EmT, EEG signals are transformed into a temporal graph format, creating a sequence of EEG feature graphs using a temporal graph construction module (TGC). A novel residual multi-view pyramid GCN module (RMPG) is then proposed to learn dynamic graph representations for each EEG feature graph within the series, and the learned representations of each graph are fused into one token. Furthermore, we design a temporal contextual transformer module (TCT) with two types of token mixers to learn the temporal contextual information. Finally, the task-specific output module (TSO) generates the desired outputs. Experiments on four publicly available datasets show that EmT achieves higher results than the baseline methods for both EEG emotion classification and regression tasks. The code is available at https://github.com/yi-ding-cs/EmT. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 11 pages, 5 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2405.00719 [pdf, other]

EEG-Deformer: A Dense Convolutional Transformer for Brain-computer Interfaces

Authors: Yi Ding, Yong Li, Hao Sun, Rui Liu, Chengxuan Tong, Cuntai Guan

Abstract: Effectively learning the temporal dynamics in electroencephalogram (EEG) signals is challenging yet essential for decoding brain activities using brain-computer interfaces (BCIs). Although Transformers are popular for their long-term sequential learning ability in the BCI field, most methods combining Transformers with convolutional neural networks (CNNs) fail to capture the coarse-to-fine tempora… ▽ More Effectively learning the temporal dynamics in electroencephalogram (EEG) signals is challenging yet essential for decoding brain activities using brain-computer interfaces (BCIs). Although Transformers are popular for their long-term sequential learning ability in the BCI field, most methods combining Transformers with convolutional neural networks (CNNs) fail to capture the coarse-to-fine temporal dynamics of EEG signals. To overcome this limitation, we introduce EEG-Deformer, which incorporates two main novel components into a CNN-Transformer: (1) a Hierarchical Coarse-to-Fine Transformer (HCT) block that integrates a Fine-grained Temporal Learning (FTL) branch into Transformers, effectively discerning coarse-to-fine temporal patterns; and (2) a Dense Information Purification (DIP) module, which utilizes multi-level, purified temporal information to enhance decoding accuracy. Comprehensive experiments on three representative cognitive tasks consistently verify the generalizability of our proposed EEG-Deformer, demonstrating that it either outperforms existing state-of-the-art methods or is comparable to them. Visualization results show that EEG-Deformer learns from neurophysiologically meaningful brain regions for the corresponding cognitive tasks. The source code can be found at https://github.com/yi-ding-cs/EEG-Deformer. △ Less

Submitted 25 April, 2024; originally announced May 2024.

Comments: 10 pages, 9 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2308.11636 [pdf, other]

Aggregating Intrinsic Information to Enhance BCI Performance through Federated Learning

Authors: Rui Liu, Yuanyuan Chen, Anran Li, Yi Ding, Han Yu, Cuntai Guan

Abstract: Insufficient data is a long-standing challenge for Brain-Computer Interface (BCI) to build a high-performance deep learning model. Though numerous research groups and institutes collect a multitude of EEG datasets for the same BCI task, sharing EEG data from multiple sites is still challenging due to the heterogeneity of devices. The significance of this challenge cannot be overstated, given the c… ▽ More Insufficient data is a long-standing challenge for Brain-Computer Interface (BCI) to build a high-performance deep learning model. Though numerous research groups and institutes collect a multitude of EEG datasets for the same BCI task, sharing EEG data from multiple sites is still challenging due to the heterogeneity of devices. The significance of this challenge cannot be overstated, given the critical role of data diversity in fostering model robustness. However, existing works rarely discuss this issue, predominantly centering their attention on model training within a single dataset, often in the context of inter-subject or inter-session settings. In this work, we propose a hierarchical personalized Federated Learning EEG decoding (FLEEG) framework to surmount this challenge. This innovative framework heralds a new learning paradigm for BCI, enabling datasets with disparate data formats to collaborate in the model training process. Each client is assigned a specific dataset and trains a hierarchical personalized model to manage diverse data formats and facilitate information exchange. Meanwhile, the server coordinates the training procedure to harness knowledge gleaned from all datasets, thus elevating overall performance. The framework has been evaluated in Motor Imagery (MI) classification with nine EEG datasets collected by different devices but implementing the same MI task. Results demonstrate that the proposed frame can boost classification performance up to 16.7% by enabling knowledge sharing between multiple datasets, especially for smaller datasets. Visualization results also indicate that the proposed framework can empower the local models to put a stable focus on task-related areas, yielding better performance. To the best of our knowledge, this is the first end-to-end solution to address this important challenge. △ Less

Submitted 14 August, 2023; originally announced August 2023.

arXiv:2305.06978 [pdf, other]

doi 10.1007/978-3-031-16443-9_13

Meta-hallucinator: Towards Few-Shot Cross-Modality Cardiac Image Segmentation

Authors: Ziyuan Zhao, Fangcheng Zhou, Zeng Zeng, Cuntai Guan, S. Kevin Zhou

Abstract: Domain shift and label scarcity heavily limit deep learning applications to various medical image analysis tasks. Unsupervised domain adaptation (UDA) techniques have recently achieved promising cross-modality medical image segmentation by transferring knowledge from a label-rich source domain to an unlabeled target domain. However, it is also difficult to collect annotations from the source domai… ▽ More Domain shift and label scarcity heavily limit deep learning applications to various medical image analysis tasks. Unsupervised domain adaptation (UDA) techniques have recently achieved promising cross-modality medical image segmentation by transferring knowledge from a label-rich source domain to an unlabeled target domain. However, it is also difficult to collect annotations from the source domain in many clinical applications, rendering most prior works suboptimal with the label-scarce source domain, particularly for few-shot scenarios, where only a few source labels are accessible. To achieve efficient few-shot cross-modality segmentation, we propose a novel transformation-consistent meta-hallucination framework, meta-hallucinator, with the goal of learning to diversify data distributions and generate useful examples for enhancing cross-modality performance. In our framework, hallucination and segmentation models are jointly trained with the gradient-based meta-learning strategy to synthesize examples that lead to good segmentation performance on the target domain. To further facilitate data hallucination and cross-domain knowledge transfer, we develop a self-ensembling model with a hallucination-consistent property. Our meta-hallucinator can seamlessly collaborate with the meta-segmenter for learning to hallucinate with mutual benefits from a combined view of meta-learning and self-ensembling learning. Extensive studies on MM-WHS 2017 dataset for cross-modality cardiac segmentation demonstrate that our method performs favorably against various approaches by a lot in the few-shot UDA scenario. △ Less

Submitted 11 May, 2023; originally announced May 2023.

Comments: Accepted by MICCAI 2022 (top 13% paper; early accept)

Journal ref: Medical Image Computing and Computer Assisted Intervention, MICCAI 2022. Lecture Notes in Computer Science, vol 13435. Springer, Cham

arXiv:2304.10755 [pdf, other]

Interpretable and Robust AI in EEG Systems: A Survey

Authors: Xinliang Zhou, Chenyu Liu, Zhongruo Wang, Liming Zhai, Ziyu Jia, Cuntai Guan, Yang Liu

Abstract: The close coupling of artificial intelligence (AI) and electroencephalography (EEG) has substantially advanced human-computer interaction (HCI) technologies in the AI era. Different from traditional EEG systems, the interpretability and robustness of AI-based EEG systems are becoming particularly crucial. The interpretability clarifies the inner working mechanisms of AI models and thus can gain th… ▽ More The close coupling of artificial intelligence (AI) and electroencephalography (EEG) has substantially advanced human-computer interaction (HCI) technologies in the AI era. Different from traditional EEG systems, the interpretability and robustness of AI-based EEG systems are becoming particularly crucial. The interpretability clarifies the inner working mechanisms of AI models and thus can gain the trust of users. The robustness reflects the AI's reliability against attacks and perturbations, which is essential for sensitive and fragile EEG signals. Thus the interpretability and robustness of AI in EEG systems have attracted increasing attention, and their research has achieved great progress recently. However, there is still no survey covering recent advances in this field. In this paper, we present the first comprehensive survey and summarize the interpretable and robust AI techniques for EEG systems. Specifically, we first propose a taxonomy of interpretability by characterizing it into three types: backpropagation, perturbation, and inherently interpretable methods. Then we classify the robustness mechanisms into four classes: noise and artifacts, human variability, data acquisition instability, and adversarial attacks. Finally, we identify several critical and unresolved challenges for interpretable and robust AI in EEG systems and further discuss their future directions. △ Less

Submitted 25 August, 2024; v1 submitted 21 April, 2023; originally announced April 2023.

arXiv:2303.15826 [pdf, other]

MS-MT: Multi-Scale Mean Teacher with Contrastive Unpaired Translation for Cross-Modality Vestibular Schwannoma and Cochlea Segmentation

Authors: Ziyuan Zhao, Kaixin Xu, Huai Zhe Yeo, Xulei Yang, Cuntai Guan

Abstract: Domain shift has been a long-standing issue for medical image segmentation. Recently, unsupervised domain adaptation (UDA) methods have achieved promising cross-modality segmentation performance by distilling knowledge from a label-rich source domain to a target domain without labels. In this work, we propose a multi-scale self-ensembling based UDA framework for automatic segmentation of two key b… ▽ More Domain shift has been a long-standing issue for medical image segmentation. Recently, unsupervised domain adaptation (UDA) methods have achieved promising cross-modality segmentation performance by distilling knowledge from a label-rich source domain to a target domain without labels. In this work, we propose a multi-scale self-ensembling based UDA framework for automatic segmentation of two key brain structures i.e., Vestibular Schwannoma (VS) and Cochlea on high-resolution T2 images. First, a segmentation-enhanced contrastive unpaired image translation module is designed for image-level domain adaptation from source T1 to target T2. Next, multi-scale deep supervision and consistency regularization are introduced to a mean teacher network for self-ensemble learning to further close the domain gap. Furthermore, self-training and intensity augmentation techniques are utilized to mitigate label scarcity and boost cross-modality segmentation performance. Our method demonstrates promising segmentation performance with a mean Dice score of 83.8% and 81.4% and an average asymmetric surface distance (ASSD) of 0.55 mm and 0.26 mm for the VS and Cochlea, respectively in the validation phase of the crossMoDA 2022 challenge. △ Less

Submitted 28 March, 2023; originally announced March 2023.

Comments: Accepted by BrainLes MICCAI proceedings (5th solution for MICCAI 2022 Cross-Modality Domain Adaptation (crossMoDA) Challenge)

arXiv:2303.10335 [pdf, other]

Multimodal Continuous Emotion Recognition: A Technical Report for ABAW5

Authors: Su Zhang, Ziyuan Zhao, Cuntai Guan

Abstract: We used two multimodal models for continuous valence-arousal recognition using visual, audio, and linguistic information. The first model is the same as we used in ABAW2 and ABAW3, which employs the leader-follower attention. The second model has the same architecture for spatial and temporal encoding. As for the fusion block, it employs a compact and straightforward channel attention, borrowed fr… ▽ More We used two multimodal models for continuous valence-arousal recognition using visual, audio, and linguistic information. The first model is the same as we used in ABAW2 and ABAW3, which employs the leader-follower attention. The second model has the same architecture for spatial and temporal encoding. As for the fusion block, it employs a compact and straightforward channel attention, borrowed from the End2You toolkit. Unlike our previous attempts that use Vggish feature directly as the audio feature, this time we feed the pre-trained VGG model using logmel-spectrogram and finetune it during the training. To make full use of the data and alleviate over-fitting, cross-validation is carried out. The code is available at https://github.com/sucv/ABAW3. △ Less

Submitted 13 April, 2023; v1 submitted 18 March, 2023; originally announced March 2023.

Comments: 6 pages. 1 figure. arXiv admin note: substantial text overlap with arXiv:2203.13031

arXiv:2302.11410 [pdf, other]

doi 10.1109/EMBC40787.2023.10340899

Score-Based Data Generation for EEG Spatial Covariance Matrices: Towards Boosting BCI Performance

Authors: Ce Ju, Reinmar Josef Kobler, Cuntai Guan

Abstract: The efficacy of Electroencephalogram (EEG) classifiers can be augmented by increasing the quantity of available data. In the case of geometric deep learning classifiers, the input consists of spatial covariance matrices derived from EEGs. In order to synthesize these spatial covariance matrices and facilitate future improvements of geometric deep learning classifiers, we propose a generative model… ▽ More The efficacy of Electroencephalogram (EEG) classifiers can be augmented by increasing the quantity of available data. In the case of geometric deep learning classifiers, the input consists of spatial covariance matrices derived from EEGs. In order to synthesize these spatial covariance matrices and facilitate future improvements of geometric deep learning classifiers, we propose a generative modeling technique based on state-of-the-art score-based models. The quality of generated samples is evaluated through visual and quantitative assessments using a left/right-hand-movement motor imagery dataset. The exceptional pixel-level resolution of these generative samples highlights the formidable capacity of score-based generative modeling. Additionally, the center (Frechet mean) of the generated samples aligns with neurophysiological evidence that event-related desynchronization and synchronization occur on electrodes C3 and C4 within the Mu and Beta frequency bands during motor imagery processing. The quantitative evaluation revealed that 84.3% of the generated samples could be accurately predicted by a pre-trained classifier and an improvement of up to 8.7% in the average accuracy over ten runs for a specific test subject in a holdout experiment. △ Less

Submitted 15 December, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

Comments: 7 pages, 4 figures; This work has been accepted by the 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Conference (IEEE EMBC 2023'). Copyright will be transferred without notice, after which this version may no longer be accessible

ACM Class: I.2.0

arXiv:2212.02078 [pdf, other]

doi 10.1109/TMI.2022.3214766

LE-UDA: Label-efficient unsupervised domain adaptation for medical image segmentation

Authors: Ziyuan Zhao, Fangcheng Zhou, Kaixin Xu, Zeng Zeng, Cuntai Guan, S. Kevin Zhou

Abstract: While deep learning methods hitherto have achieved considerable success in medical image segmentation, they are still hampered by two limitations: (i) reliance on large-scale well-labeled datasets, which are difficult to curate due to the expert-driven and time-consuming nature of pixel-level annotations in clinical practices, and (ii) failure to generalize from one domain to another, especially w… ▽ More While deep learning methods hitherto have achieved considerable success in medical image segmentation, they are still hampered by two limitations: (i) reliance on large-scale well-labeled datasets, which are difficult to curate due to the expert-driven and time-consuming nature of pixel-level annotations in clinical practices, and (ii) failure to generalize from one domain to another, especially when the target domain is a different modality with severe domain shifts. Recent unsupervised domain adaptation~(UDA) techniques leverage abundant labeled source data together with unlabeled target data to reduce the domain gap, but these methods degrade significantly with limited source annotations. In this study, we address this underexplored UDA problem, investigating a challenging but valuable realistic scenario, where the source domain not only exhibits domain shift~w.r.t. the target domain but also suffers from label scarcity. In this regard, we propose a novel and generic framework called ``Label-Efficient Unsupervised Domain Adaptation"~(LE-UDA). In LE-UDA, we construct self-ensembling consistency for knowledge transfer between both domains, as well as a self-ensembling adversarial learning module to achieve better feature alignment for UDA. To assess the effectiveness of our method, we conduct extensive experiments on two different tasks for cross-modality segmentation between MRI and CT images. Experimental results demonstrate that the proposed LE-UDA can efficiently leverage limited source labels to improve cross-domain segmentation performance, outperforming state-of-the-art UDA approaches in the literature. Code is available at: https://github.com/jacobzhaoziyuan/LE-UDA. △ Less

Submitted 5 December, 2022; originally announced December 2022.

Comments: Accepted by IEEE Transactions on Medical Imaging, 2022

arXiv:2211.11557 [pdf]

Decomposing 3D Neuroimaging into 2+1D Processing for Schizophrenia Recognition

Authors: Mengjiao Hu, Xudong Jiang, Kang Sim, Juan Helen Zhou, Cuntai Guan

Abstract: Deep learning has been successfully applied to recognizing both natural images and medical images. However, there remains a gap in recognizing 3D neuroimaging data, especially for psychiatric diseases such as schizophrenia and depression that have no visible alteration in specific slices. In this study, we propose to process the 3D data by a 2+1D framework so that we can exploit the powerful deep… ▽ More Deep learning has been successfully applied to recognizing both natural images and medical images. However, there remains a gap in recognizing 3D neuroimaging data, especially for psychiatric diseases such as schizophrenia and depression that have no visible alteration in specific slices. In this study, we propose to process the 3D data by a 2+1D framework so that we can exploit the powerful deep 2D Convolutional Neural Network (CNN) networks pre-trained on the huge ImageNet dataset for 3D neuroimaging recognition. Specifically, 3D volumes of Magnetic Resonance Imaging (MRI) metrics (grey matter, white matter, and cerebrospinal fluid) are decomposed to 2D slices according to neighboring voxel positions and inputted to 2D CNN models pre-trained on the ImageNet to extract feature maps from three views (axial, coronal, and sagittal). Global pooling is applied to remove redundant information as the activation patterns are sparsely distributed over feature maps. Channel-wise and slice-wise convolutions are proposed to aggregate the contextual information in the third view dimension unprocessed by the 2D CNN model. Multi-metric and multi-view information are fused for final prediction. Our approach outperforms handcrafted feature-based machine learning, deep feature approach with a support vector machine (SVM) classifier and 3D CNN models trained from scratch with better cross-validation results on publicly available Northwestern University Schizophrenia Dataset and the results are replicated on another independent dataset. △ Less

Submitted 21 November, 2022; v1 submitted 21 November, 2022; originally announced November 2022.

arXiv:2211.02641 [pdf, ps, other]

doi 10.1109/TNNLS.2023.3307470

Graph Neural Networks on SPD Manifolds for Motor Imagery Classification: A Perspective from the Time-Frequency Analysis

Authors: Ce Ju, Cuntai Guan

Abstract: The motor imagery (MI) classification has been a prominent research topic in brain-computer interfaces based on electroencephalography (EEG). Over the past few decades, the performance of MI-EEG classifiers has seen gradual enhancement. In this study, we amplify the geometric deep learning-based MI-EEG classifiers from the perspective of time-frequency analysis, introducing a new architecture call… ▽ More The motor imagery (MI) classification has been a prominent research topic in brain-computer interfaces based on electroencephalography (EEG). Over the past few decades, the performance of MI-EEG classifiers has seen gradual enhancement. In this study, we amplify the geometric deep learning-based MI-EEG classifiers from the perspective of time-frequency analysis, introducing a new architecture called Graph-CSPNet. We refer to this category of classifiers as Geometric Classifiers, highlighting their foundation in differential geometry stemming from EEG spatial covariance matrices. Graph-CSPNet utilizes novel manifold-valued graph convolutional techniques to capture the EEG features in the time-frequency domain, offering heightened flexibility in signal segmentation for capturing localized fluctuations. To evaluate the effectiveness of Graph-CSPNet, we employ five commonly-used publicly available MI-EEG datasets, achieving near-optimal classification accuracies in nine out of eleven scenarios. The Python repository can be found at https://github.com/GeometricBCI/Tensor-CSPNet-and-Graph-CSPNet. △ Less

Submitted 20 August, 2023; v1 submitted 25 October, 2022; originally announced November 2022.

Comments: 15 pages, 5 figures, 6 Tables; This work has been accepted by the IEEE Transactions on Neural Networks and Learning Systems, 2023. Copyright will be transferred without notice, after which this version may no longer be accessible

ACM Class: I.2.0

arXiv:2207.12238 [pdf, other]

doi 10.1109/TBME.2022.3232102

OCTAve: 2D en face Optical Coherence Tomography Angiography Vessel Segmentation in Weakly-Supervised Learning with Locality Augmentation

Authors: Amrest Chinkamol, Vetit Kanjaras, Phattarapong Sawangjai, Yitian Zhao, Thapanun Sudhawiyangkul, Chantana Chantrapornchai, Cuntai Guan, Theerawit Wilaiprasitporn

Abstract: While there have been increased researches using deep learning techniques for the extraction of vascular structure from the 2D en face OCTA, for such approach, it is known that the data annotation process on the curvilinear structure like the retinal vasculature is very costly and time consuming, albeit few tried to address the annotation problem. In this work, we propose the application of the… ▽ More While there have been increased researches using deep learning techniques for the extraction of vascular structure from the 2D en face OCTA, for such approach, it is known that the data annotation process on the curvilinear structure like the retinal vasculature is very costly and time consuming, albeit few tried to address the annotation problem. In this work, we propose the application of the scribble-base weakly-supervised learning method to automate the pixel-level annotation. The proposed method, called OCTAve, combines the weakly-supervised learning using scribble-annotated ground truth augmented with an adversarial and a novel self-supervised deep supervision. Our novel mechanism is designed to utilize the discriminative outputs from the discrimination layer of a UNet-like architecture where the Kullback-Liebler Divergence between the aggregate discriminative outputs and the segmentation map predicate is minimized during the training. This combined method leads to the better localization of the vascular structure as shown in our experiments. We validate our proposed method on the large public datasets i.e., ROSE, OCTA-500. The segmentation performance is compared against both state-of-the-art fully-supervised and scribble-based weakly-supervised approaches. The implementation of our work used in the experiments is located at [LINK]. △ Less

Submitted 25 July, 2022; originally announced July 2022.

arXiv:2207.01900 [pdf, other]

doi 10.1109/ICIP46576.2022.9897494

ACT-Net: Asymmetric Co-Teacher Network for Semi-supervised Memory-efficient Medical Image Segmentation

Authors: Ziyuan Zhao, Andong Zhu, Zeng Zeng, Bharadwaj Veeravalli, Cuntai Guan

Abstract: While deep models have shown promising performance in medical image segmentation, they heavily rely on a large amount of well-annotated data, which is difficult to access, especially in clinical practice. On the other hand, high-accuracy deep models usually come in large model sizes, limiting their employment in real scenarios. In this work, we propose a novel asymmetric co-teacher framework, ACT-… ▽ More While deep models have shown promising performance in medical image segmentation, they heavily rely on a large amount of well-annotated data, which is difficult to access, especially in clinical practice. On the other hand, high-accuracy deep models usually come in large model sizes, limiting their employment in real scenarios. In this work, we propose a novel asymmetric co-teacher framework, ACT-Net, to alleviate the burden on both expensive annotations and computational costs for semi-supervised knowledge distillation. We advance teacher-student learning with a co-teacher network to facilitate asymmetric knowledge distillation from large models to small ones by alternating student and teacher roles, obtaining tiny but accurate models for clinical employment. To verify the effectiveness of our ACT-Net, we employ the ACDC dataset for cardiac substructure segmentation in our experiments. Extensive experimental results demonstrate that ACT-Net outperforms other knowledge distillation methods and achieves lossless segmentation performance with 250x fewer parameters. △ Less

Submitted 5 July, 2022; originally announced July 2022.

Journal ref: 2022 IEEE International Conference on Image Processing (ICIP)

arXiv:2207.01883 [pdf, other]

doi 10.1109/ICIP46576.2022.9897591

MMGL: Multi-Scale Multi-View Global-Local Contrastive learning for Semi-supervised Cardiac Image Segmentation

Authors: Ziyuan Zhao, Jinxuan Hu, Zeng Zeng, Xulei Yang, Peisheng Qian, Bharadwaj Veeravalli, Cuntai Guan

Abstract: With large-scale well-labeled datasets, deep learning has shown significant success in medical image segmentation. However, it is challenging to acquire abundant annotations in clinical practice due to extensive expertise requirements and costly labeling efforts. Recently, contrastive learning has shown a strong capacity for visual representation learning on unlabeled data, achieving impressive pe… ▽ More With large-scale well-labeled datasets, deep learning has shown significant success in medical image segmentation. However, it is challenging to acquire abundant annotations in clinical practice due to extensive expertise requirements and costly labeling efforts. Recently, contrastive learning has shown a strong capacity for visual representation learning on unlabeled data, achieving impressive performance rivaling supervised learning in many domains. In this work, we propose a novel multi-scale multi-view global-local contrastive learning (MMGL) framework to thoroughly explore global and local features from different scales and views for robust contrastive learning performance, thereby improving segmentation performance with limited annotations. Extensive experiments on the MM-WHS dataset demonstrate the effectiveness of MMGL framework on semi-supervised cardiac image segmentation, outperforming the state-of-the-art contrastive learning methods by a large margin. △ Less

Submitted 5 July, 2022; originally announced July 2022.

Comments: Accepted by IEEE International Conference on Image Processing (ICIP 2022)

Journal ref: 2022 IEEE International Conference on Image Processing (ICIP)

arXiv:2205.07021 [pdf, other]

doi 10.1109/EMBC48229.2022.9871734

Self-supervised Assisted Active Learning for Skin Lesion Segmentation

Authors: Ziyuan Zhao, Wenjing Lu, Zeng Zeng, Kaixin Xu, Bharadwaj Veeravalli, Cuntai Guan

Abstract: Label scarcity has been a long-standing issue for biomedical image segmentation, due to high annotation costs and professional requirements. Recently, active learning (AL) strategies strive to reduce annotation costs by querying a small portion of data for annotation, receiving much traction in the field of medical imaging. However, most of the existing AL methods have to initialize models with so… ▽ More Label scarcity has been a long-standing issue for biomedical image segmentation, due to high annotation costs and professional requirements. Recently, active learning (AL) strategies strive to reduce annotation costs by querying a small portion of data for annotation, receiving much traction in the field of medical imaging. However, most of the existing AL methods have to initialize models with some randomly selected samples followed by active selection based on various criteria, such as uncertainty and diversity. Such random-start initialization methods inevitably introduce under-value redundant samples and unnecessary annotation costs. For the purpose of addressing the issue, we propose a novel self-supervised assisted active learning framework in the cold-start setting, in which the segmentation model is first warmed up with self-supervised learning (SSL), and then SSL features are used for sample selection via latent feature clustering without accessing labels. We assess our proposed methodology on skin lesions segmentation task. Extensive experiments demonstrate that our approach is capable of achieving promising performance with substantial improvements over existing baselines. △ Less

Submitted 14 May, 2022; originally announced May 2022.

Comments: Accepted by the 44th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2022)

Journal ref: 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC)

arXiv:2203.12454 [pdf, other]

doi 10.1007/978-3-030-87193-2_28

MT-UDA: Towards Unsupervised Cross-modality Medical Image Segmentation with Limited Source Labels

Authors: Ziyuan Zhao, Kaixin Xu, Shumeng Li, Zeng Zeng, Cuntai Guan

Abstract: The success of deep convolutional neural networks (DCNNs) benefits from high volumes of annotated data. However, annotating medical images is laborious, expensive, and requires human expertise, which induces the label scarcity problem. Especially when encountering the domain shift, the problem becomes more serious. Although deep unsupervised domain adaptation (UDA) can leverage well-established so… ▽ More The success of deep convolutional neural networks (DCNNs) benefits from high volumes of annotated data. However, annotating medical images is laborious, expensive, and requires human expertise, which induces the label scarcity problem. Especially when encountering the domain shift, the problem becomes more serious. Although deep unsupervised domain adaptation (UDA) can leverage well-established source domain annotations and abundant target domain data to facilitate cross-modality image segmentation and also mitigate the label paucity problem on the target domain, the conventional UDA methods suffer from severe performance degradation when source domain annotations are scarce. In this paper, we explore a challenging UDA setting - limited source domain annotations. We aim to investigate how to efficiently leverage unlabeled data from the source and target domains with limited source annotations for cross-modality image segmentation. To achieve this, we propose a new label-efficient UDA framework, termed MT-UDA, in which the student model trained with limited source labels learns from unlabeled data of both domains by two teacher models respectively in a semi-supervised manner. More specifically, the student model not only distills the intra-domain semantic knowledge by encouraging prediction consistency but also exploits the inter-domain anatomical information by enforcing structural consistency. Consequently, the student model can effectively integrate the underlying knowledge beneath available data resources to mitigate the impact of source label scarcity and yield improved cross-modality segmentation performance. We evaluate our method on MM-WHS 2017 dataset and demonstrate that our approach outperforms the state-of-the-art methods by a large margin under the source-label scarcity scenario. △ Less

Submitted 23 March, 2022; originally announced March 2022.

Comments: Accept by MICCAI 2021, code at: https://github.com/jacobzhaoziyuan/MT-UDA

Journal ref: Medical Image Computing and Computer Assisted Intervention, MICCAI 2021. Lecture Notes in Computer Science, vol 12901. Springer, Cham

arXiv:2202.02472 [pdf, ps, other]

doi 10.1109/TNNLS.2022.3172108

Tensor-CSPNet: A Novel Geometric Deep Learning Framework for Motor Imagery Classification

Authors: Ce Ju, Cuntai Guan

Abstract: Deep learning (DL) has been widely investigated in a vast majority of applications in electroencephalography (EEG)-based brain-computer interfaces (BCIs), especially for motor imagery (MI) classification in the past five years. The mainstream DL methodology for the MI-EEG classification exploits the temporospatial patterns of EEG signals using convolutional neural networks (CNNs), which have remar… ▽ More Deep learning (DL) has been widely investigated in a vast majority of applications in electroencephalography (EEG)-based brain-computer interfaces (BCIs), especially for motor imagery (MI) classification in the past five years. The mainstream DL methodology for the MI-EEG classification exploits the temporospatial patterns of EEG signals using convolutional neural networks (CNNs), which have remarkably succeeded in visual images. However, since the statistical characteristics of visual images depart radically from EEG signals, a natural question arises whether an alternative network architecture exists apart from CNNs. To address this question, we propose a novel geometric deep learning (GDL) framework called Tensor-CSPNet, which characterizes spatial covariance matrices derived from EEG signals on symmetric positive definite (SPD) manifolds and fully captures the temporospatiofrequency patterns using existing deep neural networks on SPD manifolds, integrating with experiences from many successful MI-EEG classifiers to optimize the framework. In the experiments, Tensor-CSPNet attains or slightly outperforms the current state-of-the-art performance on the cross-validation and holdout scenarios in two commonly-used MI-EEG datasets. Moreover, the visualization and interpretability analyses also exhibit the validity of Tensor-CSPNet for the MI-EEG classification. To conclude, in this study, we provide a feasible answer to the question by generalizing the DL methodologies on SPD manifolds, which indicates the start of a specific GDL methodology for the MI-EEG classification. △ Less

Submitted 23 September, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

Comments: 15 pages, 10 figures, 12 tables; This work has been accepted by the IEEE Transactions on Neural Networks and Learning Systems. Copyright will be transferred without notice, after which this version may no longer be accessible

ACM Class: I.2.0

arXiv:2201.05745 [pdf, other]

Deep Optimal Transport for Domain Adaptation on SPD Manifolds

Authors: Ce Ju, Cuntai Guan

Abstract: The machine learning community has shown increasing interest in addressing the domain adaptation problem on symmetric positive definite (SPD) manifolds. This interest is primarily driven by the complexities of neuroimaging data generated from brain signals, which often exhibit shifts in data distribution across recording sessions. These neuroimaging data, represented by signal covariance matrices,… ▽ More The machine learning community has shown increasing interest in addressing the domain adaptation problem on symmetric positive definite (SPD) manifolds. This interest is primarily driven by the complexities of neuroimaging data generated from brain signals, which often exhibit shifts in data distribution across recording sessions. These neuroimaging data, represented by signal covariance matrices, possess the mathematical properties of symmetry and positive definiteness. However, applying conventional domain adaptation methods is challenging because these mathematical properties can be disrupted when operating on covariance matrices. In this study, we introduce a novel geometric deep learning-based approach utilizing optimal transport on SPD manifolds to manage discrepancies in both marginal and conditional distributions between the source and target domains. We evaluate the effectiveness of this approach in three cross-session brain-computer interface scenarios and provide visualized results for further insights. The GitHub repository of this study can be accessed at https://github.com/GeometricBCI/Deep-Optimal-Transport-for-Domain-Adaptation-on-SPD-Manifolds. △ Less

Submitted 3 June, 2024; v1 submitted 14 January, 2022; originally announced January 2022.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

ACM Class: I.2.0

arXiv:2105.10369 [pdf, other]

doi 10.1109/EMBC46164.2021.9629941

Hierarchical Consistency Regularized Mean Teacher for Semi-supervised 3D Left Atrium Segmentation

Authors: Shumeng Li, Ziyuan Zhao, Kaixin Xu, Zeng Zeng, Cuntai Guan

Abstract: Deep learning has achieved promising segmentation performance on 3D left atrium MR images. However, annotations for segmentation tasks are expensive, costly and difficult to obtain. In this paper, we introduce a novel hierarchical consistency regularized mean teacher framework for 3D left atrium segmentation. In each iteration, the student model is optimized by multi-scale deep supervision and hie… ▽ More Deep learning has achieved promising segmentation performance on 3D left atrium MR images. However, annotations for segmentation tasks are expensive, costly and difficult to obtain. In this paper, we introduce a novel hierarchical consistency regularized mean teacher framework for 3D left atrium segmentation. In each iteration, the student model is optimized by multi-scale deep supervision and hierarchical consistency regularization, concurrently. Extensive experiments have shown that our method achieves competitive performance as compared with full annotation, outperforming other state-of-the-art semi-supervised segmentation methods. △ Less

Submitted 15 August, 2021; v1 submitted 21 May, 2021; originally announced May 2021.

Comments: Accepted in 43rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, IEEE EMBC 2021

Journal ref: 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC)

arXiv:2105.02786 [pdf, other]

LGGNet: Learning from Local-Global-Graph Representations for Brain-Computer Interface

Authors: Yi Ding, Neethu Robinson, Chengxuan Tong, Qiuhao Zeng, Cuntai Guan

Abstract: Neuropsychological studies suggest that co-operative activities among different brain functional areas drive high-level cognitive processes. To learn the brain activities within and among different functional areas of the brain, we propose LGGNet, a novel neurologically inspired graph neural network, to learn local-global-graph representations of electroencephalography (EEG) for Brain-Computer Int… ▽ More Neuropsychological studies suggest that co-operative activities among different brain functional areas drive high-level cognitive processes. To learn the brain activities within and among different functional areas of the brain, we propose LGGNet, a novel neurologically inspired graph neural network, to learn local-global-graph representations of electroencephalography (EEG) for Brain-Computer Interface (BCI). The input layer of LGGNet comprises a series of temporal convolutions with multi-scale 1D convolutional kernels and kernel-level attentive fusion. It captures temporal dynamics of EEG which then serves as input to the proposed local and global graph-filtering layers. Using a defined neurophysiologically meaningful set of local and global graphs, LGGNet models the complex relations within and among functional areas of the brain. Under the robust nested cross-validation settings, the proposed method is evaluated on three publicly available datasets for four types of cognitive classification tasks, namely, the attention, fatigue, emotion, and preference classification tasks. LGGNet is compared with state-of-the-art methods, such as DeepConvNet, EEGNet, R2G-STNN, TSception, RGNN, AMCNN-DGCN, HRNN and GraphNet. The results show that LGGNet outperforms these methods, and the improvements are statistically significant (p<0.05) in most cases. The results show that bringing neuroscience prior knowledge into neural network design yields an improvement of classification performance. The source code can be found at https://github.com/yi-ding-cs/LGG △ Less

Submitted 5 December, 2022; v1 submitted 5 May, 2021; originally announced May 2021.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2104.01233 [pdf, other]

FBCNet: A Multi-view Convolutional Neural Network for Brain-Computer Interface

Authors: Ravikiran Mane, Effie Chew, Karen Chua, Kai Keng Ang, Neethu Robinson, A. P. Vinod, Seong-Whan Lee, Cuntai Guan

Abstract: Lack of adequate training samples and noisy high-dimensional features are key challenges faced by Motor Imagery (MI) decoding algorithms for electroencephalogram (EEG) based Brain-Computer Interface (BCI). To address these challenges, inspired from neuro-physiological signatures of MI, this paper proposes a novel Filter-Bank Convolutional Network (FBCNet) for MI classification. FBCNet employs a mu… ▽ More Lack of adequate training samples and noisy high-dimensional features are key challenges faced by Motor Imagery (MI) decoding algorithms for electroencephalogram (EEG) based Brain-Computer Interface (BCI). To address these challenges, inspired from neuro-physiological signatures of MI, this paper proposes a novel Filter-Bank Convolutional Network (FBCNet) for MI classification. FBCNet employs a multi-view data representation followed by spatial filtering to extract spectro-spatially discriminative features. This multistage approach enables efficient training of the network even when limited training data is available. More significantly, in FBCNet, we propose a novel Variance layer that effectively aggregates the EEG time-domain information. With this design, we compare FBCNet with state-of-the-art (SOTA) BCI algorithm on four MI datasets: The BCI competition IV dataset 2a (BCIC-IV-2a), the OpenBMI dataset, and two large datasets from chronic stroke patients. The results show that, by achieving 76.20% 4-class classification accuracy, FBCNet sets a new SOTA for BCIC-IV-2a dataset. On the other three datasets, FBCNet yields up to 8% higher binary classification accuracies. Additionally, using explainable AI techniques we present one of the first reports about the differences in discriminative EEG features between healthy subjects and stroke patients. Also, the FBCNet source code is available at https://github.com/ravikiran-mane/FBCNet. △ Less

Submitted 17 March, 2021; originally announced April 2021.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2102.03814 [pdf, other]

doi 10.1109/TBME.2021.3137184

MIN2Net: End-to-End Multi-Task Learning for Subject-Independent Motor Imagery EEG Classification

Authors: Phairot Autthasan, Rattanaphon Chaisaen, Thapanun Sudhawiyangkul, Phurin Rangpong, Suktipol Kiatthaveephong, Nat Dilokthanakul, Gun Bhakdisongkhram, Huy Phan, Cuntai Guan, Theerawit Wilaiprasitporn

Abstract: Advances in the motor imagery (MI)-based brain-computer interfaces (BCIs) allow control of several applications by decoding neurophysiological phenomena, which are usually recorded by electroencephalography (EEG) using a non-invasive technique. Despite great advances in MI-based BCI, EEG rhythms are specific to a subject and various changes over time. These issues point to significant challenges t… ▽ More Advances in the motor imagery (MI)-based brain-computer interfaces (BCIs) allow control of several applications by decoding neurophysiological phenomena, which are usually recorded by electroencephalography (EEG) using a non-invasive technique. Despite great advances in MI-based BCI, EEG rhythms are specific to a subject and various changes over time. These issues point to significant challenges to enhance the classification performance, especially in a subject-independent manner. To overcome these challenges, we propose MIN2Net, a novel end-to-end multi-task learning to tackle this task. We integrate deep metric learning into a multi-task autoencoder to learn a compact and discriminative latent representation from EEG and perform classification simultaneously. This approach reduces the complexity in pre-processing, results in significant performance improvement on EEG classification. Experimental results in a subject-independent manner show that MIN2Net outperforms the state-of-the-art techniques, achieving an F1-score improvement of 6.72%, and 2.23% on the SMR-BCI, and OpenBMI datasets, respectively. We demonstrate that MIN2Net improves discriminative information in the latent representation. This study indicates the possibility and practicality of using this model to develop MI-based BCI applications for new users without the need for calibration. △ Less

Submitted 7 January, 2022; v1 submitted 7 February, 2021; originally announced February 2021.

Journal ref: IEEE Transactions on Biomedical Engineering 2021

arXiv:2101.09057 [pdf, other]

doi 10.1109/JBHI.2021.3052320

DSAL: Deeply Supervised Active Learning from Strong and Weak Labelers for Biomedical Image Segmentation

Authors: Ziyuan Zhao, Zeng Zeng, Kaixin Xu, Cen Chen, Cuntai Guan

Abstract: Image segmentation is one of the most essential biomedical image processing problems for different imaging modalities, including microscopy and X-ray in the Internet-of-Medical-Things (IoMT) domain. However, annotating biomedical images is knowledge-driven, time-consuming, and labor-intensive, making it difficult to obtain abundant labels with limited costs. Active learning strategies come into ea… ▽ More Image segmentation is one of the most essential biomedical image processing problems for different imaging modalities, including microscopy and X-ray in the Internet-of-Medical-Things (IoMT) domain. However, annotating biomedical images is knowledge-driven, time-consuming, and labor-intensive, making it difficult to obtain abundant labels with limited costs. Active learning strategies come into ease the burden of human annotation, which queries only a subset of training data for annotation. Despite receiving attention, most of active learning methods generally still require huge computational costs and utilize unlabeled data inefficiently. They also tend to ignore the intermediate knowledge within networks. In this work, we propose a deep active semi-supervised learning framework, DSAL, combining active learning and semi-supervised learning strategies. In DSAL, a new criterion based on deep supervision mechanism is proposed to select informative samples with high uncertainties and low uncertainties for strong labelers and weak labelers respectively. The internal criterion leverages the disagreement of intermediate features within the deep learning network for active sample selection, which subsequently reduces the computational costs. We use the proposed criteria to select samples for strong and weak labelers to produce oracle labels and pseudo labels simultaneously at each active learning iteration in an ensemble learning manner, which can be examined with IoMT Platform. Extensive experiments on multiple medical image datasets demonstrate the superiority of the proposed method over state-of-the-art active learning methods. △ Less

Submitted 22 January, 2021; originally announced January 2021.

Comments: Published as a journal paper at IEEE J-BHI

arXiv:2004.12321 [pdf, other]

doi 10.1109/EMBC44109.2020.9175344

Federated Transfer Learning for EEG Signal Classification

Authors: Ce Ju, Dashan Gao, Ravikiran Mane, Ben Tan, Yang Liu, Cuntai Guan

Abstract: The success of deep learning (DL) methods in the Brain-Computer Interfaces (BCI) field for classification of electroencephalographic (EEG) recordings has been restricted by the lack of large datasets. Privacy concerns associated with EEG signals limit the possibility of constructing a large EEG-BCI dataset by the conglomeration of multiple small ones for jointly training machine learning models. H… ▽ More The success of deep learning (DL) methods in the Brain-Computer Interfaces (BCI) field for classification of electroencephalographic (EEG) recordings has been restricted by the lack of large datasets. Privacy concerns associated with EEG signals limit the possibility of constructing a large EEG-BCI dataset by the conglomeration of multiple small ones for jointly training machine learning models. Hence, in this paper, we propose a novel privacy-preserving DL architecture named federated transfer learning (FTL) for EEG classification that is based on the federated learning framework. Working with the single-trial covariance matrix, the proposed architecture extracts common discriminative information from multi-subject EEG data with the help of domain adaptation techniques. We evaluate the performance of the proposed architecture on the PhysioNet dataset for 2-class motor imagery classification. While avoiding the actual data sharing, our FTL approach achieves 2% higher classification accuracy in a subject-adaptive analysis. Also, in the absence of multi-subject data, our architecture provides 6% better accuracy compared to other state-of-the-art DL architectures. △ Less

Submitted 25 January, 2021; v1 submitted 26 April, 2020; originally announced April 2020.

Comments: 6 pages, 2 figures, Accepted for IEEE Engineering in Medicine and Biology Society (EMBC) 2020 GitHub: https://github.com/DashanGao/Federated-Transfer-Leraning-for-EEG

ACM Class: I.5.4

Journal ref: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Montreal, QC, Canada, 2020, pp. 3040-3045

arXiv:2004.02965 [pdf, other]

TSception: A Deep Learning Framework for Emotion Detection Using EEG

Authors: Yi Ding, Neethu Robinson, Qiuhao Zeng, Duo Chen, Aung Aung Phyo Wai, Tih-Shih Lee, Cuntai Guan

Abstract: In this paper, we propose a deep learning framework, TSception, for emotion detection from electroencephalogram (EEG). TSception consists of temporal and spatial convolutional layers, which learn discriminative representations in the time and channel domains simultaneously. The temporal learner consists of multi-scale 1D convolutional kernels whose lengths are related to the sampling rate of the E… ▽ More In this paper, we propose a deep learning framework, TSception, for emotion detection from electroencephalogram (EEG). TSception consists of temporal and spatial convolutional layers, which learn discriminative representations in the time and channel domains simultaneously. The temporal learner consists of multi-scale 1D convolutional kernels whose lengths are related to the sampling rate of the EEG signal, which learns multiple temporal and frequency representations. The spatial learner takes advantage of the asymmetry property of emotion responses at the frontal brain area to learn the discriminative representations from the left and right hemispheres of the brain. In our study, a system is designed to study the emotional arousal in an immersive virtual reality (VR) environment. EEG data were collected from 18 healthy subjects using this system to evaluate the performance of the proposed deep learning network for the classification of low and high emotional arousal states. The proposed method is compared with SVM, EEGNet, and LSTM. TSception achieves a high classification accuracy of 86.03%, which outperforms the prior methods significantly (p<0.05). The code is available at https://github.com/deepBrains/TSception △ Less

Submitted 7 April, 2020; v1 submitted 1 April, 2020; originally announced April 2020.

Comments: Authors information updated only. Accepted to be published in: 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, July 19--24, 2020, part of 2020 IEEE World Congress on Computational Intelligence (IEEE WCCI 2020)

arXiv:2003.08818 [pdf]

Brain MRI-based 3D Convolutional Neural Networks for Classification of Schizophrenia and Controls

Authors: Mengjiao Hu, Kang Sim, Juan Helen Zhou, Xudong Jiang, Cuntai Guan

Abstract: Convolutional Neural Network (CNN) has been successfully applied on classification of both natural images and medical images but not yet been applied to differentiating patients with schizophrenia from healthy controls. Given the subtle, mixed, and sparsely distributed brain atrophy patterns of schizophrenia, the capability of automatic feature learning makes CNN a powerful tool for classifying sc… ▽ More Convolutional Neural Network (CNN) has been successfully applied on classification of both natural images and medical images but not yet been applied to differentiating patients with schizophrenia from healthy controls. Given the subtle, mixed, and sparsely distributed brain atrophy patterns of schizophrenia, the capability of automatic feature learning makes CNN a powerful tool for classifying schizophrenia from controls as it removes the subjectivity in selecting relevant spatial features. To examine the feasibility of applying CNN to classification of schizophrenia and controls based on structural Magnetic Resonance Imaging (MRI), we built 3D CNN models with different architectures and compared their performance with a handcrafted feature-based machine learning approach. Support vector machine (SVM) was used as classifier and Voxel-based Morphometry (VBM) was used as feature for handcrafted feature-based machine learning. 3D CNN models with sequential architecture, inception module and residual module were trained from scratch. CNN models achieved higher cross-validation accuracy than handcrafted feature-based machine learning. Moreover, testing on an independent dataset, 3D CNN models greatly outperformed handcrafted feature-based machine learning. This study underscored the potential of CNN for identifying patients with schizophrenia using 3D brain MR images and paved the way for imaging-based individual-level diagnosis and prognosis in psychiatric disorders. △ Less

Submitted 14 March, 2020; originally announced March 2020.

Comments: 4 PAGES

arXiv:2002.01171

Towards a Fast Steady-State Visual Evoked Potentials (SSVEP) Brain-Computer Interface (BCI)

Authors: Aung Aung Phyo Wai, Yangsong Zhang, Heng Guo, Ying Chi, Lei Zhang, Xian-Sheng Hua, Seong Whan Lee, Cuntai Guan

Abstract: Steady-state visual evoked potentials (SSVEP) brain-computer interface (BCI) provides reliable responses leading to high accuracy and information throughput. But achieving high accuracy typically requires a relatively long time window of one second or more. Various methods were proposed to improve sub-second response accuracy through subject-specific training and calibration. Substantial performan… ▽ More Steady-state visual evoked potentials (SSVEP) brain-computer interface (BCI) provides reliable responses leading to high accuracy and information throughput. But achieving high accuracy typically requires a relatively long time window of one second or more. Various methods were proposed to improve sub-second response accuracy through subject-specific training and calibration. Substantial performance improvements were achieved with tedious calibration and subject-specific training; resulting in the user's discomfort. So, we propose a training-free method by combining spatial-filtering and temporal alignment (CSTA) to recognize SSVEP responses in sub-second response time. CSTA exploits linear correlation and non-linear similarity between steady-state responses and stimulus templates with complementary fusion to achieve desirable performance improvements. We evaluated the performance of CSTA in terms of accuracy and Information Transfer Rate (ITR) in comparison with both training-based and training-free methods using two SSVEP data-sets. We observed that CSTA achieves the maximum mean accuracy of 97.43$\pm$2.26 % and 85.71$\pm$13.41 % with four-class and forty-class SSVEP data-sets respectively in sub-second response time in offline analysis. CSTA yields significantly higher mean performance (p<0.001) than the training-free method on both data-sets. Compared with training-based methods, CSTA shows 29.33$\pm$19.65 % higher mean accuracy with statistically significant differences in time window less than 0.5 s. In longer time windows, CSTA exhibits either better or comparable performance though not statistically significantly better than training-based methods. We show that the proposed method brings advantages of subject-independent SSVEP classification without requiring training while enabling high target recognition performance in sub-second response time. △ Less

Submitted 12 May, 2020; v1 submitted 4 February, 2020; originally announced February 2020.

Comments: Further improvements or modifications required to algorithm design

arXiv:1911.08136 [pdf, other]

Enhancing the Extraction of Interpretable Information for Ischemic Stroke Imaging from Deep Neural Networks

Authors: Erico Tjoa, Guo Heng, Lu Yuhao, Cuntai Guan

Abstract: We implement a visual interpretability method Layer-wise Relevance Propagation (LRP) on top of 3D U-Net trained to perform lesion segmentation on the small dataset of multi-modal images provided by ISLES 2017 competition. We demonstrate that LRP modifications could provide more sensible visual explanations to an otherwise highly noise-skewed saliency map. We also link amplitude of modified signals… ▽ More We implement a visual interpretability method Layer-wise Relevance Propagation (LRP) on top of 3D U-Net trained to perform lesion segmentation on the small dataset of multi-modal images provided by ISLES 2017 competition. We demonstrate that LRP modifications could provide more sensible visual explanations to an otherwise highly noise-skewed saliency map. We also link amplitude of modified signals to useful information content. High amplitude localized signals appear to constitute the noise that undermines the interpretability capacity of LRP. Furthermore, mathematical framework for possible analysis of function approximation is developed by analogy. △ Less

Submitted 13 January, 2020; v1 submitted 19 November, 2019; originally announced November 2019.

Showing 1–28 of 28 results for author: Guan, C