Search | arXiv e-print repository

ECLIPSE: Expunging Clean-label Indiscriminate Poisons via Sparse Diffusion Purification

Authors: Xianlong Wang, Shengshan Hu, Yechao Zhang, Ziqi Zhou, Leo Yu Zhang, Peng Xu, Wei Wan, Hai Jin

Abstract: Clean-label indiscriminate poisoning attacks add invisible perturbations to correctly labeled training images, thus dramatically reducing the generalization capability of the victim models. Recently, some defense mechanisms have been proposed such as adversarial training, image transformation techniques, and image purification. However, these schemes are either susceptible to adaptive attacks, bui… ▽ More Clean-label indiscriminate poisoning attacks add invisible perturbations to correctly labeled training images, thus dramatically reducing the generalization capability of the victim models. Recently, some defense mechanisms have been proposed such as adversarial training, image transformation techniques, and image purification. However, these schemes are either susceptible to adaptive attacks, built on unrealistic assumptions, or only effective against specific poison types, limiting their universal applicability. In this research, we propose a more universally effective, practical, and robust defense scheme called ECLIPSE. We first investigate the impact of Gaussian noise on the poisons and theoretically prove that any kind of poison will be largely assimilated when imposing sufficient random noise. In light of this, we assume the victim has access to an extremely limited number of clean images (a more practical scene) and subsequently enlarge this sparse set for training a denoising probabilistic model (a universal denoising tool). We then begin by introducing Gaussian noise to absorb the poisons and then apply the model for denoising, resulting in a roughly purified dataset. Finally, to address the trade-off of the inconsistency in the assimilation sensitivity of different poisons by Gaussian noise, we propose a lightweight corruption compensation module to effectively eliminate residual poisons, providing a more universal defense approach. Extensive experiments demonstrate that our defense approach outperforms 10 state-of-the-art defenses. We also propose an adaptive attack against ECLIPSE and verify the robustness of our defense scheme. Our code is available at https://github.com/CGCL-codes/ECLIPSE. △ Less

Submitted 24 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

Comments: Accepted by ESORICS 2024

arXiv:2405.09548 [pdf, other]

Efficient Bilevel Source Mask Optimization

Authors: Guojin Chen, Hongquan He, Peng Xu, Hao Geng, Bei Yu

Abstract: Resolution Enhancement Techniques (RETs) are critical to meet the demands of advanced technology nodes. Among RETs, Source Mask Optimization (SMO) is pivotal, concurrently optimizing both the source and the mask to expand the process window. Traditional SMO methods, however, are limited by sequential and alternating optimizations, leading to extended runtimes without performance guarantees. This p… ▽ More Resolution Enhancement Techniques (RETs) are critical to meet the demands of advanced technology nodes. Among RETs, Source Mask Optimization (SMO) is pivotal, concurrently optimizing both the source and the mask to expand the process window. Traditional SMO methods, however, are limited by sequential and alternating optimizations, leading to extended runtimes without performance guarantees. This paper introduces a unified SMO framework utilizing the accelerated Abbe forward imaging to enhance precision and efficiency. Further, we propose the innovative \texttt{BiSMO} framework, which reformulates SMO through a bilevel optimization approach, and present three gradient-based methods to tackle the challenges of bilevel SMO. Our experimental results demonstrate that \texttt{BiSMO} achieves a remarkable 40\% reduction in error metrics and 8$\times$ increase in runtime efficiency, signifying a major leap forward in SMO. △ Less

Submitted 7 March, 2024; originally announced May 2024.

Comments: Accepted by Design Automation Conference (DAC) 2024

arXiv:2403.19633 [pdf, other]

doi 10.1109/TCST.2022.3193923

Lane-Change in Dense Traffic with Model Predictive Control and Neural Networks

Authors: Sangjae Bae, David Isele, Alireza Nakhaei, Peng Xu, Alexandre Miranda Anon, Chiho Choi, Kikuo Fujimura, Scott Moura

Abstract: This paper presents an online smooth-path lane-change control framework. We focus on dense traffic where inter-vehicle space gaps are narrow, and cooperation with surrounding drivers is essential to achieve the lane-change maneuver. We propose a two-stage control framework that harmonizes Model Predictive Control (MPC) with Generative Adversarial Networks (GAN) by utilizing driving intentions to g… ▽ More This paper presents an online smooth-path lane-change control framework. We focus on dense traffic where inter-vehicle space gaps are narrow, and cooperation with surrounding drivers is essential to achieve the lane-change maneuver. We propose a two-stage control framework that harmonizes Model Predictive Control (MPC) with Generative Adversarial Networks (GAN) by utilizing driving intentions to generate smooth lane-change maneuvers. To improve performance in practice, the system is augmented with an adaptive safety boundary and a Kalman Filter to mitigate sensor noise. Simulation studies are investigated in different levels of traffic density and cooperativeness of other drivers. The simulation results support the effectiveness, driving comfort, and safety of the proposed method. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Journal ref: IEEE Transactions on Control Systems Technology ( Volume: 31, Issue: 2, March 2023)

arXiv:2403.17106 [pdf, other]

Efficient generation of realistic guided wave signals for reliability estimation

Authors: Panpan Xu, Robin Jones, Georgios Sarris, Peter Huthwaite

Abstract: Across non-destructive testing (NDT) and structural health monitoring (SHM), accurate knowledge of the systems' reliability for detecting defects, such as Probability of Detection (POD) analysis is essential to enabling widespread adoption. Traditionally this relies on access to extensive experimental data to cover all critical areas of the parametric space, which becomes expensive, and heavily un… ▽ More Across non-destructive testing (NDT) and structural health monitoring (SHM), accurate knowledge of the systems' reliability for detecting defects, such as Probability of Detection (POD) analysis is essential to enabling widespread adoption. Traditionally this relies on access to extensive experimental data to cover all critical areas of the parametric space, which becomes expensive, and heavily undermines the benefit such systems bring. In response to these challenges, reliability estimation based on numerical simulation emerges as a practical solution, offering enhanced efficiency and cost-effectiveness. Nevertheless, precise reliability estimation demands that the simulated data faithfully represents the real-world performance. In this context, a numerical framework tailored to generate realistic signals for reliability estimation purposes is presented here, focusing on the application of guided wave SHM for pipe monitoring. It specifically incorporates key characteristics of real signals: random noise and coherent noise caused by the imbalance in transducer performance within guided wave monitoring systems. The effectiveness of our proposed methodology is demonstrated through a comprehensive comparative analysis between simulation-generated signals and experimental signals both individually and statistically. Furthermore, to assess the reliability of a guided wave system in terms of the inspection range for pipe monitoring, a series of POD analyses using simulation-generated data were conducted. The comparison of POD curves derived from ideal and realistic simulation data underscores the necessity of considering coherent noise for accurate POD curve calculations. Moreover, the POD analysis based on realistic simulation-generated data provides a quantitative estimation of the inspection range with more details compared to the current industry practice. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.11081 [pdf, other]

Enhanced Index Modulation Aided Non-Orthogonal Multiple Access via Constellation Rotation

Authors: Ronglan Huang, Fei ji, Zeng Hu, Dehuan Wan, Pengcheng Xu, Yun Liu

Abstract: Non-orthogonal multiple access (NOMA) has been widely nominated as an emerging spectral efficiency (SE) multiple access technique for the next generation of wireless communication network. To meet the growing demands in massive connectivity and huge data in transmission, a novel index modulation aided NOMA with the rotation of signal constellation of low power users (IM-NOMA-RC) is developed to th… ▽ More Non-orthogonal multiple access (NOMA) has been widely nominated as an emerging spectral efficiency (SE) multiple access technique for the next generation of wireless communication network. To meet the growing demands in massive connectivity and huge data in transmission, a novel index modulation aided NOMA with the rotation of signal constellation of low power users (IM-NOMA-RC) is developed to the downlink transmission. In the proposed IM-NOMA-RC system, the users are classified into far-user group and near-user group according to their channel conditions, where the rotation constellation based IM operation is performed only on the users who belong to the near-user group that are allocated lower power compared with the far ones to transmit extra information. In the proposed IM-NOMA-RC, all the subcarriers are activated to transmit information to multiple users to achieve higher SE. With the aid of the multiple dimension modulation in IM-NOMA-RC, more users can be supported over an orthogonal resource block. Then, both maximum likelihood (ML) detector and successive interference cancellation (SIC) detector are studied for all the user. Numerical simulation results of the proposed IM-NOMARC scheme are investigate for the ML detector and the SIC detector for each users, which shows that proposed scheme can outperform conventional NOMA. △ Less

Submitted 17 March, 2024; originally announced March 2024.

arXiv:2403.01868 [pdf, other]

doi 10.1109/IV55152.2023.10186774

Map-aided annotation for pole base detection

Authors: Benjamin Missaoui, Maxime Noizet, Philippe Xu

Abstract: For autonomous navigation, high definition maps are a widely used source of information. Pole-like features encoded in HD maps such as traffic signs, traffic lights or street lights can be used as landmarks for localization. For this purpose, they first need to be detected by the vehicle using its embedded sensors. While geometric models can be used to process 3D point clouds retrieved by lidar se… ▽ More For autonomous navigation, high definition maps are a widely used source of information. Pole-like features encoded in HD maps such as traffic signs, traffic lights or street lights can be used as landmarks for localization. For this purpose, they first need to be detected by the vehicle using its embedded sensors. While geometric models can be used to process 3D point clouds retrieved by lidar sensors, modern image-based approaches rely on deep neural network and therefore heavily depend on annotated training data. In this paper, a 2D HD map is used to automatically annotate pole-like features in images. In the absence of height information, the map features are represented as pole bases at the ground level. We show how an additional lidar sensor can be used to filter out occluded features and refine the ground projection. We also demonstrate how an object detector can be trained to detect a pole base. To evaluate our methodology, it is first validated with data manually annotated from semantic segmentation and then compared to our own automatically generated annotated data recorded in the city of Compi{è}gne, France. Erratum: In the original version [1], an error occurred in the accuracy evaluation of the different models studied and the evaluation method applied on the detection results was not clearly defined. In this revision, we offer a rectification to this segment, presenting updated results, especially in terms of Mean Absolute Errors (MAE). △ Less

Submitted 4 March, 2024; originally announced March 2024.

Journal ref: 35th IEEE Intelligent Vehicles Symposium (IV 2023), Jun 2023, Anchorage, AK, United States

arXiv:2402.05373 [pdf, other]

Unleashing the Infinity Power of Geometry: A Novel Geometry-Aware Transformer (GOAT) for Whole Slide Histopathology Image Analysis

Authors: Mingxin Liu, Yunzan Liu, Pengbo Xu, Jiquan Ma

Abstract: The histopathology analysis is of great significance for the diagnosis and prognosis of cancers, however, it has great challenges due to the enormous heterogeneity of gigapixel whole slide images (WSIs) and the intricate representation of pathological features. However, recent methods have not adequately exploited geometrical representation in WSIs which is significant in disease diagnosis. Theref… ▽ More The histopathology analysis is of great significance for the diagnosis and prognosis of cancers, however, it has great challenges due to the enormous heterogeneity of gigapixel whole slide images (WSIs) and the intricate representation of pathological features. However, recent methods have not adequately exploited geometrical representation in WSIs which is significant in disease diagnosis. Therefore, we proposed a novel weakly-supervised framework, Geometry-Aware Transformer (GOAT), in which we urge the model to pay attention to the geometric characteristics within the tumor microenvironment which often serve as potent indicators. In addition, a context-aware attention mechanism is designed to extract and enhance the morphological features within WSIs. △ Less

Submitted 7 February, 2024; originally announced February 2024.

Comments: 5 pages, 3 figures. Accepted by 21st IEEE International Symposium on Biomedical Imaging (ISBI 2024)

arXiv:2309.03905 [pdf, other]

ImageBind-LLM: Multi-modality Instruction Tuning

Authors: Jiaming Han, Renrui Zhang, Wenqi Shao, Peng Gao, Peng Xu, Han Xiao, Kaipeng Zhang, Chris Liu, Song Wen, Ziyu Guo, Xudong Lu, Shuai Ren, Yafei Wen, Xiaoxin Chen, Xiangyu Yue, Hongsheng Li, Yu Qiao

Abstract: We present ImageBind-LLM, a multi-modality instruction tuning method of large language models (LLMs) via ImageBind. Existing works mainly focus on language and image instruction tuning, different from which, our ImageBind-LLM can respond to multi-modality conditions, including audio, 3D point clouds, video, and their embedding-space arithmetic by only image-text alignment training. During training… ▽ More We present ImageBind-LLM, a multi-modality instruction tuning method of large language models (LLMs) via ImageBind. Existing works mainly focus on language and image instruction tuning, different from which, our ImageBind-LLM can respond to multi-modality conditions, including audio, 3D point clouds, video, and their embedding-space arithmetic by only image-text alignment training. During training, we adopt a learnable bind network to align the embedding space between LLaMA and ImageBind's image encoder. Then, the image features transformed by the bind network are added to word tokens of all layers in LLaMA, which progressively injects visual instructions via an attention-free and zero-initialized gating mechanism. Aided by the joint embedding of ImageBind, the simple image-text training enables our model to exhibit superior multi-modality instruction-following capabilities. During inference, the multi-modality inputs are fed into the corresponding ImageBind encoders, and processed by a proposed visual cache model for further cross-modal embedding enhancement. The training-free cache model retrieves from three million image features extracted by ImageBind, which effectively mitigates the training-inference modality discrepancy. Notably, with our approach, ImageBind-LLM can respond to instructions of diverse modalities and demonstrate significant language generation quality. Code is released at https://github.com/OpenGVLab/LLaMA-Adapter. △ Less

Submitted 11 September, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

Comments: Code is available at https://github.com/OpenGVLab/LLaMA-Adapter

arXiv:2309.03686 [pdf, other]

MS-UNet-v2: Adaptive Denoising Method and Training Strategy for Medical Image Segmentation with Small Training Data

Authors: Haoyuan Chen, Yufei Han, Pin Xu, Yanyi Li, Kuan Li, Jianping Yin

Abstract: Models based on U-like structures have improved the performance of medical image segmentation. However, the single-layer decoder structure of U-Net is too "thin" to exploit enough information, resulting in large semantic differences between the encoder and decoder parts. Things get worse if the number of training sets of data is not sufficiently large, which is common in medical image processing t… ▽ More Models based on U-like structures have improved the performance of medical image segmentation. However, the single-layer decoder structure of U-Net is too "thin" to exploit enough information, resulting in large semantic differences between the encoder and decoder parts. Things get worse if the number of training sets of data is not sufficiently large, which is common in medical image processing tasks where annotated data are more difficult to obtain than other tasks. Based on this observation, we propose a novel U-Net model named MS-UNet for the medical image segmentation task in this study. Instead of the single-layer U-Net decoder structure used in Swin-UNet and TransUnet, we specifically design a multi-scale nested decoder based on the Swin Transformer for U-Net. The proposed multi-scale nested decoder structure allows the feature mapping between the decoder and encoder to be semantically closer, thus enabling the network to learn more detailed features. In addition, we propose a novel edge loss and a plug-and-play fine-tuning Denoising module, which not only effectively improves the segmentation performance of MS-UNet, but could also be applied to other models individually. Experimental results show that MS-UNet could effectively improve the network performance with more efficient feature learning capability and exhibit more advanced performance, especially in the extreme case with a small amount of training data, and the proposed Edge loss and Denoising module could significantly enhance the segmentation performance of MS-UNet. △ Less

Submitted 7 September, 2023; originally announced September 2023.

arXiv:2308.11635 [pdf, other]

Semi-Supervised Dual-Stream Self-Attentive Adversarial Graph Contrastive Learning for Cross-Subject EEG-based Emotion Recognition

Authors: Weishan Ye, Zhiguo Zhang, Fei Teng, Min Zhang, Jianhong Wang, Dong Ni, Fali Li, Peng Xu, Zhen Liang

Abstract: Electroencephalography (EEG) is an objective tool for emotion recognition with promising applications. However, the scarcity of labeled data remains a major challenge in this field, limiting the widespread use of EEG-based emotion recognition. In this paper, a semi-supervised Dual-stream Self-Attentive Adversarial Graph Contrastive learning framework (termed as DS-AGC) is proposed to tackle the ch… ▽ More Electroencephalography (EEG) is an objective tool for emotion recognition with promising applications. However, the scarcity of labeled data remains a major challenge in this field, limiting the widespread use of EEG-based emotion recognition. In this paper, a semi-supervised Dual-stream Self-Attentive Adversarial Graph Contrastive learning framework (termed as DS-AGC) is proposed to tackle the challenge of limited labeled data in cross-subject EEG-based emotion recognition. The DS-AGC framework includes two parallel streams for extracting non-structural and structural EEG features. The non-structural stream incorporates a semi-supervised multi-domain adaptation method to alleviate distribution discrepancy among labeled source domain, unlabeled source domain, and unknown target domain. The structural stream develops a graph contrastive learning method to extract effective graph-based feature representation from multiple EEG channels in a semi-supervised manner. Further, a self-attentive fusion module is developed for feature fusion, sample selection, and emotion recognition, which highlights EEG features more relevant to emotions and data samples in the labeled source domain that are closer to the target domain. Extensive experiments conducted on two benchmark databases (SEED and SEED-IV) using a semi-supervised cross-subject leave-one-subject-out cross-validation evaluation scheme show that the proposed model outperforms existing methods under different incomplete label conditions (with an average improvement of 5.83% on SEED and 6.99% on SEED-IV), demonstrating its effectiveness in addressing the label scarcity problem in cross-subject EEG-based emotion recognition. △ Less

Submitted 2 August, 2024; v1 submitted 13 August, 2023; originally announced August 2023.

Comments: arXiv admin note: text overlap with arXiv:2304.06496

arXiv:2305.17054 [pdf, other]

Extremely weakly-supervised blood vessel segmentation with physiologically based synthesis and domain adaptation

Authors: Peidi Xu, Olga Sosnovtseva, Charlotte Mehlin Sørensen, Kenny Erleben, Sune Darkner

Abstract: Accurate analysis and modeling of renal functions require a precise segmentation of the renal blood vessels. Micro-CT scans provide image data at higher resolutions, making more small vessels near the renal cortex visible. Although deep-learning-based methods have shown state-of-the-art performance in automatic blood vessel segmentations, they require a large amount of labeled training data. Howev… ▽ More Accurate analysis and modeling of renal functions require a precise segmentation of the renal blood vessels. Micro-CT scans provide image data at higher resolutions, making more small vessels near the renal cortex visible. Although deep-learning-based methods have shown state-of-the-art performance in automatic blood vessel segmentations, they require a large amount of labeled training data. However, voxel-wise labeling in micro-CT scans is extremely time-consuming given the huge volume sizes. To mitigate the problem, we simulate synthetic renal vascular trees physiologically while generating corresponding scans of the simulated trees by training a generative model on unlabeled scans. This enables the generative model to learn the mapping implicitly without the need for explicit functions to emulate the image acquisition process. We further propose an additional segmentation branch over the generative model trained on the generated scans. We demonstrate that the model can directly segment blood vessels on real scans and validate our method on both 3D micro-CT scans of rat kidneys and a proof-of-concept experiment on 2D retinal images. Code and 3D results are available at https://github.com/miccai2023anony/RenalVesselSeg △ Less

Submitted 26 May, 2023; originally announced May 2023.

arXiv:2303.04595 [pdf]

Structure-aware registration network for liver DCE-CT images

Authors: Peng Xue, Jingyang Zhang, Lei Ma, Mianxin Liu, Yuning Gu, Jiawei Huang, Feihong Liua, Yongsheng Pan, Xiaohuan Cao, Dinggang Shen

Abstract: Image registration of liver dynamic contrast-enhanced computed tomography (DCE-CT) is crucial for diagnosis and image-guided surgical planning of liver cancer. However, intensity variations due to the flow of contrast agents combined with complex spatial motion induced by respiration brings great challenge to existing intensity-based registration methods. To address these problems, we propose a no… ▽ More Image registration of liver dynamic contrast-enhanced computed tomography (DCE-CT) is crucial for diagnosis and image-guided surgical planning of liver cancer. However, intensity variations due to the flow of contrast agents combined with complex spatial motion induced by respiration brings great challenge to existing intensity-based registration methods. To address these problems, we propose a novel structure-aware registration method by incorporating structural information of related organs with segmentation-guided deep registration network. Existing segmentation-guided registration methods only focus on volumetric registration inside the paired organ segmentations, ignoring the inherent attributes of their anatomical structures. In addition, such paired organ segmentations are not always available in DCE-CT images due to the flow of contrast agents. Different from existing segmentation-guided registration methods, our proposed method extracts structural information in hierarchical geometric perspectives of line and surface. Then, according to the extracted structural information, structure-aware constraints are constructed and imposed on the forward and backward deformation field simultaneously. In this way, all available organ segmentations, including unpaired ones, can be fully utilized to avoid the side effect of contrast agent and preserve the topology of organs during registration. Extensive experiments on an in-house liver DCE-CT dataset and a public LiTS dataset show that our proposed method can achieve higher registration accuracy and preserve anatomical structure more effectively than state-of-the-art methods. △ Less

Submitted 8 March, 2023; originally announced March 2023.

arXiv:2302.14751 [pdf]

High speed free-space optical communication using standard fiber communication component without optical amplification

Authors: Yao Zhang, Hua-Ying Liu, Xiaoyi Liu, Peng Xu, Xiang Dong, Pengfei Fan, Xiaohui Tian, Hua Yu, Dong Pan, Zhijun Yin, Guilu Long, Shi-Ning Zhu, Zhenda Xie

Abstract: Free-space optical communication (FSO) can achieve fast, secure and license-free communication without need for physical cables, making it a cost-effective, energy-efficient and flexible solution when the fiber connection is unavailable. To establish FSO connection on-demand, it is essential to build portable FSO devices with compact structure and light weight. Here, we develop a miniaturized FSO… ▽ More Free-space optical communication (FSO) can achieve fast, secure and license-free communication without need for physical cables, making it a cost-effective, energy-efficient and flexible solution when the fiber connection is unavailable. To establish FSO connection on-demand, it is essential to build portable FSO devices with compact structure and light weight. Here, we develop a miniaturized FSO system and realize 9.16 Gbps FSO between two nodes that is 1 km apart, using a commercial single-mode-fiber-coupled optical transceiver module without optical amplification. Using our 4-stage acquisition, pointing and tracking (APT) systems, the tracking error is within 3 μrad and results an average link loss of 13.7 dB, which is the key for this high-bandwidth FSO demonstration without optical amplification. Our FSO link has been tested up to 4 km, with link loss of 18 dB that is limited by the foggy weather during the test. Longer FSO distances can be expected with better weather condition and optical amplification. With single FSO device weight of only 9.5 kg, this result arouses massive applications of field-deployable high-speed wireless communication. △ Less

Submitted 16 April, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

Comments: 7 pages, 5 figures

arXiv:2301.05599 [pdf, other]

Short-length SSVEP data extension by a novel generative adversarial networks based framework

Authors: Yudong Pan, Ning Li, Yangsong Zhang, Peng Xu, Dezhong Yao

Abstract: Steady-state visual evoked potentials (SSVEPs) based brain-computer interface (BCI) has received considerable attention due to its high information transfer rate (ITR) and available quantity of targets. However, the performance of frequency identification methods heavily hinges on the amount of user calibration data and data length, which hinders the deployment in real-world applications. Recently… ▽ More Steady-state visual evoked potentials (SSVEPs) based brain-computer interface (BCI) has received considerable attention due to its high information transfer rate (ITR) and available quantity of targets. However, the performance of frequency identification methods heavily hinges on the amount of user calibration data and data length, which hinders the deployment in real-world applications. Recently, generative adversarial networks (GANs)-based data generation methods have been widely adopted to create synthetic electroencephalography (EEG) data, holds promise to address these issues. In this paper, we proposed a GAN-based end-to-end signal transformation network for Time-window length Extension, termed as TEGAN. TEGAN transforms short-length SSVEP signals into long-length artificial SSVEP signals. By incorporating a novel U-Net generator architecture and an auxiliary classifier into the network architecture, the TEGAN could produce conditioned features in the synthetic data. Additionally, we introduced a two-stage training strategy and the LeCam-divergence regularization term to regularize the training process of GAN during the network implementation. The proposed TEGAN was evaluated on two public SSVEP datasets (a 4-class dataset and a 12-class dataset). With the assistance of TEGAN, the performance of traditional frequency recognition methods and deep learning-based methods have been significantly improved under limited calibration data. And the classification performance gap of various frequency recognition methods has been narrowed. This study substantiates the feasibility of the proposed method to extend the data length for short-time SSVEP signals for developing a high-performance BCI system. The proposed GAN-based methods have the great potential of shortening the calibration time and cutting down the budget for various real-world BCI-based applications. △ Less

Submitted 2 October, 2023; v1 submitted 13 January, 2023; originally announced January 2023.

Comments: 16 pages, 9 figures, 4 tables

arXiv:2210.16682 [pdf, ps, other]

Robust Distributed Learning Against Both Distributional Shifts and Byzantine Attacks

Authors: Guanqiang Zhou, Ping Xu, Yue Wang, Zhi Tian

Abstract: In distributed learning systems, robustness issues may arise from two sources. On one hand, due to distributional shifts between training data and test data, the trained model could exhibit poor out-of-sample performance. On the other hand, a portion of working nodes might be subject to byzantine attacks which could invalidate the learning result. Existing works mostly deal with these two issues s… ▽ More In distributed learning systems, robustness issues may arise from two sources. On one hand, due to distributional shifts between training data and test data, the trained model could exhibit poor out-of-sample performance. On the other hand, a portion of working nodes might be subject to byzantine attacks which could invalidate the learning result. Existing works mostly deal with these two issues separately. In this paper, we propose a new algorithm that equips distributed learning with robustness measures against both distributional shifts and byzantine attacks. Our algorithm is built on recent advances in distributionally robust optimization as well as norm-based screening (NBS), a robust aggregation scheme against byzantine attacks. We provide convergence proofs in three cases of the learning model being nonconvex, convex, and strongly convex for the proposed algorithm, shedding light on its convergence behaviors and endurability against byzantine attacks. In particular, we deduce that any algorithm employing NBS (including ours) cannot converge when the percentage of byzantine nodes is 1/3 or higher, instead of 1/2, which is the common belief in current literature. The experimental results demonstrate the effectiveness of our algorithm against both robustness issues. To the best of our knowledge, this is the first work to address distributional shifts and byzantine attacks simultaneously. △ Less

Submitted 29 October, 2022; originally announced October 2022.

arXiv:2210.10865 [pdf, other]

Robotic Table Wiping via Reinforcement Learning and Whole-body Trajectory Optimization

Authors: Thomas Lew, Sumeet Singh, Mario Prats, Jeffrey Bingham, Jonathan Weisz, Benjie Holson, Xiaohan Zhang, Vikas Sindhwani, Yao Lu, Fei Xia, Peng Xu, Tingnan Zhang, Jie Tan, Montserrat Gonzalez

Abstract: We propose a framework to enable multipurpose assistive mobile robots to autonomously wipe tables to clean spills and crumbs. This problem is challenging, as it requires planning wiping actions while reasoning over uncertain latent dynamics of crumbs and spills captured via high-dimensional visual observations. Simultaneously, we must guarantee constraints satisfaction to enable safe deployment in… ▽ More We propose a framework to enable multipurpose assistive mobile robots to autonomously wipe tables to clean spills and crumbs. This problem is challenging, as it requires planning wiping actions while reasoning over uncertain latent dynamics of crumbs and spills captured via high-dimensional visual observations. Simultaneously, we must guarantee constraints satisfaction to enable safe deployment in unstructured cluttered environments. To tackle this problem, we first propose a stochastic differential equation to model crumbs and spill dynamics and absorption with a robot wiper. Using this model, we train a vision-based policy for planning wiping actions in simulation using reinforcement learning (RL). To enable zero-shot sim-to-real deployment, we dovetail the RL policy with a whole-body trajectory optimization framework to compute base and arm joint trajectories that execute the desired wiping motions while guaranteeing constraints satisfaction. We extensively validate our approach in simulation and on hardware. Video: https://youtu.be/inORKP4F3EI △ Less

Submitted 19 October, 2022; originally announced October 2022.

arXiv:2208.08226 [pdf, other]

Auto-segmentation of Hip Joints using MultiPlanar UNet with Transfer learning

Authors: Peidi Xu, Faezeh Moshfeghifar, Torkan Gholamalizadeh, Michael Bachmann Nielsen, Kenny Erleben, Sune Darkner

Abstract: Accurate geometry representation is essential in developing finite element models. Although generally good, deep-learning segmentation approaches with only few data have difficulties in accurately segmenting fine features, e.g., gaps and thin structures. Subsequently, segmented geometries need labor-intensive manual modifications to reach a quality where they can be used for simulation purposes. W… ▽ More Accurate geometry representation is essential in developing finite element models. Although generally good, deep-learning segmentation approaches with only few data have difficulties in accurately segmenting fine features, e.g., gaps and thin structures. Subsequently, segmented geometries need labor-intensive manual modifications to reach a quality where they can be used for simulation purposes. We propose a strategy that uses transfer learning to reuse datasets with poor segmentation combined with an interactive learning step where fine-tuning of the data results in anatomically accurate segmentations suitable for simulations. We use a modified MultiPlanar UNet that is pre-trained using inferior hip joint segmentation combined with a dedicated loss function to learn the gap regions and post-processing to correct tiny inaccuracies on symmetric classes due to rotational invariance. We demonstrate this robust yet conceptually simple approach applied with clinically validated results on publicly available computed tomography scans of hip joints. Code and resulting 3D models are available at: https://github.com/MICCAI2022-155/AuToSeg} △ Less

Submitted 18 August, 2022; v1 submitted 17 August, 2022; originally announced August 2022.

Comments: Accepted at Medical Image Learning with Limited & Noisy Data (MILLanD), a workshop hosted with the conference on Medical Image Computing and Computer Assisted Interventions (MICCAI) 2022

arXiv:2208.06876 [pdf, other]

Conformal Navigation Transformations with Application to Robot Navigation in Complex Workspaces

Authors: Li Fan, Jianchang Liu, Wenle Zhang, Peng Xu

Abstract: Navigation functions provide both path and motion planning, which can be used to ensure obstacle avoidance and convergence in the sphere world. When dealing with complex and realistic scenarios, constructing a transformation to the sphere world is essential and, at the same time, challenging. This work proposes a novel transformation termed the conformal navigation transformation to achieve collis… ▽ More Navigation functions provide both path and motion planning, which can be used to ensure obstacle avoidance and convergence in the sphere world. When dealing with complex and realistic scenarios, constructing a transformation to the sphere world is essential and, at the same time, challenging. This work proposes a novel transformation termed the conformal navigation transformation to achieve collision-free navigation of a robot in a workspace populated with obstacles of arbitrary shapes. The properties of the conformal navigation transformation, including uniqueness, invariance of navigation properties, and no angular deformation, are investigated, which contribute to the solution of the robot navigation problem in complex environments. Based on navigation functions and the proposed transformation, feedback controllers are derived for the automatic guidance and motion control of kinematic and dynamic mobile robots. Moreover, an iterative method is proposed to construct the conformal navigation transformation in a multiply-connected workspace, which transforms the multiply-connected problem into multiple simply-connected problems to achieve fast convergence. In addition to the analytic guarantees, simulation studies verify the effectiveness of the proposed methodology in workspaces with non-trivial obstacles. △ Less

Submitted 2 October, 2022; v1 submitted 14 August, 2022; originally announced August 2022.

arXiv:2207.03687 [pdf]

Predicting Li-ion Battery Cycle Life with LSTM RNN

Authors: Pengcheng Xu, Yunfeng Lu

Abstract: Efficient and accurate remaining useful life prediction is a key factor for reliable and safe usage of lithium-ion batteries. This work trains a long short-term memory recurrent neural network model to learn from sequential data of discharge capacities at various cycles and voltages and to work as a cycle life predictor for battery cells cycled under different conditions. Using experimental data o… ▽ More Efficient and accurate remaining useful life prediction is a key factor for reliable and safe usage of lithium-ion batteries. This work trains a long short-term memory recurrent neural network model to learn from sequential data of discharge capacities at various cycles and voltages and to work as a cycle life predictor for battery cells cycled under different conditions. Using experimental data of first 60 - 80 cycles, our model achieves promising prediction accuracy on test sets of around 80 samples. △ Less

Submitted 8 July, 2022; originally announced July 2022.

arXiv:2207.03430 [pdf, ps, other]

A Novel Unified Conditional Score-based Generative Framework for Multi-modal Medical Image Completion

Authors: Xiangxi Meng, Yuning Gu, Yongsheng Pan, Nizhuan Wang, Peng Xue, Mengkang Lu, Xuming He, Yiqiang Zhan, Dinggang Shen

Abstract: Multi-modal medical image completion has been extensively applied to alleviate the missing modality issue in a wealth of multi-modal diagnostic tasks. However, for most existing synthesis methods, their inferences of missing modalities can collapse into a deterministic mapping from the available ones, ignoring the uncertainties inherent in the cross-modal relationships. Here, we propose the Unifie… ▽ More Multi-modal medical image completion has been extensively applied to alleviate the missing modality issue in a wealth of multi-modal diagnostic tasks. However, for most existing synthesis methods, their inferences of missing modalities can collapse into a deterministic mapping from the available ones, ignoring the uncertainties inherent in the cross-modal relationships. Here, we propose the Unified Multi-Modal Conditional Score-based Generative Model (UMM-CSGM) to take advantage of Score-based Generative Model (SGM) in modeling and stochastically sampling a target probability distribution, and further extend SGM to cross-modal conditional synthesis for various missing-modality configurations in a unified framework. Specifically, UMM-CSGM employs a novel multi-in multi-out Conditional Score Network (mm-CSN) to learn a comprehensive set of cross-modal conditional distributions via conditional diffusion and reverse generation in the complete modality space. In this way, the generation process can be accurately conditioned by all available information, and can fit all possible configurations of missing modalities in a single network. Experiments on BraTS19 dataset show that the UMM-CSGM can more reliably synthesize the heterogeneous enhancement and irregular area in tumor-induced lesions for any missing modalities. △ Less

Submitted 7 July, 2022; originally announced July 2022.

arXiv:2206.06813 [pdf, other]

Learning towards Synchronous Network Memorizability and Generalizability for Continual Segmentation across Multiple Sites

Authors: Jingyang Zhang, Peng Xue, Ran Gu, Yuning Gu, Mianxin Liu, Yongsheng Pan, Zhiming Cui, Jiawei Huang, Lei Ma, Dinggang Shen

Abstract: In clinical practice, a segmentation network is often required to continually learn on a sequential data stream from multiple sites rather than a consolidated set, due to the storage cost and privacy restriction. However, during the continual learning process, existing methods are usually restricted in either network memorizability on previous sites or generalizability on unseen sites. This paper… ▽ More In clinical practice, a segmentation network is often required to continually learn on a sequential data stream from multiple sites rather than a consolidated set, due to the storage cost and privacy restriction. However, during the continual learning process, existing methods are usually restricted in either network memorizability on previous sites or generalizability on unseen sites. This paper aims to tackle the challenging problem of Synchronous Memorizability and Generalizability (SMG) and to simultaneously improve performance on both previous and unseen sites, with a novel proposed SMG-learning framework. First, we propose a Synchronous Gradient Alignment (SGA) objective, which not only promotes the network memorizability by enforcing coordinated optimization for a small exemplar set from previous sites (called replay buffer), but also enhances the generalizability by facilitating site-invariance under simulated domain shift. Second, to simplify the optimization of SGA objective, we design a Dual-Meta algorithm that approximates the SGA objective as dual meta-objectives for optimization without expensive computation overhead. Third, for efficient rehearsal, we configure the replay buffer comprehensively considering additional inter-site diversity to reduce redundancy. Experiments on prostate MRI data sequentially acquired from six institutes demonstrate that our method can simultaneously achieve higher memorizability and generalizability over state-of-the-art methods. Code is available at https://github.com/jingyzhang/SMG-Learning. △ Less

Submitted 27 June, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

Comments: Early accepted in MICCAI2022

arXiv:2204.00925 [pdf, ps, other]

A UCB-based Tree Search Approach to Joint Verification-Correction Strategy for Large Scale Systems

Authors: Peng Xu, Xinwei Deng, Alejandro Salado

Abstract: Verification planning is a sequential decision-making problem that specifies a set of verification activities (VA) and correction activities (CA) at different phases of system development. While VAs are used to identify errors and defects, CAs also play important roles in system verification as they correct the identified errors and defects. However, current planning methods only consider VAs as d… ▽ More Verification planning is a sequential decision-making problem that specifies a set of verification activities (VA) and correction activities (CA) at different phases of system development. While VAs are used to identify errors and defects, CAs also play important roles in system verification as they correct the identified errors and defects. However, current planning methods only consider VAs as decision choices. Because VAs and CAs have different activity spaces, planning a joint verification-correction strategy (JVCS) is still challenging, especially for large-size systems. Here we introduce a UCB-based tree search approach to search for near-optimal JVCSs. First, verification planning is simplified as repeatable bandit problems and an upper confidence bound rule for repeatable bandits (UCBRB) is presented with the optimal regret bound. Next, a tree search algorithm is proposed to search for feasible JVCSs. A tree-based ensemble learning model is also used to extend the tree search algorithm to handle local optimality issues. The proposed approach is evaluated on the notional case of a communication system. △ Less

Submitted 2 April, 2022; originally announced April 2022.

Comments: 23 pages, 10 figures

arXiv:2201.02419 [pdf, other]

Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset

Authors: Tiezheng Yu, Rita Frieske, Peng Xu, Samuel Cahyawijaya, Cheuk Tung Shadow Yiu, Holy Lovenia, Wenliang Dai, Elham J. Barezi, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung

Abstract: Automatic speech recognition (ASR) on low resource languages improves the access of linguistic minorities to technological advantages provided by artificial intelligence (AI). In this paper, we address the problem of data scarcity for the Hong Kong Cantonese language by creating a new Cantonese dataset. Our dataset, Multi-Domain Cantonese Corpus (MDCC), consists of 73.6 hours of clean read speech… ▽ More Automatic speech recognition (ASR) on low resource languages improves the access of linguistic minorities to technological advantages provided by artificial intelligence (AI). In this paper, we address the problem of data scarcity for the Hong Kong Cantonese language by creating a new Cantonese dataset. Our dataset, Multi-Domain Cantonese Corpus (MDCC), consists of 73.6 hours of clean read speech paired with transcripts, collected from Cantonese audiobooks from Hong Kong. It comprises philosophy, politics, education, culture, lifestyle and family domains, covering a wide range of topics. We also review all existing Cantonese datasets and analyze them according to their speech type, data source, total size and availability. We further conduct experiments with Fairseq S2T Transformer, a state-of-the-art ASR model, on the biggest existing dataset, Common Voice zh-HK, and our proposed MDCC, and the results show the effectiveness of our dataset. In addition, we create a powerful and robust Cantonese ASR model by applying multi-dataset learning on MDCC and Common Voice zh-HK. △ Less

Submitted 17 January, 2022; v1 submitted 7 January, 2022; originally announced January 2022.

arXiv:2109.11704 [pdf, ps, other]

A Parallel Tempering Approach for Efficient Exploration of the Verification Tradespace in Engineered Systems

Authors: Peng Xu, Alejandro Salado, Xinwei Deng

Abstract: Verification is a critical process in the development of engineered systems. Through verification, engineers gain confidence in the correct functionality of the system before it is deployed into operation. Traditionally, verification strategies are fixed at the beginning of the system's development and verification activities are executed as the development progresses. Such an approach appears to… ▽ More Verification is a critical process in the development of engineered systems. Through verification, engineers gain confidence in the correct functionality of the system before it is deployed into operation. Traditionally, verification strategies are fixed at the beginning of the system's development and verification activities are executed as the development progresses. Such an approach appears to give inferior results as the selection of the verification activities does not leverage information gained through the system's development process. In contrast, a set-based design approach to verification, where verification activities are dynamically selected as the system's development progresses, has been shown to provide superior results. However, its application under realistic engineering scenarios remains unproven due to the large size of the verification tradespace. In this work, we propose a parallel tempering approach (PTA) to efficiently explore the verification tradespace. First, we formulate exploration of the verification tradespace as a tree search problem. Second, we design a parallel tempering (PT) algorithm by simulating several replicas of the verification process at different temperatures to obtain a near-optimal result. Third, We apply the PT algorithm to all possible verification states to dynamically identify near-optimal results. The effectiveness of the proposed PTA is evaluated on a partial model of a notional satellite optical instrument. △ Less

Submitted 23 September, 2021; originally announced September 2021.

arXiv:2103.00133 [pdf]

doi 10.3389/fenrg.2021.666130

Coordinated Cyber-Attack Detection Model of Cyber-Physical Power System Based on the Operating State Data Link

Authors: Lei Wang, Pengcheng Xu, Zhaoyang Qu, Xiaoyong Bo, Yunchang Dong, Zhenming Zhang, Yang Li

Abstract: Existing coordinated cyber-attack detection methods have low detection accuracy and efficiency and poor generalization ability due to difficulties dealing with unbalanced attack data samples, high data dimensionality, and noisy data sets. This paper proposes a model for cyber and physical data fusion using a data link for detecting attacks on a Cyber-Physical Power System (CPPS). Two-step principa… ▽ More Existing coordinated cyber-attack detection methods have low detection accuracy and efficiency and poor generalization ability due to difficulties dealing with unbalanced attack data samples, high data dimensionality, and noisy data sets. This paper proposes a model for cyber and physical data fusion using a data link for detecting attacks on a Cyber-Physical Power System (CPPS). Two-step principal component analysis (PCA) is used for classifying the system's operating status. An adaptive synthetic sampling algorithm is used to reduce the imbalance in the categories' samples. The loss function is improved according to the feature intensity difference of the attack event, and an integrated classifier is established using a classification algorithm based on the cost-sensitive gradient boosting decision tree (CS-GBDT). The simulation results show that the proposed method provides higher accuracy, recall, and F-Score than comparable algorithms. △ Less

Submitted 27 February, 2021; originally announced March 2021.

Comments: Accepted by Frontiers in Energy Research

Journal ref: Frontiers in Energy Research 9 (2021) 666130

arXiv:2101.01139 [pdf, other]

doi 10.1007/978-3-030-71151-1_52

High-bandwidth nonlinear control for soft actuators with recursive network models

Authors: Sarah Aguasvivas Manzano, Patricia Xu, Khoi Ly, Robert Shepherd, Nikolaus Correll

Abstract: We present a high-bandwidth, lightweight, and nonlinear output tracking technique for soft actuators that combines parsimonious recursive layers for forward output predictions and online optimization using Newton-Raphson. This technique allows for reduced model sizes and increased control loop frequencies when compared with conventional RNN models. Experimental results of this controller prototype… ▽ More We present a high-bandwidth, lightweight, and nonlinear output tracking technique for soft actuators that combines parsimonious recursive layers for forward output predictions and online optimization using Newton-Raphson. This technique allows for reduced model sizes and increased control loop frequencies when compared with conventional RNN models. Experimental results of this controller prototype on a single soft actuator with soft positional sensors indicate effective tracking of referenced spatial trajectories and rejection of mechanical and electromagnetic disturbances. These are evidenced by root mean squared path tracking errors (RMSE) of 1.8mm using a fully connected (FC) substructure, 1.62mm using a gated recurrent unit (GRU) and 2.11mm using a long short term memory (LSTM) unit, all averaged over three tasks. Among these models, the highest flash memory requirement is 2.22kB enabling co-location of controller and actuator. △ Less

Submitted 4 January, 2021; originally announced January 2021.

Comments: International Symposium on Experimental Robotics (ISER) 2020, Malta

arXiv:2012.11161 [pdf, ps, other]

Communication and Localization with Extremely Large Lens Antenna Array

Authors: Jie Yang, Yong Zeng, Shi Jin, Chao-Kai Wen, Pingping Xu

Abstract: Achieving high-rate communication with accurate localization and wireless environment sensing has emerged as an important trend of beyond-fifth and sixth generation cellular systems. Extension of the antenna array to an extremely large scale is a potential technology for achieving such goals. However, the super massive operating antennas significantly increases the computational complexity of the… ▽ More Achieving high-rate communication with accurate localization and wireless environment sensing has emerged as an important trend of beyond-fifth and sixth generation cellular systems. Extension of the antenna array to an extremely large scale is a potential technology for achieving such goals. However, the super massive operating antennas significantly increases the computational complexity of the system. Motivated by the inherent advantages of lens antenna arrays in reducing system complexity, we consider communication and localization problems with an \uline{ex}tremely large \uline{lens} antenna array, which we call "ExLens". Since radiative near-field property emerges in the setting, we derive the closed-form array response of the lens antenna array with spherical wave, which includes the array response obtained on the basis of uniform plane wave as a special case. Our derivation result reveals a window effect for energy focusing property of ExLens, which indicates that ExLens has great potential in position sensing and multi-user communication. We also propose an effective method for location and channel parameters estimation, which is able to achieve the localization performance close to the Cramér-Rao lower bound. Finally, we examine the multi-user communication performance of ExLens that serves coexisting near-field and far-field users. Numerical results demonstrate the effectiveness of the proposed channel estimation method and show that ExLens with a minimum mean square error receiver achieves significant spectral efficiency gains and complexity-and-cost reductions compared with a uniform linear array. △ Less

Submitted 21 December, 2020; originally announced December 2020.

Comments: Paper accepted for publication in IEEE Transactions on Wireless Communications

arXiv:2009.05103 [pdf, other]

Emotion-Based End-to-End Matching Between Image and Music in Valence-Arousal Space

Authors: Sicheng Zhao, Yaxian Li, Xingxu Yao, Weizhi Nie, Pengfei Xu, Jufeng Yang, Kurt Keutzer

Abstract: Both images and music can convey rich semantics and are widely used to induce specific emotions. Matching images and music with similar emotions might help to make emotion perceptions more vivid and stronger. Existing emotion-based image and music matching methods either employ limited categorical emotion states which cannot well reflect the complexity and subtlety of emotions, or train the matchi… ▽ More Both images and music can convey rich semantics and are widely used to induce specific emotions. Matching images and music with similar emotions might help to make emotion perceptions more vivid and stronger. Existing emotion-based image and music matching methods either employ limited categorical emotion states which cannot well reflect the complexity and subtlety of emotions, or train the matching model using an impractical multi-stage pipeline. In this paper, we study end-to-end matching between image and music based on emotions in the continuous valence-arousal (VA) space. First, we construct a large-scale dataset, termed Image-Music-Emotion-Matching-Net (IMEMNet), with over 140K image-music pairs. Second, we propose cross-modal deep continuous metric learning (CDCML) to learn a shared latent embedding space which preserves the cross-modal similarity relationship in the continuous matching space. Finally, we refine the embedding space by further preserving the single-modal emotion relationship in the VA spaces of both images and music. The metric learning in the embedding space and task regression in the label space are jointly optimized for both cross-modal matching and single-modal VA prediction. The extensive experiments conducted on IMEMNet demonstrate the superiority of CDCML for emotion-based image and music matching as compared to the state-of-the-art approaches. △ Less

Submitted 22 August, 2020; originally announced September 2020.

Comments: Accepted by ACM Multimedia 2020

arXiv:2008.07135 [pdf]

Extension of causal decomposition in the mutual complex dynamic process

Authors: Yi Zhang, Qin Yang, Lifu Zhang, Branko Celler, Steven Su, Peng Xu, Dezhong Yao

Abstract: Causal decomposition depicts a cause-effect relationship that is not based on the concept of prediction, but based on the phase dependence of time series. It has been validated in both stochastic and deterministic systems and is now anticipated for its application in the complex dynamic process. Here, we present an extension of causal decomposition in the mutual complex dynamic process: cause and… ▽ More Causal decomposition depicts a cause-effect relationship that is not based on the concept of prediction, but based on the phase dependence of time series. It has been validated in both stochastic and deterministic systems and is now anticipated for its application in the complex dynamic process. Here, we present an extension of causal decomposition in the mutual complex dynamic process: cause and effect of time series are inherited in the decomposition of intrinsic components in a similar time scale. Furthermore, we illustrate comparative studies with predominate methods used in neuroscience, and show the applicability of the method particularly to physiological time series in brain-muscle interactions, implying the potential to the causality analysis in the complex physiological process. △ Less

Submitted 17 August, 2020; originally announced August 2020.

Comments: 18 pages, 4 figures

arXiv:2005.12686 [pdf, other]

Physical Layer Authentication for Non-Coherent Massive SIMO-Enabled Industrial IoT Communications

Authors: Zhifang Gu, He Chen, Pingping Xu, Yonghui Li, Branka Vucetic

Abstract: Achieving ultra-reliable, low-latency and secure communications is essential for realizing the industrial Internet of Things (IIoT). Non-coherent massive multiple-input multiple-output (MIMO) is one of promising techniques to fulfill ultra-reliable and low-latency requirements. In addition, physical layer authentication (PLA) technology is particularly suitable for secure IIoT communications thank… ▽ More Achieving ultra-reliable, low-latency and secure communications is essential for realizing the industrial Internet of Things (IIoT). Non-coherent massive multiple-input multiple-output (MIMO) is one of promising techniques to fulfill ultra-reliable and low-latency requirements. In addition, physical layer authentication (PLA) technology is particularly suitable for secure IIoT communications thanks to its low-latency attribute. A PLA method for non-coherent massive single-input multiple-output (SIMO) IIoT communication systems is proposed in this paper. This method realizes PLA by embedding an authentication signal (tag) into a message signal, referred to as "message-based tag embedding". It is different from traditional PLA methods utilizing uniform power tags. We design the optimal tag embedding and optimize the power allocation between the message and tag signals to characterize the trade-off between the message and tag error performance. Numerical results show that the proposed message-based tag embedding PLA method is more accurate than the traditional uniform tag embedding method which has an unavoidable tag error floor close to 10%. △ Less

Submitted 23 May, 2020; originally announced May 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:2001.07315

arXiv:2005.06986 [pdf]

doi 10.1109/ACCESS.2020.2991075

Power Cyber-Physical System Risk Area Prediction Using Dependent Markov Chain and Improved Grey Wolf Optimization

Authors: Zhaoyang Qu, Qianhui Xie, Yuqing Liu, Yang Li, Lei Wang, Pengcheng Xu, Yuguang Zhou, Jian Sun, Kai Xue, Mingshi Cui

Abstract: Existing power cyber-physical system (CPS) risk prediction results are inaccurate as they fail to reflect the actual physical characteristics of the components and the specific operational status. A new method based on dependent Markov chain for power CPS risk area prediction is proposed in this paper. The load and constraints of the non-uniform power CPS coupling network are first characterized,… ▽ More Existing power cyber-physical system (CPS) risk prediction results are inaccurate as they fail to reflect the actual physical characteristics of the components and the specific operational status. A new method based on dependent Markov chain for power CPS risk area prediction is proposed in this paper. The load and constraints of the non-uniform power CPS coupling network are first characterized, and can be utilized as a node state judgment standard. Considering the component node isomerism and interdependence between the coupled networks, a power CPS risk regional prediction model based on dependent Markov chain is then constructed. A cross-adaptive gray wolf optimization algorithm improved by adaptive position adjustment strategy and cross-optimal solution strategy is subsequently developed to optimize the prediction model. Simulation results using the IEEE 39-BA 110 test system verify the effectiveness and superiority of the proposed method. △ Less

Submitted 29 April, 2020; originally announced May 2020.

Comments: Accepted by IEEE Access

Journal ref: IEEE Access 8 (2020) 82844-82854

arXiv:2005.01206 [pdf, other]

TIMELY: Pushing Data Movements and Interfaces in PIM Accelerators Towards Local and in Time Domain

Authors: Weitao Li, Pengfei Xu, Yang Zhao, Haitong Li, Yuan Xie, Yingyan Lin

Abstract: Resistive-random-access-memory (ReRAM) based processing-in-memory (R$^2$PIM) accelerators show promise in bridging the gap between Internet of Thing devices' constrained resources and Convolutional/Deep Neural Networks' (CNNs/DNNs') prohibitive energy cost. Specifically, R$^2$PIM accelerators enhance energy efficiency by eliminating the cost of weight movements and improving the computational dens… ▽ More Resistive-random-access-memory (ReRAM) based processing-in-memory (R$^2$PIM) accelerators show promise in bridging the gap between Internet of Thing devices' constrained resources and Convolutional/Deep Neural Networks' (CNNs/DNNs') prohibitive energy cost. Specifically, R$^2$PIM accelerators enhance energy efficiency by eliminating the cost of weight movements and improving the computational density through ReRAM's high density. However, the energy efficiency is still limited by the dominant energy cost of input and partial sum (Psum) movements and the cost of digital-to-analog (D/A) and analog-to-digital (A/D) interfaces. In this work, we identify three energy-saving opportunities in R$^2$PIM accelerators: analog data locality, time-domain interfacing, and input access reduction, and propose an innovative R$^2$PIM accelerator called TIMELY, with three key contributions: (1) TIMELY adopts analog local buffers (ALBs) within ReRAM crossbars to greatly enhance the data locality, minimizing the energy overheads of both input and Psum movements; (2) TIMELY largely reduces the energy of each single D/A (and A/D) conversion and the total number of conversions by using time-domain interfaces (TDIs) and the employed ALBs, respectively; (3) we develop an only-once input read (O$^2$IR) mapping method to further decrease the energy of input accesses and the number of D/A conversions. The evaluation with more than 10 CNN/DNN models and various chip configurations shows that, TIMELY outperforms the baseline R$^2$PIM accelerator, PRIME, by one order of magnitude in energy efficiency while maintaining better computational density (up to 31.2$\times$) and throughput (up to 736.6$\times$). Furthermore, comprehensive studies are performed to evaluate the effectiveness of the proposed ALB, TDI, and O$^2$IR innovations in terms of energy savings and area reduction. △ Less

Submitted 3 May, 2020; originally announced May 2020.

Comments: Accepted by 47th International Symposium on Computer Architecture (ISCA'2020)

arXiv:2004.14228 [pdf, other]

Meta-Transfer Learning for Code-Switched Speech Recognition

Authors: Genta Indra Winata, Samuel Cahyawijaya, Zhaojiang Lin, Zihan Liu, Peng Xu, Pascale Fung

Abstract: An increasing number of people in the world today speak a mixed-language as a result of being multilingual. However, building a speech recognition system for code-switching remains difficult due to the availability of limited resources and the expense and significant effort required to collect mixed-language data. We therefore propose a new learning method, meta-transfer learning, to transfer lear… ▽ More An increasing number of people in the world today speak a mixed-language as a result of being multilingual. However, building a speech recognition system for code-switching remains difficult due to the availability of limited resources and the expense and significant effort required to collect mixed-language data. We therefore propose a new learning method, meta-transfer learning, to transfer learn on a code-switched speech recognition system in a low-resource setting by judiciously extracting information from high-resource monolingual datasets. Our model learns to recognize individual languages, and transfer them so as to better recognize mixed-language speech by conditioning the optimization on the code-switching data. Based on experimental results, our model outperforms existing baselines on speech recognition and language modeling tasks, and is faster to converge. △ Less

Submitted 29 April, 2020; originally announced April 2020.

Comments: Accepted in ACL 2020. The first two authors contributed equally to this work

arXiv:2003.07000 [pdf, other]

TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding

Authors: Zhiheng Huang, Peng Xu, Davis Liang, Ajay Mishra, Bing Xiang

Abstract: Bidirectional Encoder Representations from Transformers (BERT) has recently achieved state-of-the-art performance on a broad range of NLP tasks including sentence classification, machine translation, and question answering. The BERT model architecture is derived primarily from the transformer. Prior to the transformer era, bidirectional Long Short-Term Memory (BLSTM) has been the dominant modeling… ▽ More Bidirectional Encoder Representations from Transformers (BERT) has recently achieved state-of-the-art performance on a broad range of NLP tasks including sentence classification, machine translation, and question answering. The BERT model architecture is derived primarily from the transformer. Prior to the transformer era, bidirectional Long Short-Term Memory (BLSTM) has been the dominant modeling architecture for neural machine translation and question answering. In this paper, we investigate how these two modeling techniques can be combined to create a more powerful model architecture. We propose a new architecture denoted as Transformer with BLSTM (TRANS-BLSTM) which has a BLSTM layer integrated to each transformer block, leading to a joint modeling framework for transformer and BLSTM. We show that TRANS-BLSTM models consistently lead to improvements in accuracy compared to BERT baselines in GLUE and SQuAD 1.1 experiments. Our TRANS-BLSTM model obtains an F1 score of 94.01% on the SQuAD 1.1 development dataset, which is comparable to the state-of-the-art result. △ Less

Submitted 15 March, 2020; originally announced March 2020.

arXiv:2003.01901 [pdf, other]

Learning Fast Adaptation on Cross-Accented Speech Recognition

Authors: Genta Indra Winata, Samuel Cahyawijaya, Zihan Liu, Zhaojiang Lin, Andrea Madotto, Peng Xu, Pascale Fung

Abstract: Local dialects influence people to pronounce words of the same language differently from each other. The great variability and complex characteristics of accents creates a major challenge for training a robust and accent-agnostic automatic speech recognition (ASR) system. In this paper, we introduce a cross-accented English speech recognition task as a benchmark for measuring the ability of the mo… ▽ More Local dialects influence people to pronounce words of the same language differently from each other. The great variability and complex characteristics of accents creates a major challenge for training a robust and accent-agnostic automatic speech recognition (ASR) system. In this paper, we introduce a cross-accented English speech recognition task as a benchmark for measuring the ability of the model to adapt to unseen accents using the existing CommonVoice corpus. We also propose an accent-agnostic approach that extends the model-agnostic meta-learning (MAML) algorithm for fast adaptation to unseen accents. Our approach significantly outperforms joint training in both zero-shot, few-shot, and all-shot in the mixed-region and cross-region settings in terms of word error rate. △ Less

Submitted 4 March, 2020; originally announced March 2020.

Comments: The first three authors contributed equally to this work

arXiv:2002.11270 [pdf, other]

DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures

Authors: Yang Zhao, Chaojian Li, Yue Wang, Pengfei Xu, Yongan Zhang, Yingyan Lin

Abstract: The recent breakthroughs in deep neural networks (DNNs) have spurred a tremendously increased demand for DNN accelerators. However, designing DNN accelerators is non-trivial as it often takes months/years and requires cross-disciplinary knowledge. To enable fast and effective DNN accelerator development, we propose DNN-Chip Predictor, an analytical performance predictor which can accurately predic… ▽ More The recent breakthroughs in deep neural networks (DNNs) have spurred a tremendously increased demand for DNN accelerators. However, designing DNN accelerators is non-trivial as it often takes months/years and requires cross-disciplinary knowledge. To enable fast and effective DNN accelerator development, we propose DNN-Chip Predictor, an analytical performance predictor which can accurately predict DNN accelerators' energy, throughput, and latency prior to their actual implementation. Our Predictor features two highlights: (1) its analytical performance formulation of DNN ASIC/FPGA accelerators facilitates fast design space exploration and optimization; and (2) it supports DNN accelerators with different algorithm-to-hardware mapping methods (i.e., dataflows) and hardware architectures. Experiment results based on 2 DNN models and 3 different ASIC/FPGA implementations show that our DNN-Chip Predictor's predicted performance differs from those of chip measurements of FPGA/ASIC implementation by no more than 17.66% when using different DNN models, hardware architectures, and dataflows. We will release code upon acceptance. △ Less

Submitted 15 April, 2021; v1 submitted 25 February, 2020; originally announced February 2020.

Comments: Accepted by 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP'2020)

arXiv:2002.10804 [pdf]

Hierarchical emotion-recognition framework based on discriminative brain neural network topology and ensemble co-decision strategy

Authors: Cunbo Li, Peiyang Li, Yangsong Zhang, Ning Li, Yajing Si, Fali Li, Dezhong Yao, Peng Xu

Abstract: Brain neural networks characterize various information propagation patterns for different emotional states. However, the statistical features based on traditional graph theory may ignore the spacial network difference. To reveal these inherent spatial features and increase the stability of emotional recognition, we proposed a hierarchical framework that can perform the multiple emotion recognition… ▽ More Brain neural networks characterize various information propagation patterns for different emotional states. However, the statistical features based on traditional graph theory may ignore the spacial network difference. To reveal these inherent spatial features and increase the stability of emotional recognition, we proposed a hierarchical framework that can perform the multiple emotion recognitions with the multiple emotion-related spatial network topology patterns (MESNP) by combining a supervised learning with ensemble co-decision strategy. To evaluate the performance of our proposed MESNP approach, we conduct both off-line and simulated on-line experiments with two public datasets i.e., MAHNOB and DEAP. The experiment results demonstrated that MESNP can significantly enhance the classification performance for the multiple emotions. The highest accuracies of off-line experiments for MAHNOB-HCI and DEAP achieved 99.93% (3 classes) and 83.66% (4 classes), respectively. For simulated on-line experiments, we also obtained the best classification accuracies with 100% (3 classes) for MAHNOB and 99.22% (4 classes) for DEAP by proposed MESNP. These results further proved the efficiency of MESNP for structured feature extraction in mult-classification emotional task. △ Less

Submitted 25 February, 2020; originally announced February 2020.

arXiv:2001.11337 [pdf, other]

EEG-based Brain-Computer Interfaces (BCIs): A Survey of Recent Studies on Signal Sensing Technologies and Computational Intelligence Approaches and their Applications

Authors: Xiaotong Gu, Zehong Cao, Alireza Jolfaei, Peng Xu, Dongrui Wu, Tzyy-Ping Jung, Chin-Teng Lin

Abstract: Brain-Computer Interface (BCI) is a powerful communication tool between users and systems, which enhances the capability of the human brain in communicating and interacting with the environment directly. Advances in neuroscience and computer science in the past decades have led to exciting developments in BCI, thereby making BCI a top interdisciplinary research area in computational neuroscience a… ▽ More Brain-Computer Interface (BCI) is a powerful communication tool between users and systems, which enhances the capability of the human brain in communicating and interacting with the environment directly. Advances in neuroscience and computer science in the past decades have led to exciting developments in BCI, thereby making BCI a top interdisciplinary research area in computational neuroscience and intelligence. Recent technological advances such as wearable sensing devices, real-time data streaming, machine learning, and deep learning approaches have increased interest in electroencephalographic (EEG) based BCI for translational and healthcare applications. Many people benefit from EEG-based BCIs, which facilitate continuous monitoring of fluctuations in cognitive states under monotonous tasks in the workplace or at home. In this study, we survey the recent literature of EEG signal sensing technologies and computational intelligence approaches in BCI applications, compensated for the gaps in the systematic summary of the past five years (2015-2019). In specific, we first review the current status of BCI and its significant obstacles. Then, we present advanced signal sensing and enhancement technologies to collect and clean EEG signals, respectively. Furthermore, we demonstrate state-of-art computational intelligence techniques, including interpretable fuzzy models, transfer learning, deep learning, and combinations, to monitor, maintain, or track human cognitive states and operating performance in prevalent applications. Finally, we deliver a couple of innovative BCI-inspired healthcare applications and discuss some future research directions in EEG-based BCIs. △ Less

Submitted 28 January, 2020; originally announced January 2020.

Comments: Submitting to IEEE/ACM Transactions on Computational Biology and Bioinformatics

arXiv:2001.07315 [pdf, ps, other]

Physical Layer Authentication for Non-coherent Massive SIMO-Based Industrial IoT Communications

Authors: Zhifang Gu, He Chen, Pingping Xu, Yonghui Li, Branka Vucetic

Abstract: Achieving ultra-reliable, low-latency and secure communications is essential for realizing the industrial Internet of Things (IIoT). Non-coherent massive multiple-input multiple-output (MIMO) has recently been proposed as a promising methodology to fulfill ultra-reliable and low-latency requirements. In addition, physical layer authentication (PLA) technology is particularly suitable for IIoT comm… ▽ More Achieving ultra-reliable, low-latency and secure communications is essential for realizing the industrial Internet of Things (IIoT). Non-coherent massive multiple-input multiple-output (MIMO) has recently been proposed as a promising methodology to fulfill ultra-reliable and low-latency requirements. In addition, physical layer authentication (PLA) technology is particularly suitable for IIoT communications thanks to its low-latency attribute. A PLA method for non-coherent massive single-input multiple-output (SIMO) IIoT communication systems is proposed in this paper. Specifically, we first determine the optimal embedding of the authentication information (tag) in the message information. We then optimize the power allocation between message and tag signal to characterize the trade-off between message and tag error performance. Numerical results show that the proposed PLA is more accurate then traditional methods adopting the uniform tag when the communication reliability remains at the same level. The proposed PLA method can be effectively applied to the non-coherent system. △ Less

Submitted 20 January, 2020; originally announced January 2020.

arXiv:2001.03535 [pdf, other]

doi 10.1145/3373087.3375306

AutoDNNchip: An Automated DNN Chip Predictor and Builder for Both FPGAs and ASICs

Authors: Pengfei Xu, Xiaofan Zhang, Cong Hao, Yang Zhao, Yongan Zhang, Yue Wang, Chaojian Li, Zetong Guan, Deming Chen, Yingyan Lin

Abstract: Recent breakthroughs in Deep Neural Networks (DNNs) have fueled a growing demand for DNN chips. However, designing DNN chips is non-trivial because: (1) mainstream DNNs have millions of parameters and operations; (2) the large design space due to the numerous design choices of dataflows, processing elements, memory hierarchy, etc.; and (3) an algorithm/hardware co-design is needed to allow the sam… ▽ More Recent breakthroughs in Deep Neural Networks (DNNs) have fueled a growing demand for DNN chips. However, designing DNN chips is non-trivial because: (1) mainstream DNNs have millions of parameters and operations; (2) the large design space due to the numerous design choices of dataflows, processing elements, memory hierarchy, etc.; and (3) an algorithm/hardware co-design is needed to allow the same DNN functionality to have a different decomposition, which would require different hardware IPs to meet the application specifications. Therefore, DNN chips take a long time to design and require cross-disciplinary experts. To enable fast and effective DNN chip design, we propose AutoDNNchip - a DNN chip generator that can automatically generate both FPGA- and ASIC-based DNN chip implementation given DNNs from machine learning frameworks (e.g., PyTorch) for a designated application and dataset. Specifically, AutoDNNchip consists of two integrated enablers: (1) a Chip Predictor, built on top of a graph-based accelerator representation, which can accurately and efficiently predict a DNN accelerator's energy, throughput, and area based on the DNN model parameters, hardware configuration, technology-based IPs, and platform constraints; and (2) a Chip Builder, which can automatically explore the design space of DNN chips (including IP selection, block configuration, resource balancing, etc.), optimize chip design via the Chip Predictor, and then generate optimized synthesizable RTL to achieve the target design metrics. Experimental results show that our Chip Predictor's predicted performance differs from real-measured ones by < 10% when validated using 15 DNN models and 4 platforms (edge-FPGA/TPU/GPU and ASIC). Furthermore, accelerators generated by our AutoDNNchip can achieve better (up to 3.86X improvement) performance than that of expert-crafted state-of-the-art accelerators. △ Less

Submitted 10 June, 2020; v1 submitted 6 January, 2020; originally announced January 2020.

Comments: Accepted by 28th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA'2020)

MSC Class: 68T45 (Primary); 68M20 (Secondary) ACM Class: C.5.0; C.3

arXiv:1911.04386 [pdf, other]

doi 10.1016/j.compchemeng.2020.106991

Fault Detection and Identification using Bayesian Recurrent Neural Networks

Authors: Weike Sun, Antonio R. C. Paiva, Peng Xu, Anantha Sundaram, Richard D. Braatz

Abstract: In processing and manufacturing industries, there has been a large push to produce higher quality products and ensure maximum efficiency of processes. This requires approaches to effectively detect and resolve disturbances to ensure optimal operations. While the control system can compensate for many types of disturbances, there are changes to the process which it still cannot handle adequately. I… ▽ More In processing and manufacturing industries, there has been a large push to produce higher quality products and ensure maximum efficiency of processes. This requires approaches to effectively detect and resolve disturbances to ensure optimal operations. While the control system can compensate for many types of disturbances, there are changes to the process which it still cannot handle adequately. It is therefore important to further develop monitoring systems to effectively detect and identify those faults such that they can be quickly resolved by operators. In this paper, a novel probabilistic fault detection and identification method is proposed which adopts a newly developed deep learning approach using Bayesian recurrent neural networks~(BRNNs) with variational dropout. The BRNN model is general and can model complex nonlinear dynamics. Moreover, compared to traditional statistic-based data-driven fault detection and identification methods, the proposed BRNN-based method yields uncertainty estimates which allow for simultaneous fault detection of chemical processes, direct fault identification, and fault propagation analysis. The outstanding performance of this method is demonstrated and contrasted to (dynamic) principal component analysis, which are widely applied in the industry, in the benchmark Tennessee Eastman process~(TEP) and a real chemical manufacturing dataset. △ Less

Submitted 26 June, 2020; v1 submitted 11 November, 2019; originally announced November 2019.

Comments: 43 pages, 23 figures. Accepted for publication in Computers & Chemical Engineering

MSC Class: 68T05

arXiv:1910.12181 [pdf, other]

Multi-source Domain Adaptation for Semantic Segmentation

Authors: Sicheng Zhao, Bo Li, Xiangyu Yue, Yang Gu, Pengfei Xu, Runbo Hu, Hua Chai, Kurt Keutzer

Abstract: Simulation-to-real domain adaptation for semantic segmentation has been actively studied for various applications such as autonomous driving. Existing methods mainly focus on a single-source setting, which cannot easily handle a more practical scenario of multiple sources with different distributions. In this paper, we propose to investigate multi-source domain adaptation for semantic segmentation… ▽ More Simulation-to-real domain adaptation for semantic segmentation has been actively studied for various applications such as autonomous driving. Existing methods mainly focus on a single-source setting, which cannot easily handle a more practical scenario of multiple sources with different distributions. In this paper, we propose to investigate multi-source domain adaptation for semantic segmentation. Specifically, we design a novel framework, termed Multi-source Adversarial Domain Aggregation Network (MADAN), which can be trained in an end-to-end manner. First, we generate an adapted domain for each source with dynamic semantic consistency while aligning at the pixel-level cycle-consistently towards the target. Second, we propose sub-domain aggregation discriminator and cross-domain cycle discriminator to make different adapted domains more closely aggregated. Finally, feature-level alignment is performed between the aggregated domain and target domain while training the segmentation network. Extensive experiments from synthetic GTA and SYNTHIA to real Cityscapes and BDDS datasets demonstrate that the proposed MADAN model outperforms state-of-the-art approaches. Our source code is released at: https://github.com/Luodian/MADAN. △ Less

Submitted 27 October, 2019; originally announced October 2019.

Comments: Accepted by NeurIPS 2019

arXiv:1907.03432 [pdf]

Blind source separation using Fast-ICA with a novel nonlinear function

Authors: Pengfei Xu, Yinjie Jia, Zhijian Wang

Abstract: Blind source separation(BSS) is a hotspot in signal processing, and independent component analysis (ICA) is a very effective tool for solving the BSS problem. In order to improve the performance of the separation, a new nonlinear function sin was introduced. It can replace the commonly used classical functions (tanh, gauss and pow3) and does not need to select different nonlinear functions accordi… ▽ More Blind source separation(BSS) is a hotspot in signal processing, and independent component analysis (ICA) is a very effective tool for solving the BSS problem. In order to improve the performance of the separation, a new nonlinear function sin was introduced. It can replace the commonly used classical functions (tanh, gauss and pow3) and does not need to select different nonlinear functions according to the Gauss property of signals. The two Matlab simulation results show that the improved Fast-ICA algorithm with the proposed nonlinearity can not only improve the separation accuracy but also speed up the convergence of blind source separation. △ Less

Submitted 8 July, 2019; originally announced July 2019.

Comments: 5 pages, 2 figures

arXiv:1906.02649 [pdf, ps, other]

A Class of Distributed Event-Triggered Average Consensus Algorithms for Multi-Agent Systems

Authors: Ping Xu, Cameron Nowzari, Zhi Tian

Abstract: This paper proposes a class of distributed event-triggered algorithms that solve the average consensus problem in multi-agent systems. By designing events such that a specifically chosen Lyapunov function is monotonically decreasing, event-triggered algorithms succeed in reducing communications among agents while still ensuring that the entire system converges to the desired state. However, depend… ▽ More This paper proposes a class of distributed event-triggered algorithms that solve the average consensus problem in multi-agent systems. By designing events such that a specifically chosen Lyapunov function is monotonically decreasing, event-triggered algorithms succeed in reducing communications among agents while still ensuring that the entire system converges to the desired state. However, depending on the chosen Lyapunov function the transient behaviors can be very different. Moreover, performance requirements also vary from application to application. Consequently, we are instead interested in considering a class of Lyapunov functions such that each Lyapunov function produces a different event-triggered coordination algorithm to solve the multi-agent average consensus problem. The proposed class of algorithms all guarantee exponential convergence of the resulting system and exclusion of Zeno behaviors. This allows us to easily implement different algorithms that all guarantee correctness to meet varying performance needs. We show that our findings can be applied to the practical clock synchronization problem in wireless sensor networks (WSNs) and further corroborate their effectiveness with simulation results. △ Less

Submitted 23 November, 2019; v1 submitted 6 June, 2019; originally announced June 2019.

Comments: 23 pages, 11 figures

arXiv:1905.03032 [pdf]

Blind separation of rotor vibration signals in high-noise environments

Authors: Pengfei Xu, Yinjie Jia, Zhijian Wang

Abstract: During the operation of the engine rotor, the vibration signal measured by the sensor is the mixed signal of each vibration source, and contains strong noise at the same time. In this paper, a new separation method for mixed vibration signals in strong noise environment(SNR=-5) is proposed. Firstly, the time-delay auto-correlation de-noising method is used to de-noise the mixed signals, and then t… ▽ More During the operation of the engine rotor, the vibration signal measured by the sensor is the mixed signal of each vibration source, and contains strong noise at the same time. In this paper, a new separation method for mixed vibration signals in strong noise environment(SNR=-5) is proposed. Firstly, the time-delay auto-correlation de-noising method is used to de-noise the mixed signals, and then the common blind separation algorithm (MSNR algorithm is used here) is used to separate the mixed vibration signals, which improves the separation performance. The simulation results verify the validity of the method. The proposed method provides a new idea for health monitoring and fault diagnosis of engine rotor vibration signals. △ Less

Submitted 8 May, 2019; originally announced May 2019.

Comments: 8 pages, 4 figures

arXiv:1903.02664 [pdf]

Blind Source Separation of Optical Wireless Communications in Noisy Environments

Authors: Pengfei Xu, Yinjie Jia, Zhijian Wang

Abstract: Blind source separation is a research hotspot in the field of signal processing because it aims to separate unknown source signals from observed mixtures through an unknown transmission channel. A low computational complexity instantaneous linear mixture signals blind separation algorithm was introduced and improved. There is only one variable parameter named the length of moving average in the al… ▽ More Blind source separation is a research hotspot in the field of signal processing because it aims to separate unknown source signals from observed mixtures through an unknown transmission channel. A low computational complexity instantaneous linear mixture signals blind separation algorithm was introduced and improved. There is only one variable parameter named the length of moving average in the algorithm, which has a significant impact on the separation effect. This paper gives some suggestions on the reasonable value through experiments. The algorithm is extended to the separation of visible light communication signals in different noise environments, and has achieved certain results, which further expands the applicability of the algorithm. △ Less

Submitted 6 March, 2019; originally announced March 2019.

Comments: 7 pages, 4 figures

arXiv:1812.10227 [pdf, ps, other]

Hierarchical feature fusion framework for frequency recognition in SSVEP-based BCIs

Authors: Yangsong Zhang, Erwei Yin, Fali Li, Yu Zhang, Daqing Guo, Dezhong Yao, Peng Xu

Abstract: Effective frequency recognition algorithms are critical in steady-state visual evoked potential (SSVEP) based brain-computer interfaces (BCIs). In this study, we present a hierarchical feature fusion framework which can be used to design high-performance frequency recognition methods. The proposed framework includes two primary technique for fusing features: spatial dimension fusion (SD) and frequ… ▽ More Effective frequency recognition algorithms are critical in steady-state visual evoked potential (SSVEP) based brain-computer interfaces (BCIs). In this study, we present a hierarchical feature fusion framework which can be used to design high-performance frequency recognition methods. The proposed framework includes two primary technique for fusing features: spatial dimension fusion (SD) and frequency dimension fusion (FD). Both SD and FD fusions are obtained using a weighted strategy with a nonlinear function. To assess our novel methods, we used the correlated component analysis (CORRCA) method to investigate the efficiency and effectiveness of the proposed framework. Experimental results were obtained from a benchmark dataset of thirty-five subjects and indicate that the extended CORRCA method used within the framework significantly outperforms the original CORCCA method. Accordingly, the proposed framework holds promise to enhance the performance of frequency recognition methods in SSVEP-based BCIs. △ Less

Submitted 21 March, 2019; v1 submitted 26 December, 2018; originally announced December 2018.

Comments: 25 pages, 9 figures

arXiv:1809.06676 [pdf]

doi 10.1109/TCDS.2020.2965135

Reconfiguration of Brain Network between Resting-state and Oddball Paradigm

Authors: Fali Li, Chanlin Yi, Yuanyuan Liao, Yuanling Jiang, Yajing Si, Limeng Song, Tao Zhang, Dezhong Yao, Yangsong Zhang, Zehong Cao, Peng Xu

Abstract: The oddball paradigm is widely applied to the investigation of multiple cognitive functions. Prior studies have explored the cortical oscillation and power spectral differing from the resting-state conduction to oddball paradigm, but whether brain networks existing the significant difference is still unclear. Our study addressed how the brain reconfigures its architecture from a resting-state cond… ▽ More The oddball paradigm is widely applied to the investigation of multiple cognitive functions. Prior studies have explored the cortical oscillation and power spectral differing from the resting-state conduction to oddball paradigm, but whether brain networks existing the significant difference is still unclear. Our study addressed how the brain reconfigures its architecture from a resting-state condition (i.e., baseline) to P300 stimulus task in the visual oddball paradigm. In this study, electroencephalogram (EEG) datasets were collected from 24 postgraduate students, who were required to only mentally count the number of target stimulus; afterwards the functional EEG networks constructed in different frequency bands were compared between baseline and oddball task conditions to evaluate the reconfiguration of functional network in the brain. Compared to the baseline, our results showed the significantly (p < 0.05) enhanced delta/theta EEG connectivity and decreased alpha default mode network in the progress of brain reconfiguration to the P300 task. Furthermore, the reconfigured coupling strengths were demonstrated to relate to P300 amplitudes, which were then regarded as input features to train a classifier to differentiate the high and low P300 amplitudes groups with an accuracy of 77.78%. The findings of our study help us to understand the changes of functional brain connectivity from resting-state to oddball stimulus task, and the reconfigured network pattern has the potential for the selection of good subjects for P300-based brain- computer interface. △ Less

Submitted 18 September, 2018; originally announced September 2018.

Comments: This manuscript is submitting to IEEE Transactions on Cognitive and Developmental Systems

arXiv:1808.06553 [pdf]

Sliding Z Transform: Applications to convolutive blind source separation

Authors: Peng-fei Xu, Yin-jie Jia, Zhi-jian Wang

Abstract: The Z Transform is a mathematical operation in signal processing, which gives a tractable way to solve linear, constant-coefficient difference equations. Based on the classical Z transform and inspired by the thought of sliding DFT, a new definition of Sliding Z Transform(SZT) is introduced and deduced. Then this method is applied to blind source separation, four simulation results are presented t… ▽ More The Z Transform is a mathematical operation in signal processing, which gives a tractable way to solve linear, constant-coefficient difference equations. Based on the classical Z transform and inspired by the thought of sliding DFT, a new definition of Sliding Z Transform(SZT) is introduced and deduced. Then this method is applied to blind source separation, four simulation results are presented to demonstrate its performance when the sliding window WIN is set. It can directly recover time-domain sources from the convolutive mixtures with the help of robust linear mixed blind separation algorithms(such as JADE) . It has simple principle and good transplantation capability and can be widely applied in various fields of digital signal processing. △ Less

Submitted 8 August, 2018; originally announced August 2018.

Comments: 4 pages, 8 figures

Showing 1–49 of 49 results for author: Xu, P