Zum Hauptinhalt springen

Showing 1–43 of 43 results for author: Gong, C

Searching in archive eess. Search in all archives.
.
  1. Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI

    Authors: Lei Zhou, Yuzhong Zhang, Jiadong Zhang, Xuejun Qian, Chen Gong, Kun Sun, Zhongxiang Ding, Xing Wang, Zhenhui Li, Zaiyi Liu, Dinggang Shen

    Abstract: Automated breast tumor segmentation on the basis of dynamic contrast-enhancement magnetic resonance imaging (DCE-MRI) has shown great promise in clinical practice, particularly for identifying the presence of breast disease. However, accurate segmentation of breast tumor is a challenging task, often necessitating the development of complex networks. To strike an optimal trade-off between computati… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

    Journal ref: 2024,IEEE Transactions on Medical Imaging

  2. arXiv:2408.05758  [pdf, other

    eess.AS cs.AI cs.CL cs.SD

    VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing

    Authors: Chunyu Qiang, Wang Geng, Yi Zhao, Ruibo Fu, Tao Wang, Cheng Gong, Tianrui Wang, Qiuyu Liu, Jiangyan Yi, Zhengqi Wen, Chen Zhang, Hao Che, Longbiao Wang, Jianwu Dang, Jianhua Tao

    Abstract: Deep learning has brought significant improvements to the field of cross-modal representation learning. For tasks such as text-to-speech (TTS), voice conversion (VC), and automatic speech recognition (ASR), a cross-modal fine-grained (frame-level) sequence representation is desired, emphasizing the semantic content of the text modality while de-emphasizing the paralinguistic information of the spe… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

  3. arXiv:2406.08911  [pdf, other

    cs.CL eess.AS

    An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios

    Authors: Cheng Gong, Erica Cooper, Xin Wang, Chunyu Qiang, Mengzhe Geng, Dan Wells, Longbiao Wang, Jianwu Dang, Marc Tessier, Aidan Pine, Korin Richmond, Junichi Yamagishi

    Abstract: Self-supervised learning (SSL) representations from massively multilingual models offer a promising solution for low-resource language speech tasks. Despite advancements, language adaptation in TTS systems remains an open problem. This paper explores the language adaptation capability of ZMM-TTS, a recent SSL-based multilingual TTS system proposed in our previous work. We conducted experiments on… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

  4. arXiv:2405.18739  [pdf, other

    cs.NI eess.SP

    FlocOff: Data Heterogeneity Resilient Federated Learning with Communication-Efficient Edge Offloading

    Authors: Mulei Ma, Chenyu Gong, Liekang Zeng, Yang Yang, Liantao Wu

    Abstract: Federated Learning (FL) has emerged as a fundamental learning paradigm to harness massive data scattered at geo-distributed edge devices in a privacy-preserving way. Given the heterogeneous deployment of edge devices, however, their data are usually Non-IID, introducing significant challenges to FL including degraded training accuracy, intensive communication costs, and high computing complexity.… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  5. arXiv:2402.19013  [pdf, other

    eess.SY

    Ultraviolet Positioning via TDOA: Error Analysis and System Prototype

    Authors: Shihui Yu, Chubing Lv, Yueke Yang, Yuchen Pan, Lei Sun, Juliang Cao, Ruihang Yu, Chen Gong, Wenqi Wu, Zhengyuan Xu

    Abstract: This work performs the design, real-time hardware realization, and experimental evaluation of a positioning system by ultra-violet (UV) communication under photon-level signal detection. The positioning is based on time-difference of arrival (TDOA) principle. Time division-based transmission of synchronization sequence from three transmitters with known positions is applied. We investigate the pos… ▽ More

    Submitted 14 April, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

  6. arXiv:2312.15195  [pdf, other

    cs.AI cs.LG eess.SY

    Mutual Information as Intrinsic Reward of Reinforcement Learning Agents for On-demand Ride Pooling

    Authors: Xianjie Zhang, Jiahao Sun, Chen Gong, Kai Wang, Yifei Cao, Hao Chen, Hao Chen, Yu Liu

    Abstract: The emergence of on-demand ride pooling services allows each vehicle to serve multiple passengers at a time, thus increasing drivers' income and enabling passengers to travel at lower prices than taxi/car on-demand services (only one passenger can be assigned to a car at a time like UberX and Lyft). Although on-demand ride pooling services can bring so many benefits, ride pooling services need a w… ▽ More

    Submitted 7 January, 2024; v1 submitted 23 December, 2023; originally announced December 2023.

    Comments: Accepted by AAMAS 2024

  7. arXiv:2312.14398  [pdf, other

    cs.SD eess.AS

    ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations

    Authors: Cheng Gong, Xin Wang, Erica Cooper, Dan Wells, Longbiao Wang, Jianwu Dang, Korin Richmond, Junichi Yamagishi

    Abstract: Neural text-to-speech (TTS) has achieved human-like synthetic speech for single-speaker, single-language synthesis. Multilingual TTS systems are limited to resource-rich languages due to the lack of large paired text and studio-quality audio data. TTS systems are typically built using a single speaker's voices, but there is growing interest in developing systems that can synthesize voices for new… ▽ More

    Submitted 26 August, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: Accepted by IEEE/ACM TASLP, 16 pages plus 1 page of bio and photos

  8. arXiv:2312.03376  [pdf, other

    eess.SY

    Beacon-enabled TDMA Ultraviolet Communication Network System Design and Realization

    Authors: Yuchen Pan, Fei Long, Ping Li, Haotian Shi, Jiazhao Shi, Hanlin Xiao, Chen Gong, Zhengyuan Xu

    Abstract: Nonline of sight (NLOS) ultraviolet (UV) scattering communication can serve as a good candidate for outdoor optical wireless communication (OWC) in the cases of non-perfect transmitter-receiver alignment and radio silence. We design and demonstrate a NLOS UV scattering communication network system in this paper, where a beacon-enabled time division multiple access (TDMA) scheme is adopted. In our… ▽ More

    Submitted 15 April, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

  9. arXiv:2306.17790  [pdf, other

    eess.SP physics.atom-ph

    Theoretical Analysis of Heterodyne Rydberg Atomic Receiver Sensitivity Based on Transit Relaxation Effect and Frequency Detuning

    Authors: Shanchi Wu, Chen Gong, Shangbin Li, Rui Ni, Jinkang Zhu

    Abstract: We conduct a theoretical investigation into the impacts of local microwave electric field frequency detuning, laser frequency detuning, and transit relaxation rate on enhancing heterodyne Rydberg atomic receiver sensitivity. To optimize the output signal amplitude given the input microwave signal, we derive the steady-state solutions of the atomic density matrix. Numerical results show that laser… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

    Comments: 9 pages, 9 figures, 19 references

  10. arXiv:2306.08823  [pdf, other

    eess.SY

    Plug-in Hybrid Electric Vehicle Energy Management with Clutch Engagement Control via Continuous-Discrete Reinforcement Learning

    Authors: Changfu Gong, Jinming Xu, Yuan Lin

    Abstract: Energy management strategy (EMS) is a key technology for plug-in hybrid electric vehicles (PHEVs). The energy management of certain series-parallel PHEVs involves the control of continuous variables, such as engine torque, and discrete variables, such as clutch engagement/disengagement. We establish a control-oriented model for a series-parallel plug-in hybrid system with clutch engagement control… ▽ More

    Submitted 2 March, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

  11. arXiv:2304.12804  [pdf, other

    cs.IT eess.SP

    Channel Estimation and Signal Detection for NLOS Ultraviolet Scattering Communication with Space Division Multiple Access

    Authors: Yubo Zhang, Yuchen Pan, Chen Gong, Beiyuan Liu, Zhengyuan Xu

    Abstract: We design a receiver assembling several photomultipliers (PMTs) as an array to increase the field of view (FOV) of the receiver and adapt to multiuser situation over None-line-of-sight (NLOS) ultraviolet (UV) channels. Channel estimation and signal detection have been investigated according to the space division characteristics of the structure. Firstly, we adopt the balanced structure on the pilo… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

  12. arXiv:2303.01218  [pdf, ps, other

    eess.SY

    Co-Optimization of Adaptive Cruise Control and Hybrid Electric Vehicle Energy Management via Model Predictive Mixed Integer Control

    Authors: Qitao Li, Changfu Gong, Yuan Lin

    Abstract: In this paper, a model predictive mixed integer control method for BYD Qin Plus DM-i (Dual Model intelligent) plug-in hybrid electric vehicle (PHEV) is proposed for co-optimization to reduce fuel consumption during car following. First, the adaptive cruise control (ACC) model for energy-saving driving is established. Then, a control-oriented energy management strategy (EMS) model considering the c… ▽ More

    Submitted 24 April, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

  13. arXiv:2302.03227  [pdf, other

    cs.LG eess.SP q-bio.NC

    Automatic Sleep Stage Classification with Cross-modal Self-supervised Features from Deep Brain Signals

    Authors: Chen Gong, Yue Chen, Yanan Sui, Luming Li

    Abstract: The detection of human sleep stages is widely used in the diagnosis and intervention of neurological and psychiatric diseases. Some patients with deep brain stimulator implanted could have their neural activities recorded from the deep brain. Sleep stage classification based on deep brain recording has great potential to provide more precise treatment for patients. The accuracy and generalizabilit… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

    Comments: 4 pages, 5 figures, 11th International IEEE EMBS Conference on Neural Engineering (NER)

  14. arXiv:2211.02903  [pdf, other

    cs.SD eess.AS

    VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer

    Authors: Yongmao Zhang, Heyang Xue, Hanzhao Li, Lei Xie, Tingwei Guo, Ruixiong Zhang, Caixia Gong

    Abstract: End-to-end singing voice synthesis (SVS) model VISinger can achieve better performance than the typical two-stage model with fewer parameters. However, VISinger has several problems: text-to-phase problem, the end-to-end model learns the meaningless mapping of text-to-phase; glitches problem, the harmonic components corresponding to the periodic signal of the voiced segment occurs a sudden change… ▽ More

    Submitted 5 November, 2022; originally announced November 2022.

    Comments: Submitted to ICASSP 2023

  15. arXiv:2208.01559  [pdf, other

    eess.SP

    The design and optimization of synchronization sequence for Ultraviolet communication

    Authors: Shihui Yu, Chen Gong, Zhengyuan Xu

    Abstract: In the ultraviolet (UV) scattering communication, the received signals exhibit the characteristics of discrete photoelectrons due to path loss. The synchronization is based on maximum Pulse Number-Sequence correlation problem. First of all, the accuracy of synchronization is vital to channel estimation and decoding. This article focuses on improving synchronization accuracy by designing and optimi… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

  16. arXiv:2207.00583  [pdf, other

    eess.IV cs.CV q-bio.NC

    Feature-selected Graph Spatial Attention Network for Addictive Brain-Networks Identification

    Authors: Changwei Gong, Changhong Jing, Junren Pan, Shuqiang Wang

    Abstract: Functional alterations in the relevant neural circuits occur from drug addiction over a certain period. And these significant alterations are also revealed by analyzing fMRI. However, because of fMRI's high dimensionality and poor signal-to-noise ratio, it is challenging to encode efficient and robust brain regional embeddings for both graph-level identification and region-level biomarkers detecti… ▽ More

    Submitted 5 July, 2022; v1 submitted 29 June, 2022; originally announced July 2022.

  17. arXiv:2204.13863  [pdf, other

    eess.SY

    Indoor 3-Dimensional Visible Light Positioning: Error Metric and LED Layout Optimization

    Authors: Jiaojiao Xu, Nuo Huang, Chen Gong

    Abstract: We consider 3-dimensional (3D) visible light positioning (VLP) based on smartphone camera in an indoor scenario. Based on the positioning model in the quantized pixel-domain, we characterize the 3D normalized positioning error metric (NPEM) through the partial derivative of the positioning function, and evaluate the NPEM for horizontal and non-horizontal receiver camera positions. Moreover, under… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

  18. arXiv:2204.06086   

    eess.AS cs.SD

    A Post Auto-regressive GAN Vocoder Focused on Spectrum Fracture

    Authors: Zhenxing Lu, Mengnan He, Ruixiong Zhang, Caixia Gong

    Abstract: Generative adversarial networks (GANs) have been indicated their superiority in usage of the real-time speech synthesis. Nevertheless, most of them make use of deep convolutional layers as their backbone, which may cause the absence of previous signal information. However, the generation of speech signals invariably require preceding waveform samples in its reconstruction, as the lack of this can… ▽ More

    Submitted 16 February, 2023; v1 submitted 12 April, 2022; originally announced April 2022.

    Comments: Experimental parts should be improved

  19. arXiv:2111.07549  [pdf, other

    cs.CL cs.SD eess.AS

    Improving Prosody for Unseen Texts in Speech Synthesis by Utilizing Linguistic Information and Noisy Data

    Authors: Zhu Li, Yuqing Zhang, Mengxi Nie, Ming Yan, Mengnan He, Ruixiong Zhang, Caixia Gong

    Abstract: Recent advancements in end-to-end speech synthesis have made it possible to generate highly natural speech. However, training these models typically requires a large amount of high-fidelity speech data, and for unseen texts, the prosody of synthesized speech is relatively unnatural. To address these issues, we propose to combine a fine-tuned BERT-based front-end with a pre-trained FastSpeech2-base… ▽ More

    Submitted 15 November, 2021; originally announced November 2021.

  20. arXiv:2111.02921  [pdf, ps, other

    cs.IT eess.SP

    Map-Assisted Constellation Design for mmWave WDM with OAM in Short-Range LOS Environment

    Authors: Yuan Wang, Chen Gong, Nuo Huang, Zhengyuan Xu

    Abstract: We consider a system that integrates positioning and single-user millimeter wave (mmWave) communication, where the communication part adopts wavelength division multiplexing (WDM) and orbital angular momentum (OAM). This paper addresses the multi-dimensional constellation design in shortrange line-of-sight (LOS) environment, with stable communication links. We propose a map-assisted method to quan… ▽ More

    Submitted 11 October, 2022; v1 submitted 4 November, 2021; originally announced November 2021.

  21. arXiv:2110.04451  [pdf, other

    cs.SD cs.AI eess.AS

    Using multiple reference audios and style embedding constraints for speech synthesis

    Authors: Cheng Gong, Longbiao Wang, Zhenhua Ling, Ju Zhang, Jianwu Dang

    Abstract: The end-to-end speech synthesis model can directly take an utterance as reference audio, and generate speech from the text with prosody and speaker characteristics similar to the reference audio. However, an appropriate acoustic embedding must be manually selected during inference. Due to the fact that only the matched text and speech are used in the training process, using unmatched text and spee… ▽ More

    Submitted 9 October, 2021; originally announced October 2021.

    Comments: 5 pages,3 figures submitted to ICASSP2022

  22. arXiv:2109.07210  [pdf

    cs.RO eess.SY

    Life-Long Multi-Task Learning of Adaptive Path Tracking Policy for Autonomous Vehicle

    Authors: Cheng Gong, Jianwei Gong, Chao Lu, Zhe Liu, Zirui Li

    Abstract: This paper proposes a life-long adaptive path tracking policy learning method for autonomous vehicles that can self-evolve and self-adapt with multi-task knowledge. Firstly, the proposed method can learn a model-free control policy for path tracking directly from the historical driving experience, where the property of vehicle dynamics and corresponding control strategy can be learned simultaneous… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

  23. Elastic Significant Bit Quantization and Acceleration for Deep Neural Networks

    Authors: Cheng Gong, Ye Lu, Kunpeng Xie, Zongming Jin, Tao Li, Yanzhi Wang

    Abstract: Quantization has been proven to be a vital method for improving the inference efficiency of deep neural networks (DNNs). However, it is still challenging to strike a good balance between accuracy and efficiency while quantizing DNN weights or activation values from high-precision formats to their quantized counterparts. We propose a new method called elastic significant bit quantization (ESB) that… ▽ More

    Submitted 17 November, 2021; v1 submitted 8 September, 2021; originally announced September 2021.

    Comments: 15 pages, 14 figures

    ACM Class: B.2.4.a; I.2.6.g; I.5.1.d; I.5.4.b

    Journal ref: IEEE Transactions on Parallel and Distributed Systems, 2021

  24. arXiv:2108.01831  [pdf, other

    cs.SD eess.AS

    Information Sieve: Content Leakage Reduction in End-to-End Prosody For Expressive Speech Synthesis

    Authors: Xudong Dai, Cheng Gong, Longbiao Wang, Kaili Zhang

    Abstract: Expressive neural text-to-speech (TTS) systems incorporate a style encoder to learn a latent embedding as the style information. However, this embedding process may encode redundant textual information. This phenomenon is called content leakage. Researchers have attempted to resolve this problem by adding an ASR or other auxiliary supervision loss functions. In this study, we propose an unsupervis… ▽ More

    Submitted 3 August, 2021; originally announced August 2021.

    Comments: Accepted By Interspeech 2021

  25. arXiv:2107.09889  [pdf, other

    cs.SD cs.MM eess.AS

    Fine-Grained Music Plagiarism Detection: Revealing Plagiarists through Bipartite Graph Matching and a Comprehensive Large-Scale Dataset

    Authors: Wenxuan Liu, Tianyao He, Chen Gong, Ning Zhang, Hua Yang, Junchi Yan

    Abstract: Music plagiarism detection is gaining more and more attention due to the popularity of music production and society's emphasis on intellectual property. We aim to find fine-grained plagiarism in music pairs since conventional methods are coarse-grained and cannot match real-life scenarios. Considering that there is no sizeable dataset designed for the music plagiarism task, we establish a large-sc… ▽ More

    Submitted 2 July, 2023; v1 submitted 21 July, 2021; originally announced July 2021.

  26. arXiv:2106.06237  [pdf, other

    eess.IV cs.CV cs.LG

    KRADA: Known-region-aware Domain Alignment for Open-set Domain Adaptation in Semantic Segmentation

    Authors: Chenhong Zhou, Feng Liu, Chen Gong, Rongfei Zeng, Tongliang Liu, William K. Cheung, Bo Han

    Abstract: In semantic segmentation, we aim to train a pixel-level classifier to assign category labels to all pixels in an image, where labeled training images and unlabeled test images are from the same distribution and share the same label set. However, in an open world, the unlabeled test images probably contain unknown categories and have different distributions from the labeled images. Hence, in this p… ▽ More

    Submitted 19 February, 2023; v1 submitted 11 June, 2021; originally announced June 2021.

    Comments: 18 pages

    Journal ref: Transactions on Machine Learning Research, 2023

  27. arXiv:2105.14704  [pdf, other

    eess.AS cs.CL cs.SD

    Parkinsonian Chinese Speech Analysis towards Automatic Classification of Parkinson's Disease

    Authors: Hao Fang, Chen Gong, Chen Zhang, Yanan Sui, Luming Li

    Abstract: Speech disorders often occur at the early stage of Parkinson's disease (PD). The speech impairments could be indicators of the disorder for early diagnosis, while motor symptoms are not obvious. In this study, we constructed a new speech corpus of Mandarin Chinese and addressed classification of patients with PD. We implemented classical machine learning methods with ranking algorithms for feature… ▽ More

    Submitted 31 May, 2021; originally announced May 2021.

    Comments: 12 pages, 5 figures, proceedings of the Machine Learning for Health NeurIPS Workshop, PMLR 136:114-125, 2020

  28. arXiv:2101.03548  [pdf, ps, other

    eess.SP

    Channel Modeling and Signal Processing for Array-based Visible Light Communication System in Misalignment

    Authors: Jiaqi Wei, Chen Gong, Nuo Huang, Zhengyuan Xu

    Abstract: This paper proposes an indoor visible light communication (VLC) system with multiple transmitters and receivers. Due to diffusivity of LED light beams, photodiode receive signals from many directions. We use one concave and one convex lens as optical antenna, and obtain the optimal lens structure by optimizing which corresponds to the minimum condition number of channel gain matrix. In this way th… ▽ More

    Submitted 10 January, 2021; originally announced January 2021.

  29. arXiv:2010.09275  [pdf, other

    eess.AS

    DiDiSpeech: A Large Scale Mandarin Speech Corpus

    Authors: Tingwei Guo, Cheng Wen, Dongwei Jiang, Ne Luo, Ruixiong Zhang, Shuaijiang Zhao, Wubo Li, Cheng Gong, Wei Zou, Kun Han, Xiangang Li

    Abstract: This paper introduces a new open-sourced Mandarin speech corpus, called DiDiSpeech. It consists of about 800 hours of speech data at 48kHz sampling rate from 6000 speakers and the corresponding texts. All speech data in the corpus is recorded in quiet environment and is suitable for various speech processing tasks, such as voice conversion, multi-speaker text-to-speech and automatic speech recogni… ▽ More

    Submitted 8 February, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: 5 pages, 2 figures, 11 tables

  30. arXiv:2006.14497  [pdf, other

    eess.SP

    Quantumized Microwave Detection Based on $Λ$-Type Three-level Superconducting System: HMM Modeling and Performance Prediction

    Authors: Junyu Zhang, Chen Gong, Shangbin Li, Shanchi Wu, Rui Ni, Chengjie Zuo, Jinkang Zhu, Ming Zhao, Zhengyuan Xu

    Abstract: We adopt artificial $Λ$-type three-level system with superconducting devices for microwave signal detection, where the signal intensity reaches the level of discrete photons instead of continuous waveform. Based on the state transition principles of the three-level system, we propose a statistical model for microwave signal detection. Moreover, we investigate the achievable transmission rate and s… ▽ More

    Submitted 27 August, 2021; v1 submitted 25 June, 2020; originally announced June 2020.

    Comments: 12 pages, 18 figures

  31. arXiv:2006.14471  [pdf, other

    eess.SP

    Wireless Communication Based on Microwave Photon-Level Detection With Superconducting Devices: Achievable Rate Prediction

    Authors: Junyu Zhang, Chen Gong, Shangbin Li, Rui Ni, Chengjie Zuo, Jinkang Zhu, Ming Zhao, Zhengyuan Xu

    Abstract: Future wireless communication system embraces physical-layer signal detection with high sensitivity, especially in the microwave photon level. Currently, the receiver primarily adopts the signal detection based on semi-conductor devices for signal detection, while this paper introduces high-sensitivity photon-level microwave detection based on superconducting structure. We first overview existing… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

    Comments: 9 pages, 13 figures

  32. arXiv:2003.12933  [pdf, other

    eess.SP

    Weak Radio Frequency Signal Detection Based on Piezo-Opto-Electro-Mechanical System: Architecture Design and Sensitivity Prediction

    Authors: Shanchi Wu, Chen Gong, Chengjie Zuo, Shangbin Li, Junyu Zhang, Zhongbin Dai, Kai Yang, Ming Zhao, Rui Ni, Zhengyuan Xu, Jinkang Zhu

    Abstract: We propose a novel radio-frequency (RF) receiving architecture based on micro-electro-mechanical system (MEMS) and optical coherent detection module. The architecture converts the received electrical signal into mechanical vibration through the piezoelectric effect and adopts an optical detection module to detect the mechanical vibration. We analyze the response function of piezoelectric film to a… ▽ More

    Submitted 8 October, 2020; v1 submitted 28 March, 2020; originally announced March 2020.

    Comments: 15 pages, 16 figures, 6 tables

  33. arXiv:1909.11953  [pdf, other

    cs.LG cs.CV eess.IV stat.ML

    Hyperspectral Image Classification With Context-Aware Dynamic Graph Convolutional Network

    Authors: Sheng Wan, Chen Gong, Ping Zhong, Shirui Pan, Guangyu Li, Jian Yang

    Abstract: In hyperspectral image (HSI) classification, spatial context has demonstrated its significance in achieving promising performance. However, conventional spatial context-based methods simply assume that spatially neighboring pixels should correspond to the same land-cover class, so they often fail to correctly discover the contextual relations among pixels in complex situations, and thus leading to… ▽ More

    Submitted 26 September, 2019; originally announced September 2019.

  34. arXiv:1907.11458  [pdf, other

    cs.CV eess.IV

    Multiple Human Association between Top and Horizontal Views by Matching Subjects' Spatial Distributions

    Authors: Ruize Han, Yujun Zhang, Wei Feng, Chenxing Gong, Xiaoyu Zhang, Jiewen Zhao, Liang Wan, Song Wang

    Abstract: Video surveillance can be significantly enhanced by using both top-view data, e.g., those from drone-mounted cameras in the air, and horizontal-view data, e.g., those from wearable cameras on the ground. Collaborative analysis of different-view data can facilitate various kinds of applications, such as human tracking, person identification, and human activity recognition. However, for such collabo… ▽ More

    Submitted 26 July, 2019; originally announced July 2019.

  35. arXiv:1905.06133  [pdf, other

    eess.IV cs.LG stat.ML

    Multi-scale Dynamic Graph Convolutional Network for Hyperspectral Image Classification

    Authors: Sheng Wan, Chen Gong, Ping Zhong, Bo Du, Lefei Zhang, Jian Yang

    Abstract: Convolutional Neural Network (CNN) has demonstrated impressive ability to represent hyperspectral images and to achieve promising results in hyperspectral image classification. However, traditional CNN models can only operate convolution on regular square image regions with fixed size and weights, so they cannot universally adapt to the distinct local regions with various object distributions and… ▽ More

    Submitted 14 May, 2019; originally announced May 2019.

  36. arXiv:1904.03575  [pdf, other

    eess.SP

    Two Dimension Intensity Distribution of Ultraviolet Scattering Communication

    Authors: Difan Zou, Zhengyuan Xu, Chen Gong

    Abstract: Consider a ultraviolet (UV) scattering communication system where the position of the transmitter is fixed and the receiver can move around on the ground. To obtain the link gain effectively and economically, we propose an algorithm based on one-dimensional (1D) numerical integration and an off-line data library. Moreover, we analyze the 2D scattering intensity distributions for both LED and laser… ▽ More

    Submitted 6 April, 2019; originally announced April 2019.

    Comments: Work was done when Difan Zou was in USTC

  37. arXiv:1811.11874  [pdf, other

    eess.IV cs.CV

    RetinaMatch: Efficient Template Matching of Retina Images for Teleophthalmology

    Authors: Chen Gong, N. Benjamin Erichson, John P. Kelly, Laura Trutoiu, Brian T. Schowengerdt, Steven L. Brunton, Eric J. Seibel

    Abstract: Retinal template matching and registration is an important challenge in teleophthalmology with low-cost imaging devices. However, the images from such devices generally have a small field of view (FOV) and image quality degradations, making matching difficult. In this work, we develop an efficient and accurate retinal matching technique that combines dimension reduction and mutual information (MI)… ▽ More

    Submitted 28 November, 2018; originally announced November 2018.

  38. arXiv:1810.13091  [pdf, other

    cs.CL eess.AS

    Towards End-to-End Code-Switching Speech Recognition

    Authors: Ne Luo, Dongwei Jiang, Shuaijiang Zhao, Caixia Gong, Wei Zou, Xiangang Li

    Abstract: Code-switching speech recognition has attracted an increasing interest recently, but the need for expert linguistic knowledge has always been a big issue. End-to-end automatic speech recognition (ASR) simplifies the building of ASR systems considerably by predicting graphemes or characters directly from acoustic input. In the mean time, the need of expert linguistic knowledge is also eliminated, w… ▽ More

    Submitted 1 November, 2018; v1 submitted 30 October, 2018; originally announced October 2018.

    Comments: 5 pages, submitted to ICASSP 2019

  39. arXiv:1808.03486  [pdf, other

    eess.SP

    Pulse-laser Based Long-range Non-line-of-sight Ultraviolet Communication with Pulse Response Position Estimation

    Authors: Ruixiong Xu, Chen Gong, Zhengyuan Xu

    Abstract: We propose pulse laser-based ultra-violet communication over long distance, such that the pulse response signals can be detected at the receiver at the cost of low data transmission rate. We characterize the signal and achievable performance for the pulse laser-based communication. Since the detection performance critically depends on the pulse response position estimation, we also propose two app… ▽ More

    Submitted 10 August, 2018; originally announced August 2018.

  40. arXiv:1805.07766  [pdf, other

    eess.SP

    Constrained Partial Group Decoding with Max-Min Fairness for Multi-color Multi-user Visible Light Communication

    Authors: Guangtao Zheng, Chen Gong, Zhengyuan Xu

    Abstract: A visible light communication (VLC) system can adopt multi-color light emitting diode (LED) arrays to support multiple users. In this paper, a multi-layer coding and constrained partial group decoding (CPGD) method is proposed to tackle strong color interference and increase the system throughput. After channel model formulation, user information rates are allocated and decoding order for all the… ▽ More

    Submitted 20 May, 2018; originally announced May 2018.

    Comments: 28 pages, 12 figures, submitted to TCOM

  41. arXiv:1805.02199  [pdf, other

    eess.SP

    Asynchronous Multiple Access in Optical Wireless Scattering Communication: Achievable Transmission Rates and Receiver Design

    Authors: Guanchu Wang, Chen Gong, Zhimeng Jiang, Zhengyuan Xu

    Abstract: We investigate the asynchronous multiple user access communication in optical wireless scattering communication, where different users transmit signals without perfect alignment in the time domain. Firstly, we characterize the received signal based on hidden markov model (HMM) such that the misalignment among different users can be characterized by the state transition. Then, we investigate the ac… ▽ More

    Submitted 21 January, 2019; v1 submitted 6 May, 2018; originally announced May 2018.

  42. arXiv:1802.03944  [pdf, other

    eess.SP

    A 1Mbps Real-time NLOS UV Scattering Communication System with Receiver Diversity over 1km

    Authors: Guanchu Wang, Kun Wang, Chen Gong, Difan Zou, Zhimeng Jiang, Zhengyuan Xu

    Abstract: In the non-line of sight (NLOS) ultraviolet (UV) scattering communication, the received signals exhibit the characteristics of discrete photoelectrons due to the extremely large path loss. We design and demonstrate an NLOS UV scattering communication system in this work, where the receiver-side signal detection is designed based on a discrete-time Poisson channel model. In our system, a laser and… ▽ More

    Submitted 12 February, 2018; originally announced February 2018.

  43. arXiv:1710.10976  [pdf, ps, other

    eess.SP

    SCMA with Low Complexity Symmetric Codebook Design for Visible Light Communication

    Authors: Shun Lou, Chen Gong, Qian Gao, Zhengyuan Xu

    Abstract: Sparse code multiple access (SCMA) is attracting significant research interests currently, which is considered as a promising multiple access technique for 5G systems. It serves as a good candidate for the future communication network with massive nodes due to its capability of handling user overloading. Introducing SCMA to visible light communication (VLC) can provide another opportunity on desig… ▽ More

    Submitted 30 October, 2017; originally announced October 2017.