Zum Hauptinhalt springen

Showing 1–19 of 19 results for author: Gao, T

Searching in archive eess. Search in all archives.
.
  1. arXiv:2404.19412  [pdf

    cs.RO eess.SY

    Enhancing Robotic Adaptability: Integrating Unsupervised Trajectory Segmentation and Conditional ProMPs for Dynamic Learning Environments

    Authors: Tianci Gao

    Abstract: We propose a novel framework for enhancing robotic adaptability and learning efficiency, which integrates unsupervised trajectory segmentation with adaptive probabilistic movement primitives (ProMPs). By employing a cutting-edge deep learning architecture that combines autoencoders and Recurrent Neural Networks (RNNs), our approach autonomously pinpoints critical transitional points in continuous,… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  2. arXiv:2403.11091  [pdf, other

    cs.SD cs.CV eess.AS

    Multitask frame-level learning for few-shot sound event detection

    Authors: Liang Zou, Genwei Yan, Ruoyu Wang, Jun Du, Meng Lei, Tian Gao, Xin Fang

    Abstract: This paper focuses on few-shot Sound Event Detection (SED), which aims to automatically recognize and classify sound events with limited samples. However, prevailing methods methods in few-shot SED predominantly rely on segment-level predictions, which often providing detailed, fine-grained predictions, particularly for events of brief duration. Although frame-level prediction strategies have been… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: 6 pages, 4 figures, conference

  3. arXiv:2312.10593  [pdf, other

    cs.CR eess.SP

    A Novel RFID Authentication Protocol Based on A Block-Order-Modulus Variable Matrix Encryption Algorithm

    Authors: Yan Wang, Ruiqi Liu, Tong Gao, Feng Shu, Xuemei Lei, Guan Gui, Jiangzhou Wang

    Abstract: In this paper, authentication for mobile radio frequency identification (RFID) systems with low-cost tags is studied. Firstly, an adaptive modulus (AM) encryption algorithm is proposed. Subsequently, in order to enhance the security without additional storage of new key matrices, a self-updating encryption order (SUEO) algorithm is designed. Furthermore, a diagonal block local transpose key matrix… ▽ More

    Submitted 9 May, 2024; v1 submitted 16 December, 2023; originally announced December 2023.

  4. arXiv:2309.11745  [pdf, other

    eess.IV cs.CV cs.LG

    PIE: Simulating Disease Progression via Progressive Image Editing

    Authors: Kaizhao Liang, Xu Cao, Kuei-Da Liao, Tianren Gao, Wenqian Ye, Zhengyu Chen, Jianguo Cao, Tejas Nama, Jimeng Sun

    Abstract: Disease progression simulation is a crucial area of research that has significant implications for clinical diagnosis, prognosis, and treatment. One major challenge in this field is the lack of continuous medical imaging monitoring of individual patients over time. To address this issue, we develop a novel framework termed Progressive Image Editing (PIE) that enables controlled manipulation of dis… ▽ More

    Submitted 5 October, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: Code and checkpoints for replicating our results can be found at https://github.com/IrohXu/PIE and https://huggingface.co/IrohXu/stable-diffusion-mimic-cxr-v0.1

  5. arXiv:2308.14638  [pdf, other

    eess.AS cs.SD

    The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge

    Authors: Ruoyu Wang, Maokui He, Jun Du, Hengshun Zhou, Shutong Niu, Hang Chen, Yanyan Yue, Gaobin Yang, Shilong Wu, Lei Sun, Yanhui Tu, Haitao Tang, Shuangqing Qian, Tian Gao, Mengzhi Wang, Genshun Wan, Jia Pan, Jianqing Gao, Chin-Hui Lee

    Abstract: This technical report details our submission system to the CHiME-7 DASR Challenge, which focuses on speaker diarization and speech recognition under complex multi-speaker scenarios. Additionally, it also evaluates the efficiency of systems in handling diverse array devices. To address these issues, we implemented an end-to-end speaker diarization system and introduced a rectification strategy base… ▽ More

    Submitted 10 October, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

    Comments: Accepted by 2023 CHiME Workshop, Oral

  6. arXiv:2305.16616  [pdf, other

    eess.SP

    Channel Measurement, Modeling, and Simulation for 6G: A Survey and Tutorial

    Authors: Jianhua Zhang, Jiaxin Lin, Pan Tang, Yuxiang Zhang, Huixin Xu, Tianyang Gao, Haiyang Miao, Zeyong Chai, Zhengfu Zhou, Yi Li, Huiwen Gong, Yameng Liu, Zhiqiang Yuan, Lei Tian, Shaoshi Yang, Liang Xia, Guangyi Liu, Ping Zhang

    Abstract: The sixth generation (6G) mobile communications have attracted substantial attention in the global research community of information and communication technologies (ICT). 6G systems are expected to support not only extended 5G usage scenarios, but also new usage scenarios, such as integrated sensing and communication (ISAC), integrated artificial intelligence (AI) and communication, and communicat… ▽ More

    Submitted 28 March, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 41 pages,52 figures

  7. arXiv:2304.12615  [pdf

    eess.IV cs.CV

    STM-UNet: An Efficient U-shaped Architecture Based on Swin Transformer and Multi-scale MLP for Medical Image Segmentation

    Authors: Lei Shi, Tianyu Gao, Zheng Zhang, Junxing Zhang

    Abstract: Automated medical image segmentation can assist doctors to diagnose faster and more accurate. Deep learning based models for medical image segmentation have made great progress in recent years. However, the existing models fail to effectively leverage Transformer and MLP for improving U-shaped architecture efficiently. In addition, the multi-scale features of the MLP have not been fully extracted… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

    Comments: 6 pages,5 figures,2 tables

  8. arXiv:2207.08902  [pdf, other

    cs.RO cs.MA eess.SY

    Layered Cost-Map-Based Traffic Management for Multiple Automated Mobile Robots via a Data Distribution Service

    Authors: Seungwoo Jeong, Taekwon Ga, Inhwan Jeong, Jongkyu Oh, Jongeun Choi

    Abstract: This letter proposes traffic management for multiple automated mobile robots (AMRs) based on a layered cost map. Multiple AMRs communicate via a data distribution service (DDS), which is shared by topics in the same DDS domain. The cost of each layer is manipulated by topics. The traffic management server in the domain sends or receives topics to each of AMRs. Using the layered cost map, the new c… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

    Comments: 8 pages, 13 figures

  9. From heavy rain removal to detail restoration: A faster and better network

    Authors: Yuanbo Wen, Tao Gao, Jing Zhang, Kaihao Zhang, Ting Chen

    Abstract: The profound accumulation of precipitation during intense rainfall events can markedly degrade the quality of images, leading to the erosion of textural details. Despite the improvements observed in existing learning-based methods specialized for heavy rain removal, it is discerned that a significant proportion of these methods tend to overlook the precise reconstruction of the intricate details.… ▽ More

    Submitted 18 December, 2023; v1 submitted 7 May, 2022; originally announced May 2022.

    Comments: Accepted by Pattern Recognition

  10. arXiv:2112.14555  [pdf, other

    eess.IV cs.CV

    Onsite Non-Line-of-Sight Imaging via Online Calibrations

    Authors: Zhengqing Pan, Ruiqian Li, Tian Gao, Zi Wang, Ping Liu, Siyuan Shen, Tao Wu, Jingyi Yu, Shiying Li

    Abstract: There has been an increasing interest in deploying non-line-of-sight (NLOS) imaging systems for recovering objects behind an obstacle. Existing solutions generally pre-calibrate the system before scanning the hidden objects. Onsite adjustments of the occluder, object and scanning pattern require re-calibration. We present an online calibration technique that directly decouples the acquired transie… ▽ More

    Submitted 29 December, 2021; originally announced December 2021.

  11. arXiv:2111.01430  [pdf, other

    cs.SD eess.AS

    CycleGAN with Dual Adversarial Loss for Bone-Conducted Speech Enhancement

    Authors: Qing Pan, Teng Gao, Jian Zhou, Huabin Wang, Liang Tao, Hon Keung Kwan

    Abstract: Compared with air-conducted speech, bone-conducted speech has the unique advantage of shielding background noise. Enhancement of bone-conducted speech helps to improve its quality and intelligibility. In this paper, a novel CycleGAN with dual adversarial loss (CycleGAN-DAL) is proposed for bone-conducted speech enhancement. The proposed method uses an adversarial loss and a cycle-consistent loss s… ▽ More

    Submitted 2 November, 2021; originally announced November 2021.

  12. arXiv:2111.01342  [pdf, other

    cs.SD cs.HC eess.AS

    Attention-Guided Generative Adversarial Network for Whisper to Normal Speech Conversion

    Authors: Teng Gao, Jian Zhou, Huabin Wang, Liang Tao, Hon Keung Kwan

    Abstract: Whispered speech is a special way of pronunciation without using vocal cord vibration. A whispered speech does not contain a fundamental frequency, and its energy is about 20dB lower than that of a normal speech. Converting a whispered speech into a normal speech can improve speech quality and intelligibility. In this paper, a novel attention-guided generative adversarial network model incorporati… ▽ More

    Submitted 1 November, 2021; originally announced November 2021.

  13. arXiv:2103.16827  [pdf, other

    eess.AS cs.CL cs.SD

    Integer-only Zero-shot Quantization for Efficient Speech Recognition

    Authors: Sehoon Kim, Amir Gholami, Zhewei Yao, Nicholas Lee, Patrick Wang, Aniruddha Nrusimha, Bohan Zhai, Tianren Gao, Michael W. Mahoney, Kurt Keutzer

    Abstract: End-to-end neural network models achieve improved performance on various automatic speech recognition (ASR) tasks. However, these models perform poorly on edge hardware due to large memory and computation requirements. While quantizing model weights and/or activations to low-precision can be a promising solution, previous research on quantizing ASR models is limited. In particular, the previous ap… ▽ More

    Submitted 30 January, 2022; v1 submitted 31 March, 2021; originally announced March 2021.

    Journal ref: ICASSP 2022

  14. arXiv:2103.10661  [pdf, other

    cs.SD cs.LG eess.AS

    USTC-NELSLIP System Description for DIHARD-III Challenge

    Authors: Yuxuan Wang, Maokui He, Shutong Niu, Lei Sun, Tian Gao, Xin Fang, Jia Pan, Jun Du, Chin-Hui Lee

    Abstract: This system description describes our submission system to the Third DIHARD Speech Diarization Challenge. Besides the traditional clustering based system, the innovation of our system lies in the combination of various front-end techniques to solve the diarization problem, including speech separation and target-speaker based voice activity detection (TS-VAD), combined with iterative data purificat… ▽ More

    Submitted 19 March, 2021; originally announced March 2021.

  15. Non-line-of-Sight Imaging via Neural Transient Fields

    Authors: Siyuan Shen, Zi Wang, Ping Liu, Zhengqing Pan, Ruiqian Li, Tian Gao, Shiying Li, Jingyi Yu

    Abstract: We present a neural modeling framework for Non-Line-of-Sight (NLOS) imaging. Previous solutions have sought to explicitly recover the 3D geometry (e.g., as point clouds) or voxel density (e.g., within a pre-defined volume) of the hidden scene. In contrast, inspired by the recent Neural Radiance Field (NeRF) approach, we use a multi-layer perceptron (MLP) to represent the neural transient field or… ▽ More

    Submitted 13 September, 2021; v1 submitted 2 January, 2021; originally announced January 2021.

  16. arXiv:2010.04611  [pdf, other

    cs.CV eess.IV

    Hyperspectral Unmixing via Nonnegative Matrix Factorization with Handcrafted and Learnt Priors

    Authors: Min Zhao, Tiande Gao, Jie Chen, Wei Chen

    Abstract: Nowadays, nonnegative matrix factorization (NMF) based methods have been widely applied to blind spectral unmixing. Introducing proper regularizers to NMF is crucial for mathematically constraining the solutions and physically exploiting spectral and spatial properties of images. Generally, properly handcrafting regularizers and solving the associated complex optimization problem are non-trivial t… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

  17. arXiv:2001.05685  [pdf, other

    cs.SD eess.AS

    SqueezeWave: Extremely Lightweight Vocoders for On-device Speech Synthesis

    Authors: Bohan Zhai, Tianren Gao, Flora Xue, Daniel Rothchild, Bichen Wu, Joseph E. Gonzalez, Kurt Keutzer

    Abstract: Automatic speech synthesis is a challenging task that is becoming increasingly important as edge devices begin to interact with users through speech. Typical text-to-speech pipelines include a vocoder, which translates intermediate audio representations into an audio waveform. Most existing vocoders are difficult to parallelize since each generated sample is conditioned on previous samples. WaveGl… ▽ More

    Submitted 16 January, 2020; originally announced January 2020.

  18. arXiv:1906.01082  [pdf, other

    eess.IV cs.CV math.FA

    Representation Theoretic Patterns in Multi-Frequency Class Averaging for Three-Dimensional Cryo-Electron Microscopy

    Authors: Yifeng Fan, Tingran Gao, Zhizhen Zhao

    Abstract: We develop in this paper a novel intrinsic classification algorithm -- multi-frequency class averaging (MFCA) -- for classifying noisy projection images obtained from three-dimensional cryo-electron microscopy (cryo-EM) by the similarity among their viewing directions. This new algorithm leverages multiple irreducible representations of the unitary group to introduce additional redundancy into the… ▽ More

    Submitted 5 July, 2021; v1 submitted 31 May, 2019; originally announced June 2019.

    Comments: 38 pages, 17 figures

    MSC Class: 20G05; 33C45; 33C55; 55R25 ACM Class: I.4.5; I.4.10

  19. arXiv:1901.07951  [pdf, other

    eess.SY

    Modeling and Simulation of UAV Carrier Landings

    Authors: Gaurav Misra, Tianyu Gao, Xiaoli Bai

    Abstract: With UAVs promising capabilities to increase operation flexibility and reduce mission cost, we are exploiting the automated carrier-landing performance advancement that can be achieved by fixed-wing UAVs. To demonstrate such potentials, in this paper, we investigate two key metrics, namely, flight path control performance, and reduced approach speeds for UAVs based on the F/A-18 High Angle of Atta… ▽ More

    Submitted 23 January, 2019; originally announced January 2019.