Zum Hauptinhalt springen

Showing 1–50 of 121 results for author: Zhang, A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2408.17397  [pdf, other

    cs.IT eess.SP

    End-to-End Learning for Task-Oriented Semantic Communications Over MIMO Channels: An Information-Theoretic Framework

    Authors: Chang Cai, Xiaojun Yuan, Ying-Jun Angela Zhang

    Abstract: This paper addresses the problem of end-to-end (E2E) design of learning and communication in a task-oriented semantic communication system. In particular, we consider a multi-device cooperative edge inference system over a wireless multiple-input multiple-output (MIMO) multiple access channel, where multiple devices transmit extracted features to a server to perform a classification task. We formu… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

    Comments: major revision in IEEE JSAC

  2. arXiv:2408.15481  [pdf, ps, other

    eess.SP

    Joint Offloading and Beamforming Design in Integrating Sensing, Communication, and Computing Systems: A Distributed Approach

    Authors: Peng Liu, Zesong Fei, Xinyi Wang, Jingxuan Huang, Jie Hu, J. Andrew Zhang

    Abstract: When applying integrated sensing and communications (ISAC) in future mobile networks, many sensing tasks have low latency requirements, preferably being implemented at terminals. However, terminals often have limited computing capabilities and energy supply. In this paper, we investigate the effectiveness of leveraging the advanced computing capabilities of mobile edge computing (MEC) servers and… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: 15 pages, 12 figures, submitted to IEEE journals for possible publication

  3. arXiv:2408.13593  [pdf, ps, other

    eess.SP eess.IV

    Learning Multi-Rate Task-Oriented Communications Over Symmetric Discrete Memoryless Channels

    Authors: Anbang Zhang, Shuaishuai Guo

    Abstract: This letter introduces a multi-rate task-oriented communication (MR-ToC) framework. This framework dynamically adapts to variations in affordable data rate within the communication pipeline. It conceptualizes communication pipelines as symmetric, discrete, memoryless channels. We employ a progressive learning strategy to train the system, comprising a nested codebook for encoding and task inferenc… ▽ More

    Submitted 24 August, 2024; originally announced August 2024.

  4. arXiv:2408.11982  [pdf, other

    eess.IV cs.CV cs.MM

    AIM 2024 Challenge on Compressed Video Quality Assessment: Methods and Results

    Authors: Maksim Smirnov, Aleksandr Gushchin, Anastasia Antsiferova, Dmitry Vatolin, Radu Timofte, Ziheng Jia, Zicheng Zhang, Wei Sun, Jiaying Qian, Yuqin Cao, Yinan Sun, Yuxin Zhu, Xiongkuo Min, Guangtao Zhai, Kanjar De, Qing Luo, Ao-Xiang Zhang, Peng Zhang, Haibo Lei, Linyan Jiang, Yaqing Li, Wenhui Meng, Xiaoheng Tan, Haiqiang Wang, Xiaozhong Xu , et al. (11 additional authors not shown)

    Abstract: Video quality assessment (VQA) is a crucial task in the development of video compression standards, as it directly impacts the viewer experience. This paper presents the results of the Compressed Video Quality Assessment challenge, held in conjunction with the Advances in Image Manipulation (AIM) workshop at ECCV 2024. The challenge aimed to evaluate the performance of VQA methods on a diverse dat… ▽ More

    Submitted 28 August, 2024; v1 submitted 21 August, 2024; originally announced August 2024.

  5. arXiv:2408.04188  [pdf, ps, other

    eess.SP eess.SY

    Trustworthy Semantic-Enabled 6G Communication: A Task-oriented and Privacy-preserving Perspective

    Authors: Shuaishuai Guo, Anbang Zhang, Yanhu Wang, Chenyuan Feng, Tony Q. S. Quek

    Abstract: Trustworthy task-oriented semantic communication (ToSC) emerges as an innovative approach in the 6G landscape, characterized by the transmission of only vital information that is directly pertinent to a specific task. While ToSC offers an efficient mode of communication, it concurrently raises concerns regarding privacy, as sophisticated adversaries might possess the capability to reconstruct the… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  6. arXiv:2408.02859  [pdf, other

    eess.IV cs.AI cs.CV

    Multistain Pretraining for Slide Representation Learning in Pathology

    Authors: Guillaume Jaume, Anurag Vaidya, Andrew Zhang, Andrew H. Song, Richard J. Chen, Sharifa Sahai, Dandan Mo, Emilio Madrigal, Long Phi Le, Faisal Mahmood

    Abstract: Developing self-supervised learning (SSL) models that can learn universal and transferable representations of H&E gigapixel whole-slide images (WSIs) is becoming increasingly valuable in computational pathology. These models hold the potential to advance critical tasks such as few-shot classification, slide retrieval, and patient stratification. Existing approaches for slide representation learnin… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Comments: ECCV'24

  7. arXiv:2407.21514  [pdf

    eess.SP

    Wireless Communications in Doubly Selective Channels with Domain Adaptivity

    Authors: J. Andrew Zhang, Hongyang Zhang, Kai Wu, Xiaojing Huang, Jinhong Yuan, Y. Jay Guo

    Abstract: Wireless communications are significantly impacted by the propagation environment, particularly in doubly selective channels with variations in both time and frequency domains. Orthogonal Time Frequency Space (OTFS) modulation has emerged as a promising solution; however, its high equalization complexity, if performed in the delay-Doppler domain, limits its universal application. This article expl… ▽ More

    Submitted 31 July, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

    Comments: Magazine article, 7 pages, 4 figures, 2 tables

  8. arXiv:2407.17057  [pdf, other

    eess.SP

    Efffcient Sensing Parameter Estimation with Direct Clutter Mitigation in Perceptive Mobile Networks

    Authors: Hang Li, Hongming Yang, Qinghua Guo, J. Andrew Zhang, Yang Xiang, Yashan Pang

    Abstract: In this work, we investigate sensing parameter estimation in the presence of clutter in perceptive mobile networks (PMNs) that integrate radar sensing into mobile communications. Performing clutter suppression before sensing parameter estimation is generally desirable as the number of sensing parameters can be signiffcantly reduced. However, existing methods require high-complexity clutter mitigat… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  9. arXiv:2407.13491  [pdf, other

    eess.SP cs.IT

    Performance Analysis and Low-Complexity Beamforming Design for Near-Field Physical Layer Security

    Authors: Yunpu Zhang, Yuan Fang, Xianghao Yu, Changsheng You, Ying-Jun Angela Zhang

    Abstract: Extremely large-scale arrays (XL-arrays) have emerged as a key enabler in achieving the unprecedented performance requirements of future wireless networks, leading to a significant increase in the range of the near-field region. This transition necessitates the spherical wavefront model for characterizing the wireless propagation rather than the far-field planar counterpart, thereby introducing ex… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 13 pages, 13 figures

  10. arXiv:2406.12426  [pdf, other

    cs.IT eess.SP

    Multi-Active-IRS-Assisted Cooperative Sensing: Cramér-Rao Bound and Joint Beamforming Design

    Authors: Yuan Fang, Xianghao Yu, Jie Xu, Ying-Jun Angela Zhang

    Abstract: This paper studies the multi-intelligent reflecting surface (IRS)-assisted cooperative sensing, in which multiple active IRSs are deployed in a distributed manner to facilitate multi-view target sensing at the non-line-of-sight (NLoS) area of the base station (BS). Different from prior works employing passive IRSs, we leverage active IRSs with the capability of amplifying the reflected signals to… ▽ More

    Submitted 18 July, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2404.13536

  11. arXiv:2406.09190  [pdf, other

    eess.SP

    Rethinking Waveform for 6G: Harnessing Delay-Doppler Alignment Modulation

    Authors: Zhiqiang Xiao, Xianda Liu, Yong Zeng, J. Andrew Zhang, Shi Jin, Rui Zhang

    Abstract: Waveform design has served as a cornerstone for each generation of mobile communication systems. The future sixth-generation (6G) mobile communication networks are expected to employ larger-scale antenna arrays and exploit higher-frequency bands for further boosting data transmission rate and providing ubiquitous wireless sensing. This brings new opportunities and challenges for 6G waveform design… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  12. arXiv:2406.05700  [pdf, other

    cs.CV eess.IV

    HDMba: Hyperspectral Remote Sensing Imagery Dehazing with State Space Model

    Authors: Hang Fu, Genyun Sun, Yinhe Li, Jinchang Ren, Aizhu Zhang, Cheng Jing, Pedram Ghamisi

    Abstract: Haze contamination in hyperspectral remote sensing images (HSI) can lead to spatial visibility degradation and spectral distortion. Haze in HSI exhibits spatial irregularity and inhomogeneous spectral distribution, with few dehazing networks available. Current CNN and Transformer-based dehazing methods fail to balance global scene recovery, local detail retention, and computational efficiency. Ins… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  13. arXiv:2405.16011  [pdf, ps, other

    eess.SP

    Semantic Importance-Aware Communications with Semantic Correction Using Large Language Models

    Authors: Shuaishuai Guo, Yanhu Wang, Jia Ye, Anbang Zhang, Kun Xu

    Abstract: Semantic communications, a promising approach for agent-human and agent-agent interactions, typically operate at a feature level, lacking true semantic understanding. This paper explores understanding-level semantic communications (ULSC), transforming visual data into human-intelligible semantic content. We employ an image caption neural network (ICNN) to derive semantic representations from visua… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  14. arXiv:2405.10553  [pdf, other

    eess.SP

    Revealing the Trade-off in ISAC Systems: The KL Divergence Perspective

    Authors: Zesong Fei, Shuntian Tang, Xinyi Wang, Fanghao Xia, Fan Liu, J. Andrew Zhang

    Abstract: Integrated sensing and communication (ISAC) is regarded as a promising technique for 6G communication network. In this letter, we investigate the Pareto bound of the ISAC system in terms of a unified Kullback-Leibler (KL) divergence performance metric. We firstly present the relationship between KL divergence and explicit ISAC performance metric, i.e., demodulation error and probability of detecti… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 5 pages, 5 figures; submitted to IEEE journals for possible publication

  15. arXiv:2404.09149  [pdf, other

    eess.SY cs.NE math.NA

    Heuristic Solution to Joint Deployment and Beamforming Design for STAR-RIS Aided Networks

    Authors: Bai Yan, Qi Zhao, Jin Zhang, J. Andrew Zhang

    Abstract: This paper tackles the deployment challenges of Simultaneous Transmitting and Reflecting Reconfigurable Intelligent Surface (STAR-RIS) in communication systems. Unlike existing works that use fixed deployment setups or solely optimize the location, this paper emphasizes the joint optimization of the location and orientation of STAR-RIS. This enables searching across all user grouping possibilities… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: 30 pages

  16. arXiv:2404.05984  [pdf, ps, other

    eess.SP

    Interference Management for Full-Duplex ISAC in B5G/6G Networks: Architectures, Challenges, and Solutions

    Authors: Aimin Tang, Xudong Wang, J. Andrew Zhang

    Abstract: Integrated sensing and communications (ISAC) has been visioned as a key technique for B5G/6G networks. To support monostatic sensing, a full-duplex radio is indispensable to extract echo signals from targets. Such a radio can also greatly improve network capacity via full-duplex communications. However, full-duplex radios in existing ISAC designs are mainly focused on wireless sensing, while the a… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted by IEEE Communications Magazine

  17. arXiv:2403.12630  [pdf, other

    eess.AS cs.SD

    Reproducing the Acoustic Velocity Vectors in a Circular Listening Area

    Authors: Jiarui Wang, Thushara Abhayapala, Jihui Aimee Zhang, Prasanga Samarasinghe

    Abstract: Acoustic velocity vectors are important for human's localization of sound at low frequencies. This paper proposes a sound field reproduction algorithm, which matches the acoustic velocity vectors in a circular listening area. In previous work, acoustic velocity vectors are matched either at sweet spots or on the boundary of the listening area. Sweet spots restrict listener's movement, whereas meas… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Submitted to EUSIPCO 2024

  18. arXiv:2403.11940  [pdf, other

    cs.LG eess.SY

    Multistep Inverse Is Not All You Need

    Authors: Alexander Levine, Peter Stone, Amy Zhang

    Abstract: In real-world control settings, the observation space is often unnecessarily high-dimensional and subject to time-correlated noise. However, the controllable dynamics of the system are often far simpler than the dynamics of the raw observations. It is therefore desirable to learn an encoder to map the observation space to a simpler space of control-relevant variables. In this work, we consider the… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  19. arXiv:2403.05793  [pdf, ps, other

    eess.SP

    Performance Bounds for Passive Sensing in Asynchronous ISAC Systems -- Appendices

    Authors: Jingbo Zhao, Zhaoming Lu, J. Andrew Zhang, Weicai Li, Yifeng Xiong, Zijun Han, Xiangming Wen, Tao Gu

    Abstract: This document contains the appendices for our paper titled ``Performance Bounds for Passive Sensing in Asynchronous ISAC Systems." The appendices include rigorous derivations of key formulas, detailed proofs of the theorems and propositions introduced in the paper, and details of the algorithm tested in the numerical simulation for validation. These appendices aim to support and elaborate on the f… ▽ More

    Submitted 29 March, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: 5 pages

  20. arXiv:2402.17533  [pdf, other

    cs.CV eess.IV

    Black-box Adversarial Attacks Against Image Quality Assessment Models

    Authors: Yu Ran, Ao-Xiang Zhang, Mingjie Li, Weixuan Tang, Yuan-Gen Wang

    Abstract: The goal of No-Reference Image Quality Assessment (NR-IQA) is to predict the perceptual quality of an image in line with its subjective evaluation. To put the NR-IQA models into practice, it is essential to study their potential loopholes for model refinement. This paper makes the first attempt to explore the black-box adversarial attacks on NR-IQA models. Specifically, we first formulate the atta… ▽ More

    Submitted 28 February, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  21. arXiv:2402.09048  [pdf, other

    eess.SP

    Sensing in Bi-Static ISAC Systems with Clock Asynchronism: A Signal Processing Perspective

    Authors: Kai Wu, Jacopo Pegoraro, Francesca Meneghello, J. Andrew Zhang, Jesus O. Lacruz, Joerg Widmer, Francesco Restuccia, Michele Rossi, Xiaojing Huang, Daqing Zhang, Giuseppe Caire, Y. Jay Guo

    Abstract: Integrated Sensing and Communication (ISAC) has been identified as a pillar usage scenario for the impending 6G era. Bi-static sensing, a major type of sensing in ISAC, is promising to expedite ISAC in the near future, as it requires minimal changes to the existing network infrastructure. However, a critical challenge for bi-static sensing is clock asynchronism due to the use of different clocks a… ▽ More

    Submitted 24 June, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: 20 pages, 6 figures, 1 table

  22. arXiv:2401.15183  [pdf, other

    q-bio.BM eess.IV

    Moment-based metrics for molecules computable from cryo-EM images

    Authors: Andy Zhang, Oscar Mickelin, Joe Kileel, Eric J. Verbeke, Nicholas F. Marshall, Marc Aurèle Gilles, Amit Singer

    Abstract: Single particle cryogenic electron microscopy (cryo-EM) is an imaging technique capable of recovering the high-resolution 3-D structure of biological macromolecules from many noisy and randomly oriented projection images. One notable approach to 3-D reconstruction, known as Kam's method, relies on the moments of the 2-D images. Inspired by Kam's method, we introduce a rotationally invariant metric… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: 21 Pages, 9 Figures, 2 Algorithms, and 3 Tables

  23. arXiv:2401.09119  [pdf, other

    eess.SP

    Anchor-points Assisted Uplink Sensing in Perceptive Mobile Networks

    Authors: Yanmo Hu, J. Andrew Zhang, Weibo Deng, Y. Jay Guo

    Abstract: Uplink sensing in integrated sensing and communications (ISAC) systems, such as Perceptive Mobile Networks, is challenging due to the clock asynchronism between transmitter and receiver. Existing solutions typically require the presence of a dominating line-of-sight path and the knowledge of transmitter location at the receiver. In this paper, relaxing these requirements, we propose a novel and ef… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: 14 pages, 12 figures, journal paper

  24. arXiv:2401.09064  [pdf, other

    cs.IT eess.SP

    Performance Bounds and Optimization for CSI-Ratio based Bi-static Doppler Sensing in ISAC Systems

    Authors: Yanmo Hu, Kai Wu, J. Andrew Zhang, Weibo Deng, Y. Jay Guo

    Abstract: Bi-static sensing is crucial for exploring the potential of networked sensing capabilities in integrated sensing and communications (ISAC). However, it suffers from the challenging clock asynchronism issue. CSI ratio-based sensing is an effective means to address the issue. Its performance bounds, particular for Doppler sensing, have not been fully understood yet. This work endeavors to fill the r… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: 14 pages, 15 figures, journal paper

  25. arXiv:2401.03473  [pdf, ps, other

    cs.SD cs.AI eess.AS

    ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge

    Authors: He Wang, Pengcheng Guo, Yue Li, Ao Zhang, Jiayao Sun, Lei Xie, Wei Chen, Pan Zhou, Hui Bu, Xin Xu, Binbin Zhang, Zhuo Chen, Jian Wu, Longbiao Wang, Eng Siong Chng, Sun Li

    Abstract: To promote speech processing and recognition research in driving scenarios, we build on the success of the Intelligent Cockpit Speech Recognition Challenge (ICSRC) held at ISCSLP 2022 and launch the ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) Challenge. This challenge collects over 100 hours of multi-channel speech data recorded inside a new energy vehicle and 40 hours… ▽ More

    Submitted 20 February, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

    Comments: Accepted at ICASSP 2024

  26. arXiv:2312.09760  [pdf, other

    eess.AS cs.SD

    U2-KWS: Unified Two-pass Open-vocabulary Keyword Spotting with Keyword Bias

    Authors: Ao Zhang, Pan Zhou, Kaixun Huang, Yong Zou, Ming Liu, Lei Xie

    Abstract: Open-vocabulary keyword spotting (KWS), which allows users to customize keywords, has attracted increasingly more interest. However, existing methods based on acoustic models and post-processing train the acoustic model with ASR training criteria to model all phonemes, making the acoustic model under-optimized for the KWS task. To solve this problem, we propose a novel unified two-pass open-vocabu… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted by ASRU2023

  27. Densifying MIMO: Channel Modeling, Physical Constraints, and Performance Evaluation for Holographic Communications

    Authors: Y. Liu, M. Zhang, T. Wang, A. Zhang, M. Debbah

    Abstract: As the backbone of the fifth-generation (5G) cellular network, massive multiple-input multiple-output (MIMO) encounters a significant challenge in practical applications: how to deploy a large number of antenna elements within limited spaces. Recently, holographic communication has emerged as a potential solution to this issue. It employs dense antenna arrays and provides a tractable model. Nevert… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: 14 pages, 20 figures, accepted by JSAC-SI-ESIT

  28. arXiv:2310.07141  [pdf, ps, other

    cs.IT eess.SP

    Time and Frequency Offset Estimation and Intercarrier Interference Cancellation for AFDM Systems

    Authors: Yuankun Tang, Anjie Zhang, Miaowen Wen, Yu Huang, Fei Ji, Jinming Wen

    Abstract: Affine frequency division multiplexing (AFDM) is an emerging multicarrier waveform that offers a potential solution for achieving reliable communications over time-varying channels. This paper proposes two maximum-likelihood (ML) estimators of symbol time offset and carrier frequency offset for AFDM systems. One is called joint ML estimator, which evaluates the arrival time and carrier frequency o… ▽ More

    Submitted 28 December, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: accepted by IEEE Wireless Communications and Networking Conference (WCNC) 2024

  29. arXiv:2310.05444  [pdf, other

    cs.IT eess.SP

    Waveform Design for MIMO-OFDM Integrated Sensing and Communication System: An Information Theoretical Approach

    Authors: Zhiqing Wei, Jinghui Piao, Xin Yuan, Huici Wu, J. Andrew Zhang, Zhiyong Feng, Lin Wang, Ping Zhang

    Abstract: Integrated sensing and communication (ISAC) is regarded as the enabling technology in the future 5th-Generation-Advanced (5G-A) and 6th-Generation (6G) mobile communication system. ISAC waveform design is critical in ISAC system. However, the difference of the performance metrics between sensing and communication brings challenges for the ISAC waveform design. This paper applies the unified perfor… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

  30. arXiv:2310.04657  [pdf, other

    eess.AS cs.SD

    Spike-Triggered Contextual Biasing for End-to-End Mandarin Speech Recognition

    Authors: Kaixun Huang, Ao Zhang, Binbin Zhang, Tianyi Xu, Xingchen Song, Lei Xie

    Abstract: The attention-based deep contextual biasing method has been demonstrated to effectively improve the recognition performance of end-to-end automatic speech recognition (ASR) systems on given contextual phrases. However, unlike shallow fusion methods that directly bias the posterior of the ASR model, deep biasing methods implicitly integrate contextual information, making it challenging to control t… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Comments: Accepted by ASRU2023

  31. arXiv:2310.03265  [pdf, other

    cs.NI eess.SP

    Integrated Communication, Sensing, and Computation Framework for 6G Networks

    Authors: Xu Chen, Zhiyong Feng, J. Andrew Zhang, Zhaohui Yang, Xin Yuan, Xinxin He, Ping Zhang

    Abstract: In the sixth generation (6G) era, intelligent machine network (IMN) applications, such as intelligent transportation, require collaborative machines with communication, sensing, and computation (CSC) capabilities. This article proposes an integrated communication, sensing, and computation (ICSAC) framework for 6G to achieve the reciprocity among CSC functions to enhance the reliability and latency… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: 8 pages, 5 figures, submitted to IEEE VTM

  32. arXiv:2309.13609  [pdf, other

    cs.CV eess.IV

    Vulnerabilities in Video Quality Assessment Models: The Challenge of Adversarial Attacks

    Authors: Ao-Xiang Zhang, Yu Ran, Weixuan Tang, Yuan-Gen Wang

    Abstract: No-Reference Video Quality Assessment (NR-VQA) plays an essential role in improving the viewing experience of end-users. Driven by deep learning, recent NR-VQA models based on Convolutional Neural Networks (CNNs) and Transformers have achieved outstanding performance. To build a reliable and practical assessment system, it is of great necessity to evaluate their robustness. However, such issue has… ▽ More

    Submitted 20 October, 2023; v1 submitted 24 September, 2023; originally announced September 2023.

  33. arXiv:2309.10605  [pdf, other

    eess.AS cs.SD

    An Active Noise Control System Based on Soundfield Interpolation Using a Physics-informed Neural Network

    Authors: Yile Angela Zhang, Fei Ma, Thushara Abhayapala, Prasanga Samarasinghe, Amy Bastine

    Abstract: Conventional multiple-point active noise control (ANC) systems require placing error microphones within the region of interest (ROI), inconveniencing users. This paper designs a feasible monitoring microphone arrangement placed outside the ROI, providing a user with more freedom of movement. The soundfield within the ROI is interpolated from the microphone signals using a physics-informed neural… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

  34. arXiv:2309.02888  [pdf, other

    eess.SP

    Multi-Device Task-Oriented Communication via Maximal Coding Rate Reduction

    Authors: Chang Cai, Xiaojun Yuan, Ying-Jun Angela Zhang

    Abstract: In task-oriented communications, most existing work designed the physical-layer communication modules and learning based codecs with distinct objectives: learning is targeted at accurate execution of specific tasks, while communication aims at optimizing conventional communication metrics, such as throughput maximization, delay minimization, or bit error rate minimization. The inconsistency betwee… ▽ More

    Submitted 28 May, 2024; v1 submitted 6 September, 2023; originally announced September 2023.

    Comments: under minor revision in IEEE Transactions on Wireless Communications

  35. arXiv:2308.02915  [pdf, other

    cs.GR cs.CV cs.SD eess.AS

    DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation

    Authors: Qiaosong Qi, Le Zhuo, Aixi Zhang, Yue Liao, Fei Fang, Si Liu, Shuicheng Yan

    Abstract: When hearing music, it is natural for people to dance to its rhythm. Automatic dance generation, however, is a challenging task due to the physical constraints of human motion and rhythmic alignment with target music. Conventional autoregressive methods introduce compounding errors during sampling and struggle to capture the long-term structure of dance sequences. To address these limitations, we… ▽ More

    Submitted 5 August, 2023; originally announced August 2023.

    Comments: Accepted at ACM MM 2023

  36. arXiv:2307.14907  [pdf, other

    eess.IV cs.CV q-bio.QM

    Weakly Supervised AI for Efficient Analysis of 3D Pathology Samples

    Authors: Andrew H. Song, Mane Williams, Drew F. K. Williamson, Guillaume Jaume, Andrew Zhang, Bowen Chen, Robert Serafin, Jonathan T. C. Liu, Alex Baras, Anil V. Parwani, Faisal Mahmood

    Abstract: Human tissue and its constituent cells form a microenvironment that is fundamentally three-dimensional (3D). However, the standard-of-care in pathologic diagnosis involves selecting a few two-dimensional (2D) sections for microscopic evaluation, risking sampling bias and misdiagnosis. Diverse methods for capturing 3D tissue morphologies have been developed, but they have yet had little translation… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

  37. arXiv:2307.11345  [pdf, other

    cs.IT eess.SP

    Sensing Aided Covert Communications: Turning Interference into Allies

    Authors: Xinyi Wang, Zesong Fei, Peng Liu, J. Andrew Zhang, Qingqing Wu, Nan Wu

    Abstract: In this paper, we investigate the realization of covert communication in a general radar-communication cooperation system, which includes integrated sensing and communications as a special example. We explore the possibility of utilizing the sensing ability of radar to track and jam the aerial adversary target attempting to detect the transmission. Based on the echoes from the target, the extended… ▽ More

    Submitted 3 January, 2024; v1 submitted 21 July, 2023; originally announced July 2023.

    Comments: 13 pages, 12 figures, submitted to IEEE journals for potential publication

  38. arXiv:2307.07200  [pdf, other

    eess.AS

    Reproducing the Acoustic Velocity Vectors in a Spherical Listening Region

    Authors: Jiarui Wang, Thushara Abhayapala, Jihui Aimee Zhang, Prasanga Samarasinghe

    Abstract: Acoustic velocity vectors (AVVs) are related to the human's perception of sound at low frequencies and are widely used in Ambisonics. This paper proposes a spatial sound field reproduction algorithm called velocity matching, which reproduces the AVVs in the spherical listening region by matching the AVVs' spherical harmonic coefficients. Using the sound field translation formula, the spherical har… ▽ More

    Submitted 6 June, 2024; v1 submitted 14 July, 2023; originally announced July 2023.

    Comments: Submitted to IEEE Signal Processing Letters

  39. arXiv:2306.10982  [pdf, other

    cs.IT cs.CR cs.LG eess.SP

    Differentially Private Over-the-Air Federated Learning Over MIMO Fading Channels

    Authors: Hang Liu, Jia Yan, Ying-Jun Angela Zhang

    Abstract: Federated learning (FL) enables edge devices to collaboratively train machine learning models, with model communication replacing direct data uploading. While over-the-air model aggregation improves communication efficiency, uploading models to an edge server over wireless networks can pose privacy risks. Differential privacy (DP) is a widely used quantitative technique to measure statistical data… ▽ More

    Submitted 25 December, 2023; v1 submitted 19 June, 2023; originally announced June 2023.

    Comments: This work has been accepted by the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  40. arXiv:2306.09135  [pdf, other

    eess.AS cs.SD

    Time-Domain Wideband Image Source Method for Spherical Microphone Arrays

    Authors: Jiarui Wang, Prasanga Samarasinghe, Thushara Abhayapala, Jihui Aimee Zhang

    Abstract: This paper presents the time-domain wideband spherical microphone array impulse response generator (TDW-SMIR generator), which is a time-domain wideband image source method (ISM) for generating the room impulse responses captured by an open spherical microphone array. To incorporate loudspeaker directivity, the TDW-SMIR generator considers a source that emits a sequence of spherical wave fronts wh… ▽ More

    Submitted 9 August, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: Accepted for publication in the IEEE 25th International Workshop on Multimedia Signal Processing (IEEE MMSP 2023)

  41. arXiv:2306.04512  [pdf, other

    eess.IV cs.CV physics.med-ph

    Cross-attention learning enables real-time nonuniform rotational distortion correction in OCT

    Authors: Haoran Zhang, Jianlong Yang, Jingqian Zhang, Shiqing Zhao, Aili Zhang

    Abstract: Nonuniform rotational distortion (NURD) correction is vital for endoscopic optical coherence tomography (OCT) imaging and its functional extensions, such as angiography and elastography. Current NURD correction methods require time-consuming feature tracking or cross-correlation calculations and thus sacrifice temporal resolution. Here we propose a cross-attention learning method for the NURD corr… ▽ More

    Submitted 5 January, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

    Journal ref: Biomedical Optics Express 15.1 (2024): 319-335

  42. arXiv:2306.00804  [pdf, other

    cs.SD cs.CL eess.AS

    Adaptive Contextual Biasing for Transducer Based Streaming Speech Recognition

    Authors: Tianyi Xu, Zhanheng Yang, Kaixun Huang, Pengcheng Guo, Ao Zhang, Biao Li, Changru Chen, Chao Li, Lei Xie

    Abstract: By incorporating additional contextual information, deep biasing methods have emerged as a promising solution for speech recognition of personalized words. However, for real-world voice assistants, always biasing on such personalized words with high prediction scores can significantly degrade the performance of recognizing common words. To address this issue, we propose an adaptive contextual bias… ▽ More

    Submitted 15 August, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

  43. arXiv:2305.17938  [pdf, other

    cs.IT eess.SP

    Complex CNN CSI Enhancer for Integrated Sensing and Communications

    Authors: Xu Chen, Zhiyong Feng, J. Andrew Zhang, Feifei Gao, Xin Yuan, Zhaohui Yang, Ping Zhang

    Abstract: In this paper, we propose a novel complex convolutional neural network (CNN) CSI enhancer for integrated sensing and communications (ISAC), which exploits the correlation between the sensing parameters (such as angle-of-arrival and range) and the channel state information (CSI) to significantly improve the CSI estimation accuracy and further enhance the sensing accuracy. Within the CNN CSI enhance… ▽ More

    Submitted 19 June, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: 13 pages, 15 figures, submitted to IEEE Journal of Selected Topics in Signal Processing

  44. arXiv:2305.12493  [pdf, other

    eess.AS cs.CL cs.SD

    Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network

    Authors: Kaixun Huang, Ao Zhang, Zhanheng Yang, Pengcheng Guo, Bingshen Mu, Tianyi Xu, Lei Xie

    Abstract: Contextual information plays a crucial role in speech recognition technologies and incorporating it into the end-to-end speech recognition models has drawn immense interest recently. However, previous deep bias methods lacked explicit supervision for bias tasks. In this study, we introduce a contextual phrase prediction network for an attention-based deep bias method. This network predicts context… ▽ More

    Submitted 12 July, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

    Comments: Accepted by interspeech2023

  45. arXiv:2305.11548  [pdf, ps, other

    eess.SP

    Sensing Aided Uplink Transmission in OTFS ISAC with Joint Parameter Association, Channel Estimation and Signal Detection

    Authors: Xi Yang, Hang Li, Qinghua Guo, J. Andrew Zhang, Xiaojing Huang, Zhiqun Cheng

    Abstract: In this work, we study sensing-aided uplink transmission in an integrated sensing and communication (ISAC) vehicular network with the use of orthogonal time frequency space (OTFS) modulation. To exploit sensing parameters for improving uplink communications, the parameters must be first associated with the transmitters, which is a challenging task. We propose a scheme that jointly conducts paramet… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

  46. arXiv:2304.11057  [pdf, other

    eess.SP

    Vital Sign Monitoring in Dynamic Environment via mmWave Radar and Camera Fusion

    Authors: Yingqi Wang, Zhongqin Wang, J. Andrew Zhang, Haimin Zhang, Min Xu

    Abstract: Contact-free vital sign monitoring, which uses wireless signals for recognizing human vital signs (i.e, breath and heartbeat), is an attractive solution to health and security. However, the subject's body movement and the change in actual environments can result in inaccurate frequency estimation of heartbeat and respiratory. In this paper, we propose a robust mmWave radar and camera fusion system… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

  47. arXiv:2303.06341  [pdf, other

    eess.AS

    The NPU-ASLP System for Audio-Visual Speech Recognition in MISP 2022 Challenge

    Authors: Pengcheng Guo, He Wang, Bingshen Mu, Ao Zhang, Peikun Chen

    Abstract: This paper describes our NPU-ASLP system for the Audio-Visual Diarization and Recognition (AVDR) task in the Multi-modal Information based Speech Processing (MISP) 2022 Challenge. Specifically, the weighted prediction error (WPE) and guided source separation (GSS) techniques are used to reduce reverberation and generate clean signals for each single speaker first. Then, we explore the effectivenes… ▽ More

    Submitted 11 March, 2023; originally announced March 2023.

    Comments: 2 pages, accepted by ICASSP 2023

  48. arXiv:2303.04696  [pdf, other

    eess.IV cs.CV cs.LG q-bio.QM

    VOLTA: an Environment-Aware Contrastive Cell Representation Learning for Histopathology

    Authors: Ramin Nakhli, Allen Zhang, Hossein Farahani, Amirali Darbandsari, Elahe Shenasa, Sidney Thiessen, Katy Milne, Jessica McAlpine, Brad Nelson, C Blake Gilks, Ali Bashashati

    Abstract: In clinical practice, many diagnosis tasks rely on the identification of cells in histopathology images. While supervised machine learning techniques require labels, providing manual cell annotations is time-consuming due to the large number of cells. In this paper, we propose a self-supervised framework (VOLTA) for cell representation learning in histopathology images using a novel technique that… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

  49. Joint Beamforming for RIS-Assisted Integrated Sensing and Communication Systems

    Authors: Yongqing Xu, Yong Li, J. Andrew Zhang, Marco Di Renzo, Tony Q. S. Quek

    Abstract: Integrated sensing and communications (ISAC) is an emerging critical technique for the next generation of communication systems. However, due to multiple performance metrics used for communication and sensing, the limited degrees-of-freedom (DoF) in optimizing ISAC systems poses a challenge. Reconfigurable intelligent surfaces (RIS) can introduce new DoF for beamforming in ISAC systems, thereby en… ▽ More

    Submitted 24 January, 2024; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: 30 pages, 8 figures. This paper has been accepted by IEEE Transactions on Communications

  50. arXiv:2302.13523  [pdf, other

    cs.SD eess.AS

    VE-KWS: Visual Modality Enhanced End-to-End Keyword Spotting

    Authors: Ao Zhang, He Wang, Pengcheng Guo, Yihui Fu, Lei Xie, Yingying Gao, Shilei Zhang, Junlan Feng

    Abstract: The performance of the keyword spotting (KWS) system based on audio modality, commonly measured in false alarms and false rejects, degrades significantly under the far field and noisy conditions. Therefore, audio-visual keyword spotting, which leverages complementary relationships over multiple modalities, has recently gained much attention. However, current studies mainly focus on combining the e… ▽ More

    Submitted 14 March, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: 5 pages. Accepted at ICASSP2023