Zum Hauptinhalt springen

Showing 1–50 of 100 results for author: Zhou, C

Searching in archive eess. Search in all archives.
.
  1. arXiv:2408.17014  [pdf, ps, other

    eess.SP

    Channel Estimation for XL-IRS Assisted Wireless Systems with Double-sided Visibility Regions

    Authors: Chao Zhou, Changsheng You, Shiqi Gong, Bin Lyu, Beixiong Zheng, Yi Gong

    Abstract: In this paper, we study efficient channel estimation design for an extremely large-scale intelligent reflecting surface (XL-IRS) assisted multi-user communication systems, where both the base station (BS) and users are located in the near-field region of the XL-IRS. Two unique channel characteristics of XL-IRS are considered, namely, the near-field spherical wavefronts and double-sided visibility… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

    Comments: 6 pages, 5 figures

  2. arXiv:2408.15771  [pdf, other

    eess.AS cs.LG cs.SD

    wav2pos: Sound Source Localization using Masked Autoencoders

    Authors: Axel Berg, Jens Gulin, Mark O'Connor, Chuteng Zhou, Karl Åström, Magnus Oskarsson

    Abstract: We present a novel approach to the 3D sound source localization task for distributed ad-hoc microphone arrays by formulating it as a set-to-set regression problem. By training a multi-modal masked autoencoder model that operates on audio recordings and microphone coordinates, we show that such a formulation allows for accurate localization of the sound source, by reconstructing coordinates masked… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

    Comments: IPIN 2024

  3. arXiv:2408.11480  [pdf, other

    eess.IV cs.CV

    OAPT: Offset-Aware Partition Transformer for Double JPEG Artifacts Removal

    Authors: Qiao Mo, Yukang Ding, Jinhua Hao, Qiang Zhu, Ming Sun, Chao Zhou, Feiyu Chen, Shuyuan Zhu

    Abstract: Deep learning-based methods have shown remarkable performance in single JPEG artifacts removal task. However, existing methods tend to degrade on double JPEG images, which are prevalent in real-world scenarios. To address this issue, we propose Offset-Aware Partition Transformer for double JPEG artifacts removal, termed as OAPT. We conduct an analysis of double JPEG compression that results in up… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: 14 pages, 9 figures. Codes and models are available at https://github.com/QMoQ/OAPT.git

  4. arXiv:2408.05785  [pdf, ps, other

    eess.SP

    Movable Antenna Enabled Symbiotic Radio Systems: An Opportunity for Mutualism

    Authors: Chao Zhou, Bin Lyu, Changsheng You, Ziwei Liu

    Abstract: In this letter, we propose a new movable antenna (MA) enabled symbiotic radio (SR) system that leverages the movement of MAs to maximize both the primary and secondary rates, thereby promoting their mutualism. Specifically, the primary transmitter (PT) equipped with MAs utilizes a maximum ratio transmission (MRT) beamforming scheme to ensure the highest primary rate at the primary user (PU). Concu… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

    Comments: 5 pages, 5 figures. Accepted to IEEE Wireless Communications Letters

  5. arXiv:2407.19753  [pdf, other

    cs.CV eess.SP

    PredIN: Towards Open-Set Gesture Recognition via Prediction Inconsistency

    Authors: Chen Liu, Can Han, Chengfeng Zhou, Crystal Cai, Dahong Qian

    Abstract: Gesture recognition based on surface electromyography (sEMG) has achieved significant progress in human-machine interaction (HMI). However, accurately recognizing predefined gestures within a closed set is still inadequate in practice; a robust open-set system needs to effectively reject unknown gestures while correctly classifying known ones. To handle this challenge, we first report prediction i… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: Under review

  6. arXiv:2407.19130  [pdf

    physics.optics eess.IV

    Panoramic single-pixel imaging with megapixel resolution based on rotational subdivision

    Authors: Huan Cui, Jie Cao, Haoyu Zhang, Chang Zhou, Haifeng Yao, Yingbo Wang, Qun Hao

    Abstract: Single-pixel imaging (SPI) using a single-pixel detector is an unconventional imaging method, which has great application prospects in many fields to realize high-performance imaging. In especial, the recent proposed catadioptric panoramic ghost imaging (CPGI) extends the application potential of SPI to high-performance imaging at a wide field of view (FOV) with recent growing demands. However, th… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

  7. arXiv:2407.10759  [pdf, other

    eess.AS cs.CL cs.LG

    Qwen2-Audio Technical Report

    Authors: Yunfei Chu, Jin Xu, Qian Yang, Haojie Wei, Xipin Wei, Zhifang Guo, Yichong Leng, Yuanjun Lv, Jinzheng He, Junyang Lin, Chang Zhou, Jingren Zhou

    Abstract: We introduce the latest progress of Qwen-Audio, a large-scale audio-language model called Qwen2-Audio, which is capable of accepting various audio signal inputs and performing audio analysis or direct textual responses with regard to speech instructions. In contrast to complex hierarchical tags, we have simplified the pre-training process by utilizing natural language prompts for different data an… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: https://github.com/QwenLM/Qwen2-Audio. Checkpoints, codes and scripts will be opensoursed soon

  8. arXiv:2406.14931  [pdf, other

    eess.SP

    Multi-beam Training for Near-field Communications in High-frequency Bands

    Authors: Cong Zhou, Changsheng You, Zixuan Huang, Shuo Shi, Yi Gong, Chan-Byoung Chae, Kaibin Huang

    Abstract: In this paper, we study efficient multi-beam training design for near-field communications to reduce the beam training overhead of conventional single-beam training methods. In particular, the array-division based multi-beam training method, which is widely used in far-field communications, cannot be directly applied to the near-field scenario, since different sub-arrays may observe different user… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: In this paper, a novel near-field multi-beam training scheme is proposed by sparsely activating a portion of antennas to form a sparse linear array

  9. arXiv:2406.04262  [pdf, other

    eess.SP

    Near-field Beam Training with Sparse DFT Codebook

    Authors: Cong Zhou, Chenyu Wu, Changsheng You, Shuo Shi

    Abstract: Extremely large-scale array (XL-array) has emerged as one promising technology to improve the spectral efficiency and spatial resolution of future sixth generation (6G) wireless systems.The upsurge in the antenna number antennas renders communication users more likely to be located in the near-field region, which requires a more accurate spherical (instead of planar) wavefront propagation modeling… ▽ More

    Submitted 18 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: In this paper, we propose a novel sparse DFT codebook to reduce near-field beam training overhead, which is equivalent to sparsely activating the dense array

  10. arXiv:2406.03875  [pdf, other

    eess.SY

    Energy-storing analysis and fishtail stiffness optimization for a wire-driven elastic robotic fish

    Authors: Xiaocun Liao, Chao Zhou, Junfeng Fan, Zhuoliang Zhang, Zhaoran Yin, Liangwei Deng

    Abstract: The robotic fish with high propulsion efficiency and good maneuverability achieves underwater fishlike propulsion by commonly adopting the motor to drive the fishtail, causing the significant fluctuations of the motor power due to the uneven swing speed of the fishtail in one swing cycle. Hence, we propose a wire-driven robotic fish with a spring-steel-based active-segment elastic spine. This bion… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 14 pages, 19 figures

  11. arXiv:2405.06297  [pdf, ps, other

    eess.SY

    Joint Uplink and Downlink Rate Splitting for Fog Computing-Enabled Internet of Medical Things

    Authors: Jiasi Zhou, Yan Chen, Cong Zhou, Yanjing Sun

    Abstract: The Internet of Medical Things (IoMT) facilitates in-home electronic healthcare, transforming traditional hospital-based medical examination approaches. This paper proposes a novel transmit scheme for fog computing-enabled IoMT that leverages uplink and downlink rate splitting (RS). Fog computing allows offloading partial computation tasks to the edge server and processing the remainder of the tas… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: submitted to IEEE Transactions on Cognitive Communications and Networking

  12. arXiv:2404.11313  [pdf, other

    eess.IV cs.AI

    NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results

    Authors: Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, Haoning Wu, Zicheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei Li, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo , et al. (43 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i.e., Kuaishou/Kwai Platform. The KVQ database is divided into three parts, including 2926 videos for training, 420 videos for validation, and 854 videos for testing. The… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR2024 Workshop. The challenge report for CVPR NTIRE2024 Short-form UGC Video Quality Assessment Challenge

  13. arXiv:2404.08224   

    cs.LG cs.AI cs.CR cs.IT eess.SY

    HCL-MTSAD: Hierarchical Contrastive Consistency Learning for Accurate Detection of Industrial Multivariate Time Series Anomalies

    Authors: Haili Sun, Yan Huang, Lansheng Han, Cai Fu, Chunjie Zhou

    Abstract: Multivariate Time Series (MTS) anomaly detection focuses on pinpointing samples that diverge from standard operational patterns, which is crucial for ensuring the safety and security of industrial applications. The primary challenge in this domain is to develop representations capable of discerning anomalies effectively. The prevalent methods for anomaly detection in the literature are predominant… ▽ More

    Submitted 18 April, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

    Comments: This paper is a manuscript that is still in the process of revision, including Table 1, Figure 2, problem definition in section III.B and method description proposed in section IV. In addition, the submitter has not been authorized by the first author and other co-authors to post the paper to arXiv

  14. arXiv:2403.10362  [pdf, other

    eess.IV cs.CV

    CPGA: Coding Priors-Guided Aggregation Network for Compressed Video Quality Enhancement

    Authors: Qiang Zhu, Jinhua Hao, Yukang Ding, Yu Liu, Qiao Mo, Ming Sun, Chao Zhou, Shuyuan Zhu

    Abstract: Recently, numerous approaches have achieved notable success in compressed video quality enhancement (VQE). However, these methods usually ignore the utilization of valuable coding priors inherently embedded in compressed videos, such as motion vectors and residual frames, which carry abundant temporal and spatial information. To remedy this problem, we propose the Coding Priors-Guided Aggregation… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  15. arXiv:2403.03426  [pdf, other

    physics.optics eess.IV

    Combined optimization ghost imaging based on random speckle field

    Authors: Zhiqing Yang, Cheng Zhou, Gangcheng Wang, Lijun Song

    Abstract: Ghost imaging is a non local imaging technology, which can obtain target information by measuring the second-order intensity correlation between the reference light field and the target detection light field. However, the current imaging environment requires a large number of measurement data, and the imaging results also have the problems of low image resolution and long reconstruction time. Ther… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 6 pages, 5 figures

  16. arXiv:2403.02616  [pdf

    cs.LG cs.AI cs.CR cs.NI eess.SY

    Unsupervised Spatio-Temporal State Estimation for Fine-grained Adaptive Anomaly Diagnosis of Industrial Cyber-physical Systems

    Authors: Haili Sun, Yan Huang, Lansheng Han, Cai Fu, Chunjie Zhou

    Abstract: Accurate detection and diagnosis of abnormal behaviors such as network attacks from multivariate time series (MTS) are crucial for ensuring the stable and effective operation of industrial cyber-physical systems (CPS). However, existing researches pay little attention to the logical dependencies among system working states, and have difficulties in explaining the evolution mechanisms of abnormal s… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 23 pages, 7 figures

  17. arXiv:2402.17043  [pdf, other

    eess.SY

    Traffic Control via Connected and Automated Vehicles: An Open-Road Field Experiment with 100 CAVs

    Authors: Jonathan W. Lee, Han Wang, Kathy Jang, Amaury Hayat, Matthew Bunting, Arwa Alanqary, William Barbour, Zhe Fu, Xiaoqian Gong, George Gunter, Sharon Hornstein, Abdul Rahman Kreidieh, Nathan Lichtlé, Matthew W. Nice, William A. Richardson, Adit Shah, Eugene Vinitsky, Fangyu Wu, Shengquan Xiang, Sulaiman Almatrudi, Fahd Althukair, Rahul Bhadani, Joy Carpio, Raphael Chekroun, Eric Cheng , et al. (39 additional authors not shown)

    Abstract: The CIRCLES project aims to reduce instabilities in traffic flow, which are naturally occurring phenomena due to human driving behavior. These "phantom jams" or "stop-and-go waves,"are a significant source of wasted energy. Toward this goal, the CIRCLES project designed a control system referred to as the MegaController by the CIRCLES team, that could be deployed in real traffic. Our field experim… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  18. arXiv:2402.07729  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension

    Authors: Qian Yang, Jin Xu, Wenrui Liu, Yunfei Chu, Ziyue Jiang, Xiaohuan Zhou, Yichong Leng, Yuanjun Lv, Zhou Zhao, Chang Zhou, Jingren Zhou

    Abstract: Recently, instruction-following audio-language models have received broad attention for human-audio interaction. However, the absence of benchmarks capable of evaluating audio-centric interaction capabilities has impeded advancements in this field. Previous models primarily focus on assessing different fundamental tasks, such as Automatic Speech Recognition (ASR), and lack an assessment of the ope… ▽ More

    Submitted 26 July, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: Code and Data: https://github.com/OFA-Sys/AIR-Bench. Accepted by ACL 2024

  19. arXiv:2402.07220  [pdf, other

    eess.IV cs.CV

    KVQ: Kwai Video Quality Assessment for Short-form Videos

    Authors: Yiting Lu, Xin Li, Yajing Pei, Kun Yuan, Qizhi Xie, Yunpeng Qu, Ming Sun, Chao Zhou, Zhibo Chen

    Abstract: Short-form UGC video platforms, like Kwai and TikTok, have been an emerging and irreplaceable mainstream media form, thriving on user-friendly engagement, and kaleidoscope creation, etc. However, the advancing content-generation modes, e.g., special effects, and sophisticated processing workflows, e.g., de-artifacts, have introduced significant challenges to recent UGC video quality assessment: (i… ▽ More

    Submitted 20 February, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

    Comments: 19 pages

  20. arXiv:2402.01808  [pdf, other

    cs.SD eess.AS

    KS-Net: Multi-band joint speech restoration and enhancement network for 2024 ICASSP SSI Challenge

    Authors: Guochen Yu, Runqiang Han, Chenglin Xu, Haoran Zhao, Nan Li, Chen Zhang, Xiguang Zheng, Chao Zhou, Qi Huang, Bing Yu

    Abstract: This paper presents the speech restoration and enhancement system created by the 1024K team for the ICASSP 2024 Speech Signal Improvement (SSI) Challenge. Our system consists of a generative adversarial network (GAN) in complex-domain for speech restoration and a fine-grained multi-band fusion module for speech enhancement. In the blind test set of SSI, the proposed system achieves an overall mean… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: Accepted to ICASSP 2024; Rank 1st in ICASSP 2024 Speech Signal Improvement (SSI) Challenge

  21. arXiv:2401.15235  [pdf, other

    eess.IV cs.CV cs.LG

    CascadedGaze: Efficiency in Global Context Extraction for Image Restoration

    Authors: Amirhosein Ghasemabadi, Muhammad Kamran Janjua, Mohammad Salameh, Chunhua Zhou, Fengyu Sun, Di Niu

    Abstract: Image restoration tasks traditionally rely on convolutional neural networks. However, given the local nature of the convolutional operator, they struggle to capture global information. The promise of attention mechanisms in Transformers is to circumvent this problem, but it comes at the cost of intensive computational overhead. Many recent studies in image restoration have focused on solving the c… ▽ More

    Submitted 7 May, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: Published in Transactions on Machine Learning Research (TMLR), 2024. 20 pages

  22. arXiv:2401.05690  [pdf, other

    cs.IT eess.SP

    Sparse Array Enabled Near-Field Communications: Beam Pattern Analysis and Hybrid Beamforming Design

    Authors: Cong Zhou, Changsheng You, Haodong Zhang, Li Chen, Shuo Shi

    Abstract: Extremely large-scale array (XL-array) has emerged as a promising technology to enable near-field communications for achieving enhanced spectrum efficiency and spatial resolution, by drastically increasing the number of antennas. However, this also inevitably incurs higher hardware and energy cost, which may not be affordable in future wireless systems. To address this issue, we propose in this pa… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: In this paper, we propose to exploit sparse arrays for enabling near-field communications and characterize its unique beam pattern for facilitating its hybrid beamforming design

  23. arXiv:2312.13722  [pdf, other

    cs.SD eess.AS

    BAE-Net: A Low complexity and high fidelity Bandwidth-Adaptive neural network for speech super-resolution

    Authors: Guochen Yu, Xiguang Zheng, Nan Li, Runqiang Han, Chengshi Zheng, Chen Zhang, Chao Zhou, Qi Huang, Bing Yu

    Abstract: Speech bandwidth extension (BWE) has demonstrated promising performance in enhancing the perceptual speech quality in real communication systems. Most existing BWE researches primarily focus on fixed upsampling ratios, disregarding the fact that the effective bandwidth of captured audio may fluctuate frequently due to various capturing devices and transmission conditions. In this paper, we propose… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: Accepted to ICASSP 2024

  24. arXiv:2312.06050  [pdf, other

    cs.LG eess.IV stat.ML

    Federated Multilinear Principal Component Analysis with Applications in Prognostics

    Authors: Chengyu Zhou, Yuqi Su, Tangbin Xia, Xiaolei Fang

    Abstract: Multilinear Principal Component Analysis (MPCA) is a widely utilized method for the dimension reduction of tensor data. However, the integration of MPCA into federated learning remains unexplored in existing research. To tackle this gap, this article proposes a Federated Multilinear Principal Component Analysis (FMPCA) method, which enables multiple users to collaboratively reduce the dimension of… ▽ More

    Submitted 28 April, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

  25. arXiv:2311.12223  [pdf, other

    cs.NI cs.AI eess.SP

    Digital Twin-Based User-Centric Edge Continual Learning in Integrated Sensing and Communication

    Authors: Shisheng Hu, Jie Gao, Xinyu Huang, Mushu Li, Kaige Qu, Conghao Zhou, Xuemin, Shen

    Abstract: In this paper, we propose a digital twin (DT)-based user-centric approach for processing sensing data in an integrated sensing and communication (ISAC) system with high accuracy and efficient resource utilization. The considered scenario involves an ISAC device with a lightweight deep neural network (DNN) and a mobile edge computing (MEC) server with a large DNN. After collecting sensing data, the… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: submitted to IEEE ICC 2024

  26. arXiv:2311.10645  [pdf, other

    eess.SP cs.MM eess.SY

    User Dynamics-Aware Edge Caching and Computing for Mobile Virtual Reality

    Authors: Mushu Li, Jie Gao, Conghao Zhou, Xuemin Shen, Weihua Zhuang

    Abstract: In this paper, we present a novel content caching and delivery approach for mobile virtual reality (VR) video streaming. The proposed approach aims to maximize VR video streaming performance, i.e., minimizing video frame missing rate, by proactively caching popular VR video chunks and adaptively scheduling computing resources at an edge server based on user and network dynamics. First, we design a… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

    Comments: 38 pages, 13 figures, single column double spaced, published in IEEE Journal of Selected Topics in Signal Processing

    Journal ref: in IEEE Journal of Selected Topics in Signal Processing, vol. 17, no. 5, pp. 1131-1146, Sept. 2023

  27. arXiv:2311.07919  [pdf, other

    eess.AS cs.CL cs.LG

    Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models

    Authors: Yunfei Chu, Jin Xu, Xiaohuan Zhou, Qian Yang, Shiliang Zhang, Zhijie Yan, Chang Zhou, Jingren Zhou

    Abstract: Recently, instruction-following audio-language models have received broad attention for audio interaction with humans. However, the absence of pre-trained audio models capable of handling diverse audio types and tasks has hindered progress in this field. Consequently, most existing works have only been able to support a limited range of interaction capabilities. In this paper, we develop the Qwen-… ▽ More

    Submitted 21 December, 2023; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: The code, checkpoints and demo are released at https://github.com/QwenLM/Qwen-Audio

  28. arXiv:2311.02646  [pdf

    eess.IV physics.optics

    Flexible uniform-sampling foveated Fourier single-pixel imaging

    Authors: Huan Cui, Jie Cao, Qun Hao, Haoyu Zhang, Chang Zhou

    Abstract: Fourier single-pixel imaging (FSI) is a data-efficient single-pixel imaging (SPI). However, there is still a serious challenge to obtain higher imaging quality using fewer measurements, which limits the development of real-time SPI. In this work, a uniform-sampling foveated FSI (UFFSI) is proposed with three features, uniform sampling, effective sampling and flexible fovea, to achieve under-sampli… ▽ More

    Submitted 5 November, 2023; originally announced November 2023.

    Comments: 7 pages,5 figures

  29. arXiv:2311.00246  [pdf, ps, other

    cs.CV eess.IV

    RAUNE-Net: A Residual and Attention-Driven Underwater Image Enhancement Method

    Authors: Wangzhen Peng, Chenghao Zhou, Runze Hu, Jingchao Cao, Yutao Liu

    Abstract: Underwater image enhancement (UIE) poses challenges due to distinctive properties of the underwater environment, including low contrast, high turbidity, visual blurriness, and color distortion. In recent years, the application of deep learning has quietly revolutionized various areas of scientific research, including UIE. However, existing deep learning-based UIE methods generally suffer from issu… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

  30. arXiv:2310.04673  [pdf, other

    cs.SD cs.AI cs.LG cs.MM eess.AS

    LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT

    Authors: Zhihao Du, Jiaming Wang, Qian Chen, Yunfei Chu, Zhifu Gao, Zerui Li, Kai Hu, Xiaohuan Zhou, Jin Xu, Ziyang Ma, Wen Wang, Siqi Zheng, Chang Zhou, Zhijie Yan, Shiliang Zhang

    Abstract: Generative Pre-trained Transformer (GPT) models have achieved remarkable performance on various natural language processing tasks, and have shown great potential as backbones for audio-and-text large language models (LLMs). Previous mainstream audio-and-text LLMs use discrete audio tokens to represent both input and output audio; however, they suffer from performance degradation on tasks such as a… ▽ More

    Submitted 2 July, 2024; v1 submitted 6 October, 2023; originally announced October 2023.

    Comments: 10 pages, work in progress

  31. arXiv:2308.13849  [pdf, other

    cs.LG cs.AI eess.SY

    Effectively Heterogeneous Federated Learning: A Pairing and Split Learning Based Approach

    Authors: Jinglong Shen, Xiucheng Wang, Nan Cheng, Longfei Ma, Conghao Zhou, Yuan Zhang

    Abstract: As a promising paradigm federated Learning (FL) is widely used in privacy-preserving machine learning, which allows distributed devices to collaboratively train a model while avoiding data transmission among clients. Despite its immense potential, the FL suffers from bottlenecks in training speed due to client heterogeneity, leading to escalated training latency and straggling server aggregation.… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

  32. AI-Assisted Slicing-Based Resource Management for Two-Tier Radio Access Networks

    Authors: Conghao Zhou, Jie Gao, Mushu Li, Xuemin Shen, Weihua Zhuang, Xu Li, Weisen Shi

    Abstract: While network slicing has become a prevalent approach to service differentiation, radio access network (RAN) slicing remains challenging due to the need of substantial adaptivity and flexibility to cope with the highly dynamic network environment in RANs. In this paper, we develop a slicing-based resource management framework for a two-tier RAN to support multiple services with different quality o… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: Accepted by IEEE Transactions on Cognitive Communications and Networking

  33. arXiv:2308.00214  [pdf

    cs.CV cs.LG eess.IV

    The Impact of Loss Functions and Scene Representations for 3D/2D Registration on Single-view Fluoroscopic X-ray Pose Estimation

    Authors: Chaochao Zhou, Syed Hasib Akhter Faruqui, Abhinav Patel, Ramez N. Abdalla, Michael C. Hurley, Ali Shaibani, Matthew B. Potts, Babak S. Jahromi, Sameer A. Ansari, Donald R. Cantrell

    Abstract: Many tasks performed in image-guided procedures can be cast as pose estimation problems, where specific projections are chosen to reach a target in 3D space. In this study, we first develop a differentiable projection (DiffProj) rendering framework for the efficient computation of Digitally Reconstructed Radiographs (DRRs) with automatic differentiability from either Cone-Beam Computerized Tomogra… ▽ More

    Submitted 27 February, 2024; v1 submitted 31 July, 2023; originally announced August 2023.

  34. arXiv:2307.08703  [pdf

    cs.HC cs.AI cs.CV eess.SY

    SSVEP-Based BCI Wheelchair Control System

    Authors: Ce Zhou

    Abstract: A brain-computer interface (BCI) is a system that allows a person to communicate or control the surroundings without depending on the brain's normal output pathways of peripheral nerves and muscles. A lot of successful applications have arisen utilizing the advantages of BCI to assist disabled people with so-called assistive technology. Considering using BCI has fewer limitations and huge potentia… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

    Comments: 108 pages

  35. arXiv:2306.03407  [pdf, other

    eess.IV cs.CV

    LESS: Label-efficient Multi-scale Learning for Cytological Whole Slide Image Screening

    Authors: Beidi Zhao, Wenlong Deng, Zi Han, Li, Chen Zhou, Zuhua Gao, Gang Wang, Xiaoxiao Li

    Abstract: In computational pathology, multiple instance learning (MIL) is widely used to circumvent the computational impasse in giga-pixel whole slide image (WSI) analysis. It usually consists of two stages: patch-level feature extraction and slide-level aggregation. Recently, pretrained models or self-supervised learning have been used to extract patch features, but they suffer from low effectiveness or i… ▽ More

    Submitted 20 September, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: This paper was submitted to Medical Image Analysis. It is under review

  36. arXiv:2306.02634  [pdf, other

    physics.optics cs.CV cs.LG eess.IV

    Computational 3D topographic microscopy from terabytes of data per sample

    Authors: Kevin C. Zhou, Mark Harfouche, Maxwell Zheng, Joakim Jönsson, Kyung Chul Lee, Ron Appel, Paul Reamey, Thomas Doman, Veton Saliu, Gregor Horstmeyer, Roarke Horstmeyer

    Abstract: We present a large-scale computational 3D topographic microscope that enables 6-gigapixel profilometric 3D imaging at micron-scale resolution across $>$110 cm$^2$ areas over multi-millimeter axial ranges. Our computational microscope, termed STARCAM (Scanning Topographic All-in-focus Reconstruction with a Computational Array Microscope), features a parallelized, 54-camera architecture with 3-axis… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  37. arXiv:2305.11172  [pdf, other

    cs.CV cs.CL cs.SD eess.AS

    ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

    Authors: Peng Wang, Shijie Wang, Junyang Lin, Shuai Bai, Xiaohuan Zhou, Jingren Zhou, Xinggang Wang, Chang Zhou

    Abstract: In this work, we explore a scalable way for building a general representation model toward unlimited modalities. We release ONE-PEACE, a highly extensible model with 4B parameters that can seamlessly align and integrate representations across vision, audio, and language modalities. The architecture of ONE-PEACE comprises modality adapters, shared self-attention layers, and modality FFNs. This desi… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: 30 pages, 9 figures, 18 tables

  38. arXiv:2305.05085  [pdf, other

    physics.optics eess.IV

    Tensorial tomographic Fourier Ptychography with applications to muscle tissue imaging

    Authors: Shiqi Xu, Xiang Dai, Paul Ritter, Kyung Chul Lee, Xi Yang, Lucas Kreiss, Kevin C. Zhou, Kanghyun Kim, Amey Chaware, Jadee Neff, Carolyn Glass, Seung Ah Lee, Oliver Friedrich, Roarke Horstmeyer

    Abstract: We report Tensorial tomographic Fourier Ptychography (ToFu), a new non-scanning label-free tomographic microscopy method for simultaneous imaging of quantitative phase and anisotropic specimen information in 3D. Built upon Fourier Ptychography, a quantitative phase imaging technique, ToFu additionally highlights the vectorial nature of light. The imaging setup consists of a standard microscope equ… ▽ More

    Submitted 13 May, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

    Journal ref: Tensorial tomographic Fourier Ptychography with applications to muscle tissue imaging, Adv. Photon. 6(2), 026004 (2024)

  39. arXiv:2304.14503  [pdf

    cs.CV eess.IV physics.ins-det physics.optics

    UHRNet: A Deep Learning-Based Method for Accurate 3D Reconstruction from a Single Fringe-Pattern

    Authors: Yixiao Wang, Canlin Zhou, Xingyang Qi, Hui Li

    Abstract: The quick and accurate retrieval of an object height from a single fringe pattern in Fringe Projection Profilometry has been a topic of ongoing research. While a single shot fringe to depth CNN based method can restore height map directly from a single pattern, its accuracy is currently inferior to the traditional phase shifting technique. To improve this method's accuracy, we propose using a U sh… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

  40. arXiv:2304.10095  [pdf, ps, other

    cs.IT eess.SP

    Transmit Power Minimization for STAR-RIS Empowered Symbiotic Radio Communications

    Authors: Chao Zhou, Bin Lyu, Youhong Feng, Dinh Thai Hoang

    Abstract: In this paper, we propose a simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) empowered transmission scheme for symbiotic radio (SR) systems to make more flexibility for network deployment and enhance system performance. The STAR-RIS is utilized to not only beam the primary signals from the base station (BS) towards multiple primary users on the same side of… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: 32 pages, 12 figures

  41. arXiv:2304.02398  [pdf, ps, other

    cs.IT eess.SP

    Robust Secure Transmission for Active RIS Enabled Symbiotic Radio Multicast Communications

    Authors: Bin Lyu, Chao Zhou, Shimin Gong, Dinh Thai Hoang, Ying-chang Liang

    Abstract: In this paper, we propose a robust secure transmission scheme for an active reconfigurable intelligent surface (RIS) enabled symbiotic radio (SR) system in the presence of multiple eavesdroppers (Eves). In the considered system, the active RIS is adopted to enable the secure transmission of primary signals from the primary transmitter to multiple primary users in a multicasting manner, and simulta… ▽ More

    Submitted 13 April, 2023; v1 submitted 5 April, 2023; originally announced April 2023.

    Comments: 32 Pages, 12 figures, accepted to IEEE Transactions on Wireless Communications

  42. arXiv:2303.12801  [pdf, ps, other

    eess.IV cs.CV cs.LG

    A Data Augmentation Method and the Embedding Mechanism for Detection and Classification of Pulmonary Nodules on Small Samples

    Authors: Yang Liu, Yue-Jie Hou, Chen-Xin Qin, Xin-Hui Li, Si-Jing Li, Bin Wang, Chi-Chun Zhou

    Abstract: Detection of pulmonary nodules by CT is used for screening lung cancer in early stages.omputer aided diagnosis (CAD) based on deep-learning method can identify the suspected areas of pulmonary nodules in CT images, thus improving the accuracy and efficiency of CT diagnosis. The accuracy and robustness of deep learning models. Method:In this paper, we explore (1) the data augmentation method based… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

  43. arXiv:2303.08140  [pdf, other

    eess.IV cs.LG physics.bio-ph

    Digital staining in optical microscopy using deep learning -- a review

    Authors: Lucas Kreiss, Shaowei Jiang, Xiang Li, Shiqi Xu, Kevin C. Zhou, Alexander Mühlberg, Kyung Chul Lee, Kanghyun Kim, Amey Chaware, Michael Ando, Laura Barisoni, Seung Ah Lee, Guoan Zheng, Kyle Lafata, Oliver Friedrich, Roarke Horstmeyer

    Abstract: Until recently, conventional biochemical staining had the undisputed status as well-established benchmark for most biomedical problems related to clinical diagnostics, fundamental research and biotechnology. Despite this role as gold-standard, staining protocols face several challenges, such as a need for extensive, manual processing of samples, substantial time delays, altered tissue homeostasis,… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: Review article, 4 main Figures, 3 Tables, 2 supplementary figures

  44. Toward Immersive Communications in 6G

    Authors: Xuemin Shen, Jie Gao, Mushu Li, Conghao Zhou, Shisheng Hu, Mingcheng He, Weihua Zhuang

    Abstract: The sixth generation (6G) networks are expected to enable immersive communications and bridge the physical and the virtual worlds. Integrating extended reality, holography, and haptics, immersive communications will revolutionize how people work, entertain, and communicate by enabling lifelike interactions. However, the unprecedented demand for data transmission rate and the stringent requirements… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

    Comments: 29 pages, 8 Figures, published by Frontiers of Computer Science

    Journal ref: Front. Comput. Sci. 4:1068478 (2023)

  45. arXiv:2302.10601  [pdf, other

    cs.CR cs.AI eess.SY

    Few-shot Detection of Anomalies in Industrial Cyber-Physical System via Prototypical Network and Contrastive Learning

    Authors: Haili Sun, Yan Huang, Lansheng Han, Chunjie Zhou

    Abstract: The rapid development of Industry 4.0 has amplified the scope and destructiveness of industrial Cyber-Physical System (CPS) by network attacks. Anomaly detection techniques are employed to identify these attacks and guarantee the normal operation of industrial CPS. However, it is still a challenging problem to cope with scenarios with few labeled samples. In this paper, we propose a few-shot anoma… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: 10 pages, 7 figures, under review

  46. arXiv:2301.08351  [pdf, other

    physics.optics eess.IV physics.bio-ph

    Parallelized computational 3D video microscopy of freely moving organisms at multiple gigapixels per second

    Authors: Kevin C. Zhou, Mark Harfouche, Colin L. Cooke, Jaehee Park, Pavan C. Konda, Lucas Kreiss, Kanghyun Kim, Joakim Jönsson, Jed Doman, Paul Reamey, Veton Saliu, Clare B. Cook, Maxwell Zheng, Jack P. Bechtel, Aurélien Bègue, Matthew McCarroll, Jennifer Bagwell, Gregor Horstmeyer, Michel Bagnat, Roarke Horstmeyer

    Abstract: To study the behavior of freely moving model organisms such as zebrafish (Danio rerio) and fruit flies (Drosophila) across multiple spatial scales, it would be ideal to use a light microscope that can resolve 3D information over a wide field of view (FOV) at high speed and high spatial resolution. However, it is challenging to design an optical instrument to achieve all of these properties simulta… ▽ More

    Submitted 19 January, 2023; originally announced January 2023.

  47. arXiv:2301.00130  [pdf, other

    eess.SY cs.AI cs.LG

    Accuracy-Guaranteed Collaborative DNN Inference in Industrial IoT via Deep Reinforcement Learning

    Authors: Wen Wu, Peng Yang, Weiting Zhang, Conghao Zhou, Xuemin, Shen

    Abstract: Collaboration among industrial Internet of Things (IoT) devices and edge networks is essential to support computation-intensive deep neural network (DNN) inference services which require low delay and high accuracy. Sampling rate adaption which dynamically configures the sampling rates of industrial IoT devices according to network conditions, is the key in minimizing the service delay. In this pa… ▽ More

    Submitted 31 December, 2022; originally announced January 2023.

    Comments: Accpeted by Transaction on Industrial Informatics (TII)

  48. arXiv:2212.04583  [pdf

    eess.AS cs.SD

    High Quality Audio Coding with MDCTNet

    Authors: Grant Davidson, Mark Vinton, Per Ekstrand, Cong Zhou, Lars Villemoes, Lie Lu

    Abstract: We propose a neural audio generative model, MDCTNet, operating in the perceptually weighted domain of an adaptive modified discrete cosine transform (MDCT). The architecture of the model captures correlations in both time and frequency directions with recurrent layers (RNNs). An audio coding system is obtained by training MDCTNet on a diverse set of fullband monophonic audio signals at 48 kHz samp… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

    Comments: Five pages, five figures

  49. arXiv:2212.00500  [pdf, other

    cs.MM cs.CL cs.LG cs.SD eess.AS

    MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech Recognition

    Authors: Xiaohuan Zhou, Jiaming Wang, Zeyu Cui, Shiliang Zhang, Zhijie Yan, Jingren Zhou, Chang Zhou

    Abstract: In this paper, we propose a novel multi-modal multi-task encoder-decoder pre-training framework (MMSpeech) for Mandarin automatic speech recognition (ASR), which employs both unlabeled speech and text data. The main difficulty in speech-text joint pre-training comes from the significant difference between speech and text modalities, especially for Mandarin speech and text. Unlike English and other… ▽ More

    Submitted 29 November, 2022; originally announced December 2022.

    Comments: Submitted to ICASSP 2023

  50. arXiv:2212.00027  [pdf, other

    eess.IV physics.optics

    Imaging across multiple spatial scales with the multi-camera array microscope

    Authors: Mark Harfouche, Kanghyun Kim, Kevin C. Zhou, Pavan Chandra Konda, Sunanda Sharma, Eric E. Thomson, Colin Cooke, Shiqi Xu, Lucas Kreiss, Amey Chaware, Xi Yang, Xing Yao, Vinayak Pathak, Martin Bohlen, Ron Appel, Aurélien Bègue, Clare Cook, Jed Doman, John Efromson, Gregor Horstmeyer, Jaehee Park, Paul Reamey, Veton Saliu, Eva Naumann, Roarke Horstmeyer

    Abstract: This article experimentally examines different configurations of a novel multi-camera array microscope (MCAM) imaging technology. The MCAM is based upon a densely packed array of "micro-cameras" to jointly image across a large field-of-view at high resolution. Each micro-camera within the array images a unique area of a sample of interest, and then all acquired data with 54 micro-cameras are digit… ▽ More

    Submitted 28 February, 2023; v1 submitted 30 November, 2022; originally announced December 2022.