Zum Hauptinhalt springen

Showing 1–50 of 230 results for author: Zheng, Y

Searching in archive eess. Search in all archives.
.
  1. arXiv:2408.15217  [pdf, other

    eess.IV cs.AI cs.CV

    Fundus2Video: Cross-Modal Angiography Video Generation from Static Fundus Photography with Clinical Knowledge Guidance

    Authors: Weiyi Zhang, Siyu Huang, Jiancheng Yang, Ruoyu Chen, Zongyuan Ge, Yingfeng Zheng, Danli Shi, Mingguang He

    Abstract: Fundus Fluorescein Angiography (FFA) is a critical tool for assessing retinal vascular dynamics and aiding in the diagnosis of eye diseases. However, its invasive nature and less accessibility compared to Color Fundus (CF) images pose significant challenges. Current CF to FFA translation methods are limited to static generation. In this work, we pioneer dynamic FFA video generation from static CF… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: The paper has been accepted by Medical Image Computing and Computer Assisted Intervention Society (MICCAI) 2024

  2. arXiv:2408.14067  [pdf, other

    eess.SY

    Active Search for Low-altitude UAV Sensing and Communication for Users at Unknown Locations

    Authors: Yuanshuai Zheng, Junting Chen

    Abstract: This paper studies optimal unmanned aerial vehicle (UAV) placement to ensure line-of-sight (LOS) communication and sensing for a cluster of ground users possibly in deep shadow, while the UAV maintains backhaul connectivity with a base station (BS). The key challenges include unknown user locations, uncertain channel model parameters, and unavailable urban structure. Addressing these challenges, t… ▽ More

    Submitted 29 August, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

  3. arXiv:2408.06049  [pdf

    eess.IV

    Hardware Architecture Design of Model-Based Image Reconstruction Towards Palm-size Photoacoustic Tomography

    Authors: Yuwei Zheng, Zijian Gao, Yuting Shen, Jiadong Zhang, Daohuai Jiang, Fengyu Liu, Feng Gao, Fei Gao

    Abstract: Photoacoustic (PA) imaging technology combines the advantages of optical imaging and ultrasound imaging, showing great potential in biomedical applications. Many preclinical studies and clinical applications urgently require fast, high-quality, low-cost and portable imaging system. Translating advanced image reconstruction algorithms into hardware implementations is highly desired. However, existi… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: 11 pages, 13 figures

  4. arXiv:2408.05596  [pdf, other

    eess.SP

    Semantic Communications with Explicit Semantic Bases: Model, Architecture, and Open Problems

    Authors: Fengyu Wang, Yuan Zheng, Wenjun Xu, Junxiao Liang, Ping Zhang

    Abstract: The increasing demands for massive data transmission pose great challenges to communication systems. Compared to traditional communication systems that focus on the accurate reconstruction of bit sequences, semantic communications (SemComs), which aim to successfully deliver information connotation, have been regarded as the key technology for next-generation communication systems. Most current Se… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

  5. arXiv:2407.20530  [pdf, other

    cs.SD eess.AS

    SuperCodec: A Neural Speech Codec with Selective Back-Projection Network

    Authors: Youqiang Zheng, Weiping Tu, Li Xiao, Xinmeng Xu

    Abstract: Neural speech coding is a rapidly developing topic, where state-of-the-art approaches now exhibit superior compression performance than conventional methods. Despite significant progress, existing methods still have limitations in preserving and reconstructing fine details for optimal reconstruction, especially at low bitrates. In this study, we introduce SuperCodec, a neural speech codec that ach… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

    Comments: Accepted by ICASSP 2024

  6. arXiv:2407.19859  [pdf

    eess.SY

    ProRuka: A highly efficient HMI algorithm for controlling a novel prosthetic hand with 6-DOF using sonomyography

    Authors: Vaheh Nazari, Yong-Ping Zheng

    Abstract: Sonomyography (SMG) is a novel human-machine interface that controls upper-limb prostheses by monitoring forearm muscle activity using ultrasonic imaging. SMG has been investigated for controlling upper-limb prostheses during the last two decades. The results show that this method, in combination with artificial intelligence, can classify different hand gestures with an accuracy of more than 90%,… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  7. arXiv:2407.14894  [pdf, other

    eess.SY

    A Holistic Optimization Framework for Energy Efficient UAV-assisted Fog Computing: Attitude Control, Trajectory Planning and Task Assignment

    Authors: Shuaijun Liu, Jinqiu Du, Yaxin Zheng, Jiaying Yin, Yuhui Deng, Jingjin Wu

    Abstract: Unmanned Aerial Vehicles (UAVs) have significantly enhanced fog computing by acting as both flexible computation platforms and communication mobile relays. In this paper, we propose a holistic framework that jointly optimizes the total latency and energy consumption for UAV-assisted fog computing in a three-dimensional spatial domain with varying terrain elevations and dynamic task generations. Ou… ▽ More

    Submitted 5 August, 2024; v1 submitted 20 July, 2024; originally announced July 2024.

    Comments: 14 pages, 10 figures

  8. arXiv:2407.07052  [pdf, other

    eess.IV cs.CV

    Latent Space Imaging

    Authors: Matheus Souza, Yidan Zheng, Kaizhang Kang, Yogeshwar Nath Mishra, Qiang Fu, Wolfgang Heidrich

    Abstract: Digital imaging systems have classically been based on brute-force measuring and processing of pixels organized on regular grids. The human visual system, on the other hand, performs a massive data reduction from the number of photo-receptors to the optic nerve, essentially encoding the image information into a low bandwidth latent space representation suitable for processing by the human brain. I… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  9. arXiv:2407.04675  [pdf, other

    eess.AS cs.SD

    Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition

    Authors: Ye Bai, Jingping Chen, Jitong Chen, Wei Chen, Zhuo Chen, Chuang Ding, Linhao Dong, Qianqian Dong, Yujiao Du, Kepan Gao, Lu Gao, Yi Guo, Minglun Han, Ting Han, Wenchao Hu, Xinying Hu, Yuxiang Hu, Deyu Hua, Lu Huang, Mingkun Huang, Youjia Huang, Jishuo Jin, Fanliu Kong, Zongwei Lan, Tianyu Li , et al. (30 additional authors not shown)

    Abstract: Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-end models fused with extra language models perform well, but mainly in data matching scenarios and are gradually approaching a bottleneck. In this wor… ▽ More

    Submitted 10 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

  10. arXiv:2407.03026  [pdf, other

    cs.SD cs.AI eess.AS

    Qifusion-Net: Layer-adapted Stream/Non-stream Model for End-to-End Multi-Accent Speech Recognition

    Authors: Jinming Chen, Jingyi Fang, Yuanzhong Zheng, Yaoxuan Wang, Haojun Fei

    Abstract: Currently, end-to-end (E2E) speech recognition methods have achieved promising performance. However, auto speech recognition (ASR) models still face challenges in recognizing multi-accent speech accurately. We propose a layer-adapted fusion (LAF) model, called Qifusion-Net, which does not require any prior knowledge about the target accent. Based on dynamic chunk strategy, our approach enables str… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: accpeted by interspeech 2014, 5 pages, 1 figure

  11. arXiv:2407.00529  [pdf, other

    cs.LG cs.SD eess.AS math.ST stat.ML

    Detecting and Identifying Selection Structure in Sequential Data

    Authors: Yujia Zheng, Zeyu Tang, Yiwen Qiu, Bernhard Schölkopf, Kun Zhang

    Abstract: We argue that the selective inclusion of data points based on latent objectives is common in practical situations, such as music sequences. Since this selection process often distorts statistical analysis, previous work primarily views it as a bias to be corrected and proposes various methods to mitigate its effect. However, while controlling this bias is crucial, selection also offers an opportun… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: ICML 2024

  12. arXiv:2406.17784  [pdf, other

    eess.SP

    Scalable Near-Field Localization Based on Partitioned Large-Scale Antenna Array

    Authors: Xiaojun Yuan, Yuqing Zheng, Mingchen Zhang, Boyu Teng, Wenjun Jiang

    Abstract: This paper studies a passive localization system, where an extremely large-scale antenna array (ELAA) is deployed at the base station (BS) to locate a user equipment (UE) residing in its near-field (Fresnel) region. We propose a novel algorithm, named array partitioning-based location estimation (APLE), for scalable near-field localization. The APLE algorithm is developed based on the basic assump… ▽ More

    Submitted 13 May, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2312.12342

  13. arXiv:2406.09696  [pdf, other

    eess.IV cs.CV

    MoME: Mixture of Multimodal Experts for Cancer Survival Prediction

    Authors: Conghao Xiong, Hao Chen, Hao Zheng, Dong Wei, Yefeng Zheng, Joseph J. Y. Sung, Irwin King

    Abstract: Survival analysis, as a challenging task, requires integrating Whole Slide Images (WSIs) and genomic data for comprehensive decision-making. There are two main challenges in this task: significant heterogeneity and complex inter- and intra-modal interactions between the two modalities. Previous approaches utilize co-attention methods, which fuse features from both modalities only once after separa… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 8 + 1/2 pages, early accepted to MICCAI2024

  14. arXiv:2406.04633  [pdf, ps, other

    eess.AS

    Boosting Diffusion Model for Spectrogram Up-sampling in Text-to-speech: An Empirical Study

    Authors: Chong Zhang, Yanqing Liu, Yang Zheng, Sheng Zhao

    Abstract: Scaling text-to-speech (TTS) with autoregressive language model (LM) to large-scale datasets by quantizing waveform into discrete speech tokens is making great progress to capture the diversity and expressiveness in human speech, but the speech reconstruction quality from discrete speech token is far from satisfaction depending on the compressed speech token compression ratio. Generative diffusion… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  15. arXiv:2406.04243  [pdf, other

    math.OC eess.SY math.DG

    Policy Optimization in Control: Geometry and Algorithmic Implications

    Authors: Shahriar Talebi, Yang Zheng, Spencer Kraisler, Na Li, Mehran Mesbahi

    Abstract: This survey explores the geometric perspective on policy optimization within the realm of feedback control systems, emphasizing the intrinsic relationship between control design and optimization. By adopting a geometric viewpoint, we aim to provide a nuanced understanding of how various ``complete parameterization'' -- referring to the policy parameters together with its Riemannian geometry -- of… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  16. arXiv:2406.04001  [pdf, other

    math.OC eess.SY math.DS

    Benign Nonconvex Landscapes in Optimal and Robust Control, Part II: Extended Convex Lifting

    Authors: Yang Zheng, Chih-Fan Pai, Yujie Tang

    Abstract: Many optimal and robust control problems are nonconvex and potentially nonsmooth in their policy optimization forms. In Part II of this paper, we introduce a new and unified Extended Convex Lifting (ECL) framework to reveal hidden convexity in classical optimal and robust control problems from a modern optimization perspective. Our ECL offers a bridge between nonconvex policy optimization and conv… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  17. arXiv:2405.18435  [pdf, other

    eess.IV cs.CV

    QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge

    Authors: Hongwei Bran Li, Fernando Navarro, Ivan Ezhov, Amirhossein Bayat, Dhritiman Das, Florian Kofler, Suprosanna Shit, Diana Waldmannstetter, Johannes C. Paetzold, Xiaobin Hu, Benedikt Wiestler, Lucas Zimmer, Tamaz Amiranashvili, Chinmay Prabhakar, Christoph Berger, Jonas Weidner, Michelle Alonso-Basant, Arif Rashid, Ujjwal Baid, Wesam Adel, Deniz Ali, Bhakti Baheti, Yingbin Bai, Ishaan Bhatt, Sabri Can Cetindag , et al. (55 additional authors not shown)

    Abstract: Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the de… ▽ More

    Submitted 24 June, 2024; v1 submitted 19 March, 2024; originally announced May 2024.

    Comments: initial technical report

  18. arXiv:2405.03141  [pdf, other

    eess.IV cs.AI cs.CV physics.med-ph

    Automatic Ultrasound Curve Angle Measurement via Affinity Clustering for Adolescent Idiopathic Scoliosis Evaluation

    Authors: Yihao Zhou, Timothy Tin-Yan Lee, Kelly Ka-Lee Lai, Chonglin Wu, Hin Ting Lau, De Yang, Chui-Yi Chan, Winnie Chiu-Wing Chu, Jack Chun-Yiu Cheng, Tsz-Ping Lam, Yong-Ping Zheng

    Abstract: The current clinical gold standard for evaluating adolescent idiopathic scoliosis (AIS) is X-ray radiography, using Cobb angle measurement. However, the frequent monitoring of the AIS progression using X-rays poses a challenge due to the cumulative radiation exposure. Although 3D ultrasound has been validated as a reliable and radiation-free alternative for scoliosis assessment, the process of mea… ▽ More

    Submitted 6 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

  19. arXiv:2404.00625  [pdf, other

    eess.SY

    Scalable second-order consensus of hierarchical groups

    Authors: Jiamin Wang, Jian Liu, Feng Xiao, Ning Xi, Yuanshi Zheng

    Abstract: Motivated by widespread dominance hierarchy, growth of group sizes, and feedback mechanisms in social species, we are devoted to exploring the scalable second-order consensus of hierarchical groups. More specifically, a hierarchical group consists of a collection of agents with double-integrator dynamics on a directed acyclic graph with additional reverse edges, which characterize feedback mechani… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: 9 pages, 1 figure

  20. arXiv:2403.17992  [pdf, other

    q-bio.QM cs.AI cs.LG eess.IV eess.SP

    Interpretable cancer cell detection with phonon microscopy using multi-task conditional neural networks for inter-batch calibration

    Authors: Yijie Zheng, Rafael Fuentes-Dominguez, Matt Clark, George S. D. Gordon, Fernando Perez-Cota

    Abstract: Advances in artificial intelligence (AI) show great potential in revealing underlying information from phonon microscopy (high-frequency ultrasound) data to identify cancerous cells. However, this technology suffers from the 'batch effect' that comes from unavoidable technical variations between each experiment, creating confounding variables that the AI model may inadvertently learn. We therefore… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  21. arXiv:2403.12695  [pdf, other

    eess.IV cs.CV cs.LG

    Federated Semi-supervised Learning for Medical Image Segmentation with intra-client and inter-client Consistency

    Authors: Yubin Zheng, Peng Tang, Tianjie Ju, Weidong Qiu, Bo Yan

    Abstract: Medical image segmentation plays a vital role in clinic disease diagnosis and medical image analysis. However, labeling medical images for segmentation task is tough due to the indispensable domain expertise of radiologists. Furthermore, considering the privacy and sensitivity of medical images, it is impractical to build a centralized segmentation dataset from different medical institutions. Fede… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Working in progress

  22. arXiv:2403.12425  [pdf, other

    cs.CV cs.SD eess.AS

    Multimodal Fusion Method with Spatiotemporal Sequences and Relationship Learning for Valence-Arousal Estimation

    Authors: Jun Yu, Gongpeng Zhao, Yongqi Wang, Zhihong Wei, Yang Zheng, Zerui Zhang, Zhongpeng Cai, Guochen Xie, Jichao Zhu, Wangyuan Zhu

    Abstract: This paper presents our approach for the VA (Valence-Arousal) estimation task in the ABAW6 competition. We devised a comprehensive model by preprocessing video frames and audio segments to extract visual and audio features. Through the utilization of Temporal Convolutional Network (TCN) modules, we effectively captured the temporal and spatial correlations between these features. Subsequently, we… ▽ More

    Submitted 20 March, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: 8 pages,3 figures

  23. arXiv:2403.10481  [pdf, other

    eess.IV eess.SP

    Tensor Star Decomposition

    Authors: Wuyang Zhou, Yu-Bang Zheng, Qibin Zhao, Danilo Mandic

    Abstract: A novel tensor decomposition framework, termed Tensor Star (TS) decomposition, is proposed which represents a new type of tensor network decomposition based on tensor contractions. This is achieved by connecting the core tensors in a ring shape, whereby the core tensors act as skip connections between the factor tensors and allow for direct correlation characterisation between any two arbitrary di… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  24. arXiv:2403.03145  [pdf, other

    cs.CV cs.LG cs.MM cs.SD eess.AS

    Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization

    Authors: Yuxin Guo, Shijie Ma, Hu Su, Zhiqing Wang, Yuhao Zhao, Wei Zou, Siyang Sun, Yun Zheng

    Abstract: Audio-Visual Source Localization (AVSL) aims to locate sounding objects within video frames given the paired audio clips. Existing methods predominantly rely on self-supervised contrastive learning of audio-visual correspondence. Without any bounding-box annotations, they struggle to achieve precise localization, especially for small objects, and suffer from blurry boundaries and false positives.… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted to NeurIPS2023

  25. Novel Low-Complexity Model Development for Li-ion Cells Using Online Impedance Measurement

    Authors: Abhijit Kulkarni, Ahsan Nadeem, Roberta Di Fonso, Yusheng Zheng, Remus Teodorescu

    Abstract: Modeling of Li-ion cells is used in battery management systems (BMS) to determine key states such as state-of-charge (SoC), state-of-health (SoH), etc. Accurate models are also useful in developing a cell-level digital-twin that can be used for protection and diagnostics in the BMS. In this paper, a low-complexity model development is proposed based on the equivalent circuit model (ECM) of the Li-… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Journal ref: 2024

  26. arXiv:2401.17721  [pdf, other

    cs.NI eess.SY

    Time Synchronization for 5G and TSN Integrated Networking

    Authors: Zixiao Wang, Zonghui Li, Xuan Qiao, Yiming Zheng, Bo Ai, Xiaoyu Song

    Abstract: Emerging industrial applications involving robotic collaborative operations and mobile robots require a more reliable and precise wireless network for deterministic data transmission. To meet this demand, the 3rd Generation Partnership Project (3GPP) is promoting the integration of 5th Generation Mobile Communication Technology (5G) and Time-Sensitive Networking (TSN). Time synchronization is esse… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  27. arXiv:2401.15826  [pdf, other

    math.OC eess.SY

    Decentralized Robust Data-driven Predictive Control for Smoothing Mixed Traffic Flow

    Authors: Xu Shang, Jiawei Wang, Yang Zheng

    Abstract: In a mixed traffic with connected automated vehicles (CAVs) and human-driven vehicles (HDVs) coexisting, data-driven predictive control of CAVs promises system-wide traffic performance improvements. Yet, most existing approaches focus on a centralized setup, which is not computationally scalable while failing to protect data privacy. The robustness against unknown disturbances has not been well ad… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

  28. arXiv:2401.05663  [pdf, other

    eess.SP

    End-to-End Learning for SLP-Based ISAC Systems

    Authors: Yixian Zheng, Rang Liu, Ming Li, Qian Liu

    Abstract: Integrated sensing and communication (ISAC) is an encouraging wireless technology which can simultaneously perform both radar and communication functionalities by sharing the same transmit waveform, spectral resource, and hardware platform. Recently emerged symbol-level precoding (SLP) technique exhibits advancement in ISAC systems by leveraging the waveform design degrees of freedom (DoFs) in bot… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: 6 pages, 7 figures, accepted by WCNC 2024

  29. arXiv:2312.17516  [pdf, other

    cs.NI eess.SP

    Robust TOA-based Localization with Inaccurate Anchors for MANET

    Authors: Xinkai Yu, Yang Zheng, Min Sheng, Yan Shi, Jiandong Li

    Abstract: Accurate node localization is vital for mobile ad hoc networks (MANETs). Current methods like Time of Arrival (TOA) can estimate node positions using imprecise baseplates and achieve the Cramér-Rao lower bound (CRLB) accuracy. In multi-hop MANETs, some nodes lack direct links to base anchors, depending on neighbor nodes as dynamic anchors for chain localization. However, the dynamic nature of MANE… ▽ More

    Submitted 29 December, 2023; originally announced December 2023.

  30. arXiv:2312.16775  [pdf, ps, other

    math.OC eess.SY

    Error bounds, PL condition, and quadratic growth for weakly convex functions, and linear convergences of proximal point methods

    Authors: Feng-Yi Liao, Lijun Ding, Yang Zheng

    Abstract: Many practical optimization problems lack strong convexity. Fortunately, recent studies have revealed that first-order algorithms also enjoy linear convergences under various weaker regularity conditions. While the relationship among different conditions for convex and smooth functions is well-understood, it is not the case for the nonsmooth setting. In this paper, we go beyond convexity and smoot… ▽ More

    Submitted 13 August, 2024; v1 submitted 27 December, 2023; originally announced December 2023.

    Comments: 29 pages, 3 figures, and 1 table

  31. arXiv:2312.15993  [pdf

    cs.AI cs.RO eess.SY

    Adaptive Kalman-based hybrid car following strategy using TD3 and CACC

    Authors: Yuqi Zheng, Ruidong Yan, Bin Jia, Rui Jiang, Adriana TAPUS, Xiaojing Chen, Shiteng Zheng, Ying Shang

    Abstract: In autonomous driving, the hybrid strategy of deep reinforcement learning and cooperative adaptive cruise control (CACC) can fully utilize the advantages of the two algorithms and significantly improve the performance of car following. However, it is challenging for the traditional hybrid strategy based on fixed coefficients to adapt to mixed traffic flow scenarios, which may decrease the performa… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: 32pages,13figures

  32. arXiv:2312.15575  [pdf, other

    eess.IV cs.CV cs.LG

    Neural Born Series Operator for Biomedical Ultrasound Computed Tomography

    Authors: Zhijun Zeng, Yihang Zheng, Youjia Zheng, Yubing Li, Zuoqiang Shi, He Sun

    Abstract: Ultrasound Computed Tomography (USCT) provides a radiation-free option for high-resolution clinical imaging. Despite its potential, the computationally intensive Full Waveform Inversion (FWI) required for tissue property reconstruction limits its clinical utility. This paper introduces the Neural Born Series Operator (NBSO), a novel technique designed to speed up wave simulations, thereby facilita… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

    ACM Class: I.4.5; J.3

  33. Construct 3D Hand Skeleton with Commercial WiFi

    Authors: Sijie Ji, Xuanye Zhang, Yuanqing Zheng, Mo Li

    Abstract: This paper presents HandFi, which constructs hand skeletons with practical WiFi devices. Unlike previous WiFi hand sensing systems that primarily employ predefined gestures for pattern matching, by constructing the hand skeleton, HandFi can enable a variety of downstream WiFi-based hand sensing applications in gaming, healthcare, and smart homes. Deriving the skeleton from WiFi signals is challeng… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

    Journal ref: ACM SenSys 2023

  34. arXiv:2312.15431  [pdf, other

    math.OC eess.SY

    Convex Approximations for a Bi-level Formulation of Data-Enabled Predictive Control

    Authors: Xu Shang, Yang Zheng

    Abstract: The Willems' fundamental lemma, which characterizes linear time-invariant (LTI) systems using input and output trajectories, has found many successful applications. Combining this with receding horizon control leads to a popular Data-EnablEd Predictive Control (DeePC) scheme. DeePC is first established for LTI systems and has been extended and applied for practical systems beyond LTI settings. How… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

  35. arXiv:2312.15332  [pdf, other

    math.OC eess.SY math.DS

    Benign Nonconvex Landscapes in Optimal and Robust Control, Part I: Global Optimality

    Authors: Yang Zheng, Chih-fan Pai, Yujie Tang

    Abstract: Direct policy search has achieved great empirical success in reinforcement learning. Many recent studies have revisited its theoretical foundation for continuous control, which reveals elegant nonconvex geometry in various benchmark problems, especially in fully observable state-feedback cases. This paper considers two fundamental optimal and robust control problems with partial observability: the… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: 79 pages, 12 figures

  36. arXiv:2312.12342  [pdf, other

    eess.SP

    Scalable Near-Field Localization Based on Partitioned Large-Scale Antenna Array

    Authors: Xiaojun Yuan, Yuqing Zheng, Mingchen Zhang, Boyu Teng, Wenjun Jiang

    Abstract: This paper studies a passive localization system, where an extremely large-scale antenna array (ELAA) is deployed at the base station (BS) to locate a user equipment (UE) residing in its near-field (Fresnel) region. We propose a novel algorithm, named array partitioning-based location estimation (APLE), for scalable near-field localization. The APLE algorithm is developed based on the basic assump… ▽ More

    Submitted 24 May, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

  37. arXiv:2312.11974  [pdf, other

    cs.SD cs.HC eess.AS

    Ms-senet: Enhancing Speech Emotion Recognition Through Multi-scale Feature Fusion With Squeeze-and-excitation Blocks

    Authors: Mengbo Li, Yuanzhong Zheng, Dichucheng Li, Yulun Wu, Yaoxuan Wang, Haojun Fei

    Abstract: Speech Emotion Recognition (SER) has become a growing focus of research in human-computer interaction. Spatiotemporal features play a crucial role in SER, yet current research lacks comprehensive spatiotemporal feature learning. This paper focuses on addressing this gap by proposing a novel approach. In this paper, we employ Convolutional Neural Network (CNN) with varying kernel sizes for spatial… ▽ More

    Submitted 24 December, 2023; v1 submitted 19 December, 2023; originally announced December 2023.

  38. arXiv:2312.01726  [pdf, other

    eess.IV cs.CV

    Simultaneous Alignment and Surface Regression Using Hybrid 2D-3D Networks for 3D Coherent Layer Segmentation of Retinal OCT Images with Full and Sparse Annotations

    Authors: Hong Liu, Dong Wei, Donghuan Lu, Xiaoying Tang, Liansheng Wang, Yefeng Zheng

    Abstract: Layer segmentation is important to quantitative analysis of retinal optical coherence tomography (OCT). Recently, deep learning based methods have been developed to automate this task and yield remarkable performance. However, due to the large spatial gap and potential mismatch between the B-scans of an OCT volume, all of them were based on 2D segmentation of individual B-scans, which may lose the… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: Accepted by MIA. arXiv admin note: text overlap with arXiv:2203.02390

  39. arXiv:2312.01679  [pdf, other

    eess.IV cs.CV cs.LG

    Adversarial Medical Image with Hierarchical Feature Hiding

    Authors: Qingsong Yao, Zecheng He, Yuexiang Li, Yi Lin, Kai Ma, Yefeng Zheng, S. Kevin Zhou

    Abstract: Deep learning based methods for medical images can be easily compromised by adversarial examples (AEs), posing a great security flaw in clinical decision-making. It has been discovered that conventional adversarial attacks like PGD which optimize the classification logits, are easy to distinguish in the feature space, resulting in accurate reactive defenses. To better understand this phenomenon an… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: Our code is available at \url{https://github.com/qsyao/Hierarchical_Feature_Constraint}. arXiv admin note: text overlap with arXiv:2012.09501

  40. arXiv:2311.15061  [pdf, other

    eess.SP

    SenseAI: Real-Time Inpainting for Electron Microscopy

    Authors: Jack Wells, Amirafshar Moshtaghpour, Daniel Nicholls, Alex W. Robinson, Yalin Zheng, Jony Castagna, Nigel D. Browning

    Abstract: Despite their proven success and broad applicability to Electron Microscopy (EM) data, joint dictionary-learning and sparse-coding based inpainting algorithms have so far remained impractical for real-time usage with an Electron Microscope. For many EM applications, the reconstruction time for a single frame is orders of magnitude longer than the data acquisition time, making it impossible to perf… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

    Comments: Presented in ISCS23

    Report number: ISCS23-35

  41. Polar-Net: A Clinical-Friendly Model for Alzheimer's Disease Detection in OCTA Images

    Authors: Shouyue Liu, Jinkui Hao, Yanwu Xu, Huazhu Fu, Xinyu Guo, Jiang Liu, Yalin Zheng, Yonghuai Liu, Jiong Zhang, Yitian Zhao

    Abstract: Optical Coherence Tomography Angiography (OCTA) is a promising tool for detecting Alzheimer's disease (AD) by imaging the retinal microvasculature. Ophthalmologists commonly use region-based analysis, such as the ETDRS grid, to study OCTA image biomarkers and understand the correlation with AD. However, existing studies have used general deep computer vision methods, which present challenges in pr… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

    Comments: Accepted by MICCAI2023

  42. arXiv:2311.00866  [pdf, other

    cs.LG eess.SP stat.ML

    Generalizing Nonlinear ICA Beyond Structural Sparsity

    Authors: Yujia Zheng, Kun Zhang

    Abstract: Nonlinear independent component analysis (ICA) aims to uncover the true latent sources from their observable nonlinear mixtures. Despite its significance, the identifiability of nonlinear ICA is known to be impossible without additional assumptions. Recent advances have proposed conditions on the connective structure from sources to observed variables, known as Structural Sparsity, to achieve iden… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  43. arXiv:2310.00509  [pdf, other

    math.OC eess.SY

    Smoothing Mixed Traffic with Robust Data-driven Predictive Control for Connected and Autonomous Vehicles

    Authors: Xu Shang, Jiawei Wang, Yang Zheng

    Abstract: The recently developed DeeP-LCC (Data-EnablEd Predictive Leading Cruise Control) method has shown promising performance for data-driven predictive control of Connected and Autonomous Vehicles (CAVs) in mixed traffic. However, its simplistic zero assumption of the future velocity errors for the head vehicle may pose safety concerns and limit its performance of smoothing traffic flow. In this paper,… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

  44. arXiv:2309.12805  [pdf, other

    eess.IV cs.CV

    Automatic view plane prescription for cardiac magnetic resonance imaging via supervision by spatial relationship between views

    Authors: Dong Wei, Yawen Huang, Donghuan Lu, Yuexiang Li, Yefeng Zheng

    Abstract: Background: View planning for the acquisition of cardiac magnetic resonance (CMR) imaging remains a demanding task in clinical practice. Purpose: Existing approaches to its automation relied either on an additional volumetric image not typically acquired in clinic routine, or on laborious manual annotations of cardiac structural landmarks. This work presents a clinic-compatible, annotation-free sy… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: Medical Physics. arXiv admin note: text overlap with arXiv:2109.11715

  45. arXiv:2308.12526  [pdf, other

    eess.AS cs.LG cs.SD

    UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023

    Authors: Yu Zheng, Yajun Zhang, Chuanying Niu, Yibin Zhan, Yanhua Long, Dongxing Xu

    Abstract: This report describes the UNISOUND submission for Track1 and Track2 of VoxCeleb Speaker Recognition Challenge 2023 (VoxSRC 2023). We submit the same system on Track 1 and Track 2, which is trained with only VoxCeleb2-dev. Large-scale ResNet and RepVGG architectures are developed for the challenge. We propose a consistency-aware score calibration method, which leverages the stability of audio voice… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

  46. arXiv:2308.06599  [pdf, other

    eess.IV

    Semantic Communications with Explicit Semantic Base for Image Transmission

    Authors: Yuan Zheng, Fengyu Wang, Wenjun Xu, Miao Pan, Ping Zhang

    Abstract: Semantic communications, aiming at ensuring the successful delivery of the meaning of information, are expected to be one of the potential techniques for the next generation communications. However, the knowledge forming and synchronizing mechanism that enables semantic communication systems to extract and interpret the semantics of information according to the communication intents is still immat… ▽ More

    Submitted 14 January, 2024; v1 submitted 12 August, 2023; originally announced August 2023.

  47. arXiv:2308.04922  [pdf

    eess.IV physics.med-ph physics.optics

    HSD-PAM: High Speed Super Resolution Deep Penetration Photoacoustic Microscopy Imaging Boosted by Dual Branch Fusion Network

    Authors: Zhengyuan Zhang, Haoran Jin, Zesheng Zheng, Wenwen Zhang, Wenhao Lu, Feng Qin, Arunima Sharma, Manojit Pramanik, Yuanjin Zheng

    Abstract: Photoacoustic microscopy (PAM) is a novel implementation of photoacoustic imaging (PAI) for visualizing the 3D bio-structure, which is realized by raster scanning of the tissue. However, as three involved critical imaging parameters, imaging speed, lateral resolution, and penetration depth have mutual effect to one the other. The improvement of one parameter results in the degradation of other two… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

  48. arXiv:2308.03018  [pdf, other

    cs.CV eess.IV

    Recurrent Spike-based Image Restoration under General Illumination

    Authors: Lin Zhu, Yunlong Zheng, Mengyue Geng, Lizhi Wang, Hua Huang

    Abstract: Spike camera is a new type of bio-inspired vision sensor that records light intensity in the form of a spike array with high temporal resolution (20,000 Hz). This new paradigm of vision sensor offers significant advantages for many vision tasks such as high speed image reconstruction. However, existing spike-based approaches typically assume that the scenes are with sufficient light intensity, whi… ▽ More

    Submitted 6 August, 2023; originally announced August 2023.

    Comments: Accepted by ACM MM 2023

  49. arXiv:2307.13295  [pdf, other

    cs.SD eess.AS

    CQNV: A combination of coarsely quantized bitstream and neural vocoder for low rate speech coding

    Authors: Youqiang Zheng, Li Xiao, Weiping Tu, Yuhong Yang, Xinmeng Xu

    Abstract: Recently, speech codecs based on neural networks have proven to perform better than traditional methods. However, redundancy in traditional parameter quantization is visible within the codec architecture of combining the traditional codec with the neural vocoder. In this paper, we propose a novel framework named CQNV, which combines the coarsely quantized parameters of a traditional parametric cod… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: Accepted by INTERSPEECH 2023

  50. arXiv:2307.07651  [pdf, ps, other

    math.OC eess.SY

    An Overview and Comparison of Spectral Bundle Methods for Primal and Dual Semidefinite Programs

    Authors: Feng-Yi Liao, Lijun Ding, Yang Zheng

    Abstract: The spectral bundle method developed by Helmberg and Rendl is well-established for solving large-scale semidefinite programs (SDPs) in the dual form, especially when the SDPs admit $\textit{low-rank primal solutions}$. Under mild regularity conditions, a recent result by Ding and Grimmer has established fast linear convergence rates when the bundle method captures… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: 57 pages, 4 figures, and 4 tables