Zum Hauptinhalt springen

Showing 1–50 of 361 results for author: Yang, X

Searching in archive eess. Search in all archives.
.
  1. arXiv:2408.16564  [pdf, other

    cs.MM cs.SD eess.AS

    Human-Inspired Audio-Visual Speech Recognition: Spike Activity, Cueing Interaction and Causal Processing

    Authors: Qianhui Liu, Jiadong Wang, Yang Wang, Xin Yang, Gang Pan, Haizhou Li

    Abstract: Humans naturally perform audiovisual speech recognition (AVSR), enhancing the accuracy and robustness by integrating auditory and visual information. Spiking neural networks (SNNs), which mimic the brain's information-processing mechanisms, are well-suited for emulating the human capability of AVSR. Despite their potential, research on SNNs for AVSR is scarce, with most existing audio-visual multi… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  2. arXiv:2408.16532  [pdf, other

    eess.AS cs.LG cs.MM cs.SD eess.SP

    WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling

    Authors: Shengpeng Ji, Ziyue Jiang, Xize Cheng, Yifu Chen, Minghui Fang, Jialong Zuo, Qian Yang, Ruiqi Li, Ziang Zhang, Xiaoda Yang, Rongjie Huang, Yidi Jiang, Qian Chen, Siqi Zheng, Wen Wang, Zhou Zhao

    Abstract: Language models have been effectively applied to modeling natural signals, such as images, video, speech, and audio. A crucial component of these models is the codec tokenizer, which compresses high-dimensional natural signals into lower-dimensional discrete tokens. In this paper, we introduce WavTokenizer, which offers several advantages over previous SOTA acoustic codec models in the audio domai… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

    Comments: Working in progress. arXiv admin note: text overlap with arXiv:2402.12208

  3. arXiv:2408.12077  [pdf, other

    eess.SP cs.CV cs.LG

    Through-the-Wall Radar Human Activity Micro-Doppler Signature Representation Method Based on Joint Boulic-Sinusoidal Pendulum Model

    Authors: Xiaopeng Yang, Weicheng Gao, Xiaodong Qu, Zeyu Ma, Hao Zhang

    Abstract: With the help of micro-Doppler signature, ultra-wideband (UWB) through-the-wall radar (TWR) enables the reconstruction of range and velocity information of limb nodes to accurately identify indoor human activities. However, existing methods are usually trained and validated directly using range-time maps (RTM) and Doppler-time maps (DTM), which have high feature redundancy and poor generalization… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: 17 pages, 14 figures, 7 tables, in IEEE Transactions on Microwave Theory and Techniques, 2024

    MSC Class: 94 ACM Class: I.5.1

  4. arXiv:2408.12069  [pdf, other

    cs.IT eess.SP

    Rotatable Block-Controlled RIS: Bridging the Performance Gap to Element-Controlled Systems

    Authors: Weicong Chen, Xinyi Yang, Chao-Kai Wen, Wankai Tang, Jinghe Wang, Yifei Yuan, Xiao Li, Shi Jin

    Abstract: The passive reconfigurable intelligent surface (RIS) requires numerous elements to achieve adequate array gain, which linearly increases power consumption (PC) with the number of reflection phases. To address this, this letter introduces a rotatable block-controlled RIS (BC-RIS) that preserves spectral efficiency (SE) while reducing power costs. Unlike the element-controlled RIS (EC-RIS), which ne… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  5. arXiv:2408.05923  [pdf, other

    eess.IV cs.CV

    Image Denoising Using Green Channel Prior

    Authors: Zhaoming Kong, Fangxi Deng, Xiaowei Yang

    Abstract: Image denoising is an appealing and challenging task, in that noise statistics of real-world observations may vary with local image contents and different image channels. Specifically, the green channel usually has twice the sampling rate in raw data. To handle noise variances and leverage such channel-wise prior information, we propose a simple and effective green channel prior-based image denois… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.08235

  6. arXiv:2408.05756  [pdf, other

    eess.SP

    Joint SIM Configuration and Power Allocation for Stacked Intelligent Metasurface-assisted MU-MISO Systems with TD3

    Authors: Xiaolei Yang, Jiayi Zhang, Enyu Shi, Ziheng Liu, Jun Liu, Kang Zheng, Bo Ai

    Abstract: The stacked intelligent metasurface (SIM) emerges as an innovative technology with the ability to directly manipulate electromagnetic (EM) wave signals, drawing parallels to the operational principles of artificial neural networks (ANN). Leveraging its structure for direct EM signal processing alongside its low-power consumption, SIM holds promise for enhancing system performance within wireless c… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

    Comments: accepted by IEEE GLOBECOM 2024

  7. arXiv:2407.21490  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Explainable and Controllable Motion Curve Guided Cardiac Ultrasound Video Generation

    Authors: Junxuan Yu, Rusi Chen, Yongsong Zhou, Yanlin Chen, Yaofei Duan, Yuhao Huang, Han Zhou, Tan Tao, Xin Yang, Dong Ni

    Abstract: Echocardiography video is a primary modality for diagnosing heart diseases, but the limited data poses challenges for both clinical teaching and machine learning training. Recently, video generative models have emerged as a promising strategy to alleviate this issue. However, previous methods often relied on holistic conditions during generation, hindering the flexible movement control over specif… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

    Comments: Accepted by MICCAI MLMI 2024

  8. arXiv:2407.19663  [pdf, other

    cs.LG eess.SP

    Short-Term Forecasting of Photovoltaic Power Generation Based on Entropy during the Foggy Winter

    Authors: Xuan Yang, Yunxuan Dong, Thomas Wu

    Abstract: Solar energy is one of the most promising renewable energy resources. Forecasting photovoltaic power generation is an important way to increase photovoltaic penetration. However, the task of photovoltaic forecasting is complicated due to its property of uncertainty, especially in specific regions during the foggy winter. This paper proposes a novel model to accomplish the problem. A developed entr… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

    Comments: The manuscript was submitted to Applied Energy on June 3, 2024

  9. arXiv:2407.16165  [pdf, other

    eess.IV cs.CV cs.LG

    Advanced AI Framework for Enhanced Detection and Assessment of Abdominal Trauma: Integrating 3D Segmentation with 2D CNN and RNN Models

    Authors: Liheng Jiang, Xuechun yang, Chang Yu, Zhizhong Wu, Yuting Wang

    Abstract: Trauma is a significant cause of mortality and disability, particularly among individuals under forty. Traditional diagnostic methods for traumatic injuries, such as X-rays, CT scans, and MRI, are often time-consuming and dependent on medical expertise, which can delay critical interventions. This study explores the application of artificial intelligence (AI) and machine learning (ML) to improve t… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: 6 Pages

  10. arXiv:2407.14823  [pdf, other

    cs.CV cs.AI cs.LG cs.MM eess.IV

    CrossDehaze: Scaling Up Image Dehazing with Cross-Data Vision Alignment and Augmentation

    Authors: Yukai Shi, Zhipeng Weng, Yupei Lin, Cidan Shi, Xiaojun Yang, Liang Lin

    Abstract: In recent years, as computer vision tasks have increasingly relied on high-quality image inputs, the task of image dehazing has received significant attention. Previously, many methods based on priors and deep learning have been proposed to address the task of image dehazing. Ignoring the domain gap between different data, former de-hazing methods usually adopt multiple datasets for explicit train… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

    Comments: A cross-dataset vision alignment and augmentation technology is proposed to boost generalizable feature learning in the de-hazing task

  11. arXiv:2407.10462  [pdf, other

    cs.SD cs.AI cs.MM eess.AS

    BandControlNet: Parallel Transformers-based Steerable Popular Music Generation with Fine-Grained Spatiotemporal Features

    Authors: Jing Luo, Xinyu Yang, Dorien Herremans

    Abstract: Controllable music generation promotes the interaction between humans and composition systems by projecting the users' intent on their desired music. The challenge of introducing controllability is an increasingly important issue in the symbolic music generation field. When building controllable generative popular multi-instrument music systems, two main challenges typically present themselves, na… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Demo page: https://chinglohsiu.github.io/files/bandcontrolnet.html

  12. arXiv:2407.07397  [pdf, other

    cs.SD eess.AS

    SimuSOE: A Simulated Snoring Dataset for Obstructive Sleep Apnea-Hypopnea Syndrome Evaluation during Wakefulness

    Authors: Jie Lin, Xiuping Yang, Li Xiao, Xinhong Li, Weiyan Yi, Yuhong Yang, Weiping Tu, Xiong Chen

    Abstract: Obstructive Sleep Apnea-Hypopnea Syndrome (OSAHS) is a prevalent chronic breathing disorder caused by upper airway obstruction. Previous studies advanced OSAHS evaluation through machine learning-based systems trained on sleep snoring or speech signal datasets. However, constructing datasets for training a precise and rapid OSAHS evaluation system poses a challenge, since 1) it is time-consuming t… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  13. arXiv:2407.05391  [pdf, other

    eess.SP

    Interference Management in MIMO-ISAC Systems: A Transceiver Design Approach

    Authors: Yangyang Niu, Zhiqing Wei, Dingyou Ma, Xiaoyu Yang, Huici Wu, Zhiyong Feng, Jianhua Yuan

    Abstract: The integrated sensing and communication (ISAC) system under multi-input multi-output (MIMO) architecture achieves dual functionalities of sensing and communication on the same platform by utilizing spatial gain, which provides a feasible paradigm facing spectrum congestion. However, the dual functionalities of sensing and communication operating simultaneously in the same platform bring severe in… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  14. arXiv:2407.04746  [pdf

    eess.SP

    Moving Target Detection Method Based on Range? Doppler Domain Compensation and Cancellation for UAV-Mounted Radar

    Authors: Xiaodong Qu, Xiaolong Sun, Feiyang Liu, Hao Zhang, Shichao Zhong, Xiaopeng Yang

    Abstract: Combining unmanned aerial vehicle (UAV) with through-the-wall radar can realize moving targets detection in complex building scenes. However, clutters generated by obstacles and static objects are always stronger and non-stationary, which results in heavy impacts on moving targets detection. To address this issue, this paper proposes a moving target detection method based on Range-Doppler domain c… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  15. arXiv:2407.02616  [pdf

    eess.IV cs.CV

    Deep Learning Based Apparent Diffusion Coefficient Map Generation from Multi-parametric MR Images for Patients with Diffuse Gliomas

    Authors: Zach Eidex, Mojtaba Safari, Jacob Wynne, Richard L. J. Qiu, Tonghe Wang, David Viar Hernandez, Hui-Kuo Shu, Hui Mao, Xiaofeng Yang

    Abstract: Purpose: Apparent diffusion coefficient (ADC) maps derived from diffusion weighted (DWI) MRI provides functional measurements about the water molecules in tissues. However, DWI is time consuming and very susceptible to image artifacts, leading to inaccurate ADC measurements. This study aims to develop a deep learning framework to synthesize ADC maps from multi-parametric MR images. Methods: We pro… ▽ More

    Submitted 4 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2311.15044

  16. arXiv:2407.00467  [pdf, other

    cs.LG cs.DC eess.IV

    VcLLM: Video Codecs are Secretly Tensor Codecs

    Authors: Ceyu Xu, Yongji Wu, Xinyu Yang, Beidi Chen, Matthew Lentz, Danyang Zhuo, Lisa Wu Wills

    Abstract: As the parameter size of large language models (LLMs) continues to expand, the need for a large memory footprint and high communication bandwidth have become significant bottlenecks for the training and inference of LLMs. To mitigate these bottlenecks, various tensor compression techniques have been proposed to reduce the data size, thereby alleviating memory requirements and communication pressur… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  17. arXiv:2406.18935  [pdf

    eess.SY

    Generalized Averaging Method for Power Electronics Modeling from DC to above Half the Switching Frequency

    Authors: Hongchang Li, Kangping Wang, Jingyang Fang, Wenjie Chen, Xu Yang

    Abstract: Modeling power electronic converters at frequencies close to or above half the switching frequency has been difficult due to the time-variant and discontinuous switching actions. This paper uses the properties of moving Fourier coefficients to develop the generalized averaging method, breaking though the limit of half the switching frequency. The paper also proposes the generalized average model f… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  18. arXiv:2406.15656  [pdf, other

    eess.IV cs.CV

    Adaptive Self-Supervised Consistency-Guided Diffusion Model for Accelerated MRI Reconstruction

    Authors: Mojtaba Safari, Zach Eidex, Shaoyan Pan, Richard L. J. Qiu, Xiaofeng Yang

    Abstract: Purpose: To propose a self-supervised deep learning-based compressed sensing MRI (DL-based CS-MRI) method named "Adaptive Self-Supervised Consistency Guided Diffusion Model (ASSCGD)" to accelerate data acquisition without requiring fully sampled datasets. Materials and Methods: We used the fastMRI multi-coil brain axial T2-weighted (T2-w) dataset from 1,376 cases and single-coil brain quantitative… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  19. arXiv:2406.12186  [pdf, ps, other

    eess.IV cs.CV

    Unlocking the Potential of Early Epochs: Uncertainty-aware CT Metal Artifact Reduction

    Authors: Xinquan Yang, Guanqun Zhou, Wei Sun, Youjian Zhang, Zhongya Wang, Jiahui He, Zhicheng Zhang

    Abstract: In computed tomography (CT), the presence of metallic implants in patients often leads to disruptive artifacts in the reconstructed images, hindering accurate diagnosis. Recently, a large amount of supervised deep learning-based approaches have been proposed for metal artifact reduction (MAR). However, these methods neglect the influence of initial training weights. In this paper, we have discover… ▽ More

    Submitted 20 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  20. arXiv:2406.08887  [pdf, other

    eess.SP

    Low-Overhead Channel Estimation via 3D Extrapolation for TDD mmWave Massive MIMO Systems Under High-Mobility Scenarios

    Authors: Binggui Zhou, Xi Yang, Shaodan Ma, Feifei Gao, Guanghua Yang

    Abstract: In TDD mmWave massive MIMO systems, the downlink CSI can be attained through uplink channel estimation thanks to the uplink-downlink channel reciprocity. However, the channel aging issue is significant under high-mobility scenarios and thus necessitates frequent uplink channel estimation. In addition, large amounts of antennas and subcarriers lead to high-dimensional CSI matrices, aggravating the… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 13 pages, 11 figures, 3 tables. This paper has been submitted to IEEE journal for possible publication

  21. arXiv:2406.02247  [pdf, other

    physics.ins-det eess.SY

    A Study of the Latest Updates of the Readout System for the Hybird-Pixel Detector at HEPS

    Authors: Hangxu Li, Jie Zhang, Wei Wei, Zhenjie Li, Xiaolu Ji, Yan Zhang, Xuanzheng Yang, Shuihan Zhang, Xueke Ma, Peng Liu, Zheng Wang, Yuanbai Chen

    Abstract: The High Energy Photon Source (HEPS) represents a fourth-generation light source. This facility has made unprecedented advancements in accelerator technology, necessitating the development of new detectors to satisfy physical requirements such as single-photon resolution, large dynamic range, and high frame rates. Since 2016, the Institute of High Energy Physics has introduced the first user-exper… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  22. arXiv:2406.02126  [pdf, other

    eess.SY cs.AI cs.LG cs.MA

    CityLight: A Universal Model for Coordinated Traffic Signal Control in City-scale Heterogeneous Intersections

    Authors: Jinwei Zeng, Chao Yu, Xinyi Yang, Wenxuan Ao, Qianyue Hao, Jian Yuan, Yong Li, Yu Wang, Huazhong Yang

    Abstract: The increasingly severe congestion problem in modern cities strengthens the significance of developing city-scale traffic signal control (TSC) methods for traffic efficiency enhancement. While reinforcement learning has been widely explored in TSC, most of them still target small-scale optimization and cannot directly scale to the city level due to unbearable resource demand. Only a few of them ma… ▽ More

    Submitted 28 August, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  23. arXiv:2405.10550  [pdf, other

    eess.IV cs.CV

    LighTDiff: Surgical Endoscopic Image Low-Light Enhancement with T-Diffusion

    Authors: Tong Chen, Qingcheng Lyu, Long Bai, Erjian Guo, Huxin Gao, Xiaoxiao Yang, Hongliang Ren, Luping Zhou

    Abstract: Advances in endoscopy use in surgeries face challenges like inadequate lighting. Deep learning, notably the Denoising Diffusion Probabilistic Model (DDPM), holds promise for low-light image enhancement in the medical field. However, DDPMs are computationally demanding and slow, limiting their practical medical applications. To bridge this gap, we propose a lightweight DDPM, dubbed LighTDiff. It ad… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  24. arXiv:2404.19087  [pdf, other

    cs.RO cs.AI cs.LG eess.SY

    Deep Reinforcement Learning for Advanced Longitudinal Control and Collision Avoidance in High-Risk Driving Scenarios

    Authors: Dianwei Chen, Yaobang Gong, Xianfeng Yang

    Abstract: Existing Advanced Driver Assistance Systems primarily focus on the vehicle directly ahead, often overlooking potential risks from following vehicles. This oversight can lead to ineffective handling of high risk situations, such as high speed, closely spaced, multi vehicle scenarios where emergency braking by one vehicle might trigger a pile up collision. To overcome these limitations, this study i… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  25. arXiv:2404.18105  [pdf, other

    cs.RO eess.SP

    Tightly-Coupled VLP/INS Integrated Navigation by Inclination Estimation and Blockage Handling

    Authors: Xiao Sun, Yuan Zhuang, Xiansheng Yang, Jianzhu Huai, Tianming Huang, Daquan Feng

    Abstract: Visible Light Positioning (VLP) has emerged as a promising technology capable of delivering indoor localization with high accuracy. In VLP systems that use Photodiodes (PDs) as light receivers, the Received Signal Strength (RSS) is affected by the incidence angle of light, making the inclination of PDs a critical parameter in the positioning model. Currently, most studies assume the inclination to… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  26. arXiv:2404.15946  [pdf

    cs.CV cs.AI eess.IV

    Mammo-CLIP: Leveraging Contrastive Language-Image Pre-training (CLIP) for Enhanced Breast Cancer Diagnosis with Multi-view Mammography

    Authors: Xuxin Chen, Yuheng Li, Mingzhe Hu, Ella Salari, Xiaoqian Chen, Richard L. J. Qiu, Bin Zheng, Xiaofeng Yang

    Abstract: Although fusion of information from multiple views of mammograms plays an important role to increase accuracy of breast cancer detection, developing multi-view mammograms-based computer-aided diagnosis (CAD) schemes still faces challenges and no such CAD schemes have been used in clinical practice. To overcome the challenges, we investigate a new approach based on Contrastive Language-Image Pre-tr… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  27. arXiv:2404.12725  [pdf, other

    cs.SD cs.CV cs.LG cs.MM eess.AS

    Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction

    Authors: Zhaoxi Mu, Xinyu Yang

    Abstract: The integration of visual cues has revitalized the performance of the target speech extraction task, elevating it to the forefront of the field. Nevertheless, this multi-modal learning paradigm often encounters the challenge of modality imbalance. In audio-visual target speech extraction tasks, the audio modality tends to dominate, potentially overshadowing the importance of visual guidance. To ta… ▽ More

    Submitted 5 May, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI 2024

  28. arXiv:2404.10640  [pdf, other

    eess.IV

    Adapting SAM for Surgical Instrument Tracking and Segmentation in Endoscopic Submucosal Dissection Videos

    Authors: Jieming Yu, Long Bai, Guankun Wang, An Wang, Xiaoxiao Yang, Huxin Gao, Hongliang Ren

    Abstract: The precise tracking and segmentation of surgical instruments have led to a remarkable enhancement in the efficiency of surgical procedures. However, the challenge lies in achieving accurate segmentation of surgical instruments while minimizing the need for manual annotation and reducing the time required for the segmentation process. To tackle this, we propose a novel framework for surgical instr… ▽ More

    Submitted 8 August, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: IEEE ICRA 2024 C4SR+ Workshop

  29. Cost-effective company response policy for product co-creation in company-sponsored online community

    Authors: Jiamin Hu, Lu-Xing Yang, Xiaofan Yang, Kaifan Huang, Gang Li, Yong Xiang

    Abstract: Product co-creation based on company-sponsored online community has come to be a paradigm of developing new products collaboratively with customers. In such a product co-creation campaign, the sponsoring company needs to interact intensively with active community members about the design scheme of the product. We call the collection of the rates of the company's response to active community member… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  30. arXiv:2404.09000  [pdf, other

    eess.IV cs.CV cs.LG

    MaSkel: A Model for Human Whole-body X-rays Generation from Human Masking Images

    Authors: Yingjie Xi, Boyuan Cheng, Jingyao Cai, Jian Jun Zhang, Xiaosong Yang

    Abstract: The human whole-body X-rays could offer a valuable reference for various applications, including medical diagnostics, digital animation modeling, and ergonomic design. The traditional method of obtaining X-ray information requires the use of CT (Computed Tomography) scan machines, which emit potentially harmful radiation. Thus it faces a significant limitation for realistic applications because it… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  31. arXiv:2404.03883  [pdf, other

    eess.IV cs.CV

    LiDAR-Guided Cross-Attention Fusion for Hyperspectral Band Selection and Image Classification

    Authors: Judy X Yang, Jun Zhou, Jing Wang, Hui Tian, Alan Wee-Chung Liew

    Abstract: The fusion of hyperspectral and LiDAR data has been an active research topic. Existing fusion methods have ignored the high-dimensionality and redundancy challenges in hyperspectral images, despite that band selection methods have been intensively studied for hyperspectral image (HSI) processing. This paper addresses this significant gap by introducing a cross-attention mechanism from the transfor… ▽ More

    Submitted 15 April, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

    Comments: 15 pages, 13 figures

    MSC Class: F.2.2; I.2.7

    Journal ref: IEEE - TGRS-2024-00264.R1 Final Files Received

  32. arXiv:2404.00837  [pdf

    eess.IV cs.CV cs.LG physics.med-ph

    Automated HER2 Scoring in Breast Cancer Images Using Deep Learning and Pyramid Sampling

    Authors: Sahan Yoruc Selcuk, Xilin Yang, Bijie Bai, Yijie Zhang, Yuzhu Li, Musa Aydin, Aras Firat Unal, Aditya Gomatam, Zhen Guo, Darrow Morgan Angus, Goren Kolodney, Karine Atlan, Tal Keidar Haran, Nir Pillar, Aydogan Ozcan

    Abstract: Human epidermal growth factor receptor 2 (HER2) is a critical protein in cancer cell growth that signifies the aggressiveness of breast cancer (BC) and helps predict its prognosis. Accurate assessment of immunohistochemically (IHC) stained tissue slides for HER2 expression levels is essential for both treatment guidance and understanding of cancer mechanisms. Nevertheless, the traditional workflow… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: 21 Pages, 7 Figures

    Journal ref: BME Frontiers (2024)

  33. arXiv:2403.17337  [pdf, other

    eess.SY eess.SP

    Destination-Constrained Linear Dynamical System Modeling in Set-Valued Frameworks

    Authors: Xiaowei Yang, Haiqi Liu, Fanqin Meng, Xiaojing Shen

    Abstract: Directional motion towards a specified destination is a common occurrence in physical processes and human societal activities. Utilizing this prior information can significantly improve the control and predictive performance of system models. This paper primarily focuses on reconstructing linear dynamic system models based on destination constraints in the set-valued framework. We treat destinatio… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 15 pages, 11 figures

  34. arXiv:2403.15716  [pdf, other

    cs.RO cs.AI eess.SY

    Distributed Robust Learning based Formation Control of Mobile Robots based on Bioinspired Neural Dynamics

    Authors: Zhe Xu, Tao Yan, Simon X. Yang, S. Andrew Gadsden, Mohammad Biglarbegian

    Abstract: This paper addresses the challenges of distributed formation control in multiple mobile robots, introducing a novel approach that enhances real-world practicability. We first introduce a distributed estimator using a variable structure and cascaded design technique, eliminating the need for derivative information to improve the real time performance. Then, a kinematic tracking control method is de… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: This paper is accepted by IEEE Transactions on Intelligent Vehicles

  35. arXiv:2403.09100  [pdf

    physics.med-ph cs.CV cs.LG eess.IV physics.optics

    Virtual birefringence imaging and histological staining of amyloid deposits in label-free tissue using autofluorescence microscopy and deep learning

    Authors: Xilin Yang, Bijie Bai, Yijie Zhang, Musa Aydin, Sahan Yoruc Selcuk, Zhen Guo, Gregory A. Fishbein, Karine Atlan, William Dean Wallace, Nir Pillar, Aydogan Ozcan

    Abstract: Systemic amyloidosis is a group of diseases characterized by the deposition of misfolded proteins in various organs and tissues, leading to progressive organ dysfunction and failure. Congo red stain is the gold standard chemical stain for the visualization of amyloid deposits in tissue sections, as it forms complexes with the misfolded proteins and shows a birefringence pattern under polarized lig… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 20 Pages, 5 Figures

  36. arXiv:2403.08238  [pdf, other

    cs.RO cs.AI eess.SY

    A Novel Feature Learning-based Bio-inspired Neural Network for Real-time Collision-free Rescue of Multi-Robot Systems

    Authors: Junfei Li, Simon X. Yang

    Abstract: Natural disasters and urban accidents drive the demand for rescue robots to provide safer, faster, and more efficient rescue trajectories. In this paper, a feature learning-based bio-inspired neural network (FLBBINN) is proposed to quickly generate a heuristic rescue path in complex and dynamic environments, as traditional approaches usually cannot provide a satisfactory solution to real-time resp… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: This paper is accepted to publish in IEEE Transactions on Industrial Electronics

  37. arXiv:2403.04116  [pdf, other

    eess.IV cs.CV

    Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis

    Authors: Yuanhao Cai, Yixun Liang, Jiahao Wang, Angtian Wang, Yulun Zhang, Xiaokang Yang, Zongwei Zhou, Alan Yuille

    Abstract: X-ray is widely applied for transmission imaging due to its stronger penetration than natural light. When rendering novel view X-ray projections, existing methods mainly based on NeRF suffer from long training time and slow inference speed. In this paper, we propose a 3D Gaussian splatting-based framework, namely X-Gaussian, for X-ray novel view synthesis. Firstly, we redesign a radiative Gaussian… ▽ More

    Submitted 8 July, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

    Comments: ECCV 2024; The first 3D Gaussian Splatting-based method for X-ray 3D reconstruction

  38. arXiv:2402.19004  [pdf, other

    cs.CV eess.IV

    RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for Remote Sensing Image Semantic Segmentation

    Authors: Jie Zhang, Xubing Yang, Rui Jiang, Wei Shao, Li Zhang

    Abstract: The development of high-resolution remote sensing satellites has provided great convenience for research work related to remote sensing. Segmentation and extraction of specific targets are essential tasks when facing the vast and complex remote sensing images. Recently, the introduction of Segment Anything Model (SAM) provides a universal pre-training model for image segmentation tasks. While the… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 12 pages, 11 figures

  39. arXiv:2402.17146  [pdf

    eess.AS

    Target Speaker Extraction by Directly Exploiting Contextual Information in the Time-Frequency Domain

    Authors: Xue Yang, Changchun Bao, Jing Zhou, Xianhong Chen

    Abstract: In target speaker extraction, many studies rely on the speaker embedding which is obtained from an enrollment of the target speaker and employed as the guidance. However, solely using speaker embedding may not fully utilize the contextual information contained in the enrollment. In this paper, we directly exploit this contextual information in the time-frequency (T-F) domain. Specifically, the T-F… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: Accepted by ICASSP 2024

  40. arXiv:2402.08235  [pdf, other

    eess.IV cs.CV

    Color Image Denoising Using The Green Channel Prior

    Authors: Zhaoming Kong, Xiaowei Yang

    Abstract: Noise removal in the standard RGB (sRGB) space remains a challenging task, in that the noise statistics of real-world images can be different in R, G and B channels. In fact, the green channel usually has twice the sampling rate in raw data and a higher signal-to-noise ratio than red/blue ones. However, the green channel prior (GCP) is often understated or ignored in color image denoising since ma… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  41. arXiv:2402.04228  [pdf, other

    cs.RO cs.AI eess.SY

    Intelligent Collective Escape of Swarm Robots Based on a Novel Fish-inspired Self-adaptive Approach with Neurodynamic Models

    Authors: Junfei Li, Simon X. Yang

    Abstract: Fish schools present high-efficiency group behaviors through simple individual interactions to collective migration and dynamic escape from the predator. The school behavior of fish is usually a good inspiration to design control architecture for swarm robots. In this paper, a novel fish-inspired self-adaptive approach is proposed for collective escape for the swarm robots. In addition, a bio-insp… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: This article is accepted for publication in a future issue of IEEE Transactions on Industrial Electronics

  42. arXiv:2401.15913  [pdf, other

    eess.IV cs.CV cs.LG physics.flu-dyn stat.AP

    Vision-Informed Flow Image Super-Resolution with Quaternion Spatial Modeling and Dynamic Flow Convolution

    Authors: Qinglong Cao, Zhengqin Xu, Chao Ma, Xiaokang Yang, Yuntian Chen

    Abstract: Flow image super-resolution (FISR) aims at recovering high-resolution turbulent velocity fields from low-resolution flow images. Existing FISR methods mainly process the flow images in natural image patterns, while the critical and distinct flow visual properties are rarely considered. This negligence would cause the significant domain gap between flow and natural images to severely hamper the acc… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  43. arXiv:2401.11205  [pdf, other

    cs.IT eess.SP

    Joint Beamforming Optimization and Mode Selection for RDARS-aided MIMO Systems

    Authors: Jintao Wang, Chengzhi Ma, Shiqi Gong, Xi Yang, Shaodan Ma

    Abstract: Considering the appealing distribution gains of distributed antenna systems (DAS) and passive gains of reconfigurable intelligent surface (RIS), a flexible reconfigurable architecture called reconfigurable distributed antenna and reflecting surface (RDARS) is proposed. RDARS encompasses DAS and RIS as two special cases and maintains the advantages of distributed antennas while reducing the hardwar… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

    Comments: 13 pages, 9 figures. This paper has been submitted to IEEE journal for possible publication

  44. arXiv:2401.05521  [pdf, other

    cs.RO cs.AI eess.SY

    Current Effect-eliminated Optimal Target Assignment and Motion Planning for a Multi-UUV System

    Authors: Danjie Zhu, Simon X. Yang

    Abstract: The paper presents an innovative approach (CBNNTAP) that addresses the complexities and challenges introduced by ocean currents when optimizing target assignment and motion planning for a multi-unmanned underwater vehicle (UUV) system. The core of the proposed algorithm involves the integration of several key components. Firstly, it incorporates a bio-inspired neural network-based (BINN) approach… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: This paper was accepted by IEEE Transactions on Intelligent Transportation Systems

  45. arXiv:2401.05412  [pdf, other

    cs.CV cs.AI eess.SP

    Spatial-Related Sensors Matters: 3D Human Motion Reconstruction Assisted with Textual Semantics

    Authors: Xueyuan Yang, Chao Yao, Xiaojuan Ban

    Abstract: Leveraging wearable devices for motion reconstruction has emerged as an economical and viable technique. Certain methodologies employ sparse Inertial Measurement Units (IMUs) on the human body and harness data-driven strategies to model human poses. However, the reconstruction of motion based solely on sparse IMUs data is inherently fraught with ambiguity, a consequence of numerous identical IMU r… ▽ More

    Submitted 26 December, 2023; originally announced January 2024.

    Comments: Accepted by AAAI 2024

  46. arXiv:2312.16247  [pdf, other

    cs.CV eess.IV

    Toward Accurate and Temporally Consistent Video Restoration from Raw Data

    Authors: Shi Guo, Jianqi Ma, Xi Yang, Zhengqiang Zhang, Lei Zhang

    Abstract: Denoising and demosaicking are two fundamental steps in reconstructing a clean full-color video from raw data, while performing video denoising and demosaicking jointly, namely VJDD, could lead to better video restoration performance than performing them separately. In addition to restoration accuracy, another key challenge to VJDD lies in the temporal consistency of consecutive frames. This issue… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

  47. arXiv:2312.16040  [pdf, other

    cs.CV eess.IV

    Multi-scale Progressive Feature Embedding for Accurate NIR-to-RGB Spectral Domain Translation

    Authors: Xingxing Yang, Jie Chen, Zaifeng Yang

    Abstract: NIR-to-RGB spectral domain translation is a challenging task due to the mapping ambiguities, and existing methods show limited learning capacities. To address these challenges, we propose to colorize NIR images via a multi-scale progressive feature embedding network (MPFNet), with the guidance of grayscale image colorization. Specifically, we first introduce a domain translation module that transl… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: Accepted by IEEE VCIP 2023

  48. arXiv:2312.10307  [pdf, other

    cs.SD cs.AI cs.MM eess.AS

    MusER: Musical Element-Based Regularization for Generating Symbolic Music with Emotion

    Authors: Shulei Ji, Xinyu Yang

    Abstract: Generating music with emotion is an important task in automatic music generation, in which emotion is evoked through a variety of musical elements (such as pitch and duration) that change over time and collaborate with each other. However, prior research on deep learning-based emotional music generation has rarely explored the contribution of different musical elements to emotions, let alone the d… ▽ More

    Submitted 1 January, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 2024

  49. arXiv:2312.10305  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction

    Authors: Zhaoxi Mu, Xinyu Yang, Sining Sun, Qing Yang

    Abstract: Speech signals are inherently complex as they encompass both global acoustic characteristics and local semantic information. However, in the task of target speech extraction, certain elements of global and local semantic information in the reference speech, which are irrelevant to speaker identity, can lead to speaker confusion within the speech extraction network. To overcome this challenge, we p… ▽ More

    Submitted 24 August, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI2024

  50. arXiv:2312.04062  [pdf, other

    cs.IT cs.AI eess.SP

    A Low-Overhead Incorporation-Extrapolation based Few-Shot CSI Feedback Framework for Massive MIMO Systems

    Authors: Binggui Zhou, Xi Yang, Jintao Wang, Shaodan Ma, Feifei Gao, Guanghua Yang

    Abstract: Accurate channel state information (CSI) is essential for downlink precoding in frequency division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems with orthogonal frequency-division multiplexing (OFDM). However, obtaining CSI through feedback from the user equipment (UE) becomes challenging with the increasing scale of antennas and subcarriers and leads to extremely high CSI… ▽ More

    Submitted 21 June, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: 16 pages, 12 figures, 5 tables. Accepted by IEEE Transactions on Wireless Communications