Zum Hauptinhalt springen

Showing 1–50 of 129 results for author: Zhu, Z

Searching in archive eess. Search in all archives.
.
  1. arXiv:2407.18449  [pdf, other

    eess.IV cs.CV cs.LG

    Towards A Generalizable Pathology Foundation Model via Unified Knowledge Distillation

    Authors: Jiabo Ma, Zhengrui Guo, Fengtao Zhou, Yihui Wang, Yingxue Xu, Yu Cai, Zhengjie Zhu, Cheng Jin, Yi Lin, Xinrui Jiang, Anjia Han, Li Liang, Ronald Cheong Kin Chan, Jiguang Wang, Kwang-Ting Cheng, Hao Chen

    Abstract: Foundation models pretrained on large-scale datasets are revolutionizing the field of computational pathology (CPath). The generalization ability of foundation models is crucial for the success in various downstream clinical tasks. However, current foundation models have only been evaluated on a limited type and number of tasks, leaving their generalization ability and overall performance unclear.… ▽ More

    Submitted 3 August, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

    Report number: I.2.10

  2. arXiv:2407.16404  [pdf

    eess.SY

    Evaluating Uncertainties in Electricity Markets via Machine Learning and Quantum Computing

    Authors: Shuyang Zhu, Ziqing Zhu, Linghua Zhu, Yujian Ye, Siqi Bu, Sasa Z. Djokic

    Abstract: The analysis of decision-making process in electricity markets is crucial for understanding and resolving issues related to market manipulation and reduced social welfare. Traditional Multi-Agent Reinforcement Learning (MARL) method can model decision-making of generation companies (GENCOs), but faces challenges due to uncertainties in policy functions, reward functions, and inter-agent interactio… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: 3 pages, 3 figures, plan for submitting to IEEE Power Engineering Letters

  3. arXiv:2407.11389  [pdf, ps, other

    cs.NI eess.SP

    Spatial-spectral Cell-free Networks: A Large-scale Case Study

    Authors: Zesheng Zhu, Lifeng Wang, Xin Wang, Dongming Wang, Kai-Kit Wong

    Abstract: This paper studies the large-scale cell-free networks where dense distributed access points (APs) serve many users. As a promising next-generation network architecture, cell-free networks enable ultra-reliable connections and minimal fading/blockage, which are much favorable to the millimeter wave and Terahertz transmissions. However, conventional beam management with large phased arrays in a cell… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  4. arXiv:2407.10408  [pdf, other

    cs.IT eess.SP

    Latency Minimization for IRS-enhanced Wideband MEC Networks with Practical Reflection Model

    Authors: N. Li, W. Hao, X. Li, Z. Zhu, Z. Tang, S. Yang

    Abstract: Intelligent reflecting surface (IRS) has been considered as an efficient way to boost the computation capability of mobile edge computing (MEC) system, especially when the communication links is blocked or the communication signal is weak. However, most existing works are restricted to narrow-band channel and ideal IRS reflection model, which is not practical and may lead to significant performanc… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 13 pages, 9 figures

  5. arXiv:2407.04561  [pdf, other

    cs.NI eess.SP

    Wireless Spectrum in Rural Farmlands: Status, Challenges and Opportunities

    Authors: Mukaram Shahid, Kunal Das, Taimoor Ul Islam, Christ Somiah, Daji Qiao, Arsalan Ahmad, Jimming Song, Zhengyuan Zhu, Sarath Babu, Yong Guan, Tusher Chakraborty, Suraj Jog, Ranveer Chandra, Hongwei Zhang

    Abstract: Due to factors such as low population density and expansive geographical distances, network deployment falls behind in rural regions, leading to a broadband divide. Wireless spectrum serves as the blood and flesh of wireless communications. Shared white spaces such as those in the TVWS and CBRS spectrum bands offer opportunities to expand connectivity, innovate, and provide affordable access to hi… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  6. arXiv:2407.00964  [pdf, other

    eess.SP

    Multi-Modal Fusion-Based Multi-Task Semantic Communication System

    Authors: Zengle Zhu, Rongqing Zhang, Xiang Cheng, Liuqing Yang

    Abstract: In recent years, there has been significant progress in semantic communication systems empowered by deep learning techniques. It has greatly improved the efficiency of information transmission. Nevertheless, traditional semantic communication models still face challenges, particularly due to their single-task and single-modal orientation. Many of these models are designed for specific tasks, which… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  7. arXiv:2406.18009  [pdf, other

    eess.AS cs.SD

    E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS

    Authors: Sefik Emre Eskimez, Xiaofei Wang, Manthan Thakker, Canrun Li, Chung-Hsien Tsai, Zhen Xiao, Hemin Yang, Zirun Zhu, Min Tang, Xu Tan, Yanqing Liu, Sheng Zhao, Naoyuki Kanda

    Abstract: This paper introduces Embarrassingly Easy Text-to-Speech (E2 TTS), a fully non-autoregressive zero-shot text-to-speech system that offers human-level naturalness and state-of-the-art speaker similarity and intelligibility. In the E2 TTS framework, the text input is converted into a character sequence with filler tokens. The flow-matching-based mel spectrogram generator is then trained based on the… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  8. arXiv:2406.09931  [pdf, other

    eess.IV cs.CV cs.LG

    SCKansformer: Fine-Grained Classification of Bone Marrow Cells via Kansformer Backbone and Hierarchical Attention Mechanisms

    Authors: Yifei Chen, Zhu Zhu, Shenghao Zhu, Linwei Qiu, Binfeng Zou, Fan Jia, Yunpeng Zhu, Chenyan Zhang, Zhaojie Fang, Feiwei Qin, Jin Fan, Changmiao Wang, Yu Gao, Gang Yu

    Abstract: The incidence and mortality rates of malignant tumors, such as acute leukemia, have risen significantly. Clinically, hospitals rely on cytological examination of peripheral blood and bone marrow smears to diagnose malignant tumors, with accurate blood cell counting being crucial. Existing automated methods face challenges such as low feature expression capability, poor interpretability, and redund… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 15 pages, 6 figures

  9. arXiv:2406.06002  [pdf, other

    cs.LG eess.SP math.OC

    Computational and Statistical Guarantees for Tensor-on-Tensor Regression with Tensor Train Decomposition

    Authors: Zhen Qin, Zhihui Zhu

    Abstract: Recently, a tensor-on-tensor (ToT) regression model has been proposed to generalize tensor recovery, encompassing scenarios like scalar-on-tensor regression and tensor-on-vector regression. However, the exponential growth in tensor complexity poses challenges for storage and computation in ToT regression. To overcome this hurdle, tensor decompositions have been introduced, with the tensor train (T… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2401.02592

  10. arXiv:2406.05699  [pdf, ps, other

    eess.AS cs.AI eess.SP

    An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS

    Authors: Xiaofei Wang, Sefik Emre Eskimez, Manthan Thakker, Hemin Yang, Zirun Zhu, Min Tang, Yufei Xia, Jinzhu Li, Sheng Zhao, Jinyu Li, Naoyuki Kanda

    Abstract: Recently, zero-shot text-to-speech (TTS) systems, capable of synthesizing any speaker's voice from a short audio prompt, have made rapid advancements. However, the quality of the generated speech significantly deteriorates when the audio prompt contains noise, and limited research has been conducted to address this issue. In this paper, we explored various strategies to enhance the quality of audi… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Accepted to INTERSPEECH2024

  11. arXiv:2406.04281  [pdf, other

    eess.AS

    Total-Duration-Aware Duration Modeling for Text-to-Speech Systems

    Authors: Sefik Emre Eskimez, Xiaofei Wang, Manthan Thakker, Chung-Hsien Tsai, Canrun Li, Zhen Xiao, Hemin Yang, Zirun Zhu, Min Tang, Jinyu Li, Sheng Zhao, Naoyuki Kanda

    Abstract: Accurate control of the total duration of generated speech by adjusting the speech rate is crucial for various text-to-speech (TTS) applications. However, the impact of adjusting the speech rate on speech quality, such as intelligibility and speaker characteristics, has been underexplored. In this work, we propose a novel total-duration-aware (TDA) duration model for TTS, where phoneme durations a… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

  12. arXiv:2405.10691  [pdf, other

    eess.IV cs.CV

    LoCI-DiffCom: Longitudinal Consistency-Informed Diffusion Model for 3D Infant Brain Image Completion

    Authors: Zihao Zhu, Tianli Tao, Yitian Tao, Haowen Deng, Xinyi Cai, Gaofeng Wu, Kaidong Wang, Haifeng Tang, Lixuan Zhu, Zhuoyang Gu, Jiawei Huang, Dinggang Shen, Han Zhang

    Abstract: The infant brain undergoes rapid development in the first few years after birth.Compared to cross-sectional studies, longitudinal studies can depict the trajectories of infants brain development with higher accuracy, statistical power and flexibility.However, the collection of infant longitudinal magnetic resonance (MR) data suffers a notorious dropout problem, resulting in incomplete datasets wit… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  13. arXiv:2404.19167  [pdf

    eess.IV physics.med-ph

    Advancing low-field MRI with a universal denoising imaging transformer: Towards fast and high-quality imaging

    Authors: Zheren Zhu, Azaan Rehman, Xiaozhi Cao, Congyu Liao, Yoo Jin Lee, Michael Ohliger, Hui Xue, Yang Yang

    Abstract: Recent developments in low-field (LF) magnetic resonance imaging (MRI) systems present remarkable opportunities for affordable and widespread MRI access. A robust denoising method to overcome the intrinsic low signal-noise-ratio (SNR) barrier is critical to the success of LF MRI. However, current data-driven MRI denoising methods predominantly handle magnitude images and rely on customized models… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  14. arXiv:2404.10643  [pdf, other

    cs.NI eess.SP

    A Calibrated and Automated Simulator for Innovations in 5G

    Authors: Conrado Boeira, Antor Hasan, Khaleda Papry, Yue Ju, Zhongwen Zhu, Israat Haque

    Abstract: The rise of 5G deployments has created the environment for many emerging technologies to flourish. Self-driving vehicles, Augmented and Virtual Reality, and remote operations are examples of applications that leverage 5G networks' support for extremely low latency, high bandwidth, and increased throughput. However, the complex architecture of 5G hinders innovation due to the lack of accessibility… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  15. arXiv:2404.02382  [pdf

    eess.IV

    Imaging transformer for MRI denoising with the SNR unit training: enabling generalization across field-strengths, imaging contrasts, and anatomy

    Authors: Hui Xue, Sarah Hooper, Azaan Rehman, Iain Pierce, Thomas Treibel, Rhodri Davies, W Patricia Bandettini, Rajiv Ramasawmy, Ahsan Javed, Zheren Zhu, Yang Yang, James Moon, Adrienne Campbell, Peter Kellman

    Abstract: The ability to recover MRI signal from noise is key to achieve fast acquisition, accurate quantification, and high image quality. Past work has shown convolutional neural networks can be used with abundant and paired low and high-SNR images for training. However, for applications where high-SNR data is difficult to produce at scale (e.g. with aggressive acceleration, high resolution, or low field… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  16. Development of a full-Scale approach to predict overlay reflective crack

    Authors: Zehui Zhu, Imad L. Al-Qadi

    Abstract: Resurfacing a moderately deteriorated Portland cement concrete (PCC) pavement with asphalt concrete (AC) layers is considered an efficient rehabilitation practice. However, reflective cracks may develop shortly after resurfacing because of discontinuities (e.g. joints and cracks) in existing PCC pavement. In this paper, a new accelerated full-scale testing approach was developed to study reflectiv… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Journal ref: International Journal of Pavement Engineering, 25(1), 2310095 (2024)

  17. arXiv:2403.16476  [pdf

    eess.IV

    A Method for Target Detection Based on Mmw Radar and Vision Fusion

    Authors: Ming Zong, Jiaying Wu, Zhanyu Zhu, Jingen Ni

    Abstract: An efficient and accurate traffic monitoring system often takes advantages of multi-sensor detection to ensure the safety of urban traffic, promoting the accuracy and robustness of target detection and tracking. A method for target detection using Radar-Vision Fusion Path Aggregation Fully Convolutional One-Stage Network (RV-PAFCOS) is proposed in this paper, which is extended from Fully Convoluti… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  18. arXiv:2403.11699  [pdf, other

    eess.IV cs.CV

    A Spatial-Temporal Progressive Fusion Network for Breast Lesion Segmentation in Ultrasound Videos

    Authors: Zhengzheng Tu, Zigang Zhu, Yayang Duan, Bo Jiang, Qishun Wang, Chaoxue Zhang

    Abstract: Ultrasound video-based breast lesion segmentation provides a valuable assistance in early breast lesion detection and treatment. However, existing works mainly focus on lesion segmentation based on ultrasound breast images which usually can not be adapted well to obtain desirable results on ultrasound videos. The main challenge for ultrasound video-based breast lesion segmentation is how to exploi… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  19. arXiv:2403.11556  [pdf, other

    eess.IV cs.CV

    Hierarchical Frequency-based Upsampling and Refining for Compressed Video Quality Enhancement

    Authors: Qianyu Zhang, Bolun Zheng, Xinying Chen, Quan Chen, Zhunjie Zhu, Canjin Wang, Zongpeng Li, Chengang Yan

    Abstract: Video compression artifacts arise due to the quantization operation in the frequency domain. The goal of video quality enhancement is to reduce compression artifacts and reconstruct a visually-pleasant result. In this work, we propose a hierarchical frequency-based upsampling and refining neural network (HFUR) for compressed video quality enhancement. HFUR consists of two modules: implicit frequen… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  20. arXiv:2402.16413  [pdf, other

    eess.SP

    AI-enabled STAR-RIS aided MISO ISAC Secure Communications

    Authors: Zhengyu Zhu, Mengfei Gong, Gangcan Sun, Peijia Liu, De Mi

    Abstract: A simultaneous transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) aided integrated sensing and communication (ISAC) dual-secure communication system is studied in this paper. The sensed target and legitimate users (LUs) are situated on the opposite sides of the STAR-RIS, and the energy splitting and time switching protocols are applied in the STAR-RIS, respectively. The long… ▽ More

    Submitted 27 February, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

  21. arXiv:2402.13776  [pdf, other

    eess.IV cs.CV cs.LG

    Cas-DiffCom: Cascaded diffusion model for infant longitudinal super-resolution 3D medical image completion

    Authors: Lianghu Guo, Tianli Tao, Xinyi Cai, Zihao Zhu, Jiawei Huang, Lixuan Zhu, Zhuoyang Gu, Haifeng Tang, Rui Zhou, Siyan Han, Yan Liang, Qing Yang, Dinggang Shen, Han Zhang

    Abstract: Early infancy is a rapid and dynamic neurodevelopmental period for behavior and neurocognition. Longitudinal magnetic resonance imaging (MRI) is an effective tool to investigate such a crucial stage by capturing the developmental trajectories of the brain structures. However, longitudinal MRI acquisition always meets a serious data-missing problem due to participant dropout and failed scans, makin… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  22. arXiv:2402.07383  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like

    Authors: Naoyuki Kanda, Xiaofei Wang, Sefik Emre Eskimez, Manthan Thakker, Hemin Yang, Zirun Zhu, Min Tang, Canrun Li, Chung-Hsien Tsai, Zhen Xiao, Yufei Xia, Jinzhu Li, Yanqing Liu, Sheng Zhao, Michael Zeng

    Abstract: Laughter is one of the most expressive and natural aspects of human speech, conveying emotions, social cues, and humor. However, most text-to-speech (TTS) systems lack the ability to produce realistic and appropriate laughter sounds, limiting their applications and user experience. While there have been prior works to generate natural laughter, they fell short in terms of controlling the timing an… ▽ More

    Submitted 4 March, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

    Comments: See https://aka.ms/elate/ for demo samples, v2: subjective evaluation has been added

  23. arXiv:2402.05412  [pdf

    eess.SY

    Multi-Network Constrained Operational Optimization in Community Integrated Energy Systems: A Safe Reinforcement Learning Approach

    Authors: Ze Hu, Ka Wing Chan, Ziqing Zhu, Xiang Wei, Siqi Bu

    Abstract: The integrated community energy system (ICES) has emerged as a promising solution for enhancing the efficiency of the distribution system by effectively coordinating multiple energy sources. However, the operational optimization of ICES is hindered by the physical constraints of heterogeneous networks including electricity, natural gas, and heat. These challenges are difficult to address due to th… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  24. arXiv:2401.13893  [pdf, ps, other

    eess.SP

    A Survey on Indoor Visible Light Positioning Systems: Fundamentals, Applications, and Challenges

    Authors: Zhiyu Zhu, Yang Yang, Mingzhe Chen, Caili Guo, Julian Cheng, Shuguang Cui

    Abstract: The growing demand for location-based services in areas like virtual reality, robot control, and navigation has intensified the focus on indoor localization. Visible light positioning (VLP), leveraging visible light communications (VLC), becomes a promising indoor positioning technology due to its high accuracy and low cost. This paper provides a comprehensive survey of VLP systems. In particular,… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  25. arXiv:2401.10218  [pdf

    physics.optics eess.SP

    A mode-multiplexed photonic integrated vector dot-product core from inverse design

    Authors: Zheyuan Zhu, Raktim Sarma, Seth Smith-Dryden, Guifang Li, Shuo Pang

    Abstract: Photonic computing has the potential of harnessing the full degrees of freedom (DOFs) of the light field, including wavelength, spatial mode, spatial location, phase quadrature, and polarization, to achieve higher level of computation parallelization and scalability than digital electronic processors. While multiplexing using wavelength and other DOFs can be readily integrated on silicon photonics… ▽ More

    Submitted 9 February, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: 13 pages, 8 figures. Initial submission to Optica

  26. arXiv:2401.09032  [pdf, other

    cs.RO cs.MA eess.SY

    Improved Consensus ADMM for Cooperative Motion Planning of Large-Scale Connected Autonomous Vehicles with Limited Communication

    Authors: Haichao Liu, Zhenmin Huang, Zicheng Zhu, Yulin Li, Shaojie Shen, Jun Ma

    Abstract: This paper investigates a cooperative motion planning problem for large-scale connected autonomous vehicles (CAVs) under limited communications, which addresses the challenges of high communication and computing resource requirements. Our proposed methodology incorporates a parallel optimization algorithm with improved consensus ADMM considering a more realistic locally connected topology network,… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: 15 pages, 10 figures

  27. arXiv:2401.08920  [pdf, other

    eess.IV cs.CV

    Idempotence and Perceptual Image Compression

    Authors: Tongda Xu, Ziran Zhu, Dailan He, Yanghao Li, Lina Guo, Yuanyuan Wang, Zhe Wang, Hongwei Qin, Yan Wang, Jingjing Liu, Ya-Qin Zhang

    Abstract: Idempotence is the stability of image codec to re-compression. At the first glance, it is unrelated to perceptual image compression. However, we find that theoretically: 1) Conditional generative model-based perceptual codec satisfies idempotence; 2) Unconditional generative model with idempotence constraint is equivalent to conditional generative codec. Based on this newfound equivalence, we prop… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: ICLR 2024

  28. arXiv:2401.04953  [pdf, other

    eess.IV eess.SP

    Adaptive-avg-pooling based Attention Vision Transformer for Face Anti-spoofing

    Authors: Jichen Yang, Fangfan Chen, Rohan Kumar Das, Zhengyu Zhu, Shunsi Zhang

    Abstract: Traditional vision transformer consists of two parts: transformer encoder and multi-layer perception (MLP). The former plays the role of feature learning to obtain better representation, while the latter plays the role of classification. Here, the MLP is constituted of two fully connected (FC) layers, average value computing, FC layer and softmax layer. However, due to the use of average value com… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: Accepted for Publication in IEEE ICASSP 2024

  29. Real-Time Asphalt Pavement Layer Thickness Prediction Using Ground-Penetrating Radar Based on a Modified Extended Common Mid-Point (XCMP) Approach

    Authors: Siqi Wang, Zhen Leng, Xin Sui, Weiguang Zhang, Tao Ma, Zehui Zhu

    Abstract: The conventional surface reflection method has been widely used to measure the asphalt pavement layer dielectric constant using ground-penetrating radar (GPR). This method may be inaccurate for in-service pavement thickness estimation with dielectric constant variation through the depth, which could be addressed using the extended common mid-point method (XCMP) with air-coupled GPR antennas. Howev… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

    Comments: IEEE Transactions on Intelligent Transportation Systems (2024)

  30. arXiv:2401.02592  [pdf, other

    stat.ML cs.LG eess.SP math.OC

    Guaranteed Nonconvex Factorization Approach for Tensor Train Recovery

    Authors: Zhen Qin, Michael B. Wakin, Zhihui Zhu

    Abstract: In this paper, we provide the first convergence guarantee for the factorization approach. Specifically, to avoid the scaling ambiguity and to facilitate theoretical analysis, we optimize over the so-called left-orthogonal TT format which enforces orthonormality among most of the factors. To ensure the orthonormal structure, we utilize the Riemannian gradient descent (RGD) for optimizing those fact… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  31. arXiv:2401.00194  [pdf, other

    cs.IT eess.SP

    On the Identifiability from Modulo Measurements under DFT Sensing Matrix

    Authors: Qi Zhang, Jiang Zhu, Fengzhong Qu, Zheng Zhu, De Wen Soh

    Abstract: Modulo sampling (MS) has been recently introduced to enhance the dynamic range of conventional ADCs by applying a modulo operator before sampling. This paper examines the identifiability of a measurement model where measurements are taken using a discrete Fourier transform (DFT) sensing matrix, followed by a modulo operator (modulo-DFT). Firstly, we derive a necessary and sufficient condition for… ▽ More

    Submitted 6 August, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

  32. arXiv:2311.12852  [pdf, ps, other

    cs.IT eess.SP

    Cell-free Terahertz Networks: A Spatial-spectral Approach

    Authors: Zesheng Zhu, Lifeng Wang, Xin Wang, Bo Tan, Shi Jin

    Abstract: Cell-free network architecture plays a promising role in the terahertz (THz) networks since it provides better link reliability and uniformly good services for all the users compared to the co-located massive MIMO counterpart, and the spatial-spectral THz link has the advantages of lower initial access latency and fast beam operations. To this end, this work studies cell-free spatial-spectral THz… ▽ More

    Submitted 21 October, 2023; originally announced November 2023.

  33. arXiv:2311.09775  [pdf, other

    cs.AR eess.SP

    MEGA: A Memory-Efficient GNN Accelerator Exploiting Degree-Aware Mixed-Precision Quantization

    Authors: Zeyu Zhu, Fanrong Li, Gang Li, Zejian Liu, Zitao Mo, Qinghao Hu, Xiaoyao Liang, Jian Cheng

    Abstract: Graph Neural Networks (GNNs) are becoming a promising technique in various domains due to their excellent capabilities in modeling non-Euclidean data. Although a spectrum of accelerators has been proposed to accelerate the inference of GNNs, our analysis demonstrates that the latency and energy consumption induced by DRAM access still significantly impedes the improvement of performance and energy… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: 15pages, 22 figures. Accepted at HPCA 2024

  34. arXiv:2311.08153  [pdf, other

    eess.SY cs.AI

    When Mining Electric Locomotives Meet Reinforcement Learning

    Authors: Ying Li, Zhencai Zhu, Xiaoqiang Li, Chunyu Yang, Hao Lu

    Abstract: As the most important auxiliary transportation equipment in coal mines, mining electric locomotives are mostly operated manually at present. However, due to the complex and ever-changing coal mine environment, electric locomotive safety accidents occur frequently these years. A mining electric locomotive control method that can adapt to different complex mining environments is needed. Reinforcemen… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  35. arXiv:2310.18222  [pdf

    eess.IV cs.CV cs.LG

    TBDLNet: a network for classifying multidrug-resistant and drug-sensitive tuberculosis

    Authors: Ziquan Zhu, Jing Tao, Shuihua Wang, Xin Zhang, Yudong Zhang

    Abstract: This paper proposes applying a novel deep-learning model, TBDLNet, to recognize CT images to classify multidrug-resistant and drug-sensitive tuberculosis automatically. The pre-trained ResNet50 is selected to extract features. Three randomized neural networks are used to alleviate the overfitting problem. The ensemble of three RNNs is applied to boost the robustness via majority voting. The propos… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Journal ref: Engineering Reports (2023)

  36. arXiv:2310.10997  [pdf

    eess.SY

    Cooperative Dispatch of Microgrids Community Using Risk-Sensitive Reinforcement Learning with Monotonously Improved Performance

    Authors: Ziqing Zhu, Xiang Gao, Siqi Bu, Ka Wing Chan, Bin Zhou, Shiwei Xia

    Abstract: The integration of individual microgrids (MGs) into Microgrid Clusters (MGCs) significantly improves the reliability and flexibility of energy supply, through resource sharing and ensuring backup during outages. The dispatch of MGCs is the key challenge to be tackled to ensure their secure and economic operation. Currently, there is a lack of optimization method that can achieve a trade-off among… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  37. arXiv:2310.09918  [pdf, other

    eess.IV

    Pedestrian Accessible Infrastructure Inventory: Assessing Zero-Shot Segmentation on Multi-Mode Geospatial Data for All Pedestrian Types

    Authors: Jiahao Xia, Gavin Gong, Jiawei Liu, Zhigang Zhu, Hao Tang

    Abstract: In this paper, a Segment Anything Model (SAM)-based pedestrian infrastructure segmentation workflow is designed and optimized, which is capable of efficiently processing multi-sourced geospatial data including LiDAR data and satellite imagery data. We used an expanded definition of pedestrian infrastructure inventory which goes beyond the traditional transportation elements to include street furni… ▽ More

    Submitted 27 November, 2023; v1 submitted 15 October, 2023; originally announced October 2023.

  38. arXiv:2309.06909  [pdf, other

    eess.SP

    Intelligent Reflective Surface Assisted Integrated Sensing and Wireless Power Transfer

    Authors: Zheng Li, Zhengyu Zhu, Zheng Chu, Yingying Guan, De Mi, Fan Liu, Lie-Liang Yang

    Abstract: Wireless sensing and wireless energy are enablers to pave the way for smart transportation and a greener future. In this paper, an intelligent reflecting surface (IRS) assisted integrated sensing and wireless power transfer (ISWPT) system is investigated, where the transmitter in transportation infrastructure networks sends signals to sense multiple targets and simultaneously to multiple energy ha… ▽ More

    Submitted 30 November, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: Firstly,the simulation has some error and is needed to checked. Secondly, the authors relationship needs to be corrected between zheng li and zheng chu

    ACM Class: H.4.3

  39. arXiv:2309.05446  [pdf, other

    eess.IV cs.CV

    A Localization-to-Segmentation Framework for Automatic Tumor Segmentation in Whole-Body PET/CT Images

    Authors: Linghan Cai, Jianhao Huang, Zihang Zhu, Jinpeng Lu, Yongbing Zhang

    Abstract: Fluorodeoxyglucose (FDG) positron emission tomography (PET) combined with computed tomography (CT) is considered the primary solution for detecting some cancers, such as lung cancer and melanoma. Automatic segmentation of tumors in PET/CT images can help reduce doctors' workload, thereby improving diagnostic quality. However, precise tumor segmentation is challenging due to the small size of many… ▽ More

    Submitted 14 September, 2023; v1 submitted 11 September, 2023; originally announced September 2023.

    Comments: 7 pages,3 figures

  40. arXiv:2308.06064  [pdf, other

    cs.IT eess.SP

    Joint Beamforming Optimization for Active STAR-RIS Assisted ISAC systems

    Authors: Shuang Zhang, Wanming Hao, Gangcan Sun, Chongwen Huang, Zhengyu Zhu, Xingwang Li, Chau Yuen

    Abstract: In this paper, we investigate an active simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted integrated sensing and communications (ISAC) system, in which a dual-function base station (DFBS) equipped with multiple antennas provides communication services for multiple users with the assistance of an active STARRIS and performs target sensing simultaneous… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

  41. arXiv:2307.14335  [pdf, other

    cs.SD cs.AI cs.MM eess.AS

    WavJourney: Compositional Audio Creation with Large Language Models

    Authors: Xubo Liu, Zhongkai Zhu, Haohe Liu, Yi Yuan, Meng Cui, Qiushi Huang, Jinhua Liang, Yin Cao, Qiuqiang Kong, Mark D. Plumbley, Wenwu Wang

    Abstract: Despite breakthroughs in audio generation models, their capabilities are often confined to domain-specific conditions such as speech transcriptions and audio captions. However, real-world audio creation aims to generate harmonious audio containing various elements such as speech, music, and sound effects with controllable conditions, which is challenging to address using existing audio generation… ▽ More

    Submitted 26 November, 2023; v1 submitted 26 July, 2023; originally announced July 2023.

    Comments: GitHub: https://github.com/Audio-AGI/WavJourney

  42. arXiv:2307.12262  [pdf, other

    cs.SD cs.CL cs.HC eess.AS

    A meta learning scheme for fast accent domain expansion in Mandarin speech recognition

    Authors: Ziwei Zhu, Changhao Shan, Bihong Zhang, Jian Yu

    Abstract: Spoken languages show significant variation across mandarin and accent. Despite the high performance of mandarin automatic speech recognition (ASR), accent ASR is still a challenge task. In this paper, we introduce meta-learning techniques for fast accent domain expansion in mandarin speech recognition, which expands the field of accents without deteriorating the performance of mandarin ASR. Meta-… ▽ More

    Submitted 23 July, 2023; originally announced July 2023.

  43. arXiv:2307.01486  [pdf, other

    eess.IV cs.CV

    H-DenseFormer: An Efficient Hybrid Densely Connected Transformer for Multimodal Tumor Segmentation

    Authors: Jun Shi, Hongyu Kan, Shulan Ruan, Ziqi Zhu, Minfan Zhao, Liang Qiao, Zhaohui Wang, Hong An, Xudong Xue

    Abstract: Recently, deep learning methods have been widely used for tumor segmentation of multimodal medical images with promising results. However, most existing methods are limited by insufficient representational ability, specific modality number and high computational complexity. In this paper, we propose a hybrid densely connected network for tumor segmentation, named H-DenseFormer, which combines the… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

    Comments: 11 pages, 2 figures. This paper has been accepted by Medical Image Computing and Computer-Assisted Intervention(MICCAI) 2023

  44. arXiv:2306.12099  [pdf, ps, other

    eess.SP

    Anti-jamming Method for SAR Using Joint Waveform Modulation and Azimuth Mismatched Filtering

    Authors: Zhuan Sun, Zhanyu Zhu

    Abstract: High-fidelity deception jamming can seriously mislead Synthetic Aperture Radar (SAR) image interpretation and target detection, which is difficult to identify or eliminate through traditional anti-jamming methods. Based on the Range-Doppler Algorithm (RDA), an anti-jamming method for SAR by using joint waveform modulation and azimuth mismatched filtering is proposed in this paper. The signal model… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

  45. arXiv:2306.09432  [pdf, other

    quant-ph cs.IT eess.SP

    Quantum State Tomography for Matrix Product Density Operators

    Authors: Zhen Qin, Casey Jameson, Zhexuan Gong, Michael B. Wakin, Zhihui Zhu

    Abstract: The reconstruction of quantum states from experimental measurements, often achieved using quantum state tomography (QST), is crucial for the verification and benchmarking of quantum devices. However, performing QST for a generic unstructured quantum state requires an enormous number of state copies that grows \emph{exponentially} with the number of individual quanta in the system, even for the mos… ▽ More

    Submitted 18 February, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

  46. arXiv:2306.04554  [pdf

    physics.optics eess.IV

    Integrated Photonic Encoder for Terapixel Image Processing

    Authors: Xiao Wang, Brandon Redding, Nicholas Karl, Christopher Long, Zheyuan Zhu, Shuo Pang, David Brady, Raktim Sarma

    Abstract: Modern lens designs are capable of resolving >10 gigapixels, while advances in camera frame-rate and hyperspectral imaging have made Terapixel/s data acquisition a real possibility. The main bottlenecks preventing such high data-rate systems are power consumption and data storage. In this work, we show that analog photonic encoders could address this challenge, enabling high-speed image compressio… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  47. arXiv:2305.10925  [pdf, other

    cs.CV eess.IV

    Unsupervised Hyperspectral Pansharpening via Low-rank Diffusion Model

    Authors: Xiangyu Rui, Xiangyong Cao, Li Pang, Zeyu Zhu, Zongsheng Yue, Deyu Meng

    Abstract: Hyperspectral pansharpening is a process of merging a high-resolution panchromatic (PAN) image and a low-resolution hyperspectral (LRHS) image to create a single high-resolution hyperspectral (HRHS) image. Existing Bayesian-based HS pansharpening methods require designing handcraft image prior to characterize the image features, and deep learning-based HS pansharpening methods usually require a la… ▽ More

    Submitted 19 November, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

  48. arXiv:2305.08527  [pdf, other

    cs.IT eess.SP

    Sum Secrecy Rate Maximization for IRS-aided Multi-Cluster MIMO-NOMA Terahertz Systems

    Authors: Jinlei Xu, Zhengyu Zhu, Zheng Chu, Hehao Niu, Pei Xiao, Inkyu Lee

    Abstract: Intelligent reflecting surface (IRS) is a promising technique to extend the network coverage and improve spectral efficiency. This paper investigates an IRS-assisted terahertz (THz) multiple-input multiple-output (MIMO)-nonorthogonal multiple access (NOMA) system based on hybrid precoding with the presence of eavesdropper. Two types of sparse RF chain antenna structures are adopted, i.e., sub-conn… ▽ More

    Submitted 11 June, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: 11 pages, 8 figure; references added

  49. arXiv:2305.02485   

    cs.AI cs.LG eess.SY

    How to Use Reinforcement Learning to Facilitate Future Electricity Market Design? Part 1: A Paradigmatic Theory

    Authors: Ziqing Zhu, Siqi Bu, Ka Wing Chan, Bin Zhou, Shiwei Xia

    Abstract: In face of the pressing need of decarbonization in the power sector, the re-design of electricity market is necessary as a Marco-level approach to accommodate the high penetration of renewable generations, and to achieve power system operation security, economic efficiency, and environmental friendliness. However, existing market design methodologies suffer from the lack of coordination among ener… ▽ More

    Submitted 11 May, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

    Comments: It is old version with mistakes

  50. arXiv:2304.06042  [pdf

    eess.SP math.OC physics.optics

    A physical neural network training approach toward multi-plane light conversion design

    Authors: Zheyuan Zhu, Joe H. Doerr, Guifang Li, Shuo Pang

    Abstract: Multi-plane light converter (MPLC) designs supporting hundreds of modes are attractive in high-throughput optical communications. These photonic structures typically comprise >10 phase masks in free space, with millions of independent design parameters. Conventional MPLC design using wavefront matching updates one mask at a time while fixing the rest. Here we construct a physical neural network (P… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: Draft for submission to Optics Express