Zum Hauptinhalt springen

Showing 1–50 of 3,484 results for author: Wu

Searching in archive eess. Search in all archives.
.
  1. arXiv:2408.17252  [pdf, other

    eess.SP

    A Homogeneous Graph Neural Network for Precoding and Power Allocation in Scalable Wireless Networks

    Authors: Mingjun Sun, Zeng Li, Shaochuan Wu, Yuanwei Liu, Guoyu Li, Tong Zhang

    Abstract: Deep learning is widely used in wireless communications but struggles with fixed neural network sizes, which limit their adaptability in environments where the number of users and antennas varies. To overcome this, this paper introduced a generalization strategy for precoding and power allocation in scalable wireless networks. Initially, we employ an innovative approach to abstract the wireless ne… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

    Comments: This work is submitted to IEEE for possible publication

  2. arXiv:2408.17186  [pdf, other

    cs.HC cs.AI eess.SY

    "Benefit Game: Alien Seaweed Swarms" -- Real-time Gamification of Digital Seaweed Ecology

    Authors: Dan-Lu Fei, Zi-Wei Wu, Kang Zhang

    Abstract: "Benefit Game: Alien Seaweed Swarms" combines artificial life art and interactive game with installation to explore the impact of human activity on fragile seaweed ecosystems. The project aims to promote ecological consciousness by creating a balance in digital seaweed ecologies. Inspired by the real species "Laminaria saccharina", the author employs Procedural Content Generation via Machine Learn… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

    Comments: Paper accepted at ISEA 24, The 29th International Symposium on Electronic Art, Brisbane, Australia, 21-29 June 2024

  3. arXiv:2408.16725  [pdf, other

    cs.AI cs.CL cs.HC cs.LG cs.SD eess.AS

    Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming

    Authors: Zhifei Xie, Changqiao Wu

    Abstract: Recent advances in language models have achieved significant progress. GPT-4o, as a new milestone, has enabled real-time conversations with humans, demonstrating near-human natural fluency. Such human-computer interaction necessitates models with the capability to perform reasoning directly with the audio modality and generate output in streaming. However, this remains beyond the reach of current… ▽ More

    Submitted 29 August, 2024; v1 submitted 29 August, 2024; originally announced August 2024.

    Comments: Technical report, work in progress. Demo and code: https://github.com/gpt-omni/mini-omni

  4. arXiv:2408.16277  [pdf

    eess.IV cs.CV

    Fine-grained Classification of Port Wine Stains Using Optical Coherence Tomography Angiography

    Authors: Xiaofeng Deng, Defu Chen, Bowen Liu, Xiwan Zhang, Haixia Qiu, Wu Yuan, Hongliang Ren

    Abstract: Accurate classification of port wine stains (PWS, vascular malformations present at birth), is critical for subsequent treatment planning. However, the current method of classifying PWS based on the external skin appearance rarely reflects the underlying angiopathological heterogeneity of PWS lesions, resulting in inconsistent outcomes with the common vascular-targeted photodynamic therapy (V-PDT)… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  5. arXiv:2408.16197  [pdf, other

    eess.SY

    Economic Optimal Power Management of Second-Life Battery Energy Storage Systems

    Authors: Amir Farakhor, Di Wu, Pingen Chen, Junmin Wang, Yebin Wang, Huazhen Fang

    Abstract: Second-life battery energy storage systems (SL-BESS) are an economical means of long-duration grid energy storage. They utilize retired battery packs from electric vehicles to store and provide electrical energy at the utility scale. However, they pose critical challenges in achieving optimal utilization and extending their remaining useful life. These complications primarily result from the const… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  6. arXiv:2408.16030  [pdf

    cs.SD cs.AI cs.LG eess.AS

    A Deep Learning Approach to Localizing Multi-level Airway Collapse Based on Snoring Sounds

    Authors: Ying-Chieh Hsu, Stanley Yung-Chuan Liu, Chao-Jung Huang, Chi-Wei Wu, Ren-Kai Cheng, Jane Yung-Jen Hsu, Shang-Ran Huang, Yuan-Ren Cheng, Fu-Shun Hsu

    Abstract: This study investigates the application of machine/deep learning to classify snoring sounds excited at different levels of the upper airway in patients with obstructive sleep apnea (OSA) using data from drug-induced sleep endoscopy (DISE). The snoring sounds of 39 subjects were analyzed and labeled according to the Velum, Oropharynx, Tongue Base, and Epiglottis (VOTE) classification system. The da… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  7. VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling

    Authors: Yixuan Zhou, Xiaoyu Qin, Zeyu Jin, Shuoyi Zhou, Shun Lei, Songtao Zhou, Zhiyong Wu, Jia Jia

    Abstract: Recent AIGC systems possess the capability to generate digital multimedia content based on human language instructions, such as text, image and video. However, when it comes to speech, existing methods related to human instruction-to-speech generation exhibit two limitations. Firstly, they require the division of inputs into content prompt (transcript) and description prompt (style and speaker), i… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

    Comments: Accepted by ACM Multimedia 2024

  8. arXiv:2408.15668  [pdf, ps, other

    cs.IT eess.SP

    Movable Antennas Meet Intelligent Reflecting Surface: When Do We Need Movable Antennas?

    Authors: Xin Wei, Weidong Mei, Qingqing Wu, Boyu Ning, Zhi Chen

    Abstract: Intelligent reflecting surface (IRS) and movable antenna (MA)/fluid antenna (FA) techniques have both received increasing attention in the realm of wireless communications due to their ability to reconfigure and improve wireless channel conditions. In this paper, we investigate the integration of MAs/FAs into an IRS-assisted wireless communication system. In particular, we consider the downlink tr… ▽ More

    Submitted 29 August, 2024; v1 submitted 28 August, 2024; originally announced August 2024.

    Comments: 6 pages, 6 figures, submitted to IEEE WCNC 2025

  9. arXiv:2408.15490  [pdf, ps, other

    eess.SP

    Symbiotic Sensing and Communication: Framework and Beamforming Design

    Authors: Fanghao Xia, Zesong Fei, Xinyi Wang, Weijie Yuan, Qingqing Wu, Yuanwei Liu, Tony Q. S. Quek

    Abstract: In this paper, we propose a novel symbiotic sensing and communication (SSAC) framework, comprising a base station (BS) and a passive sensing node. In particular, the BS transmits communication waveform to serve vehicle users (VUEs), while the sensing node is employed to execute sensing tasks based on the echoes in a bistatic manner, thereby avoiding the issue of self-interference. Besides the weak… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: 16 pages, 11 figures, submitted to IEEE journals for possible publication

  10. arXiv:2408.15435  [pdf, other

    eess.SP

    Globally Optimal Movable Antenna-Enhanced multi-user Communication: Discrete Antenna Positioning, Motion Power Consumption, and Imperfect CSI

    Authors: Yifei Wu, Dongfang Xu, Derrick Wing Kwan Ng, Wolfgang Gerstacker, Robert Schober

    Abstract: Movable antennas (MAs) represent a promising paradigm to enhance the spatial degrees of freedom of conventional multi-antenna systems by dynamically adapting the positions of antenna elements within a designated transmit area. In particular, by employing electro-mechanical MA drivers, the positions of the MA elements can be adjusted to shape a favorable spatial correlation for improving system per… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

  11. arXiv:2408.14758  [pdf, other

    eess.SY

    Learning-Based Adaptive Dynamic Routing with Stability Guarantee for a Single-Origin-Single-Destination Network

    Authors: Yidan Wu, Feixiang Shu, Jianan Zhang, Li Jin

    Abstract: We consider learning-based adaptive dynamic routing for a single-origin-single-destination queuing network with stability guarantees. Specifically, we study a class of generalized shortest path policies that can be parameterized by only two constants via a piecewise-linear function. Using the Foster-Lyapunov stability theory, we develop a criterion on the parameters to ensure mean boundedness of t… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

  12. arXiv:2408.14340  [pdf, other

    cs.SD cs.AI cs.CL cs.LG eess.AS

    Foundation Models for Music: A Survey

    Authors: Yinghao Ma, Anders Øland, Anton Ragni, Bleiz MacSen Del Sette, Charalampos Saitis, Chris Donahue, Chenghua Lin, Christos Plachouras, Emmanouil Benetos, Elio Quinton, Elona Shatri, Fabio Morreale, Ge Zhang, György Fazekas, Gus Xia, Huan Zhang, Ilaria Manco, Jiawen Huang, Julien Guinot, Liwei Lin, Luca Marinelli, Max W. Y. Lam, Megha Sharma, Qiuqiang Kong, Roger B. Dannenberg , et al. (18 additional authors not shown)

    Abstract: In recent years, foundation models (FMs) such as large language models (LLMs) and latent diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This comprehensive review examines state-of-the-art (SOTA) pre-trained models and foundation models in music, spanning from representation learning, generative learning and multimodal learning. We first contextualise the signifi… ▽ More

    Submitted 27 August, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

  13. arXiv:2408.14057  [pdf, other

    math.NA cs.DC cs.NE eess.SY nlin.CD

    Revisiting time-variant complex conjugate matrix equations with their corresponding real field time-variant large-scale linear equations, neural hypercomplex numbers space compressive approximation approach

    Authors: Jiakuang He, Dongqing Wu

    Abstract: Large-scale linear equations and high dimension have been hot topics in deep learning, machine learning, control,and scientific computing. Because of special conjugate operation characteristics, time-variant complex conjugate matrix equations need to be transformed into corresponding real field time-variant large-scale linear equations. In this paper, zeroing neural dynamic models based on complex… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

  14. arXiv:2408.13893  [pdf, other

    cs.SD cs.CL eess.AS

    SimpleSpeech 2: Towards Simple and Efficient Text-to-Speech with Flow-based Scalar Latent Transformer Diffusion Models

    Authors: Dongchao Yang, Rongjie Huang, Yuanyuan Wang, Haohan Guo, Dading Chong, Songxiang Liu, Xixin Wu, Helen Meng

    Abstract: Scaling Text-to-speech (TTS) to large-scale datasets has been demonstrated as an effective method for improving the diversity and naturalness of synthesized speech. At the high level, previous large-scale TTS models can be categorized into either Auto-regressive (AR) based (\textit{e.g.}, VALL-E) or Non-auto-regressive (NAR) based models (\textit{e.g.}, NaturalSpeech 2/3). Although these works dem… ▽ More

    Submitted 28 August, 2024; v1 submitted 25 August, 2024; originally announced August 2024.

    Comments: Submit to TASLP

  15. arXiv:2408.13483  [pdf, other

    eess.SP cs.IT

    Transmissive RIS Enabled Transceiver Systems:Architecture, Design Issues and Opportunities

    Authors: Zhendong Li, Wen Chen, Qingqing Wu, Ziwei Liu, Chong He, Xudong Bai, Jun Li

    Abstract: Reconfigurable intelligent surface (RIS) is anticipated to augment the performance of beyond fifth-generation (B5G) and sixth-generation (6G) networks by intelligently manipulating the state of its components. Rather than employing reflective RIS for aided communications, this paper proposes an innovative transmissive RIS-enabled transceiver (TRTC) architecture that can accomplish the functions of… ▽ More

    Submitted 24 August, 2024; originally announced August 2024.

    Journal ref: IEEE VTM, 2024

  16. arXiv:2408.13447  [pdf, ps, other

    eess.SP

    FAS-RIS Communication: Model, Analysis, and Optimization

    Authors: Junteng Yao, Jianchao Zheng, Tuo Wu, Ming Jin, Chau Yuen, Kai-Kit Wong, Fumiyuki Adachi

    Abstract: This correspondence investigates the novel fluid antenna system (FAS) technology, combining with reconfigurable intelligent surface (RIS) for wireless communications, where a base station (BS) communicates with a FAS-enabled user with the assistance of a RIS. To analyze this technology, we derive the outage probability based on the block-diagonal matrix approximation (BDMA) model. With this, we ob… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

  17. arXiv:2408.13444  [pdf, ps, other

    eess.SP

    FAS-RIS: A Block-Correlation Model Analysis

    Authors: Xiazhi Lai, Junteng Yao, Kangda Zhi, Tuo Wu, David Morales-Jimenez, Kai-Kit Wong

    Abstract: In this correspondence, we analyze the performance of a reconfigurable intelligent surface (RIS)-aided communication system that involves a fluid antenna system (FAS)-enabled receiver. By applying the central limit theorem (CLT), we derive approximate expressions for the system outage probability when the RIS has a large number of elements. Also, we adopt the block-correlation channel model to sim… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

  18. arXiv:2408.13403  [pdf, other

    eess.SP

    Beam Profiling and Beamforming Modeling for mmWave NextG Networks

    Authors: Efat Samir Fathalla, Sahar Zargarzadeh, Chunsheng Xin, Hongyi Wu, Peng Jiang, Joao F. Santos, Jacek Kibilda, Aloizio Pereira da

    Abstract: This paper presents an experimental study on mmWave beam profiling on a mmWave testbed, and develops a machine learning model for beamforming based on the experiment data. The datasets we have obtained from the beam profiling and the machine learning model for beamforming are valuable for a broad set of network design problems, such as network topology optimization, user equipment association, pow… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

    Comments: In Proceedings of IEEE International Conference on Computer Communications and Networks (ICCCN), 2023

  19. arXiv:2408.13290  [pdf, ps, other

    eess.IV cs.CV

    Multi-modal Intermediate Feature Interaction AutoEncoder for Overall Survival Prediction of Esophageal Squamous Cell Cancer

    Authors: Chengyu Wu, Yatao Zhang, Yaqi Wang, Qifeng Wang, Shuai Wang

    Abstract: Survival prediction for esophageal squamous cell cancer (ESCC) is crucial for doctors to assess a patient's condition and tailor treatment plans. The application and development of multi-modal deep learning in this field have attracted attention in recent years. However, the prognostically relevant features between cross-modalities have not been further explored in previous studies, which could hi… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

    Comments: Accepted by ISBI 2024

  20. arXiv:2408.13054  [pdf, other

    cs.RO cs.AI cs.LG eess.SY

    cc-DRL: a Convex Combined Deep Reinforcement Learning Flight Control Design for a Morphing Quadrotor

    Authors: Tao Yang, Huai-Ning Wu, Jun-Wei Wang

    Abstract: In comparison to common quadrotors, the shape change of morphing quadrotors endows it with a more better flight performance but also results in more complex flight dynamics. Generally, it is extremely difficult or even impossible for morphing quadrotors to establish an accurate mathematical model describing their complex flight dynamics. To figure out the issue of flight control design for morphin… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

  21. arXiv:2408.13040  [pdf, other

    eess.AS cs.AI cs.CL cs.LG

    SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks

    Authors: Kai-Wei Chang, Haibin Wu, Yu-Kai Wang, Yuan-Kuei Wu, Hua Shen, Wei-Cheng Tseng, Iu-thing Kang, Shang-Wen Li, Hung-yi Lee

    Abstract: Prompting has become a practical method for utilizing pre-trained language models (LMs). This approach offers several advantages. It allows an LM to adapt to new tasks with minimal training and parameter updates, thus achieving efficiency in both storage and computation. Additionally, prompting modifies only the LM's inputs and harnesses the generative capabilities of language models to address va… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

    Comments: Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)

    Journal ref: in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 3730-3744, 2024

  22. arXiv:2408.12658  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    Hierarchical Generative Modeling of Melodic Vocal Contours in Hindustani Classical Music

    Authors: Nithya Shikarpur, Krishna Maneesha Dendukuri, Yusong Wu, Antoine Caillon, Cheng-Zhi Anna Huang

    Abstract: Hindustani music is a performance-driven oral tradition that exhibits the rendition of rich melodic patterns. In this paper, we focus on generative modeling of singers' vocal melodies extracted from audio recordings, as the voice is musically prominent within the tradition. Prior generative work in Hindustani music models melodies as coarse discrete symbols which fails to capture the rich expressi… ▽ More

    Submitted 26 August, 2024; v1 submitted 22 August, 2024; originally announced August 2024.

    Comments: Accepted at International Society for Music Information Retrieval (ISMIR) 2024

  23. arXiv:2408.12329  [pdf, ps, other

    cs.IT eess.SP

    Asynchronous Cell-Free Massive MIMO-OFDM: Mixed Coherent and Non-Coherent Transmissions

    Authors: Guoyu Li, Shaochuan Wu, Changsheng You, Wenbin Zhang, Guanyu Shang

    Abstract: In this letter, we analyze the performance of mixed coherent and non-coherent transmissions approach, which can improve the performance of cell-free multiple-input multiple-output orthogonal frequency division multiplexing (CF mMIMO-OFDM) systems under asynchronous reception. To this end, we first obtain the achievable downlink sum-rate for the mixed coherent and non-coherent transmissions, and th… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: This work is submitted to IEEE for possible publication

  24. arXiv:2408.11405  [pdf, other

    cs.SD eess.AS

    DDSP Guitar Amp: Interpretable Guitar Amplifier Modeling

    Authors: Yen-Tung Yeh, Yu-Hua Chen, Yuan-Chiao Cheng, Jui-Te Wu, Jun-Jie Fu, Yi-Fan Yeh, Yi-Hsuan Yang

    Abstract: Neural network models for guitar amplifier emulation, while being effective, often demand high computational cost and lack interpretability. Drawing ideas from physical amplifier design, this paper aims to address these issues with a new differentiable digital signal processing (DDSP)-based model, called ``DDSP guitar amp,'' that models the four components of a guitar amp (i.e., preamp, tone stack… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: Preprint paper

  25. arXiv:2408.11227  [pdf

    eess.IV cs.AI cs.CV

    OCTCube: A 3D foundation model for optical coherence tomography that improves cross-dataset, cross-disease, cross-device and cross-modality analysis

    Authors: Zixuan Liu, Hanwen Xu, Addie Woicik, Linda G. Shapiro, Marian Blazes, Yue Wu, Cecilia S. Lee, Aaron Y. Lee, Sheng Wang

    Abstract: Optical coherence tomography (OCT) has become critical for diagnosing retinal diseases as it enables 3D images of the retina and optic nerve. OCT acquisition is fast, non-invasive, affordable, and scalable. Due to its broad applicability, massive numbers of OCT images have been accumulated in routine exams, making it possible to train large-scale foundation models that can generalize to various di… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  26. arXiv:2408.10390  [pdf, other

    eess.SY

    Self-Refined Generative Foundation Models for Wireless Traffic Prediction

    Authors: Chengming Hu, Hao Zhou, Di Wu, Xi Chen, Jun Yan, Xue Liu

    Abstract: With a broad range of emerging applications in 6G networks, wireless traffic prediction has become a critical component of network management. However, the dynamically shifting distribution of wireless traffic in non-stationary 6G networks presents significant challenges to achieving accurate and stable predictions. Motivated by recent advancements in Generative AI (GAI)-enabled 6G networks, this… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  27. arXiv:2408.10236  [pdf, other

    eess.IV cs.CV

    AID-DTI: Accelerating High-fidelity Diffusion Tensor Imaging with Detail-preserving Model-based Deep Learning

    Authors: Wenxin Fan, Jian Cheng, Cheng Li, Jing Yang, Ruoyou Wu, Juan Zou, Shanshan Wang

    Abstract: Deep learning has shown great potential in accelerating diffusion tensor imaging (DTI). Nevertheless, existing methods tend to suffer from Rician noise and eddy current, leading to detail loss in reconstructing the DTI-derived parametric maps especially when sparsely sampled q-space data are used. To address this, this paper proposes a novel method, AID-DTI (\textbf{A}ccelerating h\textbf{I}gh fi\… ▽ More

    Submitted 4 August, 2024; originally announced August 2024.

    Comments: 12 pages, 3 figures, MICCAI 2024 Workshop on Computational Diffusion MRI. arXiv admin note: text overlap with arXiv:2401.01693, arXiv:2405.03159

  28. arXiv:2408.09357  [pdf, other

    cs.GR cs.AI cs.SD eess.AS

    Meta-Learning Empowered Meta-Face: Personalized Speaking Style Adaptation for Audio-Driven 3D Talking Face Animation

    Authors: Xukun Zhou, Fengxin Li, Ziqiao Peng, Kejian Wu, Jun He, Biao Qin, Zhaoxin Fan, Hongyan Liu

    Abstract: Audio-driven 3D face animation is increasingly vital in live streaming and augmented reality applications. While remarkable progress has been observed, most existing approaches are designed for specific individuals with predefined speaking styles, thus neglecting the adaptability to varied speaking styles. To address this limitation, this paper introduces MetaFace, a novel methodology meticulously… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

  29. arXiv:2408.09315  [pdf, other

    eess.IV cs.CV

    Unpaired Volumetric Harmonization of Brain MRI with Conditional Latent Diffusion

    Authors: Mengqi Wu, Minhui Yu, Shuaiming Jing, Pew-Thian Yap, Zhengwu Zhang, Mingxia Liu

    Abstract: Multi-site structural MRI is increasingly used in neuroimaging studies to diversify subject cohorts. However, combining MR images acquired from various sites/centers may introduce site-related non-biological variations. Retrospective image harmonization helps address this issue, but current methods usually perform harmonization on pre-extracted hand-crafted radiomic features, limiting downstream a… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

  30. arXiv:2408.09067  [pdf, ps, other

    eess.SP

    FAS vs. ARIS: Which Is More Important for FAS-ARIS Communication Systems?

    Authors: Junteng Yao, Liaoshi Zhou, Tuo Wu, Ming Jin, Chongwen Huang, Chau Yuen

    Abstract: In this paper, we investigate the question of which technology, fluid antenna systems (FAS) or active reconfigurable intelligent surfaces (ARIS), plays a more crucial role in FAS-ARIS wireless communication systems. To address this, we develop a comprehensive system model and explore the problem from an optimization perspective. We introduce an alternating optimization (AO) algorithm incorporating… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  31. arXiv:2408.08456  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Efficient Data-Sketches and Fine-Tuning for Early Detection of Distributional Drift in Medical Imaging

    Authors: Yusen Wu, Hao Chen, Alex Pissinou Makki, Phuong Nguyen, Yelena Yesha

    Abstract: Distributional drift detection is important in medical applications as it helps ensure the accuracy and reliability of models by identifying changes in the underlying data distribution that could affect diagnostic or treatment decisions. However, current methods have limitations in detecting drift; for example, the inclusion of abnormal datasets can lead to unfair comparisons. This paper presents… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  32. arXiv:2408.08228  [pdf, other

    eess.IV cs.CV

    Rethinking Medical Anomaly Detection in Brain MRI: An Image Quality Assessment Perspective

    Authors: Zixuan Pan, Jun Xia, Zheyu Yan, Guoyue Xu, Yawen Wu, Zhenge Jia, Jianxu Chen, Yiyu Shi

    Abstract: Reconstruction-based methods, particularly those leveraging autoencoders, have been widely adopted to perform anomaly detection in brain MRI. While most existing works try to improve detection accuracy by proposing new model structures or algorithms, we tackle the problem through image quality assessment, an underexplored perspective in the field. We propose a fusion quality loss function that com… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  33. arXiv:2408.08121  [pdf

    eess.SY

    Optimizing Highway Ramp Merge Safety and Efficiency via Spatio-Temporal Cooperative Control and Vehicle-Road Coordination

    Authors: Ting Peng, Xiaoxue Xu, Yuan Li, Jie Wu, Tao Li, Xiang Dong, Yincai Cai, Peng Wu

    Abstract: In view of existing automatic driving, it is difficult to accurately and timely obtain the status and driving intention of other vehicles. The safety risk and urgency of autonomous vehicles in the absence of collision are evaluated. To ensure safety and improve road efficiency, a method of pre-compiling the spatio-temporal trajectory of vehicles is established to eliminate conflicts between vehicl… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  34. arXiv:2408.07931  [pdf, other

    cs.CV cs.AI cs.RO eess.IV

    Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning

    Authors: Haofeng Liu, Erli Zhang, Junde Wu, Mingxuan Hong, Yueming Jin

    Abstract: Surgical video segmentation is a critical task in computer-assisted surgery and is vital for enhancing surgical quality and patient outcomes. Recently, the Segment Anything Model 2 (SAM2) framework has shown superior advancements in image and video segmentation. However, SAM2 struggles with efficiency due to the high computational demands of processing high-resolution images and complex and long-r… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Comments: 16 pages, 2 figures

  35. arXiv:2408.07897  [pdf, other

    cs.LG cs.IR cs.MA eess.SY

    The Nah Bandit: Modeling User Non-compliance in Recommendation Systems

    Authors: Tianyue Zhou, Jung-Hoon Cho, Cathy Wu

    Abstract: Recommendation systems now pervade the digital world, ranging from advertising to entertainment. However, it remains challenging to implement effective recommendation systems in the physical world, such as in mobility or health. This work focuses on a key challenge: in the physical world, it is often easy for the user to opt out of taking any recommendation if they are not to her liking, and to fa… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: 12 pages, 8 figures, under review

  36. arXiv:2408.07770  [pdf, ps, other

    eess.SP

    User-Centric Machine Learning for Resource Allocation in MPTCP-Enabled Hybrid LiFi and WiFi Networks

    Authors: Han Ji, Declan T. Delaney, Xiping Wu

    Abstract: As an emerging paradigm of heterogeneous networks (HetNets) towards 6G, the hybrid light fidelity (LiFi) and wireless fidelity (WiFi) networks (HLWNets) have potential to explore the complementary advantages of the optical and radio spectra. Like other cooperation-native HetNets, HLWNets face a crucial load balancing (LB) problem due to the heterogeneity of access points (APs). The existing litera… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  37. arXiv:2408.07325  [pdf, other

    eess.IV cs.GR

    RoCoSDF: Row-Column Scanned Neural Signed Distance Fields for Freehand 3D Ultrasound Imaging Shape Reconstruction

    Authors: Hongbo Chen, Yuchong Gao, Shuhang Zhang, Jiangjie Wu, Yuexin Ma, Rui Zheng

    Abstract: The reconstruction of high-quality shape geometry is crucial for developing freehand 3D ultrasound imaging. However, the shape reconstruction of multi-view ultrasound data remains challenging due to the elevation distortion caused by thick transducer probes. In this paper, we present a novel learning-based framework RoCoSDF, which can effectively generate an implicit surface through continuous sha… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: Accepted by MICCAI 2024

  38. arXiv:2408.07085  [pdf

    physics.class-ph eess.SP physics.optics

    New Bounds on Antenna Bandwidth and Directivity: Corrections to the Chu-Harrington Limits

    Authors: Carl Pfeiffer, Bae-Ian Wu

    Abstract: The Chu circuit model provides the basis for analyzing the minimum radiation quality factor, Q, of a given spherical mode. However, examples of electrically large radiators readily demonstrate that this Q limit is incorrect. Spherical mode radiation is reexamined and an equivalent 1D transmission line model is derived that exactly models the fields. This model leads to a precise cutoff frequency o… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: 17 pages, 15 figures

  39. arXiv:2408.06870  [pdf, ps, other

    eess.SP

    Spectrum Prediction With Deep 3D Pyramid Vision Transformer Learning

    Authors: Guangliang Pan, Qihui Wu, Bo Zhou, Jie Li, Wei Wang, Guoru Ding, David K. Y. Yau

    Abstract: In this paper, we propose a deep learning (DL)-based task-driven spectrum prediction framework, named DeepSPred. The DeepSPred comprises a feature encoder and a task predictor, where the encoder extracts spectrum usage pattern features, and the predictor configures different networks according to the task requirements to predict future spectrum. Based on the Deep- SPred, we first propose a novel 3… ▽ More

    Submitted 20 August, 2024; v1 submitted 13 August, 2024; originally announced August 2024.

  40. Sum Rate Maximization for Movable Antenna Enabled Uplink NOMA

    Authors: Nianzu Li, Peiran Wu, Boyu Ning, Lipeng Zhu

    Abstract: Movable antenna (MA) has been recently proposed as a promising candidate technology for the next generation wireless communication systems due to its significant capability of reconfiguring wireless channels via antenna movement. In this letter, we study an MA-enabled uplink non-orthogonal multiple access (NOMA) system, where each user is equipped with a single MA. Our objective is to maximize the… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: 5 pages, 3 figures. Accepted to IEEE Wireless Communications Letters

  41. arXiv:2408.06667  [pdf, ps, other

    eess.SP

    Joint Source-Channel Optimization for UAV Video Coding and Transmission

    Authors: Kesong Wu, Xianbin Cao, Peng Yang, Haijun Zhang, Tony Q. S. Quek, Dapeng Oliver Wu

    Abstract: This paper is concerned with unmanned aerial vehicle (UAV) video coding and transmission in scenarios such as emergency rescue and environmental monitoring. Unlike existing methods of modeling UAV video source coding and channel transmission separately, we investigate the joint source-channel optimization issue for video coding and transmission. Particularly, we design eight-dimensional delay-powe… ▽ More

    Submitted 19 August, 2024; v1 submitted 13 August, 2024; originally announced August 2024.

  42. Deep Inertia $L_p$ Half-Quadratic Splitting Unrolling Network for Sparse View CT Reconstruction

    Authors: Yu Guo, Caiying Wu, Yaxin Li, Qiyu Jin, Tieyong Zeng

    Abstract: Sparse view computed tomography (CT) reconstruction poses a challenging ill-posed inverse problem, necessitating effective regularization techniques. In this letter, we employ $L_p$-norm ($0<p<1$) regularization to induce sparsity and introduce inertial steps, leading to the development of the inertial $L_p$-norm half-quadratic splitting algorithm. We rigorously prove the convergence of this algor… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: This paper was accepted by IEEE Signal Processing Letters on July 28, 2024

    Journal ref: IEEE Signal Processing Letters, 2024, 31:2030-2034

  43. arXiv:2408.06351  [pdf, other

    eess.SP math.ST

    A Probabilistic Approach for Queue Length Estimation Using License Plate Recognition Data: Considering Overtaking in Multi-lane Scenarios

    Authors: Lyuzhou Luo, Hao Wu, Jiahao Liu, Keshuang Tang, Chaopeng Tan

    Abstract: Multi-section license plate recognition (LPR) data provides input-output information and sampled travel times of the investigated link, serving as an ideal data source for lane-based queue length estimation in recent studies. However, most of these studies assumed the strict FIFO rule or a specific arrival process, thus ignoring the potential impact of overtaking and the variation of traffic flows… ▽ More

    Submitted 24 July, 2024; originally announced August 2024.

    Comments: 30 pages, 20 figures

  44. arXiv:2408.06185  [pdf, other

    eess.SY cs.CY cs.GT cs.NI

    Hi-SAM: A high-scalable authentication model for satellite-ground Zero-Trust system using mean field game

    Authors: Xuesong Wu, Tianshuai Zheng, Runfang Wu, Jie Ren, Junyan Guo, Ye Du

    Abstract: As more and more Internet of Thing (IoT) devices are connected to satellite networks, the Zero-Trust Architecture brings dynamic security to the satellite-ground system, while frequent authentication creates challenges for system availability. To make the system's accommodate more IoT devices, this paper proposes a high-scalable authentication model (Hi-SAM). Hi-SAM introduces the Proof-of-Work id… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  45. arXiv:2408.06027  [pdf, other

    eess.SP cs.LG

    A Comprehensive Survey on EEG-Based Emotion Recognition: A Graph-Based Perspective

    Authors: Chenyu Liu, Xinliang Zhou, Yihao Wu, Yi Ding, Liming Zhai, Kun Wang, Ziyu Jia, Yang Liu

    Abstract: Compared to other modalities, electroencephalogram (EEG) based emotion recognition can intuitively respond to emotional patterns in the human brain and, therefore, has become one of the most focused tasks in affective computing. The nature of emotions is a physiological and psychological state change in response to brain region connectivity, making emotion recognition focus more on the dependency… ▽ More

    Submitted 13 August, 2024; v1 submitted 12 August, 2024; originally announced August 2024.

  46. arXiv:2408.05746  [pdf, ps, other

    cs.IT eess.SP

    Movable Antenna Enhanced AF Relaying: Two-Stage Antenna Position Optimization

    Authors: Nianzu Li, Weidong Mei, Boyu Ning, Peiran Wu

    Abstract: The movable antenna (MA) technology has attracted increasing attention in wireless communications due to its capability for flexibly adjusting the positions of multiple antennas in a local region to reconfigure channel conditions. In this paper, we investigate its application in an amplify-and-forward (AF) relay system, where a multi-MA AF relay is deployed to assist in the wireless communications… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

  47. arXiv:2408.05609  [pdf, other

    eess.SY cs.AI cs.LG cs.MA cs.RO

    Mitigating Metropolitan Carbon Emissions with Dynamic Eco-driving at Scale

    Authors: Vindula Jayawardana, Baptiste Freydt, Ao Qu, Cameron Hickert, Edgar Sanchez, Catherine Tang, Mark Taylor, Blaine Leonard, Cathy Wu

    Abstract: The sheer scale and diversity of transportation make it a formidable sector to decarbonize. Here, we consider an emerging opportunity to reduce carbon emissions: the growing adoption of semi-autonomous vehicles, which can be programmed to mitigate stop-and-go traffic through intelligent speed commands and, thus, reduce emissions. But would such dynamic eco-driving move the needle on climate change… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

    Comments: In review

  48. arXiv:2408.04972  [pdf, other

    eess.SP

    Digital Semantic Communications: An Alternating Multi-Phase Training Strategy with Mask Attack

    Authors: Mingze Gong, Shuoyao Wang, Suzhi Bi, Yuan Wu, Liping Qian

    Abstract: Semantic communication (SemComm) has emerged as new paradigm shifts.Most existing SemComm systems transmit continuously distributed signals in analog fashion.However, the analog paradigm is not compatible with current digital communication frameworks. In this paper, we propose an alternating multi-phase training strategy (AMP) to enable the joint training of the networks in the encoder and decoder… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  49. arXiv:2408.04358  [pdf, other

    eess.SY

    Goal-Oriented UAV Communication Design and Optimization for Target Tracking: A MachineLearning Approach

    Authors: Wenchao Wu, Yanning Wu, Yuanqing Yang, Yansha Deng

    Abstract: To accomplish various tasks, safe and smooth control of unmanned aerial vehicles (UAVs) needs to be guaranteed, which cannot be met by existing ultra-reliable low latency communications (URLLC). This has attracted the attention of the communication field, where most existing work mainly focused on optimizing communication performance (i.e., delay) and ignored the performance of the task (i.e., tra… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

  50. arXiv:2408.04325  [pdf, other

    eess.AS cs.CL

    HydraFormer: One Encoder For All Subsampling Rates

    Authors: Yaoxun Xu, Xingchen Song, Zhiyong Wu, Di Wu, Zhendong Peng, Binbin Zhang

    Abstract: In automatic speech recognition, subsampling is essential for tackling diverse scenarios. However, the inadequacy of a single subsampling rate to address various real-world situations often necessitates training and deploying multiple models, consequently increasing associated costs. To address this issue, we propose HydraFormer, comprising HydraSub, a Conformer-based encoder, and a BiTransformer-… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: accepted by ICME 2024