Zum Hauptinhalt springen

Showing 1–50 of 112 results for author: Liang, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2408.05596  [pdf, other

    eess.SP

    Semantic Communications with Explicit Semantic Bases: Model, Architecture, and Open Problems

    Authors: Fengyu Wang, Yuan Zheng, Wenjun Xu, Junxiao Liang, Ping Zhang

    Abstract: The increasing demands for massive data transmission pose great challenges to communication systems. Compared to traditional communication systems that focus on the accurate reconstruction of bit sequences, semantic communications (SemComs), which aim to successfully deliver information connotation, have been regarded as the key technology for next-generation communication systems. Most current Se… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

  2. arXiv:2407.05726  [pdf, other

    cs.CV eess.IV

    Gait Patterns as Biomarkers: A Video-Based Approach for Classifying Scoliosis

    Authors: Zirui Zhou, Junhao Liang, Zizhao Peng, Chao Fan, Fengwei An, Shiqi Yu

    Abstract: Scoliosis presents significant diagnostic challenges, particularly in adolescents, where early detection is crucial for effective treatment. Traditional diagnostic and follow-up methods, which rely on physical examinations and radiography, face limitations due to the need for clinical expertise and the risk of radiation exposure, thus restricting their use for widespread early screening. In respon… ▽ More

    Submitted 23 August, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted to MICCAI 2024

  3. arXiv:2407.04518  [pdf, other

    eess.AS

    From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano

    Authors: Huan Zhang, Jinhua Liang, Simon Dixon

    Abstract: Our study investigates an approach for understanding musical performances through the lens of audio encoding models, focusing on the domain of solo Western classical piano music. Compared to composition-level attribute understanding such as key or genre, we identify a knowledge gap in performance-level music understanding, and address three critical tasks: expertise ranking, difficulty estimation,… ▽ More

    Submitted 19 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

    Comments: Accepted by the 25th International Society for Music Information Retrieval (ISMIR)

  4. arXiv:2407.03131  [pdf, other

    cs.NE cs.AI eess.SP

    MVGT: A Multi-view Graph Transformer Based on Spatial Relations for EEG Emotion Recognition

    Authors: Yanjie Cui, Xiaohong Liu, Jing Liang, Yamin Fu

    Abstract: Electroencephalography (EEG), a medical imaging technique that captures scalp electrical activity of brain structures via electrodes, has been widely used in affective computing. The spatial domain of EEG is rich in affective information. However, few of the existing studies have simultaneously analyzed EEG signals from multiple perspectives of geometric and anatomical structures in spatial domain… ▽ More

    Submitted 6 August, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

  5. arXiv:2406.18345  [pdf, other

    cs.LG eess.SP

    EmT: A Novel Transformer for Generalized Cross-subject EEG Emotion Recognition

    Authors: Yi Ding, Chengxuan Tong, Shuailei Zhang, Muyun Jiang, Yong Li, Kevin Lim Jun Liang, Cuntai Guan

    Abstract: Integrating prior knowledge of neurophysiology into neural network architecture enhances the performance of emotion decoding. While numerous techniques emphasize learning spatial and short-term temporal patterns, there has been limited emphasis on capturing the vital long-term contextual information associated with emotional cognitive processes. In order to address this discrepancy, we introduce a… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 11 pages, 5 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  6. arXiv:2406.14850  [pdf, other

    eess.AS

    DExter: Learning and Controlling Performance Expression with Diffusion Models

    Authors: Huan Zhang, Shreyan Chowdhury, Carlos Eduardo Cancino-Chacón, Jinhua Liang, Simon Dixon, Gerhard Widmer

    Abstract: In the pursuit of developing expressive music performance models using artificial intelligence, this paper introduces DExter, a new approach leveraging diffusion probabilistic models to render Western classical piano performances. In this approach, performance parameters are represented in a continuous expression space and a diffusion model is trained to predict these continuous parameters while b… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: in submission to appsci special session

  7. arXiv:2405.09923  [pdf, other

    cs.CV eess.IV

    NTIRE 2024 Restore Any Image Model (RAIM) in the Wild Challenge

    Authors: Jie Liang, Radu Timofte, Qiaosi Yi, Shuaizheng Liu, Lingchen Sun, Rongyuan Wu, Xindong Zhang, Hui Zeng, Lei Zhang

    Abstract: In this paper, we review the NTIRE 2024 challenge on Restore Any Image Model (RAIM) in the Wild. The RAIM challenge constructed a benchmark for image restoration in the wild, including real-world images with/without reference ground truth in various scenarios from real applications. The participants were required to restore the real-captured images from complex and unknown degradation, where gener… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  8. arXiv:2405.05757  [pdf, other

    cs.ET eess.SY

    Design and Implementation of Energy-Efficient Wireless Tire Sensing System with Delay Analysis for Intelligent Vehicles

    Authors: Shashank Mishra, Jia-Ming Liang

    Abstract: The growing prevalence of Internet of Things (IoT) technologies has led to a rise in the popularity of intelligent vehicles that incorporate a range of sensors to monitor various aspects, such as driving speed, fuel usage, distance proximity and tire anomalies. Nowadays, real-time tire sensing systems play important roles for intelligent vehicles in increasing mileage, reducing fuel consumption, i… ▽ More

    Submitted 27 May, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

  9. arXiv:2404.14713  [pdf, other

    eess.SY

    Enhancing High-Speed Cruising Performance of Autonomous Vehicles through Integrated Deep Reinforcement Learning Framework

    Authors: Jinhao Liang, Kaidi Yang, Chaopeng Tan, Jinxiang Wang, Guodong Yin

    Abstract: High-speed cruising scenarios with mixed traffic greatly challenge the road safety of autonomous vehicles (AVs). Unlike existing works that only look at fundamental modules in isolation, this work enhances AV safety in mixed-traffic high-speed cruising scenarios by proposing an integrated framework that synthesizes three fundamental modules, i.e., behavioral decision-making, path-planning, and mot… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  10. arXiv:2404.05609  [pdf, other

    math.OC eess.SY

    Feedback Stability Under Mixed Gain and Phase Uncertainty

    Authors: Jiajin Liang, Di Zhao, Li Qiu

    Abstract: In this study, we investigate the robust feedback stability problem for multiple-input-multiple-output linear time-invariant systems involving sectored-disk uncertainty, namely, dynamic uncertainty subject to simultaneous gain and phase constraints. This problem is thereby called a sectored-disk problem. Employing a frequency-wise analysis approach, we derive a fundamental static matrix problem th… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  11. arXiv:2404.02663  [pdf

    eess.SP cs.IT

    Ground-to-UAV sub-Terahertz channel measurement and modeling

    Authors: Da Li, Peian Li, Jiabiao Zhao, Jianjian Liang, Jiacheng Liu, Guohao Liu, Yuanshuai Lei, Wenbo Liu, Jianqin Deng, Fuyong Liu, Jianjun Ma

    Abstract: Unmanned Aerial Vehicle (UAV) assisted terahertz (THz) wireless communications have been expected to play a vital role in the next generation of wireless networks. UAVs can serve as either repeaters or data collectors within the communication link, thereby potentially augmenting the efficacy of communication systems. Despite their promise, the channel analysis and modeling specific to THz wireless… ▽ More

    Submitted 30 July, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: To be published in Optics Express

  12. arXiv:2403.18638  [pdf, other

    eess.AS

    Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event Detection

    Authors: Jinhua Liang, Ines Nolasco, Burooj Ghani, Huy Phan, Emmanouil Benetos, Dan Stowell

    Abstract: Detecting the presence of animal vocalisations in nature is essential to study animal populations and their behaviors. A recent development in the field is the introduction of the task known as few-shot bioacoustic sound event detection, which aims to train a versatile animal sound detector using only a small set of audio samples. Previous efforts in this area have utilized different architectures… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  13. arXiv:2403.16225  [pdf, other

    eess.SY

    Bi-Level Control of Weaving Sections in Mixed Traffic Environments with Connected and Automated Vehicles

    Authors: Longhao Yan, Jinhao Liang, Kaidi Yang

    Abstract: Connected and automated vehicles (CAVs) can be beneficial for improving the operation of highway bottlenecks such as weaving sections. This paper proposes a bi-level control approach based on an upper-level deep reinforcement learning controller and a lower-level model predictive controller to coordinate the lane-changings of a mixed fleet of CAVs and human-driven vehicles (HVs) in weaving section… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: 12 pages, 8 figures

  14. arXiv:2403.09527  [pdf, other

    eess.AS

    WavCraft: Audio Editing and Generation with Large Language Models

    Authors: Jinhua Liang, Huan Zhang, Haohe Liu, Yin Cao, Qiuqiang Kong, Xubo Liu, Wenwu Wang, Mark D. Plumbley, Huy Phan, Emmanouil Benetos

    Abstract: We introduce WavCraft, a collective system that leverages large language models (LLMs) to connect diverse task-specific models for audio content creation and editing. Specifically, WavCraft describes the content of raw audio materials in natural language and prompts the LLM conditioned on audio descriptions and user requests. WavCraft leverages the in-context learning ability of the LLM to decompo… ▽ More

    Submitted 10 May, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  15. arXiv:2402.09463  [pdf

    eess.IV

    Multi-Center Fetal Brain Tissue Annotation (FeTA) Challenge 2022 Results

    Authors: Kelly Payette, Céline Steger, Roxane Licandro, Priscille de Dumast, Hongwei Bran Li, Matthew Barkovich, Liu Li, Maik Dannecker, Chen Chen, Cheng Ouyang, Niccolò McConnell, Alina Miron, Yongmin Li, Alena Uus, Irina Grigorescu, Paula Ramirez Gilliland, Md Mahfuzur Rahman Siddiquee, Daguang Xu, Andriy Myronenko, Haoyu Wang, Ziyan Huang, Jin Ye, Mireia Alenyà, Valentin Comte, Oscar Camara , et al. (42 additional authors not shown)

    Abstract: Segmentation is a critical step in analyzing the developing human fetal brain. There have been vast improvements in automatic segmentation methods in the past several years, and the Fetal Brain Tissue Annotation (FeTA) Challenge 2021 helped to establish an excellent standard of fetal brain segmentation. However, FeTA 2021 was a single center study, and the generalizability of algorithms across dif… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: Results from FeTA Challenge 2022, held at MICCAI; Manuscript submitted. Supplementary Info (including submission methods descriptions) available here: https://zenodo.org/records/10628648

  16. arXiv:2402.02634  [pdf, other

    cs.CV cs.LG eess.IV

    Key-Graph Transformer for Image Restoration

    Authors: Bin Ren, Yawei Li, Jingyun Liang, Rakesh Ranjan, Mengyuan Liu, Rita Cucchiara, Luc Van Gool, Nicu Sebe

    Abstract: While it is crucial to capture global information for effective image restoration (IR), integrating such cues into transformer-based methods becomes computationally expensive, especially with high input resolution. Furthermore, the self-attention mechanism in transformers is prone to considering unnecessary global cues from unrelated objects or regions, introducing computational inefficiencies. In… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: 9 pages, 6 figures

  17. arXiv:2312.17446  [pdf, other

    cs.LG cs.AI eess.SP

    ClST: A Convolutional Transformer Framework for Automatic Modulation Recognition by Knowledge Distillation

    Authors: Dongbin Hou, Lixin Li, Wensheng Lin, Junli Liang, Zhu Han

    Abstract: With the rapid development of deep learning (DL) in recent years, automatic modulation recognition (AMR) with DL has achieved high accuracy. However, insufficient training signal data in complicated channel environments and large-scale DL models are critical factors that make DL methods difficult to deploy in practice. Aiming to these problems, we propose a novel neural network named convolution-l… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  18. arXiv:2312.15408  [pdf, other

    eess.IV cs.CV

    Perception-Distortion Balanced Super-Resolution: A Multi-Objective Optimization Perspective

    Authors: Lingchen Sun, Jie Liang, Shuaizheng Liu, Hongwei Yong, Lei Zhang

    Abstract: High perceptual quality and low distortion degree are two important goals in image restoration tasks such as super-resolution (SR). Most of the existing SR methods aim to achieve these goals by minimizing the corresponding yet conflicting losses, such as the $\ell_1$ loss and the adversarial loss. Unfortunately, the commonly used gradient-based optimizers, such as Adam, are hard to balance these o… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

  19. arXiv:2312.04795  [pdf, other

    eess.SP

    Latency versus Transmission Power Trade-off in Free-Space Optical (FSO) Satellite Networks with Multiple Inter-Continental Connections

    Authors: Jintao Liang, Aizaz Chaudhry, John Chinneck, Halim Yanikomeroglu, Gunes Kurt, Peng Hu, Khaled Ahmed, Stephane Martel

    Abstract: In free-space optical satellite networks (FSOSNs), satellites connected via laser inter-satellite links (LISLs), latency is a critical factor, especially for long-distance inter-continental connections. Since satellites depend on solar panels for power supply, power consumption is also a vital factor. We investigate the minimization of total network latency (i.e., the sum of the network latencies… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: Accepted for publication in IEEE Open Journal of the Communications Society

  20. arXiv:2312.04788  [pdf, other

    eess.SP

    Free-Space Optical (FSO) Satellite Networks Performance Analysis: Transmission Power, Latency, and Outage Probability

    Authors: Jintao Liang, Aizaz U. Chaudhry, Eylem Erdogan, Halim Yanikomeroglu, Gunes Karabulut Kurt, Peng Hu, Khaled Ahmed, Stephane Martel

    Abstract: In free-space optical satellite networks (FSOSNs), satellites can have different laser inter-satellite link (LISL) ranges for connectivity. Greater LISL ranges can reduce network latency of the path but can also result in an increase in transmission power for satellites on the path. Consequently, this tradeoff between satellite transmission power and network latency should be investigated, and in… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: Accepted for publication in IEEE Open Journal of Vehicular Technology

  21. arXiv:2312.00249  [pdf, other

    eess.AS

    Acoustic Prompt Tuning: Empowering Large Language Models with Audition Capabilities

    Authors: Jinhua Liang, Xubo Liu, Wenwu Wang, Mark D. Plumbley, Huy Phan, Emmanouil Benetos

    Abstract: The auditory system plays a substantial role in shaping the overall human perceptual experience. While prevailing large language models (LLMs) and visual language models (VLMs) have shown their promise in solving a wide variety of vision and language understanding tasks, only a few of them can be generalised to the audio domain without compromising their domain-specific capacity. In this work, we… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

  22. arXiv:2311.11325  [pdf, other

    cs.CV eess.IV

    MoVideo: Motion-Aware Video Generation with Diffusion Models

    Authors: Jingyun Liang, Yuchen Fan, Kai Zhang, Radu Timofte, Luc Van Gool, Rakesh Ranjan

    Abstract: While recent years have witnessed great progress on using diffusion models for video generation, most of them are simple extensions of image generation frameworks, which fail to explicitly consider one of the key differences between videos and images, i.e., motion. In this paper, we propose a novel motion-aware video generation (MoVideo) framework that takes motion into consideration from two aspe… ▽ More

    Submitted 29 July, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

    Comments: Accepted by ECCV2024. Project page: https://jingyunliang.github.io/MoVideo

  23. arXiv:2311.07912  [pdf, other

    cs.CV eess.SP

    Detection of Small Targets in Sea Clutter Based on RepVGG and Continuous Wavelet Transform

    Authors: Jingchen Ni, Haoru Li, Lilin Xu, Jing Liang

    Abstract: Constructing a high-performance target detector under the background of sea clutter is always necessary and important. In this work, we propose a RepVGGA0-CWT detector, where RepVGG is a residual network that gains a high detection accuracy. Different from traditional residual networks, RepVGG keeps an acceptable calculation speed. Giving consideration to both accuracy and speed, the RepVGGA0 is s… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  24. arXiv:2309.02529  [pdf, other

    eess.IV

    Fast and High-Performance Learned Image Compression With Improved Checkerboard Context Model, Deformable Residual Module, and Knowledge Distillation

    Authors: Haisheng Fu, Feng Liang, Jie Liang, Yongqiang Wang, Guohe Zhang, Jingning Han

    Abstract: Deep learning-based image compression has made great progresses recently. However, many leading schemes use serial context-adaptive entropy model to improve the rate-distortion (R-D) performance, which is very slow. In addition, the complexities of the encoding and decoding networks are quite high and not suitable for many practical applications. In this paper, we introduce four techniques to bala… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: Submitted to Trans. Journal

  25. arXiv:2308.11864  [pdf, other

    eess.IV

    Enhanced Residual SwinV2 Transformer for Learned Image Compression

    Authors: Yongqiang Wang, Feng Liang, Haisheng Fu, Jie Liang, Haipeng Qin, Junzhe Liang

    Abstract: Recently, the deep learning technology has been successfully applied in the field of image compression, leading to superior rate-distortion performance. However, a challenge of many learning-based approaches is that they often achieve better performance via sacrificing complexity, which making practical deployment difficult. To alleviate this issue, in this paper, we propose an effective and effic… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

  26. arXiv:2308.09103  [pdf, other

    cs.RO eess.SY

    Efficient collision avoidance for autonomous vehicles in polygonal domains

    Authors: Jiayu Fan, Nikolce Murgovski, Jun Liang

    Abstract: This research focuses on trajectory planning problems for autonomous vehicles utilizing numerical optimal control techniques. The study reformulates the constrained optimization problem into a nonlinear programming problem, incorporating explicit collision avoidance constraints. We present three novel, exact formulations to describe collision constraints. The first formulation is derived from a pr… ▽ More

    Submitted 12 December, 2023; v1 submitted 17 August, 2023; originally announced August 2023.

    Comments: 10 pages,2 figures

  27. arXiv:2308.08269  [pdf, other

    eess.IV cs.CV

    OnUVS: Online Feature Decoupling Framework for High-Fidelity Ultrasound Video Synthesis

    Authors: Han Zhou, Dong Ni, Ao Chang, Xinrui Zhou, Rusi Chen, Yanlin Chen, Lian Liu, Jiamin Liang, Yuhao Huang, Tong Han, Zhe Liu, Deng-Ping Fan, Xin Yang

    Abstract: Ultrasound (US) imaging is indispensable in clinical practice. To diagnose certain diseases, sonographers must observe corresponding dynamic anatomic structures to gather comprehensive information. However, the limited availability of specific US video cases causes teaching difficulties in identifying corresponding diseases, which potentially impacts the detection rate of such cases. The synthesis… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

    Comments: 14 pages, 13 figures and 6 tables

  28. arXiv:2307.14335  [pdf, other

    cs.SD cs.AI cs.MM eess.AS

    WavJourney: Compositional Audio Creation with Large Language Models

    Authors: Xubo Liu, Zhongkai Zhu, Haohe Liu, Yi Yuan, Meng Cui, Qiushi Huang, Jinhua Liang, Yin Cao, Qiuqiang Kong, Mark D. Plumbley, Wenwu Wang

    Abstract: Despite breakthroughs in audio generation models, their capabilities are often confined to domain-specific conditions such as speech transcriptions and audio captions. However, real-world audio creation aims to generate harmonious audio containing various elements such as speech, music, and sound effects with controllable conditions, which is challenging to address using existing audio generation… ▽ More

    Submitted 26 November, 2023; v1 submitted 26 July, 2023; originally announced July 2023.

    Comments: GitHub: https://github.com/Audio-AGI/WavJourney

  29. arXiv:2305.17719  [pdf, other

    eess.AS cs.SD

    Adapting Language-Audio Models as Few-Shot Audio Learners

    Authors: Jinhua Liang, Xubo Liu, Haohe Liu, Huy Phan, Emmanouil Benetos, Mark D. Plumbley, Wenwu Wang

    Abstract: We presented the Treff adapter, a training-efficient adapter for CLAP, to boost zero-shot classification performance by making use of a small set of labelled data. Specifically, we designed CALM to retrieve the probability distribution of text-audio clips over classes using a set of audio-label pairs and combined it with CLAP's zero-shot classification results. Furthermore, we designed a training-… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

  30. arXiv:2305.08995  [pdf, other

    cs.CV eess.IV

    Denoising Diffusion Models for Plug-and-Play Image Restoration

    Authors: Yuanzhi Zhu, Kai Zhang, Jingyun Liang, Jiezhang Cao, Bihan Wen, Radu Timofte, Luc Van Gool

    Abstract: Plug-and-play Image Restoration (IR) has been widely recognized as a flexible and interpretable method for solving various inverse problems by utilizing any off-the-shelf denoiser as the implicit image prior. However, most existing methods focus on discriminative Gaussian denoisers. Although diffusion models have shown impressive performance for high-quality image synthesis, their potential to ser… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

  31. arXiv:2305.07783  [pdf, other

    cs.CV eess.IV

    ROI-based Deep Image Compression with Swin Transformers

    Authors: Binglin Li, Jie Liang, Haisheng Fu, Jingning Han

    Abstract: Encoding the Region Of Interest (ROI) with better quality than the background has many applications including video conferencing systems, video surveillance and object-oriented vision tasks. In this paper, we propose a ROI-based image compression framework with Swin transformers as main building blocks for the autoencoder network. The binary ROI mask is integrated into different layers of the netw… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

    Comments: This paper has been accepted by ICASSP 2023

  32. arXiv:2305.02269  [pdf, other

    cs.SD cs.CL eess.AS

    M2-CTTS: End-to-End Multi-scale Multi-modal Conversational Text-to-Speech Synthesis

    Authors: Jinlong Xue, Yayue Deng, Fengping Wang, Ya Li, Yingming Gao, Jianhua Tao, Jianqing Sun, Jiaen Liang

    Abstract: Conversational text-to-speech (TTS) aims to synthesize speech with proper prosody of reply based on the historical conversation. However, it is still a challenge to comprehensively model the conversation, and a majority of conversational TTS systems only focus on extracting global information and omit local prosody features, which contain important fine-grained information like keywords and emphas… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

    Comments: 5 pages, 1 figures, 2 tables. Accepted by ICASSP 2023

  33. arXiv:2303.14720  [pdf, other

    eess.SP cs.LG eess.SY

    Driver Profiling and Bayesian Workload Estimation Using Naturalistic Peripheral Detection Study Data

    Authors: Nermin Caber, Bashar I. Ahmad, Jiaming Liang, Simon Godsill, Alexandra Bremers, Philip Thomas, David Oxtoby, Lee Skrypchuk

    Abstract: Monitoring drivers' mental workload facilitates initiating and maintaining safe interactions with in-vehicle information systems, and thus delivers adaptive human machine interaction with reduced impact on the primary task of driving. In this paper, we tackle the problem of workload estimation from driving performance data. First, we present a novel on-road study for collecting subjective workload… ▽ More

    Submitted 8 September, 2023; v1 submitted 26 March, 2023; originally announced March 2023.

    Comments: Accepted for IEEE Transactions on Intelligent Vehicles

  34. arXiv:2303.09112  [pdf, other

    eess.IV cs.AI cs.LG cs.MM

    SigVIC: Spatial Importance Guided Variable-Rate Image Compression

    Authors: Jiaming Liang, Meiqin Liu, Chao Yao, Chunyu Lin, Yao Zhao

    Abstract: Variable-rate mechanism has improved the flexibility and efficiency of learning-based image compression that trains multiple models for different rate-distortion tradeoffs. One of the most common approaches for variable-rate is to channel-wisely or spatial-uniformly scale the internal features. However, the diversity of spatial importance is instructive for bit allocation of image compression. In… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: Accepted by IEEE ICASSP2023 (Camera Ready)

  35. arXiv:2303.03857  [pdf, other

    cs.SD cs.AI cs.MM eess.AS

    Leveraging Pre-trained AudioLDM for Sound Generation: A Benchmark Study

    Authors: Yi Yuan, Haohe Liu, Jinhua Liang, Xubo Liu, Mark D. Plumbley, Wenwu Wang

    Abstract: Deep neural networks have recently achieved breakthroughs in sound generation. Despite the outstanding sample quality, current sound generation models face issues on small-scale datasets (e.g., overfitting), significantly limiting performance. In this paper, we make the first attempt to investigate the benefits of pre-training on sound generation with AudioLDM, the cutting-edge model for audio gen… ▽ More

    Submitted 29 July, 2024; v1 submitted 7 March, 2023; originally announced March 2023.

    Comments: Updated for EUSIPCO 2023 proceedings version

  36. arXiv:2212.08952  [pdf, other

    cs.SD eess.AS

    Learning from Taxonomy: Multi-label Few-Shot Classification for Everyday Sound Recognition

    Authors: Jinhua Liang, Huy Phan, Emmanouil Benetos

    Abstract: Everyday sound recognition aims to infer types of sound events in audio streams. While many works succeeded in training models with high performance in a fully-supervised manner, they are still restricted to the demand of large quantities of labelled data and the range of predefined classes. To overcome these drawbacks, this work firstly curates a new database named FSD-FS for multi-label few-shot… ▽ More

    Submitted 17 December, 2022; originally announced December 2022.

    Comments: submitted to ICASSP2023

  37. arXiv:2212.08525  [pdf, other

    cs.CR eess.SY

    Resource-Interaction Graph: Efficient Graph Representation for Anomaly Detection

    Authors: James Pope, Jinyuan Liang, Vijay Kumar, Francesco Raimondo, Xinyi Sun, Ryan McConville, Thomas Pasquier, Rob Piechocki, George Oikonomou, Bo Luo, Dan Howarth, Ioannis Mavromatis, Adrian Sanchez Mompo, Pietro Carnelli, Theodoros Spyridopoulos, Aftab Khan

    Abstract: Security research has concentrated on converting operating system audit logs into suitable graphs, such as provenance graphs, for analysis. However, provenance graphs can grow very large requiring significant computational resources beyond what is necessary for many security tasks and are not feasible for resource constrained environments, such as edge devices. To address this problem, we present… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

    Comments: 15 pages, 11 figures, 6 tables, for dataset see https://github.com/jpope8/container-escape-dataset, for code see https://github.com/jpope8/container-escape-analysis

  38. arXiv:2212.06557  [pdf, ps, other

    eess.SP

    A Data Quality Assessment Framework for AI-enabled Wireless Communication

    Authors: Hanning Tang, Liusha Yang, Rui Zhou, Jing Liang, Hong Wei, Xuan Wang, Qingjiang Shi, Zhi-Quan Luo

    Abstract: Using artificial intelligent (AI) to re-design and enhance the current wireless communication system is a promising pathway for the future sixth-generation (6G) wireless network. The performance of AI-enabled wireless communication depends heavily on the quality of wireless air-interface data. Although there are various approaches to data quality assessment (DQA) for different applications, none h… ▽ More

    Submitted 13 December, 2022; originally announced December 2022.

  39. arXiv:2211.00323  [pdf, other

    cs.IT eess.SP

    Reconfigurable Intelligent Surface: Power Consumption Modeling and Practical Measurement Validation

    Authors: Jinghe Wang, Wankai Tang, Jing Cheng Liang, Lei Zhang, Jun Yan Dai, Xiao Li, Shi Jin, Qiang Cheng, Tie Jun Cui

    Abstract: The reconfigurable intelligent surface (RIS) has received a lot of interest because of its capacity to reconfigure the wireless communication environment in a cost- and energy-efficient way. However, the realistic power consumption modeling and measurement validation of RIS has received far too little attention. Therefore, in this work, we model the power consumption of RIS and conduct measurement… ▽ More

    Submitted 6 February, 2024; v1 submitted 1 November, 2022; originally announced November 2022.

  40. arXiv:2210.17407  [pdf, other

    eess.SY physics.app-ph

    Circuit Solutions towards Broadband Piezoelectric Energy Harvesting: An Impedance Analysis

    Authors: Bao Zhao, Junrui Liang

    Abstract: In the studies of piezoelectric energy harvesting (PEH) systems, literature has shown that circuit advancement has a significant effect on the enhancement of energy harvesting capability in resonance. On the other hand, some recent studies using the phase-variable (PV) synchronized switch technologies have found that the advanced circuit solutions can also broaden the harvesting bandwidth. However… ▽ More

    Submitted 31 October, 2022; originally announced October 2022.

  41. arXiv:2210.12772  [pdf, other

    physics.med-ph eess.IV eess.SP eess.SY

    Electroanatomic Mapping to determine Scar Regions in patients with Atrial Fibrillation

    Authors: Jiyue He, Kuk Jin Jang, Katie Walsh, Jackson Liang, Sanjay Dixit, Rahul Mangharam

    Abstract: Left atrial voltage maps are routinely acquired during electroanatomic mapping in patients undergoing catheter ablation for atrial fibrillation. For patients, who have prior catheter ablation when they are in sinus rhythm, the voltage map can be used to identify low voltage areas using a threshold of 0.2 - 0.45 mV. However, such a voltage threshold for maps acquired during atrial fibrillation has… ▽ More

    Submitted 8 November, 2022; v1 submitted 23 October, 2022; originally announced October 2022.

    Journal ref: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)

  42. arXiv:2208.11803  [pdf, other

    cs.CV eess.IV

    Learning Task-Oriented Flows to Mutually Guide Feature Alignment in Synthesized and Real Video Denoising

    Authors: Jiezhang Cao, Qin Wang, Jingyun Liang, Yulun Zhang, Kai Zhang, Radu Timofte, Luc Van Gool

    Abstract: Video denoising aims at removing noise from videos to recover clean ones. Some existing works show that optical flow can help the denoising by exploiting the additional spatial-temporal clues from nearby frames. However, the flow estimation itself is also sensitive to noise, and can be unusable under large noise levels. To this end, we propose a new multi-scale refined optical flow-guided video de… ▽ More

    Submitted 25 March, 2023; v1 submitted 24 August, 2022; originally announced August 2022.

  43. arXiv:2208.05186  [pdf, other

    cs.IT cs.LG eess.SP

    Learning Quantization in LDPC Decoders

    Authors: Marvin Geiselhart, Ahmed Elkelesh, Jannis Clausius, Fei Liang, Wen Xu, Jing Liang, Stephan ten Brink

    Abstract: Finding optimal message quantization is a key requirement for low complexity belief propagation (BP) decoding. To this end, we propose a floating-point surrogate model that imitates quantization effects as additions of uniform noise, whose amplitudes are trainable variables. We verify that the surrogate model closely matches the behavior of a fixed-point implementation and propose a hand-crafted l… ▽ More

    Submitted 10 August, 2022; originally announced August 2022.

    Comments: 6 Pages, 11 Figures, submitted to IEEE for possible publication

  44. arXiv:2207.00474  [pdf, other

    cs.CV cs.AI cs.LG eess.IV

    Weakly-supervised High-fidelity Ultrasound Video Synthesis with Feature Decoupling

    Authors: Jiamin Liang, Xin Yang, Yuhao Huang, Kai Liu, Xinrui Zhou, Xindi Hu, Zehui Lin, Huanjia Luo, Yuanji Zhang, Yi Xiong, Dong Ni

    Abstract: Ultrasound (US) is widely used for its advantages of real-time imaging, radiation-free and portability. In clinical practice, analysis and diagnosis often rely on US sequences rather than a single image to obtain dynamic anatomical information. This is challenging for novices to learn because practicing with adequate videos from patients is clinically unpractical. In this paper, we propose a novel… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

    Comments: Accepted by MICCAI 2022

  45. arXiv:2206.10618  [pdf, other

    eess.IV cs.IT cs.LG

    Asymmetric Learned Image Compression with Multi-Scale Residual Block, Importance Map, and Post-Quantization Filtering

    Authors: Haisheng Fu, Feng Liang, Jie Liang, Binglin Li, Guohe Zhang, Jingning Han

    Abstract: Recently, deep learning-based image compression has made signifcant progresses, and has achieved better ratedistortion (R-D) performance than the latest traditional method, H.266/VVC, in both subjective metric and the more challenging objective metric. However, a major problem is that many leading learned schemes cannot maintain a good trade-off between performance and complexity. In this paper, w… ▽ More

    Submitted 21 June, 2022; originally announced June 2022.

    Comments: IEEE TRANSACTIONS ON MULTIMEDIA

  46. arXiv:2206.02146  [pdf, other

    cs.CV eess.IV

    Recurrent Video Restoration Transformer with Guided Deformable Attention

    Authors: Jingyun Liang, Yuchen Fan, Xiaoyu Xiang, Rakesh Ranjan, Eddy Ilg, Simon Green, Jiezhang Cao, Kai Zhang, Radu Timofte, Luc Van Gool

    Abstract: Video restoration aims at restoring multiple high-quality frames from multiple low-quality frames. Existing video restoration methods generally fall into two extreme cases, i.e., they either restore all frames in parallel or restore the video frame by frame in a recurrent way, which would result in different merits and drawbacks. Typically, the former has the advantage of temporal information fusi… ▽ More

    Submitted 12 November, 2022; v1 submitted 5 June, 2022; originally announced June 2022.

    Comments: Accepted by NeurIPS 2022. Code: https://github.com/JingyunLiang/RVRT

  47. arXiv:2205.10651  [pdf, other

    eess.IV cs.LG cs.NE

    Tensor Shape Search for Optimum Data Compression

    Authors: Ryan Solgi, Zichang He, William Jiahua Liang, Zheng Zhang

    Abstract: Various tensor decomposition methods have been proposed for data compression. In real world applications of the tensor decomposition, selecting the tensor shape for the given data poses a challenge and the shape of the tensor may affect the error and the compression ratio. In this work, we study the effect of the tensor shape on the tensor decomposition and propose an optimization model to find an… ▽ More

    Submitted 21 May, 2022; originally announced May 2022.

  48. arXiv:2204.13177  [pdf, other

    eess.SP

    Link Budget Analysis for Free-Space Optical Satellite Networks

    Authors: Jintao Liang, Aizaz U. Chaudhry, Eylem Erdogan, Halim Yanikomeroglu

    Abstract: Free-space optical satellite networks (FSOSNs) will employ free-space optical links between satellites and between satellites and ground stations, and the link budget for optical inter-satellite links and optical uplink/downlink is analyzed in this paper. The satellites in these FSOSNs will have limited energy and thereby limited power, and we investigate the effect of link distance and link margi… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

    Comments: Accepted for NTN-6G Workshop at WoWMoM 2022

  49. arXiv:2204.10437  [pdf, other

    cs.CV eess.IV

    DiRA: Discriminative, Restorative, and Adversarial Learning for Self-supervised Medical Image Analysis

    Authors: Fatemeh Haghighi, Mohammad Reza Hosseinzadeh Taher, Michael B. Gotway, Jianming Liang

    Abstract: Discriminative learning, restorative learning, and adversarial learning have proven beneficial for self-supervised learning schemes in computer vision and medical imaging. Existing efforts, however, omit their synergistic effects on each other in a ternary setup, which, we envision, can significantly benefit deep semantic representation learning. To realize this vision, we have developed DiRA, the… ▽ More

    Submitted 21 April, 2022; originally announced April 2022.

    Comments: Accepted at CVPR 2022 [main conference]

  50. arXiv:2204.07344  [pdf, other

    eess.IV cs.CV

    CAiD: Context-Aware Instance Discrimination for Self-supervised Learning in Medical Imaging

    Authors: Mohammad Reza Hosseinzadeh Taher, Fatemeh Haghighi, Michael B. Gotway, Jianming Liang

    Abstract: Recently, self-supervised instance discrimination methods have achieved significant success in learning visual representations from unlabeled photographic images. However, given the marked differences between photographic and medical images, the efficacy of instance-based objectives, focusing on learning the most discriminative global features in the image (i.e., wheels in bicycle), remains unknow… ▽ More

    Submitted 15 April, 2022; originally announced April 2022.

    Comments: Accepted at MIDL 2022 [main conference]