Zum Hauptinhalt springen

Showing 1–50 of 85 results for author: Hu, C

Searching in archive eess. Search in all archives.
.
  1. arXiv:2408.14156  [pdf, other

    eess.SP

    Integrated Sensing, Communication, and Powering over Multi-antenna OFDM Systems

    Authors: Yilong Chen, Chao Hu, Zixiang Ren, Han Hu, Jie Xu, Lexi Xu, Lei Liu, Shuguang Cui

    Abstract: This paper considers a multi-functional orthogonal frequency division multiplexing (OFDM) system with integrated sensing, communication, and powering (ISCAP), in which a multi-antenna base station (BS) transmits OFDM signals to simultaneously deliver information to multiple information receivers (IRs), provide energy supply to multiple energy receivers (ERs), and sense potential targets based on t… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

    Comments: 13 pages, 12 figures

  2. arXiv:2408.10390  [pdf, other

    eess.SY

    Self-Refined Generative Foundation Models for Wireless Traffic Prediction

    Authors: Chengming Hu, Hao Zhou, Di Wu, Xi Chen, Jun Yan, Xue Liu

    Abstract: With a broad range of emerging applications in 6G networks, wireless traffic prediction has become a critical component of network management. However, the dynamically shifting distribution of wireless traffic in non-stationary 6G networks presents significant challenges to achieving accurate and stable predictions. Motivated by recent advancements in Generative AI (GAI)-enabled 6G networks, this… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  3. arXiv:2408.09851  [pdf, other

    cs.NI eess.SY

    ISAC-Fi: Enabling Full-fledged Monostatic Sensing over Wi-Fi Communication

    Authors: Zhe Chen, Chao Hu, Tianyue Zheng, Hangcheng Cao, Yanbing Yang, Yen Chu, Hongbo Jiang, Jun Luo

    Abstract: Whereas Wi-Fi communications have been exploited for sensing purpose for over a decade, the bistatic or multistatic nature of Wi-Fi still poses multiple challenges, hampering real-life deployment of integrated sensing and communication (ISAC) within Wi-Fi framework. In this paper, we aim to re-design WiFi so that monostatic sensing (mimicking radar) can be achieved over the multistatic communicati… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 14 pages, 22 figures

  4. arXiv:2408.02549  [pdf, other

    eess.SY

    Generative AI as a Service in 6G Edge-Cloud: Generation Task Offloading by In-context Learning

    Authors: Hao Zhou, Chengming Hu, Dun Yuan, Ye Yuan, Di Wu, Xue Liu, Zhu Han, Charlie Zhang

    Abstract: Generative artificial intelligence (GAI) is a promising technique towards 6G networks, and generative foundation models such as large language models (LLMs) have attracted considerable interest from academia and telecom industry. This work considers a novel edge-cloud deployment of foundation models in 6G networks. Specifically, it aims to minimize the service delay of foundation models by radio r… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  5. arXiv:2408.00214  [pdf, other

    eess.SY

    Large Language Model (LLM)-enabled In-context Learning for Wireless Network Optimization: A Case Study of Power Control

    Authors: Hao Zhou, Chengming Hu, Dun Yuan, Ye Yuan, Di Wu, Xue Liu, Charlie Zhang

    Abstract: Large language model (LLM) has recently been considered a promising technique for many fields. This work explores LLM-based wireless network optimization via in-context learning. To showcase the potential of LLM technologies, we consider the base station (BS) power control as a case study, a fundamental but crucial technique that is widely investigated in wireless networks. Different from existing… ▽ More

    Submitted 31 July, 2024; originally announced August 2024.

  6. arXiv:2406.18067  [pdf, other

    cs.CL eess.AS

    Exploring Energy-Based Models for Out-of-Distribution Detection in Dialect Identification

    Authors: Yaqian Hao, Chenguang Hu, Yingying Gao, Shilei Zhang, Junlan Feng

    Abstract: The diverse nature of dialects presents challenges for models trained on specific linguistic patterns, rendering them susceptible to errors when confronted with unseen or out-of-distribution (OOD) data. This study introduces a novel margin-enhanced joint energy model (MEJEM) tailored specifically for OOD detection in dialects. By integrating a generative model and the energy margin loss, our appro… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  7. arXiv:2406.18065  [pdf, other

    eess.AS cs.SD

    On Calibration of Speech Classification Models: Insights from Energy-Based Model Investigations

    Authors: Yaqian Hao, Chenguang Hu, Yingying Gao, Shilei Zhang, Junlan Feng

    Abstract: For speech classification tasks, deep learning models often achieve high accuracy but exhibit shortcomings in calibration, manifesting as classifiers exhibiting overconfidence. The significance of calibration lies in its critical role in guaranteeing the reliability of decision-making within deep learning systems. This study explores the effectiveness of Energy-Based Models in calibrating confiden… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  8. arXiv:2406.13268  [pdf, other

    eess.AS cs.SD

    CEC: A Noisy Label Detection Method for Speaker Recognition

    Authors: Yao Shen, Yingying Gao, Yaqian Hao, Chenguang Hu, Fulin Zhang, Junlan Feng, Shilei Zhang

    Abstract: Noisy labels are inevitable, even in well-annotated datasets. The detection of noisy labels is of significant importance to enhance the robustness of speaker recognition models. In this paper, we propose a novel noisy label detection approach based on two new statistical metrics: Continuous Inconsistent Counting (CIC) and Total Inconsistent Counting (TIC). These metrics are calculated through Cros… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: interspeech 2024

  9. arXiv:2405.17439  [pdf, other

    cs.NI cs.LG eess.SY

    An Overview of Machine Learning-Enabled Optimization for Reconfigurable Intelligent Surfaces-Aided 6G Networks: From Reinforcement Learning to Large Language Models

    Authors: Hao Zhou, Chengming Hu, Xue Liu

    Abstract: Reconfigurable intelligent surface (RIS) becomes a promising technique for 6G networks by reshaping signal propagation in smart radio environments. However, it also leads to significant complexity for network management due to the large number of elements and dedicated phase-shift optimization. In this work, we provide an overview of machine learning (ML)-enabled optimization for RIS-aided 6G netw… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  10. arXiv:2405.12046  [pdf, other

    cs.LG cs.DC cs.IT eess.SP

    Energy-Efficient Federated Edge Learning with Streaming Data: A Lyapunov Optimization Approach

    Authors: Chung-Hsuan Hu, Zheng Chen, Erik G. Larsson

    Abstract: Federated learning (FL) has received significant attention in recent years for its advantages in efficient training of machine learning models across distributed clients without disclosing user-sensitive data. Specifically, in federated edge learning (FEEL) systems, the time-varying nature of wireless channels introduces inevitable system dynamics in the communication process, thereby affecting tr… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: Submitted to IEEE journals for possible publication

  11. arXiv:2405.10825  [pdf, other

    eess.SY cs.LG

    Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities

    Authors: Hao Zhou, Chengming Hu, Ye Yuan, Yufei Cui, Yili Jin, Can Chen, Haolun Wu, Dun Yuan, Li Jiang, Di Wu, Xue Liu, Charlie Zhang, Xianbin Wang, Jiangchuan Liu

    Abstract: Large language models (LLMs) have received considerable attention recently due to their outstanding comprehension and reasoning capabilities, leading to great progress in many fields. The advancement of LLM techniques also offers promising opportunities to automate many tasks in the telecommunication (telecom) field. After pre-training and fine-tuning, LLMs can perform diverse downstream tasks bas… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  12. arXiv:2405.05126  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    Exploring Speech Pattern Disorders in Autism using Machine Learning

    Authors: Chuanbo Hu, Jacob Thrasher, Wenqi Li, Mindi Ruan, Xiangxu Yu, Lynn K Paul, Shuo Wang, Xin Li

    Abstract: Diagnosing autism spectrum disorder (ASD) by identifying abnormal speech patterns from examiner-patient dialogues presents significant challenges due to the subtle and diverse manifestations of speech-related symptoms in affected individuals. This study presents a comprehensive approach to identify distinctive speech patterns through the analysis of examiner-patient dialogues. Utilizing a dataset… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  13. arXiv:2404.12769  [pdf

    eess.SY

    Towards Accurate and Efficient Sorting of Retired Lithium-ion Batteries: A Data Driven Based Electrode Aging Assessment Approach

    Authors: Ruohan Guo, Feng Wang, Cungang Hu, Weixiang Shen

    Abstract: Retired batteries (RBs) for second-life applications offer promising economic and environmental benefits. However, accurate and efficient sorting of RBs with discrepant characteristics persists as a pressing challenge. In this study, we introduce a data driven based electrode aging assessment approach to address this concern. To this end, a number of 15 feature points are extracted from battery op… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 40 pages, 25 figures

  14. arXiv:2404.11836  [pdf, other

    eess.SP

    AI-Empowered RIS-Assisted Networks: CV-Enabled RIS Selection and DNN-Enabled Transmission

    Authors: Conggang Hu, Yang Lu, Hongyang Du, Mi Yang, Bo Ai, Dusit Niyato

    Abstract: This paper investigates artificial intelligence (AI) empowered schemes for reconfigurable intelligent surface (RIS) assisted networks from the perspective of fast implementation. We formulate a weighted sum-rate maximization problem for a multi-RIS-assisted network. To avoid huge channel estimation overhead due to activate all RISs, we propose a computer vision (CV) enabled RIS selection scheme ba… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  15. arXiv:2404.08943  [pdf, other

    math.OC eess.SY

    A Novel State-Centric Necessary Condition for Time-Optimal Control of Controllable Linear Systems Based on Augmented Switching Laws

    Authors: Yunan Wang, Chuxiong Hu, Yujie Lin, Zeyang Li, Shize Lin, Suqin He

    Abstract: Most existing necessary conditions for optimal control based on adjoining methods require both state information and costate information, yet the lack of costates for a given feasible trajectory in practice impedes the determination of optimality. This paper establishes a novel theoretical framework for time-optimal control of controllable linear systems, proposing the augmented switching law that… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  16. arXiv:2404.07577  [pdf, other

    cs.LG eess.SP

    Generating Comprehensive Lithium Battery Charging Data with Generative AI

    Authors: Lidang Jiang, Changyan Hu, Sibei Ji, Hang Zhao, Junxiong Chen, Ge He

    Abstract: In optimizing performance and extending the lifespan of lithium batteries, accurate state prediction is pivotal. Traditional regression and classification methods have achieved some success in battery state prediction. However, the efficacy of these data-driven approaches heavily relies on the availability and quality of public datasets. Additionally, generating electrochemical data predominantly… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  17. arXiv:2403.17675  [pdf, other

    math.OC eess.SY

    Chattering Phenomena in Time-Optimal Control for High-Order Chain-of-Integrators Systems with Full State Constraints

    Authors: Yunan Wang, Chuxiong Hu, Zeyang Li, Yujie Lin, Shize Lin, Suqin He

    Abstract: Time-optimal control for high-order chain-of-integrators systems with full state constraints remains an open and challenging problem in the optimal control theory domain. The behaviors of optimal control in high-order problems lack precision characterization, even where the existence of the chattering phenomenon remains unknown and overlooked. This paper establishes a theoretical framework for cha… ▽ More

    Submitted 29 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

  18. arXiv:2402.00320  [pdf

    eess.IV

    DARCS: Memory-Efficient Deep Compressed Sensing Reconstruction for Acceleration of 3D Whole-Heart Coronary MR Angiography

    Authors: Zhihao Xue, Fan Yang, Juan Gao, Zhuo Chen, Hao Peng, Chao Zou, Hang Jin, Chenxi Hu

    Abstract: Three-dimensional coronary magnetic resonance angiography (CMRA) demands reconstruction algorithms that can significantly suppress the artifacts from a heavily undersampled acquisition. While unrolling-based deep reconstruction methods have achieved state-of-the-art performance on 2D image reconstruction, their application to 3D reconstruction is hindered by the large amount of memory needed to tr… ▽ More

    Submitted 2 February, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

    Comments: 10 pages, 8 figures

  19. arXiv:2401.11500  [pdf, other

    cs.RO cs.AI eess.SY

    Integration of Large Language Models in Control of EHD Pumps for Precise Color Synthesis

    Authors: Yanhong Peng, Ceng Zhang, Chenlong Hu, Zebing Mao

    Abstract: This paper presents an innovative approach to integrating Large Language Models (LLMs) with Arduino-controlled Electrohydrodynamic (EHD) pumps for precise color synthesis in automation systems. We propose a novel framework that employs fine-tuned LLMs to interpret natural language commands and convert them into specific operational instructions for EHD pump control. This approach aims to enhance u… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

  20. arXiv:2401.00283  [pdf, other

    cs.IT eess.SP

    Near-Space Communications: the Last Piece of 6G Space-Air-Ground-Sea Integrated Network Puzzle

    Authors: Hongshan Liu, Tong Qin, Zhen Gao, Tianqi Mao, Keke Ying, Ziwei Wan, Li Qiao, Rui Na, Zhongxiang Li, Chun Hu, Yikun Mei, Tuan Li, Guanghui Wen, Lei Chen, Zhonghuai Wu, Ruiqi Liu, Gaojie Chen, Shuo Wang, Dezhi Zheng

    Abstract: This article presents a comprehensive study on the emerging near-space communications (NS-COM) within the context of space-air-ground-sea integrated network (SAGSIN). Specifically, we firstly explore the recent technical developments of NS-COM, followed by the discussions about motivations behind integrating NS-COM into SAGSIN. To further demonstrate the necessity of NS-COM, a comparative analysis… ▽ More

    Submitted 4 March, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

    Comments: 28 pages, 8 figures, 2 tables

  21. arXiv:2311.07348  [pdf

    eess.IV cs.CV

    Deformable Groupwise Registration Using a Locally Low-Rank Dissimilarity Metric for Myocardial Strain Estimation from Cardiac Cine MRI Images

    Authors: Haiyang Chen, Juan Gao, Chenxi Hu

    Abstract: Objective: Cardiovascular magnetic resonance-feature tracking (CMR-FT) represents a group of methods for myocardial strain estimation from cardiac cine MRI images. Established CMR-FT methods are mainly based on optical flow or pairwise registration. However, these methods suffer from either inaccurate estimation of large motion or drift effect caused by accumulative tracking errors. In this work,… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  22. arXiv:2311.07039  [pdf, other

    eess.SY

    Time-Optimal Control for High-Order Chain-of-Integrators Systems with Full State Constraints and Arbitrary Terminal States (Extended Version)

    Authors: Yunan Wang, Chuxiong Hu, Zeyang Li, Shize Lin, Suqin He, Yu Zhu

    Abstract: Time-optimal control for high-order chain-of-integrators systems with full state constraints and arbitrarily given terminal states remains a challenging problem in the optimal control theory domain, yet to be resolved. To enhance further comprehension of the problem, this paper establishes a novel notation system and theoretical framework, providing the switching manifold for high-order problems i… ▽ More

    Submitted 28 March, 2024; v1 submitted 12 November, 2023; originally announced November 2023.

  23. arXiv:2311.06769  [pdf, other

    eess.SY cs.LG

    Learning Predictive Safety Filter via Decomposition of Robust Invariant Set

    Authors: Zeyang Li, Chuxiong Hu, Weiye Zhao, Changliu Liu

    Abstract: Ensuring safety of nonlinear systems under model uncertainty and external disturbances is crucial, especially for real-world control tasks. Predictive methods such as robust model predictive control (RMPC) require solving nonconvex optimization problems online, which leads to high computational burden and poor scalability. Reinforcement learning (RL) works well with complex systems, but pays the p… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

  24. arXiv:2311.05372  [pdf, other

    cs.IT eess.SP

    Joint Angle and Delay Cramér-Rao Bound Optimization for ISAC

    Authors: Chao Hu, Yuan Fang, Ling Qiu

    Abstract: In this paper, we study a multi-input multi-output (MIMO) beamforming design in an integrated sensing and communication (ISAC) system, in which an ISAC base station (BS) is used to communicate with multiple downlink users and simultaneously the communication signals are reused for sensing multiple targets. Our interested sensing parameters are the angle and delay information of the targets, which… ▽ More

    Submitted 3 July, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

  25. arXiv:2310.08445  [pdf, other

    eess.SY

    Risk-informed Resilience Planning of Transmission Systems Against Ice Storms

    Authors: Chenxi Hu, Yujia Li, Yunhe Hou

    Abstract: Ice storms, known for their severity and predictability, necessitate proactive resilience enhancement in power systems. Traditional approaches often overlook the endogenous uncertainties inherent in human decisions and underutilize predictive information like forecast accuracy and preparation time. To bridge these gaps, we proposed a two-stage risk-informed decision-dependent resilience planning (… ▽ More

    Submitted 22 January, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

  26. arXiv:2310.03985  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Dementia Assessment Using Mandarin Speech with an Attention-based Speech Recognition Encoder

    Authors: Zih-Jyun Lin, Yi-Ju Chen, Po-Chih Kuo, Likai Huang, Chaur-Jong Hu, Cheng-Yu Chen

    Abstract: Dementia diagnosis requires a series of different testing methods, which is complex and time-consuming. Early detection of dementia is crucial as it can prevent further deterioration of the condition. This paper utilizes a speech recognition model to construct a dementia assessment system tailored for Mandarin speakers during the picture description task. By training an attention-based speech reco… ▽ More

    Submitted 15 December, 2023; v1 submitted 5 October, 2023; originally announced October 2023.

    Comments: Accepted to IEEE ICASSP 2024

  27. arXiv:2309.00883  [pdf, other

    cs.SD eess.AS

    DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin

    Authors: Tao Li, Chenxu Hu, Jian Cong, Xinfa Zhu, Jingbei Li, Qiao Tian, Yuping Wang, Lei Xie

    Abstract: While the performance of cross-lingual TTS based on monolingual corpora has been significantly improved recently, generating cross-lingual speech still suffers from the foreign accent problem, leading to limited naturalness. Besides, current cross-lingual methods ignore modeling emotion, which is indispensable paralinguistic information in speech delivery. In this paper, we propose DiCLET-TTS, a D… ▽ More

    Submitted 2 September, 2023; originally announced September 2023.

    Comments: accepted by TASLP

  28. arXiv:2308.16160  [pdf, other

    cs.CV eess.IV

    Occ$^2$Net: Robust Image Matching Based on 3D Occupancy Estimation for Occluded Regions

    Authors: Miao Fan, Mingrui Chen, Chen Hu, Shuchang Zhou

    Abstract: Image matching is a fundamental and critical task in various visual applications, such as Simultaneous Localization and Mapping (SLAM) and image retrieval, which require accurate pose estimation. However, most existing methods ignore the occlusion relations between objects caused by camera motion and scene structure. In this paper, we propose Occ$^2$Net, a novel image matching method that models o… ▽ More

    Submitted 14 August, 2023; originally announced August 2023.

  29. Preference-based training framework for automatic speech quality assessment using deep neural network

    Authors: Cheng-Hung Hu, Yusuke Yasuda, Tomoki Toda

    Abstract: One objective of Speech Quality Assessment (SQA) is to estimate the ranks of synthetic speech systems. However, recent SQA models are typically trained using low-precision direct scores such as mean opinion scores (MOS) as the training objective, which is not straightforward to estimate ranking. Although it is effective for predicting quality scores of individual sentences, this approach does not… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

    Comments: Accepted by Interspeech 2023, oral

  30. arXiv:2306.17203  [pdf, other

    cs.SD cs.CV cs.LG eess.AS

    Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models

    Authors: Simian Luo, Chuanhao Yan, Chenxu Hu, Hang Zhao

    Abstract: The Video-to-Audio (V2A) model has recently gained attention for its practical application in generating audio directly from silent videos, particularly in video/film production. However, previous methods in V2A have limited generation quality in terms of temporal synchronization and audio-visual relevance. We present Diff-Foley, a synchronized Video-to-Audio synthesis method with a latent diffusi… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

  31. arXiv:2305.05344  [pdf, other

    eess.IV cs.CV

    Trustworthy Multi-phase Liver Tumor Segmentation via Evidence-based Uncertainty

    Authors: Chuanfei Hu, Tianyi Xia, Ying Cui, Quchen Zou, Yuancheng Wang, Wenbo Xiao, Shenghong Ju, Xinde Li

    Abstract: Multi-phase liver contrast-enhanced computed tomography (CECT) images convey the complementary multi-phase information for liver tumor segmentation (LiTS), which are crucial to assist the diagnosis of liver cancer clinically. However, the performances of existing multi-phase liver tumor segmentation (MPLiTS)-based methods suffer from redundancy and weak interpretability, % of the fused result, res… ▽ More

    Submitted 20 June, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

  32. arXiv:2304.08506  [pdf, other

    eess.IV cs.CV

    When SAM Meets Medical Images: An Investigation of Segment Anything Model (SAM) on Multi-phase Liver Tumor Segmentation

    Authors: Chuanfei Hu, Tianyi Xia, Shenghong Ju, Xinde Li

    Abstract: Learning to segmentation without large-scale samples is an inherent capability of human. Recently, Segment Anything Model (SAM) performs the significant zero-shot image segmentation, attracting considerable attention from the computer vision community. Here, we investigate the capability of SAM for medical image analysis, especially for multi-phase liver tumor segmentation (MPLiTS), in terms of pr… ▽ More

    Submitted 21 December, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: Preliminary investigation

  33. arXiv:2212.07356  [pdf, other

    cs.LG cs.DC cs.IT eess.SP

    Scheduling and Aggregation Design for Asynchronous Federated Learning over Wireless Networks

    Authors: Chung-Hsuan Hu, Zheng Chen, Erik G. Larsson

    Abstract: Federated Learning (FL) is a collaborative machine learning (ML) framework that combines on-device training and server-based aggregation to train a common ML model among distributed agents. In this work, we propose an asynchronous FL design with periodic aggregation to tackle the straggler issue in FL systems. Considering limited wireless communication resources, we investigate the effect of diffe… ▽ More

    Submitted 21 March, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

    Journal ref: IEEE Journal on Selected Areas in Communications, vol. 41, no. 4, pp. 874-886, April 2023

  34. arXiv:2211.08880  [pdf

    eess.SP

    Temporal-spatial Representation Learning Transformer for EEG-based Emotion Recognition

    Authors: Zhe Wang, Yongxiong Wang, Chuanfei Hu, Zhong Yin, Yu Song

    Abstract: Both the temporal dynamics and spatial correlations of Electroencephalogram (EEG), which contain discriminative emotion information, are essential for the emotion recognition. However, some redundant information within the EEG signals would degrade the performance. Specifically,the subjects reach prospective intense emotions for only a fraction of the stimulus duration. Besides, it is a challenge… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

  35. arXiv:2210.12621  [pdf, other

    eess.SY

    Development of a Hybrid Simulation and Experiment Test Platform for Dynamic Positioning Vessels

    Authors: Changjun Hu, Quan Shi, Xin Li, Xiaoxian Guo

    Abstract: The harsh ocean environment and complex operating condition require high dynamic positioning (DP) capability of offshore vessel. The design, development and performance evaluation of DP system are generally carried out by numerical simulations or scale model experiments. Compared with the time-consuming and laborious experiment, the simulation is convenient and low cost, but its results lack pract… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

  36. Spatio-Temporal-based Context Fusion for Video Anomaly Detection

    Authors: Chao Hu, Weibin Qiu, Weijie Wu, Liqiang Zhu

    Abstract: Video anomaly detection aims to discover abnormal events in videos, and the principal objects are target objects such as people and vehicles. Each target in the video data has rich spatio-temporal context information. Most existing methods only focus on the temporal context, ignoring the role of the spatial context in anomaly detection. The spatial context information represents the relationship b… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

    Report number: MIR-2022-09-302

    Journal ref: Machine Intelligence Research 2022

  37. arXiv:2210.05741  [pdf

    eess.SY

    Road Slope Prediction and Vehicle Dynamics Control for Autonomous Vehicles

    Authors: Gautam Shetty, Sabir Hossain, Chuan Hu, Xianke Lin

    Abstract: Autonomous vehicles can enhance overall performance and implement safety measures in ways that are impossible with conventional automobiles. These functions are executed through vehicle control systems, which have been the subject of considerable research. Autonomous cars have a distinct advantage as they possess various perception sensors that can predict road surface conditions and other phenome… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: 16 pages, 15 figures

  38. arXiv:2209.12159  [pdf, other

    eess.SP cs.IT

    Grant-Free NOMA-OTFS Paradigm: Enabling Efficient Ubiquitous Access for LEO Satellite Internet-of-Things

    Authors: Zhen Gao, Xingyu Zhou, Jingjing Zhao, Juan Li, Chunli Zhu, Chun Hu, Pei Xiao, Symeon Chatzinotas, Derrick Wing Kwan Ng, Bjorn Ottersten

    Abstract: With the blooming of Internet-of-Things (IoT), we are witnessing an explosion in the number of IoT terminals, triggering an unprecedented demand for ubiquitous wireless access globally. In this context, the emerging low-Earth-orbit satellites (LEO-SATs) have been regarded as a promising enabler to complement terrestrial wireless networks in providing ubiquitous connectivity and bridging the ever-g… ▽ More

    Submitted 22 December, 2022; v1 submitted 25 September, 2022; originally announced September 2022.

    Comments: Accepted by IEEE Network

  39. arXiv:2209.11585  [pdf

    cs.SD eess.AS

    Synthetic Voice Spoofing Detection Based On Online Hard Example Mining

    Authors: Chenlei Hu, Ruohua Zhou

    Abstract: The automatic speaker verification spoofing (ASVspoof) challenge series is crucial for enhancing the spoofing consideration and the countermeasures growth. Although the recent ASVspoof 2019 validation results indicate the significant capability to identify most attacks, the model's recognition effect is still poor for some attacks. This paper presents the Online Hard Example Mining (OHEM) algorith… ▽ More

    Submitted 26 September, 2022; v1 submitted 23 September, 2022; originally announced September 2022.

  40. arXiv:2209.09635  [pdf

    cs.SD eess.AS

    The BUCEA Speaker Diarization System for the VoxCeleb Speaker Recognition Challenge 2022

    Authors: Ruohua Zhou, Yuxuan Du, Chenlei Hu

    Abstract: This paper describes the BUCEA speaker diarization system for the 2022 VoxCeleb Speaker Recognition Challenge. Voxsrc-22 provides the development set and test set of VoxConverse, and we mainly use the test set of VoxConverse for parameter adjustment. Our system consists of several modules, including speech activity detection (VAD), speaker embedding extractor, clustering methods, overlapping speec… ▽ More

    Submitted 20 September, 2022; originally announced September 2022.

  41. arXiv:2209.02832  [pdf, other

    cs.CV cs.LG eess.IV

    Impact of Colour Variation on Robustness of Deep Neural Networks

    Authors: Chengyin Hu, Weiwen Shi

    Abstract: Deep neural networks (DNNs) have have shown state-of-the-art performance for computer vision applications like image classification, segmentation and object detection. Whereas recent advances have shown their vulnerability to manual digital perturbations in the input data, namely adversarial attacks. The accuracy of the networks is significantly affected by the data distribution of their training… ▽ More

    Submitted 23 May, 2023; v1 submitted 2 September, 2022; originally announced September 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2209.02132

  42. Key frames assisted hybrid encoding for photorealistic compressive video sensing

    Authors: Honghao Huang, Jiajie Teng, Yu Liang, Chengyang Hu, Minghua Chen, Sigang Yang, Hongwei Chen

    Abstract: Snapshot compressive imaging (SCI) encodes high-speed scene video into a snapshot measurement and then computationally makes reconstructions, allowing for efficient high-dimensional data acquisition. Numerous algorithms, ranging from regularization-based optimization and deep learning, are being investigated to improve reconstruction quality, but they are still limited by the ill-posed and informa… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

  43. arXiv:2207.06088  [pdf, other

    cs.SD eess.AS

    Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech

    Authors: Zhengxi Liu, Qiao Tian, Chenxu Hu, Xudong Liu, Menglin Wu, Yuping Wang, Hang Zhao, Yuxuan Wang

    Abstract: Some recent studies have demonstrated the feasibility of single-stage neural text-to-speech, which does not need to generate mel-spectrograms but generates the raw waveforms directly from the text. Single-stage text-to-speech often faces two problems: a) the one-to-many mapping problem due to multiple speech variations and b) insufficiency of high frequency reconstruction due to the lack of superv… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

  44. A Distributionally Robust Resilience Enhancement Strategy for Distribution Networks Considering Decision-Dependent Contingencies

    Authors: Yujia Li, Shunbo Lei, Wei Sun, Chenxi Hu, Yunhe Hou

    Abstract: When performing the resilience enhancement for distribution networks, there are two obstacles to reliably model the uncertain contingencies: 1) decision-dependent uncertainty (DDU) due to various line hardening decisions, and 2) distributional ambiguity due to limited outage information during extreme weather events (EWEs). To address these two challenges, this paper develops scenario-wise decisio… ▽ More

    Submitted 23 August, 2022; v1 submitted 2 July, 2022; originally announced July 2022.

  45. arXiv:2206.14199  [pdf, other

    eess.IV physics.optics

    Hyperspectral image reconstruction for spectral camera based on ghost imaging via sparsity constraints using V-DUnet

    Authors: Ziyan Chen, Zhentao Liu, Chenyu Hu, Heng Wu, Jianrong Wu, Jinda Lin, Zhishen Tong, Hong Yu, Shensheng Han

    Abstract: Spectral camera based on ghost imaging via sparsity constraints (GISC spectral camera) obtains three-dimensional (3D) hyperspectral information with two-dimensional (2D) compressive measurements in a single shot, which has attracted much attention in recent years. However, its imaging quality and real-time performance of reconstruction still need to be further improved. Recently, deep learning has… ▽ More

    Submitted 28 June, 2022; originally announced June 2022.

  46. arXiv:2206.09058  [pdf, other

    eess.AS cs.LG

    NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling

    Authors: Chi-Chang Lee, Cheng-Hung Hu, Yu-Chen Lin, Chu-Song Chen, Hsin-Min Wang, Yu Tsao

    Abstract: For deep learning-based speech enhancement (SE) systems, the training-test acoustic mismatch can cause notable performance degradation. To address the mismatch issue, numerous noise adaptation strategies have been derived. In this paper, we propose a novel method, called noise adaptive speech enhancement with target-conditional resampling (NASTAR), which reduces mismatches with only one sample (on… ▽ More

    Submitted 17 June, 2022; originally announced June 2022.

    Comments: Accepted to Interspeech 2022

  47. arXiv:2201.09145  [pdf, other

    cs.LG eess.SP

    glassoformer: a query-sparse transformer for post-fault power grid voltage prediction

    Authors: Yunling Zheng, Carson Hu, Guang Lin, Meng Yue, Bao Wang, Jack Xin

    Abstract: We propose GLassoformer, a novel and efficient transformer architecture leveraging group Lasso regularization to reduce the number of queries of the standard self-attention mechanism. Due to the sparsified queries, GLassoformer is more computationally efficient than the standard transformers. On the power grid post-fault voltage prediction task, GLassoformer shows remarkably better prediction than… ▽ More

    Submitted 22 January, 2022; originally announced January 2022.

  48. arXiv:2201.06778  [pdf, other

    eess.SP cs.IT cs.LG

    Data-Driven Deep Learning Based Hybrid Beamforming for Aerial Massive MIMO-OFDM Systems with Implicit CSI

    Authors: Zhen Gao, Minghui Wu, Chun Hu, Feifei Gao, Guanghui Wen, Dezhi Zheng, Jun Zhang

    Abstract: In an aerial hybrid massive multiple-input multiple-output (MIMO) and orthogonal frequency division multiplexing (OFDM) system, how to design a spectral-efficient broadband multi-user hybrid beamforming with a limited pilot and feedback overhead is challenging. To this end, by modeling the key transmission modules as an end-to-end (E2E) neural network, this paper proposes a data-driven deep learni… ▽ More

    Submitted 9 September, 2022; v1 submitted 18 January, 2022; originally announced January 2022.

    Comments: Accepted by IEEE Journal on Selected Areas in Communications

  49. arXiv:2110.08243  [pdf, other

    eess.AS cs.CL cs.CV cs.LG cs.SD eess.IV

    Neural Dubber: Dubbing for Videos According to Scripts

    Authors: Chenxu Hu, Qiao Tian, Tingle Li, Yuping Wang, Yuxuan Wang, Hang Zhao

    Abstract: Dubbing is a post-production process of re-recording actors' dialogues, which is extensively used in filmmaking and video production. It is usually performed manually by professional voice actors who read lines with proper prosody, and in synchronization with the pre-recorded videos. In this work, we propose Neural Dubber, the first neural network model to solve a novel automatic video dubbing (AV… ▽ More

    Submitted 15 March, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: Accepted by NeurIPS 2021; Project page at https://tsinghua-mars-lab.github.io/NeuralDubber/

  50. arXiv:2109.12502  [pdf, other

    cs.CV eess.IV

    Self-Supervised Learning for MRI Reconstruction with a Parallel Network Training Framework

    Authors: Chen Hu, Cheng Li, Haifeng Wang, Qiegen Liu, Hairong Zheng, Shanshan Wang

    Abstract: Image reconstruction from undersampled k-space data plays an important role in accelerating the acquisition of MR data, and a lot of deep learning-based methods have been exploited recently. Despite the achieved inspiring results, the optimization of these methods commonly relies on the fully-sampled reference data, which are time-consuming and difficult to collect. To address this issue, we propo… ▽ More

    Submitted 26 September, 2021; originally announced September 2021.

    Comments: 10 pages, 3 figures, 2 tables