Zum Hauptinhalt springen

Showing 1–50 of 94 results for author: Han, Y

Searching in archive eess. Search in all archives.
.
  1. arXiv:2408.10737  [pdf, other

    cs.IT eess.SP

    Mid-Band Extra Large-Scale MIMO System: Channel Modeling and Performance Analysis

    Authors: Jiachen Tian, Yu Han, Xiao Li, Shi Jin, Chao-Kai Wen

    Abstract: In pursuit of enhanced quality of service and higher transmission rates, communication within the mid-band spectrum, such as bands in the 6-15 GHz range, combined with extra large-scale multiple-input multiple-output (XL-MIMO), is considered a potential enabler for future communication systems. However, the characteristics introduced by mid-band XL-MIMO systems pose challenges for channel modeling… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 16 pages, 10 figures

  2. arXiv:2407.11705  [pdf, other

    cs.RO eess.SP

    Snail-Radar: A large-scale diverse dataset for the evaluation of 4D-radar-based SLAM systems

    Authors: Jianzhu Huai, Binliang Wang, Yuan Zhuang, Yiwen Chen, Qipeng Li, Yulong Han, Charles Toth

    Abstract: 4D radars are increasingly favored for odometry and mapping of autonomous systems due to their robustness in harsh weather and dynamic environments. Existing datasets, however, often cover limited areas and are typically captured using a single platform. To address this gap, we present a diverse large-scale dataset specifically designed for 4D radar-based localization and mapping. This dataset was… ▽ More

    Submitted 22 July, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

    Comments: 11 pages, 4 figures, 5 tables

  3. arXiv:2407.10377  [pdf

    eess.IV cs.AI cs.CV

    Enhanced Self-supervised Learning for Multi-modality MRI Segmentation and Classification: A Novel Approach Avoiding Model Collapse

    Authors: Linxuan Han, Sa Xiao, Zimeng Li, Haidong Li, Xiuchao Zhao, Fumin Guo, Yeqing Han, Xin Zhou

    Abstract: Multi-modality magnetic resonance imaging (MRI) can provide complementary information for computer-aided diagnosis. Traditional deep learning algorithms are suitable for identifying specific anatomical structures segmenting lesions and classifying diseases with magnetic resonance images. However, manual labels are limited due to high expense, which hinders further improvement of model accuracy. Se… ▽ More

    Submitted 17 July, 2024; v1 submitted 14 July, 2024; originally announced July 2024.

  4. arXiv:2406.19769  [pdf, other

    eess.SP

    Decision Transformer for IRS-Assisted Systems with Diffusion-Driven Generative Channels

    Authors: Jie Zhang, Jun Li, Zhe Wang, Yu Han, Long Shi, Bin Cao

    Abstract: In this paper, we propose a novel diffusion-decision transformer (D2T) architecture to optimize the beamforming strategies for intelligent reflecting surface (IRS)-assisted multiple-input single-output (MISO) communication systems. The first challenge lies in the expensive computation cost to recover the real-time channel state information (CSI) from the received pilot signals, which usually requi… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  5. arXiv:2406.18425  [pdf, other

    eess.SP

    L-Sort: An Efficient Hardware for Real-time Multi-channel Spike Sorting with Localization

    Authors: Yuntao Han, Shiwei Wang, Alister Hamilton

    Abstract: Spike sorting is essential for extracting neuronal information from neural signals and understanding brain function. With the advent of high-density microelectrode arrays (HDMEAs), the challenges and opportunities in multi-channel spike sorting have intensified. Real-time spike sorting is particularly crucial for closed-loop brain computer interface (BCI) applications, demanding efficient hardware… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    ACM Class: B.7.1

  6. arXiv:2406.03706  [pdf, other

    cs.SD cs.CL eess.AS

    Improving Audio Codec-based Zero-Shot Text-to-Speech Synthesis with Multi-Modal Context and Large Language Model

    Authors: Jinlong Xue, Yayue Deng, Yicheng Han, Yingming Gao, Ya Li

    Abstract: Recent advances in large language models (LLMs) and development of audio codecs greatly propel the zero-shot TTS. They can synthesize personalized speech with only a 3-second speech of an unseen speaker as acoustic prompt. However, they only support short speech prompts and cannot leverage longer context information, as required in audiobook and conversational TTS scenarios. In this paper, we intr… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  7. arXiv:2405.20969  [pdf, other

    cs.RO eess.SY

    Design, Calibration, and Control of Compliant Force-sensing Gripping Pads for Humanoid Robots

    Authors: Yuanfeng Han, Boren Jiang, Gregory S. Chirikjian

    Abstract: This paper introduces a pair of low-cost, light-weight and compliant force-sensing gripping pads used for manipulating box-like objects with smaller-sized humanoid robots. These pads measure normal gripping forces and center of pressure (CoP). A calibration method is developed to improve the CoP measurement accuracy. A hybrid force-alignment-position control framework is proposed to regulate the g… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 21 pages, 16 figures, Published in ASME Journal of Mechanisms and Robotics

    Journal ref: Journal of Mechanisms and Robotics, 15, 031010,2023

  8. arXiv:2405.16715  [pdf

    eess.SP

    Coil Reweighting to Suppress Motion Artifacts in Real-Time Exercise Cine Imaging

    Authors: Chong Chen, Yingmin Liu, Yu Ding, Matthew Tong, Preethi Chandrasekaran, Christopher Crabtree, Syed M. Arshad, Yuchi Han, Rizwan Ahmad

    Abstract: Background: Accelerated real-time cine (RT-Cine) imaging enables cardiac function assessment without the need for breath-holding. However, when performed during in-magnet exercise, RT-Cine images may exhibit significant motion artifacts. Methods: By projecting the time-averaged images to the subspace spanned by the coil sensitivity maps, we propose a coil reweighting (CR) method to automatically s… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  9. arXiv:2405.00367  [pdf, other

    cs.IR cs.AI cs.SD eess.AS

    Distance Sampling-based Paraphraser Leveraging ChatGPT for Text Data Manipulation

    Authors: Yoori Oh, Yoseob Han, Kyogu Lee

    Abstract: There has been growing interest in audio-language retrieval research, where the objective is to establish the correlation between audio and text modalities. However, most audio-text paired datasets often lack rich expression of the text data compared to the audio samples. One of the significant challenges facing audio-text datasets is the presence of similar or identical captions despite different… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: Accepted at SIGIR 2024 short paper track

  10. arXiv:2404.16318  [pdf, other

    eess.SY

    The Continuous-Time Weighted-Median Opinion Dynamics

    Authors: Yi Han, Ge Chen, Florian Dörfler, Wenjun Mei

    Abstract: Opinion dynamics models are important in understanding and predicting opinion formation processes within social groups. Although the weighted-averaging opinion-update mechanism is widely adopted as the micro-foundation of opinion dynamics, it bears a non-negligibly unrealistic implication: opinion attractiveness increases with opinion distance. Recently, the weighted-median mechanism has been prop… ▽ More

    Submitted 28 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: 13 pages, 1 figure

    MSC Class: 91D30(Primary) 93A16(Secondary)

  11. arXiv:2403.08580  [pdf, other

    cs.CV cs.MM eess.IV

    Leveraging Compressed Frame Sizes For Ultra-Fast Video Classification

    Authors: Yuxing Han, Yunan Ding, Chen Ye Gan, Jiangtao Wen

    Abstract: Classifying videos into distinct categories, such as Sport and Music Video, is crucial for multimedia understanding and retrieval, especially when an immense volume of video content is being constantly generated. Traditional methods require video decompression to extract pixel-level features like color, texture, and motion, thereby increasing computational and storage demands. Moreover, these meth… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 5 pages, 5 figures, 1 table. arXiv admin note: substantial text overlap with arXiv:2309.07361

  12. arXiv:2403.06998  [pdf

    eess.SP cs.HC cs.NE

    High-speed Low-consumption sEMG-based Transient-state micro-Gesture Recognition

    Authors: Youfang Han, Wei Zhao, Xiangjin Chen, Xin Meng

    Abstract: Gesture recognition on wearable devices is extensively applied in human-computer interaction. Electromyography (EMG) has been used in many gesture recognition systems for its rapid perception of muscle signals. However, analyzing EMG signals on devices, like smart wristbands, usually needs inference models to have high performances, such as low inference latency, low power consumption, and low mem… ▽ More

    Submitted 12 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

  13. arXiv:2402.17877  [pdf, other

    eess.SP eess.IV

    Accelerated Real-time Cine and Flow under In-magnet Staged Exercise

    Authors: Preethi Chandrasekaran, Chong Chen, Yingmin Liu, Syed Murtaza Arshad, Christopher Crabtree, Matthew Tong, Yuchi Han, Rizwan Ahmad

    Abstract: Background: Cardiovascular magnetic resonance imaging (CMR) is a wellestablished imaging tool for diagnosing and managing cardiac conditions. The integration of exercise stress with CMR (ExCMR) can enhance its diagnostic capacity. Despite recent advances in CMR technology, quantitative ExCMR during exercise remains technically challenging due to motion artifacts and limited spatial and temporal re… ▽ More

    Submitted 21 May, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  14. arXiv:2401.08121  [pdf, other

    cs.LG cs.AI eess.SY

    CycLight: learning traffic signal cooperation with a cycle-level strategy

    Authors: Gengyue Han, Xiaohan Liu, Xianyue Peng, Hao Wang, Yu Han

    Abstract: This study introduces CycLight, a novel cycle-level deep reinforcement learning (RL) approach for network-level adaptive traffic signal control (NATSC) systems. Unlike most traditional RL-based traffic controllers that focus on step-by-step decision making, CycLight adopts a cycle-level strategy, optimizing cycle length and splits simultaneously using Parameterized Deep Q-Networks (PDQN) algorithm… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  15. arXiv:2312.17282  [pdf

    eess.SY nlin.CD

    Nonlinear energy harvesting system with multiple stability

    Authors: Yanwei Han, Zijian Zhang

    Abstract: The nonlinear energy harvesting systems of the forced vibration with an electron-mechanical coupling are widely used to capture ambient vibration energy and convert mechanical energy into electrical energy. However, the nonlinear response mechanism of the friction induced vibration (FIV) energy harvesting system with multiple stability and stick-slip motion is still unclear. In the current paper,… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

    Comments: 29 Pages, 29 figures

    MSC Class: 34-xx ACM Class: J.2

  16. arXiv:2312.16383  [pdf, ps, other

    cs.SD cs.AI eess.AS

    Frame-level emotional state alignment method for speech emotion recognition

    Authors: Qifei Li, Yingming Gao, Cong Wang, Yayue Deng, Jinlong Xue, Yichen Han, Ya Li

    Abstract: Speech emotion recognition (SER) systems aim to recognize human emotional state during human-computer interaction. Most existing SER systems are trained based on utterance-level labels. However, not all frames in an audio have affective states consistent with utterance-level label, which makes it difficult for the model to distinguish the true emotion of the audio and perform poorly. To address th… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: Accepted by ICASSP 2024

  17. arXiv:2312.10112  [pdf, other

    cs.CV cs.LG eess.IV

    NM-FlowGAN: Modeling sRGB Noise with a Hybrid Approach based on Normalizing Flows and Generative Adversarial Networks

    Authors: Young Joo Han, Ha-Jin Yu

    Abstract: Modeling and synthesizing real sRGB noise is crucial for various low-level vision tasks, such as building datasets for training image denoising systems. The distribution of real sRGB noise is highly complex and affected by a multitude of factors, making its accurate modeling extremely challenging. Therefore, recent studies have proposed methods that employ data-driven generative models, such as ge… ▽ More

    Submitted 14 March, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: 25 pages, 11 figures, 7 tables

    MSC Class: 68T45 ACM Class: I.4.4

  18. arXiv:2310.11044  [pdf, ps, other

    cs.IT eess.SP

    A Tutorial on Near-Field XL-MIMO Communications Towards 6G

    Authors: Haiquan Lu, Yong Zeng, Changsheng You, Yu Han, Jiayi Zhang, Zhe Wang, Zhenjun Dong, Shi Jin, Cheng-Xiang Wang, Tao Jiang, Xiaohu You, Rui Zhang

    Abstract: Extremely large-scale multiple-input multiple-output (XL-MIMO) is a promising technology for the sixth-generation (6G) mobile communication networks. By significantly boosting the antenna number or size to at least an order of magnitude beyond current massive MIMO systems, XL-MIMO is expected to unprecedentedly enhance the spectral efficiency and spatial resolution for wireless communication. The… ▽ More

    Submitted 3 April, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: 42 pages

  19. arXiv:2310.07464  [pdf

    eess.IV cs.LG q-bio.QM

    Deep Learning Predicts Biomarker Status and Discovers Related Histomorphology Characteristics for Low-Grade Glioma

    Authors: Zijie Fang, Yihan Liu, Yifeng Wang, Xiangyang Zhang, Yang Chen, Changjing Cai, Yiyang Lin, Ying Han, Zhi Wang, Shan Zeng, Hong Shen, Jun Tan, Yongbing Zhang

    Abstract: Biomarker detection is an indispensable part in the diagnosis and treatment of low-grade glioma (LGG). However, current LGG biomarker detection methods rely on expensive and complex molecular genetic testing, for which professionals are required to analyze the results, and intra-rater variability is often reported. To overcome these challenges, we propose an interpretable deep learning pipeline, a… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: 47 pages, 6 figures

  20. arXiv:2309.16128  [pdf, other

    cs.CV eess.IV

    Joint Correcting and Refinement for Balanced Low-Light Image Enhancement

    Authors: Nana Yu, Hong Shi, Yahong Han

    Abstract: Low-light image enhancement tasks demand an appropriate balance among brightness, color, and illumination. While existing methods often focus on one aspect of the image without considering how to pay attention to this balance, which will cause problems of color distortion and overexposure etc. This seriously affects both human visual perception and the performance of high-level visual models. In t… ▽ More

    Submitted 19 October, 2023; v1 submitted 27 September, 2023; originally announced September 2023.

  21. arXiv:2309.11977  [pdf, other

    cs.SD eess.AS

    Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts

    Authors: Shun Lei, Yixuan Zhou, Liyang Chen, Dan Luo, Zhiyong Wu, Xixin Wu, Shiyin Kang, Tao Jiang, Yahui Zhou, Yuxing Han, Helen Meng

    Abstract: Zero-shot text-to-speech (TTS) synthesis aims to clone any unseen speaker's voice without adaptation parameters. By quantizing speech waveform into discrete acoustic tokens and modeling these tokens with the language model, recent language model-based TTS models show zero-shot speaker adaptation capabilities with only a 3-second acoustic prompt of an unseen speaker. However, they are limited by th… ▽ More

    Submitted 9 April, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: Accepted bt ICASSP 2024

  22. arXiv:2309.03686  [pdf, other

    eess.IV cs.CV

    MS-UNet-v2: Adaptive Denoising Method and Training Strategy for Medical Image Segmentation with Small Training Data

    Authors: Haoyuan Chen, Yufei Han, Pin Xu, Yanyi Li, Kuan Li, Jianping Yin

    Abstract: Models based on U-like structures have improved the performance of medical image segmentation. However, the single-layer decoder structure of U-Net is too "thin" to exploit enough information, resulting in large semantic differences between the encoder and decoder parts. Things get worse if the number of training sets of data is not sufficiently large, which is common in medical image processing t… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  23. arXiv:2309.03451  [pdf, other

    cs.SD cs.LG eess.AS

    Cross-domain Sound Recognition for Efficient Underwater Data Analysis

    Authors: Jeongsoo Park, Dong-Gyun Han, Hyoung Sul La, Sangmin Lee, Yoonchang Han, Eun-Jin Yang

    Abstract: This paper presents a novel deep learning approach for analyzing massive underwater acoustic data by leveraging a model trained on a broad spectrum of non-underwater (aerial) sounds. Recognizing the challenge in labeling vast amounts of underwater data, we propose a two-fold methodology to accelerate this labor-intensive procedure. The first part of our approach involves PCA and UMAP visualizati… ▽ More

    Submitted 21 February, 2024; v1 submitted 6 September, 2023; originally announced September 2023.

    Comments: Accepted to APSIPA 2023

  24. arXiv:2308.15752  [pdf, other

    cs.CV eess.IV

    Large-scale data extraction from the UNOS organ donor documents

    Authors: Marek Rychlik, Bekir Tanriover, Yan Han

    Abstract: In this paper we focus on three major task: 1) discussing our methods: Our method captures a portion of the data in DCD flowsheets, kidney perfusion data, and Flowsheet data captured peri-organ recovery surgery. 2) demonstrating the result: We built a comprehensive, analyzable database from 2022 OPTN data. This dataset is by far larger than any previously available even in this preliminary phase;… ▽ More

    Submitted 4 January, 2024; v1 submitted 30 August, 2023; originally announced August 2023.

    MSC Class: 62; 68 ACM Class: I.5.4

  25. arXiv:2308.12985  [pdf

    cs.AI eess.SY

    Perimeter Control with Heterogeneous Metering Rates for Cordon Signals: A Physics-Regularized Multi-Agent Reinforcement Learning Approach

    Authors: Jiajie Yu, Pierre-Antoine Laharotte, Yu Han, Wei Ma, Ludovic Leclercq

    Abstract: Perimeter Control (PC) strategies have been proposed to address urban road network control in oversaturated situations by regulating the transfer flow of the Protected Network (PN) based on the Macroscopic Fundamental Diagram (MFD). The uniform metering rate for cordon signals in most existing studies overlooks the variance of local traffic states at the intersection level, which may cause severe… ▽ More

    Submitted 31 May, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

    Comments: 21 pages, 24 figures

  26. arXiv:2308.02088  [pdf, other

    eess.IV eess.SP

    Motion-robust free-running volumetric cardiovascular MRI

    Authors: Syed M. Arshad, Lee C. Potter, Chong Chen, Yingmin Liu, Preethi Chandrasekaran, Christopher Crabtree, Matthew S. Tong, Orlando P. Simonetti, Yuchi Han, Rizwan Ahmad

    Abstract: PURPOSE: To present and assess an outlier mitigation method that makes free-running volumetric cardiovascular MRI (CMR) more robust to motion. METHODS: The proposed method, called compressive recovery with outlier rejection (CORe), models outliers in the measured data as an additive auxiliary variable. We enforce MR physics-guided group sparsity on the auxiliary variable, and jointly estimate it… ▽ More

    Submitted 24 June, 2024; v1 submitted 3 August, 2023; originally announced August 2023.

    Journal ref: Magnetic Resonance in Medicine 92(3) (2024) 1248-1262

  27. Rank Optimization for MIMO Channel with RIS: Simulation and Measurement

    Authors: Shengguo Meng, Wankai Tang, Weicong Chen, Jifeng Lan, Qun Yan Zhou, Yu Han, Xiao Li, Shi Jin

    Abstract: Reconfigurable intelligent surface (RIS) is a promising technology that can reshape the electromagnetic environment in wireless networks, offering various possibilities for enhancing wireless channels. Motivated by this, we investigate the channel optimization for multiple-input multiple-output (MIMO) systems assisted by RIS. In this paper, an efficient RIS optimization method is proposed to enhan… ▽ More

    Submitted 8 December, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

    Comments: This work has been accepted by IEEE WCL

  28. arXiv:2307.09823  [pdf, other

    eess.IV cs.CV cs.LG

    Multi-modal Learning based Prediction for Disease

    Authors: Yaran Chen, Xueyu Chen, Yu Han, Haoran Li, Dongbin Zhao, Jingzhong Li, Xu Wang

    Abstract: Non alcoholic fatty liver disease (NAFLD) is the most common cause of chronic liver disease, which can be predicted accurately to prevent advanced fibrosis and cirrhosis. While, a liver biopsy, the gold standard for NAFLD diagnosis, is invasive, expensive, and prone to sampling errors. Therefore, non-invasive studies are extremely promising, yet they are still in their infancy due to the lack of c… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  29. arXiv:2306.07650  [pdf, other

    cs.CL cs.SD eess.AS

    Modality Adaption or Regularization? A Case Study on End-to-End Speech Translation

    Authors: Yuchen Han, Chen Xu, Tong Xiao, Jingbo Zhu

    Abstract: Pre-training and fine-tuning is a paradigm for alleviating the data scarcity problem in end-to-end speech translation (E2E ST). The commonplace "modality gap" between speech and text data often leads to inconsistent inputs between pre-training and fine-tuning. However, we observe that this gap occurs in the early stages of fine-tuning, but does not have a major impact on the final performance. On… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: ACL 2023 Main Conference

  30. arXiv:2304.14467  [pdf, other

    eess.SP

    Distributed Quantized Detection of Sparse Signals Under Byzantine Attacks

    Authors: Chen Quan, Yunghsiang S. Han, Baocheng Geng, Pramod K. Varshney

    Abstract: This paper investigates distributed detection of sparse stochastic signals with quantized measurements under Byzantine attacks. Under this type of attack, sensors in the networks might send falsified data to degrade system performance. The Bernoulli-Gaussian (BG) distribution in terms of the sparsity degree of the stochastic signal is utilized for modeling the sparsity of signals. Several detector… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

  31. arXiv:2301.10815  [pdf, other

    eess.SP

    Human-machine Hierarchical Networks for Decision Making under Byzantine Attacks

    Authors: Chen Quan, Baocheng Geng, Yunghsiang S. Han, Pramod K. Varshney

    Abstract: This paper proposes a belief-updating scheme in a human-machine collaborative decision-making network to combat Byzantine attacks. A hierarchical framework is used to realize the network where local decisions from physical sensors act as reference decisions to improve the quality of human sensor decisions. During the decision-making process, the belief that each physical sensor is malicious is upd… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

  32. arXiv:2301.09058  [pdf, other

    eess.AS cs.LG

    Leveraging Speaker Embeddings with Adversarial Multi-task Learning for Age Group Classification

    Authors: Kwangje Baeg, Yeong-Gwan Kim, Young-Sub Han, Byoung-Ki Jeon

    Abstract: Recently, researchers have utilized neural network-based speaker embedding techniques in speaker-recognition tasks to identify speakers accurately. However, speaker-discriminative embeddings do not always represent speech features such as age group well. In an embedding model that has been highly trained to capture speaker traits, the task of age group classification is closer to speech informatio… ▽ More

    Submitted 22 January, 2023; originally announced January 2023.

  33. arXiv:2211.06160  [pdf, other

    eess.AS cs.LG cs.SD

    Semi-supervised learning for continuous emotional intensity controllable speech synthesis with disentangled representations

    Authors: Yoori Oh, Juheon Lee, Yoseob Han, Kyogu Lee

    Abstract: Recent text-to-speech models have reached the level of generating natural speech similar to what humans say. But there still have limitations in terms of expressiveness. The existing emotional speech synthesis models have shown controllability using interpolated features with scaling parameters in emotional latent space. However, the emotional latent space generated from the existing models is dif… ▽ More

    Submitted 29 May, 2023; v1 submitted 11 November, 2022; originally announced November 2022.

    Comments: Accepted by Interspeech 2023

  34. arXiv:2210.14627  [pdf, other

    cs.IT eess.SP

    Channel-Aware Ordered Successive Relaying with Finite-Blocklength Coding

    Authors: Lingrui Zhang, Yuxing Han, Qiong Wang, Wei Chen

    Abstract: Successive relaying can improve the transmission rate by allowing the source and relays to transmit messages simultaneously, but it may cause severe inter-relay interference (IRI). IRI cancellation schemes have been proposed to mitigate IRI. However, interference cancellation methods have a high risk of error propagation, resulting in a severe transmission rate loss in finite blocklength regimes.… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: 11 pages, 5 figures

  35. Efficient Ordered-Transmission Based Distributed Detection under Data Falsification Attacks

    Authors: Chen Quan, Nandan Sriranga, Haodong Yang, Yunghsiang S. Han, Baocheng Geng, Pramod K. Varshney

    Abstract: In distributed detection systems, energy-efficient ordered transmission (EEOT) schemes are able to reduce the number of transmissions required to make a final decision. In this work, we investigate the effect of data falsification attacks on the performance of EEOT-based systems. We derive the probability of error for an EEOT-based system under attack and find an upper bound (UB) on the expected n… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

  36. arXiv:2206.02507  [pdf, other

    cs.LG eess.SY

    Learning to Control under Time-Varying Environment

    Authors: Yuzhen Han, Ruben Solozabal, Jing Dong, Xingyu Zhou, Martin Takac, Bin Gu

    Abstract: This paper investigates the problem of regret minimization in linear time-varying (LTV) dynamical systems. Due to the simultaneous presence of uncertainty and non-stationarity, designing online control algorithms for unknown LTV systems remains a challenging task. At a cost of NP-hard offline planning, prior works have introduced online convex optimization algorithms, although they suffer from non… ▽ More

    Submitted 6 June, 2022; originally announced June 2022.

  37. arXiv:2206.01833  [pdf, other

    cs.RO cs.MA eess.SY

    Leveraging Heterogeneous Capabilities in Multi-Agent Systems for Environmental Conflict Resolution

    Authors: Michael Enqi Cao, Jonas Warnke, Yunhai Han, Xinpei Ni, Ye Zhao, Samuel Coogan

    Abstract: In this paper, we introduce a high-level controller synthesis framework that enables teams of heterogeneous agents to assist each other in resolving environmental conflicts that appear at runtime. This conflict resolution method is built upon temporal-logic-based reactive synthesis to guarantee safety and task completion under specific environment assumptions. In heterogeneous multi-agent systems,… ▽ More

    Submitted 1 September, 2022; v1 submitted 3 June, 2022; originally announced June 2022.

    Comments: Submitted to The International Symposium on Safety, Security, and Rescue Robotics (SSRR) 2022

  38. arXiv:2205.15170  [pdf, other

    eess.IV cs.CR cs.CV

    GAN-based Medical Image Small Region Forgery Detection via a Two-Stage Cascade Framework

    Authors: Jianyi Zhang, Xuanxi Huang, Yaqi Liu, Yuyang Han, Zixiao Xiang

    Abstract: Using generative adversarial network (GAN)\cite{RN90} for data enhancement of medical images is significantly helpful for many computer-aided diagnosis (CAD) tasks. A new attack called CT-GAN has emerged. It can inject or remove lung cancer lesions to CT scans. Because the tampering region may even account for less than 1\% of the original image, even state-of-the-art methods are challenging to de… ▽ More

    Submitted 30 May, 2022; originally announced May 2022.

  39. arXiv:2204.08686  [pdf, ps, other

    cs.SD eess.AS

    Audio-Visual Wake Word Spotting System For MISP Challenge 2021

    Authors: Yanguang Xu, Jianwei Sun, Yang Han, Shuaijiang Zhao, Chaoyang Mei, Tingwei Guo, Shuran Zhou, Chuandong Xie, Wei Zou, Xiangang Li, Shuran Zhou, Chuandong Xie, Wei Zou, Xiangang Li

    Abstract: This paper presents the details of our system designed for the Task 1 of Multimodal Information Based Speech Processing (MISP) Challenge 2021. The purpose of Task 1 is to leverage both audio and video information to improve the environmental robustness of far-field wake word spotting. In the proposed system, firstly, we take advantage of speech enhancement algorithms such as beamforming and weight… ▽ More

    Submitted 19 April, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

    Comments: Accepted to ICASSP 2022

  40. arXiv:2204.07212  [pdf, other

    cs.CR eess.SP

    Reputation and Audit Bit Based Distributed Detection in the Presence of Byzantine

    Authors: Chen Quan, Yunghsiang S. Han, Baocheng Geng, Pramod K. Varshney

    Abstract: In this paper, two reputation based algorithms called Reputation and audit based clustering (RAC) algorithm and Reputation and audit based clustering with auxiliary anchor node (RACA) algorithm are proposed to defend against Byzantine attacks in distributed detection networks when the fusion center (FC) has no prior knowledge of the attacking strategy of Byzantine nodes. By updating the reputation… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

  41. arXiv:2203.10473  [pdf, other

    cs.SD cs.LG eess.AS

    ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis

    Authors: Jinlong Xue, Yayue Deng, Yichen Han, Ya Li, Jianqing Sun, Jiaen Liang

    Abstract: In recent years, neural network based methods for multi-speaker text-to-speech synthesis (TTS) have made significant progress. However, the current speaker encoder models used in these methods still cannot capture enough speaker information. In this paper, we focus on accurate speaker encoder modeling and propose an end-to-end method that can generate high-quality speech and better similarity for… ▽ More

    Submitted 26 March, 2022; v1 submitted 20 March, 2022; originally announced March 2022.

    Comments: 5 pages, 2 figures, submitted to interspeech2022

  42. arXiv:2203.03969  [pdf, other

    cs.GT cs.NI eess.SY

    A Dynamic Hierarchical Framework for IoT-assisted Metaverse Synchronization

    Authors: Yue Han, Dusit Niyato, Cyril Leung, Dong In Kim, Kun Zhu, Shaohan Feng, Sherman Xuemin Shen, Chunyan Miao

    Abstract: Metaverse has recently attracted much attention from both academia and industry. Virtual services, ranging from virtual driver training to online route optimization for smart goods delivery, are emerging in the Metaverse. To make the human experience of virtual life more real, digital twins (DTs), namely digital replicas of physical objects, are key enablers. However, DT status may not always accu… ▽ More

    Submitted 14 March, 2022; v1 submitted 8 March, 2022; originally announced March 2022.

  43. arXiv:2202.07108  [pdf

    eess.IV eess.SY physics.bio-ph physics.med-ph

    Dynamic optical contrast imaging for real-time delineation of tumor resection margins using head and neck cancer as a model

    Authors: Yong Hu, Shan Huang, Albert Y. Han, Seong Moon, Jeffrey F. Krane, Oscar Stafsudd, Warren Grundfest, Maie A. St. John

    Abstract: Complete surgical resection of the tumor for Head and neck squamous cell carcinoma (HNSCC) remains challenging, given the devastating side effects of aggressive surgery and the anatomic proximity to vital structures. To address the clinical challenges, we introduce a wide-field, label-free imaging tool that can assist surgeons delineate tumor margins real-time. We assume that autofluorescence life… ▽ More

    Submitted 18 February, 2024; v1 submitted 14 February, 2022; originally announced February 2022.

    Comments: 21 pages, 7 figures and 1 table

  44. arXiv:2112.07854  [pdf

    physics.bio-ph eess.SY q-bio.QM

    A terrain treadmill to study animal locomotion through large obstacles

    Authors: Ratan Othayoth, Blake Strebel, Yuanfeng Han, Evains Francois, Chen Li

    Abstract: A major challenge to understanding locomotion in complex 3-D terrain with large obstacles is to create tools for controlled, systematic lab experiments. Existing terrain arenas only allow observations at small spatiotemporal scales (~10 body length, ~10 stride cycles). Here, we create a terrain treadmill to enable high-resolution observations of animal locomotion through large obstacles over large… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

  45. arXiv:2112.02743  [pdf, other

    eess.IV cs.CV cs.LG

    Separated Contrastive Learning for Organ-at-Risk and Gross-Tumor-Volume Segmentation with Limited Annotation

    Authors: Jiacheng Wang, Xiaomeng Li, Yiming Han, Jing Qin, Liansheng Wang, Zhou Qichao

    Abstract: Automatic delineation of organ-at-risk (OAR) and gross-tumor-volume (GTV) is of great significance for radiotherapy planning. However, it is a challenging task to learn powerful representations for accurate delineation under limited pixel (voxel)-wise annotations. Contrastive learning at pixel-level can alleviate the dependency on annotations by learning dense representations from unlabeled data.… ▽ More

    Submitted 20 April, 2022; v1 submitted 5 December, 2021; originally announced December 2021.

    Comments: Accepted in AAAI-22 (Oral)

  46. arXiv:2111.03729  [pdf, other

    eess.IV cs.CV cs.LG

    Explaining neural network predictions of material strength

    Authors: Ian A. Palmer, T. Nathan Mundhenk, Brian Gallagher, Yong Han

    Abstract: We recently developed a deep learning method that can determine the critical peak stress of a material by looking at scanning electron microscope (SEM) images of the material's crystals. However, it has been somewhat unclear what kind of image features the network is keying off of when it makes its prediction. It is common in computer vision to employ an explainable AI saliency map to tell one wha… ▽ More

    Submitted 5 November, 2021; originally announced November 2021.

  47. arXiv:2110.14787  [pdf, other

    eess.IV cs.CV

    SCALP -- Supervised Contrastive Learning for Cardiopulmonary Disease Classification and Localization in Chest X-rays using Patient Metadata

    Authors: Ajay Jaiswal, Tianhao Li, Cyprian Zander, Yan Han, Justin F. Rousseau, Yifan Peng, Ying Ding

    Abstract: Computer-aided diagnosis plays a salient role in more accessible and accurate cardiopulmonary diseases classification and localization on chest radiography. Millions of people get affected and die due to these diseases without an accurate and timely diagnosis. Recently proposed contrastive learning heavily relies on data augmentation, especially positive data augmentation. However, generating clin… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

  48. arXiv:2109.13325  [pdf, other

    cs.CR eess.SP

    Enhanced Audit Bit Based Distributed Bayesian Detection in the Presence of Strategic Attacks

    Authors: Chen Quan, Baocheng Geng, Yunghsiang S. Han, Pramod K. Varshney

    Abstract: This paper employs an audit bit based mechanism to mitigate the effect of Byzantine attacks. In this framework, the optimal attacking strategy for intelligent attackers is investigated for the traditional audit bit based scheme (TAS) to evaluate the robustness of the system. We show that it is possible for an intelligent attacker to degrade the performance of TAS to the system without audit bits.… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

  49. arXiv:2108.12094  [pdf, other

    eess.SY

    A Numerical Verification Framework for Differential Privacy in Estimation

    Authors: Yunhai Han, Sonia Martínez

    Abstract: This work proposes an algorithmic method to verify differential privacy for estimation mechanisms with performance guarantees. Differential privacy makes it hard to distinguish outputs of a mechanism produced by adjacent inputs. While obtaining theoretical conditions that guarantee differential privacy may be possible, evaluating these conditions in practice can be hard. This is especially true fo… ▽ More

    Submitted 2 December, 2021; v1 submitted 26 August, 2021; originally announced August 2021.

    Comments: The paper is accepted by IEEE Control System Letter (L-CSS)

  50. arXiv:2108.00952  [pdf, other

    cs.CV cs.LG eess.IV

    An Applied Deep Learning Approach for Estimating Soybean Relative Maturity from UAV Imagery to Aid Plant Breeding Decisions

    Authors: Saba Moeinizade, Hieu Pham, Ye Han, Austin Dobbels, Guiping Hu

    Abstract: For a global breeding organization, identifying the next generation of superior crops is vital for its success. Recognizing new genetic varieties requires years of in-field testing to gather data about the crop's yield, pest resistance, heat resistance, etc. At the conclusion of the growing season, organizations need to determine which varieties will be advanced to the next growing season (or sold… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

    Comments: 22 pages, 7 figures