Zum Hauptinhalt springen

Showing 1–50 of 190 results for author: Huang, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2408.16030  [pdf

    cs.SD cs.AI cs.LG eess.AS

    A Deep Learning Approach to Localizing Multi-level Airway Collapse Based on Snoring Sounds

    Authors: Ying-Chieh Hsu, Stanley Yung-Chuan Liu, Chao-Jung Huang, Chi-Wei Wu, Ren-Kai Cheng, Jane Yung-Jen Hsu, Shang-Ran Huang, Yuan-Ren Cheng, Fu-Shun Hsu

    Abstract: This study investigates the application of machine/deep learning to classify snoring sounds excited at different levels of the upper airway in patients with obstructive sleep apnea (OSA) using data from drug-induced sleep endoscopy (DISE). The snoring sounds of 39 subjects were analyzed and labeled according to the Velum, Oropharynx, Tongue Base, and Epiglottis (VOTE) classification system. The da… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  2. arXiv:2408.15217  [pdf, other

    eess.IV cs.AI cs.CV

    Fundus2Video: Cross-Modal Angiography Video Generation from Static Fundus Photography with Clinical Knowledge Guidance

    Authors: Weiyi Zhang, Siyu Huang, Jiancheng Yang, Ruoyu Chen, Zongyuan Ge, Yingfeng Zheng, Danli Shi, Mingguang He

    Abstract: Fundus Fluorescein Angiography (FFA) is a critical tool for assessing retinal vascular dynamics and aiding in the diagnosis of eye diseases. However, its invasive nature and less accessibility compared to Color Fundus (CF) images pose significant challenges. Current CF to FFA translation methods are limited to static generation. In this work, we pioneer dynamic FFA video generation from static CF… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: The paper has been accepted by Medical Image Computing and Computer Assisted Intervention Society (MICCAI) 2024

  3. arXiv:2408.07566  [pdf

    eess.SY

    Startup Control Optimization of He-Xe Cooled Space Nuclear Reactors Using a System Analysis Program

    Authors: Chengyuan Li, Leran Guo, Shanfang Huang, Jian Deng, Jiahe Shang

    Abstract: In recent years, achieving autonomous control in nuclear reactor operations has become pivotal for the effectiveness of Space Nuclear Power Systems (SNPS). However, compared to power control, the startup control of SNPS remains underexplored. This study introduces a multi-objective optimization framework aimed at enhancing startup control, leveraging a system level analysis program to simulate the… ▽ More

    Submitted 14 August, 2024; v1 submitted 14 August, 2024; originally announced August 2024.

  4. arXiv:2408.00808  [pdf, other

    eess.IV

    LightViz: Autonomous Light-field Surveying and Mapping for Distributed Light Pollution Monitoring

    Authors: Sheng-En Huang, Kazi Farha Farzana Suhi, Md Jahidul Islam

    Abstract: Existing technologies for distributed light-field mapping and light pollution monitoring (LPM) rely on either remote satellite imagery or manual light surveying with single-point sensors such as SQMs (sky quality meters). These modalities offer low-resolution data that are not informative for dense light-field mapping, pollutant factor identification, or sustainable policy implementation. In this… ▽ More

    Submitted 31 July, 2024; originally announced August 2024.

    Comments: 10 pages, 11 figures

  5. arXiv:2407.21600  [pdf, other

    eess.IV cs.AI cs.CV eess.SP physics.med-ph

    Robust Simultaneous Multislice MRI Reconstruction Using Deep Generative Priors

    Authors: Shoujin Huang, Guanxiong Luo, Yuwan Wang, Kexin Yang, Lingyan Zhang, Jingzhe Liu, Hua Guo, Min Wang, Mengye Lyu

    Abstract: Simultaneous multislice (SMS) imaging is a powerful technique for accelerating magnetic resonance imaging (MRI) acquisitions. However, SMS reconstruction remains challenging due to the complex signal interactions between and within the excited slices. This study presents a robust SMS MRI reconstruction method using deep generative priors. Starting from Gaussian noise, we leverage denoising diffusi… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

  6. arXiv:2407.17727  [pdf, other

    eess.SP

    Distributed Memory Approximate Message Passing

    Authors: Jun Lu, Lei Liu, Shunqi Huang, Ning Wei, Xiaoming Chen

    Abstract: Approximate message passing (AMP) algorithms are iterative methods for signal recovery in noisy linear systems. In some scenarios, AMP algorithms need to operate within a distributed network. To address this challenge, the distributed extensions of AMP (D-AMP, FD-AMP) and orthogonal/vector AMP (D-OAMP/D-VAMP) were proposed, but they still inherit the limitations of centralized algorithms. In this… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: Submitted to the IEEE Journal

  7. arXiv:2407.16447  [pdf, ps, other

    eess.AS cs.SD

    The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant Automatic Speech Recognition and Diarization

    Authors: Samuele Cornell, Taejin Park, Steve Huang, Christoph Boeddeker, Xuankai Chang, Matthew Maciejewski, Matthew Wiesner, Paola Garcia, Shinji Watanabe

    Abstract: This paper presents the CHiME-8 DASR challenge which carries on from the previous edition CHiME-7 DASR (C7DASR) and the past CHiME-6 challenge. It focuses on joint multi-channel distant speech recognition (DASR) and diarization with one or more, possibly heterogeneous, devices. The main goal is to spur research towards meeting transcription approaches that can generalize across arbitrary number of… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  8. arXiv:2407.01909  [pdf, other

    cs.CL cs.SD eess.AS

    Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models

    Authors: Zhiyuan Tang, Dong Wang, Shen Huang, Shidong Shang

    Abstract: Recent studies have demonstrated the efficacy of large language models (LLMs) in error correction for automatic speech recognition (ASR). However, much of the research focuses on the English language. This paper redirects the attention to Chinese. Firstly, we construct a specialized benchmark dataset aimed at error correction for Chinese ASR with 724K hypotheses-transcription pairs, named the Chin… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Interspeech 2024

  9. arXiv:2406.16297  [pdf, other

    cs.CV eess.IV

    Priorformer: A UGC-VQA Method with content and distortion priors

    Authors: Yajing Pei, Shiyu Huang, Yiting Lu, Xin Li, Zhibo Chen

    Abstract: User Generated Content (UGC) videos are susceptible to complicated and variant degradations and contents, which prevents the existing blind video quality assessment (BVQA) models from good performance since the lack of the adapability of distortions and contents. To mitigate this, we propose a novel prior-augmented perceptual vision transformer (PriorFormer) for the BVQA of UGC, which boots its ad… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 7 pages

  10. arXiv:2406.13209  [pdf, other

    eess.IV cs.CV physics.med-ph

    Diffusion Model-based FOD Restoration from High Distortion in dMRI

    Authors: Shuo Huang, Lujia Zhong, Yonggang Shi

    Abstract: Fiber orientation distributions (FODs) is a popular model to represent the diffusion MRI (dMRI) data. However, imaging artifacts such as susceptibility-induced distortion in dMRI can cause signal loss and lead to the corrupted reconstruction of FODs, which prohibits successful fiber tracking and connectivity analysis in affected brain regions such as the brain stem. Generative models, such as the… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 11 pages, 7 figures

  11. arXiv:2406.12946  [pdf

    eess.AS cs.AI cs.CL cs.LG

    Instruction Data Generation and Unsupervised Adaptation for Speech Language Models

    Authors: Vahid Noroozi, Zhehuai Chen, Somshubra Majumdar, Steve Huang, Jagadeesh Balam, Boris Ginsburg

    Abstract: In this paper, we propose three methods for generating synthetic samples to train and evaluate multimodal large language models capable of processing both text and speech inputs. Addressing the scarcity of samples containing both modalities, synthetic data generation emerges as a crucial strategy to enhance the performance of such systems and facilitate the modeling of cross-modal relationships be… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted for Interspeech 2024

  12. arXiv:2406.09656  [pdf, other

    cs.CV cs.AI cs.LG eess.IV

    RSEND: Retinex-based Squeeze and Excitation Network with Dark Region Detection for Efficient Low Light Image Enhancement

    Authors: Jingcheng Li, Ye Qiao, Haocheng Xu, Sitao Huang

    Abstract: Images captured under low-light scenarios often suffer from low quality. Previous CNN-based deep learning methods often involve using Retinex theory. Nevertheless, most of them cannot perform well in more complicated datasets like LOL-v2 while consuming too much computational resources. Besides, some of these methods require sophisticated training at different stages, making the procedure even mor… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  13. arXiv:2405.14327  [pdf, other

    eess.IV cs.AI cs.CV

    Autoregressive Image Diffusion: Generation of Image Sequence and Application in MRI

    Authors: Guanxiong Luo, Shoujin Huang, Martin Uecker

    Abstract: Magnetic resonance imaging (MRI) is a widely used non-invasive imaging modality. However, a persistent challenge lies in balancing image quality with imaging speed. This trade-off is primarily constrained by k-space measurements, which traverse specific trajectories in the spatial Fourier domain (k-space). These measurements are often undersampled to shorten acquisition times, resulting in image a… ▽ More

    Submitted 24 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  14. arXiv:2405.13199  [pdf, ps, other

    eess.IV cs.CV

    TauAD: MRI-free Tau Anomaly Detection in PET Imaging via Conditioned Diffusion Models

    Authors: Lujia Zhong, Shuo Huang, Jiaxin Yue, Jianwei Zhang, Zhiwei Deng, Wenhao Chi, Yonggang Shi

    Abstract: The emergence of tau PET imaging over the last decade has enabled Alzheimer's disease (AD) researchers to examine tau pathology in vivo and more effectively characterize the disease trajectories of AD. Current tau PET analysis methods, however, typically perform inferences on large cortical ROIs and are limited in the detection of localized tau pathology that varies across subjects. Furthermore, a… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  15. arXiv:2405.12667  [pdf, other

    eess.SY

    Spatial Mode Multiplexing for Fiber-Coupled IM/DD Optical Wireless Links with Misalignment

    Authors: Jinzhe Che, Shenjie Huang, Majid Safari

    Abstract: Optical wireless communication (OWC) emerges as a pivotal solution for achieving terabit-level aggregate throughput in next-generation wireless networks. With the mature high-speed transceivers and advanced (de)multiplexing techniques designed for fiber optics, fiber-coupled OWC can be seamlessly integrated into existing ultra-high-speed networks such as data centres. In particular, OWC leveraging… ▽ More

    Submitted 25 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

    Comments: 13 pages, 15 figures

  16. arXiv:2404.15584  [pdf

    eess.SY

    Research on OPF control of three-phase four-wire low-voltage distribution network considering uncertainty

    Authors: Rui Wang, Xiaoqing Bai, Shengquan Huang, Shoupu Wei

    Abstract: As power systems become more complex and uncertain, low-voltage distribution networks face numerous challenges, including three-phase imbalances caused by asymmetrical loads and distributed energy resources. We propose a robust stochastic optimization (RSO) based optimal power flow (OPF) control method for three-phase, four-wire low-voltage distribution networks that consider uncertainty to addres… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: systems optimization, robust optimization, local control

  17. arXiv:2404.13677  [pdf, other

    cs.CV eess.IV

    A Dataset and Model for Realistic License Plate Deblurring

    Authors: Haoyan Gong, Yuzheng Feng, Zhenrong Zhang, Xianxu Hou, Jingxin Liu, Siqi Huang, Hongbin Liu

    Abstract: Vehicle license plate recognition is a crucial task in intelligent traffic management systems. However, the challenge of achieving accurate recognition persists due to motion blur from fast-moving vehicles. Despite the widespread use of image synthesis approaches in existing deblurring and recognition algorithms, their effectiveness in real-world scenarios remains unproven. To address this, we int… ▽ More

    Submitted 22 April, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI 2024

  18. arXiv:2404.06393  [pdf, other

    cs.SD cs.AI eess.AS

    MuPT: A Generative Symbolic Music Pretrained Transformer

    Authors: Xingwei Qu, Yuelin Bai, Yinghao Ma, Ziya Zhou, Ka Man Lo, Jiaheng Liu, Ruibin Yuan, Lejun Min, Xueling Liu, Tianyu Zhang, Xinrun Du, Shuyue Guo, Yiming Liang, Yizhi Li, Shangda Wu, Junting Zhou, Tianyu Zheng, Ziyang Ma, Fengze Han, Wei Xue, Gus Xia, Emmanouil Benetos, Xiang Yue, Chenghua Lin, Xu Tan , et al. (4 additional authors not shown)

    Abstract: In this paper, we explore the application of Large Language Models (LLMs) to the pre-training of music. While the prevalent use of MIDI in music modeling is well-established, our findings suggest that LLMs are inherently more compatible with ABC Notation, which aligns more closely with their design and strengths, thereby enhancing the model's performance in musical composition. To address the chal… ▽ More

    Submitted 10 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

  19. arXiv:2403.15853  [pdf

    eess.IV cs.CV

    An edge detection-based deep learning approach for tear meniscus height measurement

    Authors: Kesheng Wang, Kunhui Xu, Xiaoyu Chen, Chunlei He, Jianfeng Zhang, Dexing Kong, Qi Dai, Shoujun Huang

    Abstract: Automatic measurements of tear meniscus height (TMH) have been achieved by using deep learning techniques; however, annotation is significantly influenced by subjective factors and is both time-consuming and labor-intensive. In this paper, we introduce an automatic TMH measurement technique based on edge detection-assisted annotation within a deep learning framework. This method generates mask lab… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: 22 pages, 5 figures

  20. arXiv:2403.05912  [pdf, other

    eess.IV cs.CV

    Mask-Enhanced Segment Anything Model for Tumor Lesion Semantic Segmentation

    Authors: Hairong Shi, Songhao Han, Shaofei Huang, Yue Liao, Guanbin Li, Xiangxing Kong, Hua Zhu, Xiaomu Wang, Si Liu

    Abstract: Tumor lesion segmentation on CT or MRI images plays a critical role in cancer diagnosis and treatment planning. Considering the inherent differences in tumor lesion segmentation data across various medical imaging modalities and equipment, integrating medical knowledge into the Segment Anything Model (SAM) presents promising capability due to its versatility and generalization potential. Recent st… ▽ More

    Submitted 11 July, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

  21. arXiv:2403.05245  [pdf, other

    eess.IV cs.AI cs.CV

    Noise Level Adaptive Diffusion Model for Robust Reconstruction of Accelerated MRI

    Authors: Shoujin Huang, Guanxiong Luo, Xi Wang, Ziran Chen, Yuwan Wang, Huaishui Yang, Pheng-Ann Heng, Lingyan Zhang, Mengye Lyu

    Abstract: In general, diffusion model-based MRI reconstruction methods incrementally remove artificially added noise while imposing data consistency to reconstruct the underlying images. However, real-world MRI acquisitions already contain inherent noise due to thermal fluctuations. This phenomenon is particularly notable when using ultra-fast, high-resolution imaging sequences for advanced research, or usi… ▽ More

    Submitted 31 July, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  22. arXiv:2402.11419  [pdf, other

    eess.SP

    A Self-Healing Magnetic-Array-Type Current Sensor with Data-Driven Identification of Abnormal Magnetic Measurement Units

    Authors: Xiaohu Liu, Kang Ma, Jian Liu, Wei Zhao, Lisha Peng, Songling Huang, Shisong Li

    Abstract: Magnetic-array-type current sensors have garnered increasing popularity owing to their notable advantages, including broadband functionality, a large dynamic range, cost-effectiveness, and compact dimensions. However, the susceptibility of the measurement error of one or more magnetic measurement units (MMUs) within the current sensor to drift significantly from the nominal value due to environmen… ▽ More

    Submitted 15 August, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

    Comments: 11 pages, 10 figures

  23. arXiv:2402.09430  [pdf, other

    eess.SP cs.AI cs.CV cs.MM

    WiMANS: A Benchmark Dataset for WiFi-based Multi-user Activity Sensing

    Authors: Shuokang Huang, Kaihan Li, Di You, Yichong Chen, Arvin Lin, Siying Liu, Xiaohui Li, Julie A. McCann

    Abstract: WiFi-based human sensing has exhibited remarkable potential to analyze user behaviors in a non-intrusive and device-free manner, benefiting applications as diverse as smart homes and healthcare. However, most previous works focus on single-user sensing, which has limited practicability in scenarios involving multiple users. Although recent studies have begun to investigate WiFi-based multi-user se… ▽ More

    Submitted 12 March, 2024; v1 submitted 24 January, 2024; originally announced February 2024.

    Comments: We present WiMANS, to our knowledge, the first dataset for multi-user activity sensing based on WiFi

  24. arXiv:2402.02694  [pdf, other

    eess.AS cs.LG cs.SD

    Description on IEEE ICME 2024 Grand Challenge: Semi-supervised Acoustic Scene Classification under Domain Shift

    Authors: Jisheng Bai, Mou Wang, Haohe Liu, Han Yin, Yafei Jia, Siwei Huang, Yutong Du, Dongzhe Zhang, Dongyuan Shi, Woon-Seng Gan, Mark D. Plumbley, Susanto Rahardja, Bin Xiang, Jianfeng Chen

    Abstract: Acoustic scene classification (ASC) is a crucial research problem in computational auditory scene analysis, and it aims to recognize the unique acoustic characteristics of an environment. One of the challenges of the ASC task is the domain shift between training and testing data. Since 2018, ASC challenges have focused on the generalization of ASC models across different recording devices. Althoug… ▽ More

    Submitted 28 February, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  25. arXiv:2401.11620  [pdf, other

    eess.SY

    Real-Time Systems Optimization with Black-box Constraints and Hybrid Variables

    Authors: Sen Wang, Dong Li, Shao-Yu Huang, Xuanliang Deng, Ashrarul H. Sifat, Changhee Jung, Ryan Williams, Haibo Zeng

    Abstract: When optimizing real-time systems, designers often face a challenging problem where the schedulability constraints are non-convex, non-continuous, or lack an analytical form to understand their properties. Although the optimization framework NORTH proposed in previous work is general (it works with arbitrary schedulability analysis) and scalable, it can only handle problems with continuous variabl… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: Workshop on OPtimization for Embedded and ReAl-time systems (OPERA 2023) co-located with the 44th IEEE Real-Time Systems Symposium (RTSS)

  26. arXiv:2401.03476  [pdf, other

    cs.MM cs.AI cs.HC cs.SD eess.AS

    Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness

    Authors: Sicheng Yang, Zunnan Xu, Haiwei Xue, Yongkang Cheng, Shaoli Huang, Mingming Gong, Zhiyong Wu

    Abstract: Current talking avatars mostly generate co-speech gestures based on audio and text of the utterance, without considering the non-speaking motion of the speaker. Furthermore, previous works on co-speech gesture generation have designed network structures based on individual gesture datasets, which results in limited data volume, compromised generalizability, and restricted speaker movements. To tac… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

    Comments: 6 pages, 3 figures, ICASSP 2024

  27. arXiv:2401.03284  [pdf, other

    eess.SY

    A General and Scalable Method for Optimizing Real-Time Systems

    Authors: Sen Wang, Dong Li, Shao-Yu Huang, Xuanliang Deng, Ashrarul H. Sifat, Changhee Jung, Ryan Williams, Haibo Zeng

    Abstract: In real-time systems optimization, designers often face a challenging problem posed by the non-convex and non-continuous schedulability conditions, which may even lack an analytical form to understand their properties. To tackle this challenging problem, we treat the schedulability analysis as a black box that only returns true/false results. We propose a general and scalable framework to optimize… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

    Comments: Extension of a conference paper

  28. Implementing Digital Twin in Field-Deployed Optical Networks: Uncertain Factors, Operational Guidance, and Field-Trial Demonstration

    Authors: Yuchen Song, Min Zhang, Yao Zhang, Yan Shi, Shikui Shen, Bingli Guo, Shanguo Huang, Danshi Wang

    Abstract: Digital twin has revolutionized optical communication networks by enabling their full life-cycle management, including design, troubleshooting, optimization, upgrade, and prediction. While extensive literature exists on frameworks, standards, and applications of digital twin, there is a pressing need in implementing digital twin in field-deployed optical networks operating in real-world environmen… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: 10 pages, 5 figures Accepted by IEEE Network Magazine, early access

  29. arXiv:2311.10798  [pdf, other

    cs.LG cs.AI cs.CV eess.IV

    INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and Prognosis

    Authors: Shih-Cheng Huang, Zepeng Huo, Ethan Steinberg, Chia-Chun Chiang, Matthew P. Lungren, Curtis P. Langlotz, Serena Yeung, Nigam H. Shah, Jason A. Fries

    Abstract: Synthesizing information from multiple data sources plays a crucial role in the practice of modern medicine. Current applications of artificial intelligence in medicine often focus on single-modality data due to a lack of publicly available, multimodal medical datasets. To address this limitation, we introduce INSPECT, which contains de-identified longitudinal records from a large cohort of patien… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  30. arXiv:2311.03887  [pdf, other

    physics.optics eess.IV physics.med-ph

    Toward ground-truth optical coherence tomography via three-dimensional unsupervised deep learning processing and data

    Authors: Renxiong Wu, Fei Zheng, Meixuan Li, Shaoyan Huang, Xin Ge, Linbo Liu, Yong Liu, Guangming Ni

    Abstract: Optical coherence tomography (OCT) can perform non-invasive high-resolution three-dimensional (3D) imaging and has been widely used in biomedical fields, while it is inevitably affected by coherence speckle noise which degrades OCT imaging performance and restricts its applications. Here we present a novel speckle-free OCT imaging strategy, named toward-ground-truth OCT (tGT-OCT), that utilizes un… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

  31. arXiv:2311.02511  [pdf

    cs.NI eess.SP physics.optics quant-ph

    EU COST Action on future generation optical wireless communication technologies, 2nd White paper

    Authors: Z. Ghassemlooy, M. A. Khalighi, S. Zvanovec, A. Shrestha, B. Ortega, M. Petkovic, X. Pang, C. Sirtori, D. Orsucci, A. Shrestha, F. Moll, G. Cossu, V. Spirito, M. P. Ninos, E. Ciaramella, J. Bas, M. Amay, S. Huang, M. Safari, T. Gutema, W. Popoola, Vicente Matus, Jose Rabadan, Rafael Perez-Jimenez, E. Panayirci , et al. (3 additional authors not shown)

    Abstract: NEWFOCUS is an EU COST Action targeted at exploring radical solutions that could influence the design of future wireless networks. The project aims to address some of the challenges associated with optical wireless communication (OWC) and to establish it as a complementary technology to the radio frequency (RF)-based wireless systems in order to meet the demanding requirements of the fifth generat… ▽ More

    Submitted 14 June, 2023; originally announced November 2023.

  32. arXiv:2310.19699  [pdf, other

    eess.SY cs.OS cs.SC

    Optimizing Logical Execution Time Model for Both Determinism and Low Latency

    Authors: Sen Wang, Dong Li, Ashrarul H. Sifat, Shao-Yu Huang, Xuanliang Deng, Changhee Jung, Ryan Williams, Haibo Zeng

    Abstract: The Logical Execution Time (LET) programming model has recently received considerable attention, particularly because of its timing and dataflow determinism. In LET, task computation appears always to take the same amount of time (called the task's LET interval), and the task reads (resp. writes) at the beginning (resp. end) of the interval. Compared to other communication mechanisms, such as impl… ▽ More

    Submitted 7 March, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: accepted in RTAS'24

  33. arXiv:2310.08021  [pdf, other

    eess.SP

    Channel-robust Automatic Modulation Classification Using Spectral Quotient Cumulants

    Authors: Sai Huang, Yuting Chen, Jiashuo He, Shuo Chang, Zhiyong Feng

    Abstract: Automatic modulation classification (AMC) is to identify the modulation format of the received signal corrupted by the channel effects and noise. Most existing works focus on the impact of noise while relatively little attention has been paid to the impact of channel effects. However, the instability posed by multipath fading channels leads to significant performance degradation. To mitigate the a… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: THIS WORK HAS BEEN SUBMITTED TO THE IEEE FOR POSSIBLE PUBLICATION. COPYRIGHT MAY BE TRANSFERRED WITHOUT NOTICE, AFTER WHICH THIS VERSION MAY NO LONGER BE ACCESSIBLE,5 Pages

  34. arXiv:2309.09032  [pdf, other

    cs.IT cs.LG eess.SP stat.ML

    Solving Quadratic Systems with Full-Rank Matrices Using Sparse or Generative Priors

    Authors: Junren Chen, Shuai Huang, Michael K. Ng, Zhaoqiang Liu

    Abstract: The problem of recovering a signal $\boldsymbol{x} \in \mathbb{R}^n$ from a quadratic system $\{y_i=\boldsymbol{x}^\top\boldsymbol{A}_i\boldsymbol{x},\ i=1,\ldots,m\}$ with full-rank matrices $\boldsymbol{A}_i$ frequently arises in applications such as unassigned distance geometry and sub-wavelength imaging. With i.i.d. standard Gaussian matrices $\boldsymbol{A}_i$, this paper addresses the high-d… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

  35. arXiv:2308.16460  [pdf, other

    eess.IV cs.CV

    Improving Lens Flare Removal with General Purpose Pipeline and Multiple Light Sources Recovery

    Authors: Yuyan Zhou, Dong Liang, Songcan Chen, Sheng-Jun Huang, Shuo Yang, Chongyi Li

    Abstract: When taking images against strong light sources, the resulting images often contain heterogeneous flare artifacts. These artifacts can importantly affect image visual quality and downstream computer vision tasks. While collecting real data pairs of flare-corrupted/flare-free images for training flare removal models is challenging, current methods utilize the direct-add approach to synthesize data.… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

    Comments: ICCV 2023

  36. arXiv:2308.08172  [pdf, other

    eess.IV cs.CV cs.LG

    AATCT-IDS: A Benchmark Abdominal Adipose Tissue CT Image Dataset for Image Denoising, Semantic Segmentation, and Radiomics Evaluation

    Authors: Zhiyu Ma, Chen Li, Tianming Du, Le Zhang, Dechao Tang, Deguo Ma, Shanchuan Huang, Yan Liu, Yihao Sun, Zhihao Chen, Jin Yuan, Qianqing Nie, Marcin Grzegorzek, Hongzan Sun

    Abstract: Methods: In this study, a benchmark \emph{Abdominal Adipose Tissue CT Image Dataset} (AATTCT-IDS) containing 300 subjects is prepared and published. AATTCT-IDS publics 13,732 raw CT slices, and the researchers individually annotate the subcutaneous and visceral adipose tissue regions of 3,213 of those slices that have the same slice distance to validate denoising methods, train semantic segmentati… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

    Comments: 17 pages, 7 figures

  37. arXiv:2308.06762  [pdf, other

    eess.IV cs.CV

    Tissue Segmentation of Thick-Slice Fetal Brain MR Scans with Guidance from High-Quality Isotropic Volumes

    Authors: Shijie Huang, Xukun Zhang, Zhiming Cui, He Zhang, Geng Chen, Dinggang Shen

    Abstract: Accurate tissue segmentation of thick-slice fetal brain magnetic resonance (MR) scans is crucial for both reconstruction of isotropic brain MR volumes and the quantification of fetal brain development. However, this task is challenging due to the use of thick-slice scans in clinically-acquired fetal brain data. To address this issue, we propose to leverage high-quality isotropic fetal brain MR vol… ▽ More

    Submitted 4 December, 2023; v1 submitted 13 August, 2023; originally announced August 2023.

    Comments: 10 pages, 9 figures, 5 tables, Fetal MRI, Brain tissue segmentation, Unsupervised domain adaptation, Cycle-consistency

  38. arXiv:2308.05862  [pdf, other

    eess.IV cs.AI cs.CV

    Unleashing the Strengths of Unlabeled Data in Pan-cancer Abdominal Organ Quantification: the FLARE22 Challenge

    Authors: Jun Ma, Yao Zhang, Song Gu, Cheng Ge, Shihao Ma, Adamo Young, Cheng Zhu, Kangkang Meng, Xin Yang, Ziyan Huang, Fan Zhang, Wentao Liu, YuanKe Pan, Shoujin Huang, Jiacheng Wang, Mingze Sun, Weixin Xu, Dengqiang Jia, Jae Won Choi, Natália Alves, Bram de Wilde, Gregor Koehler, Yajun Wu, Manuel Wiesenfarth, Qiongjie Zhu , et al. (4 additional authors not shown)

    Abstract: Quantitative organ assessment is an essential step in automated abdominal disease diagnosis and treatment planning. Artificial intelligence (AI) has shown great potential to automatize this process. However, most existing AI algorithms rely on many expert annotations and lack a comprehensive evaluation of accuracy and efficiency in real-world multinational settings. To overcome these limitations,… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

    Comments: MICCAI FLARE22: https://flare22.grand-challenge.org/

  39. arXiv:2307.10316  [pdf, other

    cs.CV eess.IV

    CPCM: Contextual Point Cloud Modeling for Weakly-supervised Point Cloud Semantic Segmentation

    Authors: Lizhao Liu, Zhuangwei Zhuang, Shangxin Huang, Xunlong Xiao, Tianhang Xiang, Cen Chen, Jingdong Wang, Mingkui Tan

    Abstract: We study the task of weakly-supervised point cloud semantic segmentation with sparse annotations (e.g., less than 0.1% points are labeled), aiming to reduce the expensive cost of dense annotations. Unfortunately, with extremely sparse annotated points, it is very difficult to extract both contextual and object information for scene understanding such as semantic segmentation. Motivated by masked m… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: Accepted by ICCV 2023

  40. arXiv:2307.08239  [pdf, other

    eess.AS

    Dynamic Kernel Convolution Network with Scene-dedicate Training for Sound Event Localization and Detection

    Authors: Siwei Huang, Jianfeng Chen, Jisheng Bai, Yafei Jia, Dongzhe Zhang

    Abstract: DNN-based methods have shown high performance in sound event localization and detection(SELD). While in real spatial sound scenes, reverberation and the imbalanced presence of various sound events increase the complexity of the SELD task. In this paper, we propose an effective SELD system in real spatial scenes.In our approach, a dynamic kernel convolution module is introduced after the convolutio… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: 11 pages, 6 figures

  41. arXiv:2307.02015  [pdf, other

    eess.IV

    Model-based T1, T2* and Proton Density Mapping Using a Bayesian Approach with Parameter Estimation and Complementary Undersampling Patterns

    Authors: Shuai Huang, James J. Lah, Jason W. Allen, Deqiang Qiu

    Abstract: Purpose: To achieve automatic hyperparameter estimation for the joint recovery of quantitative MR images, we propose a Bayesian formulation of the reconstruction problem that incorporates the signal model. Additionally, we investigate the use of complementary undersampling patterns to determine optimal undersampling schemes for quantitative MRI. Theory: We introduce a novel nonlinear approximate… ▽ More

    Submitted 10 September, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

  42. arXiv:2306.06669  [pdf, other

    eess.IV cs.CV cs.LG

    TransMRSR: Transformer-based Self-Distilled Generative Prior for Brain MRI Super-Resolution

    Authors: Shan Huang, Xiaohong Liu, Tao Tan, Menghan Hu, Xiaoer Wei, Tingli Chen, Bin Sheng

    Abstract: Magnetic resonance images (MRI) acquired with low through-plane resolution compromise time and cost. The poor resolution in one orientation is insufficient to meet the requirement of high resolution for early diagnosis of brain disease and morphometric study. The common Single image super-resolution (SISR) solutions face two main challenges: (1) local detailed and global anatomical structural info… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    Comments: 2023 CGI

  43. arXiv:2306.04987  [pdf, other

    eess.AS cs.SD

    Convolutional Recurrent Neural Network with Attention for 3D Speech Enhancement

    Authors: Han Yin, Jisheng Bai, Mou Wang, Siwei Huang, Yafei Jia, Jianfeng Chen

    Abstract: 3D speech enhancement can effectively improve the auditory experience and plays a crucial role in augmented reality technology. However, traditional convolutional-based speech enhancement methods have limitations in extracting dynamic voice information. In this paper, we incorporate a dual-path recurrent neural network block into the U-Net to iteratively extract dynamic audio information in both t… ▽ More

    Submitted 19 November, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: Published on IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC 2023)

  44. arXiv:2306.02596  [pdf, other

    eess.AS cs.CL cs.CV

    A Novel Interpretable and Generalizable Re-synchronization Model for Cued Speech based on a Multi-Cuer Corpus

    Authors: Lufei Gao, Shan Huang, Li Liu

    Abstract: Cued Speech (CS) is a multi-modal visual coding system combining lip reading with several hand cues at the phonetic level to make the spoken language visible to the hearing impaired. Previous studies solved asynchronous problems between lip and hand movements by a cuer\footnote{The people who perform Cued Speech are called the cuer.}-dependent piecewise linear model for English and French CS. In t… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: 5 pages, 4 figures, Accepted to INTERSPEECH2023

  45. Average AoI Minimization for Energy Harvesting Relay-aided Status Update Network Using Deep Reinforcement Learning

    Authors: Sin-Yu Huang, Kuang-Hao, Liu

    Abstract: A dual-hop status update system aided by energy harvesting (EH) relays with finite data and energy buffers is studied in this work. To achieve timely status updates, the best relays should be selected to minimize the average age of information (AoI), which is a recently proposed metric to evaluate information freshness. The average AoI minimization can be formulated as a Markov decision process (M… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: This article has been accepted for publication in IEEE Wireless Communications Letters. Citation information: DOI 10.1109/LWC.2023.3278864

  46. Learning Music Sequence Representation from Text Supervision

    Authors: Tianyu Chen, Yuan Xie, Shuai Zhang, Shaohan Huang, Haoyi Zhou, Jianxin Li

    Abstract: Music representation learning is notoriously difficult for its complex human-related concepts contained in the sequence of numerical signals. To excavate better MUsic SEquence Representation from labeled audio, we propose a novel text-supervision pre-training method, namely MUSER. MUSER adopts an audio-spectrum-text tri-modal contrastive learning framework, where the text input could be any form o… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Journal ref: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022: 4583-4587

  47. arXiv:2305.16621  [pdf, other

    cs.AI eess.SY

    A Reminder of its Brittleness: Language Reward Shaping May Hinder Learning for Instruction Following Agents

    Authors: Sukai Huang, Nir Lipovetzky, Trevor Cohn

    Abstract: Teaching agents to follow complex written instructions has been an important yet elusive goal. One technique for enhancing learning efficiency is language reward shaping (LRS). Within a reinforcement learning (RL) framework, LRS involves training a reward function that rewards behaviours precisely aligned with given language instructions. We argue that the apparent success of LRS is brittle, and p… ▽ More

    Submitted 17 August, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

  48. arXiv:2305.10631  [pdf

    eess.IV cs.CV cs.LG

    An image segmentation algorithm based on multi-scale feature pyramid network

    Authors: Yu Xiao, Xin Yang, Sijuan Huang, Lihua Guo

    Abstract: Medical image segmentation is particularly critical as a prerequisite for relevant quantitative analysis in the treatment of clinical diseases. For example, in clinical cervical cancer radiotherapy, after acquiring subabdominal MRI images, a fast and accurate image segmentation of organs and tumors in MRI images can optimize the clinical radiotherapy process, whereas traditional approaches use man… ▽ More

    Submitted 28 June, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

  49. arXiv:2305.09821  [pdf, other

    cs.IT eess.SP

    Single-Photon Counting Receivers for Optical Wireless Communications in Future 6G Networks

    Authors: Shenjie Huang, Danial Chitnis, Cheng Chen, Harald Haas, Mohammad-Ali Khalighi, Robert K. Henderson, Majid Safari

    Abstract: Optical wireless communication (OWC) offers several complementary advantages to radio-frequency wireless networks such as its massive available spectrum; hence, it is widely anticipated that OWC will assume a pivotal role in the forthcoming sixth generation wireless communication networks. Although significant progress has been achieved in OWC over the past decades, the outage induced by occasiona… ▽ More

    Submitted 30 October, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

  50. arXiv:2305.04047  [pdf, other

    eess.IV cs.CV

    Degradation-Noise-Aware Deep Unfolding Transformer for Hyperspectral Image Denoising

    Authors: Haijin Zeng, Jiezhang Cao, Kai Feng, Shaoguang Huang, Hongyan Zhang, Hiep Luong, Wilfried Philips

    Abstract: Hyperspectral imaging (HI) has emerged as a powerful tool in diverse fields such as medical diagnosis, industrial inspection, and agriculture, owing to its ability to detect subtle differences in physical properties through high spectral resolution. However, hyperspectral images (HSIs) are often quite noisy because of narrow band spectral filtering. To reduce the noise in HSI data cubes, both mode… ▽ More

    Submitted 6 May, 2023; originally announced May 2023.