Zum Hauptinhalt springen

Showing 1–50 of 58 results for author: Tang, M

Searching in archive eess. Search in all archives.
.
  1. arXiv:2408.16905  [pdf, ps, other

    eess.SY

    On Fixed-Time Stability for a Class of Singularly Perturbed Systems using Composite Lyapunov Functions

    Authors: Michael Tang, Miroslav Krstic, Jorge Poveda

    Abstract: Fixed-time stable dynamical systems are capable of achieving exact convergence to an equilibrium point within a fixed time that is independent of the initial conditions of the system. This property makes them highly appealing for designing control, estimation, and optimization algorithms in applications with stringent performance requirements. However, the set of tools available for analyzing the… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  2. arXiv:2408.03647  [pdf, other

    eess.SY eess.SP

    Real-time Event Recognition of Long-distance Distributed Vibration Sensing with Knowledge Distillation and Hardware Acceleration

    Authors: Zhongyao Luo, Hao Wu, Zhao Ge, Ming Tang

    Abstract: Fiber-optic sensing, especially distributed optical fiber vibration (DVS) sensing, is gaining importance in internet of things (IoT) applications, such as industrial safety monitoring and intrusion detection. Despite their wide application, existing post-processing methods that rely on deep learning models for event recognition in DVS systems face challenges with real-time processing of large samp… ▽ More

    Submitted 22 August, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

    Comments: 9 pages, 10 figures

  3. arXiv:2407.06373  [pdf

    eess.IV eess.SP

    Enhancing super-resolution ultrasound localisation through multi-frame deconvolution exploiting spatiotemporal coherence

    Authors: Su Yan, Clotilde Vié, Marcelo Lerendegui, Herman Verinaz-Jadan, Jipeng Yan, Martina Tashkova, James Burn, Bingxue Wang, Gary Frost, Kevin G. Murphy, Meng-Xing Tang

    Abstract: Super-resolution ultrasound imaging through microbubble (MB) localisation and tracking, also known as ultrasound localisation microscopy, allows non-invasive sub-diffraction resolution imaging of microvasculature in animals and humans. The number of MBs localised from the acquired contrast-enhanced ultrasound (CEUS) images and the localisation precision directly influence the quality of the result… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 26 pages, 1 table, 7 figures

  4. arXiv:2407.05168  [pdf, other

    eess.SY

    Deception in Nash Equilibrium Seeking

    Authors: Michael Tang, Umar Javed, Xudong Chen, Miroslav Krstic, Jorge I. Poveda

    Abstract: In socio-technical multi-agent systems, deception exploits privileged information to induce false beliefs in "victims," keeping them oblivious and leading to outcomes detrimental to them or advantageous to the deceiver. We consider model-free Nash-equilibrium-seeking for non-cooperative games with asymmetric information and introduce model-free deceptive algorithms with stability guarantees. In th… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  5. arXiv:2406.18009  [pdf, other

    eess.AS cs.SD

    E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS

    Authors: Sefik Emre Eskimez, Xiaofei Wang, Manthan Thakker, Canrun Li, Chung-Hsien Tsai, Zhen Xiao, Hemin Yang, Zirun Zhu, Min Tang, Xu Tan, Yanqing Liu, Sheng Zhao, Naoyuki Kanda

    Abstract: This paper introduces Embarrassingly Easy Text-to-Speech (E2 TTS), a fully non-autoregressive zero-shot text-to-speech system that offers human-level naturalness and state-of-the-art speaker similarity and intelligibility. In the E2 TTS framework, the text input is converted into a character sequence with filler tokens. The flow-matching-based mel spectrogram generator is then trained based on the… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  6. arXiv:2406.16317  [pdf

    cs.SD eess.AS

    SNR-Progressive Model with Harmonic Compensation for Low-SNR Speech Enhancement

    Authors: Zhongshu Hou, Tong Lei, Qinwen Hu, Zhanzhong Cao, Ming Tang, Jing Lu

    Abstract: Despite significant progress made in the last decade, deep neural network (DNN) based speech enhancement (SE) still faces the challenge of notable degradation in the quality of recovered speech under low signal-to-noise ratio (SNR) conditions. In this letter, we propose an SNR-progressive speech enhancement model with harmonic compensation for low-SNR SE. Reliable pitch estimation is obtained from… ▽ More

    Submitted 18 August, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  7. arXiv:2406.15885  [pdf, other

    cs.SD cs.AI eess.AS

    The Music Maestro or The Musically Challenged, A Massive Music Evaluation Benchmark for Large Language Models

    Authors: Jiajia Li, Lu Yang, Mingni Tang, Cong Chen, Zuchao Li, Ping Wang, Hai Zhao

    Abstract: Benchmark plays a pivotal role in assessing the advancements of large language models (LLMs). While numerous benchmarks have been proposed to evaluate LLMs' capabilities, there is a notable absence of a dedicated benchmark for assessing their musical abilities. To address this gap, we present ZIQI-Eval, a comprehensive and large-scale music benchmark specifically designed to evaluate the music-rel… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL-Findings 2024

  8. arXiv:2406.05699  [pdf, ps, other

    eess.AS cs.AI eess.SP

    An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS

    Authors: Xiaofei Wang, Sefik Emre Eskimez, Manthan Thakker, Hemin Yang, Zirun Zhu, Min Tang, Yufei Xia, Jinzhu Li, Sheng Zhao, Jinyu Li, Naoyuki Kanda

    Abstract: Recently, zero-shot text-to-speech (TTS) systems, capable of synthesizing any speaker's voice from a short audio prompt, have made rapid advancements. However, the quality of the generated speech significantly deteriorates when the audio prompt contains noise, and limited research has been conducted to address this issue. In this paper, we explored various strategies to enhance the quality of audi… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Accepted to INTERSPEECH2024

  9. arXiv:2406.04281  [pdf, other

    eess.AS

    Total-Duration-Aware Duration Modeling for Text-to-Speech Systems

    Authors: Sefik Emre Eskimez, Xiaofei Wang, Manthan Thakker, Chung-Hsien Tsai, Canrun Li, Zhen Xiao, Hemin Yang, Zirun Zhu, Min Tang, Jinyu Li, Sheng Zhao, Naoyuki Kanda

    Abstract: Accurate control of the total duration of generated speech by adjusting the speech rate is crucial for various text-to-speech (TTS) applications. However, the impact of adjusting the speech rate on speech quality, such as intelligibility and speaker characteristics, has been underexplored. In this work, we propose a novel total-duration-aware (TDA) duration model for TTS, where phoneme durations a… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

  10. arXiv:2405.19685  [pdf

    eess.IV

    Identifying Functional Brain Networks of Spatiotemporal Wide-Field Calcium Imaging Data via a Long Short-Term Memory Autoencoder

    Authors: Xiaohui Zhang, Eric C Landsness, Lindsey M Brier, Wei Chen, Michelle J. Tang, Hanyang Miao, Jin-Moo Lee, Mark A. Anastasio, Joseph P. Culver

    Abstract: Wide-field calcium imaging (WFCI) that records neural calcium dynamics allows for identification of functional brain networks (FBNs) in mice that express genetically encoded calcium indicators. Estimating FBNs from WFCI data is commonly achieved by use of seed-based correlation (SBC) analysis and independent component analysis (ICA). These two methods are conceptually distinct and each possesses l… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  11. arXiv:2405.04253  [pdf

    eess.SP

    Fermat Number Transform Based Chromatic Dispersion Compensation and Adaptive Equalization Algorithm

    Authors: Siyu Chen, Zheli Liu, Weihao Li, Zihe Hu, Mingming Zhang, Sheng Cui, Ming Tang

    Abstract: By introducing the Fermat number transform into chromatic dispersion compensation and adaptive equalization, the computational complexity has been reduced by 68% compared with the con?ventional implementation. Experimental results validate its transmission performance with only 0.8 dB receiver sensitivity penalty in a 75 km-40 GBaud-PDM-16QAM system.

    Submitted 7 May, 2024; originally announced May 2024.

  12. arXiv:2403.19996  [pdf, other

    cs.LG eess.SP

    DeepHeteroIoT: Deep Local and Global Learning over Heterogeneous IoT Sensor Data

    Authors: Muhammad Sakib Khan Inan, Kewen Liao, Haifeng Shen, Prem Prakash Jayaraman, Dimitrios Georgakopoulos, Ming Jian Tang

    Abstract: Internet of Things (IoT) sensor data or readings evince variations in timestamp range, sampling frequency, geographical location, unit of measurement, etc. Such presented sequence data heterogeneity makes it difficult for traditional time series classification algorithms to perform well. Therefore, addressing the heterogeneity challenge demands learning not only the sub-patterns (local features) b… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: Accepted for Publication and Presented in EAI MobiQuitous 2023 - 20th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services

  13. arXiv:2402.07383  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like

    Authors: Naoyuki Kanda, Xiaofei Wang, Sefik Emre Eskimez, Manthan Thakker, Hemin Yang, Zirun Zhu, Min Tang, Canrun Li, Chung-Hsien Tsai, Zhen Xiao, Yufei Xia, Jinzhu Li, Yanqing Liu, Sheng Zhao, Michael Zeng

    Abstract: Laughter is one of the most expressive and natural aspects of human speech, conveying emotions, social cues, and humor. However, most text-to-speech (TTS) systems lack the ability to produce realistic and appropriate laughter sounds, limiting their applications and user experience. While there have been prior works to generate natural laughter, they fell short in terms of controlling the timing an… ▽ More

    Submitted 4 March, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

    Comments: See https://aka.ms/elate/ for demo samples, v2: subjective evaluation has been added

  14. arXiv:2401.08887  [pdf, ps, other

    cs.SD cs.AI cs.CL eess.AS

    NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription

    Authors: Alon Vinnikov, Amir Ivry, Aviv Hurvitz, Igor Abramovski, Sharon Koubi, Ilya Gurvich, Shai Pe`er, Xiong Xiao, Benjamin Martinez Elizalde, Naoyuki Kanda, Xiaofei Wang, Shalev Shaer, Stav Yagev, Yossi Asher, Sunit Sivasankaran, Yifan Gong, Min Tang, Huaming Wang, Eyal Krupka

    Abstract: We introduce the first Natural Office Talkers in Settings of Far-field Audio Recordings (``NOTSOFAR-1'') Challenge alongside datasets and baseline system. The challenge focuses on distant speaker diarization and automatic speech recognition (DASR) in far-field meeting scenarios, with single-channel and known-geometry multi-channel tracks, and serves as a launch platform for two new datasets: First… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: preprint

  15. arXiv:2401.08098  [pdf

    eess.IV q-bio.NC

    Attention-Based CNN-BiLSTM for Sleep State Classification of Spatiotemporal Wide-Field Calcium Imaging Data

    Authors: Xiaohui Zhang, Eric C. Landsness, Hanyang Miao, Wei Chen, Michelle Tang, Lindsey M. Brier, Joseph P. Culver, Jin-Moo Lee, Mark A. Anastasio

    Abstract: Background: Wide-field calcium imaging (WFCI) with genetically encoded calcium indicators allows for spatiotemporal recordings of neuronal activity in mice. When applied to the study of sleep, WFCI data are manually scored into the sleep states of wakefulness, non-REM (NREM) and REM by use of adjunct EEG and EMG recordings. However, this process is time-consuming, invasive and often suffers from l… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  16. arXiv:2312.10418  [pdf, other

    cs.LG cs.NI eess.SP

    Fractional Deep Reinforcement Learning for Age-Minimal Mobile Edge Computing

    Authors: Lyudong Jin, Ming Tang, Meng Zhang, Hao Wang

    Abstract: Mobile edge computing (MEC) is a promising paradigm for real-time applications with intensive computational needs (e.g., autonomous driving), as it can reduce the processing delay. In this work, we focus on the timeliness of computational-intensive updates, measured by Age-ofInformation (AoI), and study how to jointly optimize the task updating and offloading policies for AoI with fractional form.… ▽ More

    Submitted 19 December, 2023; v1 submitted 16 December, 2023; originally announced December 2023.

  17. arXiv:2311.08823  [pdf, other

    physics.med-ph eess.IV

    Ultrafast 3-D Super Resolution Ultrasound using Row-Column Array specific Coherence-based Beamforming and Rolling Acoustic Sub-aperture Processing: In Vitro, In Vivo and Clinical Study

    Authors: Joseph Hansen-Shearer, Jipeng Yan, Marcelo Lerendegui, Biao Huang, Matthieu Toulemonde, Kai Riemer, Qingyuan Tan, Johanna Tonko, Peter D. Weinberg, Chris Dunsby, Meng-Xing Tang

    Abstract: The row-column addressed array is an emerging probe for ultrafast 3-D ultrasound imaging. It achieves this with far fewer independent electronic channels and a wider field of view than traditional 2-D matrix arrays, of the same channel count, making it a good candidate for clinical translation. However, the image quality of row-column arrays is generally poor, particularly when investigating tissu… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  18. arXiv:2308.13575  [pdf

    eess.SP cs.AI physics.optics

    FrFT based estimation of linear and nonlinear impairments using Vision Transformer

    Authors: Ting Jiang, Zheng Gao, Yizhao Chen, Zihe Hu, Ming Tang

    Abstract: To comprehensively assess optical fiber communication system conditions, it is essential to implement joint estimation of the following four critical impairments: nonlinear signal-to-noise ratio (SNRNL), optical signal-to-noise ratio (OSNR), chromatic dispersion (CD) and differential group delay (DGD). However, current studies only achieve identifying a limited number of impairments within a narro… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

    Comments: 15 pages, 10 figures

  19. arXiv:2308.06873  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    SpeechX: Neural Codec Language Model as a Versatile Speech Transformer

    Authors: Xiaofei Wang, Manthan Thakker, Zhuo Chen, Naoyuki Kanda, Sefik Emre Eskimez, Sanyuan Chen, Min Tang, Shujie Liu, Jinyu Li, Takuya Yoshioka

    Abstract: Recent advancements in generative speech models based on audio-text prompts have enabled remarkable innovations like high-quality zero-shot text-to-speech. However, existing models still face limitations in handling diverse audio-text speech generation tasks involving transforming input speech and processing audio captured in adverse acoustic conditions. This paper introduces SpeechX, a versatile… ▽ More

    Submitted 25 June, 2024; v1 submitted 13 August, 2023; originally announced August 2023.

    Comments: To appear in TASLP. See https://aka.ms/speechx for demo samples

  20. arXiv:2308.04683  [pdf, other

    eess.SP eess.SY

    Real-time FPGA Implementation of CNN-based Distributed Fiber Optic Vibration Event Recognition Method

    Authors: Zhongyao Luo, Zhao Ge, Hao Wu, Ming Tang

    Abstract: Utilizing optical fibers to detect and pinpoint vibrations, Distributed Optical Fiber Vibration Sensing (DVS) technology provides real-time monitoring and surveillance of wide-reaching areas. This field has been leveraging Convolutional Neural Networks (CNN). Recently, a study has accomplished end-to-end vibration event recognition, enabling utilization of CNN-based DVS algorithms as real-time emb… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: 5 pages, 6 figures

  21. arXiv:2308.04013  [pdf, other

    eess.SY cs.IT

    Distributed Target Tracking with Fading Channels over Underwater Wireless Sensor Networks

    Authors: Miaoyi Tang, Meiqin Liu, Senlin Zhang, Ronghao Zheng, Shanling Dong

    Abstract: This paper investigates the problem of distributed target tracking via underwater wireless sensor networks (UWSNs) with fading channels. The degradation of signal quality due to wireless channel fading can significantly impact network reliability and subsequently reduce the tracking accuracy. To address this issue, we propose a modified distributed unscented Kalman filter (DUKF) named DUKF-Fc, whi… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: 12 pages, 6 figures, 6 tables

  22. arXiv:2304.12783  [pdf, other

    physics.med-ph cs.CE cs.CV eess.IV

    On the Use of Singular Value Decomposition as a Clutter Filter for Ultrasound Flow Imaging

    Authors: Kai Riemer, Marcelo Lerendegui, Matthieu Toulemonde, Jiaqi Zhu, Christopher Dunsby, Peter D. Weinberg, Meng-Xing Tang

    Abstract: Filtering based on Singular Value Decomposition (SVD) provides substantial separation of clutter, flow and noise in high frame rate ultrasound flow imaging. The use of SVD as a clutter filter has greatly improved techniques such as vector flow imaging, functional ultrasound and super-resolution ultrasound localization microscopy. The removal of clutter and noise relies on the assumption that tissu… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

    Comments: 10 pages, 7 figures

  23. arXiv:2304.00819  [pdf, other

    eess.IV

    Acceleration-Based Kalman Tracking for Super-Resolution Ultrasound Imaging in vivo

    Authors: Biao Huang, Jipeng Yan, Megan Morris, Victoria Sinnett, Navita Somaiah, Meng-Xing Tang

    Abstract: Super-resolution ultrasound can image microvascular structure and flow at sub-wave-diffraction resolution based on localising and tracking microbubbles. Currently, tracking microbubbles accurately under limited imaging frame rates and high microbubble concentrations remains a challenge, especially under the effect of cardiac pulsatility and in highly curved vessels. In this study, an acceleration-… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Comments: 15 pages, 10 figures

  24. arXiv:2303.14003  [pdf

    eess.IV eess.SP

    Transthoracic super-resolution ultrasound localisation microscopy of myocardial vasculature in patients

    Authors: Jipeng Yan, Biao Huang, Johanna Tonko, Matthieu Toulemonde, Joseph Hansen-Shearer, Qingyuan Tan, Kai Riemer, Konstantinos Ntagiantas, Rasheda A Chowdhury, Pier Lambiase, Roxy Senior, Meng-Xing Tang

    Abstract: Micro-vascular flow in the myocardium is of significant importance clinically but remains poorly understood. Up to 25% of patients with symptoms of coronary heart diseases have no obstructive coronary arteries and have suspected microvascular diseases. However, such microvasculature is difficult to image in vivo with existing modalities due to the lack of resolution and sensitivity. Here, we demon… ▽ More

    Submitted 28 March, 2023; v1 submitted 24 March, 2023; originally announced March 2023.

    Comments: 22 pages, 10 figures

  25. arXiv:2303.11510  [pdf, other

    cs.SD eess.AS

    ICASSP 2023 Deep Noise Suppression Challenge

    Authors: Harishchandra Dubey, Ashkan Aazami, Vishak Gopal, Babak Naderi, Sebastian Braun, Ross Cutler, Alex Ju, Mehdi Zohourian, Min Tang, Hannes Gamper, Mehrsa Golestaneh, Robert Aichner

    Abstract: Deep Speech Enhancement Challenge is the 5th edition of deep noise suppression (DNS) challenges organized at ICASSP 2023 Signal Processing Grand Challenges. DNS challenges were organized during 2019-2023 to stimulate research in deep speech enhancement (DSE). Previous DNS challenges were organized at INTERSPEECH 2020, ICASSP 2021, INTERSPEECH 2021, and ICASSP 2022. From prior editions, we learnt t… ▽ More

    Submitted 8 May, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: 6 pages, 1 figure. arXiv admin note: text overlap with arXiv:2202.13288

  26. arXiv:2303.07005  [pdf, other

    eess.AS

    Real-Time Audio-Visual End-to-End Speech Enhancement

    Authors: Zirun Zhu, Hemin Yang, Min Tang, Ziyi Yang, Sefik Emre Eskimez, Huaming Wang

    Abstract: Audio-visual speech enhancement (AV-SE) methods utilize auxiliary visual cues to enhance speakers' voices. Therefore, technically they should be able to outperform the audio-only speech enhancement (SE) methods. However, there are few works in the literature on an AV-SE system that can work in real time on a CPU. In this paper, we propose a low-latency real-time audio-visual end-to-end enhancement… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP 2023

  27. arXiv:2211.09988  [pdf, ps, other

    eess.AS cs.SD

    Exploring WavLM on Speech Enhancement

    Authors: Hyungchan Song, Sanyuan Chen, Zhuo Chen, Yu Wu, Takuya Yoshioka, Min Tang, Jong Won Shin, Shujie Liu

    Abstract: There is a surge in interest in self-supervised learning approaches for end-to-end speech encoding in recent years as they have achieved great success. Especially, WavLM showed state-of-the-art performance on various speech processing tasks. To better understand the efficacy of self-supervised learning models for speech enhancement, in this work, we design and conduct a series of experiments with… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

    Comments: Accepted by IEEE SLT 2022

  28. arXiv:2211.02773  [pdf, other

    eess.AS cs.SD

    Real-Time Joint Personalized Speech Enhancement and Acoustic Echo Cancellation

    Authors: Sefik Emre Eskimez, Takuya Yoshioka, Alex Ju, Min Tang, Tanel Parnamaa, Huaming Wang

    Abstract: Personalized speech enhancement (PSE) is a real-time SE approach utilizing a speaker embedding of a target person to remove background noise, reverberation, and interfering voices. To deploy a PSE model for full duplex communications, the model must be combined with acoustic echo cancellation (AEC), although such a combination has been less explored. This paper proposes a series of methods that ar… ▽ More

    Submitted 25 May, 2023; v1 submitted 4 November, 2022; originally announced November 2022.

    Comments: Accepted to Interspeech 2023

  29. arXiv:2211.00754  [pdf, other

    eess.IV

    BUbble Flow Field: a Simulation Framework for Evaluating Ultrasound Localization Microscopy Algorithms

    Authors: Marcelo Lerendegui, Kai Riemer, Bingxue Wang, Christopher Dunsby, Meng-Xing Tang

    Abstract: Ultrasound contrast enhanced imaging has seen widespread uptake in research and clinical diagnostic imaging. This includes applications such as vector flow imaging, functional ultrasound and super-resolution Ultrasound Localization Microscopy (ULM). All of these require testing and validation during development of new algorithms with ground truth data. In this work we present a comprehensive simul… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

    Comments: 10 Pages, 9 Figures

  30. arXiv:2210.00801  [pdf, other

    eess.SY

    Design of the PID temperature controller for an alkaline electrolysis system with time delays

    Authors: Ruomei Qi, Jiarong Li, Jin Lin, Yonghua Song, Jiepeng Wang, Qiangqiang Cui, Yiwei Qiu, Ming Tang, Jian Wang

    Abstract: Electrolysis systems use proportional-integral-derivative (PID) temperature controllers to maintain stack temperatures around set points. However, heat transfer delays in electrolysis systems cause manual tuning of PID temperature controllers to be time-consuming, and temperature oscillations often occur. This paper focuses on the design of the PID temperature controller for an alkaline electrolys… ▽ More

    Submitted 3 October, 2022; originally announced October 2022.

  31. arXiv:2209.10382  [pdf, other

    cs.IT cs.LG eess.SP

    Robust Information Bottleneck for Task-Oriented Communication with Digital Modulation

    Authors: Songjie Xie, Shuai Ma, Ming Ding, Yuanming Shi, Mingjian Tang, Youlong Wu

    Abstract: Task-oriented communications, mostly using learning-based joint source-channel coding (JSCC), aim to design a communication-efficient edge inference system by transmitting task-relevant information to the receiver. However, only transmitting task-relevant information without introducing any redundancy may cause robustness issues in learning due to the channel variations, and the JSCC which directl… ▽ More

    Submitted 9 May, 2023; v1 submitted 21 September, 2022; originally announced September 2022.

  32. arXiv:2208.12176  [pdf, other

    eess.IV eess.SP physics.med-ph

    3D Super-Resolution Ultrasound with Adaptive Weight-Based Beamforming

    Authors: Jipeng Yan, Bingxue Wang, Kai Riemer, Joseph Hansen-Shearer, Marcelo Lerendegui, Matthieu Toulemonde, Christopher J Rowlands, Peter D. Weinberg, Meng-Xing Tang

    Abstract: Super-resolution ultrasound (SRUS) imaging through localising and tracking sparse microbubbles has been shown to reveal microvascular structure and flow beyond the wave diffraction limit. Most SRUS studies use standard delay and sum (DAS) beamforming, where large main lobe and significant side lobes make separation and localisation of densely distributed bubbles challenging, particularly in 3D due… ▽ More

    Submitted 25 August, 2022; originally announced August 2022.

    Comments: Ultrasound localisation microscopy (ULM), super-resolution, contrast-enhanced ultrasound, 3D beamforming

  33. Towards Better Dermoscopic Image Feature Representation Learning for Melanoma Classification

    Authors: ChengHui Yu, MingKang Tang, ShengGe Yang, MingQing Wang, Zhe Xu, JiangPeng Yan, HanMo Chen, Yu Yang, Xiao-Jun Zeng, Xiu Li

    Abstract: Deep learning-based melanoma classification with dermoscopic images has recently shown great potential in automatic early-stage melanoma diagnosis. However, limited by the significant data imbalance and obvious extraneous artifacts, i.e., the hair and ruler markings, discriminative feature extraction from dermoscopic images is very challenging. In this study, we seek to resolve these problems resp… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

    Comments: ICONIP 2021 conference

  34. arXiv:2206.03912  [pdf, other

    eess.SP

    Volumetric Image Projection Super-Resolution Ultrasound (VIP-SR) with a 1D Unfocused Linear Array

    Authors: B. Wang, K. Riemer, M. Toulemonde, J. Yan, X. Zhou, M. Tang

    Abstract: Super-Resolution Ultrasound (SRUS) through localizing spatially isolated microbubbles has been demonstrated to overcome the wave diffraction limit and reveal the microvascular structure and flow information at the microscopic scale. However, 3D SRUS imaging remains a challenge due to the fabrication and computational complexity of 2D matrix array probes and connections. Inspired by X-ray radiograp… ▽ More

    Submitted 8 June, 2022; originally announced June 2022.

    Comments: 19 pages, 9 figures

  35. arXiv:2203.09461  [pdf

    eess.SP physics.optics

    Beyond the Limitation of Pulse Width in Optical Time-domain Reflectometry

    Authors: Hao Wu, Ming Tang

    Abstract: Optical time-domain reflectometry (OTDR) is the basis for distributed time-domain optical fiber sensing techniques. By injecting pulse light into an optical fiber, the distance information of an event can be obtained based on the time of light flight. The minimum distinguishable event separation along the fiber length is called the spatial resolution, which is determined by the optical pulse width… ▽ More

    Submitted 13 March, 2022; originally announced March 2022.

  36. Fast and selective super-resolution ultrasound in vivo with sono-switchable nanodroplets

    Authors: Kai Riemer, Matthieu Toulemonde, Jipeng Yan, Marcelo Lerendegui, Eleanor Stride, Peter D. Weinberg, Christopher Dunsby, Meng-Xing Tang

    Abstract: Perfusion by the microcirculation is key to the development, maintenance and pathology of tissue. Its measurement with high spatiotemporal resolution is consequently valuable but remains a challenge in deep tissue. Ultrasound Localization Microscopy (ULM) provides very high spatiotemporal resolution but the use of microbubbles requires low contrast agent concentrations, a long acquisition time, an… ▽ More

    Submitted 8 March, 2022; originally announced March 2022.

    Comments: phase-change contrast agent, low-boiling point nanodroplet, acoustic vaporization, droplet activation, microcirculation, contrast enhanced ultrasound, plane wave

  37. arXiv:2202.13422  [pdf, other

    eess.SY

    Thermal Modelling and Controller Design of an Alkaline Electrolysis System under Dynamic Operating Conditions

    Authors: Ruomei Qi, Jiarong Li, Jin Lin, Yonghua Song, Jiepeng Wang, Qiangqiang Cui, Yiwei Qiu, Ming Tang, Jian Wang

    Abstract: Thermal management is vital for the efficient and safe operation of alkaline electrolysis systems. Traditional alkaline electrolysis systems use simple proportional-integral-differentiation (PID) controllers to maintain the stack temperature near the rated value. However, in renewable-to-hydrogen scenarios, the stack temperature is disturbed by load fluctuations, and the temperature overshoot phen… ▽ More

    Submitted 27 February, 2022; originally announced February 2022.

  38. arXiv:2110.05745  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    VarArray: Array-Geometry-Agnostic Continuous Speech Separation

    Authors: Takuya Yoshioka, Xiaofei Wang, Dongmei Wang, Min Tang, Zirun Zhu, Zhuo Chen, Naoyuki Kanda

    Abstract: Continuous speech separation using a microphone array was shown to be promising in dealing with the speech overlap problem in natural conversation transcription. This paper proposes VarArray, an array-geometry-agnostic speech separation neural network model. The proposed model is applicable to any number of microphones without retraining while leveraging the nonlinear correlation between the input… ▽ More

    Submitted 26 October, 2021; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: 5 pages, 1 figure, 3 tables, submitted to ICASSP 2022; updated reference information of [33]

  39. arXiv:2110.03345  [pdf, other

    physics.med-ph eess.SP physics.app-ph physics.comp-ph

    Stride: a flexible platform for high-performance ultrasound computed tomography

    Authors: Carlos Cueto, Oscar Bates, George Strong, Javier Cudeiro, Fabio Luporini, Oscar Calderon Agudo, Gerard Gorman, Lluis Guasch, Meng-Xing Tang

    Abstract: Advanced ultrasound computed tomography techniques like full-waveform inversion are mathematically challenging and orders of magnitude more computationally expensive than conventional ultrasound imaging methods. This computational and algorithmic complexity, and a lack of open-source libraries in this field, represent a barrier preventing the generalised adoption of these techniques, slowing the p… ▽ More

    Submitted 18 May, 2022; v1 submitted 7 October, 2021; originally announced October 2021.

    Journal ref: Computer Methods and Programs in Biomedicine, 221, 2022

  40. arXiv:2109.10349  [pdf

    eess.SP physics.optics

    Enabling variable high spatial resolution retrieval from a long pulse BOTDA sensor

    Authors: Zhao Ge, Li Shen, Can Zhao, Hao Wu, Zhiyong Zhao, Ming Tang

    Abstract: In the field of Internet of Things, there is an urgent need for sensors with large-scale sensing capability for scenarios such as intelligent monitoring of production lines and urban infrastructure. Brillouin optical time domain analysis (BOTDA) sensors, which can monitor thousands of continuous points simultaneously, show great advantages in these applications. We propose a convolutional neural n… ▽ More

    Submitted 8 September, 2021; originally announced September 2021.

    Comments: 7 pages, 6 figures

    MSC Class: 78A15 ACM Class: I.2.1

  41. arXiv:2108.05096  [pdf

    physics.optics eess.IV

    Omnidirectional ghost imaging system and unwrapping-free panoramic ghost imaging

    Authors: Huan Cui, Jie Cao, Qun Hao, Dong Zhou, Mingyuan Tang, Kaiyu Zhang, Yingqiang Zhang

    Abstract: Ghost imaging (GI) is a novel imaging method, which can reconstruct the object information by the light intensity correlation measurements. However, at present, the field of view (FOV) is limited to the illuminating range of the light patterns. To enlarge FOV of GI efficiently, here we proposed the omnidirectional ghost imaging system (OGIS), which can achieve a 360° omnidirectional FOV at one sho… ▽ More

    Submitted 11 August, 2021; originally announced August 2021.

  42. arXiv:2106.02896  [pdf, other

    eess.AS

    Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement

    Authors: Sefik Emre Eskimez, Xiaofei Wang, Min Tang, Hemin Yang, Zirun Zhu, Zhuo Chen, Huaming Wang, Takuya Yoshioka

    Abstract: With the surge of online meetings, it has become more critical than ever to provide high-quality speech audio and live captioning under various noise conditions. However, most monaural speech enhancement (SE) models introduce processing artifacts and thus degrade the performance of downstream tasks, including automatic speech recognition (ASR). This paper proposes a multi-task training framework t… ▽ More

    Submitted 5 June, 2021; originally announced June 2021.

    Comments: Accepted to INTERSPEECH2021

  43. Spatial response identification enables robust experimental ultrasound computed tomography

    Authors: Carlos Cueto, Lluis Guasch, Javier Cudeiro, Oscar Calderon Agudo, Oscar Bates, George Strong, Meng-Xing Tang

    Abstract: Ultrasound computed tomography techniques have the potential to provide clinicians with 3D, quantitative and high-resolution information of both soft and hard tissues such as the breast or the adult human brain. Their practical application requires accurate modelling of the acquisition setup: the spatial location, orientation, and impulse response of each ultrasound transducer. However, existing c… ▽ More

    Submitted 5 April, 2021; v1 submitted 19 March, 2021; originally announced March 2021.

    Journal ref: IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 69 (1) 27-37, 2022

  44. arXiv:2102.07746  [pdf, other

    eess.SP

    High contrast Ultrafast 3D Ultrasound Imaging using Row Column specific Frame Multiply and Sum

    Authors: Joseph Hansen-Shearer, Marcelo Lerendegui, Matthieu Toulemonde, Meng-Xing Tang

    Abstract: Row-column arrays have shown to be able to generate 3-D ultrafast ultrasound images with an order of magnitude less independent electronic channels than classic 2D matrix arrays. Unfortunately row-column array images suffer from major imaging artefacts due to the high side lobes. This paper proposes a row-column specific beamforming technique that exploits the incoherent nature of certain row colu… ▽ More

    Submitted 15 February, 2021; originally announced February 2021.

  45. arXiv:2102.04799  [pdf, other

    eess.IV

    Multi-scale GCN-assisted two-stage network for joint segmentation of retinal layers and disc in peripapillary OCT images

    Authors: Jiaxuan Li, Peiyao Jin, Jianfeng Zhu, Haidong Zou, Xun Xu, Min Tang, Minwen Zhou, Yu Gan, Jiangnan He, Yuye Ling, Yikai Su

    Abstract: An accurate and automated tissue segmentation algorithm for retinal optical coherence tomography (OCT) images is crucial for the diagnosis of glaucoma. However, due to the presence of the optic disc, the anatomical structure of the peripapillary region of the retina is complicated and is challenging for segmentation. To address this issue, we developed a novel graph convolutional network (GCN)-ass… ▽ More

    Submitted 9 February, 2021; originally announced February 2021.

  46. arXiv:2011.05122  [pdf

    eess.IV physics.optics

    Scannerless non-line-of-sight three dimensional imaging with a 32x32 SPAD array

    Authors: Chenfei Jin, Meng Tang, Legeng Jia, Xiaorui Tian, Jie Yang, Kai Qiao, Siqi Zhang

    Abstract: We develop a scannerless non-line-of-sight three dimensional imaging system based on a commercial 32x32 SPAD camera combined with a 70 ps pulsed laser. In our experiment, 1024 time histograms can be achieved synchronously in 3s with an average time resolution of about 165 ps. The result with filtered back projection shows a discernable reconstruction while the result using virtual wave field demon… ▽ More

    Submitted 10 November, 2020; originally announced November 2020.

    Comments: 10 pages, 8 figures

  47. arXiv:2009.08804  [pdf

    eess.SP physics.optics

    Improving the spatial resolution of a BOTDA sensor using deconvolution algorithm

    Authors: Li Shen, Zhiyong Zhao, Can Zhao, Hao Wu, Chao Lu, Ming Tang

    Abstract: Spatial resolution improvement from an acquired measurement using long pulse is developed for Brillouin optical time domain analysis (BOTDA) systems based on the total variation deconvolution algorithm. The frequency dependency of Brillouin gain temporal envelope is investigated by simulation, and its impact on the recovered results of deconvolution algorithm is thoroughly analyzed. To implement a… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.

  48. arXiv:2005.07796  [pdf, other

    cs.CV cs.LG eess.IV

    FuSSI-Net: Fusion of Spatio-temporal Skeletons for Intention Prediction Network

    Authors: Francesco Piccoli, Rajarathnam Balakrishnan, Maria Jesus Perez, Moraldeepsingh Sachdeo, Carlos Nunez, Matthew Tang, Kajsa Andreasson, Kalle Bjurek, Ria Dass Raj, Ebba Davidsson, Colin Eriksson, Victor Hagman, Jonas Sjoberg, Ying Li, L. Srikar Muppirisetty, Sohini Roychowdhury

    Abstract: Pedestrian intention recognition is very important to develop robust and safe autonomous driving (AD) and advanced driver assistance systems (ADAS) functionalities for urban driving. In this work, we develop an end-to-end pedestrian intention framework that performs well on day- and night- time scenarios. Our framework relies on objection detection bounding boxes combined with skeletal features of… ▽ More

    Submitted 15 May, 2020; originally announced May 2020.

    Comments: 5 pages, 6 figures, 5 tables, IEEE Asilomar SSC

  49. arXiv:2004.04362  [pdf, other

    cs.LG eess.SP physics.soc-ph q-bio.NC stat.ML

    Detecting Dynamic Community Structure in Functional Brain Networks Across Individuals: A Multilayer Approach

    Authors: Chee-Ming Ting, S. Balqis Samdin, Meini Tang, Hernando Ombao

    Abstract: We present a unified statistical framework for characterizing community structure of brain functional networks that captures variation across individuals and evolution over time. Existing methods for community detection focus only on single-subject analysis of dynamic networks; while recent extensions to multiple-subjects analysis are limited to static networks. To overcome these limitations, we p… ▽ More

    Submitted 16 October, 2020; v1 submitted 9 April, 2020; originally announced April 2020.

    Comments: Main paper: 12 pages, 13 figures. Supplemental file: 16 pages. Accepted for IEEE Trans Medical Imaging

    Journal ref: IEEE Trans Medical Imaging, vol. 40, no. 2 (2021) 468 - 480

  50. arXiv:2001.03030  [pdf

    eess.SP

    Distributed Brillouin frequency shift extraction via a convolutional neural network

    Authors: Yiqing Chang, Hao Wu, Can Zhao, Li Shen, Songnian Fu, Ming Tang

    Abstract: Distributed optical fiber Brillouin sensors detect the temperature and strain along a fiber according to the local Brillouin frequency shift, which is usually calculated by the measured Brillouin spectrum using Lorentzian curve fitting. In addition, cross-correlation, principal component analysis, and machine learning methods have been proposed for the more efficient extraction of Brillouin freque… ▽ More

    Submitted 9 January, 2020; originally announced January 2020.