Zum Hauptinhalt springen

Showing 1–50 of 51 results for author: Hu, P

Searching in archive eess. Search in all archives.
.
  1. arXiv:2408.07644  [pdf, other

    cs.RO cs.LG cs.MA eess.SY

    SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning

    Authors: Jianye Xu, Pan Hu, Bassam Alrifaee

    Abstract: This paper introduces an open-source, decentralized framework named SigmaRL, designed to enhance both sample efficiency and generalization of multi-agent Reinforcement Learning (RL) for motion planning of connected and automated vehicles. Most RL agents exhibit a limited capacity to generalize, often focusing narrowly on specific scenarios, and are usually evaluated in similar or even the same sce… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: 8 pages, 5 figures, accepted for presentation at the IEEE International Conference on Intelligent Transportation Systems (ITSC) 2024

  2. arXiv:2408.04737  [pdf, other

    cs.SD cs.LG eess.AS

    Quantifying the Corpus Bias Problem in Automatic Music Transcription Systems

    Authors: Lukáš Samuel Marták, Patricia Hu, Gerhard Widmer

    Abstract: Automatic Music Transcription (AMT) is the task of recognizing notes in audio recordings of music. The State-of-the-Art (SotA) benchmarks have been dominated by deep learning systems. Due to the scarcity of high quality data, they are usually trained and evaluated exclusively or predominantly on classical piano music. Unfortunately, that hinders our ability to understand how they generalize to oth… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: 2 pages, 1 figure, presented in the 1st International Workshop on Sound Signal Processing Applications (IWSSPA) 2024

  3. arXiv:2408.03124  [pdf, other

    eess.SY cs.LG

    Closed-loop Diffusion Control of Complex Physical Systems

    Authors: Long Wei, Haodong Feng, Peiyan Hu, Tao Zhang, Yuchen Yang, Xiang Zheng, Ruiqi Feng, Dixia Fan, Tailin Wu

    Abstract: The control problems of complex physical systems have wide applications in science and engineering. Several previous works have demonstrated that generative control methods based on diffusion models have significant advantages for solving these problems. However, existing generative control methods face challenges in handling closed-loop control, which is an inherent constraint for effective contr… ▽ More

    Submitted 31 July, 2024; originally announced August 2024.

  4. arXiv:2407.11620  [pdf

    eess.SP

    A Deep Learning-Based Target Radial Length Estimation Method through HRRP Sequence

    Authors: Lingfeng Chen, Panhe Hu, Zhiliang Pan, Xiao Sun, Zehao Wang

    Abstract: This paper introduces an innovative deep learning-based method for end-to-end target radial length estimation from HRRP (High Resolution Range Profile) sequences. Firstly, the HRRP sequences are normalized and transformed into GAF (Gram Angular Field) images to effectively capture and utilize the temporal information. Subsequently, these GAF images serve as the input for a pretrained ResNet-101 mo… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 2 pages, 2 figures. Accepted by APCAP 2024

  5. arXiv:2407.08236  [pdf, other

    eess.SP

    HRRPGraphNet: A Graph Neural Network Based Approach for HRRP Radar Target Recognition

    Authors: Lingfeng Chen, Panhe Hu, Zhiliang Pan, Xiao Sun, Zehao Wang

    Abstract: High Resolution Range Profiles (HRRP) have become a key area of focus in the domain of Radar Automatic Target Recognition (RATR). Despite the success of data-driven neural network-based HRRP recognition, challenges such as insufficient training samples persist in its real-world application. This letter introduces HRRPGraphNet, a novel Graph Neural Network (GNN) model designed specifically for HRRP… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 5 pages, 4 figures

  6. arXiv:2406.15160  [pdf, other

    eess.AS eess.SP

    Exploring Audio-Visual Information Fusion for Sound Event Localization and Detection In Low-Resource Realistic Scenarios

    Authors: Ya Jiang, Qing Wang, Jun Du, Maocheng Hu, Pengfei Hu, Zeyan Liu, Shi Cheng, Zhaoxu Nian, Yuxuan Dong, Mingqi Cai, Xin Fang, Chin-Hui Lee

    Abstract: This study presents an audio-visual information fusion approach to sound event localization and detection (SELD) in low-resource scenarios. We aim at utilizing audio and video modality information through cross-modal learning and multi-modal fusion. First, we propose a cross-modal teacher-student learning (TSL) framework to transfer information from an audio-only teacher model, trained on a rich c… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: accepted by icme2024

  7. arXiv:2406.08454  [pdf, other

    cs.SD eess.AS

    Towards Musically Informed Evaluation of Piano Transcription Models

    Authors: Patricia Hu, Lukáš Samuel Marták, Carlos Cancino-Chacón, Gerhard Widmer

    Abstract: Automatic piano transcription models are typically evaluated using simple frame- or note-wise information retrieval (IR) metrics. Such benchmark metrics do not provide insights into the transcription quality of specific musical aspects such as articulation, dynamics, or rhythmic precision of the output, which are essential in the context of expressive performance analysis. Furthermore, in recent y… ▽ More

    Submitted 29 July, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted at the 25th International Society for Music Information Retrieval Conference (ISMIR 2024)

  8. arXiv:2405.15438  [pdf, other

    cs.CV cs.LG eess.IV

    Comparing remote sensing-based forest biomass mapping approaches using new forest inventory plots in contrasting forests in northeastern and southwestern China

    Authors: Wenquan Dong, Edward T. A. Mitchard, Yuwei Chen, Man Chen, Congfeng Cao, Peilun Hu, Cong Xu, Steven Hancock

    Abstract: Large-scale high spatial resolution aboveground biomass (AGB) maps play a crucial role in determining forest carbon stocks and how they are changing, which is instrumental in understanding the global carbon cycle, and implementing policy to mitigate climate change. The advent of the new space-borne LiDAR sensor, NASA's GEDI instrument, provides unparalleled possibilities for the accurate and unbia… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  9. arXiv:2312.04795  [pdf, other

    eess.SP

    Latency versus Transmission Power Trade-off in Free-Space Optical (FSO) Satellite Networks with Multiple Inter-Continental Connections

    Authors: Jintao Liang, Aizaz Chaudhry, John Chinneck, Halim Yanikomeroglu, Gunes Kurt, Peng Hu, Khaled Ahmed, Stephane Martel

    Abstract: In free-space optical satellite networks (FSOSNs), satellites connected via laser inter-satellite links (LISLs), latency is a critical factor, especially for long-distance inter-continental connections. Since satellites depend on solar panels for power supply, power consumption is also a vital factor. We investigate the minimization of total network latency (i.e., the sum of the network latencies… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: Accepted for publication in IEEE Open Journal of the Communications Society

  10. arXiv:2312.04788  [pdf, other

    eess.SP

    Free-Space Optical (FSO) Satellite Networks Performance Analysis: Transmission Power, Latency, and Outage Probability

    Authors: Jintao Liang, Aizaz U. Chaudhry, Eylem Erdogan, Halim Yanikomeroglu, Gunes Karabulut Kurt, Peng Hu, Khaled Ahmed, Stephane Martel

    Abstract: In free-space optical satellite networks (FSOSNs), satellites can have different laser inter-satellite link (LISL) ranges for connectivity. Greater LISL ranges can reduce network latency of the path but can also result in an increase in transmission power for satellites on the path. Consequently, this tradeoff between satellite transmission power and network latency should be investigated, and in… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: Accepted for publication in IEEE Open Journal of Vehicular Technology

  11. Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition

    Authors: Qijie Shao, Pengcheng Guo, Jinghao Yan, Pengfei Hu, Lei Xie

    Abstract: Accents, as variations from standard pronunciation, pose significant challenges for speech recognition systems. Although joint automatic speech recognition (ASR) and accent recognition (AR) training has been proven effective in handling multi-accent scenarios, current multi-task ASR-AR approaches overlook the granularity differences between tasks. Fine-grained units capture pronunciation-related a… ▽ More

    Submitted 17 November, 2023; v1 submitted 12 November, 2023; originally announced November 2023.

    Comments: Accepted by IEEE Transactions on Audio, Speech and Language Processing (TASLP)

  12. arXiv:2309.07925  [pdf, other

    eess.AS cs.AI cs.MM cs.SD

    Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023

    Authors: Haotian Wang, Yuxuan Xi, Hang Chen, Jun Du, Yan Song, Qing Wang, Hengshun Zhou, Chenxi Wang, Jiefeng Ma, Pengfei Hu, Ya Jiang, Shi Cheng, Jie Zhang, Yuzhe Weng

    Abstract: In this paper, we propose a novel framework for recognizing both discrete and dimensional emotions. In our framework, deep features extracted from foundation models are used as robust acoustic and visual representations of raw video. Three different structures based on attention-guided feature gathering (AFG) are designed for deep feature fusion. Then, we introduce a joint decoding structure for e… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

    Comments: 5 pages, 4 figures

    Journal ref: The 31st ACM International Conference on Multimedia (MM'23), 2023

  13. arXiv:2309.02399  [pdf, other

    cs.SD cs.DL eess.AS

    The Batik-plays-Mozart Corpus: Linking Performance to Score to Musicological Annotations

    Authors: Patricia Hu, Gerhard Widmer

    Abstract: We present the Batik-plays-Mozart Corpus, a piano performance dataset combining professional Mozart piano sonata performances with expert-labelled scores at a note-precise level. The performances originate from a recording by Viennese pianist Roland Batik on a computer-monitored Bösendorfer grand piano, and are available both as MIDI files and audio recordings. They have been precisely aligned, no… ▽ More

    Submitted 6 September, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: To be published in the Proceedings of the 24th International Society for Music Information Retrieval Conference (ISMIR 2023), Milan, Italy

  14. arXiv:2306.14471  [pdf

    physics.med-ph eess.IV physics.ins-det physics.optics

    Single-shot 3D photoacoustic computed tomography with a densely packed array for transcranial functional imaging

    Authors: Rui Cao, Yilin Luo, Jinhua Xu, Xiaofei Luo, Ku Geng, Yousuf Aborahama, Manxiu Cui, Samuel Davis, Shuai Na, Xin Tong, Cindy Liu, Karteek Sastry, Konstantin Maslov, Peng Hu, Yide Zhang, Li Lin, Yang Zhang, Lihong V. Wang

    Abstract: Photoacoustic computed tomography (PACT) is emerging as a new technique for functional brain imaging, primarily due to its capabilities in label-free hemodynamic imaging. Despite its potential, the transcranial application of PACT has encountered hurdles, such as acoustic attenuations and distortions by the skull and limited light penetration through the skull. To overcome these challenges, we hav… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  15. arXiv:2306.09397  [pdf, other

    cs.LG cs.MA eess.SP

    Non-Asymptotic Performance of Social Machine Learning Under Limited Data

    Authors: Ping Hu, Virginia Bordignon, Mert Kayaalp, Ali H. Sayed

    Abstract: This paper studies the probability of error associated with the social machine learning framework, which involves an independent training phase followed by a cooperative decision-making phase over a graph. This framework addresses the problem of classifying a stream of unlabeled data in a distributed manner. In this work, we examine the classification task with limited observations during the deci… ▽ More

    Submitted 9 July, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

  16. arXiv:2304.12939  [pdf, other

    cs.SD cs.HC eess.AS

    The ACCompanion: Combining Reactivity, Robustness, and Musical Expressivity in an Automatic Piano Accompanist

    Authors: Carlos Cancino-Chacón, Silvan Peter, Patricia Hu, Emmanouil Karystinaios, Florian Henkel, Francesco Foscarin, Nimrod Varga, Gerhard Widmer

    Abstract: This paper introduces the ACCompanion, an expressive accompaniment system. Similarly to a musician who accompanies a soloist playing a given musical piece, our system can produce a human-like rendition of the accompaniment part that follows the soloist's choices in terms of tempo, dynamics, and articulation. The ACCompanion works in the symbolic domain, i.e., it needs a musical instrument capable… ▽ More

    Submitted 30 May, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

    Comments: In Proceedings of the 32nd International Joint Conference on Artificial Intelligence (IJCAI-23), Macao, China. The differences/extensions with the previous version include a technical appendix, added missing links, and minor text updates. 10 pages, 4 figures

  17. arXiv:2303.12883  [pdf, other

    eess.SY

    HAPS-UAV-Enabled Heterogeneous Networks: A Deep Reinforcement Learning Approach

    Authors: Atefeh H. Arani, Peng Hu, Yeying Zhu

    Abstract: The integrated use of non-terrestrial network (NTN) entities such as the high-altitude platform station (HAPS) and low-altitude platform station (LAPS) has become essential elements in the space-air-ground integrated networks (SAGINs). However, the complexity, mobility, and heterogeneity of NTN entities and resources present various challenges from system design to deployment. This paper proposes… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

  18. arXiv:2303.05697  [pdf

    physics.med-ph eess.IV eess.SP

    Quantification of cervical elasticity during pregnancy based on transvaginal ultrasound imaging and stress measurement

    Authors: Peng Hu, Peinan Zhao, Yuan Qu, Konstantin Maslov, Jessica Chubiz, Methodius G. Tuuli, Molly J. Stout, Lihong V. Wang

    Abstract: Objective: Strain elastography and shear wave elastography are two commonly used methods to quantify cervical elasticity; however, they have limitations. Strain elastography is effective in showing tissue elasticity distribution in a single image, but the absence of stress information causes difficulty in comparing the results acquired from different imaging sessions. Shear wave elastography is ef… ▽ More

    Submitted 9 March, 2023; originally announced March 2023.

    Comments: 26 pages, 8 figures, 1 table

  19. arXiv:2302.13130  [pdf, other

    cs.CV eess.SP

    Point Cloud Forecasting as a Proxy for 4D Occupancy Forecasting

    Authors: Tarasha Khurana, Peiyun Hu, David Held, Deva Ramanan

    Abstract: Predicting how the world can evolve in the future is crucial for motion planning in autonomous systems. Classical methods are limited because they rely on costly human annotations in the form of semantic class labels, bounding boxes, and tracks or HD maps of cities to plan their motion and thus are difficult to scale to large unlabeled datasets. One promising self-supervised task is 3D point cloud… ▽ More

    Submitted 30 April, 2023; v1 submitted 25 February, 2023; originally announced February 2023.

    Comments: CVPR 2023. Project page: https://www.cs.cmu.edu/~tkhurana/ff4d/index.html Code: https://github.com/tarashakhurana/4d-occ-forecasting

  20. arXiv:2302.05525  [pdf, other

    cs.LG cs.NE eess.SY

    Satellite Anomaly Detection Using Variance Based Genetic Ensemble of Neural Networks

    Authors: Mohammad Amin Maleki Sadr, Yeying Zhu, Peng Hu

    Abstract: In this paper, we use a variance-based genetic ensemble (VGE) of Neural Networks (NNs) to detect anomalies in the satellite's historical data. We use an efficient ensemble of the predictions from multiple Recurrent Neural Networks (RNNs) by leveraging each model's uncertainty level (variance). For prediction, each RNN is guided by a Genetic Algorithm (GA) which constructs the optimal structure for… ▽ More

    Submitted 10 February, 2023; originally announced February 2023.

  21. arXiv:2301.03641  [pdf, other

    cs.NI eess.SY

    SatNetOps: Toward Multi-Layer Networking for Satellite Network Operations

    Authors: Peng Hu

    Abstract: Recent advancements in low-Earth-orbit (LEO) satellites aim to bring resilience, ubiquitous, and high-quality service to future Internet infrastructure. However, the soaring number of space assets, increasing dynamics of LEO satellites and expanding dimensions of network threats call for an enhanced approach to efficient satellite operations. To address these pressing challenges, we propose an app… ▽ More

    Submitted 9 January, 2023; originally announced January 2023.

  22. arXiv:2212.05986  [pdf, other

    cs.NI eess.SY

    A Cross-Layer Descent Approach for Resilient Network Operations of Proliferated LEO Satellites

    Authors: Peng Hu

    Abstract: With the proliferated low-Earth-orbit (LEO) satellites in mega-constellations, the future Internet will be able to reach any place on Earth, providing high-quality services to everyone. However, high-quality operations in terms of timeliness and resilience are lacking in the current solutions. This paper proposes a multi-layer networking approach called "Cross-Layer Descent (CLD)". Based on the pr… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

    Comments: 2023 IEEE Wireless Communications and Networking Conference (WCNC), 26--29 March 2023, Glasgow, UK

  23. arXiv:2212.04148  [pdf, other

    cs.CV eess.IV

    Relationship Quantification of Image Degradations

    Authors: Wenxin Wang, Boyun Li, Yuanbiao Gou, Peng Hu, Wangmeng Zuo, Xi Peng

    Abstract: In this paper, we study two challenging but less-touched problems in image restoration, namely, i) how to quantify the relationship between image degradations and ii) how to improve the performance of a specific restoration task using the quantified relationship. To tackle the first challenge, we proposed a Degradation Relationship Index (DRI) which is defined as the mean drop rate difference in t… ▽ More

    Submitted 5 August, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

  24. arXiv:2212.03729  [pdf, other

    eess.SY cs.NI

    Enabling Resilient and Real-Time Network Operations in Space: A Novel Multi-Layer Satellite Networking Scheme

    Authors: Peng Hu

    Abstract: Recently advanced low-Earth-orbit (LEO) satellite networks represented by large constellations and advanced payloads provide great promises for enabling high-quality Internet connectivity to any place on Earth. However, the traditional access-based approach to satellite operations cannot meet the pressing requirements of real-time, reliable, and resilient operations for LEO satellites. A new schem… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

    Comments: Published in the Proceedings of the 2022 IEEE Latin-American Conference on Communications (LATINCOM), 30 November - 2 December 2022, Rio de Janeiro, Brazil

  25. AccEar: Accelerometer Acoustic Eavesdropping with Unconstrained Vocabulary

    Authors: Pengfei Hu, Hui Zhuang, Panneer Selvam Santhalingamy, Riccardo Spolaor, Parth Pathaky, Guoming Zhang, Xiuzhen Cheng

    Abstract: With the increasing popularity of voice-based applications, acoustic eavesdropping has become a serious threat to users' privacy. While on smartphones the access to microphones needs an explicit user permission, acoustic eavesdropping attacks can rely on motion sensors (such as accelerometer and gyroscope), which access is unrestricted. However, previous instances of such attacks can only recogniz… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

    Comments: 2022 IEEE Symposium on Security and Privacy (SP)

    Journal ref: 2022 IEEE Symposium on Security and Privacy (SP)

  26. arXiv:2211.14938  [pdf, other

    cs.LG cs.AI eess.SP

    An Anomaly Detection Method for Satellites Using Monte Carlo Dropout

    Authors: Mohammad Amin Maleki Sadr, Yeying Zhu, Peng Hu

    Abstract: Recently, there has been a significant amount of interest in satellite telemetry anomaly detection (AD) using neural networks (NN). For AD purposes, the current approaches focus on either forecasting or reconstruction of the time series, and they cannot measure the level of reliability or the probability of correct detection. Although the Bayesian neural network (BNN)-based approaches are well kno… ▽ More

    Submitted 27 November, 2022; originally announced November 2022.

    Journal ref: IEEE Transactions on Aerospace and Electronic Systems, 2022

  27. arXiv:2211.14931  [pdf, other

    eess.SY cs.LG cs.NI

    UAV-Assisted Space-Air-Ground Integrated Networks: A Technical Review of Recent Learning Algorithms

    Authors: Atefeh H. Arani, Peng Hu, Yeying Zhu

    Abstract: Recent technological advancements in space, air, and ground components have made possible a new network paradigm called space-air-ground integrated network (SAGIN). Unmanned aerial vehicles (UAVs) play a key role in SAGINs. However, due to UAVs' high dynamics and complexity, real-world deployment of a SAGIN becomes a significant barrier to realizing such SAGINs. UAVs are expected to meet key perfo… ▽ More

    Submitted 16 July, 2024; v1 submitted 27 November, 2022; originally announced November 2022.

    Comments: Accepted by the IEEE Open Journal of Vehicular Technology in July 2024

  28. arXiv:2209.04093  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Learning Audio-Visual embedding for Person Verification in the Wild

    Authors: Peiwen Sun, Shanshan Zhang, Zishan Liu, Yougen Yuan, Taotao Zhang, Honggang Zhang, Pengfei Hu

    Abstract: It has already been observed that audio-visual embedding is more robust than uni-modality embedding for person verification. Here, we proposed a novel audio-visual strategy that considers aggregators from a fusion perspective. First, we introduced weight-enhanced attentive statistics pooling for the first time in face verification. We find that a strong correlation exists between modalities during… ▽ More

    Submitted 26 October, 2022; v1 submitted 8 September, 2022; originally announced September 2022.

  29. arXiv:2209.02205  [pdf, other

    cs.CV eess.SY

    High Speed Rotation Estimation with Dynamic Vision Sensors

    Authors: Guangrong Zhao, Yiran Shen, Ning Chen, Pengfei Hu, Lei Liu, Hongkai Wen

    Abstract: Rotational speed is one of the important metrics to be measured for calibrating the electric motors in manufacturing, monitoring engine during car repairing, faults detection on electrical appliance and etc. However, existing measurement techniques either require prohibitive hardware (e.g., high-speed camera) or are inconvenient to use in real-world application scenarios. In this paper, we propose… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

    Comments: 10 pages,13 figures

  30. arXiv:2206.00248  [pdf

    physics.med-ph eess.SP

    Transcranial photoacoustic computed tomography of human brain function

    Authors: Yang Zhang, Shuai Na, Karteekeya Sastry, Jonathan J. Russin, Peng Hu, Li Lin, Xin Tong, Kay B. Jann, Danny J. Wang, Charles Y. Liu, Lihong V. Wang

    Abstract: Herein we report the first in-human transcranial imaging of brain function using photoacoustic computed tomography. Functional responses to benchmark motor tasks were imaged on both the skull-less and the skull-intact hemispheres of a hemicraniectomy patient. The observed brain responses in these preliminary results demonstrate the potential of photoacoustic computed tomography for achieving trans… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

  31. arXiv:2205.12459  [pdf, other

    cs.CV eess.IV

    A CNN with Noise Inclined Module and Denoise Framework for Hyperspectral Image Classification

    Authors: Zhiqiang Gong, Ping Zhong, Jiahao Qi, Panhe Hu

    Abstract: Deep Neural Networks have been successfully applied in hyperspectral image classification. However, most of prior works adopt general deep architectures while ignore the intrinsic structure of the hyperspectral image, such as the physical noise generation. This would make these deep models unable to generate discriminative features and provide impressive classification performance. To leverage suc… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

    Journal ref: IET Image Processing, 2022

  32. arXiv:2204.04956  [pdf, other

    eess.IV cs.CV

    Segmentation Network with Compound Loss Function for Hydatidiform Mole Hydrops Lesion Recognition

    Authors: Chengze Zhu, Pingge Hu, Xianxu Zeng, Xingtong Wang, Zehua Ji, Li Shi

    Abstract: Pathological morphology diagnosis is the standard diagnosis method of hydatidiform mole. As a disease with malignant potential, the hydatidiform mole section of hydrops lesions is an important basis for diagnosis. Due to incomplete lesion development, early hydatidiform mole is difficult to distinguish, resulting in a low accuracy of clinical diagnosis. As a remarkable machine learning technology,… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

  33. arXiv:2204.04949  [pdf

    eess.IV cs.CV

    A Semantic Segmentation Network Based Real-Time Computer-Aided Diagnosis System for Hydatidiform Mole Hydrops Lesion Recognition in Microscopic View

    Authors: Chengze Zhu, Pingge Hu, Xianxu Zeng, Xingtong Wang, Zehua Ji, Li Shi

    Abstract: As a disease with malignant potential, hydatidiform mole (HM) is one of the most common gestational trophoblastic diseases. For pathologists, the HM section of hydrops lesions is an important basis for diagnosis. In pathology departments, the diverse microscopic manifestations of HM lesions and the limited view under the microscope mean that physicians with extensive diagnostic experience are requ… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

  34. arXiv:2204.03398  [pdf, other

    cs.SD eess.AS

    Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition

    Authors: Qijie Shao, Jinghao Yan, Jian Kang, Pengcheng Guo, Xian Shi, Pengfei Hu, Lei Xie

    Abstract: General accent recognition (AR) models tend to directly extract low-level information from spectrums, which always significantly overfit on speakers or channels. Considering accent can be regarded as a series of shifts relative to native pronunciation, distinguishing accents will be an easier task with accent shift as input. But due to the lack of native utterance as an anchor, estimating the acce… ▽ More

    Submitted 1 July, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

    Comments: Accepted by Interspeech 2022

  35. Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur Speech Recognition

    Authors: Guodong Ma, Pengfei Hu, Jian Kang, Shen Huang, Hao Huang

    Abstract: In Uyghur speech, consonant and vowel reduction are often encountered, especially in spontaneous speech with high speech rate, which will cause a degradation of speech recognition performance. To solve this problem, we propose an effective phone mask training method for Conformer-based Uyghur end-to-end (E2E) speech recognition. The idea is to randomly mask off a certain percentage features of pho… ▽ More

    Submitted 2 April, 2022; originally announced April 2022.

    Comments: Accepted by INTERSPEECH 2021

    Journal ref: INTERSPEECH 2021

  36. arXiv:2203.15249  [pdf, other

    cs.SD eess.AS

    MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification

    Authors: Yang Zhang, Zhiqiang Lv, Haibin Wu, Shanshan Zhang, Pengfei Hu, Zhiyong Wu, Hung-yi Lee, Helen Meng

    Abstract: In this paper, we present Multi-scale Feature Aggregation Conformer (MFA-Conformer), an easy-to-implement, simple but effective backbone for automatic speaker verification based on the Convolution-augmented Transformer (Conformer). The architecture of the MFA-Conformer is inspired by recent stateof-the-art models in speech recognition and speaker verification. Firstly, we introduce a convolution s… ▽ More

    Submitted 10 November, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

    Comments: accepted by INTERSPEECH 2022

  37. arXiv:2203.07065  [pdf, other

    eess.SP cs.IT cs.MA

    Optimal Aggregation Strategies for Social Learning over Graphs

    Authors: Ping Hu, Virginia Bordignon, Stefan Vlaski, Ali H. Sayed

    Abstract: Adaptive social learning is a useful tool for studying distributed decision-making problems over graphs. This paper investigates the effect of combination policies on the performance of adaptive social learning strategies. Using large-deviation analysis, it first derives a bound on the steady-state error probability and characterizes the optimal selection for the Perron eigenvectors of the combina… ▽ More

    Submitted 31 May, 2023; v1 submitted 14 March, 2022; originally announced March 2022.

  38. arXiv:2203.04313  [pdf, other

    eess.IV cs.CV

    Multi-Scale Adaptive Network for Single Image Denoising

    Authors: Yuanbiao Gou, Peng Hu, Jiancheng Lv, Joey Tianyi Zhou, Xi Peng

    Abstract: Multi-scale architectures have shown effectiveness in a variety of tasks thanks to appealing cross-scale complementarity. However, existing architectures treat different scale features equally without considering the scale-specific characteristics, \textit{i.e.}, the within-scale characteristics are ignored in the architecture design. In this paper, we reveal this missing piece for multi-scale arc… ▽ More

    Submitted 29 October, 2022; v1 submitted 8 March, 2022; originally announced March 2022.

    Journal ref: the Thirty-Sixth Annual Conference on Neural Information Processing Systems (NeurIPS 2022)

  39. 5G Enabled Fault Detection and Diagnostics: How Do We Achieve Efficiency?

    Authors: Peng Hu, Jinhuan Zhang

    Abstract: The 5th-generation wireless networks (5G) technologies and mobile edge computing (MEC) provide great promises of enabling new capabilities for the industrial Internet of Things. However, the solutions enabled by the 5G ultra-reliable low-latency communication (URLLC) paradigm come with challenges, where URLLC alone does not necessarily guarantee the efficient execution of time-critical fault detec… ▽ More

    Submitted 15 February, 2022; originally announced February 2022.

  40. arXiv:2112.08133  [pdf

    physics.ins-det eess.IV physics.optics

    Ptychographic sensor for large-scale lensless microbial monitoring with high spatiotemporal resolution

    Authors: Shaowei Jiang, Chengfei Guo, Zichao Bian, Ruihai Wang, Jiakai Zhu, Pengming Song, Patrick Hu, Derek Hu, Zibang Zhang, Kazunori Hoshino, Bin Feng, Guoan Zheng

    Abstract: Traditional microbial detection methods often rely on the overall property of microbial cultures and cannot resolve individual growth event at high spatiotemporal resolution. As a result, they require bacteria to grow to confluence and then interpret the results. Here, we demonstrate the application of an integrated ptychographic sensor for lensless cytometric analysis of microbial cultures over a… ▽ More

    Submitted 15 December, 2021; originally announced December 2021.

    Comments: 18 pages, 6 figures

  41. arXiv:2112.06721  [pdf, other

    cs.SD cs.CL eess.AS

    PM-MMUT: Boosted Phone-Mask Data Augmentation using Multi-Modeling Unit Training for Phonetic-Reduction-Robust E2E Speech Recognition

    Authors: Guodong Ma, Pengfei Hu, Nurmemet Yolwas, Shen Huang, Hao Huang

    Abstract: Consonant and vowel reduction are often encountered in speech, which might cause performance degradation in automatic speech recognition (ASR). Our recently proposed learning strategy based on masking, Phone Masking Training (PMT), alleviates the impact of such phenomenon in Uyghur ASR. Although PMT achieves remarkably improvements, there still exists room for further gains due to the granularity… ▽ More

    Submitted 2 July, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

    Comments: Accepted to INTERSPEECH 2022

  42. arXiv:2110.15316  [pdf

    cs.SD eess.AS

    VRM-Phase I VKW system description of long-short video customizable keyword wakeup challenge

    Authors: Yougen Yuan, Zhiqiang Lv, Shen Huang, Pengfei Hu

    Abstract: Keyword wakeup technology has always been a research hotspot in speech processing, but many related works were done on different datasets. We organized a Chinese long-short video keyword wakeup challenge (Video Keyword Wakeup Challenge, VKW) for testing the ability of each participating team to build a keyword wakeup system under the public dataset. All submitted systems not only need to support t… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

    Comments: 6 pages, in Chinese language, 3 tables, NCMMC 2021 conference paper

  43. arXiv:2110.09121  [pdf, ps, other

    cs.SD eess.AS

    KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke

    Authors: Xiaobin Zhuang, Huiran Yu, Weifeng Zhao, Tao Jiang, Peng Hu

    Abstract: An automatic pitch correction system typically includes several stages, such as pitch extraction, deviation estimation, pitch shift processing, and cross-fade smoothing. However, designing these components with strategies often requires domain expertise and they are likely to fail on corner cases. In this paper, we present KaraTuner, an end-to-end neural architecture that predicts pitch curve and… ▽ More

    Submitted 26 June, 2022; v1 submitted 18 October, 2021; originally announced October 2021.

    Comments: To be published in Proc. Interspeech 2022, Incheon, South Korea

  44. High-throughput lensless whole slide imaging via continuous height-varying modulation of tilted sensor

    Authors: Shaowei Jiang, Chengfei Guo, Patrick Hu, Derek Hu, Pengming Song, Tianbo Wang, Zichao Bian, Zibang Zhang, Guoan Zheng

    Abstract: We report a new lensless microscopy configuration by integrating the concepts of transverse translational ptychography and defocus multi-height phase retrieval. In this approach, we place a tilted image sensor under the specimen for linearly-increasing phase modulation along one lateral direction. Similar to the operation of ptychography, we laterally translate the specimen and acquire the diffrac… ▽ More

    Submitted 28 September, 2021; originally announced October 2021.

  45. arXiv:2107.01805  [pdf

    eess.SY

    Dual Synchronous Generator: Inertial Current Source based Grid-Forming Solution for VSC

    Authors: Huanhai Xin, Kehao Zhuang, Pengfei Hu, Yunjie Gu, Ping Ju

    Abstract: In order to improve dynamic characteristics of the power system with high-proportion renewable energy sources (RESs), it is necessary for the voltage source converter (VSC), interfaces of RESs, to provide inertial and frequency regulation. In practical applications, VSCs are better to be controlled as a current source due to its weak overcurrent capacity. According to the characteristic, a dual sy… ▽ More

    Submitted 17 July, 2021; v1 submitted 5 July, 2021; originally announced July 2021.

  46. arXiv:2004.07135  [pdf, other

    cs.NI eess.SY

    A 5G NR based System Architecture for Real-Time Control with Batteryless RFID Sensors

    Authors: Peng Hu

    Abstract: The fifth-generation wireless networking (5G) technologies have been developed to meet various time-critical use cases with ultra-reliable, low-latency and massive machine-type communications which are indispensable for tactile Internet applications. Recent advancements in very low-cost and batteryless radio-frequency identification (RFID) sensors have given promises of deploying a massive amount… ▽ More

    Submitted 12 August, 2020; v1 submitted 15 April, 2020; originally announced April 2020.

  47. arXiv:2004.01800  [pdf, other

    cs.CV cs.LG cs.MM eess.IV

    Temporally Distributed Networks for Fast Video Semantic Segmentation

    Authors: Ping Hu, Fabian Caba Heilbron, Oliver Wang, Zhe Lin, Stan Sclaroff, Federico Perazzi

    Abstract: We present TDNet, a temporally distributed network designed for fast and accurate video semantic segmentation. We observe that features extracted from a certain high-level layer of a deep CNN can be approximated by composing features extracted from several shallower sub-networks. Leveraging the inherent temporal continuity in videos, we distribute these sub-networks over sequential frames. Therefo… ▽ More

    Submitted 6 April, 2020; v1 submitted 3 April, 2020; originally announced April 2020.

    Comments: [CVPR2020] Project: https://github.com/feinanshan/TDNet

  48. arXiv:1912.11264  [pdf, other

    cs.CV eess.IV

    Deep Manifold Embedding for Hyperspectral Image Classification

    Authors: Zhiqiang Gong, Weidong Hu, Xiaoyong Du, Ping Zhong, Panhe Hu

    Abstract: Deep learning methods have played a more and more important role in hyperspectral image classification. However, the general deep learning methods mainly take advantage of the information of sample itself or the pairwise information between samples while ignore the intrinsic data structure within the whole data. To tackle this problem, this work develops a novel deep manifold embedding method(DMEM… ▽ More

    Submitted 27 March, 2021; v1 submitted 24 December, 2019; originally announced December 2019.

    Comments: Accepted by IEEE TCYB

  49. arXiv:1908.05033  [pdf, other

    cs.CV cs.LG eess.IV

    Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks

    Authors: Ruihao Gong, Xianglong Liu, Shenghu Jiang, Tianxiang Li, Peng Hu, Jiazhen Lin, Fengwei Yu, Junjie Yan

    Abstract: Hardware-friendly network quantization (e.g., binary/uniform quantization) can efficiently accelerate the inference and meanwhile reduce memory consumption of the deep neural networks, which is crucial for model deployment on resource-limited devices like mobile phones. However, due to the discreteness of low-bit quantization, existing quantization methods often face the unstable training process… ▽ More

    Submitted 14 August, 2019; originally announced August 2019.

    Comments: IEEE ICCV 2019

  50. arXiv:1907.11138  [pdf

    physics.app-ph eess.SP

    Effect of Surrounding Conductive Object on Four-Plate Capacitive Power Transfer System

    Authors: Qi Zhu, Lixiang Jackie Zou, Shaoge Zang, Mei Su, Aiguo Patrick Hu

    Abstract: In this paper, the effect of a surrounding conductive object on a typical capacitive power transfer (CPT) system with two pairs of parallel plates is studied by considering the mutual coupling between the conductive object and the plates. A mathematical model is established based on a 5*5 mutual capacitance matrix by using a larger additional conductive plate to represent the surrounding conductiv… ▽ More

    Submitted 7 June, 2019; originally announced July 2019.

    Comments: 9 pages, 15 figures, 4 tables