Zum Hauptinhalt springen

Showing 1–50 of 70 results for author: Ya, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.16313  [pdf, other

    cs.CV cs.AI

    FA-YOLO: Research On Efficient Feature Selection YOLO Improved Algorithm Based On FMDS and AGMF Modules

    Authors: Yukang Huo, Mingyuan Yao, Qingbin Tian, Tonghao Wang, Ruifeng Wang, Haihua Wang

    Abstract: Over the past few years, the YOLO series of models has emerged as one of the dominant methodologies in the realm of object detection. Many studies have advanced these baseline models by modifying their architectures, enhancing data quality, and developing new loss functions. However, current models still exhibit deficiencies in processing feature maps, such as overlooking the fusion of cross-scale… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

    Comments: 11 pages and 4 figures

  2. arXiv:2408.00788  [pdf, other

    cs.NE cs.LG

    SpikeVoice: High-Quality Text-to-Speech Via Efficient Spiking Neural Network

    Authors: Kexin Wang, Jiahong Zhang, Yong Ren, Man Yao, Di Shang, Bo Xu, Guoqi Li

    Abstract: Brain-inspired Spiking Neural Network (SNN) has demonstrated its effectiveness and efficiency in vision, natural language, and speech understanding tasks, indicating their capacity to "see", "listen", and "read". In this paper, we design \textbf{SpikeVoice}, which performs high-quality Text-To-Speech (TTS) via SNN, to explore the potential of SNN to "speak". A major obstacle to using SNN for such… ▽ More

    Submitted 17 July, 2024; originally announced August 2024.

    Comments: 9 pages

  3. arXiv:2408.00381  [pdf, other

    cs.IT eess.SY

    Statistical AoI Guarantee Optimization for Supporting xURLLC in ISAC-enabled V2I Networks

    Authors: Yanxi Zhang, Mingwu Yao, Qinghai Yang, Dongqi Yan, Xu Zhang, Xu Bao, Muyu Mei

    Abstract: This paper addresses the critical challenge of supporting next-generation ultra-reliable and low-latency communication (xURLLC) within integrated sensing and communication (ISAC)-enabled vehicle-to-infrastructure (V2I) networks. We incorporate channel evaluation and retransmission mechanisms for real-time reliability enhancement. Using stochastic network calculus (SNC), we establish a theoretical… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  4. arXiv:2407.20708  [pdf, other

    cs.AI

    Integer-Valued Training and Spike-Driven Inference Spiking Neural Network for High-performance and Energy-efficient Object Detection

    Authors: Xinhao Luo, Man Yao, Yuhong Chou, Bo Xu, Guoqi Li

    Abstract: Brain-inspired Spiking Neural Networks (SNNs) have bio-plausibility and low-power advantages over Artificial Neural Networks (ANNs). Applications of SNNs are currently limited to simple classification tasks because of their poor performance. In this work, we focus on bridging the performance gap between ANNs and SNNs on object detection. Our design revolves around network architecture and spiking… ▽ More

    Submitted 5 August, 2024; v1 submitted 30 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024; 19 pages, 4 figures

  5. arXiv:2407.20099  [pdf, other

    cs.CV

    RSC-SNN: Exploring the Trade-off Between Adversarial Robustness and Accuracy in Spiking Neural Networks via Randomized Smoothing Coding

    Authors: Keming Wu, Man Yao, Yuhong Chou, Xuerui Qiu, Rui Yang, Bo Xu, Guoqi Li

    Abstract: Spiking Neural Networks (SNNs) have received widespread attention due to their unique neuronal dynamics and low-power nature. Previous research empirically shows that SNNs with Poisson coding are more robust than Artificial Neural Networks (ANNs) on small-scale datasets. However, it is still unclear in theory how the adversarial robustness of SNNs is derived, and whether SNNs can still maintain it… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: Accepted by ACM MM 2024

  6. arXiv:2407.10485  [pdf, other

    cs.CV

    MM-Tracker: Motion Mamba with Margin Loss for UAV-platform Multiple Object Tracking

    Authors: Mufeng Yao, Jinlong Peng, Qingdong He, Bo Peng, Hao Chen, Mingmin Chi, Chao Liu, Jon Atli Benediktsson

    Abstract: Multiple object tracking (MOT) from unmanned aerial vehicle (UAV) platforms requires efficient motion modeling. This is because UAV-MOT faces both local object motion and global camera motion. Motion blur also increases the difficulty of detecting large moving objects. Previous UAV motion modeling approaches either focus only on local motion or ignore motion blurring effects, thus limiting their t… ▽ More

    Submitted 17 August, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2308.07207

  7. arXiv:2407.02073  [pdf, other

    cs.LG

    Contribution Evaluation of Heterogeneous Participants in Federated Learning via Prototypical Representations

    Authors: Qi Guo, Minghao Yao, Zhen Tian, Saiyu Qi, Yong Qi, Yun Lin, Jin Song Dong

    Abstract: Contribution evaluation in federated learning (FL) has become a pivotal research area due to its applicability across various domains, such as detecting low-quality datasets, enhancing model robustness, and designing incentive mechanisms. Existing contribution evaluation methods, which primarily rely on data volume, model similarity, and auxiliary test datasets, have shown success in diverse scena… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  8. arXiv:2406.08204  [pdf, other

    cs.CV

    Diffusion-Promoted HDR Video Reconstruction

    Authors: Yuanshen Guan, Ruikang Xu, Mingde Yao, Ruisheng Gao, Lizhi Wang, Zhiwei Xiong

    Abstract: High dynamic range (HDR) video reconstruction aims to generate HDR videos from low dynamic range (LDR) frames captured with alternating exposures. Most existing works solely rely on the regression-based paradigm, leading to adverse effects such as ghosting artifacts and missing details in saturated regions. In this paper, we propose a diffusion-promoted method for HDR video reconstruction, termed… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Arxiv Preprint

  9. arXiv:2406.01003  [pdf, other

    cs.CV

    Uni-ISP: Unifying the Learning of ISPs from Multiple Cameras

    Authors: Lingen Li, Mingde Yao, Xingyu Meng, Muquan Yu, Tianfan Xue, Jinwei Gu

    Abstract: Modern end-to-end image signal processors (ISPs) can learn complex mappings from RAW/XYZ data to sRGB (or inverse), opening new possibilities in image processing. However, as the diversity of camera models continues to expand, developing and maintaining individual ISPs is not sustainable in the long term, which inherently lacks versatility, hindering the adaptability to multiple camera models. In… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  10. arXiv:2405.16466  [pdf, other

    cs.NE

    High-Performance Temporal Reversible Spiking Neural Networks with $O(L)$ Training Memory and $O(1)$ Inference Cost

    Authors: JiaKui Hu, Man Yao, Xuerui Qiu, Yuhong Chou, Yuxuan Cai, Ning Qiao, Yonghong Tian, Bo XU, Guoqi Li

    Abstract: Multi-timestep simulation of brain-inspired Spiking Neural Networks (SNNs) boost memory requirements during training and increase inference energy cost. Current training methods cannot simultaneously solve both training and inference dilemmas. This work proposes a novel Temporal Reversible architecture for SNNs (T-RevSNN) to jointly address the training and inference challenges by altering the for… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML2024

  11. arXiv:2405.14839  [pdf, other

    cs.CV cs.CL

    A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis

    Authors: Yue Yang, Mona Gandhi, Yufei Wang, Yifan Wu, Michael S. Yao, Chris Callison-Burch, James C. Gee, Mark Yatskar

    Abstract: While deep networks have achieved broad success in analyzing natural images, when applied to medical scans, they often fail in unexcepted situations. We investigate this challenge and focus on model sensitivity to domain shifts, such as data sampled from different hospitals or data confounded by demographic variables such as sex, race, etc, in the context of chest X-rays and skin lesion images. A… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 23 pages, 9 figures, 12 tables, project page: https://yueyang1996.github.io/knobo/

  12. arXiv:2405.10987  [pdf, other

    cs.LG cs.AI

    Manifold-based Incomplete Multi-view Clustering via Bi-Consistency Guidance

    Authors: Huibing Wang, Mingze Yao, Yawei Chen, Yunqiu Xu, Haipeng Liu, Wei Jia, Xianping Fu, Yang Wang

    Abstract: Incomplete multi-view clustering primarily focuses on dividing unlabeled data into corresponding categories with missing instances, and has received intensive attention due to its superiority in real applications. Considering the influence of incomplete data, the existing methods mostly attempt to recover data by adding extra terms. However, for the unsupervised methods, a simple recovery strategy… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  13. arXiv:2404.19534  [pdf, other

    cs.CV

    MIPI 2024 Challenge on Nighttime Flare Removal: Methods and Results

    Authors: Yuekun Dai, Dafeng Zhang, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Peiqing Yang, Zhezhu Jin, Guanqun Liu, Chen Change Loy, Lize Zhang, Shuai Liu, Chaoyu Feng, Luyang Wang, Shuan Chen, Guangqi Shao, Xiaotao Wang, Lei Lei, Qirui Yang, Qihua Cheng, Zhiqiang Xu, Yihao Liu, Huanjing Yue, Jingyu Yang , et al. (38 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 27 May, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 Mobile Intelligent Photography and Imaging (MIPI) Workshop--Nighttime Flare Removal Challenge Report. Website: https://mipi-challenge.org/MIPI2024/

  14. arXiv:2404.15244  [pdf, other

    cs.CV cs.LG

    Efficient Transformer Encoders for Mask2Former-style models

    Authors: Manyi Yao, Abhishek Aich, Yumin Suh, Amit Roy-Chowdhury, Christian Shelton, Manmohan Chandraker

    Abstract: Vision transformer based models bring significant improvements for image segmentation tasks. Although these architectures offer powerful capabilities irrespective of specific segmentation tasks, their use of computational resources can be taxing on deployed devices. One way to overcome this challenge is by adapting the computation level to the specific needs of the input image rather than the curr… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  15. arXiv:2404.03663  [pdf, other

    cs.NE cs.CV

    Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring the Design of Next-generation Neuromorphic Chips

    Authors: Man Yao, Jiakui Hu, Tianxiang Hu, Yifan Xu, Zhaokun Zhou, Yonghong Tian, Bo Xu, Guoqi Li

    Abstract: Neuromorphic computing, which exploits Spiking Neural Networks (SNNs) on neuromorphic chips, is a promising energy-efficient alternative to traditional AI. CNN-based SNNs are the current mainstream of neuromorphic computing. By contrast, no neuromorphic chips are designed especially for Transformer-based SNNs, which have just emerged, and their performance is only on par with CNN-based SNNs, offer… ▽ More

    Submitted 15 February, 2024; originally announced April 2024.

    Comments: Accepted by ICLR2024. Code and Model: https://github.com/BICLab/Spike-Driven-Transformer-V2

  16. arXiv:2404.00714  [pdf, other

    cs.CV

    Neural Radiance Field-based Visual Rendering: A Comprehensive Review

    Authors: Mingyuan Yao, Yukang Huo, Yang Ran, Qingbin Tian, Ruifeng Wang, Haihua Wang

    Abstract: In recent years, Neural Radiance Fields (NeRF) has made remarkable progress in the field of computer vision and graphics, providing strong technical support for solving key tasks including 3D scene understanding, new perspective synthesis, human body reconstruction, robotics, and so on, the attention of academics to this research result is growing. As a revolutionary neural implicit field represen… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: 35 pages, 22 figures, 14 tables, 18 formulas

  17. arXiv:2403.15919  [pdf, other

    cs.HC cs.CY

    Negotiating the Shared Agency between Humans & AI in the Recommender System

    Authors: Mengke Wu, Weizi Liu, Yanyun Wang, Mike Yao

    Abstract: Smart recommendation algorithms have revolutionized information dissemination, enhancing efficiency and reshaping content delivery across various domains. However, concerns about user agency have arisen due to the inherent opacity (information asymmetry) and the nature of one-way output (power asymmetry) on algorithms. While both issues have been criticized by scholars via advocating explainable A… ▽ More

    Submitted 19 April, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

  18. arXiv:2403.05606  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    A Concept-based Interpretable Model for the Diagnosis of Choroid Neoplasias using Multimodal Data

    Authors: Yifan Wu, Yang Liu, Yue Yang, Michael S. Yao, Wenli Yang, Xuehui Shi, Lihong Yang, Dongjun Li, Yueming Liu, James C. Gee, Xuan Yang, Wenbin Wei, Shi Gu

    Abstract: Diagnosing rare diseases presents a common challenge in clinical practice, necessitating the expertise of specialists for accurate identification. The advent of machine learning offers a promising solution, while the development of such technologies is hindered by the scarcity of data on rare conditions and the demand for models that are both interpretable and trustworthy in a clinical context. In… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  19. arXiv:2402.07369  [pdf, other

    cs.LG

    Diff-RNTraj: A Structure-aware Diffusion Model for Road Network-constrained Trajectory Generation

    Authors: Tonglong Wei, Youfang Lin, Shengnan Guo, Yan Lin, Yiheng Huang, Chenyang Xiang, Yuqing Bai, Menglu Ya, Huaiyu Wan

    Abstract: Trajectory data is essential for various applications as it records the movement of vehicles. However, publicly available trajectory datasets remain limited in scale due to privacy concerns, which hinders the development of trajectory data mining and trajectory-based applications. To address this issue, some methods for generating synthetic trajectories have been proposed to expand the scale of th… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

  20. arXiv:2402.06532  [pdf, other

    cs.LG cs.AI

    Generative Adversarial Bayesian Optimization for Surrogate Objectives

    Authors: Michael S. Yao, Yimeng Zeng, Hamsa Bastani, Jacob Gardner, James C. Gee, Osbert Bastani

    Abstract: Offline model-based policy optimization seeks to optimize a learned surrogate objective function without querying the true oracle objective during optimization. However, inaccurate surrogate model predictions are frequently encountered along the optimization trajectory. To address this limitation, we propose generative adversarial Bayesian optimization (GABO) using adaptive source critic regulariz… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

    Comments: 15 pages, 3 figures

  21. arXiv:2310.12848  [pdf, other

    cs.CV

    Neural Degradation Representation Learning for All-In-One Image Restoration

    Authors: Mingde Yao, Ruikang Xu, Yuanshen Guan, Jie Huang, Zhiwei Xiong

    Abstract: Existing methods have demonstrated effective performance on a single degradation type. In practical applications, however, the degradation is often unknown, and the mismatch between the model and the degradation will result in a severe performance drop. In this paper, we propose an all-in-one image restoration network that tackles multiple degradations. Due to the heterogeneous nature of different… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

  22. arXiv:2309.17334  [pdf, other

    eess.IV cs.CV

    Multi-Depth Branch Network for Efficient Image Super-Resolution

    Authors: Huiyuan Tian, Li Zhang, Shijian Li, Min Yao, Gang Pan

    Abstract: A longstanding challenge in Super-Resolution (SR) is how to efficiently enhance high-frequency details in Low-Resolution (LR) images while maintaining semantic coherence. This is particularly crucial in practical applications where SR models are often deployed on low-power devices. To address this issue, we propose an innovative asymmetric SR architecture featuring Multi-Depth Branch Module (MDBM)… ▽ More

    Submitted 15 January, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

  23. arXiv:2309.11753  [pdf, other

    cs.AI

    Improve the efficiency of deep reinforcement learning through semantic exploration guided by natural language

    Authors: Zhourui Guo, Meng Yao, Yang Yu, Qiyue Yin

    Abstract: Reinforcement learning is a powerful technique for learning from trial and error, but it often requires a large number of interactions to achieve good performance. In some domains, such as sparse-reward tasks, an oracle that can provide useful feedback or guidance to the agent during the learning process is really of great importance. However, querying the oracle too frequently may be costly or im… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

  24. arXiv:2308.14018  [pdf, other

    cs.CV

    VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and Quantization

    Authors: Mingshuai Yao, Yabo Zhang, Xianhui Lin, Xiaoming Li, Wangmeng Zuo

    Abstract: Few-shot font generation is challenging, as it needs to capture the fine-grained stroke styles from a limited set of reference glyphs, and then transfer to other characters, which are expected to have similar styles. However, due to the diversity and complexity of Chinese font styles, the synthesized glyphs of existing methods usually exhibit visible artifacts, such as missing details and distorte… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

    Comments: 13 pages, 14 figures

  25. arXiv:2308.13783  [pdf, other

    cs.CV

    Generalized Lightness Adaptation with Channel Selective Normalization

    Authors: Mingde Yao, Jie Huang, Xin Jin, Ruikang Xu, Shenglong Zhou, Man Zhou, Zhiwei Xiong

    Abstract: Lightness adaptation is vital to the success of image processing to avoid unexpected visual deterioration, which covers multiple aspects, e.g., low-light image enhancement, image retouching, and inverse tone mapping. Existing methods typically work well on their trained lightness conditions but perform poorly in unknown ones due to their limited generalization ability. To address this limitation,… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

    Comments: Accepted to ICCV 2023. Code: https://github.com/mdyao/CSNorm/

  26. arXiv:2308.12538  [pdf, other

    cs.CV

    Mutual-Guided Dynamic Network for Image Fusion

    Authors: Yuanshen Guan, Ruikang Xu, Mingde Yao, Lizhi Wang, Zhiwei Xiong

    Abstract: Image fusion aims to generate a high-quality image from multiple images captured under varying conditions. The key problem of this task is to preserve complementary information while filtering out irrelevant information for the fused result. However, existing methods address this problem by leveraging static convolutional neural networks (CNNs), suffering two inherent limitations during feature ex… ▽ More

    Submitted 1 September, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: ACMMM 2023 accepted

  27. arXiv:2308.08227  [pdf, other

    cs.NE cs.CV cs.LG

    Inherent Redundancy in Spiking Neural Networks

    Authors: Man Yao, Jiakui Hu, Guangshe Zhao, Yaoyuan Wang, Ziyang Zhang, Bo Xu, Guoqi Li

    Abstract: Spiking Neural Networks (SNNs) are well known as a promising energy-efficient alternative to conventional artificial neural networks. Subject to the preconceived impression that SNNs are sparse firing, the analysis and optimization of inherent redundancy in SNNs have been largely overlooked, thus the potential advantages of spike-based neuromorphic computing in accuracy and energy efficiency are i… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV2023

  28. arXiv:2308.07207  [pdf, other

    cs.CV

    FOLT: Fast Multiple Object Tracking from UAV-captured Videos Based on Optical Flow

    Authors: Mufeng Yao, Jiaqi Wang, Jinlong Peng, Mingmin Chi, Chao Liu

    Abstract: Multiple object tracking (MOT) has been successfully investigated in computer vision. However, MOT for the videos captured by unmanned aerial vehicles (UAV) is still challenging due to small object size, blurred object appearance, and very large and/or irregular motion in both ground objects and UAV platforms. In this paper, we propose FOLT to mitigate these problems and reach fast and accurat… ▽ More

    Submitted 14 August, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: Accepted by ACM Multi-Media 2023

  29. arXiv:2308.04322  [pdf, other

    cs.CV

    Domain Adaptive Person Search via GAN-based Scene Synthesis for Cross-scene Videos

    Authors: Huibing Wang, Tianxiang Cui, Mingze Yao, Huijuan Pang, Yushan Du

    Abstract: Person search has recently been a challenging task in the computer vision domain, which aims to search specific pedestrians from real cameras.Nevertheless, most surveillance videos comprise only a handful of images of each pedestrian, which often feature identical backgrounds and clothing. Hence, it is difficult to learn more discriminative features for person search in real scenes. To tackle this… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

  30. arXiv:2307.01694  [pdf, other

    cs.NE cs.CV

    Spike-driven Transformer

    Authors: Man Yao, Jiakui Hu, Zhaokun Zhou, Li Yuan, Yonghong Tian, Bo Xu, Guoqi Li

    Abstract: Spiking Neural Networks (SNNs) provide an energy-efficient deep learning option due to their unique spike-based event-driven (i.e., spike-driven) paradigm. In this paper, we incorporate the spike-driven paradigm into Transformer by the proposed Spike-driven Transformer with four unique properties: 1) Event-driven, no calculation is triggered when the input of Transformer is zero; 2) Binary spike c… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

  31. arXiv:2305.14725  [pdf, other

    cs.CL

    AMELI: Enhancing Multimodal Entity Linking with Fine-Grained Attributes

    Authors: Barry Menglong Yao, Yu Chen, Qifan Wang, Sijia Wang, Minqian Liu, Zhiyang Xu, Licheng Yu, Lifu Huang

    Abstract: We propose attribute-aware multimodal entity linking, where the input is a mention described with a text and image, and the goal is to predict the corresponding target entity from a multimodal knowledge base (KB) where each entity is also described with a text description, a visual image and a set of attributes and values. To support this research, we construct AMELI, a large-scale dataset consist… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: 12 pages, 4 figures

    ACM Class: I.2.7

  32. arXiv:2305.12148  [pdf, other

    cs.LG

    Probabilistic Modeling: Proving the Lottery Ticket Hypothesis in Spiking Neural Network

    Authors: Man Yao, Yuhong Chou, Guangshe Zhao, Xiawu Zheng, Yonghong Tian, Bo Xu, Guoqi Li

    Abstract: The Lottery Ticket Hypothesis (LTH) states that a randomly-initialized large neural network contains a small sub-network (i.e., winning tickets) which, when trained in isolation, can achieve comparable performance to the large network. LTH opens up a new path for network pruning. Existing proofs of LTH in Artificial Neural Networks (ANNs) are based on continuous activation functions, such as ReLU,… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

    Comments: 22pages, 5 figures

  33. arXiv:2305.08394  [pdf, ps, other

    cs.MA

    More Like Real World Game Challenge for Partially Observable Multi-Agent Cooperation

    Authors: Meng Yao, Xueou Feng, Qiyue Yin

    Abstract: Some standardized environments have been designed for partially observable multi-agent cooperation, but we find most current environments are synchronous, whereas real-world agents often have their own action spaces leading to asynchrony. Furthermore, fixed agents number limits the scalability of action space, whereas in reality agents number can change resulting in a flexible action space. In add… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

  34. arXiv:2305.06925  [pdf, other

    cond-mat.mtrl-sci cs.LG physics.chem-ph physics.comp-ph

    Accurate Surface and Finite Temperature Bulk Properties of Lithium Metal at Large Scales using Machine Learning Interaction Potentials

    Authors: Mgcini Keith Phuthi, Archie Mingze Yao, Simon Batzner, Albert Musaelian, Boris Kozinsky, Ekin Dogus Cubuk, Venkatasubramanian Viswanathan

    Abstract: The properties of lithium metal are key parameters in the design of lithium ion and lithium metal batteries. They are difficult to probe experimentally due to the high reactivity and low melting point of lithium as well as the microscopic scales at which lithium exists in batteries where it is found to have enhanced strength, with implications for dendrite suppression strategies. Computationally,… ▽ More

    Submitted 22 May, 2023; v1 submitted 24 April, 2023; originally announced May 2023.

    Comments: 9 pages, 4 figures, 3 pages of Supporting Information

  35. arXiv:2304.11631  [pdf, other

    cs.CV cs.AI

    TSGCNeXt: Dynamic-Static Multi-Graph Convolution for Efficient Skeleton-Based Action Recognition with Long-term Learning Potential

    Authors: Dongjingdin Liu, Pengpeng Chen, Miao Yao, Yijing Lu, Zijie Cai, Yuxin Tian

    Abstract: Skeleton-based action recognition has achieved remarkable results in human action recognition with the development of graph convolutional networks (GCNs). However, the recent works tend to construct complex learning mechanisms with redundant training and exist a bottleneck for long time-series. To solve these problems, we propose the Temporal-Spatio Graph ConvNeXt (TSGCNeXt) to explore efficient l… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

  36. arXiv:2303.03933  [pdf, other

    cs.LG cs.AI cs.DC

    DEDGAT: Dual Embedding of Directed Graph Attention Networks for Detecting Financial Risk

    Authors: Jiafu Wu, Mufeng Yao, Dong Wu, Mingmin Chi, Baokun Wang, Ruofan Wu, Xin Fu, Changhua Meng, Weiqiang Wang

    Abstract: Graph representation plays an important role in the field of financial risk control, where the relationship among users can be constructed in a graph manner. In practical scenarios, the relationships between nodes in risk control tasks are bidirectional, e.g., merchants having both revenue and expense behaviors. Graph neural networks designed for undirected graphs usually aggregate discriminative… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

  37. arXiv:2302.01762  [pdf, other

    cs.CR cs.CV cs.LG

    BackdoorBox: A Python Toolbox for Backdoor Learning

    Authors: Yiming Li, Mengxi Ya, Yang Bai, Yong Jiang, Shu-Tao Xia

    Abstract: Third-party resources ($e.g.$, samples, backbones, and pre-trained models) are usually involved in the training of deep neural networks (DNNs), which brings backdoor attacks as a new training-phase threat. In general, backdoor attackers intend to implant hidden backdoor in DNNs, so that the attacked DNNs behave normally on benign samples whereas their predictions will be maliciously changed to a p… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

    Comments: BackdoorBox V0.1. The first two authors contributed equally to this toolbox. 13 pages

  38. arXiv:2301.02484  [pdf, other

    cs.CV

    Graph-Collaborated Auto-Encoder Hashing for Multi-view Binary Clustering

    Authors: Huibing Wang, Mingze Yao, Guangqi Jiang, Zetian Mi, Xianping Fu

    Abstract: Unsupervised hashing methods have attracted widespread attention with the explosive growth of large-scale data, which can greatly reduce storage and computation by learning compact binary codes. Existing unsupervised hashing methods attempt to exploit the valuable information from samples, which fails to take the local geometric structure of unlabeled samples into consideration. Moreover, hashing… ▽ More

    Submitted 6 January, 2023; originally announced January 2023.

  39. arXiv:2211.12156  [pdf, other

    cs.CV

    MSS-DepthNet: Depth Prediction with Multi-Step Spiking Neural Network

    Authors: Xiaoshan Wu, Weihua He, Man Yao, Ziyang Zhang, Yaoyuan Wang, Guoqi Li

    Abstract: Event cameras are considered to have great potential for computer vision and robotics applications because of their high temporal resolution and low power consumption characteristics. However, the event stream output from event cameras has asynchronous, sparse characteristics that existing computer vision algorithms cannot handle. Spiking neural network is a novel event-based computational paradig… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

  40. arXiv:2209.13929  [pdf, other

    cs.CV

    Attention Spiking Neural Networks

    Authors: Man Yao, Guangshe Zhao, Hengyu Zhang, Yifan Hu, Lei Deng, Yonghong Tian, Bo Xu, Guoqi Li

    Abstract: Benefiting from the event-driven and sparse spiking characteristics of the brain, spiking neural networks (SNNs) are becoming an energy-efficient alternative to artificial neural networks (ANNs). However, the performance gap between SNNs and ANNs has been a great hindrance to deploying SNNs ubiquitously for a long time. To leverage the full potential of SNNs, we study the effect of attention mecha… ▽ More

    Submitted 28 September, 2022; originally announced September 2022.

    Comments: 18 pages, 8 figures, Under Review

  41. arXiv:2209.10043  [pdf, other

    cs.LG cs.AI eess.IV q-bio.QM

    SynthA1c: Towards Clinically Interpretable Patient Representations for Diabetes Risk Stratification

    Authors: Michael S. Yao, Allison Chae, Matthew T. MacLean, Anurag Verma, Jeffrey Duda, James Gee, Drew A. Torigian, Daniel Rader, Charles Kahn, Walter R. Witschey, Hersh Sagreiya

    Abstract: Early diagnosis of Type 2 Diabetes Mellitus (T2DM) is crucial to enable timely therapeutic interventions and lifestyle modifications. As the time available for clinical office visits shortens and medical imaging data become more widely available, patient image data could be used to opportunistically identify patients for additional T2DM diagnostic workup by physicians. We investigated whether imag… ▽ More

    Submitted 27 July, 2023; v1 submitted 20 September, 2022; originally announced September 2022.

    Comments: 12 pages. Accepted to PRIME MICCAI 2023

  42. arXiv:2208.12835  [pdf, other

    eess.IV cs.CV cs.LG

    A Path Towards Clinical Adaptation of Accelerated MRI

    Authors: Michael S. Yao, Michael S. Hansen

    Abstract: Accelerated MRI reconstructs images of clinical anatomies from sparsely sampled signal data to reduce patient scan times. While recent works have leveraged deep learning to accomplish this task, such approaches have often only been explored in simulated environments where there is no signal corruption or resource limitations. In this work, we explore augmentations to neural network MRI image recon… ▽ More

    Submitted 28 November, 2022; v1 submitted 26 August, 2022; originally announced August 2022.

    Comments: Accepted to ML4H 2022

    Journal ref: In Proceedings of the 2nd Machine Learning for Health Symposium 193:489-511, 2022

  43. arXiv:2208.08052  [pdf, other

    cs.CV cs.CR

    Imperceptible and Robust Backdoor Attack in 3D Point Cloud

    Authors: Kuofeng Gao, Jiawang Bai, Baoyuan Wu, Mengxi Ya, Shu-Tao Xia

    Abstract: With the thriving of deep learning in processing point cloud data, recent works show that backdoor attacks pose a severe security threat to 3D vision applications. The attacker injects the backdoor into the 3D model by poisoning a few training samples with trigger, such that the backdoored model performs well on clean samples but behaves maliciously when the trigger pattern appears. Existing attac… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

  44. arXiv:2207.08736  [pdf, other

    cs.CV

    Towards Diverse and Faithful One-shot Adaption of Generative Adversarial Networks

    Authors: Yabo Zhang, Mingshuai Yao, Yuxiang Wei, Zhilong Ji, Jinfeng Bai, Wangmeng Zuo

    Abstract: One-shot generative domain adaption aims to transfer a pre-trained generator on one domain to a new domain using one reference image only. However, it remains very challenging for the adapted generator (i) to generate diverse images inherited from the pre-trained generator while (ii) faithfully acquiring the domain-specific attributes and styles of the reference image. In this paper, we present a… ▽ More

    Submitted 25 September, 2022; v1 submitted 18 July, 2022; originally announced July 2022.

    Comments: Accepted at NeurIPS 2022. Code is available at https://github.com/1170300521/DiFa

  45. Predicting Electricity Infrastructure Induced Wildfire Risk in California

    Authors: Mengqi Yao, Meghana Bharadwaj, Zheng Zhang, Baihong Jin, Duncan S. Callaway

    Abstract: This paper examines the use of risk models to predict the timing and location of wildfires caused by electricity infrastructure. Our data include historical ignition and wire-down points triggered by grid infrastructure collected between 2015 to 2019 in Pacific Gas & Electricity territory along with various weather, vegetation, and very high resolution data on grid infrastructure including locatio… ▽ More

    Submitted 11 August, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

  46. End-to-End Multimodal Fact-Checking and Explanation Generation: A Challenging Dataset and Models

    Authors: Barry Menglong Yao, Aditya Shah, Lichao Sun, Jin-Hee Cho, Lifu Huang

    Abstract: We propose end-to-end multimodal fact-checking and explanation generation, where the input is a claim and a large collection of web sources, including articles, images, videos, and tweets, and the goal is to assess the truthfulness of the claim by retrieving relevant evidence and predicting a truthfulness label (e.g., support, refute or not enough information), and to generate a statement to summa… ▽ More

    Submitted 6 July, 2023; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted by SIGIR 23, 11 pages, 4 figures

    Journal ref: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '23), July 23--27, 2023, Taipei, Taiwan

  47. arXiv:2205.05703  [pdf, other

    cs.CV cs.RO

    Multi-Class 3D Object Detection with Single-Class Supervision

    Authors: Mao Ye, Chenxi Liu, Maoqing Yao, Weiyue Wang, Zhaoqi Leng, Charles R. Qi, Dragomir Anguelov

    Abstract: While multi-class 3D detectors are needed in many robotics applications, training them with fully labeled datasets can be expensive in labeling cost. An alternative approach is to have targeted single-class labels on disjoint data samples. In this paper, we are interested in training a multi-class 3D object detection model, while using these single-class labeled data. We begin by detailing the uni… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: ICRA 2022

  48. arXiv:2203.00911  [pdf, other

    eess.IV cs.CV

    Towards Bidirectional Arbitrary Image Rescaling: Joint Optimization and Cycle Idempotence

    Authors: Zhihong Pan, Baopu Li, Dongliang He, Mingde Yao, Wenhao Wu, Tianwei Lin, Xin Li, Errui Ding

    Abstract: Deep learning based single image super-resolution models have been widely studied and superb results are achieved in upscaling low-resolution images with fixed scale factor and downscaling degradation kernel. To improve real world applicability of such models, there are growing interests to develop models optimized for arbitrary upscaling factors. Our proposed method is the first to treat arbitrar… ▽ More

    Submitted 7 March, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

    Comments: To appear at CVPR 2022

  49. arXiv:2201.06244  [pdf, other

    cs.LG cs.CR

    EFMVFL: An Efficient and Flexible Multi-party Vertical Federated Learning without a Third Party

    Authors: Yimin Huang, Xinyu Feng, Wanwan Wang, Hao He, Yukun Wang, Ming Yao

    Abstract: Federated learning allows multiple participants to conduct joint modeling without disclosing their local data. Vertical federated learning (VFL) handles the situation where participants share the same ID space and different feature spaces. In most VFL frameworks, to protect the security and privacy of the participants' local data, a third party is needed to generate homomorphic encryption key pair… ▽ More

    Submitted 17 January, 2022; originally announced January 2022.

    Comments: 9pages,2 figures

  50. arXiv:2112.13003  [pdf, other

    cs.CV

    Continuous Spectral Reconstruction from RGB Images via Implicit Neural Representation

    Authors: Ruikang Xu, Mingde Yao, Chang Chen, Lizhi Wang, Zhiwei Xiong

    Abstract: Existing methods for spectral reconstruction usually learn a discrete mapping from RGB images to a number of spectral bands. However, this modeling strategy ignores the continuous nature of spectral signature. In this paper, we propose Neural Spectral Reconstruction (NeSR) to lift this limitation, by introducing a novel continuous spectral representation. To this end, we embrace the concept of imp… ▽ More

    Submitted 31 August, 2022; v1 submitted 24 December, 2021; originally announced December 2021.

    Comments: Accepted to ECCV Workshop 2022