Skip to main content

Showing 1–50 of 1,704 results for author: Wang, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12888  [pdf

    cs.CL cs.AI

    Explainable Biomedical Hypothesis Generation via Retrieval Augmented Generation enabled Large Language Models

    Authors: Alexander R. Pelletier, Joseph Ramirez, Irsyad Adam, Simha Sankar, Yu Yan, Ding Wang, Dylan Steinecke, Wei Wang, Peipei Ping

    Abstract: The vast amount of biomedical information available today presents a significant challenge for investigators seeking to digest, process, and understand these findings effectively. Large Language Models (LLMs) have emerged as powerful tools to navigate this complex and challenging data landscape. However, LLMs may lead to hallucinatory responses, making Retrieval Augmented Generation (RAG) crucial… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  2. arXiv:2407.12593  [pdf, other

    cs.CV

    EvSign: Sign Language Recognition and Translation with Streaming Events

    Authors: Pengyu Zhang, Hao Yin, Zeren Wang, Wenyue Chen, Shengming Li, Dong Wang, Huchuan Lu, and Xu Jia

    Abstract: Sign language is one of the most effective communication tools for people with hearing difficulties. Most existing works focus on improving the performance of sign language tasks on RGB videos, which may suffer from degraded recording conditions, such as fast movement of hands with motion blur and textured signer's appearance. The bio-inspired event camera, which asynchronously captures brightness… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: To appear on ECCV 2024

  3. arXiv:2407.12580  [pdf, other

    cs.CL cs.CV cs.IR

    E5-V: Universal Embeddings with Multimodal Large Language Models

    Authors: Ting Jiang, Minghui Song, Zihan Zhang, Haizhen Huang, Weiwei Deng, Feng Sun, Qi Zhang, Deqing Wang, Fuzhen Zhuang

    Abstract: Multimodal large language models (MLLMs) have shown promising advancements in general visual and language understanding. However, the representation of multimodal information using MLLMs remains largely unexplored. In this work, we introduce a new framework, E5-V, designed to adapt MLLMs for achieving universal multimodal embeddings. Our findings highlight the significant potential of MLLMs in rep… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Code and models are available at https://github.com/kongds/E5-V

  4. arXiv:2407.12338  [pdf, other

    cs.IR cs.AI

    GUME: Graphs and User Modalities Enhancement for Long-Tail Multimodal Recommendation

    Authors: Guojiao Lin, Zhen Meng, Dongjie Wang, Qingqing Long, Yuanchun Zhou, Meng Xiao

    Abstract: Multimodal recommendation systems (MMRS) have received considerable attention from the research community due to their ability to jointly utilize information from user behavior and product images and text. Previous research has two main issues. First, many long-tail items in recommendation systems have limited interaction data, making it difficult to learn comprehensive and informative representat… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: 11 pages, accepted by CIKM 2024

  5. arXiv:2407.12015  [pdf, other

    cs.CL cs.AI cs.HC

    The Great AI Witch Hunt: Reviewers Perception and (Mis)Conception of Generative AI in Research Writing

    Authors: Hilda Hadan, Derrick Wang, Reza Hadi Mogavi, Joseph Tu, Leah Zhang-Kennedy, Lennart E. Nacke

    Abstract: Generative AI (GenAI) use in research writing is growing fast. However, it is unclear how peer reviewers recognize or misjudge AI-augmented manuscripts. To investigate the impact of AI-augmented writing on peer reviews, we conducted a snippet-based online survey with 17 peer reviewers from top-tier HCI conferences. Our findings indicate that while AI-augmented writing improves readability, languag… ▽ More

    Submitted 26 June, 2024; originally announced July 2024.

  6. arXiv:2407.11853  [pdf, other

    cs.ET

    A Case for Application-Aware Space Radiation Tolerance in Orbital Computing

    Authors: Meiqi Wang, Han Qiu, Longnv Xu, Di Wang, Yuanjie Li, Tianwei Zhang, Jun Liu, Hewu Li

    Abstract: We are witnessing a surge in the use of commercial off-the-shelf (COTS) hardware for cost-effective in-orbit computing, such as deep neural network (DNN) based on-satellite sensor data processing, Earth object detection, and task decision.However, once exposed to harsh space environments, COTS hardware is vulnerable to cosmic radiation and suffers from exhaustive single-event upsets (SEUs) and mul… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  7. arXiv:2407.11477  [pdf, other

    cs.LG cs.AI

    XTraffic: A Dataset Where Traffic Meets Incidents with Explainability and More

    Authors: Xiaochuan Gou, Ziyue Li, Tian Lan, Junpeng Lin, Zhishuai Li, Bingyu Zhao, Chen Zhang, Di Wang, Xiangliang Zhang

    Abstract: Long-separated research has been conducted on two highly correlated tracks: traffic and incidents. Traffic track witnesses complicating deep learning models, e.g., to push the prediction a few percent more accurate, and the incident track only studies the incidents alone, e.g., to infer the incident risk. We, for the first time, spatiotemporally aligned the two tracks in a large-scale region (16,9… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  8. arXiv:2407.11389  [pdf, ps, other

    cs.NI eess.SP

    Spatial-spectral Cell-free Networks: A Large-scale Case Study

    Authors: Zesheng Zhu, Lifeng Wang, Xin Wang, Dongming Wang, Kai-Kit Wong

    Abstract: This paper studies the large-scale cell-free networks where dense distributed access points (APs) serve many users. As a promising next-generation network architecture, cell-free networks enable ultra-reliable connections and minimal fading/blockage, which are much favorable to the millimeter wave and Terahertz transmissions. However, conventional beam management with large phased arrays in a cell… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  9. arXiv:2407.11038  [pdf, other

    cs.LG cs.AI cs.NE

    Fuzzy Recurrent Stochastic Configuration Networks for Industrial Data Analytics

    Authors: Dianhui Wang, Gang Dang

    Abstract: This paper presents a novel neuro-fuzzy model, termed fuzzy recurrent stochastic configuration networks (F-RSCNs), for industrial data analytics. Unlike the original recurrent stochastic configuration network (RSCN), the proposed F-RSCN is constructed by multiple sub-reservoirs, and each sub-reservoir is associated with a Takagi-Sugeno-Kang (TSK) fuzzy rule. Through this hybrid framework, first, t… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  10. arXiv:2407.10401  [pdf, ps, other

    cs.DS cs.GT

    The Average-Value Allocation Problem

    Authors: Kshipra Bhawalkar, Zhe Feng, Anupam Gupta, Aranyak Mehta, David Wajc, Di Wang

    Abstract: We initiate the study of centralized algorithms for welfare-maximizing allocation of goods to buyers subject to average-value constraints. We show that this problem is NP-hard to approximate beyond a factor of $\frac{e}{e-1}$, and provide a $\frac{4e}{e-1}$-approximate offline algorithm. For the online setting, we show that no non-trivial approximations are achievable under adversarial arrivals. U… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

  11. The Jade Gateway to Exergaming: How Socio-Cultural Factors Shape Exergaming Among East Asian Older Adults

    Authors: Reza Hadi Mogavi, Juhyung Son, Simin Yang, Derrick M. Wang, Lydia Choong, Ahmad Alhilal, Peng Yuan Zhou, Pan Hui, Lennart E. Nacke

    Abstract: Exergaming, blending exercise and gaming, improves the physical and mental health of older adults. We currently do not fully know the factors that drive older adults to either engage in or abstain from exergaming. Large-scale studies investigating this are still scarce, particularly those studying East Asian older adults. To address this, we interviewed 64 older adults from China, Japan, and South… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: This manuscript is the pre-print version of our paper, which has been accepted for the ACM CHI Play 2024. Please visit https://doi.org/10.1145/3677106

  12. arXiv:2407.09007  [pdf, other

    cs.CL

    Benchmarking Language Model Creativity: A Case Study on Code Generation

    Authors: Yining Lu, Dixuan Wang, Tianjian Li, Dongwei Jiang, Daniel Khashabi

    Abstract: As LLMs become increasingly prevalent, it is interesting to consider how ``creative'' these models can be. From cognitive science, creativity consists of at least two key characteristics: \emph{convergent} thinking (purposefulness to achieve a given goal) and \emph{divergent} thinking (adaptability to new environments or constraints) \citep{runco2003critical}. In this work, we introduce a framewor… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  13. arXiv:2407.08948  [pdf, other

    eess.IV cs.CV

    Symmetry Awareness Encoded Deep Learning Framework for Brain Imaging Analysis

    Authors: Yang Ma, Dongang Wang, Peilin Liu, Lynette Masters, Michael Barnett, Weidong Cai, Chenyu Wang

    Abstract: The heterogeneity of neurological conditions, ranging from structural anomalies to functional impairments, presents a significant challenge in medical imaging analysis tasks. Moreover, the limited availability of well-annotated datasets constrains the development of robust analysis models. Against this backdrop, this study introduces a novel approach leveraging the inherent anatomical symmetrical… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: MICCAI 2024

    ACM Class: I.2.10; I.4.10

  14. arXiv:2407.08845  [pdf, ps, other

    cs.DS

    Optimal Protocols for 2-Party Contention Resolution

    Authors: Dingyu Wang

    Abstract: \emph{Contention Resolution} is a fundamental symmetry-breaking problem in which $n$ devices must acquire temporary and exclusive access to some \emph{shared resource}, without the assistance of a mediating authority. For example, the $n$ devices may be sensors that each need to transmit a single packet of data over a broadcast channel. In each time step, devices can (probabilistically) choose to… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: full version of the corresponding conference version

  15. arXiv:2407.08514  [pdf, other

    cs.CV

    Rethinking the Threat and Accessibility of Adversarial Attacks against Face Recognition Systems

    Authors: Yuxin Cao, Yumeng Zhu, Derui Wang, Sheng Wen, Minhui Xue, Jin Lu, Hao Ge

    Abstract: Face recognition pipelines have been widely deployed in various mission-critical systems in trust, equitable and responsible AI applications. However, the emergence of adversarial attacks has threatened the security of the entire recognition pipeline. Despite the sheer number of attack methods proposed for crafting adversarial examples in both digital and physical forms, it is never an easy task t… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 19 pages, 12 figures

  16. arXiv:2407.08022  [pdf, other

    cs.GT cs.AI cs.LG

    Deep Reinforcement Learning for Sequential Combinatorial Auctions

    Authors: Sai Srivatsa Ravindranath, Zhe Feng, Di Wang, Manzil Zaheer, Aranyak Mehta, David C. Parkes

    Abstract: Revenue-optimal auction design is a challenging problem with significant theoretical and practical implications. Sequential auction mechanisms, known for their simplicity and strong strategyproofness guarantees, are often limited by theoretical results that are largely existential, except for certain restrictive settings. Although traditional reinforcement learning methods such as Proximal Policy… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  17. arXiv:2407.07476  [pdf, other

    cs.DC

    A Transverse-Read-assisted Valid-Bit Collection to Accelerate Stochastic Conmputing MAC for Energy-Efficient in-RTM DNNs

    Authors: Jihe Wang, Zhiying Zhang, Xingwu Dong, Danghui Wang

    Abstract: It looks very attractive to coordinate racetrack-memory(RM) and stochastic-computing (SC) jointly to build an ultra-low power neuron-architecture.However,the above combination has always been questioned in a fatal weakness that the narrow bit-view of the RM-MTJ structure,a.k.a.shift-and-access pattern,cannot physically match the great throughput of direct-stored stochastic sequences.Fortunately,a… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  18. arXiv:2407.07304  [pdf, other

    cs.AI

    Inference Performance Optimization for Large Language Models on CPUs

    Authors: Pujiang He, Shan Zhou, Wenhuan Huang, Changqing Li, Duyi Wang, Bin Guo, Chen Meng, Sheng Gui, Weifei Yu, Yi Xie

    Abstract: Large language models (LLMs) have shown exceptional performance and vast potential across diverse tasks. However, the deployment of LLMs with high performance in low-resource environments has garnered significant attention in the industry. When GPU hardware resources are limited, we can explore alternative options on CPUs. To mitigate the financial burden and alleviate constraints imposed by hardw… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 5 pages, 6 figure, ICML 2024 on Foundation Models in the Wild

  19. arXiv:2407.07099  [pdf, other

    cs.CL cs.AI cs.GT cs.LG

    Nash CoT: Multi-Path Inference with Preference Equilibrium

    Authors: Ziqi Zhang, Cunxiang Wang, Xiong Xiao, Yue Zhang, Donglin Wang

    Abstract: Chain-of-thought (CoT) prompting has emerged as a powerful technique for enhancing the reasoning capabilities of Large Language Models (LLMs) on complex problems. Among CoT-related studies, self-consistency (Multi-path inference with answer filtering through voting) involves generating multiple reasoning paths using the CoT framework and then selecting the most frequently produced outputs standing… ▽ More

    Submitted 18 June, 2024; originally announced July 2024.

  20. arXiv:2407.06714  [pdf, other

    cs.CV

    Improving the Transferability of Adversarial Examples by Feature Augmentation

    Authors: Donghua Wang, Wen Yao, Tingsong Jiang, Xiaohu Zheng, Junqi Wu, Xiaoqian Chen

    Abstract: Despite the success of input transformation-based attacks on boosting adversarial transferability, the performance is unsatisfying due to the ignorance of the discrepancy across models. In this paper, we propose a simple but effective feature augmentation attack (FAUG) method, which improves adversarial transferability without introducing extra computation costs. Specifically, we inject the random… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 19 pages, 4 figures, 4 tables

  21. arXiv:2407.06688  [pdf, other

    cs.CV

    Universal Multi-view Black-box Attack against Object Detectors via Layout Optimization

    Authors: Donghua Wang, Wen Yao, Tingsong Jiang, Chao Li, Xiaoqian Chen

    Abstract: Object detectors have demonstrated vulnerability to adversarial examples crafted by small perturbations that can deceive the object detector. Existing adversarial attacks mainly focus on white-box attacks and are merely valid at a specific viewpoint, while the universal multi-view black-box attack is less explored, limiting their generalization in practice. In this paper, we propose a novel univer… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 12 pages, 13 figures, 5 tables

  22. arXiv:2407.06649  [pdf, ps, other

    cs.SC math.AC

    On the equivalence problem of Smith forms for multivariate polynomial matrices

    Authors: Dong Lu, Dingkang Wang, Fanghui Xiao, Xiaopeng Zheng

    Abstract: This paper delves into the equivalence problem of Smith forms for multivariate polynomial matrices. Generally speaking, multivariate ($n \geq 2$) polynomial matrices and their Smith forms may not be equivalent. However, under certain specific condition, we derive the necessary and sufficient condition for their equivalence. Let $F\in K[x_1,\ldots,x_n]^{l\times m}$ be of rank $r$,… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  23. arXiv:2407.06078  [pdf, ps, other

    cs.SD

    Few-Shot Keyword Spotting from Mixed Speech

    Authors: Junming Yuan, Ying Shi, LanTian Li, Dong Wang, Askar Hamdulla

    Abstract: Few-shot keyword spotting (KWS) aims to detect unknown keywords with limited training samples. A commonly used approach is the pre-training and fine-tuning framework. While effective in clean conditions, this approach struggles with mixed keyword spotting -- simultaneously detecting multiple keywords blended in an utterance, which is crucial in real-world applications. Previous research has propos… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: accepted by INTERSPEECH 2024

  24. arXiv:2407.05873  [pdf, other

    eess.SP cs.IT

    Receiver Selection and Transmit Beamforming for Multi-static Integrated Sensing and Communications

    Authors: Dan Wang, Yuanming Tian, Chuan Huang, Hao Chen, Xiaodong Xu, Ping Zhang

    Abstract: Next-generation wireless networks are expected to develop a novel paradigm of integrated sensing and communications (ISAC) to enable both the high-accuracy sensing and high-speed communications. However, conventional mono-static ISAC systems, which simultaneously transmit and receive at the same equipment, may suffer from severe self-interference, and thus significantly degrade the system performa… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  25. arXiv:2407.05840  [pdf, other

    cs.ET physics.optics

    A 103-TOPS/mm$^2$ Integrated Photonic Computing Engine Enabling Next-Generation Reservoir Computing

    Authors: Dongliang Wang, Yikun Nie, Gaolei Hu, Hon Ki Tsang, Chaoran Huang

    Abstract: Reservoir computing (RC) is a leading machine learning algorithm for information processing due to its rich expressiveness. A new RC paradigm has recently emerged, showcasing superior performance and delivering more interpretable results with shorter training data sets and training times, representing the next generation of RC computing. This work presents the first realization of a high-speed nex… ▽ More

    Submitted 31 May, 2024; originally announced July 2024.

  26. arXiv:2407.05165  [pdf, other

    cs.SE

    Feedback-Driven Automated Whole Bug Report Reproduction for Android Apps

    Authors: Dingbang Wang, Yu Zhao, Sidong Feng, Zhaoxu Zhang, William G. J. Halfond, Chunyang Chen, Xiaoxia Sun, Jiangfan Shi, Tingting Yu

    Abstract: In software development, bug report reproduction is a challenging task. This paper introduces ReBL, a novel feedback-driven approach that leverages GPT-4, a large-scale language model, to automatically reproduce Android bug reports. Unlike traditional methods, ReBL bypasses the use of Step to Reproduce (S2R) entities. Instead, it leverages the entire textual bug report and employs innovative promp… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: Accepted by ISSTA 2024

  27. arXiv:2407.05112  [pdf, other

    cs.CR cs.AI

    Releasing Malevolence from Benevolence: The Menace of Benign Data on Machine Unlearning

    Authors: Binhao Ma, Tianhang Zheng, Hongsheng Hu, Di Wang, Shuo Wang, Zhongjie Ba, Zhan Qin, Kui Ren

    Abstract: Machine learning models trained on vast amounts of real or synthetic data often achieve outstanding predictive performance across various domains. However, this utility comes with increasing concerns about privacy, as the training data may include sensitive information. To address these concerns, machine unlearning has been proposed to erase specific data samples from models. While some unlearning… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  28. arXiv:2407.04976  [pdf, other

    cs.DS

    Congestion-Approximators from the Bottom Up

    Authors: Jason Li, Satish Rao, Di Wang

    Abstract: We develop a novel algorithm to construct a congestion-approximator with polylogarithmic quality on a capacitated, undirected graph in nearly-linear time. Our approach is the first *bottom-up* hierarchical construction, in contrast to previous *top-down* approaches including that of Racke, Shah, and Taubig (SODA 2014), the only other construction achieving polylogarithmic quality that is implement… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: 46 pages

  29. arXiv:2407.04931  [pdf, other

    cs.DS math.PR

    Universal Perfect Samplers for Incremental Streams

    Authors: Seth Pettie, Dingyu Wang

    Abstract: If $G : \mathbb{R}_+ \to \mathbb{R}_+$, the $G$-moment of a vector $\mathbf{x}\in\mathbb{R}_+^n$ is $G(\mathbf{x}) = \sum_{v\in[n]} G(\mathbf{x}(v))$ and the $G$-sampling problem is to select an index $v_*\in [n]$ according to its contribution to the $G$-moment, i.e., such that $\Pr(v_*=v) = G(\mathbf{x}(v))/G(\mathbf{x})$. Approximate $G$-samplers may introduce multiplicative and/or additive erro… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  30. arXiv:2407.04929  [pdf, other

    cs.RO

    Toward Precise Robotic Weed Flaming Using a Mobile Manipulator with a Flamethrower

    Authors: Di Wang, Chengsong Hu, Shuangyu Xie, Joe Johnson, Hojun Ji, Yingtao Jiang, Muthukumar Bagavathiannan, Dezhen Song

    Abstract: Robotic weed flaming is a new and environmentally friendly approach to weed removal in the agricultural field. Using a mobile manipulator equipped with a flamethrower, we design a new system and algorithm to enable effective weed flaming, which requires robotic manipulation with a soft and deformable end effector, as the thermal coverage of the flame is affected by dynamic or unknown environmental… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: IROS 2024

  31. arXiv:2407.04466  [pdf, other

    cs.CL

    Using LLMs to label medical papers according to the CIViC evidence model

    Authors: Markus Hisch, Xing David Wang

    Abstract: We introduce the sequence classification problem CIViC Evidence to the field of medical NLP. CIViC Evidence denotes the multi-label classification problem of assigning labels of clinical evidence to abstracts of scientific papers which have examined various combinations of genomic variants, cancer types, and treatment approaches. We approach CIViC Evidence using different language models: We fine-… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  32. arXiv:2407.04404  [pdf

    cs.AR

    Fixed and Movable Antenna Technology for 6G Integrated Sensing and Communication

    Authors: Yong Zeng, Zhenjun Dong, Huizhi Wang, Lipeng Zhu, Ziyao Hong, Qingji Jiang, Dongming Wang, Shi Jin, Rui Zhang

    Abstract: By deploying antenna arrays at the transmitter/receiver to provide additional spatial-domain degrees of freedom (DoFs), multi-antenna technology greatly improves the reliability and efficiency of wireless communication. Meanwhile, the application of multi-antenna technology in the radar field has achieved spatial angle resolution and improved sensing DoF, thus significantly enhancing wireless sens… ▽ More

    Submitted 16 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

    Comments: in Chinese language

  33. arXiv:2407.04267  [pdf, other

    cs.DC

    A High-Quality Workflow for Multi-Resolution Scientific Data Reduction and Visualization

    Authors: Daoce Wang, Pascal Grosset, Jesus Pulido, Tushar M. Athawale, Jiannan Tian, Kai Zhao, Zarija Lukić, Axel Huebl, Zhe Wang, James Ahrens, Dingwen Tao

    Abstract: Multi-resolution methods such as Adaptive Mesh Refinement (AMR) can enhance storage efficiency for HPC applications generating vast volumes of data. However, their applicability is limited and cannot be universally deployed across all applications. Furthermore, integrating lossy compression with multi-resolution techniques to further boost storage efficiency encounters significant barriers. To thi… ▽ More

    Submitted 11 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

    Comments: accepted by SC '24

  34. arXiv:2407.04064  [pdf, other

    cs.RO

    Collision Avoidance for Multiple UAVs in Unknown Scenarios with Causal Representation Disentanglement

    Authors: Jiafan Zhuang, Zihao Xia, Gaofei Han, Boxi Wang, Wenji Li, Dongliang Wang, Zhifeng Hao, Ruichu Cai, Zhun Fan

    Abstract: Deep reinforcement learning (DRL) has achieved remarkable progress in online path planning tasks for multi-UAV systems. However, existing DRL-based methods often suffer from performance degradation when tackling unseen scenarios, since the non-causal factors in visual representations adversely affect policy learning. To address this issue, we propose a novel representation learning approach, \ie,… ▽ More

    Submitted 15 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

  35. arXiv:2407.04056  [pdf, other

    cs.RO

    Robust Policy Learning for Multi-UAV Collision Avoidance with Causal Feature Selection

    Authors: Jiafan Zhuang, Gaofei Han, Zihao Xia, Boxi Wang, Wenji Li, Dongliang Wang, Zhifeng Hao, Ruichu Cai, Zhun Fan

    Abstract: In unseen and complex outdoor environments, collision avoidance navigation for unmanned aerial vehicle (UAV) swarms presents a challenging problem. It requires UAVs to navigate through various obstacles and complex backgrounds. Existing collision avoidance navigation methods based on deep reinforcement learning show promising performance but suffer from poor generalization abilities, resulting in… ▽ More

    Submitted 15 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

  36. arXiv:2407.03966  [pdf, other

    cs.SD cs.AI eess.AS

    Serialized Output Training by Learned Dominance

    Authors: Ying Shi, Lantian Li, Shi Yin, Dong Wang, Jiqing Han

    Abstract: Serialized Output Training (SOT) has showcased state-of-the-art performance in multi-talker speech recognition by sequentially decoding the speech of individual speakers. To address the challenging label-permutation issue, prior methods have relied on either the Permutation Invariant Training (PIT) or the time-based First-In-First-Out (FIFO) rule. This study presents a model-based serialization st… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: accepted by INTERSPEECH 2024

  37. arXiv:2407.03647  [pdf, other

    math.OC cs.AI

    WANCO: Weak Adversarial Networks for Constrained Optimization problems

    Authors: Gang Bao, Dong Wang, Boyi Zou

    Abstract: This paper focuses on integrating the networks and adversarial training into constrained optimization problems to develop a framework algorithm for constrained optimization problems. For such problems, we first transform them into minimax problems using the augmented Lagrangian method and then use two (or several) deep neural networks(DNNs) to represent the primal and dual variables respectively.… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 24 pages, 18 figures

  38. arXiv:2407.03531  [pdf, other

    cs.RO

    OrbitGrasp: $SE(3)$-Equivariant Grasp Learning

    Authors: Boce Hu, Xupeng Zhu, Dian Wang, Zihao Dong, Haojie Huang, Chenghao Wang, Robin Walters, Robert Platt

    Abstract: While grasp detection is an important part of any robotic manipulation pipeline, reliable and accurate grasp detection in $SE(3)$ remains a research challenge. Many robotics applications in unstructured environments such as the home or warehouse would benefit a lot from better grasp performance. This paper proposes a novel framework for detecting $SE(3)$ grasp poses based on point cloud input. Our… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  39. arXiv:2407.01909  [pdf, other

    cs.CL cs.SD eess.AS

    Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models

    Authors: Zhiyuan Tang, Dong Wang, Shen Huang, Shidong Shang

    Abstract: Recent studies have demonstrated the efficacy of large language models (LLMs) in error correction for automatic speech recognition (ASR). However, much of the research focuses on the English language. This paper redirects the attention to Chinese. Firstly, we construct a specialized benchmark dataset aimed at error correction for Chinese ASR with 724K hypotheses-transcription pairs, named the Chin… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Interspeech 2024

  40. arXiv:2407.01812  [pdf, other

    cs.RO cs.LG

    Equivariant Diffusion Policy

    Authors: Dian Wang, Stephen Hart, David Surovik, Tarik Kelestemur, Haojie Huang, Haibo Zhao, Mark Yeatman, Jiuguang Wang, Robin Walters, Robert Platt

    Abstract: Recent work has shown diffusion models are an effective approach to learning the multimodal distributions arising from demonstration data in behavior cloning. However, a drawback of this approach is the need to learn a denoising function, which is significantly more complex than learning an explicit policy. In this work, we propose Equivariant Diffusion Policy, a novel diffusion policy learning me… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  41. arXiv:2407.01251  [pdf, other

    cs.CR cs.AI

    QUEEN: Query Unlearning against Model Extraction

    Authors: Huajie Chen, Tianqing Zhu, Lefeng Zhang, Bo Liu, Derui Wang, Wanlei Zhou, Minhui Xue

    Abstract: Model extraction attacks currently pose a non-negligible threat to the security and privacy of deep learning models. By querying the model with a small dataset and usingthe query results as the ground-truth labels, an adversary can steal a piracy model with performance comparable to the original model. Two key issues that cause the threat are, on the one hand, accurate and unlimited queries can be… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  42. arXiv:2407.01131  [pdf, other

    cs.CV

    M$^2$IST: Multi-Modal Interactive Side-Tuning for Memory-efficient Referring Expression Comprehension

    Authors: Xuyang Liu, Ting Liu, Siteng Huang, Yue Hu, Quanjun Yin, Donglin Wang, Honggang Chen

    Abstract: Referring expression comprehension (REC) is a vision-language task to locate a target object in an image based on a language expression. Fully fine-tuning general-purpose pre-trained models for REC yields impressive performance but becomes increasingly costly. Parameter-efficient transfer learning (PETL) methods have shown strong performance with fewer tunable parameters. However, applying PETL to… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  43. arXiv:2407.00995  [pdf, other

    cs.CY eess.SY physics.app-ph

    Data on the Move: Traffic-Oriented Data Trading Platform Powered by AI Agent with Common Sense

    Authors: Yi Yu, Shengyue Yao, Tianchen Zhou, Yexuan Fu, Jingru Yu, Ding Wang, Xuhong Wang, Cen Chen, Yilun Lin

    Abstract: In the digital era, data has become a pivotal asset, advancing technologies such as autonomous driving. Despite this, data trading faces challenges like the absence of robust pricing methods and the lack of trustworthy trading mechanisms. To address these challenges, we introduce a traffic-oriented data trading platform named Data on The Move (DTM), integrating traffic simulation, data trading, an… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  44. arXiv:2407.00290  [pdf, other

    cs.RO

    Variable Time Step Reinforcement Learning for Robotic Applications

    Authors: Dong Wang, Giovanni Beltrame

    Abstract: Traditional reinforcement learning (RL) generates discrete control policies, assigning one action per cycle. These policies are usually implemented as in a fixed-frequency control loop. This rigidity presents challenges as optimal control frequency is task-dependent; suboptimal frequencies increase computational demands and reduce exploration efficiency. Variable Time Step Reinforcement… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  45. arXiv:2407.00029  [pdf, other

    cs.DC

    Distributed Inference Performance Optimization for LLMs on CPUs

    Authors: Pujiang He, Shan Zhou, Changqing Li, Wenhuan Huang, Weifei Yu, Duyi Wang, Chen Meng, Sheng Gui

    Abstract: Large language models (LLMs) hold tremendous potential for addressing numerous real-world challenges, yet they typically demand significant computational resources and memory. Deploying LLMs onto a resource-limited hardware device with restricted memory capacity presents considerable challenges. Distributed computing emerges as a prevalent strategy to mitigate single-node memory constraints and ex… ▽ More

    Submitted 16 May, 2024; originally announced July 2024.

    Comments: 4 pages, 3 figures, Practical ML for Low Resource Settings Workshop @ ICLR 2024

  46. arXiv:2406.19774  [pdf, other

    cs.CL

    Direct Preference Knowledge Distillation for Large Language Models

    Authors: Yixing Li, Yuxian Gu, Li Dong, Dequan Wang, Yu Cheng, Furu Wei

    Abstract: In the field of large language models (LLMs), Knowledge Distillation (KD) is a critical technique for transferring capabilities from teacher models to student models. However, existing KD methods face limitations and challenges in distillation of LLMs, including efficiency and insufficient measurement capabilities of traditional KL divergence. It is shown that LLMs can serve as an implicit reward… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  47. arXiv:2406.19596  [pdf, other

    cs.CR cs.AI cs.LG

    Optimizing Cyber Defense in Dynamic Active Directories through Reinforcement Learning

    Authors: Diksha Goel, Kristen Moore, Mingyu Guo, Derui Wang, Minjune Kim, Seyit Camtepe

    Abstract: This paper addresses a significant gap in Autonomous Cyber Operations (ACO) literature: the absence of effective edge-blocking ACO strategies in dynamic, real-world networks. It specifically targets the cybersecurity vulnerabilities of organizational Active Directory (AD) systems. Unlike the existing literature on edge-blocking defenses which considers AD systems as static entities, our study coun… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: The manuscript has been accepted as full paper at European Symposium on Research in Computer Security (ESORICS) 2024

  48. arXiv:2406.18992  [pdf, other

    cs.CV cs.AI cs.LG

    Semi-supervised Concept Bottleneck Models

    Authors: Lijie Hu, Tianhao Huang, Huanyi Xie, Chenyang Ren, Zhengyu Hu, Lu Yu, Di Wang

    Abstract: Concept Bottleneck Models (CBMs) have garnered increasing attention due to their ability to provide concept-based explanations for black-box deep learning models while achieving high final prediction accuracy using human-like concepts. However, the training of current CBMs heavily relies on the accuracy and richness of annotated concepts in the dataset. These concept labels are typically provided… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 17 pages

  49. arXiv:2406.18145  [pdf, other

    cs.CR cs.LG

    Beyond Statistical Estimation: Differentially Private Individual Computation via Shuffling

    Authors: Shaowei Wang, Changyu Dong, Xiangfu Song, Jin Li, Zhili Zhou, Di Wang, Han Wu

    Abstract: In data-driven applications, preserving user privacy while enabling valuable computations remains a critical challenge. Technologies like Differential Privacy (DP) have been pivotal in addressing these concerns. The shuffle model of DP requires no trusted curators and can achieve high utility by leveraging the privacy amplification effect yielded from shuffling. These benefits have led to signific… ▽ More

    Submitted 11 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

  50. arXiv:2406.16997  [pdf, ps, other

    cs.LG cs.AI

    Wavelet Attention GRU for Efficient Industrial Gas Recognition with Novel Metrics

    Authors: Ding Wang

    Abstract: Gas recognition technology has received considerable attention from researchers in recent years. Nevertheless, the gas recognition area has faced obstacles in implementing deep learning-based recognition solutions due to the absence of standardized protocols. To tackle this problem, we suggest using two sets of specialized evaluation measures for gas recognition algorithms. These metrics will make… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.