Skip to main content

Showing 1–50 of 730 results for author: Ding, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13126  [pdf, other

    cs.DC

    Improving GPU Multi-Tenancy Through Dynamic Multi-Instance GPU Reconfiguration

    Authors: Tianyu Wang, Sheng Li, Bingyao Li, Yue Dai, Ao Li, Geng Yuan, Yufei Ding, Youtao Zhang, Xulong Tang

    Abstract: Continuous learning (CL) has emerged as one of the most popular deep learning paradigms deployed in modern cloud GPUs. Specifically, CL has the capability to continuously update the model parameters (through model retraining) and use the updated model (if available) to serve overtime arriving inference requests. It is generally beneficial to co-locate the retraining and inference together to enabl… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  2. arXiv:2407.08255  [pdf, other

    cs.CV cs.LG

    GraphMamba: An Efficient Graph Structure Learning Vision Mamba for Hyperspectral Image Classification

    Authors: Aitao Yang, Min Li, Yao Ding, Leyuan Fang, Yaoming Cai, Yujie He

    Abstract: Efficient extraction of spectral sequences and geospatial information has always been a hot topic in hyperspectral image classification. In terms of spectral sequence feature capture, RNN and Transformer have become mainstream classification frameworks due to their long-range feature capture capabilities. In terms of spatial information aggregation, CNN enhances the receptive field to retain integ… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 13 pages, 10 figures

  3. arXiv:2407.06505  [pdf

    cs.HC

    Not all explicit cues help communicate: Pedestrians' perceptions, fixations, and decisions toward automated vehicles with varied appearance

    Authors: Wei Lyu, Yaqin Cao, Yi Ding, Jingyu Li, Kai Tian, Hui Zhang

    Abstract: Given pedestrians' vulnerability in road traffic, it remains unclear how novel AV appearances will impact pedestrians crossing behaviour. To address this gap, this study pioneers an investigation into the influence of AVs' exterior design, correlated with their kinematics, on pedestrians' road-crossing perception and decision-making. A video-based eye-tracking experimental study was conducted with… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 37 pages, 13 figures, 4 tables

  4. CrowdTransfer: Enabling Crowd Knowledge Transfer in AIoT Community

    Authors: Yan Liu, Bin Guo, Nuo Li, Yasan Ding, Zhouyangzi Zhang, Zhiwen Yu

    Abstract: Artificial Intelligence of Things (AIoT) is an emerging frontier based on the deep fusion of Internet of Things (IoT) and Artificial Intelligence (AI) technologies. Although advanced deep learning techniques enhance the efficient data processing and intelligent analysis of complex IoT data, they still suffer from notable challenges when deployed to practical AIoT applications, such as constrained… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: This paper has been accepted for publication in IEEE Communications Surveys & Tutorials. Copyright will be transferred without notice, after this version may no longer be accessible

  5. arXiv:2407.05739  [pdf, other

    cs.NE cs.AI

    Multi-Bit Mechanism: A Novel Information Transmission Paradigm for Spiking Neural Networks

    Authors: Yongjun Xiao, Xianlong Tian, Yongqi Ding, Pei He, Mengmeng Jing, Lin Zuo

    Abstract: Since proposed, spiking neural networks (SNNs) gain recognition for their high performance, low power consumption and enhanced biological interpretability. However, while bringing these advantages, the binary nature of spikes also leads to considerable information loss in SNNs, ultimately causing performance degradation. We claim that the limited expressiveness of current binary spikes, resulting… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Under review

  6. arXiv:2407.03474  [pdf

    cs.CY

    How high-status women promote repeated collaboration among women in male-dominated contexts

    Authors: Huimin Xu, Jamie Strassman, Ying Ding, Steven Gray, Maytal Saar-Tsechansky

    Abstract: Male-dominated contexts pose a dilemma: they increase the benefits of repeated collaboration among women, yet at the same time, make such collaborations less likely. This paper seeks to understand the conditions that foster repeated collaboration among women versus men in male-dominated settings by examining the critical role of status hierarchies. Using collaboration data on 8,232,769 computer sc… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  7. arXiv:2407.03374  [pdf

    cs.AI cs.SE eess.SP eess.SY

    An Outline of Prognostics and Health Management Large Model: Concepts, Paradigms, and Challenges

    Authors: Laifa Tao, Shangyu Li, Haifei Liu, Qixuan Huang, Liang Ma, Guoao Ning, Yiling Chen, Yunlong Wu, Bin Li, Weiwei Zhang, Zhengduo Zhao, Wenchao Zhan, Wenyan Cao, Chao Wang, Hongmei Liu, Jian Ma, Mingliang Suo, Yujie Cheng, Yu Ding, Dengwei Song, Chen Lu

    Abstract: Prognosis and Health Management (PHM), critical for ensuring task completion by complex systems and preventing unexpected failures, is widely adopted in aerospace, manufacturing, maritime, rail, energy, etc. However, PHM's development is constrained by bottlenecks like generalization, interpretation and verification abilities. Presently, generative artificial intelligence (AI), represented by Larg… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  8. arXiv:2407.02390  [pdf, other

    cs.DC cs.LG

    Uncertainty-Aware Decarbonization for Datacenters

    Authors: Amy Li, Sihang Liu, Yi Ding

    Abstract: This paper represents the first effort to quantify uncertainty in carbon intensity forecasting for datacenter decarbonization. We identify and analyze two types of uncertainty -- temporal and spatial -- and discuss their system implications. To address the temporal dynamics in quantifying uncertainty for carbon intensity forecasting, we introduce a conformal prediction-based framework. Evaluation… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  9. arXiv:2407.02159  [pdf, other

    cs.CV eess.IV

    SparseSSP: 3D Subcellular Structure Prediction from Sparse-View Transmitted Light Images

    Authors: Jintu Zheng, Yi Ding, Qizhe Liu, Yi Cao, Ying Hu, Zenan Wang

    Abstract: Traditional fluorescence staining is phototoxic to live cells, slow, and expensive; thus, the subcellular structure prediction (SSP) from transmitted light (TL) images is emerging as a label-free, faster, low-cost alternative. However, existing approaches utilize 3D networks for one-to-one voxel level dense prediction, which necessitates a frequent and time-consuming Z-axis imaging process. Moreov… ▽ More

    Submitted 3 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: Accpeted to ECCV2024

  10. arXiv:2407.00105  [pdf, other

    cs.LG cs.AI

    Multiple Kronecker RLS fusion-based link propagation for drug-side effect prediction

    Authors: Yuqing Qian, Ziyu Zheng, Prayag Tiwari, Yijie Ding, Quan Zou

    Abstract: Drug-side effect prediction has become an essential area of research in the field of pharmacology. As the use of medications continues to rise, so does the importance of understanding and mitigating the potential risks associated with them. At present, researchers have turned to data-driven methods to predict drug-side effects. Drug-side effect prediction is a link prediction problem, and the rela… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

    Comments: Transactions on Machine Learning Research (TMLR 2024)

  11. arXiv:2406.18938  [pdf, other

    cs.IR

    Towards Personalized Federated Multi-scenario Multi-task Recommendation

    Authors: Yue Ding, Yanbiao Ji, Xun Cai, Xin Xin, Xiaofeng Gao, Hongtao Lu

    Abstract: In modern recommender system applications, such as e-commerce, predicting multiple targets like click-through rate (CTR) and post-view click-through \& conversion rate (CTCVR) is common. Multi-task recommender systems are gaining traction in research and practical use. Existing multi-task recommender systems tackle diverse business scenarios, merging and modeling these scenarios unlocks shared kno… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  12. arXiv:2406.18585  [pdf, other

    cs.CV cs.AI

    Flexible ViG: Learning the Self-Saliency for Flexible Object Recognition

    Authors: Lin Zuo, Kunshan Yang, Xianlong Tian, Kunbin He, Yongqi Ding, Mengmeng Jing

    Abstract: Existing computer vision methods mainly focus on the recognition of rigid objects, whereas the recognition of flexible objects remains unexplored. Recognizing flexible objects poses significant challenges due to their inherently diverse shapes and sizes, translucent attributes, ambiguous boundaries, and subtle inter-class differences. In this paper, we claim that these problems primarily arise fro… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: under review

  13. arXiv:2406.18345  [pdf, other

    cs.LG eess.SP

    EmT: A Novel Transformer for Generalized Cross-subject EEG Emotion Recognition

    Authors: Yi Ding, Chengxuan Tong, Shuailei Zhang, Muyun Jiang, Yong Li, Kevin Lim Jun Liang, Cuntai Guan

    Abstract: Integrating prior knowledge of neurophysiology into neural network architecture enhances the performance of emotion decoding. While numerous techniques emphasize learning spatial and short-term temporal patterns, there has been limited emphasis on capturing the vital long-term contextual information associated with emotional cognitive processes. In order to address this discrepancy, we introduce a… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 11 pages, 5 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  14. arXiv:2406.17659  [pdf, other

    cs.AI cs.RO

    DKPROMPT: Domain Knowledge Prompting Vision-Language Models for Open-World Planning

    Authors: Xiaohan Zhang, Zainab Altaweel, Yohei Hayamizu, Yan Ding, Saeid Amiri, Hao Yang, Andy Kaminski, Chad Esselink, Shiqi Zhang

    Abstract: Vision-language models (VLMs) have been applied to robot task planning problems, where the robot receives a task in natural language and generates plans based on visual inputs. While current VLMs have demonstrated strong vision-language understanding capabilities, their performance is still far from being satisfactory in planning tasks. At the same time, although classical task planners, such as P… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  15. arXiv:2406.16034  [pdf, ps, other

    math.LO cs.LO

    Some General Completeness Results for Propositionally Quantified Modal Logics

    Authors: Yifeng Ding, Yipu Li

    Abstract: We study the completeness problem for propositionally quantified modal logics on quantifiable general frames, where the admissible sets are the propositions the quantifiers can range over and expressible sets of worlds are admissible, and Kripke frames, where the quantifiers range over all sets of worlds. We show that any normal propositionally quantified modal logic containing all instances of th… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  16. arXiv:2406.13951  [pdf, other

    cs.CV

    Towards the in-situ Trunk Identification and Length Measurement of Sea Cucumbers via Bézier Curve Modelling

    Authors: Shuaixin Liu, Kunqian Li, Yilin Ding, Kuangwei Xu, Qianli Jiang, Q. M. Jonathan Wu, Dalei Song

    Abstract: We introduce a novel vision-based framework for in-situ trunk identification and length measurement of sea cucumbers, which plays a crucial role in the monitoring of marine ranching resources and mechanized harvesting. To model sea cucumber trunk curves with varying degrees of bending, we utilize the parametric Bézier curve due to its computational simplicity, stability, and extensive range of tra… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  17. arXiv:2406.13344  [pdf, other

    cs.CV

    WaterMono: Teacher-Guided Anomaly Masking and Enhancement Boosting for Robust Underwater Self-Supervised Monocular Depth Estimation

    Authors: Yilin Ding, Kunqian Li, Han Mei, Shuaixin Liu, Guojia Hou

    Abstract: Depth information serves as a crucial prerequisite for various visual tasks, whether on land or underwater. Recently, self-supervised methods have achieved remarkable performance on several terrestrial benchmarks despite the absence of depth annotations. However, in more challenging underwater scenarios, they encounter numerous brand-new obstacles such as the influence of marine life and degradati… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  18. arXiv:2406.12404  [pdf

    cs.CV

    Scan-to-BIM for As-built Roads: Automatic Road Digital Twinning from Semantically Labeled Point Cloud Data

    Authors: Yuexiong Ding, Mengtian Yin, Ran Wei, Ioannis Brilakis, Muyang Liu, Xiaowei Luo

    Abstract: Creating geometric digital twins (gDT) for as-built roads still faces many challenges, such as low automation level and accuracy, limited asset types and shapes, and reliance on engineering experience. A novel scan-to-building information modeling (scan-to-BIM) framework is proposed for automatic road gDT creation based on semantically labeled point cloud data (PCD), which considers six asset type… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  19. arXiv:2406.12395  [pdf

    cs.CV

    SDNIA-YOLO: A Robust Object Detection Model for Extreme Weather Conditions

    Authors: Yuexiong Ding, Xiaowei Luo

    Abstract: Though current object detection models based on deep learning have achieved excellent results on many conventional benchmark datasets, their performance will dramatically decline on real-world images taken under extreme conditions. Existing methods either used image augmentation based on traditional image processing algorithms or applied customized and scene-limited image adaptation technologies f… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  20. arXiv:2406.09998  [pdf, other

    eess.AS cs.AI cs.LG cs.MM cs.SD

    Understanding Pedestrian Movement Using Urban Sensing Technologies: The Promise of Audio-based Sensors

    Authors: Chaeyeon Han, Pavan Seshadri, Yiwei Ding, Noah Posner, Bon Woo Koo, Animesh Agrawal, Alexander Lerch, Subhrajit Guhathakurta

    Abstract: While various sensors have been deployed to monitor vehicular flows, sensing pedestrian movement is still nascent. Yet walking is a significant mode of travel in many cities, especially those in Europe, Africa, and Asia. Understanding pedestrian volumes and flows is essential for designing safer and more attractive pedestrian infrastructure and for controlling periodic overcrowding. This study dis… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: submitted to Urban Informatics

  21. arXiv:2406.07862  [pdf, other

    cs.LG cs.AI cs.CV cs.NE

    Self-Distillation Learning Based on Temporal-Spatial Consistency for Spiking Neural Networks

    Authors: Lin Zuo, Yongqi Ding, Mengmeng Jing, Kunshan Yang, Yunqian Yu

    Abstract: Spiking neural networks (SNNs) have attracted considerable attention for their event-driven, low-power characteristics and high biological interpretability. Inspired by knowledge distillation (KD), recent research has improved the performance of the SNN model with a pre-trained teacher model. However, additional teacher models require significant computational resources, and it is tedious to manua… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 17 pages, 6 figures

    ACM Class: I.2.6; I.5.1

  22. 1-D CNN-Based Online Signature Verification with Federated Learning

    Authors: Lingfeng Zhang, Yuheng Guo, Yepeng Ding, Hiroyuki Sato

    Abstract: Online signature verification plays a pivotal role in security infrastructures. However, conventional online signature verification models pose significant risks to data privacy, especially during training processes. To mitigate these concerns, we propose a novel federated learning framework that leverages 1-D Convolutional Neural Networks (CNN) for online signature verification. Furthermore, our… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 8 pages, 11 figures, 1 table

  23. arXiv:2406.06025  [pdf, other

    cs.SE cs.CL cs.LG

    RepoQA: Evaluating Long Context Code Understanding

    Authors: Jiawei Liu, Jia Le Tian, Vijay Daita, Yuxiang Wei, Yifeng Ding, Yuhan Katherine Wang, Jun Yang, Lingming Zhang

    Abstract: Recent advances have been improving the context windows of Large Language Models (LLMs). To quantify the real long-context capabilities of LLMs, evaluators such as the popular Needle in a Haystack have been developed to test LLMs over a large chunk of raw texts. While effective, current evaluations overlook the insight of how LLMs work with long-context code, i.e., repositories. To this end, we in… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  24. arXiv:2406.04802  [pdf, other

    cs.CV cs.LG

    Predictive Dynamic Fusion

    Authors: Bing Cao, Yinan Xia, Yi Ding, Changqing Zhang, Qinghua Hu

    Abstract: Multimodal fusion is crucial in joint decision-making systems for rendering holistic judgments. Since multimodal data changes in open environments, dynamic fusion has emerged and achieved remarkable progress in numerous applications. However, most existing dynamic multimodal fusion methods lack theoretical guarantees and easily fall into suboptimal problems, yielding unreliability and instability.… ▽ More

    Submitted 13 July, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

    Comments: Accepted by ICML 2024

  25. arXiv:2406.04151  [pdf, other

    cs.AI cs.CL

    AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

    Authors: Zhiheng Xi, Yiwen Ding, Wenxiang Chen, Boyang Hong, Honglin Guo, Junzhe Wang, Dingwen Yang, Chenyang Liao, Xin Guo, Wei He, Songyang Gao, Lu Chen, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang

    Abstract: Building generalist agents that can handle diverse tasks and evolve themselves across different environments is a long-term goal in the AI community. Large language models (LLMs) are considered a promising foundation to build such agents due to their generalized capabilities. Current approaches either have LLM-based agents imitate expert-provided trajectories step-by-step, requiring human supervis… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Project site: https://agentgym.github.io

  26. arXiv:2406.02133  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    SimulTron: On-Device Simultaneous Speech to Speech Translation

    Authors: Alex Agranovich, Eliya Nachmani, Oleg Rybakov, Yifan Ding, Ye Jia, Nadav Bar, Heiga Zen, Michelle Tadmor Ramanovich

    Abstract: Simultaneous speech-to-speech translation (S2ST) holds the promise of breaking down communication barriers and enabling fluid conversations across languages. However, achieving accurate, real-time translation through mobile devices remains a major challenge. We introduce SimulTron, a novel S2ST architecture designed to tackle this task. SimulTron is a lightweight direct S2ST model that uses the st… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  27. arXiv:2406.01006  [pdf, other

    cs.CL cs.AI cs.SE

    SemCoder: Training Code Language Models with Comprehensive Semantics

    Authors: Yangruibo Ding, Jinjun Peng, Marcus J. Min, Gail Kaiser, Junfeng Yang, Baishakhi Ray

    Abstract: Code Large Language Models (Code LLMs) have excelled at tasks like code completion but often miss deeper semantics such as execution effects and dynamic states. This paper aims to bridge the gap between Code LLMs' reliance on static text data and the need for thorough semantic understanding for complex tasks like debugging and program repair. We introduce a novel strategy to train Code LLMs with c… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  28. Model-Driven Security Analysis of Self-Sovereign Identity Systems

    Authors: Yepeng Ding, Hiroyuki Sato

    Abstract: Best practices of self-sovereign identity (SSI) are being intensively explored in academia and industry. Reusable solutions obtained from best practices are generalized as architectural patterns for systematic analysis and design reference, which significantly boosts productivity and increases the dependability of future implementations. For security-sensitive projects, architects make architectur… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  29. arXiv:2405.20860  [pdf, other

    cs.LG

    Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation

    Authors: Shangding Gu, Laixi Shi, Yuhao Ding, Alois Knoll, Costas Spanos, Adam Wierman, Ming Jin

    Abstract: Safe reinforcement learning (RL) is crucial for deploying RL agents in real-world applications, as it aims to maximize long-term rewards while satisfying safety constraints. However, safe RL often suffers from sample inefficiency, requiring extensive interactions with the environment to learn a safe policy. We propose Efficient Safe Policy Optimization (ESPO), a novel approach that enhances the ef… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  30. arXiv:2405.19338  [pdf, other

    eess.SP cs.AI cs.CV

    Accurate Patient Alignment without Unnecessary Imaging Dose via Synthesizing Patient-specific 3D CT Images from 2D kV Images

    Authors: Yuzhen Ding, Jason M. Holmes, Hongying Feng, Baoxin Li, Lisa A. McGee, Jean-Claude M. Rwigema, Sujay A. Vora, Daniel J. Ma, Robert L. Foote, Samir H. Patel, Wei Liu

    Abstract: In radiotherapy, 2D orthogonally projected kV images are used for patient alignment when 3D-on-board imaging(OBI) unavailable. But tumor visibility is constrained due to the projection of patient's anatomy onto a 2D plane, potentially leading to substantial setup errors. In treatment room with 3D-OBI such as cone beam CT(CBCT), the field of view(FOV) of CBCT is limited with unnecessarily high imag… ▽ More

    Submitted 1 April, 2024; originally announced May 2024.

    Comments: 17 pages, 8 figures and tables

  31. arXiv:2405.18194  [pdf, other

    cs.LG cs.CR

    Delving into Differentially Private Transformer

    Authors: Youlong Ding, Xueyang Wu, Yining Meng, Yonggang Luo, Hao Wang, Weike Pan

    Abstract: Deep learning with differential privacy (DP) has garnered significant attention over the past years, leading to the development of numerous methods aimed at enhancing model accuracy and training efficiency. This paper delves into the problem of training Transformer models with differential privacy. Our treatment is modular: the logic is to `reduce' the problem of training DP Transformer to the mor… ▽ More

    Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  32. Superpixelwise Low-rank Approximation based Partial Label Learning for Hyperspectral Image Classification

    Authors: Shujun Yang, Yu Zhang, Yao Ding, Danfeng Hong

    Abstract: Insufficient prior knowledge of a captured hyperspectral image (HSI) scene may lead the experts or the automatic labeling systems to offer incorrect labels or ambiguous labels (i.e., assigning each training sample to a group of candidate labels, among which only one of them is valid; this is also known as partial label learning) during the labeling process. Accordingly, how to learn from such data… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 0

    Journal ref: journal={IEEE Geoscience and Remote Sensing Letters}, year={2023}, publisher={IEEE}

  33. arXiv:2405.16601  [pdf, other

    cs.LG

    A CMDP-within-online framework for Meta-Safe Reinforcement Learning

    Authors: Vanshaj Khattar, Yuhao Ding, Bilgehan Sel, Javad Lavaei, Ming Jin

    Abstract: Meta-reinforcement learning has widely been used as a learning-to-learn framework to solve unseen tasks with limited experience. However, the aspect of constraint violations has not been adequately addressed in the existing works, making their application restricted in real-world settings. In this paper, we study the problem of meta-safe reinforcement learning (Meta-SRL) through the CMDP-within-on… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Journal ref: ICLR 2023

  34. arXiv:2405.16390  [pdf, other

    cs.AI cs.LG

    Safe and Balanced: A Framework for Constrained Multi-Objective Reinforcement Learning

    Authors: Shangding Gu, Bilgehan Sel, Yuhao Ding, Lu Wang, Qingwei Lin, Alois Knoll, Ming Jin

    Abstract: In numerous reinforcement learning (RL) problems involving safety-critical systems, a key challenge lies in balancing multiple objectives while simultaneously meeting all stringent safety constraints. To tackle this issue, we propose a primal-based framework that orchestrates policy optimization between multi-objective learning and constraint adherence. Our method employs a novel natural policy gr… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  35. arXiv:2405.16214  [pdf, other

    cs.CV

    Underwater Image Enhancement by Diffusion Model with Customized CLIP-Classifier

    Authors: Shuaixin Liu, Kunqian Li, Yilin Ding, Qi Qi

    Abstract: Underwater Image Enhancement (UIE) aims to improve the visual quality from a low-quality input. Unlike other image enhancement tasks, underwater images suffer from the unavailability of real reference images. Although existing works exploit synthetic images and manually select well-enhanced images as reference images to train enhancement networks, their upper performance bound is limited by the re… ▽ More

    Submitted 7 June, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

  36. arXiv:2405.13401  [pdf, ps, other

    cs.CR cs.CL

    TrojanRAG: Retrieval-Augmented Generation Can Be Backdoor Driver in Large Language Models

    Authors: Pengzhou Cheng, Yidong Ding, Tianjie Ju, Zongru Wu, Wei Du, Ping Yi, Zhuosheng Zhang, Gongshen Liu

    Abstract: Large language models (LLMs) have raised concerns about potential security threats despite performing significantly in Natural Language Processing (NLP). Backdoor attacks initially verified that LLM is doing substantial harm at all stages, but the cost and robustness have been criticized. Attacking LLMs is inherently risky in security review, while prohibitively expensive. Besides, the continuous… ▽ More

    Submitted 7 July, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: 19 pages, 14 figures, 4 tables

  37. arXiv:2405.11647  [pdf, other

    cs.AI cs.LG

    Hummer: Towards Limited Competitive Preference Dataset

    Authors: Li Jiang, Yusen Wu, Junwu Xiong, Jingqing Ruan, Yichuan Ding, Qingpei Guo, Zujie Wen, Jun Zhou, Xiaotie Deng

    Abstract: Preference datasets are essential for incorporating human preferences into pre-trained language models, playing a key role in the success of Reinforcement Learning from Human Feedback. However, these datasets often demonstrate conflicting alignment objectives, leading to increased vulnerability to jailbreak attacks and challenges in adapting downstream tasks to prioritize specific alignment object… ▽ More

    Submitted 20 May, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

    Comments: 9 pages, 5 figures

  38. arXiv:2405.06211  [pdf, other

    cs.CL cs.AI cs.IR

    A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models

    Authors: Wenqi Fan, Yujuan Ding, Liangbo Ning, Shijie Wang, Hengyun Li, Dawei Yin, Tat-Seng Chua, Qing Li

    Abstract: As one of the most advanced techniques in AI, Retrieval-Augmented Generation (RAG) can offer reliable and up-to-date external knowledge, providing huge convenience for numerous tasks. Particularly in the era of AI-Generated Content (AIGC), the powerful capacity of retrieval in providing additional knowledge enables RAG to assist existing generative AI in producing high-quality outputs. Recently, L… ▽ More

    Submitted 17 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

    Comments: This is the long version of the corresponding survey paper accepted by KDD2024

  39. arXiv:2405.05957  [pdf, other

    cs.CL

    OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning

    Authors: Dan Qiao, Yi Su, Pinzheng Wang, Jing Ye, Wenjing Xie, Yuechi Zhou, Yuyang Ding, Zecheng Tang, Jikai Wang, Yixin Ji, Yue Wang, Pei Guo, Zechen Sun, Zikang Zhang, Juntao Li, Pingfu Chao, Wenliang Chen, Guohong Fu, Guodong Zhou, Qiaoming Zhu, Min Zhang

    Abstract: Large Language Models (LLMs) have played an important role in many fields due to their powerful capabilities.However, their massive number of parameters leads to high deployment requirements and incurs significant inference costs, which impedes their practical applications. Training smaller models is an effective way to address this problem. Therefore, we introduce OpenBA-V2, a 3.4B model derived… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  40. arXiv:2405.04819  [pdf, other

    cs.CL cs.AI

    DALK: Dynamic Co-Augmentation of LLMs and KG to answer Alzheimer's Disease Questions with Scientific Literature

    Authors: Dawei Li, Shu Yang, Zhen Tan, Jae Young Baik, Sukwon Yun, Joseph Lee, Aaron Chacko, Bojian Hou, Duy Duong-Tran, Ying Ding, Huan Liu, Li Shen, Tianlong Chen

    Abstract: Recent advancements in large language models (LLMs) have achieved promising performances across various applications. Nonetheless, the ongoing challenge of integrating long-tail knowledge continues to impede the seamless adoption of LLMs in specialized domains. In this work, we introduce DALK, a.k.a. Dynamic Co-Augmentation of LLMs and KG, to address this limitation and demonstrate its ability on… ▽ More

    Submitted 12 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: Under Review; Incorrect author name revised

  41. arXiv:2405.03144  [pdf, other

    cs.CV cs.LG

    PTQ4SAM: Post-Training Quantization for Segment Anything

    Authors: Chengtao Lv, Hong Chen, Jinyang Guo, Yifu Ding, Xianglong Liu

    Abstract: Segment Anything Model (SAM) has achieved impressive performance in many computer vision tasks. However, as a large-scale model, the immense memory and computation costs hinder its practical deployment. In this paper, we propose a post-training quantization (PTQ) framework for Segment Anything Model, namely PTQ4SAM. First, we investigate the inherent bottleneck of SAM quantization attributed to th… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: CVPR 2024

  42. arXiv:2405.02969  [pdf, other

    cs.LG cs.DC

    Towards a Flexible and High-Fidelity Approach to Distributed DNN Training Emulation

    Authors: Banruo Liu, Mubarak Adetunji Ojewale, Yuhan Ding, Marco Canini

    Abstract: We propose NeuronaBox, a flexible, user-friendly, and high-fidelity approach to emulate DNN training workloads. We argue that to accurately observe performance, it is possible to execute the training workload on a subset of real nodes and emulate the networked execution environment along with the collective communication operations. Initial results from a proof-of-concept implementation show that… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  43. arXiv:2405.02004  [pdf, other

    cs.CV

    M${^2}$Depth: Self-supervised Two-Frame Multi-camera Metric Depth Estimation

    Authors: Yingshuang Zou, Yikang Ding, Xi Qiu, Haoqian Wang, Haotian Zhang

    Abstract: This paper presents a novel self-supervised two-frame multi-camera metric depth estimation network, termed M${^2}$Depth, which is designed to predict reliable scale-aware surrounding depth in autonomous driving. Unlike the previous works that use multi-view images from a single time-step or multiple time-step images from a single camera, M${^2}$Depth takes temporally adjacent two-frame images from… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  44. arXiv:2405.01677  [pdf, other

    cs.LG cs.AI

    Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation

    Authors: Shangding Gu, Bilgehan Sel, Yuhao Ding, Lu Wang, Qingwei Lin, Ming Jin, Alois Knoll

    Abstract: Ensuring the safety of Reinforcement Learning (RL) is crucial for its deployment in real-world applications. Nevertheless, managing the trade-off between reward and safety during exploration presents a significant challenge. Improving reward performance through policy adjustments may adversely affect safety performance. In this study, we aim to address this conflicting relation by leveraging the t… ▽ More

    Submitted 7 June, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

  45. arXiv:2405.00719  [pdf, other

    eess.SP cs.LG q-bio.NC

    EEG-Deformer: A Dense Convolutional Transformer for Brain-computer Interfaces

    Authors: Yi Ding, Yong Li, Hao Sun, Rui Liu, Chengxuan Tong, Cuntai Guan

    Abstract: Effectively learning the temporal dynamics in electroencephalogram (EEG) signals is challenging yet essential for decoding brain activities using brain-computer interfaces (BCIs). Although Transformers are popular for their long-term sequential learning ability in the BCI field, most methods combining Transformers with convolutional neural networks (CNNs) fail to capture the coarse-to-fine tempora… ▽ More

    Submitted 25 April, 2024; originally announced May 2024.

    Comments: 10 pages, 9 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  46. arXiv:2405.00218  [pdf, other

    cs.CR cs.AI cs.LG cs.SE

    Constrained Decoding for Secure Code Generation

    Authors: Yanjun Fu, Ethan Baker, Yu Ding, Yizheng Chen

    Abstract: Code Large Language Models (Code LLMs) have been increasingly used by developers to boost productivity, but they often generate vulnerable code. Thus, there is an urgent need to ensure that code generated by Code LLMs is correct and secure. Previous research has primarily focused on generating secure code, overlooking the fact that secure code also needs to be correct. This oversight can lead to a… ▽ More

    Submitted 7 June, 2024; v1 submitted 30 April, 2024; originally announced May 2024.

    Comments: 17 pages, 9 figures

  47. arXiv:2404.19209  [pdf, other

    cs.DC

    AdaOper: Energy-efficient and Responsive Concurrent DNN Inference on Mobile Devices

    Authors: Zheng Lin, Bin Guo, Sicong Liu, Wentao Zhou, Yasan Ding, Yu Zhang, Zhiwen Yu

    Abstract: Deep neural network (DNN) has driven extensive applications in mobile technology. However, for long-running mobile apps like voice assistants or video applications on smartphones, energy efficiency is critical for battery-powered devices. The rise of heterogeneous processors in mobile devices today has introduced new challenges for optimizing energy efficiency. Our key insight is that partitioning… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  48. arXiv:2404.17454  [pdf, other

    cs.LG cs.AI q-bio.QM

    Domain Adaptive and Fine-grained Anomaly Detection for Single-cell Sequencing Data and Beyond

    Authors: Kaichen Xu, Yueyang Ding, Suyang Hou, Weiqiang Zhan, Nisang Chen, Jun Wang, Xiaobo Sun

    Abstract: Fined-grained anomalous cell detection from affected tissues is critical for clinical diagnosis and pathological research. Single-cell sequencing data provide unprecedented opportunities for this task. However, current anomaly detection methods struggle to handle domain shifts prevalent in multi-sample and multi-domain single-cell sequencing data, leading to suboptimal performance. Moreover, these… ▽ More

    Submitted 29 April, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

    Comments: 17 pages, 2 figures. Accepted by IJCAI 2024

  49. arXiv:2404.15247  [pdf, other

    cs.CL cs.AI cs.LG cs.SE

    XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts

    Authors: Yifeng Ding, Jiawei Liu, Yuxiang Wei, Terry Yue Zhuo, Lingming Zhang

    Abstract: We introduce XFT, a simple yet powerful training scheme, by simply merging upcycled Mixture-of-Experts (MoE) to unleash the performance limit of instruction-tuned code Large Language Models (LLMs). While vanilla sparse upcycling fails to improve instruction tuning, XFT introduces a shared expert mechanism with a novel routing weight normalization strategy into sparse upcycling, which significantly… ▽ More

    Submitted 6 June, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  50. arXiv:2404.12720  [pdf, other

    cs.CV cs.CL

    PDF-MVQA: A Dataset for Multimodal Information Retrieval in PDF-based Visual Question Answering

    Authors: Yihao Ding, Kaixuan Ren, Jiabin Huang, Siwen Luo, Soyeon Caren Han

    Abstract: Document Question Answering (QA) presents a challenge in understanding visually-rich documents (VRD), particularly those dominated by lengthy textual content like research journal articles. Existing studies primarily focus on real-world documents with sparse text, while challenges persist in comprehending the hierarchical semantic relations among multiple pages to locate multimodal components. To… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI 2024