Zum Hauptinhalt springen

Showing 1–50 of 153 results for author: Lou, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.16326  [pdf, other

    cs.CL

    Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic

    Authors: Xin Zheng, Jie Lou, Boxi Cao, Xueru Wen, Yuqiu Ji, Hongyu Lin, Yaojie Lu, Xianpei Han, Debing Zhang, Le Sun

    Abstract: Self-critic has become an important mechanism for enhancing the reasoning performance of LLMs. However, current approaches mainly involve basic prompts without further training, which tend to be over-simplified, leading to limited accuracy.Moreover, there is a lack of in-depth investigation of the relationship between LLM's ability to criticism and its task-solving performance.To address these iss… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  2. arXiv:2408.00764  [pdf, other

    cs.CL cs.AI cs.LG

    AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation

    Authors: Mengkang Hu, Pu Zhao, Can Xu, Qingfeng Sun, Jianguang Lou, Qingwei Lin, Ping Luo, Saravan Rajmohan, Dongmei Zhang

    Abstract: Large Language Model (LLM) based agents have garnered significant attention and are becoming increasingly popular. Furthermore, planning ability is a crucial component of an LLM-based agent, involving interaction with the environment and executing actions to complete a planning task, which generally entails achieving a desired goal from an initial state. This paper investigates enhancing the plann… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  3. Towards Robust Vision Transformer via Masked Adaptive Ensemble

    Authors: Fudong Lin, Jiadong Lou, Xu Yuan, Nian-Feng Tzeng

    Abstract: Adversarial training (AT) can help improve the robustness of Vision Transformers (ViT) against adversarial attacks by intentionally injecting adversarial examples into the training data. However, this way of adversarial injection inevitably incurs standard accuracy degradation to some extent, thereby calling for a trade-off between standard accuracy and robustness. Besides, the prominent AT soluti… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: 9 pages

    Journal ref: 2024 ACM International Conference on Information & Knowledge Management (CIKM)

  4. arXiv:2407.11033  [pdf, other

    cs.LG cs.CL

    Hadamard Adapter: An Extreme Parameter-Efficient Adapter Tuning Method for Pre-trained Language Models

    Authors: Yuyan Chen, Qiang Fu, Ge Fan, Lun Du, Jian-Guang Lou, Shi Han, Dongmei Zhang, Zhixu Li, Yanghua Xiao

    Abstract: Recent years, Pre-trained Language models (PLMs) have swept into various fields of artificial intelligence and achieved great success. However, most PLMs, such as T5 and GPT3, have a huge amount of parameters, fine-tuning them is often expensive and time consuming, and storing them takes up a lot of space. Therefore, it is necessary to adopt a parameter-efficient approach to reduce parameters of P… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted to CIKM 2023 (Long Paper)

  5. arXiv:2407.10627  [pdf, other

    cs.CL cs.AI cs.LG

    Arena Learning: Build Data Flywheel for LLMs Post-training via Simulated Chatbot Arena

    Authors: Haipeng Luo, Qingfeng Sun, Can Xu, Pu Zhao, Qingwei Lin, Jianguang Lou, Shifeng Chen, Yansong Tang, Weizhu Chen

    Abstract: Assessing the effectiveness of large language models (LLMs) presents substantial challenges. The method of conducting human-annotated battles in an online Chatbot Arena is a highly effective evaluative technique. However, this approach is limited by the costs and time required for human annotation. In this paper, we introduce Arena Learning, an innovative offline strategy designed to simulate thes… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  6. arXiv:2407.06915  [pdf, ps, other

    cs.RO

    FE-GUT: Factor Graph Optimization hybrid with Extended Kalman Filter for tightly coupled GNSS/UWB Integration

    Authors: Qijia Zhao, Shaolin Lü, Jianan Lou, Rong Zhang

    Abstract: Precise positioning and navigation information has been increasingly important with the development of the consumer electronics market. Due to some deficits of Global Navigation Satellite System (GNSS), such as susceptible to interferences, integrating of GNSS with additional alternative sensors is a promising approach to overcome the performance limitations of GNSS-based localization systems. Ult… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  7. arXiv:2406.13404  [pdf, other

    cs.DC

    Low-Latency Layer-Aware Proactive and Passive Container Migration in Meta Computing

    Authors: Mengjie Liu, Yihua Li, Fangyi Mou, Zhiqing Tang, Jiong Lou, Jianxiong Guo, Weijia Jia

    Abstract: Meta computing is a new computing paradigm that aims to efficiently utilize all network computing resources to provide fault-tolerant, personalized services with strong security and privacy guarantees. It also seeks to virtualize the Internet as many meta computers. In meta computing, tasks can be assigned to containers at edge nodes for processing, based on container images with multiple layers.… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: to be published in IEEE ICMC 2024

  8. arXiv:2406.13399  [pdf, other

    cs.AI

    VELO: A Vector Database-Assisted Cloud-Edge Collaborative LLM QoS Optimization Framework

    Authors: Zhi Yao, Zhiqing Tang, Jiong Lou, Ping Shen, Weijia Jia

    Abstract: The Large Language Model (LLM) has gained significant popularity and is extensively utilized across various domains. Most LLM deployments occur within cloud data centers, where they encounter substantial response delays and incur high costs, thereby impacting the Quality of Services (QoS) at the network edge. Leveraging vector database caching to store LLM request results at the edge can substanti… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: to be published in IEEE ICWS 2024

  9. arXiv:2406.00770  [pdf, other

    cs.CL cs.AI

    Automatic Instruction Evolving for Large Language Models

    Authors: Weihao Zeng, Can Xu, Yingxiu Zhao, Jian-Guang Lou, Weizhu Chen

    Abstract: Fine-tuning large pre-trained language models with Evol-Instruct has achieved encouraging results across a wide range of tasks. However, designing effective evolving methods for instruction evolution requires substantial human expertise. This paper proposes Auto Evol-Instruct, an end-to-end framework that evolves instruction datasets using large language models without any human effort. The framew… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  10. arXiv:2404.19417  [pdf, other

    cs.CV

    Physical Backdoor: Towards Temperature-based Backdoor Attacks in the Physical World

    Authors: Wen Yin, Jian Lou, Pan Zhou, Yulai Xie, Dan Feng, Yuhua Sun, Tailai Zhang, Lichao Sun

    Abstract: Backdoor attacks have been well-studied in visible light object detection (VLOD) in recent years. However, VLOD can not effectively work in dark and temperature-sensitive scenarios. Instead, thermal infrared object detection (TIOD) is the most accessible and practical in such environments. In this paper, our team is the first to investigate the security vulnerabilities associated with TIOD in the… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: To appear in CVPR 2024.11pages, 8 figures and 4 tables

  11. arXiv:2404.16811  [pdf, other

    cs.CL cs.AI

    Make Your LLM Fully Utilize the Context

    Authors: Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou

    Abstract: While many contemporary large language models (LLMs) can process lengthy input, they still struggle to fully utilize information within the long context, known as the lost-in-the-middle challenge. We hypothesize that it stems from insufficient explicit supervision during the long-context training, which fails to emphasize that any position in a long context can hold crucial information. Based on t… ▽ More

    Submitted 26 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: 19 pages, 7 figures, 3 tables, 9 examples

  12. Merits of Time-Domain Computing for VMM -- A Quantitative Comparison

    Authors: Florian Freye, Jie Lou, Christian Lanius, Tobias Gemmeke

    Abstract: Vector-matrix-multiplication (VMM) accel-erators have gained a lot of traction, especially due to therise of convolutional neural networks (CNNs) and the desireto compute them on the edge. Besides the classical digitalapproach, analog computing has gone through a renais-sance to push energy efficiency further. A more recent ap-proach is called time-domain (TD) computing. In contrastto analog compu… ▽ More

    Submitted 21 May, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: 8 pages, 12 figures. This paper was accepted at the 25th International Symposium on Quality Electronic Design(ISQED) 2024. DOI: 10.1109/ISQED60706.2024.10528682

  13. arXiv:2403.05307  [pdf, other

    cs.AI

    Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents

    Authors: Jinyang Li, Nan Huo, Yan Gao, Jiayi Shi, Yingxiu Zhao, Ge Qu, Yurong Wu, Chenhao Ma, Jian-Guang Lou, Reynold Cheng

    Abstract: Interactive Data Analysis, the collaboration between humans and LLM agents, enables real-time data exploration for informed decision-making. The challenges and costs of collecting realistic interactive logs for data analysis hinder the quantitative evaluation of Large Language Model (LLM) agents in this task. To mitigate this issue, we introduce Tapilot-Crossing, a new benchmark to evaluate LLM ag… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 30 pages, 7 figures

  14. arXiv:2402.07818  [pdf, other

    cs.LG cs.AI cs.CL

    Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning

    Authors: Z Liu, J Lou, W Bao, Y Hu, B Li, Z Qin, K Ren

    Abstract: Fine-tuning on task-specific datasets is a widely-embraced paradigm of harnessing the powerful capability of pretrained LLMs for various downstream tasks. Due to the popularity of LLMs fine-tuning and its accompanying privacy concerns, differentially private (DP) fine-tuning of pretrained LLMs has been widely used to safeguarding the privacy of task-specific datasets. Lying at the design core of D… ▽ More

    Submitted 9 May, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  15. arXiv:2402.07002  [pdf, other

    cs.LG cs.AI cs.CR

    Clients Collaborate: Flexible Differentially Private Federated Learning with Guaranteed Improvement of Utility-Privacy Trade-off

    Authors: Yuecheng Li, Tong Wang, Chuan Chen, Jian Lou, Bin Chen, Lei Yang, Zibin Zheng

    Abstract: To defend against privacy leakage of user data, differential privacy is widely used in federated learning, but it is not free. The addition of noise randomly disrupts the semantic integrity of the model and this disturbance accumulates with increased communication rounds. In this paper, we introduce a novel federated learning framework with rigorous privacy guarantees, named FedCEO, designed to st… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

    Comments: 22 pages, 8 figures

  16. arXiv:2401.16251  [pdf, other

    cs.CR cs.AI cs.LG

    Cross-silo Federated Learning with Record-level Personalized Differential Privacy

    Authors: Junxu Liu, Jian Lou, Li Xiong, Jinfei Liu, Xiaofeng Meng

    Abstract: Federated learning (FL) enhanced by differential privacy has emerged as a popular approach to better safeguard the privacy of client-side data by protecting clients' contributions during the training process. Existing solutions typically assume a uniform privacy budget for all records and provide one-size-fits-all solutions that may not be adequate to meet each record's privacy requirement. In thi… ▽ More

    Submitted 29 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: 15 pages, 8 figures, accepted by CCS'2024

  17. arXiv:2401.10458  [pdf, other

    cs.LG cs.CR

    Contrastive Unlearning: A Contrastive Approach to Machine Unlearning

    Authors: Hong kyu Lee, Qiuchen Zhang, Carl Yang, Jian Lou, Li Xiong

    Abstract: Machine unlearning aims to eliminate the influence of a subset of training samples (i.e., unlearning samples) from a trained model. Effectively and efficiently removing the unlearning samples without negatively impacting the overall model performance is still challenging. In this paper, we propose a contrastive unlearning framework, leveraging the concept of representation learning for more effect… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

  18. arXiv:2312.15395  [pdf, other

    cs.CL cs.DB cs.LG

    Prompt Valuation Based on Shapley Values

    Authors: Hanxi Liu, Xiaokai Mao, Haocheng Xia, Jian Lou, Jinfei Liu

    Abstract: Large language models (LLMs) excel on new tasks without additional training, simply by providing natural language prompts that demonstrate how the task should be performed. Prompt ensemble methods comprehensively harness the knowledge of LLMs while mitigating individual biases and errors and further enhancing performance. However, more prompts do not necessarily lead to better results, and not all… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

  19. arXiv:2312.13694  [pdf, other

    cs.CL

    Data Transformation to Construct a Dataset for Generating Entity-Relationship Model from Natural Language

    Authors: Zhenwen Li, Jian-Guang Lou, Tao Xie

    Abstract: In order to reduce the manual cost of designing ER models, recent approaches have been proposed to address the task of NL2ERM, i.e., automatically generating entity-relationship (ER) models from natural language (NL) utterances such as software requirements. These approaches are typically rule-based ones, which rely on rigid heuristic rules; these approaches cannot generalize well to various lingu… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

  20. arXiv:2312.11198  [pdf, other

    cs.LG cs.AI

    Signed Graph Neural Ordinary Differential Equation for Modeling Continuous-time Dynamics

    Authors: Lanlan Chen, Kai Wu, Jian Lou, Jing Liu

    Abstract: Modeling continuous-time dynamics constitutes a foundational challenge, and uncovering inter-component correlations within complex systems holds promise for enhancing the efficacy of dynamic modeling. The prevailing approach of integrating graph neural networks with ordinary differential equations has demonstrated promising performance. However, they disregard the crucial signed information intrin… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: AAAI 2024

  21. arXiv:2312.10336  [pdf, ps, other

    cs.LG

    Certified Minimax Unlearning with Generalization Rates and Deletion Capacity

    Authors: Jiaqi Liu, Jian Lou, Zhan Qin, Kui Ren

    Abstract: We study the problem of $(ε,δ)$-certified machine unlearning for minimax models. Most of the existing works focus on unlearning from standard statistical learning models that have a single variable and their unlearning steps hinge on the direct Hessian-based conventional Newton update. We develop a new $(ε,δ)$-certified machine unlearning algorithm for minimax models. It proposes a minimax unlearn… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023

  22. arXiv:2311.16136  [pdf, other

    cs.CR cs.AI

    ERASER: Machine Unlearning in MLaaS via an Inference Serving-Aware Approach

    Authors: Yuke Hu, Jian Lou, Jiaqi Liu, Wangze Ni, Feng Lin, Zhan Qin, Kui Ren

    Abstract: Over the past years, Machine Learning-as-a-Service (MLaaS) has received a surging demand for supporting Machine Learning-driven services to offer revolutionized user experience across diverse application areas. MLaaS provides inference service with low inference latency based on an ML model trained using a dataset collected from numerous individual data owners. Recently, for the sake of data owner… ▽ More

    Submitted 18 June, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

    Comments: Accepted by CCS'24

  23. arXiv:2311.16062  [pdf, other

    cs.CR

    Local Differentially Private Heavy Hitter Detection in Data Streams with Bounded Memory

    Authors: Xiaochen Li, Weiran Liu, Jian Lou, Yuan Hong, Lei Zhang, Zhan Qin, Kui Ren

    Abstract: Top-$k$ frequent items detection is a fundamental task in data stream mining. Many promising solutions are proposed to improve memory efficiency while still maintaining high accuracy for detecting the Top-$k$ items. Despite the memory efficiency concern, the users could suffer from privacy loss if participating in the task without proper protection, since their contributed local data streams may c… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  24. arXiv:2311.08734  [pdf, other

    cs.CL

    Thread of Thought Unraveling Chaotic Contexts

    Authors: Yucheng Zhou, Xiubo Geng, Tao Shen, Chongyang Tao, Guodong Long, Jian-Guang Lou, Jianbing Shen

    Abstract: Large Language Models (LLMs) have ushered in a transformative era in the field of natural language processing, excelling in tasks related to text comprehension and generation. Nevertheless, they encounter difficulties when confronted with chaotic contexts (e.g., distractors rather than long irrelevant context), leading to the inadvertent omission of certain details within the chaotic context. In r… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: 11 pages, 7 figures, 5 tables

  25. arXiv:2311.06495  [pdf, other

    cs.CV

    LayoutPrompter: Awaken the Design Ability of Large Language Models

    Authors: Jiawei Lin, Jiaqi Guo, Shizhao Sun, Zijiang James Yang, Jian-Guang Lou, Dongmei Zhang

    Abstract: Conditional graphic layout generation, which automatically maps user constraints to high-quality layouts, has attracted widespread attention today. Although recent works have achieved promising performance, the lack of versatility and data efficiency hinders their practical applications. In this work, we propose LayoutPrompter, which leverages large language models (LLMs) to address the above prob… ▽ More

    Submitted 11 November, 2023; originally announced November 2023.

    Comments: NeurIPS 2023

  26. arXiv:2311.06304  [pdf, other

    cs.LG cs.AI q-bio.BM

    Retro-BLEU: Quantifying Chemical Plausibility of Retrosynthesis Routes through Reaction Template Sequence Analysis

    Authors: Junren Li, Lei Fang, Jian-Guang Lou

    Abstract: Computer-assisted methods have emerged as valuable tools for retrosynthesis analysis. However, quantifying the plausibility of generated retrosynthesis routes remains a challenging task. We introduce Retro-BLEU, a statistical metric adapted from the well-established BLEU score in machine translation, to evaluate the plausibility of retrosynthesis routes based on reaction template sequences analysi… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Journal ref: https://pubs.rsc.org/en/content/articlelanding/2024/dd/d3dd00219e

  27. arXiv:2311.06227  [pdf, other

    cs.CR cs.LG

    Does Differential Privacy Prevent Backdoor Attacks in Practice?

    Authors: Fereshteh Razmi, Jian Lou, Li Xiong

    Abstract: Differential Privacy (DP) was originally developed to protect privacy. However, it has recently been utilized to secure machine learning (ML) models from poisoning attacks, with DP-SGD receiving substantial attention. Nevertheless, a thorough investigation is required to assess the effectiveness of different DP techniques in preventing backdoor attacks in practice. In this paper, we investigate th… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  28. arXiv:2311.04686  [pdf, other

    cs.LG cs.DC stat.ML

    Robust and Communication-Efficient Federated Domain Adaptation via Random Features

    Authors: Zhanbo Feng, Yuanjie Wang, Jie Li, Fan Yang, Jiong Lou, Tiebin Mi, Robert. C. Qiu, Zhenyu Liao

    Abstract: Modern machine learning (ML) models have grown to a scale where training them on a single machine becomes impractical. As a result, there is a growing trend to leverage federated learning (FL) techniques to train large ML models in a distributed and collaborative manner. These models, however, when deployed on new devices, might struggle to generalize well due to domain shifts. In this context, fe… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: 21 pages

  29. arXiv:2310.20689  [pdf, other

    cs.CL cs.AI

    Learning From Mistakes Makes LLM Better Reasoner

    Authors: Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou, Weizhu Chen

    Abstract: Large language models (LLMs) recently exhibited remarkable reasoning capabilities on solving math problems. To further improve their reasoning capabilities, this work explores whether LLMs can LEarn from MistAkes (LEMA), akin to the human learning process. Consider a human student who failed to solve a math problem, he will learn from what mistake he has made and how to correct it. Mimicking this… ▽ More

    Submitted 29 March, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

    Comments: 23 pages, 13 figures, 6 tables

  30. arXiv:2310.16475  [pdf, other

    cs.DC

    Efficient Serverless Function Scheduling at the Network Edge

    Authors: Jiong Lou, Zhiqing Tang, Shijing Yuan, Jie Li, Chengtao Wu, Weijia Jia

    Abstract: Serverless computing is a promising approach for edge computing since its inherent features, e.g., lightweight virtualization, rapid scalability, and economic efficiency. However, previous studies have not studied well the issues of significant cold start latency and highly dynamic workloads in serverless function scheduling, which are exacerbated at the resource-limited network edge. In this pape… ▽ More

    Submitted 31 October, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

  31. arXiv:2310.12439  [pdf, other

    cs.CL cs.AI

    PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models

    Authors: Hongwei Yao, Jian Lou, Zhan Qin

    Abstract: Prompts have significantly improved the performance of pretrained Large Language Models (LLMs) on various downstream tasks recently, making them increasingly indispensable for a diverse range of LLM application scenarios. However, the backdoor vulnerability, a serious security threat that can maliciously alter the victim model's normal predictions, has not been sufficiently explored for prompt-bas… ▽ More

    Submitted 18 December, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: To Appear in IEEE ICASSP 2024, code is available at: https://github.com/grasses/PoisonPrompt

  32. arXiv:2310.04711  [pdf, other

    cs.CR cs.DB

    DP-starJ: A Differential Private Scheme towards Analytical Star-Join Queries

    Authors: Congcong Fu, Hui Li, Jian Lou, Jiangtao Cui

    Abstract: Star-join query is the fundamental task in data warehouse and has wide applications in On-line Analytical Processing (OLAP) scenarios. Due to the large number of foreign key constraints and the asymmetric effect in the neighboring instance between the fact and dimension tables, even those latest DP efforts specifically designed for join, if directly applied to star-join query, will suffer from ext… ▽ More

    Submitted 17 November, 2023; v1 submitted 7 October, 2023; originally announced October 2023.

  33. arXiv:2310.00560  [pdf, other

    cs.DC

    Joint Task Scheduling and Container Image Caching in Edge Computing

    Authors: Fangyi Mou, Zhiqing Tang, Jiong Lou, Jianxiong Guo, Wenhua Wang, Tian Wang

    Abstract: In Edge Computing (EC), containers have been increasingly used to deploy applications to provide mobile users services. Each container must run based on a container image file that exists locally. However, it has been conspicuously neglected by existing work that effective task scheduling combined with dynamic container image caching is a promising way to reduce the container image download time w… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

  34. arXiv:2309.11979  [pdf, other

    q-fin.CP cs.CL cs.LG

    Stock Market Sentiment Classification and Backtesting via Fine-tuned BERT

    Authors: Jiashu Lou

    Abstract: With the rapid development of big data and computing devices, low-latency automatic trading platforms based on real-time information acquisition have become the main components of the stock trading market, so the topic of quantitative trading has received widespread attention. And for non-strongly efficient trading markets, human emotions and expectations always dominate market trends and trading… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

  35. arXiv:2309.06275  [pdf, other

    cs.CL

    Re-Reading Improves Reasoning in Large Language Models

    Authors: Xiaohan Xu, Chongyang Tao, Tao Shen, Can Xu, Hongbo Xu, Guodong Long, Jian-guang Lou

    Abstract: To enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs), we introduce a simple, yet general and effective prompting method, Re2, i.e., \textbf{Re}-\textbf{Re}ading the question as input. Unlike most thought-eliciting prompting methods, such as Chain-of-Thought (CoT), which aim to elicit the reasoning process in the output, Re2 shifts the focus to the input by processing… ▽ More

    Submitted 29 February, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

    Comments: 25 pages

  36. arXiv:2309.01371  [pdf, other

    cs.HC

    A Survey for Graphic Design Intelligence

    Authors: Danqing Huang, Jiaqi Guo, Shizhao Sun, Hanling Tian, Jieru Lin, Zheng Hu, Chin-Yew Lin, Jian-Guang Lou, Dongmei Zhang

    Abstract: Graphic design is an effective language for visual communication. Using complex composition of visual elements (e.g., shape, color, font) guided by design principles and aesthetics, design helps produce more visually-appealing content. The creation of a harmonious design requires carefully selecting and combining different visual elements, which can be challenging and time-consuming. To expedite t… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

    Comments: 10 pages, 2 figures

  37. arXiv:2308.12700  [pdf, other

    cs.CV

    A Parse-Then-Place Approach for Generating Graphic Layouts from Textual Descriptions

    Authors: Jiawei Lin, Jiaqi Guo, Shizhao Sun, Weijiang Xu, Ting Liu, Jian-Guang Lou, Dongmei Zhang

    Abstract: Creating layouts is a fundamental step in graphic design. In this work, we propose to use text as the guidance to create graphic layouts, i.e., Text-to-Layout, aiming to lower the design barriers. Text-to-Layout is a challenging task, because it needs to consider the implicit, combined, and incomplete layout constraints from text, each of which has not been studied in previous work. To address thi… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV2023

  38. arXiv:2308.12319  [pdf, other

    cs.CV cs.AI

    RemovalNet: DNN Fingerprint Removal Attacks

    Authors: Hongwei Yao, Zheng Li, Kunzhe Huang, Jian Lou, Zhan Qin, Kui Ren

    Abstract: With the performance of deep neural networks (DNNs) remarkably improving, DNNs have been widely used in many areas. Consequently, the DNN model has become a valuable asset, and its intellectual property is safeguarded by ownership verification techniques (e.g., DNN fingerprinting). However, the feasibility of the DNN fingerprint removal attack and its potential influence remains an open problem. I… ▽ More

    Submitted 22 November, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: Accepted to IEEE TDSC, code is available at: https://github.com/grasses/RemovalNet

  39. arXiv:2308.09583  [pdf, other

    cs.CL cs.AI cs.LG

    WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

    Authors: Haipeng Luo, Qingfeng Sun, Can Xu, Pu Zhao, Jianguang Lou, Chongyang Tao, Xiubo Geng, Qingwei Lin, Shifeng Chen, Dongmei Zhang

    Abstract: Large language models (LLMs), such as GPT-4, have shown remarkable performance in natural language processing (NLP) tasks, including challenging mathematical reasoning. However, most existing open-source models are only pre-trained on large-scale internet data and without math-related optimization. In this paper, we present WizardMath, which enhances the mathematical reasoning abilities of Llama-2… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Comments: LLM, Mathematical Reasoning

  40. arXiv:2308.05362  [pdf, other

    cs.CR cs.LG cs.SE

    FINER: Enhancing State-of-the-art Classifiers with Feature Attribution to Facilitate Security Analysis

    Authors: Yiling He, Jian Lou, Zhan Qin, Kui Ren

    Abstract: Deep learning classifiers achieve state-of-the-art performance in various risk detection applications. They explore rich semantic representations and are supposed to automatically discover risk behaviors. However, due to the lack of transparency, the behavioral semantics cannot be conveyed to downstream security experts to reduce their heavy workload in security analysis. Although feature attribut… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

  41. arXiv:2308.02816  [pdf, other

    cs.MM cs.CR

    PromptCARE: Prompt Copyright Protection by Watermark Injection and Verification

    Authors: Hongwei Yao, Jian Lou, Kui Ren, Zhan Qin

    Abstract: Large language models (LLMs) have witnessed a meteoric rise in popularity among the general public users over the past few months, facilitating diverse downstream tasks with human-level accuracy and proficiency. Prompts play an essential role in this success, which efficiently adapt pre-trained LLMs to task-specific applications by simply prepending a sequence of tokens to the query texts. However… ▽ More

    Submitted 28 November, 2023; v1 submitted 5 August, 2023; originally announced August 2023.

    Comments: To Appear in the 45th IEEE Symposium on Security and Privacy 2024, code is available at: https://github.com/grasses/PromptCARE

  42. arXiv:2308.00728  [pdf, other

    cs.CV

    ELFNet: Evidential Local-global Fusion for Stereo Matching

    Authors: Jieming Lou, Weide Liu, Zhuo Chen, Fayao Liu, Jun Cheng

    Abstract: Although existing stereo matching models have achieved continuous improvement, they often face issues related to trustworthiness due to the absence of uncertainty estimation. Additionally, effectively leveraging multi-scale and multi-view knowledge of stereo pairs remains unexplored. In this paper, we introduce the \textbf{E}vidential \textbf{L}ocal-global \textbf{F}usion (ELF) framework for stere… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: ICCV 2023

  43. arXiv:2307.12121  [pdf, other

    cs.DC

    Online Container Scheduling for Low-Latency IoT Services in Edge Cluster Upgrade: A Reinforcement Learning Approach

    Authors: Hanshuai Cui, Zhiqing Tang, Jiong Lou, Weijia Jia

    Abstract: In Mobile Edge Computing (MEC), Internet of Things (IoT) devices offload computationally-intensive tasks to edge nodes, where they are executed within containers, reducing the reliance on centralized cloud infrastructure. Frequent upgrades are essential to maintain the efficient and secure operation of edge clusters. However, traditional cloud cluster upgrade strategies are ill-suited for edge clu… ▽ More

    Submitted 22 July, 2023; originally announced July 2023.

  44. arXiv:2306.10675  [pdf, other

    cs.DB cs.AI

    LaDe: The First Comprehensive Last-mile Delivery Dataset from Industry

    Authors: Lixia Wu, Haomin Wen, Haoyuan Hu, Xiaowei Mao, Yutong Xia, Ergang Shan, Jianbin Zhen, Junhong Lou, Yuxuan Liang, Liuqing Yang, Roger Zimmermann, Youfang Lin, Huaiyu Wan

    Abstract: Real-world last-mile delivery datasets are crucial for research in logistics, supply chain management, and spatio-temporal data mining. Despite a plethora of algorithms developed to date, no widely accepted, publicly available last-mile delivery dataset exists to support research in this field. In this paper, we introduce \texttt{LaDe}, the first publicly available last-mile delivery dataset with… ▽ More

    Submitted 2 January, 2024; v1 submitted 18 June, 2023; originally announced June 2023.

  45. arXiv:2306.01762  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    Pre-trained transformer for adversarial purification

    Authors: Kai Wu, Yujian Betterest Li, Jian Lou, Xiaoyu Zhang, Handing Wang, Jing Liu

    Abstract: With more and more deep neural networks being deployed as various daily services, their reliability is essential. It is frightening that deep neural networks are vulnerable and sensitive to adversarial attacks, the most common one of which for the services is evasion-based. Recent works usually strengthen the robustness by adversarial training or leveraging the knowledge of an amount of clean data… ▽ More

    Submitted 25 September, 2023; v1 submitted 27 May, 2023; originally announced June 2023.

  46. arXiv:2305.16253  [pdf, other

    cs.CL

    Uncovering and Categorizing Social Biases in Text-to-SQL

    Authors: Yan Liu, Yan Gao, Zhe Su, Xiaokang Chen, Elliott Ash, Jian-Guang Lou

    Abstract: Content Warning: This work contains examples that potentially implicate stereotypes, associations, and other harms that could be offensive to individuals in certain social groups.} Large pre-trained language models are acknowledged to carry social biases towards different demographics, which can further amplify existing stereotypes in our society and cause even more harm. Text-to-SQL is an importa… ▽ More

    Submitted 7 June, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

  47. arXiv:2305.15377  [pdf, other

    cs.CL

    Uncovering and Quantifying Social Biases in Code Generation

    Authors: Yan Liu, Xiaokang Chen, Yan Gao, Zhe Su, Fengji Zhang, Daoguang Zan, Jian-Guang Lou, Pin-Yu Chen, Tsung-Yi Ho

    Abstract: With the popularity of automatic code generation tools, such as Copilot, the study of the potential hazards of these tools is gaining importance. In this work, we explore the social bias problem in pre-trained code generation models. We propose a new paradigm to construct code prompts and successfully uncover social biases in code generation models. To quantify the severity of social biases in gen… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  48. arXiv:2305.14682  [pdf, other

    cs.CL

    TACR: A Table-alignment-based Cell-selection and Reasoning Model for Hybrid Question-Answering

    Authors: Jian Wu, Yicheng Xu, Yan Gao, Jian-Guang Lou, Börje F. Karlsson, Manabu Okumura

    Abstract: Hybrid Question-Answering (HQA), which targets reasoning over tables and passages linked from table cells, has witnessed significant research in recent years. A common challenge in HQA and other passage-table QA datasets is that it is generally unrealistic to iterate over all table rows, columns, and linked passages to retrieve evidence. Such a challenge made it difficult for previous studies to s… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted at Findings of ACL 2023

  49. arXiv:2305.14221  [pdf, other

    cs.CL

    Question Answering as Programming for Solving Time-Sensitive Questions

    Authors: Xinyu Zhu, Cheng Yang, Bei Chen, Siheng Li, Jian-Guang Lou, Yujiu Yang

    Abstract: Question answering plays a pivotal role in human daily life because it involves our acquisition of knowledge about the world. However, due to the dynamic and ever-changing nature of real-world facts, the answer can be completely different when the time constraint in the question changes. Recently, Large Language Models (LLMs) have shown remarkable intelligence in question answering, while our expe… ▽ More

    Submitted 20 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted to EMNLP 2023 Main Conference

  50. arXiv:2305.14210  [pdf, other

    cs.CL cs.AI

    Skill-Based Few-Shot Selection for In-Context Learning

    Authors: Shengnan An, Bo Zhou, Zeqi Lin, Qiang Fu, Bei Chen, Nanning Zheng, Weizhu Chen, Jian-Guang Lou

    Abstract: In-context learning is the paradigm that adapts large language models to downstream tasks by providing a few examples. Few-shot selection -- selecting appropriate examples for each test instance separately -- is important for in-context learning. In this paper, we propose Skill-KNN, a skill-based few-shot selection method for in-context learning. The key advantages of Skill-KNN include: (1) it add… ▽ More

    Submitted 10 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted by EMNLP 2023 main conference