Skip to main content

Showing 1–50 of 743 results for author: Liu, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13709  [pdf, other

    cs.CL cs.LG

    Understanding Reference Policies in Direct Preference Optimization

    Authors: Yixin Liu, Pengfei Liu, Arman Cohan

    Abstract: Direct Preference Optimization (DPO) has become a widely used training method for the instruction fine-tuning of large language models (LLMs). In this work, we explore an under-investigated aspect of DPO - its dependency on the reference model or policy. Such reference policies, typically instantiated as the model to be further fine-tuned, are important since they can impose an upper limit on DPO'… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  2. arXiv:2407.13647  [pdf, other

    cs.CL cs.AI

    Weak-to-Strong Reasoning

    Authors: Yuqing Yang, Yan Ma, Pengfei Liu

    Abstract: When large language models (LLMs) exceed human-level capabilities, it becomes increasingly challenging to provide full-scale and accurate supervisions for these models. Weak-to-strong learning, which leverages a less capable model to unlock the latent abilities of a stronger model, proves valuable in this context. Yet, the efficacy of this approach for complex reasoning tasks is still untested. Fu… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  3. arXiv:2407.13537  [pdf, other

    cs.CV

    GlobalPointer: Large-Scale Plane Adjustment with Bi-Convex Relaxation

    Authors: Bangyan Liao, Zhenjun Zhao, Lu Chen, Haoang Li, Daniel Cremers, Peidong Liu

    Abstract: Plane adjustment (PA) is crucial for many 3D applications, involving simultaneous pose estimation and plane recovery. Despite recent advancements, it remains a challenging problem in the realm of multi-view point cloud registration. Current state-of-the-art methods can achieve globally optimal convergence only with good initialization. Furthermore, their high time complexity renders them impractic… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024. The first two authors contributed equally to this work. Code: https://github.com/wu-cvgl/GlobalPointer

  4. arXiv:2407.13279  [pdf, other

    cs.LG

    Analyzing and Bridging the Gap between Maximizing Total Reward and Discounted Reward in Deep Reinforcement Learning

    Authors: Shuyu Yin, Fei Wen, Peilin Liu, Tao Luo

    Abstract: In deep reinforcement learning applications, maximizing discounted reward is often employed instead of maximizing total reward to ensure the convergence and stability of algorithms, even though the performance metric for evaluating the policy remains the total reward. However, the optimal policies corresponding to these two objectives may not always be consistent. To address this issue, we analyze… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  5. arXiv:2407.13093  [pdf, other

    cs.CR

    Using LLMs to Automate Threat Intelligence Analysis Workflows in Security Operation Centers

    Authors: PeiYu Tseng, ZihDwo Yeh, Xushu Dai, Peng Liu

    Abstract: SIEM systems are prevalent and play a critical role in a variety of analyst workflows in Security Operation Centers. However, modern SIEMs face a big challenge: they still cannot relieve analysts from the repetitive tasks involved in analyzing CTI (Cyber Threat Intelligence) reports written in natural languages. This project aims to develop an AI agent to replace the labor intensive repetitive tas… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  6. arXiv:2407.12943  [pdf, other

    cs.CL cs.AI

    Halu-J: Critique-Based Hallucination Judge

    Authors: Binjie Wang, Steffi Chern, Ethan Chern, Pengfei Liu

    Abstract: Large language models (LLMs) frequently generate non-factual content, known as hallucinations. Existing retrieval-augmented-based hallucination detection approaches typically address this by framing it as a classification task, evaluating hallucinations based on their consistency with retrieved evidence. However, this approach usually lacks detailed explanations for these evaluations and does not… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  7. arXiv:2407.12239  [pdf, other

    cs.CV

    Motion and Structure from Event-based Normal Flow

    Authors: Zhongyang Ren, Bangyan Liao, Delei Kong, Jinghang Li, Peidong Liu, Laurent Kneip, Guillermo Gallego, Yi Zhou

    Abstract: Recovering the camera motion and scene geometry from visual data is a fundamental problem in the field of computer vision. Its success in standard vision is attributed to the maturity of feature extraction, data association and multi-view geometry. The recent emergence of neuromorphic event-based cameras places great demands on approaches that use raw event data as input to solve this fundamental… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: This paper has been accepted by ECCV 2024

  8. arXiv:2407.10990  [pdf

    cs.CL cs.AI

    MedBench: A Comprehensive, Standardized, and Reliable Benchmarking System for Evaluating Chinese Medical Large Language Models

    Authors: Mianxin Liu, Jinru Ding, Jie Xu, Weiguo Hu, Xiaoyang Li, Lifeng Zhu, Zhian Bai, Xiaoming Shi, Benyou Wang, Haitao Song, Pengfei Liu, Xiaofan Zhang, Shanshan Wang, Kang Li, Haofen Wang, Tong Ruan, Xuanjing Huang, Xin Sun, Shaoting Zhang

    Abstract: Ensuring the general efficacy and goodness for human beings from medical large language models (LLM) before real-world deployment is crucial. However, a widely accepted and accessible evaluation process for medical LLM, especially in the Chinese context, remains to be established. In this work, we introduce "MedBench", a comprehensive, standardized, and reliable benchmarking system for Chinese med… ▽ More

    Submitted 23 June, 2024; originally announced July 2024.

    Comments: 25 pages.4 figures

  9. arXiv:2407.10830  [pdf, other

    cs.DS

    Almost-Linear Time Algorithms for Decremental Graphs: Min-Cost Flow and More via Duality

    Authors: Jan van den Brand, Li Chen, Rasmus Kyng, Yang P. Liu, Simon Meierhans, Maximilian Probst Gutenberg, Sushant Sachdeva

    Abstract: We give the first almost-linear total time algorithm for deciding if a flow of cost at most $F$ still exists in a directed graph, with edge costs and capacities, undergoing decremental updates, i.e., edge deletions, capacity decreases, and cost increases. This implies almost-linear time algorithms for approximating the minimum-cost flow value and $s$-$t$ distance on such decremental graphs. Our fr… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 61 pages, Accepted to FOCS 2024

  10. arXiv:2407.08948  [pdf, other

    eess.IV cs.CV

    Symmetry Awareness Encoded Deep Learning Framework for Brain Imaging Analysis

    Authors: Yang Ma, Dongang Wang, Peilin Liu, Lynette Masters, Michael Barnett, Weidong Cai, Chenyu Wang

    Abstract: The heterogeneity of neurological conditions, ranging from structural anomalies to functional impairments, presents a significant challenge in medical imaging analysis tasks. Moreover, the limited availability of well-annotated datasets constrains the development of robust analysis models. Against this backdrop, this study introduces a novel approach leveraging the inherent anatomical symmetrical… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: MICCAI 2024

    ACM Class: I.2.10; I.4.10

  11. arXiv:2407.07503  [pdf, other

    cs.CV cs.IR

    Metasurface-based Snapshot Shortwave-Infrared Hyperspectral Image Reconstruction with Inter and Intra Prior Learning Network

    Authors: Linqiang Li, Jinglei Hao, Yongqiang Zhao, Pan Liu, Haofang Yan, Ziqin Zhang, Seong G. Kong

    Abstract: Shortwave-infrared(SWIR) spectral information,ranging from 1 μm to 2.5μm, breaks the limitations of traditional color cameras in acquiring scene information and has been used in many fields. However, conventional SWIR hyperspectral imaging systems face challenges due to their bulky setups and low acquisition speed. In this work, we introduce a snapshot SWIR hyperspectral imaging system based on a… ▽ More

    Submitted 10 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

    Comments: 10 pages,5 figures

  12. arXiv:2407.07324  [pdf, other

    cs.CV

    Event-Aided Time-to-Collision Estimation for Autonomous Driving

    Authors: Jinghang Li, Bangyan Liao, Xiuyuan LU, Peidong Liu, Shaojie Shen, Yi Zhou

    Abstract: Predicting a potential collision with leading vehicles is an essential functionality of any autonomous/assisted driving system. One bottleneck of existing vision-based solutions is that their updating rate is limited to the frame rate of standard cameras used. In this paper, we present a novel method that estimates the time to collision using a neuromorphic event-based camera, a biologically inspi… ▽ More

    Submitted 16 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: Accepted to European Conference on Computer Vision 2024, dataset used in this paper can be found at https://nail-hnu.github.io/EventAidedTTC

  13. arXiv:2407.07307  [pdf, other

    cs.CV

    Dual-stage Hyperspectral Image Classification Model with Spectral Supertoken

    Authors: Peifu Liu, Tingfa Xu, Jie Wang, Huan Chen, Huiyan Bai, Jianan Li

    Abstract: Hyperspectral image classification, a task that assigns pre-defined classes to each pixel in a hyperspectral image of remote sensing scenes, often faces challenges due to the neglect of correlations between spectrally similar pixels. This oversight can lead to inaccurate edge definitions and difficulties in managing minor spectral variations in contiguous areas. To address these issues, we introdu… ▽ More

    Submitted 13 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  14. arXiv:2407.06135  [pdf, other

    cs.CL cs.AI cs.CV

    ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation

    Authors: Ethan Chern, Jiadi Su, Yan Ma, Pengfei Liu

    Abstract: Previous open-source large multimodal models (LMMs) have faced several limitations: (1) they often lack native integration, requiring adapters to align visual representations with pre-trained large language models (LLMs); (2) many are restricted to single-modal generation; (3) while some support multimodal generation, they rely on separate diffusion models for visual modeling and generation. To mi… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  15. arXiv:2407.05872  [pdf, other

    cs.LG

    Scaling Exponents Across Parameterizations and Optimizers

    Authors: Katie Everett, Lechao Xiao, Mitchell Wortsman, Alexander A. Alemi, Roman Novak, Peter J. Liu, Izzeddin Gur, Jascha Sohl-Dickstein, Leslie Pack Kaelbling, Jaehoon Lee, Jeffrey Pennington

    Abstract: Robust and effective scaling of models from small to large width typically requires the precise adjustment of many algorithmic and architectural details, such as parameterization and optimizer choices. In this work, we propose a new perspective on parameterization by investigating a key assumption in prior work about the alignment between parameters and data and derive new theoretical results unde… ▽ More

    Submitted 16 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: 63 pages, International Conference on Machine Learning 2024

  16. arXiv:2407.05311  [pdf, other

    cs.CV

    MMAD: Multi-label Micro-Action Detection in Videos

    Authors: Kun Li, Dan Guo, Pengyu Liu, Guoliang Chen, Meng Wang

    Abstract: Human body actions are an important form of non-verbal communication in social interactions. This paper focuses on a specific subset of body actions known as micro-actions, which are subtle, low-intensity body movements that provide a deeper understanding of inner human feelings. In real-world scenarios, human micro-actions often co-occur, with multiple micro-actions overlapping in time, such as s… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Work in Progress

  17. arXiv:2407.05232  [pdf, other

    cs.LG

    PAPM: A Physics-aware Proxy Model for Process Systems

    Authors: Pengwei Liu, Zhongkai Hao, Xingyu Ren, Hangjie Yuan, Jiayang Ren, Dong Ni

    Abstract: In the context of proxy modeling for process systems, traditional data-driven deep learning approaches frequently encounter significant challenges, such as substantial training costs induced by large amounts of data, and limited generalization capabilities. As a promising alternative, physics-aware models incorporate partial physics knowledge to ameliorate these challenges. Although demonstrating… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: ICML 2024

  18. arXiv:2407.05013  [pdf, other

    cs.CL

    Progress or Regress? Self-Improvement Reversal in Post-training

    Authors: Ting Wu, Xuefeng Li, Pengfei Liu

    Abstract: Self-improvement through post-training methods such as iterative preference learning has been acclaimed for enhancing the problem-solving capabilities (e.g., mathematical reasoning) of Large Language Models (LLMs) without human intervention. However, as exploration deepens, it becomes crucial to assess whether these improvements genuinely signify progress in solving more challenging problems or if… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  19. arXiv:2407.04923  [pdf, other

    cs.CV cs.CL

    OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding

    Authors: Tiancheng Zhao, Qianqian Zhang, Kyusong Lee, Peng Liu, Lu Zhang, Chunxin Fang, Jiajia Liao, Kelei Jiang, Yibo Ma, Ruochen Xu

    Abstract: We introduce OmChat, a model designed to excel in handling long contexts and video understanding tasks. OmChat's new architecture standardizes how different visual inputs are processed, making it more efficient and adaptable. It uses a dynamic vision encoding process to effectively handle images of various resolutions, capturing fine details across a range of image qualities. OmChat utilizes an ac… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 14 pages

  20. arXiv:2407.04490  [pdf, other

    cs.CV

    Micro-gesture Online Recognition using Learnable Query Points

    Authors: Pengyu Liu, Fei Wang, Kun Li, Guoliang Chen, Yanyan Wei, Shengeng Tang, Zhiliang Wu, Dan Guo

    Abstract: In this paper, we briefly introduce the solution developed by our team, HFUT-VUT, for the Micro-gesture Online Recognition track in the MiGA challenge at IJCAI 2024. The Micro-gesture Online Recognition task involves identifying the category and locating the start and end times of micro-gestures in video clips. Compared to the typical Temporal Action Detection task, the Micro-gesture Online Recogn… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Technical Report of HFUT-VUT for the MiGA challenge at IJCAI 2024

  21. arXiv:2407.03705  [pdf, other

    cs.RO

    Energy-based Contact Planning under Uncertainty for Robot Air Hockey

    Authors: Julius Jankowski, Ante Marić, Puze Liu, Davide Tateo, Jan Peters, Sylvain Calinon

    Abstract: Planning robot contact often requires reasoning over a horizon to anticipate outcomes, making such planning problems computationally expensive. In this letter, we propose a learning framework for efficient contact planning in real-time subject to uncertain contact dynamics. We implement our approach for the example task of robot air hockey. Based on a learned stochastic model of puck dynamics, we… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: This work has been submitted to the IEEE Robotics & Automation Letters for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  22. arXiv:2407.02174  [pdf, other

    cs.CV

    BeNeRF: Neural Radiance Fields from a Single Blurry Image and Event Stream

    Authors: Wenpu Li, Pian Wan, Peng Wang, Jinghang Li, Yi Zhou, Peidong Liu

    Abstract: Neural implicit representation of visual scenes has attracted a lot of attention in recent research of computer vision and graphics. Most prior methods focus on how to reconstruct 3D scene representation from a set of images. In this work, we demonstrate the possibility to recover the neural radiance fields (NeRF) from a single blurry image and its corresponding event stream. We model the camera m… ▽ More

    Submitted 3 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  23. arXiv:2407.01636  [pdf, other

    cs.CV

    Learning Frequency-Aware Dynamic Transformers for All-In-One Image Restoration

    Authors: Zenglin Shi, Tong Su, Pei Liu, Yunpeng Wu, Le Zhang, Meng Wang

    Abstract: This work aims to tackle the all-in-one image restoration task, which seeks to handle multiple types of degradation with a single model. The primary challenge is to extract degradation representations from the input degraded images and use them to guide the model's adaptation to specific degradation types. Recognizing that various degradations affect image content differently across frequency band… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 8 pages

  24. arXiv:2407.01046  [pdf, other

    cs.AI cs.CL

    FRoG: Evaluating Fuzzy Reasoning of Generalized Quantifiers in Large Language Models

    Authors: Yiyuan Li, Shichao Sun, Pengfei Liu

    Abstract: Fuzzy reasoning is vital due to the frequent use of imprecise information in daily contexts. However, the ability of current large language models (LLMs) to handle such reasoning remains largely uncharted. In this paper, we introduce a new benchmark, FRoG, for fuzzy reasoning, featuring real-world mathematical word problems that incorporate generalized quantifiers. Our experimental findings reveal… ▽ More

    Submitted 2 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: Under review

  25. arXiv:2407.00578  [pdf, other

    cs.RO

    UniQuad: A Unified and Versatile Quadrotor Platform Series for UAV Research and Application

    Authors: Yichen Zhang, Xinyi Chen, Peize Liu, Junzhe Wang, Hetai Zou, Neng Pan, Fei Gao, Shaojie Shen

    Abstract: As quadrotors take on an increasingly diverse range of roles, researchers often need to develop new hardware platforms tailored for specific tasks, introducing significant engineering overhead. In this article, we introduce the UniQuad series, a unified and versatile quadrotor platform series that offers high flexibility to adapt to a wide range of common tasks, excellent customizability for advan… ▽ More

    Submitted 4 July, 2024; v1 submitted 29 June, 2024; originally announced July 2024.

    Comments: Submitted to 40th Anniversary of the IEEE Conference on Robotics and Automation (ICRA-X40)

  26. arXiv:2406.19741  [pdf, other

    cs.RO cs.AI

    ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

    Authors: Christopher E. Mower, Yuhui Wan, Hongzhan Yu, Antoine Grosnit, Jonas Gonzalez-Billandon, Matthieu Zimmer, Jinlong Wang, Xinyu Zhang, Yao Zhao, Anbang Zhai, Puze Liu, Daniel Palenicek, Davide Tateo, Cesar Cadena, Marco Hutter, Jan Peters, Guangjian Tian, Yuzheng Zhuang, Kun Shao, Xingyue Quan, Jianye Hao, Jun Wang, Haitham Bou-Ammar

    Abstract: We present a framework for intuitive robot programming by non-experts, leveraging natural language prompts and contextual information from the Robot Operating System (ROS). Our system integrates large language models (LLMs), enabling non-experts to articulate task requirements to the system through a chat interface. Key features of the framework include: integration of ROS with an AI agent connect… ▽ More

    Submitted 12 July, 2024; v1 submitted 28 June, 2024; originally announced June 2024.

    Comments: This document contains 26 pages and 13 figures

  27. arXiv:2406.17694  [pdf, other

    cs.CR

    Protecting the 'Stop Using My Data' Right through Blockchain-assisted Evidence Generation

    Authors: Fan Zhang, Peng Liu

    Abstract: In order to provide personalized services to users, Internet-based platforms collect and utilize user-generated behavioral data. Although the 'stop using my data' right should be a fundamental data right, which allows individuals to request their personal data to be no longer utilized by online platforms, the existing preventive data protection measures (e.g., cryptographic data elimination, diffe… ▽ More

    Submitted 29 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

  28. arXiv:2406.17431  [pdf, other

    cs.SE

    A Large-scale Investigation of Semantically Incompatible APIs behind Compatibility Issues in Android Apps

    Authors: Shidong Pan, Tianchen Guo, Lihong Zhang, Pei Liu, Zhenchang Xing, Xiaoyu Sun

    Abstract: Application Programming Interface (API) incompatibility is a long-standing issue in Android application development. The rapid evolution of Android APIs results in a significant number of API additions, removals, and changes between adjacent versions. Unfortunately, this high frequency of alterations may lead to compatibility issues, often without adequate notification to developers regarding thes… ▽ More

    Submitted 26 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

  29. arXiv:2406.16772  [pdf, other

    cs.CL cs.AI

    OlympicArena Medal Ranks: Who Is the Most Intelligent AI So Far?

    Authors: Zhen Huang, Zengzhi Wang, Shijie Xia, Pengfei Liu

    Abstract: In this report, we pose the following question: Who is the most intelligent AI model to date, as measured by the OlympicArena (an Olympic-level, multi-discipline, multi-modal benchmark for superintelligent AI)? We specifically focus on the most recently released models: Claude-3.5-Sonnet, Gemini-1.5-Pro, and GPT-4o. For the first time, we propose using an Olympic medal Table approach to rank AI mo… ▽ More

    Submitted 26 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

    Comments: 10 pages

  30. arXiv:2406.14883  [pdf, other

    cs.CL cs.CY

    OATH-Frames: Characterizing Online Attitudes Towards Homelessness with LLM Assistants

    Authors: Jaspreet Ranjit, Brihi Joshi, Rebecca Dorn, Laura Petry, Olga Koumoundouros, Jayne Bottarini, Peichen Liu, Eric Rice, Swabha Swayamdipta

    Abstract: Warning: Contents of this paper may be upsetting. Public attitudes towards key societal issues, expressed on online media, are of immense value in policy and reform efforts, yet challenging to understand at scale. We study one such social issue: homelessness in the U.S., by leveraging the remarkable capabilities of large language models to assist social work experts in analyzing millions of post… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Project website: https://dill-lab.github.io/oath-frames/

  31. arXiv:2406.13261  [pdf, other

    cs.CL cs.AI

    BeHonest: Benchmarking Honesty in Large Language Models

    Authors: Steffi Chern, Zhulin Hu, Yuqing Yang, Ethan Chern, Yuan Guo, Jiahe Jin, Binjie Wang, Pengfei Liu

    Abstract: Previous works on Large Language Models (LLMs) have mainly focused on evaluating their helpfulness or harmlessness. However, honesty, another crucial alignment criterion, has received relatively less attention. Dishonest behaviors in LLMs, such as spreading misinformation and defrauding users, present severe risks that intensify as these models approach superintelligent levels. Enhancing honesty i… ▽ More

    Submitted 8 July, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

  32. arXiv:2406.12753  [pdf, other

    cs.CL cs.AI

    OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

    Authors: Zhen Huang, Zengzhi Wang, Shijie Xia, Xuefeng Li, Haoyang Zou, Ruijie Xu, Run-Ze Fan, Lyumanshan Ye, Ethan Chern, Yixin Ye, Yikai Zhang, Yuqing Yang, Ting Wu, Binjie Wang, Shichao Sun, Yang Xiao, Yiyuan Li, Fan Zhou, Steffi Chern, Yiwei Qin, Yan Ma, Jiadi Su, Yixiu Liu, Yuxiang Zheng, Shaoting Zhang , et al. (3 additional authors not shown)

    Abstract: The evolution of Artificial Intelligence (AI) has been significantly accelerated by advancements in Large Language Models (LLMs) and Large Multimodal Models (LMMs), gradually showcasing potential cognitive reasoning abilities in problem-solving and scientific discovery (i.e., AI4Science) once exclusive to human intellect. To comprehensively evaluate current models' performance in cognitive reasoni… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 44 pages

  33. arXiv:2406.11562  [pdf, other

    cs.LG cs.RO

    An Imitative Reinforcement Learning Framework for Autonomous Dogfight

    Authors: Siyuan Li, Rongchang Zuo, Peng Liu, Yingnan Zhao

    Abstract: Unmanned Combat Aerial Vehicle (UCAV) dogfight, which refers to a fight between two or more UCAVs usually at close quarters, plays a decisive role on the aerial battlefields. With the evolution of artificial intelligence, dogfight progressively transits towards intelligent and autonomous modes. However, the development of autonomous dogfight policy learning is hindered by challenges such as weak e… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  34. arXiv:2406.10514  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    GTR-Voice: Articulatory Phonetics Informed Controllable Expressive Speech Synthesis

    Authors: Zehua Kcriss Li, Meiying Melissa Chen, Yi Zhong, Pinxin Liu, Zhiyao Duan

    Abstract: Expressive speech synthesis aims to generate speech that captures a wide range of para-linguistic features, including emotion and articulation, though current research primarily emphasizes emotional aspects over the nuanced articulatory features mastered by professional voice actors. Inspired by this, we explore expressive speech synthesis through the lens of articulatory phonetics. Specifically,… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  35. arXiv:2406.10395  [pdf, other

    eess.IV cs.CV q-bio.NC

    BrainFounder: Towards Brain Foundation Models for Neuroimage Analysis

    Authors: Joseph Cox, Peng Liu, Skylar E. Stolte, Yunchao Yang, Kang Liu, Kyle B. See, Huiwen Ju, Ruogu Fang

    Abstract: The burgeoning field of brain health research increasingly leverages artificial intelligence (AI) to interpret and analyze neurological data. This study introduces a novel approach towards the creation of medical foundation models by integrating a large-scale multi-modal magnetic resonance imaging (MRI) dataset derived from 41,400 participants in its own. Our method involves a novel two-stage pret… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 17 pages, 5 figures, to be published in Medical Image Analysis

  36. arXiv:2406.08148  [pdf, other

    cs.LG cs.AI

    Probing Implicit Bias in Semi-gradient Q-learning: Visualizing the Effective Loss Landscapes via the Fokker--Planck Equation

    Authors: Shuyu Yin, Fei Wen, Peilin Liu, Tao Luo

    Abstract: Semi-gradient Q-learning is applied in many fields, but due to the absence of an explicit loss function, studying its dynamics and implicit bias in the parameter space is challenging. This paper introduces the Fokker--Planck equation and employs partial data obtained through sampling to construct and visualize the effective loss landscape within a two-dimensional parameter space. This visualizatio… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  37. arXiv:2406.06965  [pdf, other

    cs.CV

    Evolving from Single-modal to Multi-modal Facial Deepfake Detection: A Survey

    Authors: Ping Liu, Qiqi Tao, Joey Tianyi Zhou

    Abstract: This survey addresses the critical challenge of deepfake detection amidst the rapid advancements in artificial intelligence. As AI-generated media, including video, audio and text, become more realistic, the risk of misuse to spread misinformation and commit identity fraud increases. Focused on face-centric deepfakes, this work traces the evolution from traditional single-modality methods to sophi… ▽ More

    Submitted 14 July, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: P. Liu is with the Department of Computer Science and Engineering, University of Nevada, Reno, NV, 89512. Q. Tao and J. Zhou are with Centre for Frontier AI Research (CFAR), and Institute of High Performance Computing (IHPC), A*STAR, Singapore. J. Zhou is also with Centre for Advanced Technologies in Online Safety (CATOS), A*STAR, Singapore. J. Zhou is the corresponding author

  38. arXiv:2406.05690  [pdf, other

    cs.CL

    MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation

    Authors: Yan Ma, Yu Qiao, Pengfei Liu

    Abstract: A story premise succinctly defines a story's main idea, foundation, and trajectory. It serves as the initial trigger in automatic story generation. Existing sources of story premises are limited by a lack of diversity, uneven quality, and high costs that make them difficult to scale. In response, we introduce Modular Story Premise Synthesis (MoPS) which breaks down story premises into modules like… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: ACL 2024, camera-ready

  39. arXiv:2406.03751  [pdf, other

    cs.LG

    Adaptive Multi-Scale Decomposition Framework for Time Series Forecasting

    Authors: Yifan Hu, Peiyuan Liu, Peng Zhu, Dawei Cheng, Tao Dai

    Abstract: Transformer-based and MLP-based methods have emerged as leading approaches in time series forecasting (TSF). While Transformer-based methods excel in capturing long-range dependencies, they suffer from high computational complexities and tend to overfit. Conversely, MLP-based methods offer computational efficiency and adeptness in modeling temporal dynamics, but they struggle with capturing comple… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  40. arXiv:2406.02212  [pdf, other

    cs.CE

    Generative Pre-Trained Diffusion Paradigm for Zero-Shot Time Series Forecasting

    Authors: Jiarui Yang, Tao Dai, Naiqi Li, Junxi Wu, Peiyuan Liu, Jinmin Li, Jigang Bao, Haigang Zhang, Shutao Xia

    Abstract: In recent years, generative pre-trained paradigms such as Large Language Models (LLMs) and Large Vision Models (LVMs) have achieved revolutionary advancements and widespread real-world applications. Particularly, the emergence of pre-trained LLMs-based temporal works, compared to previous deep model approaches, has demonstrated superior generalization and robustness, showcasing the potential of ge… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  41. arXiv:2406.02066  [pdf, other

    cs.LG q-bio.BM

    Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models

    Authors: Songtao Liu, Hanjun Dai, Yue Zhao, Peng Liu

    Abstract: Molecule synthesis through machine learning is one of the fundamental problems in drug discovery. Current data-driven strategies employ one-step retrosynthesis models and search algorithms to predict synthetic routes in a top-bottom manner. Despite their effective performance, these strategies face limitations in the molecule synthetic route generation due to a greedy selection of the next molecul… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted by ICML 2024(Oral)

  42. arXiv:2406.01252  [pdf, other

    cs.CL cs.AI stat.ML

    Towards Scalable Automated Alignment of LLMs: A Survey

    Authors: Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben He, Xianpei Han, Le Sun, Hongyu Lin, Bowen Yu

    Abstract: Alignment is the most critical step in building large language models (LLMs) that meet human needs. With the rapid development of LLMs gradually surpassing human capabilities, traditional alignment methods based on human-annotation are increasingly unable to meet the scalability demands. Therefore, there is an urgent need to explore new sources of automated alignment signals and technical approach… ▽ More

    Submitted 16 July, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  43. arXiv:2406.00507  [pdf, other

    cs.CL cs.AI

    Prompt Chaining or Stepwise Prompt? Refinement in Text Summarization

    Authors: Shichao Sun, Ruifeng Yuan, Ziqiang Cao, Wenjie Li, Pengfei Liu

    Abstract: Large language models (LLMs) have demonstrated the capacity to improve summary quality by mirroring a human-like iterative process of critique and refinement starting from the initial draft. Two strategies are designed to perform this iterative process: Prompt Chaining and Stepwise Prompt. Prompt chaining orchestrates the drafting, critiquing, and refining phases through a series of three discrete… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: Accepted to Findings of ACL 2024

  44. arXiv:2406.00050  [pdf, other

    cs.CL cs.AI

    An Empirical Analysis on Large Language Models in Debate Evaluation

    Authors: Xinyi Liu, Pinxin Liu, Hangfeng He

    Abstract: In this study, we investigate the capabilities and inherent biases of advanced large language models (LLMs) such as GPT-3.5 and GPT-4 in the context of debate evaluation. We discover that LLM's performance exceeds humans and surpasses the performance of state-of-the-art methods fine-tuned on extensive datasets in debate evaluation. We additionally explore and analyze biases present in LLMs, includ… ▽ More

    Submitted 4 June, 2024; v1 submitted 28 May, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 main

  45. arXiv:2405.20727  [pdf, other

    cs.CR cs.AI cs.DC

    GANcrop: A Contrastive Defense Against Backdoor Attacks in Federated Learning

    Authors: Xiaoyun Gan, Shanyu Gan, Taizhi Su, Peng Liu

    Abstract: With heightened awareness of data privacy protection, Federated Learning (FL) has attracted widespread attention as a privacy-preserving distributed machine learning method. However, the distributed nature of federated learning also provides opportunities for backdoor attacks, where attackers can guide the model to produce incorrect predictions without affecting the global model training process.… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  46. arXiv:2405.19085  [pdf, other

    cs.AI cs.CV

    Patch-enhanced Mask Encoder Prompt Image Generation

    Authors: Shusong Xu, Peiye Liu

    Abstract: Artificial Intelligence Generated Content(AIGC), known for its superior visual results, represents a promising mitigation method for high-cost advertising applications. Numerous approaches have been developed to manipulate generated content under different conditions. However, a crucial limitation lies in the accurate description of products in advertising applications. Applying previous methods d… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  47. arXiv:2405.17198  [pdf, other

    cs.LG math.OC

    Convex Relaxation for Solving Large-Margin Classifiers in Hyperbolic Space

    Authors: Sheng Yang, Peihan Liu, Cengiz Pehlevan

    Abstract: Hyperbolic spaces have increasingly been recognized for their outstanding performance in handling data with inherent hierarchical structures compared to their Euclidean counterparts. However, learning in hyperbolic spaces poses significant challenges. In particular, extending support vector machines to hyperbolic spaces is in general a constrained non-convex optimization problem. Previous and popu… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  48. arXiv:2405.14486  [pdf, other

    cs.CL

    RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models

    Authors: Xiangkun Hu, Dongyu Ru, Lin Qiu, Qipeng Guo, Tianhang Zhang, Yang Xu, Yun Luo, Pengfei Liu, Yue Zhang, Zheng Zhang

    Abstract: Large Language Models (LLMs) have shown impressive capabilities but also a concerning tendency to hallucinate. This paper presents RefChecker, a framework that introduces claim-triplets to represent claims in LLM responses, aiming to detect fine-grained hallucinations. In RefChecker, an extractor generates claim-triplets from a response, which are then evaluated by a checker against a reference. W… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  49. arXiv:2405.13238  [pdf

    cs.IR cs.LG

    Enhancing User Interest based on Stream Clustering and Memory Networks in Large-Scale Recommender Systems

    Authors: Peng Liu, Nian Wang, Cong Xu, Ming Zhao, Bin Wang, Yi Ren

    Abstract: Recommender Systems (RSs) provide personalized recommendation service based on user interest, which are widely used in various platforms. However, there are lots of users with sparse interest due to lacking consumption behaviors, which leads to poor recommendation results for them. This problem is widespread in large-scale RSs and is particularly difficult to address. To solve this problem, we pro… ▽ More

    Submitted 26 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

  50. arXiv:2405.13089  [pdf, other

    cs.LG

    SEGAN: semi-supervised learning approach for missing data imputation

    Authors: Xiaohua Pan, Weifeng Wu, Peiran Liu, Zhen Li, Peng Lu, Peijian Cao, Jianfeng Zhang, Xianfei Qiu, YangYang Wu

    Abstract: In many practical real-world applications, data missing is a very common phenomenon, making the development of data-driven artificial intelligence theory and technology increasingly difficult. Data completion is an important method for missing data preprocessing. Most existing miss-ing data completion models directly use the known information in the missing data set but ignore the impact of the da… ▽ More

    Submitted 12 June, 2024; v1 submitted 21 May, 2024; originally announced May 2024.