Zum Hauptinhalt springen

Showing 151–200 of 1,033 results for author: Gu, J

.
  1. arXiv:2402.04676  [pdf, other

    cs.LG cs.AI cs.CV

    Group Distributionally Robust Dataset Distillation with Risk Minimization

    Authors: Saeed Vahidian, Mingyu Wang, Jianyang Gu, Vyacheslav Kungurtsev, Wei Jiang, Yiran Chen

    Abstract: Dataset distillation (DD) has emerged as a widely adopted technique for crafting a synthetic dataset that captures the essential information of a training dataset, facilitating the training of accurate neural models. Its applications span various domains, including transfer learning, federated learning, and neural architecture search. The most popular methods for constructing the synthetic data re… ▽ More

    Submitted 11 March, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

  2. arXiv:2402.03628  [pdf, other

    cs.CL

    Professional Agents -- Evolving Large Language Models into Autonomous Experts with Human-Level Competencies

    Authors: Zhixuan Chu, Yan Wang, Feng Zhu, Lu Yu, Longfei Li, Jinjie Gu

    Abstract: The advent of large language models (LLMs) such as ChatGPT, PaLM, and GPT-4 has catalyzed remarkable advances in natural language processing, demonstrating human-like language fluency and reasoning capacities. This position paper introduces the concept of Professional Agents (PAgents), an application framework harnessing LLM capabilities to create autonomous agents with controllable, specialized,… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: 14 pages, 1 figure

  3. arXiv:2402.03190  [pdf, other

    cs.CL cs.AI cs.IR cs.LG cs.MM

    Unified Hallucination Detection for Multimodal Large Language Models

    Authors: Xiang Chen, Chenxi Wang, Yida Xue, Ningyu Zhang, Xiaoyan Yang, Qiang Li, Yue Shen, Lei Liang, Jinjie Gu, Huajun Chen

    Abstract: Despite significant strides in multimodal tasks, Multimodal Large Language Models (MLLMs) are plagued by the critical issue of hallucination. The reliable detection of such hallucinations in MLLMs has, therefore, become a vital aspect of model evaluation and the safeguarding of practical application deployment. Prior research in this domain has been constrained by a narrow focus on singular tasks,… ▽ More

    Submitted 27 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Accepted by ACL 2024 (main conference)

  4. arXiv:2402.00893  [pdf, other

    cs.LG cs.AI

    MoDE: A Mixture-of-Experts Model with Mutual Distillation among the Experts

    Authors: Zhitian Xie, Yinger Zhang, Chenyi Zhuang, Qitao Shi, Zhining Liu, Jinjie Gu, Guannan Zhang

    Abstract: The application of mixture-of-experts (MoE) is gaining popularity due to its ability to improve model's performance. In an MoE structure, the gate layer plays a significant role in distinguishing and routing input features to different experts. This enables each expert to specialize in processing their corresponding sub-tasks. However, the gate's routing mechanism also gives rise to narrow vision:… ▽ More

    Submitted 30 January, 2024; originally announced February 2024.

    Comments: Accepted by AAAI-24

  5. arXiv:2402.00390  [pdf, other

    cs.IR cs.AI

    EASRec: Elastic Architecture Search for Efficient Long-term Sequential Recommender Systems

    Authors: Sheng Zhang, Maolin Wang, Yao Zhao, Chenyi Zhuang, Jinjie Gu, Ruocheng Guo, Xiangyu Zhao, Zijian Zhang, Hongzhi Yin

    Abstract: In this age where data is abundant, the ability to distill meaningful insights from the sea of information is essential. Our research addresses the computational and resource inefficiencies that current Sequential Recommender Systems (SRSs) suffer from. especially those employing attention-based models like SASRec, These systems are designed for next-item recommendations in various applications, f… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  6. arXiv:2402.00059  [pdf, other

    cs.LG cs.AI physics.ao-ph

    FengWu-GHR: Learning the Kilometer-scale Medium-range Global Weather Forecasting

    Authors: Tao Han, Song Guo, Fenghua Ling, Kang Chen, Junchao Gong, Jingjia Luo, Junxia Gu, Kan Dai, Wanli Ouyang, Lei Bai

    Abstract: Kilometer-scale modeling of global atmosphere dynamics enables fine-grained weather forecasting and decreases the risk of disastrous weather and climate activity. Therefore, building a kilometer-scale global forecast model is a persistent pursuit in the meteorology domain. Active international efforts have been made in past decades to improve the spatial resolution of numerical weather models. Non… ▽ More

    Submitted 28 January, 2024; originally announced February 2024.

    Comments: 19 pages

  7. arXiv:2401.17623  [pdf, other

    cs.CL

    Neighboring Perturbations of Knowledge Editing on Large Language Models

    Authors: Jun-Yu Ma, Zhen-Hua Ling, Ningyu Zhang, Jia-Chen Gu

    Abstract: Despite their exceptional capabilities, large language models (LLMs) are prone to generating unintended text due to false or outdated knowledge. Given the resource-intensive nature of retraining LLMs, there has been a notable increase in the development of knowledge editing. However, current approaches and evaluations rarely explore the perturbation of editing on neighboring knowledge. This paper… ▽ More

    Submitted 26 May, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

    Comments: Accepted by ICML 2024

  8. arXiv:2401.17450  [pdf, other

    quant-ph cs.AR eess.SY

    Qplacer: Frequency-Aware Component Placement for Superconducting Quantum Computers

    Authors: Junyao Zhang, Hanrui Wang, Qi Ding, Jiaqi Gu, Reouven Assouly, William D. Oliver, Song Han, Kenneth R. Brown, Hai "Helen" Li, Yiran Chen

    Abstract: Noisy Intermediate-Scale Quantum (NISQ) computers face a critical limitation in qubit numbers, hindering their progression towards large-scale and fault-tolerant quantum computing. A significant challenge impeding scaling is crosstalk, characterized by unwanted interactions among neighboring components on quantum chips, including qubits, resonators, and substrate. We motivate a general approach to… ▽ More

    Submitted 8 May, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

  9. arXiv:2401.16184  [pdf, other

    cs.CL cs.LG

    Realizing Disentanglement in LM Latent Space via Vocabulary-Defined Semantics

    Authors: Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang

    Abstract: Understanding the latent space of language models (LMs) is important for improving the performance and interpretability of LMs. Existing analyses often fail to provide insights that take advantage of the semantic properties of language models and often overlook crucial aspects of language model adaptation. In response, we introduce a pioneering approach called vocabulary-defined semantics, which e… ▽ More

    Submitted 26 May, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: under peer-review

  10. arXiv:2401.15884  [pdf, other

    cs.CL

    Corrective Retrieval Augmented Generation

    Authors: Shi-Qi Yan, Jia-Chen Gu, Yun Zhu, Zhen-Hua Ling

    Abstract: Large language models (LLMs) inevitably exhibit hallucinations since the accuracy of generated texts cannot be secured solely by the parametric knowledge they encapsulate. Although retrieval-augmented generation (RAG) is a practicable complement to LLMs, it relies heavily on the relevance of retrieved documents, raising concerns about how the model behaves if retrieval goes wrong. To this end, we… ▽ More

    Submitted 16 February, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

  11. arXiv:2401.15847  [pdf, other

    cs.CV cs.AI cs.CL

    Muffin or Chihuahua? Challenging Multimodal Large Language Models with Multipanel VQA

    Authors: Yue Fan, Jing Gu, Kaiwen Zhou, Qianqi Yan, Shan Jiang, Ching-Chen Kuo, Xinze Guan, Xin Eric Wang

    Abstract: Multipanel images, commonly seen as web screenshots, posters, etc., pervade our daily lives. These images, characterized by their composition of multiple subfigures in distinct layouts, effectively convey information to people. Toward building advanced multimodal AI applications, such as agents that understand complex scenes and navigate through webpages, the skill of multipanel visual reasoning i… ▽ More

    Submitted 27 June, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

    Comments: ACL 2024

  12. arXiv:2401.14823  [pdf, other

    cs.NI

    A Deep Reinforcement Learning-based Approach for Adaptive Handover Protocols in Mobile Networks

    Authors: Peter J. Gu, Johannes Voigt, Peter M. Rost

    Abstract: Due to an ever-increasing number of participants and new areas of application, the demands on mobile communications systems are continually increasing. In order to deliver higher data rates, enable mobility and guarantee QoS requirements of subscribers, these systems and the protocols used are becoming more complex. By using higher frequency spectrums, cells become smaller and more base stations h… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: Submitted to EuCNC

  13. arXiv:2401.14742  [pdf, other

    hep-ph astro-ph.CO gr-qc physics.atom-ph

    Violation of the equivalence principle induced by oscillating rest mass and transition frequency, and its detection in atom interferometers

    Authors: Jordan Gué, Aurélien Hees, Peter Wolf

    Abstract: We present a theoretical investigation of the expected experimental signals produced by freely falling atoms with time oscillating mass and transition frequency. These oscillations could be produced in a variety of models, in particular, models of scalar dark matter (DM) non universally coupled to the standard matter (SM) such as axion-like particles (ALP) and dilatons. Performing complete and rig… ▽ More

    Submitted 15 May, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: 30 pages, 7 figures, 2 tables

  14. arXiv:2401.13627  [pdf, other

    cs.CV

    Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

    Authors: Fanghua Yu, Jinjin Gu, Zheyuan Li, Jinfan Hu, Xiangtao Kong, Xintao Wang, Jingwen He, Yu Qiao, Chao Dong

    Abstract: We introduce SUPIR (Scaling-UP Image Restoration), a groundbreaking image restoration method that harnesses generative prior and the power of model scaling up. Leveraging multi-modal techniques and advanced generative prior, SUPIR marks a significant advance in intelligent and realistic image restoration. As a pivotal catalyst within SUPIR, model scaling dramatically enhances its capabilities and… ▽ More

    Submitted 3 April, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: This paper has been accepted by CVPR 2024

  15. arXiv:2401.13354  [pdf, other

    cs.OS cs.NI

    Characterizing Network Requirements for GPU API Remoting in AI Applications

    Authors: Tianxia Wang, Zhuofu Chen, Xingda Wei, Jinyu Gu, Rong Chen, Haibo Chen

    Abstract: GPU remoting is a promising technique for supporting AI applications. Networking plays a key role in enabling remoting. However, for efficient remoting, the network requirements in terms of latency and bandwidth are unknown. In this paper, we take a GPU-centric approach to derive the minimum latency and bandwidth requirements for GPU remoting, while ensuring no (or little) performance degradation… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  16. arXiv:2401.11505  [pdf, other

    cs.CL cs.IR

    CheX-GPT: Harnessing Large Language Models for Enhanced Chest X-ray Report Labeling

    Authors: Jawook Gu, Han-Cheol Cho, Jiho Kim, Kihyun You, Eun Kyoung Hong, Byungseok Roh

    Abstract: Free-text radiology reports present a rich data source for various medical tasks, but effectively labeling these texts remains challenging. Traditional rule-based labeling methods fall short of capturing the nuances of diverse free-text patterns. Moreover, models using expert-annotated data are limited by data scarcity and pre-defined classes, impacting their performance, flexibility and scalabili… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: 16 pages, 3 figures

  17. arXiv:2401.11170  [pdf, other

    cs.CV cs.CR

    Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images

    Authors: Kuofeng Gao, Yang Bai, Jindong Gu, Shu-Tao Xia, Philip Torr, Zhifeng Li, Wei Liu

    Abstract: Large vision-language models (VLMs) such as GPT-4 have achieved exceptional performance across various multi-modal tasks. However, the deployment of VLMs necessitates substantial energy consumption and computational resources. Once attackers maliciously induce high energy consumption and latency time (energy-latency cost) during inference of VLMs, it will exhaust computational resources. In this p… ▽ More

    Submitted 22 March, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

    Comments: Accepted by ICLR 2024

  18. arXiv:2401.10559  [pdf, other

    cs.LG cs.AI cs.CL

    OrchMoE: Efficient Multi-Adapter Learning with Task-Skill Synergy

    Authors: Haowen Wang, Tao Sun, Kaixiang Ji, Jian Wang, Cong Fan, Jinjie Gu

    Abstract: We advance the field of Parameter-Efficient Fine-Tuning (PEFT) with our novel multi-adapter method, OrchMoE, which capitalizes on modular skill architecture for enhanced forward transfer in neural networks. Unlike prior models that depend on explicit task identification inputs, OrchMoE automatically discerns task categories, streamlining the learning process. This is achieved through an integrated… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: 9 pages, 3 figures

  19. arXiv:2401.08941   

    stat.ME

    A Powerful and Precise Feature-level Filter using Group Knockoffs

    Authors: Jiaqi Gu, Zihuai He

    Abstract: Selecting important features that have substantial effects on the response with provable type-I error rate control is a fundamental concern in statistics, with wide-ranging practical applications. Existing knockoff filters, although shown to provide theoretical guarantee on false discovery rate (FDR) control, often struggle to strike a balance between high power and precision in pinpointing import… ▽ More

    Submitted 27 February, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: We need a major revision of this paper

  20. arXiv:2401.07103  [pdf, other

    cs.CL

    Leveraging Large Language Models for NLG Evaluation: Advances and Challenges

    Authors: Zhen Li, Xiaohan Xu, Tao Shen, Can Xu, Jia-Chen Gu, Yuxuan Lai, Chongyang Tao, Shuai Ma

    Abstract: In the rapidly evolving domain of Natural Language Generation (NLG) evaluation, introducing Large Language Models (LLMs) has opened new avenues for assessing generated content quality, e.g., coherence, creativity, and context relevance. This paper aims to provide a thorough overview of leveraging LLMs for NLG evaluation, a burgeoning area that lacks a systematic analysis. We propose a coherent tax… ▽ More

    Submitted 12 June, 2024; v1 submitted 13 January, 2024; originally announced January 2024.

    Comments: 21 pages, 5 figures

  21. arXiv:2401.05571  [pdf, other

    quant-ph cs.AR cs.LG

    QuantumSEA: In-Time Sparse Exploration for Noise Adaptive Quantum Circuits

    Authors: Tianlong Chen, Zhenyu Zhang, Hanrui Wang, Jiaqi Gu, Zirui Li, David Z. Pan, Frederic T. Chong, Song Han, Zhangyang Wang

    Abstract: Parameterized Quantum Circuits (PQC) have obtained increasing popularity thanks to their great potential for near-term Noisy Intermediate-Scale Quantum (NISQ) computers. Achieving quantum advantages usually requires a large number of qubits and quantum circuits with enough capacity. However, limited coherence time and massive quantum noises severely constrain the size of quantum circuits that can… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: IEEE International Conference on Quantum Computing and Engineering (QCE 2023)

  22. arXiv:2401.05391  [pdf

    cs.AR cs.AI

    Efficient LLM inference solution on Intel GPU

    Authors: Hui Wu, Yi Gan, Feng Yuan, Jing Ma, Wei Zhu, Yutao Xu, Hong Zhu, Yuhua Zhu, Xiaoli Liu, Jinghui Gu, Peng Zhao

    Abstract: Transformer based Large Language Models (LLMs) have been widely used in many fields, and the efficiency of LLM inference becomes hot topic in real applications. However, LLMs are usually complicatedly designed in model structure with massive operations and perform inference in the auto-regressive mode, making it a challenging task to design a system with high efficiency. In this paper, we propos… ▽ More

    Submitted 23 June, 2024; v1 submitted 19 December, 2023; originally announced January 2024.

  23. arXiv:2401.04700  [pdf, other

    cs.CL

    Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue

    Authors: Jia-Chen Gu, Hao-Xiang Xu, Jun-Yu Ma, Pan Lu, Zhen-Hua Ling, Kai-Wei Chang, Nanyun Peng

    Abstract: Model editing is a technique that edits the large language models (LLMs) with updated knowledge to alleviate hallucinations without resource-intensive retraining. While current model editing methods can effectively modify a model's behavior within a specific area of interest, they often overlook the potential unintended side effects on the general abilities of LLMs such as reasoning, natural langu… ▽ More

    Submitted 16 June, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: Propose a new regularization method

  24. arXiv:2401.04319  [pdf, other

    cs.CL cs.AI

    Know Your Needs Better: Towards Structured Understanding of Marketer Demands with Analogical Reasoning Augmented LLMs

    Authors: Junjie Wang, Dan Yang, Binbin Hu, Yue Shen, Wen Zhang, Jinjie Gu

    Abstract: In this paper, we explore a new way for user targeting, where non-expert marketers could select their target users solely given demands in natural language form. The key to this issue is how to transform natural languages into practical structured logical languages, i.e., the structured understanding of marketer demands. In practical scenarios, the demands of non-expert marketers are often abstrac… ▽ More

    Submitted 11 June, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

    Comments: Accepted by KDD 2024

  25. arXiv:2401.04115  [pdf, ps, other

    math.AP

    Soliton resolution for the energy critical damped wave equations in the radial case

    Authors: Jingyuan Gu, Lifeng Zhao

    Abstract: We consider energy-critical damped wave equation \begin{equation*} \partial_{tt}u-Δu+α\partial_t u=\left|u\right|^{\frac{4}{D-2}}u \end{equation*} with radial initial data in dimensions $D\geq 4$. The equation has a nontrivial radial stationary solution $W$, called the ground state, which is unique up to sign and scale. We prove that any bounded energy norm solution behaves asymptotically as a sup… ▽ More

    Submitted 11 March, 2024; v1 submitted 22 December, 2023; originally announced January 2024.

    Comments: arXiv admin note: text overlap with arXiv:2203.09614, arXiv:2106.10738 by other authors

  26. arXiv:2401.03512  [pdf, other

    cs.CL cs.AI cs.LG

    CharPoet: A Chinese Classical Poetry Generation System Based on Token-free LLM

    Authors: Chengyue Yu, Lei Zang, Jiaotuan Wang, Chenyi Zhuang, Jinjie Gu

    Abstract: Automatic Chinese classical poetry generation has attracted much research interest, but achieving effective control over format and content simultaneously remains challenging. Traditional systems usually accept keywords as user inputs, resulting in limited control over content. Large language models (LLMs) improve content control by allowing unrestricted user instructions, but the token-by-token g… ▽ More

    Submitted 20 March, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

  27. arXiv:2401.02474  [pdf, other

    hep-ph hep-ex

    From Optimal Observables to Machine Learning: an Effective-Field-Theory Analysis of $e^+e^- \to W^+W^-$ at Future Lepton Colliders

    Authors: Shengdu Chai, Jiayin Gu, Lingfeng Li

    Abstract: We apply machine-learning techniques to the effective-field-theory analysis of the $e^+e^- \to W^+W^-$ processes at future lepton colliders, and demonstrate their advantages in comparison with conventional methods, such as optimal observables. Compared to traditional algorithms, we show that simulation-based inference methods are more robust to detector effects and backgrounds, and could in princi… ▽ More

    Submitted 30 June, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

    Comments: 31 pages, 8 figures, minor updates

  28. arXiv:2401.01377  [pdf, other

    cs.CR cs.AI

    Does Few-shot Learning Suffer from Backdoor Attacks?

    Authors: Xinwei Liu, Xiaojun Jia, Jindong Gu, Yuan Xun, Siyuan Liang, Xiaochun Cao

    Abstract: The field of few-shot learning (FSL) has shown promising results in scenarios where training data is limited, but its vulnerability to backdoor attacks remains largely unexplored. We first explore this topic by first evaluating the performance of the existing backdoor attack methods on few-shot learning scenarios. Unlike in standard supervised learning, existing backdoor attack methods failed to p… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

    Comments: AAAI2024

  29. arXiv:2401.01286  [pdf, other

    cs.CL cs.AI cs.CV cs.HC cs.LG

    A Comprehensive Study of Knowledge Editing for Large Language Models

    Authors: Ningyu Zhang, Yunzhi Yao, Bozhong Tian, Peng Wang, Shumin Deng, Mengru Wang, Zekun Xi, Shengyu Mao, Jintian Zhang, Yuansheng Ni, Siyuan Cheng, Ziwen Xu, Xin Xu, Jia-Chen Gu, Yong Jiang, Pengjun Xie, Fei Huang, Lei Liang, Zhiqiang Zhang, Xiaowei Zhu, Jun Zhou, Huajun Chen

    Abstract: Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. However, a primary limitation lies in the significant computational demands during training, arising from their extensive parameterization. This challenge is further intensified by the dynamic nature of the world, necessitating frequent updates to LLMs t… ▽ More

    Submitted 28 March, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

    Comments: Ongoing work; 52 pages, 282 citations; benchmark is available at https://huggingface.co/datasets/zjunlp/KnowEdit code is available at https://github.com/zjunlp/EasyEdit paper list is available at https://github.com/zjunlp/KnowledgeEditingPapers

  30. arXiv:2312.17624  [pdf, other

    cs.LG cs.AI

    XAI for In-hospital Mortality Prediction via Multimodal ICU Data

    Authors: Xingqiao Li, Jindong Gu, Zhiyong Wang, Yancheng Yuan, Bo Du, Fengxiang He

    Abstract: Predicting in-hospital mortality for intensive care unit (ICU) patients is key to final clinical outcomes. AI has shown advantaged accuracy but suffers from the lack of explainability. To address this issue, this paper proposes an eXplainable Multimodal Mortality Predictor (X-MMP) approaching an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data. We emp… ▽ More

    Submitted 29 December, 2023; originally announced December 2023.

  31. GreenFlow: A Computation Allocation Framework for Building Environmentally Sound Recommendation System

    Authors: Xingyu Lu, Zhining Liu, Yanchu Guan, Hongxuan Zhang, Chenyi Zhuang, Wenqi Ma, Yize Tan, Jinjie Gu, Guannan Zhang

    Abstract: Given the enormous number of users and items, industrial cascade recommendation systems (RS) are continuously expanded in size and complexity to deliver relevant items, such as news, services, and commodities, to the appropriate users. In a real-world scenario with hundreds of thousands requests per second, significant computation is required to infer personalized results for each request, resulti… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Journal ref: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence AI for Good. Pages 6103-6111

  32. arXiv:2312.15939  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    Acousto-drag photovoltaic effect by piezoelectric integration of two-dimensional semiconductors

    Authors: Jiaming Gu, Yicheng Mou, Jianwen Ma, Haonan Chen, Chuanxin Zhang, Yuxiang Wang, Jiayu Wang, Hangwen Guo, Wu Shi, Xiang Yuan, Xue Jiang, Dean Ta, Jian Shen, Cheng Zhang

    Abstract: Light-to-electricity conversion is crucial for energy harvesting and photodetection, requesting efficient electron-hole pair separation to prevent recombination. Traditional junction-based mechanisms using built-in electric fields fail in non-barrier regions. Homogeneous material harvesting under photovoltaic effect is appealing but only realized in non-centrosymmetric systems via bulk photovoltai… ▽ More

    Submitted 8 August, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

    Comments: 14 pages, 4 figures

  33. arXiv:2312.12871  [pdf, other

    cs.LG stat.ML

    Effect Size Estimation for Duration Recommendation in Online Experiments: Leveraging Hierarchical Models and Objective Utility Approaches

    Authors: Yu Liu, Runzhe Wan, James McQueen, Doug Hains, Jinxiang Gu, Rui Song

    Abstract: The selection of the assumed effect size (AES) critically determines the duration of an experiment, and hence its accuracy and efficiency. Traditionally, experimenters determine AES based on domain knowledge. However, this method becomes impractical for online experimentation services managing numerous experiments, and a more automated approach is hence of great demand. We initiate the study of da… ▽ More

    Submitted 17 April, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

  34. arXiv:2312.12742  [pdf, other

    cs.CV

    Cached Transformers: Improving Transformers with Differentiable Memory Cache

    Authors: Zhaoyang Zhang, Wenqi Shao, Yixiao Ge, Xiaogang Wang, Jinwei Gu, Ping Luo

    Abstract: This work introduces a new Transformer model called Cached Transformer, which uses Gated Recurrent Cached (GRC) attention to extend the self-attention mechanism with a differentiable memory cache of tokens. GRC attention enables attending to both past and current tokens, increasing the receptive field of attention and allowing for exploring long-range dependencies. By utilizing a recurrent gating… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: AAAI 2024

  35. arXiv:2312.12728  [pdf, other

    cs.IR cs.AI cs.LG

    Lookahead: An Inference Acceleration Framework for Large Language Model with Lossless Generation Accuracy

    Authors: Yao Zhao, Zhitian Xie, Chen Liang, Chenyi Zhuang, Jinjie Gu

    Abstract: As Large Language Models (LLMs) have made significant advancements across various tasks, such as question answering, translation, text summarization, and dialogue systems, the need for accuracy in information becomes crucial, especially for serious financial products serving billions of users like Alipay. However, for a real-world product serving millions of users, the inference speed of LLMs beco… ▽ More

    Submitted 30 May, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: 10 pages, 6 figures

  36. arXiv:2312.09785  [pdf, other

    cs.CL

    RJUA-QA: A Comprehensive QA Dataset for Urology

    Authors: Shiwei Lyu, Chenfei Chi, Hongbo Cai, Lei Shi, Xiaoyan Yang, Lei Liu, Xiang Chen, Deng Zhao, Zhiqiang Zhang, Xianguo Lyu, Ming Zhang, Fangzhou Li, Xiaowei Ma, Yue Shen, Jinjie Gu, Wei Xue, Yiran Huang

    Abstract: We introduce RJUA-QA, a novel medical dataset for question answering (QA) and reasoning with clinical evidence, contributing to bridge the gap between general large language models (LLMs) and medical-specific LLM applications. RJUA-QA is derived from realistic clinical scenarios and aims to facilitate LLMs in generating reliable diagnostic and advice. The dataset contains 2,132 curated Question-Co… ▽ More

    Submitted 7 January, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: An initial version

  37. Multiple Instance Learning for Uplift Modeling

    Authors: Yao Zhao, Haipeng Zhang, Shiwei Lyu, Ruiying Jiang, Jinjie Gu, Guannan Zhang

    Abstract: Uplift modeling is widely used in performance marketing to estimate effects of promotion campaigns (e.g., increase of customer retention rate). Since it is impossible to observe outcomes of a recipient in treatment (e.g., receiving a certain promotion) and control (e.g., without promotion) groups simultaneously (i.e., counter-factual), uplift models are mainly trained on instances of treatment and… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: short paper of CIKM22(full version)

    Journal ref: Proceedings of the 31st ACM International Conference on Information and Knowledge Management (2022) 4727-4731

  38. arXiv:2312.08962  [pdf, other

    cs.CV

    Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models

    Authors: Zhiyuan You, Zheyuan Li, Jinjin Gu, Zhenfei Yin, Tianfan Xue, Chao Dong

    Abstract: We introduce a Depicted image Quality Assessment method (DepictQA), overcoming the constraints of traditional score-based methods. DepictQA allows for detailed, language-based, human-like evaluation of image quality by leveraging Multi-modal Large Language Models (MLLMs). Unlike conventional Image Quality Assessment (IQA) methods relying on scores, DepictQA interprets image content and distortions… ▽ More

    Submitted 14 July, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: Accepted to ECCV2024, Camera Ready Version

  39. arXiv:2312.08860  [pdf, other

    hep-lat hep-ph nucl-ex nucl-th

    Baryon electric charge correlation as a magnetometer of QCD

    Authors: Heng-Tong Ding, Jin-Biao Gu, Arpith Kumar, Sheng-Tai Li, Jun-Hong Liu

    Abstract: The correlation between net baryon number and electric charge, $χ_{11}^{\rm BQ}$, can serve as a magnetometer of QCD. This is demonstrated by lattice QCD computations using the highly improved staggered quarks with physical pion mass of $M_π=135~$MeV on $N_τ=8$ and 12 lattices. We find that $χ_{11}^{\rm BQ}$ along the transition line starts to increase rapidly with magnetic field strength… ▽ More

    Submitted 27 March, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: 6 pages main text + 6 pages supplemental material, discussions added on the continuum estimate and extrapolation along with additional lattice QCD simulations on Nt=16 lattices, and the proxy of baryon electric charge correlation as well as the thermal fits to obtain the baryon and electric charge chemical potential

  40. arXiv:2312.08563  [pdf, other

    cs.CV

    Efficient-NeRF2NeRF: Streamlining Text-Driven 3D Editing with Multiview Correspondence-Enhanced Diffusion Models

    Authors: Liangchen Song, Liangliang Cao, Jiatao Gu, Yifan Jiang, Junsong Yuan, Hao Tang

    Abstract: The advancement of text-driven 3D content editing has been blessed by the progress from 2D generative diffusion models. However, a major obstacle hindering the widespread adoption of 3D content editing is its time-intensive processing. This challenge arises from the iterative and refining steps required to achieve consistent 3D outputs from 2D image-based generative models. Recent state-of-the-art… ▽ More

    Submitted 26 December, 2023; v1 submitted 13 December, 2023; originally announced December 2023.

    Comments: Project page: https://lsongx.github.io/projects/en2n.html

  41. arXiv:2312.06677  [pdf, other

    cs.LG cs.AI cs.CL

    Intelligent Virtual Assistants with LLM-based Process Automation

    Authors: Yanchu Guan, Dong Wang, Zhixuan Chu, Shiyu Wang, Feiyue Ni, Ruihua Song, Longfei Li, Jinjie Gu, Chenyi Zhuang

    Abstract: While intelligent virtual assistants like Siri, Alexa, and Google Assistant have become ubiquitous in modern life, they still face limitations in their ability to follow multi-step instructions and accomplish complex goals articulated in natural language. However, recent breakthroughs in large language models (LLMs) show promise for overcoming existing barriers by enhancing natural language proces… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  42. arXiv:2312.05795  [pdf, other

    cs.AI

    Large Multimodal Model Compression via Efficient Pruning and Distillation at AntGroup

    Authors: Maolin Wang, Yao Zhao, Jiajia Liu, Jingdong Chen, Chenyi Zhuang, Jinjie Gu, Ruocheng Guo, Xiangyu Zhao

    Abstract: The deployment of Large Multimodal Models (LMMs) within AntGroup has significantly advanced multimodal tasks in payment, security, and advertising, notably enhancing advertisement audition tasks in Alipay. However, the deployment of such sizable models introduces challenges, particularly in increased latency and carbon emissions, which are antithetical to the ideals of Green AI. This paper introdu… ▽ More

    Submitted 24 June, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

  43. arXiv:2312.05716  [pdf, other

    cs.CV

    Initialization Matters for Adversarial Transfer Learning

    Authors: Andong Hua, Jindong Gu, Zhiyu Xue, Nicholas Carlini, Eric Wong, Yao Qin

    Abstract: With the prevalence of the Pretraining-Finetuning paradigm in transfer learning, the robustness of downstream tasks has become a critical concern. In this work, we delve into adversarial robustness in transfer learning and reveal the critical role of initialization, including both the pretrained model and the linear head. First, we discover the necessity of an adversarially robust pretrained model… ▽ More

    Submitted 30 March, 2024; v1 submitted 9 December, 2023; originally announced December 2023.

    Comments: CVPR 2024

  44. arXiv:2312.05356  [pdf, other

    cs.SE cs.CL cs.LG

    Neuron Patching: Semantic-based Neuron-level Language Model Repair for Code Generation

    Authors: Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang

    Abstract: Large Language Models (LLMs) have already gained widespread adoption in software engineering, particularly in code generation tasks. However, updating these models with new knowledge can be prohibitively expensive, yet it is essential to maximize their utility, such as implementing a hotfix technique to address urgent or critical LLM errors. In this paper, we propose \textsc{MENT}, a novel and eff… ▽ More

    Submitted 5 August, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

    Comments: 12 pages, 7 figures, 7 tables, under peer-review

  45. arXiv:2312.05276  [pdf, other

    cs.AI cs.LG

    Making Large Language Models Better Knowledge Miners for Online Marketing with Progressive Prompting Augmentation

    Authors: Chunjing Gan, Dan Yang, Binbin Hu, Ziqi Liu, Yue Shen, Zhiqiang Zhang, Jinjie Gu, Jun Zhou, Guannan Zhang

    Abstract: Nowadays, the rapid development of mobile economy has promoted the flourishing of online marketing campaigns, whose success greatly hinges on the efficient matching between user preferences and desired marketing campaigns where a well-established Marketing-oriented Knowledge Graph (dubbed as MoKG) could serve as the critical "bridge" for preference propagation. In this paper, we seek to carefully… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  46. arXiv:2312.04403  [pdf, other

    cs.CV

    OT-Attack: Enhancing Adversarial Transferability of Vision-Language Models via Optimal Transport Optimization

    Authors: Dongchen Han, Xiaojun Jia, Yang Bai, Jindong Gu, Yang Liu, Xiaochun Cao

    Abstract: Vision-language pre-training (VLP) models demonstrate impressive abilities in processing both images and text. However, they are vulnerable to multi-modal adversarial examples (AEs). Investigating the generation of high-transferability adversarial examples is crucial for uncovering VLP models' vulnerabilities in practical scenarios. Recent works have indicated that leveraging data augmentation and… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  47. arXiv:2312.03248  [pdf, other

    cs.LG cs.AI

    Customizable Combination of Parameter-Efficient Modules for Multi-Task Learning

    Authors: Haowen Wang, Tao Sun, Cong Fan, Jinjie Gu

    Abstract: Modular and composable transfer learning is an emerging direction in the field of Parameter Efficient Fine-Tuning, as it enables neural networks to better organize various aspects of knowledge, leading to improved cross-task generalization. In this paper, we introduce a novel approach Customized Polytropon C-Poly that combines task-common skills and task-specific skills, while the skill parameters… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: 22 pages, 9 figures

  48. arXiv:2312.03045  [pdf, other

    cs.CV

    Customization Assistant for Text-to-image Generation

    Authors: Yufan Zhou, Ruiyi Zhang, Jiuxiang Gu, Tong Sun

    Abstract: Customizing pre-trained text-to-image generation model has attracted massive research interest recently, due to its huge potential in real-world applications. Although existing methods are able to generate creative content for a novel concept contained in single user-input image, their capability are still far from perfection. Specifically, most existing methods require fine-tuning the generative… ▽ More

    Submitted 8 May, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: CVPR 2024

  49. arXiv:2312.03018  [pdf, other

    cs.CV

    DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text Guidance

    Authors: Cong Wang, Jiaxi Gu, Panwen Hu, Songcen Xu, Hang Xu, Xiaodan Liang

    Abstract: Image-to-video generation, which aims to generate a video starting from a given reference image, has drawn great attention. Existing methods try to extend pre-trained text-guided image diffusion models to image-guided video generation models. Nevertheless, these methods often result in either low fidelity or flickering over time due to their limitation to shallow image guidance and poor temporal c… ▽ More

    Submitted 12 December, 2023; v1 submitted 4 December, 2023; originally announced December 2023.

  50. arXiv:2312.03015  [pdf, other

    cs.CV cs.AI cs.LG

    PartSLIP++: Enhancing Low-Shot 3D Part Segmentation via Multi-View Instance Segmentation and Maximum Likelihood Estimation

    Authors: Yuchen Zhou, Jiayuan Gu, Xuanlin Li, Minghua Liu, Yunhao Fang, Hao Su

    Abstract: Open-world 3D part segmentation is pivotal in diverse applications such as robotics and AR/VR. Traditional supervised methods often grapple with limited 3D data availability and struggle to generalize to unseen object categories. PartSLIP, a recent advancement, has made significant strides in zero- and few-shot 3D part segmentation. This is achieved by harnessing the capabilities of the 2D open-vo… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.