One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models

Zhu, Yutao; Huang, Zhaoheng; Dou, Zhicheng; Wen, Ji-Rong

Computer Science > Computation and Language

arXiv:2405.19670 (cs)

[Submitted on 30 May 2024 (v1), last revised 8 Jun 2024 (this version, v3)]

Title:One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models

Authors:Yutao Zhu, Zhaoheng Huang, Zhicheng Dou, Ji-Rong Wen

View PDF HTML (experimental)

Abstract:Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs) for generating more factual, accurate, and up-to-date content. Existing methods either optimize prompts to guide LLMs in leveraging retrieved information or directly fine-tune LLMs to adapt to RAG scenarios. Although fine-tuning can yield better performance, it often compromises the LLMs' general generation capabilities by modifying their parameters. This limitation poses challenges in practical applications, especially when LLMs are already deployed, as parameter adjustments may affect their original functionality. To address this, we propose a novel method that involves learning scalable and pluggable virtual tokens for RAG. By maintaining the LLMs' original parameters and fine-tuning only the embeddings of these pluggable tokens, our approach not only enhances LLMs' performance but also preserves their general generation capabilities. Furthermore, we design several training strategies to improve the scalability, flexibility, and generalizability of our method. Comprehensive experiments across nine question-answering tasks demonstrate the superiority of our approach.

Comments:	working in progress, repo: this https URL
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2405.19670 [cs.CL]
	(or arXiv:2405.19670v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2405.19670

Submission history

From: Yutao Zhu [view email]
[v1] Thu, 30 May 2024 03:44:54 UTC (264 KB)
[v2] Fri, 31 May 2024 02:56:56 UTC (264 KB)
[v3] Sat, 8 Jun 2024 07:14:13 UTC (313 KB)

Computer Science > Computation and Language

Title:One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators