Identifying and Analyzing Task-Encoding Tokens in Large Language Models

Bai, Yu; Huang, Heyan; Piano, Cesare Spinoso-Di; Rondeau, Marc-Antoine; Chen, Sanxing; Gao, Yang; Cheung, Jackie Chi Kit

Computer Science > Computation and Language

arXiv:2401.11323 (cs)

[Submitted on 20 Jan 2024 (v1), last revised 16 Feb 2024 (this version, v2)]

Title:Identifying and Analyzing Task-Encoding Tokens in Large Language Models

Authors:Yu Bai, Heyan Huang, Cesare Spinoso-Di Piano, Marc-Antoine Rondeau, Sanxing Chen, Yang Gao, Jackie Chi Kit Cheung

View PDF

Abstract:In-context learning (ICL) has become an effective solution for few-shot learning in natural language processing. However, our understanding of ICL's working mechanisms is limited, specifically regarding how models learn to perform tasks from ICL demonstrations. For example, unexpectedly large changes in performance can arise from small changes in the prompt, leaving prompt design a largely empirical endeavour. In this paper, we investigate this problem by identifying and analyzing task-encoding tokens on whose representations the task performance depends. Using experiments that ablate the representations of different token types, we find that template and stopword tokens are the most prone to be task-encoding. In addition, we demonstrate experimentally that lexical meaning, repetition, and text formatting are the main distinguishing characteristics of these tokens. Our work sheds light on how large language models (LLMs) learn to perform a task from demonstrations, deepens our understanding of the varied roles different types of tokens play in LLMs, and provides insights for avoiding instability from improperly utilizing task-encoding tokens.

Comments:	Work in progress
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2401.11323 [cs.CL]
	(or arXiv:2401.11323v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2401.11323

Submission history

From: Yu Bai [view email]
[v1] Sat, 20 Jan 2024 20:55:21 UTC (928 KB)
[v2] Fri, 16 Feb 2024 16:43:35 UTC (713 KB)

Computer Science > Computation and Language

Title:Identifying and Analyzing Task-Encoding Tokens in Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Identifying and Analyzing Task-Encoding Tokens in Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators