The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values

Kirk, Hannah Rose; Bean, Andrew M.; Vidgen, Bertie; Röttger, Paul; Hale, Scott A.

Computer Science > Computation and Language

arXiv:2310.07629 (cs)

[Submitted on 11 Oct 2023]

Title:The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values

Authors:Hannah Rose Kirk, Andrew M. Bean, Bertie Vidgen, Paul Röttger, Scott A. Hale

View PDF

Abstract:Human feedback is increasingly used to steer the behaviours of Large Language Models (LLMs). However, it is unclear how to collect and incorporate feedback in a way that is efficient, effective and unbiased, especially for highly subjective human preferences and values. In this paper, we survey existing approaches for learning from human feedback, drawing on 95 papers primarily from the ACL and arXiv repositories.First, we summarise the past, pre-LLM trends for integrating human feedback into language models. Second, we give an overview of present techniques and practices, as well as the motivations for using feedback; conceptual frameworks for defining values and preferences; and how feedback is collected and from whom. Finally, we encourage a better future of feedback learning in LLMs by raising five unresolved conceptual and practical challenges.

Comments:	Accepted for the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP, Main)
Subjects:	Computation and Language (cs.CL); Computers and Society (cs.CY)
Cite as:	arXiv:2310.07629 [cs.CL]
	(or arXiv:2310.07629v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2310.07629

Submission history

From: Hannah Rose Kirk Miss [view email]
[v1] Wed, 11 Oct 2023 16:18:13 UTC (263 KB)

Computer Science > Computation and Language

Title:The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators