InferAct: Inferring Safe Actions for LLM-Based Agents Through Preemptive Evaluation and Human Feedback

Fang, Haishuo; Zhu, Xiaodan; Gurevych, Iryna

Computer Science > Computation and Language

arXiv:2407.11843 (cs)

[Submitted on 16 Jul 2024]

Title:InferAct: Inferring Safe Actions for LLM-Based Agents Through Preemptive Evaluation and Human Feedback

Authors:Haishuo Fang, Xiaodan Zhu, Iryna Gurevych

View PDF HTML (experimental)

Abstract:A crucial requirement for deploying LLM-based agents in real-life applications is robustness against risky or irreversible mistakes. However, existing research lacks a focus on the preemptive evaluation of reasoning trajectories performed by LLM agents, leading to a gap in ensuring safe and reliable operations. To explore better solutions, this paper introduces InferAct, a novel approach that leverages the Theory-of-Mind capability of LLMs to proactively detect potential errors before critical actions are executed (e.g., "buy-now" in automatic online trading or web shopping). InferAct is also capable of integrating human feedback to prevent irreversible risks and enhance the actor agent's decision-making process. Experiments on three widely used tasks demonstrate the effectiveness of InferAct. The proposed solution presents a novel approach and concrete contributions toward developing LLM agents that can be safely deployed in different environments involving critical decision-making.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2407.11843 [cs.CL]
	(or arXiv:2407.11843v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2407.11843

Submission history

From: Haishuo Fang [view email]
[v1] Tue, 16 Jul 2024 15:24:44 UTC (8,557 KB)

Computer Science > Computation and Language

Title:InferAct: Inferring Safe Actions for LLM-Based Agents Through Preemptive Evaluation and Human Feedback

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:InferAct: Inferring Safe Actions for LLM-Based Agents Through Preemptive Evaluation and Human Feedback

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators