FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering

Li, Zhenyu; Fan, Sunqi; Gu, Yu; Li, Xiuxing; Duan, Zhichao; Dong, Bowen; Liu, Ning; Wang, Jianyong

Computer Science > Computation and Language

arXiv:2308.12060 (cs)

[Submitted on 23 Aug 2023 (v1), last revised 26 Jan 2024 (this version, v3)]

Title:FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering

Authors:Zhenyu Li, Sunqi Fan, Yu Gu, Xiuxing Li, Zhichao Duan, Bowen Dong, Ning Liu, Jianyong Wang

View PDF

Abstract:Knowledge base question answering (KBQA) is a critical yet challenging task due to the vast number of entities within knowledge bases and the diversity of natural language questions posed by users. Unfortunately, the performance of most KBQA models tends to decline significantly in real-world scenarios where high-quality annotated data is insufficient. To mitigate the burden associated with manual annotation, we introduce FlexKBQA by utilizing Large Language Models (LLMs) as program translators for addressing the challenges inherent in the few-shot KBQA task. Specifically, FlexKBQA leverages automated algorithms to sample diverse programs, such as SPARQL queries, from the knowledge base, which are subsequently converted into natural language questions via LLMs. This synthetic dataset facilitates training a specialized lightweight model for the KB. Additionally, to reduce the barriers of distribution shift between synthetic data and real user questions, FlexKBQA introduces an executionguided self-training method to iterative leverage unlabeled user questions. Furthermore, we explore harnessing the inherent reasoning capability of LLMs to enhance the entire framework. Consequently, FlexKBQA delivers substantial flexibility, encompassing data annotation, deployment, and being domain agnostic. Through extensive experiments on GrailQA, WebQSP, and KQA Pro, we observe that under the few-shot even the more challenging zero-shot scenarios, FlexKBQA achieves impressive results with a few annotations, surpassing all previous baselines and even approaching the performance of supervised models, achieving a remarkable 93% performance relative to the fully-supervised models. We posit that FlexKBQA represents a significant advancement towards exploring better integration of large and lightweight models. The code is open-sourced.

Comments:	Accepted as AAAI-24 Oral paper; Knowledge Base Question Answering; Large Language Model; Data Generation; Few-Shot & Zero-Shot
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2308.12060 [cs.CL]
	(or arXiv:2308.12060v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2308.12060

Submission history

From: Zhenyu Li [view email]
[v1] Wed, 23 Aug 2023 11:00:36 UTC (1,642 KB)
[v2] Sat, 9 Dec 2023 10:23:55 UTC (1,642 KB)
[v3] Fri, 26 Jan 2024 12:49:04 UTC (1,642 KB)

🚨2024-09-29: arxiv.org is experience DB issues. The announce tonight will be 3 hours later than usual.🚨

Computer Science > Computation and Language

Title:FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

🚨2024-09-29: arxiv.org is experience DB issues. The announce tonight will be 3 hours later than usual.🚨

Computer Science > Computation and Language

Title:FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators