Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU

Koto, Fajri; Aisyah, Nurul; Li, Haonan; Baldwin, Timothy

Computer Science > Computation and Language

arXiv:2310.04928 (cs)

[Submitted on 7 Oct 2023 (v1), last revised 21 Oct 2023 (this version, v2)]

Title:Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU

Authors:Fajri Koto, Nurul Aisyah, Haonan Li, Timothy Baldwin

View PDF

Abstract:Although large language models (LLMs) are often pre-trained on large-scale multilingual texts, their reasoning abilities and real-world knowledge are mainly evaluated based on English datasets. Assessing LLM capabilities beyond English is increasingly vital but hindered due to the lack of suitable datasets. In this work, we introduce IndoMMLU, the first multi-task language understanding benchmark for Indonesian culture and languages, which consists of questions from primary school to university entrance exams in Indonesia. By employing professional teachers, we obtain 14,981 questions across 64 tasks and education levels, with 46% of the questions focusing on assessing proficiency in the Indonesian language and knowledge of nine local languages and cultures in Indonesia. Our empirical evaluations show that GPT-3.5 only manages to pass the Indonesian primary school level, with limited knowledge of local Indonesian languages and culture. Other smaller models such as BLOOMZ and Falcon perform at even lower levels.

Comments:	Accepted at EMNLP 2023
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2310.04928 [cs.CL]
	(or arXiv:2310.04928v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2310.04928

Submission history

From: Fajri Koto [view email]
[v1] Sat, 7 Oct 2023 21:49:38 UTC (8,459 KB)
[v2] Sat, 21 Oct 2023 17:13:05 UTC (8,459 KB)

Computer Science > Computation and Language

Title:Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators