Skip to content

CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean

Benachrichtigungen You must be signed in to change notification settings

rladmstn1714/CLIcK

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

CLIcK πŸ‡°πŸ‡·πŸ§ 

CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean

Dataset Paper

Introduction πŸŽ‰

CLIcK (Cultural and Linguistic Intelligence in Korean) is a comprehensive dataset designed to evaluate cultural and linguistic intelligence in the context of Korean language models. In an era where diverse language models are continually emerging, there is a pressing need for robust evaluation datasets, especially for non-English languages like Korean. CLIcK fills this gap by providing a rich, well-categorized dataset focusing on both cultural and linguistic aspects, enabling a nuanced assessment of Korean language models.

News πŸ“°

  • [LREC-COLING] Our paper introducing CLIcK has been accepted to LREC-COLING 2024!πŸŽ‰
  • [3/20] We revise some grammatical errors in the dataset. Test with the new version of CLIcK!

Dataset Description πŸ“Š

The CLIcK benchmark comprises two broad categories: Culture and Language, which are further divided into 11 fine-grained subcategories.

Categories πŸ“‚

  • Language πŸ—£οΈ

    • Textual Knowledge
    • Grammatical Knowledge
    • Functional Knowledge
  • Culture 🌍

    • Korean Society
    • Korean Tradition
    • Korean Politics
    • Korean Economy
    • Korean Law
    • Korean History
    • Korean Geography
    • Korean Popular Culture (K-Pop)

Construction πŸ—οΈ

CLIcK was developed using two human-centric approaches:

  1. Reclassification of official and well-designed exam data into our defined categories.
  2. Generation of questions using ChatGPT, based on official educational materials from the Korean Ministry of Justice, followed by our own validation process.

Structure πŸ›οΈ

The dataset is organized as follows, with each subcategory containing relevant JSON files:

πŸ“¦CLIcK
 └─ Dataset
    β”œβ”€ Culture
    β”‚  β”œβ”€ [Each cultural subcategory with associated JSON files]
    └─ Language
       β”œβ”€ [Each language subcategory with associated JSON files]

Exam Code Descriptions πŸ“œ

  • KIIP: Korea Immigration & Integration Program (Website)
  • CSAT: College Scholastic Ability Test for Korean (Website)
  • Kedu: Test of Teaching Korean as a Foreign Language exams (Website)
  • PSE: Public Service Exam for 9th grade
  • TOPIK: Test of Proficiency in Korean (Website)
  • KHB: Korean History Exam Basic (Website)
  • PSAT: Public Service Aptitude Test in Korea

Results

Models Average Accuracy (Korean Culture) Average Accuracy (Korean Language)
Polyglot-Ko 1.3B 32.71% 22.88%
Polyglot-Ko 3.8B 32.90% 22.38%
Polyglot-Ko 5.8B 33.14% 23.27%
Polyglot-Ko 12.8B 33.40% 22.24%
KULLM 5.8B 33.79% 23.50%
KULLM 12.8B 33.51% 23.78%
KoAlpaca 5.8B 32.33% 23.87%
KoAlpaca 12.8B 33.80% 22.42%
LLaMA-Ko 7B 33.26% 25.69%
LLaMA 7B 35.44% 27.17%
LLaMA 13B 36.22% 26.71%
GPT-3.5 49.30% 42.32%
Claude2 51.72% 45.39%

Dataset Link πŸ”—

The CLIcK dataset is available on the Hugging Face Hub: CLIcK Dataset

Citation πŸ“

If you use CLIcK in your research, please cite our paper:

@misc{kim2024click,
      title={CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean}, 
      author={Eunsu Kim and Juyoung Suk and Philhoon Oh and Haneul Yoo and James Thorne and Alice Oh},
      year={2024},
      eprint={2403.06412},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Contact πŸ“§

For any questions or inquiries, please contact [emailΒ protected].

Über uns

CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean

Ressourcen

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published