Zum Hauptinhalt springen

Showing 1–3 of 3 results for author: Phatthiyaphaibun, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.16127  [pdf, other

    cs.CL cs.AI

    WangchanLion and WangchanX MRC Eval

    Authors: Wannaphong Phatthiyaphaibun, Surapon Nonesung, Patomporn Payoungkhamdee, Peerat Limkonchotiwat, Can Udomcharoenchaikit, Jitkapat Sawatphol, Chompakorn Chaksangchaichot, Ekapol Chuangsuwanich, Sarana Nutanong

    Abstract: This technical report describes the development of WangchanLion, an instruction fine-tuned model focusing on Machine Reading Comprehension (MRC) in the Thai language. Our model is based on SEA-LION and a collection of instruction following datasets. To promote open research and reproducibility, we publicly release all training data, code, and the final model weights under the Apache-2 license. To… ▽ More

    Submitted 23 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

  2. arXiv:2312.04649  [pdf, other

    cs.CL

    PyThaiNLP: Thai Natural Language Processing in Python

    Authors: Wannaphong Phatthiyaphaibun, Korakot Chaovavanich, Charin Polpanumas, Arthit Suriyawongkul, Lalita Lowphansirikul, Pattarawat Chormai, Peerat Limkonchotiwat, Thanathip Suntorntip, Can Udomcharoenchaikit

    Abstract: We present PyThaiNLP, a free and open-source natural language processing (NLP) library for Thai language implemented in Python. It provides a wide range of software, models, and datasets for Thai language. We first provide a brief historical context of tools for Thai language prior to the development of PyThaiNLP. We then outline the functionalities it provided as well as datasets and pre-trained… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: 12 pages, 2 figures, LaTeX; typos corrected, timeline clarified for section 2. In Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023), pages 25-36, Singapore, Singapore. Empirical Methods in Natural Language Processing

    ACM Class: I.2.7

  3. arXiv:2208.04799  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Thai Wav2Vec2.0 with CommonVoice V8

    Authors: Wannaphong Phatthiyaphaibun, Chompakorn Chaksangchaichot, Peerat Limkonchotiwat, Ekapol Chuangsuwanich, Sarana Nutanong

    Abstract: Recently, Automatic Speech Recognition (ASR), a system that converts audio into text, has caught a lot of attention in the machine learning community. Thus, a lot of publicly available models were released in HuggingFace. However, most of these ASR models are available in English; only a minority of the models are available in Thai. Additionally, most of the Thai ASR models are closed-sourced, and… ▽ More

    Submitted 9 August, 2022; originally announced August 2022.