Zum Hauptinhalt springen

Showing 1–2 of 2 results for author: Csaki, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.05829  [pdf, other

    cs.CL cs.AI cs.LG

    SambaLingo: Teaching Large Language Models New Languages

    Authors: Zoltan Csaki, Bo Li, Jonathan Li, Qiantong Xu, Pian Pawakapan, Leon Zhang, Yun Du, Hengyu Zhao, Changran Hu, Urmish Thakker

    Abstract: Despite the widespread availability of LLMs, there remains a substantial gap in their capabilities and availability across diverse languages. One approach to address these issues has been to take an existing pre-trained LLM and continue to train it on new languages. While prior works have experimented with language adaptation, many questions around best practices and methodology have not been cove… ▽ More

    Submitted 17 July, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: 23 pages

  2. arXiv:2311.05741  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Efficiently Adapting Pretrained Language Models To New Languages

    Authors: Zoltan Csaki, Pian Pawakapan, Urmish Thakker, Qiantong Xu

    Abstract: Recent large language models (LLM) exhibit sub-optimal performance on low-resource languages, as the training data of these models is usually dominated by English and other high-resource languages. Furthermore, it is challenging to train models for low-resource languages, especially from scratch, due to a lack of high quality training data. Adapting pretrained LLMs reduces the need for data in the… ▽ More

    Submitted 14 December, 2023; v1 submitted 9 November, 2023; originally announced November 2023.

    Comments: Accepted to "The third Neurips Workshop on Efficient Natural Language and Speech Processing 2023" (ENLSP-III)