Zum Hauptinhalt springen

Showing 1–2 of 2 results for author: Kittask, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2011.04784  [pdf, other

    cs.CL

    EstBERT: A Pretrained Language-Specific BERT for Estonian

    Authors: Hasan Tanvir, Claudia Kittask, Sandra Eiche, Kairit Sirts

    Abstract: This paper presents EstBERT, a large pretrained transformer-based language-specific BERT model for Estonian. Recent work has evaluated multilingual BERT models on Estonian tasks and found them to outperform the baselines. Still, based on existing studies on other languages, a language-specific BERT model is expected to improve over the multilingual ones. We first describe the EstBERT pretraining p… ▽ More

    Submitted 28 April, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

    Comments: NoDaLiDa 2021

  2. arXiv:2010.00454  [pdf, ps, other

    cs.CL

    Evaluating Multilingual BERT for Estonian

    Authors: Claudia Kittask, Kirill Milintsevich, Kairit Sirts

    Abstract: Recently, large pre-trained language models, such as BERT, have reached state-of-the-art performance in many natural language processing tasks, but for many languages, including Estonian, BERT models are not yet available. However, there exist several multilingual BERT models that can handle multiple languages simultaneously and that have been trained also on Estonian data. In this paper, we evalu… ▽ More

    Submitted 8 January, 2021; v1 submitted 1 October, 2020; originally announced October 2020.

    Comments: V1: Baltic HLT 2020 V2: Changed NER baseline results