We gratefully acknowledge support from
the Simons Foundation and member institutions.

Matej Ulčar and Marko Robnik-Sikonja are qualified to endorse.

Training dataset and dictionary sizes matter in BERT models: the case of Baltic languages

Matej Ulčar: Is registered as an author of this paper.
Can endorse for cs.CL, cs.LG. (why?)
Marko Robnik-Sikonja: Is registered as an author of this paper.
Can endorse for cs.AI, cs.CL, cs.CR, cs.CY, cs.LG, cs.SE, stat.AP, stat.ML. (why?)