Zum Hauptinhalt springen

Showing 1–1 of 1 results for author: Schagen, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.10675  [pdf, other

    cs.CL

    German Text Simplification: Finetuning Large Language Models with Semi-Synthetic Data

    Authors: Lars Klöser, Mika Beele, Jan-Niklas Schagen, Bodo Kraft

    Abstract: This study pioneers the use of synthetically generated data for training generative models in document-level text simplification of German texts. We demonstrate the effectiveness of our approach with real-world online texts. Addressing the challenge of data scarcity in language simplification, we crawled professionally simplified German texts and synthesized a corpus using GPT-4. We finetune Large… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: Accepted at Fourth Workshop on Language Technology for Equality, Diversity, Inclusion - EACL 2024

    ACM Class: I.2.7