Skip to main content

Showing 1–5 of 5 results for author: Weber-Genzel, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.01931  [pdf, other

    cs.CL

    VariErr NLI: Separating Annotation Error from Human Label Variation

    Authors: Leon Weber-Genzel, Siyao Peng, Marie-Catherine de Marneffe, Barbara Plank

    Abstract: Human label variation arises when annotators assign different labels to the same item for valid reasons, while annotation errors occur when labels are assigned for invalid reasons. These two issues are prevalent in NLP benchmarks, yet existing research has studied them in isolation. To the best of our knowledge, there exists no prior work that focuses on teasing apart error from signal, especially… ▽ More

    Submitted 6 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: 14 pages, accepted at ACL 2024 main

  2. arXiv:2402.14499  [pdf, other

    cs.CL

    "My Answer is C": First-Token Probabilities Do Not Match Text Answers in Instruction-Tuned Language Models

    Authors: Xinpeng Wang, Bolei Ma, Chengzhi Hu, Leon Weber-Genzel, Paul Röttger, Frauke Kreuter, Dirk Hovy, Barbara Plank

    Abstract: The open-ended nature of language generation makes the evaluation of autoregressive large language models (LLMs) challenging. One common evaluation approach uses multiple-choice questions (MCQ) to limit the response space. The model is then evaluated by ranking the candidate answers by the log probability of the first token prediction. However, first-tokens may not consistently reflect the final r… ▽ More

    Submitted 4 July, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: ACL 2024 Findings

  3. arXiv:2402.12372  [pdf, other

    cs.CL

    HunFlair2 in a cross-corpus evaluation of biomedical named entity recognition and normalization tools

    Authors: Mario Sänger, Samuele Garda, Xing David Wang, Leon Weber-Genzel, Pia Droop, Benedikt Fuchs, Alan Akbik, Ulf Leser

    Abstract: With the exponential growth of the life science literature, biomedical text mining (BTM) has become an essential technology for accelerating the extraction of insights from publications. Identifying named entities (e.g., diseases, drugs, or genes) in texts and their linkage to reference knowledge bases are crucial steps in BTM pipelines to enable information aggregation from different documents. H… ▽ More

    Submitted 20 February, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  4. arXiv:2309.01669  [pdf, other

    cs.CL

    Donkii: Can Annotation Error Detection Methods Find Errors in Instruction-Tuning Datasets?

    Authors: Leon Weber-Genzel, Robert Litschko, Ekaterina Artemova, Barbara Plank

    Abstract: Instruction tuning has become an integral part of training pipelines for Large Language Models (LLMs) and has been shown to yield strong performance gains. In an orthogonal line of research, Annotation Error Detection (AED) has emerged as a tool for detecting quality problems in gold standard labels. So far, however, the application of AED methods has been limited to classification tasks. It is an… ▽ More

    Submitted 22 February, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

    Comments: Camera ready version for LAW-XVIII

  5. arXiv:2308.11537  [pdf, other

    cs.CL

    BELB: a Biomedical Entity Linking Benchmark

    Authors: Samuele Garda, Leon Weber-Genzel, Robert Martin, Ulf Leser

    Abstract: Biomedical entity linking (BEL) is the task of grounding entity mentions to a knowledge base. It plays a vital role in information extraction pipelines for the life sciences literature. We review recent work in the field and find that, as the task is absent from existing benchmarks for biomedical text mining, different studies adopt different experimental setups making comparisons based on publish… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.