Zum Hauptinhalt springen

Showing 1–2 of 2 results for author: Beloch, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.01822  [pdf, other

    cs.LG cs.CL cs.SI

    A (More) Realistic Evaluation Setup for Generalisation of Community Models on Malicious Content Detection

    Authors: Ivo Verhoeven, Pushkar Mishra, Rahel Beloch, Helen Yannakoudakis, Ekaterina Shutova

    Abstract: Community models for malicious content detection, which take into account the context from a social graph alongside the content itself, have shown remarkable performance on benchmark datasets. Yet, misinformation and hate speech continue to propagate on social media networks. This mismatch can be partially attributed to the limitations of current evaluation setups that neglect the rapid evolution… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: To be published at Findings of NAACL 2024

  2. arXiv:2310.12611  [pdf, other

    cs.CL cs.AI

    Identifying and Adapting Transformer-Components Responsible for Gender Bias in an English Language Model

    Authors: Abhijith Chintam, Rahel Beloch, Willem Zuidema, Michael Hanna, Oskar van der Wal

    Abstract: Language models (LMs) exhibit and amplify many types of undesirable biases learned from the training data, including gender bias. However, we lack tools for effectively and efficiently changing this behavior without hurting general language modeling performance. In this paper, we study three methods for identifying causal relations between LM components and particular output: causal mediation anal… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: Accepted at BlackboxNLP 2023