Skip to main content

Showing 1–15 of 15 results for author: Fokkens, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.10629  [pdf, other

    cs.LG cs.CL cs.CY

    Balancing the Scales: Reinforcement Learning for Fair Classification

    Authors: Leon Eshuijs, Shihan Wang, Antske Fokkens

    Abstract: Fairness in classification tasks has traditionally focused on bias removal from neural representations, but recent trends favor algorithmic methods that embed fairness into the training process. These methods steer models towards fair performance, preventing potential elimination of valuable information that arises from representation manipulation. Reinforcement Learning (RL), with its capacity fo… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  2. arXiv:2407.04615  [pdf, other

    cs.CL

    ARM: Efficient Guided Decoding with Autoregressive Reward Models

    Authors: Sergey Troshin, Vlad Niculae, Antske Fokkens

    Abstract: Language models trained on large amounts of data require careful tuning to be safely deployed in real world. We revisit the guided decoding paradigm, where the goal is to augment the logits of the base language model using the scores from a task-specific reward model. We propose a simple but efficient parameterization of the autoregressive reward model enabling fast and effective guided decoding.… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  3. arXiv:2404.03987  [pdf, other

    cs.CL

    Investigating the Robustness of Modelling Decisions for Few-Shot Cross-Topic Stance Detection: A Preregistered Study

    Authors: Myrthe Reuver, Suzan Verberne, Antske Fokkens

    Abstract: For a viewpoint-diverse news recommender, identifying whether two news articles express the same viewpoint is essential. One way to determine "same or different" viewpoint is stance detection. In this paper, we investigate the robustness of operationalization choices for few-shot stance detection, with special attention to modelling stance across different topics. Our experiments test pre-register… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: Accepted at LREC-COLING 2024: cite the published version when available

  4. arXiv:2403.19424  [pdf, other

    cs.CL cs.AI

    The Role of Syntactic Span Preferences in Post-Hoc Explanation Disagreement

    Authors: Jonathan Kamp, Lisa Beinborn, Antske Fokkens

    Abstract: Post-hoc explanation methods are an important tool for increasing model transparency for users. Unfortunately, the currently used methods for attributing token importance often yield diverging patterns. In this work, we study potential sources of disagreement across methods from a linguistic perspective. We find that different methods systematically select different classes of words and that metho… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: Long paper accepted to LREC-Coling 2024 main conference. Please cite the conference proceedings version when available

  5. arXiv:2310.05619  [pdf, other

    cs.CL cs.AI

    Dynamic Top-k Estimation Consolidates Disagreement between Feature Attribution Methods

    Authors: Jonathan Kamp, Lisa Beinborn, Antske Fokkens

    Abstract: Feature attribution scores are used for explaining the prediction of a text classifier to users by highlighting a k number of tokens. In this work, we propose a way to determine the number of optimal k tokens that should be displayed from sequential properties of the attribution scores. Our approach is dynamic across sentences, method-agnostic, and deals with sentence length bias. We compare agree… ▽ More

    Submitted 3 November, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Short paper accepted to EMNLP 2023 main conference. Please cite the EMNLP version when available

  6. arXiv:2309.06192  [pdf, other

    cs.CL cs.CY cs.IR

    Improving and Evaluating the Detection of Fragmentation in News Recommendations with the Clustering of News Story Chains

    Authors: Alessandra Polimeno, Myrthe Reuver, Sanne Vrijenhoek, Antske Fokkens

    Abstract: News recommender systems play an increasingly influential role in shaping information access within democratic societies. However, tailoring recommendations to users' specific interests can result in the divergence of information streams. Fragmented access to information poses challenges to the integrity of the public sphere, thereby influencing democracy and public discourse. The Fragmentation me… ▽ More

    Submitted 18 September, 2023; v1 submitted 12 September, 2023; originally announced September 2023.

    Comments: Cite published version: Polimeno et. al., Improving and Evaluating the Detection of Fragmentation in News Recommendations with the Clustering of News Story Chains, NORMalize 2023: The First Workshop on the Normative Design and Evaluation of Recommender Systems, September 19, 2023, co-located with the ACM Conference on Recommender Systems 2023 (RecSys 2023), Singapore

    Journal ref: NORMalize 2023: The First Workshop on the Normative Design and Evaluation of Recommender Systems, September 19, 2023, co-located with the ACM Conference on Recommender Systems 2023 (RecSys 2023), Singapore

  7. arXiv:2212.04273  [pdf, other

    cs.LG cs.CY

    Better Hit the Nail on the Head than Beat around the Bush: Removing Protected Attributes with a Single Projection

    Authors: Pantea Haghighatkhah, Antske Fokkens, Pia Sommerauer, Bettina Speckmann, Kevin Verbeek

    Abstract: Bias elimination and recent probing studies attempt to remove specific information from embedding spaces. Here it is important to remove as much of the target information as possible, while preserving any other information present. INLP is a popular recent method which removes specific information through iterative nullspace projections. Multiple iterations, however, increase the risk that informa… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

    Comments: EMNLP 2022

    Journal ref: https://aclanthology.org/2022.emnlp-main.575

  8. arXiv:2211.02429  [pdf, other

    cs.CL

    Dealing with Abbreviations in the Slovenian Biographical Lexicon

    Authors: Angel Daza, Antske Fokkens, Tomaž Erjavec

    Abstract: Abbreviations present a significant challenge for NLP systems because they cause tokenization and out-of-vocabulary errors. They can also make the text less readable, especially in reference printed books, where they are extensively used. Abbreviations are especially problematic in low-resource settings, where systems are less robust to begin with. In this paper, we propose a new method for addres… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

    Comments: To be presented at The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022)

  9. arXiv:2209.14780  [pdf, other

    cs.CL

    Perturbations and Subpopulations for Testing Robustness in Token-Based Argument Unit Recognition

    Authors: Jonathan Kamp, Lisa Beinborn, Antske Fokkens

    Abstract: Argument Unit Recognition and Classification aims at identifying argument units from text and classifying them as pro or against. One of the design choices that need to be made when developing systems for this task is what the unit of classification should be: segments of tokens or full sentences. Previous research suggests that fine-tuning language models on the token-level yields more robust res… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

    Comments: Accepted at the 9th Workshop on Argument Mining, co-located with COLING 2022. Please cite the published version when available

  10. arXiv:2206.15455  [pdf, other

    cs.CL

    Hate Speech Criteria: A Modular Approach to Task-Specific Hate Speech Definitions

    Authors: Urja Khurana, Ivar Vermeulen, Eric Nalisnick, Marloes van Noorloos, Antske Fokkens

    Abstract: \textbf{Offensive Content Warning}: This paper contains offensive language only for providing examples that clarify this research and do not reflect the authors' opinions. Please be aware that these examples are offensive and may cause you distress. The subjectivity of recognizing \textit{hate speech} makes it a complex task. This is also reflected by different and incomplete definitions in NLP.… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

    Comments: Accepted at WOAH 2022, co-located at NAACL 2022. Cite ACL version

  11. arXiv:2111.09612  [pdf, ps, other

    cs.CL cs.LG

    How Emotionally Stable is ALBERT? Testing Robustness with Stochastic Weight Averaging on a Sentiment Analysis Task

    Authors: Urja Khurana, Eric Nalisnick, Antske Fokkens

    Abstract: Despite their success, modern language models are fragile. Even small changes in their training pipeline can lead to unexpected results. We study this phenomenon by examining the robustness of ALBERT (arXiv:1909.11942) in combination with Stochastic Weight Averaging (SWA) (arXiv:1803.05407) -- a cheap way of ensembling -- on a sentiment analysis task (SST-2). In particular, we analyze SWA's stabil… ▽ More

    Submitted 18 November, 2021; originally announced November 2021.

    Comments: Accepted at the second workshop on Evaluation & Comparison of NLP Systems, co-located at EMNLP 2021. Cite ACL version

  12. arXiv:2110.07693  [pdf, other

    cs.CL

    Is Stance Detection Topic-Independent and Cross-topic Generalizable? -- A Reproduction Study

    Authors: Myrthe Reuver, Suzan Verberne, Roser Morante, Antske Fokkens

    Abstract: Cross-topic stance detection is the task to automatically detect stances (pro, against, or neutral) on unseen topics. We successfully reproduce state-of-the-art cross-topic stance detection work (Reimers et. al., 2019), and systematically analyze its reproducibility. Our attention then turns to the cross-topic aspect of this work, and the specificity of topics in terms of vocabulary and socio-cult… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

    Comments: Accepted at the 8th Workshop on Argument Mining, 2021 co-located with EMNLP 2021. Cite the published version

  13. arXiv:1809.01375  [pdf, ps, other

    cs.CL

    Firearms and Tigers are Dangerous, Kitchen Knives and Zebras are Not: Testing whether Word Embeddings Can Tell

    Authors: Pia Sommerauer, Antske Fokkens

    Abstract: This paper presents an approach for investigating the nature of semantic information captured by word embeddings. We propose a method that extends an existing human-elicited semantic property dataset with gold negative examples using crowd judgments. Our experimental approach tests the ability of supervised classifiers to identify semantic features in word embedding vectors and com- pares this to… ▽ More

    Submitted 5 September, 2018; originally announced September 2018.

    Comments: Accepted to the EMNLP workshop "Analyzing and interpreting neural networks for NLP"

  14. arXiv:1801.07073  [pdf

    cs.CL

    BiographyNet: Extracting Relations Between People and Events

    Authors: Antske Fokkens, Serge ter Braake, Niels Ockeloen, Piek Vossen, Susan Legêne, Guus Schreiber, Victor de Boer

    Abstract: This paper describes BiographyNet, a digital humanities project (2012-2016) that brings together researchers from history, computational linguistics and computer science. The project uses data from the Biography Portal of the Netherlands (BPN), which contains approximately 125,000 biographies from a variety of Dutch biographical dictionaries from the eighteenth century until now, describing around… ▽ More

    Submitted 26 December, 2018; v1 submitted 22 January, 2018; originally announced January 2018.

    Comments: 35 pages, 5 figures, Á. Z. Bernád, C. Gruber, M. Kaiser (editors). Europa baut auf Biographien: Aspekte, Bausteine, Normen und Standards für eine europäische Biographik (2017)

  15. arXiv:1702.06794  [pdf, other

    cs.CL

    Tackling Error Propagation through Reinforcement Learning: A Case of Greedy Dependency Parsing

    Authors: Minh Le, Antske Fokkens

    Abstract: Error propagation is a common problem in NLP. Reinforcement learning explores erroneous states during training and can therefore be more robust when mistakes are made early in a process. In this paper, we apply reinforcement learning to greedy dependency parsing which is known to suffer from error propagation. Reinforcement learning improves accuracy of both labeled and unlabeled dependencies of t… ▽ More

    Submitted 22 February, 2017; originally announced February 2017.