Skip to main content

Showing 1–44 of 44 results for author: Shutova, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.07011  [pdf, other

    cs.CL

    Induction Heads as an Essential Mechanism for Pattern Matching in In-context Learning

    Authors: J. Crosbie, E. Shutova

    Abstract: Large language models (LLMs) have shown a remarkable ability to learn and perform complex tasks through in-context learning (ICL). However, a comprehensive understanding of its internal mechanisms is still lacking. This paper explores the role of induction heads in a few-shot ICL setting. We analyse two state-of-the-art models, Llama-3-8B and InternLM2-20B on abstract pattern recognition and NLP t… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 9 pages, 7 figures

  2. arXiv:2407.03952  [pdf, other

    cs.CL

    A framework for annotating and modelling intentions behind metaphor use

    Authors: Gianluca Michelli, Xiaoyu Tong, Ekaterina Shutova

    Abstract: Metaphors are part of everyday language and shape the way in which we conceptualize the world. Moreover, they play a multifaceted role in communication, making their understanding and generation a challenging task for language models (LMs). While there has been extensive work in the literature linking metaphor to the fulfilment of individual intentions, no comprehensive taxonomy of such intentions… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  3. arXiv:2406.14267  [pdf, other

    cs.CL cs.AI

    On the Evaluation Practices in Multilingual NLP: Can Machine Translation Offer an Alternative to Human Translations?

    Authors: Rochelle Choenni, Sara Rajaee, Christof Monz, Ekaterina Shutova

    Abstract: While multilingual language models (MLMs) have been trained on 100+ languages, they are typically only evaluated across a handful of them due to a lack of available test data in most languages. This is particularly problematic when assessing MLM's potential for low-resource and unseen languages. In this paper, we present an analysis of existing evaluation frameworks in multilingual NLP, discuss th… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  4. arXiv:2406.06590  [pdf, other

    cs.CL cs.AI

    Are LLMs classical or nonmonotonic reasoners? Lessons from generics

    Authors: Alina Leidinger, Robert van Rooij, Ekaterina Shutova

    Abstract: Recent scholarship on reasoning in LLMs has supplied evidence of impressive performance and flexible adaptation to machine generated or human feedback. Nonmonotonic reasoning, crucial to human cognition for navigating the real world, remains a challenging, yet understudied task. In this work, we study nonmonotonic reasoning capabilities of seven state-of-the-art LLMs in one abstract and one common… ▽ More

    Submitted 12 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted at ACL 2024 (main)

  5. arXiv:2405.12744  [pdf, other

    cs.CL

    The Echoes of Multilinguality: Tracing Cultural Value Shifts during LM Fine-tuning

    Authors: Rochelle Choenni, Anne Lauscher, Ekaterina Shutova

    Abstract: Texts written in different languages reflect different culturally-dependent beliefs of their writers. Thus, we expect multilingual LMs (MLMs), that are jointly trained on a concatenation of text in multiple languages, to encode different cultural values for each language. Yet, as the 'multilinguality' of these LMs is driven by cross-lingual sharing, we also have reason to belief that cultural valu… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  6. arXiv:2404.01822  [pdf, other

    cs.LG cs.CL cs.SI

    A (More) Realistic Evaluation Setup for Generalisation of Community Models on Malicious Content Detection

    Authors: Ivo Verhoeven, Pushkar Mishra, Rahel Beloch, Helen Yannakoudakis, Ekaterina Shutova

    Abstract: Community models for malicious content detection, which take into account the context from a social graph alongside the content itself, have shown remarkable performance on benchmark datasets. Yet, misinformation and hate speech continue to propagate on social media networks. This mismatch can be partially attributed to the limitations of current evaluation setups that neglect the rapid evolution… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: To be published at Findings of NAACL 2024

  7. arXiv:2403.11810  [pdf, other

    cs.CL

    Metaphor Understanding Challenge Dataset for LLMs

    Authors: Xiaoyu Tong, Rochelle Choenni, Martha Lewis, Ekaterina Shutova

    Abstract: Metaphors in natural language are a reflection of fundamental cognitive processes such as analogical reasoning and categorisation, and are deeply rooted in everyday communication. Metaphor understanding is therefore an essential task for large language models (LLMs). We release the Metaphor Understanding Challenge Dataset (MUNCH), designed to evaluate the metaphor understanding capabilities of LLM… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  8. arXiv:2312.10136  [pdf, other

    cs.CV

    Gradient-based Parameter Selection for Efficient Fine-Tuning

    Authors: Zhi Zhang, Qizhe Zhang, Zijun Gao, Renrui Zhang, Ekaterina Shutova, Shiji Zhou, Shanghang Zhang

    Abstract: With the growing size of pre-trained models, full fine-tuning and storing all the parameters for various downstream tasks is costly and infeasible. In this paper, we propose a new parameter-efficient fine-tuning method, Gradient-based Parameter Selection (GPS), demonstrating that only tuning a few selected parameters from the pre-trained model while keeping the remainder of the model frozen can ge… ▽ More

    Submitted 11 June, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Journal ref: CVPR2024

  9. arXiv:2311.08273  [pdf, other

    cs.CL

    Examining Modularity in Multilingual LMs via Language-Specialized Subnetworks

    Authors: Rochelle Choenni, Ekaterina Shutova, Dan Garrette

    Abstract: Recent work has proposed explicitly inducing language-wise modularity in multilingual LMs via sparse fine-tuning (SFT) on per-language subnetworks as a means of better guiding cross-lingual sharing. In this work, we investigate (1) the degree to which language-wise modularity naturally arises within models with no special modularity interventions, and (2) how cross-lingual sharing and interference… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  10. arXiv:2311.01967  [pdf, other

    cs.CL cs.AI cs.LG

    The language of prompting: What linguistic properties make a prompt successful?

    Authors: Alina Leidinger, Robert van Rooij, Ekaterina Shutova

    Abstract: The latest generation of LLMs can be prompted to achieve impressive zero-shot or few-shot performance in many NLP tasks. However, since performance is highly sensitive to the choice of prompts, considerable effort has been devoted to crowd-sourcing prompts or designing methods for prompt optimisation. Yet, we still lack a systematic understanding of how linguistic properties of prompts correlate w… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: Accepted to EMNLP 2023 Findings

  11. arXiv:2310.20384  [pdf, other

    cs.CL cs.AI

    Do large language models solve verbal analogies like children do?

    Authors: Claire E. Stevenson, Mathilde ter Veen, Rochelle Choenni, Han L. J. van der Maas, Ekaterina Shutova

    Abstract: Analogy-making lies at the heart of human cognition. Adults solve analogies such as \textit{Horse belongs to stable like chicken belongs to ...?} by mapping relations (\textit{kept in}) and answering \textit{chicken coop}. In contrast, children often use association, e.g., answering \textit{egg}. This paper investigates whether large language models (LLMs) solve verbal analogies in A:B::C:? form u… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

  12. arXiv:2310.18696  [pdf, other

    cs.CL cs.AI cs.LG

    Probing LLMs for Joint Encoding of Linguistic Categories

    Authors: Giulio Starace, Konstantinos Papakostas, Rochelle Choenni, Apostolos Panagiotopoulos, Matteo Rosati, Alina Leidinger, Ekaterina Shutova

    Abstract: Large Language Models (LLMs) exhibit impressive performance on a range of NLP tasks, due to the general-purpose linguistic knowledge acquired during pretraining. Existing model interpretability research (Tenney et al., 2019) suggests that a linguistic hierarchy emerges in the LLM layers, with lower layers better suited to solving syntactic tasks and higher layers employed for semantic processing.… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: Accepted in EMNLP Findings 2023

  13. arXiv:2305.13286  [pdf, other

    cs.CL

    How do languages influence each other? Studying cross-lingual data sharing during LM fine-tuning

    Authors: Rochelle Choenni, Dan Garrette, Ekaterina Shutova

    Abstract: Multilingual large language models (MLLMs) are jointly trained on data from many different languages such that representation of individual languages can benefit from other languages' data. Impressive performance on zero-shot cross-lingual transfer shows that these models are capable of exploiting data from other languages. Yet, it remains unclear to what extent, and under which conditions, langua… ▽ More

    Submitted 21 May, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

  14. arXiv:2305.08414  [pdf, other

    cs.CL cs.AI

    What's the Meaning of Superhuman Performance in Today's NLU?

    Authors: Simone Tedeschi, Johan Bos, Thierry Declerck, Jan Hajic, Daniel Hershcovich, Eduard H. Hovy, Alexander Koller, Simon Krek, Steven Schockaert, Rico Sennrich, Ekaterina Shutova, Roberto Navigli

    Abstract: In the last five years, there has been a significant focus in Natural Language Processing (NLP) on developing larger Pretrained Language Models (PLMs) and introducing benchmarks such as SuperGLUE and SQuAD to measure their abilities in language understanding, reasoning, and reading comprehension. These PLMs have achieved impressive results on these benchmarks, even surpassing human performance in… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: 9 pages, long paper at ACL 2023 proceedings

  15. arXiv:2302.09027  [pdf, other

    cs.CV cs.AI cs.CL cs.MM

    CK-Transformer: Commonsense Knowledge Enhanced Transformers for Referring Expression Comprehension

    Authors: Zhi Zhang, Helen Yannakoudakis, Xiantong Zhen, Ekaterina Shutova

    Abstract: The task of multimodal referring expression comprehension (REC), aiming at localizing an image region described by a natural language expression, has recently received increasing attention within the research comminity. In this paper, we specifically focus on referring expression comprehension with commonsense knowledge (KB-Ref), a task which typically requires reasoning beyond spatial, visual or… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

  16. arXiv:2301.10481  [pdf, other

    cs.CL cs.LG

    FewShotTextGCN: K-hop neighborhood regularization for few-shot learning on graphs

    Authors: Niels van der Heijden, Ekaterina Shutova, Helen Yannakoudakis

    Abstract: We present FewShotTextGCN, a novel method designed to effectively utilize the properties of word-document graphs for improved learning in low-resource settings. We introduce K-hop Neighbourhood Regularization, a regularizer for heterogeneous graphs, and show that it stabilizes and improves learning when only a few training samples are available. We furthermore propose a simplification in the graph… ▽ More

    Submitted 6 February, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

    Comments: 8 pages, 4 figures, EACL 2023

  17. arXiv:2211.15268  [pdf, other

    cs.CL cs.LG

    Scientific and Creative Analogies in Pretrained Language Models

    Authors: Tamara Czinczoll, Helen Yannakoudakis, Pushkar Mishra, Ekaterina Shutova

    Abstract: This paper examines the encoding of analogy in large-scale pretrained language models, such as BERT and GPT-2. Existing analogy datasets typically focus on a limited set of analogical relations, with a high similarity of the two domains between which the analogy holds. As a more realistic setup, we introduce the Scientific and Creative Analogy dataset (SCAN), a novel analogy dataset containing sys… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

    Comments: To be published in Findings of EMNLP 2022

  18. arXiv:2211.00106  [pdf, other

    cs.CL

    Data-Efficient Cross-Lingual Transfer with Language-Specific Subnetworks

    Authors: Rochelle Choenni, Dan Garrette, Ekaterina Shutova

    Abstract: Large multilingual language models typically share their parameters across all languages, which enables cross-lingual task transfer, but learning can also be hindered when training updates from different languages are in conflict. In this paper, we propose novel methods for using language-specific subnetworks, which control cross-lingual parameter sharing, to reduce conflicts and increase positive… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

  19. arXiv:2210.17437  [pdf, other

    cs.LG cs.CL

    Learning New Tasks from a Few Examples with Soft-Label Prototypes

    Authors: Avyav Kumar Singh, Ekaterina Shutova, Helen Yannakoudakis

    Abstract: Existing approaches to few-shot learning in NLP rely on large language models and fine-tuning of these to generalise on out-of-distribution data. In this work, we propose a simple yet powerful approach to "extreme" few-shot learning, wherein models are exposed to as little as 4 examples per class, based on soft-label prototypes that collectively capture the distribution of different classes across… ▽ More

    Submitted 14 March, 2024; v1 submitted 31 October, 2022; originally announced October 2022.

  20. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  21. arXiv:2109.10052  [pdf, other

    cs.CL

    Stepmothers are mean and academics are pretentious: What do pretrained language models learn about you?

    Authors: Rochelle Choenni, Ekaterina Shutova, Robert van Rooij

    Abstract: In this paper, we investigate what types of stereotypical information are captured by pretrained language models. We present the first dataset comprising stereotypical attributes of a range of social groups and propose a method to elicit stereotypes encoded by pretrained language models in an unsupervised fashion. Moreover, we link the emergent stereotypes to their manifestation as basic emotions… ▽ More

    Submitted 21 September, 2021; originally announced September 2021.

  22. arXiv:2106.05664  [pdf, other

    cs.CL cs.AI

    Ruddit: Norms of Offensiveness for English Reddit Comments

    Authors: Rishav Hada, Sohi Sudhir, Pushkar Mishra, Helen Yannakoudakis, Saif M. Mohammad, Ekaterina Shutova

    Abstract: On social media platforms, hateful and offensive language negatively impact the mental well-being of users and the participation of people from diverse backgrounds. Automatic methods to detect offensive language have largely relied on datasets with categorical labels. However, comments can vary in their degree of offensiveness. We create the first dataset of English language Reddit comments that h… ▽ More

    Submitted 25 January, 2022; v1 submitted 10 June, 2021; originally announced June 2021.

    Comments: Camera-ready version in ACL 2021

  23. arXiv:2106.02960  [pdf, other

    cs.CL

    Meta-Learning with Variational Semantic Memory for Word Sense Disambiguation

    Authors: Yingjun Du, Nithin Holla, Xiantong Zhen, Cees G. M. Snoek, Ekaterina Shutova

    Abstract: A critical challenge faced by supervised word sense disambiguation (WSD) is the lack of large annotated datasets with sufficient coverage of words in their diversity of senses. This inspired recent research on few-shot WSD using meta-learning. While such work has successfully applied meta-learning to learn new word senses from very few examples, its performance still lags behind its fully supervis… ▽ More

    Submitted 5 June, 2021; originally announced June 2021.

    Comments: 15 pages, 5 figures

    Journal ref: ACL-IJCNLP 2021

  24. arXiv:2104.04736  [pdf, other

    cs.CL cs.AI

    Meta-Learning for Fast Cross-Lingual Adaptation in Dependency Parsing

    Authors: Anna Langedijk, Verna Dankers, Phillip Lippe, Sander Bos, Bryan Cardenas Guevara, Helen Yannakoudakis, Ekaterina Shutova

    Abstract: Meta-learning, or learning to learn, is a technique that can help to overcome resource scarcity in cross-lingual NLP problems, by enabling fast adaptation to new tasks. We apply model-agnostic meta-learning (MAML) to the task of cross-lingual dependency parsing. We train our model on a diverse set of languages to learn a parameter initialization that can adapt quickly to new languages. We find tha… ▽ More

    Submitted 23 March, 2022; v1 submitted 10 April, 2021; originally announced April 2021.

    Comments: - Add additional results (Appendix D) - Cosmetic updates for camera-ready version ACL 2022

  25. arXiv:2104.03928  [pdf, other

    cs.CL cs.SI

    How Metaphors Impact Political Discourse: A Large-Scale Topic-Agnostic Study Using Neural Metaphor Detection

    Authors: Vinodkumar Prabhakaran, Marek Rei, Ekaterina Shutova

    Abstract: Metaphors are widely used in political rhetoric as an effective framing device. While the efficacy of specific metaphors such as the war metaphor in political discourse has been documented before, those studies often rely on small number of hand-coded instances of metaphor use. Larger-scale topic-agnostic studies are required to establish the general persuasiveness of metaphors as a device, and to… ▽ More

    Submitted 8 April, 2021; originally announced April 2021.

    Comments: Published at ICWSM 2021. Please cite that version for academic publications

    Journal ref: The International AAAI Conference on Web and Social Media (ICWSM) 2021

  26. arXiv:2103.17191  [pdf, ps, other

    cs.CL cs.AI

    Modeling Users and Online Communities for Abuse Detection: A Position on Ethics and Explainability

    Authors: Pushkar Mishra, Helen Yannakoudakis, Ekaterina Shutova

    Abstract: Abuse on the Internet is an important societal problem of our time. Millions of Internet users face harassment, racism, personal attacks, and other types of abuse across various platforms. The psychological effects of abuse on individuals can be profound and lasting. Consequently, over the past few years, there has been a substantial research effort towards automated abusive language detection in… ▽ More

    Submitted 14 April, 2021; v1 submitted 31 March, 2021; originally announced March 2021.

  27. Us vs. Them: A Dataset of Populist Attitudes, News Bias and Emotions

    Authors: Pere-Lluís Huguet-Cabot, David Abadi, Agneta Fischer, Ekaterina Shutova

    Abstract: Computational modelling of political discourse tasks has become an increasingly important area of research in natural language processing. Populist rhetoric has risen across the political sphere in recent years; however, computational approaches to it have been scarce due to its complex nature. In this paper, we present the new $\textit{Us vs. Them}$ dataset, consisting of 6861 Reddit comments ann… ▽ More

    Submitted 14 February, 2021; v1 submitted 28 January, 2021; originally announced January 2021.

    Comments: Camera-ready version in EACL 2021

  28. arXiv:2101.11302  [pdf, other

    cs.CL

    Multilingual and cross-lingual document classification: A meta-learning approach

    Authors: Niels van der Heijden, Helen Yannakoudakis, Pushkar Mishra, Ekaterina Shutova

    Abstract: The great majority of languages in the world are considered under-resourced for the successful application of deep learning methods. In this work, we propose a meta-learning approach to document classification in limited-resource setting and demonstrate its effectiveness in two different settings: few-shot, cross-lingual adaptation to previously unseen languages; and multilingual joint training wh… ▽ More

    Submitted 24 April, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

    Comments: 11 pages, 1 figure

    Journal ref: Association for Computational Linguistics, Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021, 1966--1976

  29. arXiv:2012.12871  [pdf, other

    cs.CL cs.AI

    A Multimodal Framework for the Detection of Hateful Memes

    Authors: Phillip Lippe, Nithin Holla, Shantanu Chandra, Santhosh Rajamanickam, Georgios Antoniou, Ekaterina Shutova, Helen Yannakoudakis

    Abstract: An increasingly common expression of online hate speech is multimodal in nature and comes in the form of memes. Designing systems to automatically detect hateful content is of paramount importance if we are to mitigate its undesirable effects on the society at large. The detection of multimodal hate speech is an intrinsically difficult and open problem: memes convey a message using both images and… ▽ More

    Submitted 24 December, 2020; v1 submitted 23 December, 2020; originally announced December 2020.

    Journal ref: PMLR 133:344-360, 2021

  30. arXiv:2010.12825  [pdf, other

    cs.CL

    Cross-neutralising: Probing for joint encoding of linguistic information in multilingual models

    Authors: Rochelle Choenni, Ekaterina Shutova

    Abstract: Multilingual sentence encoders are widely used to transfer NLP models across languages. The success of this transfer is, however, dependent on the model's ability to encode the patterns of cross-lingual similarity and variation. Yet, little is known as to how these models are able to do this. We propose a simple method to study how relationships between languages are encoded in two state-of-the-ar… ▽ More

    Submitted 13 March, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

  31. arXiv:2009.12862  [pdf, other

    cs.CL

    What does it mean to be language-agnostic? Probing multilingual sentence encoders for typological properties

    Authors: Rochelle Choenni, Ekaterina Shutova

    Abstract: Multilingual sentence encoders have seen much success in cross-lingual model transfer for downstream NLP tasks. Yet, we know relatively little about the properties of individual languages or the general patterns of linguistic variation that they encode. We propose methods for probing sentence representations from state-of-the-art multilingual encoders (LASER, M-BERT, XLM and XLM-R) with respect to… ▽ More

    Submitted 27 September, 2020; originally announced September 2020.

  32. arXiv:2009.04891  [pdf, other

    cs.CL cs.LG

    Meta-Learning with Sparse Experience Replay for Lifelong Language Learning

    Authors: Nithin Holla, Pushkar Mishra, Helen Yannakoudakis, Ekaterina Shutova

    Abstract: Lifelong learning requires models that can continuously learn from sequential streams of data without suffering catastrophic forgetting due to shifts in data distributions. Deep learning models have thrived in the non-sequential learning paradigm; however, when used to learn a sequence of tasks, they fail to retain past knowledge and learn incrementally. We propose a novel approach to lifelong lea… ▽ More

    Submitted 25 July, 2021; v1 submitted 10 September, 2020; originally announced September 2020.

  33. arXiv:2008.06274  [pdf, other

    cs.CL cs.LG

    Graph-based Modeling of Online Communities for Fake News Detection

    Authors: Shantanu Chandra, Pushkar Mishra, Helen Yannakoudakis, Madhav Nimishakavi, Marzieh Saeidi, Ekaterina Shutova

    Abstract: Over the past few years, there has been a substantial effort towards automated detection of fake news on social media platforms. Existing research has modeled the structure, style, content, and patterns in dissemination of online posts, as well as the demographic traits of users who interact with them. However, no attention has been directed towards modeling the properties of online communities th… ▽ More

    Submitted 23 November, 2020; v1 submitted 14 August, 2020; originally announced August 2020.

  34. arXiv:2005.14028  [pdf, other

    cs.CL cs.LG

    Joint Modelling of Emotion and Abusive Language Detection

    Authors: Santhosh Rajamanickam, Pushkar Mishra, Helen Yannakoudakis, Ekaterina Shutova

    Abstract: The rise of online communication platforms has been accompanied by some undesirable effects, such as the proliferation of aggressive and abusive behaviour online. Aiming to tackle this problem, the natural language processing (NLP) community has experimented with a range of techniques for abuse detection. While achieving substantial success, these methods have so far only focused on modelling the… ▽ More

    Submitted 28 May, 2020; originally announced May 2020.

    Comments: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

  35. arXiv:2004.14355  [pdf, other

    cs.CL cs.LG

    Learning to Learn to Disambiguate: Meta-Learning for Few-Shot Word Sense Disambiguation

    Authors: Nithin Holla, Pushkar Mishra, Helen Yannakoudakis, Ekaterina Shutova

    Abstract: The success of deep learning methods hinges on the availability of large training datasets annotated for the task of interest. In contrast to human intelligence, these methods lack versatility and struggle to learn and adapt quickly to new tasks, where labeled data is scarce. Meta-learning aims to solve this problem by training a model on a large number of few-shot tasks, with an objective to lear… ▽ More

    Submitted 12 October, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

    Comments: Camera-ready: Findings of EMNLP

  36. arXiv:1912.10169  [pdf, other

    cs.CL

    A Comparison of Architectures and Pretraining Methods for Contextualized Multilingual Word Embeddings

    Authors: Niels van der Heijden, Samira Abnar, Ekaterina Shutova

    Abstract: The lack of annotated data in many languages is a well-known challenge within the field of multilingual natural language processing (NLP). Therefore, many recent studies focus on zero-shot transfer learning and joint training across languages to overcome data scarcity for low-resource languages. In this work we (i) perform a comprehensive comparison of state-ofthe-art multilingual word and sentenc… ▽ More

    Submitted 15 December, 2019; originally announced December 2019.

    Comments: 7 pages, 6 figures

  37. arXiv:1908.06024  [pdf, ps, other

    cs.CL

    Tackling Online Abuse: A Survey of Automated Abuse Detection Methods

    Authors: Pushkar Mishra, Helen Yannakoudakis, Ekaterina Shutova

    Abstract: Abuse on the Internet represents an important societal problem of our time. Millions of Internet users face harassment, racism, personal attacks, and other types of abuse on online platforms. The psychological effects of such abuse on individuals can be profound and lasting. Consequently, over the past few years, there has been a substantial research effort towards automated abuse detection in the… ▽ More

    Submitted 30 September, 2020; v1 submitted 13 August, 2019; originally announced August 2019.

    Comments: In preparation for Computational Linguistics

  38. arXiv:1904.04073  [pdf, other

    cs.CL

    Abusive Language Detection with Graph Convolutional Networks

    Authors: Pushkar Mishra, Marco Del Tredici, Helen Yannakoudakis, Ekaterina Shutova

    Abstract: Abuse on the Internet represents a significant societal problem of our time. Previous research on automated abusive language detection in Twitter has shown that community-based profiling of users is a promising technique for this task. However, existing approaches only capture shallow properties of online communities by modeling follower-following relationships. In contrast, working with graph con… ▽ More

    Submitted 4 April, 2019; originally announced April 2019.

    Comments: Proceedings of the 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT)

  39. arXiv:1904.02246  [pdf, other

    cs.CL

    Learning Outside the Box: Discourse-level Features Improve Metaphor Identification

    Authors: Jesse Mu, Helen Yannakoudakis, Ekaterina Shutova

    Abstract: Most current approaches to metaphor identification use restricted linguistic contexts, e.g. by considering only a verb's arguments or the sentence containing a phrase. Inspired by pragmatic accounts of metaphor, we argue that broader discourse features are crucial for better metaphor identification. We train simple gradient boosting classifiers on representations of an utterance and its surroundin… ▽ More

    Submitted 9 April, 2019; v1 submitted 3 April, 2019; originally announced April 2019.

    Comments: NAACL 2019; 6 pages; code available at https://github.com/jayelm/broader-metaphor; v2 updates affiliations and acknowledgments

  40. arXiv:1902.06734  [pdf, other

    cs.CL

    Author Profiling for Hate Speech Detection

    Authors: Pushkar Mishra, Marco Del Tredici, Helen Yannakoudakis, Ekaterina Shutova

    Abstract: The rapid growth of social media in recent years has fed into some highly undesirable phenomena such as proliferation of abusive and offensive language on the Internet. Previous research suggests that such hateful content tends to come from users who share a set of common stereotypes and form communities around them. The current state-of-the-art approaches to hate speech detection are oblivious to… ▽ More

    Submitted 14 February, 2019; originally announced February 2019.

    Comments: Proceedings of the 27th International Conference on Computational Linguistics (COLING) 2018. arXiv admin note: text overlap with arXiv:1809.00378

  41. arXiv:1809.00378  [pdf, other

    cs.CL

    Neural Character-based Composition Models for Abuse Detection

    Authors: Pushkar Mishra, Helen Yannakoudakis, Ekaterina Shutova

    Abstract: The advent of social media in recent years has fed into some highly undesirable phenomena such as proliferation of offensive language, hate speech, sexist remarks, etc. on the Internet. In light of this, there have been several efforts to automate the detection and moderation of such abusive content. However, deliberate obfuscation of words by users to evade detection poses a serious challenge to… ▽ More

    Submitted 2 September, 2018; originally announced September 2018.

    Comments: In Proceedings of the EMNLP Workshop on Abusive Language Online 2018

  42. arXiv:1807.00914  [pdf, other

    cs.CL

    Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing

    Authors: Edoardo Maria Ponti, Helen O'Horan, Yevgeni Berzak, Ivan Vulić, Roi Reichart, Thierry Poibeau, Ekaterina Shutova, Anna Korhonen

    Abstract: Linguistic typology aims to capture structural and semantic variation across the world's languages. A large-scale typology could provide excellent guidance for multilingual Natural Language Processing (NLP), particularly for languages that suffer from the lack of human labeled resources. We present an extensive literature survey on the use of typological information in the development of NLP techn… ▽ More

    Submitted 26 October, 2020; v1 submitted 2 July, 2018; originally announced July 2018.

  43. arXiv:1709.00575  [pdf, other

    cs.CL cs.LG cs.NE

    Grasping the Finer Point: A Supervised Similarity Network for Metaphor Detection

    Authors: Marek Rei, Luana Bulat, Douwe Kiela, Ekaterina Shutova

    Abstract: The ubiquity of metaphor in our everyday communication makes it an important problem for natural language understanding. Yet, the majority of metaphor processing systems to date rely on hand-engineered features and there is still no consensus in the field as to which features are optimal for this task. In this paper, we present the first deep learning architecture designed to capture metaphorical… ▽ More

    Submitted 2 September, 2017; originally announced September 2017.

    Comments: EMNLP 2017

    ACM Class: I.2.7; I.2.6; I.5.1

  44. arXiv:1609.09019  [pdf, other

    cs.CL

    Psychologically Motivated Text Mining

    Authors: Ekaterina Shutova, Patricia Lichtenstein

    Abstract: Natural language processing techniques are increasingly applied to identify social trends and predict behavior based on large text collections. Existing methods typically rely on surface lexical and syntactic information. Yet, research in psychology shows that patterns of human conceptualisation, such as metaphorical framing, are reliable predictors of human expectations and decisions. In this pap… ▽ More

    Submitted 28 September, 2016; originally announced September 2016.