Zum Hauptinhalt springen

Showing 1–38 of 38 results for author: Elazar, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.21530  [pdf, other

    cs.CL cs.LG

    Data Contamination Report from the 2024 CONDA Shared Task

    Authors: Oscar Sainz, Iker García-Ferrero, Alon Jacovi, Jon Ander Campos, Yanai Elazar, Eneko Agirre, Yoav Goldberg, Wei-Lin Chen, Jenny Chim, Leshem Choshen, Luca D'Amico-Wong, Melissa Dell, Run-Ze Fan, Shahriar Golchin, Yucheng Li, Pengfei Liu, Bhavish Pahwa, Ameya Prabhu, Suryansh Sharma, Emily Silcock, Kateryna Solonko, David Stap, Mihai Surdeanu, Yu-Min Tseng, Vishaal Udandarao , et al. (3 additional authors not shown)

    Abstract: The 1st Workshop on Data Contamination (CONDA 2024) focuses on all relevant aspects of data contamination in natural language processing, where data contamination is understood as situations where evaluation data is included in pre-training corpora used to train large scale models, compromising evaluation results. The workshop fostered a shared task to collect evidence on data contamination in cur… ▽ More

    Submitted 4 August, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

    Comments: https://huggingface.co/spaces/CONDA-Workshop/Data-Contamination-Database

  2. arXiv:2407.14985  [pdf, other

    cs.CL cs.AI cs.LG

    Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data

    Authors: Antonis Antoniades, Xinyi Wang, Yanai Elazar, Alfonso Amayuelas, Alon Albalak, Kexun Zhang, William Yang Wang

    Abstract: Despite the proven utility of large language models (LLMs) in real-world applications, there remains a lack of understanding regarding how they leverage their large-scale pretraining text corpora to achieve such capabilities. In this work, we investigate the interplay between generalization and memorization in pretrained LLMs at scale, through a comprehensive $n$-gram analysis of their training da… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

    Comments: ICML FM-Wild workshop version

  3. arXiv:2407.00211  [pdf, other

    cs.CL

    Detection and Measurement of Syntactic Templates in Generated Text

    Authors: Chantal Shaib, Yanai Elazar, Junyi Jessy Li, Byron C. Wallace

    Abstract: Recent work on evaluating the diversity of text generated by LLMs has focused on word-level features. Here we offer an analysis of syntactic features to characterize general repetition in models, beyond frequent n-grams. Specifically, we define syntactic templates and show that models tend to produce templated text in downstream tasks at a higher rate than what is found in human-reference texts. W… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  4. arXiv:2406.13069  [pdf, other

    cs.CL cs.AI

    Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG

    Authors: William Merrill, Noah A. Smith, Yanai Elazar

    Abstract: How novel are texts generated by language models (LMs) relative to their training corpora? In this work, we investigate the extent to which modern LMs generate $n$-grams from their training data, evaluating both (i) the probability LMs assign to complete training $n$-grams and (ii) $n$-novelty, the proportion of $n$-grams generated by an LM that did not appear in the training data (for arbitrarily… ▽ More

    Submitted 25 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: 8 page preprint + appendix. Minor fixes and appendix changes June 25, 2024

  5. arXiv:2406.00787  [pdf, other

    cs.CL

    Applying Intrinsic Debiasing on Downstream Tasks: Challenges and Considerations for Machine Translation

    Authors: Bar Iluz, Yanai Elazar, Asaf Yehudai, Gabriel Stanovsky

    Abstract: Most works on gender bias focus on intrinsic bias -- removing traces of information about a protected group from the model's internal representation. However, these works are often disconnected from the impact of such debiasing on downstream applications, which is the main motivation for debiasing in the first place. In this work, we systematically test how methods for intrinsic debiasing affect n… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  6. arXiv:2402.16827  [pdf, other

    cs.CL cs.LG

    A Survey on Data Selection for Language Models

    Authors: Alon Albalak, Yanai Elazar, Sang Michael Xie, Shayne Longpre, Nathan Lambert, Xinyi Wang, Niklas Muennighoff, Bairu Hou, Liangming Pan, Haewon Jeong, Colin Raffel, Shiyu Chang, Tatsunori Hashimoto, William Yang Wang

    Abstract: A major factor in the recent success of large language models is the use of enormous and ever-growing text datasets for unsupervised pre-training. However, naively training a model on all available data may not be optimal (or feasible), as the quality of available text data can vary. Filtering out data can also decrease the carbon footprint and financial costs of training models by reducing the am… ▽ More

    Submitted 2 August, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: Paper list available at https://github.com/alon-albalak/data-selection-survey

  7. arXiv:2402.13904  [pdf, other

    cs.CL

    Calibrating Large Language Models with Sample Consistency

    Authors: Qing Lyu, Kumar Shridhar, Chaitanya Malaviya, Li Zhang, Yanai Elazar, Niket Tandon, Marianna Apidianaki, Mrinmaya Sachan, Chris Callison-Burch

    Abstract: Accurately gauging the confidence level of Large Language Models' (LLMs) predictions is pivotal for their reliable application. However, LLMs are often uncalibrated inherently and elude conventional calibration techniques due to their proprietary nature and massive scale. In this work, we explore the potential of deriving confidence from the distribution of multiple randomly sampled model generati… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  8. arXiv:2402.00838  [pdf, other

    cs.CL

    OLMo: Accelerating the Science of Language Models

    Authors: Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam , et al. (18 additional authors not shown)

    Abstract: Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off, gated behind proprietary interfaces, with important details of their training data, architectures, and development undisclosed. Given the importance of these details in scientifically studying these models… ▽ More

    Submitted 7 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  9. arXiv:2402.00159  [pdf, other

    cs.CL

    Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

    Authors: Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen , et al. (11 additional authors not shown)

    Abstract: Information about pretraining corpora used to train the current best-performing language models is seldom discussed: commercial models rarely detail their data, and even open models are often released without accompanying training data or recipes to reproduce them. As a result, it is challenging to conduct and advance scientific research on language modeling, such as understanding how training dat… ▽ More

    Submitted 6 June, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

    Comments: Accepted at ACL 2024; Dataset: https://hf.co/datasets/allenai/dolma; Code: https://github.com/allenai/dolma

  10. arXiv:2312.10523  [pdf, other

    cs.CL cs.AI cs.LG

    Paloma: A Benchmark for Evaluating Language Model Fit

    Authors: Ian Magnusson, Akshita Bhagia, Valentin Hofmann, Luca Soldaini, Ananya Harsh Jha, Oyvind Tafjord, Dustin Schwenk, Evan Pete Walsh, Yanai Elazar, Kyle Lo, Dirk Groeneveld, Iz Beltagy, Hannaneh Hajishirzi, Noah A. Smith, Kyle Richardson, Jesse Dodge

    Abstract: Language models (LMs) commonly report perplexity on monolithic data held out from training. Implicitly or explicitly, this data is composed of domains$\unicode{x2013}$varying distributions of language. Rather than assuming perplexity on one distribution extrapolates to others, Perplexity Analysis for Language Model Assessment (Paloma), measures LM fit to 585 text domains, ranging from nytimes.com… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: Project Page: https://paloma.allen.ai/

  11. arXiv:2311.09605  [pdf, other

    cs.CL

    Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals

    Authors: Yanai Elazar, Bhargavi Paranjape, Hao Peng, Sarah Wiegreffe, Khyathi Raghavi, Vivek Srikumar, Sameer Singh, Noah A. Smith

    Abstract: The inevitable appearance of spurious correlations in training datasets hurts the generalization of NLP models on unseen data. Previous work has found that datasets with paired inputs are prone to correlations between a specific part of the input (e.g., the hypothesis in NLI) and the label; consequently, models trained only on those outperform chance. Are these correlations picked up by models tra… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  12. arXiv:2310.20707  [pdf, other

    cs.CL cs.LG

    What's In My Big Data?

    Authors: Yanai Elazar, Akshita Bhagia, Ian Magnusson, Abhilasha Ravichander, Dustin Schwenk, Alane Suhr, Pete Walsh, Dirk Groeneveld, Luca Soldaini, Sameer Singh, Hanna Hajishirzi, Noah A. Smith, Jesse Dodge

    Abstract: Large text corpora are the backbone of language models. However, we have a limited understanding of the content of these corpora, including general statistics, quality, social factors, and inclusion of evaluation data (contamination). In this work, we propose What's In My Big Data? (WIMBD), a platform and a set of sixteen analyses that allow us to reveal and compare the contents of large text corp… ▽ More

    Submitted 5 March, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

    Comments: Published at ICLR 2024 spotlight

  13. arXiv:2308.00755  [pdf, other

    cs.LG cs.CL cs.CV cs.CY

    The Bias Amplification Paradox in Text-to-Image Generation

    Authors: Preethi Seshadri, Sameer Singh, Yanai Elazar

    Abstract: Bias amplification is a phenomenon in which models exacerbate biases or stereotypes present in the training data. In this paper, we study bias amplification in the text-to-image domain using Stable Diffusion by comparing gender ratios in training vs. generated images. We find that the model appears to amplify gender-occupation biases found in the training data (LAION) considerably. However, we dis… ▽ More

    Submitted 15 November, 2023; v1 submitted 1 August, 2023; originally announced August 2023.

  14. arXiv:2306.13891  [pdf, other

    cs.CL

    Estimating the Causal Effect of Early ArXiving on Paper Acceptance

    Authors: Yanai Elazar, Jiayao Zhang, David Wadden, Bo Zhang, Noah A. Smith

    Abstract: What is the effect of releasing a preprint of a paper before it is submitted for peer review? No randomized controlled trial has been conducted, so we turn to observational data to answer this question. We use data from the ICLR conference (2018--2022) and apply methods from causal inference to estimate the effect of arXiving a paper before the reviewing period (early arXiving) on its acceptance t… ▽ More

    Submitted 20 February, 2024; v1 submitted 24 June, 2023; originally announced June 2023.

    Comments: Published at CLeaR 2024

  15. arXiv:2305.16938  [pdf, other

    cs.CL

    Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation

    Authors: Marius Mosbach, Tiago Pimentel, Shauli Ravfogel, Dietrich Klakow, Yanai Elazar

    Abstract: Few-shot fine-tuning and in-context learning are two alternative strategies for task adaptation of pre-trained language models. Recently, in-context learning has gained popularity over fine-tuning due to its simplicity and improved out-of-domain generalization, and because extensive evidence shows that fine-tuned models pick up on spurious correlations. Unfortunately, previous comparisons of the t… ▽ More

    Submitted 30 May, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of ACL 2023

  16. arXiv:2303.03745  [pdf, other

    cs.CV

    At Your Fingertips: Extracting Piano Fingering Instructions from Videos

    Authors: Amit Moryossef, Yanai Elazar, Yoav Goldberg

    Abstract: Piano fingering -- knowing which finger to use to play each note in a musical piece, is a hard and important skill to master when learning to play the piano. While some sheet music is available with expert-annotated fingering information, most pieces lack this information, and people often resort to learning the fingering from demonstrations in online videos. We consider the AI task of automating… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

    Comments: 6 pages, paper from 2019

  17. arXiv:2210.12673  [pdf, other

    cs.CL

    Lexical Generalization Improves with Larger Models and Longer Training

    Authors: Elron Bandel, Yoav Goldberg, Yanai Elazar

    Abstract: While fine-tuned language models perform well on many tasks, they were also shown to rely on superficial surface features such as lexical overlap. Excessive utilization of such heuristics can lead to failure on challenging inputs. We analyze the use of lexical overlap heuristics in natural language inference, paraphrase detection, and reading comprehension (using a novel contrastive dataset), and… ▽ More

    Submitted 25 October, 2022; v1 submitted 23 October, 2022; originally announced October 2022.

    Comments: Accepted to EMNLP 2022 as Findings Paper, Presented at BlackboxNLP 2022

  18. arXiv:2210.06246  [pdf, other

    cs.CL

    CIKQA: Learning Commonsense Inference with a Unified Knowledge-in-the-loop QA Paradigm

    Authors: Hongming Zhang, Yintong Huo, Yanai Elazar, Yangqiu Song, Yoav Goldberg, Dan Roth

    Abstract: Recently, the community has achieved substantial progress on many commonsense reasoning benchmarks. However, it is still unclear what is learned from the training process: the knowledge, inference capability, or both? We argue that due to the large scale of commonsense knowledge, it is infeasible to annotate a large enough training set for each task to cover all commonsense for learning. Thus we s… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

  19. State-of-the-art generalisation research in NLP: A taxonomy and review

    Authors: Dieuwke Hupkes, Mario Giulianelli, Verna Dankers, Mikel Artetxe, Yanai Elazar, Tiago Pimentel, Christos Christodoulopoulos, Karim Lasri, Naomi Saphra, Arabella Sinclair, Dennis Ulmer, Florian Schottmann, Khuyagbaatar Batsuren, Kaiser Sun, Koustuv Sinha, Leila Khalatbari, Maria Ryskina, Rita Frieske, Ryan Cotterell, Zhijing Jin

    Abstract: The ability to generalise well is one of the primary desiderata of natural language processing (NLP). Yet, what 'good generalisation' entails and how it should be evaluated is not well understood, nor are there any evaluation standards for generalisation. In this paper, we lay the groundwork to address both of these issues. We present a taxonomy for characterising and understanding generalisation… ▽ More

    Submitted 12 January, 2024; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: This preprint was published as an Analysis article in Nature Machine Intelligence. Please refer to the published version when citing this work. 28 pages of content + 6 pages of appendix + 52 pages of references

    Journal ref: Nat Mach Intell 5, 1161-1174 (2023)

  20. arXiv:2207.14251  [pdf, other

    cs.CL

    Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions

    Authors: Yanai Elazar, Nora Kassner, Shauli Ravfogel, Amir Feder, Abhilasha Ravichander, Marius Mosbach, Yonatan Belinkov, Hinrich Schütze, Yoav Goldberg

    Abstract: Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models. But what exactly in the training data causes a model to make a certain prediction? We seek to answer this question by providing a language for describing how training data influences predictions, through a causal framework. Importantly, our framework bypasses the need to retrain exp… ▽ More

    Submitted 24 March, 2023; v1 submitted 28 July, 2022; originally announced July 2022.

    Comments: We received a criticism regarding the validity of the causal formulation in this paper. We will address them in an upcoming version

  21. arXiv:2109.12085  [pdf, other

    cs.CL

    Text-based NP Enrichment

    Authors: Yanai Elazar, Victoria Basmov, Yoav Goldberg, Reut Tsarfaty

    Abstract: Understanding the relations between entities denoted by NPs in a text is a critical part of human-like natural language understanding. However, only a fraction of such relations is covered by standard NLP tasks and benchmarks nowadays. In this work, we propose a novel task termed text-based NP enrichment (TNE), in which we aim to enrich each NP in a text with all the preposition-mediated relations… ▽ More

    Submitted 11 April, 2022; v1 submitted 24 September, 2021; originally announced September 2021.

    Comments: Accepted to the TACL journal, pre-MIT Press publication version

  22. arXiv:2104.08481  [pdf, other

    cs.CL

    Revisiting Few-shot Relation Classification: Evaluation Data and Classification Schemes

    Authors: Ofer Sabo, Yanai Elazar, Yoav Goldberg, Ido Dagan

    Abstract: We explore Few-Shot Learning (FSL) for Relation Classification (RC). Focusing on the realistic scenario of FSL, in which a test instance might not belong to any of the target categories (none-of-the-above, aka NOTA), we first revisit the recent popular dataset structure for FSL, pointing out its unrealistic data distribution. To remedy this, we propose a novel methodology for deriving more realist… ▽ More

    Submitted 17 April, 2021; originally announced April 2021.

    Comments: Accepted to TACL 2021

  23. arXiv:2104.08161  [pdf, other

    cs.CL

    Back to Square One: Artifact Detection, Training and Commonsense Disentanglement in the Winograd Schema

    Authors: Yanai Elazar, Hongming Zhang, Yoav Goldberg, Dan Roth

    Abstract: The Winograd Schema (WS) has been proposed as a test for measuring commonsense capabilities of models. Recently, pre-trained language model-based approaches have boosted performance on some WS benchmarks but the source of improvement is still not clear. This paper suggests that the apparent progress on WS may not necessarily reflect progress in commonsense reasoning. To support this claim, we firs… ▽ More

    Submitted 13 October, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

    Comments: Accepted to EMNLP 2021

  24. arXiv:2103.01378  [pdf, other

    cs.CL cs.AI cs.LG

    Contrastive Explanations for Model Interpretability

    Authors: Alon Jacovi, Swabha Swayamdipta, Shauli Ravfogel, Yanai Elazar, Yejin Choi, Yoav Goldberg

    Abstract: Contrastive explanations clarify why an event occurred in contrast to another. They are more inherently intuitive to humans to both produce and comprehend. We propose a methodology to produce contrastive explanations for classification models by modifying the representation to disregard non-contrastive information, and modifying model behavior to only be based on contrastive reasoning. Our method… ▽ More

    Submitted 14 September, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

    Comments: Accepted to EMNLP 2021 as a long paper

  25. arXiv:2102.01017  [pdf, other

    cs.CL

    Measuring and Improving Consistency in Pretrained Language Models

    Authors: Yanai Elazar, Nora Kassner, Shauli Ravfogel, Abhilasha Ravichander, Eduard Hovy, Hinrich Schütze, Yoav Goldberg

    Abstract: Consistency of a model -- that is, the invariance of its behavior under meaning-preserving alternations in its input -- is a highly desirable property in natural language processing. In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge? To this end, we create ParaRel, a high-quality resource of cloze-style query English paraphrases… ▽ More

    Submitted 29 May, 2021; v1 submitted 1 February, 2021; originally announced February 2021.

    Comments: Accepted to the TACL journal, pre-MIT Press publication version

  26. arXiv:2101.11109  [pdf, other

    cs.CL

    First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT

    Authors: Benjamin Muller, Yanai Elazar, Benoît Sagot, Djamé Seddah

    Abstract: Multilingual pretrained language models have demonstrated remarkable zero-shot cross-lingual transfer capabilities. Such transfer emerges by fine-tuning on a task of interest in one language and evaluating on a distinct language, not seen during the fine-tuning. Despite promising results, we still lack a proper understanding of the source of this transfer. Using a novel layer ablation technique an… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

    Comments: Accepted at EACL 2021

  27. arXiv:2010.08275  [pdf, other

    cs.CL

    It's not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT

    Authors: Hila Gonen, Shauli Ravfogel, Yanai Elazar, Yoav Goldberg

    Abstract: Recent works have demonstrated that multilingual BERT (mBERT) learns rich cross-lingual representations, that allow for transfer across languages. We study the word-level translation information embedded in mBERT and present two simple methods that expose remarkable translation capabilities with no fine-tuning. The results suggest that most of this information is encoded in a non-linear way, while… ▽ More

    Submitted 16 October, 2020; originally announced October 2020.

    Comments: BlackboxNLP 2020

  28. arXiv:2010.05971  [pdf, other

    cs.CL

    The Extraordinary Failure of Complement Coercion Crowdsourcing

    Authors: Yanai Elazar, Victoria Basmov, Shauli Ravfogel, Yoav Goldberg, Reut Tsarfaty

    Abstract: Crowdsourcing has eased and scaled up the collection of linguistic annotation in recent years. In this work, we follow known methodologies of collecting labeled data for the complement coercion phenomenon. These are constructions with an implied action -- e.g., "I started a new book I bought last week", where the implied action is reading. We aim to collect annotated data for this phenomenon by re… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: Workshop on Insights from Negative Results in NLP, co-located with EMNLP 2020

  29. arXiv:2010.05345  [pdf, other

    cs.CL

    Do Language Embeddings Capture Scales?

    Authors: Xikun Zhang, Deepak Ramachandran, Ian Tenney, Yanai Elazar, Dan Roth

    Abstract: Pretrained Language Models (LMs) have been shown to possess significant linguistic, common sense, and factual knowledge. One form of knowledge that has not been studied yet in this context is information about the scalar magnitudes of objects. We show that pretrained language models capture a significant amount of this information but are short of the capability required for general common-sense r… ▽ More

    Submitted 24 November, 2020; v1 submitted 11 October, 2020; originally announced October 2020.

    Comments: Accepted at EMNLP Findings 2020 and EMNLP BlackboxNLP workshop 2020; 8 pages, 2 figures; Minor changes to the acknowledgment section

    ACM Class: I.2.7

  30. arXiv:2010.05265  [pdf, other

    cs.CL cs.LG

    Unsupervised Distillation of Syntactic Information from Contextualized Word Representations

    Authors: Shauli Ravfogel, Yanai Elazar, Jacob Goldberger, Yoav Goldberg

    Abstract: Contextualized word representations, such as ELMo and BERT, were shown to perform well on various semantic and syntactic tasks. In this work, we tackle the task of unsupervised disentanglement between semantics and structure in neural language representations: we aim to learn a transformation of the contextualized vectors, that discards the lexical semantics, but keeps the structural information.… ▽ More

    Submitted 11 March, 2021; v1 submitted 11 October, 2020; originally announced October 2020.

    Comments: Accepted in BlackboxNLP@EMNLP2020

  31. arXiv:2006.00995  [pdf, other

    cs.CL

    Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals

    Authors: Yanai Elazar, Shauli Ravfogel, Alon Jacovi, Yoav Goldberg

    Abstract: A growing body of work makes use of probing to investigate the working of neural models, often considered black boxes. Recently, an ongoing debate emerged surrounding the limitations of the probing paradigm. In this work, we point out the inability to infer behavioral conclusions from probing results and offer an alternative method that focuses on how the information is being used, rather than on… ▽ More

    Submitted 19 February, 2021; v1 submitted 1 June, 2020; originally announced June 2020.

    Comments: TACL journal. Initial title was: "When Bert Forgets How To POS: Amnesic Probing of Linguistic Properties and MLM Predictions"

  32. arXiv:2004.07667  [pdf, other

    cs.CL cs.LG

    Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection

    Authors: Shauli Ravfogel, Yanai Elazar, Hila Gonen, Michael Twiton, Yoav Goldberg

    Abstract: The ability to control for the kinds of information encoded in neural representation has a variety of use cases, especially in light of the challenge of interpreting these models. We present Iterative Null-space Projection (INLP), a novel method for removing information from neural representations. Our method is based on repeated training of linear classifiers that predict a certain property we ai… ▽ More

    Submitted 28 April, 2020; v1 submitted 16 April, 2020; originally announced April 2020.

    Comments: Accepted as a long paper in ACL 2020

  33. arXiv:2004.02709  [pdf, other

    cs.CL

    Evaluating Models' Local Decision Boundaries via Contrast Sets

    Authors: Matt Gardner, Yoav Artzi, Victoria Basmova, Jonathan Berant, Ben Bogin, Sihao Chen, Pradeep Dasigi, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hanna Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson F. Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah A. Smith, Sanjay Subramanian, Reut Tsarfaty, Eric Wallace, Ally Zhang , et al. (1 additional authors not shown)

    Abstract: Standard test sets for supervised learning evaluate in-distribution generalization. Unfortunately, when a dataset has systematic gaps (e.g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture a dataset's intended capabilities. We propose a new annotation paradigm for NLP that helps to close systemati… ▽ More

    Submitted 1 October, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

  34. arXiv:1912.13283  [pdf, other

    cs.CL cs.AI cs.LG

    oLMpics -- On what Language Model Pre-training Captures

    Authors: Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant

    Abstract: Recent success of pre-trained language models (LMs) has spurred widespread interest in the language capabilities that they possess. However, efforts to understand whether LM representations are useful for symbolic reasoning tasks have been limited and scattered. In this work, we propose eight reasoning tasks, which conceptually require operations such as comparison, conjunction, and composition. A… ▽ More

    Submitted 19 November, 2020; v1 submitted 31 December, 2019; originally announced December 2019.

    Comments: TACL 2020

  35. arXiv:1906.01327  [pdf, other

    cs.CL

    How Large Are Lions? Inducing Distributions over Quantitative Attributes

    Authors: Yanai Elazar, Abhijit Mahabal, Deepak Ramachandran, Tania Bedrax-Weiss, Dan Roth

    Abstract: Most current NLP systems have little knowledge about quantitative attributes of objects and events. We propose an unsupervised method for collecting quantitative information from large amounts of web data, and use it to create a new, very large resource consisting of distributions over physical quantities associated with objects, adjectives, and verbs which we call Distributions over Quantitative… ▽ More

    Submitted 4 June, 2019; originally announced June 2019.

  36. arXiv:1905.10886  [pdf, other

    cs.CL

    Where's My Head? Definition, Dataset and Models for Numeric Fused-Heads Identification and Resolution

    Authors: Yanai Elazar, Yoav Goldberg

    Abstract: We provide the first computational treatment of fused-heads constructions (FH), focusing on the numeric fused-heads (NFH). FHs constructions are noun phrases (NPs) in which the head noun is missing and is said to be `fused' with its dependent modifier. This missing information is implicit and is important for sentence understanding. The missing references are easily filled in by humans but pose a… ▽ More

    Submitted 26 May, 2019; originally announced May 2019.

  37. arXiv:1808.06640  [pdf, other

    cs.CL cs.LG stat.ML

    Adversarial Removal of Demographic Attributes from Text Data

    Authors: Yanai Elazar, Yoav Goldberg

    Abstract: Recent advances in Representation Learning and Adversarial Training seem to succeed in removing unwanted features from the learned representation. We show that demographic information of authors is encoded in -- and can be recovered from -- the intermediate representations learned by text-based neural classifiers. The implication is that decisions of classifiers trained on textual data are not agn… ▽ More

    Submitted 2 September, 2018; v1 submitted 20 August, 2018; originally announced August 2018.

  38. arXiv:1807.03521  [pdf, ps, other

    cs.IR cs.LG stat.ML

    Privacy and Fairness in Recommender Systems via Adversarial Training of User Representations

    Authors: Yehezkel S. Resheff, Yanai Elazar, Moni Shahar, Oren Sar Shalom

    Abstract: Latent factor models for recommender systems represent users and items as low dimensional vectors. Privacy risks of such systems have previously been studied mostly in the context of recovery of personal information in the form of usage records from the training data. However, the user representations themselves may be used together with external data to recover private user information such as ge… ▽ More

    Submitted 18 December, 2018; v1 submitted 10 July, 2018; originally announced July 2018.

    Comments: International Conference on Pattern Recognition and Methods