Zum Hauptinhalt springen

Showing 1–19 of 19 results for author: Blevins, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.06518  [pdf, other

    cs.CL

    Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models

    Authors: Hila Gonen, Terra Blevins, Alisa Liu, Luke Zettlemoyer, Noah A. Smith

    Abstract: Despite their wide adoption, the biases and unintended behaviors of language models remain poorly understood. In this paper, we identify and characterize a phenomenon never discussed before, which we call semantic leakage, where models leak irrelevant information from the prompt into the generation in unexpected ways. We propose an evaluation setting to detect semantic leakage both by humans and a… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  2. arXiv:2405.12413  [pdf, other

    cs.CL

    Targeted Multilingual Adaptation for Low-resource Language Families

    Authors: C. M. Downey, Terra Blevins, Dhwani Serai, Dwija Parikh, Shane Steinert-Threlkeld

    Abstract: The "massively-multilingual" training of multilingual models is known to limit their utility in any one language, and they perform particularly poorly on low-resource languages. However, there is evidence that low-resource languages can benefit from targeted multilinguality, where the model is trained on closely related languages. To test this approach more rigorously, we systematically study best… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  3. arXiv:2403.10691  [pdf, other

    cs.CL cs.AI cs.LG

    MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling

    Authors: Tomasz Limisiewicz, Terra Blevins, Hila Gonen, Orevaoghene Ahia, Luke Zettlemoyer

    Abstract: A major consideration in multilingual language modeling is how to best represent languages with diverse vocabularies and scripts. Although contemporary text encoding methods cover most of the world's writing systems, they exhibit bias towards the high-resource languages of the Global West. As a result, texts of underrepresented languages tend to be segmented into long sequences of linguistically m… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  4. arXiv:2402.10496  [pdf, other

    cs.CL cs.AI

    Comparing Hallucination Detection Metrics for Multilingual Generation

    Authors: Haoqiang Kang, Terra Blevins, Luke Zettlemoyer

    Abstract: While many hallucination detection techniques have been evaluated on English text, their effectiveness in multilingual contexts remains unknown. This paper assesses how well various factual hallucination detection metrics (lexical metrics like ROUGE and Named Entity Overlap, and Natural Language Inference (NLI)-based metrics) identify hallucinations in generated biographical summaries across langu… ▽ More

    Submitted 15 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  5. arXiv:2401.10440  [pdf, other

    cs.CL

    Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models

    Authors: Terra Blevins, Tomasz Limisiewicz, Suchin Gururangan, Margaret Li, Hila Gonen, Noah A. Smith, Luke Zettlemoyer

    Abstract: Despite their popularity in non-English NLP, multilingual language models often underperform monolingual ones due to inter-language competition for model parameters. We propose Cross-lingual Expert Language Models (X-ELM), which mitigate this competition by independently training language models on subsets of the multilingual corpus. This process specializes X-ELMs to different languages while rem… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

  6. arXiv:2311.09122  [pdf, other

    cs.CL

    Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark

    Authors: Stephen Mayhew, Terra Blevins, Shuheng Liu, Marek Šuppa, Hila Gonen, Joseph Marvin Imperial, Börje F. Karlsson, Peiqin Lin, Nikola Ljubešić, LJ Miranda, Barbara Plank, Arij Riabi, Yuval Pinter

    Abstract: We introduce Universal NER (UNER), an open, community-driven project to develop gold-standard NER benchmarks in many languages. The overarching goal of UNER is to provide high-quality, cross-lingually consistent annotations to facilitate and standardize multilingual NER research. UNER v1 contains 18 datasets annotated with named entities in a cross-lingual consistent schema across 12 diverse langu… ▽ More

    Submitted 29 June, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: NAACL 2024 Camera-ready

  7. arXiv:2310.16789  [pdf, other

    cs.CL cs.CR cs.LG

    Detecting Pretraining Data from Large Language Models

    Authors: Weijia Shi, Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu, Terra Blevins, Danqi Chen, Luke Zettlemoyer

    Abstract: Although large language models (LLMs) are widely deployed, the data used to train them is rarely disclosed. Given the incredible scale of this data, up to trillions of tokens, it is all but certain that it includes potentially problematic text such as copyrighted materials, personally identifiable information, and test data for widely reported reference benchmarks. However, we currently have no wa… ▽ More

    Submitted 9 March, 2024; v1 submitted 25 October, 2023; originally announced October 2023.

  8. arXiv:2309.04679  [pdf, other

    cs.CL

    Embedding structure matters: Comparing methods to adapt multilingual vocabularies to new languages

    Authors: C. M. Downey, Terra Blevins, Nora Goldfine, Shane Steinert-Threlkeld

    Abstract: Pre-trained multilingual language models underpin a large portion of modern NLP tools outside of English. A strong baseline for specializing these models for specific languages is Language-Adaptive Pre-Training (LAPT). However, retaining a large cross-lingual vocabulary and embedding matrix comes at considerable excess computational cost during adaptation. In this study, we propose several simple… ▽ More

    Submitted 26 October, 2023; v1 submitted 9 September, 2023; originally announced September 2023.

    Comments: Camera-ready for Proceedings of the 3rd Workshop on Multilingual Representation Learning

  9. arXiv:2305.14857  [pdf, other

    cs.CL

    BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer

    Authors: Akari Asai, Sneha Kudugunta, Xinyan Velocity Yu, Terra Blevins, Hila Gonen, Machel Reid, Yulia Tsvetkov, Sebastian Ruder, Hannaneh Hajishirzi

    Abstract: Despite remarkable advancements in few-shot generalization in natural language processing, most models are developed and evaluated primarily in English. To facilitate research on few-shot cross-lingual transfer, we introduce a new benchmark, called BUFFET, which unifies 15 diverse tasks across 54 languages in a sequence-to-sequence format and provides a fixed set of few-shot examples and instructi… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: The data and code is available at https://buffetfs.github.io/

  10. arXiv:2304.13803  [pdf, other

    cs.CL

    Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language Models

    Authors: Haoqiang Kang, Terra Blevins, Luke Zettlemoyer

    Abstract: Pretrained Language Models (PLMs) learn rich cross-lingual knowledge and can be finetuned to perform well on diverse tasks such as translation and multilingual word sense disambiguation (WSD). However, they often struggle at disambiguating word sense in a zero-shot setting. To better understand this contrast, we present a new study investigating how well PLMs capture cross-lingual word sense with… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

  11. arXiv:2212.04037  [pdf, other

    cs.CL

    Demystifying Prompts in Language Models via Perplexity Estimation

    Authors: Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer

    Abstract: Language models can be prompted to perform a wide variety of zero- and few-shot learning problems. However, performance varies significantly with the choice of prompt, and we do not yet understand why this happens or how to pick the best prompts. In this work, we analyze the factors that contribute to this variance and establish a new empirical hypothesis: the performance of a prompt is coupled wi… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

  12. arXiv:2211.07830  [pdf, other

    cs.CL

    Prompting Language Models for Linguistic Structure

    Authors: Terra Blevins, Hila Gonen, Luke Zettlemoyer

    Abstract: Although pretrained language models (PLMs) can be prompted to perform a wide range of language tasks, it remains an open question how much this ability comes from generalizable linguistic understanding versus surface-level lexical patterns. To test this, we present a structured prompting approach for linguistic structured prediction tasks, allowing us to perform zero- and few-shot sequence tagging… ▽ More

    Submitted 20 May, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

    Comments: ACL 2023

  13. arXiv:2205.11758  [pdf, other

    cs.CL

    Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of Multilingual Language Models

    Authors: Terra Blevins, Hila Gonen, Luke Zettlemoyer

    Abstract: The emergent cross-lingual transfer seen in multilingual pretrained models has sparked significant interest in studying their behavior. However, because these analyses have focused on fully trained multilingual models, little is known about the dynamics of the multilingual pretraining process. We investigate when these models acquire their in-language and cross-lingual abilities by probing checkpo… ▽ More

    Submitted 22 October, 2022; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: EMNLP 2022

  14. arXiv:2205.04050  [pdf, other

    cs.CL

    Few-shot Mining of Naturally Occurring Inputs and Outputs

    Authors: Mandar Joshi, Terra Blevins, Mike Lewis, Daniel S. Weld, Luke Zettlemoyer

    Abstract: Creating labeled natural language training data is expensive and requires significant human effort. We mine input output examples from large corpora using a supervised mining function trained using a small seed set of only 100 examples. The mining consists of two stages -- (1) a biencoder-based recall-oriented dense search which pairs inputs with potential outputs, and (2) a crossencoder-based fil… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

  15. arXiv:2204.08110  [pdf, other

    cs.CL

    Language Contamination Helps Explain the Cross-lingual Capabilities of English Pretrained Models

    Authors: Terra Blevins, Luke Zettlemoyer

    Abstract: English pretrained language models, which make up the backbone of many modern NLP systems, require huge amounts of unlabeled training data. These models are generally presented as being trained only on English text but have been found to transfer surprisingly well to other languages. We investigate this phenomenon and find that common English pretraining corpora actually contain significant amount… ▽ More

    Submitted 16 November, 2022; v1 submitted 17 April, 2022; originally announced April 2022.

    Comments: EMNLP 2022; corrected typos in appendix tables

  16. arXiv:2102.07983  [pdf, other

    cs.CL

    FEWS: Large-Scale, Low-Shot Word Sense Disambiguation with the Dictionary

    Authors: Terra Blevins, Mandar Joshi, Luke Zettlemoyer

    Abstract: Current models for Word Sense Disambiguation (WSD) struggle to disambiguate rare senses, despite reaching human performance on global WSD metrics. This stems from a lack of data for both modeling and evaluating rare senses in existing WSD datasets. In this paper, we introduce FEWS (Few-shot Examples of Word Senses), a new low-shot WSD dataset automatically extracted from example sentences in Wikti… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

    Comments: EACL 2021

  17. arXiv:2005.02590  [pdf, other

    cs.CL

    Moving Down the Long Tail of Word Sense Disambiguation with Gloss-Informed Biencoders

    Authors: Terra Blevins, Luke Zettlemoyer

    Abstract: A major obstacle in Word Sense Disambiguation (WSD) is that word senses are not uniformly distributed, causing existing models to generally perform poorly on senses that are either rare or unseen during training. We propose a bi-encoder model that independently embeds (1) the target word with its surrounding context and (2) the dictionary definition, or gloss, of each sense. The encoders are joint… ▽ More

    Submitted 2 June, 2020; v1 submitted 6 May, 2020; originally announced May 2020.

    Comments: Accepted to ACL 2020; current version corrects typos and formatting

  18. arXiv:1906.01037  [pdf, other

    cs.CL

    Better Character Language Modeling Through Morphology

    Authors: Terra Blevins, Luke Zettlemoyer

    Abstract: We incorporate morphological supervision into character language models (CLMs) via multitasking and show that this addition improves bits-per-character (BPC) performance across 24 languages, even when the morphology data and language modeling data are disjoint. Analyzing the CLMs shows that inflected words benefit more from explicitly modeling morphology than uninflected words, and that morphologi… ▽ More

    Submitted 12 June, 2019; v1 submitted 3 June, 2019; originally announced June 2019.

    Comments: Accepted to ACL 2019

  19. arXiv:1805.04218  [pdf, other

    cs.CL

    Deep RNNs Encode Soft Hierarchical Syntax

    Authors: Terra Blevins, Omer Levy, Luke Zettlemoyer

    Abstract: We present a set of experiments to demonstrate that deep recurrent neural networks (RNNs) learn internal representations that capture soft hierarchical notions of syntax from highly varied supervision. We consider four syntax tasks at different depths of the parse tree; for each word, we predict its part of speech as well as the first (parent), second (grandparent) and third level (great-grandpare… ▽ More

    Submitted 10 May, 2018; originally announced May 2018.

    Comments: Accepted to ACL 2018