Zum Hauptinhalt springen

Showing 1–4 of 4 results for author: Kauf, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.09605  [pdf, other

    cs.CL cs.AI cs.LG

    Elements of World Knowledge (EWOK): A cognition-inspired framework for evaluating basic world knowledge in language models

    Authors: Anna A. Ivanova, Aalok Sathe, Benjamin Lipkin, Unnathi Kumar, Setayesh Radkani, Thomas H. Clark, Carina Kauf, Jennifer Hu, R. T. Pramod, Gabriel Grand, Vivian Paulun, Maria Ryskina, Ekin Akyürek, Ethan Wilcox, Nafisa Rashid, Leshem Choshen, Roger Levy, Evelina Fedorenko, Joshua Tenenbaum, Jacob Andreas

    Abstract: The ability to build and leverage world models is essential for a general-purpose AI agent. Testing such capabilities is hard, in part because the building blocks of world models are ill-defined. We present Elements of World Knowledge (EWOK), a framework for evaluating world modeling in language models by testing their ability to use knowledge of a concept to match a target text with a plausible/i… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: 21 pages (11 main), 7 figures. Authors Anna Ivanova, Aalok Sathe, Benjamin Lipkin contributed equally

  2. arXiv:2403.14859  [pdf, other

    cs.CL cs.AI

    Comparing Plausibility Estimates in Base and Instruction-Tuned Large Language Models

    Authors: Carina Kauf, Emmanuele Chersoni, Alessandro Lenci, Evelina Fedorenko, Anna A. Ivanova

    Abstract: Instruction-tuned LLMs can respond to explicit queries formulated as prompts, which greatly facilitates interaction with human users. However, prompt-based approaches might not always be able to tap into the wealth of implicit knowledge acquired by LLMs during pre-training. This paper presents a comprehensive study of ways to evaluate semantic plausibility in LLMs. We compare base and instruction-… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  3. arXiv:2305.10588  [pdf, other

    cs.CL

    A Better Way to Do Masked Language Model Scoring

    Authors: Carina Kauf, Anna Ivanova

    Abstract: Estimating the log-likelihood of a given sentence under an autoregressive language model is straightforward: one can simply apply the chain rule and sum the log-likelihood values for each successive token. However, for masked language models (MLMs), there is no direct way to estimate the log-likelihood of a sentence. To address this issue, Salazar et al. (2020) propose to estimate sentence pseudo-… ▽ More

    Submitted 23 May, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

  4. arXiv:2212.01488  [pdf

    cs.CL cs.AI

    Event knowledge in large language models: the gap between the impossible and the unlikely

    Authors: Carina Kauf, Anna A. Ivanova, Giulia Rambelli, Emmanuele Chersoni, Jingyuan Selena She, Zawad Chowdhury, Evelina Fedorenko, Alessandro Lenci

    Abstract: Word co-occurrence patterns in language corpora contain a surprising amount of conceptual knowledge. Large language models (LLMs), trained to predict words in context, leverage these patterns to achieve impressive performance on diverse semantic tasks requiring world knowledge. An important but understudied question about LLMs' semantic abilities is whether they acquire generalized knowledge of co… ▽ More

    Submitted 26 October, 2023; v1 submitted 2 December, 2022; originally announced December 2022.

    Comments: The two lead authors have contributed equally to this work