Skip to main content

Showing 1–20 of 20 results for author: Weir, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03572  [pdf, other

    cs.CL

    Core: Robust Factual Precision Scoring with Informative Sub-Claim Identification

    Authors: Zhengping Jiang, Jingyu Zhang, Nathaniel Weir, Seth Ebner, Miriam Wanner, Kate Sanders, Daniel Khashabi, Anqi Liu, Benjamin Van Durme

    Abstract: Hallucinations -- the generation of untrue claims -- pose a challenge to the application of large language models (LLMs) [1] thereby motivating the development of metrics to evaluate factual precision. We observe that popular metrics using the Decompose-Then-Verify framework, such as FActScore [2], can be manipulated by adding obvious or repetitive claims to artificially inflate scores. We expand… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2405.16337  [pdf, other

    cs.CL cs.AI

    Learning to Reason via Program Generation, Emulation, and Search

    Authors: Nathaniel Weir, Muhammad Khalifa, Linlu Qiu, Orion Weller, Peter Clark

    Abstract: Program synthesis with language models (LMs) has unlocked a large set of reasoning abilities; code-tuned LMs have proven adept at generating programs that solve a wide variety of algorithmic symbolic manipulation tasks (e.g. word concatenation). However, not all reasoning tasks are easily expressible as code, e.g. tasks involving commonsense reasoning, moral decision-making, and sarcasm understand… ▽ More

    Submitted 28 May, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

    Comments: 16 pages, 10 figures

  3. arXiv:2404.04298  [pdf, other

    cs.AI cs.CL cs.LG

    SELF-[IN]CORRECT: LLMs Struggle with Refining Self-Generated Responses

    Authors: Dongwei Jiang, Jingyu Zhang, Orion Weller, Nathaniel Weir, Benjamin Van Durme, Daniel Khashabi

    Abstract: Can LLMs continually improve their previous outputs for better results? An affirmative answer would require LLMs to be better at discriminating among previously-generated alternatives, than generating initial responses. We explore the validity of this hypothesis in practice. We first introduce a unified framework that allows us to compare the generative and discriminative capability of any model o… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  4. arXiv:2402.19467  [pdf, other

    cs.CL cs.AI cs.CV

    TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning

    Authors: Kate Sanders, Nathaniel Weir, Benjamin Van Durme

    Abstract: It is challenging to perform question-answering over complex, multimodal content such as television clips. This is in part because current video-language models rely on single-modality reasoning, have lowered performance on long inputs, and lack interpetability. We propose TV-TREES, the first multimodal entailment tree generator. TV-TREES serves as an approach to video understanding that promotes… ▽ More

    Submitted 10 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: 9 pages, preprint

    ACM Class: I.2.7; I.2.10

  5. arXiv:2402.14798  [pdf, other

    cs.CL cs.AI

    Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic

    Authors: Nathaniel Weir, Kate Sanders, Orion Weller, Shreya Sharma, Dongwei Jiang, Zhengping Jiang, Bhavana Dalvi Mishra, Oyvind Tafjord, Peter Jansen, Peter Clark, Benjamin Van Durme

    Abstract: Contemporary language models enable new opportunities for structured reasoning with text, such as the construction and evaluation of intuitive, proof-like textual entailment trees without relying on brittle formal logic. However, progress in this direction has been hampered by a long-standing lack of a clear protocol for determining what valid compositional entailment is. This absence causes noisy… ▽ More

    Submitted 27 February, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

  6. arXiv:2401.06715  [pdf, other

    cs.CL cs.AI

    Reframing Tax Law Entailment as Analogical Reasoning

    Authors: Xinrui Zou, Ming Zhang, Nathaniel Weir, Benjamin Van Durme, Nils Holzenberger

    Abstract: Statutory reasoning refers to the application of legislative provisions to a series of case facts described in natural language. We re-frame statutory reasoning as an analogy task, where each instance of the analogy task involves a combination of two instances of statutory reasoning. This increases the dataset size by two orders of magnitude, and introduces an element of interpretability. We show… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

  7. arXiv:2305.13252  [pdf, other

    cs.CL cs.AI

    "According to ...": Prompting Language Models Improves Quoting from Pre-Training Data

    Authors: Orion Weller, Marc Marone, Nathaniel Weir, Dawn Lawrie, Daniel Khashabi, Benjamin Van Durme

    Abstract: Large Language Models (LLMs) may hallucinate and generate fake information, despite pre-training on factual data. Inspired by the journalistic device of "according to sources", we propose according-to prompting: directing LLMs to ground responses against previously observed text. To quantify this grounding, we propose a novel evaluation metric (QUIP-Score) that measures the extent to which model-p… ▽ More

    Submitted 26 February, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted to EACL 2024

  8. arXiv:2212.10618  [pdf, ps, other

    cs.CL

    Ontologically Faithful Generation of Non-Player Character Dialogues

    Authors: Nathaniel Weir, Ryan Thomas, Randolph D'Amore, Kellie Hill, Benjamin Van Durme, Harsh Jhamtani

    Abstract: We introduce a language generation task grounded in a popular video game environment. KNUDGE (KNowledge Constrained User-NPC Dialogue GEneration) requires models to produce trees of dialogue between video game characters that accurately reflect quest and entity specifications stated in natural language. KNUDGE is constructed from side quest dialogues drawn directly from game data of Obsidian Enter… ▽ More

    Submitted 13 May, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  9. arXiv:2212.10002  [pdf, other

    cs.CL cs.IR

    Defending Against Disinformation Attacks in Open-Domain Question Answering

    Authors: Orion Weller, Aleem Khan, Nathaniel Weir, Dawn Lawrie, Benjamin Van Durme

    Abstract: Recent work in open-domain question answering (ODQA) has shown that adversarial poisoning of the search collection can cause large drops in accuracy for production systems. However, little to no work has proposed methods to defend against these attacks. To do so, we rely on the intuition that redundant information often exists in large corpora. To find it, we introduce a method that uses query aug… ▽ More

    Submitted 26 February, 2024; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: Accepted to EACL 2024

  10. arXiv:2209.07662  [pdf, other

    cs.CL

    NELLIE: A Neuro-Symbolic Inference Engine for Grounded, Compositional, and Explainable Reasoning

    Authors: Nathaniel Weir, Peter Clark, Benjamin Van Durme

    Abstract: Our goal is a modern approach to answering questions via systematic reasoning where answers are supported by human interpretable proof trees grounded in an NL corpus of authoritative facts. Such a system would help alleviate the challenges of interpretability and hallucination with modern LMs, and the lack of grounding of current explanation methods (e.g., Chain-of-Thought). This paper proposes a… ▽ More

    Submitted 21 December, 2023; v1 submitted 15 September, 2022; originally announced September 2022.

  11. arXiv:2203.04806  [pdf, other

    cs.CL

    One-Shot Learning from a Demonstration with Hierarchical Latent Language

    Authors: Nathaniel Weir, Xingdi Yuan, Marc-Alexandre Côté, Matthew Hausknecht, Romain Laroche, Ida Momennejad, Harm Van Seijen, Benjamin Van Durme

    Abstract: Humans have the capability, aided by the expressive compositionality of their language, to learn quickly by demonstration. They are able to describe unseen task-performing procedures and generalize their execution to other contexts. In this work, we introduce DescribeWorld, an environment designed to test this sort of generalization skill in grounded agents, where tasks are linguistically and proc… ▽ More

    Submitted 9 March, 2022; originally announced March 2022.

  12. arXiv:2103.04941  [pdf, other

    cs.CL

    InFillmore: Frame-Guided Language Generation with Bidirectional Context

    Authors: Jiefu Ou, Nathaniel Weir, Anton Belyy, Felix Yu, Benjamin Van Durme

    Abstract: We propose a structured extension to bidirectional-context conditional language generation, or "infilling," inspired by Frame Semantic theory (Fillmore, 1976). Guidance is provided through two approaches: (1) model fine-tuning, conditioning directly on observed symbolic frames, and (2) a novel extension to disjunctive lexically constrained decoding that leverages frame semantic lexical units. Auto… ▽ More

    Submitted 22 March, 2022; v1 submitted 8 March, 2021; originally announced March 2021.

    Comments: Appearing in *SEM 2021

  13. arXiv:2102.04420  [pdf, other

    cs.CV

    The Multi-Temporal Urban Development SpaceNet Dataset

    Authors: Adam Van Etten, Daniel Hogan, Jesus Martinez-Manso, Jacob Shermeyer, Nicholas Weir, Ryan Lewis

    Abstract: Satellite imagery analytics have numerous human development and disaster response applications, particularly when time series methods are involved. For example, quantifying population statistics is fundamental to 67 of the 231 United Nations Sustainable Development Goals Indicators, but the World Bank estimates that over 100 countries currently lack effective Civil Registration systems. To help ad… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

    Comments: 8 pages, 10 figures, 3 tables

  14. arXiv:2010.02882  [pdf, other

    cs.CL

    COD3S: Diverse Generation with Discrete Semantic Signatures

    Authors: Nathaniel Weir, João Sedoc, Benjamin Van Durme

    Abstract: We present COD3S, a novel method for generating semantically diverse sentences using neural sequence-to-sequence (seq2seq) models. Conditioned on an input, seq2seq models typically produce semantically and syntactically homogeneous sets of sentences and thus perform poorly on one-to-many sequence generation tasks. Our two-stage approach improves output diversity by conditioning generation on local… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: EMNLP2020 preprint

  15. arXiv:2004.06500  [pdf, other

    eess.IV cs.CV

    SpaceNet 6: Multi-Sensor All Weather Mapping Dataset

    Authors: Jacob Shermeyer, Daniel Hogan, Jason Brown, Adam Van Etten, Nicholas Weir, Fabio Pacifici, Ronny Haensch, Alexei Bastidas, Scott Soenen, Todd Bacastow, Ryan Lewis

    Abstract: Within the remote sensing domain, a diverse set of acquisition modalities exist, each with their own unique strengths and weaknesses. Yet, most of the current literature and open datasets only deal with electro-optical (optical) data for different detection and segmentation tasks at high spatial resolutions. optical data is often the preferred choice for geospatial applications, but requires clear… ▽ More

    Submitted 14 April, 2020; originally announced April 2020.

    Comments: To appear in CVPR EarthVision Proceedings, 10 pages, 7 figures

  16. arXiv:2004.04877  [pdf, other

    cs.CL

    Probing Neural Language Models for Human Tacit Assumptions

    Authors: Nathaniel Weir, Adam Poliak, Benjamin Van Durme

    Abstract: Humans carry stereotypic tacit assumptions (STAs) (Prince, 1978), or propositional beliefs about generic concepts. Such associations are crucial for understanding natural language. We construct a diagnostic set of word prediction prompts to evaluate whether recent neural contextualized language models trained on large text corpora capture STAs. Our prompts are based on human responses in a psychol… ▽ More

    Submitted 16 June, 2020; v1 submitted 9 April, 2020; originally announced April 2020.

    Comments: To be published in CogSci 2020

  17. Road Network and Travel Time Extraction from Multiple Look Angles with SpaceNet Data

    Authors: Adam Van Etten, Jacob Shermeyer, Daniel Hogan, Nicholas Weir, Ryan Lewis

    Abstract: Identification of road networks and optimal routes directly from remote sensing is of critical importance to a broad array of humanitarian and commercial applications. Yet while identification of road pixels has been attempted before, estimation of route travel times from overhead imagery remains a novel problem, particularly for off-nadir overhead imagery. To this end, we extract road networks wi… ▽ More

    Submitted 2 March, 2021; v1 submitted 16 January, 2020; originally announced January 2020.

    Comments: 4 pages, 5 figures. To appear at the 2020 IEEE International Geoscience and Remote Sensing Symposium

  18. arXiv:1909.06182  [pdf, other

    cs.DB

    DBPal: Weak Supervision for Learning a Natural Language Interface to Databases

    Authors: Nathaniel Weir, Andrew Crotty, Alex Galakatos, Amir Ilkhechi, Shekar Ramaswamy, Rohin Bhushan, Ugur Cetintemel, Prasetya Utama, Nadja Geisler, Benjamin Hättasch, Steffen Eger, Carsten Binnig

    Abstract: This paper describes DBPal, a new system to translate natural language utterances into SQL statements using a neural machine translation model. While other recent approaches use neural machine translation to implement a Natural Language Interface to Databases (NLIDB), existing techniques rely on supervised learning with manually curated training data, which results in substantial overhead for supp… ▽ More

    Submitted 11 September, 2019; originally announced September 2019.

    Comments: arXiv admin note: text overlap with arXiv:1804.00401

  19. SpaceNet MVOI: a Multi-View Overhead Imagery Dataset

    Authors: Nicholas Weir, David Lindenbaum, Alexei Bastidas, Adam Van Etten, Sean McPherson, Jacob Shermeyer, Varun Kumar, Hanlin Tang

    Abstract: Detection and segmentation of objects in overheard imagery is a challenging task. The variable density, random orientation, small size, and instance-to-instance heterogeneity of objects in overhead imagery calls for approaches distinct from existing models designed for natural scene datasets. Though new overhead imagery datasets are being developed, they almost universally comprise a single view t… ▽ More

    Submitted 15 August, 2019; v1 submitted 28 March, 2019; originally announced March 2019.

    Comments: Accepted into IEEE International Conference on Computer Vision (ICCV) 2019

  20. arXiv:1804.00401  [pdf, other

    cs.DB cs.CL cs.HC

    An End-to-end Neural Natural Language Interface for Databases

    Authors: Prasetya Utama, Nathaniel Weir, Fuat Basik, Carsten Binnig, Ugur Cetintemel, Benjamin Hättasch, Amir Ilkhechi, Shekar Ramaswamy, Arif Usta

    Abstract: The ability to extract insights from new data sets is critical for decision making. Visual interactive tools play an important role in data exploration since they provide non-technical users with an effective way to visually compose queries and comprehend the results. Natural language has recently gained traction as an alternative query interface to databases with the potential to enable non-exper… ▽ More

    Submitted 2 April, 2018; originally announced April 2018.