Search | arXiv e-print repository

Small Models Are (Still) Effective Cross-Domain Argument Extractors

Authors: William Gantt, Aaron Steven White

Abstract: Effective ontology transfer has been a major goal of recent work on event argument extraction (EAE). Two methods in particular -- question answering (QA) and template infilling (TI) -- have emerged as promising approaches to this problem. However, detailed explorations of these techniques' ability to actually enable this transfer are lacking. In this work, we provide such a study, exploring zero-s… ▽ More Effective ontology transfer has been a major goal of recent work on event argument extraction (EAE). Two methods in particular -- question answering (QA) and template infilling (TI) -- have emerged as promising approaches to this problem. However, detailed explorations of these techniques' ability to actually enable this transfer are lacking. In this work, we provide such a study, exploring zero-shot transfer using both techniques on six major EAE datasets at both the sentence and document levels. Further, we challenge the growing reliance on LLMs for zero-shot extraction, showing that vastly smaller models trained on an appropriate source ontology can yield zero-shot performance superior to that of GPT-3.5 or GPT-4. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: ACL Rolling Review Short Paper

arXiv:2402.06973 [pdf, other]

Event-Keyed Summarization

Authors: William Gantt, Alexander Martin, Pavlo Kuchmiichuk, Aaron Steven White

Abstract: We introduce event-keyed summarization (EKS), a novel task that marries traditional summarization and document-level event extraction, with the goal of generating a contextualized summary for a specific event, given a document and an extracted event structure. We introduce a dataset for this task, MUCSUM, consisting of summaries of all events in the classic MUC-4 dataset, along with a set of basel… ▽ More We introduce event-keyed summarization (EKS), a novel task that marries traditional summarization and document-level event extraction, with the goal of generating a contextualized summary for a specific event, given a document and an extracted event structure. We introduce a dataset for this task, MUCSUM, consisting of summaries of all events in the classic MUC-4 dataset, along with a set of baselines that comprises both pretrained LM standards in the summarization literature, as well as larger frontier models. We show that ablations that reduce EKS to traditional summarization or structure-to-text yield inferior summaries of target events and that MUCSUM is a robust benchmark for this task. Lastly, we conduct a human evaluation of both reference and model summaries, and provide some detailed analysis of the results. △ Less

Submitted 10 February, 2024; originally announced February 2024.

Comments: ARR short paper (under review)

arXiv:2401.16209 [pdf, other]

MultiMUC: Multilingual Template Filling on MUC-4

Authors: William Gantt, Shabnam Behzad, Hannah YoungEun An, Yunmo Chen, Aaron Steven White, Benjamin Van Durme, Mahsa Yarmohammadi

Abstract: We introduce MultiMUC, the first multilingual parallel corpus for template filling, comprising translations of the classic MUC-4 template filling benchmark into five languages: Arabic, Chinese, Farsi, Korean, and Russian. We obtain automatic translations from a strong multilingual machine translation system and manually project the original English annotations into each target language. For all la… ▽ More We introduce MultiMUC, the first multilingual parallel corpus for template filling, comprising translations of the classic MUC-4 template filling benchmark into five languages: Arabic, Chinese, Farsi, Korean, and Russian. We obtain automatic translations from a strong multilingual machine translation system and manually project the original English annotations into each target language. For all languages, we also provide human translations for sentences in the dev and test splits that contain annotated template arguments. Finally, we present baselines on MultiMUC both with state-of-the-art template filling models and with ChatGPT. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: EACL 2024

arXiv:2311.05601 [pdf, other]

FAMuS: Frames Across Multiple Sources

Authors: Siddharth Vashishtha, Alexander Martin, William Gantt, Benjamin Van Durme, Aaron Steven White

Abstract: Understanding event descriptions is a central aspect of language processing, but current approaches focus overwhelmingly on single sentences or documents. Aggregating information about an event \emph{across documents} can offer a much richer understanding. To this end, we present FAMuS, a new corpus of Wikipedia passages that \emph{report} on some event, paired with underlying, genre-diverse (non-… ▽ More Understanding event descriptions is a central aspect of language processing, but current approaches focus overwhelmingly on single sentences or documents. Aggregating information about an event \emph{across documents} can offer a much richer understanding. To this end, we present FAMuS, a new corpus of Wikipedia passages that \emph{report} on some event, paired with underlying, genre-diverse (non-Wikipedia) \emph{source} articles for the same event. Events and (cross-sentence) arguments in both report and source are annotated against FrameNet, providing broad coverage of different event types. We present results on two key event understanding tasks enabled by FAMuS: \emph{source validation} -- determining whether a document is a valid source for a target report event -- and \emph{cross-document argument extraction} -- full-document argument extraction for a target event from both its report and the correct source article. We release both FAMuS and our models to support further research. △ Less

Submitted 9 November, 2023; originally announced November 2023.

arXiv:2310.13793 [pdf, other]

A Unified View of Evaluation Metrics for Structured Prediction

Authors: Yunmo Chen, William Gantt, Tongfei Chen, Aaron Steven White, Benjamin Van Durme

Abstract: We present a conceptual framework that unifies a variety of evaluation metrics for different structured prediction tasks (e.g. event and relation extraction, syntactic and semantic parsing). Our framework requires representing the outputs of these tasks as objects of certain data types, and derives metrics through matching of common substructures, possibly followed by normalization. We demonstrate… ▽ More We present a conceptual framework that unifies a variety of evaluation metrics for different structured prediction tasks (e.g. event and relation extraction, syntactic and semantic parsing). Our framework requires representing the outputs of these tasks as objects of certain data types, and derives metrics through matching of common substructures, possibly followed by normalization. We demonstrate how commonly used metrics for a number of tasks can be succinctly expressed by this framework, and show that new metrics can be naturally derived in a bottom-up way based on an output structure. We release a library that enables this derivation to create new metrics. Finally, we consider how specific characteristics of tasks motivate metric design decisions, and suggest possible modifications to existing metrics in line with those motivations. △ Less

Submitted 20 October, 2023; originally announced October 2023.

Comments: Accepted at EMNLP2023 Main Track

arXiv:2212.09702 [pdf, other]

On Event Individuation for Document-Level Information Extraction

Authors: William Gantt, Reno Kriz, Yunmo Chen, Siddharth Vashishtha, Aaron Steven White

Abstract: As information extraction (IE) systems have grown more adept at processing whole documents, the classic task of template filling has seen renewed interest as benchmark for document-level IE. In this position paper, we call into question the suitability of template filling for this purpose. We argue that the task demands definitive answers to thorny questions of event individuation -- the problem o… ▽ More As information extraction (IE) systems have grown more adept at processing whole documents, the classic task of template filling has seen renewed interest as benchmark for document-level IE. In this position paper, we call into question the suitability of template filling for this purpose. We argue that the task demands definitive answers to thorny questions of event individuation -- the problem of distinguishing distinct events -- about which even human experts disagree. Through an annotation study and error analysis, we show that this raises concerns about the usefulness of template filling metrics, the quality of datasets for the task, and the ability of models to learn it. Finally, we consider possible solutions. △ Less

Submitted 20 October, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

Comments: EMNLP: Findings 2023

arXiv:2210.06600 [pdf, other]

Iterative Document-level Information Extraction via Imitation Learning

Authors: Yunmo Chen, William Gantt, Weiwei Gu, Tongfei Chen, Aaron Steven White, Benjamin Van Durme

Abstract: We present a novel iterative extraction model, IterX, for extracting complex relations, or templates (i.e., N-tuples representing a mapping from named slots to spans of text) within a document. Documents may feature zero or more instances of a template of any given type, and the task of template extraction entails identifying the templates in a document and extracting each template's slot values.… ▽ More We present a novel iterative extraction model, IterX, for extracting complex relations, or templates (i.e., N-tuples representing a mapping from named slots to spans of text) within a document. Documents may feature zero or more instances of a template of any given type, and the task of template extraction entails identifying the templates in a document and extracting each template's slot values. Our imitation learning approach casts the problem as a Markov decision process (MDP), and relieves the need to use predefined template orders to train an extractor. It leads to state-of-the-art results on two established benchmarks -- 4-ary relation extraction on SciREX and template extraction on MUC-4 -- as well as a strong baseline on the new BETTER Granular task. △ Less

Submitted 1 May, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

Comments: Accepted to EACL 2023

arXiv:2107.08523 [pdf, other]

Argument Linking: A Survey and Forecast

Authors: William Gantt

Abstract: Semantic role labeling (SRL) -- identifying the semantic relationships between a predicate and other constituents in the same sentence -- is a well-studied task in natural language understanding (NLU). However, many of these relationships are evident only at the level of the document, as a role for a predicate in one sentence may often be filled by an argument in a different one. This more general… ▽ More Semantic role labeling (SRL) -- identifying the semantic relationships between a predicate and other constituents in the same sentence -- is a well-studied task in natural language understanding (NLU). However, many of these relationships are evident only at the level of the document, as a role for a predicate in one sentence may often be filled by an argument in a different one. This more general task, known as implicit semantic role labeling or argument linking, has received increased attention in recent years, as researchers have recognized its centrality to information extraction and NLU. This paper surveys the literature on argument linking and identifies several notable shortcomings of existing approaches that indicate the paths along which future research effort could most profitably be spent. △ Less

Submitted 18 July, 2021; originally announced July 2021.

Comments: An unpublished survey

arXiv:2103.10387 [pdf, other]

Decomposing and Recomposing Event Structure

Authors: William Gantt, Lelia Glass, Aaron Steven White

Abstract: We present an event structure classification empirically derived from inferential properties annotated on sentence- and document-level Universal Decompositional Semantics (UDS) graphs. We induce this classification jointly with semantic role, entity, and event-event relation classifications using a document-level generative model structured by these graphs. To support this induction, we augment ex… ▽ More We present an event structure classification empirically derived from inferential properties annotated on sentence- and document-level Universal Decompositional Semantics (UDS) graphs. We induce this classification jointly with semantic role, entity, and event-event relation classifications using a document-level generative model structured by these graphs. To support this induction, we augment existing annotations found in the UDS1.0 dataset, which covers the entirety of the English Web Treebank, with an array of inferential properties capturing fine-grained aspects of the temporal and aspectual structure of events. The resulting dataset (available at decomp.io) is the largest annotation of event structure and (partial) event coreference to date. △ Less

Submitted 29 September, 2021; v1 submitted 18 March, 2021; originally announced March 2021.

Comments: Accepted to Transactions of the Association for Computational Linguistics

arXiv:2010.10501 [pdf, other]

Natural Language Inference with Mixed Effects

Authors: William Gantt, Benjamin Kane, Aaron Steven White

Abstract: There is growing evidence that the prevalence of disagreement in the raw annotations used to construct natural language inference datasets makes the common practice of aggregating those annotations to a single label problematic. We propose a generic method that allows one to skip the aggregation step and train on the raw annotations directly without subjecting the model to unwanted noise that can… ▽ More There is growing evidence that the prevalence of disagreement in the raw annotations used to construct natural language inference datasets makes the common practice of aggregating those annotations to a single label problematic. We propose a generic method that allows one to skip the aggregation step and train on the raw annotations directly without subjecting the model to unwanted noise that can arise from annotator response biases. We demonstrate that this method, which generalizes the notion of a \textit{mixed effects model} by incorporating \textit{annotator random effects} into any existing neural model, improves performance over models that do not incorporate such effects. △ Less

Submitted 20 October, 2020; originally announced October 2020.

Journal ref: The Ninth Joint Conference on Lexical and Computational Semantics (*SEM2020)

Showing 1–10 of 10 results for author: Gantt, W