Skip to main content

Showing 1–50 of 114 results for author: Lapata, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13578  [pdf, other

    cs.CL cs.AI

    Large Language Models as Reliable Knowledge Bases?

    Authors: Danna Zheng, Mirella Lapata, Jeff Z. Pan

    Abstract: The NLP community has recently shown a growing interest in leveraging Large Language Models (LLMs) for knowledge-intensive tasks, viewing LLMs as potential knowledge bases (KBs). However, the reliability and extent to which LLMs can function as KBs remain underexplored. While previous studies suggest LLMs can encode knowledge within their parameters, the amount of parametric knowledge alone is not… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  2. arXiv:2407.09506  [pdf, other

    cs.CL

    Integrating Large Language Models with Graph-based Reasoning for Conversational Question Answering

    Authors: Parag Jain, Mirella Lapata

    Abstract: We focus on a conversational question answering task which combines the challenges of understanding questions in context and reasoning over evidence gathered from heterogeneous sources like text, knowledge graphs, tables, and infoboxes. Our method utilizes a graph structured representation to aggregate information about a question and its context (i.e., the conversation so far and evidence retriev… ▽ More

    Submitted 14 June, 2024; originally announced July 2024.

  3. arXiv:2406.19073  [pdf, other

    cs.CL

    AMBROSIA: A Benchmark for Parsing Ambiguous Questions into Database Queries

    Authors: Irina Saparina, Mirella Lapata

    Abstract: Practical semantic parsers are expected to understand user utterances and map them to executable programs, even when these are ambiguous. We introduce a new benchmark, AMBROSIA, which we hope will inform and inspire the development of text-to-SQL parsers capable of recognizing and interpreting ambiguous requests. Our dataset contains questions showcasing three different types of ambiguity (scope a… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  4. arXiv:2406.10190  [pdf, other

    cs.CL

    CHIRON: Rich Character Representations in Long-Form Narratives

    Authors: Alexander Gurung, Mirella Lapata

    Abstract: Characters are integral to long-form narratives, but are poorly understood by existing story analysis and generation systems. While prior work has simplified characters via graph-based methods and brief character descriptions, we aim to better tackle the problem of representing complex characters by taking inspiration from advice given to professional writers. We propose CHIRON, a new `character s… ▽ More

    Submitted 26 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

  5. arXiv:2406.04106  [pdf, other

    cs.CL

    Explainability and Hate Speech: Structured Explanations Make Social Media Moderators Faster

    Authors: Agostina Calabrese, Leonardo Neves, Neil Shah, Maarten W. Bos, Björn Ross, Mirella Lapata, Francesco Barbieri

    Abstract: Content moderators play a key role in keeping the conversation on social media healthy. While the high volume of content they need to judge represents a bottleneck to the moderation pipeline, no studies have explored how models could support them to make faster decisions. There is, by now, a vast body of research into detecting hate speech, sometimes explicitly motivated by a desire to help improv… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 11 pages, 14 figures, to be published at ACL 2024

  6. arXiv:2405.06524  [pdf, other

    cs.CL

    Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts

    Authors: Wenyu Huang, Guancheng Zhou, Mirella Lapata, Pavlos Vougiouklis, Sebastien Montella, Jeff Z. Pan

    Abstract: Although Large Language Models (LLMs) are effective in performing various NLP tasks, they still struggle to handle tasks that require extensive, real-world knowledge, especially when dealing with long-tail facts (facts related to long-tail entities). This limitation highlights the need to supplement LLMs with non-parametric knowledge. To address this issue, we analysed the effects of different typ… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  7. arXiv:2405.05938  [pdf, other

    cs.CL

    DOLOMITES: Domain-Specific Long-Form Methodical Tasks

    Authors: Chaitanya Malaviya, Priyanka Agrawal, Kuzman Ganchev, Pranesh Srinivasan, Fantine Huot, Jonathan Berant, Mark Yatskar, Dipanjan Das, Mirella Lapata, Chris Alberti

    Abstract: Experts in various fields routinely perform methodical writing tasks to plan, organize, and report their work. From a clinician writing a differential diagnosis for a patient, to a teacher writing a lesson plan for students, these tasks are pervasive, requiring to methodically generate structured long-form output for a given input. We develop a typology of methodical tasks structured in the form o… ▽ More

    Submitted 28 May, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

    Comments: Dataset now available at https://dolomites-benchmark.github.io

  8. arXiv:2404.03381  [pdf, other

    cs.CL

    Learning to Plan and Generate Text with Citations

    Authors: Constanza Fierro, Reinald Kim Amplayo, Fantine Huot, Nicola De Cao, Joshua Maynez, Shashi Narayan, Mirella Lapata

    Abstract: The increasing demand for the deployment of LLMs in information-seeking scenarios has spurred efforts in creating verifiable systems, which generate responses to queries along with supporting evidence. In this paper, we explore the attribution capabilities of plan-based models which have been recently shown to improve the faithfulness, grounding, and controllability of generated text. We conceptua… ▽ More

    Submitted 13 May, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

  9. arXiv:2403.03823  [pdf, other

    cs.CL

    A Modular Approach for Multimodal Summarization of TV Shows

    Authors: Louis Mahon, Mirella Lapata

    Abstract: In this paper we address the task of summarizing television shows, which touches key areas in AI research: complex reasoning, multiple modalities, and long narratives. We present a modular approach where separate components perform specialized sub-tasks which we argue affords greater flexibility compared to end-to-end methods. Our modules involve detecting scene boundaries, reordering scenes so as… ▽ More

    Submitted 6 July, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

  10. arXiv:2403.00435  [pdf, other

    cs.CL

    Hierarchical Indexing for Retrieval-Augmented Opinion Summarization

    Authors: Tom Hosking, Hao Tang, Mirella Lapata

    Abstract: We propose a method for unsupervised abstractive opinion summarization, that combines the attributability and scalability of extractive approaches with the coherence and fluency of Large Language Models (LLMs). Our method, HIRO, learns an index structure that maps sentences to a path through a semantically organized discrete hierarchy. At inference time, we populate the index and use it to identif… ▽ More

    Submitted 17 July, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

    Comments: Accepted to TACL; Pre MIT Press version

  11. arXiv:2402.12554  [pdf, other

    cs.CL

    Archer: A Human-Labeled Text-to-SQL Dataset with Arithmetic, Commonsense and Hypothetical Reasoning

    Authors: Danna Zheng, Mirella Lapata, Jeff Z. Pan

    Abstract: We present Archer, a challenging bilingual text-to-SQL dataset specific to complex reasoning, including arithmetic, commonsense and hypothetical reasoning. It contains 1,042 English questions and 1,042 Chinese questions, along with 521 unique SQL queries, covering 20 English databases across 20 domains. Notably, this dataset demonstrates a significantly higher level of complexity compared to exist… ▽ More

    Submitted 24 February, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: EACL 2024

  12. arXiv:2402.12545  [pdf, other

    cs.CL

    TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness

    Authors: Danna Zheng, Danyang Liu, Mirella Lapata, Jeff Z. Pan

    Abstract: Large Language Models (LLMs) have demonstrated impressive capabilities across various domains, prompting a surge in their practical applications. However, concerns have arisen regarding the trustworthiness of LLMs outputs, particularly in closed-book question-answering tasks, where non-experts may struggle to identify inaccuracies due to the absence of contextual or ground truth information. This… ▽ More

    Submitted 6 May, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  13. arXiv:2402.08666  [pdf, other

    cs.CL

    Improving Generalization in Semantic Parsing by Increasing Natural Language Variation

    Authors: Irina Saparina, Mirella Lapata

    Abstract: Text-to-SQL semantic parsing has made significant progress in recent years, with various models demonstrating impressive performance on the challenging Spider benchmark. However, it has also been shown that these models often struggle to generalize even when faced with small perturbations of previously (accurately) parsed expressions. This is mainly due to the linguistic form of questions in Spide… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: EACL 2024

  14. arXiv:2312.14215  [pdf, other

    cs.CL cs.AI

    SimLM: Can Language Models Infer Parameters of Physical Systems?

    Authors: Sean Memery, Mirella Lapata, Kartic Subr

    Abstract: Several machine learning methods aim to learn or reason about complex physical systems. A common first-step towards reasoning is to infer system parameters from observations of its behavior. In this paper, we investigate the performance of Large Language Models (LLMs) at performing parameter inference in the context of physical systems. Our experiments suggest that they are not inherently suited t… ▽ More

    Submitted 6 February, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    ACM Class: I.2.7; I.6

  15. arXiv:2312.02748  [pdf, other

    cs.CL cs.LG

    Compositional Generalization for Data-to-Text Generation

    Authors: Xinnuo Xu, Ivan Titov, Mirella Lapata

    Abstract: Data-to-text generation involves transforming structured data, often represented as predicate-argument tuples, into coherent textual descriptions. Despite recent advances, systems still struggle when confronted with unseen combinations of predicates, producing unfaithful descriptions (e.g. hallucinations or omissions). We refer to this issue as compositional generalisation, and it encouraged us to… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Journal ref: Findings of EMNLP 2023

  16. arXiv:2311.09808  [pdf, other

    cs.CL

    PixT3: Pixel-based Table-To-Text Generation

    Authors: Iñigo Alonso, Eneko Agirre, Mirella Lapata

    Abstract: Table-to-text generation involves generating appropriate textual descriptions given structured tabular data. It has attracted increasing attention in recent years thanks to the popularity of neural network models and the availability of large-scale datasets. A common feature across existing methods is their treatment of the input as a string, i.e., by employing linearization techniques that do not… ▽ More

    Submitted 3 June, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

  17. arXiv:2311.08572  [pdf, other

    cs.CL cs.AI cs.LG

    Low-Rank Adaptation for Multilingual Summarization: An Empirical Study

    Authors: Chenxi Whitehouse, Fantine Huot, Jasmijn Bastings, Mostafa Dehghani, Chu-Cheng Lin, Mirella Lapata

    Abstract: Although the advancements of pre-trained Large Language Models have significantly accelerated recent progress in NLP, their ever-increasing size poses significant challenges for conventional fine-tuning, especially in memory-intensive tasks. We investigate the potential of Parameter-Efficient Fine-Tuning, focusing on Low-Rank Adaptation (LoRA), in the domain of multilingual summarization, a task t… ▽ More

    Submitted 31 March, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: Findings of NAACL 2024

  18. arXiv:2310.05295  [pdf, other

    cs.CL

    Visual Storytelling with Question-Answer Plans

    Authors: Danyang Liu, Mirella Lapata, Frank Keller

    Abstract: Visual storytelling aims to generate compelling narratives from image sequences. Existing models often focus on enhancing the representation of the image sequence, e.g., with external knowledge sources or advanced graph structures. Despite recent progress, the stories are often repetitive, illogical, and lacking in detail. To mitigate these issues, we present a novel framework which integrates vis… ▽ More

    Submitted 17 October, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Findings

  19. arXiv:2307.04096  [pdf, other

    cs.CL

    Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing

    Authors: Tom Sherborne, Tom Hosking, Mirella Lapata

    Abstract: Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data. Previous work has primarily considered silver-standard data augmentation or zero-shot methods, however, exploiting few-shot gold data is comparatively unexplored. We propose a new approach to cross-lingual semantic parsing by explicitly mini… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

    Comments: Accepted to TACL 2023. Pre-MIT Press publication. 17 pages, 3 figures, 6 tables

  20. arXiv:2305.14205  [pdf, other

    cs.CL

    $μ$PLAN: Summarizing using a Content Plan as Cross-Lingual Bridge

    Authors: Fantine Huot, Joshua Maynez, Chris Alberti, Reinald Kim Amplayo, Priyanka Agrawal, Constanza Fierro, Shashi Narayan, Mirella Lapata

    Abstract: Cross-lingual summarization consists of generating a summary in one language given an input document in a different language, allowing for the dissemination of relevant content across speakers of other languages. The task is challenging mainly due to the paucity of cross-lingual datasets and the compounded difficulty of summarizing and translating. This work presents $μ$PLAN, an approach to cross-… ▽ More

    Submitted 31 January, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: EACL 2024

  21. arXiv:2305.11603  [pdf, other

    cs.CL

    Attributable and Scalable Opinion Summarization

    Authors: Tom Hosking, Hao Tang, Mirella Lapata

    Abstract: We propose a method for unsupervised opinion summarization that encodes sentences from customer reviews into a hierarchical discrete latent space, then identifies common opinions based on the frequency of their encodings. We are able to generate both abstractive summaries by decoding these frequent encodings, and extractive summaries by selecting the sentences assigned to the same frequent encodin… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

    Comments: ACL 2023

  22. arXiv:2305.10142  [pdf, other

    cs.CL

    Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback

    Authors: Yao Fu, Hao Peng, Tushar Khot, Mirella Lapata

    Abstract: We study whether multiple large language models (LLMs) can autonomously improve each other in a negotiation game by playing, reflecting, and criticizing. We are interested in this question because if LLMs were able to improve each other, it would imply the possibility of creating strong AI agents with minimal human intervention. We ask two LLMs to negotiate with each other, playing the roles of a… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

    Comments: Preprint. Code at https://github.com/FranxYao/GPT-Bargaining

  23. arXiv:2305.06164  [pdf, other

    cs.CL cs.AI

    Conversational Semantic Parsing using Dynamic Context Graphs

    Authors: Parag Jain, Mirella Lapata

    Abstract: In this paper we consider the task of conversational semantic parsing over general purpose knowledge graphs (KGs) with millions of entities, and thousands of relation-types. We focus on models which are capable of interactively mapping user utterances into executable logical forms (e.g., Sparql) in the context of the conversational history. Our key idea is to represent information about an utteran… ▽ More

    Submitted 7 December, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: camera ready

  24. arXiv:2305.00034  [pdf, other

    cs.CL

    Text-Blueprint: An Interactive Platform for Plan-based Conditional Generation

    Authors: Fantine Huot, Joshua Maynez, Shashi Narayan, Reinald Kim Amplayo, Kuzman Ganchev, Annie Louis, Anders Sandholm, Dipanjan Das, Mirella Lapata

    Abstract: While conditional generation models can now generate natural language well enough to create fluent text, it is still difficult to control the generation process, leading to irrelevant, repetitive, and hallucinated content. Recent work shows that planning can be a useful intermediate step to render conditional generation less opaque and more grounded. We present a web browser-based demonstration fo… ▽ More

    Submitted 28 April, 2023; originally announced May 2023.

    Comments: Accepted at EACL Call for System Demonstrations 2023

  25. arXiv:2301.12217  [pdf, other

    cs.CL

    Semantic Parsing for Conversational Question Answering over Knowledge Graphs

    Authors: Laura Perez-Beltrachini, Parag Jain, Emilio Monti, Mirella Lapata

    Abstract: In this paper, we are interested in developing semantic parsers which understand natural language questions embedded in a conversation with a user and ground them to formal queries over definitions in a general purpose knowledge graph (KG) with very large vocabularies (covering thousands of concept names and relations, and millions of entities). To this end, we develop a dataset where user questio… ▽ More

    Submitted 28 January, 2023; originally announced January 2023.

    Comments: EACL 2023

  26. arXiv:2212.10622  [pdf, other

    cs.CL

    mFACE: Multilingual Summarization with Factual Consistency Evaluation

    Authors: Roee Aharoni, Shashi Narayan, Joshua Maynez, Jonathan Herzig, Elizabeth Clark, Mirella Lapata

    Abstract: Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets. Despite promising results, current models still suffer from generating factually inconsistent summaries, reducing their utility for real-world application. Several recent efforts attempt to address this by devising models that automatically det… ▽ More

    Submitted 5 January, 2024; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: 28 pages with links to released data

  27. arXiv:2212.10471  [pdf, other

    cs.CL

    Little Red Riding Hood Goes Around the Globe:Crosslingual Story Planning and Generation with Large Language Models

    Authors: Evgeniia Razumovskaia, Joshua Maynez, Annie Louis, Mirella Lapata, Shashi Narayan

    Abstract: Previous work has demonstrated the effectiveness of planning for story generation exclusively in a monolingual setting focusing primarily on English. We consider whether planning brings advantages to automatic story generation across languages. We propose a new task of cross-lingual story generation with planning and present a new dataset for this task. We conduct a comprehensive study of differen… ▽ More

    Submitted 25 March, 2024; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: Accepted to LREC-COLING 2024

  28. arXiv:2212.05982  [pdf, other

    cs.CL cs.AI

    Real-World Compositional Generalization with Disentangled Sequence-to-Sequence Learning

    Authors: Hao Zheng, Mirella Lapata

    Abstract: Compositional generalization is a basic mechanism in human language learning, which current neural networks struggle with. A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability by learning specialized encodings for each decoding step. We introduce two key modifications to this model which encourage more disentangled representations and impro… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

  29. arXiv:2211.08264  [pdf, other

    cs.CL

    QAmeleon: Multilingual QA with Only 5 Examples

    Authors: Priyanka Agrawal, Chris Alberti, Fantine Huot, Joshua Maynez, Ji Ma, Sebastian Ruder, Kuzman Ganchev, Dipanjan Das, Mirella Lapata

    Abstract: The availability of large, high-quality datasets has been one of the main drivers of recent progress in question answering (QA). Such annotated datasets however are difficult and costly to collect, and rarely exist in languages other than English, rendering QA technology inaccessible to underrepresented languages. An alternative to building large monolingual training datasets is to leverage pre-tr… ▽ More

    Submitted 7 August, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

    Comments: To Appear at Transactions of Association for Computational Linguistics (TACL)

  30. arXiv:2210.04829  [pdf, other

    cs.CL

    Hierarchical3D Adapters for Long Video-to-text Summarization

    Authors: Pinelopi Papalampidi, Mirella Lapata

    Abstract: In this paper, we focus on video-to-text summarization and investigate how to best utilize multimodal information for summarizing long inputs (e.g., an hour-long TV show) into long outputs (e.g., a multi-sentence summary). We extend SummScreen (Chen et al., 2021), a dialogue summarization dataset consisting of transcripts of TV episodes with reference summaries, and create a multimodal variant by… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

  31. arXiv:2210.02659  [pdf, other

    cs.CL

    Explainable Abuse Detection as Intent Classification and Slot Filling

    Authors: Agostina Calabrese, Björn Ross, Mirella Lapata

    Abstract: To proactively offer social media users a safe online experience, there is a need for systems that can detect harmful posts and promptly alert platform moderators. In order to guarantee the enforcement of a consistent policy, moderators are provided with detailed guidelines. In contrast, most state-of-the-art models learn what abuse is from labelled examples and as a result base their predictions… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: 14 pages, 2 figures, to be published in TACL (pre-MIT Press publication version)

    ACM Class: I.2.7

  32. arXiv:2209.12714  [pdf, other

    cs.CL cs.LG

    Text Summarization with Oracle Expectation

    Authors: Yumo Xu, Mirella Lapata

    Abstract: Extractive summarization produces summaries by identifying and concatenating the most important sentences in a document. Since most summarization datasets do not come with gold labels indicating whether document sentences are summary-worthy, different labeling algorithms have been proposed to extrapolate oracle extracts for model training. In this work, we identify two flaws with the widely used g… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

    Comments: 18 pages, 5 figures

  33. arXiv:2209.12577  [pdf, other

    cs.CL

    Meta-Learning a Cross-lingual Manifold for Semantic Parsing

    Authors: Tom Sherborne, Mirella Lapata

    Abstract: Localizing a semantic parser to support new languages requires effective cross-lingual generalization. Recent work has found success with machine-translation or zero-shot methods although these approaches can struggle to model how native speakers ask questions. We consider how to effectively leverage minimal annotated examples in new languages for few-shot cross-lingual semantic parsing. We introd… ▽ More

    Submitted 27 September, 2022; v1 submitted 26 September, 2022; originally announced September 2022.

    Comments: Accepted to TACL 2022. Pre-MIT Press publication

  34. arXiv:2207.00397  [pdf, ps, other

    cs.CL

    Conditional Generation with a Question-Answering Blueprint

    Authors: Shashi Narayan, Joshua Maynez, Reinald Kim Amplayo, Kuzman Ganchev, Annie Louis, Fantine Huot, Anders Sandholm, Dipanjan Das, Mirella Lapata

    Abstract: The ability to convey relevant and faithful information is critical for many tasks in conditional generation and yet remains elusive for neural seq-to-seq models whose outputs often reveal hallucinations and fail to correctly cover important details. In this work, we advocate planning as a useful intermediate representation for rendering conditional generation less opaque and more grounded. Our wo… ▽ More

    Submitted 1 May, 2023; v1 submitted 1 July, 2022; originally announced July 2022.

    Comments: 22 pages, Accepted at TACL. Pre-MIT Press publication version

  35. arXiv:2206.01512  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    Latent Topology Induction for Understanding Contextualized Representations

    Authors: Yao Fu, Mirella Lapata

    Abstract: In this work, we study the representation space of contextualized embeddings and gain insight into the hidden topology of large language models. We show there exists a network of latent states that summarize linguistic properties of contextualized representations. Instead of seeking alignments to existing well-defined annotations, we infer this latent network in a fully unsupervised way using a st… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

    Comments: Preprint

  36. arXiv:2203.15108  [pdf, other

    cs.CL

    A Well-Composed Text is Half Done! Composition Sampling for Diverse Conditional Generation

    Authors: Shashi Narayan, Gonçalo Simões, Yao Zhao, Joshua Maynez, Dipanjan Das, Michael Collins, Mirella Lapata

    Abstract: We propose Composition Sampling, a simple but effective method to generate diverse outputs for conditional generation of higher quality compared to previous stochastic decoding strategies. It builds on recently proposed plan-based neural generation models (Narayan et al, 2021) that are trained to first create a composition of the output and then generate by conditioning on it and the input. Our ap… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: 21 pages, ACL 2022

  37. arXiv:2203.03463  [pdf, other

    cs.CL

    Hierarchical Sketch Induction for Paraphrase Generation

    Authors: Tom Hosking, Hao Tang, Mirella Lapata

    Abstract: We propose a generative model of paraphrase generation, that encourages syntactic diversity by conditioning on an explicit syntactic sketch. We introduce Hierarchical Refinement Quantized Variational Autoencoders (HRQ-VAE), a method for learning decompositions of dense encodings as a sequence of discrete latent variables that make iterative refinements of increasing granularity. This hierarchy of… ▽ More

    Submitted 21 March, 2022; v1 submitted 7 March, 2022; originally announced March 2022.

    Comments: Accepted at ACL 2022

  38. arXiv:2202.13756  [pdf, other

    cs.CL

    Data-to-text Generation with Variational Sequential Planning

    Authors: Ratish Puduppully, Yao Fu, Mirella Lapata

    Abstract: We consider the task of data-to-text generation, which aims to create textual output from non-linguistic input. We focus on generating long-form text, i.e., documents with multiple paragraphs, and propose a neural model enhanced with a planning component responsible for organizing high-level information in a coherent and meaningful way. We infer latent plans sequentially with a structured variatio… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

    Comments: To appear in Transactions of the Association for Computational Linguistics (TACL); 18 pages

  39. arXiv:2202.09583  [pdf, other

    cs.CL

    Models and Datasets for Cross-Lingual Summarisation

    Authors: Laura Perez-Beltrachini, Mirella Lapata

    Abstract: We present a cross-lingual summarisation corpus with long documents in a source language associated with multi-sentence summaries in a target language. The corpus covers twelve language pairs and directions for four European languages, namely Czech, English, French and German, and the methodology for its creation can be applied to several other languages. We derive cross-lingual document-summary i… ▽ More

    Submitted 19 February, 2022; originally announced February 2022.

    Comments: EMNLP 2021

  40. arXiv:2112.03638  [pdf, other

    cs.LG cs.CL cs.DS stat.AP stat.ML

    Scaling Structured Inference with Randomization

    Authors: Yao Fu, John P. Cunningham, Mirella Lapata

    Abstract: Deep discrete structured models have seen considerable progress recently, but traditional inference using dynamic programming (DP) typically works with a small number of states (less than hundreds), which severely limits model capacity. At the same time, across machine learning, there is a recent trend of using randomized truncation techniques to accelerate computations involving large sums. Here,… ▽ More

    Submitted 24 July, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

    Comments: ICML 2022 camera ready

  41. arXiv:2111.08774  [pdf, other

    cs.CV

    Film Trailer Generation via Task Decomposition

    Authors: Pinelopi Papalampidi, Frank Keller, Mirella Lapata

    Abstract: Movie trailers perform multiple functions: they introduce viewers to the story, convey the mood and artistic style of the film, and encourage audiences to see the movie. These diverse functions make automatic trailer generation a challenging endeavor. We decompose it into two subtasks: narrative structure identification and sentiment prediction. We model movies as graphs, where nodes are shots and… ▽ More

    Submitted 16 November, 2021; originally announced November 2021.

  42. arXiv:2110.07358  [pdf, other

    cs.CL

    Memory-Based Semantic Parsing

    Authors: Parag Jain, Mirella Lapata

    Abstract: We present a memory-based model for context-dependent semantic parsing. Previous approaches focus on enabling the decoder to copy or modify the parse from the previous utterance, assuming there is a dependency between the current and previous parses. In this work, we propose to represent contextual information using an external memory. We learn a context memory controller that manages the memory b… ▽ More

    Submitted 7 September, 2021; originally announced October 2021.

  43. arXiv:2110.04655  [pdf, other

    cs.CL

    Disentangled Sequence to Sequence Learning for Compositional Generalization

    Authors: Hao Zheng, Mirella Lapata

    Abstract: There is mounting evidence that existing neural network models, in particular the very popular sequence-to-sequence architecture, struggle to systematically generalize to unseen compositions of seen components. We demonstrate that one of the reasons hindering compositional generalization relates to representations being entangled. We propose an extension to sequence-to-sequence models which encour… ▽ More

    Submitted 22 March, 2022; v1 submitted 9 October, 2021; originally announced October 2021.

    Comments: ACL 2022

  44. arXiv:2109.04325  [pdf, other

    cs.CL cs.AI cs.LG

    Learning Opinion Summarizers by Selecting Informative Reviews

    Authors: Arthur Bražinskas, Mirella Lapata, Ivan Titov

    Abstract: Opinion summarization has been traditionally approached with unsupervised, weakly-supervised and few-shot learning techniques. In this work, we collect a large dataset of summaries paired with user reviews for over 31,000 products, enabling supervised training. However, the number of reviews per product is large (320 on average), making summarization - and especially training a summarizer - imprac… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021

  45. arXiv:2109.03171  [pdf, other

    cs.CL

    Aspect-Controllable Opinion Summarization

    Authors: Reinald Kim Amplayo, Stefanos Angelidis, Mirella Lapata

    Abstract: Recent work on opinion summarization produces general summaries based on a set of input reviews and the popularity of opinions expressed in them. In this paper, we propose an approach that allows the generation of customized summaries based on aspect queries (e.g., describing the location and room of a hotel). Using a review corpus, we create a synthetic training dataset of (review, summary) pairs… ▽ More

    Submitted 7 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021

  46. arXiv:2106.03257  [pdf, other

    cs.CL cs.LG

    Structured Reordering for Modeling Latent Alignments in Sequence Transduction

    Authors: Bailin Wang, Mirella Lapata, Ivan Titov

    Abstract: Despite success in many domains, neural models struggle in settings where train and test examples are drawn from different distributions. In particular, in contrast to humans, conventional sequence-to-sequence (seq2seq) models fail to generalize systematically, i.e., interpret sentences representing novel combinations of concepts (e.g., text segments) seen in training. Traditional grammar formalis… ▽ More

    Submitted 26 October, 2021; v1 submitted 6 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021

  47. arXiv:2106.00104  [pdf, other

    cs.CL cs.LG

    Text Summarization with Latent Queries

    Authors: Yumo Xu, Mirella Lapata

    Abstract: The availability of large-scale datasets has driven the development of neural models that create summaries from single documents, for generic purposes. When using a summarization system, users often have specific intents with various language realizations, which, depending on the information need, can range from a single keyword to a long narrative composed of multiple questions. Existing summariz… ▽ More

    Submitted 31 May, 2021; originally announced June 2021.

    Comments: 12 pages

  48. arXiv:2105.15053  [pdf, other

    cs.CL

    Factorising Meaning and Form for Intent-Preserving Paraphrasing

    Authors: Tom Hosking, Mirella Lapata

    Abstract: We propose a method for generating paraphrases of English questions that retain the original intent but use a different surface form. Our model combines a careful choice of training objective with a principled information bottleneck, to induce a latent encoding space that disentangles meaning and form. We train an encoder-decoder model to reconstruct a question from a paraphrase with the same mean… ▽ More

    Submitted 31 May, 2021; originally announced May 2021.

    Comments: ACL 2021

  49. arXiv:2104.07554  [pdf, other

    cs.CL

    Zero-Shot Cross-lingual Semantic Parsing

    Authors: Tom Sherborne, Mirella Lapata

    Abstract: Recent work in cross-lingual semantic parsing has successfully applied machine translation to localize parsers to new languages. However, these advances assume access to high-quality machine translation systems and word alignment tools. We remove these assumptions and study cross-lingual semantic parsing as a zero-shot problem, without parallel data (i.e., utterance-logical form pairs) for new lan… ▽ More

    Submitted 7 March, 2022; v1 submitted 15 April, 2021; originally announced April 2021.

    Comments: Accepted to ACL2022 Main Conference. 19 pages, 3 figures, 12 tables

  50. arXiv:2104.05819  [pdf, other

    cs.CL

    Learning from Executions for Semantic Parsing

    Authors: Bailin Wang, Mirella Lapata, Ivan Titov

    Abstract: Semantic parsing aims at translating natural language (NL) utterances onto machine-interpretable programs, which can be executed against a real-world environment. The expensive annotation of utterance-program pairs has long been acknowledged as a major bottleneck for the deployment of contemporary neural models to real-life applications. In this work, we focus on the task of semi-supervised learni… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: NAACL 2021 Camera Ready