Zum Hauptinhalt springen

Showing 1–27 of 27 results for author: Gardent, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.07836  [pdf, ps, other

    cs.CL

    Question Generation in Knowledge-Driven Dialog: Explainability and Evaluation

    Authors: Juliette Faille, Quentin Brabant, Gwenole Lecorve, Lina M. Rojas-Barahona, Claire Gardent

    Abstract: We explore question generation in the context of knowledge-grounded dialogs focusing on explainability and evaluation. Inspired by previous work on planning-based summarisation, we present a model which instead of directly generating a question, sequentially predicts first a fact then a question. We evaluate our approach on 37k test dialogs adapted from the KGConv dataset and we show that, althoug… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  2. arXiv:2404.03278  [pdf, other

    cs.CL

    Evaluating Document Simplification: On the Importance of Separately Assessing Simplicity and Meaning Preservation

    Authors: Liam Cripwell, Joël Legrand, Claire Gardent

    Abstract: Text simplification intends to make a text easier to read while preserving its core meaning. Intuitively and as shown in previous works, these two dimensions (simplification and meaning preservation) are often-times inversely correlated. An overly conservative text will fail to simplify sufficiently, whereas extreme simplification will degrade meaning preservation. Yet, popular evaluation metrics… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted to READI Workshop 2024

  3. ModelWriter: Text & Model-Synchronized Document Engineering Platform

    Authors: Ferhat Erata, Claire Gardent, Bikash Gyawali, Anastasia Shimorina, Yvan Lussaud, Bedir Tekinerdogan, Geylani Kardas, Anne Monceaux

    Abstract: The ModelWriter platform provides a generic framework for automated traceability analysis. In this paper, we demonstrate how this framework can be used to trace the consistency and completeness of technical documents that consist of a set of System Installation Design Principles used by Airbus to ensure the correctness of aircraft system installation. We show in particular, how the platform allows… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: Published in: 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)

  4. arXiv:2310.08170  [pdf, other

    cs.CL

    Simplicity Level Estimate (SLE): A Learned Reference-Less Metric for Sentence Simplification

    Authors: Liam Cripwell, Joël Legrand, Claire Gardent

    Abstract: Automatic evaluation for sentence simplification remains a challenging problem. Most popular evaluation metrics require multiple high-quality references -- something not readily available for simplification -- which makes it difficult to test performance on unseen domains. Furthermore, most existing metrics conflate simplicity with correlated attributes such as fluency or meaning preservation. We… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 (Main Conference)

  5. arXiv:2310.03878  [pdf, other

    cs.CL

    Automatic and Human-AI Interactive Text Generation

    Authors: Yao Dou, Philippe Laban, Claire Gardent, Wei Xu

    Abstract: In this tutorial, we focus on text-to-text generation, a class of natural language generation (NLG) tasks, that takes a piece of text as input and then generates a revision that is improved according to some specific criteria (e.g., readability or linguistic styles), while largely retaining the original meaning and the length of the text. This includes many useful applications, such as text simpli… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

    Comments: To appear at ACL 2024, Tutorial

  6. arXiv:2308.15298  [pdf, other

    cs.CL cs.AI

    KGConv, a Conversational Corpus grounded in Wikidata

    Authors: Quentin Brabant, Gwenole Lecorve, Lina M. Rojas-Barahona, Claire Gardent

    Abstract: We present KGConv, a large, conversational corpus of 71k conversations where each question-answer pair is grounded in a Wikidata fact. Conversations contain on average 8.6 questions and for each Wikidata fact, we provide multiple variants (12 on average) of the corresponding question using templates, human annotations, hand-crafted rules and a question rewriting neural model. We provide baselines… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

  7. arXiv:2305.06274  [pdf, other

    cs.CL

    Context-Aware Document Simplification

    Authors: Liam Cripwell, Joël Legrand, Claire Gardent

    Abstract: To date, most work on text simplification has focused on sentence-level inputs. Early attempts at document simplification merely applied these approaches iteratively over the sentences of a document. However, this fails to coherently preserve the discourse structure, leading to suboptimal output quality. Recently, strategies from controllable simplification have been leveraged to achieve state-of-… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of ACL 2023

  8. arXiv:2302.14785  [pdf, other

    cs.CL

    Joint Representations of Text and Knowledge Graphs for Retrieval and Evaluation

    Authors: Teven Le Scao, Claire Gardent

    Abstract: A key feature of neural models is that they can produce semantic vector representations of objects (texts, images, speech, etc.) ensuring that similar objects are close to each other in the vector space. While much work has focused on learning representations for other modalities, there are no aligned cross-modal representations for text and knowledge base (KB) elements. One challenge for learning… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

  9. arXiv:2207.03145  [pdf, other

    cs.CL

    Active Learning and Multi-label Classification for Ellipsis and Coreference Detection in Conversational Question-Answering

    Authors: Quentin Brabant, Lina Maria Rojas-Barahona, Claire Gardent

    Abstract: In human conversations, ellipsis and coreference are commonly occurring linguistic phenomena. Although these phenomena are a mean of making human-machine conversations more fluent and natural, only few dialogue corpora contain explicit indications on which turns contain ellipses and/or coreferences. In this paper we address the task of automatically detecting ellipsis and coreferences in conversat… ▽ More

    Submitted 7 July, 2022; originally announced July 2022.

    Comments: Published in IWSDS 2021

  10. arXiv:2204.05879  [pdf, other

    cs.CL

    Generating Full Length Wikipedia Biographies: The Impact of Gender Bias on the Retrieval-Based Generation of Women Biographies

    Authors: Angela Fan, Claire Gardent

    Abstract: Generating factual, long-form text such as Wikipedia articles raises three key challenges: how to gather relevant evidence, how to structure information into well-formed text, and how to ensure that the generated text is factually correct. We address these by developing a model for English text that uses a retrieval mechanism to identify relevant supporting information on the web and a cache-based… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

  11. arXiv:2011.05443  [pdf, other

    cs.CL

    Multilingual AMR-to-Text Generation

    Authors: Angela Fan, Claire Gardent

    Abstract: Generating text from structured data is challenging because it requires bridging the gap between (i) structure and natural language (NL) and (ii) semantically underspecified input and fully specified NL output. Multilingual generation brings in an additional challenge: that of generating into languages with varied word order and morphological properties. In this work, we focus on Abstract Meaning… ▽ More

    Submitted 10 November, 2020; originally announced November 2020.

  12. arXiv:2004.12744  [pdf, other

    cs.CL

    Augmenting Transformers with KNN-Based Composite Memory for Dialogue

    Authors: Angela Fan, Claire Gardent, Chloe Braud, Antoine Bordes

    Abstract: Various machine learning tasks can benefit from access to external information of different modalities, such as text and images. Recent work has focused on learning architectures with large memories capable of storing this knowledge. We propose augmenting generative Transformer neural networks with KNN-based Information Fetching (KIF) modules. Each KIF module learns a read operation to access fixe… ▽ More

    Submitted 10 November, 2020; v1 submitted 27 April, 2020; originally announced April 2020.

  13. arXiv:2001.11003  [pdf, other

    cs.CL

    Modeling Global and Local Node Contexts for Text Generation from Knowledge Graphs

    Authors: Leonardo F. R. Ribeiro, Yue Zhang, Claire Gardent, Iryna Gurevych

    Abstract: Recent graph-to-text models generate text from graph-based data using either global or local aggregation to learn node representations. Global node encoding allows explicit communication between two distant nodes, thereby neglecting graph topology as all nodes are directly connected. In contrast, local node encoding considers the relations between neighbor nodes capturing the graph structure, but… ▽ More

    Submitted 22 June, 2020; v1 submitted 29 January, 2020; originally announced January 2020.

    Comments: Accepted for publication in Transactions of the Association for Computational Linguistics (TACL), 2020; Author's final version; pre-MIT Press publication version

  14. arXiv:1912.05493  [pdf, other

    cs.CL

    Quality of syntactic implication of RL-based sentence summarization

    Authors: Hoa T. Le, Christophe Cerisara, Claire Gardent

    Abstract: Work on summarization has explored both reinforcement learning (RL) optimization using ROUGE as a reward and syntax-aware models, such as models those input is enriched with part-of-speech (POS)-tags and dependency information. However, it is not clear what is the respective impact of these approaches beyond the standard ROUGE evaluation metric. Especially, RL-based for summarization is becoming m… ▽ More

    Submitted 11 December, 2019; originally announced December 2019.

    Comments: AAAI-20 Workshop on Engineering Dependable and Secure Machine Learning Systems (EDSMLS 2020)

  15. arXiv:1910.08435  [pdf, other

    cs.CL

    Using Local Knowledge Graph Construction to Scale Seq2Seq Models to Multi-Document Inputs

    Authors: Angela Fan, Claire Gardent, Chloe Braud, Antoine Bordes

    Abstract: Query-based open-domain NLP tasks require information synthesis from long and diverse web results. Current approaches extractively select portions of web text as input to Sequence-to-Sequence models using methods such as TF-IDF ranking. We propose constructing a local graph structured knowledge base for each query, which compresses the web search information and reduces redundancy. We show that by… ▽ More

    Submitted 18 October, 2019; originally announced October 2019.

  16. arXiv:1909.00352  [pdf, other

    cs.CL

    Enhancing AMR-to-Text Generation with Dual Graph Representations

    Authors: Leonardo F. R. Ribeiro, Claire Gardent, Iryna Gurevych

    Abstract: Generating text from graph-based data, such as Abstract Meaning Representation (AMR), is a challenging task due to the inherent difficulty in how to properly encode the structure of a graph with labeled edges. To address this difficulty, we propose a novel graph-to-sequence model that encodes different but complementary perspectives of the structural information contained in the AMR graph. The mod… ▽ More

    Submitted 1 September, 2019; originally announced September 2019.

    Comments: Accepted as a long conference paper to EMNLP 2019

  17. arXiv:1809.07721  [pdf

    cs.CL

    Symbolic Priors for RNN-based Semantic Parsing

    Authors: Chunyang Xiao, Marc Dymetman, Claire Gardent

    Abstract: Seq2seq models based on Recurrent Neural Networks (RNNs) have recently received a lot of attention in the domain of Semantic Parsing for Question Answering. While in principle they can be trained directly on pairs (natural language utterances, logical forms), their performance is limited by the amount of available data. To alleviate this problem, we propose to exploit various sources of prior know… ▽ More

    Submitted 20 September, 2018; originally announced September 2018.

  18. arXiv:1707.06971  [pdf, other

    cs.CL

    Split and Rephrase

    Authors: Shashi Narayan, Claire Gardent, Shay B. Cohen, Anastasia Shimorina

    Abstract: We propose a new sentence simplification task (Split-and-Rephrase) where the aim is to split a complex sentence into a meaning preserving sequence of shorter sentences. Like sentence simplification, splitting-and-rephrasing has the potential of benefiting both natural language processing and societal applications. Because shorter sentences are generally better processed by NLP systems, it could be… ▽ More

    Submitted 21 July, 2017; originally announced July 2017.

    Comments: 11 pages, EMNLP 2017

  19. arXiv:1705.03802  [pdf, other

    cs.CL

    Analysing Data-To-Text Generation Benchmarks

    Authors: Laura Perez-Beltrachini, Claire Gardent

    Abstract: Recently, several data-sets associating data to text have been created to train data-to-text surface realisers. It is unclear however to what extent the surface realisation task exercised by these data-sets is linguistically challenging. Do these data-sets provide enough variety to encourage the development of generic, high-quality data-to-text surface realisers ? In this paper, we argue that thes… ▽ More

    Submitted 10 May, 2017; originally announced May 2017.

  20. arXiv:1507.08452  [pdf, ps, other

    cs.CL

    Unsupervised Sentence Simplification Using Deep Semantics

    Authors: Shashi Narayan, Claire Gardent

    Abstract: We present a novel approach to sentence simplification which departs from previous work in two main ways. First, it requires neither hand written rules nor a training corpus of aligned standard and simplified sentences. Second, sentence splitting operates on deep semantic structure. We show (i) that the unsupervised framework we propose is competitive with four state-of-the-art supervised systems… ▽ More

    Submitted 7 September, 2016; v1 submitted 30 July, 2015; originally announced July 2015.

    Comments: 10 pages, INLG 2016

  21. arXiv:0909.3445  [pdf, ps, other

    cs.CL

    Grouping Synonyms by Definitions

    Authors: Ingrid Falk, Claire Gardent, Evelyne Jacquey, Fabienne Venant

    Abstract: We present a method for grouping the synonyms of a lemma according to its dictionary senses. The senses are defined by a large machine readable dictionary for French, the TLFi (Trésor de la langue française informatisé) and the synonyms are given by 5 synonym dictionaries (also for French). To evaluate the proposed method, we manually constructed a gold standard where for each (word, definition)… ▽ More

    Submitted 18 September, 2009; originally announced September 2009.

    Journal ref: Recent Advances in Natural Language Processing (RANLP), Borovets : Bulgaria (2009)

  22. Computing Parallelism in Discourse

    Authors: Claire Gardent, Michael Kohlhase

    Abstract: Although much has been said about parallelism in discourse, a formal, computational theory of parallelism structure is still outstanding. In this paper, we present a theory which given two parallel utterances predicts which are the parallel elements. The theory consists of a sorted, higher-order abductive calculus and we show that it reconciles the insights of discourse theories of parallelism w… ▽ More

    Submitted 1 May, 1997; originally announced May 1997.

    Comments: 6 pages

    Report number: CLAUS Nr. 90

    Journal ref: Proceedings of IJCAI'97

  23. Sloppy Identity

    Authors: Claire Gardent

    Abstract: Although sloppy interpretation is usually accounted for by theories of ellipsis, it often arises in non-elliptical contexts. In this paper, a theory of sloppy interpretation is provided which captures this fact. The underlying idea is that sloppy interpretation results from a semantic constraint on parallel structures and the theory is shown to predict sloppy readings for deaccented and paycheck… ▽ More

    Submitted 1 May, 1997; originally announced May 1997.

    Comments: 20 pages

    Report number: CLAUS Nr.88, University of Saarbruecken

    Journal ref: Logical Aspects of Computational Linguistics, Springer-Verlag.

  24. Corrections and Higher-Order Unification

    Authors: Claire Gardent, Michael Kohlhase, Noor van Neusen

    Abstract: We propose an analysis of corrections which models some of the requirements corrections place on context. We then show that this analysis naturally extends to the interaction of corrections with pronominal anaphora on the one hand, and (in)definiteness on the other. The analysis builds on previous unification--based approaches to NL semantics and relies on Higher--Order Unification with Equivale… ▽ More

    Submitted 2 September, 1996; originally announced September 1996.

    Comments: 12 pages, LateX file, In Proccedings of the 3. Konferenz zur Verarbeitung natuerlicher Sprache (KONVENS), Bielefeld, 1996

    Report number: CLAUS Report Nr. 77

  25. Focus and Higher-Order Unification

    Authors: Claire Gardent, Michael Kohlhase

    Abstract: Pulman has shown that Higher--Order Unification (HOU) can be used to model the interpretation of focus. In this paper, we extend the unification--based approach to cases which are often seen as a test--bed for focus theory: utterances with multiple focus operators and second occurrence expressions. We then show that the resulting analysis favourably compares with two prominent theories of focus… ▽ More

    Submitted 2 May, 1996; originally announced May 1996.

    Comments: 6 pages, Latex file, uses colap.sty, to appear in Proceedings of COLING 96

    Report number: CLAUS-75

  26. Higher-Order Coloured Unification and Natural Language Semantics

    Authors: Claire Gardent, Michael Kohlhase

    Abstract: In this paper, we show that Higher-Order Coloured Unification - a form of unification developed for automated theorem proving - provides a general theory for modeling the interface between the interpretation process and other sources of linguistic, non semantic information. In particular, it provides the general theory for the Primary Occurrence Restriction which (Dalrymple, Shieber and Pereira,… ▽ More

    Submitted 2 May, 1996; originally announced May 1996.

    Comments: 9 pages, LateX file, uses aclap.sty, To appear in Proceedings of ACL96

    Report number: CLAUS-76

  27. A specification language for Lexical Functional Grammars

    Authors: Patrick Blackburn, Claire Gardent

    Abstract: This paper defines a language L for specifying LFG grammars. This enables constraints on LFG's composite ontology (c-structures synchronised with f-structures) to be stated directly; no appeal to the LFG construction algorithm is needed. We use L to specify schemata annotated rules and the LFG uniqueness, completeness and coherence principles. Broader issues raised by this work are noted and discu… ▽ More

    Submitted 3 March, 1995; originally announced March 1995.

    Comments: 6 pages, LaTeX uses eaclap.sty; Procs of Euro ACL-95

    Report number: CLAUS Report Nr. 51