-
Extracting Formal Models from Normative Texts
Authors:
John J. Camilleri,
Normunds Grūzītis,
Gerardo Schneider
Abstract:
We are concerned with the analysis of normative texts - documents based on the deontic notions of obligation, permission, and prohibition. Our goal is to make queries about these notions and verify that a text satisfies certain properties concerning causality of actions and timing constraints. This requires taking the original text and building a representation (model) of it in a formal language,…
▽ More
We are concerned with the analysis of normative texts - documents based on the deontic notions of obligation, permission, and prohibition. Our goal is to make queries about these notions and verify that a text satisfies certain properties concerning causality of actions and timing constraints. This requires taking the original text and building a representation (model) of it in a formal language, in our case the C-O Diagram formalism. We present an experimental, semi-automatic aid that helps to bridge the gap between a normative text in natural language and its C-O Diagram representation. Our approach consists of using dependency structures obtained from the state-of-the-art Stanford Parser, and applying our own rules and heuristics in order to extract the relevant components. The result is a tabular data structure where each sentence is split into suitable fields, which can then be converted into a C-O Diagram. The process is not fully automatic however, and some post-editing is generally required of the user. We apply our tool and perform experiments on documents from different domains, and report an initial evaluation of the accuracy and feasibility of our approach.
△ Less
Submitted 15 June, 2017;
originally announced June 2017.
-
Towards Self-explanatory Ontology Visualization with Contextual Verbalization
Authors:
Renārs Liepiņš,
Uldis Bojārs,
Normunds Grūzītis,
Kārlis Čerāns,
Edgars Celms
Abstract:
Ontologies are one of the core foundations of the Semantic Web. To participate in Semantic Web projects, domain experts need to be able to understand the ontologies involved. Visual notations can provide an overview of the ontology and help users to understand the connections among entities. However, the users first need to learn the visual notation before they can interpret it correctly. Controll…
▽ More
Ontologies are one of the core foundations of the Semantic Web. To participate in Semantic Web projects, domain experts need to be able to understand the ontologies involved. Visual notations can provide an overview of the ontology and help users to understand the connections among entities. However, the users first need to learn the visual notation before they can interpret it correctly. Controlled natural language representation would be readable right away and might be preferred in case of complex axioms, however, the structure of the ontology would remain less apparent. We propose to combine ontology visualizations with contextual ontology verbalizations of selected ontology (diagram) elements, displaying controlled natural language (CNL) explanations of OWL axioms corresponding to the selected visual notation elements. Thus, the domain experts will benefit from both the high-level overview provided by the graphical notation and the detailed textual explanations of particular elements in the diagram.
△ Less
Submitted 6 July, 2016;
originally announced July 2016.
-
Extracting Formal Models from Normative Texts
Authors:
John J. Camilleri,
Normunds Gruzitis,
Gerardo Schneider
Abstract:
Normative texts are documents based on the deontic notions of obligation, permission, and prohibition. Our goal is to model such texts using the C-O Diagram formalism, making them amenable to formal analysis, in particular verifying that a text satisfies properties concerning causality of actions and timing constraints. We present an experimental, semi-automatic aid to bridge the gap between a nor…
▽ More
Normative texts are documents based on the deontic notions of obligation, permission, and prohibition. Our goal is to model such texts using the C-O Diagram formalism, making them amenable to formal analysis, in particular verifying that a text satisfies properties concerning causality of actions and timing constraints. We present an experimental, semi-automatic aid to bridge the gap between a normative text and its formal representation. Our approach uses dependency trees combined with our own rules and heuristics for extracting the relevant components. The resulting tabular data can then be converted into a C-O Diagram.
△ Less
Submitted 6 July, 2016;
originally announced July 2016.
-
The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingual Media Monitoring
Authors:
Normunds Gruzitis,
Guntis Barzdins
Abstract:
In the era of Big Data and Deep Learning, there is a common view that machine learning approaches are the only way to cope with the robust and scalable information extraction and summarization. It has been recently proposed that the CNL approach could be scaled up, building on the concept of embedded CNL and, thus, allowing for CNL-based information extraction from e.g. normative or medical texts…
▽ More
In the era of Big Data and Deep Learning, there is a common view that machine learning approaches are the only way to cope with the robust and scalable information extraction and summarization. It has been recently proposed that the CNL approach could be scaled up, building on the concept of embedded CNL and, thus, allowing for CNL-based information extraction from e.g. normative or medical texts that are rather controlled by nature but still infringe the boundaries of CNL. Although it is arguable if CNL can be exploited to approach the robust wide-coverage semantic parsing for use cases like media monitoring, its potential becomes much more obvious in the opposite direction: generation of story highlights from the summarized AMR graphs, which is in the focus of this position paper.
△ Less
Submitted 20 June, 2016;
originally announced June 2016.
-
Polysemy in Controlled Natural Language Texts
Authors:
Normunds Gruzitis,
Guntis Barzdins
Abstract:
Computational semantics and logic-based controlled natural languages (CNL) do not address systematically the word sense disambiguation problem of content words, i.e., they tend to interpret only some functional words that are crucial for construction of discourse representation structures. We show that micro-ontologies and multi-word units allow integration of the rich and polysemous multi-domain…
▽ More
Computational semantics and logic-based controlled natural languages (CNL) do not address systematically the word sense disambiguation problem of content words, i.e., they tend to interpret only some functional words that are crucial for construction of discourse representation structures. We show that micro-ontologies and multi-word units allow integration of the rich and polysemous multi-domain background knowledge into CNL thus providing interpretation for the content words. The proposed approach is demonstrated by extending the Attempto Controlled English (ACE) with polysemous and procedural constructs resulting in a more natural CNL named PAO covering narrative multi-domain texts.
△ Less
Submitted 20 November, 2015;
originally announced November 2015.
-
A Multilingual FrameNet-based Grammar and Lexicon for Controlled Natural Language
Authors:
Normunds Gruzitis,
Dana Dannélls
Abstract:
Berkeley FrameNet is a lexico-semantic resource for English based on the theory of frame semantics. It has been exploited in a range of natural language processing applications and has inspired the development of framenets for many languages. We present a methodological approach to the extraction and generation of a computational multilingual FrameNet-based grammar and lexicon. The approach levera…
▽ More
Berkeley FrameNet is a lexico-semantic resource for English based on the theory of frame semantics. It has been exploited in a range of natural language processing applications and has inspired the development of framenets for many languages. We present a methodological approach to the extraction and generation of a computational multilingual FrameNet-based grammar and lexicon. The approach leverages FrameNet-annotated corpora to automatically extract a set of cross-lingual semantico-syntactic valence patterns. Based on data from Berkeley FrameNet and Swedish FrameNet, the proposed approach has been implemented in Grammatical Framework (GF), a categorial grammar formalism specialized for multilingual grammars. The implementation of the grammar and lexicon is supported by the design of FrameNet, providing a frame semantic abstraction layer, an interlingual semantic API (application programming interface), over the interlingual syntactic API already provided by GF Resource Grammar Library. The evaluation of the acquired grammar and lexicon shows the feasibility of the approach. Additionally, we illustrate how the FrameNet-based grammar and lexicon are exploited in two distinct multilingual controlled natural language applications. The produced resources are available under an open source license.
△ Less
Submitted 12 November, 2015;
originally announced November 2015.
-
FrameNet Resource Grammar Library for GF
Authors:
Normunds Gruzitis,
Peteris Paikens,
Guntis Barzdins
Abstract:
In this paper we present an ongoing research investigating the possibility and potential of integrating frame semantics, particularly FrameNet, in the Grammatical Framework (GF) application grammar development. An important component of GF is its Resource Grammar Library (RGL) that encapsulates the low-level linguistic knowledge about morphology and syntax of currently more than 20 languages facil…
▽ More
In this paper we present an ongoing research investigating the possibility and potential of integrating frame semantics, particularly FrameNet, in the Grammatical Framework (GF) application grammar development. An important component of GF is its Resource Grammar Library (RGL) that encapsulates the low-level linguistic knowledge about morphology and syntax of currently more than 20 languages facilitating rapid development of multilingual applications. In the ideal case, porting a GF application grammar to a new language would only require introducing the domain lexicon - translation equivalents that are interlinked via common abstract terms. While it is possible for a highly restricted CNL, developing and porting a less restricted CNL requires above average linguistic knowledge about the particular language, and above average GF experience. Specifying a lexicon is mostly straightforward in the case of nouns (incl. multi-word units), however, verbs are the most complex category (in terms of both inflectional paradigms and argument structure), and adding them to a GF application grammar is not a straightforward task. In this paper we are focusing on verbs, investigating the possibility of creating a multilingual FrameNet-based GF library. We propose an extension to the current RGL, allowing GF application developers to define clauses on the semantic level, thus leaving the language-specific syntactic mapping to this extension. We demonstrate our approach by reengineering the MOLTO Phrasebook application grammar.
△ Less
Submitted 26 June, 2014;
originally announced June 2014.
-
Controlled Natural Language Generation from a Multilingual FrameNet-based Grammar
Authors:
Dana Dannélls,
Normunds Grūzītis
Abstract:
This paper presents a currently bilingual but potentially multilingual FrameNet-based grammar library implemented in Grammatical Framework. The contribution of this paper is two-fold. First, it offers a methodological approach to automatically generate the grammar based on semantico-syntactic valence patterns extracted from FrameNet-annotated corpora. Second, it provides a proof of concept for two…
▽ More
This paper presents a currently bilingual but potentially multilingual FrameNet-based grammar library implemented in Grammatical Framework. The contribution of this paper is two-fold. First, it offers a methodological approach to automatically generate the grammar based on semantico-syntactic valence patterns extracted from FrameNet-annotated corpora. Second, it provides a proof of concept for two use cases illustrating how the acquired multilingual grammar can be exploited in different CNL applications in the domains of arts and tourism.
△ Less
Submitted 9 June, 2014;
originally announced June 2014.
-
Extracting a bilingual semantic grammar from FrameNet-annotated corpora
Authors:
Dana Dannélls,
Normunds Grūzītis
Abstract:
We present the creation of an English-Swedish FrameNet-based grammar in Grammatical Framework. The aim of this research is to make existing framenets computationally accessible for multilingual natural language applications via a common semantic grammar API, and to facilitate the porting of such grammar to other languages. In this paper, we describe the abstract syntax of the semantic grammar whil…
▽ More
We present the creation of an English-Swedish FrameNet-based grammar in Grammatical Framework. The aim of this research is to make existing framenets computationally accessible for multilingual natural language applications via a common semantic grammar API, and to facilitate the porting of such grammar to other languages. In this paper, we describe the abstract syntax of the semantic grammar while focusing on its automatic extraction possibilities. We have extracted a shared abstract syntax from ~58,500 annotated sentences in Berkeley FrameNet (BFN) and ~3,500 annotated sentences in Swedish FrameNet (SweFN). The abstract syntax defines 769 frame-specific valence patterns that cover 77.8% examples in BFN and 74.9% in SweFN belonging to the shared set of 471 frames. As a side result, we provide a unified method for comparing semantic and syntactic valence patterns across framenets.
△ Less
Submitted 8 April, 2014;
originally announced April 2014.
-
Verbalizing Ontologies in Controlled Baltic Languages
Authors:
Normunds Grūzītis,
Gunta Nešpore,
Baiba Saulīte
Abstract:
Controlled natural languages (mostly English-based) recently have emerged as seemingly informal supplementary means for OWL ontology authoring, if compared to the formal notations that are used by professional knowledge engineers. In this paper we present by examples controlled Latvian language that has been designed to be compliant with the state of the art Attempto Controlled English. We also di…
▽ More
Controlled natural languages (mostly English-based) recently have emerged as seemingly informal supplementary means for OWL ontology authoring, if compared to the formal notations that are used by professional knowledge engineers. In this paper we present by examples controlled Latvian language that has been designed to be compliant with the state of the art Attempto Controlled English. We also discuss relation with controlled Lithuanian language that is being designed in parallel.
△ Less
Submitted 2 November, 2012;
originally announced November 2012.