ModelWriter: Text & Model-Synchronized Document Engineering Platform

Ferhat Erata14, Claire Gardent2, Bikash Gyawali2, Anastasia Shimorina2, Yvan Lussaud6, Bedir Tekinerdogan1,
Geylani Kardas57 and Anne Monceaux3 1Information Technology Group, Wageningen University and Research Centre, The Netherlands 2CNRS, LORIA, UMR 7503 Vandoeuvre-les-Nancy, F-54500, Nancy, France 3System Engineering Platforms, Airbus Group Innovations, Toulouse, France 4UNIT Information Technologies R&D Ltd., Izmir, Turkey 5Ege University, International Computer Institute, Izmir, Turkey 7KoçSistem Information and Communication Services Inc. Istanbul, Turkey 6OBEO, Nantes, France

Abstract

The ModelWriter platform provides a generic framework for automated traceability analysis. In this paper, we demonstrate how this framework can be used to trace the consistency and completeness of technical documents that consist of a set of System Installation Design Principles used by Airbus to ensure the correctness of aircraft system installation. We show in particular, how the platform allows the integration of two types of reasoning: reasoning about the meaning of text using semantic parsing and description logic theorem proving; and reasoning about document structure using first-order relational logic and finite model finding for traceability analysis.

https://itea3.org/project/modelwriter.html

I Introduction

The complexity of software systems in safety critical domains (e.g. avionics and automotive) has significantly increased over the years. Development of such systems requires various phases which result in several artifacts (e.g., requirements documents, architecture models and test cases). In this context, traceability [1, 2] not only establishes and maintains consistency between these artifacts but also helps guarantee that each requirement is fulfilled by the source code and test cases properly cover all requirements, a very important objective in safety critical systems and the standards they need to comply with DO-178C (Software Considerations in Airborne Systems and Equipment Certification) [3] and ISO-26262 (Road Vehicles - Functional Safety) [4]. As a result, the engineers have to establish and maintain several types of traces, having different semantics, between and within various development artifacts.

Traceability is a quality concern that helps users understand each and every steps in the development or even the end to end life cycle of a product. Its implementation is highly contextual as the key artifacts produced or used along a process differ depending on the product. We want to provide a framework for users to specify which artifacts they want to precisely identify and monitor and what is the meaning for trace links between these artifacts.

The considered Artifacts represented in our context by trace locations might be of different levels of granularity, ranging from a complete document or model to fragments of text or code. Focusing on documents and text, both the structure and the content might be used to reason about traceability.

To this end, ModelWriter platform provides a generic traceability analysis applicable to Text & Model artifacts. Trace locations can be fragments of text, elements of an architectural model, and parts of program codes. Traces are relations between trace locations. ModelWriter platform allows axiomatization of these relations and reasoning about them, i.e. supporting traceability analysis for different types of artifacts.

In this paper, we focus on demonstrating the features of ModelWriter platform for the traceability analysis applied to technical documentation. A particular challenge in this use case is to take into account the meaning of natural language. We integrate techniques from Natural Language Processing (NLP) and Automated Reasoning to reason both about the meaning and about the structure of text. We use techniques from semantic parsing to assign formal meaning representations to NL text. We then use techniques from theorem proving and model building to infer traceability relations between text fragments (here SIDPs), to check consistency and to ensure completeness.

II The Airbus SIDP Usecase

We illustrate the workings of the ModelWriter platform based on a set of System Installation Design Principles (SIDP) used by Airbus to ensure the correctness of aircraft design. A SIDP rule is actually a kind of system installation requirement, that is a description of system properties which should be fulfilled. In this usecase, SIDPs are trace locations and there are five types of trace links defined between trace locations, namely contains, refines, conflicts, equals, and requires. In the following, we informally give the meaning of the trace-types.

Rule $r_{1}$ contains Rule $r_{2}\dots r_{n}$ if $r_{2}\dots r_{n}$ are parts of the whole $r_{1}$ (part-whole hierarchy). The contained rule is a sub-rule of containing rule. Rule $r_{1}$ refines another Rule $r_{2}$ if $r_{1}$ is derived from $r_{2}$ by adding more details to its properties. The refined rule can be seen as an abstraction of the detailed rules. In Fig. 1 contains and refines traces are illustrated. Each box represents a property of the corresponding rule.

Figure 1:

r_{1}

contains

r_{2}

and

r_{3}

r_{4}

refines

r_{5}

Rule $r_{1}$ conflicts with Rule $r_{2}$ if the fulfillment of $r_{1}$ excludes the fulfillment of $r_{2}$ and vice versa. The existence of a conflict trace indicates an inconsistency between two rules. Rule $r_{1}$ equals to Rule $r_{2}$ if $r_{1}$ states exactly the same properties with their constraints with $r_{2}$ and vice versa. Rule $r_{1}$ requires Rule $r_{2}$ if $r_{1}$ is fulfilled only when $r_{2}$ is fulfilled. The required rule can be seen as a pre-condition for the requiring rule. In the following Fig. 2 conflicts, equals and requires traces are illustrated.

Figure 2: Illustration of “conflicts, equals and requires”

Given a set of SIDPs, the ModelWriter platform can be used to check completeness and consistency as follows. First, SIDPs are parsed and assigned Description Logic formulae representing their meaning (cf. Section III-A). Second, traces are either manually specified by the end user or can be inferred using semantic parsing and DL theorem proving (cf. Section III-B). Third, new traces can be inferred upon existing ones using Relational Logic (cf. Section III-C) and Model Finding (cf. Section III-D). Importantly, the inference of trace links allows for the detection of missing or inconsistent SIDPs.

Table I illustrates this process. Given the SIDPs $r_{1}$ - $r_{6}$ , conflicts and refines trace links are first inferred using semantic parsing and the Hermit theorem prover [5] (DL lines in the table).

Table I: Example SIDPs and inference of Trace Links

Nr.	Artifact Annotations (Trace-locations)
$\text{r}_{\text{1}}$	Bracket shall be used in hydraulic area Alpha
$\text{r}_{\text{2}}$	Adhesive bonded bracket shall be used in hydraulic area
$\text{r}_{\text{3}}$	Adhesive bonded bracket shall be used in hydraulic area Alpha
$\text{r}_{\text{4}}$	Bracket shall be used in hydraulic area
$\text{r}_{\text{5}}$	Bracket shall be installed in hydraulic area
$\text{r}_{\text{6}}$	Bracket shall be installed in fuel tank

Nr.	Inferred Traces	Nr.	Inferred Traces
$\text{DL}_{\text{1}}$	$conflicts(r_{5},r_{6})$	$\text{RL}_{\text{1}}$	$conflicts(r_{6},r_{4})$
$\text{DL}_{\text{2}}$	$refines(r_{3},r_{2})$	$\text{RL}_{\text{2}}$	$requires(r_{1},r_{5})$
$\text{DL}_{\text{3}}$	$refines(r_{2},r_{4})$	$\text{RL}_{\text{3}}$	$conflicts(r_{6},r_{1})$
$\text{DL}_{\text{4}}$	$refines(r_{1},r_{4})$	$\text{RL}_{\text{4}}$	$requires(r_{2},r_{5})$
$\text{DL}_{\text{5}}$	$requires(r_{4},r_{5})$	$\text{RL}_{\text{5}}$	$conflicts(r_{2},r_{6})$

For example, the DL formulae obtained by parsing sentences $r_{5}$ and $r_{6}$ conflict with each other because the underlying ontology to which these axioms are added specifies that concepts “hydraulic area” and “fuel tank” are disjoint. Similarly, the axiom obtained for the sentence $r_{2}$ refines the axiom obtained for $r_{4}$ because the ontology specifies that “Bracket” is a sub concept of “Adhesive bonded bracket”. In Fig. 3, Table I is represented as a digraph model in which the nodes represent trace-locations, i.e. SIDP rules listed in the table and edges represents traces. A red edge specifically corresponds to the trace inferred using semantic parsing and DL theorem proving. The black one is an example trace, $refines(r_{3},r_{6})$ created by the user manually.

Figure 3: Inferred Traces (red traces indicate reasoning using DL, blue indicates reasoning using RL, the black one indicates a manual trace)

Later, additional trace links are inferred using Relational Model Finding (RL lines in the Table I and dashed blue edges on Fig. 3). For instance, as part of the trace semantics of this use case, according to the axiom schema (3) formalized in Section III-C where a, b and c are artifact elements, if a refines, requires oder contains b, while b conflicts with c, then a also conflicts with c. In this way, ModelWriter generates conflicts traces such that combination of $conflicts(r_{5},r_{6})$ and $requires(r_{4},r_{5})$ makes $conflicts(r_{6},r_{4})$ ; on the other hand, according to axiom schema (1) described in Section III-C, the combination of $refines(r_{2},r_{4})$ and $requires(r_{4},r_{5})$ generates $requires(r_{2},r_{5})$ corresponding to the patterns shown in Fig. 4.

Figure 4: Inferring “requires” with “refines” and inferring “conflicts”

Finally, in this example, DL-based reasoning process inferred only one conflicts trace using the meaning of the sentences, i.e. $r_{5}$ conflicts with $r_{6}$ whereas the ModelWriter detects three more conflicts traces using the meaning of trace types by means of RL-based reasoning on top of DL-based reasoning. As a result, it can be seen that not only $r_{5}$ and $r_{6}$ but also $r_{4}$ , $r_{1}$ , and $r_{2}$ are inconsistent.

III Overview of the Approach

We now describe the four main modules making up the ModelWriter platform. Section III-A introduces the semantic parser, i.e., the module that converts text to Description Logic formulae. Section III-B explains how the Hermit reasoner can be used to detect refines, conflicts and equals trace links between text fragments (here, SIDPs). Section III-C shows how Alloy formalism [6] can be customized to axiomatize trace types and semantics. Finally, Section III-D explains how the KodKod model finder [7] is used to infer new trace links between SIDPs to detect the inconsistent SIDPs.

III-A Mapping Text to Description Logic Formulae

The semantic parser used in ModelWriter to convert text to DL formulae is described in details in [8]. In what follows, we briefly summarize its working and some evaluation results on a set of 960 SIDPs used for testing.

The ModelWriter semantic processing framework combines an automatically derived lexicon, a small hand-written grammar, a parsing algorithm to convert text to DL formulae and a generation algorithm to generate text from DL formulae. This framework is modular, robust and reversible. It is modular in that, different lexicons or grammars may be plugged to meet the requirements of the semantic application being considered. For instance, the lexicon (which relates words and concepts) could be built using a concept extraction tool, i.e. a text mining tool that extracts concepts from text (e.g., [9]). And the grammar could be replaced by a grammar describing the syntax of other document styles such as cooking recipes. It is robust in that, in the presence of unknown words, the parser can skip words and deliver a connected (partial) parse. And, it is reversible in that the same grammar and lexicon can be used both for parsing and for generation. Fig. 5 outlines our approach showing the interaction of various components.

Figure 5: Parsing and Generation of Airbus SIDPs.

The lexicon maps verbs and noun phrases to grammar rules and to complex and simple concepts respectively. Fig. 6 shows an illustrating example with a lexical entry on the left and the corresponding grammar unit on the right. During generation/parsing, the semantic literals listed in the lexicon (here, Use and useArg2inv) are used to instantiate the variables (here, A2 and Rel) in the semantic schema (here, L ${}_{0}$ :subset(X,L ${}_{1}$ ) L ${}_{2}$ :exists(A2,L ${}_{3}$ ) L ${}_{3}$ :Rel(Y)). Similarly, the Anchor value (used) is used to label the terminal node marked with the anchor sign ( $\diamond$ ) and each coanchor is used to label the terminal node with corresponding name. Thus, the strings shall and be will be used to label the terminal nodes $V1$ and $V2$ respectively. Importantly, this separation between grammar and lexicon supports modularity in that e.g., different lexicons and/or grammars could be plugged into the system. For the work presented here, we built this lexicon by applying regular expressions and a customised NP chunker (the NLTK regular expression chunker) to extract verbal and nominal lexical entries from SIDPs.

Refer to caption — Figure 6: Example Lexical Entry and Grammar Unit

The grammar provides a declarative specification of how text relates to meaning (as represented by OWL DL [10] formulae). We use a Feature-Based Lexicalised Tree Adjoining Grammar (FB-LTAG) [11] augmented with a unification-based flat semantics. Fig. 7 shows an example FB-LTAG for the words “not”, “pipes” and “shall be used”. An FB-LTAG tree is a set of initial and auxiliary trees which have been lexicalised using the lexicon and can be combined using either substitution or adjunction. Auxiliary trees are trees such as the tree for “not” which contains a foot node (marked with *) whose category (here AUX) matches that of the root node. Initial trees are trees such as that of “pipes” and “shall be used” whose terminal nodes may be substitution nodes (marked with $\downarrow$ ). Substitution inserts a tree with root category $C$ into a substitution node of the same category. For instance, the tree for “pipes” may be substituted in the $NP_{\downarrow}^{Y}$ node of th “shall be used” tree. Adjunction inserts an auxiliary tree with foot node category $C$ into a tree at a node of category $C$ . For instance, the tree for “not” may be adjoined into the tree for “shall be used” at the AUX node.

The parser and the generator exploit the grammar and the lexicon to map natural language to OWL DL formulae (semantic parsing) and OWL DL formulae to natural language (generation) respectively. For instance, given the sentence “Pipes shall not be used”, the parser will first select the grammar trees associated with “Pipes”, “shall be used” and “not” and then combines these trees using substitution and adjunction. As shown in Fig. 8, the semantics derived for the input sentence is then the union of the semantics of these trees modulo unification. Conversely, given the flat semantics shown in the figure the generator will generate the sentence “Pipes shall not be used” by first, selecting grammar trees whose semantics subsumes the input and then combining them using substitution and operation. The generated sentences are given by the yield of the derived trees whose root is of category S (sentence) and whose semantics is exactly the input semantics.

While the grammar integrates a so-called flat semantics, as shown in Fig. 9, there is a direct translation from this semantics to OWL functional syntax. Further details about Semantic Parser can be found at:

https://github.com/ModelWriter/Deliverables/tree/master/WP2

\tau(\phi)=\begin{cases}\mbox{ObjectSomeValuesFrom}(\mbox{:R }\tau(C))&\text{% if }\phi=l_{i}:exists(R,l_{j})\;\;l_{j}:C\\ \mbox{SubClassOf}(\tau(C_{1})\;\tau(C_{2}))&\text{if }\phi=l_{i}:subset(l_{j},% l_{k})\;\;l_{j}:C_{1}\;\;l_{k}:C_{2}\\ \mbox{ObjectIntersectionOf}(\tau(C_{1})\;\tau(C_{2}))&\text{if }\phi=l_{i}:and% (l_{j},l_{k})\;\;l_{j}:C_{1}\;\;l_{k}:C_{2}\\ (\tau(C1)\sqcap\tau(C2))&\text{if }\phi=l_{i}:and(l_{j},l_{k})\;\;l_{j}:C1\;\;% l_{k}:C2\\ (\tau(C1)\sqcup\tau(C2))&\text{if }\phi=l_{i}:or(l_{j},l_{k})\;\;l_{j}:C1\;\;l% _{k}:C2\\ \mbox{not}(\tau(C))&\text{if }\phi=l_{i}:not(l_{j})\;\;l_{j}:C\\ \mbox{R${}^{-}$}&\text{if }\phi=Rinv\\ \mbox{C}&\text{if }\phi=l_{i}:C(x)\\ \end{cases}

Figure 9: Mapping Flat Semantics to Owl Functional Syntax

III-B Inferring Traces using DL Theorem Proving

We use Hermit theorem prover to detect inconsistencies, entailment and equivalence between two SIDPs $s_{1}$ and $s_{2}$ . Given the DL formulae $\phi_{1}$ and $\phi_{2}$ associated by the semantic parsing process to $s_{1}$ and $s_{2}$ , we determine these relations as follows: (i) if $\phi_{1}\sqcap\phi_{2}$ is not satisfiable, we infer a conflicts trace between $s_{1}$ and $s_{2}$ , (ii) if $\neg\phi_{1}\sqcup\phi_{2}$ is satisfiable, we infer a requires trace between $s_{1}$ and $s_{2}$ , and (iii) if $\phi_{1}\equiv\phi_{2}$ is satisfiable, we infer an equals trace between $s_{1}$ and $s_{2}$ .

III-C Formal Semantics of Trace-types

Tarski is the module of ModelWriter approach for automated reasoning about traces based on configurable trace semantics, recently described in [12]. The tool provides an enhanced text editor to allow users to define new trace types in a restricted form of Alloy [6], i.e., First-Order Relational Logic.

In the following, we axiomatize trace semantics based on the informal definition explained in Section II using First-order Predicate Logic with the signature:

	$\displaystyle\Sigma_{T}:\{=,\in\}\cup\Sigma^{1}_{T}\cup\Sigma^{2}_{T}$
	$\displaystyle\Sigma^{1}_{T}:\{Artifact,Requirement,Specification\}$
	$\displaystyle\Sigma^{2}_{T}:\{requires,refines,contains,equals,conflicts\}$

$\Sigma^{1}_{T}$ is the set of unary predicate symbols and $\Sigma^{2}_{T}$ is the set of binary predicate symbols. For simplicity, we assume that the universe only consists of the type, Artifact which is partitioned into disjoint subsets of Requirement and Specification. From now on, $A$ represents the set of Artifacts. $=$ and $\in$ symbols are interpreted and represent equality and membership respectively. In the following several axiom schemas are listed to formalize Traceability Theory, that is used in the SIDP case.

Reasoning about requires traces is stated as follows:

	$\displaystyle\vdash\forall a,b,c\in A\mid(a,b)\in\operatorname{\mathrel{% \square}}\wedge\,(b,c)\in\operatorname{\mathrel{require}}\to(a,c)\in% \operatorname{\mathrel{require}}$		(1)
	$\displaystyle\vdash\forall a,b,c\in A\mid(a,b)\in\operatorname{\mathrel{% require}}\wedge\,(b,c)\in\operatorname{\mathrel{\square}}\to(a,c)\in% \operatorname{\mathrel{require}}$		(2)
	$\displaystyle\mbox{\quad where \>}\,\square\in\{\operatorname{\mathrel{% requires}},\operatorname{\mathrel{refines}},\operatorname{\mathrel{contains}}\}$

The following axiom schema is being used for generating conflicts traces.

	$\displaystyle\vdash\forall a,b,c\in A\mid(a,b)\in\square\wedge(b,c)\in% \triangle\to(a,c)\in\triangle$		(3)
	$\displaystyle\vdash\forall a\in A\mid(a,a)\in\triangle$		(4)
	$\displaystyle\mbox{\quad where \>}\square\in\{\operatorname{\mathrel{requires}% },\operatorname{\mathrel{refines}},\operatorname{\mathrel{contains}}\}\mbox{ % and \>}\triangle=\operatorname{\mathrel{conflicts}}$

Reasoning about equals traces:

	$\displaystyle\vdash\forall a,b,c\in A\mid(a,b)\in\operatorname{\mathrel{equals% }}\wedge\,(b,c)\in\square\to(a,c)\in\square$		(5)
	$\displaystyle\vdash\forall a,b,c\in A\mid(a,b)\in\operatorname{\mathrel{equals% }}\wedge\,(c,b)\in\square\to(c,a)\in\square$		(6)
	$\displaystyle\vdash\forall a\in A\mid(a,a)\in\operatorname{\mathrel{equals}}$		(7)
	$\displaystyle\mbox{where \>}\square\in\{\operatorname{\mathrel{contains}},% \operatorname{\mathrel{requires}},\operatorname{\mathrel{refines}},% \operatorname{\mathrel{conflicts}}\}$

In the following axiom schema, transitivity (8) is used for reasoning new traces, whereas anti-symmetry (9) and irreflexivity (10) are used to check consistency.

	$\displaystyle\vdash\forall a,b,c\in A\mid(a,b)\in\square\wedge(b,c)\in\square% \to(a,c)\in\square,$		(8)
	$\displaystyle\vdash\forall a,b\in A\mid(a,b)\in\square\wedge(b,a)\in\square\to a% =b,$		(9)
	$\displaystyle\vdash\forall a\in A\mid(a,a)\notin\square,$		(10)
	$\displaystyle\mbox{where \>}\square\in\{\operatorname{\mathrel{contains}},% \operatorname{\mathrel{requires}},\operatorname{\mathrel{refines}}\}$

contains traces is left-unique (injective relation) in some scenarios that induces an inconsistency when transitivity axiom (8) for contains is instantiated in the specification.

	$\displaystyle\vdash\forall a,a^{\prime},b\in A\mid(a,b)\in\square\wedge(a^{% \prime},b)\in\square\to a=a^{\prime}$		(11)
	$\displaystyle\mbox{\quad where \>}\,\square=\operatorname{\mathrel{contains}}$

We encode above axioms in First-order Relational Logic using the Tarski’s text editor to configure the Tarski module (see Figure 10).

III-D Inferring Trace Links using Model Finding

We employ Kodkod [13, 7], an efficient SAT-based constraint solver for FOL with relational algebra and partial models, for automated trace reasoning using the trace semantics that user provides. Once the user performs reasoning operations about traces, the result is reported back to the user by dashed traces as shown in Fig. 11. If there exists different solutions, the user can traverse them back and forth. He can also accept the inferred traces, and perform another analysis operation including inferred traces. Further details about Tarski can be found at:

https://modelwriter.github.io/Tarski/

IV Evaluation

We evaluate Semantic Parsing approach of ModelWriter on a dataset of 960 SIDPs provided by Airbus which demonstrate (i) that the approach is robust (97.50% of the SIDPs can be parsed) and (ii) that DL axioms assigned to full parses are very likely to be correct in 96% of the cases. Regarding inference phase, since we observed that DL-based reasoning is relatively faster than the SAT-based reasoning in the context of SIDP case, we only focus on the Tarski module to evaluate the performance of ModelWriter approach. Table II shows the solving results of three configurations of the formal trace specification running with Alloy Analyzer [6], KodKod[7] and Z3[14]. Minisat[15] SAT Solver is chosen for both Alloy (alloy4.2-2015-02-22.jar) and KodKod (Kodkod 2.1) solvers. From Alloy to SMT solver translation for these cases, we employ the translation method proposed by El Ghazi et.al. [16] and the problems are encoded in SMT-LIB [17] syntax which is fed into Z3 solver. Transitive closure and integer arithmetic are not used in these use cases to fairly benchmark the results with the SMT solver. In SMT-LIB, the logic is set for Equality Logic with Uninterpreted Functions (UF).

Table II: Comparisons of Several Use Cases for Trace Inferring

	Artifacts	Traces	Inferred	Alloy	KodKod	Z3
$\#1$	123	102	89	67922	25668	40900
$\#2$	56	27	25	4428	84	480
$\#3$	42	103	75	724	1	1460

Evaluation results are obtained on a machine, that runs 64 bit debian linux operating system with 8 GB of memory and 2.90GHz Intel i7-3520M CPU. Solving times are indicated in milliseconds. The best results are obtained by the direct use of KodKod API since to find satisfiable models, KodKod allows us to configure lower and upper bounds for the solution space employing different pre-processing techniques such as slicing, incremental upper bounds and unrolling transitive closures. The evaluation shows that our tool is practical and beneficial in industrial settings to specify trace semantics for automated trace reasoning. We plan to conduct more case studies to better evaluate the practical utility and usability of the platform.

V Related Work

Many existing works on semantic parsing describe the task of obtaining axiomatic representation of natural language sentences. However, they suffer from two main limitations: (i) use of controlled languages such as Attempto Controlled English[18] (e.g. [19, 20]) and/or (ii) inability to deduce complex axioms involving logical connectives, role restrictions and other expressive features of OWL (e.g. [21, 22]), as noted in [23]. In contrast, we work on human authored real-world text (Airbus SIDPs) and produce complex OWL axioms involving the following DL constructs: $\top$ (the most general concept), disjunction, conjunction, negation, role inverse, universal and existential restrictions. Moreover, we extended the scope of our application by deducing traces among the semantic parse outputs. Such traces were then used as baseline input to Tarski platform which could infer additional traces propagating over the whole system.

Similarly, several approaches and tools have been proposed for automated trace reasoning using the trace semantics [24, 25, 26, 27, 28, 29, 30, 31]. These approaches employ a predefined set of trace types and their corresponding semantics. For instance, Goknil et al. [29] provide a tool for inferencing and consistency checking of traces between requirements using a set of trace types and their formal semantics. Similarly, Egyed and Grünbacher [25] propose a trace generation approach. They do not allow the user to introduce new trace types and their semantics for automated reasoning. In the development of complex systems, it is required to enable the adoption of various trace types, and herewith automated reasoning using their semantics. Tarski module of ModelWriter allows the user to interactively define new trace types with their semantics to be used in automated reasoning about traces.

VI Conclusion

We presented an integrated platform for automatically mapping natural language text to trace types and performing further inferencing on those traces. Starting with the semantic parser module, we showed how complex axioms could be derived to represent text coming from real world use case. We identified the traces among the parse outputs and fed it to the Tarski tool. The Tarski tool, in turn, allowed users to specify configurable trace semantics for various forms of automated trace reasoning such as inferencing and consistency checking. The key characteristics of our tool are (1) automatic identification of traces existing in texts using semantic parsing (2) allowing user to define new trace types and their semantics which can be later configured, (3) deducing new traces based on the traces which the user has already specified, and (4) identifying traces whose existence causes a contradiction.

Acknowledgment

This work is conducted within ModelWriter project[32] labeled by the European Union’s EUREKA Cluster programme ITEA and partially supported by the Scientific and Technological Research Council of Turkey under project #9140014, #9150181 and Industry and Digital Affairs of France, Directorate-General for Enterprise under contract #142930204. The authors would like to acknowledge networking support by European Cooperation in Science and Technology Action IC1404 ”Multi-Paradigm Modelling for Cyber-Physical Systems”.

References

[1] B. Ramesh and M. Jarke, “Toward reference models for requirements traceability,” IEEE Transactions on Software Engineering, vol. 27, no. 1, pp. 58–93, 2001.
[2] I. C. Society, P. Bourque, and R. E. Fairley, Guide to the Software Engineering Body of Knowledge (SWEBOK(R)): Version 3.0, 3rd ed. Los Alamitos, CA, USA: IEEE Computer Society Press, 2014.
[3] RTCA and EUROCAE, “DO-178C: Software considerations in airborne systems and equipment certification,” 2017.
[4] ISO, “ISO-26262: Road vehicles – functional safety,” 2017.
[5] R. Shearer, B. Motik, and I. Horrocks, “Hermit: A highly-efficient owl reasoner.” in 5th OWL Experienced and Directions Workshop, vol. 432, 2008, p. 91.
[6] D. Jackson, Software Abstractions: Logic, Language, and Analysis. MIT press, 2012.
[7] E. Torlak, “A constraint solver for software engineering: Finding models and cores of large relational specifications,” Ph.D. dissertation, Massachusetts Institute of Technology, 2008.
[8] B. Gyawali, A. Shimorina, C. Gardent, S. Cruz-Lara, and M. Mahfoudh, “Mapping natural language to description logic,” in The Semantic Web: 14th International Conference, ESWC 2017, Portorož, Slovenia, May 28 – June 1, 2017, Proceedings, Part I, E. Blomqvist, D. Maynard, A. Gangemi, R. Hoekstra, P. Hitzler, and O. Hartig, Eds. Cham: Springer International Publishing, 2017, pp. 273–288. [Online]. Available: http://dx.doi.org/10.1007/978-3-319-58068-5_17
[9] E. Bozsak, M. Ehrig, S. Handschuh, A. Hotho, A. Maedche, B. Motik, D. Oberle, C. Schmitz, S. Staab, L. Stojanovic et al., “KAON – towards a large scale Semantic Web,” in E-Commerce and Web Technologies. Springer, 2002, pp. 304–313.
[10] D. L. McGuinness, F. Van Harmelen et al., “OWL web ontology language overview,” W3C recommendation, vol. 10, no. 10, p. 2004, 2004.
[11] C. Gardent and L. Kallmeyer, “Semantic construction in feature-based TAG,” in Proceedings of EACL. Association for Computational Linguistics, 2003, pp. 123–130.
[12] F. Erata, M. Challenger, B. Tekinerdogan, A. Monceaux, E. Tüzün, and G. Kardas, “Tarski: A platform for automated analysis of dynamically configurable traceability semantics,” in Proceedings of the Symposium on Applied Computing, ser. SAC ’17. New York, NY, USA: ACM, 2017, pp. 1607–1614.
[13] E. Torlak and D. Jackson, “Kodkod: A relational model finder,” in the 13th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’07), 2007, pp. 632–647.
[14] L. D. Moura and N. Bjørner, “Z3: An efficient smt solver,” in International conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’08), 2008, pp. 337–340.
[15] N. Eén and N. Sörensson, “An extensible sat-solver,” in the 6th International Conference on Theory and Applications of Satisfiability Testing (SAT’03), 2003, pp. 502–518.
[16] A. A. E. Ghazi and M. Taghdiri, “Relational reasoning via smt solving,” in the 17th International Conference on Formal Methods (FM’11), 2011, pp. 133–148.
[17] C. Barrett, A. Stump, C. Tinelli et al., “The smt-lib standard: Version 2.0,” in the 8th International Workshop on Satisfiability Modulo Theories, vol. 13, 2010, p. 14.
[18] K. Kaljurand and N. E. Fuchs, “Verbalizing OWL in Attempto Controlled English,” in OWLED, vol. 258, 2007.
[19] V. Tablan, T. Polajnar, H. Cunningham, and K. Bontcheva, “User-friendly ontology authoring using a controlled language,” in Proceedings of LREC, 2006.
[20] A. Bernstein, E. Kaufmann, A. Göhring, and C. Kiefer, “Querying ontologies: A controlled english interface for end-users,” in International Semantic Web Conference. Springer, 2005, pp. 112–126.
[21] P. Buitelaar, P. Cimiano, and B. Magnini, Ontology learning from text: methods, evaluation and applications. IOS press, 2005, vol. 123.
[22] M. Ruiz-Casado, E. Alfonseca, and P. Castells, “Automatic Extraction of Semantic Relationships for Wordnet by Means of Pattern Learning from Wikipedia,” in International Conference on Application of Natural Language to Information Systems. Springer, 2005, pp. 67–79.
[23] J. Völker, P. Hitzler, and P. Cimiano, “Acquisition of OWL DL axioms from lexical resources,” in European Semantic Web Conference. Springer, 2007, pp. 670–685.
[24] N. Aizenbud-Reshef, R. F. Paige, J. Rubin, Y. Shaham-Gafni, and D. S. Kolovos, “Operational semantics for traceability,” in ECMDA Traceability Workshop (ECMDA-TW’05), 2005, pp. 8–14.
[25] A. Egyed and P. Grünbacher, “Supporting software understanding with automated requirements traceability,” International Journal of Software Engineering and Knowledge Engineering, vol. 15, no. 5, pp. 783–810, 2005.
[26] A. Egyed, “A scenario-driven approach to trace dependency analysis,” IEEE Transactions on Software Engineering, vol. 29, no. 2, pp. 116–132, 2003.
[27] J. Cleland-Huang, C. K. Chang, and M. J. Christensen, “Event-based traceability for managing evolutionary change,” IEEE Transactions on Software Engineering, vol. 29, no. 9, pp. 796–810, 2003.
[28] L. C. Lamb, W. Jirapanthong, and A. Zisman, “Formalizing traceability relations for product lines,” in the 6th International Workshop on Traceability in Emerging Forms of Software Engineering (TEFSE’11), 2011, pp. 42–45.
[29] A. Goknil, I. Kurtev, K. van den Berg, and J.-W. Veldhuis, “Semantics of trace relations in requirements models for consistency checking and inferencing,” Software and System Modeling, vol. 10, no. 1, pp. 31–54, 2011.
[30] A. Goknil, I. Kurtev, and K. V. D. Berg, “Generation and validation of traces between requirements and architecture based on formal trace semantics,” Journal of Systems and Software, vol. 88, pp. 112–137, 2014.
[31] N. Drivalos, D. S. Kolovos, R. F. Paige, and K. J. Fernandes, “Engineering a dsl for software traceability,” in 1st International Conference on Software Language Engineering (SLE’08), 2008, pp. 151–167.
[32] F. Erata, “ModelWriter: Text & model synchronized document engineering platform,” https://itea3.org/project/modelwriter.html, Sep 2014.

Appendix A Availability & Open Source License

This work is being developed under technical Work Package 2 (Semantic Parser - Text Part), Work Package 3 (Tarski - Model Part), and Work Package 4 (Federated Knowledge Base) within ModelWriter project, labeled by the European Union’s EUREKA Cluster programme ITEA (Information Technology for European Advancement). Further details about the project can be found at:

https://itea3.org/project/modelwriter.html

A video demonstration of ModelWriter is available shows the use of ModelWriter in the context of the industrial use case SIDP presented in the paper. The video is available at:

https://youtu.be/TcVCosW8HkU

The source codes files and datasets of ModelWriter are publicly available for download and use at the project repository. Tarski and SemanticParser are components of ModelWriter platform. Source codes, screencast and datasets regarding the project are also available and can be found at:

https://github.com/ModelWriter/Deliverables/tree/master/WP2

https://modelwriter.github.io/Tarski/

https://github.com/ModelWriter/Source/

https://github.com/ModelWriter/Demonstration/

ModelWriter is distributed with an open source software license, namely Eclipse Public License v1. This commercially friendly copyleft license provides the ability to commercially license binaries; a modern royalty-free patent license grant; and the ability for linked works to use other licenses, including commercial ones.

https://github.com/ModelWriter/Source/blob/master/LICENSE

Appendix B Tool Demonstration Plan

There will be four parts to our presentation: (1) motivation and industrial use cases, (2) overview of the approach and tool architecture, (3) demonstration walktrough, and (4) evaluation. Parts 1, 2 and 4 are presented using slides while Part 3 is presented as a demo using the industrial use case scenario described in Section II. To present these parts, we use a combination of slides, animations, and a live demo. In the following subsections, we provide further details about our presentation plan.

B-A Motivation & Challenges

B-A1 Motivation

We will emphasize the importance of traceability by introducing ”DO-178C Software Considerations in Airborne Systems and Equipment Certification” [3] from aviation industry.

B-A2 Industrial Use Cases

We will briefly describe the challenges of Traceability Analysis Activities faced in industry by introducing industrial use cases from Airbus. We will explain the importance of semantically meaningful traceability, traceability configuration and automated traceability analysis in industry.

B-B Tool Overview

B-B1 Overview of the Solution.

We will explain the approach and the user workflow of ModelWriter by following the steps of Section III.

B-B2 ModelWriter Features

We will briefly explain tool features such as semantic parser and its reasoning engine and configurable automated traceability analysis using animated slides by giving concrete examples from the industrial use case, namely System Installation Design Principles (SIDP) presented in the paper.

B-C Walk-trough of the Tool Demonstration

In this section, we will perform a live demonstration aligns with the industrial use case SIDP, which is illustrated in Section II.

B-D Evaluation and Lessons Learned

We conclude with a summary that presents the evaluation results and the lessons learned.