Search | arXiv e-print repository

Normative Requirements Operationalization with Large Language Models

Authors: Nick Feng, Lina Marsso, S. Getir Yaman, Isobel Standen, Yesugen Baatartogtokh, Reem Ayad, Victória Oldemburgo de Mello, Bev Townsend, Hanne Bartels, Ana Cavalcanti, Radu Calinescu, Marsha Chechik

Abstract: Normative non-functional requirements specify constraints that a system must observe in order to avoid violations of social, legal, ethical, empathetic, and cultural norms. As these requirements are typically defined by non-technical system stakeholders with different expertise and priorities (ethicists, lawyers, social scientists, etc.), ensuring their well-formedness and consistency is very chal… ▽ More Normative non-functional requirements specify constraints that a system must observe in order to avoid violations of social, legal, ethical, empathetic, and cultural norms. As these requirements are typically defined by non-technical system stakeholders with different expertise and priorities (ethicists, lawyers, social scientists, etc.), ensuring their well-formedness and consistency is very challenging. Recent research has tackled this challenge using a domain-specific language to specify normative requirements as rules whose consistency can then be analysed with formal methods. In this paper, we propose a complementary approach that uses Large Language Models to extract semantic relationships between abstract representations of system capabilities. These relations, which are often assumed implicitly by non-technical stakeholders (e.g., based on common sense or domain knowledge), are then used to enrich the automated reasoning techniques for eliciting and analyzing the consistency of normative requirements. We show the effectiveness of our approach to normative requirements elicitation and operationalization through a range of real-world case studies. △ Less

Submitted 28 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

arXiv:2402.19401 [pdf, other]

Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance

Authors: Huakun Shen, Boyue Caroline Hu, Krzysztof Czarnecki, Lina Marsso, Marsha Chechik

Abstract: While Neural Networks (NNs) have surpassed human accuracy in image classification on ImageNet, they often lack robustness against image corruption, i.e., corruption robustness. Yet such robustness is seemingly effortless for human perception. In this paper, we propose visually-continuous corruption robustness (VCR) -- an extension of corruption robustness to allow assessing it over the wide and co… ▽ More While Neural Networks (NNs) have surpassed human accuracy in image classification on ImageNet, they often lack robustness against image corruption, i.e., corruption robustness. Yet such robustness is seemingly effortless for human perception. In this paper, we propose visually-continuous corruption robustness (VCR) -- an extension of corruption robustness to allow assessing it over the wide and continuous range of changes that correspond to the human perceptive quality (i.e., from the original image to the full distortion of all perceived visual information), along with two novel human-aware metrics for NN evaluation. To compare VCR of NNs with human perception, we conducted extensive experiments on 14 commonly used image corruptions with 7,718 human participants and state-of-the-art robust NN models with different training objectives (e.g., standard, adversarial, corruption robustness), different architectures (e.g., convolution NNs, vision transformers), and different amounts of training data augmentation. Our study showed that: 1) assessing robustness against continuous corruption can reveal insufficient robustness undetected by existing benchmarks; as a result, 2) the gap between NN and human robustness is larger than previously known; and finally, 3) some image corruptions have a similar impact on human perception, offering opportunities for more cost-effective robustness assessments. Our validation set with 14 image corruptions, human robustness data, and the evaluation code is provided as a toolbox and a benchmark. △ Less

Submitted 29 February, 2024; originally announced February 2024.

arXiv:2401.05673 [pdf, other]

doi 10.1145/3597503.3639093

Analyzing and Debugging Normative Requirements via Satisfiability Checking

Authors: Nick Feng, Lina Marsso, Sinem Getir Yaman, Yesugen Baatartogtokh, Reem Ayad, Victória Oldemburgo de Mello, Beverley Townsend, Isobel Standen, Ioannis Stefanakos, Calum Imrie, Genaína Nunes Rodrigues, Ana Cavalcanti, Radu Calinescu, Marsha Chechik

Abstract: As software systems increasingly interact with humans in application domains such as transportation and healthcare, they raise concerns related to the social, legal, ethical, empathetic, and cultural (SLEEC) norms and values of their stakeholders. Normative non-functional requirements (N-NFRs) are used to capture these concerns by setting SLEEC-relevant boundaries for system behavior. Since N-NFRs… ▽ More As software systems increasingly interact with humans in application domains such as transportation and healthcare, they raise concerns related to the social, legal, ethical, empathetic, and cultural (SLEEC) norms and values of their stakeholders. Normative non-functional requirements (N-NFRs) are used to capture these concerns by setting SLEEC-relevant boundaries for system behavior. Since N-NFRs need to be specified by multiple stakeholders with widely different, non-technical expertise (ethicists, lawyers, regulators, end users, etc.), N-NFR elicitation is very challenging. To address this challenge, we introduce N-Check, a novel tool-supported formal approach to N-NFR analysis and debugging. N-Check employs satisfiability checking to identify a broad spectrum of N-NFR well-formedness issues (WFI), such as conflicts, redundancy, restrictiveness, insufficiency, yielding diagnostics which pinpoint their causes in a user-friendly way that enables non-technical stakeholders to understand and fix them. We show the effectiveness and usability of our approach through nine case studies in which teams of ethicists, lawyers, philosophers, psychologists, safety analysts, and engineers used N-Check to analyse and debug 233 N-NFRs comprising 62 issues for the software underpinning the operation of systems ranging from assistive-care robots and tree-disease detection drones to manufacturing collaborative robots. △ Less

Submitted 11 January, 2024; originally announced January 2024.

arXiv:2209.04052 [pdf, ps, other]

Early Verification of Legal Compliance via Bounded Satisfiability Checking

Authors: Nick Feng, Lina Marsso, Mehrdad Sabetzadeh, Marsha Chechik

Abstract: Legal properties involve reasoning about data values and time. Metric first-order temporal logic (MFOTL) provides a rich formalism for specifying legal properties. While MFOTL has been successfully used for verifying legal properties over operational systems via runtime monitoring, no solution exists for MFOTL-based verification in early-stage system development captured by requirements. Given a l… ▽ More Legal properties involve reasoning about data values and time. Metric first-order temporal logic (MFOTL) provides a rich formalism for specifying legal properties. While MFOTL has been successfully used for verifying legal properties over operational systems via runtime monitoring, no solution exists for MFOTL-based verification in early-stage system development captured by requirements. Given a legal property and system requirements, both formalized in MFOTL, the compliance of the property can be verified on the requirements via satisfiability checking. In this paper, we propose a practical, sound, and complete (within a given bound) satisfiability checking approach for MFOTL. The approach, based on satisfiability modulo theories (SMT), employs a counterexample-guided strategy to incrementally search for a satisfying solution. We implemented our approach using the Z3 SMT solver and evaluated it on five case studies spanning the healthcare, business administration, banking and aviation domains. Our results indicate that our approach can efficiently determine whether legal properties of interest are met, or generate counterexamples that lead to compliance violations. △ Less

Submitted 27 May, 2023; v1 submitted 8 September, 2022; originally announced September 2022.

arXiv:2203.09885 [pdf, ps, other]

doi 10.4204/EPTCS.355.5

Formally Modeling Autonomous Vehicles in LNT for Simulation and Testing

Authors: Lina Marsso, Radu Mateescu, Lucie Muller, Wendelin Serwe

Abstract: We present two behavioral models of an autonomous vehicle and its interaction with the environment. Both models use the formal modeling language LNT provided by the CADP toolbox. This paper discusses the modeling choices and the challenges of our autonomous vehicle models, and also illustrates how formal validation tools can be applied to a single component or the overall vehicle. We present two behavioral models of an autonomous vehicle and its interaction with the environment. Both models use the formal modeling language LNT provided by the CADP toolbox. This paper discusses the modeling choices and the challenges of our autonomous vehicle models, and also illustrates how formal validation tools can be applied to a single component or the overall vehicle. △ Less

Submitted 18 March, 2022; originally announced March 2022.

Comments: In Proceedings MARS 2022, arXiv:2203.09299

Journal ref: EPTCS 355, 2022, pp. 60-117

arXiv:2202.03930 [pdf, other]

doi 10.1145/3510003.3510109

If a Human Can See It, So Should Your System: Reliability Requirements for Machine Vision Components

Authors: Boyue Caroline Hu, Lina Marsso, Krzysztof Czarnecki, Rick Salay, Huakun Shen, Marsha Chechik

Abstract: Machine Vision Components (MVC) are becoming safety-critical. Assuring their quality, including safety, is essential for their successful deployment. Assurance relies on the availability of precisely specified and, ideally, machine-verifiable requirements. MVCs with state-of-the-art performance rely on machine learning (ML) and training data but largely lack such requirements. In this paper, we… ▽ More Machine Vision Components (MVC) are becoming safety-critical. Assuring their quality, including safety, is essential for their successful deployment. Assurance relies on the availability of precisely specified and, ideally, machine-verifiable requirements. MVCs with state-of-the-art performance rely on machine learning (ML) and training data but largely lack such requirements. In this paper, we address the need for defining machine-verifiable reliability requirements for MVCs against transformations that simulate the full range of realistic and safety-critical changes in the environment. Using human performance as a baseline, we define reliability requirements as: 'if the changes in an image do not affect a human's decision, neither should they affect the MVC's.' To this end, we provide: (1) a class of safety-related image transformations; (2) reliability requirement classes to specify correctness-preservation and prediction-preservation for MVCs; (3) a method to instantiate machine-verifiable requirements from these requirements classes using human performance experiment data; (4) human performance experiment data for image recognition involving eight commonly used transformations, from about 2000 human participants; and (5) a method for automatically checking whether an MVC satisfies our requirements. Further, we show that our reliability requirements are feasible and reusable by evaluating our methods on 13 state-of-the-art pre-trained image classification models. Finally, we demonstrate that our approach detects reliability gaps in MVCs that other existing methods are unable to detect. △ Less

Submitted 8 February, 2022; originally announced February 2022.

arXiv:2004.14212 [pdf, other]

doi 10.4204/EPTCS.316.7

Specifying a Cryptographical Protocol in Lustre and SCADE

Authors: Lina Marsso

Abstract: We present SCADE and Lustre models of the Message Authenticator Algorithm (MAA), which is one of the first cryptographic functions for computing a message authentication code. The MAA was adopted between 1987 and 2001, in international standards (ISO 8730 and ISO 8731-2), to ensure the authenticity and integrity of banking transactions. This paper discusses the choices and the challenges of our MA… ▽ More We present SCADE and Lustre models of the Message Authenticator Algorithm (MAA), which is one of the first cryptographic functions for computing a message authentication code. The MAA was adopted between 1987 and 2001, in international standards (ISO 8730 and ISO 8731-2), to ensure the authenticity and integrity of banking transactions. This paper discusses the choices and the challenges of our MAA implementations. Our SCADE and Lustre models validate 201 official test vectors for the MAA. △ Less

Submitted 28 April, 2020; originally announced April 2020.

Comments: In Proceedings MARS 2020, arXiv:2004.12403. arXiv admin note: text overlap with arXiv:1703.06573

Journal ref: EPTCS 316, 2020, pp. 149-199

arXiv:1803.10322 [pdf, ps, other]

doi 10.4204/EPTCS.268.2

Comparative Study of Eight Formal Specifications of the Message Authenticator Algorithm

Authors: Hubert Garavel, Lina Marsso

Abstract: The Message Authenticator Algorithm (MAA) is one of the first cryptographic functions for computing a Message Authentication Code. Between 1987 and 2001, the MAA was adopted in international standards (ISO 8730 and ISO 8731-2) to ensure the authenticity and integrity of banking transactions. In 1990 and 1991, three formal, yet non-executable, specifications of the MAA (in VDM, Z, and LOTOS) were d… ▽ More The Message Authenticator Algorithm (MAA) is one of the first cryptographic functions for computing a Message Authentication Code. Between 1987 and 2001, the MAA was adopted in international standards (ISO 8730 and ISO 8731-2) to ensure the authenticity and integrity of banking transactions. In 1990 and 1991, three formal, yet non-executable, specifications of the MAA (in VDM, Z, and LOTOS) were developed at NPL. Since then, five formal executable specifications of the MAA (in LOTOS, LNT, and term rewrite systems) have been designed at INRIA Grenoble. This article provides an overview of the MAA and compares its formal specifications with respect to common-sense criteria, such as conciseness, readability, and efficiency of code generation. △ Less

Submitted 27 March, 2018; originally announced March 2018.

Comments: In Proceedings MARS/VPT 2018, arXiv:1803.08668

Journal ref: EPTCS 268, 2018, pp. 41-87

arXiv:1803.10319 [pdf, other]

doi 10.4204/EPTCS.268.1

A Formal TLS Handshake Model in LNT

Authors: Josip Bozic, Lina Marsso, Radu Mateescu, Franz Wotawa

Abstract: Testing of network services represents one of the biggest challenges in cyber security. Because new vulnerabilities are detected on a regular basis, more research is needed. These faults have their roots in the software development cycle or because of intrinsic leaks in the system specification. Conformance testing checks whether a system behaves according to its specification. Here model-based te… ▽ More Testing of network services represents one of the biggest challenges in cyber security. Because new vulnerabilities are detected on a regular basis, more research is needed. These faults have their roots in the software development cycle or because of intrinsic leaks in the system specification. Conformance testing checks whether a system behaves according to its specification. Here model-based testing provides several methods for automated detection of shortcomings. The formal specification of a system behavior represents the starting point of the testing process. In this paper, a widely used cryptographic protocol is specified and tested for conformance with a test execution framework. The first empirical results are presented and discussed. △ Less

Submitted 27 March, 2018; originally announced March 2018.

Comments: In Proceedings MARS/VPT 2018, arXiv:1803.08668

Journal ref: EPTCS 268, 2018, pp. 1-40

arXiv:1703.06573 [pdf, ps, other]

doi 10.4204/EPTCS.244.6

A Large Term Rewrite System Modelling a Pioneering Cryptographic Algorithm

Authors: Hubert Garavel, Lina Marsso

Abstract: We present a term rewrite system that formally models the Message Authenticator Algorithm (MAA), which was one of the first cryptographic functions for computing a Message Authentication Code and was adopted, between 1987 and 2001, in international standards (ISO 8730 and ISO 8731-2) to ensure the authenticity and integrity of banking transactions. Our term rewrite system is large (13 sorts,… ▽ More We present a term rewrite system that formally models the Message Authenticator Algorithm (MAA), which was one of the first cryptographic functions for computing a Message Authentication Code and was adopted, between 1987 and 2001, in international standards (ISO 8730 and ISO 8731-2) to ensure the authenticity and integrity of banking transactions. Our term rewrite system is large (13 sorts, 18 constructors, 644 non-constructors, and 684 rewrite rules), confluent, and terminating. Implementations in thirteen different languages have been automatically derived from this model and used to validate 200 official test vectors for the MAA. △ Less

Submitted 19 March, 2017; originally announced March 2017.

Comments: In Proceedings MARS 2017, arXiv:1703.05812

Journal ref: EPTCS 244, 2017, pp. 129-183

Showing 1–10 of 10 results for author: Marsso, L