-
IoT Monitoring with Blockchain: Generating Smart Contracts from Service Level Agreements
Authors:
Adam Booth,
Awatif Alqahtani,
Ellis Solaiman
Abstract:
A Service Level Agreement (SLA) is a commitment between a client and provider that assures the quality of service (QoS) a client can expect to receive when purchasing a service. However, evidence of SLA violations in Internet of Things (IoT) service monitoring data can be manipulated by the provider or consumer, resulting in an issue of trust between contracted parties. The following research aims…
▽ More
A Service Level Agreement (SLA) is a commitment between a client and provider that assures the quality of service (QoS) a client can expect to receive when purchasing a service. However, evidence of SLA violations in Internet of Things (IoT) service monitoring data can be manipulated by the provider or consumer, resulting in an issue of trust between contracted parties. The following research aims to explore the use of blockchain technology in monitoring IoT systems using smart contracts so that SLA violations captured are irrefutable amongst service providers and clients. The research focuses on the development of a Java library that is capable of generating a smart contract from a given SLA. A smart contract generated by this library is validated through a mock scenario presented in the form of a Remote Patient Monitoring IoT system. In this scenario, the findings demonstrate a 100 percent success rate in capturing all emulated violations.
△ Less
Submitted 23 August, 2024;
originally announced August 2024.
-
Voronoi Candidates for Bayesian Optimization
Authors:
Nathan Wycoff,
John W. Smith,
Annie S. Booth,
Robert B. Gramacy
Abstract:
Bayesian optimization (BO) offers an elegant approach for efficiently optimizing black-box functions. However, acquisition criteria demand their own challenging inner-optimization, which can induce significant overhead. Many practical BO methods, particularly in high dimension, eschew a formal, continuous optimization of the acquisition function and instead search discretely over a finite set of s…
▽ More
Bayesian optimization (BO) offers an elegant approach for efficiently optimizing black-box functions. However, acquisition criteria demand their own challenging inner-optimization, which can induce significant overhead. Many practical BO methods, particularly in high dimension, eschew a formal, continuous optimization of the acquisition function and instead search discretely over a finite set of space-filling candidates. Here, we propose to use candidates which lie on the boundary of the Voronoi tessellation of the current design points, so they are equidistant to two or more of them. We discuss strategies for efficient implementation by directly sampling the Voronoi boundary without explicitly generating the tessellation, thus accommodating large designs in high dimension. On a battery of test problems optimized via Gaussian processes with expected improvement, our proposed approach significantly improves the execution time of a multi-start continuous search without a loss in accuracy.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Informative path planning for scalar dynamic reconstruction using coregionalized Gaussian processes and a spatiotemporal kernel
Authors:
Lorenzo Booth,
Stefano Carpin
Abstract:
The proliferation of unmanned vehicles offers many opportunities for solving environmental sampling tasks with applications in resource monitoring and precision agriculture. Informative path planning (IPP) includes a family of methods which offer improvements over traditional surveying techniques for suggesting locations for observation collection. In this work, we present a novel solution to the…
▽ More
The proliferation of unmanned vehicles offers many opportunities for solving environmental sampling tasks with applications in resource monitoring and precision agriculture. Informative path planning (IPP) includes a family of methods which offer improvements over traditional surveying techniques for suggesting locations for observation collection. In this work, we present a novel solution to the IPP problem by using a coregionalized Gaussian processes to estimate a dynamic scalar field that varies in space and time. Our method improves previous approaches by using a composite kernel accounting for spatiotemporal correlations and at the same time, can be readily incorporated in existing IPP algorithms. Through extensive simulations, we show that our novel modeling approach leads to more accurate estimations when compared with formerly proposed methods that do not account for the temporal dimension.
△ Less
Submitted 13 September, 2023;
originally announced September 2023.
-
A portable coding strategy to exploit vectorization on combustion simulations
Authors:
Fabio Banchelli,
Guillermo Oyarzun,
Marta Garcia-Gasulla,
Filippo Mantovani,
Ambrus Both,
Guillaume Houzeaux,
Daniel Mira
Abstract:
The complexity of combustion simulations demands the latest high-performance computing tools to accelerate its time-to-solution results. A current trend on HPC systems is the utilization of CPUs with SIMD or vector extensions to exploit data parallelism. Our work proposes a strategy to improve the automatic vectorization of finite element-based scientific codes. The approach applies a parametric c…
▽ More
The complexity of combustion simulations demands the latest high-performance computing tools to accelerate its time-to-solution results. A current trend on HPC systems is the utilization of CPUs with SIMD or vector extensions to exploit data parallelism. Our work proposes a strategy to improve the automatic vectorization of finite element-based scientific codes. The approach applies a parametric configuration to the data structures to help the compiler detect the block of codes that can take advantage of vector computation while maintaining the code portable. A detailed analysis of the computational impact of this methodology on the different stages of a CFD solver is studied on the PRECCINSTA burner simulation. Our parametric implementation has proven to help the compiler generate more vector instructions in the assembly operation: this results in a reduction of up to 9.3 times of the total executed instruction maintaining constant the Instructions Per Cycle and the CPU frequency. The proposed strategy improves the performance of the CFD case under study up to 4.67 times on the MareNostrum 4 supercomputer.
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
QALD-9-plus: A Multilingual Dataset for Question Answering over DBpedia and Wikidata Translated by Native Speakers
Authors:
Aleksandr Perevalov,
Dennis Diefenbach,
Ricardo Usbeck,
Andreas Both
Abstract:
The ability to have the same experience for different user groups (i.e., accessibility) is one of the most important characteristics of Web-based systems. The same is true for Knowledge Graph Question Answering (KGQA) systems that provide the access to Semantic Web data via natural language interface. While following our research agenda on the multilingual aspect of accessibility of KGQA systems,…
▽ More
The ability to have the same experience for different user groups (i.e., accessibility) is one of the most important characteristics of Web-based systems. The same is true for Knowledge Graph Question Answering (KGQA) systems that provide the access to Semantic Web data via natural language interface. While following our research agenda on the multilingual aspect of accessibility of KGQA systems, we identified several ongoing challenges. One of them is the lack of multilingual KGQA benchmarks. In this work, we extend one of the most popular KGQA benchmarks - QALD-9 by introducing high-quality questions' translations to 8 languages provided by native speakers, and transferring the SPARQL queries of QALD-9 from DBpedia to Wikidata, s.t., the usability and relevance of the dataset is strongly increased. Five of the languages - Armenian, Ukrainian, Lithuanian, Bashkir and Belarusian - to our best knowledge were never considered in KGQA research community before. The latter two of the languages are considered as "endangered" by UNESCO. We call the extended dataset QALD-9-plus and made it available online https://github.com/Perevalov/qald_9_plus.
△ Less
Submitted 7 February, 2022; v1 submitted 31 January, 2022;
originally announced February 2022.
-
Knowledge Graph Question Answering Leaderboard: A Community Resource to Prevent a Replication Crisis
Authors:
Aleksandr Perevalov,
Xi Yan,
Liubov Kovriguina,
Longquan Jiang,
Andreas Both,
Ricardo Usbeck
Abstract:
Data-driven systems need to be evaluated to establish trust in the scientific approach and its applicability. In particular, this is true for Knowledge Graph (KG) Question Answering (QA), where complex data structures are made accessible via natural-language interfaces. Evaluating the capabilities of these systems has been a driver for the community for more than ten years while establishing diffe…
▽ More
Data-driven systems need to be evaluated to establish trust in the scientific approach and its applicability. In particular, this is true for Knowledge Graph (KG) Question Answering (QA), where complex data structures are made accessible via natural-language interfaces. Evaluating the capabilities of these systems has been a driver for the community for more than ten years while establishing different KGQA benchmark datasets. However, comparing different approaches is cumbersome. The lack of existing and curated leaderboards leads to a missing global view over the research field and could inject mistrust into the results. In particular, the latest and most-used datasets in the KGQA community, LC-QuAD and QALD, miss providing central and up-to-date points of trust. In this paper, we survey and analyze a wide range of evaluation results with significant coverage of 100 publications and 98 systems from the last decade. We provide a new central and open leaderboard for any KGQA benchmark dataset as a focal point for the community - https://kgqa.github.io/leaderboard. Our analysis highlights existing problems during the evaluation of KGQA systems. Thus, we will point to possible improvements for future evaluations.
△ Less
Submitted 20 January, 2022;
originally announced January 2022.
-
Activity-based and agent-based Transport model of Melbourne (AToM): an open multi-modal transport simulation model for Greater Melbourne
Authors:
Afshin Jafari,
Dhirendra Singh,
Alan Both,
Mahsa Abdollahyar,
Lucy Gunn,
Steve Pemberton,
Billie Giles-Corti
Abstract:
Agent-based and activity-based models for simulating transportation systems have attracted significant attention in recent years. Few studies, however, include a detailed representation of active modes of transportation - such as walking and cycling - at a city-wide level, where dominating motorised modes are often of primary concern. This paper presents an open workflow for creating a multi-modal…
▽ More
Agent-based and activity-based models for simulating transportation systems have attracted significant attention in recent years. Few studies, however, include a detailed representation of active modes of transportation - such as walking and cycling - at a city-wide level, where dominating motorised modes are often of primary concern. This paper presents an open workflow for creating a multi-modal agent-based and activity-based transport simulation model, focusing on Greater Melbourne, and including the process of mode choice calibration for the four main travel modes of driving, public transport, cycling and walking. The synthetic population generated and used as an input for the simulation model represented Melbourne's population based on Census 2016, with daily activities and trips based on the Victoria's 2016-18 travel survey data. The road network used in the simulation model includes all public roads accessible via the included travel modes. We compared the output of the simulation model with observations from the real world in terms of mode share, road volume, travel time, and travel distance. Through these comparisons, we showed that our model is suitable for studying mode choice and road usage behaviour of travellers.
△ Less
Submitted 15 December, 2021;
originally announced December 2021.
-
Improving the Question Answering Quality using Answer Candidate Filtering based on Natural-Language Features
Authors:
Aleksandr Gashkov,
Aleksandr Perevalov,
Maria Eltsova,
Andreas Both
Abstract:
Software with natural-language user interfaces has an ever-increasing importance. However, the quality of the included Question Answering (QA) functionality is still not sufficient regarding the number of questions that are answered correctly. In our work, we address the research problem of how the QA quality of a given system can be improved just by evaluating the natural-language input (i.e., th…
▽ More
Software with natural-language user interfaces has an ever-increasing importance. However, the quality of the included Question Answering (QA) functionality is still not sufficient regarding the number of questions that are answered correctly. In our work, we address the research problem of how the QA quality of a given system can be improved just by evaluating the natural-language input (i.e., the user's question) and output (i.e., the system's answer). Our main contribution is an approach capable of identifying wrong answers provided by a QA system. Hence, filtering incorrect answers from a list of answer candidates is leading to a highly improved QA quality. In particular, our approach has shown its potential while removing in many cases the majority of incorrect answers, which increases the QA quality significantly in comparison to the non-filtered output of a system.
△ Less
Submitted 10 December, 2021;
originally announced December 2021.
-
An Activity-Based Model of Transport Demand for Greater Melbourne
Authors:
Alan Both,
Dhirendra Singh,
Afshin Jafari,
Billie Giles-Corti,
Lucy Gunn
Abstract:
In this paper, we present an algorithm for creating a synthetic population for the Greater Melbourne area using a combination of machine learning, probabilistic, and gravity-based approaches. We combine these techniques in a hybrid model with three primary innovations: 1. when assigning activity patterns, we generate individual activity chains for every agent, tailored to their cohort; 2. when sel…
▽ More
In this paper, we present an algorithm for creating a synthetic population for the Greater Melbourne area using a combination of machine learning, probabilistic, and gravity-based approaches. We combine these techniques in a hybrid model with three primary innovations: 1. when assigning activity patterns, we generate individual activity chains for every agent, tailored to their cohort; 2. when selecting destinations, we aim to strike a balance between the distance-decay of trip lengths and the activity-based attraction of destination locations; and 3. we take into account the number of trips remaining for an agent so as to ensure they do not select a destination that would be unreasonable to return home from. Our method is completely open and replicable, requiring only publicly available data to generate a synthetic population of agents compatible with commonly used agent-based modeling software such as MATSim. The synthetic population was found to be accurate in terms of distance distribution, mode choice, and destination choice for a variety of population sizes.
△ Less
Submitted 19 November, 2021;
originally announced November 2021.
-
Measuring information exchange and brokerage capacity of healthcare teams
Authors:
F. Grippa,
J. Bucuvalas,
A. Booth,
E. Alessandrini,
A. Fronzetti Colladon,
L. M. Wade
Abstract:
Purpose: The purpose of this paper is to explore possible factors impacting team performance in healthcare, by focusing on information exchange within and across hospital's boundaries. Design/methodology/approach: Through a web-survey and group interviews, the authors collected data on the communication networks of 31 members of four interdisciplinary healthcare teams involved in a system redesign…
▽ More
Purpose: The purpose of this paper is to explore possible factors impacting team performance in healthcare, by focusing on information exchange within and across hospital's boundaries. Design/methodology/approach: Through a web-survey and group interviews, the authors collected data on the communication networks of 31 members of four interdisciplinary healthcare teams involved in a system redesign initiative within a large US children's hospital. The authors mapped their internal and external social networks based on management advice, technical support and knowledge dissemination within and across departments, studying interaction patterns that involved more than 700 actors. The authors then compared team performance and social network metrics such as degree, closeness and betweenness centrality, and computed cross ties and constraint levels for each team. Findings: The results indicate that highly effective teams were more inwardly focused and less connected to outside members. Moreover, highly recognized teams communicated frequently but, overall, less intensely than the others. Originality/value: Mapping knowledge flows and balancing internal focus and outward connectivity of interdisciplinary teams may help healthcare decision makers in their attempt to achieve high value for patients, families and employees.
△ Less
Submitted 26 May, 2021;
originally announced May 2021.
-
Development of a dynamic type 2 diabetes risk prediction tool: a UK Biobank study
Authors:
Nikola Dolezalova,
Massimo Cairo,
Alex Despotovic,
Adam T. C. Booth,
Angus B. Reed,
Davide Morelli,
David Plans
Abstract:
Diabetes affects over 400 million people and is among the leading causes of morbidity worldwide. Identification of high-risk individuals can support early diagnosis and prevention of disease development through lifestyle changes. However, the majority of existing risk scores require information about blood-based factors which are not obtainable outside of the clinic. Here, we aimed to develop an a…
▽ More
Diabetes affects over 400 million people and is among the leading causes of morbidity worldwide. Identification of high-risk individuals can support early diagnosis and prevention of disease development through lifestyle changes. However, the majority of existing risk scores require information about blood-based factors which are not obtainable outside of the clinic. Here, we aimed to develop an accessible solution that could be deployed digitally and at scale. We developed a predictive 10-year type 2 diabetes risk score using 301 features derived from 472,830 participants in the UK Biobank dataset while excluding any features which are not easily obtainable by a smartphone. Using a data-driven feature selection process, 19 features were included in the final reduced model. A Cox proportional hazards model slightly overperformed a DeepSurv model trained using the same features, achieving a concordance index of 0.818 (95% CI: 0.812-0.823), compared to 0.811 (95% CI: 0.806-0.815). The final model showed good calibration. This tool can be used for clinical screening of individuals at risk of developing type 2 diabetes and to foster patient empowerment by broadening their knowledge of the factors affecting their personal risk.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.
-
Machine learning approach to dynamic risk modeling of mortality in COVID-19: a UK Biobank study
Authors:
Mohammad A. Dabbah,
Angus B. Reed,
Adam T. C. Booth,
Arrash Yassaee,
Alex Despotovic,
Benjamin Klasmer,
Emily Binning,
Mert Aral,
David Plans,
Alain B. Labrique,
Diwakar Mohan
Abstract:
The COVID-19 pandemic has created an urgent need for robust, scalable monitoring tools supporting stratification of high-risk patients. This research aims to develop and validate prediction models, using the UK Biobank, to estimate COVID-19 mortality risk in confirmed cases. From the 11,245 participants testing positive for COVID-19, we develop a data-driven random forest classification model with…
▽ More
The COVID-19 pandemic has created an urgent need for robust, scalable monitoring tools supporting stratification of high-risk patients. This research aims to develop and validate prediction models, using the UK Biobank, to estimate COVID-19 mortality risk in confirmed cases. From the 11,245 participants testing positive for COVID-19, we develop a data-driven random forest classification model with excellent performance (AUC: 0.91), using baseline characteristics, pre-existing conditions, symptoms, and vital signs, such that the score could dynamically assess mortality risk with disease deterioration. We also identify several significant novel predictors of COVID-19 mortality with equivalent or greater predictive value than established high-risk comorbidities, such as detailed anthropometrics and prior acute kidney failure, urinary tract infection, and pneumonias. The model design and feature selection enables utility in outpatient settings. Possible applications include supporting individual-level risk profiling and monitoring disease progression across patients with COVID-19 at-scale, especially in hospital-at-home settings.
△ Less
Submitted 19 April, 2021;
originally announced April 2021.
-
Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines
Authors:
Mohamad Yaser Jaradeh,
Kuldeep Singh,
Markus Stocker,
Andreas Both,
Sören Auer
Abstract:
In the last decade, a large number of Knowledge Graph (KG) information extraction approaches were proposed. Albeit effective, these efforts are disjoint, and their collective strengths and weaknesses in effective KG information extraction (IE) have not been studied in the literature. We propose Plumber, the first framework that brings together the research community's disjoint IE efforts. The Plum…
▽ More
In the last decade, a large number of Knowledge Graph (KG) information extraction approaches were proposed. Albeit effective, these efforts are disjoint, and their collective strengths and weaknesses in effective KG information extraction (IE) have not been studied in the literature. We propose Plumber, the first framework that brings together the research community's disjoint IE efforts. The Plumber architecture comprises 33 reusable components for various KG information extraction subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components,Plumber dynamically generates suitable information extraction pipelines and offers overall 264 distinct pipelines.We study the optimization problem of choosing suitable pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over two KGs: DBpedia, and Open Research Knowledge Graph (ORKG). Our results demonstrate the effectiveness of Plumber in dynamically generating KG information extraction pipelines,outperforming all baselines agnostics of the underlying KG. Furthermore,we provide an analysis of collective failure cases, study the similarities and synergies among integrated components, and discuss their limitations.
△ Less
Submitted 22 February, 2021;
originally announced February 2021.
-
Template-Based Question Answering over Linked Geospatial Data
Authors:
Dharmen Punjani,
Markos Iliakis,
Theodoros Stefou,
Kuldeep Singh,
Andreas Both,
Manolis Koubarakis,
Iosif Angelidis,
Konstantina Bereta,
Themis Beris,
Dimitris Bilidas,
Theofilos Ioannidis,
Nikolaos Karalis,
Christoph Lange,
Despina-Athanasia Pantazi,
Christos Papaloukas,
Georgios Stamoulis
Abstract:
Large amounts of geospatial data have been made available recently on the linked open data cloud and the portals of many national cartographic agencies (e.g., OpenStreetMap data, administrative geographies of various countries, or land cover/land use data sets). These datasets use various geospatial vocabularies and can be queried using SPARQL or its OGC-standardized extension GeoSPARQL. In this p…
▽ More
Large amounts of geospatial data have been made available recently on the linked open data cloud and the portals of many national cartographic agencies (e.g., OpenStreetMap data, administrative geographies of various countries, or land cover/land use data sets). These datasets use various geospatial vocabularies and can be queried using SPARQL or its OGC-standardized extension GeoSPARQL. In this paper, we go beyond these approaches to offer a question-answering engine for natural language questions on top of linked geospatial data sources. Our system has been implemented as re-usable components of the Frankenstein question answering architecture. We give a detailed description of the system's architecture, its underlying algorithms, and its evaluation using a set of 201 natural language questions. The set of questions is offered to the research community as a gold standard dataset for the comparative evaluation of future geospatial question answering engines.
△ Less
Submitted 29 April, 2021; v1 submitted 14 July, 2020;
originally announced July 2020.
-
Towards a Question Answering System over the Semantic Web
Authors:
Dennis Diefenbach,
Andreas Both,
Kamal Singh,
Pierre Maret
Abstract:
Thanks to the development of the Semantic Web, a lot of new structured data has become available on the Web in the form of knowledge bases (KBs). Making this valuable data accessible and usable for end-users is one of the main goals of Question Answering (QA) over KBs. Most current QA systems query one KB, in one language (namely English). The existing approaches are not designed to be easily adap…
▽ More
Thanks to the development of the Semantic Web, a lot of new structured data has become available on the Web in the form of knowledge bases (KBs). Making this valuable data accessible and usable for end-users is one of the main goals of Question Answering (QA) over KBs. Most current QA systems query one KB, in one language (namely English). The existing approaches are not designed to be easily adaptable to new KBs and languages. We first introduce a new approach for translating natural language questions to SPARQL queries. It is able to query several KBs simultaneously, in different languages, and can easily be ported to other KBs and languages. In our evaluation, the impact of our approach is proven using 5 different well-known and large KBs: Wikidata, DBpedia, MusicBrainz, DBLP and Freebase as well as 5 different languages namely English, German, French, Italian and Spanish. Second, we show how we integrated our approach, to make it easily accessible by the research community and by end-users. To summarize, we provided a conceptional solution for multilingual, KB-agnostic Question Answering over the Semantic Web. The provided first approximation validates this concept.
△ Less
Submitted 2 March, 2018;
originally announced March 2018.
-
REFOCUS: Current & Future Search Interface Requirements for German-speaking Users
Authors:
Maximilian Speicher,
Andreas Both,
Martin Gaedke
Abstract:
While smartphones are widely used for web browsing, also other novel devices like Smart TVs become increasingly popular. Yet, current interfaces do not cater for the newly available devices beyond touch and small screens, if at all for the latter. Particularly search engines -- today's entry points of the WWW -- must ensure their interfaces are easy to use on any web-enabled device. We report on a…
▽ More
While smartphones are widely used for web browsing, also other novel devices like Smart TVs become increasingly popular. Yet, current interfaces do not cater for the newly available devices beyond touch and small screens, if at all for the latter. Particularly search engines -- today's entry points of the WWW -- must ensure their interfaces are easy to use on any web-enabled device. We report on a survey that investigated (1) users' perception and usage of current search interfaces, and (2) their expectations towards current and future search interfaces. Users are mostly satisfied with desktop and mobile search, but seem to be skeptical towards web search with novel devices and input modalities. Hence, we derive REFOCUS -- a novel set of requirements for current and future search interfaces, which shall address the demand for improvement of novel web search and has been validated by 12 dedicated experts.
△ Less
Submitted 10 August, 2022; v1 submitted 29 October, 2016;
originally announced October 2016.
-
Evaluating topic coherence measures
Authors:
Frank Rosner,
Alexander Hinneburg,
Michael Röder,
Martin Nettling,
Andreas Both
Abstract:
Topic models extract representative word sets - called topics - from word counts in documents without requiring any semantic annotations. Topics are not guaranteed to be well interpretable, therefore, coherence measures have been proposed to distinguish between good and bad topics. Studies of topic coherence so far are limited to measures that score pairs of individual words. For the first time, w…
▽ More
Topic models extract representative word sets - called topics - from word counts in documents without requiring any semantic annotations. Topics are not guaranteed to be well interpretable, therefore, coherence measures have been proposed to distinguish between good and bad topics. Studies of topic coherence so far are limited to measures that score pairs of individual words. For the first time, we include coherence measures from scientific philosophy that score pairs of more complex word subsets and apply them to topic scoring.
△ Less
Submitted 25 March, 2014;
originally announced March 2014.
-
On Redundant Topological Constraints
Authors:
Sanjiang Li,
Zhiguo Long,
Weiming Liu,
Matt Duckham,
Alan Both
Abstract:
The Region Connection Calculus (RCC) is a well-known calculus for representing part-whole and topological relations. It plays an important role in qualitative spatial reasoning, geographical information science, and ontology. The computational complexity of reasoning with RCC5 and RCC8 (two fragments of RCC) as well as other qualitative spatial/temporal calculi has been investigated in depth in th…
▽ More
The Region Connection Calculus (RCC) is a well-known calculus for representing part-whole and topological relations. It plays an important role in qualitative spatial reasoning, geographical information science, and ontology. The computational complexity of reasoning with RCC5 and RCC8 (two fragments of RCC) as well as other qualitative spatial/temporal calculi has been investigated in depth in the literature. Most of these works focus on the consistency of qualitative constraint networks. In this paper, we consider the important problem of redundant qualitative constraints. For a set $Γ$ of qualitative constraints, we say a constraint $(x R y)$ in $Γ$ is redundant if it is entailed by the rest of $Γ$. A prime subnetwork of $Γ$ is a subset of $Γ$ which contains no redundant constraints and has the same solution set as $Γ$. It is natural to ask how to compute such a prime subnetwork, and when it is unique.
In this paper, we show that this problem is in general intractable, but becomes tractable if $Γ$ is over a tractable subalgebra $\mathcal{S}$ of a qualitative calculus. Furthermore, if $\mathcal{S}$ is a subalgebra of RCC5 or RCC8 in which weak composition distributes over nonempty intersections, then $Γ$ has a unique prime subnetwork, which can be obtained in cubic time by removing all redundant constraints simultaneously from $Γ$. As a byproduct, we show that any path-consistent network over such a distributive subalgebra is weakly globally consistent and minimal. A thorough empirical analysis of the prime subnetwork upon real geographical data sets demonstrates the approach is able to identify significantly more redundant constraints than previously proposed algorithms, especially in constraint networks with larger proportions of partial overlap relations.
△ Less
Submitted 13 February, 2015; v1 submitted 3 March, 2014;
originally announced March 2014.