Search | arXiv e-print repository

Decidability of Graph Neural Networks via Logical Characterizations

Authors: Michael Benedikt, Chia-Hsuan Lu, Boris Motik, Tony Tan

Abstract: We present results concerning the expressiveness and decidability of a popular graph learning formalism, graph neural networks (GNNs), exploiting connections with logic. We use a family of recently-discovered decidable logics involving "Presburger quantifiers". We show how to use these logics to measure the expressiveness of classes of GNNs, in some cases getting exact correspondences between the… ▽ More We present results concerning the expressiveness and decidability of a popular graph learning formalism, graph neural networks (GNNs), exploiting connections with logic. We use a family of recently-discovered decidable logics involving "Presburger quantifiers". We show how to use these logics to measure the expressiveness of classes of GNNs, in some cases getting exact correspondences between the expressiveness of logics and GNNs. We also employ the logics, and the techniques used to analyze them, to obtain decision procedures for verification problems over GNNs. We complement this with undecidability results for static analysis problems involving the logics, as well as for GNN verification problems. △ Less

Submitted 23 May, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

arXiv:2305.18015 [pdf, ps, other]

On the Correspondence Between Monotonic Max-Sum GNNs and Datalog

Authors: David Tena Cucala, Bernardo Cuenca Grau, Boris Motik, Egor V. Kostylev

Abstract: Although there has been significant interest in applying machine learning techniques to structured data, the expressivity (i.e., a description of what can be learned) of such techniques is still poorly understood. In this paper, we study data transformations based on graph neural networks (GNNs). First, we note that the choice of how a dataset is encoded into a numeric form processable by a GNN ca… ▽ More Although there has been significant interest in applying machine learning techniques to structured data, the expressivity (i.e., a description of what can be learned) of such techniques is still poorly understood. In this paper, we study data transformations based on graph neural networks (GNNs). First, we note that the choice of how a dataset is encoded into a numeric form processable by a GNN can obscure the characterisation of a model's expressivity, and we argue that a canonical encoding provides an appropriate basis. Second, we study the expressivity of monotonic max-sum GNNs, which cover a subclass of GNNs with max and sum aggregation functions. We show that, for each such GNN, one can compute a Datalog program such that applying the GNN to any dataset produces the same facts as a single round of application of the program's rules to the dataset. Monotonic max-sum GNNs can sum an unbounded number of feature vectors which can result in arbitrarily large feature values, whereas rule application requires only a bounded number of constants. Hence, our result shows that the unbounded summation of monotonic max-sum GNNs does not increase their expressive power. Third, we sharpen our result to the subclass of monotonic max GNNs, which use only the max aggregation function, and identify a corresponding class of Datalog programs. △ Less

Submitted 15 June, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

arXiv:2212.08522 [pdf, other]

doi 10.14778/3551793.3551851

Rewriting the Infinite Chase

Authors: Michael Benedikt, Maxime Buron, Stefano Germano, Kevin Kappelmann, Boris Motik

Abstract: Guarded tuple-generating dependencies (GTGDs) are a natural extension of description logics and referential constraints. It has long been known that queries over GTGDs can be answered by a variant of the chase - a quintessential technique for reasoning with dependencies. However, there has been little work on concrete algorithms and even less on implementation. To address this gap, we revisit Data… ▽ More Guarded tuple-generating dependencies (GTGDs) are a natural extension of description logics and referential constraints. It has long been known that queries over GTGDs can be answered by a variant of the chase - a quintessential technique for reasoning with dependencies. However, there has been little work on concrete algorithms and even less on implementation. To address this gap, we revisit Datalog rewriting approaches to query answering, where GTGDs are transformed to a Datalog program that entails the same base facts on each base instance. We show that the rewriting can be seen as containing "shortcut" rules that circumvent certain chase steps, we present several algorithms that compute the rewriting by simulating specific types of chase steps, and we discuss important implementation issues. Finally, we show empirically that our techniques can process complex GTGDs derived from synthetic and real benchmarks and are thus suitable for practical use. △ Less

Submitted 16 December, 2022; originally announced December 2022.

Journal ref: Proceedings of the VLDB Endowment, Volume 15, Issue 11, July 2022, pp 3045-3057

arXiv:1908.10177 [pdf, ps, other]

doi 10.1145/3357384.3358147

Datalog Reasoning over Compressed RDF Knowledge Bases

Authors: Pan Hu, Jacopo Urbani, Boris Motik, Ian Horrocks

Abstract: Materialisation is often used in RDF systems as a preprocessing step to derive all facts implied by given RDF triples and rules. Although widely used, materialisation considers all possible rule applications and can use a lot of memory for storing the derived facts, which can hinder performance. We present a novel materialisation technique that compresses the RDF triples so that the rules can some… ▽ More Materialisation is often used in RDF systems as a preprocessing step to derive all facts implied by given RDF triples and rules. Although widely used, materialisation considers all possible rule applications and can use a lot of memory for storing the derived facts, which can hinder performance. We present a novel materialisation technique that compresses the RDF triples so that the rules can sometimes be applied to multiple facts at once, and the derived facts can be represented using structure sharing. Our technique can thus require less space, as well as skip certain rule applications. Our experiments show that our technique can be very effective: when the rules are relatively simple, our system is both faster and requires less memory than prominent state-of-the-art RDF systems. △ Less

Submitted 29 August, 2019; v1 submitted 27 August, 2019; originally announced August 2019.

Comments: CIKM 2019

arXiv:1906.10261 [pdf, ps, other]

Datalog Materialisation in Distributed RDF Stores with Dynamic Data Exchange

Authors: Temitope Ajileye, Boris Motik, Ian Horrocks

Abstract: Several centralised RDF systems support datalog reasoning by precomputing and storing all logically implied triples using the wellknown seminaive algorithm. Large RDF datasets often exceed the capacity of centralised RDF systems, and a common solution is to distribute the datasets in a cluster of shared-nothing servers. While numerous distributed query answering techniques are known, distributed s… ▽ More Several centralised RDF systems support datalog reasoning by precomputing and storing all logically implied triples using the wellknown seminaive algorithm. Large RDF datasets often exceed the capacity of centralised RDF systems, and a common solution is to distribute the datasets in a cluster of shared-nothing servers. While numerous distributed query answering techniques are known, distributed seminaive evaluation of arbitrary datalog rules is less understood. In fact, most distributed RDF stores either support no reasoning or can handle only limited datalog fragments. In this paper we extend the dynamic data exchange approach for distributed query answering by Potter et al. [12] to a reasoning algorithm that can handle arbitrary rules while preserving important properties such as nonrepetition of inferences. We also show empirically that our algorithm scales well to very large RDF datasets △ Less

Submitted 24 June, 2019; originally announced June 2019.

Comments: 16 pages, ISWC conference

arXiv:1811.02304 [pdf, ps, other]

Modular Materialisation of Datalog Programs

Authors: Pan Hu, Boris Motik, Ian Horrocks

Abstract: The seminaïve algorithm can materialise all consequences of arbitrary datalog rules, and it also forms the basis for incremental algorithms that update a materialisation as the input facts change. Certain (combinations of) rules, however, can be handled much more efficiently using custom algorithms. To integrate such algorithms into a general reasoning approach that can handle arbitrary rules, we… ▽ More The seminaïve algorithm can materialise all consequences of arbitrary datalog rules, and it also forms the basis for incremental algorithms that update a materialisation as the input facts change. Certain (combinations of) rules, however, can be handled much more efficiently using custom algorithms. To integrate such algorithms into a general reasoning approach that can handle arbitrary rules, we propose a modular framework for materialisation computation and its maintenance. We split a datalog program into modules that can be handled using specialised algorithms, and handle the remaining rules using the seminaïve algorithm. We also present two algorithms for computing the transitive and the symmetric-transitive closure of a relation that can be used within our framework. Finally, we show empirically that our framework can handle arbitrary datalog programs while outperforming existing approaches, often by orders of magnitude. △ Less

Submitted 13 November, 2018; v1 submitted 6 November, 2018; originally announced November 2018.

Comments: Accepted at AAAI 2019

arXiv:1804.09473 [pdf, other]

Stratified Negation in Limit Datalog Programs

Authors: Mark Kaminski, Bernardo Cuenca Grau, Egor V. Kostylev, Boris Motik, Ian Horrocks

Abstract: There has recently been an increasing interest in declarative data analysis, where analytic tasks are specified using a logical language, and their implementation and optimisation are delegated to a general-purpose query engine. Existing declarative languages for data analysis can be formalised as variants of logic programming equipped with arithmetic function symbols and/or aggregation, and are t… ▽ More There has recently been an increasing interest in declarative data analysis, where analytic tasks are specified using a logical language, and their implementation and optimisation are delegated to a general-purpose query engine. Existing declarative languages for data analysis can be formalised as variants of logic programming equipped with arithmetic function symbols and/or aggregation, and are typically undecidable. In prior work, the language of $\mathit{limit\ programs}$ was proposed, which is sufficiently powerful to capture many analysis tasks and has decidable entailment problem. Rules in this language, however, do not allow for negation. In this paper, we study an extension of limit programs with stratified negation-as-failure. We show that the additional expressive power makes reasoning computationally more demanding, and provide tight data complexity bounds. We also identify a fragment with tractable data complexity and sufficient expressivity to capture many relevant tasks. △ Less

Submitted 25 April, 2018; originally announced April 2018.

Comments: 14 pages; full version of a paper accepted at IJCAI-18

arXiv:1801.09619 [pdf, other]

doi 10.1145/3178876.3186003

Estimating the Cardinality of Conjunctive Queries over RDF Data Using Graph Summarisation

Authors: Giorgio Stefanoni, Boris Motik, Egor V. Kostylev

Abstract: Estimating the cardinality (i.e., the number of answers) of conjunctive queries is particularly difficult in RDF systems: queries over RDF data are navigational and thus tend to involve many joins. We present a new, principled cardinality estimation technique based on graph summarisation. We interpret a summary of an RDF graph using a possible world semantics and formalise the estimation problem a… ▽ More Estimating the cardinality (i.e., the number of answers) of conjunctive queries is particularly difficult in RDF systems: queries over RDF data are navigational and thus tend to involve many joins. We present a new, principled cardinality estimation technique based on graph summarisation. We interpret a summary of an RDF graph using a possible world semantics and formalise the estimation problem as computing the expected cardinality over all RDF graphs represented by the summary, and we present a closed-form formula for computing the expectation of arbitrary queries. We also discuss approaches to RDF graph summarisation. Finally, we show empirically that our cardinality technique is more accurate and more consistent, often by orders of magnitude, than the state of the art. △ Less

Submitted 16 February, 2018; v1 submitted 29 January, 2018; originally announced January 2018.

arXiv:1711.05227 [pdf, ps, other]

Goal-Driven Query Answering for Existential Rules with Equality

Authors: Michael Benedikt, Boris Motik, Efthymia Tsamoura

Abstract: Inspired by the magic sets for Datalog, we present a novel goal-driven approach for answering queries over terminating existential rules with equality (aka TGDs and EGDs). Our technique improves the performance of query answering by pruning the consequences that are not relevant for the query. This is challenging in our setting because equalities can potentially affect all predicates in a dataset.… ▽ More Inspired by the magic sets for Datalog, we present a novel goal-driven approach for answering queries over terminating existential rules with equality (aka TGDs and EGDs). Our technique improves the performance of query answering by pruning the consequences that are not relevant for the query. This is challenging in our setting because equalities can potentially affect all predicates in a dataset. We address this problem by combining the existing singularization technique with two new ingredients: an algorithm for identifying the rules relevant to a query and a new magic sets algorithm. We show empirically that our technique can significantly improve the performance of query answering, and that it can mean the difference between answering a query in a few seconds or not being able to process the query at all. △ Less

Submitted 20 November, 2017; v1 submitted 14 November, 2017; originally announced November 2017.

arXiv:1711.04013 [pdf, other]

Stream Reasoning in Temporal Datalog

Authors: Alessandro Ronca, Mark Kaminski, Bernardo Cuenca Grau, Boris Motik, Ian Horrocks

Abstract: In recent years, there has been an increasing interest in extending traditional stream processing engines with logical, rule-based, reasoning capabilities. This poses significant theoretical and practical challenges since rules can derive new information and propagate it both towards past and future time points; as a result, streamed query answers can depend on data that has not yet been received,… ▽ More In recent years, there has been an increasing interest in extending traditional stream processing engines with logical, rule-based, reasoning capabilities. This poses significant theoretical and practical challenges since rules can derive new information and propagate it both towards past and future time points; as a result, streamed query answers can depend on data that has not yet been received, as well as on data that arrived far in the past. Stream reasoning algorithms, however, must be able to stream out query answers as soon as possible, and can only keep a limited number of previous input facts in memory. In this paper, we propose novel reasoning problems to deal with these challenges, and study their computational properties on Datalog extended with a temporal sort and the successor function (a core rule-based language for stream reasoning applications). △ Less

Submitted 15 November, 2018; v1 submitted 10 November, 2017; originally announced November 2017.

arXiv:1711.03987 [pdf, ps, other]

Optimised Maintenance of Datalog Materialisations

Authors: Pan Hu, Boris Motik, Ian Horrocks

Abstract: To efficiently answer queries, datalog systems often materialise all consequences of a datalog program, so the materialisation must be updated whenever the input facts change. Several solutions to the materialisation update problem have been proposed. The Delete/Rederive (DRed) and the Backward/Forward (B/F) algorithms solve this problem for general datalog, but both contain steps that evaluate ru… ▽ More To efficiently answer queries, datalog systems often materialise all consequences of a datalog program, so the materialisation must be updated whenever the input facts change. Several solutions to the materialisation update problem have been proposed. The Delete/Rederive (DRed) and the Backward/Forward (B/F) algorithms solve this problem for general datalog, but both contain steps that evaluate rules 'backwards' by matching their heads to a fact and evaluating the partially instantiated rule bodies as queries. We show that this can be a considerable source of overhead even on very small updates. In contrast, the Counting algorithm does not evaluate the rules 'backwards', but it can handle only nonrecursive rules. We present two hybrid approaches that combine DRed and B/F with Counting so as to reduce or even eliminate 'backward' rule evaluation while still handling arbitrary datalog programs. We show empirically that our hybrid algorithms are usually significantly faster than existing approaches, sometimes by orders of magnitude. △ Less

Submitted 20 November, 2017; v1 submitted 10 November, 2017; originally announced November 2017.

Comments: AAAI 2018

arXiv:1705.06927 [pdf, other]

Foundations of Declarative Data Analysis Using Limit Datalog Programs

Authors: Mark Kaminski, Bernardo Cuenca Grau, Egor V. Kostylev, Boris Motik, Ian Horrocks

Abstract: Motivated by applications in declarative data analysis, we study $\mathit{Datalog}_{\mathbb{Z}}$---an extension of positive Datalog with arithmetic functions over integers. This language is known to be undecidable, so we propose two fragments. In $\mathit{limit}~\mathit{Datalog}_{\mathbb{Z}}$ predicates are axiomatised to keep minimal/maximal numeric values, allowing us to show that fact entailmen… ▽ More Motivated by applications in declarative data analysis, we study $\mathit{Datalog}_{\mathbb{Z}}$---an extension of positive Datalog with arithmetic functions over integers. This language is known to be undecidable, so we propose two fragments. In $\mathit{limit}~\mathit{Datalog}_{\mathbb{Z}}$ predicates are axiomatised to keep minimal/maximal numeric values, allowing us to show that fact entailment is coNExpTime-complete in combined, and coNP-complete in data complexity. Moreover, an additional $\mathit{stability}$ requirement causes the complexity to drop to ExpTime and PTime, respectively. Finally, we show that stable $\mathit{Datalog}_{\mathbb{Z}}$ can express many useful data analysis tasks, and so our results provide a sound foundation for the development of advanced information systems. △ Less

Submitted 12 November, 2017; v1 submitted 19 May, 2017; originally announced May 2017.

Comments: 23 pages; full version of a paper accepted at IJCAI-17; v2 fixes some typos and improves the acknowledgments

arXiv:1602.04498 [pdf, other]

Extending Consequence-Based Reasoning to SRIQ

Authors: Andrew Bate, Boris Motik, Bernardo Cuenca Grau, František Simančík, Ian Horrocks

Abstract: Consequence-based calculi are a family of reasoning algorithms for description logics (DLs), and they combine hypertableau and resolution in a way that often achieves excellent performance in practice. Up to now, however, they were proposed for either Horn DLs (which do not support disjunction), or for DLs without counting quantifiers. In this paper we present a novel consequence-based calculus fo… ▽ More Consequence-based calculi are a family of reasoning algorithms for description logics (DLs), and they combine hypertableau and resolution in a way that often achieves excellent performance in practice. Up to now, however, they were proposed for either Horn DLs (which do not support disjunction), or for DLs without counting quantifiers. In this paper we present a novel consequence-based calculus for SRIQ---a rich DL that supports both features. This extension is non-trivial since the intermediate consequences that need to be derived during reasoning cannot be captured using DLs themselves. The results of our preliminary performance evaluation suggest the feasibility of our approach in practice. △ Less

Submitted 23 February, 2016; v1 submitted 14 February, 2016; originally announced February 2016.

arXiv:1505.00212 [pdf, other]

Combining Rewriting and Incremental Materialisation Maintenance for Datalog Programs with Equality

Authors: Boris Motik, Yavor Nenov, Robert Piro, Ian Horrocks

Abstract: Materialisation precomputes all consequences of a set of facts and a datalog program so that queries can be evaluated directly (i.e., independently from the program). Rewriting optimises materialisation for datalog programs with equality by replacing all equal constants with a single representative; and incremental maintenance algorithms can efficiently update a materialisation for small changes i… ▽ More Materialisation precomputes all consequences of a set of facts and a datalog program so that queries can be evaluated directly (i.e., independently from the program). Rewriting optimises materialisation for datalog programs with equality by replacing all equal constants with a single representative; and incremental maintenance algorithms can efficiently update a materialisation for small changes in the input facts. Both techniques are critical to practical applicability of datalog systems; however, we are unaware of an approach that combines rewriting and incremental maintenance. In this paper we present the first such combination, and we show empirically that it can speed up updates by several orders of magnitude compared to using either rewriting or incremental maintenance in isolation. △ Less

Submitted 1 May, 2015; originally announced May 2015.

Comments: All proofs contained in the appendix. 7 pages + 4 pages appendix. 7 algorithms and one table with evaluation results

arXiv:1411.3622 [pdf, ps, other]

Handling owl:sameAs via Rewriting

Authors: Boris Motik, Yavor Nenov, Robert Piro, Ian Horrocks

Abstract: Rewriting is widely used to optimise owl:sameAs reasoning in materialisation based OWL 2 RL systems. We investigate issues related to both the correctness and efficiency of rewriting, and present an algorithm that guarantees correctness, improves efficiency, and can be effectively parallelised. Our evaluation shows that our approach can reduce reasoning times on practical data sets by orders of ma… ▽ More Rewriting is widely used to optimise owl:sameAs reasoning in materialisation based OWL 2 RL systems. We investigate issues related to both the correctness and efficiency of rewriting, and present an algorithm that guarantees correctness, improves efficiency, and can be effectively parallelised. Our evaluation shows that our approach can reduce reasoning times on practical data sets by orders of magnitude. △ Less

Submitted 13 November, 2014; originally announced November 2014.

Comments: This is the technical report supporting the AAAI 2015 Conference submission with the same title

arXiv:1411.2516 [pdf, other]

Answering Conjunctive Queries over $\mathcal{EL}$ Knowledge Bases with Transitive and Reflexive Roles

Authors: Giorgio Stefanoni, Boris Motik

Abstract: Answering conjunctive queries (CQs) over $\mathcal{EL}$ knowledge bases (KBs) with complex role inclusions is PSPACE-hard and in PSPACE in certain cases; however, if complex role inclusions are restricted to role transitivity, the tight upper complexity bound has so far been unknown. Furthermore, the existing algorithms cannot handle reflexive roles, and they are not practicable. Finally, the prob… ▽ More Answering conjunctive queries (CQs) over $\mathcal{EL}$ knowledge bases (KBs) with complex role inclusions is PSPACE-hard and in PSPACE in certain cases; however, if complex role inclusions are restricted to role transitivity, the tight upper complexity bound has so far been unknown. Furthermore, the existing algorithms cannot handle reflexive roles, and they are not practicable. Finally, the problem is tractable for acyclic CQs and $\mathcal{ELH}$, and NP-complete for unrestricted CQs and $\mathcal{ELHO}$ KBs. In this paper we complete the complexity landscape of CQ answering for several important cases. In particular, we present a practicable NP algorithm for answering CQs over $\mathcal{ELHO}^s$ KBs---a logic containing all of OWL 2 EL, but with complex role inclusions restricted to role transitivity. Our preliminary evaluation suggests that the algorithm can be suitable for practical use. Moreover, we show that, even for a restricted class of so-called arborescent acyclic queries, CQ answering over $\mathcal{EL}$ KBs becomes NP-hard in the presence of either transitive or reflexive roles. Finally, we show that answering arborescent CQs over $\mathcal{ELHO}$ KBs is tractable, whereas answering acyclic CQs is NP-hard. △ Less

Submitted 13 May, 2015; v1 submitted 10 November, 2014; originally announced November 2014.

Comments: Extended version of a paper to appear on AAAI-15. In this version of the report, we fixed a few typos; all the results are unchanged

arXiv:1406.4110 [pdf]

doi 10.1613/jair.3949

Acyclicity Notions for Existential Rules and Their Application to Query Answering in Ontologies

Authors: Bernardo Cuenca Grau, Ian Horrocks, Markus Krötzsch, Clemens Kupke, Despoina Magka, Boris Motik, Zhe Wang

Abstract: Answering conjunctive queries (CQs) over a set of facts extended with existential rules is a prominent problem in knowledge representation and databases. This problem can be solved using the chase algorithm, which extends the given set of facts with fresh facts in order to satisfy the rules. If the chase terminates, then CQs can be evaluated directly in the resulting set of facts. The chase, howev… ▽ More Answering conjunctive queries (CQs) over a set of facts extended with existential rules is a prominent problem in knowledge representation and databases. This problem can be solved using the chase algorithm, which extends the given set of facts with fresh facts in order to satisfy the rules. If the chase terminates, then CQs can be evaluated directly in the resulting set of facts. The chase, however, does not terminate necessarily, and checking whether the chase terminates on a given set of rules and facts is undecidable. Numerous acyclicity notions were proposed as sufficient conditions for chase termination. In this paper, we present two new acyclicity notions called model-faithful acyclicity (MFA) and model-summarising acyclicity (MSA). Furthermore, we investigate the landscape of the known acyclicity notions and establish a complete taxonomy of all notions known to us. Finally, we show that MFA and MSA generalise most of these notions. Existential rules are closely related to the Horn fragments of the OWL 2 ontology language; furthermore, several prominent OWL 2 reasoners implement CQ answering by using the chase to materialise all relevant facts. In order to avoid termination problems, many of these systems handle only the OWL 2 RL profile of OWL 2; furthermore, some systems go beyond OWL 2 RL, but without any termination guarantees. In this paper we also investigate whether various acyclicity notions can provide a principled and practical solution to these problems. On the theoretical side, we show that query answering for acyclic ontologies is of lower complexity than for general ontologies. On the practical side, we show that many of the commonly used OWL 2 ontologies are MSA, and that the number of facts obtained by materialisation is not too large. Our results thus suggest that principled development of materialisation-based OWL 2 reasoners is practically feasible. △ Less

Submitted 3 February, 2014; originally announced June 2014.

Journal ref: Journal Of Artificial Intelligence Research, Volume 47, pages 741-808, 2013

arXiv:1401.5853 [pdf]

doi 10.1613/jair.3579

Reasoning over Ontologies with Hidden Content: The Import-by-Query Approach

Authors: Bernardo Cuenca Grau, Boris Motik

Abstract: There is currently a growing interest in techniques for hiding parts of the signature of an ontology Kh that is being reused by another ontology Kv. Towards this goal, in this paper we propose the import-by-query framework, which makes the content of Kh accessible through a limited query interface. If Kv reuses the symbols from Kh in a certain restricted way, one can reason over Kv U Kh by accessi… ▽ More There is currently a growing interest in techniques for hiding parts of the signature of an ontology Kh that is being reused by another ontology Kv. Towards this goal, in this paper we propose the import-by-query framework, which makes the content of Kh accessible through a limited query interface. If Kv reuses the symbols from Kh in a certain restricted way, one can reason over Kv U Kh by accessing only Kv and the query interface. We map out the landscape of the import-by-query problem. In particular, we outline the limitations of our framework and prove that certain restrictions on the expressivity of Kh and the way in which Kv reuses symbols from Kh are strictly necessary to enable reasoning in our setting. We also identify cases in which reasoning is possible and we present suitable import-by-query reasoning algorithms. △ Less

Submitted 22 January, 2014; originally announced January 2014.

Journal ref: Journal Of Artificial Intelligence Research, Volume 45, pages 197-255, 2012

arXiv:1401.4604 [pdf]

doi 10.1613/jair.3470

Completeness Guarantees for Incomplete Ontology Reasoners: Theory and Practice

Authors: Bernardo Cuenca Grau, Boris Motik, Giorgos Stoilos, Ian Horrocks

Abstract: To achieve scalability of query answering, the developers of Semantic Web applications are often forced to use incomplete OWL 2 reasoners, which fail to derive all answers for at least one query, ontology, and data set. The lack of completeness guarantees, however, may be unacceptable for applications in areas such as health care and defence, where missing answers can adversely affect the applicat… ▽ More To achieve scalability of query answering, the developers of Semantic Web applications are often forced to use incomplete OWL 2 reasoners, which fail to derive all answers for at least one query, ontology, and data set. The lack of completeness guarantees, however, may be unacceptable for applications in areas such as health care and defence, where missing answers can adversely affect the applications functionality. Furthermore, even if an application can tolerate some level of incompleteness, it is often advantageous to estimate how many and what kind of answers are being lost. In this paper, we present a novel logic-based framework that allows one to check whether a reasoner is complete for a given query Q and ontology T---that is, whether the reasoner is guaranteed to compute all answers to Q w.r.t. T and an arbitrary data set A. Since ontologies and typical queries are often fixed at application design time, our approach allows application developers to check whether a reasoner known to be incomplete in general is actually complete for the kinds of input relevant for the application. We also present a technique that, given a query Q, an ontology T, and reasoners R_1 and R_2 that satisfy certain assumptions, can be used to determine whether, for each data set A, reasoner R_1 computes more answers to Q w.r.t. T and A than reasoner R_2. This allows application developers to select the reasoner that provides the highest degree of completeness for Q and T that is compatible with the applications scalability requirements. Our results thus provide a theoretical and practical foundation for the design of future ontology-based information systems that maximise scalability while minimising or even eliminating incompleteness of query answers. △ Less

Submitted 18 January, 2014; originally announced January 2014.

Journal ref: Journal Of Artificial Intelligence Research, Volume 43, pages 419-476, 2012

arXiv:1401.3485 [pdf]

doi 10.1613/jair.2811

Hypertableau Reasoning for Description Logics

Authors: Boris Motik, Rob Shearer, Ian Horrocks

Abstract: We present a novel reasoning calculus for the description logic SHOIQ^+---a knowledge representation formalism with applications in areas such as the Semantic Web. Unnecessary nondeterminism and the construction of large models are two primary sources of inefficiency in the tableau-based reasoning calculi used in state-of-the-art reasoners. In order to reduce nondeterminism, we base our calculus o… ▽ More We present a novel reasoning calculus for the description logic SHOIQ^+---a knowledge representation formalism with applications in areas such as the Semantic Web. Unnecessary nondeterminism and the construction of large models are two primary sources of inefficiency in the tableau-based reasoning calculi used in state-of-the-art reasoners. In order to reduce nondeterminism, we base our calculus on hypertableau and hyperresolution calculi, which we extend with a blocking condition to ensure termination. In order to reduce the size of the constructed models, we introduce anywhere pairwise blocking. We also present an improved nominal introduction rule that ensures termination in the presence of nominals, inverse roles, and number restrictions---a combination of DL constructs that has proven notoriously difficult to handle. Our implementation shows significant performance improvements over state-of-the-art reasoners on several well-known ontologies. △ Less

Submitted 15 January, 2014; originally announced January 2014.

Journal ref: Journal Of Artificial Intelligence Research, Volume 36, pages 165-228, 2009

arXiv:1304.1402 [pdf, ps, other]

Computing Datalog Rewritings beyond Horn Ontologies

Authors: Bernardo Cuenca Grau, Boris Motik, Giorgos Stoilos, Ian Horrocks

Abstract: Rewriting-based approaches for answering queries over an OWL 2 DL ontology have so far been developed mainly for Horn fragments of OWL 2 DL. In this paper, we study the possibilities of answering queries over non-Horn ontologies using datalog rewritings. We prove that this is impossible in general even for very simple ontology languages, and even if PTIME = NP. Furthermore, we present a resolution… ▽ More Rewriting-based approaches for answering queries over an OWL 2 DL ontology have so far been developed mainly for Horn fragments of OWL 2 DL. In this paper, we study the possibilities of answering queries over non-Horn ontologies using datalog rewritings. We prove that this is impossible in general even for very simple ontology languages, and even if PTIME = NP. Furthermore, we present a resolution-based procedure for $\SHI$ ontologies that, in case it terminates, produces a datalog rewriting of the ontology. Our procedure necessarily terminates on DL-Lite_{bool}^H ontologies---an extension of OWL 2 QL with transitive roles and Boolean connectives. △ Less

Submitted 8 April, 2013; v1 submitted 4 April, 2013; originally announced April 2013.

Comments: 14 pages. To appear at IJCAI 2013

arXiv:1303.7430 [pdf, other]

Introducing Nominals to the Combined Query Answering Approaches for EL

Authors: Giorgio Stefanoni, Boris Motik, Ian Horrocks

Abstract: So-called combined approaches answer a conjunctive query over a description logic ontology in three steps: first, they materialise certain consequences of the ontology and the data; second, they evaluate the query over the data; and third, they filter the result of the second phase to eliminate unsound answers. Such approaches were developed for various members of the DL-Lite and the EL families o… ▽ More So-called combined approaches answer a conjunctive query over a description logic ontology in three steps: first, they materialise certain consequences of the ontology and the data; second, they evaluate the query over the data; and third, they filter the result of the second phase to eliminate unsound answers. Such approaches were developed for various members of the DL-Lite and the EL families of languages, but none of them can handle ontologies containing nominals. In our work, we bridge this gap and present a combined query answering approach for ELHO---a logic that contains all features of the OWL 2 EL standard apart from transitive roles and complex role inclusions. This extension is nontrivial because nominals require equality reasoning, which introduces complexity into the first and the third step. Our empirical evaluation suggests that our technique is suitable for practical application, and so it provides a practical basis for conjunctive query answering in a large fragment of OWL 2 EL. △ Less

Submitted 1 April, 2013; v1 submitted 29 March, 2013; originally announced March 2013.

Comments: Extended version of a paper to appear on AAAI-13

Showing 1–22 of 22 results for author: Motik, B