-
Swift Markov Logic for Probabilistic Reasoning on Knowledge Graphs
Authors:
Luigi Bellomarini,
Eleonora Laurenza,
Emanuel Sallinger,
Evgeny Sherkhonov
Abstract:
We provide a framework for probabilistic reasoning in Vadalog-based Knowledge Graphs (KGs), satisfying the requirements of ontological reasoning: full recursion, powerful existential quantification, expression of inductive definitions. Vadalog is a Knowledge Representation and Reasoning (KRR) language based on Warded Datalog+/-, a logical core language of existential rules, with a good balance bet…
▽ More
We provide a framework for probabilistic reasoning in Vadalog-based Knowledge Graphs (KGs), satisfying the requirements of ontological reasoning: full recursion, powerful existential quantification, expression of inductive definitions. Vadalog is a Knowledge Representation and Reasoning (KRR) language based on Warded Datalog+/-, a logical core language of existential rules, with a good balance between computational complexity and expressive power. Handling uncertainty is essential for reasoning with KGs. Yet Vadalog and Warded Datalog+/- are not covered by the existing probabilistic logic programming and statistical relational learning approaches for several reasons, including insufficient support for recursion with existential quantification, and the impossibility to express inductive definitions. In this work, we introduce Soft Vadalog, a probabilistic extension to Vadalog, satisfying these desiderata. A Soft Vadalog program induces what we call a Probabilistic Knowledge Graph (PKG), which consists of a probability distribution on a network of chase instances, structures obtained by grounding the rules over a database using the chase procedure. We exploit PKGs for probabilistic marginal inference. We discuss the theory and present MCMC-chase, a Monte Carlo method to use Soft Vadalog in practice. We apply our framework to solve data management and industrial problems, and experimentally evaluate it in the Vadalog system.
Under consideration in Theory and Practice of Logic Programming (TPLP).
△ Less
Submitted 1 October, 2022;
originally announced October 2022.
-
Enriching a Fashion Knowledge Graph from Product Textual Descriptions
Authors:
João Barroca,
Abhishek Shivkumar,
Beatriz Quintino Ferreira,
Evgeny Sherkhonov,
João Faria
Abstract:
Knowledge Graphs offer a very useful and powerful structure for representing information, consequently, they have been adopted as the backbone for many applications in e-commerce scenarios. In this paper, we describe an application of existing techniques for enriching thelarge-scale Fashion Knowledge Graph (FKG) that we build at Farfetch. In particular, we apply techniques for named entity recogni…
▽ More
Knowledge Graphs offer a very useful and powerful structure for representing information, consequently, they have been adopted as the backbone for many applications in e-commerce scenarios. In this paper, we describe an application of existing techniques for enriching thelarge-scale Fashion Knowledge Graph (FKG) that we build at Farfetch. In particular, we apply techniques for named entity recognition (NER) and entity linking (EL) in order to extract and link rich metadata from product textual descriptions to entities in the FKG. Having a complete and enriched FKG as an e-commerce backbone can have a highly valuable impact on downstream applications such as search and recommendations. However, enriching a Knowledge Graph in the fashion domain has its own challenges. Data representation is different from a more generic KG, like Wikidata and Yago, as entities (e.g. product attributes) are too specific to the domain, and long textual descriptions are not readily available. Data itself is also scarce, as labelling datasets to train supervised models is a very laborious task. Even more, fashion products display a high variability and require an intricate ontology of attributes to link to. We use a transfer learning based approach to train an NER module on a small amount of manually labeled data, followed by an EL module that links the previously identified named entities to the appropriate entities within the FKG. Experiments using a pre-trained model show that it is possible to achieve 89.75% accuracy in NER even with a small manually labeled dataset. Moreover, the EL module, despite relying on simple rule-based or ML models (due to lack of training data), is able to link relevant attributes to products, thus automatically enriching the FKG.
△ Less
Submitted 2 June, 2022;
originally announced June 2022.
-
Data Science with Vadalog: Bridging Machine Learning and Reasoning
Authors:
Luigi Bellomarini,
Ruslan R. Fayzrakhmanov,
Georg Gottlob,
Andrey Kravchenko,
Eleonora Laurenza,
Yavor Nenov,
Stephane Reissfelder,
Emanuel Sallinger,
Evgeny Sherkhonov,
Lianlong Wu
Abstract:
Following the recent successful examples of large technology companies, many modern enterprises seek to build knowledge graphs to provide a unified view of corporate knowledge and to draw deep insights using machine learning and logical reasoning. There is currently a perceived disconnect between the traditional approaches for data science, typically based on machine learning and statistical model…
▽ More
Following the recent successful examples of large technology companies, many modern enterprises seek to build knowledge graphs to provide a unified view of corporate knowledge and to draw deep insights using machine learning and logical reasoning. There is currently a perceived disconnect between the traditional approaches for data science, typically based on machine learning and statistical modelling, and systems for reasoning with domain knowledge. In this paper we present a state-of-the-art Knowledge Graph Management System, Vadalog, which delivers highly expressive and efficient logical reasoning and provides seamless integration with modern data science toolkits, such as the Jupyter platform. We demonstrate how to use Vadalog to perform traditional data wrangling tasks, as well as complex logical and probabilistic reasoning. We argue that this is a significant step forward towards combining machine learning and reasoning in data science.
△ Less
Submitted 23 July, 2018;
originally announced July 2018.
-
Containment for Conditional Tree Patterns
Authors:
Alessandro Facchini,
Yoichi Hirai,
Maarten Marx,
Evgeny Sherkhonov
Abstract:
A Conditional Tree Pattern (CTP) expands an XML tree pattern with labels attached to the descendant edges. These labels can be XML element names or Boolean CTPs. The meaning of a descendant edge labelled by A and ending in a node labelled by B is a path of child steps ending in a B node such that all intermediate nodes are A nodes. In effect this expresses the until B, A holds construction from t…
▽ More
A Conditional Tree Pattern (CTP) expands an XML tree pattern with labels attached to the descendant edges. These labels can be XML element names or Boolean CTPs. The meaning of a descendant edge labelled by A and ending in a node labelled by B is a path of child steps ending in a B node such that all intermediate nodes are A nodes. In effect this expresses the until B, A holds construction from temporal logic.This paper studies the containment problem for CTP. For tree patterns (TP), this problem is known to be coNP-complete. We show that it is PSPACE-complete for CTP. This increase in complexity is due to the fact that CTP is expressive enough to encode an unrestricted form of label negation: ${*}\setminus a$, meaning "any node except an a-node". Containment of TP expanded with this type of negation is already PSPACE-hard. CTP is a positive, forward, first order fragment of Regular XPath. Unlike TP, CTP expanded with disjunction is not equivalent to unions of CTP's. Like TP, CTP is a natural fragment to consider: CTP is closed under intersections and CTP with disjunction is equally expressive as positive existential first order logic expanded with the until operator.
△ Less
Submitted 6 June, 2015; v1 submitted 7 March, 2015;
originally announced March 2015.
-
High-Level Why-Not Explanations using Ontologies
Authors:
Balder ten Cate,
Cristina Civili,
Evgeny Sherkhonov,
Wang-Chiew Tan
Abstract:
We propose a novel foundational framework for why-not explanations, that is, explanations for why a tuple is missing from a query result. Our why-not explanations leverage concepts from an ontology to provide high-level and meaningful reasons for why a tuple is missing from the result of a query. A key algorithmic problem in our framework is that of computing a most-general explanation for a why-n…
▽ More
We propose a novel foundational framework for why-not explanations, that is, explanations for why a tuple is missing from a query result. Our why-not explanations leverage concepts from an ontology to provide high-level and meaningful reasons for why a tuple is missing from the result of a query. A key algorithmic problem in our framework is that of computing a most-general explanation for a why-not question, relative to an ontology, which can either be provided by the user, or it may be automatically derived from the data and/or schema. We study the complexity of this problem and associated problems, and present concrete algorithms for computing why-not explanations. In the case where an external ontology is provided, we first show that the problem of deciding the existence of an explanation to a why-not question is NP-complete in general. However, the problem is solvable in polynomial time for queries of bounded arity, provided that the ontology is specified in a suitable language, such as a member of the DL-Lite family of description logics, which allows for efficient concept subsumption checking. Furthermore, we show that a most-general explanation can be computed in polynomial time in this case. In addition, we propose a method for deriving a suitable (virtual) ontology from a database and/or a data workspace schema, and we present an algorithm for computing a most-general explanation to a why-not question, relative to such ontologies. This algorithm runs in polynomial-time in the case when concepts are defined in a selection-free language, or if the underlying schema is fixed. Finally, we also study the problem of computing short most-general explanations, and we briefly discuss alternative definitions of what it means to be an explanation, and to be most general.
△ Less
Submitted 31 March, 2015; v1 submitted 7 December, 2014;
originally announced December 2014.