-
Veni, Vidi, Vici: Solving the Myriad of Challenges before Knowledge Graph Learning
Authors:
Jeffrey Sardina,
Luca Costabello,
Christophe Guéret
Abstract:
Knowledge Graphs (KGs) have become increasingly common for representing large-scale linked data. However, their immense size has required graph learning systems to assist humans in analysis, interpretation, and pattern detection. While there have been promising results for researcher- and clinician- empowerment through a variety of KG learning systems, we identify four key deficiencies in state-of…
▽ More
Knowledge Graphs (KGs) have become increasingly common for representing large-scale linked data. However, their immense size has required graph learning systems to assist humans in analysis, interpretation, and pattern detection. While there have been promising results for researcher- and clinician- empowerment through a variety of KG learning systems, we identify four key deficiencies in state-of-the-art graph learning that simultaneously limit KG learning performance and diminish the ability of humans to interface optimally with these learning systems. These deficiencies are: 1) lack of expert knowledge integration, 2) instability to node degree extremity in the KG, 3) lack of consideration for uncertainty and relevance while learning, and 4) lack of explainability. Furthermore, we characterise state-of-the-art attempts to solve each of these problems and note that each attempt has largely been isolated from attempts to solve the other problems. Through a formalisation of these problems and a review of the literature that addresses them, we adopt the position that not only are deficiencies in these four key areas holding back human-KG empowerment, but that the divide-and-conquer approach to solving these problems as individual units rather than a whole is a significant barrier to the interface between humans and KG learning systems. We propose that it is only through integrated, holistic solutions to the limitations of KG learning systems that human and KG learning co-empowerment will be efficiently affected. We finally present our "Veni, Vidi, Vici" framework that sets a roadmap for effectively and efficiently shifting to a holistic co-empowerment model in both the KG learning and the broader machine learning domain.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
Explaining Groups of Instances Counterfactually for XAI: A Use Case, Algorithm and User Study for Group-Counterfactuals
Authors:
Greta Warren,
Mark T. Keane,
Christophe Gueret,
Eoin Delaney
Abstract:
Counterfactual explanations are an increasingly popular form of post hoc explanation due to their (i) applicability across problem domains, (ii) proposed legal compliance (e.g., with GDPR), and (iii) reliance on the contrastive nature of human explanation. Although counterfactual explanations are normally used to explain individual predictive-instances, we explore a novel use case in which groups…
▽ More
Counterfactual explanations are an increasingly popular form of post hoc explanation due to their (i) applicability across problem domains, (ii) proposed legal compliance (e.g., with GDPR), and (iii) reliance on the contrastive nature of human explanation. Although counterfactual explanations are normally used to explain individual predictive-instances, we explore a novel use case in which groups of similar instances are explained in a collective fashion using ``group counterfactuals'' (e.g., to highlight a repeating pattern of illness in a group of patients). These group counterfactuals meet a human preference for coherent, broad explanations covering multiple events/instances. A novel, group-counterfactual algorithm is proposed to generate high-coverage explanations that are faithful to the to-be-explained model. This explanation strategy is also evaluated in a large, controlled user study (N=207), using objective (i.e., accuracy) and subjective (i.e., confidence, explanation satisfaction, and trust) psychological measures. The results show that group counterfactuals elicit modest but definite improvements in people's understanding of an AI system. The implications of these findings for counterfactual methods and for XAI are discussed.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
Predicting Illness for a Sustainable Dairy Agriculture: Predicting and Explaining the Onset of Mastitis in Dairy Cows
Authors:
Cathal Ryan,
Christophe Guéret,
Donagh Berry,
Medb Corcoran,
Mark T. Keane,
Brian Mac Namee
Abstract:
Mastitis is a billion dollar health problem for the modern dairy industry, with implications for antibiotic resistance. The use of AI techniques to identify the early onset of this disease, thus has significant implications for the sustainability of this agricultural sector. Current approaches to treating mastitis involve antibiotics and this practice is coming under ever increasing scrutiny. Usin…
▽ More
Mastitis is a billion dollar health problem for the modern dairy industry, with implications for antibiotic resistance. The use of AI techniques to identify the early onset of this disease, thus has significant implications for the sustainability of this agricultural sector. Current approaches to treating mastitis involve antibiotics and this practice is coming under ever increasing scrutiny. Using machine learning models to identify cows at risk of develo** mastitis and applying targeted treatment regimes to only those animals promotes a more sustainable approach. Incorrect predictions from such models, however, can lead to monetary losses, unnecessary use of antibiotics, and even the premature death of animals, so it is important to generate compelling explanations for predictions to build trust with users and to better support their decision making. In this paper we demonstrate a system developed to predict mastitis infections in cows and provide explanations of these predictions using counterfactuals. We demonstrate the system and describe the engagement with farmers undertaken to build it.
△ Less
Submitted 7 January, 2021; v1 submitted 6 January, 2021;
originally announced January 2021.
-
Can We Detect Mastitis earlier than Farmers?
Authors:
Cathal Ryan,
Christophe Guéret,
Donagh Berry,
Brian Mac Namee
Abstract:
The aim of this study was to build a modelling framework that would allow us to be able to detect mastitis infections before they would normally be found by farmers through the introduction of machine learning techniques. In the making of this we created two different modelling framework's, one that works on the premise of detecting Sub Clinical mastitis infections at one Somatic Cell Count record…
▽ More
The aim of this study was to build a modelling framework that would allow us to be able to detect mastitis infections before they would normally be found by farmers through the introduction of machine learning techniques. In the making of this we created two different modelling framework's, one that works on the premise of detecting Sub Clinical mastitis infections at one Somatic Cell Count recording in advance called SMA and the other tries to detect both Sub Clinical mastitis infections aswell as Clinical mastitis infections at any time the cow is milked called AMA. We also introduce the idea of two different feature sets for our study, these represent different characteristics that should be taken into account when detecting infections, these were the idea of a cow differing to a farm mean and also trends in the lactation. We reported that the results for SMA are better than those created by AMA for Sub Clinical infections yet it has the significant disadvantage of only being able to classify Sub Clinical infections due to how we recorded Sub Clinical infections as being any time a Somatic Cell Count measurement went above a certain threshold where as CM could appear at any stage of lactation. Thus in some cases the lower accuracy values for AMA might in fact be more beneficial to farmers.
△ Less
Submitted 5 November, 2020;
originally announced November 2020.
-
Background Knowledge Injection for Interpretable Sequence Classification
Authors:
Severin Gsponer,
Luca Costabello,
Chan Le Van,
Sumit Pai,
Christophe Gueret,
Georgiana Ifrim,
Freddy Lecue
Abstract:
Sequence classification is the supervised learning task of building models that predict class labels of unseen sequences of symbols. Although accuracy is paramount, in certain scenarios interpretability is a must. Unfortunately, such trade-off is often hard to achieve since we lack human-independent interpretability metrics. We introduce a novel sequence learning algorithm, that combines (i) linea…
▽ More
Sequence classification is the supervised learning task of building models that predict class labels of unseen sequences of symbols. Although accuracy is paramount, in certain scenarios interpretability is a must. Unfortunately, such trade-off is often hard to achieve since we lack human-independent interpretability metrics. We introduce a novel sequence learning algorithm, that combines (i) linear classifiers - which are known to strike a good balance between predictive power and interpretability, and (ii) background knowledge embeddings. We extend the classic subsequence feature space with groups of symbols which are generated by background knowledge injected via word or graph embeddings, and use this new feature space to learn a linear classifier. We also present a new measure to evaluate the interpretability of a set of symbolic features based on the symbol embeddings. Experiments on human activity recognition from wearables and amino acid sequence classification show that our classification approach preserves predictive power, while delivering more interpretable models.
△ Less
Submitted 25 June, 2020;
originally announced June 2020.
-
Release Early, Release Often: Predicting Change in Versioned Knowledge Organization Systems on the Web
Authors:
Albert Meroño-Peñuela,
Christophe Guéret,
Stefan Schlobach
Abstract:
The Semantic Web is built on top of Knowledge Organization Systems (KOS) (vocabularies, ontologies, concept schemes) that provide a structured, interoperable and distributed access to Linked Data on the Web. The maintenance of these KOS over time has produced a number of KOS version chains: subsequent unique version identifiers to unique states of a KOS. However, the release of new KOS versions po…
▽ More
The Semantic Web is built on top of Knowledge Organization Systems (KOS) (vocabularies, ontologies, concept schemes) that provide a structured, interoperable and distributed access to Linked Data on the Web. The maintenance of these KOS over time has produced a number of KOS version chains: subsequent unique version identifiers to unique states of a KOS. However, the release of new KOS versions pose challenges to both KOS publishers and users. For publishers, updating a KOS is a knowledge intensive task that requires a lot of manual effort, often implying deep deliberation on the set of changes to introduce. For users that link their datasets to these KOS, a new version compromises the validity of their links, often creating ramifications. In this paper we describe a method to automatically detect which parts of a Web KOS are likely to change in a next version, using supervised learning on past versions in the KOS version chain. We use a set of ontology change features to model and predict change in arbitrary Web KOS. We apply our method on 139 varied datasets systematically retrieved from the Semantic Web, obtaining robust results at correctly predicting change. To illustrate the accuracy, genericity and domain independence of the method, we study the relationship between its effectiveness and several characterizations of the evaluated datasets, finding that predictors like the number of versions in a chain and their release frequency have a fundamental impact in predictability of change in Web KOS. Consequently, we argue for adopting a release early, release often philosophy in Web KOS development cycles.
△ Less
Submitted 15 September, 2015; v1 submitted 12 May, 2015;
originally announced May 2015.
-
Knowledge Maps and Information Retrieval (KMIR)
Authors:
Peter Mutschke,
Andrea Scharnhorst,
Christophe Guéret,
Philipp Mayr,
Preben Hansen,
Aida Slavic
Abstract:
Information systems usually show as a particular point of failure the vagueness between user search terms and the knowledge orders of the information space in question. Some kind of guided searching therefore becomes more and more important in order to precisely discover information without knowing the right search terms. Knowledge maps of digital library collections are promising navigation tools…
▽ More
Information systems usually show as a particular point of failure the vagueness between user search terms and the knowledge orders of the information space in question. Some kind of guided searching therefore becomes more and more important in order to precisely discover information without knowing the right search terms. Knowledge maps of digital library collections are promising navigation tools through knowledge spaces but still far away from being applicable for searching digital libraries. However, there is no continuous knowledge exchange between the "map makers" on the one hand and the Information Retrieval (IR) specialists on the other hand. Thus, there is also a lack of models that properly combine insights of the two strands. The proposed workshop aims at bringing together these two communities: experts in IR reflecting on visual enhanced search interfaces and experts in knowledge map** reflecting on visualizations of the content of a collection that might also present a context for a search term in a visual manner. The intention of the workshop is to raise awareness of the potential of interactive knowledge maps for information seeking purposes and to create a common ground for experiments aiming at the incorporation of knowledge maps into IR models at the level of the user interface.
△ Less
Submitted 30 May, 2014;
originally announced May 2014.
-
The Entity Registry System: Implementing 5-Star Linked Data Without the Web
Authors:
Marat Charlaganov,
Philippe Cudré-Mauroux,
Cristian Dinu,
Christophe Guéret,
Martin Grund,
Teodor Macicas
Abstract:
Linked Data applications often assume that connectivity to data repositories and entity resolution services are always available. This may not be a valid assumption in many cases. Indeed, there are about 4.5 billion people in the world who have no or limited Web access. Many data-driven applications may have a critical impact on the life of those people, but are inaccessible to those populations d…
▽ More
Linked Data applications often assume that connectivity to data repositories and entity resolution services are always available. This may not be a valid assumption in many cases. Indeed, there are about 4.5 billion people in the world who have no or limited Web access. Many data-driven applications may have a critical impact on the life of those people, but are inaccessible to those populations due to the architecture of today's data registries. In this paper, we propose and evaluate a new open-source system that can be used as a general-purpose entity registry suitable for deployment in poorly-connected or ad-hoc environments.
△ Less
Submitted 15 August, 2013;
originally announced August 2013.
-
Genericity versus expressivity - an exercise in semantic interoperable research information systems for Web Science
Authors:
Christophe Guéret,
Tamy Chambers,
Linda Reijnhoudt,
Frank van der Most,
Andrea Scharnhorst
Abstract:
The web does not only enable new forms of science, it also creates new possibilities to study science and new digital scholarship. This paper brings together multiple perspectives: from individual researchers seeking the best options to display their activities and market their skills on the academic job market; to academic institutions, national funding agencies, and countries needing to monitor…
▽ More
The web does not only enable new forms of science, it also creates new possibilities to study science and new digital scholarship. This paper brings together multiple perspectives: from individual researchers seeking the best options to display their activities and market their skills on the academic job market; to academic institutions, national funding agencies, and countries needing to monitor the science system and account for public money spending. We also address the research interests aimed at better understanding the self-organising and complex nature of the science system through researcher tracing, the identification of the emergence of new fields, and knowledge discovery using large-data mining and non-linear dynamics. In particular this paper draws attention to the need for standardisation and data interoperability in the area of research information as an indispensable pre-condition for any science modelling. We discuss which levels of complexity are needed to provide a globally, interoperable, and expressive data infrastructure for research information. With possible dynamic science model applications in mind, we introduce the need for a "middle-range" level of complexity for data representation and propose a conceptual model for research data based on a core international ontology with national and local extensions.
△ Less
Submitted 21 April, 2013;
originally announced April 2013.