Search | arXiv e-print repository

Does It Make Sense to Explain a Black Box With Another Black Box?

Authors: Julien Delaunay, Luis Galárraga, Christine Largouët

Abstract: Although counterfactual explanations are a popular approach to explain ML black-box classifiers, they are less widespread in NLP. Most methods find those explanations by iteratively perturbing the target document until it is classified differently by the black box. We identify two main families of counterfactual explanation methods in the literature, namely, (a) \emph{transparent} methods that per… ▽ More Although counterfactual explanations are a popular approach to explain ML black-box classifiers, they are less widespread in NLP. Most methods find those explanations by iteratively perturbing the target document until it is classified differently by the black box. We identify two main families of counterfactual explanation methods in the literature, namely, (a) \emph{transparent} methods that perturb the target by adding, removing, or replacing words, and (b) \emph{opaque} approaches that project the target document into a latent, non-interpretable space where the perturbation is carried out subsequently. This article offers a comparative study of the performance of these two families of methods on three classical NLP tasks. Our empirical evidence shows that opaque approaches can be an overkill for downstream applications such as fake news detection or sentiment analysis since they add an additional level of complexity with no significant performance gain. These observations motivate our discussion, which raises the question of whether it makes sense to explain a black box using another black box. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: This article was originally published in French at the Journal TAL. VOL 64 n°3/2023. arXiv admin note: substantial text overlap with arXiv:2402.10888

MSC Class: 68T50

arXiv:2312.12115 [pdf, other]

Sha** Up SHAP: Enhancing Stability through Layer-Wise Neighbor Selection

Authors: Gwladys Kelodjou, Laurence Rozé, Véronique Masson, Luis Galárraga, Romaric Gaudel, Maurice Tchuente, Alexandre Termier

Abstract: Machine learning techniques, such as deep learning and ensemble methods, are widely used in various domains due to their ability to handle complex real-world tasks. However, their black-box nature has raised multiple concerns about the fairness, trustworthiness, and transparency of computer-assisted decision-making. This has led to the emergence of local post-hoc explainability methods, which offe… ▽ More Machine learning techniques, such as deep learning and ensemble methods, are widely used in various domains due to their ability to handle complex real-world tasks. However, their black-box nature has raised multiple concerns about the fairness, trustworthiness, and transparency of computer-assisted decision-making. This has led to the emergence of local post-hoc explainability methods, which offer explanations for individual decisions made by black-box algorithms. Among these methods, Kernel SHAP is widely used due to its model-agnostic nature and its well-founded theoretical framework. Despite these strengths, Kernel SHAP suffers from high instability: different executions of the method with the same inputs can lead to significantly different explanations, which diminishes the relevance of the explanations. The contribution of this paper is two-fold. On the one hand, we show that Kernel SHAP's instability is caused by its stochastic neighbor selection procedure, which we adapt to achieve full stability without compromising explanation fidelity. On the other hand, we show that by restricting the neighbors generation to perturbations of size 1 -- which we call the coalitions of Layer 1 -- we obtain a novel feature-attribution method that is fully stable, computationally efficient, and still meaningful. △ Less

Submitted 17 June, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

Journal ref: AAAI Conference on Artificial Intelligence, 2024

arXiv:2302.06967 [pdf, ps, other]

Effects of Locality and Rule Language on Explanations for Knowledge Graph Embeddings

Authors: Luis Galárraga

Abstract: Knowledge graphs (KGs) are key tools in many AI-related tasks such as reasoning or question answering. This has, in turn, propelled research in link prediction in KGs, the task of predicting missing relationships from the available knowledge. Solutions based on KG embeddings have shown promising results in this matter. On the downside, these approaches are usually unable to explain their predictio… ▽ More Knowledge graphs (KGs) are key tools in many AI-related tasks such as reasoning or question answering. This has, in turn, propelled research in link prediction in KGs, the task of predicting missing relationships from the available knowledge. Solutions based on KG embeddings have shown promising results in this matter. On the downside, these approaches are usually unable to explain their predictions. While some works have proposed to compute post-hoc rule explanations for embedding-based link predictors, these efforts have mostly resorted to rules with unbounded atoms, e.g., bornIn(x,y) => residence(x,y), learned on a global scope, i.e., the entire KG. None of these works has considered the impact of rules with bounded atoms such as nationality(x,England) => speaks(x, English), or the impact of learning from regions of the KG, i.e., local scopes. We therefore study the effects of these factors on the quality of rule-based explanations for embedding-based link predictors. Our results suggest that more specific rules and local scopes can improve the accuracy of the explanations. Moreover, these rules can provide further insights about the inner-workings of KG embeddings for link prediction. △ Less

Submitted 14 February, 2023; originally announced February 2023.

Journal ref: Full paper at the Symposium on Intelligent Data Analysis (IDA) 2023

arXiv:2208.01510 [pdf, other]

s-LIME: Reconciling Locality and Fidelity in Linear Explanations

Authors: Romaric Gaudel, Luis Galárraga, Julien Delaunay, Laurence Rozé, Vaishnavi Bhargava

Abstract: The benefit of locality is one of the major premises of LIME, one of the most prominent methods to explain black-box machine learning models. This emphasis relies on the postulate that the more locally we look at the vicinity of an instance, the simpler the black-box model becomes, and the more accurately we can mimic it with a linear surrogate. As logical as this seems, our findings suggest that,… ▽ More The benefit of locality is one of the major premises of LIME, one of the most prominent methods to explain black-box machine learning models. This emphasis relies on the postulate that the more locally we look at the vicinity of an instance, the simpler the black-box model becomes, and the more accurately we can mimic it with a linear surrogate. As logical as this seems, our findings suggest that, with the current design of LIME, the surrogate model may degenerate when the explanation is too local, namely, when the bandwidth parameter $σ$ tends to zero. Based on this observation, the contribution of this paper is twofold. Firstly, we study the impact of both the bandwidth and the training vicinity on the fidelity and semantics of LIME explanations. Secondly, and based on our findings, we propose \slime, an extension of LIME that reconciles fidelity and locality. △ Less

Submitted 2 August, 2022; originally announced August 2022.

Journal ref: Symposium on Intelligent Data Analysis (IDA'22), Apr 2022, Rennes, France

arXiv:2109.07519 [pdf, other]

Discovering Useful Compact Sets of Sequential Rules in a Long Sequence

Authors: Erwan Bourrand, Luis Galárraga, Esther Galbrun, Elisa Fromont, Alexandre Termier

Abstract: We are interested in understanding the underlying generation process for long sequences of symbolic events. To do so, we propose COSSU, an algorithm to mine small and meaningful sets of sequential rules. The rules are selected using an MDL-inspired criterion that favors compactness and relies on a novel rule-based encoding scheme for sequences. Our evaluation shows that COSSU can successfully retr… ▽ More We are interested in understanding the underlying generation process for long sequences of symbolic events. To do so, we propose COSSU, an algorithm to mine small and meaningful sets of sequential rules. The rules are selected using an MDL-inspired criterion that favors compactness and relies on a novel rule-based encoding scheme for sequences. Our evaluation shows that COSSU can successfully retrieve relevant sets of closed sequential rules from a long sequence. Such rules constitute an interpretable model that exhibits competitive accuracy for the tasks of next-element prediction and classification. △ Less

Submitted 30 December, 2022; v1 submitted 15 September, 2021; originally announced September 2021.

Comments: 8 pages, published in the proceedings of the 33rd IEEE International Conference on Tools with Artificial Intelligence

arXiv:2104.00148 [pdf, other]

doi 10.1145/3411764

Showing Academic Performance Predictions during Term Planning: Effects on Students' Decisions, Behaviors, and Preferences

Authors: Gonzalo Gabriel Méndez, Luis Galárraga, Katherine Chiluiza

Abstract: Course selection is a crucial activity for students as it directly impacts their workload and performance. It is also time-consuming, prone to subjectivity, and often carried out based on incomplete information. This task can, nevertheless, be assisted with computational tools, for instance, by predicting performance based on historical data. We investigate the effects of showing grade predictions… ▽ More Course selection is a crucial activity for students as it directly impacts their workload and performance. It is also time-consuming, prone to subjectivity, and often carried out based on incomplete information. This task can, nevertheless, be assisted with computational tools, for instance, by predicting performance based on historical data. We investigate the effects of showing grade predictions to students through an interactive visualization tool. A qualitative study suggests that in the presence of predictions, students may focus too much on maximizing their performance, to the detriment of other factors such as the workload. A follow-up quantitative study explored whether these effects are mitigated by changing how predictions are conveyed. Our observations suggest the presence of a framing effect that induces students to put more effort into course selection when faced with more specific predictions. We discuss these and other findings and outline considerations for designing better data-driven course selection tools. △ Less

Submitted 31 March, 2021; originally announced April 2021.

Comments: 17 pages

arXiv:2102.12370 [pdf, other]

HiPaR: Hierarchical Pattern-aided Regression

Authors: Luis Galárraga, Olivier Pelgrin, Alexandre Termier

Abstract: We introduce HiPaR, a novel pattern-aided regression method for tabular data containing both categorical and numerical attributes. HiPaR mines hybrid rules of the form $p \Rightarrow y = f(X)$ where $p$ is the characterization of a data region and $f(X)$ is a linear regression model on a variable of interest $y$. HiPaR relies on pattern mining techniques to identify regions of the data where the t… ▽ More We introduce HiPaR, a novel pattern-aided regression method for tabular data containing both categorical and numerical attributes. HiPaR mines hybrid rules of the form $p \Rightarrow y = f(X)$ where $p$ is the characterization of a data region and $f(X)$ is a linear regression model on a variable of interest $y$. HiPaR relies on pattern mining techniques to identify regions of the data where the target variable can be accurately explained via local linear models. The novelty of the method lies in the combination of an enumerative approach to explore the space of regions and efficient heuristics that guide the search. Such a strategy provides more flexibility when selecting a small set of jointly accurate and human-readable hybrid rules that explain the entire dataset. As our experiments shows, HiPaR mines fewer rules than existing pattern-based regression methods while still attaining state-of-the-art prediction performance. △ Less

Submitted 24 February, 2021; originally announced February 2021.

Journal ref: Proceedings of the 25th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2021)

arXiv:1911.01157 [pdf, other]

REMI: Mining Intuitive Referring Expressions on Knowledge Bases

Authors: Luis Galárraga, Julien Delaunay, Jean-Louis Dessalles

Abstract: A referring expression (RE) is a description that identifies a set of instances unambiguously. Mining REs from data finds applications in natural language generation, algorithmic journalism, and data maintenance. Since there may exist multiple REs for a given set of entities, it is common to focus on the most intuitive ones, i.e., the most concise and informative. In this paper we present REMI, a… ▽ More A referring expression (RE) is a description that identifies a set of instances unambiguously. Mining REs from data finds applications in natural language generation, algorithmic journalism, and data maintenance. Since there may exist multiple REs for a given set of entities, it is common to focus on the most intuitive ones, i.e., the most concise and informative. In this paper we present REMI, a system that can mine intuitive REs on large RDF knowledge bases. Our experimental evaluation shows that REMI finds REs deemed intuitive by users. Moreover we show that REMI is several orders of magnitude faster than an approach based on inductive logic programming. △ Less

Submitted 4 November, 2019; originally announced November 2019.

arXiv:1612.05786 [pdf, other]

doi 10.1145/3018661.3018739

Predicting Completeness in Knowledge Bases

Authors: Luis Galárraga, Simon Razniewski, Antoine Amarilli, Fabian M. Suchanek

Abstract: Knowledge bases such as Wikidata, DBpedia, or YAGO contain millions of entities and facts. In some knowledge bases, the correctness of these facts has been evaluated. However, much less is known about their completeness, i.e., the proportion of real facts that the knowledge bases cover. In this work, we investigate different signals to identify the areas where a knowledge base is complete. We show… ▽ More Knowledge bases such as Wikidata, DBpedia, or YAGO contain millions of entities and facts. In some knowledge bases, the correctness of these facts has been evaluated. However, much less is known about their completeness, i.e., the proportion of real facts that the knowledge bases cover. In this work, we investigate different signals to identify the areas where a knowledge base is complete. We show that we can combine these signals in a rule mining approach, which allows us to predict where facts may be missing. We also show that completeness predictions can help other applications such as fact prediction. △ Less

Submitted 17 December, 2016; originally announced December 2016.

Comments: 21 pages, 19 references, 1 figure, 5 tables. Complete version of the article accepted at WSDM'17

arXiv:1212.5636 [pdf, other]

Partout: A Distributed Engine for Efficient RDF Processing

Authors: Luis Galárraga, Katja Hose, Ralf Schenkel

Abstract: The increasing interest in Semantic Web technologies has led not only to a rapid growth of semantic data on the Web but also to an increasing number of backend applications with already more than a trillion triples in some cases. Confronted with such huge amounts of data and the future growth, existing state-of-the-art systems for storing RDF and processing SPARQL queries are no longer sufficient.… ▽ More The increasing interest in Semantic Web technologies has led not only to a rapid growth of semantic data on the Web but also to an increasing number of backend applications with already more than a trillion triples in some cases. Confronted with such huge amounts of data and the future growth, existing state-of-the-art systems for storing RDF and processing SPARQL queries are no longer sufficient. In this paper, we introduce Partout, a distributed engine for efficient RDF processing in a cluster of machines. We propose an effective approach for fragmenting RDF data sets based on a query log, allocating the fragments to nodes in a cluster, and finding the optimal configuration. Partout can efficiently handle updates and its query optimizer produces efficient query execution plans for ad-hoc SPARQL queries. Our experiments show the superiority of our approach to state-of-the-art approaches for partitioning and distributed SPARQL query processing. △ Less

Submitted 21 December, 2012; originally announced December 2012.

Showing 1–10 of 10 results for author: Galárraga, L