Search | arXiv e-print repository

FocusLearn: Fully-Interpretable, High-Performance Modular Neural Networks for Time Series

Authors: Qiqi Su, Christos Kloukinas, Artur d'Avila Garcez

Abstract: Multivariate time series have many applications, from healthcare and meteorology to life science. Although deep learning models have shown excellent predictive performance for time series, they have been criticised for being "black-boxes" or non-interpretable. This paper proposes a novel modular neural network model for multivariate time series prediction that is interpretable by construction. A r… ▽ More Multivariate time series have many applications, from healthcare and meteorology to life science. Although deep learning models have shown excellent predictive performance for time series, they have been criticised for being "black-boxes" or non-interpretable. This paper proposes a novel modular neural network model for multivariate time series prediction that is interpretable by construction. A recurrent neural network learns the temporal dependencies in the data while an attention-based feature selection component selects the most relevant features and suppresses redundant features used in the learning of the temporal dependencies. A modular deep network is trained from the selected features independently to show the users how features influence outcomes, making the model interpretable. Experimental results show that this approach can outperform state-of-the-art interpretable Neural Additive Models (NAM) and variations thereof in both regression and classification of time series tasks, achieving a predictive performance that is comparable to the top non-interpretable methods for time series, LSTM and XGBoost. △ Less

Submitted 3 May, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

arXiv:2310.19174 [pdf]

Predicting recovery following stroke: deep learning, multimodal data and feature selection using explainable AI

Authors: Adam White, Margarita Saranti, Artur d'Avila Garcez, Thomas M. H. Hope, Cathy J. Price, Howard Bowman

Abstract: Machine learning offers great potential for automated prediction of post-stroke symptoms and their response to rehabilitation. Major challenges for this endeavour include the very high dimensionality of neuroimaging data, the relatively small size of the datasets available for learning, and how to effectively combine neuroimaging and tabular data (e.g. demographic information and clinical characte… ▽ More Machine learning offers great potential for automated prediction of post-stroke symptoms and their response to rehabilitation. Major challenges for this endeavour include the very high dimensionality of neuroimaging data, the relatively small size of the datasets available for learning, and how to effectively combine neuroimaging and tabular data (e.g. demographic information and clinical characteristics). This paper evaluates several solutions based on two strategies. The first is to use 2D images that summarise MRI scans. The second is to select key features that improve classification accuracy. Additionally, we introduce the novel approach of training a convolutional neural network (CNN) on images that combine regions-of-interest extracted from MRIs, with symbolic representations of tabular data. We evaluate a series of CNN architectures (both 2D and a 3D) that are trained on different representations of MRI and tabular data, to predict whether a composite measure of post-stroke spoken picture description ability is in the aphasic or non-aphasic range. MRI and tabular data were acquired from 758 English speaking stroke survivors who participated in the PLORAS study. The classification accuracy for a baseline logistic regression was 0.678 for lesion size alone, rising to 0.757 and 0.813 when initial symptom severity and recovery time were successively added. The highest classification accuracy 0.854 was observed when 8 regions-of-interest was extracted from each MRI scan and combined with lesion size, initial severity and recovery time in a 2D Residual Neural Network.Our findings demonstrate how imaging and tabular data can be combined for high post-stroke classification accuracy, even when the dataset is small in machine learning terms. We conclude by proposing how the current models could be improved to achieve even higher levels of accuracy using images from hospital scanners. △ Less

Submitted 29 October, 2023; originally announced October 2023.

arXiv:2308.14898

doi 10.4204/EPTCS.385

Proceedings 39th International Conference on Logic Programming

Authors: Enrico Pontelli, Stefania Costantini, Carmine Dodaro, Sarah Gaggl, Roberta Calegari, Artur D'Avila Garcez, Francesco Fabiano, Alessandra Mileo, Alessandra Russo, Francesca Toni

Abstract: This volume contains the Technical Communications presented at the 39th International Conference on Logic Programming (ICLP 2023), held at Imperial College London, UK from July 9 to July 15, 2023. Technical Communications included here concern the Main Track, the Doctoral Consortium, the Application and Systems/Demo track, the Recently Published Research Track, the Birds-of-a-Feather track, the Th… ▽ More This volume contains the Technical Communications presented at the 39th International Conference on Logic Programming (ICLP 2023), held at Imperial College London, UK from July 9 to July 15, 2023. Technical Communications included here concern the Main Track, the Doctoral Consortium, the Application and Systems/Demo track, the Recently Published Research Track, the Birds-of-a-Feather track, the Thematic Tracks on Logic Programming and Machine Learning, and Logic Programming and Explainability, Ethics, and Trustworthiness. △ Less

Submitted 28 August, 2023; originally announced August 2023.

Journal ref: EPTCS 385, 2023

arXiv:2212.12050 [pdf, other]

A Semantic Framework for Neural-Symbolic Computing

Authors: Simon Odense, Artur d'Avila Garcez

Abstract: Two approaches to AI, neural networks and symbolic systems, have been proven very successful for an array of AI problems. However, neither has been able to achieve the general reasoning ability required for human-like intelligence. It has been argued that this is due to inherent weaknesses in each approach. Luckily, these weaknesses appear to be complementary, with symbolic systems being adept at… ▽ More Two approaches to AI, neural networks and symbolic systems, have been proven very successful for an array of AI problems. However, neither has been able to achieve the general reasoning ability required for human-like intelligence. It has been argued that this is due to inherent weaknesses in each approach. Luckily, these weaknesses appear to be complementary, with symbolic systems being adept at the kinds of things neural networks have trouble with and vice-versa. The field of neural-symbolic AI attempts to exploit this asymmetry by combining neural networks and symbolic AI into integrated systems. Often this has been done by encoding symbolic knowledge into neural networks. Unfortunately, although many different methods for this have been proposed, there is no common definition of an encoding to compare them. We seek to rectify this problem by introducing a semantic framework for neural-symbolic AI, which is then shown to be general enough to account for a large family of neural-symbolic systems. We provide a number of examples and proofs of the application of the framework to the neural encoding of various forms of knowledge representation and neural network. These, at first sight disparate approaches, are all shown to fall within the framework's formal definition of what we call semantic encoding for neural-symbolic AI. △ Less

Submitted 27 June, 2024; v1 submitted 22 December, 2022; originally announced December 2022.

arXiv:2210.07117 [pdf, other]

Graph-based Neural Modules to Inspect Attention-based Architectures: A Position Paper

Authors: Breno W. Carvalho, Artur D'Avilla Garcez, Luis C. Lamb

Abstract: Encoder-decoder architectures are prominent building blocks of state-of-the-art solutions for tasks across multiple fields where deep learning (DL) or foundation models play a key role. Although there is a growing community working on the provision of interpretation for DL models as well as considerable work in the neuro-symbolic community seeking to integrate symbolic representations and DL, many… ▽ More Encoder-decoder architectures are prominent building blocks of state-of-the-art solutions for tasks across multiple fields where deep learning (DL) or foundation models play a key role. Although there is a growing community working on the provision of interpretation for DL models as well as considerable work in the neuro-symbolic community seeking to integrate symbolic representations and DL, many open questions remain around the need for better tools for visualization of the inner workings of DL architectures. In particular, encoder-decoder models offer an exciting opportunity for visualization and editing by humans of the knowledge implicitly represented in model weights. In this work, we explore ways to create an abstraction for segments of the network as a two-way graph-based representation. Changes to this graph structure should be reflected directly in the underlying tensor representations. Such two-way graph representation enables new neuro-symbolic systems by leveraging the pattern recognition capabilities of the encoder-decoder along with symbolic reasoning carried out on the graphs. The approach is expected to produce new ways of interacting with DL models but also to improve performance as a result of the combination of learning and reasoning capabilities. △ Less

Submitted 13 October, 2022; originally announced October 2022.

arXiv:2112.11805 [pdf, other]

Neural-Symbolic Integration for Interactive Learning and Conceptual Grounding

Authors: Benedikt Wagner, Artur d'Avila Garcez

Abstract: We propose neural-symbolic integration for abstract concept explanation and interactive learning. Neural-symbolic integration and explanation allow users and domain-experts to learn about the data-driven decision making process of large neural models. The models are queried using a symbolic logic language. Interaction with the user then confirms or rejects a revision of the neural model using logi… ▽ More We propose neural-symbolic integration for abstract concept explanation and interactive learning. Neural-symbolic integration and explanation allow users and domain-experts to learn about the data-driven decision making process of large neural models. The models are queried using a symbolic logic language. Interaction with the user then confirms or rejects a revision of the neural model using logic-based constraints that can be distilled into the model architecture. The approach is illustrated using the Logic Tensor Network framework alongside Concept Activation Vectors and applied to a Convolutional Neural Network. △ Less

Submitted 17 January, 2022; v1 submitted 22 December, 2021; originally announced December 2021.

Comments: Corrected references

Journal ref: 1st Workshop on Human and Machine Decisions (WHMD 2021), NeurIPS 2021

arXiv:2112.05841 [pdf, other]

Logical Boltzmann Machines

Authors: Son N. Tran, Artur d'Avila Garcez

Abstract: The idea of representing symbolic knowledge in connectionist systems has been a long-standing endeavour which has attracted much attention recently with the objective of combining machine learning and scalable sound reasoning. Early work has shown a correspondence between propositional logic and symmetrical neural networks which nevertheless did not scale well with the number of variables and whos… ▽ More The idea of representing symbolic knowledge in connectionist systems has been a long-standing endeavour which has attracted much attention recently with the objective of combining machine learning and scalable sound reasoning. Early work has shown a correspondence between propositional logic and symmetrical neural networks which nevertheless did not scale well with the number of variables and whose training regime was inefficient. In this paper, we introduce Logical Boltzmann Machines (LBM), a neurosymbolic system that can represent any propositional logic formula in strict disjunctive normal form. We prove equivalence between energy minimization in LBM and logical satisfiability thus showing that LBM is capable of sound reasoning. We evaluate reasoning empirically to show that LBM is capable of finding all satisfying assignments of a class of logical formulae by searching fewer than 0.75% of the possible (approximately 1 billion) assignments. We compare learning in LBM with a symbolic inductive logic programming system, a state-of-the-art neurosymbolic system and a purely neural network-based system, achieving better learning performance in five out of seven data sets. △ Less

Submitted 10 December, 2021; originally announced December 2021.

Comments: 15 pages, 5 figures, 2 tables

MSC Class: 68T01 ACM Class: I.2.4; I.2.6; I.2.11

arXiv:2111.14260 [pdf, other]

A Practical guide on Explainable AI Techniques applied on Biomedical use case applications

Authors: Adrien Bennetot, Ivan Donadello, Ayoub El Qadi, Mauro Dragoni, Thomas Frossard, Benedikt Wagner, Anna Saranti, Silvia Tulli, Maria Trocan, Raja Chatila, Andreas Holzinger, Artur d'Avila Garcez, Natalia Díaz-Rodríguez

Abstract: Last years have been characterized by an upsurge of opaque automatic decision support systems, such as Deep Neural Networks (DNNs). Although they have great generalization and prediction skills, their functioning does not allow obtaining detailed explanations of their behaviour. As opaque machine learning models are increasingly being employed to make important predictions in critical environments… ▽ More Last years have been characterized by an upsurge of opaque automatic decision support systems, such as Deep Neural Networks (DNNs). Although they have great generalization and prediction skills, their functioning does not allow obtaining detailed explanations of their behaviour. As opaque machine learning models are increasingly being employed to make important predictions in critical environments, the danger is to create and use decisions that are not justifiable or legitimate. Therefore, there is a general agreement on the importance of endowing machine learning models with explainability. EXplainable Artificial Intelligence (XAI) techniques can serve to verify and certify model outputs and enhance them with desirable notions such as trustworthiness, accountability, transparency and fairness. This guide is meant to be the go-to handbook for any audience with a computer science background aiming at getting intuitive insights on machine learning models, accompanied with straight, fast, and intuitive explanations out of the box. This article aims to fill the lack of compelling XAI guide by applying XAI techniques in their particular day-to-day models, datasets and use-cases. Figure 1 acts as a flowchart/map for the reader and should help him to find the ideal method to use according to his type of data. In each chapter, the reader will find a description of the proposed method as well as an example of use on a Biomedical application and a Python notebook. It can be easily modified in order to be applied to specific applications. △ Less

Submitted 5 September, 2022; v1 submitted 13 November, 2021; originally announced November 2021.

arXiv:2109.12546 [pdf, other]

Synthetic Data Generation for Fraud Detection using GANs

Authors: Charitos Charitou, Simo Dragicevic, Artur d'Avila Garcez

Abstract: Detecting money laundering in gambling is becoming increasingly challenging for the gambling industry as consumers migrate to online channels. Whilst increasingly stringent regulations have been applied over the years to prevent money laundering in gambling, despite this, online gambling is still a channel for criminals to spend proceeds from crime. Complementing online gambling's growth more conc… ▽ More Detecting money laundering in gambling is becoming increasingly challenging for the gambling industry as consumers migrate to online channels. Whilst increasingly stringent regulations have been applied over the years to prevent money laundering in gambling, despite this, online gambling is still a channel for criminals to spend proceeds from crime. Complementing online gambling's growth more concerns are raised to its effects compared with gambling in traditional, physical formats, as it might introduce higher levels of problem gambling or fraudulent behaviour due to its nature of immediate interaction with online gambling experience. However, in most cases the main issue when organisations try to tackle those areas is the absence of high quality data. Since fraud detection related issues face the significant problem of the class imbalance, in this paper we propose a novel system based on Generative Adversarial Networks (GANs) for generating synthetic data in order to train a supervised classifier. Our framework Synthetic Data Generation GAN (SDG-GAN), manages to outperformed density based over-sampling methods and improve the classification performance of benchmarks datasets and the real world gambling fraud dataset. △ Less

Submitted 26 September, 2021; originally announced September 2021.

Comments: 8 pages

arXiv:2109.09809 [pdf, other]

Counterfactual Instances Explain Little

Authors: Adam White, Artur d'Avila Garcez

Abstract: In many applications, it is important to be able to explain the decisions of machine learning systems. An increasingly popular approach has been to seek to provide \emph{counterfactual instance explanations}. These specify close possible worlds in which, contrary to the facts, a person receives their desired decision from the machine learning system. This paper will draw on literature from the phi… ▽ More In many applications, it is important to be able to explain the decisions of machine learning systems. An increasingly popular approach has been to seek to provide \emph{counterfactual instance explanations}. These specify close possible worlds in which, contrary to the facts, a person receives their desired decision from the machine learning system. This paper will draw on literature from the philosophy of science to argue that a satisfactory explanation must consist of both counterfactual instances and a causal equation (or system of equations) that support the counterfactual instances. We will show that counterfactual instances by themselves explain little. We will further illustrate how explainable AI methods that provide both causal equations and counterfactual instances can successfully explain machine learning predictions. △ Less

Submitted 20 September, 2021; originally announced September 2021.

arXiv:2106.14556 [pdf, other]

Contrastive Counterfactual Visual Explanations With Overdetermination

Authors: Adam White, Kwun Ho Ngan, James Phelan, Saman Sadeghi Afgeh, Kevin Ryan, Constantino Carlos Reyes-Aldasoro, Artur d'Avila Garcez

Abstract: A novel explainable AI method called CLEAR Image is introduced in this paper. CLEAR Image is based on the view that a satisfactory explanation should be contrastive, counterfactual and measurable. CLEAR Image explains an image's classification probability by contrasting the image with a corresponding image generated automatically via adversarial learning. This enables both salient segmentation and… ▽ More A novel explainable AI method called CLEAR Image is introduced in this paper. CLEAR Image is based on the view that a satisfactory explanation should be contrastive, counterfactual and measurable. CLEAR Image explains an image's classification probability by contrasting the image with a corresponding image generated automatically via adversarial learning. This enables both salient segmentation and perturbations that faithfully determine each segment's importance. CLEAR Image was successfully applied to a medical imaging case study where it outperformed methods such as Grad-CAM and LIME by an average of 27% using a novel pointing game metric. CLEAR Image excels in identifying cases of "causal overdetermination" where there are multiple patches in an image, any one of which is sufficient by itself to cause the classification probability to be close to one. △ Less

Submitted 9 June, 2022; v1 submitted 28 June, 2021; originally announced June 2021.

arXiv:2012.13635 [pdf, other]

doi 10.1016/j.artint.2021.103649

Logic Tensor Networks

Authors: Samy Badreddine, Artur d'Avila Garcez, Luciano Serafini, Michael Spranger

Abstract: Artificial Intelligence agents are required to learn from their surroundings and to reason about the knowledge that has been learned in order to make decisions. While state-of-the-art learning from data typically uses sub-symbolic distributed representations, reasoning is normally useful at a higher level of abstraction with the use of a first-order logic language for knowledge representation. As… ▽ More Artificial Intelligence agents are required to learn from their surroundings and to reason about the knowledge that has been learned in order to make decisions. While state-of-the-art learning from data typically uses sub-symbolic distributed representations, reasoning is normally useful at a higher level of abstraction with the use of a first-order logic language for knowledge representation. As a result, attempts at combining symbolic AI and neural computation into neural-symbolic systems have been on the increase. In this paper, we present Logic Tensor Networks (LTN), a neurosymbolic formalism and computational model that supports learning and reasoning through the introduction of a many-valued, end-to-end differentiable first-order logic called Real Logic as a representation language for deep learning. We show that LTN provides a uniform language for the specification and the computation of several AI tasks such as data clustering, multi-label classification, relational learning, query answering, semi-supervised learning, regression and embedding learning. We implement and illustrate each of the above tasks with a number of simple explanatory examples using TensorFlow 2. Keywords: Neurosymbolic AI, Deep Learning and Reasoning, Many-valued Logic. △ Less

Submitted 22 December, 2021; v1 submitted 25 December, 2020; originally announced December 2020.

Comments: 68 pages, 28 figures, 6 tables

ACM Class: I.2.4; I.2.6

Journal ref: Artificial Intelligence, Volume 303, February 2022, 103649

arXiv:2012.05876 [pdf, ps, other]

Neurosymbolic AI: The 3rd Wave

Authors: Artur d'Avila Garcez, Luis C. Lamb

Abstract: Current advances in Artificial Intelligence (AI) and Machine Learning (ML) have achieved unprecedented impact across research communities and industry. Nevertheless, concerns about trust, safety, interpretability and accountability of AI were raised by influential thinkers. Many have identified the need for well-founded knowledge representation and reasoning to be integrated with deep learning and… ▽ More Current advances in Artificial Intelligence (AI) and Machine Learning (ML) have achieved unprecedented impact across research communities and industry. Nevertheless, concerns about trust, safety, interpretability and accountability of AI were raised by influential thinkers. Many have identified the need for well-founded knowledge representation and reasoning to be integrated with deep learning and for sound explainability. Neural-symbolic computing has been an active area of research for many years seeking to bring together robust learning in neural networks with reasoning and explainability via symbolic representations for network models. In this paper, we relate recent and early research results in neurosymbolic AI with the objective of identifying the key ingredients of the next wave of AI systems. We focus on research that integrates in a principled way neural network-based learning with symbolic knowledge representation and logical reasoning. The insights provided by 20 years of neural-symbolic computing are shown to shed new light onto the increasingly prominent role of trust, safety, interpretability and accountability of AI. We also identify promising directions and challenges for the next decade of AI research from the perspective of neural-symbolic systems. △ Less

Submitted 16 December, 2020; v1 submitted 10 December, 2020; originally announced December 2020.

Comments: 37 pages

ACM Class: I.2.4; I.2.6

arXiv:2011.07137 [pdf, other]

On the Transferability of VAE Embeddings using Relational Knowledge with Semi-Supervision

Authors: Harald Strömfelt, Luke Dickens, Artur d'Avila Garcez, Alessandra Russo

Abstract: We propose a new model for relational VAE semi-supervision capable of balancing disentanglement and low complexity modelling of relations with different symbolic properties. We compare the relative benefits of relation-decoder complexity and latent space structure on both inductive and transductive transfer learning. Our results depict a complex picture where enforcing structure on semi-supervised… ▽ More We propose a new model for relational VAE semi-supervision capable of balancing disentanglement and low complexity modelling of relations with different symbolic properties. We compare the relative benefits of relation-decoder complexity and latent space structure on both inductive and transductive transfer learning. Our results depict a complex picture where enforcing structure on semi-supervised representations can greatly improve zero-shot transductive transfer, but may be less favourable or even impact negatively the capacity for inductive transfer. △ Less

Submitted 13 November, 2020; originally announced November 2020.

arXiv:2003.09000 [pdf, other]

Layerwise Knowledge Extraction from Deep Convolutional Networks

Authors: Simon Odense, Artur d'Avila Garcez

Abstract: Knowledge extraction is used to convert neural networks into symbolic descriptions with the objective of producing more comprehensible learning models. The central challenge is to find an explanation which is more comprehensible than the original model while still representing that model faithfully. The distributed nature of deep networks has led many to believe that the hidden features of a neura… ▽ More Knowledge extraction is used to convert neural networks into symbolic descriptions with the objective of producing more comprehensible learning models. The central challenge is to find an explanation which is more comprehensible than the original model while still representing that model faithfully. The distributed nature of deep networks has led many to believe that the hidden features of a neural network cannot be explained by logical descriptions simple enough to be comprehensible. In this paper, we propose a novel layerwise knowledge extraction method using M-of-N rules which seeks to obtain the best trade-off between the complexity and accuracy of rules describing the hidden features of a deep network. We show empirically that this approach produces rules close to an optimal complexity-error tradeoff. We apply this method to a variety of deep networks and find that in the internal layers we often cannot find rules with a satisfactory complexity and accuracy, suggesting that rule extraction as a general purpose method for explaining the internal logic of a neural network may be impossible. However, we also find that the softmax layer in Convolutional Neural Networks and Autoencoders using either tanh or relu activation functions is highly explainable by rule extraction, with compact rules consisting of as little as 3 units out of 128 often reaching over 99% accuracy. This shows that rule extraction can be a useful component for explaining parts (or modules) of a deep neural network. △ Less

Submitted 19 March, 2020; originally announced March 2020.

arXiv:1911.04489 [pdf, other]

Making Good on LSTMs' Unfulfilled Promise

Authors: Daniel Philps, Artur d'Avila Garcez, Tillman Weyde

Abstract: LSTMs promise much to financial time-series analysis, temporal and cross-sectional inference, but we find that they do not deliver in a real-world financial management task. We examine an alternative called Continual Learning (CL), a memory-augmented approach, which can provide transparent explanations, i.e. which memory did what and when. This work has implications for many financial applications… ▽ More LSTMs promise much to financial time-series analysis, temporal and cross-sectional inference, but we find that they do not deliver in a real-world financial management task. We examine an alternative called Continual Learning (CL), a memory-augmented approach, which can provide transparent explanations, i.e. which memory did what and when. This work has implications for many financial applications including credit, time-varying fairness in decision making and more. We make three important new observations. Firstly, as well as being more explainable, time-series CL approaches outperform LSTMs as well as a simple sliding window learner using feed-forward neural networks (FFNN). Secondly, we show that CL based on a sliding window learner (FFNN) is more effective than CL based on a sequential learner (LSTM). Thirdly, we examine how real-world, time-series noise impacts several similarity approaches used in CL memory addressing. We provide these insights using an approach called Continual Learning Augmentation (CLA) tested on a complex real-world problem, emerging market equities investment decision making. CLA provides a test-bed as it can be based on different types of time-series learners, allowing testing of LSTM and FFNN learners side by side. CLA is also used to test several distance approaches used in a memory recall-gate: Euclidean distance (ED), dynamic time war** (DTW), auto-encoders (AE) and a novel hybrid approach, warp-AE. We find that ED under-performs DTW and AE but warp-AE shows the best overall performance in a real-world financial task. △ Less

Submitted 8 December, 2019; v1 submitted 11 November, 2019; originally announced November 2019.

Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada. arXiv admin note: text overlap with arXiv:1812.02340

arXiv:1908.03020 [pdf, other]

Measurable Counterfactual Local Explanations for Any Classifier

Authors: Adam White, Artur d'Avila Garcez

Abstract: We propose a novel method for explaining the predictions of any classifier. In our approach, local explanations are expected to explain both the outcome of a prediction and how that prediction would change if 'things had been different'. Furthermore, we argue that satisfactory explanations cannot be dissociated from a notion and measure of fidelity, as advocated in the early days of neural network… ▽ More We propose a novel method for explaining the predictions of any classifier. In our approach, local explanations are expected to explain both the outcome of a prediction and how that prediction would change if 'things had been different'. Furthermore, we argue that satisfactory explanations cannot be dissociated from a notion and measure of fidelity, as advocated in the early days of neural networks' knowledge extraction. We introduce a definition of fidelity to the underlying classifier for local explanation models which is based on distances to a target decision boundary. A system called CLEAR: Counterfactual Local Explanations via Regression, is introduced and evaluated. CLEAR generates w-counterfactual explanations that state minimum changes necessary to flip a prediction's classification. CLEAR then builds local regression models, using the w-counterfactuals to measure and improve the fidelity of its regressions. By contrast, the popular LIME method, which also uses regression to generate local explanations, neither measures its own fidelity nor generates counterfactuals. CLEAR's regressions are found to have significantly higher fidelity than LIME's, averaging over 45% higher in this paper's four case studies. △ Less

Submitted 23 November, 2019; v1 submitted 8 August, 2019; originally announced August 2019.

arXiv:1906.06455 [pdf, other]

Efficient predicate invention using shared "NeMuS"

Authors: Edjard Mota, Jacob M. Howe, Ana Schramm, Artur d'Avila Garcez

Abstract: Amao is a cognitive agent framework that tackles the invention of predicates with a different strategy as compared to recent advances in Inductive Logic Programming (ILP) approaches like Meta-Intepretive Learning (MIL) technique. It uses a Neural Multi-Space (NeMuS) graph structure to anti-unify atoms from the Herbrand base, which passes in the inductive momentum check. Inductive Clause Learning (… ▽ More Amao is a cognitive agent framework that tackles the invention of predicates with a different strategy as compared to recent advances in Inductive Logic Programming (ILP) approaches like Meta-Intepretive Learning (MIL) technique. It uses a Neural Multi-Space (NeMuS) graph structure to anti-unify atoms from the Herbrand base, which passes in the inductive momentum check. Inductive Clause Learning (ICL), as it is called, is extended here by using the weights of logical components, already present in NeMuS, to support inductive learning by expanding clause candidates with anti-unified atoms. An efficient invention mechanism is achieved, including the learning of recursive hypotheses, while restricting the shape of the hypothesis by adding bias definitions or idiosyncrasies of the language. △ Less

Submitted 14 June, 2019; originally announced June 2019.

Comments: 7 pages, 5 figures, Proceedings of the 2019 International Workshop on Neural-Symbolic Learning and Reasoning

ACM Class: I.2.0; I.2.1; I.2.3; I.2.4; I.2.8; I.2.11

arXiv:1905.06088 [pdf, other]

Neural-Symbolic Computing: An Effective Methodology for Principled Integration of Machine Learning and Reasoning

Authors: Artur d'Avila Garcez, Marco Gori, Luis C. Lamb, Luciano Serafini, Michael Spranger, Son N. Tran

Abstract: Current advances in Artificial Intelligence and machine learning in general, and deep learning in particular have reached unprecedented impact not only across research communities, but also over popular media channels. However, concerns about interpretability and accountability of AI have been raised by influential thinkers. In spite of the recent impact of AI, several works have identified the ne… ▽ More Current advances in Artificial Intelligence and machine learning in general, and deep learning in particular have reached unprecedented impact not only across research communities, but also over popular media channels. However, concerns about interpretability and accountability of AI have been raised by influential thinkers. In spite of the recent impact of AI, several works have identified the need for principled knowledge representation and reasoning mechanisms integrated with deep learning-based systems to provide sound and explainable models for such systems. Neural-symbolic computing aims at integrating, as foreseen by Valiant, two most fundamental cognitive abilities: the ability to learn from the environment, and the ability to reason from what has been learned. Neural-symbolic computing has been an active topic of research for many years, reconciling the advantages of robust learning in neural networks and reasoning and interpretability of symbolic representation. In this paper, we survey recent accomplishments of neural-symbolic computing as a principled methodology for integrated machine learning and reasoning. We illustrate the effectiveness of the approach by outlining the main characteristics of the methodology: principled integration of neural learning with symbolic knowledge representation and reasoning allowing for the construction of explainable AI systems. The insights provided by neural-symbolic computing shed new light on the increasingly prominent need for interpretable and accountable AI systems. △ Less

Submitted 15 May, 2019; originally announced May 2019.

arXiv:1812.02340 [pdf, other]

Continual Learning Augmented Investment Decisions

Authors: Daniel Philps, Tillman Weyde, Artur d'Avila Garcez, Roy Batchelor

Abstract: Investment decisions can benefit from incorporating an accumulated knowledge of the past to drive future decision making. We introduce Continual Learning Augmentation (CLA) which is based on an explicit memory structure and a feed forward neural network (FFNN) base model and used to drive long term financial investment decisions. We demonstrate that our approach improves accuracy in investment dec… ▽ More Investment decisions can benefit from incorporating an accumulated knowledge of the past to drive future decision making. We introduce Continual Learning Augmentation (CLA) which is based on an explicit memory structure and a feed forward neural network (FFNN) base model and used to drive long term financial investment decisions. We demonstrate that our approach improves accuracy in investment decision making while memory is addressed in an explainable way. Our approach introduces novel remember cues, consisting of empirically learned change points in the absolute error series of the FFNN. Memory recall is also novel, with contextual similarity assessed over time by sampling distances using dynamic time war** (DTW). We demonstrate the benefits of our approach by using it in an expected return forecasting task to drive investment decisions. In an investment simulation in a broad international equity universe between 2003-2017, our approach significantly outperforms FFNN base models. We also illustrate how CLA's memory addressing works in practice, using a worked example to demonstrate the explainability of our approach. △ Less

Submitted 25 January, 2019; v1 submitted 5 December, 2018; originally announced December 2018.

Comments: NeurIPS 2018 Workshop on Challenges and Opportunities for AI in Financial Services: the Impact of Fairness, Explainability, Accuracy, and Privacy, Montreal, Canada. This is a non-archival publication - the authors may submit revisions and extensions of this paper to other publication venues

arXiv:1804.08597 [pdf, other]

Towards Symbolic Reinforcement Learning with Common Sense

Authors: Artur d'Avila Garcez, Aimore Resende Riquetti Dutra, Eduardo Alonso

Abstract: Deep Reinforcement Learning (deep RL) has made several breakthroughs in recent years in applications ranging from complex control tasks in unmanned vehicles to game playing. Despite their success, deep RL still lacks several important capacities of human intelligence, such as transfer learning, abstraction and interpretability. Deep Symbolic Reinforcement Learning (DSRL) seeks to incorporate such… ▽ More Deep Reinforcement Learning (deep RL) has made several breakthroughs in recent years in applications ranging from complex control tasks in unmanned vehicles to game playing. Despite their success, deep RL still lacks several important capacities of human intelligence, such as transfer learning, abstraction and interpretability. Deep Symbolic Reinforcement Learning (DSRL) seeks to incorporate such capacities to deep Q-networks (DQN) by learning a relevant symbolic representation prior to using Q-learning. In this paper, we propose a novel extension of DSRL, which we call Symbolic Reinforcement Learning with Common Sense (SRL+CS), offering a better balance between generalization and specialization, inspired by principles of common sense when assigning rewards and aggregating Q-values. Experiments reported in this paper show that SRL+CS learns consistently faster than Q-learning and DSRL, achieving also a higher accuracy. In the hardest case, where agents were trained in a deterministic environment and tested in a random environment, SRL+CS achieves nearly 100% average accuracy compared to DSRL's 70% and DQN's 50% accuracy. To the best of our knowledge, this is the first case of near perfect zero-shot transfer learning using Reinforcement Learning. △ Less

Submitted 23 April, 2018; originally announced April 2018.

Comments: 15 pages, 13 figures, 26 references

ACM Class: I.2.6

arXiv:1711.03902 [pdf, other]

Neural-Symbolic Learning and Reasoning: A Survey and Interpretation

Authors: Tarek R. Besold, Artur d'Avila Garcez, Sebastian Bader, Howard Bowman, Pedro Domingos, Pascal Hitzler, Kai-Uwe Kuehnberger, Luis C. Lamb, Daniel Lowd, Priscila Machado Vieira Lima, Leo de Penning, Gadi Pinkas, Hoifung Poon, Gerson Zaverucha

Abstract: The study and understanding of human behaviour is relevant to computer science, artificial intelligence, neural computation, cognitive science, philosophy, psychology, and several other areas. Presupposing cognition as basis of behaviour, among the most prominent tools in the modelling of behaviour are computational-logic systems, connectionist models of cognition, and models of uncertainty. Recen… ▽ More The study and understanding of human behaviour is relevant to computer science, artificial intelligence, neural computation, cognitive science, philosophy, psychology, and several other areas. Presupposing cognition as basis of behaviour, among the most prominent tools in the modelling of behaviour are computational-logic systems, connectionist models of cognition, and models of uncertainty. Recent studies in cognitive science, artificial intelligence, and psychology have produced a number of cognitive models of reasoning, learning, and language that are underpinned by computation. In addition, efforts in computer science research have led to the development of cognitive computational systems integrating machine learning and automated reasoning. Such systems have shown promise in a range of applications, including computational biology, fault diagnosis, training and assessment in simulators, and software verification. This joint survey reviews the personal ideas and views of several researchers on neural-symbolic learning and reasoning. The article is organised in three parts: Firstly, we frame the scope and goals of neural-symbolic computation and have a look at the theoretical foundations. We then proceed to describe the realisations of neural-symbolic computation, systems, and applications. Finally we present the challenges facing the area and avenues for further research. △ Less

Submitted 10 November, 2017; originally announced November 2017.

Comments: 58 pages, work in progress

arXiv:1705.08968 [pdf, other]

Logic Tensor Networks for Semantic Image Interpretation

Authors: Ivan Donadello, Luciano Serafini, Artur d'Avila Garcez

Abstract: Semantic Image Interpretation (SII) is the task of extracting structured semantic descriptions from images. It is widely agreed that the combined use of visual data and background knowledge is of great importance for SII. Recently, Statistical Relational Learning (SRL) approaches have been developed for reasoning under uncertainty and learning in the presence of data and rich knowledge. Logic Tens… ▽ More Semantic Image Interpretation (SII) is the task of extracting structured semantic descriptions from images. It is widely agreed that the combined use of visual data and background knowledge is of great importance for SII. Recently, Statistical Relational Learning (SRL) approaches have been developed for reasoning under uncertainty and learning in the presence of data and rich knowledge. Logic Tensor Networks (LTNs) are an SRL framework which integrates neural networks with first-order fuzzy logic to allow (i) efficient learning from noisy data in the presence of logical constraints, and (ii) reasoning with logical formulas describing general properties of the data. In this paper, we develop and apply LTNs to two of the main tasks of SII, namely, the classification of an image's bounding boxes and the detection of the relevant part-of relations between objects. To the best of our knowledge, this is the first successful application of SRL to such SII tasks. The proposed approach is evaluated on a standard image processing benchmark. Experiments show that the use of background knowledge in the form of logical constraints can improve the performance of purely data-driven approaches, including the state-of-the-art Fast Region-based Convolutional Neural Networks (Fast R-CNN). Moreover, we show that the use of logical background knowledge adds robustness to the learning system when errors are present in the labels of the training data. △ Less

Submitted 24 May, 2017; originally announced May 2017.

Comments: 14 pages, 2 figures, IJCAI 2017

arXiv:1701.05226 [pdf, other]

Reasoning in Non-Probabilistic Uncertainty: Logic Programming and Neural-Symbolic Computing as Examples

Authors: Tarek R. Besold, Artur d'Avila Garcez, Keith Stenning, Leendert van der Torre, Michiel van Lambalgen

Abstract: This article aims to achieve two goals: to show that probability is not the only way of dealing with uncertainty (and even more, that there are kinds of uncertainty which are for principled reasons not addressable with probabilistic means); and to provide evidence that logic-based methods can well support reasoning with uncertainty. For the latter claim, two paradigmatic examples are presented: Lo… ▽ More This article aims to achieve two goals: to show that probability is not the only way of dealing with uncertainty (and even more, that there are kinds of uncertainty which are for principled reasons not addressable with probabilistic means); and to provide evidence that logic-based methods can well support reasoning with uncertainty. For the latter claim, two paradigmatic examples are presented: Logic Programming with Kleene semantics for modelling reasoning from information in a discourse, to an interpretation of the state of affairs of the intended model, and a neural-symbolic implementation of Input/Output logic for dealing with uncertainty in dynamic normative contexts. △ Less

Submitted 1 March, 2017; v1 submitted 18 January, 2017; originally announced January 2017.

Comments: Forthcoming with DOI 10.1007/s11023-017-9428-3 in the Special Issue "Reasoning with Imperfect Information and Knowledge" of Minds and Machines (2017). The final publication will be available at http://link.springer.com. --- Changes to previous version: Fixed some typos and a broken reference

arXiv:1606.04422 [pdf, ps, other]

Logic Tensor Networks: Deep Learning and Logical Reasoning from Data and Knowledge

Authors: Luciano Serafini, Artur d'Avila Garcez

Abstract: We propose Logic Tensor Networks: a uniform framework for integrating automatic learning and reasoning. A logic formalism called Real Logic is defined on a first-order language whereby formulas have truth-value in the interval [0,1] and semantics defined concretely on the domain of real numbers. Logical constants are interpreted as feature vectors of real numbers. Real Logic promotes a well-founde… ▽ More We propose Logic Tensor Networks: a uniform framework for integrating automatic learning and reasoning. A logic formalism called Real Logic is defined on a first-order language whereby formulas have truth-value in the interval [0,1] and semantics defined concretely on the domain of real numbers. Logical constants are interpreted as feature vectors of real numbers. Real Logic promotes a well-founded integration of deductive reasoning on a knowledge-base and efficient data-driven relational machine learning. We show how Real Logic can be implemented in deep Tensor Neural Networks with the use of Google's tensorflow primitives. The paper concludes with experiments applying Logic Tensor Networks on a simple but representative example of knowledge completion. △ Less

Submitted 7 July, 2016; v1 submitted 14 June, 2016; originally announced June 2016.

Comments: 12 pages, 2 figs, 1 table, 27 references

arXiv:1604.01806 [pdf, ps, other]

Generalising the Discriminative Restricted Boltzmann Machine

Authors: Srikanth Cherla, Son N Tran, Tillman Weyde, Artur d'Avila Garcez

Abstract: We present a novel theoretical result that generalises the Discriminative Restricted Boltzmann Machine (DRBM). While originally the DRBM was defined assuming the {0, 1}-Bernoulli distribution in each of its hidden units, this result makes it possible to derive cost functions for variants of the DRBM that utilise other distributions, including some that are often encountered in the literature. This… ▽ More We present a novel theoretical result that generalises the Discriminative Restricted Boltzmann Machine (DRBM). While originally the DRBM was defined assuming the {0, 1}-Bernoulli distribution in each of its hidden units, this result makes it possible to derive cost functions for variants of the DRBM that utilise other distributions, including some that are often encountered in the literature. This is illustrated with the Binomial and {-1, +1}-Bernoulli distributions here. We evaluate these two DRBM variants and compare them with the original one on three benchmark datasets, namely the MNIST and USPS digit classification datasets, and the 20 Newsgroups document classification dataset. Results show that each of the three compared models outperforms the remaining two in one of the three datasets, thus indicating that the proposed theoretical generalisation of the DRBM may be valuable in practice. △ Less

Submitted 6 April, 2016; originally announced April 2016.

Comments: Submitted to ECML 2016 conference track

arXiv:1412.1156 [pdf, other]

doi 10.4204/EPTCS.169.8

Runtime Verification Through Forward Chaining

Authors: Alan Perotti, Guido Boella, Artur d'Avila Garcez

Abstract: In this paper we present a novel rule-based approach for Runtime Verification of FLTL properties over finite but expanding traces. Our system exploits Horn clauses in implication form and relies on a forward chaining-based monitoring algorithm. This approach avoids the branching structure and exponential complexity typical of tableaux-based formulations, creating monitors with a single state and a… ▽ More In this paper we present a novel rule-based approach for Runtime Verification of FLTL properties over finite but expanding traces. Our system exploits Horn clauses in implication form and relies on a forward chaining-based monitoring algorithm. This approach avoids the branching structure and exponential complexity typical of tableaux-based formulations, creating monitors with a single state and a fixed number of rules. This allows for a fast and scalable tool for Runtime Verification: we present the technical details together with a working implementation. △ Less

Submitted 2 December, 2014; originally announced December 2014.

Comments: In Proceedings HCVS 2014, arXiv:1412.0825

Journal ref: EPTCS 169, 2014, pp. 68-81

arXiv:1312.6190 [pdf, other]

Adaptive Feature Ranking for Unsupervised Transfer Learning

Authors: Son N. Tran, Artur d'Avila Garcez

Abstract: Transfer Learning is concerned with the application of knowledge gained from solving a problem to a different but related problem domain. In this paper, we propose a method and efficient algorithm for ranking and selecting representations from a Restricted Boltzmann Machine trained on a source domain to be transferred onto a target domain. Experiments carried out using the MNIST, ICDAR and TiCC im… ▽ More Transfer Learning is concerned with the application of knowledge gained from solving a problem to a different but related problem domain. In this paper, we propose a method and efficient algorithm for ranking and selecting representations from a Restricted Boltzmann Machine trained on a source domain to be transferred onto a target domain. Experiments carried out using the MNIST, ICDAR and TiCC image datasets show that the proposed adaptive feature ranking and transfer learning method offers statistically significant improvements on the training of RBMs. Our method is general in that the knowledge chosen by the ranking function does not depend on its relation to any specific target domain, and it works with unsupervised learning and knowledge-based transfer. △ Less

Submitted 28 May, 2014; v1 submitted 20 December, 2013; originally announced December 2013.

Comments: 9 pages 7 figures, new experimental results on ranking and transfer have been added, typo fixed

Showing 1–28 of 28 results for author: Garcez, A D