Search | arXiv e-print repository

CNN-based explanation ensembling for dataset, representation and explanations evaluation

Authors: Weronika Hryniewska-Guzik, Luca Longo, Przemysław Biecek

Abstract: Explainable Artificial Intelligence has gained significant attention due to the widespread use of complex deep learning models in high-stake domains such as medicine, finance, and autonomous cars. However, different explanations often present different aspects of the model's behavior. In this research manuscript, we explore the potential of ensembling explanations generated by deep classification… ▽ More Explainable Artificial Intelligence has gained significant attention due to the widespread use of complex deep learning models in high-stake domains such as medicine, finance, and autonomous cars. However, different explanations often present different aspects of the model's behavior. In this research manuscript, we explore the potential of ensembling explanations generated by deep classification models using convolutional model. Through experimentation and analysis, we aim to investigate the implications of combining explanations to uncover a more coherent and reliable patterns of the model's behavior, leading to the possibility of evaluating the representation learned by the model. With our method, we can uncover problems of under-representation of images in a certain class. Moreover, we discuss other side benefits like features' reduction by replacing the original image with its explanations resulting in the removal of some sensitive information. Through the use of carefully selected evaluation metrics from the Quantus library, we demonstrated the method's superior performance in terms of Localisation and Faithfulness, compared to individual explanations. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: accepted at 2nd World Conference on eXplainable Artificial Intelligence

arXiv:2310.19775 [pdf, other]

doi 10.1016/j.inffus.2024.102301

Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions

Authors: Luca Longo, Mario Brcic, Federico Cabitza, Jaesik Choi, Roberto Confalonieri, Javier Del Ser, Riccardo Guidotti, Yoichi Hayashi, Francisco Herrera, Andreas Holzinger, Richard Jiang, Hassan Khosravi, Freddy Lecue, Gianclaudio Malgieri, Andrés Páez, Wojciech Samek, Johannes Schneider, Timo Speith, Simone Stumpf

Abstract: As systems based on opaque Artificial Intelligence (AI) continue to flourish in diverse real-world applications, understanding these black box models has become paramount. In response, Explainable AI (XAI) has emerged as a field of research with practical and ethical benefits across various domains. This paper not only highlights the advancements in XAI and its application in real-world scenarios… ▽ More As systems based on opaque Artificial Intelligence (AI) continue to flourish in diverse real-world applications, understanding these black box models has become paramount. In response, Explainable AI (XAI) has emerged as a field of research with practical and ethical benefits across various domains. This paper not only highlights the advancements in XAI and its application in real-world scenarios but also addresses the ongoing challenges within XAI, emphasizing the need for broader perspectives and collaborative efforts. We bring together experts from diverse fields to identify open problems, striving to synchronize research agendas and accelerate XAI in practical applications. By fostering collaborative discussion and interdisciplinary cooperation, we aim to propel XAI forward, contributing to its continued success. Our goal is to put forward a comprehensive proposal for advancing XAI. To achieve this goal, we present a manifesto of 27 open problems categorized into nine categories. These challenges encapsulate the complexities and nuances of XAI and offer a road map for future research. For each problem, we provide promising research directions in the hope of harnessing the collective intelligence of interested stakeholders. △ Less

Submitted 30 October, 2023; originally announced October 2023.

ACM Class: F.2.0; H.1.2; I.2; I.2.6; K.4; K.5

Journal ref: Information Fusion 2024

arXiv:2307.10283 [pdf, other]

Interpretable Timbre Synthesis using Variational Autoencoders Regularized on Timbre Descriptors

Authors: Anastasia Natsiou, Luca Longo, Sean O'Leary

Abstract: Controllable timbre synthesis has been a subject of research for several decades, and deep neural networks have been the most successful in this area. Deep generative models such as Variational Autoencoders (VAEs) have the ability to generate a high-level representation of audio while providing a structured latent space. Despite their advantages, the interpretability of these latent spaces in term… ▽ More Controllable timbre synthesis has been a subject of research for several decades, and deep neural networks have been the most successful in this area. Deep generative models such as Variational Autoencoders (VAEs) have the ability to generate a high-level representation of audio while providing a structured latent space. Despite their advantages, the interpretability of these latent spaces in terms of human perception is often limited. To address this limitation and enhance the control over timbre generation, we propose a regularized VAE-based latent space that incorporates timbre descriptors. Moreover, we suggest a more concise representation of sound by utilizing its harmonic content, in order to minimize the dimensionality of the latent space. △ Less

Submitted 18 July, 2023; originally announced July 2023.

arXiv:2306.15500 [pdf, other]

doi 10.1007/978-3-031-44070-0_20

A novel structured argumentation framework for improved explainability of classification tasks

Authors: Lucas Rizzo, Luca Longo

Abstract: This paper presents a novel framework for structured argumentation, named extend argumentative decision graph ($xADG$). It is an extension of argumentative decision graphs built upon Dung's abstract argumentation graphs. The $xADG$ framework allows for arguments to use boolean logic operators and multiple premises (supports) within their internal structure, resulting in more concise argumentation… ▽ More This paper presents a novel framework for structured argumentation, named extend argumentative decision graph ($xADG$). It is an extension of argumentative decision graphs built upon Dung's abstract argumentation graphs. The $xADG$ framework allows for arguments to use boolean logic operators and multiple premises (supports) within their internal structure, resulting in more concise argumentation graphs that may be easier for users to understand. The study presents a methodology for construction of $xADGs$ and evaluates their size and predictive capacity for classification tasks of varying magnitudes. Resulting $xADGs$ achieved strong (balanced) accuracy, which was accomplished through an input decision tree, while also reducing the average number of supports needed to reach a conclusion. The results further indicated that it is possible to construct plausibly understandable $xADGs$ that outperform other techniques for building $ADGs$ in terms of predictive capacity and overall size. In summary, the study suggests that $xADG$ represents a promising framework to develo** more concise argumentative models that can be used for classification tasks and knowledge discovery, acquisition, and refinement. △ Less

Submitted 27 June, 2023; originally announced June 2023.

Comments: Submitted to the The World Conference on eXplainable Artificial Intelligence (xAI 2023)

arXiv:2301.07665 [pdf, other]

An investigation of the reconstruction capacity of stacked convolutional autoencoders for log-mel-spectrograms

Authors: Anastasia Natsiou, Luca Longo, Sean O'Leary

Abstract: In audio processing applications, the generation of expressive sounds based on high-level representations demonstrates a high demand. These representations can be used to manipulate the timbre and influence the synthesis of creative instrumental notes. Modern algorithms, such as neural networks, have inspired the development of expressive synthesizers based on musical instrument timbre compression… ▽ More In audio processing applications, the generation of expressive sounds based on high-level representations demonstrates a high demand. These representations can be used to manipulate the timbre and influence the synthesis of creative instrumental notes. Modern algorithms, such as neural networks, have inspired the development of expressive synthesizers based on musical instrument timbre compression. Unsupervised deep learning methods can achieve audio compression by training the network to learn a map** from waveforms or spectrograms to low-dimensional representations. This study investigates the use of stacked convolutional autoencoders for the compression of time-frequency audio representations for a variety of instruments for a single pitch. Further exploration of hyper-parameters and regularization techniques is demonstrated to enhance the performance of the initial design. In an unsupervised manner, the network is able to reconstruct a monophonic and harmonic sound based on latent representations. In addition, we introduce an evaluation metric to measure the similarity between the original and reconstructed samples. Evaluating a deep generative model for the synthesis of sound is a challenging task. Our approach is based on the accuracy of the generated frequencies as it presents a significant metric for the perception of harmonic sounds. This work is expected to accelerate future experiments on audio compression using neural autoencoders. △ Less

Submitted 18 January, 2023; originally announced January 2023.

arXiv:2209.10992 [pdf, other]

Modeling cognitive load as a self-supervised brain rate with electroencephalography and deep learning

Authors: Luca Longo

Abstract: The principal reason for measuring mental workload is to quantify the cognitive cost of performing tasks to predict human performance. Unfortunately, a method for assessing mental workload that has general applicability does not exist yet. This research presents a novel self-supervised method for mental workload modelling from EEG data employing Deep Learning and a continuous brain rate, an index… ▽ More The principal reason for measuring mental workload is to quantify the cognitive cost of performing tasks to predict human performance. Unfortunately, a method for assessing mental workload that has general applicability does not exist yet. This research presents a novel self-supervised method for mental workload modelling from EEG data employing Deep Learning and a continuous brain rate, an index of cognitive activation, without requiring human declarative knowledge. This method is a convolutional recurrent neural network trainable with spatially preserving spectral topographic head-maps from EEG data to fit the brain rate variable. Findings demonstrate the capacity of the convolutional layers to learn meaningful high-level representations from EEG data since within-subject models had a test Mean Absolute Percentage Error average of 11%. The addition of a Long-Short Term Memory layer for handling sequences of high-level representations was not significant, although it did improve their accuracy. Findings point to the existence of quasi-stable blocks of learnt high-level representations of cognitive activation because they can be induced through convolution and seem not to be dependent on each other over time, intuitively matching the non-stationary nature of brain responses. Across-subject models, induced with data from an increasing number of participants, thus containing more variability, obtained a similar accuracy to the within-subject models. This highlights the potential generalisability of the induced high-level representations across people, suggesting the existence of subject-independent cognitive activation patterns. This research contributes to the body of knowledge by providing scholars with a novel computational method for mental workload modelling that aims to be generally applicable, does not rely on ad-hoc human-crafted models supporting replicability and falsifiability. △ Less

Submitted 21 September, 2022; originally announced September 2022.

Comments: 18 pages, 12 figures, 1 table

ACM Class: I.2.6

arXiv:2206.13959 [pdf, other]

doi 10.1016/j.inffus.2022.08.025

Comparing and extending the use of defeasible argumentation with quantitative data in real-world contexts

Authors: Lucas Rizzo, Luca Longo

Abstract: Dealing with uncertain, contradicting, and ambiguous information is still a central issue in Artificial Intelligence (AI). As a result, many formalisms have been proposed or adapted so as to consider non-monotonicity, with only a limited number of works and researchers performing any sort of comparison among them. A non-monotonic formalism is one that allows the retraction of previous conclusions… ▽ More Dealing with uncertain, contradicting, and ambiguous information is still a central issue in Artificial Intelligence (AI). As a result, many formalisms have been proposed or adapted so as to consider non-monotonicity, with only a limited number of works and researchers performing any sort of comparison among them. A non-monotonic formalism is one that allows the retraction of previous conclusions or claims, from premises, in light of new evidence, offering some desirable flexibility when dealing with uncertainty. This research article focuses on evaluating the inferential capacity of defeasible argumentation, a formalism particularly envisioned for modelling non-monotonic reasoning. In addition to this, fuzzy reasoning and expert systems, extended for handling non-monotonicity of reasoning, are selected and employed as baselines, due to their vast and accepted use within the AI community. Computational trust was selected as the domain of application of such models. Trust is an ill-defined construct, hence, reasoning applied to the inference of trust can be seen as non-monotonic. Inference models were designed to assign trust scalars to editors of the Wikipedia project. In particular, argument-based models demonstrated more robustness than those built upon the baselines despite the knowledge bases or datasets employed. This study contributes to the body of knowledge through the exploitation of defeasible argumentation and its comparison to similar approaches. The practical use of such approaches coupled with a modular design that facilitates similar experiments was exemplified and their respective implementations made publicly available on GitHub [120, 121]. This work adds to previous works, empirically enhancing the generalisability of defeasible argumentation as a compelling approach to reason with quantitative data and uncertain knowledge. △ Less

Submitted 28 June, 2022; originally announced June 2022.

ACM Class: I.2.4; I.2.1

arXiv:2202.12937 [pdf, ps, other]

An Evaluation of the EEG alpha-to-theta and theta-to-alpha band Ratios as Indexes of Mental Workload

Authors: Bujar Raufi, Luca Longo

Abstract: Many research works indicate that EEG bands, specifically the alpha and theta bands, have been potentially helpful cognitive load indicators. However, minimal research exists to validate this claim. This study aims to assess and analyze the impact of the alpha-to-theta and the theta-to-alpha band ratios on supporting the creation of models capable of discriminating self-reported perceptions of men… ▽ More Many research works indicate that EEG bands, specifically the alpha and theta bands, have been potentially helpful cognitive load indicators. However, minimal research exists to validate this claim. This study aims to assess and analyze the impact of the alpha-to-theta and the theta-to-alpha band ratios on supporting the creation of models capable of discriminating self-reported perceptions of mental workload. A dataset of raw EEG data was utilized in which 48 subjects performed a resting activity and an induced task demanding exercise in the form of a multitasking SIMKAP test. Band ratios were devised from frontal and parietal electrode clusters. Building and model testing was done with high-level independent features from the frequency and temporal domains extracted from the computed ratios over time. Target features for model training were extracted from the subjective ratings collected after resting and task demand activities. Models were built by employing Logistic Regression, Support Vector Machines and Decision Trees and were evaluated with performance measures including accuracy, recall, precision and f1-score. The results indicate high classification accuracy of those models trained with the high-level features extracted from the alpha-to-theta ratios and theta-to-alpha ratios. Preliminary results also show that models trained with logistic regression and support vector machines can accurately classify self-reported perceptions of mental workload. This research contributes to the body of knowledge by demonstrating the richness of the information in the temporal, spectral and statistical domains extracted from the alpha-to-theta and theta-to-alpha EEG band ratios for the discrimination of self-reported perceptions of mental workload. △ Less

Submitted 2 March, 2022; v1 submitted 22 February, 2022; originally announced February 2022.

Comments: 25 pages, 12 figures, and 6 tables

ACM Class: I.6; G.3

arXiv:2202.12205 [pdf, ps, other]

doi 10.3233/SW-223228

Is Neuro-Symbolic AI Meeting its Promise in Natural Language Processing? A Structured Review

Authors: Kyle Hamilton, Aparna Nayak, Bojan Božić, Luca Longo

Abstract: Advocates for Neuro-Symbolic Artificial Intelligence (NeSy) assert that combining deep learning with symbolic reasoning will lead to stronger AI than either paradigm on its own. As successful as deep learning has been, it is generally accepted that even our best deep learning systems are not very good at abstract reasoning. And since reasoning is inextricably linked to language, it makes intuitive… ▽ More Advocates for Neuro-Symbolic Artificial Intelligence (NeSy) assert that combining deep learning with symbolic reasoning will lead to stronger AI than either paradigm on its own. As successful as deep learning has been, it is generally accepted that even our best deep learning systems are not very good at abstract reasoning. And since reasoning is inextricably linked to language, it makes intuitive sense that Natural Language Processing (NLP), would be a particularly well-suited candidate for NeSy. We conduct a structured review of studies implementing NeSy for NLP, with the aim of answering the question of whether NeSy is indeed meeting its promises: reasoning, out-of-distribution generalization, interpretability, learning and reasoning from small data, and transferability to new domains. We examine the impact of knowledge representation, such as rules and semantic networks, language structure and relational structure, and whether implicit or explicit reasoning contributes to higher promise scores. We find that systems where logic is compiled into the neural network lead to the most NeSy goals being satisfied, while other factors such as knowledge representation, or type of neural architecture do not exhibit a clear correlation with goals being met. We find many discrepancies in how reasoning is defined, specifically in relation to human level reasoning, which impact decisions about model architectures and drive conclusions which are not always consistent across studies. Hence we advocate for a more methodical approach to the application of theories of human reasoning as well as the development of appropriate benchmarks, which we hope can lead to a better understanding of progress in the field. We make our data and code available on github for further analysis. △ Less

Submitted 30 June, 2022; v1 submitted 24 February, 2022; originally announced February 2022.

Comments: Survey

Journal ref: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-42, 2022

arXiv:2103.10701 [pdf, other]

Weakly Complete Semantics Based on Undecidedness Blocking

Authors: Pierpaolo Dondio, Luca Longo

Abstract: In this paper we introduce a novel family of semantics called weakly complete semantics. Differently from Dung's complete semantics, weakly complete semantics employs a mechanism called undecidedness blocking by which the label undecided of an attacking argument is not always propagated to an otherwise accepted attacked argument. The new semantics are conflict-free, non-admissible but employing a… ▽ More In this paper we introduce a novel family of semantics called weakly complete semantics. Differently from Dung's complete semantics, weakly complete semantics employs a mechanism called undecidedness blocking by which the label undecided of an attacking argument is not always propagated to an otherwise accepted attacked argument. The new semantics are conflict-free, non-admissible but employing a weaker notion of admissibility; they allow reinstatement and they retain the majority of properties of complete semantics. We show how both weakly complete and Dung's complete semantics can be generated by applying different undecidedness blocking strategies, making undecidedness blocking a unifying mechanism underlying argumentation semantics. The semantics are also an example of ambiguity blocking Dunganian semantics and the first semantics to tackle the problem of self-defeating attacking arguments. In the last part of the paper we compare weakly complete semantics with the recent work of Baumann et al. on weakly admissible semantics. Since the two families of semantics do not coincide, a principle-based analysis of the two approaches is provided. The analysis shows how our semantics satisfy a number of principles satisfied by Dung's complete semantics but not by Baumann et al. semantics, including directionality, abstention, SCC-decomposability and cardinality of extensions, making them a more faithful non-admissible version of Dung' semantics. △ Less

Submitted 19 March, 2021; originally announced March 2021.

Comments: 48 pages, 9 figures. Preprint

MSC Class: 68T27 (Primary) 68T30; 68T37 (Secondary) ACM Class: I.2.4

arXiv:2006.00093 [pdf, other]

Explainable Artificial Intelligence: a Systematic Review

Authors: Giulia Vilone, Luca Longo

Abstract: Explainable Artificial Intelligence (XAI) has experienced a significant growth over the last few years. This is due to the widespread application of machine learning, particularly deep learning, that has led to the development of highly accurate models but lack explainability and interpretability. A plethora of methods to tackle this problem have been proposed, developed and tested. This systemati… ▽ More Explainable Artificial Intelligence (XAI) has experienced a significant growth over the last few years. This is due to the widespread application of machine learning, particularly deep learning, that has led to the development of highly accurate models but lack explainability and interpretability. A plethora of methods to tackle this problem have been proposed, developed and tested. This systematic review contributes to the body of knowledge by clustering these methods with a hierarchical classification system with four main clusters: review articles, theories and notions, methods and their evaluation. It also summarises the state-of-the-art in XAI and recommends future research directions. △ Less

Submitted 12 October, 2020; v1 submitted 29 May, 2020; originally announced June 2020.

Comments: 78 pages, 18 figures, journal paper to be submitted to Information Fusion

ACM Class: I.2.0; I.2.6; I.2.m

arXiv:1903.05981 [pdf, other]

Expressing Trust with Temporal Frequency of User Interaction in Online Communities

Authors: Ekaterina Yashkina, Arseny Pinigin, JooYoung Lee, Manuel Mazzara, Akinlolu Solomon Adekotujo, Adam Zubair, Luca Longo

Abstract: Reputation systems concern soft security dynamics in diverse areas. Trust dynamics in a reputation system should be stable and adaptable at the same time to serve the purpose. Many reputation mechanisms have been proposed and tested over time. However, the main drawback of reputation management is that users need to share private information to gain trust in a system such as phone numbers, reviews… ▽ More Reputation systems concern soft security dynamics in diverse areas. Trust dynamics in a reputation system should be stable and adaptable at the same time to serve the purpose. Many reputation mechanisms have been proposed and tested over time. However, the main drawback of reputation management is that users need to share private information to gain trust in a system such as phone numbers, reviews, and ratings. Recently, a novel model that tries to overcome this issue was presented: the Dynamic Interaction-based Reputation Model (DIBRM). This approach to trust considers only implicit information automatically deduced from the interactions of users within an online community. In this primary research study, the Reddit and MathOverflow online social communities have been selected for testing DIBRM. Results show how this novel approach to trust can mimic behaviors of the selected reputation systems, namely Reddit and MathOverflow, only with temporal information. △ Less

Submitted 29 January, 2019; originally announced March 2019.

arXiv:1801.03904 [pdf, other]

Towards dynamic interaction-based model

Authors: Almaz Melnikov, Manuel Mazzara, Victor Rivera, JooYoung Lee, Luca Longo

Abstract: In this paper, we investigate how dynamic properties of reputation can influence the quality of users ranking. Reputation systems should be based on rules that can guarantee a high level of trust and help identifying unreliable units. To understand the effectiveness of dynamic properties in the evaluation of reputation, we propose our own model (DIB-RM) that is based on three factors: forgetting,… ▽ More In this paper, we investigate how dynamic properties of reputation can influence the quality of users ranking. Reputation systems should be based on rules that can guarantee a high level of trust and help identifying unreliable units. To understand the effectiveness of dynamic properties in the evaluation of reputation, we propose our own model (DIB-RM) that is based on three factors: forgetting, cumulative, and activity period. In order to evaluate the model, we use data from StackOverflow, which also has its own reputation model. We estimate similarity of ratings between DIB-RM and the StackOverflow model so to check our hypothesis. We use two values to calculate our metric: DIB-RM reputation and $historical$ reputation. We found that $historical$ reputation gives better metric values. Our preliminary results are presented for different sets of values of the aforementioned factors in order to analyze how effectively the model can be used for modeling reputation systems. △ Less

Submitted 11 January, 2018; originally announced January 2018.

arXiv:1712.07686 [pdf, other]

Pseudorehearsal in actor-critic agents with neural network function approximation

Authors: Vladimir Marochko, Leonard Johard, Manuel Mazzara, Luca Longo

Abstract: Catastrophic forgetting has a significant negative impact in reinforcement learning. The purpose of this study is to investigate how pseudorehearsal can change performance of an actor-critic agent with neural-network function approximation. We tested agent in a pole balancing task and compared different pseudorehearsal approaches. We have found that pseudorehearsal can assist learning and decrease… ▽ More Catastrophic forgetting has a significant negative impact in reinforcement learning. The purpose of this study is to investigate how pseudorehearsal can change performance of an actor-critic agent with neural-network function approximation. We tested agent in a pole balancing task and compared different pseudorehearsal approaches. We have found that pseudorehearsal can assist learning and decrease forgetting. △ Less

Submitted 19 February, 2018; v1 submitted 20 December, 2017; originally announced December 2017.

Showing 1–14 of 14 results for author: Longo, L