-
Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions
Authors:
Luca Longo,
Mario Brcic,
Federico Cabitza,
Jaesik Choi,
Roberto Confalonieri,
Javier Del Ser,
Riccardo Guidotti,
Yoichi Hayashi,
Francisco Herrera,
Andreas Holzinger,
Richard Jiang,
Hassan Khosravi,
Freddy Lecue,
Gianclaudio Malgieri,
Andrés Páez,
Wojciech Samek,
Johannes Schneider,
Timo Speith,
Simone Stumpf
Abstract:
As systems based on opaque Artificial Intelligence (AI) continue to flourish in diverse real-world applications, understanding these black box models has become paramount. In response, Explainable AI (XAI) has emerged as a field of research with practical and ethical benefits across various domains. This paper not only highlights the advancements in XAI and its application in real-world scenarios…
▽ More
As systems based on opaque Artificial Intelligence (AI) continue to flourish in diverse real-world applications, understanding these black box models has become paramount. In response, Explainable AI (XAI) has emerged as a field of research with practical and ethical benefits across various domains. This paper not only highlights the advancements in XAI and its application in real-world scenarios but also addresses the ongoing challenges within XAI, emphasizing the need for broader perspectives and collaborative efforts. We bring together experts from diverse fields to identify open problems, striving to synchronize research agendas and accelerate XAI in practical applications. By fostering collaborative discussion and interdisciplinary cooperation, we aim to propel XAI forward, contributing to its continued success. Our goal is to put forward a comprehensive proposal for advancing XAI. To achieve this goal, we present a manifesto of 27 open problems categorized into nine categories. These challenges encapsulate the complexities and nuances of XAI and offer a road map for future research. For each problem, we provide promising research directions in the hope of harnessing the collective intelligence of interested stakeholders.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
Painting the black box white: experimental findings from applying XAI to an ECG reading setting
Authors:
Federico Cabitza,
Matteo Cameli,
Andrea Campagner,
Chiara Natali,
Luca Ronzio
Abstract:
The shift from symbolic AI systems to black-box, sub-symbolic, and statistical ones has motivated a rapid increase in the interest toward explainable AI (XAI), i.e. approaches to make black-box AI systems explainable to human decision makers with the aim of making these systems more acceptable and more usable tools and supports. However, we make the point that, rather than always making black boxe…
▽ More
The shift from symbolic AI systems to black-box, sub-symbolic, and statistical ones has motivated a rapid increase in the interest toward explainable AI (XAI), i.e. approaches to make black-box AI systems explainable to human decision makers with the aim of making these systems more acceptable and more usable tools and supports. However, we make the point that, rather than always making black boxes transparent, these approaches are at risk of \emph{painting the black boxes white}, thus failing to provide a level of transparency that would increase the system's usability and comprehensibility; or, even, at risk of generating new errors, in what we termed the \emph{white-box paradox}. To address these usability-related issues, in this work we focus on the cognitive dimension of users' perception of explanations and XAI systems. To this aim, we designed and conducted a questionnaire-based experiment by which we involved 44 cardiology residents and specialists in an AI-supported ECG reading task. In doing so, we investigated different research questions concerning the relationship between users' characteristics (e.g. expertise) and their perception of AI and XAI systems, including their trust, the perceived explanations' quality and their tendency to defer the decision process to automation (i.e. technology dominance), as well as the mutual relationships among these different dimensions. Our findings provide a contribution to the evaluation of AI-based support systems from a Human-AI interaction-oriented perspective and lay the ground for further investigation of XAI and its effects on decision making and user experience.
△ Less
Submitted 27 October, 2022;
originally announced October 2022.
-
Everything is Varied: The Surprising Impact of Individual Variation on ML Robustness in Medicine
Authors:
Andrea Campagner,
Lorenzo Famiglini,
Anna Carobene,
Federico Cabitza
Abstract:
In medical settings, Individual Variation (IV) refers to variation that is due not to population differences or errors, but rather to within-subject variation, that is the intrinsic and characteristic patterns of variation pertaining to a given instance or the measurement process. While taking into account IV has been deemed critical for proper analysis of medical data, this source of uncertainty…
▽ More
In medical settings, Individual Variation (IV) refers to variation that is due not to population differences or errors, but rather to within-subject variation, that is the intrinsic and characteristic patterns of variation pertaining to a given instance or the measurement process. While taking into account IV has been deemed critical for proper analysis of medical data, this source of uncertainty and its impact on robustness have so far been neglected in Machine Learning (ML). To fill this gap, we look at how IV affects ML performance and generalization and how its impact can be mitigated. Specifically, we provide a methodological contribution to formalize the problem of IV in the statistical learning framework and, through an experiment based on one of the largest real-world laboratory medicine datasets for the problem of COVID-19 diagnosis, we show that: 1) common state-of-the-art ML models are severely impacted by the presence of IV in data; and 2) advanced learning strategies, based on data augmentation and data imprecisiation, and proper study designs can be effective at improving robustness to IV. Our findings demonstrate the critical relevance of correctly accounting for IV to enable safe deployment of ML in clinical settings.
△ Less
Submitted 11 October, 2022; v1 submitted 10 October, 2022;
originally announced October 2022.
-
Responsible AI in Healthcare
Authors:
Federico Cabitza,
Davide Ciucci,
Gabriella Pasi,
Marco Viviani
Abstract:
This article discusses open problems, implemented solutions, and future research in the area of responsible AI in healthcare. In particular, we illustrate two main research themes related to the work of two laboratories within the Department of Informatics, Systems, and Communication at the University of Milano-Bicocca. The problems addressed concern, in particular, {uncertainty in medical data an…
▽ More
This article discusses open problems, implemented solutions, and future research in the area of responsible AI in healthcare. In particular, we illustrate two main research themes related to the work of two laboratories within the Department of Informatics, Systems, and Communication at the University of Milano-Bicocca. The problems addressed concern, in particular, {uncertainty in medical data and machine advice}, and the problem of online health information disorder.
△ Less
Submitted 19 February, 2022;
originally announced March 2022.
-
Toward a Perspectivist Turn in Ground Truthing for Predictive Computing
Authors:
Valerio Basile,
Federico Cabitza,
Andrea Campagner,
Michael Fell
Abstract:
Most Artificial Intelligence applications are based on supervised machine learning (ML), which ultimately grounds on manually annotated data. The annotation process is often performed in terms of a majority vote and this has been proved to be often problematic, as highlighted by recent studies on the evaluation of ML models. In this article we describe and advocate for a different paradigm, which…
▽ More
Most Artificial Intelligence applications are based on supervised machine learning (ML), which ultimately grounds on manually annotated data. The annotation process is often performed in terms of a majority vote and this has been proved to be often problematic, as highlighted by recent studies on the evaluation of ML models. In this article we describe and advocate for a different paradigm, which we call data perspectivism, which moves away from traditional gold standard datasets, towards the adoption of methods that integrate the opinions and perspectives of the human subjects involved in the knowledge representation step of ML processes. Drawing on previous works which inspired our proposal we describe the potential of our proposal for not only the more subjective tasks (e.g. those related to human language) but also to tasks commonly understood as objective (e.g. medical decision making), and present the main advantages of adopting a perspectivist stance in ML, as well as possible disadvantages, and various ways in which such a stance can be implemented in practice. Finally, we share a set of recommendations and outline a research agenda to advance the perspectivist stance in ML.
△ Less
Submitted 29 June, 2023; v1 submitted 9 September, 2021;
originally announced September 2021.
-
Who wants accurate models? Arguing for a different metrics to take classification models seriously
Authors:
Federico Cabitza,
Andrea Campagner
Abstract:
With the increasing availability of AI-based decision support, there is an increasing need for their certification by both AI manufacturers and notified bodies, as well as the pragmatic (real-world) validation of these systems. Therefore, there is the need for meaningful and informative ways to assess the performance of AI systems in clinical practice. Common metrics (like accuracy scores and area…
▽ More
With the increasing availability of AI-based decision support, there is an increasing need for their certification by both AI manufacturers and notified bodies, as well as the pragmatic (real-world) validation of these systems. Therefore, there is the need for meaningful and informative ways to assess the performance of AI systems in clinical practice. Common metrics (like accuracy scores and areas under the ROC curve) have known problems and they do not take into account important information about the preferences of clinicians and the needs of their specialist practice, like the likelihood and impact of errors and the complexity of cases. In this paper, we present a new accuracy measure, the H-accuracy (Ha), which we claim is more informative in the medical domain (and others of similar needs) for the elements it encompasses. We also provide proof that the H-accuracy is a generalization of the balanced accuracy and establish a relation between the H-accuracy and the Net Benefit. Finally, we illustrate an experimentation in two user studies to show the descriptive power of the Ha score and how complementary and differently informative measures can be derived from its formulation (a Python script to compute Ha is also made available).
△ Less
Submitted 22 October, 2019; v1 submitted 21 October, 2019;
originally announced October 2019.
-
A giant with feet of clay: on the validity of the data that feed machine learning in medicine
Authors:
Federico Cabitza,
Davide Ciucci,
Raffaele Rasoini
Abstract:
This paper considers the use of Machine Learning (ML) in medicine by focusing on the main problem that this computational approach has been aimed at solving or at least minimizing: uncertainty. To this aim, we point out how uncertainty is so ingrained in medicine that it biases also the representation of clinical phenomena, that is the very input of ML models, thus undermining the clinical signifi…
▽ More
This paper considers the use of Machine Learning (ML) in medicine by focusing on the main problem that this computational approach has been aimed at solving or at least minimizing: uncertainty. To this aim, we point out how uncertainty is so ingrained in medicine that it biases also the representation of clinical phenomena, that is the very input of ML models, thus undermining the clinical significance of their output. Recognizing this can motivate both medical doctors, in taking more responsibility in the development and use of these decision aids, and the researchers, in pursuing different ways to assess the value of these systems. In so doing, both designers and users could take this intrinsic characteristic of medicine more seriously and consider alternative approaches that do not "sweep uncertainty under the rug" within an objectivist fiction, which everyone can come up by believing as true.
△ Less
Submitted 14 May, 2018; v1 submitted 21 June, 2017;
originally announced June 2017.
-
Breeding electric zebras in the fields of Medicine
Authors:
Federico Cabitza
Abstract:
A few notes on the use of machine learning in medicine and the related unintended consequences.
A few notes on the use of machine learning in medicine and the related unintended consequences.
△ Less
Submitted 27 January, 2017; v1 submitted 15 January, 2017;
originally announced January 2017.
-
Human-Data Interaction in Healthcare
Authors:
Federico Cabitza,
Angela Locoro
Abstract:
In this paper, we focus on an emerging strand of IT-oriented research, namely Human-Data Interaction (HDI) and how this can be applied to healthcare. HDI regards both how humans create and use data by means of interactive systems, which can both assist and constrain them, as well as to passively collect and proactively generate data. Healthcare provides a challenging arena to test the potential of…
▽ More
In this paper, we focus on an emerging strand of IT-oriented research, namely Human-Data Interaction (HDI) and how this can be applied to healthcare. HDI regards both how humans create and use data by means of interactive systems, which can both assist and constrain them, as well as to passively collect and proactively generate data. Healthcare provides a challenging arena to test the potential of HDI to provide a new, user-centered perspective on how data work should be supported and assessed, especially in the light of the fact that data are becoming increasingly big and that many tools are now available for the lay people, including doctors and nurses, to interact with health-related data.
△ Less
Submitted 13 May, 2016; v1 submitted 18 February, 2016;
originally announced February 2016.
-
Appropriation as neglected practice in communities: presenting a framework to enable EUD design for CoPs
Authors:
Federico Cabitza,
Carla Simone
Abstract:
Communities present considerable challenges for the design and application of supportive information technology (IT), especially when these develop in loosely-integrated, informal and scarcely organized contexts, like it is often the case of Communities of Practice (CoP). An approach that actively supports user communities in the process of IT appropriation can help alleviate the impossibility of…
▽ More
Communities present considerable challenges for the design and application of supportive information technology (IT), especially when these develop in loosely-integrated, informal and scarcely organized contexts, like it is often the case of Communities of Practice (CoP). An approach that actively supports user communities in the process of IT appropriation can help alleviate the impossibility of the members of these communities to rely on professional support, and enable even complex forms of tailoring and End-User Development (EUD). Although this approach has been already explored by an increasing number of researchers, however there is still a lack of a general framework that could play a role in the comparison of existing proposals and in the development of new EUD solutions for CoPs. The paper proposes a conceptual framework and a related architecture, called Logic of Bricolage, that aims to be a step further in this direction to enable better EUD-oriented support for digitized communities. The framework is described and the architecture instantiated in three concrete EUD environments that specifically regard collaborative activities in order to show the generality and applicability of the framework.
△ Less
Submitted 7 June, 2013;
originally announced June 2013.
-
Design Ltd.: Renovated Myths for the Development of Socially Embedded Technologies
Authors:
Federico Cabitza,
Carla Simone
Abstract:
This paper argues that traditional and mainstream mythologies, which have been continually told within the Information Technology domain among designers and advocators of conceptual modelling since the 1960s in different fields of computing sciences, could now be renovated or substituted in the mould of more recent discourses about performativity, complexity and end-user creativity that have been…
▽ More
This paper argues that traditional and mainstream mythologies, which have been continually told within the Information Technology domain among designers and advocators of conceptual modelling since the 1960s in different fields of computing sciences, could now be renovated or substituted in the mould of more recent discourses about performativity, complexity and end-user creativity that have been constructed across different fields in the meanwhile. In the paper, it is submitted that these discourses could motivate IT professionals in undertaking alternative approaches toward the co-construction of socio-technical systems, i.e., social settings where humans cooperate to reach common goals by means of mediating computational tools. The authors advocate further discussion about and consolidation of some concepts in design research, design practice and more generally Information Technology (IT) development, like those of: task-artifact entanglement, universatility (sic) of End-User Development (EUD) environments, bricolant/bricoleur end-user, logic of bricolage, maieuta-designers (sic), and laissez-faire method to socio-technical construction. Points backing these and similar concepts are made to promote further discussion on the need to rethink the main assumptions underlying IT design and development some fifty years later the coming of age of software and modern IT in the organizational domain.
△ Less
Submitted 14 March, 2013; v1 submitted 23 November, 2012;
originally announced November 2012.