Search | arXiv e-print repository

Designing for Complementarity: A Conceptual Framework to Go Beyond the Current Paradigm of Using XAI in Healthcare

Authors: Elisa Rubegni, Omran Ayoub, Stefania Maria Rita Rizzo, Marco Barbero, Guenda Bernegger, Francesca Faraci, Francesca Mangili, Emiliano Soldini, Pierpaolo Trimboli, Alessandro Facchini

Abstract: The widespread use of Artificial Intelligence-based tools in the healthcare sector raises many ethical and legal problems, one of the main reasons being their black-box nature and therefore the seemingly opacity and inscrutability of their characteristics and decision-making process. Literature extensively discusses how this can lead to phenomena of over-reliance and under-reliance, ultimately lim… ▽ More The widespread use of Artificial Intelligence-based tools in the healthcare sector raises many ethical and legal problems, one of the main reasons being their black-box nature and therefore the seemingly opacity and inscrutability of their characteristics and decision-making process. Literature extensively discusses how this can lead to phenomena of over-reliance and under-reliance, ultimately limiting the adoption of AI. We addressed these issues by building a theoretical framework based on three concepts: Feature Importance, Counterexample Explanations, and Similar-Case Explanations. Grounded in the literature, the model was deployed within a case study in which, using a participatory design approach, we designed and developed a high-fidelity prototype. Through the co-design and development of the prototype and the underlying model, we advanced the knowledge on how to design AI-based systems for enabling complementarity in the decision-making process in the healthcare domain. Our work aims at contributing to the current discourse on designing AI systems to support clinicians' decision-making processes. △ Less

Submitted 6 April, 2024; originally announced April 2024.

arXiv:2403.19475 [pdf, other]

A theoretical framework for the design and analysis of computational thinking problems in education

Authors: Giorgia Adorni, Alberto Piatti, Engin Bumbacher, Lucio Negrini, Francesco Mondada, Dorit Assaf, Francesca Mangili, Luca Gambardella

Abstract: The field of computational thinking education has grown in recent years as researchers and educators have sought to develop and assess students' computational thinking abilities. While much of the research in this area has focused on defining computational thinking, the competencies it involves and how to assess them in teaching and learning contexts, this work takes a different approach. We provi… ▽ More The field of computational thinking education has grown in recent years as researchers and educators have sought to develop and assess students' computational thinking abilities. While much of the research in this area has focused on defining computational thinking, the competencies it involves and how to assess them in teaching and learning contexts, this work takes a different approach. We provide a more situated perspective on computational thinking, focusing on the types of problems that require computational thinking skills to be solved and the features that support these processes. We develop a framework for analysing existing computational thinking problems in an educational context. We conduct a comprehensive literature review to identify prototypical activities from areas where computational thinking is typically pursued in education. We identify the main components and characteristics of these activities, along with their influence on activating computational thinking competencies. The framework provides a catalogue of computational thinking skills that can be used to understand the relationship between problem features and competencies activated. This study contributes to the field of computational thinking education by offering a tool for evaluating and revising existing problems to activate specific skills and for assisting in designing new problems that target the development of particular competencies. The results of this study may be of interest to researchers and educators working in computational thinking education. △ Less

Submitted 28 March, 2024; originally announced March 2024.

arXiv:2209.05467 [pdf, ps, other]

Modelling Assessment Rubrics through Bayesian Networks: a Pragmatic Approach

Authors: Francesca Mangili, Giorgia Adorni, Alberto Piatti, Claudio Bonesana, Alessandro Antonucci

Abstract: Automatic assessment of learner competencies is a fundamental task in intelligent tutoring systems. An assessment rubric typically and effectively describes relevant competencies and competence levels. This paper presents an approach to deriving a learner model directly from an assessment rubric defining some (partial) ordering of competence levels. The model is based on Bayesian networks and expl… ▽ More Automatic assessment of learner competencies is a fundamental task in intelligent tutoring systems. An assessment rubric typically and effectively describes relevant competencies and competence levels. This paper presents an approach to deriving a learner model directly from an assessment rubric defining some (partial) ordering of competence levels. The model is based on Bayesian networks and exploits logical gates with uncertainty (often referred to as noisy gates) to reduce the number of parameters of the model, so to simplify their elicitation by experts and allow real-time inference in intelligent tutoring systems. We illustrate how the approach can be applied to automatize the human assessment of an activity developed for testing computational thinking skills. The simple elicitation of the model starting from the assessment rubric opens up the possibility of quickly automating the assessment of several tasks, making them more easily exploitable in the context of adaptive assessment tools and intelligent tutoring systems. △ Less

Submitted 7 September, 2022; originally announced September 2022.

Journal ref: Proceedings of 2022 International Conference on Software, Telecommunications and Computer Networks (SoftCOM)

arXiv:2112.14476 [pdf, ps, other]

ADAPQUEST: A Software for Web-Based Adaptive Questionnaires based on Bayesian Networks

Authors: Claudio Bonesana, Francesca Mangili, Alessandro Antonucci

Abstract: We introduce ADAPQUEST, a software tool written in Java for the development of adaptive questionnaires based on Bayesian networks. Adaptiveness is intended here as the dynamical choice of the question sequence on the basis of an evolving model of the skill level of the test taker. Bayesian networks offer a flexible and highly interpretable framework to describe such testing process, especially whe… ▽ More We introduce ADAPQUEST, a software tool written in Java for the development of adaptive questionnaires based on Bayesian networks. Adaptiveness is intended here as the dynamical choice of the question sequence on the basis of an evolving model of the skill level of the test taker. Bayesian networks offer a flexible and highly interpretable framework to describe such testing process, especially when co** with multiple skills. ADAPQUEST embeds dedicated elicitation strategies to simplify the elicitation of the questionnaire parameters. An application of this tool for the diagnosis of mental disorders is also discussed together with some implementation details. △ Less

Submitted 29 December, 2021; originally announced December 2021.

Comments: Presented at the IJCAI 2021 Workshop on Artificial Intelligence for Education

arXiv:2105.12205 [pdf, ps, other]

doi 10.1007/978-3-030-86772-0_29

A New Score for Adaptive Tests in Bayesian and Credal Networks

Authors: Alessandro Antonucci, Francesca Mangili, Claudio Bonesana, Giorgia Adorni

Abstract: A test is adaptive when its sequence and number of questions is dynamically tuned on the basis of the estimated skills of the taker. Graphical models, such as Bayesian networks, are used for adaptive tests as they allow to model the uncertainty about the questions and the skills in an explainable fashion, especially when co** with multiple skills. A better elicitation of the uncertainty in the q… ▽ More A test is adaptive when its sequence and number of questions is dynamically tuned on the basis of the estimated skills of the taker. Graphical models, such as Bayesian networks, are used for adaptive tests as they allow to model the uncertainty about the questions and the skills in an explainable fashion, especially when co** with multiple skills. A better elicitation of the uncertainty in the question/skills relations can be achieved by interval probabilities. This turns the model into a credal network, thus making more challenging the inferential complexity of the queries required to select questions. This is especially the case for the information theoretic quantities used as scores to drive the adaptive mechanism. We present an alternative family of scores, based on the mode of the posterior probabilities, and hence easier to explain. This makes considerably simpler the evaluation in the credal case, without significantly affecting the quality of the adaptive process. Numerical tests on synthetic and real-world data are used to support this claim. △ Less

Submitted 25 May, 2021; originally announced May 2021.

Journal ref: Vejnarová J., Wilson N. (eds) Symbolic and Quantitative Approaches to Reasoning with Uncertainty. ECSQARU 2021. Lecture Notes in Computer Science, vol 12897. Springer, Cham

arXiv:2002.05063 [pdf, other]

A Bayesian Approach to Conversational Recommendation Systems

Authors: Francesca Mangili, Denis Broggini, Alessandro Antonucci, Marco Alberti, Lorenzo Cimasoni

Abstract: We present a conversational recommendation system based on a Bayesian approach. A probability mass function over the items is updated after any interaction with the user, with information-theoretic criteria optimally sha** the interaction and deciding when the conversation should be terminated and the most probable item consequently recommended. Dedicated elicitation techniques for the prior pro… ▽ More We present a conversational recommendation system based on a Bayesian approach. A probability mass function over the items is updated after any interaction with the user, with information-theoretic criteria optimally sha** the interaction and deciding when the conversation should be terminated and the most probable item consequently recommended. Dedicated elicitation techniques for the prior probabilities of the parameters modeling the interactions are derived from basic structural judgements. Such prior information can be combined with historical data to discriminate items with different recommendation histories. A case study based on the application of this approach to \emph{stagend.com}, an online platform for booking entertainers, is finally discussed together with an empirical analysis showing the advantages in terms of recommendation quality and efficiency. △ Less

Submitted 12 February, 2020; originally announced February 2020.

Comments: Accepted for oral presentation at the \emph{AAAI 2020 Workshop on Interactive and Conversational Recommendation Systems} (WICRS)

arXiv:1609.08905 [pdf, other]

Statistical comparison of classifiers through Bayesian hierarchical modelling

Authors: Giorgio Corani, Alessio Benavoli, Janez Demšar, Francesca Mangili, Marco Zaffalon

Abstract: Usually one compares the accuracy of two competing classifiers via null hypothesis significance tests (nhst). Yet the nhst tests suffer from important shortcomings, which can be overcome by switching to Bayesian hypothesis testing. We propose a Bayesian hierarchical model which jointly analyzes the cross-validation results obtained by two classifiers on multiple data sets. It returns the posterior… ▽ More Usually one compares the accuracy of two competing classifiers via null hypothesis significance tests (nhst). Yet the nhst tests suffer from important shortcomings, which can be overcome by switching to Bayesian hypothesis testing. We propose a Bayesian hierarchical model which jointly analyzes the cross-validation results obtained by two classifiers on multiple data sets. It returns the posterior probability of the accuracies of the two classifiers being practically equivalent or significantly different. A further strength of the hierarchical model is that, by jointly analyzing the results obtained on all data sets, it reduces the estimation error compared to the usual approach of averaging the cross-validation results obtained on a given data set. △ Less

Submitted 22 November, 2016; v1 submitted 28 September, 2016; originally announced September 2016.

arXiv:1505.02288 [pdf, ps, other]

Should we really use post-hoc tests based on mean-ranks?

Authors: Alessio Benavoli, Giorgio Corani, Francesca Mangili

Abstract: The statistical comparison of multiple algorithms over multiple data sets is fundamental in machine learning. This is typically carried out by the Friedman test. When the Friedman test rejects the null hypothesis, multiple comparisons are carried out to establish which are the significant differences among algorithms. The multiple comparisons are usually performed using the mean-ranks test. The ai… ▽ More The statistical comparison of multiple algorithms over multiple data sets is fundamental in machine learning. This is typically carried out by the Friedman test. When the Friedman test rejects the null hypothesis, multiple comparisons are carried out to establish which are the significant differences among algorithms. The multiple comparisons are usually performed using the mean-ranks test. The aim of this technical note is to discuss the inconsistencies of the mean-ranks post-hoc test with the goal of discouraging its use in machine learning as well as in medicine, psychology, etc.. We show that the outcome of the mean-ranks test depends on the pool of algorithms originally included in the experiment. In other words, the outcome of the comparison between algorithms A and B depends also on the performance of the other algorithms included in the original experiment. This can lead to paradoxical situations. For instance the difference between A and B could be declared significant if the pool comprises algorithms C, D, E and not significant if the pool comprises algorithms F, G, H. To overcome these issues, we suggest instead to perform the multiple comparison using a test whose outcome only depends on the two algorithms being compared, such as the sign-test or the Wilcoxon signed-rank test. △ Less

Submitted 9 May, 2015; originally announced May 2015.

Showing 1–8 of 8 results for author: Mangili, F