Search | arXiv e-print repository

TaCo: Targeted Concept Removal in Output Embeddings for NLP via Information Theory and Explainability

Authors: Fanny Jourdan, Louis Béthune, Agustin Picard, Laurent Risser, Nicholas Asher

Abstract: The fairness of Natural Language Processing (NLP) models has emerged as a crucial concern. Information theory indicates that to achieve fairness, a model should not be able to predict sensitive variables, such as gender, ethnicity, and age. However, information related to these variables often appears implicitly in language, posing a challenge in identifying and mitigating biases effectively. To t… ▽ More The fairness of Natural Language Processing (NLP) models has emerged as a crucial concern. Information theory indicates that to achieve fairness, a model should not be able to predict sensitive variables, such as gender, ethnicity, and age. However, information related to these variables often appears implicitly in language, posing a challenge in identifying and mitigating biases effectively. To tackle this issue, we present a novel approach that operates at the embedding level of an NLP model, independent of the specific architecture. Our method leverages insights from recent advances in XAI techniques and employs an embedding transformation to eliminate implicit information from a selected variable. By directly manipulating the embeddings in the final layer, our approach enables a seamless integration into existing models without requiring significant modifications or retraining. In evaluation, we show that the proposed post-hoc approach significantly reduces gender-related associations in NLP models while preserving the overall performance and functionality of the models. An implementation of our method is available: https://github.com/fanny-jourdan/TaCo △ Less

Submitted 12 April, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

arXiv:2306.05307 [pdf, other]

Are fairness metric scores enough to assess discrimination biases in machine learning?

Authors: Fanny Jourdan, Laurent Risser, Jean-Michel Loubes, Nicholas Asher

Abstract: This paper presents novel experiments shedding light on the shortcomings of current metrics for assessing biases of gender discrimination made by machine learning algorithms on textual data. We focus on the Bios dataset, and our learning task is to predict the occupation of individuals, based on their biography. Such prediction tasks are common in commercial Natural Language Processing (NLP) appli… ▽ More This paper presents novel experiments shedding light on the shortcomings of current metrics for assessing biases of gender discrimination made by machine learning algorithms on textual data. We focus on the Bios dataset, and our learning task is to predict the occupation of individuals, based on their biography. Such prediction tasks are common in commercial Natural Language Processing (NLP) applications such as automatic job recommendations. We address an important limitation of theoretical discussions dealing with group-wise fairness metrics: they focus on large datasets, although the norm in many industrial NLP applications is to use small to reasonably large linguistic datasets for which the main practical constraint is to get a good prediction accuracy. We then question how reliable are different popular measures of bias when the size of the training set is simply sufficient to learn reasonably accurate predictions. Our experiments sample the Bios dataset and learn more than 200 models on different sample sizes. This allows us to statistically study our results and to confirm that common gender bias indices provide diverging and sometimes unreliable results when applied to relatively small training and test samples. This highlights the crucial importance of variance calculations for providing sound results in this field. △ Less

Submitted 8 June, 2023; originally announced June 2023.

Comments: Accepted for publication at Third Workshop on Trustworthy Natural Language Processing, ACL 2023

arXiv:2305.06754 [pdf, other]

COCKATIEL: COntinuous Concept ranKed ATtribution with Interpretable ELements for explaining neural net classifiers on NLP tasks

Authors: Fanny Jourdan, Agustin Picard, Thomas Fel, Laurent Risser, Jean Michel Loubes, Nicholas Asher

Abstract: Transformer architectures are complex and their use in NLP, while it has engendered many successes, makes their interpretability or explainability challenging. Recent debates have shown that attention maps and attribution methods are unreliable (Pruthi et al., 2019; Brunner et al., 2019). In this paper, we present some of their limitations and introduce COCKATIEL, which successfully addresses some… ▽ More Transformer architectures are complex and their use in NLP, while it has engendered many successes, makes their interpretability or explainability challenging. Recent debates have shown that attention maps and attribution methods are unreliable (Pruthi et al., 2019; Brunner et al., 2019). In this paper, we present some of their limitations and introduce COCKATIEL, which successfully addresses some of them. COCKATIEL is a novel, post-hoc, concept-based, model-agnostic XAI technique that generates meaningful explanations from the last layer of a neural net model trained on an NLP classification task by using Non-Negative Matrix Factorization (NMF) to discover the concepts the model leverages to make predictions and by exploiting a Sensitivity Analysis to estimate accurately the importance of each of these concepts for the model. It does so without compromising the accuracy of the underlying model or requiring a new one to be trained. We conduct experiments in single and multi-aspect sentiment analysis tasks and we show COCKATIEL's superior ability to discover concepts that align with humans' on Transformer models without any supervision, we objectively verify the faithfulness of its explanations through fidelity metrics, and we showcase its ability to provide meaningful explanations in two different datasets. △ Less

Submitted 14 May, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

Comments: Accepted for publication at Findings of ACL 2023

arXiv:2302.14063 [pdf, other]

How optimal transport can tackle gender biases in multi-class neural-network classifiers for job recommendations?

Authors: Fanny Jourdan, Titon Tshiongo Kaninku, Nicholas Asher, Jean-Michel Loubes, Laurent Risser

Abstract: Automatic recommendation systems based on deep neural networks have become extremely popular during the last decade. Some of these systems can however be used for applications which are ranked as High Risk by the European Commission in the A.I. act, as for instance for online job candidate recommendation. When used in the European Union, commercial AI systems for this purpose will then be required… ▽ More Automatic recommendation systems based on deep neural networks have become extremely popular during the last decade. Some of these systems can however be used for applications which are ranked as High Risk by the European Commission in the A.I. act, as for instance for online job candidate recommendation. When used in the European Union, commercial AI systems for this purpose will then be required to have to proper statistical properties with regard to potential discrimination they could engender. This motivated our contribution, where we present a novel optimal transport strategy to mitigate undesirable algorithmic biases in multi-class neural-network classification. Our stratey is model agnostic and can be used on any multi-class classification neural-network model. To anticipate the certification of recommendation systems using textual data, we then used it on the Bios dataset, for which the learning task consists in predicting the occupation of female and male individuals, based on their LinkedIn biography. Results show that it can reduce undesired algorithmic biases in this context to lower levels than a standard strategy. △ Less

Submitted 27 February, 2023; originally announced February 2023.

arXiv:2210.10418 [pdf, other]

p$^3$VAE: a physics-integrated generative model. Application to the pixel-wise classification of airborne hyperspectral images

Authors: Romain Thoreau, Laurent Risser, Véronique Achard, Béatrice Berthelot, Xavier Briottet

Abstract: The combination of machine learning models with physical models is a recent research path to learn robust data representations. In this paper, we introduce p$^3$VAE, a generative model that integrates a physical model which deterministically models some of the true underlying factors of variation in the data. To fully leverage our hybrid design, we enhance an existing semi-supervised optimization… ▽ More The combination of machine learning models with physical models is a recent research path to learn robust data representations. In this paper, we introduce p$^3$VAE, a generative model that integrates a physical model which deterministically models some of the true underlying factors of variation in the data. To fully leverage our hybrid design, we enhance an existing semi-supervised optimization technique and introduce a new inference scheme that comes along meaningful uncertainty estimates. We apply p$^3$VAE to the pixel-wise classification of airborne hyperspectral images. Our experiments on simulated and real data demonstrate the benefits of our hybrid model against conventional machine learning models in terms of extrapolation capabilities and interpretability. In particular, we show that p$^3$VAE naturally has high disentanglement capabilities. Our code and data have been made publicly available at https://github.com/Romain3Ch216/p3VAE. △ Less

Submitted 25 September, 2023; v1 submitted 19 October, 2022; originally announced October 2022.

Comments: 29 pages, 14 figures, submitted to Springer Machine Learning

MSC Class: 68T45 ACM Class: I.2.6; I.2.10

arXiv:2210.04491 [pdf, other]

A survey of Identification and mitigation of Machine Learning algorithmic biases in Image Analysis

Authors: Laurent Risser, Agustin Picard, Lucas Hervier, Jean-Michel Loubes

Abstract: The problem of algorithmic bias in machine learning has gained a lot of attention in recent years due to its concrete and potentially hazardous implications in society. In much the same manner, biases can also alter modern industrial and safety-critical applications where machine learning are based on high dimensional inputs such as images. This issue has however been mostly left out of the spotli… ▽ More The problem of algorithmic bias in machine learning has gained a lot of attention in recent years due to its concrete and potentially hazardous implications in society. In much the same manner, biases can also alter modern industrial and safety-critical applications where machine learning are based on high dimensional inputs such as images. This issue has however been mostly left out of the spotlight in the machine learning literature. Contrarily to societal applications where a set of proxy variables can be provided by the common sense or by regulations to draw the attention on potential risks, industrial and safety-critical applications are most of the times sailing blind. The variables related to undesired biases can indeed be indirectly represented in the input data, or can be unknown, thus making them harder to tackle. This raises serious and well-founded concerns towards the commercial deployment of AI-based solutions, especially in a context where new regulations clearly address the issues opened by undesired biases in AI. Consequently, we propose here to make an overview of recent advances in this area, firstly by presenting how such biases can demonstrate themselves, then by exploring different ways to bring them to light, and by probing different possibilities to mitigate them. We finally present a practical remote sensing use-case of industrial Fairness. △ Less

Submitted 10 October, 2022; originally announced October 2022.

arXiv:2003.14263 [pdf, other]

A survey of bias in Machine Learning through the prism of Statistical Parity for the Adult Data Set

Authors: Philippe Besse, Eustasio del Barrio, Paula Gordaliza, Jean-Michel Loubes, Laurent Risser

Abstract: Applications based on Machine Learning models have now become an indispensable part of the everyday life and the professional world. A critical question then recently arised among the population: Do algorithmic decisions convey any type of discrimination against specific groups of population or minorities? In this paper, we show the importance of understanding how a bias can be introduced into aut… ▽ More Applications based on Machine Learning models have now become an indispensable part of the everyday life and the professional world. A critical question then recently arised among the population: Do algorithmic decisions convey any type of discrimination against specific groups of population or minorities? In this paper, we show the importance of understanding how a bias can be introduced into automatic decisions. We first present a mathematical framework for the fair learning problem, specifically in the binary classification setting. We then propose to quantify the presence of bias by using the standard Disparate Impact index on the real and well-known Adult income data set. Finally, we check the performance of different approaches aiming to reduce the bias in binary classification outcomes. Importantly, we show that some intuitive methods are ineffective. This sheds light on the fact trying to make fair machine learning models may be a particularly challenging task, in particular when the training observations contain a bias. △ Less

Submitted 6 April, 2020; v1 submitted 31 March, 2020; originally announced March 2020.

arXiv:1908.05783 [pdf, other]

Tackling Algorithmic Bias in Neural-Network Classifiers using Wasserstein-2 Regularization

Authors: Laurent Risser, Alberto Gonzalez Sanz, Quentin Vincenot, Jean-Michel Loubes

Abstract: The increasingly common use of neural network classifiers in industrial and social applications of image analysis has allowed impressive progress these last years. Such methods are however sensitive to algorithmic bias, i.e. to an under- or an over-representation of positive predictions or to higher prediction errors in specific subgroups of images. We then introduce in this paper a new method to… ▽ More The increasingly common use of neural network classifiers in industrial and social applications of image analysis has allowed impressive progress these last years. Such methods are however sensitive to algorithmic bias, i.e. to an under- or an over-representation of positive predictions or to higher prediction errors in specific subgroups of images. We then introduce in this paper a new method to temper the algorithmic bias in Neural-Network based classifiers. Our method is Neural-Network architecture agnostic and scales well to massive training sets of images. It indeed only overloads the loss function with a Wasserstein-2 based regularization term for which we back-propagate the impact of specific output predictions using a new model, based on the Gateaux derivatives of the predictions distribution. This model is algorithmically reasonable and makes it possible to use our regularized loss with standard stochastic gradient-descent strategies. Its good behavior is assessed on the reference Adult census, MNIST, CelebA datasets. △ Less

Submitted 12 November, 2021; v1 submitted 15 August, 2019; originally announced August 2019.

arXiv:1810.07924 [pdf, other]

Explaining Machine Learning Models using Entropic Variable Projection

Authors: François Bachoc, Fabrice Gamboa, Max Halford, Jean-Michel Loubes, Laurent Risser

Abstract: In this paper, we present a new explainability formalism designed to shed light on how each input variable of a test set impacts the predictions of machine learning models. Hence, we propose a group explainability formalism for trained machine learning decision rules, based on their response to the variability of the input variables distribution. In order to emphasize the impact of each input vari… ▽ More In this paper, we present a new explainability formalism designed to shed light on how each input variable of a test set impacts the predictions of machine learning models. Hence, we propose a group explainability formalism for trained machine learning decision rules, based on their response to the variability of the input variables distribution. In order to emphasize the impact of each input variable, this formalism uses an information theory framework that quantifies the influence of all input-output observations based on entropic projections. This is thus the first unified and model agnostic formalism enabling data scientists to interpret the dependence between the input variables, their impact on the prediction errors, and their influence on the output predictions. Convergence rates of the entropic projections are provided in the large sample case. Most importantly, we prove that computing an explanation in our framework has a low algorithmic complexity, making it scalable to real-life large datasets. We illustrate our strategy by explaining complex decision rules learned by using XGBoost, Random Forest or Deep Neural Network classifiers on various datasets such as Adult Income, MNIST, CelebA, Boston Housing, Iris, as well as synthetic ones. We finally make clear its differences with the explainability strategies LIME and SHAP, that are based on single observations. Results can be reproduced by using the freely distributed Python toolbox https://gems-ai.aniti.fr/. △ Less

Submitted 11 August, 2022; v1 submitted 18 October, 2018; originally announced October 2018.

arXiv:1805.10211 [pdf, other]

COREclust: a new package for a robust and scalable analysis of complex data

Authors: Camille Champion, Anne-Claire Brunet, Jean-Michel Loubes, Laurent Risser

Abstract: In this paper, we present a new R package COREclust dedicated to the detection of representative variables in high dimensional spaces with a potentially limited number of observations. Variable sets detection is based on an original graph clustering strategy denoted CORE-clustering algorithm that detects CORE-clusters, i.e. variable sets having a user defined size range and in which each variable… ▽ More In this paper, we present a new R package COREclust dedicated to the detection of representative variables in high dimensional spaces with a potentially limited number of observations. Variable sets detection is based on an original graph clustering strategy denoted CORE-clustering algorithm that detects CORE-clusters, i.e. variable sets having a user defined size range and in which each variable is very similar to at least another variable. Representative variables are then robustely estimate as the CORE-cluster centers. This strategy is entirely coded in C++ and wrapped by R using the Rcpp package. A particular effort has been dedicated to keep its algorithmic cost reasonable so that it can be used on large datasets. After motivating our work, we will explain the CORE-clustering algorithm as well as a greedy extension of this algorithm. We will then present how to use it and results obtained on synthetic and real data. △ Less

Submitted 25 May, 2018; originally announced May 2018.

Showing 1–10 of 10 results for author: Risser, L