Skip to main content

Showing 1–20 of 20 results for author: Cadene, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2306.07304  [pdf, other

    cs.LG cs.AI

    A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation

    Authors: Thomas Fel, Victor Boutin, Mazda Moayeri, Rémi Cadène, Louis Bethune, Léo andéol, Mathieu Chalvidal, Thomas Serre

    Abstract: In recent years, concept-based approaches have emerged as some of the most promising explainability methods to help us interpret the decisions of Artificial Neural Networks (ANNs). These methods seek to discover intelligible visual 'concepts' buried within the complex patterns of ANN activations in two key steps: (1) concept extraction followed by (2) importance estimation. While these two steps a… ▽ More

    Submitted 29 October, 2023; v1 submitted 11 June, 2023; originally announced June 2023.

    Journal ref: Conference on Neural Information Processing Systems (NeurIPS), 2023

  2. arXiv:2306.06805  [pdf, other

    cs.CV cs.AI

    Unlocking Feature Visualization for Deeper Networks with MAgnitude Constrained Optimization

    Authors: Thomas Fel, Thibaut Boissin, Victor Boutin, Agustin Picard, Paul Novello, Julien Colin, Drew Linsley, Tom Rousseau, Rémi Cadène, Laurent Gardes, Thomas Serre

    Abstract: Feature visualization has gained substantial popularity, particularly after the influential work by Olah et al. in 2017, which established it as a crucial tool for explainability. However, its widespread adoption has been limited due to a reliance on tricks to generate interpretable images, and corresponding challenges in scaling it to deeper neural networks. Here, we describe MACO, a simple appro… ▽ More

    Submitted 29 October, 2023; v1 submitted 11 June, 2023; originally announced June 2023.

    Journal ref: Conference on Neural Information Processing Systems (NeurIPS), 2023

  3. arXiv:2211.10154  [pdf, other

    cs.CV cs.AI

    CRAFT: Concept Recursive Activation FacTorization for Explainability

    Authors: Thomas Fel, Agustin Picard, Louis Bethune, Thibaut Boissin, David Vigouroux, Julien Colin, Rémi Cadène, Thomas Serre

    Abstract: Attribution methods, which employ heatmaps to identify the most influential regions of an image that impact model decisions, have gained widespread popularity as a type of explainability method. However, recent research has exposed the limited practical value of these methods, attributed in part to their narrow focus on the most prominent regions of an image -- revealing "where" the model looks, b… ▽ More

    Submitted 28 March, 2023; v1 submitted 17 November, 2022; originally announced November 2022.

    Journal ref: Proceedings of the IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2023

  4. arXiv:2206.04394  [pdf, other

    cs.LG cs.AI

    Xplique: A Deep Learning Explainability Toolbox

    Authors: Thomas Fel, Lucas Hervier, David Vigouroux, Antonin Poche, Justin Plakoo, Remi Cadene, Mathieu Chalvidal, Julien Colin, Thibaut Boissin, Louis Bethune, Agustin Picard, Claire Nicodeme, Laurent Gardes, Gregory Flandin, Thomas Serre

    Abstract: Today's most advanced machine-learning models are hardly scrutable. The key challenge for explainability methods is to help assisting researchers in opening up these black boxes, by revealing the strategy that led to a given decision, by characterizing their internal states or by studying the underlying data representation. To address this challenge, we have developed Xplique: a software library f… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

  5. arXiv:2202.07728  [pdf, other

    cs.CV

    Don't Lie to Me! Robust and Efficient Explainability with Verified Perturbation Analysis

    Authors: Thomas Fel, Melanie Ducoffe, David Vigouroux, Remi Cadene, Mikael Capelle, Claire Nicodeme, Thomas Serre

    Abstract: A variety of methods have been proposed to try to explain how deep neural networks make their decisions. Key to those approaches is the need to sample the pixel space efficiently in order to derive importance maps. However, it has been shown that the sampling methods used to date introduce biases and other artifacts, leading to inaccurate estimates of the importance of individual pixels and severe… ▽ More

    Submitted 18 March, 2023; v1 submitted 15 February, 2022; originally announced February 2022.

    Journal ref: Proceedings of the IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2023

  6. arXiv:2112.04417  [pdf, other

    cs.CV cs.HC cs.LG

    What I Cannot Predict, I Do Not Understand: A Human-Centered Evaluation Framework for Explainability Methods

    Authors: Julien Colin, Thomas Fel, Remi Cadene, Thomas Serre

    Abstract: A multitude of explainability methods and associated fidelity performance metrics have been proposed to help better understand how modern AI systems make decisions. However, much of the current work has remained theoretical -- without much consideration for the human end-user. In particular, it is not yet known (1) how useful current explainability methods are in practice for more real-world scena… ▽ More

    Submitted 31 January, 2023; v1 submitted 6 December, 2021; originally announced December 2021.

  7. arXiv:2111.04138  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Look at the Variance! Efficient Black-box Explanations with Sobol-based Sensitivity Analysis

    Authors: Thomas Fel, Remi Cadene, Mathieu Chalvidal, Matthieu Cord, David Vigouroux, Thomas Serre

    Abstract: We describe a novel attribution method which is grounded in Sensitivity Analysis and uses Sobol indices. Beyond modeling the individual contributions of image regions, Sobol indices provide an efficient way to capture higher-order interactions between image regions and their contributions to a neural network's prediction through the lens of variance. We describe an approach that makes the computat… ▽ More

    Submitted 7 November, 2021; originally announced November 2021.

    Comments: NeurIPS2021

    Journal ref: Conference on Neural Information Processing Systems (NeurIPS), Dec 2022, Sydney, Australia

  8. Understanding the computational demands underlying visual reasoning

    Authors: Mohit Vaishnav, Remi Cadene, Andrea Alamia, Drew Linsley, Rufin VanRullen, Thomas Serre

    Abstract: Visual understanding requires comprehending complex visual relations between objects within a scene. Here, we seek to characterize the computational demands for abstract visual reasoning. We do this by systematically assessing the ability of modern deep convolutional neural networks (CNNs) to learn to solve the "Synthetic Visual Reasoning Test" (SVRT) challenge, a collection of twenty-three visual… ▽ More

    Submitted 8 December, 2021; v1 submitted 8 August, 2021; originally announced August 2021.

    Comments: 26 pages, 16 figures

    Journal ref: Neural Computation, 2022

  9. arXiv:2104.03149  [pdf, other

    cs.CV cs.AI

    Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering

    Authors: Corentin Dancette, Remi Cadene, Damien Teney, Matthieu Cord

    Abstract: We introduce an evaluation methodology for visual question answering (VQA) to better diagnose cases of shortcut learning. These cases happen when a model exploits spurious statistical regularities to produce correct answers but does not actually deploy the desired behavior. There is a need to identify possible shortcuts in a dataset and assess their use before deploying a model in the real world.… ▽ More

    Submitted 1 September, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

    Comments: Accepted at ICCV 2021. Code is available at https://github.com/cdancette/detect-shortcuts

  10. arXiv:2009.04521  [pdf, other

    cs.LG cs.AI

    How Good is your Explanation? Algorithmic Stability Measures to Assess the Quality of Explanations for Deep Neural Networks

    Authors: Thomas Fel, David Vigouroux, Rémi Cadène, Thomas Serre

    Abstract: A plethora of methods have been proposed to explain how deep neural networks reach their decisions but comparatively, little effort has been made to ensure that the explanations produced by these methods are objectively relevant. While several desirable properties for trustworthy explanations have been formulated, objective measures have been harder to derive. Here, we propose two new measures to… ▽ More

    Submitted 9 November, 2021; v1 submitted 7 September, 2020; originally announced September 2020.

    Journal ref: 2022 CVF Winter Conference on Applications of Computer Vision (WACV), Jan 2022, Hawaii, United States

  11. arXiv:2006.10079  [pdf, other

    cs.CV cs.CL cs.LG eess.IV

    Overcoming Statistical Shortcuts for Open-ended Visual Counting

    Authors: Corentin Dancette, Remi Cadene, Xinlei Chen, Matthieu Cord

    Abstract: Machine learning models tend to over-rely on statistical shortcuts. These spurious correlations between parts of the input and the output labels does not hold in real-world settings. We target this issue on the recent open-ended visual counting task which is well suited to study statistical shortcuts. We aim to develop models that learn a proper mechanism of counting regardless of the output label… ▽ More

    Submitted 1 July, 2020; v1 submitted 17 June, 2020; originally announced June 2020.

    Comments: 17 pages, 8 figures

  12. arXiv:1906.10169  [pdf, other

    cs.CV cs.CL cs.LG

    RUBi: Reducing Unimodal Biases in Visual Question Answering

    Authors: Remi Cadene, Corentin Dancette, Hedi Ben-younes, Matthieu Cord, Devi Parikh

    Abstract: Visual Question Answering (VQA) is the task of answering questions about an image. Some VQA models often exploit unimodal biases to provide the correct answer without using the image information. As a result, they suffer from a huge drop in performance when evaluated on data outside their training set distribution. This critical issue makes them unsuitable for real-world settings. We propose RUB… ▽ More

    Submitted 23 March, 2020; v1 submitted 24 June, 2019; originally announced June 2019.

    Comments: NeurIPS 2019 http://papers.nips.cc/paper/8371-rubi-reducing-unimodal-biases-for-visual-question-answering

    Journal ref: Advances in Neural Information Processing Systems 2019 (pp. 839-850)

  13. arXiv:1902.09487  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    MUREL: Multimodal Relational Reasoning for Visual Question Answering

    Authors: Remi Cadene, Hedi Ben-younes, Matthieu Cord, Nicolas Thome

    Abstract: Multimodal attentional networks are currently state-of-the-art models for Visual Question Answering (VQA) tasks involving real images. Although attention allows to focus on the visual content relevant to the question, this simple mechanism is arguably insufficient to model complex reasoning features required for VQA or other high-level tasks. In this paper, we propose MuRel, a multimodal relatio… ▽ More

    Submitted 25 February, 2019; originally announced February 2019.

    Comments: CVPR2019 accepted paper

  14. arXiv:1902.00038  [pdf, other

    cs.CV

    BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection

    Authors: Hedi Ben-younes, Rémi Cadene, Nicolas Thome, Matthieu Cord

    Abstract: Multimodal representation learning is gaining more and more interest within the deep learning community. While bilinear models provide an interesting framework to find subtle combination of modalities, their number of parameters grows quadratically with the input dimensions, making their practical implementation within classical deep learning pipelines challenging. In this paper, we introduce BLOC… ▽ More

    Submitted 12 February, 2019; v1 submitted 31 January, 2019; originally announced February 2019.

  15. Benchmark Analysis of Representative Deep Neural Network Architectures

    Authors: Simone Bianco, Remi Cadene, Luigi Celona, Paolo Napoletano

    Abstract: This work presents an in-depth analysis of the majority of the deep neural networks (DNNs) proposed in the state of the art for image recognition. For each DNN multiple performance indices are observed, such as recognition accuracy, model complexity, computational complexity, memory usage, and inference time. The behavior of such performance indices and some combinations of them are analyzed and d… ▽ More

    Submitted 19 October, 2018; v1 submitted 1 October, 2018; originally announced October 2018.

    Comments: Will appear in IEEE Access

    Journal ref: IEEE Access, 6 (2018) 64270-64277

  16. arXiv:1805.00900  [pdf, other

    cs.AI cs.CL cs.CV cs.IR

    Images & Recipes: Retrieval in the cooking context

    Authors: Micael Carvalho, Rémi Cadène, David Picard, Laure Soulier, Matthieu Cord

    Abstract: Recent advances in the machine learning community allowed different use cases to emerge, as its association to domains like cooking which created the computational cuisine. In this paper, we tackle the picture-recipe alignment problem, having as target application the large-scale retrieval task (finding a recipe given a picture, and vice versa). Our approach is validated on the Recipe1M dataset, c… ▽ More

    Submitted 2 May, 2018; originally announced May 2018.

    Comments: Published at DECOR / ICDE 2018. Extended version accepted at SIGIR 2018, available here: arXiv:1804.11146

  17. arXiv:1804.11146  [pdf, other

    cs.CL cs.CV cs.IR

    Cross-Modal Retrieval in the Cooking Context: Learning Semantic Text-Image Embeddings

    Authors: Micael Carvalho, Rémi Cadène, David Picard, Laure Soulier, Nicolas Thome, Matthieu Cord

    Abstract: Designing powerful tools that support cooking activities has rapidly gained popularity due to the massive amounts of available data, as well as recent advances in machine learning that are capable of analyzing them. In this paper, we propose a cross-modal retrieval model aligning visual and textual data (like pictures of dishes and their recipes) in a shared representation space. We describe an ef… ▽ More

    Submitted 30 April, 2018; originally announced April 2018.

    Comments: accepted at the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, 2018

  18. arXiv:1705.06676  [pdf, other

    cs.CV

    MUTAN: Multimodal Tucker Fusion for Visual Question Answering

    Authors: Hedi Ben-younes, Rémi Cadene, Matthieu Cord, Nicolas Thome

    Abstract: Bilinear models provide an appealing framework for mixing and merging information in Visual Question Answering (VQA) tasks. They help to learn high level associations between question meaning and visual concepts in the image, but they suffer from huge dimensionality issues. We introduce MUTAN, a multimodal tensor-based Tucker decomposition to efficiently parametrize bilinear interactions between v… ▽ More

    Submitted 18 May, 2017; originally announced May 2017.

  19. arXiv:1610.05567  [pdf, other

    cs.CV

    Master's Thesis : Deep Learning for Visual Recognition

    Authors: Rémi Cadène, Nicolas Thome, Matthieu Cord

    Abstract: The goal of our research is to develop methods advancing automatic visual recognition. In order to predict the unique or multiple labels associated to an image, we study different kind of Deep Neural Networks architectures and methods for supervised features learning. We first draw up a state-of-the-art review of the Convolutional Neural Networks aiming to understand the history behind this family… ▽ More

    Submitted 18 October, 2016; originally announced October 2016.

  20. arXiv:1610.05541  [pdf, other

    cs.CV

    M2CAI Workflow Challenge: Convolutional Neural Networks with Time Smoothing and Hidden Markov Model for Video Frames Classification

    Authors: Rémi Cadène, Thomas Robert, Nicolas Thome, Matthieu Cord

    Abstract: Our approach is among the three best to tackle the M2CAI Workflow challenge. The latter consists in recognizing the operation phase for each frames of endoscopic videos. In this technical report, we compare several classification models and temporal smoothing methods. Our submitted solution is a fine tuned Residual Network-200 on 80% of the training set with temporal smoothing using simple tempora… ▽ More

    Submitted 2 December, 2016; v1 submitted 18 October, 2016; originally announced October 2016.