Search | arXiv e-print repository

Restyling Unsupervised Concept Based Interpretable Networks with Generative Models

Authors: Jayneel Parekh, Quentin Bouniot, Pavlo Mozharovskyi, Alasdair Newson, Florence d'Alché-Buc

Abstract: Develo** inherently interpretable models for prediction has gained prominence in recent years. A subclass of these models, wherein the interpretable network relies on learning high-level concepts, are valued because of closeness of concept representations to human communication. However, the visualization and understanding of the learnt unsupervised dictionary of concepts encounters major limita… ▽ More Develo** inherently interpretable models for prediction has gained prominence in recent years. A subclass of these models, wherein the interpretable network relies on learning high-level concepts, are valued because of closeness of concept representations to human communication. However, the visualization and understanding of the learnt unsupervised dictionary of concepts encounters major limitations, specially for large-scale images. We propose here a novel method that relies on map** the concept features to the latent space of a pretrained generative model. The use of a generative model enables high quality visualization, and naturally lays out an intuitive and interactive procedure for better interpretation of the learnt concepts. Furthermore, leveraging pretrained generative models has the additional advantage of making the training of the system more efficient. We quantitatively ascertain the efficacy of our method in terms of accuracy of the interpretable prediction network, fidelity of reconstruction, as well as faithfulness and consistency of learnt concepts. The experiments are conducted on multiple image recognition benchmarks for large-scale images. Project page available at https://jayneelparekh.github.io/VisCoIN_project_page/ △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: Project page available at https://jayneelparekh.github.io/VisCoIN_project_page/

arXiv:2406.08074 [pdf, other]

A Concept-Based Explainability Framework for Large Multimodal Models

Authors: Jayneel Parekh, Pegah Khayatan, Mustafa Shukor, Alasdair Newson, Matthieu Cord

Abstract: Large multimodal models (LMMs) combine unimodal encoders and large language models (LLMs) to perform multimodal tasks. Despite recent advancements towards the interpretability of these models, understanding internal representations of LMMs remains largely a mystery. In this paper, we present a novel framework for the interpretation of LMMs. We propose a dictionary learning based approach, applied… ▽ More Large multimodal models (LMMs) combine unimodal encoders and large language models (LLMs) to perform multimodal tasks. Despite recent advancements towards the interpretability of these models, understanding internal representations of LMMs remains largely a mystery. In this paper, we present a novel framework for the interpretation of LMMs. We propose a dictionary learning based approach, applied to the representation of tokens. The elements of the learned dictionary correspond to our proposed concepts. We show that these concepts are well semantically grounded in both vision and text. Thus we refer to these as "multi-modal concepts". We qualitatively and quantitatively evaluate the results of the learnt concepts. We show that the extracted multimodal concepts are useful to interpret representations of test samples. Finally, we evaluate the disentanglement between different concepts and the quality of grounding concepts visually and textually. We will publicly release our code. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2305.07132 [pdf, other]

Tackling Interpretability in Audio Classification Networks with Non-negative Matrix Factorization

Authors: Jayneel Parekh, Sanjeel Parekh, Pavlo Mozharovskyi, Gaël Richard, Florence d'Alché-Buc

Abstract: This paper tackles two major problem settings for interpretability of audio processing networks, post-hoc and by-design interpretation. For post-hoc interpretation, we aim to interpret decisions of a network in terms of high-level audio objects that are also listenable for the end-user. This is extended to present an inherently interpretable model with high performance. To this end, we propose a n… ▽ More This paper tackles two major problem settings for interpretability of audio processing networks, post-hoc and by-design interpretation. For post-hoc interpretation, we aim to interpret decisions of a network in terms of high-level audio objects that are also listenable for the end-user. This is extended to present an inherently interpretable model with high performance. To this end, we propose a novel interpreter design that incorporates non-negative matrix factorization (NMF). In particular, an interpreter is trained to generate a regularized intermediate embedding from hidden layers of a target network, learnt as time-activations of a pre-learnt NMF dictionary. Our methodology allows us to generate intuitive audio-based interpretations that explicitly enhance parts of the input signal most relevant for a network's decision. We demonstrate our method's applicability on a variety of classification tasks, including multi-label data for real-world audio and music. △ Less

Submitted 11 May, 2023; originally announced May 2023.

Comments: Under submission at IEEE/ACM TASLP. arXiv admin note: text overlap with arXiv:2202.11479

arXiv:2202.11479 [pdf, other]

Listen to Interpret: Post-hoc Interpretability for Audio Networks with NMF

Authors: Jayneel Parekh, Sanjeel Parekh, Pavlo Mozharovskyi, Florence d'Alché-Buc, Gaël Richard

Abstract: This paper tackles post-hoc interpretability for audio processing networks. Our goal is to interpret decisions of a network in terms of high-level audio objects that are also listenable for the end-user. To this end, we propose a novel interpreter design that incorporates non-negative matrix factorization (NMF). In particular, a carefully regularized interpreter module is trained to take hidden la… ▽ More This paper tackles post-hoc interpretability for audio processing networks. Our goal is to interpret decisions of a network in terms of high-level audio objects that are also listenable for the end-user. To this end, we propose a novel interpreter design that incorporates non-negative matrix factorization (NMF). In particular, a carefully regularized interpreter module is trained to take hidden layer representations of the targeted network as input and produce time activations of pre-learnt NMF components as intermediate outputs. Our methodology allows us to generate intuitive audio-based interpretations that explicitly enhance parts of the input signal most relevant for a network's decision. We demonstrate our method's applicability on popular benchmarks, including a real-world multi-label classification task. △ Less

Submitted 24 October, 2022; v1 submitted 23 February, 2022; originally announced February 2022.

Comments: Accepted at NeurIPS 2022

arXiv:2110.04259 [pdf, other]

A Wireless Intrusion Detection System for 802.11 WPA3 Networks

Authors: Neil Dalal, Nadeem Akhtar, Anubhav Gupta, Nikhil Karamchandani, Gaurav S. Kasbekar, Jatin Parekh

Abstract: Wi-Fi (802.11) networks have become an essential part of our daily lives; hence, their security is of utmost importance. However, Wi-Fi Protected Access 3 (WPA3), the latest security certification for 802.11 standards, has recently been shown to be vulnerable to several attacks. In this paper, we first describe the attacks on WPA3 networks that have been reported in prior work; additionally, we sh… ▽ More Wi-Fi (802.11) networks have become an essential part of our daily lives; hence, their security is of utmost importance. However, Wi-Fi Protected Access 3 (WPA3), the latest security certification for 802.11 standards, has recently been shown to be vulnerable to several attacks. In this paper, we first describe the attacks on WPA3 networks that have been reported in prior work; additionally, we show that a deauthentication attack and a beacon flood attack, known to be possible on a WPA2 network, are still possible with WPA3. We launch and test all the above (a total of nine) attacks using a testbed that contains an enterprise Access Point (AP) and Intrusion Detection System (IDS). Our experimental results show that the AP is vulnerable to eight out of the nine attacks and the IDS is unable to detect any of them. We propose a design for a signature-based IDS, which incorporates techniques to detect all the above attacks. Also, we implement these techniques on our testbed and verify that our IDS is able to successfully detect all the above attacks. We provide schemes for mitigating the impact of the above attacks once they are detected. We make the code to perform the above attacks as well as that of our IDS publicly available, so that it can be used for future work by the research community at large. △ Less

Submitted 8 October, 2021; originally announced October 2021.

Comments: Nine pages including one page of references

arXiv:2010.09345 [pdf, other]

A Framework to Learn with Interpretation

Authors: Jayneel Parekh, Pavlo Mozharovskyi, Florence d'Alché-Buc

Abstract: To tackle interpretability in deep learning, we present a novel framework to jointly learn a predictive model and its associated interpretation model. The interpreter provides both local and global interpretability about the predictive model in terms of human-understandable high level attribute functions, with minimal loss of accuracy. This is achieved by a dedicated architecture and well chosen r… ▽ More To tackle interpretability in deep learning, we present a novel framework to jointly learn a predictive model and its associated interpretation model. The interpreter provides both local and global interpretability about the predictive model in terms of human-understandable high level attribute functions, with minimal loss of accuracy. This is achieved by a dedicated architecture and well chosen regularization penalties. We seek for a small-size dictionary of high level attribute functions that take as inputs the outputs of selected hidden layers and whose outputs feed a linear classifier. We impose strong conciseness on the activation of attributes with an entropy-based criterion while enforcing fidelity to both inputs and outputs of the predictive model. A detailed pipeline to visualize the learnt features is also developed. Moreover, besides generating interpretable models by design, our approach can be specialized to provide post-hoc interpretations for a pre-trained neural network. We validate our approach against several state-of-the-art methods on multiple datasets and show its efficacy on both kinds of tasks. △ Less

Submitted 23 February, 2022; v1 submitted 19 October, 2020; originally announced October 2020.

arXiv:2003.07703 [pdf, other]

Flexible and Context-Specific AI Explainability: A Multidisciplinary Approach

Authors: Valérie Beaudouin, Isabelle Bloch, David Bounie, Stéphan Clémençon, Florence d'Alché-Buc, James Eagan, Winston Maxwell, Pavlo Mozharovskyi, Jayneel Parekh

Abstract: The recent enthusiasm for artificial intelligence (AI) is due principally to advances in deep learning. Deep learning methods are remarkably accurate, but also opaque, which limits their potential use in safety-critical applications. To achieve trust and accountability, designers and operators of machine learning algorithms must be able to explain the inner workings, the results and the causes of… ▽ More The recent enthusiasm for artificial intelligence (AI) is due principally to advances in deep learning. Deep learning methods are remarkably accurate, but also opaque, which limits their potential use in safety-critical applications. To achieve trust and accountability, designers and operators of machine learning algorithms must be able to explain the inner workings, the results and the causes of failures of algorithms to users, regulators, and citizens. The originality of this paper is to combine technical, legal and economic aspects of explainability to develop a framework for defining the "right" level of explain-ability in a given context. We propose three logical steps: First, define the main contextual factors, such as who the audience of the explanation is, the operational context, the level of harm that the system could cause, and the legal/regulatory framework. This step will help characterize the operational and legal needs for explanation, and the corresponding social benefits. Second, examine the technical tools available, including post hoc approaches (input perturbation, saliency maps...) and hybrid AI approaches. Third, as function of the first two steps, choose the right levels of global and local explanation outputs, taking into the account the costs involved. We identify seven kinds of costs and emphasize that explanations are socially useful only when total social benefits exceed costs. △ Less

Submitted 13 March, 2020; originally announced March 2020.

arXiv:2002.06595 [pdf, other]

Speech-to-Singing Conversion in an Encoder-Decoder Framework

Authors: Jayneel Parekh, Preeti Rao, Yi-Hsuan Yang

Abstract: In this paper our goal is to convert a set of spoken lines into sung ones. Unlike previous signal processing based methods, we take a learning based approach to the problem. This allows us to automatically model various aspects of this transformation, thus overcoming dependence on specific inputs such as high quality singing templates or phoneme-score synchronization information. Specifically, we… ▽ More In this paper our goal is to convert a set of spoken lines into sung ones. Unlike previous signal processing based methods, we take a learning based approach to the problem. This allows us to automatically model various aspects of this transformation, thus overcoming dependence on specific inputs such as high quality singing templates or phoneme-score synchronization information. Specifically, we propose an encoder--decoder framework for our task. Given time-frequency representations of speech and a target melody contour, we learn encodings that enable us to synthesize singing that preserves the linguistic content and timbre of the speaker while adhering to the target melody. We also propose a multi-task learning based objective to improve lyric intelligibility. We present a quantitative and qualitative analysis of our framework. △ Less

Submitted 16 February, 2020; originally announced February 2020.

Comments: Accepted at IEEE ICASSP 2020

Showing 1–8 of 8 results for author: Parekh, J