Search | arXiv e-print repository

LLM meets Vision-Language Models for Zero-Shot One-Class Classification

Authors: Yassir Bendou, Giulia Lioi, Bastien Pasdeloup, Lukas Mauch, Ghouthi Boukli Hacene, Fabien Cardinaux, Vincent Gripon

Abstract: We consider the problem of zero-shot one-class visual classification, extending traditional one-class classification to scenarios where only the label of the target class is available. This method aims to discriminate between positive and negative query samples without requiring examples from the target class. We propose a two-step solution that first queries large language models for visually con… ▽ More We consider the problem of zero-shot one-class visual classification, extending traditional one-class classification to scenarios where only the label of the target class is available. This method aims to discriminate between positive and negative query samples without requiring examples from the target class. We propose a two-step solution that first queries large language models for visually confusing objects and then relies on vision-language pre-trained models (e.g., CLIP) to perform classification. By adapting large-scale vision benchmarks, we demonstrate the ability of the proposed method to outperform adapted off-the-shelf alternatives in this setting. Namely, we propose a realistic benchmark where negative query samples are drawn from the same original dataset as positive ones, including a granularity-controlled version of iNaturalist, where negative samples are at a fixed distance in the taxonomy tree from the positive ones. To our knowledge, we are the first to demonstrate the ability to discriminate a single category from other semantically related ones using only its label. △ Less

Submitted 27 May, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

arXiv:2403.15438 [pdf, ps, other]

Unsupervised Adaptive Deep Learning Method For BCI Motor Imagery Decoding

Authors: Yassine El Ouahidi, Giulia Lioi, Nicolas Farrugia, Bastien Pasdeloup, Vincent Gripon

Abstract: In the context of Brain-Computer Interfaces, we propose an adaptive method that reaches offline performance level while being usable online without requiring supervision. Interestingly, our method does not require retraining the model, as it consists in using a frozen efficient deep learning backbone while continuously realigning data, both at input and latent spaces, based on streaming observatio… ▽ More In the context of Brain-Computer Interfaces, we propose an adaptive method that reaches offline performance level while being usable online without requiring supervision. Interestingly, our method does not require retraining the model, as it consists in using a frozen efficient deep learning backbone while continuously realigning data, both at input and latent spaces, based on streaming observations. We demonstrate its efficiency for Motor Imagery brain decoding from electroencephalography data, considering challenging cross-subject scenarios. For reproducibility, we share the code of our experiments. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2311.14544 [pdf, other]

Inferring Latent Class Statistics from Text for Robust Visual Few-Shot Learning

Authors: Yassir Bendou, Vincent Gripon, Bastien Pasdeloup, Giulia Lioi, Lukas Mauch, Fabien Cardinaux, Ghouthi Boukli Hacene

Abstract: In the realm of few-shot learning, foundation models like CLIP have proven effective but exhibit limitations in cross-domain robustness especially in few-shot settings. Recent works add text as an extra modality to enhance the performance of these models. Most of these approaches treat text as an auxiliary modality without fully exploring its potential to elucidate the underlying class visual feat… ▽ More In the realm of few-shot learning, foundation models like CLIP have proven effective but exhibit limitations in cross-domain robustness especially in few-shot settings. Recent works add text as an extra modality to enhance the performance of these models. Most of these approaches treat text as an auxiliary modality without fully exploring its potential to elucidate the underlying class visual features distribution. In this paper, we present a novel approach that leverages text-derived statistics to predict the mean and covariance of the visual feature distribution for each class. This predictive framework enriches the latent space, yielding more robust and generalizable few-shot learning models. We demonstrate the efficacy of incorporating both mean and covariance statistics in improving few-shot classification performance across various datasets. Our method shows that we can use text to predict the mean and covariance of the distribution offering promising improvements in few-shot learning scenarios. △ Less

Submitted 24 November, 2023; originally announced November 2023.

Comments: R0-FoMo: Workshop on Robustness of Few-shot and Zero-shot Learning in Foundation Models at NeurIPS 2023

arXiv:2309.07159 [pdf, other]

A Strong and Simple Deep Learning Baseline for BCI MI Decoding

Authors: Yassine El Ouahidi, Vincent Gripon, Bastien Pasdeloup, Ghaith Bouallegue, Nicolas Farrugia, Giulia Lioi

Abstract: We propose EEG-SimpleConv, a straightforward 1D convolutional neural network for Motor Imagery decoding in BCI. Our main motivation is to propose a simple and performing baseline to compare to, using only very standard ingredients from the literature. We evaluate its performance on four EEG Motor Imagery datasets, including simulated online setups, and compare it to recent Deep Learning and Machin… ▽ More We propose EEG-SimpleConv, a straightforward 1D convolutional neural network for Motor Imagery decoding in BCI. Our main motivation is to propose a simple and performing baseline to compare to, using only very standard ingredients from the literature. We evaluate its performance on four EEG Motor Imagery datasets, including simulated online setups, and compare it to recent Deep Learning and Machine Learning approaches. EEG-SimpleConv is at least as good or far more efficient than other approaches, showing strong knowledge-transfer capabilities across subjects, at the cost of a low inference time. We advocate that using off-the-shelf ingredients rather than coming with ad-hoc solutions can significantly help the adoption of Deep Learning approaches for BCI. We make the code of the models and the experiments accessible. △ Less

Submitted 25 January, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

arXiv:2301.06372 [pdf, other]

Disambiguation of One-Shot Visual Classification Tasks: A Simplex-Based Approach

Authors: Yassir Bendou, Lucas Drumetz, Vincent Gripon, Giulia Lioi, Bastien Pasdeloup

Abstract: The field of visual few-shot classification aims at transferring the state-of-the-art performance of deep learning visual systems onto tasks where only a very limited number of training samples are available. The main solution consists in training a feature extractor using a large and diverse dataset to be applied to the considered few-shot task. Thanks to the encoded priors in the feature extract… ▽ More The field of visual few-shot classification aims at transferring the state-of-the-art performance of deep learning visual systems onto tasks where only a very limited number of training samples are available. The main solution consists in training a feature extractor using a large and diverse dataset to be applied to the considered few-shot task. Thanks to the encoded priors in the feature extractors, classification tasks with as little as one example (or "shot'') for each class can be solved with high accuracy, even when the shots display individual features not representative of their classes. Yet, the problem becomes more complicated when some of the given shots display multiple objects. In this paper, we present a strategy which aims at detecting the presence of multiple and previously unseen objects in a given shot. This methodology is based on identifying the corners of a simplex in a high dimensional space. We introduce an optimization routine and showcase its ability to successfully detect multiple (previously unseen) objects in raw images. Then, we introduce a downstream classifier meant to exploit the presence of multiple objects to improve the performance of few-shot classification, in the case of extreme settings where only one shot is given for its class. Using standard benchmarks of the field, we show the ability of the proposed method to slightly, yet statistically significantly, improve accuracy in these settings. △ Less

Submitted 16 January, 2023; originally announced January 2023.

arXiv:2211.02624 [pdf, other]

Spatial Graph Signal Interpolation with an Application for Merging BCI Datasets with Various Dimensionalities

Authors: Yassine El Ouahidi, Lucas Drumetz, Giulia Lioi, Nicolas Farrugia, Bastien Pasdeloup, Vincent Gripon

Abstract: BCI Motor Imagery datasets usually are small and have different electrodes setups. When training a Deep Neural Network, one may want to capitalize on all these datasets to increase the amount of data available and hence obtain good generalization results. To this end, we introduce a spatial graph signal interpolation technique, that allows to interpolate efficiently multiple electrodes. We conduct… ▽ More BCI Motor Imagery datasets usually are small and have different electrodes setups. When training a Deep Neural Network, one may want to capitalize on all these datasets to increase the amount of data available and hence obtain good generalization results. To this end, we introduce a spatial graph signal interpolation technique, that allows to interpolate efficiently multiple electrodes. We conduct a set of experiments with five BCI Motor Imagery datasets comparing the proposed interpolation with spherical splines interpolation. We believe that this work provides novel ideas on how to leverage graphs to interpolate electrodes and on how to homogenize multiple datasets. △ Less

Submitted 28 October, 2022; originally announced November 2022.

Comments: Submitted to the 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2023)

arXiv:2203.04455 [pdf, other]

Pruning Graph Convolutional Networks to select meaningful graph frequencies for fMRI decoding

Authors: Yassine El Ouahidi, Hugo Tessier, Giulia Lioi, Nicolas Farrugia, Bastien Pasdeloup, Vincent Gripon

Abstract: Graph Signal Processing is a promising framework to manipulate brain signals as it allows to encompass the spatial dependencies between the activity in regions of interest in the brain. In this work, we are interested in better understanding what are the graph frequencies that are the most useful to decode fMRI signals. To this end, we introduce a deep learning architecture and adapt a pruning met… ▽ More Graph Signal Processing is a promising framework to manipulate brain signals as it allows to encompass the spatial dependencies between the activity in regions of interest in the brain. In this work, we are interested in better understanding what are the graph frequencies that are the most useful to decode fMRI signals. To this end, we introduce a deep learning architecture and adapt a pruning methodology to automatically identify such frequencies. We experiment with various datasets, architectures and graphs, and show that low graph frequencies are consistently identified as the most important for fMRI decoding, with a stronger contribution for the functional graph over the structural one. We believe that this work provides novel insights on how graph-based methods can be deployed to increase fMRI decoding accuracy and interpretability. △ Less

Submitted 8 March, 2022; originally announced March 2022.

Comments: Submitted to the 30th European Signal Processing Conference, EUSIPCO 2022

arXiv:2201.09699 [pdf, other]

EASY: Ensemble Augmented-Shot Y-shaped Learning: State-Of-The-Art Few-Shot Classification with Simple Ingredients

Authors: Yassir Bendou, Yuqing Hu, Raphael Lafargue, Giulia Lioi, Bastien Pasdeloup, Stéphane Pateux, Vincent Gripon

Abstract: Few-shot learning aims at leveraging knowledge learned by one or more deep learning models, in order to obtain good classification performance on new problems, where only a few labeled samples per class are available. Recent years have seen a fair number of works in the field, introducing methods with numerous ingredients. A frequent problem, though, is the use of suboptimally trained models to ex… ▽ More Few-shot learning aims at leveraging knowledge learned by one or more deep learning models, in order to obtain good classification performance on new problems, where only a few labeled samples per class are available. Recent years have seen a fair number of works in the field, introducing methods with numerous ingredients. A frequent problem, though, is the use of suboptimally trained models to extract knowledge, leading to interrogations on whether proposed approaches bring gains compared to using better initial models without the introduced ingredients. In this work, we propose a simple methodology, that reaches or even beats state of the art performance on multiple standardized benchmarks of the field, while adding almost no hyperparameters or parameters to those used for training the initial deep learning models on the generic dataset. This methodology offers a new baseline on which to propose (and fairly compare) new techniques or adapt existing ones. △ Less

Submitted 7 February, 2022; v1 submitted 24 January, 2022; originally announced January 2022.

arXiv:2010.12500 [pdf, other]

Few-shot Decoding of Brain Activation Maps

Authors: Myriam Bontonou, Giulia Lioi, Nicolas Farrugia, Vincent Gripon

Abstract: Few-shot learning addresses problems for which a limited number of training examples are available. So far, the field has been mostly driven by applications in computer vision. Here, we are interested in adapting recently introduced few-shot methods to solve problems dealing with neuroimaging data, a promising application field. To this end, we create a neuroimaging benchmark dataset for few-shot… ▽ More Few-shot learning addresses problems for which a limited number of training examples are available. So far, the field has been mostly driven by applications in computer vision. Here, we are interested in adapting recently introduced few-shot methods to solve problems dealing with neuroimaging data, a promising application field. To this end, we create a neuroimaging benchmark dataset for few-shot learning and compare multiple learning paradigms, including meta-learning, as well as various backbone networks. Our experiments show that few-shot methods are able to efficiently decode brain signals using few examples, which paves the way for a number of applications in clinical and cognitive neuroscience, such as identifying biomarkers from brain scans or understanding the generalization of brain representations across a wide range of cognitive tasks. △ Less

Submitted 19 May, 2021; v1 submitted 23 October, 2020; originally announced October 2020.

Comments: 5 pages. Updated title and minor modifications

Showing 1–9 of 9 results for author: Lioi, G