Search | arXiv e-print repository

LLM meets Vision-Language Models for Zero-Shot One-Class Classification

Authors: Yassir Bendou, Giulia Lioi, Bastien Pasdeloup, Lukas Mauch, Ghouthi Boukli Hacene, Fabien Cardinaux, Vincent Gripon

Abstract: We consider the problem of zero-shot one-class visual classification, extending traditional one-class classification to scenarios where only the label of the target class is available. This method aims to discriminate between positive and negative query samples without requiring examples from the target class. We propose a two-step solution that first queries large language models for visually con… ▽ More We consider the problem of zero-shot one-class visual classification, extending traditional one-class classification to scenarios where only the label of the target class is available. This method aims to discriminate between positive and negative query samples without requiring examples from the target class. We propose a two-step solution that first queries large language models for visually confusing objects and then relies on vision-language pre-trained models (e.g., CLIP) to perform classification. By adapting large-scale vision benchmarks, we demonstrate the ability of the proposed method to outperform adapted off-the-shelf alternatives in this setting. Namely, we propose a realistic benchmark where negative query samples are drawn from the same original dataset as positive ones, including a granularity-controlled version of iNaturalist, where negative samples are at a fixed distance in the taxonomy tree from the positive ones. To our knowledge, we are the first to demonstrate the ability to discriminate a single category from other semantically related ones using only its label. △ Less

Submitted 27 May, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

arXiv:2403.15438 [pdf, ps, other]

Unsupervised Adaptive Deep Learning Method For BCI Motor Imagery Decoding

Authors: Yassine El Ouahidi, Giulia Lioi, Nicolas Farrugia, Bastien Pasdeloup, Vincent Gripon

Abstract: In the context of Brain-Computer Interfaces, we propose an adaptive method that reaches offline performance level while being usable online without requiring supervision. Interestingly, our method does not require retraining the model, as it consists in using a frozen efficient deep learning backbone while continuously realigning data, both at input and latent spaces, based on streaming observatio… ▽ More In the context of Brain-Computer Interfaces, we propose an adaptive method that reaches offline performance level while being usable online without requiring supervision. Interestingly, our method does not require retraining the model, as it consists in using a frozen efficient deep learning backbone while continuously realigning data, both at input and latent spaces, based on streaming observations. We demonstrate its efficiency for Motor Imagery brain decoding from electroencephalography data, considering challenging cross-subject scenarios. For reproducibility, we share the code of our experiments. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.03569 [pdf, other]

On Transfer in Classification: How Well do Subsets of Classes Generalize?

Authors: Raphael Baena, Lucas Drumetz, Vincent Gripon

Abstract: In classification, it is usual to observe that models trained on a given set of classes can generalize to previously unseen ones, suggesting the ability to learn beyond the initial task. This ability is often leveraged in the context of transfer learning where a pretrained model can be used to process new classes, with or without fine tuning. Surprisingly, there are a few papers looking at the the… ▽ More In classification, it is usual to observe that models trained on a given set of classes can generalize to previously unseen ones, suggesting the ability to learn beyond the initial task. This ability is often leveraged in the context of transfer learning where a pretrained model can be used to process new classes, with or without fine tuning. Surprisingly, there are a few papers looking at the theoretical roots beyond this phenomenon. In this work, we are interested in laying the foundations of such a theoretical framework for transferability between sets of classes. Namely, we establish a partially ordered set of subsets of classes. This tool allows to represent which subset of classes can generalize to others. In a more practical setting, we explore the ability of our framework to predict which subset of classes can lead to the best performance when testing on all of them. We also explore few-shot learning, where transfer is the golden standard. Our work contributes to better understanding of transfer mechanics and model generalization. △ Less

Submitted 6 March, 2024; originally announced March 2024.

arXiv:2401.15834 [pdf, other]

Few and Fewer: Learning Better from Few Examples Using Fewer Base Classes

Authors: Raphael Lafargue, Yassir Bendou, Bastien Pasdeloup, Jean-Philippe Diguet, Ian Reid, Vincent Gripon, Jack Valmadre

Abstract: When training data is scarce, it is common to make use of a feature extractor that has been pre-trained on a large base dataset, either by fine-tuning its parameters on the ``target'' dataset or by directly adopting its representation as features for a simple classifier. Fine-tuning is ineffective for few-shot learning, since the target dataset contains only a handful of examples. However, directl… ▽ More When training data is scarce, it is common to make use of a feature extractor that has been pre-trained on a large base dataset, either by fine-tuning its parameters on the ``target'' dataset or by directly adopting its representation as features for a simple classifier. Fine-tuning is ineffective for few-shot learning, since the target dataset contains only a handful of examples. However, directly adopting the features without fine-tuning relies on the base and target distributions being similar enough that these features achieve separability and generalization. This paper investigates whether better features for the target dataset can be obtained by training on fewer base classes, seeking to identify a more useful base dataset for a given task.We consider cross-domain few-shot image classification in eight different domains from Meta-Dataset and entertain multiple real-world settings (domain-informed, task-informed and uninformed) where progressively less detail is known about the target task. To our knowledge, this is the first demonstration that fine-tuning on a subset of carefully selected base classes can significantly improve few-shot learning. Our contributions are simple and intuitive methods that can be implemented in any few-shot solution. We also give insights into the conditions in which these solutions are likely to provide a boost in accuracy. We release the code to reproduce all experiments from this paper on GitHub. https://github.com/RafLaf/Few-and-Fewer.git △ Less

Submitted 28 January, 2024; originally announced January 2024.

Comments: 9.5 pages + bibliography and supplementary material

MSC Class: 68T ACM Class: I.2; I.4; I.5

arXiv:2401.11311 [pdf, other]

A Novel Benchmark for Few-Shot Semantic Segmentation in the Era of Foundation Models

Authors: Reda Bensaid, Vincent Gripon, François Leduc-Primeau, Lukas Mauch, Ghouthi Boukli Hacene, Fabien Cardinaux

Abstract: In recent years, the rapid evolution of computer vision has seen the emergence of various foundation models, each tailored to specific data types and tasks. In this study, we explore the adaptation of these models for few-shot semantic segmentation. Specifically, we conduct a comprehensive comparative analysis of four prominent foundation models: DINO V2, Segment Anything, CLIP, Masked AutoEncoder… ▽ More In recent years, the rapid evolution of computer vision has seen the emergence of various foundation models, each tailored to specific data types and tasks. In this study, we explore the adaptation of these models for few-shot semantic segmentation. Specifically, we conduct a comprehensive comparative analysis of four prominent foundation models: DINO V2, Segment Anything, CLIP, Masked AutoEncoders, and of a straightforward ResNet50 pre-trained on the COCO dataset. We also include 5 adaptation methods, ranging from linear probing to fine tuning. Our findings show that DINO V2 outperforms other models by a large margin, across various datasets and adaptation methods. On the other hand, adaptation methods provide little discrepancy in the obtained results, suggesting that a simple linear probing can compete with advanced, more computationally intensive, alternatives △ Less

Submitted 2 April, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

arXiv:2311.14544 [pdf, other]

Inferring Latent Class Statistics from Text for Robust Visual Few-Shot Learning

Authors: Yassir Bendou, Vincent Gripon, Bastien Pasdeloup, Giulia Lioi, Lukas Mauch, Fabien Cardinaux, Ghouthi Boukli Hacene

Abstract: In the realm of few-shot learning, foundation models like CLIP have proven effective but exhibit limitations in cross-domain robustness especially in few-shot settings. Recent works add text as an extra modality to enhance the performance of these models. Most of these approaches treat text as an auxiliary modality without fully exploring its potential to elucidate the underlying class visual feat… ▽ More In the realm of few-shot learning, foundation models like CLIP have proven effective but exhibit limitations in cross-domain robustness especially in few-shot settings. Recent works add text as an extra modality to enhance the performance of these models. Most of these approaches treat text as an auxiliary modality without fully exploring its potential to elucidate the underlying class visual features distribution. In this paper, we present a novel approach that leverages text-derived statistics to predict the mean and covariance of the visual feature distribution for each class. This predictive framework enriches the latent space, yielding more robust and generalizable few-shot learning models. We demonstrate the efficacy of incorporating both mean and covariance statistics in improving few-shot classification performance across various datasets. Our method shows that we can use text to predict the mean and covariance of the distribution offering promising improvements in few-shot learning scenarios. △ Less

Submitted 24 November, 2023; originally announced November 2023.

Comments: R0-FoMo: Workshop on Robustness of Few-shot and Zero-shot Learning in Foundation Models at NeurIPS 2023

arXiv:2309.12854 [pdf, ps, other]

ThinResNet: A New Baseline for Structured Convolutional Networks Pruning

Authors: Hugo Tessier, Ghouti Boukli Hacene, Vincent Gripon

Abstract: Pruning is a compression method which aims to improve the efficiency of neural networks by reducing their number of parameters while maintaining a good performance, thus enhancing the performance-to-cost ratio in nontrivial ways. Of particular interest are structured pruning techniques, in which whole portions of parameters are removed altogether, resulting in easier to leverage shrunk architectur… ▽ More Pruning is a compression method which aims to improve the efficiency of neural networks by reducing their number of parameters while maintaining a good performance, thus enhancing the performance-to-cost ratio in nontrivial ways. Of particular interest are structured pruning techniques, in which whole portions of parameters are removed altogether, resulting in easier to leverage shrunk architectures. Since its growth in popularity in the recent years, pruning gave birth to countless papers and contributions, resulting first in critical inconsistencies in the way results are compared, and then to a collective effort to establish standardized benchmarks. However, said benchmarks are based on training practices that date from several years ago and do not align with current practices. In this work, we verify how results in the recent literature of pruning hold up against networks that underwent both state-of-the-art training methods and trivial model scaling. We find that the latter clearly and utterly outperform all the literature we compared to, proving that updating standard pruning benchmarks and re-evaluating classical methods in their light is an absolute necessity. We thus introduce a new challenging baseline to compare structured pruning to: ThinResNet. △ Less

Submitted 22 September, 2023; originally announced September 2023.

Comments: 11 pages, 2 figures, 3 tables

arXiv:2309.07159 [pdf, other]

A Strong and Simple Deep Learning Baseline for BCI MI Decoding

Authors: Yassine El Ouahidi, Vincent Gripon, Bastien Pasdeloup, Ghaith Bouallegue, Nicolas Farrugia, Giulia Lioi

Abstract: We propose EEG-SimpleConv, a straightforward 1D convolutional neural network for Motor Imagery decoding in BCI. Our main motivation is to propose a simple and performing baseline to compare to, using only very standard ingredients from the literature. We evaluate its performance on four EEG Motor Imagery datasets, including simulated online setups, and compare it to recent Deep Learning and Machin… ▽ More We propose EEG-SimpleConv, a straightforward 1D convolutional neural network for Motor Imagery decoding in BCI. Our main motivation is to propose a simple and performing baseline to compare to, using only very standard ingredients from the literature. We evaluate its performance on four EEG Motor Imagery datasets, including simulated online setups, and compare it to recent Deep Learning and Machine Learning approaches. EEG-SimpleConv is at least as good or far more efficient than other approaches, showing strong knowledge-transfer capabilities across subjects, at the cost of a low inference time. We advocate that using off-the-shelf ingredients rather than coming with ad-hoc solutions can significantly help the adoption of Deep Learning approaches for BCI. We make the code of the models and the experiments accessible. △ Less

Submitted 25 January, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

arXiv:2301.06372 [pdf, other]

Disambiguation of One-Shot Visual Classification Tasks: A Simplex-Based Approach

Authors: Yassir Bendou, Lucas Drumetz, Vincent Gripon, Giulia Lioi, Bastien Pasdeloup

Abstract: The field of visual few-shot classification aims at transferring the state-of-the-art performance of deep learning visual systems onto tasks where only a very limited number of training samples are available. The main solution consists in training a feature extractor using a large and diverse dataset to be applied to the considered few-shot task. Thanks to the encoded priors in the feature extract… ▽ More The field of visual few-shot classification aims at transferring the state-of-the-art performance of deep learning visual systems onto tasks where only a very limited number of training samples are available. The main solution consists in training a feature extractor using a large and diverse dataset to be applied to the considered few-shot task. Thanks to the encoded priors in the feature extractors, classification tasks with as little as one example (or "shot'') for each class can be solved with high accuracy, even when the shots display individual features not representative of their classes. Yet, the problem becomes more complicated when some of the given shots display multiple objects. In this paper, we present a strategy which aims at detecting the presence of multiple and previously unseen objects in a given shot. This methodology is based on identifying the corners of a simplex in a high dimensional space. We introduce an optimization routine and showcase its ability to successfully detect multiple (previously unseen) objects in raw images. Then, we introduce a downstream classifier meant to exploit the presence of multiple objects to improve the performance of few-shot classification, in the case of extreme settings where only one shot is given for its class. Using standard benchmarks of the field, we show the ability of the proposed method to slightly, yet statistically significantly, improve accuracy in these settings. △ Less

Submitted 16 January, 2023; originally announced January 2023.

arXiv:2212.06461 [pdf, ps, other]

A Statistical Model for Predicting Generalization in Few-Shot Classification

Authors: Yassir Bendou, Vincent Gripon, Bastien Pasdeloup, Lukas Mauch, Stefan Uhlich, Fabien Cardinaux, Ghouthi Boukli Hacene, Javier Alonso Garcia

Abstract: The estimation of the generalization error of classifiers often relies on a validation set. Such a set is hardly available in few-shot learning scenarios, a highly disregarded shortcoming in the field. In these scenarios, it is common to rely on features extracted from pre-trained neural networks combined with distance-based classifiers such as nearest class mean. In this work, we introduce a Gaus… ▽ More The estimation of the generalization error of classifiers often relies on a validation set. Such a set is hardly available in few-shot learning scenarios, a highly disregarded shortcoming in the field. In these scenarios, it is common to rely on features extracted from pre-trained neural networks combined with distance-based classifiers such as nearest class mean. In this work, we introduce a Gaussian model of the feature distribution. By estimating the parameters of this model, we are able to predict the generalization error on new classification tasks with few samples. We observe that accurate distance estimates between class-conditional densities are the key to accurate estimates of the generalization performance. Therefore, we propose an unbiased estimator for these distances and integrate it in our numerical analysis. We empirically show that our approach outperforms alternatives such as the leave-one-out cross-validation strategy. △ Less

Submitted 28 March, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

arXiv:2211.02624 [pdf, other]

Spatial Graph Signal Interpolation with an Application for Merging BCI Datasets with Various Dimensionalities

Authors: Yassine El Ouahidi, Lucas Drumetz, Giulia Lioi, Nicolas Farrugia, Bastien Pasdeloup, Vincent Gripon

Abstract: BCI Motor Imagery datasets usually are small and have different electrodes setups. When training a Deep Neural Network, one may want to capitalize on all these datasets to increase the amount of data available and hence obtain good generalization results. To this end, we introduce a spatial graph signal interpolation technique, that allows to interpolate efficiently multiple electrodes. We conduct… ▽ More BCI Motor Imagery datasets usually are small and have different electrodes setups. When training a Deep Neural Network, one may want to capitalize on all these datasets to increase the amount of data available and hence obtain good generalization results. To this end, we introduce a spatial graph signal interpolation technique, that allows to interpolate efficiently multiple electrodes. We conduct a set of experiments with five BCI Motor Imagery datasets comparing the proposed interpolation with spherical splines interpolation. We believe that this work provides novel ideas on how to leverage graphs to interpolate electrodes and on how to homogenize multiple datasets. △ Less

Submitted 28 October, 2022; originally announced November 2022.

Comments: Submitted to the 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2023)

arXiv:2209.11481 [pdf, ps, other]

Active Few-Shot Classification: a New Paradigm for Data-Scarce Learning Settings

Authors: Aymane Abdali, Vincent Gripon, Lucas Drumetz, Bartosz Boguslawski

Abstract: We consider a novel formulation of the problem of Active Few-Shot Classification (AFSC) where the objective is to classify a small, initially unlabeled, dataset given a very restrained labeling budget. This problem can be seen as a rival paradigm to classical Transductive Few-Shot Classification (TFSC), as both these approaches are applicable in similar conditions. We first propose a methodology t… ▽ More We consider a novel formulation of the problem of Active Few-Shot Classification (AFSC) where the objective is to classify a small, initially unlabeled, dataset given a very restrained labeling budget. This problem can be seen as a rival paradigm to classical Transductive Few-Shot Classification (TFSC), as both these approaches are applicable in similar conditions. We first propose a methodology that combines statistical inference, and an original two-tier active learning strategy that fits well into this framework. We then adapt several standard vision benchmarks from the field of TFSC. Our experiments show the potential benefits of AFSC can be substantial, with gains in average weighted accuracy of up to 10% compared to state-of-the-art TFSC methods for the same labeling budget. We believe this new paradigm could lead to new developments and standards in data-scarce learning settings. △ Less

Submitted 23 September, 2022; originally announced September 2022.

arXiv:2209.08527 [pdf, other]

Adaptive Dimension Reduction and Variational Inference for Transductive Few-Shot Classification

Authors: Yuqing Hu, Stéphane Pateux, Vincent Gripon

Abstract: Transductive Few-Shot learning has gained increased attention nowadays considering the cost of data annotations along with the increased accuracy provided by unlabelled samples in the domain of few shot. Especially in Few-Shot Classification (FSC), recent works explore the feature distributions aiming at maximizing likelihoods or posteriors with respect to the unknown parameters. Following this ve… ▽ More Transductive Few-Shot learning has gained increased attention nowadays considering the cost of data annotations along with the increased accuracy provided by unlabelled samples in the domain of few shot. Especially in Few-Shot Classification (FSC), recent works explore the feature distributions aiming at maximizing likelihoods or posteriors with respect to the unknown parameters. Following this vein, and considering the parallel between FSC and clustering, we seek for better taking into account the uncertainty in estimation due to lack of data, as well as better statistical properties of the clusters associated with each class. Therefore in this paper we propose a new clustering method based on Variational Bayesian inference, further improved by Adaptive Dimension Reduction based on Probabilistic Linear Discriminant Analysis. Our proposed method significantly improves accuracy in the realistic unbalanced transductive setting on various Few-Shot benchmarks when applied to features used in previous studies, with a gain of up to $6\%$ in accuracy. In addition, when applied to balanced setting, we obtain very competitive results without making use of the class-balance artefact which is disputable for practical use cases. We also provide the performance of our method on a high performing pretrained backbone, with the reported results further surpassing the current state-of-the-art accuracy, suggesting the genericity of the proposed method. △ Less

Submitted 18 September, 2022; originally announced September 2022.

arXiv:2208.03684 [pdf, other]

Preserving Fine-Grain Feature Information in Classification via Entropic Regularization

Authors: Raphael Baena, Lucas Drumetz, Vincent Gripon

Abstract: Labeling a classification dataset implies to define classes and associated coarse labels, that may approximate a smoother and more complicated ground truth. For example, natural images may contain multiple objects, only one of which is labeled in many vision datasets, or classes may result from the discretization of a regression problem. Using cross-entropy to train classification models on such c… ▽ More Labeling a classification dataset implies to define classes and associated coarse labels, that may approximate a smoother and more complicated ground truth. For example, natural images may contain multiple objects, only one of which is labeled in many vision datasets, or classes may result from the discretization of a regression problem. Using cross-entropy to train classification models on such coarse labels is likely to roughly cut through the feature space, potentially disregarding the most meaningful such features, in particular losing information on the underlying fine-grain task. In this paper we are interested in the problem of solving fine-grain classification or regression, using a model trained on coarse-grain labels only. We show that standard cross-entropy can lead to overfitting to coarse-related features. We introduce an entropy-based regularization to promote more diversity in the feature space of trained models, and empirically demonstrate the efficacy of this methodology to reach better performance on the fine-grain problems. Our results are supported through theoretical developments and empirical validation. △ Less

Submitted 7 August, 2022; originally announced August 2022.

arXiv:2206.06255 [pdf, ps, other]

doi 10.1007/978-3-031-16281-7_52

Energy Consumption Analysis of pruned Semantic Segmentation Networks on an Embedded GPU

Authors: Hugo Tessier, Vincent Gripon, Mathieu Léonardon, Matthieu Arzel, David Bertrand, Thomas Hannagan

Abstract: Deep neural networks are the state of the art in many computer vision tasks. Their deployment in the context of autonomous vehicles is of particular interest, since their limitations in terms of energy consumption prohibit the use of very large networks, that typically reach the best performance. A common method to reduce the complexity of these architectures, without sacrificing accuracy, is to r… ▽ More Deep neural networks are the state of the art in many computer vision tasks. Their deployment in the context of autonomous vehicles is of particular interest, since their limitations in terms of energy consumption prohibit the use of very large networks, that typically reach the best performance. A common method to reduce the complexity of these architectures, without sacrificing accuracy, is to rely on pruning, in which the least important portions are eliminated. There is a large literature on the subject, but interestingly few works have measured the actual impact of pruning on energy. In this work, we are interested in measuring it in the specific context of semantic segmentation for autonomous driving, using the Cityscapes dataset. To this end, we analyze the impact of recently proposed structured pruning methods when trained architectures are deployed on a Jetson Xavier embedded GPU. △ Less

Submitted 13 June, 2022; originally announced June 2022.

Comments: 10 pages, 3 figures, submitted to SysInt 2022

MSC Class: 68T07

arXiv:2206.06247 [pdf, ps, other]

doi 10.1109/SiPS55645.2022.9919253

Leveraging Structured Pruning of Convolutional Neural Networks

Authors: Hugo Tessier, Vincent Gripon, Mathieu Léonardon, Matthieu Arzel, David Bertrand, Thomas Hannagan

Abstract: Structured pruning is a popular method to reduce the cost of convolutional neural networks, that are the state of the art in many computer vision tasks. However, depending on the architecture, pruning introduces dimensional discrepancies which prevent the actual reduction of pruned networks. To tackle this problem, we propose a method that is able to take any structured pruning mask and generate a… ▽ More Structured pruning is a popular method to reduce the cost of convolutional neural networks, that are the state of the art in many computer vision tasks. However, depending on the architecture, pruning introduces dimensional discrepancies which prevent the actual reduction of pruned networks. To tackle this problem, we propose a method that is able to take any structured pruning mask and generate a network that does not encounter any of these problems and can be leveraged efficiently. We provide an accurate description of our solution and show results of gains, in energy consumption and inference time on embedded hardware, of pruned convolutional neural networks. △ Less

Submitted 13 June, 2022; originally announced June 2022.

Comments: 6 pages, 5 figures, submitted to SiPS 2022

MSC Class: 68T07

arXiv:2203.04455 [pdf, other]

Pruning Graph Convolutional Networks to select meaningful graph frequencies for fMRI decoding

Authors: Yassine El Ouahidi, Hugo Tessier, Giulia Lioi, Nicolas Farrugia, Bastien Pasdeloup, Vincent Gripon

Abstract: Graph Signal Processing is a promising framework to manipulate brain signals as it allows to encompass the spatial dependencies between the activity in regions of interest in the brain. In this work, we are interested in better understanding what are the graph frequencies that are the most useful to decode fMRI signals. To this end, we introduce a deep learning architecture and adapt a pruning met… ▽ More Graph Signal Processing is a promising framework to manipulate brain signals as it allows to encompass the spatial dependencies between the activity in regions of interest in the brain. In this work, we are interested in better understanding what are the graph frequencies that are the most useful to decode fMRI signals. To this end, we introduce a deep learning architecture and adapt a pruning methodology to automatically identify such frequencies. We experiment with various datasets, architectures and graphs, and show that low graph frequencies are consistently identified as the most important for fMRI decoding, with a stronger contribution for the functional graph over the structural one. We believe that this work provides novel insights on how graph-based methods can be deployed to increase fMRI decoding accuracy and interpretability. △ Less

Submitted 8 March, 2022; originally announced March 2022.

Comments: Submitted to the 30th European Signal Processing Conference, EUSIPCO 2022

arXiv:2201.09699 [pdf, other]

EASY: Ensemble Augmented-Shot Y-shaped Learning: State-Of-The-Art Few-Shot Classification with Simple Ingredients

Authors: Yassir Bendou, Yuqing Hu, Raphael Lafargue, Giulia Lioi, Bastien Pasdeloup, Stéphane Pateux, Vincent Gripon

Abstract: Few-shot learning aims at leveraging knowledge learned by one or more deep learning models, in order to obtain good classification performance on new problems, where only a few labeled samples per class are available. Recent years have seen a fair number of works in the field, introducing methods with numerous ingredients. A frequent problem, though, is the use of suboptimally trained models to ex… ▽ More Few-shot learning aims at leveraging knowledge learned by one or more deep learning models, in order to obtain good classification performance on new problems, where only a few labeled samples per class are available. Recent years have seen a fair number of works in the field, introducing methods with numerous ingredients. A frequent problem, though, is the use of suboptimally trained models to extract knowledge, leading to interrogations on whether proposed approaches bring gains compared to using better initial models without the introduced ingredients. In this work, we propose a simple methodology, that reaches or even beats state of the art performance on multiple standardized benchmarks of the field, while adding almost no hyperparameters or parameters to those used for training the initial deep learning models on the generic dataset. This methodology offers a new baseline on which to propose (and fairly compare) new techniques or adapt existing ones. △ Less

Submitted 7 February, 2022; v1 submitted 24 January, 2022; originally announced January 2022.

arXiv:2201.04368 [pdf, ps, other]

Preventing Manifold Intrusion with Locality: Local Mixup

Authors: Raphael Baena, Lucas Drumetz, Vincent Gripon

Abstract: Mixup is a data-dependent regularization technique that consists in linearly interpolating input samples and associated outputs. It has been shown to improve accuracy when used to train on standard machine learning datasets. However, authors have pointed out that Mixup can produce out-of-distribution virtual samples and even contradictions in the augmented training set, potentially resulting in ad… ▽ More Mixup is a data-dependent regularization technique that consists in linearly interpolating input samples and associated outputs. It has been shown to improve accuracy when used to train on standard machine learning datasets. However, authors have pointed out that Mixup can produce out-of-distribution virtual samples and even contradictions in the augmented training set, potentially resulting in adversarial effects. In this paper, we introduce Local Mixup in which distant input samples are weighted down when computing the loss. In constrained settings we demonstrate that Local Mixup can create a trade-off between bias and variance, with the extreme cases reducing to vanilla training and classical Mixup. Using standardized computer vision benchmarks , we also show that Local Mixup can improve test accuracy. △ Less

Submitted 12 January, 2022; originally announced January 2022.

arXiv:2110.09446 [pdf, other]

Squeezing Backbone Feature Distributions to the Max for Efficient Few-Shot Learning

Authors: Yuqing Hu, Vincent Gripon, Stéphane Pateux

Abstract: Few-shot classification is a challenging problem due to the uncertainty caused by using few labelled samples. In the past few years, many methods have been proposed with the common aim of transferring knowledge acquired on a previously solved task, what is often achieved by using a pretrained feature extractor. Following this vein, in this paper we propose a novel transfer-based method which aims… ▽ More Few-shot classification is a challenging problem due to the uncertainty caused by using few labelled samples. In the past few years, many methods have been proposed with the common aim of transferring knowledge acquired on a previously solved task, what is often achieved by using a pretrained feature extractor. Following this vein, in this paper we propose a novel transfer-based method which aims at processing the feature vectors so that they become closer to Gaussian-like distributions, resulting in increased accuracy. In the case of transductive few-shot learning where unlabelled test samples are available during training, we also introduce an optimal-transport inspired algorithm to boost even further the achieved performance. Using standardized vision benchmarks, we show the ability of the proposed methodology to achieve state-of-the-art accuracy with various datasets, backbone architectures and few-shot settings. △ Less

Submitted 18 October, 2021; originally announced October 2021.

Comments: Init commit. arXiv admin note: text overlap with arXiv:2006.03806

arXiv:2110.03999 [pdf, other]

Graphs as Tools to Improve Deep Learning Methods

Authors: Carlos Lassance, Myriam Bontonou, Mounia Hamidouche, Bastien Pasdeloup, Lucas Drumetz, Vincent Gripon

Abstract: In recent years, deep neural networks (DNNs) have known an important rise in popularity. However, although they are state-of-the-art in many machine learning challenges, they still suffer from several limitations. For example, DNNs require a lot of training data, which might not be available in some practical applications. In addition, when small perturbations are added to the inputs, DNNs are pro… ▽ More In recent years, deep neural networks (DNNs) have known an important rise in popularity. However, although they are state-of-the-art in many machine learning challenges, they still suffer from several limitations. For example, DNNs require a lot of training data, which might not be available in some practical applications. In addition, when small perturbations are added to the inputs, DNNs are prone to misclassification errors. DNNs are also viewed as black-boxes and as such their decisions are often criticized for their lack of interpretability. In this chapter, we review recent works that aim at using graphs as tools to improve deep learning methods. These graphs are defined considering a specific layer in a deep learning architecture. Their vertices represent distinct samples, and their edges depend on the similarity of the corresponding intermediate representations. These graphs can then be leveraged using various methodologies, many of which built on top of graph signal processing. This chapter is composed of four main parts: tools for visualizing intermediate layers in a DNN, denoising data representations, optimizing graph objective functions and regularizing the learning process. △ Less

Submitted 8 October, 2021; originally announced October 2021.

Comments: arXiv admin note: text overlap with arXiv:2012.07439

arXiv:2108.10427 [pdf, other]

Graph-LDA: Graph Structure Priors to Improve the Accuracy in Few-Shot Classification

Authors: Myriam Bontonou, Nicolas Farrugia, Vincent Gripon

Abstract: It is very common to face classification problems where the number of available labeled samples is small compared to their dimension. These conditions are likely to cause underdetermined settings, with high risk of overfitting. To improve the generalization ability of trained classifiers, common solutions include using priors about the data distribution. Among many options, data structure priors,… ▽ More It is very common to face classification problems where the number of available labeled samples is small compared to their dimension. These conditions are likely to cause underdetermined settings, with high risk of overfitting. To improve the generalization ability of trained classifiers, common solutions include using priors about the data distribution. Among many options, data structure priors, often represented through graphs, are increasingly popular in the field. In this paper, we introduce a generic model where observed class signals are supposed to be deteriorated with two sources of noise, one independent of the underlying graph structure and isotropic, and the other colored by a known graph operator. Under this model, we derive an optimal methodology to classify such signals. Interestingly, this methodology includes a single parameter, making it particularly suitable for cases where available data is scarce. Using various real datasets, we showcase the ability of the proposed model to be implemented in real world scenarios, resulting in increased generalization accuracy compared to popular alternatives. △ Less

Submitted 23 August, 2021; originally announced August 2021.

arXiv:2105.13331 [pdf, other]

doi 10.3390/s21092984

Quantization and Deployment of Deep Neural Networks on Microcontrollers

Authors: Pierre-Emmanuel Novac, Ghouthi Boukli Hacene, Alain Pegatoquet, Benoît Miramond, Vincent Gripon

Abstract: Embedding Artificial Intelligence onto low-power devices is a challenging task that has been partly overcome with recent advances in machine learning and hardware design. Presently, deep neural networks can be deployed on embedded targets to perform different tasks such as speech recognition,object detection or Human Activity Recognition. However, there is still room for optimization of deep neura… ▽ More Embedding Artificial Intelligence onto low-power devices is a challenging task that has been partly overcome with recent advances in machine learning and hardware design. Presently, deep neural networks can be deployed on embedded targets to perform different tasks such as speech recognition,object detection or Human Activity Recognition. However, there is still room for optimization of deep neural networks onto embedded devices. These optimizations mainly address power consumption,memory and real-time constraints, but also an easier deployment at the edge. Moreover, there is still a need for a better understanding of what can be achieved for different use cases. This work focuses on quantization and deployment of deep neural networks onto low-power 32-bit microcontrollers. The quantization methods, relevant in the context of an embedded execution onto a microcontroller, are first outlined. Then, a new framework for end-to-end deep neural networks training, quantization and deployment is presented. This framework, called MicroAI, is designed as an alternative to existing inference engines (TensorFlow Lite for Microcontrollers and STM32CubeAI). Our framework can indeed be easily adjusted and/or extended for specific use cases. Execution using single precision 32-bit floating-point as well as fixed-point on 8- and 16-bit integers are supported. The proposed quantization method is evaluated with three different datasets (UCI-HAR, Spoken MNIST and GTSRB). Finally, a comparison study between MicroAI and both existing embedded inference engines is provided in terms of memory and power efficiency. On-device evaluation is done using ARM Cortex-M4F-based microcontrollers (Ambiq Apollo3 and STM32L452RE). △ Less

Submitted 23 September, 2021; v1 submitted 27 May, 2021; originally announced May 2021.

Comments: 34 pages, 14 figures. Published in MDPI Sensors 2021, special issue "Embedded Artificial Intelligence (AI) for Smart Sensing and IoT Applications": https://www.mdpi.com/1424-8220/21/9/2984 . v2: add reference for MicroAI software and link to source code repository; fix Eq. 3 according to implementation; improve English grammar and spelling; improve page layout

Journal ref: Sensors 2021, 21, 2984

arXiv:2105.04922 [pdf, ps, other]

Using Deep Neural Networks to Predict and Improve the Performance of Polar Codes

Authors: Mathieu Léonardon, Vincent Gripon

Abstract: Polar codes can theoretically achieve very competitive Frame Error Rates. In practice, their performance may depend on the chosen decoding procedure, as well as other parameters of the communication system they are deployed upon. As a consequence, designing efficient polar codes for a specific context can quickly become challenging. In this paper, we introduce a methodology that consists in traini… ▽ More Polar codes can theoretically achieve very competitive Frame Error Rates. In practice, their performance may depend on the chosen decoding procedure, as well as other parameters of the communication system they are deployed upon. As a consequence, designing efficient polar codes for a specific context can quickly become challenging. In this paper, we introduce a methodology that consists in training deep neural networks to predict the frame error rate of polar codes based on their frozen bit construction sequence. We introduce an algorithm based on Projected Gradient Descent that leverages the gradient of the neural network function to generate promising frozen bit sequences. We showcase on generated datasets the ability of the proposed methodology to produce codes more efficient than those used to train the neural networks, even when the latter are selected among the most efficient ones. △ Less

Submitted 11 May, 2021; originally announced May 2021.

Comments: Submitted to ISTC 2021

arXiv:2102.09493 [pdf, other]

doi 10.23919/EUSIPCO54536.2021.9616010

Inferring Graph Signal Translations as Invariant Transformations for Classification Tasks

Authors: Raphael Baena, Lucas Drumetz, Vincent Gripon

Abstract: The field of Graph Signal Processing (GSP) has proposed tools to generalize harmonic analysis to complex domains represented through graphs. Among these tools are translations, which are required to define many others. Most works propose to define translations using solely the graph structure (i.e. edges). Such a problem is ill-posed in general as a graph conveys information about neighborhood but… ▽ More The field of Graph Signal Processing (GSP) has proposed tools to generalize harmonic analysis to complex domains represented through graphs. Among these tools are translations, which are required to define many others. Most works propose to define translations using solely the graph structure (i.e. edges). Such a problem is ill-posed in general as a graph conveys information about neighborhood but not about directions. In this paper, we propose to infer translations as edge-constrained operations that make a supervised classification problem invariant using a deep learning framework. As such, our methodology uses both the graph structure and labeled signals to infer translations. We perform experiments with regular 2D images and abstract hyperlink networks to show the effectiveness of the proposed methodology in inferring meaningful translations for signals supported on graphs. △ Less

Submitted 18 February, 2021; originally announced February 2021.

arXiv:2101.04789 [pdf, ps, other]

Improving Classification Accuracy with Graph Filtering

Authors: Mounia Hamidouche, Carlos Lassance, Yuqing Hu, Lucas Drumetz, Bastien Pasdeloup, Vincent Gripon

Abstract: In machine learning, classifiers are typically susceptible to noise in the training data. In this work, we aim at reducing intra-class noise with the help of graph filtering to improve the classification performance. Considered graphs are obtained by connecting samples of the training set that belong to a same class depending on the similarity of their representation in a latent space. We show tha… ▽ More In machine learning, classifiers are typically susceptible to noise in the training data. In this work, we aim at reducing intra-class noise with the help of graph filtering to improve the classification performance. Considered graphs are obtained by connecting samples of the training set that belong to a same class depending on the similarity of their representation in a latent space. We show that the proposed graph filtering methodology has the effect of asymptotically reducing intra-class variance, while maintaining the mean. While our approach applies to all classification problems in general, it is particularly useful in few-shot settings, where intra-class noise can have a huge impact due to the small sample selection. Using standardized benchmarks in the field of vision, we empirically demonstrate the ability of the proposed method to slightly improve state-of-the-art results in both cases of few-shot and standard classification. △ Less

Submitted 25 January, 2021; v1 submitted 12 January, 2021; originally announced January 2021.

arXiv:2012.01509 [pdf, ps, other]

DecisiveNets: Training Deep Associative Memories to Solve Complex Machine Learning Problems

Authors: Vincent Gripon, Carlos Lassance, Ghouthi Boukli Hacene

Abstract: Learning deep representations to solve complex machine learning tasks has become the prominent trend in the past few years. Indeed, Deep Neural Networks are now the golden standard in domains as various as computer vision, natural language processing or even playing combinatorial games. However, problematic limitations are hidden behind this surprising universal capability. Among other things, exp… ▽ More Learning deep representations to solve complex machine learning tasks has become the prominent trend in the past few years. Indeed, Deep Neural Networks are now the golden standard in domains as various as computer vision, natural language processing or even playing combinatorial games. However, problematic limitations are hidden behind this surprising universal capability. Among other things, explainability of the decisions is a major concern, especially since deep neural networks are made up of a very large number of trainable parameters. Moreover, computational complexity can quickly become a problem, especially in contexts constrained by real time or limited resources. Therefore, understanding how information is stored and the impact this storage can have on the system remains a major and open issue. In this chapter, we introduce a method to transform deep neural network models into deep associative memories, with simpler, more explicable and less expensive operations. We show through experiments that these transformations can be done without penalty on predictive performance. The resulting deep associative memories are excellent candidates for artificial intelligence that is easier to theorize and manipulate. △ Less

Submitted 2 December, 2020; originally announced December 2020.

Comments: Submitted as chapter proposal

arXiv:2011.12737 [pdf, ps, other]

Ranking Deep Learning Generalization using Label Variation in Latent Geometry Graphs

Authors: Carlos Lassance, Louis Béthune, Myriam Bontonou, Mounia Hamidouche, Vincent Gripon

Abstract: Measuring the generalization performance of a Deep Neural Network (DNN) without relying on a validation set is a difficult task. In this work, we propose exploiting Latent Geometry Graphs (LGGs) to represent the latent spaces of trained DNN architectures. Such graphs are obtained by connecting samples that yield similar latent representations at a given layer of the considered DNN. We then obtain… ▽ More Measuring the generalization performance of a Deep Neural Network (DNN) without relying on a validation set is a difficult task. In this work, we propose exploiting Latent Geometry Graphs (LGGs) to represent the latent spaces of trained DNN architectures. Such graphs are obtained by connecting samples that yield similar latent representations at a given layer of the considered DNN. We then obtain a generalization score by looking at how strongly connected are samples of distinct classes in LGGs. This score allowed us to rank 3rd on the NeurIPS 2020 Predicting Generalization in Deep Learning (PGDL) competition. △ Less

Submitted 25 November, 2020; originally announced November 2020.

Comments: Short paper describing submission that got the 3rd place on the NeurIPS 2020 Predicting Generalization in Deep Learning (PGDL) competition. We hope to update this with more analysis when the full data is made available

arXiv:2011.10520 [pdf, other]

doi 10.3390/jimaging8030064

Rethinking Weight Decay For Efficient Neural Network Pruning

Authors: Hugo Tessier, Vincent Gripon, Mathieu Léonardon, Matthieu Arzel, Thomas Hannagan, David Bertrand

Abstract: Introduced in the late 1980s for generalization purposes, pruning has now become a staple for compressing deep neural networks. Despite many innovations in recent decades, pruning approaches still face core issues that hinder their performance or scalability. Drawing inspiration from early work in the field, and especially the use of weight decay to achieve sparsity, we introduce Selective Weight… ▽ More Introduced in the late 1980s for generalization purposes, pruning has now become a staple for compressing deep neural networks. Despite many innovations in recent decades, pruning approaches still face core issues that hinder their performance or scalability. Drawing inspiration from early work in the field, and especially the use of weight decay to achieve sparsity, we introduce Selective Weight Decay (SWD), which carries out efficient, continuous pruning throughout training. Our approach, theoretically grounded on Lagrangian smoothing, is versatile and can be applied to multiple tasks, networks, and pruning structures. We show that SWD compares favorably to state-of-the-art approaches, in terms of performance-to-parameters ratio, on the CIFAR-10, Cora, and ImageNet ILSVRC2012 datasets. △ Less

Submitted 9 March, 2022; v1 submitted 20 November, 2020; originally announced November 2020.

Comments: 23 pages, 18 figures, published at Journal of Imaging, update : added new results, rewrite

Journal ref: Journal of Imaging 8 (2022), no. 3: 64

arXiv:2011.07343 [pdf, other]

Representing Deep Neural Networks Latent Space Geometries with Graphs

Authors: Carlos Lassance, Vincent Gripon, Antonio Ortega

Abstract: Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists in training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit… ▽ More Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists in training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more details, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: i) Reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, ii) Designing efficient embeddings for classification is achieved by targeting specific geometries, and iii) Robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems. △ Less

Submitted 14 November, 2020; originally announced November 2020.

Comments: 15 pages, submitted to MDPI Algorithms

arXiv:2010.12500 [pdf, other]

Few-shot Decoding of Brain Activation Maps

Authors: Myriam Bontonou, Giulia Lioi, Nicolas Farrugia, Vincent Gripon

Abstract: Few-shot learning addresses problems for which a limited number of training examples are available. So far, the field has been mostly driven by applications in computer vision. Here, we are interested in adapting recently introduced few-shot methods to solve problems dealing with neuroimaging data, a promising application field. To this end, we create a neuroimaging benchmark dataset for few-shot… ▽ More Few-shot learning addresses problems for which a limited number of training examples are available. So far, the field has been mostly driven by applications in computer vision. Here, we are interested in adapting recently introduced few-shot methods to solve problems dealing with neuroimaging data, a promising application field. To this end, we create a neuroimaging benchmark dataset for few-shot learning and compare multiple learning paradigms, including meta-learning, as well as various backbone networks. Our experiments show that few-shot methods are able to efficiently decode brain signals using few examples, which paves the way for a number of applications in clinical and cognitive neuroscience, such as identifying biomarkers from brain scans or understanding the generalization of brain representations across a wide range of cognitive tasks. △ Less

Submitted 19 May, 2021; v1 submitted 23 October, 2020; originally announced October 2020.

Comments: 5 pages. Updated title and minor modifications

arXiv:2009.14702 [pdf, ps, other]

doi 10.1007/s10955-021-02727-z

Some Remarks on Replicated Simulated Annealing

Authors: Vincent Gripon, Matthias Löwe, Franck Vermet

Abstract: Recently authors have introduced the idea of training discrete weights neural networks using a mix between classical simulated annealing and a replica ansatz known from the statistical physics literature. Among other points, they claim their method is able to find robust configurations. In this paper, we analyze this so-called "replicated simulated annealing" algorithm. In particular, we explicit… ▽ More Recently authors have introduced the idea of training discrete weights neural networks using a mix between classical simulated annealing and a replica ansatz known from the statistical physics literature. Among other points, they claim their method is able to find robust configurations. In this paper, we analyze this so-called "replicated simulated annealing" algorithm. In particular, we explicit criteria to guarantee its convergence, and study when it successfully samples from configurations. We also perform experiments using synthetic and real data bases. △ Less

Submitted 2 December, 2020; v1 submitted 30 September, 2020; originally announced September 2020.

arXiv:2009.12567 [pdf]

Gradients of Connectivity as Graph Fourier Bases of Brain Activity

Authors: Giulia Lioi, Vincent Gripon, Abdelbasset Brahim, François Rousseau, Nicolas Farrugia

Abstract: The application of graph theory to model the complex structure and function of the brain has shed new light on its organization and function, prompting the emergence of network neuroscience. Despite the tremendous progress that has been achieved in this field, still relatively few methods exploit the topology of brain networks to analyze brain activity. Recent attempts in this direction have lever… ▽ More The application of graph theory to model the complex structure and function of the brain has shed new light on its organization and function, prompting the emergence of network neuroscience. Despite the tremendous progress that has been achieved in this field, still relatively few methods exploit the topology of brain networks to analyze brain activity. Recent attempts in this direction have leveraged on graph spectral analysis and graph signal processing to decompose brain activity in connectivity eigenmodes or gradients. If results are promising in terms of interpretability and functional relevance, methodologies and terminology are sometimes confusing. The goals of this paper are twofold. First, we summarize recent contributions related to connectivity gradients and graph signal processing, and attempt a clarification of the terminology and methods used in the field, while pointing out current methodological limitations. Second, we discuss the perspective that the functional relevance of connectivity gradients could be fruitfully exploited by considering them as graph Fourier bases of brain activity. △ Less

Submitted 26 September, 2020; originally announced September 2020.

Comments: 14 pages, 1 figure, 2 Tables, submitted to Network Neuroscience

arXiv:2009.03665 [pdf, other]

GPU-based Self-Organizing Maps for Post-Labeled Few-Shot Unsupervised Learning

Authors: Lyes Khacef, Vincent Gripon, Benoit Miramond

Abstract: Few-shot classification is a challenge in machine learning where the goal is to train a classifier using a very limited number of labeled examples. This scenario is likely to occur frequently in real life, for example when data acquisition or labeling is expensive. In this work, we consider the problem of post-labeled few-shot unsupervised learning, a classification task where representations are… ▽ More Few-shot classification is a challenge in machine learning where the goal is to train a classifier using a very limited number of labeled examples. This scenario is likely to occur frequently in real life, for example when data acquisition or labeling is expensive. In this work, we consider the problem of post-labeled few-shot unsupervised learning, a classification task where representations are learned in an unsupervised fashion, to be later labeled using very few annotated examples. We argue that this problem is very likely to occur on the edge, when the embedded device directly acquires the data, and the expert needed to perform labeling cannot be prompted often. To address this problem, we consider an algorithm consisting of the concatenation of transfer learning with clustering using Self-Organizing Maps (SOMs). We introduce a TensorFlow-based implementation to speed-up the process in multi-core CPUs and GPUs. Finally, we demonstrate the effectiveness of the method using standard off-the-shelf few-shot classification benchmarks. △ Less

Submitted 4 September, 2020; originally announced September 2020.

Comments: Accepted for publication in the International Conference on Neural Information Processing (ICONIP) 2020. arXiv admin note: text overlap with arXiv:2009.02174

arXiv:2007.10106 [pdf, other]

ThriftyNets : Convolutional Neural Networks with Tiny Parameter Budget

Authors: Guillaume Coiffier, Ghouthi Boukli Hacene, Vincent Gripon

Abstract: Typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort t… ▽ More Typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40K parameters in total, and 74.3% on CIFAR-100 with less than 600K parameters. △ Less

Submitted 20 July, 2020; originally announced July 2020.

arXiv:2007.08216 [pdf, ps, other]

Graph topology inference benchmarks for machine learning

Authors: Carlos Lassance, Vincent Gripon, Gonzalo Mateos

Abstract: Graphs are nowadays ubiquitous in the fields of signal processing and machine learning. As a tool used to express relationships between objects, graphs can be deployed to various ends: I) clustering of vertices, II) semi-supervised classification of vertices, III) supervised classification of graph signals, and IV) denoising of graph signals. However, in many practical cases graphs are not explici… ▽ More Graphs are nowadays ubiquitous in the fields of signal processing and machine learning. As a tool used to express relationships between objects, graphs can be deployed to various ends: I) clustering of vertices, II) semi-supervised classification of vertices, III) supervised classification of graph signals, and IV) denoising of graph signals. However, in many practical cases graphs are not explicitly available and must therefore be inferred from data. Validation is a challenging endeavor that naturally depends on the downstream task for which the graph is learnt. Accordingly, it has often been difficult to compare the efficacy of different algorithms. In this work, we introduce several ease-to-use and publicly released benchmarks specifically designed to reveal the relative merits and limitations of graph inference methods. We also contrast some of the most prominent techniques in the literature. △ Less

Submitted 16 July, 2020; originally announced July 2020.

Comments: To appear in 2020 Machine Learning for Signal Processing. Code available at https://github.com/cadurosar/benchmark_graphinference

arXiv:2007.04238 [pdf, other]

Predicting the Accuracy of a Few-Shot Classifier

Authors: Myriam Bontonou, Louis Béthune, Vincent Gripon

Abstract: In the context of few-shot learning, one cannot measure the generalization ability of a trained classifier using validation sets, due to the small number of labeled samples. In this paper, we are interested in finding alternatives to answer the question: is my classifier generalizing well to previously unseen data? We first analyze the reasons for the variability of generalization performances. We… ▽ More In the context of few-shot learning, one cannot measure the generalization ability of a trained classifier using validation sets, due to the small number of labeled samples. In this paper, we are interested in finding alternatives to answer the question: is my classifier generalizing well to previously unseen data? We first analyze the reasons for the variability of generalization performances. We then investigate the case of using transfer-based solutions, and consider three settings: i) supervised where we only have access to a few labeled samples, ii) semi-supervised where we have access to both a few labeled samples and a set of unlabeled samples and iii) unsupervised where we only have access to unlabeled samples. For each setting, we propose reasonable measures that we empirically demonstrate to be correlated with the generalization ability of considered classifiers. We also show that these simple measures can be used to predict generalization up to a certain confidence. We conduct our experiments on standard few-shot vision datasets. △ Less

Submitted 8 July, 2020; originally announced July 2020.

arXiv:2006.05095 [pdf, other]

Towards an Intrinsic Definition of Robustness for a Classifier

Authors: Théo Giraudon, Vincent Gripon, Matthias Löwe, Franck Vermet

Abstract: The robustness of classifiers has become a question of paramount importance in the past few years. Indeed, it has been shown that state-of-the-art deep learning architectures can easily be fooled with imperceptible changes to their inputs. Therefore, finding good measures of robustness of a trained classifier is a key issue in the field. In this paper, we point out that averaging the radius of rob… ▽ More The robustness of classifiers has become a question of paramount importance in the past few years. Indeed, it has been shown that state-of-the-art deep learning architectures can easily be fooled with imperceptible changes to their inputs. Therefore, finding good measures of robustness of a trained classifier is a key issue in the field. In this paper, we point out that averaging the radius of robustness of samples in a validation set is a statistically weak measure. We propose instead to weight the importance of samples depending on their difficulty. We motivate the proposed score by a theoretical case study using logistic regression, where we show that the proposed score is independent of the choice of the samples it is evaluated upon. We also empirically demonstrate the ability of the proposed score to measure robustness of classifiers with little dependence on the choice of samples in more complex settings, including deep convolutional neural networks and real datasets. △ Less

Submitted 11 June, 2020; v1 submitted 9 June, 2020; originally announced June 2020.

Comments: 13 pages

arXiv:2006.03806 [pdf, other]

Leveraging the Feature Distribution in Transfer-based Few-Shot Learning

Authors: Yuqing Hu, Vincent Gripon, Stéphane Pateux

Abstract: Few-shot classification is a challenging problem due to the uncertainty caused by using few labelled samples. In the past few years, many methods have been proposed to solve few-shot classification, among which transfer-based methods have proved to achieve the best performance. Following this vein, in this paper we propose a novel transfer-based method that builds on two successive steps: 1) prepr… ▽ More Few-shot classification is a challenging problem due to the uncertainty caused by using few labelled samples. In the past few years, many methods have been proposed to solve few-shot classification, among which transfer-based methods have proved to achieve the best performance. Following this vein, in this paper we propose a novel transfer-based method that builds on two successive steps: 1) preprocessing the feature vectors so that they become closer to Gaussian-like distributions, and 2) leveraging this preprocessing using an optimal-transport inspired algorithm (in the case of transductive settings). Using standardized vision benchmarks, we prove the ability of the proposed methodology to achieve state-of-the-art accuracy with various datasets, backbone architectures and few-shot settings. △ Less

Submitted 26 January, 2021; v1 submitted 6 June, 2020; originally announced June 2020.

arXiv:2002.03090 [pdf, other]

BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization

Authors: Miloš Nikolić, Ghouthi Boukli Hacene, Ciaran Bannon, Alberto Delmas Lascorz, Matthieu Courbariaux, Yoshua Bengio, Vincent Gripon, Andreas Moshovos

Abstract: Neural networks have demonstrably achieved state-of-the art accuracy using low-bitlength integer quantization, yielding both execution time and energy benefits on existing hardware designs that support short bitlengths. However, the question of finding the minimum bitlength for a desired accuracy remains open. We introduce a training method for minimizing inference bitlength at any granularity whi… ▽ More Neural networks have demonstrably achieved state-of-the art accuracy using low-bitlength integer quantization, yielding both execution time and energy benefits on existing hardware designs that support short bitlengths. However, the question of finding the minimum bitlength for a desired accuracy remains open. We introduce a training method for minimizing inference bitlength at any granularity while maintaining accuracy. Namely, we propose a regularizer that penalizes large bitlength representations throughout the architecture and show how it can be modified to minimize other quantifiable criteria, such as number of operations or memory footprint. We demonstrate that our method learns thrifty representations while maintaining accuracy. With ImageNet, the method produces an average per layer bitlength of 4.13, 3.76 and 4.36 bits on AlexNet, ResNet18 and MobileNet V2 respectively, remaining within 2.0%, 0.5% and 0.5% of the base TOP-1 accuracy. △ Less

Submitted 11 August, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

arXiv:2001.09849 [pdf, other]

Graph-based Interpolation of Feature Vectors for Accurate Few-Shot Classification

Authors: Yuqing Hu, Vincent Gripon, Stéphane Pateux

Abstract: In few-shot classification, the aim is to learn models able to discriminate classes using only a small number of labeled examples. In this context, works have proposed to introduce Graph Neural Networks (GNNs) aiming at exploiting the information contained in other samples treated concurrently, what is commonly referred to as the transductive setting in the literature. These GNNs are trained all t… ▽ More In few-shot classification, the aim is to learn models able to discriminate classes using only a small number of labeled examples. In this context, works have proposed to introduce Graph Neural Networks (GNNs) aiming at exploiting the information contained in other samples treated concurrently, what is commonly referred to as the transductive setting in the literature. These GNNs are trained all together with a backbone feature extractor. In this paper, we propose a new method that relies on graphs only to interpolate feature vectors instead, resulting in a transductive learning setting with no additional parameters to train. Our proposed method thus exploits two levels of information: a) transfer features obtained on generic datasets, b) transductive information obtained from other samples to be classified. Using standard few-shot vision classification datasets, we demonstrate its ability to bring significant gains compared to other works. △ Less

Submitted 28 January, 2021; v1 submitted 27 January, 2020; originally announced January 2020.

arXiv:1911.10287 [pdf, ps, other]

Training Modern Deep Neural Networks for Memory-Fault Robustness

Authors: Ghouthi Boukli Hacene, François Leduc-Primeau, Amal Ben Soussia, Vincent Gripon, François Gagnon

Abstract: Because deep neural networks (DNNs) rely on a large number of parameters and computations, their implementation in energy-constrained systems is challenging. In this paper, we investigate the solution of reducing the supply voltage of the memories used in the system, which results in bit-cell faults. We explore the robustness of state-of-the-art DNN architectures towards such defects and propose a… ▽ More Because deep neural networks (DNNs) rely on a large number of parameters and computations, their implementation in energy-constrained systems is challenging. In this paper, we investigate the solution of reducing the supply voltage of the memories used in the system, which results in bit-cell faults. We explore the robustness of state-of-the-art DNN architectures towards such defects and propose a regularizer meant to mitigate their effects on accuracy. Our experiments clearly demonstrate the interest of operating the system in a faulty regime to save energy without reducing accuracy. △ Less

Submitted 22 November, 2019; originally announced November 2019.

Journal ref: In 2019 IEEE International Symposium on Circuits and Systems (ISCAS) (pp. 1-5). IEEE

arXiv:1911.07847 [pdf, ps, other]

Efficient Hardware Implementation of Incremental Learning and Inference on Chip

Authors: Ghouthi Boukli Hacene, Vincent Gripon, Nicolas Farrugia, Matthieu Arzel, Michel Jezequel

Abstract: In this paper, we tackle the problem of incrementally learning a classifier, one example at a time, directly on chip. To this end, we propose an efficient hardware implementation of a recently introduced incremental learning procedure that achieves state-of-the-art performance by combining transfer learning with majority votes and quantization techniques. The proposed design is able to accommodate… ▽ More In this paper, we tackle the problem of incrementally learning a classifier, one example at a time, directly on chip. To this end, we propose an efficient hardware implementation of a recently introduced incremental learning procedure that achieves state-of-the-art performance by combining transfer learning with majority votes and quantization techniques. The proposed design is able to accommodate for both new examples and new classes directly on the chip. We detail the hardware implementation of the method (implemented on FPGA target) and show it requires limited resources while providing a significant acceleration compared to using a CPU. △ Less

Submitted 17 November, 2019; originally announced November 2019.

Comments: In 2019 IEEE International NEWCAS Conference

arXiv:1911.03080 [pdf, ps, other]

Deep geometric knowledge distillation with graphs

Authors: Carlos Lassance, Myriam Bontonou, Ghouthi Boukli Hacene, Vincent Gripon, Jian Tang, Antonio Ortega

Abstract: In most cases deep learning architectures are trained disregarding the amount of operations and energy consumption. However, some applications, like embedded systems, can be resource-constrained during inference. A popular approach to reduce the size of a deep learning architecture consists in distilling knowledge from a bigger network (teacher) to a smaller one (student). Directly training the st… ▽ More In most cases deep learning architectures are trained disregarding the amount of operations and energy consumption. However, some applications, like embedded systems, can be resource-constrained during inference. A popular approach to reduce the size of a deep learning architecture consists in distilling knowledge from a bigger network (teacher) to a smaller one (student). Directly training the student to mimic the teacher representation can be effective, but it requires that both share the same latent space dimensions. In this work, we focus instead on relative knowledge distillation (RKD), which considers the geometry of the respective latent spaces, allowing for dimension-agnostic transfer of knowledge. Specifically we introduce a graph-based RKD method, in which graphs are used to capture the geometry of latent spaces. Using classical computer vision benchmarks, we demonstrate the ability of the proposed method to efficiently distillate knowledge from the teacher to the student, leading to better accuracy for the same budget as compared to existing RKD alternatives. △ Less

Submitted 8 November, 2019; originally announced November 2019.

arXiv:1911.02961 [pdf, ps, other]

Improved Visual Localization via Graph Smoothing

Authors: Carlos Lassance, Yasir Latif, Ravi Garg, Vincent Gripon, Ian Reid

Abstract: Vision based localization is the problem of inferring the pose of the camera given a single image. One solution to this problem is to learn a deep neural network to infer the pose of a query image after learning on a dataset of images with known poses. Another more commonly used approach rely on image retrieval where the query image is compared against the database of images and its pose is inferr… ▽ More Vision based localization is the problem of inferring the pose of the camera given a single image. One solution to this problem is to learn a deep neural network to infer the pose of a query image after learning on a dataset of images with known poses. Another more commonly used approach rely on image retrieval where the query image is compared against the database of images and its pose is inferred with the help of the retrieved images. The latter approach assumes that images taken from the same places consists of the same landmarks and, thus would have similar feature representations. These representation can be learned using full supervision to be robust to different variations in capture conditions like time of the day and weather. In this work, we introduce a framework to enhance the performance of these retrieval based localization methods by taking into account the additional information including GPS coordinates and temporal neighbourhood of the images provided by the acquisition process in addition to the descriptor similarity of pairs of images in the reference or query database which is used traditionally for localization. Our method constructs a graph based on this additional information and use it for robust retrieval by smoothing the feature representation of reference and/or query images. We show that the proposed method is able to significantly improve the localization accuracy on two large scale datasets over the baselines. △ Less

Submitted 7 November, 2019; originally announced November 2019.

arXiv:1909.05095 [pdf, ps, other]

doi 10.1109/DSW.2019.8755564

Structural Robustness for Deep Learning Architectures

Authors: Carlos Lassance, Vincent Gripon, Jian Tang, Antonio Ortega

Abstract: Deep Networks have been shown to provide state-of-the-art performance in many machine learning challenges. Unfortunately, they are susceptible to various types of noise, including adversarial attacks and corrupted inputs. In this work we introduce a formal definition of robustness which can be viewed as a localized Lipschitz constant of the network function, quantified in the domain of the data to… ▽ More Deep Networks have been shown to provide state-of-the-art performance in many machine learning challenges. Unfortunately, they are susceptible to various types of noise, including adversarial attacks and corrupted inputs. In this work we introduce a formal definition of robustness which can be viewed as a localized Lipschitz constant of the network function, quantified in the domain of the data to be classified. We compare this notion of robustness to existing ones, and study its connections with methods in the literature. We evaluate this metric by performing experiments on various competitive vision datasets. △ Less

Submitted 11 September, 2019; originally announced September 2019.

arXiv:1908.06868 [pdf, other]

Comparing linear structure-based and data-driven latent spatial representations for sequence prediction

Authors: Myriam Bontonou, Carlos Lassance, Vincent Gripon, Nicolas Farrugia

Abstract: Predicting the future of Graph-supported Time Series (GTS) is a key challenge in many domains, such as climate monitoring, finance or neuroimaging. Yet it is a highly difficult problem as it requires to account jointly for time and graph (spatial) dependencies. To simplify this process, it is common to use a two-step procedure in which spatial and time dependencies are dealt with separately. In th… ▽ More Predicting the future of Graph-supported Time Series (GTS) is a key challenge in many domains, such as climate monitoring, finance or neuroimaging. Yet it is a highly difficult problem as it requires to account jointly for time and graph (spatial) dependencies. To simplify this process, it is common to use a two-step procedure in which spatial and time dependencies are dealt with separately. In this paper, we are interested in comparing various linear spatial representations, namely structure-based ones and data-driven ones, in terms of how they help predict the future of GTS. To that end, we perform experiments with various datasets including spontaneous brain activity and raw videos. △ Less

Submitted 19 August, 2019; originally announced August 2019.

Journal ref: Wavelets and Sparsity XVIII, Aug 2019, San Diego, United States

arXiv:1905.12300 [pdf, ps, other]

Attention Based Pruning for Shift Networks

Authors: Ghouthi Boukli Hacene, Carlos Lassance, Vincent Gripon, Matthieu Courbariaux, Yoshua Bengio

Abstract: In many application domains such as computer vision, Convolutional Layers (CLs) are key to the accuracy of deep learning methods. However, it is often required to assemble a large number of CLs, each containing thousands of parameters, in order to reach state-of-the-art accuracy, thus resulting in complex and demanding systems that are poorly fitted to resource-limited devices. Recently, methods h… ▽ More In many application domains such as computer vision, Convolutional Layers (CLs) are key to the accuracy of deep learning methods. However, it is often required to assemble a large number of CLs, each containing thousands of parameters, in order to reach state-of-the-art accuracy, thus resulting in complex and demanding systems that are poorly fitted to resource-limited devices. Recently, methods have been proposed to replace the generic convolution operator by the combination of a shift operation and a simpler 1x1 convolution. The resulting block, called Shift Layer (SL), is an efficient alternative to CLs in the sense it allows to reach similar accuracies on various tasks with faster computations and fewer parameters. In this contribution, we introduce Shift Attention Layers (SALs), which extend SLs by using an attention mechanism that learns which shifts are the best at the same time the network function is trained. We demonstrate SALs are able to outperform vanilla SLs (and CLs) on various object recognition benchmarks while significantly reducing the number of float operations and parameters for the inference. △ Less

Submitted 29 May, 2019; originally announced May 2019.

arXiv:1905.00496 [pdf, ps, other]

A Unified Deep Learning Formalism For Processing Graph Signals

Authors: Myriam Bontonou, Carlos Lassance, Jean-Charles Vialatte, Vincent Gripon

Abstract: Convolutional Neural Networks are very efficient at processing signals defined on a discrete Euclidean space (such as images). However, as they can not be used on signals defined on an arbitrary graph, other models have emerged, aiming to extend its properties. We propose to review some of the major deep learning models designed to exploit the underlying graph structure of signals. We express them… ▽ More Convolutional Neural Networks are very efficient at processing signals defined on a discrete Euclidean space (such as images). However, as they can not be used on signals defined on an arbitrary graph, other models have emerged, aiming to extend its properties. We propose to review some of the major deep learning models designed to exploit the underlying graph structure of signals. We express them in a unified formalism, giving them a new and comparative reading. △ Less

Submitted 1 May, 2019; originally announced May 2019.

Comments: 2 pages, short version

arXiv:1905.00301 [pdf, ps, other]

Introducing Graph Smoothness Loss for Training Deep Learning Architectures

Authors: Myriam Bontonou, Carlos Lassance, Ghouthi Boukli Hacene, Vincent Gripon, Jian Tang, Antonio Ortega

Abstract: We introduce a novel loss function for training deep learning architectures to perform classification. It consists in minimizing the smoothness of label signals on similarity graphs built at the output of the architecture. Equivalently, it can be seen as maximizing the distances between the network function images of training inputs from distinct classes. As such, only distances between pairs of e… ▽ More We introduce a novel loss function for training deep learning architectures to perform classification. It consists in minimizing the smoothness of label signals on similarity graphs built at the output of the architecture. Equivalently, it can be seen as maximizing the distances between the network function images of training inputs from distinct classes. As such, only distances between pairs of examples in distinct classes are taken into account in the process, and the training does not prevent inputs from the same class to be mapped to distant locations in the output domain. We show that this loss leads to similar performance in classification as architectures trained using the classical cross-entropy, while offering interesting degrees of freedom and properties. We also demonstrate the interest of the proposed loss to increase robustness of trained architectures to deviations of the inputs. △ Less

Submitted 1 May, 2019; originally announced May 2019.

Comments: 5 pages

Showing 1–50 of 85 results for author: Gripon, V