-
Leveraging PAC-Bayes Theory and Gibbs Distributions for Generalization Bounds with Complexity Measures
Authors:
Paul Viallard,
Rémi Emonet,
Amaury Habrard,
Emilie Morvant,
Valentina Zantedeschi
Abstract:
In statistical learning theory, a generalization bound usually involves a complexity measure imposed by the considered theoretical framework. This limits the scope of such bounds, as other forms of capacity measures or regularizations are used in algorithms. In this paper, we leverage the framework of disintegrated PAC-Bayes bounds to derive a general generalization bound instantiable with arbitra…
▽ More
In statistical learning theory, a generalization bound usually involves a complexity measure imposed by the considered theoretical framework. This limits the scope of such bounds, as other forms of capacity measures or regularizations are used in algorithms. In this paper, we leverage the framework of disintegrated PAC-Bayes bounds to derive a general generalization bound instantiable with arbitrary complexity measures. One trick to prove such a result involves considering a commonly used family of distributions: the Gibbs distributions. Our bound stands in probability jointly over the hypothesis and the learning sample, which allows the complexity to be adapted to the generalization gap as it can be customized to fit both the hypothesis class and the task.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Fair Text Classification with Wasserstein Independence
Authors:
Thibaud Leteno,
Antoine Gourru,
Charlotte Laclau,
Rémi Emonet,
Christophe Gravier
Abstract:
Group fairness is a central research topic in text classification, where reaching fair treatment between sensitive groups (e.g. women vs. men) remains an open challenge. This paper presents a novel method for mitigating biases in neural text classification, agnostic to the model architecture. Considering the difficulty to distinguish fair from unfair information in a text encoder, we take inspirat…
▽ More
Group fairness is a central research topic in text classification, where reaching fair treatment between sensitive groups (e.g. women vs. men) remains an open challenge. This paper presents a novel method for mitigating biases in neural text classification, agnostic to the model architecture. Considering the difficulty to distinguish fair from unfair information in a text encoder, we take inspiration from adversarial training to induce Wasserstein independence between representations learned to predict our target label and the ones learned to predict some sensitive attribute. Our approach provides two significant advantages. Firstly, it does not require annotations of sensitive attributes in both testing and training data. This is more suitable for real-life scenarios compared to existing methods that require annotations of sensitive attributes at train time. Second, our approach exhibits a comparable or better fairness-accuracy trade-off compared to existing methods.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Learning complexity to guide light-induced self-organized nanopatterns
Authors:
Eduardo Brandao,
Anthony Nakhoul,
Stefan Duffner,
Rémi Emonet,
Florence Garrelie,
Amaury Habrard,
François Jacquenet,
Florent Pigeon,
Marc Sebban,
Jean-Philippe Colombier
Abstract:
Ultrafast laser irradiation can induce spontaneous self-organization of surfaces into dissipative structures with nanoscale reliefs. These surface patterns emerge from symmetry-breaking dynamical processes that occur in Rayleigh-Bénard-like instabilities. In this study, we demonstrate that the coexistence and competition between surface patterns of different symmetries in two dimensions can be num…
▽ More
Ultrafast laser irradiation can induce spontaneous self-organization of surfaces into dissipative structures with nanoscale reliefs. These surface patterns emerge from symmetry-breaking dynamical processes that occur in Rayleigh-Bénard-like instabilities. In this study, we demonstrate that the coexistence and competition between surface patterns of different symmetries in two dimensions can be numerically unraveled using the stochastic generalized Swift-Hohenberg model. We originally propose a deep convolutional network to identify and learn the dominant modes that stabilize for a given bifurcation and quadratic model coefficients. The model is scale-invariant and has been calibrated on microscopy measurements using a physics-guided machine learning strategy. Our approach enables the identification of experimental irradiation conditions for a desired self-organization pattern. It can be applied generally to predict structure formation in situations where the underlying physics can be approximately described by a self-organization process and data is sparse and non-time series. Our work paves the way for supervised local manipulation of matter using timely-controlled optical fields in laser manufacturing.
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
Learning Stochastic Majority Votes by Minimizing a PAC-Bayes Generalization Bound
Authors:
Valentina Zantedeschi,
Paul Viallard,
Emilie Morvant,
Rémi Emonet,
Amaury Habrard,
Pascal Germain,
Benjamin Guedj
Abstract:
We investigate a stochastic counterpart of majority votes over finite ensembles of classifiers, and study its generalization properties. While our approach holds for arbitrary distributions, we instantiate it with Dirichlet distributions: this allows for a closed-form and differentiable expression for the expected risk, which then turns the generalization bound into a tractable training objective.…
▽ More
We investigate a stochastic counterpart of majority votes over finite ensembles of classifiers, and study its generalization properties. While our approach holds for arbitrary distributions, we instantiate it with Dirichlet distributions: this allows for a closed-form and differentiable expression for the expected risk, which then turns the generalization bound into a tractable training objective. The resulting stochastic majority vote learning algorithm achieves state-of-the-art accuracy and benefits from (non-vacuous) tight generalization bounds, in a series of numerical experiments when compared to competing algorithms which also minimize PAC-Bayes objectives -- both with uninformed (data-independent) and informed (data-dependent) priors.
△ Less
Submitted 19 October, 2021; v1 submitted 23 June, 2021;
originally announced June 2021.
-
Mean Oriented Riesz Features for Micro Expression Classification
Authors:
Carlos Arango Duque,
Olivier Alata,
Rémi Emonet,
Hubert Konik,
Anne-Claire Legrand
Abstract:
Micro-expressions are brief and subtle facial expressions that go on and off the face in a fraction of a second. This kind of facial expressions usually occurs in high stake situations and is considered to reflect a human's real intent. There has been some interest in micro-expression analysis, however, a great majority of the methods are based on classically established computer vision methods su…
▽ More
Micro-expressions are brief and subtle facial expressions that go on and off the face in a fraction of a second. This kind of facial expressions usually occurs in high stake situations and is considered to reflect a human's real intent. There has been some interest in micro-expression analysis, however, a great majority of the methods are based on classically established computer vision methods such as local binary patterns, histogram of gradients and optical flow. A novel methodology for micro-expression recognition using the Riesz pyramid, a multi-scale steerable Hilbert transform is presented. In fact, an image sequence is transformed with this tool, then the image phase variations are extracted and filtered as proxies for motion. Furthermore, the dominant orientation constancy from the Riesz transform is exploited to average the micro-expression sequence into an image pair. Based on that, the Mean Oriented Riesz Feature description is introduced. Finally the performance of our methods are tested in two spontaneous micro-expressions databases and compared to state-of-the-art methods.
△ Less
Submitted 13 May, 2020;
originally announced May 2020.
-
An Adjusted Nearest Neighbor Algorithm Maximizing the F-Measure from Imbalanced Data
Authors:
Rémi Viola,
Rémi Emonet,
Amaury Habrard,
Guillaume Metzler,
Sébastien Riou,
Marc Sebban
Abstract:
In this paper, we address the challenging problem of learning from imbalanced data using a Nearest-Neighbor (NN) algorithm. In this setting, the minority examples typically belong to the class of interest requiring the optimization of specific criteria, like the F-Measure. Based on simple geometrical ideas, we introduce an algorithm that reweights the distance between a query sample and any positi…
▽ More
In this paper, we address the challenging problem of learning from imbalanced data using a Nearest-Neighbor (NN) algorithm. In this setting, the minority examples typically belong to the class of interest requiring the optimization of specific criteria, like the F-Measure. Based on simple geometrical ideas, we introduce an algorithm that reweights the distance between a query sample and any positive training example. This leads to a modification of the Voronoi regions and thus of the decision boundaries of the NN algorithm. We provide a theoretical justification about the weighting scheme needed to reduce the False Negative rate while controlling the number of False Positives. We perform an extensive experimental study on many public imbalanced datasets, but also on large scale non public data from the French Ministry of Economy and Finance on a tax fraud detection task, showing that our method is very effective and, interestingly, yields the best performance when combined with state of the art sampling methods.
△ Less
Submitted 22 January, 2020; v1 submitted 2 September, 2019;
originally announced September 2019.
-
Learning Interpretable Shapelets for Time Series Classification through Adversarial Regularization
Authors:
Yichang Wang,
Rémi Emonet,
Elisa Fromont,
Simon Malinowski,
Etienne Menager,
Loïc Mosser,
Romain Tavenard
Abstract:
Times series classification can be successfully tackled by jointly learning a shapelet-based representation of the series in the dataset and classifying the series according to this representation. However, although the learned shapelets are discriminative, they are not always similar to pieces of a real series in the dataset. This makes it difficult to interpret the decision, i.e. difficult to an…
▽ More
Times series classification can be successfully tackled by jointly learning a shapelet-based representation of the series in the dataset and classifying the series according to this representation. However, although the learned shapelets are discriminative, they are not always similar to pieces of a real series in the dataset. This makes it difficult to interpret the decision, i.e. difficult to analyze if there are particular behaviors in a series that triggered the decision. In this paper, we make use of a simple convolutional network to tackle the time series classification task and we introduce an adversarial regularization to constrain the model to learn more interpretable shapelets. Our classification results on all the usual time series benchmarks are comparable with the results obtained by similar state-of-the-art algorithms but our adversarially regularized method learns shapelets that are, by design, interpretable.
△ Less
Submitted 12 June, 2019; v1 submitted 3 June, 2019;
originally announced June 2019.
-
End-to-End Learned Early Classification of Time Series for In-Season Crop Type Map**
Authors:
Marc Rußwurm,
Nicolas Courty,
Rémi Emonet,
Sébastien Lefèvre,
Devis Tuia,
Romain Tavenard
Abstract:
Remote sensing satellites capture the cyclic dynamics of our Planet in regular time intervals recorded in satellite time series data. End-to-end trained deep learning models use this time series data to make predictions at a large scale, for instance, to produce up-to-date crop cover maps. Most time series classification approaches focus on the accuracy of predictions. However, the earliness of th…
▽ More
Remote sensing satellites capture the cyclic dynamics of our Planet in regular time intervals recorded in satellite time series data. End-to-end trained deep learning models use this time series data to make predictions at a large scale, for instance, to produce up-to-date crop cover maps. Most time series classification approaches focus on the accuracy of predictions. However, the earliness of the prediction is also of great importance since coming to an early decision can make a crucial difference in time-sensitive applications. In this work, we present an End-to-End Learned Early Classification of Time Series (ELECTS) model that estimates a classification score and a probability of whether sufficient data has been observed to come to an early and still accurate decision. ELECTS is modular: any deep time series classification model can adopt the ELECTS conceptual idea by adding a second prediction head that outputs a probability of stop** the classification. The ELECTS loss function then optimizes the overall model on a balanced objective of earliness and accuracy. Our experiments on four crop classification datasets from Europe and Africa show that ELECTS allows reaching state-of-the-art accuracy while reducing the quantity of data massively to be downloaded, stored, and processed. The source code is available at https://github.com/marccoru/elects.
△ Less
Submitted 21 December, 2022; v1 submitted 30 January, 2019;
originally announced January 2019.
-
IoU is not submodular
Authors:
Tanguy Kerdoncuff,
Rémi Emonet
Abstract:
This short article aims at demonstrate that the Intersection over Union (or Jaccard index) is not a submodular function. This mistake has been made in an article which is cited and used as a foundation in another article. The Intersection of Union is widely used in machine learning as a cost function especially for imbalance data and semantic segmentation.
This short article aims at demonstrate that the Intersection over Union (or Jaccard index) is not a submodular function. This mistake has been made in an article which is cited and used as a foundation in another article. The Intersection of Union is widely used in machine learning as a cost function especially for imbalance data and semantic segmentation.
△ Less
Submitted 3 September, 2018;
originally announced September 2018.
-
Residual Conv-Deconv Grid Network for Semantic Segmentation
Authors:
Damien Fourure,
Rémi Emonet,
Elisa Fromont,
Damien Muselet,
Alain Tremeau,
Christian Wolf
Abstract:
This paper presents GridNet, a new Convolutional Neural Network (CNN) architecture for semantic image segmentation (full scene labelling). Classical neural networks are implemented as one stream from the input to the output with subsampling operators applied in the stream in order to reduce the feature maps size and to increase the receptive field for the final prediction. However, for semantic im…
▽ More
This paper presents GridNet, a new Convolutional Neural Network (CNN) architecture for semantic image segmentation (full scene labelling). Classical neural networks are implemented as one stream from the input to the output with subsampling operators applied in the stream in order to reduce the feature maps size and to increase the receptive field for the final prediction. However, for semantic image segmentation, where the task consists in providing a semantic class to each pixel of an image, feature maps reduction is harmful because it leads to a resolution loss in the output prediction. To tackle this problem, our GridNet follows a grid pattern allowing multiple interconnected streams to work at different resolutions. We show that our network generalizes many well known networks such as conv-deconv, residual or U-Net networks. GridNet is trained from scratch and achieves competitive results on the Cityscapes dataset.
△ Less
Submitted 26 July, 2017; v1 submitted 25 July, 2017;
originally announced July 2017.
-
Ten simple rules for collaborative lesson development
Authors:
Gabriel A. Devenyi,
Rémi Emonet,
Rayna M. Harris,
Kate L. Hertweck,
Damien Irving,
Ian Milligan,
Greg Wilson
Abstract:
The collaborative development methods pioneered by the open source software community offer a way to create lessons that are open, accessible, and sustainable. This paper presents ten simple rules for doing this drawn from our experience with several successful projects.
The collaborative development methods pioneered by the open source software community offer a way to create lessons that are open, accessible, and sustainable. This paper presents ten simple rules for doing this drawn from our experience with several successful projects.
△ Less
Submitted 9 July, 2017;
originally announced July 2017.
-
Improving Max-Sum through Decimation to Solve Loopy Distributed Constraint Optimization Problems
Authors:
Jesús Cerquides,
Rémi Emonet,
Gauthier Picard,
Juan A. Rodríguez-Aguilar
Abstract:
In the context of solving large distributed constraint optimization problems (DCOP), belief-propagation and approximate inference algorithms are candidates of choice. However, in general, when the factor graph is very loopy (i.e. cyclic), these solution methods suffer from bad performance, due to non-convergence and many exchanged messages. As to improve performances of the Max-Sum inference algor…
▽ More
In the context of solving large distributed constraint optimization problems (DCOP), belief-propagation and approximate inference algorithms are candidates of choice. However, in general, when the factor graph is very loopy (i.e. cyclic), these solution methods suffer from bad performance, due to non-convergence and many exchanged messages. As to improve performances of the Max-Sum inference algorithm when solving loopy constraint optimization problems, we propose here to take inspiration from the belief-propagation-guided dec-imation used to solve sparse random graphs (k-satisfiability). We propose the novel DeciMaxSum method, which is parameterized in terms of policies to decide when to trigger decimation, which variables to decimate, and which values to assign to decimated variables. Based on an empirical evaluation on a classical BP benchmark (the Ising model), some of these combinations of policies exhibit better performance than state-of-the-art competitors.
△ Less
Submitted 7 June, 2017;
originally announced June 2017.
-
L$^3$-SVMs: Landmarks-based Linear Local Support Vectors Machines
Authors:
Valentina Zantedeschi,
Rémi Emonet,
Marc Sebban
Abstract:
For their ability to capture non-linearities in the data and to scale to large training sets, local Support Vector Machines (SVMs) have received a special attention during the past decade. In this paper, we introduce a new local SVM method, called L$^3$-SVMs, which clusters the input space, carries out dimensionality reduction by projecting the data on landmarks, and jointly learns a linear combin…
▽ More
For their ability to capture non-linearities in the data and to scale to large training sets, local Support Vector Machines (SVMs) have received a special attention during the past decade. In this paper, we introduce a new local SVM method, called L$^3$-SVMs, which clusters the input space, carries out dimensionality reduction by projecting the data on landmarks, and jointly learns a linear combination of local models. Simple and effective, our algorithm is also theoretically well-founded. Using the framework of Uniform Stability, we show that our SVM formulation comes with generalization guarantees on the true risk. The experiments based on the simplest configuration of our model (i.e. landmarks randomly selected, linear projection, linear kernel) show that L$^3$-SVMs is very competitive w.r.t. the state of the art and opens the door to new exciting lines of research.
△ Less
Submitted 3 April, 2017; v1 submitted 1 March, 2017;
originally announced March 2017.
-
Lipschitz Continuity of Mahalanobis Distances and Bilinear Forms
Authors:
Valentina Zantedeschi,
Rémi Emonet,
Marc Sebban
Abstract:
Many theoretical results in the machine learning domain stand only for functions that are Lipschitz continuous. Lipschitz continuity is a strong form of continuity that linearly bounds the variations of a function. In this paper, we derive tight Lipschitz constants for two families of metrics: Mahalanobis distances and bounded-space bilinear forms. To our knowledge, this is the first time the Maha…
▽ More
Many theoretical results in the machine learning domain stand only for functions that are Lipschitz continuous. Lipschitz continuity is a strong form of continuity that linearly bounds the variations of a function. In this paper, we derive tight Lipschitz constants for two families of metrics: Mahalanobis distances and bounded-space bilinear forms. To our knowledge, this is the first time the Mahalanobis distance is formally proved to be Lipschitz continuous and that such tight Lipschitz constants are derived.
△ Less
Submitted 4 April, 2016;
originally announced April 2016.