Skip to main content

Showing 1–34 of 34 results for author: Habrard, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.13285  [pdf, other

    stat.ML cs.LG

    Leveraging PAC-Bayes Theory and Gibbs Distributions for Generalization Bounds with Complexity Measures

    Authors: Paul Viallard, Rémi Emonet, Amaury Habrard, Emilie Morvant, Valentina Zantedeschi

    Abstract: In statistical learning theory, a generalization bound usually involves a complexity measure imposed by the considered theoretical framework. This limits the scope of such bounds, as other forms of capacity measures or regularizations are used in algorithms. In this paper, we leverage the framework of disintegrated PAC-Bayes bounds to derive a general generalization bound instantiable with arbitra… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: AISTATS 2024

  2. Towards Few-Annotation Learning for Object Detection: Are Transformer-based Models More Efficient ?

    Authors: Quentin Bouniot, Angélique Loesch, Romaric Audigier, Amaury Habrard

    Abstract: For specialized and dense downstream tasks such as object detection, labeling data requires expertise and can be very expensive, making few-shot and semi-supervised models much more attractive alternatives. While in the few-shot setup we observe that transformer-based object detectors perform better than convolution-based two-stage models for a similar amount of parameters, they are not as effecti… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Published at WACV 2023

  3. arXiv:2310.16835  [pdf, other

    cs.CV cs.AI cs.LG

    Proposal-Contrastive Pretraining for Object Detection from Fewer Data

    Authors: Quentin Bouniot, Romaric Audigier, Angélique Loesch, Amaury Habrard

    Abstract: The use of pretrained deep neural networks represents an attractive way to achieve strong results with few data available. When specialized in dense problems such as object detection, learning local rather than global information in images has proven to be more efficient. However, for unsupervised pretraining, the popular contrastive learning requires a large batch size and, therefore, a lot of re… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: Published as a conference paper at ICLR 2023

  4. arXiv:2209.12727  [pdf, other

    cs.LG

    A Simple Way to Learn Metrics Between Attributed Graphs

    Authors: Yacouba Kaloga, Pierre Borgnat, Amaury Habrard

    Abstract: The choice of good distances and similarity measures between objects is important for many machine learning methods. Therefore, many metric learning algorithms have been developed in recent years, mainly for Euclidean data in order to improve performance of classification or clustering methods. However, due to difficulties in establishing computable, efficient and differentiable distances between… ▽ More

    Submitted 21 December, 2022; v1 submitted 26 September, 2022; originally announced September 2022.

  5. arXiv:2106.12535  [pdf, other

    cs.LG stat.ME stat.ML

    Learning Stochastic Majority Votes by Minimizing a PAC-Bayes Generalization Bound

    Authors: Valentina Zantedeschi, Paul Viallard, Emilie Morvant, Rémi Emonet, Amaury Habrard, Pascal Germain, Benjamin Guedj

    Abstract: We investigate a stochastic counterpart of majority votes over finite ensembles of classifiers, and study its generalization properties. While our approach holds for arbitrary distributions, we instantiate it with Dirichlet distributions: this allows for a closed-form and differentiable expression for the expected risk, which then turns the generalization bound into a tractable training objective.… ▽ More

    Submitted 19 October, 2021; v1 submitted 23 June, 2021; originally announced June 2021.

    Journal ref: Proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

  6. arXiv:2104.13626  [pdf, other

    stat.ML cs.LG

    Self-Bounding Majority Vote Learning Algorithms by the Direct Minimization of a Tight PAC-Bayesian C-Bound

    Authors: Paul Viallard, Pascal Germain, Amaury Habrard, Emilie Morvant

    Abstract: In the PAC-Bayesian literature, the C-Bound refers to an insightful relation between the risk of a majority vote classifier (under the zero-one loss) and the first two moments of its margin (i.e., the expected margin and the voters' diversity). Until now, learning algorithms developed in this framework minimize the empirical version of the C-Bound, instead of explicit PAC-Bayesian generalization b… ▽ More

    Submitted 31 August, 2021; v1 submitted 28 April, 2021; originally announced April 2021.

    Journal ref: ECML PKDD 2021, Sep 2021, Bilbao, Spain

  7. arXiv:2102.11069  [pdf, other

    cs.LG cs.AI stat.ML

    A PAC-Bayes Analysis of Adversarial Robustness

    Authors: Paul Viallard, Guillaume Vidot, Amaury Habrard, Emilie Morvant

    Abstract: We propose the first general PAC-Bayesian generalization bounds for adversarial robustness, that estimate, at test time, how much a model will be invariant to imperceptible perturbations in the input. Instead of deriving a worst-case analysis of the risk of a hypothesis over all the possible perturbations, we leverage the PAC-Bayesian framework to bound the averaged risk on the perturbations for m… ▽ More

    Submitted 27 October, 2021; v1 submitted 19 February, 2021; originally announced February 2021.

    Journal ref: NeurIPS 2021, Dec 2021, Sydney, Australia

  8. arXiv:2102.08649  [pdf, other

    stat.ML cs.LG

    A General Framework for the Practical Disintegration of PAC-Bayesian Bounds

    Authors: Paul Viallard, Pascal Germain, Amaury Habrard, Emilie Morvant

    Abstract: PAC-Bayesian bounds are known to be tight and informative when studying the generalization ability of randomized classifiers. However, they require a loose and costly derandomization step when applied to some families of deterministic models such as neural networks. As an alternative to this step, we introduce new PAC-Bayesian generalization bounds that have the originality to provide disintegrate… ▽ More

    Submitted 18 September, 2023; v1 submitted 17 February, 2021; originally announced February 2021.

    Comments: Machine Learning, In press

  9. arXiv:2010.16132  [pdf, other

    cs.LG stat.ML

    Multiview Variational Graph Autoencoders for Canonical Correlation Analysis

    Authors: Yacouba Kaloga, Pierre Borgnat, Sundeep Prabhakar Chepuri, Patrice Abry, Amaury Habrard

    Abstract: We present a novel multiview canonical correlation analysis model based on a variational approach. This is the first nonlinear model that takes into account the available graph-based geometric constraints while being scalable for processing large scale datasets with multiple views. It is based on an autoencoder architecture with graph convolutional neural network layers. We experiment with our app… ▽ More

    Submitted 4 October, 2021; v1 submitted 30 October, 2020; originally announced October 2020.

    Comments: 4 pages, 3 figures, submitted

  10. arXiv:2010.01992  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Improving Few-Shot Learning through Multi-task Representation Learning Theory

    Authors: Quentin Bouniot, Ievgen Redko, Romaric Audigier, Angélique Loesch, Amaury Habrard

    Abstract: In this paper, we consider the framework of multi-task representation (MTR) learning where the goal is to use source tasks to learn a representation that reduces the sample complexity of solving a target task. We start by reviewing recent advances in MTR theory and show that they can provide novel insights for popular meta-learning algorithms when analyzed within this framework. In particular, we… ▽ More

    Submitted 2 August, 2022; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: Accepted at ECCV2022. Code: https://github.com/CEA-LIST/MetaMTReg

  11. arXiv:2007.03373  [pdf, other

    cs.LG cs.CV stat.ML

    Hierarchical and Unsupervised Graph Representation Learning with Loukas's Coarsening

    Authors: Louis Béthune, Yacouba Kaloga, Pierre Borgnat, Aurélien Garivier, Amaury Habrard

    Abstract: We propose a novel algorithm for unsupervised graph representation learning with attributed graphs. It combines three advantages addressing some current limitations of the literature: i) The model is inductive: it can embed new graphs without re-training in the presence of new data; ii) The method takes into account both micro-structures and macro-structures by looking at the attributed graphs at… ▽ More

    Submitted 17 August, 2020; v1 submitted 7 July, 2020; originally announced July 2020.

    Comments: 19 pages, 15 figures, submitted

  12. arXiv:2004.11829  [pdf, other

    cs.LG stat.ML

    A survey on domain adaptation theory: learning bounds and theoretical guarantees

    Authors: Ievgen Redko, Emilie Morvant, Amaury Habrard, Marc Sebban, Younès Bennani

    Abstract: All famous machine learning algorithms that comprise both supervised and semi-supervised learning work well only under a common assumption: the training and test data follow the same distribution. When the distribution changes, most statistical models must be reconstructed from newly collected data, which for some applications can be costly or impossible to obtain. Therefore, it has become necessa… ▽ More

    Submitted 13 July, 2022; v1 submitted 24 April, 2020; originally announced April 2020.

  13. arXiv:1909.01651  [pdf, other

    stat.ML cs.LG

    Metric Learning from Imbalanced Data

    Authors: Léo Gautheron, Emilie Morvant, Amaury Habrard, Marc Sebban

    Abstract: A key element of any machine learning algorithm is the use of a function that measures the dis/similarity between data points. Given a task, such a function can be optimized with a metric learning algorithm. Although this research field has received a lot of attention during the past decade, very few approaches have focused on learning a metric in an imbalanced scenario where the number of positiv… ▽ More

    Submitted 4 September, 2019; originally announced September 2019.

  14. arXiv:1909.00693  [pdf, other

    cs.LG stat.ML

    An Adjusted Nearest Neighbor Algorithm Maximizing the F-Measure from Imbalanced Data

    Authors: Rémi Viola, Rémi Emonet, Amaury Habrard, Guillaume Metzler, Sébastien Riou, Marc Sebban

    Abstract: In this paper, we address the challenging problem of learning from imbalanced data using a Nearest-Neighbor (NN) algorithm. In this setting, the minority examples typically belong to the class of interest requiring the optimization of specific criteria, like the F-Measure. Based on simple geometrical ideas, we introduce an algorithm that reweights the distance between a query sample and any positi… ▽ More

    Submitted 22 January, 2020; v1 submitted 2 September, 2019; originally announced September 2019.

    Comments: In Proceedings of the 31 International Conference on Tools with Artificial Intelligence (ICTAI 2019)

  15. arXiv:1906.06203  [pdf, other

    stat.ML cs.LG

    Learning Landmark-Based Ensembles with Random Fourier Features and Gradient Boosting

    Authors: Léo Gautheron, Pascal Germain, Amaury Habrard, Emilie Morvant, Marc Sebban, Valentina Zantedeschi

    Abstract: We propose a Gradient Boosting algorithm for learning an ensemble of kernel functions adapted to the task at hand. Unlike state-of-the-art Multiple Kernel Learning techniques that make use of a pre-computed dictionary of kernel functions to select from, at each iteration we fit a kernel by approximating it as a weighted sum of Random Fourier Features (RFF) and by optimizing their barycenter. This… ▽ More

    Submitted 14 June, 2019; originally announced June 2019.

  16. Near-lossless Binarization of Word Embeddings

    Authors: Julien Tissier, Christophe Gravier, Amaury Habrard

    Abstract: Word embeddings are commonly used as a starting point in many NLP models to achieve state-of-the-art performances. However, with a large vocabulary and many dimensions, these floating-point representations are expensive both in terms of memory and calculations which makes them unsuitable for use on low-resource devices. The method proposed in this paper transforms real-valued embeddings into binar… ▽ More

    Submitted 15 November, 2018; v1 submitted 24 March, 2018; originally announced March 2018.

    Comments: Accepted as a long paper at AAAI 2019

  17. arXiv:1705.08848  [pdf, other

    stat.ML cs.LG

    Joint Distribution Optimal Transportation for Domain Adaptation

    Authors: Nicolas Courty, Rémi Flamary, Amaury Habrard, Alain Rakotomamonjy

    Abstract: This paper deals with the unsupervised domain adaptation problem, where one wants to estimate a prediction function $f$ in a given target domain without any labeled sample by exploiting the knowledge available from a source domain where labels are known. Our work makes the following assumption: there exists a non-linear transformation between the joint feature/label space distributions of the two… ▽ More

    Submitted 22 October, 2017; v1 submitted 24 May, 2017; originally announced May 2017.

    Comments: Accepted for publication at NIPS 2017

  18. arXiv:1610.04783  [pdf, other

    cs.LG

    Similarity Learning for Time Series Classification

    Authors: Maria-Irina Nicolae, Éric Gaussier, Amaury Habrard, Marc Sebban

    Abstract: Multivariate time series naturally exist in many fields, like energy, bioinformatics, signal processing, and finance. Most of these applications need to be able to compare these structured data. In this context, dynamic time war** (DTW) is probably the most common comparison measure. However, not much research effort has been put into improving it by learning. In this paper, we propose a novel m… ▽ More

    Submitted 15 October, 2016; originally announced October 2016.

    Comments: Techreport

  19. arXiv:1610.04420  [pdf, ps, other

    stat.ML cs.LG

    Theoretical Analysis of Domain Adaptation with Optimal Transport

    Authors: Ievgen Redko, Amaury Habrard, Marc Sebban

    Abstract: Domain adaptation (DA) is an important and emerging field of machine learning that tackles the problem occurring when the distributions of training (source domain) and test (target domain) data are similar but different. Current theoretical results show that the efficiency of DA algorithms depends on their capacity of minimizing the divergence between source and target probability distributions. I… ▽ More

    Submitted 28 July, 2017; v1 submitted 14 October, 2016; originally announced October 2016.

  20. arXiv:1506.04573  [pdf, other

    stat.ML cs.LG

    A New PAC-Bayesian Perspective on Domain Adaptation

    Authors: Pascal Germain, Amaury Habrard, François Laviolette, Emilie Morvant

    Abstract: We study the issue of PAC-Bayesian domain adaptation: We want to learn, from a source domain, a majority vote model dedicated to a target one. Our theoretical contribution brings a new perspective by deriving an upper-bound on the target risk where the distributions' divergence---expressed as a ratio---controls the trade-off between a source error measure and the target voters' disagreement. Our b… ▽ More

    Submitted 26 July, 2016; v1 submitted 15 June, 2015; originally announced June 2015.

    Comments: Published at ICML 2016

  21. arXiv:1503.06944  [pdf, other

    stat.ML cs.LG

    PAC-Bayesian Theorems for Domain Adaptation with Specialization to Linear Classifiers

    Authors: Pascal Germain, Amaury Habrard, François Laviolette, Emilie Morvant

    Abstract: In this paper, we provide two main contributions in PAC-Bayesian theory for domain adaptation where the objective is to learn, from a source distribution, a well-performing majority vote on a different target distribution. On the one hand, we propose an improvement of the previous approach proposed by Germain et al. (2013), that relies on a novel distribution pseudodistance based on a disagreement… ▽ More

    Submitted 9 August, 2016; v1 submitted 24 March, 2015; originally announced March 2015.

    Comments: This report is a long version of our paper entitled A PAC-Bayesian Approach for Domain Adaptation with Specialization to Linear Classifiers published in the proceedings of the International Conference on Machine Learning (ICML) 2013. We improved our main results, extended our experiments, and proposed an extension to multisource domain adaptation

  22. arXiv:1501.03002  [pdf, ps, other

    stat.ML cs.LG

    An Improvement to the Domain Adaptation Bound in a PAC-Bayesian context

    Authors: Pascal Germain, Amaury Habrard, Francois Laviolette, Emilie Morvant

    Abstract: This paper provides a theoretical analysis of domain adaptation based on the PAC-Bayesian theory. We propose an improvement of the previous domain adaptation bound obtained by Germain et al. in two ways. We first give another generalization bound tighter and easier to interpret. Moreover, we provide a new analysis of the constant term appearing in the bound that can be of high interest for develop… ▽ More

    Submitted 13 January, 2015; originally announced January 2015.

    Comments: NIPS 2014 Workshop on Transfer and Multi-task learning: Theory Meets Practice, Dec 2014, Montr{é}al, Canada

  23. arXiv:1412.6452  [pdf, ps, other

    cs.LG

    Algorithmic Robustness for Learning via $(ε, γ, τ)$-Good Similarity Functions

    Authors: Maria-Irina Nicolae, Marc Sebban, Amaury Habrard, Éric Gaussier, Massih-Reza Amini

    Abstract: The notion of metric plays a key role in machine learning problems such as classification, clustering or ranking. However, it is worth noting that there is a severe lack of theoretical guarantees that can be expected on the generalization capacity of the classifier associated to a given metric. The theoretical framework of $(ε, γ, τ)$-good similarity functions (Balcan et al., 2008) has been one of… ▽ More

    Submitted 31 March, 2015; v1 submitted 19 December, 2014; originally announced December 2014.

    Comments: ICLR 2015 Workshop - accepted

  24. arXiv:1409.5241  [pdf, other

    cs.CV

    Subspace Alignment For Domain Adaptation

    Authors: Basura Fernando, Amaury Habrard, Marc Sebban, Tinne Tuytelaars

    Abstract: In this paper, we introduce a new domain adaptation (DA) algorithm where the source and target domains are represented by subspaces spanned by eigenvectors. Our method seeks a domain invariant feature space by learning a map** function which aligns the source subspace with the target one. We show that the solution of the corresponding optimization problem can be obtained in a simple closed form,… ▽ More

    Submitted 23 October, 2014; v1 submitted 18 September, 2014; originally announced September 2014.

  25. arXiv:1404.7796  [pdf, other

    stat.ML cs.LG cs.MM

    Majority Vote of Diverse Classifiers for Late Fusion

    Authors: Emilie Morvant, Amaury Habrard, Stéphane Ayache

    Abstract: In the past few years, a lot of attention has been devoted to multimedia indexing by fusing multimodal informations. Two kinds of fusion schemes are generally considered: The early fusion and the late fusion. We focus on late classifier fusion, where one combines the scores of each modality at the decision level. To tackle this problem, we investigate a recent and elegant well-founded quadratic pr… ▽ More

    Submitted 19 June, 2014; v1 submitted 30 April, 2014; originally announced April 2014.

    Comments: IAPR Joint International Workshops on Statistical Techniques in Pattern Recognition and Structural and Syntactic Pattern Recignition, Joensuu : Finland (2014)

  26. arXiv:1312.6282  [pdf, other

    cs.LG

    Dimension-free Concentration Bounds on Hankel Matrices for Spectral Learning

    Authors: François Denis, Mattias Gybels, Amaury Habrard

    Abstract: Learning probabilistic models over strings is an important issue for many applications. Spectral methods propose elegant solutions to the problem of inferring weighted automata from finite samples of variable-length strings drawn from an unknown target distribution. These methods rely on a singular value decomposition of a matrix $H_S$, called the Hankel matrix, that records the frequencies of (so… ▽ More

    Submitted 21 December, 2013; originally announced December 2013.

    Comments: Extended version of a paper to appear at ICML 2014

  27. arXiv:1306.6709  [pdf, ps, other

    cs.LG cs.AI stat.ML

    A Survey on Metric Learning for Feature Vectors and Structured Data

    Authors: Aurélien Bellet, Amaury Habrard, Marc Sebban

    Abstract: The need for appropriate ways to measure the distance or similarity between data is ubiquitous in machine learning, pattern recognition and data mining, but handcrafting such good metrics for specific problems is generally difficult. This has led to the emergence of metric learning, which aims at automatically learning a metric from data and has attracted a lot of interest in machine learning and… ▽ More

    Submitted 12 February, 2014; v1 submitted 27 June, 2013; originally announced June 2013.

    Comments: Technical report, 59 pages. Changes in v2: fixed typos and improved presentation. Changes in v3: fixed typos. Changes in v4: fixed typos and new methods

  28. arXiv:1212.2340  [pdf, other

    stat.ML cs.LG

    PAC-Bayesian Learning and Domain Adaptation

    Authors: Pascal Germain, Amaury Habrard, François Laviolette, Emilie Morvant

    Abstract: In machine learning, Domain Adaptation (DA) arises when the distribution gen- erating the test (target) data differs from the one generating the learning (source) data. It is well known that DA is an hard task even under strong assumptions, among which the covariate-shift where the source and target distributions diverge only in their marginals, i.e. they have the same labeling function. Another p… ▽ More

    Submitted 11 December, 2012; originally announced December 2012.

    Comments: https://sites.google.com/site/multitradeoffs2012/

    Journal ref: Multi-Trade-offs in Machine Learning, NIPS 2012 Workshop, Lake Tahoe : United States (2012)

  29. arXiv:1209.1086  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Robustness and Generalization for Metric Learning

    Authors: Aurélien Bellet, Amaury Habrard

    Abstract: Metric learning has attracted a lot of interest over the last decade, but the generalization ability of such methods has not been thoroughly studied. In this paper, we introduce an adaptation of the notion of algorithmic robustness (previously introduced by Xu and Mannor) that can be used to derive generalization bounds for metric learning. We further show that a weak notion of robustness is in fa… ▽ More

    Submitted 29 September, 2014; v1 submitted 5 September, 2012; originally announced September 2012.

    Comments: 16 pages, to appear in Neurocomputing

    Journal ref: Neurocomputing,151(1):259-267, 2015

  30. arXiv:1207.1019  [pdf, ps, other

    stat.ML cs.CV cs.LG cs.MM

    PAC-Bayesian Majority Vote for Late Classifier Fusion

    Authors: Emilie Morvant, Amaury Habrard, Stéphane Ayache

    Abstract: A lot of attention has been devoted to multimedia indexing over the past few years. In the literature, we often consider two kinds of fusion schemes: The early fusion and the late fusion. In this paper we focus on late classifier fusion, where one combines the scores of each modality at the decision level. To tackle this problem, we investigate a recent and elegant well-founded quadratic program n… ▽ More

    Submitted 4 July, 2012; originally announced July 2012.

    Comments: 7 pages, Research report

  31. arXiv:1206.6476  [pdf

    cs.LG cs.AI stat.ML

    Similarity Learning for Provably Accurate Sparse Linear Classification

    Authors: Aurelien Bellet, Amaury Habrard, Marc Sebban

    Abstract: In recent years, the crucial importance of metrics in machine learning algorithms has led to an increasing interest for optimizing distance and similarity functions. Most of the state of the art focus on learning Mahalanobis distances (requiring to fulfill a constraint of positive semi-definiteness) for use in a local k-NN algorithm. However, no theoretical link is established between the learned… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

  32. arXiv:0807.2983  [pdf, ps, other

    cs.LG

    On Probability Distributions for Trees: Representations, Inference and Learning

    Authors: François Denis, Amaury Habrard, Rémi Gilleron, Marc Tommasi, Édouard Gilbert

    Abstract: We study probability distributions over free algebras of trees. Probability distributions can be seen as particular (formal power) tree series [Berstel et al 82, Esik et al 03], i.e. map**s from trees to a semiring K . A widely studied class of tree series is the class of rational (or recognizable) tree series which can be defined either in an algebraic way or by means of multiplicity tree aut… ▽ More

    Submitted 18 July, 2008; originally announced July 2008.

    Journal ref: Dans NIPS Workshop on Representations and Inference on Probability Distributions (2007)

  33. arXiv:cs/0607085  [pdf, ps, other

    cs.LG

    Using Pseudo-Stochastic Rational Languages in Probabilistic Grammatical Inference

    Authors: Amaury Habrard, Francois Denis, Yann Esposito

    Abstract: In probabilistic grammatical inference, a usual goal is to infer a good approximation of an unknown distribution P called a stochastic language. The estimate of P stands in some class of probabilistic models such as probabilistic automata (PA). In this paper, we focus on probabilistic models based on multiplicity automata (MA). The stochastic languages generated by MA are called rational stochas… ▽ More

    Submitted 7 November, 2008; v1 submitted 18 July, 2006; originally announced July 2006.

    Journal ref: 8th International Colloquium on Grammatical Inference (ICGI'06), Japan (2006)

  34. arXiv:cs/0602062  [pdf, ps, other

    cs.LG

    Learning rational stochastic languages

    Authors: François Denis, Yann Esposito, Amaury Habrard

    Abstract: Given a finite set of words w1,...,wn independently drawn according to a fixed unknown distribution law P called a stochastic language, an usual goal in Grammatical Inference is to infer an estimate of P in some class of probabilistic models, such as Probabilistic Automata (PA). Here, we study the class of rational stochastic languages, which consists in stochastic languages that can be generate… ▽ More

    Submitted 17 February, 2006; originally announced February 2006.

    Comments: 15 pages