Skip to main content

Showing 1–10 of 10 results for author: Laforgue, P

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.09253  [pdf, other

    stat.ML cs.LG

    Deep Sketched Output Kernel Regression for Structured Prediction

    Authors: Tamim El Ahmad, Junjie Yang, Pierre Laforgue, Florence d'Alché-Buc

    Abstract: By leveraging the kernel trick in the output space, kernel-induced losses provide a principled way to define structured output prediction tasks for a wide variety of output modalities. In particular, they have been successfully used in the context of surrogate non-parametric regression, where the kernel trick is typically exploited in the input space as well. However, when inputs are images or tex… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  2. arXiv:2302.10128  [pdf, other

    stat.ML cs.LG

    Sketch In, Sketch Out: Accelerating both Learning and Inference for Structured Prediction with Kernels

    Authors: Tamim El Ahmad, Luc Brogat-Motte, Pierre Laforgue, Florence d'Alché-Buc

    Abstract: Leveraging the kernel trick in both the input and output spaces, surrogate kernel methods are a flexible and theoretically grounded solution to structured output prediction. If they provide state-of-the-art performance on complex data sets of moderate size (e.g., in chemoinformatics), these approaches however fail to scale. We propose to equip surrogate kernel methods with sketching-based approxim… ▽ More

    Submitted 6 May, 2024; v1 submitted 20 February, 2023; originally announced February 2023.

    Journal ref: Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:109-117, 2024

  3. arXiv:2211.00603  [pdf, other

    stat.ML cs.LG

    On Medians of (Randomized) Pairwise Means

    Authors: Pierre Laforgue, Stephan Clémençon, Patrice Bertail

    Abstract: Tournament procedures, recently introduced in Lugosi & Mendelson (2016), offer an appealing alternative, from a theoretical perspective at least, to the principle of Empirical Risk Minimization in machine learning. Statistical learning by Median-of-Means (MoM) basically consists in segmenting the training data into blocks of equal size and comparing the statistical performance of every pair of can… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

  4. arXiv:2206.03827  [pdf, other

    stat.ML cs.LG

    Fast Kernel Methods for Generic Lipschitz Losses via $p$-Sparsified Sketches

    Authors: Tamim El Ahmad, Pierre Laforgue, Florence d'Alché-Buc

    Abstract: Kernel methods are learning algorithms that enjoy solid theoretical foundations while suffering from important computational limitations. Sketching, which consists in looking for solutions among a subspace of reduced dimension, is a well studied approach to alleviate these computational burdens. However, statistically-accurate sketches, such as the Gaussian one, usually contain few null entries, s… ▽ More

    Submitted 6 November, 2023; v1 submitted 8 June, 2022; originally announced June 2022.

    Journal ref: Transactions on Machine Learning Research (2023)

  5. arXiv:2109.02357  [pdf, other

    cs.CV cs.CY cs.LG stat.ML

    Fighting Selection Bias in Statistical Learning: Application to Visual Recognition from Biased Image Databases

    Authors: Stephan Clémençon, Pierre Laforgue, Robin Vogel

    Abstract: In practice, and especially when training deep neural networks, visual recognition rules are often learned based on various sources of information. On the other hand, the recent deployment of facial recognition systems with uneven performances on different population segments has highlighted the representativeness issues induced by a naive aggregation of the datasets. In this paper, we show how bi… ▽ More

    Submitted 1 November, 2022; v1 submitted 6 September, 2021; originally announced September 2021.

  6. arXiv:2006.10325  [pdf, other

    stat.ML cs.LG

    When OT meets MoM: Robust estimation of Wasserstein Distance

    Authors: Guillaume Staerman, Pierre Laforgue, Pavlo Mozharovskyi, Florence d'Alché-Buc

    Abstract: Issued from Optimal Transport, the Wasserstein distance has gained importance in Machine Learning due to its appealing geometrical properties and the increasing availability of efficient approximations. In this work, we consider the problem of estimating the Wasserstein distance between two probability distributions when observations are polluted by outliers. To that end, we investigate how to lev… ▽ More

    Submitted 18 February, 2022; v1 submitted 18 June, 2020; originally announced June 2020.

    Journal ref: Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, AISTATS 2021

  7. arXiv:2006.05240  [pdf, other

    stat.ML cs.LG

    Generalization Bounds in the Presence of Outliers: a Median-of-Means Study

    Authors: Pierre Laforgue, Guillaume Staerman, Stephan Clémençon

    Abstract: In contrast to the empirical mean, the Median-of-Means (MoM) is an estimator of the mean $θ$ of a square integrable r.v. $Z$, around which accurate nonasymptotic confidence bounds can be built, even when $Z$ does not exhibit a sub-Gaussian tail behavior. Thanks to the high confidence it achieves on heavy-tailed data, MoM has found various applications in machine learning, where it is used to desig… ▽ More

    Submitted 7 February, 2021; v1 submitted 9 June, 2020; originally announced June 2020.

  8. arXiv:1910.04621  [pdf, other

    stat.ML cs.LG

    Duality in RKHSs with Infinite Dimensional Outputs: Application to Robust Losses

    Authors: Pierre Laforgue, Alex Lambert, Luc Brogat-Motte, Florence d'Alché-Buc

    Abstract: Operator-Valued Kernels (OVKs) and associated vector-valued Reproducing Kernel Hilbert Spaces provide an elegant way to extend scalar kernel methods when the output space is a Hilbert space. Although primarily used in finite dimension for problems like multi-task regression, the ability of this framework to deal with infinite dimensional output spaces unlocks many more applications, such as functi… ▽ More

    Submitted 21 August, 2020; v1 submitted 10 October, 2019; originally announced October 2019.

  9. arXiv:1906.12304  [pdf, other

    stat.ML cs.LG

    Statistical Learning from Biased Training Samples

    Authors: Stephan Clémençon, Pierre Laforgue

    Abstract: With the deluge of digitized information in the Big Data era, massive datasets are becoming increasingly available for learning predictive models. However, in many practical situations, the poor control of the data acquisition processes may naturally jeopardize the outputs of machine learning algorithms, and selection bias issues are now the subject of much attention in the literature. The present… ▽ More

    Submitted 1 November, 2022; v1 submitted 28 June, 2019; originally announced June 2019.

  10. arXiv:1805.11028  [pdf, other

    stat.ML cs.LG

    Autoencoding any Data through Kernel Autoencoders

    Authors: Pierre Laforgue, Stephan Clémençon, Florence d'Alché-Buc

    Abstract: This paper investigates a novel algorithmic approach to data representation based on kernel methods. Assuming that the observations lie in a Hilbert space X, the introduced Kernel Autoencoder (KAE) is the composition of map**s from vector-valued Reproducing Kernel Hilbert Spaces (vv-RKHSs) that minimizes the expected reconstruction error. Beyond a first extension of the autoencoding scheme to po… ▽ More

    Submitted 2 December, 2020; v1 submitted 28 May, 2018; originally announced May 2018.