Skip to main content

Showing 1–28 of 28 results for author: Dohmatob, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.07515  [pdf, other

    cs.LG cs.AI stat.ML

    Beyond Model Collapse: Scaling Up with Synthesized Data Requires Reinforcement

    Authors: Yunzhen Feng, Elvis Dohmatob, Pu Yang, Francois Charton, Julia Kempe

    Abstract: Synthesized data from generative models is increasingly considered as an alternative to human-annotated data for fine-tuning Large Language Models. This raises concerns about model collapse: a drop in performance of models fine-tuned on generated data. Considering that it is easier for both humans and machines to tell between good and bad examples than to generate high-quality samples, we investig… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  2. arXiv:2402.07712  [pdf, other

    cs.LG cs.AI stat.ML

    Model Collapse Demystified: The Case of Regression

    Authors: Elvis Dohmatob, Yunzhen Feng, Julia Kempe

    Abstract: In the era of proliferation of large language and image generation models, the phenomenon of "model collapse" refers to the situation whereby as a model is trained recursively on data generated from previous generations of itself over time, its performance degrades until the model eventually becomes completely useless, i.e the model collapses. In this work, we study this phenomenon in the setting… ▽ More

    Submitted 30 April, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  3. arXiv:2402.07043  [pdf, other

    cs.LG cs.AI cs.CL

    A Tale of Tails: Model Collapse as a Change of Scaling Laws

    Authors: Elvis Dohmatob, Yunzhen Feng, Pu Yang, Francois Charton, Julia Kempe

    Abstract: As AI model size grows, neural scaling laws have become a crucial tool to predict the improvements of large models when increasing capacity and the size of original (human or natural) training data. Yet, the widespread use of popular models means that the ecosystem of online data and text will co-evolve to progressively contain increased amounts of synthesized data. In this paper we ask: How will… ▽ More

    Submitted 31 May, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

    Journal ref: ICML 2024

  4. arXiv:2310.02984  [pdf, other

    stat.ML cs.AI cs.CL cs.LG cs.NE

    Scaling Laws for Associative Memories

    Authors: Vivien Cabannes, Elvis Dohmatob, Alberto Bietti

    Abstract: Learning arguably involves the discovery and memorization of abstract rules. The aim of this paper is to study associative memory mechanisms. Our model is based on high-dimensional matrices consisting of outer products of embeddings, which relates to the inner layers of transformer language models. We derive precise scaling laws with respect to sample size and parameter size, and discuss the stati… ▽ More

    Submitted 20 February, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    ACM Class: I.2.6; G.1.6

  5. arXiv:2308.00556  [pdf, other

    stat.ML cs.LG

    Robust Linear Regression: Phase-Transitions and Precise Tradeoffs for General Norms

    Authors: Elvis Dohmatob, Meyer Scetbon

    Abstract: In this paper, we investigate the impact of test-time adversarial attacks on linear regression models and determine the optimal level of robustness that any model can reach while maintaining a given level of standard predictive performance (accuracy). Through quantitative estimates, we uncover fundamental tradeoffs between adversarial robustness and accuracy in different regimes. We obtain a preci… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

  6. arXiv:2301.13486  [pdf, other

    stat.ML cs.LG

    Robust Linear Regression: Gradient-descent, Early-stop**, and Beyond

    Authors: Meyer Scetbon, Elvis Dohmatob

    Abstract: In this work we study the robustness to adversarial attacks, of early-stop** strategies on gradient-descent (GD) methods for linear regression. More precisely, we show that early-stopped GD is optimally robust (up to an absolute constant) against Euclidean-norm adversarial attacks. However, we show that this strategy can be arbitrarily sub-optimal in the case of general Mahalanobis attacks. This… ▽ More

    Submitted 31 January, 2023; originally announced January 2023.

  7. arXiv:2211.02675  [pdf, other

    cs.LG cs.CR stat.ML

    An Adversarial Robustness Perspective on the Topology of Neural Networks

    Authors: Morgane Goibert, Thomas Ricatte, Elvis Dohmatob

    Abstract: In this paper, we investigate the impact of neural networks (NNs) topology on adversarial robustness. Specifically, we study the graph produced when an input traverses all the layers of a NN, and show that such graphs are different for clean and adversarial inputs. We find that graphs from clean inputs are more centralized around highway edges, whereas those from adversaries are more diffuse, leve… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

  8. arXiv:2210.09957  [pdf, other

    cs.LG cs.AI cs.CY cs.IR stat.ML

    Contextual bandits with concave rewards, and an application to fair ranking

    Authors: Virginie Do, Elvis Dohmatob, Matteo Pirotta, Alessandro Lazaric, Nicolas Usunier

    Abstract: We consider Contextual Bandits with Concave Rewards (CBCR), a multi-objective bandit problem where the desired trade-off between the rewards is defined by a known concave objective function, and the reward vector depends on an observed stochastic context. We present the first algorithm with provably vanishing regret for CBCR without restrictions on the policy space, whereas prior works were restri… ▽ More

    Submitted 28 February, 2023; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: ICLR 2023

  9. arXiv:2209.13019  [pdf, other

    cs.IR cs.AI cs.CY cs.LG

    Fast online ranking with fairness of exposure

    Authors: Nicolas Usunier, Virginie Do, Elvis Dohmatob

    Abstract: As recommender systems become increasingly central for sorting and prioritizing the content available online, they have a growing impact on the opportunities or revenue of their items producers. For instance, they influence which recruiter a resume is recommended to, or to whom and how much a music track, video or news article is being exposed. This calls for recommendation approaches that not onl… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: FAccT 2022

  10. arXiv:2207.00486  [pdf, other

    cs.LG stat.ML

    Scalable MCMC Sampling for Nonsymmetric Determinantal Point Processes

    Authors: Insu Han, Mike Gartrell, Elvis Dohmatob, Amin Karbasi

    Abstract: A determinantal point process (DPP) is an elegant model that assigns a probability to every subset of a collection of $n$ items. While conventionally a DPP is parameterized by a symmetric kernel matrix, removing this symmetry constraint, resulting in nonsymmetric DPPs (NDPPs), leads to significant improvements in modeling power and predictive performance. Recent work has studied an approximate Mar… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

    Comments: ICML 2022

  11. arXiv:2203.13779  [pdf, other

    stat.ML cs.LG

    Origins of Low-dimensional Adversarial Perturbations

    Authors: Elvis Dohmatob, Chuan Guo, Morgane Goibert

    Abstract: In this paper, we initiate a rigorous study of the phenomenon of low-dimensional adversarial perturbations (LDAPs) in classification. Unlike the classical setting, these perturbations are limited to a subspace of dimension $k$ which is much smaller than the dimension $d$ of the feature space. The case $k=1$ corresponds to so-called universal adversarial perturbations (UAPs; Moosavi-Dezfooli et al.… ▽ More

    Submitted 4 July, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

  12. arXiv:2203.11864  [pdf, other

    stat.ML cs.LG

    On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes

    Authors: Elvis Dohmatob, Alberto Bietti

    Abstract: Neural networks are known to be highly sensitive to adversarial examples. These may arise due to different factors, such as random initialization, or spurious correlations in the learning problem. To better understand these factors, we provide a precise study of the adversarial robustness in different scenarios, from initialization to the end of training in different regimes, as well as intermedia… ▽ More

    Submitted 4 July, 2022; v1 submitted 22 March, 2022; originally announced March 2022.

  13. arXiv:2201.08417  [pdf, other

    cs.LG stat.ML

    Scalable Sampling for Nonsymmetric Determinantal Point Processes

    Authors: Insu Han, Mike Gartrell, Jennifer Gillenwater, Elvis Dohmatob, Amin Karbasi

    Abstract: A determinantal point process (DPP) on a collection of $M$ items is a model, parameterized by a symmetric kernel matrix, that assigns a probability to every subset of those items. Recent work shows that removing the kernel symmetry constraint, yielding nonsymmetric DPPs (NDPPs), can lead to significant predictive performance gains for machine learning applications. However, existing work leaves op… ▽ More

    Submitted 19 April, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

    Comments: ICLR 2022

  14. arXiv:2106.02630  [pdf, other

    stat.ML cs.LG

    Fundamental tradeoffs between memorization and robustness in random features and neural tangent regimes

    Authors: Elvis Dohmatob

    Abstract: This work studies the (non)robustness of two-layer neural networks in various high-dimensional linearized regimes. We establish fundamental trade-offs between memorization and robustness, as measured by the Sobolev-seminorm of the model w.r.t the data distribution, i.e the square root of the average squared $L_2$-norm of the gradients of the model w.r.t the its input. More precisely, if $n$ is the… ▽ More

    Submitted 4 June, 2021; originally announced June 2021.

  15. arXiv:2011.06550  [pdf, other

    stat.ML cs.LG

    Implicit bias of any algorithm: bounding bias via margin

    Authors: Elvis Dohmatob

    Abstract: Consider $n$ points $x_1,\ldots,x_n$ in finite-dimensional euclidean space, each having one of two colors. Suppose there exists a separating hyperplane (identified with its unit normal vector $w)$ for the points, i.e a hyperplane such that points of same color lie on the same side of the hyperplane. We measure the quality of such a hyperplane by its margin $γ(w)$, defined as minimum distance betwe… ▽ More

    Submitted 23 November, 2020; v1 submitted 12 November, 2020; originally announced November 2020.

  16. arXiv:2006.09989  [pdf, other

    stat.ML cs.CV cs.LG cs.NE

    Classifier-independent Lower-Bounds for Adversarial Robustness

    Authors: Elvis Dohmatob

    Abstract: We theoretically analyse the limits of robustness to test-time adversarial and noisy examples in classification. Our work focuses on deriving bounds which uniformly apply to all classifiers (i.e all measurable functions from features to labels) for a given problem. Our contributions are two-fold. (1) We use optimal transport theory to derive variational formulae for the Bayes-optimal error a class… ▽ More

    Submitted 9 November, 2020; v1 submitted 17 June, 2020; originally announced June 2020.

  17. arXiv:2006.09862  [pdf, other

    cs.LG stat.ML

    Scalable Learning and MAP Inference for Nonsymmetric Determinantal Point Processes

    Authors: Mike Gartrell, Insu Han, Elvis Dohmatob, Jennifer Gillenwater, Victor-Emmanuel Brunel

    Abstract: Determinantal point processes (DPPs) have attracted significant attention in machine learning for their ability to model subsets drawn from a large item collection. Recent work shows that nonsymmetric DPP (NDPP) kernels have significant advantages over symmetric kernels in terms of modeling power and predictive performance. However, for an item collection of size $M$, existing NDPP learning and in… ▽ More

    Submitted 13 April, 2021; v1 submitted 17 June, 2020; originally announced June 2020.

    Comments: ICLR 2021

  18. arXiv:2006.04596  [pdf, other

    stat.ML cs.LG

    Learning disconnected manifolds: a no GANs land

    Authors: Ugo Tanielian, Thibaut Issenhuth, Elvis Dohmatob, Jeremie Mary

    Abstract: Typical architectures of Generative AdversarialNetworks make use of a unimodal latent distribution transformed by a continuous generator. Consequently, the modeled distribution always has connected support which is cumbersome when learning a disconnected set of manifolds. We formalize this problem by establishing a no free lunch theorem for the disconnected manifold learning stating an upper bound… ▽ More

    Submitted 10 December, 2020; v1 submitted 8 June, 2020; originally announced June 2020.

    Comments: 24 pages

    Journal ref: PMLR 119:9418-9427, 2020

  19. arXiv:1909.09621  [pdf, other

    stat.ML cs.LG

    On the Convergence of Approximate and Regularized Policy Iteration Schemes

    Authors: Elena Smirnova, Elvis Dohmatob

    Abstract: Entropy regularized algorithms such as Soft Q-learning and Soft Actor-Critic, recently showed state-of-the-art performance on a number of challenging reinforcement learning (RL) tasks. The regularized formulation modifies the standard RL objective and thus generally converges to a policy different from the optimal greedy policy of the original RL problem. Practically, it is important to control th… ▽ More

    Submitted 14 October, 2019; v1 submitted 20 September, 2019; originally announced September 2019.

  20. arXiv:1906.11567  [pdf, other

    cs.LG cs.AI stat.ML

    Adversarial Robustness via Label-Smoothing

    Authors: Morgane Goibert, Elvis Dohmatob

    Abstract: We study Label-Smoothing as a means for improving adversarial robustness of supervised deep-learning models. After establishing a thorough and unified framework, we propose several variations to this general method: adversarial, Boltzmann and second-best Label-Smoothing methods, and we explain how to construct your own one. On various datasets (MNIST, CIFAR10, SVHN) and models (linear models, MLPs… ▽ More

    Submitted 15 October, 2019; v1 submitted 27 June, 2019; originally announced June 2019.

  21. arXiv:1906.06211  [pdf, ps, other

    stat.ML cs.LG

    Distributionally Robust Counterfactual Risk Minimization

    Authors: Louis Faury, Ugo Tanielian, Flavian Vasile, Elena Smirnova, Elvis Dohmatob

    Abstract: This manuscript introduces the idea of using Distributionally Robust Optimization (DRO) for the Counterfactual Risk Minimization (CRM) problem. Tap** into a rich existing literature, we show that DRO is a principled tool for counterfactual decision making. We also show that well-established solutions to the CRM problem like sample variance penalization schemes are special instances of a more gen… ▽ More

    Submitted 14 December, 2019; v1 submitted 14 June, 2019; originally announced June 2019.

    Comments: Accepted at AAAI20

  22. arXiv:1905.12962  [pdf, other

    cs.LG stat.ML

    Learning Nonsymmetric Determinantal Point Processes

    Authors: Mike Gartrell, Victor-Emmanuel Brunel, Elvis Dohmatob, Syrine Krichene

    Abstract: Determinantal point processes (DPPs) have attracted substantial attention as an elegant probabilistic model that captures the balance between quality and diversity within sets. DPPs are conventionally parameterized by a positive semi-definite kernel matrix, and this symmetric kernel encodes only repulsive interactions between items. These so-called symmetric DPPs have significant expressive power,… ▽ More

    Submitted 12 November, 2020; v1 submitted 30 May, 2019; originally announced May 2019.

    Comments: NeurIPS 2019

  23. arXiv:1902.08708  [pdf, other

    stat.ML cs.LG

    Distributionally Robust Reinforcement Learning

    Authors: Elena Smirnova, Elvis Dohmatob, Jérémie Mary

    Abstract: Real-world applications require RL algorithms to act safely. During learning process, it is likely that the agent executes sub-optimal actions that may lead to unsafe/poor states of the system. Exploration is particularly brittle in high-dimensional state/action space due to increased number of low-performing actions. In this work, we consider risk-averse exploration in approximate RL setting. To… ▽ More

    Submitted 14 June, 2019; v1 submitted 22 February, 2019; originally announced February 2019.

  24. arXiv:1811.07245  [pdf, other

    stat.ML cs.LG

    Deep Determinantal Point Processes

    Authors: Mike Gartrell, Elvis Dohmatob, Jon Alberdi

    Abstract: Determinantal point processes (DPPs) have attracted significant attention as an elegant model that is able to capture the balance between quality and diversity within sets. DPPs are parameterized by a positive semi-definite kernel matrix. While DPPs have substantial expressive power, they are fundamentally limited by the parameterization of the kernel matrix and their inability to capture nonlinea… ▽ More

    Submitted 29 May, 2019; v1 submitted 17 November, 2018; originally announced November 2018.

  25. arXiv:1810.04065  [pdf, other

    stat.ML cs.LG

    Generalized No Free Lunch Theorem for Adversarial Robustness

    Authors: Elvis Dohmatob

    Abstract: This manuscript presents some new impossibility results on adversarial robustness in machine learning, a very important yet largely open problem. We show that if conditioned on a class label the data distribution satisfies the $W_2$ Talagrand transportation-cost inequality (for example, this condition is satisfied if the conditional distribution has density which is log-concave; is the uniform mea… ▽ More

    Submitted 4 June, 2019; v1 submitted 8 October, 2018; originally announced October 2018.

  26. arXiv:1512.06999  [pdf, ps, other

    q-bio.NC cs.LG stat.CO stat.ML

    FAASTA: A fast solver for total-variation regularization of ill-conditioned problems with application to brain imaging

    Authors: Gaël Varoquaux, Michael Eickenberg, Elvis Dohmatob, Bertand Thirion

    Abstract: The total variation (TV) penalty, as many other analysis-sparsity problems, does not lead to separable factors or a proximal operatorwith a closed-form expression, such as soft thresholding for the $\ell\_1$ penalty. As a result, in a variational formulation of an inverse problem or statisticallearning estimation, it leads to challenging non-smooth optimization problemsthat are often solved with e… ▽ More

    Submitted 22 December, 2015; originally announced December 2015.

    Journal ref: Colloque GRETSI, Sep 2015, Lyon, France. Gretsi, 2015, http://www.gretsi.fr/colloque2015/myGretsi/programme.php

  27. arXiv:1507.07901  [pdf, other

    cs.GT

    A simple and numerically stable primal-dual algorithm for computing Nash-equilibria in sequential games with incomplete information

    Authors: Elvis Dohmatob

    Abstract: We present a simple primal-dual algorithm for computing approximate Nash-equilibria in two-person zero-sum sequential games with incomplete information and perfect recall (like Texas Hold'em Poker). Our algorithm is numerically stable, performs only basic iterations (i.e matvec multiplications, clip**, etc., and no calls to external first-order oracles, no matrix inversions, etc.), and is applic… ▽ More

    Submitted 23 December, 2015; v1 submitted 28 July, 2015; originally announced July 2015.

  28. arXiv:1412.3925  [pdf, other

    q-bio.NC cs.CV

    Region segmentation for sparse decompositions: better brain parcellations from rest fMRI

    Authors: Alexandre Abraham, Elvis Dohmatob, Bertrand Thirion, Dimitris Samaras, Gael Varoquaux

    Abstract: Functional Magnetic Resonance Images acquired during resting-state provide information about the functional organization of the brain through measuring correlations between brain areas. Independent components analysis is the reference approach to estimate spatial components from weakly structured data such as brain signal time courses; each of these components may be referred to as a brain network… ▽ More

    Submitted 12 December, 2014; originally announced December 2014.

    Journal ref: Sparsity Techniques in Medical Imaging, Sep 2014, Boston, United States. pp.8