Skip to main content

Showing 1–19 of 19 results for author: Domingo-Enrich, C

.
  1. arXiv:2406.00288  [pdf, other

    cs.LG stat.ML

    Neural Optimal Transport with Lagrangian Costs

    Authors: Aram-Alexandre Pooladian, Carles Domingo-Enrich, Ricky T. Q. Chen, Brandon Amos

    Abstract: We investigate the optimal transport problem between probability measures when the underlying cost function is understood to satisfy a least action principle, also known as a Lagrangian cost. These generalizations are useful when connecting observations from a physical system where the transport dynamics are influenced by the geometry of the system, such as obstacles (e.g., incorporating barrier f… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: UAI 2024

  2. arXiv:2312.02027  [pdf, other

    math.OC cs.LG math.NA math.PR stat.ML

    Stochastic Optimal Control Matching

    Authors: Carles Domingo-Enrich, Jiequn Han, Brandon Amos, Joan Bruna, Ricky T. Q. Chen

    Abstract: Stochastic optimal control, which has the goal of driving the behavior of noisy systems, is broadly applicable in science, engineering and artificial intelligence. Our work introduces Stochastic Optimal Control Matching (SOCM), a novel Iterative Diffusion Optimization (IDO) technique for stochastic optimal control that stems from the same philosophy as the conditional score matching loss for diffu… ▽ More

    Submitted 28 June, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

  3. arXiv:2306.15400  [pdf, other

    cs.LG

    Length Generalization in Arithmetic Transformers

    Authors: Samy Jelassi, Stéphane d'Ascoli, Carles Domingo-Enrich, Yuhuai Wu, Yuanzhi Li, François Charton

    Abstract: We examine how transformers cope with two challenges: learning basic integer arithmetic, and generalizing to longer sequences than seen during training. We find that relative position embeddings enable length generalization for simple tasks, such as addition: models trained on $5$-digit numbers can perform $15$-digit sums. However, this method fails for multiplication, and we propose train set pri… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

  4. arXiv:2306.11928  [pdf, ps, other

    stat.ML cs.LG math.ST

    Open Problem: Learning with Variational Objectives on Measures

    Authors: Vivien Cabannes, Carles Domingo-Enrich

    Abstract: The theory of statistical learning has focused on variational objectives expressed on functions. In this note, we discuss motivations to write similar objectives on measures, in particular to discuss out-of-distribution generalization and weakly-supervised learning. It raises a natural question: can one cast usual statistical learning results to objectives expressed on measures? Does the resulting… ▽ More

    Submitted 16 November, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    MSC Class: 68T05 ACM Class: I.2.6; F.2.2; G.3

    Journal ref: IEEE Big Data, 2023

  5. arXiv:2304.14772  [pdf, other

    cs.LG

    Multisample Flow Matching: Straightening Flows with Minibatch Couplings

    Authors: Aram-Alexandre Pooladian, Heli Ben-Hamu, Carles Domingo-Enrich, Brandon Amos, Yaron Lipman, Ricky T. Q. Chen

    Abstract: Simulation-free methods for training continuous-time generative models construct probability paths that go between noise distributions and individual data samples. Recent works, such as Flow Matching, derived paths that are optimal for each data sample. However, these algorithms rely on independent data and noise samples, and do not exploit underlying structure in the data distribution for constru… ▽ More

    Submitted 24 May, 2023; v1 submitted 28 April, 2023; originally announced April 2023.

  6. arXiv:2302.12229  [pdf, other

    math.OC stat.ML

    An Explicit Expansion of the Kullback-Leibler Divergence along its Fisher-Rao Gradient Flow

    Authors: Carles Domingo-Enrich, Aram-Alexandre Pooladian

    Abstract: Let $V_* : \mathbb{R}^d \to \mathbb{R}$ be some (possibly non-convex) potential function, and consider the probability measure $π\propto e^{-V_*}$. When $π$ exhibits multiple modes, it is known that sampling techniques based on Wasserstein gradient flows of the Kullback-Leibler (KL) divergence (e.g. Langevin Monte Carlo) suffer poorly in the rate of convergence, where the dynamics are unable to ea… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

    Comments: 15 pages, 4 figures

  7. arXiv:2301.05974  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Compress Then Test: Powerful Kernel Testing in Near-linear Time

    Authors: Carles Domingo-Enrich, Raaz Dwivedi, Lester Mackey

    Abstract: Kernel two-sample testing provides a powerful framework for distinguishing any pair of distributions based on $n$ sample points. However, existing kernel tests either run in $n^2$ time or sacrifice undue power to improve runtime. To address these shortcomings, we introduce Compress Then Test (CTT), a new framework for high-powered kernel testing based on sample compression. CTT cheaply approximate… ▽ More

    Submitted 23 February, 2023; v1 submitted 14 January, 2023; originally announced January 2023.

    Comments: Accepted as a paper at AISTATS 2023

  8. arXiv:2206.00632  [pdf, other

    math.OC cs.LG stat.ML

    Computing the Variance of Shuffling Stochastic Gradient Algorithms via Power Spectral Density Analysis

    Authors: Carles Domingo-Enrich

    Abstract: When solving finite-sum minimization problems, two common alternatives to stochastic gradient descent (SGD) with theoretical benefits are random reshuffling (SGD-RR) and shuffle-once (SGD-SO), in which functions are sampled in cycles without replacement. Under a convenient stochastic noise approximation which holds experimentally, we study the stationary variances of the iterates of SGD, SGD-RR an… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

    Comments: The code can be found at \url{https://github.com/CDEnrich/sgd_shuffling}

  9. arXiv:2205.13941  [pdf, other

    cs.LG cs.CR cs.IT stat.ML

    Auditing Differential Privacy in High Dimensions with the Kernel Quantum Rényi Divergence

    Authors: Carles Domingo-Enrich, Youssef Mroueh

    Abstract: Differential privacy (DP) is the de facto standard for private data release and private machine learning. Auditing black-box DP algorithms and mechanisms to certify whether they satisfy a certain DP guarantee is challenging, especially in high dimension. We propose relaxations of differential privacy based on new divergences on probability distributions: the kernel Rényi divergence and its regular… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

    Comments: Code at https://github.com/CDEnrich/kernel_renyi_dp

  10. arXiv:2205.13684  [pdf, other

    stat.ML cs.LG math.PR math.ST

    Learning with Stochastic Orders

    Authors: Carles Domingo-Enrich, Yair Schiff, Youssef Mroueh

    Abstract: Learning high-dimensional distributions is often done with explicit likelihood modeling or implicit modeling via minimizing integral probability metrics (IPMs). In this paper, we expand this learning paradigm to stochastic orders, namely, the convex or Choquet order between probability measures. Towards this end, exploiting the relation between convex orders and optimal transport, we introduce the… ▽ More

    Submitted 9 November, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: Code available at https://github.com/yair-schiff/stochastic-orders-ICMN

  11. arXiv:2202.06460   

    cs.LG math.OC stat.ML

    Simultaneous Transport Evolution for Minimax Equilibria on Measures

    Authors: Carles Domingo-Enrich, Joan Bruna

    Abstract: Min-max optimization problems arise in several key machine learning setups, including adversarial learning and generative modeling. In their general form, in absence of convexity/concavity assumptions, finding pure equilibria of the underlying two-player zero-sum game is computationally hard [Daskalakis et al., 2021]. In this work we focus instead in finding mixed equilibria, and consider the asso… ▽ More

    Submitted 21 February, 2022; v1 submitted 13 February, 2022; originally announced February 2022.

    Comments: Error in the proof of Lemma 1, which makes Theorem 1 not hold

  12. arXiv:2112.13867  [pdf, other

    cs.LG

    Depth and Feature Learning are Provably Beneficial for Neural Network Discriminators

    Authors: Carles Domingo-Enrich

    Abstract: We construct pairs of distributions $μ_d, ν_d$ on $\mathbb{R}^d$ such that the quantity $|\mathbb{E}_{x \sim μ_d} [F(x)] - \mathbb{E}_{x \sim ν_d} [F(x)]|$ decreases as $Ω(1/d^2)$ for some three-layer ReLU network $F$ with polynomial width and weights, while declining exponentially in $d$ if $F$ is any two-layer network with polynomial weights. This shows that deep GAN discriminators are able to d… ▽ More

    Submitted 27 December, 2021; originally announced December 2021.

  13. arXiv:2110.03673  [pdf, other

    stat.ML cs.LG math.ST

    Tighter Sparse Approximation Bounds for ReLU Neural Networks

    Authors: Carles Domingo-Enrich, Youssef Mroueh

    Abstract: A well-known line of work (Barron, 1993; Breiman, 1993; Klusowski & Barron, 2018) provides bounds on the width $n$ of a ReLU two-layer neural network needed to approximate a function $f$ over the ball $\mathcal{B}_R(\mathbb{R}^d)$ up to error $ε$, when the Fourier based quantity $C_f = \frac{1}{(2π)^{d/2}} \int_{\mathbb{R}^d} \|ξ\|^2 |\hat{f}(ξ)| \ dξ$ is finite. More recently Ongie et al. (2019)… ▽ More

    Submitted 25 November, 2021; v1 submitted 7 October, 2021; originally announced October 2021.

  14. arXiv:2107.05134  [pdf, other

    cs.LG math.OC stat.ML

    Dual Training of Energy-Based Models with Overparametrized Shallow Neural Networks

    Authors: Carles Domingo-Enrich, Alberto Bietti, Marylou Gabrié, Joan Bruna, Eric Vanden-Eijnden

    Abstract: Energy-based models (EBMs) are generative models that are usually trained via maximum likelihood estimation. This approach becomes challenging in generic situations where the trained energy is non-convex, due to the need to sample the Gibbs distribution associated with this energy. Using general Fenchel duality results, we derive variational principles dual to maximum likelihood EBMs with shallow… ▽ More

    Submitted 15 February, 2022; v1 submitted 11 July, 2021; originally announced July 2021.

  15. arXiv:2106.05739  [pdf, other

    stat.ML cs.LG math.PR math.ST

    Separation Results between Fixed-Kernel and Feature-Learning Probability Metrics

    Authors: Carles Domingo-Enrich, Youssef Mroueh

    Abstract: Several works in implicit and explicit generative modeling empirically observed that feature-learning discriminators outperform fixed-kernel discriminators in terms of the sample quality of the models. We provide separation results between probability metrics with fixed-kernel and feature-learning discriminators using the function classes $\mathcal{F}_2$ and $\mathcal{F}_1$ respectively, which wer… ▽ More

    Submitted 31 October, 2021; v1 submitted 10 June, 2021; originally announced June 2021.

  16. arXiv:2104.07531  [pdf, other

    cs.LG stat.ML

    On Energy-Based Models with Overparametrized Shallow Neural Networks

    Authors: Carles Domingo-Enrich, Alberto Bietti, Eric Vanden-Eijnden, Joan Bruna

    Abstract: Energy-based models (EBMs) are a simple yet powerful framework for generative modeling. They are based on a trainable energy function which defines an associated Gibbs measure, and they can be trained and sampled from via well-established statistical tools, such as MCMC. Neural networks may be used as energy function approximators, providing both a rich class of expressive models as well as a flex… ▽ More

    Submitted 5 May, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

  17. arXiv:2010.02076  [pdf, other

    math.OC cs.GT cs.LG

    Average-case Acceleration for Bilinear Games and Normal Matrices

    Authors: Carles Domingo-Enrich, Fabian Pedregosa, Damien Scieur

    Abstract: Advances in generative modeling and adversarial learning have given rise to renewed interest in smooth games. However, the absence of symmetry in the matrix of second derivatives poses challenges that are not present in the classical minimization framework. While a rich theory of average-case analysis has been developed for minimization problems, little is known in the context of smooth games. In… ▽ More

    Submitted 5 October, 2020; originally announced October 2020.

    Comments: 24 pages, 1 figure

  18. arXiv:2002.06277  [pdf, other

    cs.LG math.OC math.PR stat.ML

    A mean-field analysis of two-player zero-sum games

    Authors: Carles Domingo-Enrich, Samy Jelassi, Arthur Mensch, Grant Rotskoff, Joan Bruna

    Abstract: Finding Nash equilibria in two-player zero-sum continuous games is a central problem in machine learning, e.g. for training both GANs and robust models. The existence of pure Nash equilibria requires strong conditions which are not typically met in practice. Mixed Nash equilibria exist in greater generality and may be found using mirror descent. Yet this approach does not scale to high dimensions.… ▽ More

    Submitted 6 May, 2021; v1 submitted 14 February, 2020; originally announced February 2020.

    Journal ref: Published at NeurIPS 2020

  19. arXiv:1905.12363  [pdf, other

    stat.ML cs.LG math.OC

    Extragradient with player sampling for faster Nash equilibrium finding

    Authors: Carles Domingo Enrich, Samy Jelassi, Carles Domingo-Enrich, Damien Scieur, Arthur Mensch, Joan Bruna

    Abstract: Data-driven modeling increasingly requires to find a Nash equilibrium in multi-player games, e.g. when training GANs. In this paper, we analyse a new extra-gradient method for Nash equilibrium finding, that performs gradient extrapolations and updates on a random subset of players at each iteration. This approach provably exhibits a better rate of convergence than full extra-gradient for non-smoot… ▽ More

    Submitted 21 July, 2020; v1 submitted 29 May, 2019; originally announced May 2019.