Skip to main content

Showing 1–21 of 21 results for author: Scieur, D

Searching in archive math. Search in all archives.
.
  1. arXiv:2312.03583  [pdf, ps, other

    math.OC

    Strong Convexity of Sets in Riemannian Manifolds

    Authors: Damien Scieur, Thomas Kerdreux, Martínez-Rubio, Alexandre d'Aspremont, Sebastian Pokutta

    Abstract: Convex curvature properties are important in designing and analyzing convex optimization algorithms in the Hilbertian or Riemannian settings. In the case of the Hilbertian setting, strongly convex sets are well studied. Herein, we propose various definitions of strong convexity for uniquely geodesic sets in a Riemannian manifold. We study their relationship, propose tools to determine the geodesic… ▽ More

    Submitted 6 February, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

  2. arXiv:2305.19179  [pdf, ps, other

    math.OC math.NA

    Adaptive Quasi-Newton and Anderson Acceleration Framework with Explicit Global (Accelerated) Convergence Rates

    Authors: Damien Scieur

    Abstract: Despite the impressive numerical performance of the quasi-Newton and Anderson/nonlinear acceleration methods, their global convergence rates have remained elusive for over 50 years. This study addresses this long-standing issue by introducing a framework that derives novel, adaptive quasi-Newton and nonlinear/Anderson acceleration schemes. Under mild assumptions, the proposed iterative methods exh… ▽ More

    Submitted 14 November, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

  3. arXiv:2209.13271  [pdf, other

    math.OC stat.ML

    The Curse of Unrolling: Rate of Differentiating Through Optimization

    Authors: Damien Scieur, Quentin Bertrand, Gauthier Gidel, Fabian Pedregosa

    Abstract: Computing the Jacobian of the solution of an optimization problem is a central problem in machine learning, with applications in hyperparameter optimization, meta-learning, optimization as a layer, and dataset distillation, to name a few. Unrolled differentiation is a popular heuristic that approximates the solution using an iterative solver and differentiates it through the computational path. Th… ▽ More

    Submitted 25 August, 2023; v1 submitted 27 September, 2022; originally announced September 2022.

  4. arXiv:2206.09901  [pdf, other

    math.OC cs.LG

    Only Tails Matter: Average-Case Universality and Robustness in the Convex Regime

    Authors: Leonardo Cunha, Gauthier Gidel, Fabian Pedregosa, Damien Scieur, Courtney Paquette

    Abstract: The recently developed average-case analysis of optimization methods allows a more fine-grained and representative convergence analysis than usual worst-case results. In exchange, this analysis requires a more precise hypothesis over the data generating process, namely assuming knowledge of the expected spectral distribution (ESD) of the random matrix associated with the problem. This work shows t… ▽ More

    Submitted 22 June, 2022; v1 submitted 20 June, 2022; originally announced June 2022.

    Comments: To be published in ICML 2022

  5. arXiv:2111.06826  [pdf, other

    stat.ML cs.LG math.ST

    Convergence Rates for the MAP of an Exponential Family and Stochastic Mirror Descent -- an Open Problem

    Authors: Rémi Le Priol, Frederik Kunstner, Damien Scieur, Simon Lacoste-Julien

    Abstract: We consider the problem of upper bounding the expected log-likelihood sub-optimality of the maximum likelihood estimate (MLE), or a conjugate maximum a posteriori (MAP) for an exponential family, in a non-asymptotic way. Surprisingly, we found no general solution to this problem in the literature. In particular, current theories do not hold for a Gaussian or in the interesting few samples regime.… ▽ More

    Submitted 12 November, 2021; originally announced November 2021.

    Comments: 9 pages and 3 figures + Appendix

  6. arXiv:2106.09687  [pdf, other

    math.OC

    Super-Acceleration with Cyclical Step-sizes

    Authors: Baptiste Goujaud, Damien Scieur, Aymeric Dieuleveut, Adrien Taylor, Fabian Pedregosa

    Abstract: We develop a convergence-rate analysis of momentum with cyclical step-sizes. We show that under some assumption on the spectral gap of Hessians in machine learning, cyclical step-sizes are provably faster than constant step-sizes. More precisely, we develop a convergence rate analysis for quadratic objectives that provides optimal parameters and shows that cyclical learning rates can improve upon… ▽ More

    Submitted 9 May, 2022; v1 submitted 17 June, 2021; originally announced June 2021.

    Journal ref: Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:3028-3065, 2022

  7. arXiv:2101.09545  [pdf, ps, other

    math.OC cs.LG math.NA

    Acceleration Methods

    Authors: Alexandre d'Aspremont, Damien Scieur, Adrien Taylor

    Abstract: This monograph covers some recent advances in a range of acceleration techniques frequently used in convex optimization. We first use quadratic optimization problems to introduce two key families of methods, namely momentum and nested optimization schemes. They coincide in the quadratic case to form the Chebyshev method. We discuss momentum methods in detail, starting with the seminal work of Nest… ▽ More

    Submitted 21 December, 2021; v1 submitted 23 January, 2021; originally announced January 2021.

    Comments: Published in Foundation and Trends in Optimization (see https://www.nowpublishers.com/article/Details/OPT-036)

    Journal ref: Foundations and Trends in Optimization: Vol. 5: No. 1-2, pp 1-245 (2021)

  8. arXiv:2011.03358  [pdf, ps, other

    math.OC math.NA

    Generalization of Quasi-Newton Methods: Application to Robust Symmetric Multisecant Updates

    Authors: Damien Scieur, Lewis Liu, Thomas Pumir, Nicolas Boumal

    Abstract: Quasi-Newton techniques approximate the Newton step by estimating the Hessian using the so-called secant equations. Some of these methods compute the Hessian using several secant equations but produce non-symmetric updates. Other quasi-Newton schemes, such as BFGS, enforce symmetry but cannot satisfy more than one secant equation. We propose a new type of quasi-Newton symmetric update using severa… ▽ More

    Submitted 8 February, 2021; v1 submitted 6 November, 2020; originally announced November 2020.

    Comments: AISTATS 2021

  9. arXiv:2011.03351  [pdf, ps, other

    math.OC

    Affine Invariant Analysis of Frank-Wolfe on Strongly Convex Sets

    Authors: Thomas Kerdreux, Lewis Liu, Simon Lacoste-Julien, Damien Scieur

    Abstract: It is known that the Frank-Wolfe (FW) algorithm, which is affine-covariant, enjoys accelerated convergence rates when the constraint set is strongly convex. However, these results rely on norm-dependent assumptions, usually incurring non-affine invariant bounds, in contradiction with FW's affine-covariant property. In this work, we introduce new structural assumptions on the problem (such as the d… ▽ More

    Submitted 6 November, 2020; originally announced November 2020.

  10. arXiv:2010.02076  [pdf, other

    math.OC cs.GT cs.LG

    Average-case Acceleration for Bilinear Games and Normal Matrices

    Authors: Carles Domingo-Enrich, Fabian Pedregosa, Damien Scieur

    Abstract: Advances in generative modeling and adversarial learning have given rise to renewed interest in smooth games. However, the absence of symmetry in the matrix of second derivatives poses challenges that are not present in the classical minimization framework. While a rich theory of average-case analysis has been developed for minimization problems, little is known in the context of smooth games. In… ▽ More

    Submitted 5 October, 2020; originally announced October 2020.

    Comments: 24 pages, 1 figure

  11. arXiv:2002.04756  [pdf, other

    math.OC cs.LG

    Average-case Acceleration Through Spectral Density Estimation

    Authors: Fabian Pedregosa, Damien Scieur

    Abstract: We develop a framework for the average-case analysis of random quadratic problems and derive algorithms that are optimal under this analysis. This yields a new class of methods that achieve acceleration given a model of the Hessian's eigenvalue distribution. We develop explicit algorithms for the uniform, Marchenko-Pastur, and exponential distributions. These methods are momentum-based algorithms,… ▽ More

    Submitted 20 February, 2023; v1 submitted 11 February, 2020; originally announced February 2020.

    Journal ref: Proceedings of the 37th International Conference on Machine Learning, PMLR 119, 2020

  12. arXiv:2002.04664  [pdf, other

    math.OC

    Universal Average-Case Optimality of Polyak Momentum

    Authors: Damien Scieur, Fabian Pedregosa

    Abstract: Polyak momentum (PM), also known as the heavy-ball method, is a widely used optimization method that enjoys an asymptotic optimal worst-case complexity on quadratic objectives. However, its remarkable empirical success is not fully explained by this optimality, as the worst-case analysis -- contrary to the average-case -- is not representative of the expected complexity of an algorithm. In this wo… ▽ More

    Submitted 21 January, 2021; v1 submitted 11 February, 2020; originally announced February 2020.

    Comments: Added references in the proof of Theorem 4.1

    Journal ref: Proceedings of the 37th International Conference on Machine Learning, PMLR 119, 2020

  13. arXiv:2001.00602  [pdf, other

    cs.LG math.OC stat.ML

    Accelerating Smooth Games by Manipulating Spectral Shapes

    Authors: Waïss Azizian, Damien Scieur, Ioannis Mitliagkas, Simon Lacoste-Julien, Gauthier Gidel

    Abstract: We use matrix iteration theory to characterize acceleration in smooth games. We define the spectral shape of a family of games as the set containing all eigenvalues of the Jacobians of standard gradient dynamics in the family. Shapes restricted to the real line represent well-understood classes of problems, like minimization. Shapes spanning the complex plane capture the added numerical challenges… ▽ More

    Submitted 9 March, 2020; v1 submitted 2 January, 2020; originally announced January 2020.

    Comments: Appears in: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020). 34 pages

    MSC Class: G.1.6; I.2.6 ACM Class: G.1.6; I.2.6

  14. arXiv:1905.12363  [pdf, other

    stat.ML cs.LG math.OC

    Extragradient with player sampling for faster Nash equilibrium finding

    Authors: Carles Domingo Enrich, Samy Jelassi, Carles Domingo-Enrich, Damien Scieur, Arthur Mensch, Joan Bruna

    Abstract: Data-driven modeling increasingly requires to find a Nash equilibrium in multi-player games, e.g. when training GANs. In this paper, we analyse a new extra-gradient method for Nash equilibrium finding, that performs gradient extrapolations and updates on a random subset of players at each iteration. This approach provably exhibits a better rate of convergence than full extra-gradient for non-smoot… ▽ More

    Submitted 21 July, 2020; v1 submitted 29 May, 2019; originally announced May 2019.

  15. arXiv:1903.08764  [pdf, ps, other

    math.OC

    Generalized Framework for Nonlinear Acceleration

    Authors: Damien Scieur

    Abstract: Nonlinear acceleration algorithms improve the performance of iterative methods, such as gradient descent, using the information contained in past iterates. However, their efficiency is still not entirely understood even in the quadratic case. In this paper, we clarify the convergence analysis by giving general properties that share several classes of nonlinear acceleration: Anderson acceleration (… ▽ More

    Submitted 20 March, 2019; originally announced March 2019.

  16. arXiv:1810.04539  [pdf, other

    math.OC

    Nonlinear Acceleration of Momentum and Primal-Dual Algorithms

    Authors: Raghu Bollapragada, Damien Scieur, Alexandre d'Aspremont

    Abstract: We describe convergence acceleration schemes for multistep optimization algorithms. The extrapolated solution is written as a nonlinear average of the iterates produced by the original optimization method. Our analysis does not need the underlying fixed-point operator to be symmetric, hence handles e.g. algorithms with momentum terms such as Nesterov's accelerated method, or primal-dual methods. T… ▽ More

    Submitted 17 October, 2019; v1 submitted 10 October, 2018; originally announced October 2018.

  17. arXiv:1806.00370  [pdf, ps, other

    math.OC cs.LG stat.ML

    Nonlinear Acceleration of CNNs

    Authors: Damien Scieur, Edouard Oyallon, Alexandre d'Aspremont, Francis Bach

    Abstract: The Regularized Nonlinear Acceleration (RNA) algorithm is an acceleration method capable of improving the rate of convergence of many optimization schemes such as gradient descend, SAGA or SVRG. Until now, its analysis is limited to convex problems, but empirical observations shows that RNA may be extended to wider settings. In this paper, we investigate further the benefits of RNA when applied to… ▽ More

    Submitted 1 June, 2018; originally announced June 2018.

  18. arXiv:1805.09639  [pdf, ps, other

    math.OC cs.LG stat.ML

    Online Regularized Nonlinear Acceleration

    Authors: Damien Scieur, Edouard Oyallon, Alexandre d'Aspremont, Francis Bach

    Abstract: Regularized nonlinear acceleration (RNA) estimates the minimum of a function by post-processing iterates from an algorithm such as the gradient method. It can be seen as a regularized version of Anderson acceleration, a classical acceleration scheme from numerical analysis. The new scheme provably improves the rate of convergence of fixed step gradient descent, and its empirical performance is com… ▽ More

    Submitted 21 June, 2019; v1 submitted 24 May, 2018; originally announced May 2018.

  19. arXiv:1706.07270  [pdf, other

    math.OC

    Nonlinear Acceleration of Stochastic Algorithms

    Authors: Damien Scieur, Alexandre d'Aspremont, Francis Bach

    Abstract: Extrapolation methods use the last few iterates of an optimization algorithm to produce a better estimate of the optimum. They were shown to achieve optimal convergence rates in a deterministic setting using simple gradient iterates. Here, we study extrapolation methods in a stochastic setting, where the iterates are produced by either a simple or an accelerated stochastic gradient algorithm. We f… ▽ More

    Submitted 3 August, 2017; v1 submitted 22 June, 2017; originally announced June 2017.

  20. arXiv:1702.06751  [pdf, ps, other

    math.OC

    Integration Methods and Accelerated Optimization Algorithms

    Authors: Damien Scieur, Vincent Roulet, Francis Bach, Alexandre d'Aspremont

    Abstract: We show that accelerated optimization methods can be seen as particular instances of multi-step integration schemes from numerical analysis, applied to the gradient flow equation. In comparison with recent advances in this vein, the differential equation considered here is the basic gradient flow and we show that multi-step schemes allow integration of this differential equation using larger step… ▽ More

    Submitted 22 February, 2017; originally announced February 2017.

  21. arXiv:1606.04133  [pdf, ps, other

    math.OC

    Regularized Nonlinear Acceleration

    Authors: Damien Scieur, Alexandre d'Aspremont, Francis Bach

    Abstract: We describe a convergence acceleration technique for unconstrained optimization problems. Our scheme computes estimates of the optimum from a nonlinear average of the iterates produced by any optimization method. The weights in this average are computed via a simple linear system, whose solution can be updated online. This acceleration scheme runs in parallel to the base algorithm, providing impro… ▽ More

    Submitted 14 April, 2019; v1 submitted 13 June, 2016; originally announced June 2016.