Skip to main content

Showing 1–25 of 25 results for author: Pauwels, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.15894  [pdf, other

    math.OC cs.LG

    Derivatives of Stochastic Gradient Descent

    Authors: Franck Iutzeler, Edouard Pauwels, Samuel Vaiter

    Abstract: We consider stochastic optimization problems where the objective depends on some parameter, as commonly found in hyperparameter optimization for instance. We investigate the behavior of the derivatives of the iterates of Stochastic Gradient Descent (SGD) with respect to that parameter and show that they are driven by an inexact SGD recursion on a different objective function, perturbed by the conv… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  2. arXiv:2305.13768  [pdf, other

    math.OC cs.LG

    One-step differentiation of iterative algorithms

    Authors: Jérôme Bolte, Edouard Pauwels, Samuel Vaiter

    Abstract: In appropriate frameworks, automatic differentiation is transparent to the user at the cost of being a significant computational burden when the number of operations is large. For iterative algorithms, implicit differentiation alleviates this issue but requires custom implementation of Jacobian evaluation. In this paper, we study one-step differentiation, also known as Jacobian-free backpropagatio… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  3. arXiv:2212.07844  [pdf, ps, other

    cs.LG math.OC

    Differentiating Nonsmooth Solutions to Parametric Monotone Inclusion Problems

    Authors: Jérôme Bolte, Edouard Pauwels, Antonio Silveti-Falls

    Abstract: We leverage path differentiability and a recent result on nonsmooth implicit differentiation calculus to give sufficient conditions ensuring that the solution to a monotone inclusion problem will be path differentiable, with formulas for computing its generalized gradient. A direct consequence of our result is that these solutions happen to be differentiable almost everywhere. Our approach is full… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

  4. arXiv:2206.01730  [pdf, ps, other

    math.NA cs.AI cs.LG math.OC

    On the complexity of nonsmooth automatic differentiation

    Authors: Jérôme Bolte, Ryan Boustany, Edouard Pauwels, Béatrice Pesquet-Popescu

    Abstract: Using the notion of conservative gradient, we provide a simple model to estimate the computational costs of the backward and forward modes of algorithmic differentiation for a wide class of nonsmooth programs. The overhead complexity of the backward mode turns out to be independent of the dimension when using programs with locally Lipschitz semi-algebraic or definable elementary functions. This co… ▽ More

    Submitted 6 February, 2023; v1 submitted 1 June, 2022; originally announced June 2022.

  5. arXiv:2206.00457  [pdf, other

    math.OC cs.LG

    Automatic differentiation of nonsmooth iterative algorithms

    Authors: Jérôme Bolte, Edouard Pauwels, Samuel Vaiter

    Abstract: Differentiation along algorithms, i.e., piggyback propagation of derivatives, is now routinely used to differentiate iterative solvers in differentiable programming. Asymptotics is well understood for many smooth problems but the nondifferentiable case is hardly considered. Is there a limiting object for nonsmooth piggyback automatic differentiation (AD)? Does it have any variational meaning and c… ▽ More

    Submitted 31 May, 2022; originally announced June 2022.

  6. arXiv:2201.03819  [pdf, ps, other

    cs.LG math.OC

    Path differentiability of ODE flows

    Authors: Swann Marx, Edouard Pauwels

    Abstract: We consider flows of ordinary differential equations (ODEs) driven by path differentiable vector fields. Path differentiable functions constitute a proper subclass of Lipschitz functions which admit conservative gradients, a notion of generalized derivative compatible with basic calculus rules. Our main result states that such flows inherit the path differentiability property of the driving vector… ▽ More

    Submitted 11 January, 2022; originally announced January 2022.

  7. arXiv:2106.12955  [pdf, other

    cs.CV cs.CE

    Regularisation for PCA- and SVD-type matrix factorisations

    Authors: Abdolrahman Khoshrou, Eric J. Pauwels

    Abstract: Singular Value Decomposition (SVD) and its close relative, Principal Component Analysis (PCA), are well-known linear matrix decomposition techniques that are widely used in applications such as dimension reduction and clustering. However, an important limitation of SVD/PCA is its sensitivity to noise in the input data. In this paper, we take another look at the problem of regularisation and show t… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

  8. arXiv:2106.12915  [pdf, other

    cs.LG cs.AI

    Numerical influence of ReLU'(0) on backpropagation

    Authors: David Bertoin, Jérôme Bolte, Sébastien Gerchinovitz, Edouard Pauwels

    Abstract: In theory, the choice of ReLU(0) in [0, 1] for a neural network has a negligible influence both on backpropagation and training. Yet, in the real world, 32 bits default precision combined with the size of deep learning problems makes it a hyperparameter of training methods. We investigate the importance of the value of ReLU'(0) for several precision levels (16, 32, 64 bits), on various networks (f… ▽ More

    Submitted 3 November, 2023; v1 submitted 23 June, 2021; originally announced June 2021.

    Journal ref: Advances in Neural Information Processing Systems, Dec 2021, Paris, France

  9. arXiv:2106.04350  [pdf, other

    cs.LG cs.AI math.OC

    Nonsmooth Implicit Differentiation for Machine Learning and Optimization

    Authors: Jérôme Bolte, Tam Le, Edouard Pauwels, Antonio Silveti-Falls

    Abstract: In view of training increasingly complex learning architectures, we establish a nonsmooth implicit function theorem with an operational calculus. Our result applies to most practical problems (i.e., definable problems) provided that a nonsmooth form of the classical invertibility condition is fulfilled. This approach allows for formal subdifferentiation: for instance, replacing derivatives by Clar… ▽ More

    Submitted 5 April, 2022; v1 submitted 8 June, 2021; originally announced June 2021.

    Journal ref: Advances in Neural Information Processing Systems, Dec 2021, Online, France

  10. Second-order step-size tuning of SGD for non-convex optimization

    Authors: Camille Castera, Jérôme Bolte, Cédric Févotte, Edouard Pauwels

    Abstract: In view of a direct and simple improvement of vanilla SGD, this paper presents a fine-tuning of its step-sizes in the mini-batch case. For doing so, one estimates curvature, based on a local quadratic model and using only noisy gradient approximations. One obtains a new stochastic first-order method (Step-Tuned SGD), enhanced by second-order information, which can be seen as a stochastic version o… ▽ More

    Submitted 21 November, 2021; v1 submitted 5 March, 2021; originally announced March 2021.

    Comments: To appear in Neural Processing Letters (accepted Nov. 2021)

    Journal ref: Neural Processing Letters (2022)

  11. arXiv:2011.12341  [pdf, ps, other

    math.OC cs.LG

    Sequential convergence of AdaGrad algorithm for smooth convex optimization

    Authors: Cheik Traoré, Edouard Pauwels

    Abstract: We prove that the iterates produced by, either the scalar step size variant, or the coordinatewise variant of AdaGrad algorithm, are convergent sequences when applied to convex objective functions with Lipschitz gradient. The key insight is to remark that such AdaGrad sequences satisfy a variable metric quasi-Fejér monotonicity property, which allows to prove convergence.

    Submitted 13 April, 2021; v1 submitted 24 November, 2020; originally announced November 2020.

    Comments: 9 pages

  12. arXiv:2007.08810  [pdf, other

    cs.LG math.OC

    A Hölderian backtracking method for min-max and min-min problems

    Authors: Jérôme Bolte, Lilian Glaudin, Edouard Pauwels, Mathieu Serrurier

    Abstract: We present a new algorithm to solve min-max or min-min problems out of the convex world. We use rigidity assumptions, ubiquitous in learning, making our method applicable to many optimization problems. Our approach takes advantage of hidden regularity properties and allows us to devise a simple algorithm of ridge type. An original feature of our method is to come with automatic step size adaptatio… ▽ More

    Submitted 17 July, 2020; originally announced July 2020.

  13. Incremental Without Replacement Sampling in Nonconvex Optimization

    Authors: Edouard Pauwels

    Abstract: Minibatch decomposition methods for empirical risk minimization are commonly analysed in a stochastic approximation setting, also known as sampling with replacement. On the other hands modern implementations of such techniques are incremental: they rely on sampling without replacement, for which available analysis are much scarcer. We provide convergence guaranties for the latter variant by analys… ▽ More

    Submitted 6 January, 2023; v1 submitted 15 July, 2020; originally announced July 2020.

    Comments: Journal of Optimization Theory and Applications, 2021

  14. arXiv:2006.02080  [pdf, other

    cs.LG math.OC stat.ML

    A mathematical model for automatic differentiation in machine learning

    Authors: Jerome Bolte, Edouard Pauwels

    Abstract: Automatic differentiation, as implemented today, does not have a simple mathematical model adapted to the needs of modern machine learning. In this work we articulate the relationships between differentiation of programs as implemented in practice and differentiation of nonsmooth functions. To this end we provide a simple class of functions, a nonsmooth calculus, and show how they apply to stochas… ▽ More

    Submitted 29 October, 2020; v1 submitted 3 June, 2020; originally announced June 2020.

    Journal ref: Conference on Neural Information Processing Systems, Dec 2020, Vancouver, Canada

  15. arXiv:2002.03657  [pdf, other

    math.OC cs.LG

    Semialgebraic Optimization for Lipschitz Constants of ReLU Networks

    Authors: Tong Chen, Jean-Bernard Lasserre, Victor Magron, Edouard Pauwels

    Abstract: The Lipschitz constant of a network plays an important role in many applications of deep learning, such as robustness certification and Wasserstein Generative Adversarial Network. We introduce a semidefinite programming hierarchy to estimate the global and local Lipschitz constant of a multiple layer deep neural network. The novelty is to combine a polynomial lifting for ReLU functions derivatives… ▽ More

    Submitted 28 October, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

    Comments: NeurIPS 2020

  16. arXiv:1910.14458  [pdf, other

    math.ST cs.LG

    Rate of convergence for geometric inference based on the empirical Christoffel function

    Authors: Mai Trang Vu, François Bachoc, Edouard Pauwels

    Abstract: We consider the problem of estimating the support of a measure from a finite, independent, sample. The estimators which are considered are constructed based on the empirical Christoffel function. Such estimators have been proposed for the problem of set estimation with heuristic justifications. We carry out a detailed finite sample analysis, that allows us to select the threshold and degree parame… ▽ More

    Submitted 19 May, 2020; v1 submitted 31 October, 2019; originally announced October 2019.

  17. arXiv:1909.10300  [pdf, other

    math.OC cs.AI cs.LG

    Conservative set valued fields, automatic differentiation, stochastic gradient method and deep learning

    Authors: Jérôme Bolte, Edouard Pauwels

    Abstract: Modern problems in AI or in numerical analysis require nonsmooth approaches with a flexible calculus. We introduce generalized derivatives called conservative fields for which we develop a calculus and provide representation formulas. Functions having a conservative field are called path differentiable: convex, concave, Clarke regular and any semialgebraic Lipschitz continuous functions are path d… ▽ More

    Submitted 9 April, 2020; v1 submitted 23 September, 2019; originally announced September 2019.

    Comments: Corrected typos

  18. arXiv:1905.12278  [pdf, other

    cs.LG math.OC stat.ML

    An Inertial Newton Algorithm for Deep Learning

    Authors: Camille Castera, Jérôme Bolte, Cédric Févotte, Edouard Pauwels

    Abstract: We introduce a new second-order inertial optimization method for machine learning called INNA. It exploits the geometry of the loss function while only requiring stochastic approximations of the function values and the generalized gradients. This makes INNA fully implementable and adapted to large-scale optimization problems such as the training of deep neural networks. The algorithm combines both… ▽ More

    Submitted 28 July, 2021; v1 submitted 29 May, 2019; originally announced May 2019.

    Comments: To appear in Journal of Machine Learning Research (JMLR), Volume 22, acceptance date: 5/21

    Journal ref: Journal of Machine Learning Research (JMLR), v22(134):1-31, 2021

  19. SVD-based Visualisation and Approximation for Time Series Data in Smart Energy Systems

    Authors: Abdolrahman Khoshrou, Andre B. Dorsman, Eric. J. Pauwels

    Abstract: Many time series in smart energy systems exhibit two different timescales. On the one hand there are patterns linked to daily human activities. On the other hand, there are relatively slow trends linked to seasonal variations. In this paper we interpret these time series as matrices, to be visualized as images. This approach has two advantages: First of all, interpreting such time series as images… ▽ More

    Submitted 11 July, 2018; originally announced July 2018.

  20. Quantifying Volatility Reduction in German Day-ahead Spot Market in the Period 2006 through 2016

    Authors: Abdolrahman Khoshrou, Eric J. Pauwels

    Abstract: In Europe, Germany is taking the lead in the switch from the conventional to renewable energy. This poses new challenges as wind and solar energy are fundamentally intermittent, weather-dependent and less predictable. It is therefore of considerable interest to investigate the evolution of price volatility in this post-transition era. There are a number of reasons, however, that makes the practica… ▽ More

    Submitted 19 July, 2018; originally announced July 2018.

  21. arXiv:1805.07943  [pdf, other

    cs.LG stat.ML

    Relating Leverage Scores and Density using Regularized Christoffel Functions

    Authors: Edouard Pauwels, Francis Bach, Jean-Philippe Vert

    Abstract: Statistical leverage scores emerged as a fundamental tool for matrix sketching and column sampling with applications to low rank approximation, regression, random feature learning and quadrature. Yet, the very nature of this quantity is barely understood. Borrowing ideas from the orthogonal polynomial literature, we introduce the regularized Christoffel function associated to a positive definite k… ▽ More

    Submitted 21 November, 2018; v1 submitted 21 May, 2018; originally announced May 2018.

  22. On Fienup Methods for Regularized Phase Retrieval

    Authors: Edouard Pauwels, Amir Beck, Yonina C. Eldar, Shoham Sabach

    Abstract: Alternating minimization, or Fienup methods, have a long history in phase retrieval. We provide new insights related to the empirical and theoretical analysis of these algorithms when used with Fourier measurements and combined with convex priors. In particular, we show that Fienup methods can be viewed as performing alternating minimization on a regularized nonconvex least-squares problem with re… ▽ More

    Submitted 27 February, 2017; originally announced February 2017.

  23. arXiv:1701.02886  [pdf, other

    cs.LG

    The empirical Christoffel function with applications in data analysis

    Authors: Jean-Bernard Lasserre, Edouard Pauwels

    Abstract: We illustrate the potential applications in machine learning of the Christoffel function, or more precisely, its empirical counterpart associated with a counting measure uniformly supported on a finite set of points. Firstly, we provide a thresholding scheme which allows to approximate the support of a measure from a finite subset of its moments with strong asymptotic guaranties. Secondly, we prov… ▽ More

    Submitted 7 February, 2019; v1 submitted 11 January, 2017; originally announced January 2017.

  24. arXiv:1606.03858  [pdf, other

    cs.LG

    Sorting out typicality with the inverse moment matrix SOS polynomial

    Authors: Jean-Bernard Lasserre, Edouard Pauwels

    Abstract: We study a surprising phenomenon related to the representation of a cloud of data points using polynomials. We start with the previously unnoticed empirical observation that, given a collection (a cloud) of data points, the sublevel sets of a certain distinguished polynomial capture the shape of the cloud very accurately. This distinguished polynomial is a sum-of-squares (SOS) derived in a simple… ▽ More

    Submitted 14 June, 2016; v1 submitted 13 June, 2016; originally announced June 2016.

  25. arXiv:1307.1568  [pdf

    cs.AI

    Using MathML to Represent Units of Measurement for Improved Ontology Alignment

    Authors: Chau Do, Eric J. Pauwels

    Abstract: Ontologies provide a formal description of concepts and their relationships in a knowledge domain. The goal of ontology alignment is to identify semantically matching concepts and relationships across independently developed ontologies that purport to describe the same knowledge. In order to handle the widest possible class of ontologies, many alignment algorithms rely on terminological and struct… ▽ More

    Submitted 5 July, 2013; originally announced July 2013.

    Comments: Conferences on Intelligent Computer Mathematics (CICM 2013), Bath, England

    Journal ref: CICM 2013, LNAI (7961), Springer, 2013