Skip to main content

Showing 1–8 of 8 results for author: Schechtman, S

Searching in archive math. Search in all archives.
.
  1. arXiv:2402.08272  [pdf, ps, other

    math.OC

    The gradient's limit of a definable family of functions is a conservative set-valued field

    Authors: Sholom Schechtman

    Abstract: It is well-known that the convergence of a family of smooth functions does not imply the convergence of its gradients. In this work, we show that if the family is definable in an o-minimal structure (for instance semialgebraic, subanalytic, or any composition of the previous with exp, log), then the gradient's limit is a conservative set-valued field in the sense introduced by Bolte and Pauwels. I… ▽ More

    Submitted 21 February, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

  2. arXiv:2305.13187  [pdf, ps, other

    math.OC stat.ML

    SignSVRG: fixing SignSGD via variance reduction

    Authors: Evgenii Chzhen, Sholom Schechtman

    Abstract: We consider the problem of unconstrained minimization of finite sums of functions. We propose a simple, yet, practical way to incorporate variance reduction techniques into SignSGD, guaranteeing convergence that is similar to the full sign gradient descent. The core idea is first instantiated on the problem of minimizing sums of convex and Lipschitz functions and is then extended to the smooth cas… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  3. arXiv:2303.09261  [pdf, other

    math.OC stat.ML

    Orthogonal Directions Constrained Gradient Method: from non-linear equality constraints to Stiefel manifold

    Authors: Sholom Schechtman, Daniil Tiapkin, Michael Muehlebach, Eric Moulines

    Abstract: We consider the problem of minimizing a non-convex function over a smooth manifold $\mathcal{M}$. We propose a novel algorithm, the Orthogonal Directions Constrained Gradient Method (ODCGM) which only requires computing a projection onto a vector space. ODCGM is infeasible but the iterates are constantly pulled towards the manifold, ensuring the convergence of ODCGM towards $\mathcal{M}$. ODCGM is… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

  4. arXiv:2109.02455   

    math.OC stat.ML

    Stochastic Subgradient Descent on a Generic Definable Function Converges to a Minimizer

    Authors: Sholom Schechtman

    Abstract: It was previously shown by Davis and Drusvyatskiy that every Clarke critical point of a generic, semialgebraic (and more generally definable in an o-minimal structure), weakly convex function is lying on an active manifold and is either a local minimum or an active strict saddle. In the first part of this work, we show that when the weak convexity assumption fails a third type of point appears: a… ▽ More

    Submitted 10 February, 2022; v1 submitted 6 September, 2021; originally announced September 2021.

    Comments: This paper was withdrawn due to a mistake in the work of Benaïm-Hofbauer-Sorin "Stochastic Approximations and Differential Inclusions". In the latter, the equivalence in Theorem 4.1 is not true and in particular the linearly interpolated process of the iterates is not an APT of the associated DI. This equivalence was at the heart of Propositions 7, 8 and Theorem 2 of the present paper

    MSC Class: 65K10; 62L20; 49J52; 32B20

  5. arXiv:2108.02072  [pdf, ps, other

    math.OC stat.ML

    Stochastic Subgradient Descent Escapes Active Strict Saddles on Weakly Convex Functions

    Authors: Pascal Bianchi, Walid Hachem, Sholom Schechtman

    Abstract: In non-smooth stochastic optimization, we establish the non-convergence of the stochastic subgradient descent (SGD) to the critical points recently called active strict saddles by Davis and Drusvyatskiy. Such points lie on a manifold $M$ where the function $f$ has a direction of second-order negative curvature. Off this manifold, the norm of the Clarke subdifferential of $f$ is lower-bounded. We r… ▽ More

    Submitted 25 July, 2023; v1 submitted 4 August, 2021; originally announced August 2021.

    Comments: Accepted for publication in Mathematics of Operations Research

    MSC Class: 65K10; 62L20 (Primary); 49J52; 32B20 (secondary)

  6. Stochastic proximal subgradient descent oscillates in the vicinity of its accumulation set

    Authors: Sholom Schechtman

    Abstract: We analyze the stochastic proximal subgradient descent in the case where the objective functions are path differentiable and verify a Sard-type condition. While the accumulation set may not be reduced to unique point, we show that the time spent by the iterates to move from one accumulation point to another goes to infinity. An oscillation-type behavior of the drift is established. These results s… ▽ More

    Submitted 24 May, 2022; v1 submitted 30 March, 2021; originally announced March 2021.

    MSC Class: 65K10; 62L20 (Primary); 62M45 (Secondary) ACM Class: G.1.6; I.2.6

  7. arXiv:2012.04002  [pdf, ps, other

    math.OC math.PR stat.ML

    Stochastic optimization with momentum: convergence, fluctuations, and traps avoidance

    Authors: A. Barakat, P. Bianchi, W. Hachem, Sh. Schechtman

    Abstract: In this paper, a general stochastic optimization procedure is studied, unifying several variants of the stochastic gradient descent such as, among others, the stochastic heavy ball method, the Stochastic Nesterov Accelerated Gradient algorithm (S-NAG), and the widely used Adam algorithm. The algorithm is seen as a noisy Euler discretization of a non-autonomous ordinary differential equation, recen… ▽ More

    Submitted 10 July, 2021; v1 submitted 7 December, 2020; originally announced December 2020.

    Comments: Accepted for publication in Electronic Journal of Statistics. 49 pages

    MSC Class: 62L20; 34A12; 60F99

  8. arXiv:2005.08513  [pdf, ps, other

    math.NA math.OC

    Convergence of constant step stochastic gradient descent for non-smooth non-convex functions

    Authors: Pascal Bianchi, Walid Hachem, Sholom Schechtman

    Abstract: This paper studies the asymptotic behavior of the constant step Stochastic Gradient Descent for the minimization of an unknown function F , defined as the expectation of a non convex, non smooth, locally Lipschitz random function. As the gradient may not exist, it is replaced by a certain operator: a reasonable choice is to use an element of the Clarke subdifferential of the random function; an ot… ▽ More

    Submitted 12 April, 2022; v1 submitted 18 May, 2020; originally announced May 2020.

    Journal ref: Set-Valued and Variational Analysis, Springer, 2022