-
Variance reduction techniques for stochastic proximal point algorithms
Authors:
Cheik Traoré,
Vassilis Apidopoulos,
Saverio Salzo,
Silvia Villa
Abstract:
In the context of finite sums minimization, variance reduction techniques are widely used to improve the performance of state-of-the-art stochastic gradient methods. Their practical impact is clear, as well as their theoretical properties. Stochastic proximal point algorithms have been studied as an alternative to stochastic gradient algorithms since they are more stable with respect to the choice…
▽ More
In the context of finite sums minimization, variance reduction techniques are widely used to improve the performance of state-of-the-art stochastic gradient methods. Their practical impact is clear, as well as their theoretical properties. Stochastic proximal point algorithms have been studied as an alternative to stochastic gradient algorithms since they are more stable with respect to the choice of the stepsize but their variance reduced versions are not as studied as the gradient ones. In this work, we propose the first unified study of variance reduction techniques for stochastic proximal point algorithms. We introduce a generic stochastic proximal algorithm that can be specified to give the proximal version of SVRG, SAGA, and some of their variants for smooth and convex functions. We provide several convergence results for the iterates and the objective function values. In addition, under the Polyak-Łojasiewicz (PL) condition, we obtain linear convergence rates for the iterates and the function values. Our numerical experiments demonstrate the advantages of the proximal variance reduction methods over their gradient counterparts, especially about the stability with respect to the choice of the stepsize for difficult problems.
△ Less
Submitted 30 May, 2024; v1 submitted 18 August, 2023;
originally announced August 2023.
-
Regularization properties of dual subgradient flow
Authors:
Vassilis Apidopoulos,
Cesare Molinari,
Lorenzo Rosasco,
Silvia Villa
Abstract:
Dual gradient descent combined with early stop** represents an efficient alternative to the Tikhonov variational approach when the regularizer is strongly convex. However, for many relevant applications, it is crucial to deal with regularizers which are only convex. In this setting, the dual problem is non smooth, and dual gradient descent cannot be used. In this paper, we study the regularizati…
▽ More
Dual gradient descent combined with early stop** represents an efficient alternative to the Tikhonov variational approach when the regularizer is strongly convex. However, for many relevant applications, it is crucial to deal with regularizers which are only convex. In this setting, the dual problem is non smooth, and dual gradient descent cannot be used. In this paper, we study the regularization properties of a subgradient dual flow, and we show that the proposed procedure achieves the same recovery accuracy as penalization methods, while being more efficient from the computational perspective.
△ Less
Submitted 11 May, 2023;
originally announced May 2023.
-
Iterative regularization in classification via hinge loss diagonal descent
Authors:
Vassilis Apidopoulos,
Tomaso Poggio,
Lorenzo Rosasco,
Silvia Villa
Abstract:
Iterative regularization is a classic idea in regularization theory, that has recently become popular in machine learning. On the one hand, it allows to design efficient algorithms controlling at the same time numerical and statistical accuracy. On the other hand it allows to shed light on the learning curves observed while training neural networks. In this paper, we focus on iterative regularizat…
▽ More
Iterative regularization is a classic idea in regularization theory, that has recently become popular in machine learning. On the one hand, it allows to design efficient algorithms controlling at the same time numerical and statistical accuracy. On the other hand it allows to shed light on the learning curves observed while training neural networks. In this paper, we focus on iterative regularization in the context of classification. After contrasting this setting with that of regression and inverse problems, we develop an iterative regularization approach based on the use of the hinge loss function. More precisely we consider a diagonal approach for a family of algorithms for which we prove convergence as well as rates of convergence. Our approach compares favorably with other alternatives, as confirmed also in numerical simulations.
△ Less
Submitted 24 December, 2022;
originally announced December 2022.
-
Convergence rates for the Heavy-Ball continuous dynamics for non-convex optimization, under Polyak-Łojasiewicz condition
Authors:
Vassilis Apidopoulos,
Nicolò Ginatta,
Silvia Villa
Abstract:
We study convergence of the trajectories of the Heavy Ball dynamical system, with constant dam** coefficient, in the framework of convex and non-convex smooth optimization. By using the Polyak-Łojasiewicz condition, we derive new linear convergence rates for the associated trajectory, in terms of objective function values, without assuming uniqueness of the minimizer.
We study convergence of the trajectories of the Heavy Ball dynamical system, with constant dam** coefficient, in the framework of convex and non-convex smooth optimization. By using the Polyak-Łojasiewicz condition, we derive new linear convergence rates for the associated trajectory, in terms of objective function values, without assuming uniqueness of the minimizer.
△ Less
Submitted 26 January, 2022; v1 submitted 21 July, 2021;
originally announced July 2021.