Search | arXiv e-print repository

arXiv:1906.01164 [pdf, other]

A Generic Acceleration Framework for Stochastic Composite Optimization

Authors: Andrei Kulunchakov, Julien Mairal

Abstract: In this paper, we introduce various mechanisms to obtain accelerated first-order stochastic optimization algorithms when the objective function is convex or strongly convex. Specifically, we extend the Catalyst approach originally designed for deterministic objectives to the stochastic setting. Given an optimization method with mild convergence guarantees for strongly convex problems, the challeng… ▽ More In this paper, we introduce various mechanisms to obtain accelerated first-order stochastic optimization algorithms when the objective function is convex or strongly convex. Specifically, we extend the Catalyst approach originally designed for deterministic objectives to the stochastic setting. Given an optimization method with mild convergence guarantees for strongly convex problems, the challenge is to accelerate convergence to a noise-dominated region, and then achieve convergence with an optimal worst-case complexity depending on the noise variance of the gradients. A side contribution of our work is also a generic analysis that can handle inexact proximal operators, providing new insights about the robustness of stochastic algorithms when the proximal operator cannot be exactly computed. △ Less

Submitted 9 October, 2019; v1 submitted 3 June, 2019; originally announced June 2019.

Journal ref: Advances in Neural Information Processing Systems (NeurIPS), Dec 2019, Vancouver, Canada

arXiv:1905.02374 [pdf, other]

Estimate Sequences for Variance-Reduced Stochastic Composite Optimization

Authors: Andrei Kulunchakov, Julien Mairal

Abstract: In this paper, we propose a unified view of gradient-based algorithms for stochastic convex composite optimization by extending the concept of estimate sequence introduced by Nesterov. This point of view covers the stochastic gradient descent method, variants of the approaches SAGA, SVRG, and has several advantages: (i) we provide a generic proof of convergence for the aforementioned methods; (ii)… ▽ More In this paper, we propose a unified view of gradient-based algorithms for stochastic convex composite optimization by extending the concept of estimate sequence introduced by Nesterov. This point of view covers the stochastic gradient descent method, variants of the approaches SAGA, SVRG, and has several advantages: (i) we provide a generic proof of convergence for the aforementioned methods; (ii) we show that this SVRG variant is adaptive to strong convexity; (iii) we naturally obtain new algorithms with the same guarantees; (iv) we derive generic strategies to make these algorithms robust to stochastic noise, which is useful when data is corrupted by small random perturbations. Finally, we show that this viewpoint is useful to obtain new accelerated algorithms in the sense of Nesterov. △ Less

Submitted 7 May, 2019; originally announced May 2019.

Comments: short version of preprint arXiv:1901.08788

Journal ref: International Conference on Machine Learning (ICML), Jun 2019, Long Beach, United States

arXiv:1901.08788 [pdf, other]

Estimate Sequences for Stochastic Composite Optimization: Variance Reduction, Acceleration, and Robustness to Noise

Authors: Andrei Kulunchakov, Julien Mairal

Abstract: In this paper, we propose a unified view of gradient-based algorithms for stochastic convex composite optimization by extending the concept of estimate sequence introduced by Nesterov. More precisely, we interpret a large class of stochastic optimization methods as procedures that iteratively minimize a surrogate of the objective, which covers the stochastic gradient descent method and variants of… ▽ More In this paper, we propose a unified view of gradient-based algorithms for stochastic convex composite optimization by extending the concept of estimate sequence introduced by Nesterov. More precisely, we interpret a large class of stochastic optimization methods as procedures that iteratively minimize a surrogate of the objective, which covers the stochastic gradient descent method and variants of the incremental approaches SAGA, SVRG, and MISO/Finito/SDCA. This point of view has several advantages: (i) we provide a simple generic proof of convergence for all of the aforementioned methods; (ii) we naturally obtain new algorithms with the same guarantees; (iii) we derive generic strategies to make these algorithms robust to stochastic noise, which is useful when data is corrupted by small random perturbations. Finally, we propose a new accelerated stochastic gradient descent algorithm and an accelerated SVRG algorithm with optimal complexity that is robust to stochastic noise. △ Less

Submitted 4 September, 2020; v1 submitted 25 January, 2019; originally announced January 2019.

Comments: Journal of Machine Learning Research, Microtome Publishing, In press

Showing 1–3 of 3 results for author: Kulunchakov, A