Search | arXiv e-print repository

Heavy Ball Momentum for Non-Strongly Convex Optimization

Authors: Jean-François Aujol, Charles Dossal, Hippolyte Labarrière, Aude Rondepierre

Abstract: When considering the minimization of a quadratic or strongly convex function, it is well known that first-order methods involving an inertial term weighted by a constant-in-time parameter are particularly efficient (see Polyak [32], Nesterov [28], and references therein). By setting the inertial parameter according to the condition number of the objective function, these methods guarantee a fast e… ▽ More When considering the minimization of a quadratic or strongly convex function, it is well known that first-order methods involving an inertial term weighted by a constant-in-time parameter are particularly efficient (see Polyak [32], Nesterov [28], and references therein). By setting the inertial parameter according to the condition number of the objective function, these methods guarantee a fast exponential decay of the error. We prove that this type of schemes (which are later called Heavy Ball schemes) is relevant in a relaxed setting, i.e. for composite functions satisfying a quadratic growth condition. In particular, we adapt V-FISTA, introduced by Beck in [10] for strongly convex functions, to this broader class of functions. To the authors' knowledge, the resulting worst-case convergence rates are faster than any other in the literature, including those of FISTA restart schemes. No assumption on the set of minimizers is required and guarantees are also given in the non-optimal case, i.e. when the condition number is not exactly known. This analysis follows the study of the corresponding continuous-time dynamical system (Heavy Ball with friction system), for which new convergence results of the trajectory are shown. △ Less

Submitted 11 March, 2024; originally announced March 2024.

MSC Class: 46N10; 65K10; 90C25; 90C30

arXiv:2307.14323 [pdf, other]

Parameter-Free FISTA by Adaptive Restart and Backtracking

Authors: Jean-François Aujol, Luca Calatroni, Charles Dossal, Hippolyte Labarrière, Aude Rondepierre

Abstract: We consider a combined restarting and adaptive backtracking strategy for the popular Fast Iterative Shrinking-Thresholding Algorithm frequently employed for accelerating the convergence speed of large-scale structured convex optimization problems. Several variants of FISTA enjoy a provable linear convergence rate for the function values $F(x_n)$ of the form $\mathcal{O}( e^{-K\sqrt{μ/L}~n})$ under… ▽ More We consider a combined restarting and adaptive backtracking strategy for the popular Fast Iterative Shrinking-Thresholding Algorithm frequently employed for accelerating the convergence speed of large-scale structured convex optimization problems. Several variants of FISTA enjoy a provable linear convergence rate for the function values $F(x_n)$ of the form $\mathcal{O}( e^{-K\sqrt{μ/L}~n})$ under the prior knowledge of problem conditioning, i.e. of the ratio between the (Łojasiewicz) parameter $μ$ determining the growth of the objective function and the Lipschitz constant $L$ of its smooth component. These parameters are nonetheless hard to estimate in many practical cases. Recent works address the problem by estimating either parameter via suitable adaptive strategies. In our work both parameters can be estimated at the same time by means of an algorithmic restarting scheme where, at each restart, a non-monotone estimation of $L$ is performed. For this scheme, theoretical convergence results are proved, showing that a $\mathcal{O}( e^{-K\sqrt{μ/L}n})$ convergence speed can still be achieved along with quantitative estimates of the conditioning. The resulting Free-FISTA algorithm is therefore parameter-free. Several numerical results are reported to confirm the practical interest of its use in many exemplar problems. △ Less

Submitted 26 July, 2023; originally announced July 2023.

arXiv:2206.06853 [pdf, other]

Fast convergence of inertial dynamics with Hessian-driven dam** under geometry assumptions

Authors: Jean-François Aujol, Charles Dossal, Văn Hào Hoàng, Hippolyte Labarrière, Aude Rondepierre

Abstract: First-order optimization algorithms can be considered as a discretization of ordinary differential equations (ODEs) \cite{su2014differential}. In this perspective, studying the properties of the corresponding trajectories may lead to convergence results which can be transfered to the numerical scheme. In this paper we analyse the following ODE introduced by Attouch et al. in \cite{attouch2016fast}… ▽ More First-order optimization algorithms can be considered as a discretization of ordinary differential equations (ODEs) \cite{su2014differential}. In this perspective, studying the properties of the corresponding trajectories may lead to convergence results which can be transfered to the numerical scheme. In this paper we analyse the following ODE introduced by Attouch et al. in \cite{attouch2016fast}: \begin{equation*} \forall t\geqslant t_0,~\ddot{x}(t)+\fracα{t}\dot{x}(t)+βH_F(x(t))\dot{x}(t)+\nabla F(x(t))=0,\end{equation*} where $α>0$, $β>0$ and $H_F$ denotes the Hessian of $F$. This ODE can be derived to build numerical schemes which do not require $F$ to be twice differentiable as shown in \cite{attouch2020first,attouch2021convergence}. We provide strong convergence results on the error $F(x(t))-F^*$ and integrability properties on $\|\nabla F(x(t))\|$ under some geometry assumptions on $F$ such as quadratic growth around the set of minimizers. In particular, we show that the decay rate of the error for a strongly convex function is $O(t^{-α-\varepsilon})$ for any $\varepsilon>0$. These results are briefly illustrated at the end of the paper. △ Less

Submitted 20 June, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

Showing 1–3 of 3 results for author: Labarrière, H