Search | arXiv e-print repository

doi 10.1007/s10107-024-02061-8

Automated tight Lyapunov analysis for first-order methods

Authors: Manu Upadhyaya, Sebastian Banert, Adrien B. Taylor, Pontus Giselsson

Abstract: We present a methodology for establishing the existence of quadratic Lyapunov inequalities for a wide range of first-order methods used to solve convex optimization problems. In particular, we consider (i) classes of optimization problems of finite-sum form with (possibly strongly) convex and possibly smooth functional components, (ii) first-order methods that can be written as a linear system on… ▽ More We present a methodology for establishing the existence of quadratic Lyapunov inequalities for a wide range of first-order methods used to solve convex optimization problems. In particular, we consider (i) classes of optimization problems of finite-sum form with (possibly strongly) convex and possibly smooth functional components, (ii) first-order methods that can be written as a linear system on state-space form in feedback interconnection with the subdifferentials of the functional components of the objective function, and (iii) quadratic Lyapunov inequalities that can be used to draw convergence conclusions. We present a necessary and sufficient condition for the existence of a quadratic Lyapunov inequality within a predefined class of Lyapunov inequalities, which amounts to solving a small-sized semidefinite program. We showcase our methodology on several first-order methods that fit the framework. Most notably, our methodology allows us to significantly extend the region of parameter choices that allow for duality gap convergence in the Chambolle-Pock method when the linear operator is the identity map**. △ Less

Submitted 27 February, 2024; v1 submitted 13 February, 2023; originally announced February 2023.

arXiv:1812.00146 [pdf, other]

Operator Splitting Performance Estimation: Tight contraction factors and optimal parameter selection

Authors: Ernest K. Ryu, Adrien B. Taylor, Carolina Bergeling, Pontus Giselsson

Abstract: We propose a methodology for studying the performance of common splitting methods through semidefinite programming. We prove tightness of the methodology and demonstrate its value by presenting two applications of it. First, we use the methodology as a tool for computer-assisted proofs to prove tight analytical contraction factors for Douglas--Rachford splitting that are likely too complicated for… ▽ More We propose a methodology for studying the performance of common splitting methods through semidefinite programming. We prove tightness of the methodology and demonstrate its value by presenting two applications of it. First, we use the methodology as a tool for computer-assisted proofs to prove tight analytical contraction factors for Douglas--Rachford splitting that are likely too complicated for a human to find bare-handed. Second, we use the methodology as an algorithmic tool to computationally select the optimal splitting method parameters by solving a series of semidefinite programs. △ Less

Submitted 30 April, 2020; v1 submitted 1 December, 2018; originally announced December 2018.

Comments: Published in the SIAM Journal on Optimization

MSC Class: 47H05 47H09 68Q25 90C22 90C25 90C30 90C60

arXiv:1803.05676 [pdf, ps, other]

doi 10.1007/s10107-019-01410-2

Efficient First-order Methods for Convex Minimization: a Constructive Approach

Authors: Yoel Drori, Adrien B. Taylor

Abstract: We describe a novel constructive technique for devising efficient first-order methods for a wide range of large-scale convex minimization settings, including smooth, non-smooth, and strongly convex minimization. The technique builds upon a certain variant of the conjugate gradient method to construct a family of methods such that a) all methods in the family share the same worst-case guarantee as… ▽ More We describe a novel constructive technique for devising efficient first-order methods for a wide range of large-scale convex minimization settings, including smooth, non-smooth, and strongly convex minimization. The technique builds upon a certain variant of the conjugate gradient method to construct a family of methods such that a) all methods in the family share the same worst-case guarantee as the base conjugate gradient method, and b) the family includes a fixed-step first-order method. We demonstrate the effectiveness of the approach by deriving optimal methods for the smooth and non-smooth cases, including new methods that forego knowledge of the problem parameters at the cost of a one-dimensional line search per iteration, and a universal method for the union of these classes that requires a three-dimensional search per iteration. In the strongly convex case, we show how numerical tools can be used to perform the construction, and show that the resulting method offers an improved worst-case bound compared to Nesterov's celebrated fast gradient method. △ Less

Submitted 26 June, 2019; v1 submitted 15 March, 2018; originally announced March 2018.

Comments: Accepted in Mathematical Programming (https://doi.org/10.1007/s10107-019-01410-2). Code available on GitHub (https://github.com/AdrienTaylor/GreedyMethods)

arXiv:1705.04398 [pdf, ps, other]

Exact worst-case convergence rates of the proximal gradient method for composite convex minimization

Authors: Adrien B. Taylor, Julien M. Hendrickx, François Glineur

Abstract: We study the worst-case convergence rates of the proximal gradient method for minimizing the sum of a smooth strongly convex function and a non-smooth convex function whose proximal operator is available. We establish the exact worst-case convergence rates of the proximal gradient method in this setting for any step size and for different standard performance measures: objective function accurac… ▽ More We study the worst-case convergence rates of the proximal gradient method for minimizing the sum of a smooth strongly convex function and a non-smooth convex function whose proximal operator is available. We establish the exact worst-case convergence rates of the proximal gradient method in this setting for any step size and for different standard performance measures: objective function accuracy, distance to optimality and residual gradient norm. The proof methodology relies on recent developments in performance estimation of first-order methods based on semidefinite programming. In the case of the proximal gradient method, this methodology allows obtaining exact and non-asymptotic worst-case guarantees that are conceptually very simple, although apparently new. On the way, we discuss how strong convexity can be replaced by weaker assumptions, while preserving the corresponding convergence rates. We also establish that the same fixed step size policy is optimal for all three performance measures. Finally, we extend recent results on the worst-case behavior of gradient descent with exact line search to the proximal case. △ Less

Submitted 29 February, 2020; v1 submitted 11 May, 2017; originally announced May 2017.

Comments: Journal of Optimization Theory and Algorithms (DOI:10.1007/s10957-018-1298-1) [V4: correction of a minor typo and one updated reference]

arXiv:1606.09365 [pdf, ps, other]

On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions

Authors: Etienne de Klerk, François Glineur, Adrien B. Taylor

Abstract: We consider the gradient (or steepest) descent method with exact line search applied to a strongly convex function with Lipschitz continuous gradient. We establish the exact worst-case rate of convergence of this scheme, and show that this worst-case behavior is exhibited by a certain convex quadratic function. We also give the tight worst-case complexity bound for a noisy variant of gradient desc… ▽ More We consider the gradient (or steepest) descent method with exact line search applied to a strongly convex function with Lipschitz continuous gradient. We establish the exact worst-case rate of convergence of this scheme, and show that this worst-case behavior is exhibited by a certain convex quadratic function. We also give the tight worst-case complexity bound for a noisy variant of gradient descent method, where exact line-search is performed in a search direction that differs from negative gradient by at most a prescribed relative tolerance. The proofs are computer-assisted, and rely on the resolutions of semidefinite programming performance estimation problems as introduced in the paper [Y. Drori and M. Teboulle. Performance of first-order methods for smooth convex minimization: a novel approach. Mathematical Programming, 145(1-2):451-482, 2014]. △ Less

Submitted 15 September, 2016; v1 submitted 30 June, 2016; originally announced June 2016.

Comments: 11pages

arXiv:1512.07516 [pdf, other]

doi 10.1137/16M108104X

Exact Worst-case Performance of First-order Methods for Composite Convex Optimization

Authors: Adrien B. Taylor, Julien M. Hendrickx, François Glineur

Abstract: We provide a framework for computing the exact worst-case performance of any algorithm belonging to a broad class of oracle-based first-order methods for composite convex optimization, including those performing explicit, projected, proximal, conditional and inexact (sub)gradient steps. We simultaneously obtain tight worst-case guarantees and explicit instances of optimization problems on which th… ▽ More We provide a framework for computing the exact worst-case performance of any algorithm belonging to a broad class of oracle-based first-order methods for composite convex optimization, including those performing explicit, projected, proximal, conditional and inexact (sub)gradient steps. We simultaneously obtain tight worst-case guarantees and explicit instances of optimization problems on which the algorithm reaches this worst-case. We achieve this by reducing the computation of the worst-case to solving a convex semidefinite program, generalizing previous works on performance estimation by Drori and Teboulle [13] and the authors [43]. We use these developments to obtain a tighter analysis of the proximal point algorithm and of several variants of fast proximal gradient, conditional gradient, subgradient and alternating projection methods. In particular, we present a new analytical worst-case guarantee for the proximal point algorithm that is twice better than previously known, and improve the standard worst-case guarantee for the conditional gradient method by more than a factor of two. We also show how the optimized gradient method proposed by Kim and Fessler in [22] can be extended by incorporating a projection or a proximal operator, which leads to an algorithm that converges in the worst-case twice as fast as the standard accelerated proximal gradient method [2]. △ Less

Submitted 21 November, 2019; v1 submitted 23 December, 2015; originally announced December 2015.

Comments: Published in SIOPT (updated version with corrected typo) Code available at https://github.com/AdrienTaylor/Performance-Estimation-Toolbox

Journal ref: SIAM Journal on Optimization, 27(3), 1283-1313 (2017)

arXiv:1502.05666 [pdf, other]

Smooth Strongly Convex Interpolation and Exact Worst-case Performance of First-order Methods

Authors: Adrien B. Taylor, Julien M. Hendrickx, François Glineur

Abstract: We show that the exact worst-case performance of fixed-step first-order methods for unconstrained optimization of smooth (possibly strongly) convex functions can be obtained by solving convex programs. Finding the worst-case performance of a black-box first-order method is formulated as an optimization problem over a set of smooth (strongly) convex functions and initial conditions. We develop cl… ▽ More We show that the exact worst-case performance of fixed-step first-order methods for unconstrained optimization of smooth (possibly strongly) convex functions can be obtained by solving convex programs. Finding the worst-case performance of a black-box first-order method is formulated as an optimization problem over a set of smooth (strongly) convex functions and initial conditions. We develop closed-form necessary and sufficient conditions for smooth (strongly) convex interpolation, which provide a finite representation for those functions. This allows us to reformulate the worst-case performance estimation problem as an equivalent finite dimension-independent semidefinite optimization problem, whose exact solution can be recovered up to numerical precision. Optimal solutions to this performance estimation problem provide both worst-case performance bounds and explicit functions matching them, as our smooth (strongly) convex interpolation procedure is constructive. Our works build on those of Drori and Teboulle in [Math. Prog. 145 (1-2), 2014] who introduced and solved relaxations of the performance estimation problem for smooth convex functions. We apply our approach to different fixed-step first-order methods with several performance criteria, including objective function accuracy and gradient norm. We conjecture several numerically supported worst-case bounds on the performance of the fixed-step gradient, fast gradient and optimized gradient methods, both in the smooth convex and the smooth strongly convex cases, and deduce tight estimates of the optimal step size for the gradient method. △ Less

Submitted 31 October, 2016; v1 submitted 19 February, 2015; originally announced February 2015.

Comments: Accepted for publication in Mathematical Programming, DOI 10.1007/s10107-016-1009-3. Performance estimation code: http://perso.uclouvain.be/adrien.taylor/#projects

arXiv:1112.3298 [pdf]

doi 10.1073/pnas.1120597109

Evaluation of Competing J domain:Hsp70 Complex Models in Light of Existing Mutational and NMR Data

Authors: Rui Sousa, Jianwen Jiang, Eileen M. Lafer, Andrew P. Hinck, Li** Wang, Alexander B. Taylor, E. Guy Maes

Abstract: Ahmad et al. recently presented an NMR-based model for a bacterial DnaJ J domain:DnaK(Hsp70):ADP complex(1) that differs significantly from the crystal structure of a disulfide linked mammalian auxilin J domain:Hsc70 complex that we previously published(2). They claimed that their model could better account for existing mutational data, was in better agreement with previous NMR studies, and that t… ▽ More Ahmad et al. recently presented an NMR-based model for a bacterial DnaJ J domain:DnaK(Hsp70):ADP complex(1) that differs significantly from the crystal structure of a disulfide linked mammalian auxilin J domain:Hsc70 complex that we previously published(2). They claimed that their model could better account for existing mutational data, was in better agreement with previous NMR studies, and that the presence of a cross-link in our structure made it irrelevant to understanding J:Hsp70 interactions. Here we detail extensive NMR and mutational data relevant to understanding J:Hsp70 function and show that, in fact, our structure is much better able to account for the mutational data and is in much better agreement with a previous NMR study of a mammalian polyoma virus T-ag J domain:Hsc70 complex than is the Ahmad et al. complex, and that our structure is predictive and provides insight into J:Hsp70 interactions and mechanism of ATPase activation. △ Less

Submitted 14 December, 2011; originally announced December 2011.

Showing 1–8 of 8 results for author: Taylor, A B