-
The Randomized Block Coordinate Descent Method in the Hölder Smooth Setting
Authors:
Leandro Farias Maia,
David Huckleberry Gutman
Abstract:
This work provides the first convergence analysis for the Randomized Block Coordinate Descent method for minimizing a function that is both Hölder smooth and block Hölder smooth. Our analysis applies to objective functions that are non-convex, convex, and strongly convex. For non-convex functions, we show that the expected gradient norm reduces at an $O\left(k^{\fracγ{1+γ}}\right)$ rate, where…
▽ More
This work provides the first convergence analysis for the Randomized Block Coordinate Descent method for minimizing a function that is both Hölder smooth and block Hölder smooth. Our analysis applies to objective functions that are non-convex, convex, and strongly convex. For non-convex functions, we show that the expected gradient norm reduces at an $O\left(k^{\fracγ{1+γ}}\right)$ rate, where $k$ is the iteration count and $γ$ is the Hölder exponent. For convex functions, we show that the expected suboptimality gap reduces at the rate $O\left(k^{-γ}\right)$. In the strongly convex setting, we show this rate for the expected suboptimality gap improves to $O\left(k^{-\frac{2γ}{1-γ}}\right)$ when $γ>1$ and to a linear rate when $γ=1$. Notably, these new convergence rates coincide with those furnished in the existing literature for the Lipschitz smooth setting.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Quantum Enhanced Pattern Search Optimization
Authors:
Colton Mikes,
Ismael R. de Farias Jr.,
David Huckleberry Gutman,
Victoria E. Howle
Abstract:
This paper introduces a quantum-classical hybrid algorithm for generalized pattern search (GPS) algorithms. We introduce a quantum search step algorithm using amplitude amplification, which reduces the number of oracle calls needed during the search step from O(N) classical calls to O(N^(1/2)) quantum calls. This work addresses three fundamental issues with using a quantum search step with GPS. Fi…
▽ More
This paper introduces a quantum-classical hybrid algorithm for generalized pattern search (GPS) algorithms. We introduce a quantum search step algorithm using amplitude amplification, which reduces the number of oracle calls needed during the search step from O(N) classical calls to O(N^(1/2)) quantum calls. This work addresses three fundamental issues with using a quantum search step with GPS. First we address the need to mark an improved mesh point, a requirement of the amplitude amplification algorithm. Second, we introduce a modified version of the amplitude amplification algorithm QSearch, which is guaranteed to terminate using a finite number of iterations. Third, we avoid disrupting the GPS algorithm's convergence by limiting the quantum algorithm to the search step.
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
The Inexact Cyclic Block Proximal Gradient Method and Properties of Inexact Proximal Maps
Authors:
Leandro Maia,
David Huckleberry Gutman,
Ryan Christopher Hughes
Abstract:
This paper expands the Cyclic Block Proximal Gradient method for block separable composite minimization by allowing for inexactly computed gradients and proximal maps. The resultant algorithm, the Inexact Cyclic Block Proximal Gradient (I-CBPG) method, shares the same convergence rate as its exactly computed analogue provided the allowable errors decrease sufficiently quickly or are pre-selected t…
▽ More
This paper expands the Cyclic Block Proximal Gradient method for block separable composite minimization by allowing for inexactly computed gradients and proximal maps. The resultant algorithm, the Inexact Cyclic Block Proximal Gradient (I-CBPG) method, shares the same convergence rate as its exactly computed analogue provided the allowable errors decrease sufficiently quickly or are pre-selected to be sufficiently small. We provide numerical experiments that showcase the practical computational advantage of I-CBPG for certain fixed tolerances of approximation error and for a dynamically decreasing error tolerance regime in particular. We establish a tight relationship between inexact proximal map evaluations and $δ$-subgradients in our $δ$-Second Prox Theorem. This theorem forms the foundation of our convergence analysis and enables us to show that inexact gradient computations and other notions of inexact proximal map computation can be subsumed within a single unifying framework.
△ Less
Submitted 3 January, 2022;
originally announced January 2022.
-
Coordinate Descent Without Coordinates: Tangent Subspace Descent on Riemannian Manifolds
Authors:
David Huckleberry Gutman,
Nam Ho-Nguyen
Abstract:
We extend coordinate descent to manifold domains, and provide convergence analyses for geodesically convex and non-convex smooth objective functions. Our key insight is to draw an analogy between coordinate blocks in Euclidean space and tangent subspaces of a manifold. Hence, our method is called tangent subspace descent (TSD). The core principle behind ensuring convergence of TSD is the appropria…
▽ More
We extend coordinate descent to manifold domains, and provide convergence analyses for geodesically convex and non-convex smooth objective functions. Our key insight is to draw an analogy between coordinate blocks in Euclidean space and tangent subspaces of a manifold. Hence, our method is called tangent subspace descent (TSD). The core principle behind ensuring convergence of TSD is the appropriate choice of subspace at each iteration. To this end, we propose two novel conditions, the gap ensuring and $C$-randomized norm conditions on deterministic and randomized modes of subspace selection respectively, that promise convergence for smooth functions and that are satisfied in practical contexts. We propose two subspace selection rules of particular practical interest that satisfy these conditions: a deterministic one for the manifold of square orthogonal matrices, and a randomized one for the Stiefel manifold. Our proof-of-concept numerical experiments on the orthogonal Procrustes problem demonstrate TSD's efficacy.
△ Less
Submitted 13 June, 2020; v1 submitted 23 December, 2019;
originally announced December 2019.
-
The condition number of a function relative to a set
Authors:
David H. Gutman,
Javier F. Pena
Abstract:
The condition number of a differentiable convex function, namely the ratio of its smoothness to strong convexity constants, is closely tied to fundamental properties of the function. In particular, the condition number of a quadratic convex function is the square of the aspect ratio of a canonical ellipsoid associated to the function. Furthermore, the condition number of a function bounds the line…
▽ More
The condition number of a differentiable convex function, namely the ratio of its smoothness to strong convexity constants, is closely tied to fundamental properties of the function. In particular, the condition number of a quadratic convex function is the square of the aspect ratio of a canonical ellipsoid associated to the function. Furthermore, the condition number of a function bounds the linear rate of convergence of the gradient descent algorithm for unconstrained convex minimization.
We propose a condition number of a differentiable convex function relative to a reference convex set and distance function pair. This relative condition number is defined as the ratio of a relative smoothness to a relative strong convexity constants. We show that the relative condition number extends the main properties of the traditional condition number both in terms of its geometric insight and in terms of its role in characterizing the linear convergence of first-order methods for constrained convex minimization.
When the reference set $X$ is a convex cone or a polyhedron and the function $f$ is of the form $f = g\circ A$, we provide characterizations of and bounds on the condition number of $f$ relative to $X$ in terms of the usual condition number of $g$ and a suitable condition number of the pair $(A,X)$.
△ Less
Submitted 18 April, 2020; v1 submitted 24 January, 2019;
originally announced January 2019.
-
Perturbed Fenchel duality and first-order methods
Authors:
David H. Gutman,
Javier F. Peña
Abstract:
We show that the iterates generated by a generic first-order meta-algorithm satisfy a canonical perturbed Fenchel duality inequality. The latter in turn readily yields a unified derivation of the best known convergence rates for various popular first-order algorithms including the conditional gradient method as well as the main kinds of Bregman proximal methods: subgradient, gradient, fast gradien…
▽ More
We show that the iterates generated by a generic first-order meta-algorithm satisfy a canonical perturbed Fenchel duality inequality. The latter in turn readily yields a unified derivation of the best known convergence rates for various popular first-order algorithms including the conditional gradient method as well as the main kinds of Bregman proximal methods: subgradient, gradient, fast gradient, and universal gradient methods.
△ Less
Submitted 3 December, 2021; v1 submitted 25 December, 2018;
originally announced December 2018.
-
Enhanced Basic Procedures for the Projection and Rescaling Algorithm
Authors:
David H. Gutman
Abstract:
Using an efficient algorithmic implementation of Caratheodory's theorem, we propose three enhanced versions of the Projection and Rescaling algorithm's basic procedures each of which improves upon the order of complexity of its analogue in [Mathematical Programming Series A, 166 (2017), pp. 87-111].
Using an efficient algorithmic implementation of Caratheodory's theorem, we propose three enhanced versions of the Projection and Rescaling algorithm's basic procedures each of which improves upon the order of complexity of its analogue in [Mathematical Programming Series A, 166 (2017), pp. 87-111].
△ Less
Submitted 16 July, 2018;
originally announced July 2018.
-
The condition of a function relative to a polytope
Authors:
David H. Gutman,
Javier F. Pena
Abstract:
The condition number of a smooth convex function, namely the ratio of its smoothness to strong convexity constants, is closely tied to fundamental properties of the function. In particular, the condition number of a quadratic convex function is precisely the square of the diameter-to-width ratio of a canonical ellipsoid associated to the function. Furthermore, the condition number of a function bo…
▽ More
The condition number of a smooth convex function, namely the ratio of its smoothness to strong convexity constants, is closely tied to fundamental properties of the function. In particular, the condition number of a quadratic convex function is precisely the square of the diameter-to-width ratio of a canonical ellipsoid associated to the function. Furthermore, the condition number of a function bounds the linear rate of convergence of the gradient descent algorithm for unconstrained minimization.
We propose a condition number of a smooth convex function relative to a reference polytope. This relative condition number is defined as the ratio of a relative smooth constant to a relative strong convexity constant of the function, where both constants are relative to the reference polytope. The relative condition number extends the main properties of the traditional condition number. In particular, we show that the condition number of a quadratic convex function relative to a polytope is precisely the square of the diameter-to-facial-distance ratio of a scaled polytope for a canonical scaling induced by the function. Furthermore, we illustrate how the relative condition number of a function bounds the linear rate of convergence of first-order methods for minimization of the function over the polytope.
△ Less
Submitted 1 February, 2018;
originally announced February 2018.
-
Convergence rates of proximal gradient methods via the convex conjugate
Authors:
David H. Gutman,
Javier F. Pena
Abstract:
We give a novel proof of the $O(1/k)$ and $O(1/k^2)$ convergence rates of the proximal gradient and accelerated proximal gradient methods for composite convex minimization. The crux of the new proof is an upper bound constructed via the convex conjugate of the objective function.
We give a novel proof of the $O(1/k)$ and $O(1/k^2)$ convergence rates of the proximal gradient and accelerated proximal gradient methods for composite convex minimization. The crux of the new proof is an upper bound constructed via the convex conjugate of the objective function.
△ Less
Submitted 8 January, 2018; v1 submitted 8 January, 2018;
originally announced January 2018.