Search | arXiv e-print repository

A penalty barrier framework for nonconvex constrained optimization

Authors: Alberto De Marchi, Andreas Themelis

Abstract: Focusing on minimization problems with structured objective function and smooth constraints, we present a flexible technique that combines the beneficial regularization effects of (exact) penalty and interior-point methods. Working in the fully nonconvex setting, a pure barrier approach requires careful steps when approaching the infeasible set, thus hindering convergence. We show how a tight inte… ▽ More Focusing on minimization problems with structured objective function and smooth constraints, we present a flexible technique that combines the beneficial regularization effects of (exact) penalty and interior-point methods. Working in the fully nonconvex setting, a pure barrier approach requires careful steps when approaching the infeasible set, thus hindering convergence. We show how a tight integration with a penalty scheme overcomes such conservatism, does not require a strictly feasible starting point, and thus accommodates equality constraints. The crucial advancement that allows us to invoke generic (possibly accelerated) subsolvers is a marginalization step: amounting to a conjugacy operation, this step effectively merges (exact) penalty and barrier into a smooth, full domain functional object. When the penalty exactness takes effect, the generated subproblems do not suffer the ill-conditioning typical of penalty methods, nor do they exhibit the nonsmoothness of exact penalty terms. We provide a theoretical characterization of the algorithm and its asymptotic properties, deriving convergence results for fully nonconvex problems. Illustrative examples and numerical simulations demonstrate the wide range of problems our theory and algorithm are able to cover. △ Less

Submitted 14 June, 2024; originally announced June 2024.

MSC Class: 49J52; 49J53; 65K05; 90C06; 90C30

arXiv:2404.09617 [pdf, other]

Safeguarding adaptive methods: global convergence of Barzilai-Borwein and other stepsize choices

Authors: Hongjia Ou, Andreas Themelis

Abstract: Leveraging on recent advancements on adaptive methods for convex minimization problems, this paper provides a linesearch-free proximal gradient framework for globalizing the convergence of popular stepsize choices such as Barzilai-Borwein and one-dimensional Anderson acceleration. This framework can cope with problems in which the gradient of the differentiable function is merely locally Hölder co… ▽ More Leveraging on recent advancements on adaptive methods for convex minimization problems, this paper provides a linesearch-free proximal gradient framework for globalizing the convergence of popular stepsize choices such as Barzilai-Borwein and one-dimensional Anderson acceleration. This framework can cope with problems in which the gradient of the differentiable function is merely locally Hölder continuous. Our analysis not only encompasses but also refines existing results upon which it builds. The theory is corroborated by numerical evidence that showcases the synergetic interplay between fast stepsize selections and adaptive methods. △ Less

Submitted 13 May, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

MSC Class: 65K05; 90C06; 90C25; 90C30; 90C53

arXiv:2402.06271 [pdf, other]

Adaptive proximal gradient methods are universal without approximation

Authors: Konstantinos A. Oikonomidis, Emanuel Laude, Puya Latafat, Andreas Themelis, Panagiotis Patrinos

Abstract: We show that adaptive proximal gradient methods for convex problems are not restricted to traditional Lipschitzian assumptions. Our analysis reveals that a class of linesearch-free methods is still convergent under mere local Hölder gradient continuity, covering in particular continuously differentiable semi-algebraic functions. To mitigate the lack of local Lipschitz continuity, popular approache… ▽ More We show that adaptive proximal gradient methods for convex problems are not restricted to traditional Lipschitzian assumptions. Our analysis reveals that a class of linesearch-free methods is still convergent under mere local Hölder gradient continuity, covering in particular continuously differentiable semi-algebraic functions. To mitigate the lack of local Lipschitz continuity, popular approaches revolve around $\varepsilon$-oracles and/or linesearch procedures. In contrast, we exploit plain Hölder inequalities not entailing any approximation, all while retaining the linesearch-free nature of adaptive schemes. Furthermore, we prove full sequence convergence without prior knowledge of local Hölder constants nor of the order of Hölder continuity. Numerical experiments make comparisons with baseline methods on diverse tasks from machine learning covering both the locally and the globally Hölder setting. △ Less

Submitted 5 July, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

MSC Class: 65K05; 90C06; 90C25; 90C30; 90C47

arXiv:2311.18431 [pdf, other]

On the convergence of adaptive first order methods: proximal gradient and alternating minimization algorithms

Authors: Puya Latafat, Andreas Themelis, Panagiotis Patrinos

Abstract: Building upon recent works on linesearch-free adaptive proximal gradient methods, this paper proposes adaPG$^{q,r}$, a framework that unifies and extends existing results by providing larger stepsize policies and improved lower bounds. Different choices of the parameters $q$ and $r$ are discussed and the efficacy of the resulting methods is demonstrated through numerical simulations. In an attempt… ▽ More Building upon recent works on linesearch-free adaptive proximal gradient methods, this paper proposes adaPG$^{q,r}$, a framework that unifies and extends existing results by providing larger stepsize policies and improved lower bounds. Different choices of the parameters $q$ and $r$ are discussed and the efficacy of the resulting methods is demonstrated through numerical simulations. In an attempt to better understand the underlying theory, its convergence is established in a more general setting that allows for time-varying parameters. Finally, an adaptive alternating minimization algorithm is presented by exploring the dual setting. This algorithm not only incorporates additional adaptivity, but also expands its applicability beyond standard strongly convex settings. △ Less

Submitted 15 May, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

MSC Class: 65K05; 90C06; 90C25; 90C30; 90C47

Journal ref: Proceedings of the 6th Annual Learning for Dynamics & Control Conference, PMLR 242:197-208, 2024

arXiv:2305.03559 [pdf, other]

On the convergence of proximal gradient methods for convex simple bilevel optimization

Authors: Puya Latafat, Andreas Themelis, Silvia Villa, Panagiotis Patrinos

Abstract: This paper studies proximal gradient iterations for solving simple bilevel optimization problems where both the upper and the lower level cost functions are split as the sum of differentiable and (possibly nonsmooth) proximable functions. We develop a novel convergence recipe for iteration varying stepsizes that relies on Barzilai-Borwein type local estimates for the differentiable terms. Leveragi… ▽ More This paper studies proximal gradient iterations for solving simple bilevel optimization problems where both the upper and the lower level cost functions are split as the sum of differentiable and (possibly nonsmooth) proximable functions. We develop a novel convergence recipe for iteration varying stepsizes that relies on Barzilai-Borwein type local estimates for the differentiable terms. Leveraging the convergence recipe, under global Lipschitz gradient continuity, we establish convergence for a nonadaptive stepsize sequence, without requiring any strong convexity or linesearch. In the locally Lipschitz differentiable setting, we develop an adaptive linesearch method that introduces a systematic adaptive scheme enabling large and nonmonotonic stepsize sequences while being insensitive to the choice of hyperparameters and initialization. Numerical simulations are provided showcasing favorable convergence speed of our methods. △ Less

Submitted 2 March, 2024; v1 submitted 5 May, 2023; originally announced May 2023.

MSC Class: 65K05; 90C06; 90C25; 90C30

arXiv:2301.04431 [pdf, other]

Adaptive proximal algorithms for convex optimization under local Lipschitz continuity of the gradient

Authors: Puya Latafat, Andreas Themelis, Lorenzo Stella, Panagiotis Patrinos

Abstract: Backtracking linesearch is the de facto approach for minimizing continuously differentiable functions with locally Lipschitz gradient. In recent years, it has been shown that in the convex setting it is possible to avoid linesearch altogether, and to allow the stepsize to adapt based on a local smoothness estimate without any backtracks or evaluations of the function value. In this work we propose… ▽ More Backtracking linesearch is the de facto approach for minimizing continuously differentiable functions with locally Lipschitz gradient. In recent years, it has been shown that in the convex setting it is possible to avoid linesearch altogether, and to allow the stepsize to adapt based on a local smoothness estimate without any backtracks or evaluations of the function value. In this work we propose an adaptive proximal gradient method, adaPG, that uses novel estimates of the local smoothness modulus which leads to less conservative stepsize updates and that can additionally cope with nonsmooth terms. This idea is extended to the primal-dual setting where an adaptive three-term primal-dual algorithm, adaPD, is proposed which can be viewed as an extension of the PDHG method. Moreover, in this setting the "essentially" fully adaptive variant adaPD$^+$ is proposed that avoids evaluating the linear operator norm by invoking a backtracking procedure, that, remarkably, does not require extra gradient evaluations. Numerical simulations demonstrate the effectiveness of the proposed algorithms compared to the state of the art. △ Less

Submitted 13 March, 2024; v1 submitted 11 January, 2023; originally announced January 2023.

MSC Class: 65K05; 90C06; 90C25; 90C30; 90C47

arXiv:2301.00931 [pdf, other]

doi 10.1109/TEMPR.2023.3289582

Optimal Grid Layouts for Hybrid Offshore Assets in the North Sea under Different Market Designs

Authors: Stephen Hardy, Andreas Themelis, Kaoru Yamamoto, Hakan Ergun, Dirk Van Hertem

Abstract: This work examines the Generation and Transmission Expansion (GATE) planning problem of offshore grids under different market clearing mechanisms: a Home Market Design (HMD), a zonal cleared Offshore Bidding Zone (zOBZ) and a nodal cleared Offshore Bidding Zone (nOBZ). It aims at answering two questions. 1) Is knowing the market structure a priori necessary for effective generation and transmiss… ▽ More This work examines the Generation and Transmission Expansion (GATE) planning problem of offshore grids under different market clearing mechanisms: a Home Market Design (HMD), a zonal cleared Offshore Bidding Zone (zOBZ) and a nodal cleared Offshore Bidding Zone (nOBZ). It aims at answering two questions. 1) Is knowing the market structure a priori necessary for effective generation and transmission expansion planning? 2) Which market mechanism results in the highest overall social welfare? To this end a multi-period, stochastic GATE planning formulation is developed for both nodal and zonal market designs. The approach considers the costs and benefits among stake-holders of Hybrid Offshore Assets (HOA) as well as gross consumer surplus (GCS). The methodology is demonstrated on a North Sea test grid based on projects from the European Network of Transmission System Operators' (ENTSO-E) Ten Year Network Development Plan (TYNDP). An upper bound on potential social welfare in zonal market designs is calculated and it is concluded that from a generation and transmission perspective, planning under the assumption of an nOBZ results in the best risk adjusted return. △ Less

Submitted 2 January, 2023; originally announced January 2023.

Journal ref: IEEE Transactions on Energy Markets, Policy and Regulation, vol. 1, no. 4, pp. 468-479, Dec. 2023

arXiv:2212.04391 [pdf, other]

doi 10.1016/j.ifacol.2023.10.1254

Gauss-Newton meets PANOC: A fast and globally convergent algorithm for nonlinear optimal control

Authors: Pieter Pas, Andreas Themelis, Panagiotis Patrinos

Abstract: PANOC is an algorithm for nonconvex optimization that has recently gained popularity in real-time control applications due to its fast, global convergence. The present work proposes a variant of PANOC that makes use of Gauss-Newton directions to accelerate the method. Furthermore, we show that when applied to optimal control problems, the computation of this Gauss-Newton step can be cast as a line… ▽ More PANOC is an algorithm for nonconvex optimization that has recently gained popularity in real-time control applications due to its fast, global convergence. The present work proposes a variant of PANOC that makes use of Gauss-Newton directions to accelerate the method. Furthermore, we show that when applied to optimal control problems, the computation of this Gauss-Newton step can be cast as a linear quadratic regulator (LQR) problem, allowing for an efficient solution through the Riccati recursion. Finally, we demonstrate that the proposed algorithm is more than twice as fast as the traditional L-BFGS variant of PANOC when applied to an optimal control benchmark problem, and that the performance scales favorably with increasing horizon length. △ Less

Submitted 8 December, 2022; originally announced December 2022.

Comments: Submitted to the 2023 IFAC World Congress, Yokohama

MSC Class: 65K05; 49M37; 90C30

Journal ref: IFAC-PapersOnLine 56(2):4852-4857 (2023)

arXiv:2212.01504 [pdf, ps, other]

doi 10.1007/s10957-024-02383-9

A mirror inertial forward-reflected-backward splitting: Global convergence and linesearch extension beyond convexity and Lipschitz smoothness

Authors: Ziyuan Wang, Andreas Themelis, Hongjia Ou, Xianfu Wang

Abstract: This work investigates a Bregman and inertial extension of the forward-reflected-backward algorithm [Y. Malitsky and M. Tam, SIAM J. Optim., 30 (2020), pp. 1451--1472] applied to structured nonconvex minimization problems under relative smoothness. To this end, the proposed algorithm hinges on two key features: taking inertial steps in the dual space, and allowing for possibly negative inertial va… ▽ More This work investigates a Bregman and inertial extension of the forward-reflected-backward algorithm [Y. Malitsky and M. Tam, SIAM J. Optim., 30 (2020), pp. 1451--1472] applied to structured nonconvex minimization problems under relative smoothness. To this end, the proposed algorithm hinges on two key features: taking inertial steps in the dual space, and allowing for possibly negative inertial values. Our analysis begins with studying an associated envelope function that takes inertial terms into account through a novel product space formulation. Such construction substantially differs from similar objects in the literature and could offer new insights for extensions of splitting algorithms. Global convergence and rates are obtained by appealing to the generalized concave Kurdyka-Lojasiewicz (KL) property, which allows us to describe a sharp upper bound on the total length of iterates. Finally, a linesearch extension is given to enhance the proposed method. △ Less

Submitted 2 December, 2022; originally announced December 2022.

MSC Class: 90C26; 49J52; 49J53

Journal ref: J Optim Theory Appl (2024)

arXiv:2208.00799 [pdf, other]

doi 10.5802/ojmo.30

An interior proximal gradient method for nonconvex optimization

Authors: Alberto De Marchi, Andreas Themelis

Abstract: We consider structured minimization problems subject to smooth inequality constraints and present a flexible algorithm that combines interior point (IP) and proximal gradient schemes. While traditional IP methods cannot cope with nonsmooth objective functions and proximal algorithms cannot handle complicated constraints, their combined usage is shown to successfully compensate the respective short… ▽ More We consider structured minimization problems subject to smooth inequality constraints and present a flexible algorithm that combines interior point (IP) and proximal gradient schemes. While traditional IP methods cannot cope with nonsmooth objective functions and proximal algorithms cannot handle complicated constraints, their combined usage is shown to successfully compensate the respective shortcomings. We provide a theoretical characterization of the algorithm and its asymptotic properties, deriving convergence results for fully nonconvex problems, thus bridging the gap with previous works that successfully addressed the convex case. Our interior proximal gradient algorithm benefits from warm starting, generates strictly feasible iterates with decreasing objective value, and returns after finitely many iterations a primal-dual pair approximately satisfying suitable optimality conditions. As a byproduct of our analysis of proximal gradient iterations we demonstrate that a slight refinement of traditional backtracking techniques waives the need for upper bounding the stepsize sequence, as required in existing results for the nonconvex setting. △ Less

Submitted 28 January, 2024; v1 submitted 1 August, 2022; originally announced August 2022.

MSC Class: 49J52; 65K05; 90C30

Journal ref: Open Journal of Mathematical Optimization, Volume 5 (2024), article no. 3

arXiv:2207.08195 [pdf, other]

doi 10.1007/s10589-023-00550-8

SPIRAL: A superlinearly convergent incremental proximal algorithm for nonconvex finite sum minimization

Authors: Pourya Behmandpoor, Puya Latafat, Andreas Themelis, Marc Moonen, Panagiotis Patrinos

Abstract: We introduce SPIRAL, a SuPerlinearly convergent Incremental pRoximal ALgorithm, for solving nonconvex regularized finite sum problems under a relative smoothness assumption. Each iteration of SPIRAL consists of an inner and an outer loop. It combines incremental gradient updates with a linesearch that has the remarkable property of never being triggered asymptotically, leading to superlinear conve… ▽ More We introduce SPIRAL, a SuPerlinearly convergent Incremental pRoximal ALgorithm, for solving nonconvex regularized finite sum problems under a relative smoothness assumption. Each iteration of SPIRAL consists of an inner and an outer loop. It combines incremental gradient updates with a linesearch that has the remarkable property of never being triggered asymptotically, leading to superlinear convergence under mild assumptions at the limit point. Simulation results with L-BFGS directions on different convex, nonconvex, and non-Lipschitz differentiable problems show that our algorithm, as well as its adaptive variant, are competitive to the state of the art. △ Less

Submitted 15 January, 2024; v1 submitted 17 July, 2022; originally announced July 2022.

MSC Class: 90C06; 90C25; 90C26; 49J52; 49J53; 90C53

Journal ref: Comput Optim Appl 88, 71-106 (2024)

arXiv:2202.11331 [pdf, other]

doi 10.1109/CCTA49430.2022.9966067

Flock navigation with dynamic hierarchy and subjective weights using nonlinear MPC

Authors: Aneek Nag, Shuo Huang, Andreas Themelis, Kaoru Yamamoto

Abstract: We propose a model predictive control (MPC) based approach to a flock control problem with obstacle avoidance capability in a leader-follower framework, utilizing the future trajectory prediction computed by each agent. We employ the traditional Reynolds' flocking rules (cohesion, separation, and alignment) as a basis, and tailor the model to fit a navigation (as opposed to formation) purpose. In… ▽ More We propose a model predictive control (MPC) based approach to a flock control problem with obstacle avoidance capability in a leader-follower framework, utilizing the future trajectory prediction computed by each agent. We employ the traditional Reynolds' flocking rules (cohesion, separation, and alignment) as a basis, and tailor the model to fit a navigation (as opposed to formation) purpose. In particular, we introduce several concepts such as the credibility and the importance of the gathered information from neighbors, and dynamic trade-offs between references. They are based on the observations that near-future predictions are more reliable, agents closer to leaders are implicit carriers of more educated information, and the predominance of either cohesion or alignment is dictated by the distance between the agent and its neighbors. These features are incorporated in the MPC formulation, and their advantages are discussed through numerical simulations. △ Less

Submitted 17 June, 2022; v1 submitted 23 February, 2022; originally announced February 2022.

MSC Class: 93A13; 93A16; 93B45

Journal ref: 2022 IEEE Conference on Control Technology and Applications (CCTA), pp. 1135-1140

arXiv:2112.13000 [pdf, other]

doi 10.1007/s10957-022-02048-5

Proximal Gradient Algorithms under Local Lipschitz Gradient Continuity: A Convergence and Robustness Analysis of PANOC

Authors: Alberto De Marchi, Andreas Themelis

Abstract: Composite optimization offers a powerful modeling tool for a variety of applications and is often numerically solved by means of proximal gradient methods. In this paper, we consider fully nonconvex composite problems under only local Lipschitz gradient continuity for the smooth part of the objective function. We investigate an adaptive scheme for PANOC-type methods (Stella et al. in Proceedings o… ▽ More Composite optimization offers a powerful modeling tool for a variety of applications and is often numerically solved by means of proximal gradient methods. In this paper, we consider fully nonconvex composite problems under only local Lipschitz gradient continuity for the smooth part of the objective function. We investigate an adaptive scheme for PANOC-type methods (Stella et al. in Proceedings of the IEEE 56th CDC, 1939--1944, 2017), namely accelerated linesearch algorithms requiring only the simple oracle of proximal gradient. While including the classical proximal gradient method, our theoretical results cover a broader class of algorithms and provide convergence guarantees for accelerated methods with possibly inexact computation of the proximal map**. These findings have also significant practical impact, as they widen scope and performance of existing, and possibly future, general purpose optimization software that invoke PANOC as inner solver. △ Less

Submitted 18 April, 2022; v1 submitted 24 December, 2021; originally announced December 2021.

MSC Class: 49J52; 65K05; 90C30

Journal ref: J Optim Theory Appl 194, 771-794 (2022)

arXiv:2112.08886 [pdf, other]

doi 10.1137/21M1465913

Dualities for non-Euclidean smoothness and strong convexity under the light of generalized conjugacy

Authors: Emanuel Laude, Andreas Themelis, Panagiotis Patrinos

Abstract: Relative smoothness and strong convexity have recently gained considerable attention in optimization. These notions are generalizations of the classical Euclidean notions of smoothness and strong convexity that are known to be dual to each other. However, conjugate dualities for non-Euclidean relative smoothness and strong convexity remain an open problem as noted earlier by Lu, Freund and Nestero… ▽ More Relative smoothness and strong convexity have recently gained considerable attention in optimization. These notions are generalizations of the classical Euclidean notions of smoothness and strong convexity that are known to be dual to each other. However, conjugate dualities for non-Euclidean relative smoothness and strong convexity remain an open problem as noted earlier by Lu, Freund and Nesterov [SIAM J. Optim., 28 (2018), pp. 333-354]. In this paper we address this question by introducing the notions of anisotropic strong convexity and smoothness as the respective dual counterparts. The dualities are developed under the light of generalized conjugacy which leads us embed the anticipated dual notions within the superclasses of certain upper and lower envelopes. In contrast to the Euclidean case these inclusions are proper in general as showcased by means of counterexamples. △ Less

Submitted 23 January, 2023; v1 submitted 16 December, 2021; originally announced December 2021.

Journal ref: SIAM J Optim 33(4):2721-2749 (2023)

arXiv:2103.14343 [pdf, other]

doi 10.1109/CDC45484.2021.9682842

Neural Network Training as an Optimal Control Problem: An Augmented Lagrangian Approach

Authors: Brecht Evens, Puya Latafat, Andreas Themelis, Johan Suykens, Panagiotis Patrinos

Abstract: Training of neural networks amounts to nonconvex optimization problems that are typically solved by using backpropagation and (variants of) stochastic gradient descent. In this work we propose an alternative approach by viewing the training task as a nonlinear optimal control problem. Under this lens, backpropagation amounts to the sequential approach (single shooting) to optimal control, where th… ▽ More Training of neural networks amounts to nonconvex optimization problems that are typically solved by using backpropagation and (variants of) stochastic gradient descent. In this work we propose an alternative approach by viewing the training task as a nonlinear optimal control problem. Under this lens, backpropagation amounts to the sequential approach (single shooting) to optimal control, where the states variables have been eliminated. It is well known that single shooting may lead to ill conditioning, and for this reason the simultaneous approach (multiple shooting) is typically preferred. Motivated by this hypothesis, an augmented Lagrangian algorithm is developed that only requires an approximate solution to the Lagrangian subproblems up to a user-defined accuracy. By applying this framework to the training of neural networks, it is shown that the inner Lagrangian subproblems are amenable to be solved using Gauss-Newton iterations. To fully exploit the structure of neural networks, the resulting linear least squares problems are addressed by employing an approach based on forward dynamic programming. Finally, the effectiveness of our method is showcased on regression datasets. △ Less

Submitted 6 May, 2021; v1 submitted 26 March, 2021; originally announced March 2021.

Comments: 8 pages; typos corrected

MSC Class: 49L20; 49M15; 49M37; 68T07; 90C06; 90C26; 90C30

Journal ref: 60th IEEE Conference on Decision and Control (CDC 2021)

arXiv:2103.08533 [pdf, other]

doi 10.23919/EUSIPCO54536.2021.9616167

Lasry-Lions Envelopes and Nonconvex Optimization: A Homotopy Approach

Authors: Miguel Simões, Andreas Themelis, Panagiotis Patrinos

Abstract: In large-scale optimization, the presence of nonsmooth and nonconvex terms in a given problem typically makes it hard to solve. A popular approach to address nonsmooth terms in convex optimization is to approximate them with their respective Moreau envelopes. In this work, we study the use of Lasry-Lions double envelopes to approximate nonsmooth terms that are also not convex. These envelopes are… ▽ More In large-scale optimization, the presence of nonsmooth and nonconvex terms in a given problem typically makes it hard to solve. A popular approach to address nonsmooth terms in convex optimization is to approximate them with their respective Moreau envelopes. In this work, we study the use of Lasry-Lions double envelopes to approximate nonsmooth terms that are also not convex. These envelopes are an extension of the Moreau ones but exhibit an additional smoothness property that makes them amenable to fast optimization algorithms. Lasry-Lions envelopes can also be seen as an "intermediate" between a given function and its convex envelope, and we make use of this property to develop a method that builds a sequence of approximate subproblems that are easier to solve than the original problem. We discuss convergence properties of this method when used to address composite minimization problems; additionally, based on a number of experiments, we discuss settings where it may be more useful than classical alternatives in two domains: signal decoding and spectral unmixing. △ Less

Submitted 22 June, 2021; v1 submitted 15 March, 2021; originally announced March 2021.

Comments: 29th Eur. Signal Process. Conf. (EUSIPCO 2021), accepted. 5 pages, 2 figures, 2 tables

Journal ref: Eur Sig Proc Conf (EUSIPCO), 2021, pp 2089-2093

arXiv:2102.10312 [pdf, other]

doi 10.1137/21M140376X

Bregman Finito/MISO for nonconvex regularized finite sum minimization without Lipschitz gradient continuity

Authors: Puya Latafat, Andreas Themelis, Masoud Ahookhosh, Panagiotis Patrinos

Abstract: We introduce two algorithms for nonconvex regularized finite sum minimization, where typical Lipschitz differentiability assumptions are relaxed to the notion of relative smoothness. The first one is a Bregman extension of Finito/MISO, studied for fully nonconvex problems when the sampling is randomized, or under convexity of the nonsmooth term when it is essentially cyclic. The second algorithm i… ▽ More We introduce two algorithms for nonconvex regularized finite sum minimization, where typical Lipschitz differentiability assumptions are relaxed to the notion of relative smoothness. The first one is a Bregman extension of Finito/MISO, studied for fully nonconvex problems when the sampling is randomized, or under convexity of the nonsmooth term when it is essentially cyclic. The second algorithm is a low-memory variant, in the spirit of SVRG and SARAH, that also allows for fully nonconvex formulations. Our analysis is made remarkably simple by employing a Bregman Moreau envelope as Lyapunov function. In the randomized case, linear convergence is established when the cost function is strongly convex, yet with no convexity requirements on the individual functions in the sum. For the essentially cyclic and low-memory variants, global and linear convergence results are established when the cost function satisfies the Kurdyka-Łojasiewicz property. △ Less

Submitted 2 March, 2022; v1 submitted 20 February, 2021; originally announced February 2021.

MSC Class: 90C06; 90C25; 90C26; 49J52; 49J53

Journal ref: SIAM J Optim 32(3):2230-2262 (2022)

arXiv:2010.02653 [pdf, other]

doi 10.1007/s12532-022-00218-0

QPALM: A Proximal Augmented Lagrangian Method for Nonconvex Quadratic Programs

Authors: Ben Hermans, Andreas Themelis, Panagiotis Patrinos

Abstract: We propose QPALM, a nonconvex quadratic programming (QP) solver based on the proximal augmented Lagrangian method. This method solves a sequence of inner subproblems which can be enforced to be strongly convex and which therefore admit a unique solution. The resulting steps are shown to be equivalent to inexact proximal point iterations on the extended-real-valued cost function, which allows for a… ▽ More We propose QPALM, a nonconvex quadratic programming (QP) solver based on the proximal augmented Lagrangian method. This method solves a sequence of inner subproblems which can be enforced to be strongly convex and which therefore admit a unique solution. The resulting steps are shown to be equivalent to inexact proximal point iterations on the extended-real-valued cost function, which allows for a fairly simple analysis where convergence to a stationary point at an $R$-linear rate is shown. The QPALM algorithm solves the subproblems iteratively using semismooth Newton directions and an exact linesearch. The former can be computed efficiently in most iterations by making use of suitable factorization update routines, while the latter requires the zero of a monotone, one-dimensional, piecewise affine function. QPALM is implemented in open-source C code, with tailored linear algebra routines for the factorization in a self-written package LADEL. The resulting implementation is shown to be extremely robust in numerical simulations, solving all of the Maros-Meszaros problems and finding a stationary point for most of the nonconvex QPs in the Cutest test set. Furthermore, it is shown to be competitive against state-of-the-art convex QP solvers in typical QPs arising from application domains such as portfolio optimization and model predictive control. As such, QPALM strikes a unique balance between solving both easy and hard problems efficiently. △ Less

Submitted 2 September, 2021; v1 submitted 6 October, 2020; originally announced October 2020.

MSC Class: 90C05; 90C20; 90C26; 49J53; 49M15

Journal ref: Math. Prog. Comp. 14, 497-541 (2022)

arXiv:2005.10230 [pdf, other]

doi 10.1007/s10589-022-00366-y

Douglas-Rachford splitting and ADMM for nonconvex optimization: Accelerated and Newton-type linesearch algorithms

Authors: Andreas Themelis, Lorenzo Stella, Panagiotis Patrinos

Abstract: Although the performance of popular optimization algorithms such as Douglas-Rachford splitting (DRS) and the ADMM is satisfactory in small and well-scaled problems, ill conditioning and problem size pose a severe obstacle to their reliable employment. Expanding on recent convergence results for DRS and ADMM applied to nonconvex problems, we propose two linesearch algorithms to enhance and robustif… ▽ More Although the performance of popular optimization algorithms such as Douglas-Rachford splitting (DRS) and the ADMM is satisfactory in small and well-scaled problems, ill conditioning and problem size pose a severe obstacle to their reliable employment. Expanding on recent convergence results for DRS and ADMM applied to nonconvex problems, we propose two linesearch algorithms to enhance and robustify these methods by means of quasi-Newton directions. The proposed algorithms are suited for nonconvex problems, require the same black-box oracle of DRS and ADMM, and maintain their (subsequential) convergence properties. Numerical evidence shows that the employment of L-BFGS in the proposed framework greatly improves convergence of DRS and ADMM, making them robust to ill conditioning. Under regularity and nondegeneracy assumptions at the limit point, superlinear convergence is shown when quasi-Newton Broyden directions are adopted. △ Less

Submitted 3 November, 2021; v1 submitted 20 May, 2020; originally announced May 2020.

MSC Class: 90C06; 90C25; 90C26; 49J52; 49J53

Journal ref: Comput Optim Appl 82, 395-440 (2022)

arXiv:2004.00083 [pdf, other]

doi 10.1109/CDC42340.2020.9304514

A new envelope function for nonsmooth DC optimization

Authors: Andreas Themelis, Ben Hermans, Panagiotis Patrinos

Abstract: Difference-of-convex (DC) optimization problems are shown to be equivalent to the minimization of a Lipschitz-differentiable "envelope". A gradient method on this surrogate function yields a novel (sub)gradient-free proximal algorithm which is inherently parallelizable and can handle fully nonsmooth formulations. Newton-type methods such as L-BFGS are directly applicable with a classical linesearc… ▽ More Difference-of-convex (DC) optimization problems are shown to be equivalent to the minimization of a Lipschitz-differentiable "envelope". A gradient method on this surrogate function yields a novel (sub)gradient-free proximal algorithm which is inherently parallelizable and can handle fully nonsmooth formulations. Newton-type methods such as L-BFGS are directly applicable with a classical linesearch. Our analysis reveals a deep kinship between the novel DC envelope and the forward-backward envelope, the former being a smooth and convexity-preserving nonlinear reparametrization of the latter. △ Less

Submitted 31 March, 2020; originally announced April 2020.

MSC Class: 90C26; 90C53; 90C06

Journal ref: 59th IEEE Conference on Decision and Control (CDC), 2020, pp. 4697-4702

arXiv:2003.03502 [pdf, ps, other]

doi 10.23919/Eusipco47968.2020.9287549

A quadratically convergent proximal algorithm for nonnegative tensor decomposition

Authors: Nico Vervliet, Andreas Themelis, Panagiotis Patrinos, Lieven De Lathauwer

Abstract: The decomposition of tensors into simple rank-1 terms is key in a variety of applications in signal processing, data analysis and machine learning. While this canonical polyadic decomposition (CPD) is unique under mild conditions, including prior knowledge such as nonnegativity can facilitate interpretation of the components. Inspired by the effectiveness and efficiency of Gauss-Newton (GN) for un… ▽ More The decomposition of tensors into simple rank-1 terms is key in a variety of applications in signal processing, data analysis and machine learning. While this canonical polyadic decomposition (CPD) is unique under mild conditions, including prior knowledge such as nonnegativity can facilitate interpretation of the components. Inspired by the effectiveness and efficiency of Gauss-Newton (GN) for unconstrained CPD, we derive a proximal, semismooth GN type algorithm for nonnegative tensor factorization. If the algorithm converges to the global optimum, we show that $Q$-quadratic convergence can be obtained in the exact case. Global convergence is achieved via backtracking on the forward-backward envelope function. The $Q$-quadratic convergence is verified experimentally, and we illustrate that using the GN step significantly reduces number of (expensive) gradient computations compared to proximal gradient descent. △ Less

Submitted 6 March, 2020; originally announced March 2020.

MSC Class: 15A69; 49J52; 90C26; 90C53

Journal ref: 28th European Signal Processing Conference (EUSIPCO), 2021, pp. 1020-1024

arXiv:1911.02934 [pdf, ps, other]

doi 10.1109/CDC40024.2019.9030211

QPALM: A Newton-type Proximal Augmented Lagrangian Method for Quadratic Programs

Authors: Ben Hermans, Andreas Themelis, Panagiotis Patrinos

Abstract: We present a proximal augmented Lagrangian based solver for general convex quadratic programs (QPs), relying on semismooth Newton iterations with exact line search to solve the inner subproblems. The exact line search reduces in this case to finding the zero of a one-dimensional monotone, piecewise affine function and can be carried out very efficiently. Our algorithm requires the solution of a li… ▽ More We present a proximal augmented Lagrangian based solver for general convex quadratic programs (QPs), relying on semismooth Newton iterations with exact line search to solve the inner subproblems. The exact line search reduces in this case to finding the zero of a one-dimensional monotone, piecewise affine function and can be carried out very efficiently. Our algorithm requires the solution of a linear system at every iteration, but as the matrix to be factorized depends on the active constraints, efficient sparse factorization updates can be employed like in active-set methods. Both primal and dual residuals can be enforced down to strict tolerances and otherwise infeasibility can be detected from intermediate iterates. A C implementation of the proposed algorithm is tested and benchmarked against other state-of-the-art QP solvers for a large variety of problem data and shown to compare favorably against these solvers. △ Less

Submitted 7 November, 2019; originally announced November 2019.

MSC Class: 49M15; 49M37; 90C05; 90C06; 90C20; 90C25; 90C46

Journal ref: 2019 IEEE 58th Conference on Decision and Control (CDC), Nice, France, 2019, pp. 4325-4330

arXiv:1906.10053 [pdf, ps, other]

doi 10.1007/s10107-020-01599-7

Block-coordinate and incremental aggregated proximal gradient methods for nonsmooth nonconvex problems

Authors: Puya Latafat, Andreas Themelis, Panagiotis Patrinos

Abstract: This paper analyzes block-coordinate proximal gradient methods for minimizing the sum of a separable smooth function and a (nonseparable) nonsmooth function, both of which are allowed to be nonconvex. The main tool in our analysis is the forward-backward envelope (FBE), which serves as a particularly suitable continuous and real-valued Lyapunov function. Global and linear convergence results are e… ▽ More This paper analyzes block-coordinate proximal gradient methods for minimizing the sum of a separable smooth function and a (nonseparable) nonsmooth function, both of which are allowed to be nonconvex. The main tool in our analysis is the forward-backward envelope (FBE), which serves as a particularly suitable continuous and real-valued Lyapunov function. Global and linear convergence results are established when the cost function satisfies the Kurdyka-Łojasiewicz property without imposing convexity requirements on the smooth function. Two prominent special cases of the investigated setting are regularized finite sum minimization and the sharing problem; in particular, an immediate byproduct of our analysis leads to novel convergence results and rates for the popular Finito/MISO algorithm in the nonsmooth and nonconvex setting with very general sampling strategies. This paper analyzes block-coordinate proximal gradient methods for minimizing the sum of a separable smooth function and a (nonseparable) nonsmooth function, both of which are allowed to be nonconvex. The main tool in our analysis is the forward-backward envelope (FBE), which serves as a particularly suitable continuous and real-valued Lyapunov function. Global and linear convergence results are established when the cost function satisfies the Kurdyka-Łojasiewicz property without imposing convexity requirements on the smooth function. Two prominent special cases of the investigated setting are regularized finite sum minimization and the sharing problem; in particular, an immediate byproduct of our analysis leads to novel convergence results and rates for the popular Finito/MISO algorithm in the nonsmooth and nonconvex setting with very general sampling strategies. △ Less

Submitted 26 November, 2019; v1 submitted 24 June, 2019; originally announced June 2019.

MSC Class: 90C06; 90C25; 90C26; 49J52; 49J53

Journal ref: Math. Program. 193, 195-224 (2022)

arXiv:1905.11904 [pdf, ps, other]

doi 10.1137/19M1264783

A Bregman forward-backward linesearch algorithm for nonconvex composite optimization: superlinear convergence to nonisolated local minima

Authors: Masoud Ahookhosh, Andreas Themelis, Panagiotis Patrinos

Abstract: We introduce Bella, a locally superlinearly convergent Bregman forward backward splitting method for minimizing the sum of two nonconvex functions, one of which satisfying a relative smoothness condition and the other one possibly nonsmooth. A key tool of our methodology is the Bregman forward-backward envelope (BFBE), an exact and continuous penalty function with favorable first- and second-order… ▽ More We introduce Bella, a locally superlinearly convergent Bregman forward backward splitting method for minimizing the sum of two nonconvex functions, one of which satisfying a relative smoothness condition and the other one possibly nonsmooth. A key tool of our methodology is the Bregman forward-backward envelope (BFBE), an exact and continuous penalty function with favorable first- and second-order properties, and enjoying a nonlinear error bound when the objective function satisfies a Lojasiewicz-type property. The proposed algorithm is of linesearch type over the BFBE along candidate update directions, and converges subsequentially to stationary points, globally under a KL condition, and owing to the given nonlinear error bound can attain superlinear convergence rates even when the limit point is a nonisolated minimum, provided the directions are suitably selected. △ Less

Submitted 1 May, 2020; v1 submitted 28 May, 2019; originally announced May 2019.

MSC Class: 90C06; 90C25; 90C26; 49J52; 49J53

Journal ref: SIAM J Optim 31(1):653-685 (2021)

arXiv:1904.10546 [pdf, other]

doi 10.23919/ECC.2018.8550253

Embedded nonlinear model predictive control for obstacle avoidance using PANOC

Authors: Ajay Sathya, Pantelis Sopasakis, Ruben Van Parys, Andreas Themelis, Goele Pipeleers, Panagiotis Patrinos

Abstract: We employ the proximal averaged Newton-type method for optimal control (PANOC) to solve obstacle avoidance problems in real time. We introduce a novel modeling framework for obstacle avoidance which allows us to easily account for generic, possibly nonconvex, obstacles involving polytopes, ellipsoids, semialgebraic sets and generic sets described by a set of nonlinear inequalities. PANOC is partic… ▽ More We employ the proximal averaged Newton-type method for optimal control (PANOC) to solve obstacle avoidance problems in real time. We introduce a novel modeling framework for obstacle avoidance which allows us to easily account for generic, possibly nonconvex, obstacles involving polytopes, ellipsoids, semialgebraic sets and generic sets described by a set of nonlinear inequalities. PANOC is particularly well-suited for embedded applications as it involves simple steps, its implementation comes with a low memory footprint and its fast convergence meets the tight runtime requirements of fast dynamical systems one encounters in modern mechatronics and robotics. The proposed obstacle avoidance scheme is tested on a lab-scale autonomous vehicle. △ Less

Submitted 23 April, 2019; originally announced April 2019.

Journal ref: European Control Conference (ECC'18), pp.1523-1528, Cyprus, 2018

arXiv:1811.02935 [pdf, other]

doi 10.1007/978-3-030-25939-6_15

On the acceleration of forward-backward splitting via an inexact Newton method

Authors: Andreas Themelis, Masoud Ahookhosh, Panagiotis Patrinos

Abstract: We propose a Forward-Backward Truncated-Newton method (FBTN) for minimizing the sum of two convex functions, one of which smooth. Unlike other proximal Newton methods, our approach does not involve the employment of variable metrics, but is rather based on a reformulation of the original problem as the unconstrained minimization of a continuously differentiable function, the forward-backward envel… ▽ More We propose a Forward-Backward Truncated-Newton method (FBTN) for minimizing the sum of two convex functions, one of which smooth. Unlike other proximal Newton methods, our approach does not involve the employment of variable metrics, but is rather based on a reformulation of the original problem as the unconstrained minimization of a continuously differentiable function, the forward-backward envelope (FBE). We introduce a generalized Hessian for the FBE that symmetrizes the generalized Jacobian of the nonlinear system of equations representing the optimality conditions for the problem. This enables the employment of conjugate gradient method (CG) for efficiently solving the resulting (regularized) linear systems, which can be done inexactly. The employment of CG prevents the computation of full (generalized) Jacobians, as it requires only (generalized) directional derivatives. The resulting algorithm is globally (subsequentially) convergent, $Q$-linearly under an error bound condition, and up to $Q$-superlinearly and $Q$-quadratically under regularity assumptions at the possibly non-isolated limit point. △ Less

Submitted 7 November, 2018; originally announced November 2018.

MSC Class: 49J52; 49M15; 90C06; 90C25; 90C30

Journal ref: Splitting Algorithms, Modern Operator Theory, and Applications. Springer, Cham (2019)

arXiv:1803.05256 [pdf, other]

doi 10.1109/TAC.2018.2872203

Newton-type Alternating Minimization Algorithm for Convex Optimization

Authors: Lorenzo Stella, Andreas Themelis, Panagiotis Patrinos

Abstract: We propose NAMA (Newton-type Alternating Minimization Algorithm) for solving structured nonsmooth convex optimization problems where the sum of two functions is to be minimized, one being strongly convex and the other composed with a linear map**. The proposed algorithm is a line-search method over a continuous, real-valued, exact penalty function for the corresponding dual problem, which is com… ▽ More We propose NAMA (Newton-type Alternating Minimization Algorithm) for solving structured nonsmooth convex optimization problems where the sum of two functions is to be minimized, one being strongly convex and the other composed with a linear map**. The proposed algorithm is a line-search method over a continuous, real-valued, exact penalty function for the corresponding dual problem, which is computed by evaluating the augmented Lagrangian at the primal points obtained by alternating minimizations. As a consequence, NAMA relies on exactly the same computations as the classical alternating minimization algorithm (AMA), also known as the dual proximal gradient method. Under standard assumptions the proposed algorithm possesses strong convergence properties, while under mild additional assumptions the asymptotic convergence is superlinear, provided that the search directions are chosen according to quasi-Newton formulas. Due to its simplicity, the proposed method is well suited for embedded applications and large-scale problems. Experiments show that using limited-memory directions in NAMA greatly improves the convergence speed over AMA and its accelerated variant. △ Less

Submitted 14 March, 2018; originally announced March 2018.

MSC Class: 90C06; 90C25; 90C53; 47N10; 49J52; 49J53

Journal ref: IEEE Transactions on Automatic Control, vol. 64, no. 2, pp. 697-711, Feb. 2019

arXiv:1709.06487 [pdf, other]

A Simple and Efficient Algorithm for Nonlinear Model Predictive Control

Authors: Lorenzo Stella, Andreas Themelis, Pantelis Sopasakis, Panagiotis Patrinos

Abstract: We present PANOC, a new algorithm for solving optimal control problems arising in nonlinear model predictive control (NMPC). A usual approach to this type of problems is sequential quadratic programming (SQP), which requires the solution of a quadratic program at every iteration and, consequently, inner iterative procedures. As a result, when the problem is ill-conditioned or the prediction horizo… ▽ More We present PANOC, a new algorithm for solving optimal control problems arising in nonlinear model predictive control (NMPC). A usual approach to this type of problems is sequential quadratic programming (SQP), which requires the solution of a quadratic program at every iteration and, consequently, inner iterative procedures. As a result, when the problem is ill-conditioned or the prediction horizon is large, each outer iteration becomes computationally very expensive. We propose a line-search algorithm that combines forward-backward iterations (FB) and Newton-type steps over the recently introduced forward-backward envelope (FBE), a continuous, real-valued, exact merit function for the original problem. The curvature information of Newton-type methods enables asymptotic superlinear rates under mild assumptions at the limit point, and the proposed algorithm is based on very simple operations: access to first-order information of the cost and dynamics and low-cost direct linear algebra. No inner iterative procedure nor Hessian evaluation is required, making our approach computationally simpler than SQP methods. The low-memory requirements and simple implementation make our method particularly suited for embedded NMPC applications. △ Less

Submitted 19 September, 2017; originally announced September 2017.

arXiv:1709.05747 [pdf, other]

doi 10.1137/18M1163993

Douglas-Rachford splitting and ADMM for nonconvex optimization: tight convergence results

Authors: Andreas Themelis, Panagiotis Patrinos

Abstract: Although originally designed and analyzed for convex problems, the alternating direction method of multipliers (ADMM) and its close relatives, Douglas-Rachford splitting (DRS) and Peaceman-Rachford splitting (PRS), have been observed to perform remarkably well when applied to certain classes of structured nonconvex optimization problems. However, partial global convergence results in the nonconvex… ▽ More Although originally designed and analyzed for convex problems, the alternating direction method of multipliers (ADMM) and its close relatives, Douglas-Rachford splitting (DRS) and Peaceman-Rachford splitting (PRS), have been observed to perform remarkably well when applied to certain classes of structured nonconvex optimization problems. However, partial global convergence results in the nonconvex setting have only recently emerged. In this paper we show how the Douglas-Rachford envelope (DRE), introduced in 2014, can be employed to unify and considerably simplify the theory for devising global convergence guarantees for ADMM, DRS and PRS applied to nonconvex problems under less restrictive conditions, larger prox-stepsizes and over-relaxation parameters than previously known. In fact, our bounds are tight whenever the over-relaxation parameter ranges in $(0,2]$. The analysis of ADMM uses a universal primal equivalence with DRS that generalizes the known duality of the algorithms. △ Less

Submitted 9 November, 2018; v1 submitted 17 September, 2017; originally announced September 2017.

MSC Class: 90C06; 90C25; 90C26; 90C53; 49J52; 49J53

Journal ref: SIAM J. Optim., 2018 30:1, 149-181

arXiv:1609.06955 [pdf, other]

doi 10.1109/TAC.2019.2906393

SuperMann: a superlinearly convergent algorithm for finding fixed points of nonexpansive operators

Authors: Andreas Themelis, Panagiotis Patrinos

Abstract: Operator splitting techniques have recently gained popularity in convex optimization problems arising in various control fields. Being fixed-point iterations of nonexpansive operators, such methods suffer many well known downsides, which include high sensitivity to ill conditioning and parameter selection, and consequent low accuracy and robustness. As universal solution we propose SuperMann, a Ne… ▽ More Operator splitting techniques have recently gained popularity in convex optimization problems arising in various control fields. Being fixed-point iterations of nonexpansive operators, such methods suffer many well known downsides, which include high sensitivity to ill conditioning and parameter selection, and consequent low accuracy and robustness. As universal solution we propose SuperMann, a Newton-type algorithm for finding fixed points of nonexpansive operators. It generalizes the classical Krasnosel'skii-Mann scheme, enjoys its favorable global convergence properties and requires exactly the same oracle. It is based on a novel separating hyperplane projection tailored for nonexpansive map**s which makes it possible to include steps along any direction. In particular, when the directions satisfy a Dennis-Moré condition we show that SuperMann converges superlinearly under mild assumptions, which, surprisingly, do not entail nonsingularity of the Jacobian at the solution but merely metric subregularity. As a result, SuperMann enhances and robustifies all operator splitting schemes for structured convex optimization, overcoming their well known sensitivity to ill conditioning. △ Less

Submitted 14 March, 2018; v1 submitted 22 September, 2016; originally announced September 2016.

MSC Class: 47H09; 90C25; 90C53; 65K15

Journal ref: IEEE Transactions on Automatic Control, vol. 64, no. 12, pp. 4875-4890, Dec. 2019

arXiv:1606.06256 [pdf, other]

doi 10.1137/16M1080240

Forward-backward envelope for the sum of two nonconvex functions: Further properties and nonmonotone line-search algorithms

Authors: Andreas Themelis, Lorenzo Stella, Panagiotis Patrinos

Abstract: We propose ZeroFPR, a nonmonotone linesearch algorithm for minimizing the sum of two nonconvex functions, one of which is smooth and the other possibly nonsmooth. ZeroFPR is the first algorithm that, despite being fit for fully nonconvex problems and requiring only the black-box oracle of forward-backward splitting (FBS) --- namely evaluations of the gradient of the smooth term and of the proximit… ▽ More We propose ZeroFPR, a nonmonotone linesearch algorithm for minimizing the sum of two nonconvex functions, one of which is smooth and the other possibly nonsmooth. ZeroFPR is the first algorithm that, despite being fit for fully nonconvex problems and requiring only the black-box oracle of forward-backward splitting (FBS) --- namely evaluations of the gradient of the smooth term and of the proximity operator of the nonsmooth one --- achieves superlinear convergence rates under mild assumptions at the limit point when the linesearch directions satisfy a Dennis-Moré condition, and we show that this is the case for quasi-Newton directions. Our approach is based on the forward-backward envelope (FBE), an exact and strictly continuous penalty function for the original cost. Extending previous results we show that, despite being nonsmooth for fully nonconvex problems, the FBE still enjoys favorable first- and second-order properties which are key for the convergence results of ZeroFPR. Our theoretical results are backed up by promising numerical simulations. On large-scale problems, by computing linesearch directions using limited-memory quasi-Newton updates our algorithm greatly outperforms FBS and its accelerated variant (AFBS). △ Less

Submitted 23 May, 2017; v1 submitted 20 June, 2016; originally announced June 2016.

MSC Class: 90C06; 90C25; 90C26; 90C53; 49J52; 49J53

Journal ref: SIAM J. Optim., 2018 28:3, 2274-2303

arXiv:1604.08096 [pdf, other]

doi 10.1007/s10589-017-9912-y

Forward-backward quasi-Newton methods for nonsmooth optimization problems

Authors: Lorenzo Stella, Andreas Themelis, Panagiotis Patrinos

Abstract: The forward-backward splitting method (FBS) for minimizing a nonsmooth composite function can be interpreted as a (variable-metric) gradient method over a continuously differentiable function which we call forward-backward envelope (FBE). This allows to extend algorithms for smooth unconstrained optimization and apply them to nonsmooth (possibly constrained) problems. Since the FBE and its gradien… ▽ More The forward-backward splitting method (FBS) for minimizing a nonsmooth composite function can be interpreted as a (variable-metric) gradient method over a continuously differentiable function which we call forward-backward envelope (FBE). This allows to extend algorithms for smooth unconstrained optimization and apply them to nonsmooth (possibly constrained) problems. Since the FBE and its gradient can be computed by simply evaluating forward-backward steps, the resulting methods rely on the very same black-box oracle as FBS. We propose an algorithmic scheme that enjoys the same global convergence properties of FBS when the problem is convex, or when the objective function possesses the Kurdyka-Łojasiewicz property at its critical points. Moreover, when using quasi-Newton directions the proposed method achieves superlinear convergence provided that usual second-order sufficiency conditions on the FBE hold at the limit point of the generated sequence. Such conditions translate into milder requirements on the original function involving generalized second-order differentiability. We show that BFGS fits our framework and that the limited-memory variant L-BFGS is well suited for large-scale problems, greatly outperforming FBS or its accelerated version in practice. The analysis of superlinear convergence is based on an extension of the Dennis and Moré theorem for the proposed algorithmic scheme. △ Less

Submitted 2 May, 2016; v1 submitted 27 April, 2016; originally announced April 2016.

Journal ref: Comput Optim Appl (2017) 67: 443

Showing 1–32 of 32 results for author: Themelis, A