-
Piecewise Polynomial Regression of Tame Functions via Integer Programming
Authors:
Gilles Bareilles,
Johannes Aspman,
Jiri Nemecek,
Jakub Marecek
Abstract:
Tame functions are a class of nonsmooth, nonconvex functions, which feature in a wide range of applications: functions encountered in the training of deep neural networks with all common activations, value functions of mixed-integer programs, or wave functions of small molecules. We consider approximating tame functions with piecewise polynomial functions. We bound the quality of approximation of…
▽ More
Tame functions are a class of nonsmooth, nonconvex functions, which feature in a wide range of applications: functions encountered in the training of deep neural networks with all common activations, value functions of mixed-integer programs, or wave functions of small molecules. We consider approximating tame functions with piecewise polynomial functions. We bound the quality of approximation of a tame function by a piecewise polynomial function with a given number of segments on any full-dimensional cube. We also present the first mixed-integer programming formulation of piecewise polynomial regression. Together, these can be used to estimate tame functions. We demonstrate promising computational results.
△ Less
Submitted 4 June, 2024; v1 submitted 22 November, 2023;
originally announced November 2023.
-
Hybrid Methods in Polynomial Optimisation
Authors:
Johannes Aspman,
Gilles Bareilles,
Vyacheslav Kungurtsev,
Jakub Marecek,
Martin Takáč
Abstract:
The Moment/Sum-of-squares hierarchy provides a way to compute the global minimizers of polynomial optimization problems (POP), at the cost of solving a sequence of increasingly large semidefinite programs (SDPs). We consider large-scale POPs, for which interior-point methods are no longer able to solve the resulting SDPs. We propose an algorithm that combines a first-order method for solving the S…
▽ More
The Moment/Sum-of-squares hierarchy provides a way to compute the global minimizers of polynomial optimization problems (POP), at the cost of solving a sequence of increasingly large semidefinite programs (SDPs). We consider large-scale POPs, for which interior-point methods are no longer able to solve the resulting SDPs. We propose an algorithm that combines a first-order method for solving the SDP relaxation, and a second-order method on a non-convex problem obtained from the POP. The switch from the first to the second-order method is based on a quantitative criterion, whose satisfaction ensures that Newton's method converges quadratically from its first iteration. This criterion leverages the point-estimation theory of Smale and the active-set identification. We illustrate the methodology to obtain global minimizers of large-scale optimal power flow problems.
△ Less
Submitted 12 September, 2023; v1 submitted 25 May, 2023;
originally announced May 2023.
-
Harnessing structure in composite nonsmooth minimization
Authors:
Gilles Bareilles,
Franck Iutzeler,
Jérôme Malick
Abstract:
We consider the problem of minimizing the composition of a nonsmooth function with a smooth map** in the case where the proximity operator of the nonsmooth function can be explicitly computed. We first show that this proximity operator can provide the exact smooth substructure of minimizers, not only of the nonsmooth function, but also of the full composite function. We then exploit this proxima…
▽ More
We consider the problem of minimizing the composition of a nonsmooth function with a smooth map** in the case where the proximity operator of the nonsmooth function can be explicitly computed. We first show that this proximity operator can provide the exact smooth substructure of minimizers, not only of the nonsmooth function, but also of the full composite function. We then exploit this proximal identification by proposing an algorithm which combines proximal steps with sequential quadratic programming steps. We show that our method locally identifies the optimal smooth substructure and then converges quadratically. We illustrate its behavior on two problems: the minimization of a maximum of quadratic functions and the minimization of the maximal eigenvalue of a parametrized matrix.
△ Less
Submitted 20 March, 2023; v1 submitted 30 June, 2022;
originally announced June 2022.
-
Newton acceleration on manifolds identified by proximal-gradient methods
Authors:
Gilles Bareilles,
Franck Iutzeler,
Jérôme Malick
Abstract:
Proximal methods are known to identify the underlying substructure of nonsmooth optimization problems. Even more, in many interesting situations, the output of a proximity operator comes with its structure at no additional cost, and convergence is improved once it matches the structure of a minimizer. However, it is impossible in general to know whether the current structure is final or not; such…
▽ More
Proximal methods are known to identify the underlying substructure of nonsmooth optimization problems. Even more, in many interesting situations, the output of a proximity operator comes with its structure at no additional cost, and convergence is improved once it matches the structure of a minimizer. However, it is impossible in general to know whether the current structure is final or not; such highly valuable information has to be exploited adaptively. To do so, we place ourselves in the case where a proximal gradient method can identify manifolds of differentiability of the nonsmooth objective. Leveraging this manifold identification, we show that Riemannian Newton-like methods can be intertwined with the proximal gradient steps to drastically boost the convergence. We prove the superlinear convergence of the algorithm when solving some nondegenerated nonsmooth nonconvex optimization problems. We provide numerical illustrations on optimization problems regularized by $\ell_1$-norm or trace-norm.
△ Less
Submitted 25 May, 2022; v1 submitted 23 December, 2020;
originally announced December 2020.
-
Randomized Progressive Hedging methods for Multi-stage Stochastic Programming
Authors:
Gilles Bareilles,
Yassine Laguel,
Dmitry Grishchenko,
Franck Iutzeler,
Jérôme Malick
Abstract:
Progressive Hedging is a popular decomposition algorithm for solving multi-stage stochastic optimization problems. A computational bottleneck of this algorithm is that all scenario subproblems have to be solved at each iteration. In this paper, we introduce randomized versions of the Progressive Hedging algorithm able to produce new iterates as soon as a single scenario subproblem is solved. Build…
▽ More
Progressive Hedging is a popular decomposition algorithm for solving multi-stage stochastic optimization problems. A computational bottleneck of this algorithm is that all scenario subproblems have to be solved at each iteration. In this paper, we introduce randomized versions of the Progressive Hedging algorithm able to produce new iterates as soon as a single scenario subproblem is solved. Building on the relation between Progressive Hedging and monotone operators, we leverage recent results on randomized fixed point methods to derive and analyze the proposed methods. Finally, we release the corresponding code as an easy-to-use Julia toolbox and report computational experiments showing the practical interest of randomized algorithms, notably in a parallel context. Throughout the paper, we pay a special attention to presentation, stressing main ideas, avoiding extra-technicalities, in order to make the randomized methods accessible to a broad audience in the Operations Research community.
△ Less
Submitted 25 September, 2020;
originally announced September 2020.
-
On the Interplay between Acceleration and Identification for the Proximal Gradient algorithm
Authors:
Gilles Bareilles,
Franck Iutzeler
Abstract:
In this paper, we study the interplay between acceleration and structure identification for the proximal gradient algorithm. We report and analyze several cases where this interplay has negative effects on the algorithm behavior (iterates oscillation, loss of structure, etc.). We present a generic method that tames acceleration when structure identification may be at stake; it benefits from a conv…
▽ More
In this paper, we study the interplay between acceleration and structure identification for the proximal gradient algorithm. We report and analyze several cases where this interplay has negative effects on the algorithm behavior (iterates oscillation, loss of structure, etc.). We present a generic method that tames acceleration when structure identification may be at stake; it benefits from a convergence rate that matches the one of the accelerated proximal gradient under some qualifying condition. We show empirically that the proposed method is much more stable in terms of subspace identification compared to the accelerated proximal gradient method while kee** a similar functional decrease.
△ Less
Submitted 31 August, 2020; v1 submitted 19 September, 2019;
originally announced September 2019.