-
Derivative-Free Optimization of Noisy Functions via Quasi-Newton Methods
Authors:
Albert S. Berahas,
Richard H. Byrd,
Jorge Nocedal
Abstract:
This paper presents a finite difference quasi-Newton method for the minimization of noisy functions. The method takes advantage of the scalability and power of BFGS updating, and employs an adaptive procedure for choosing the differencing interval $h$ based on the noise estimation techniques of Hamming (2012) and MorĂ© and Wild (2011). This noise estimation procedure and the selection of $h$ are in…
▽ More
This paper presents a finite difference quasi-Newton method for the minimization of noisy functions. The method takes advantage of the scalability and power of BFGS updating, and employs an adaptive procedure for choosing the differencing interval $h$ based on the noise estimation techniques of Hamming (2012) and Moré and Wild (2011). This noise estimation procedure and the selection of $h$ are inexpensive but not always accurate, and to prevent failures the algorithm incorporates a recovery mechanism that takes appropriate action in the case when the line search procedure is unable to produce an acceptable point. A novel convergence analysis is presented that considers the effect of a noisy line search procedure. Numerical experiments comparing the method to a function interpolating trust region method are presented.
△ Less
Submitted 8 January, 2019; v1 submitted 27 March, 2018;
originally announced March 2018.
-
A Stochastic Quasi-Newton Method for Large-Scale Optimization
Authors:
R. H. Byrd,
S. L. Hansen,
J. Nocedal,
Y. Singer
Abstract:
The question of how to incorporate curvature information in stochastic approximation methods is challenging. The direct application of classical quasi- Newton updating techniques for deterministic optimization leads to noisy curvature estimates that have harmful effects on the robustness of the iteration. In this paper, we propose a stochastic quasi-Newton method that is efficient, robust and scal…
▽ More
The question of how to incorporate curvature information in stochastic approximation methods is challenging. The direct application of classical quasi- Newton updating techniques for deterministic optimization leads to noisy curvature estimates that have harmful effects on the robustness of the iteration. In this paper, we propose a stochastic quasi-Newton method that is efficient, robust and scalable. It employs the classical BFGS update formula in its limited memory form, and is based on the observation that it is beneficial to collect curvature information pointwise, and at regular intervals, through (sub-sampled) Hessian-vector products. This technique differs from the classical approach that would compute differences of gradients, and where controlling the quality of the curvature estimates can be difficult. We present numerical results on problems arising in machine learning that suggest that the proposed method shows much promise.
△ Less
Submitted 18 February, 2015; v1 submitted 27 January, 2014;
originally announced January 2014.
-
An Inexact Successive Quadratic Approximation Method for Convex L-1 Regularized Optimization
Authors:
Richard H. Byrd,
Jorge Nocedal,
Figen Oztoprak
Abstract:
We study a Newton-like method for the minimization of an objective function that is the sum of a smooth convex function and an l-1 regularization term. This method, which is sometimes referred to in the literature as a proximal Newton method, computes a step by minimizing a piecewise quadratic model of the objective function. In order to make this approach efficient in practice, it is imperative t…
▽ More
We study a Newton-like method for the minimization of an objective function that is the sum of a smooth convex function and an l-1 regularization term. This method, which is sometimes referred to in the literature as a proximal Newton method, computes a step by minimizing a piecewise quadratic model of the objective function. In order to make this approach efficient in practice, it is imperative to perform this inner minimization inexactly. In this paper, we give inexactness conditions that guarantee global convergence and that can be used to control the local rate of convergence of the iteration. Our inexactness conditions are based on a semi-smooth function that represents a (continuous) measure of the optimality conditions of the problem, and that embodies the soft-thresholding iteration. We give careful consideration to the algorithm employed for the inner minimization, and report numerical results on two test sets originating in machine learning.
△ Less
Submitted 13 September, 2013;
originally announced September 2013.