Skip to main content

Showing 1–21 of 21 results for author: Nocedal, J

.
  1. arXiv:2402.11920  [pdf, other

    math.OC

    A Feasible Method for Constrained Derivative-Free Optimization

    Authors: Melody Qiming Xuan, Jorge Nocedal

    Abstract: This paper explores a method for solving constrained optimization problems when the derivatives of the objective function are unavailable, while the derivatives of the constraints are known. We allow the objective and constraint function to be nonconvex. The method constructs a quadratic model of the objective function via interpolation and computes a step by minimizing this model subject to the o… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  2. arXiv:2401.15007  [pdf, other

    math.OC math.NA

    Noise-Tolerant Optimization Methods for the Solution of a Robust Design Problem

    Authors: Yuchen Lou, Shigeng Sun, Jorge Nocedal

    Abstract: The development of nonlinear optimization algorithms capable of performing reliably in the presence of noise has garnered considerable attention lately. This paper advocates for strategies to create noise-tolerant nonlinear optimization algorithms by adapting classical deterministic methods. These adaptations follow certain design guidelines described here, which make use of estimates of the noise… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    MSC Class: 90C30; 90C15; 93B51; 65K05

  3. arXiv:2201.00973  [pdf, other

    math.OC

    A Trust Region Method for the Optimization of Noisy Functions

    Authors: Shigeng Sun, Jorge Nocedal

    Abstract: Classical trust region methods were designed to solve problems in which function and gradient information are exact. This paper considers the case when there are bounded errors (or noise) in the above computations and proposes a simple modification of the trust region method to cope with these errors. The new algorithm only requires information about the size of the errors in the function evaluati… ▽ More

    Submitted 3 January, 2022; originally announced January 2022.

    MSC Class: 65K05; 68Q25; 65G99; 90C30

  4. arXiv:2110.06380  [pdf, other

    math.OC

    Adaptive Finite-Difference Interval Estimation for Noisy Derivative-Free Optimization

    Authors: Hao-Jun Michael Shi, Yuchen Xie, Melody Qiming Xuan, Jorge Nocedal

    Abstract: A common approach for minimizing a smooth nonlinear function is to employ finite-difference approximations to the gradient. While this can be easily performed when no error is present within the function evaluations, when the function is noisy, the optimal choice requires information about the noise level and higher-order derivatives of the function, which is often unavailable. Given the noise lev… ▽ More

    Submitted 22 March, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: 39 pages, 20 tables, 6 figures

  5. arXiv:2110.04355  [pdf, other

    math.OC

    Constrained Optimization in the Presence of Noise

    Authors: Figen Oztoprak, Richard Byrd, Jorge Nocedal

    Abstract: The problem of interest is the minimization of a nonlinear function subject to nonlinear equality constraints using a sequential quadratic programming (SQP) method. The minimization must be performed while observing only noisy evaluations of the objective and constraint functions. In order to obtain stability, the classical SQP method is modified by relaxing the standard Armijo line search based o… ▽ More

    Submitted 8 October, 2021; originally announced October 2021.

  6. arXiv:2102.09762  [pdf, other

    math.OC

    On the Numerical Performance of Derivative-Free Optimization Methods Based on Finite-Difference Approximations

    Authors: Hao-Jun Michael Shi, Melody Qiming Xuan, Figen Oztoprak, Jorge Nocedal

    Abstract: The goal of this paper is to investigate an approach for derivative-free optimization that has not received sufficient attention in the literature and is yet one of the simplest to implement and parallelize. It consists of computing gradients of a smoothed approximation of the objective function (and constraints), and employing them within established codes. These gradient approximations are calcu… ▽ More

    Submitted 19 February, 2021; originally announced February 2021.

    Comments: 82 pages, 38 tables, 29 figures

  7. arXiv:2012.15411  [pdf, other

    math.OC stat.ML

    Constrained and Composite Optimization via Adaptive Sampling Methods

    Authors: Yuchen Xie, Raghu Bollapragada, Richard Byrd, Jorge Nocedal

    Abstract: The motivation for this paper stems from the desire to develop an adaptive sampling method for solving constrained optimization problems in which the objective function is stochastic and the constraints are deterministic. The method proposed in this paper is a proximal gradient method that can also be applied to the composite optimization problem min f(x) + h(x), where f is stochastic and h is c… ▽ More

    Submitted 30 December, 2020; originally announced December 2020.

    Comments: 26 pages, 13 figures

  8. arXiv:2010.04352  [pdf, other

    math.OC

    A Noise-Tolerant Quasi-Newton Algorithm for Unconstrained Optimization

    Authors: Hao-Jun Michael Shi, Yuchen Xie, Richard Byrd, Jorge Nocedal

    Abstract: This paper describes an extension of the BFGS and L-BFGS methods for the minimization of a nonlinear function subject to errors. This work is motivated by applications that contain computational noise, employ low-precision arithmetic, or are subject to statistical noise. The classical BFGS and L-BFGS methods can fail in such circumstances because the updating procedure can be corrupted and the lin… ▽ More

    Submitted 8 September, 2021; v1 submitted 8 October, 2020; originally announced October 2020.

    Comments: 27 pages, 13 figures, 2 tables

  9. arXiv:1901.09063  [pdf, other

    math.OC

    Analysis of the BFGS Method with Errors

    Authors: Yuchen Xie, Richard Byrd, Jorge Nocedal

    Abstract: The classical convergence analysis of quasi-Newton methods assumes that the function and gradients employed at each iteration are exact. In this paper, we consider the case when there are (bounded) errors in both computations and establish conditions under which a slight modification of the BFGS algorithm with an Armijo-Wolfe line search converges to a neighborhood of the solution that is determin… ▽ More

    Submitted 25 January, 2019; originally announced January 2019.

  10. arXiv:1803.10173  [pdf, other

    math.OC

    Derivative-Free Optimization of Noisy Functions via Quasi-Newton Methods

    Authors: Albert S. Berahas, Richard H. Byrd, Jorge Nocedal

    Abstract: This paper presents a finite difference quasi-Newton method for the minimization of noisy functions. The method takes advantage of the scalability and power of BFGS updating, and employs an adaptive procedure for choosing the differencing interval $h$ based on the noise estimation techniques of Hamming (2012) and Moré and Wild (2011). This noise estimation procedure and the selection of $h$ are in… ▽ More

    Submitted 8 January, 2019; v1 submitted 27 March, 2018; originally announced March 2018.

    Comments: 26 pages, 9 figures

  11. arXiv:1802.05374  [pdf, other

    math.OC cs.LG stat.ML

    A Progressive Batching L-BFGS Method for Machine Learning

    Authors: Raghu Bollapragada, Dheevatsa Mudigere, Jorge Nocedal, Hao-Jun Michael Shi, ** Tak Peter Tang

    Abstract: The standard L-BFGS method relies on gradient approximations that are not dominated by noise, so that search directions are descent directions, the line search is reliable, and quasi-Newton updating yields useful quadratic models of the objective function. All of this appears to call for a full batch approach, but since small batch sizes give rise to faster algorithms with better generalization pr… ▽ More

    Submitted 30 May, 2018; v1 submitted 14 February, 2018; originally announced February 2018.

    Comments: ICML 2018. 25 pages, 17 figures, 2 tables

  12. arXiv:1710.11258  [pdf, other

    math.OC stat.ML

    Adaptive Sampling Strategies for Stochastic Optimization

    Authors: Raghu Bollapragada, Richard Byrd, Jorge Nocedal

    Abstract: In this paper, we propose a stochastic optimization method that adaptively controls the sample size used in the computation of gradient approximations. Unlike other variance reduction techniques that either require additional storage or the regular computation of full gradients, the proposed method reduces variance by increasing the sample size as needed. The decision to increase the sample size i… ▽ More

    Submitted 30 October, 2017; originally announced October 2017.

    Comments: 32 Pages

  13. arXiv:1705.06211  [pdf, other

    math.OC cs.LG stat.ML

    An Investigation of Newton-Sketch and Subsampled Newton Methods

    Authors: Albert S. Berahas, Raghu Bollapragada, Jorge Nocedal

    Abstract: Sketching, a dimensionality reduction technique, has received much attention in the statistics community. In this paper, we study sketching in the context of Newton's method for solving finite-sum optimization problems in which the number of variables and data points are both large. We study two forms of sketching that perform dimensionality reduction in data space: Hessian subsampling and randomi… ▽ More

    Submitted 30 May, 2019; v1 submitted 17 May, 2017; originally announced May 2017.

    Comments: 36 pages, 22 figures

  14. arXiv:1609.08502  [pdf, other

    math.OC stat.ML

    Exact and Inexact Subsampled Newton Methods for Optimization

    Authors: Raghu Bollapragada, Richard Byrd, Jorge Nocedal

    Abstract: The paper studies the solution of stochastic optimization problems in which approximations to the gradient and Hessian are obtained through subsampling. We first consider Newton-like methods that employ these approximations and discuss how to coordinate the accuracy in the gradient and Hessian to yield a superlinear rate of convergence in expectation. The second part of the paper analyzes an inexa… ▽ More

    Submitted 27 September, 2016; originally announced September 2016.

    Comments: 37 pages

  15. arXiv:1609.04836  [pdf, other

    cs.LG math.OC

    On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima

    Authors: Nitish Shirish Keskar, Dheevatsa Mudigere, Jorge Nocedal, Mikhail Smelyanskiy, ** Tak Peter Tang

    Abstract: The stochastic gradient descent (SGD) method and its variants are algorithms of choice for many Deep Learning tasks. These methods operate in a small-batch regime wherein a fraction of the training data, say $32$-$512$ data points, is sampled to compute an approximation to the gradient. It has been observed in practice that when using a larger batch there is a degradation in the quality of the mod… ▽ More

    Submitted 9 February, 2017; v1 submitted 15 September, 2016; originally announced September 2016.

    Comments: Accepted as a conference paper at ICLR 2017

  16. arXiv:1606.04838  [pdf, other

    stat.ML cs.LG math.OC

    Optimization Methods for Large-Scale Machine Learning

    Authors: Léon Bottou, Frank E. Curtis, Jorge Nocedal

    Abstract: This paper provides a review and commentary on the past, present, and future of numerical optimization algorithms in the context of machine learning applications. Through case studies on text classification and the training of deep neural networks, we discuss how optimization problems arise in machine learning and what makes them challenging. A major theme of our study is that large-scale machine… ▽ More

    Submitted 8 February, 2018; v1 submitted 15 June, 2016; originally announced June 2016.

  17. arXiv:1605.06049  [pdf, other

    math.OC cs.LG stat.ML

    A Multi-Batch L-BFGS Method for Machine Learning

    Authors: Albert S. Berahas, Jorge Nocedal, Martin Takáč

    Abstract: The question of how to parallelize the stochastic gradient descent (SGD) method has received much attention in the literature. In this paper, we focus instead on batch methods that use a sizeable fraction of the training set at each iteration to facilitate parallelism, and that employ second-order information. In order to improve the learning process, we follow a multi-batch approach in which the… ▽ More

    Submitted 23 October, 2016; v1 submitted 19 May, 2016; originally announced May 2016.

    Comments: NIPS 2016. 31 pages, 22 figures

  18. arXiv:1505.04315  [pdf, ps, other

    math.OC

    A Second-Order Method for Convex $\ell_1$-Regularized Optimization with Active Set Prediction

    Authors: Nitish Shirish Keskar, Jorge Nocedal, Figen Oztoprak, Andreas Waechter

    Abstract: We describe an active-set method for the minimization of an objective function $φ$ that is the sum of a smooth convex function and an $\ell_1$-regularization term. A distinctive feature of the method is the way in which active-set identification and {second-order} subspace minimization steps are integrated to combine the predictive power of the two approaches. At every iteration, the algorithm sel… ▽ More

    Submitted 16 May, 2015; originally announced May 2015.

  19. arXiv:1412.1844  [pdf, ps, other

    math.OC

    An Algorithm for Quadratic $\ell_1$-Regularized Optimization with a Flexible Active-Set Strategy

    Authors: Stefan Solntsev, Jorge Nocedal, Richard Byrd

    Abstract: We present an active-set method for minimizing an objective that is the sum of a convex quadratic and $\ell_1$ regularization term. Unlike two-phase methods that combine a first-order active set identification step and a subspace phase consisting of a \emph{cycle} of conjugate gradient (CG) iterations, the method presented here has the flexibility of computing a first-order proximal gradient step… ▽ More

    Submitted 4 December, 2014; originally announced December 2014.

  20. arXiv:1401.7020  [pdf, other

    math.OC cs.LG stat.ML

    A Stochastic Quasi-Newton Method for Large-Scale Optimization

    Authors: R. H. Byrd, S. L. Hansen, J. Nocedal, Y. Singer

    Abstract: The question of how to incorporate curvature information in stochastic approximation methods is challenging. The direct application of classical quasi- Newton updating techniques for deterministic optimization leads to noisy curvature estimates that have harmful effects on the robustness of the iteration. In this paper, we propose a stochastic quasi-Newton method that is efficient, robust and scal… ▽ More

    Submitted 18 February, 2015; v1 submitted 27 January, 2014; originally announced January 2014.

  21. arXiv:1309.3529  [pdf, ps, other

    math.OC

    An Inexact Successive Quadratic Approximation Method for Convex L-1 Regularized Optimization

    Authors: Richard H. Byrd, Jorge Nocedal, Figen Oztoprak

    Abstract: We study a Newton-like method for the minimization of an objective function that is the sum of a smooth convex function and an l-1 regularization term. This method, which is sometimes referred to in the literature as a proximal Newton method, computes a step by minimizing a piecewise quadratic model of the objective function. In order to make this approach efficient in practice, it is imperative t… ▽ More

    Submitted 13 September, 2013; originally announced September 2013.