Skip to main content

Showing 1–15 of 15 results for author: Bullins, B

Searching in archive math. Search in all archives.
.
  1. arXiv:2305.15643  [pdf, other

    cs.LG math.OC stat.ML

    Federated Composite Saddle Point Optimization

    Authors: Site Bai, Brian Bullins

    Abstract: Federated learning (FL) approaches for saddle point problems (SPP) have recently gained in popularity due to the critical role they play in machine learning (ML). Existing works mostly target smooth unconstrained objectives in Euclidean space, whereas ML problems often involve constraints or non-smooth regularization, which results in a need for composite optimization. Addressing these issues, we… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  2. arXiv:2304.08389  [pdf, other

    math.OC cs.LG

    Beyond first-order methods for non-convex non-concave min-max optimization

    Authors: Abhijeet Vyas, Brian Bullins

    Abstract: We propose a study of structured non-convex non-concave min-max problems which goes beyond standard first-order approaches. Inspired by the tight understanding established in recent works [Adil et al., 2022, Lin and Jordan, 2022b], we develop a suite of higher-order methods which show the improvements attainable beyond the monotone and Minty condition settings. Specifically, we provide a new under… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

  3. arXiv:2205.06167  [pdf, ps, other

    math.OC cs.DS

    Optimal Methods for Higher-Order Smooth Monotone Variational Inequalities

    Authors: Deeksha Adil, Brian Bullins, Arun Jambulapati, Sushant Sachdeva

    Abstract: In this work, we present new simple and optimal algorithms for solving the variational inequality (VI) problem for $p^{th}$-order smooth, monotone operators -- a problem that generalizes convex optimization and saddle-point problems. Recent works (Bullins and Lai (2020), Lin and Jordan (2021), Jiang and Mokhtari (2022)) present methods that achieve a rate of $\tilde{O}(ε^{-2/(p+1)})$ for… ▽ More

    Submitted 31 May, 2022; v1 submitted 12 May, 2022; originally announced May 2022.

    Comments: 21 Pages

  4. arXiv:2110.02954  [pdf, other

    math.OC cs.LG stat.ML

    A Stochastic Newton Algorithm for Distributed Convex Optimization

    Authors: Brian Bullins, Kumar Kshitij Patel, Ohad Shamir, Nathan Srebro, Blake Woodworth

    Abstract: We propose and analyze a stochastic Newton algorithm for homogeneous distributed stochastic convex optimization, where each machine can calculate stochastic gradients of the same population objective, as well as stochastic Hessian-vector products (products of an independent unbiased estimator of the Hessian of the population objective with arbitrary vectors), with many such stochastic computations… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

  5. arXiv:2107.02432  [pdf, ps, other

    math.OC cs.DS

    Unifying Width-Reduced Methods for Quasi-Self-Concordant Optimization

    Authors: Deeksha Adil, Brian Bullins, Sushant Sachdeva

    Abstract: We provide several algorithms for constrained optimization of a large class of convex problems, including softmax, $\ell_p$ regression, and logistic regression. Central to our approach is the notion of width reduction, a technique which has proven immensely useful in the context of maximum flow [Christiano et al., STOC'11] and, more recently, $\ell_p$ regression [Adil et al., SODA'19], in terms of… ▽ More

    Submitted 6 July, 2021; originally announced July 2021.

  6. arXiv:2102.01583  [pdf, other

    cs.LG math.OC

    The Min-Max Complexity of Distributed Stochastic Convex Optimization with Intermittent Communication

    Authors: Blake Woodworth, Brian Bullins, Ohad Shamir, Nathan Srebro

    Abstract: We resolve the min-max complexity of distributed stochastic convex optimization (up to a log factor) in the intermittent communication setting, where $M$ machines work in parallel over the course of $R$ rounds of communication to optimize the objective, and during each round of communication, each machine may sequentially compute $K$ stochastic gradient estimates. We present a novel lower bound wi… ▽ More

    Submitted 5 August, 2021; v1 submitted 2 February, 2021; originally announced February 2021.

    Comments: 48 pages

  7. arXiv:2007.04528  [pdf, ps, other

    math.OC cs.LG stat.ML

    Higher-order methods for convex-concave min-max optimization and monotone variational inequalities

    Authors: Brian Bullins, Kevin A. Lai

    Abstract: We provide improved convergence rates for constrained convex-concave min-max problems and monotone variational inequalities with higher-order smoothness. In min-max settings where the $p^{th}$-order derivatives are Lipschitz continuous, we give an algorithm HigherOrderMirrorProx that achieves an iteration complexity of $O(1/T^{\frac{p+1}{2}})$ when given access to an oracle for finding a fixed poi… ▽ More

    Submitted 8 July, 2020; originally announced July 2020.

  8. arXiv:2002.07839  [pdf, other

    cs.LG math.OC stat.ML

    Is Local SGD Better than Minibatch SGD?

    Authors: Blake Woodworth, Kumar Kshitij Patel, Sebastian U. Stich, Zhen Dai, Brian Bullins, H. Brendan McMahan, Ohad Shamir, Nathan Srebro

    Abstract: We study local SGD (also known as parallel SGD and federated averaging), a natural and frequently used stochastic distributed optimization method. Its theoretical foundations are currently lacking and we highlight how all existing error guarantees in the convex setting are dominated by a simple baseline, minibatch SGD. (1) For quadratic objectives we prove that local SGD strictly dominates minibat… ▽ More

    Submitted 20 July, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

    Comments: 29 pages

  9. arXiv:1906.01621  [pdf, ps, other

    math.OC cs.LG stat.ML

    Higher-Order Accelerated Methods for Faster Non-Smooth Optimization

    Authors: Brian Bullins, Richard Peng

    Abstract: We provide improved convergence rates for various \emph{non-smooth} optimization problems via higher-order accelerated methods. In the case of $\ell_\infty$ regression, we achieves an $O(ε^{-4/5})$ iteration complexity, breaking the $O(ε^{-1})$ barrier so far present for previous methods. We arrive at a similar rate for the problem of $\ell_1$-SVM, going beyond what is attainable by first-order me… ▽ More

    Submitted 4 June, 2019; originally announced June 2019.

  10. arXiv:1902.08721  [pdf, ps, other

    cs.LG eess.SY math.OC stat.ML

    Online Control with Adversarial Disturbances

    Authors: Naman Agarwal, Brian Bullins, Elad Hazan, Sham M. Kakade, Karan Singh

    Abstract: We study the control of a linear dynamical system with adversarial disturbances (as opposed to statistical noise). The objective we consider is one of regret: we desire an online control procedure that can do nearly as well as that of a procedure that has full knowledge of the disturbances in hindsight. Our main result is an efficient algorithm that provides nearly tight regret bounds for this pro… ▽ More

    Submitted 22 February, 2019; originally announced February 2019.

  11. arXiv:1812.10349  [pdf, ps, other

    math.OC

    Fast minimization of structured convex quartics

    Authors: Brian Bullins

    Abstract: We propose faster methods for unconstrained optimization of \emph{structured convex quartics}, which are convex functions of the form \begin{equation*} f(x) = c^\top x + x^\top \mathbf{G} x + \mathbf{T}[x,x,x] + \frac{1}{24} \mathopen\| \mathbf{A} x \mathclose\|_4^4 \end{equation*} for $c \in \mathbb{R}^d$, $\mathbf{G} \in \mathbb{R}^{d \times d}$,… ▽ More

    Submitted 26 December, 2018; originally announced December 2018.

  12. arXiv:1806.02958  [pdf, other

    cs.LG math.OC stat.ML

    Efficient Full-Matrix Adaptive Regularization

    Authors: Naman Agarwal, Brian Bullins, Xinyi Chen, Elad Hazan, Karan Singh, Cyril Zhang, Yi Zhang

    Abstract: Adaptive regularization methods pre-multiply a descent direction by a preconditioning matrix. Due to the large number of parameters of machine learning problems, full-matrix preconditioning methods are prohibitively expensive. We show how to modify full-matrix adaptive regularization in order to make it practical and effective. We also provide a novel theoretical analysis for adaptive regularizati… ▽ More

    Submitted 17 November, 2020; v1 submitted 7 June, 2018; originally announced June 2018.

    Comments: Updated to ICML 2019 camera-ready version. Title of preprint was "The Case for Full-Matrix Adaptive Regularization"

  13. Adaptive regularization with cubics on manifolds

    Authors: Naman Agarwal, Nicolas Boumal, Brian Bullins, Coralia Cartis

    Abstract: Adaptive regularization with cubics (ARC) is an algorithm for unconstrained, non-convex optimization. Akin to the popular trust-region method, its iterations can be thought of as approximate, safe-guarded Newton steps. For cost functions with Lipschitz continuous Hessian, ARC has optimal iteration complexity, in the sense that it produces an iterate with gradient smaller than $\varepsilon$ in… ▽ More

    Submitted 16 May, 2020; v1 submitted 31 May, 2018; originally announced June 2018.

    Comments: 48 pages, 3 figures

  14. arXiv:1611.01146  [pdf, other

    math.OC cs.DS cs.NE stat.ML

    Finding Approximate Local Minima Faster than Gradient Descent

    Authors: Naman Agarwal, Zeyuan Allen-Zhu, Brian Bullins, Elad Hazan, Tengyu Ma

    Abstract: We design a non-convex second-order optimization algorithm that is guaranteed to return an approximate local minimum in time which scales linearly in the underlying dimension and the number of training examples. The time complexity of our algorithm to find an approximate local minimum is even faster than that of gradient descent to find a critical point. Our algorithm applies to a general class of… ▽ More

    Submitted 24 April, 2017; v1 submitted 3 November, 2016; originally announced November 2016.

  15. arXiv:1305.2147  [pdf, ps, other

    math.SP math.CO

    When the largest eigenvalue of the modularity and normalized modularity matrix is zero

    Authors: Marianna Bolla, Brian Bullins, Sorathan Chaturapruek, Shiwen Chen, Katalin Friedl

    Abstract: In July 2012, at the Conference on Applications of Graph Spectra in Computer Science, Barcelona, D. Stevanovic posed the following open problem: which graphs have the zero as the largest eigenvalue of their modularity matrix? The conjecture was that only the complete and complete multipartite graphs. They indeed have this property, but are they the only ones? In this paper, we will give an affirma… ▽ More

    Submitted 9 May, 2013; originally announced May 2013.

    MSC Class: 05C35; 62H30