Skip to main content

Showing 1–8 of 8 results for author: Roosta-Khorasani, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:1802.09113  [pdf, other

    cs.LG cs.DC math.OC

    GPU Accelerated Sub-Sampled Newton's Method

    Authors: Sudhir B. Kylasa, Farbod Roosta-Khorasani, Michael W. Mahoney, Ananth Grama

    Abstract: First order methods, which solely rely on gradient information, are commonly used in diverse machine learning (ML) and data analysis (DA) applications. This is attributed to the simplicity of their implementations, as well as low per-iteration computational/storage costs. However, they suffer from significant disadvantages; most notably, their performance degrades with increasing problem ill-condi… ▽ More

    Submitted 2 March, 2018; v1 submitted 25 February, 2018; originally announced February 2018.

  2. arXiv:1711.09090  [pdf, other

    cs.LG stat.ML

    Invariance of Weight Distributions in Rectified MLPs

    Authors: Russell Tsuchida, Farbod Roosta-Khorasani, Marcus Gallagher

    Abstract: An interesting approach to analyzing neural networks that has received renewed attention is to examine the equivalent kernel of the neural network. This is based on the fact that a fully connected feedforward network with one hidden layer, a certain weight distribution, an activation function, and an infinite number of neurons can be viewed as a map** into a Hilbert space. We derive the equivale… ▽ More

    Submitted 31 May, 2018; v1 submitted 24 November, 2017; originally announced November 2017.

    Comments: ICML 2018

  3. arXiv:1709.03528  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    GIANT: Globally Improved Approximate Newton Method for Distributed Optimization

    Authors: Shusen Wang, Farbod Roosta-Khorasani, Peng Xu, Michael W. Mahoney

    Abstract: For distributed computing environment, we consider the empirical risk minimization problem and propose a distributed and communication-efficient Newton-type optimization method. At every iteration, each worker locally finds an Approximate NewTon (ANT) direction, which is sent to the main driver. The main driver, then, averages all the ANT directions received from workers to form a {\it Globally Im… ▽ More

    Submitted 11 September, 2018; v1 submitted 11 September, 2017; originally announced September 2017.

    Comments: Fixed some typos. Improved writing

  4. arXiv:1708.07827  [pdf, other

    math.OC cs.LG math.NA stat.ML

    Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study

    Authors: Peng Xu, Farbod Roosta-Khorasani, Michael W. Mahoney

    Abstract: While first-order optimization methods such as stochastic gradient descent (SGD) are popular in machine learning (ML), they come with well-known deficiencies, including relatively-slow convergence, sensitivity to the settings of hyper-parameters such as learning rate, stagnation at high training errors, and difficulty in esca** flat regions and saddle points. These issues are particularly acute… ▽ More

    Submitted 15 February, 2018; v1 submitted 24 August, 2017; originally announced August 2017.

    Comments: 21 pages, 11 figures. Restructure the paper and add experiments

  5. arXiv:1605.08108  [pdf, other

    math.OC cs.LG stat.ML

    FLAG n' FLARE: Fast Linearly-Coupled Adaptive Gradient Methods

    Authors: Xiang Cheng, Farbod Roosta-Khorasani, Stefan Palombo, Peter L. Bartlett, Michael W. Mahoney

    Abstract: We consider first order gradient methods for effectively optimizing a composite objective in the form of a sum of smooth and, potentially, non-smooth functions. We present accelerated and adaptive gradient methods, called FLAG and FLARE, which can offer the best of both worlds. They can achieve the optimal convergence rate by attaining the optimal first-order oracle complexity for smooth convex op… ▽ More

    Submitted 11 November, 2017; v1 submitted 25 May, 2016; originally announced May 2016.

  6. arXiv:1604.07515  [pdf, other

    cs.DC

    Parallel Local Graph Clustering

    Authors: Julian Shun, Farbod Roosta-Khorasani, Kimon Fountoulakis, Michael W. Mahoney

    Abstract: Graph clustering has many important applications in computing, but due to growing sizes of graphs, even traditionally fast clustering methods such as spectral partitioning can be computationally expensive for real-world graphs of interest. Motivated partly by this, so-called local algorithms for graph clustering have received significant interest due to the fact that they can find good clusters in… ▽ More

    Submitted 8 June, 2019; v1 submitted 26 April, 2016; originally announced April 2016.

    Comments: Fixed typo in Figure 5

  7. arXiv:1601.04738  [pdf, ps, other

    math.OC cs.LG stat.ML

    Sub-Sampled Newton Methods II: Local Convergence Rates

    Authors: Farbod Roosta-Khorasani, Michael W. Mahoney

    Abstract: Many data-fitting applications require the solution of an optimization problem involving a sum of large number of functions of high dimensional parameter. Here, we consider the problem of minimizing a sum of $n$ functions over a convex constraint set $\mathcal{X} \subseteq \mathbb{R}^{p}$ where both $n$ and $p$ are large. In such problems, sub-sampling as a way to reduce $n$ can offer great amount… ▽ More

    Submitted 25 February, 2016; v1 submitted 18 January, 2016; originally announced January 2016.

  8. arXiv:1601.04737  [pdf, other

    math.OC cs.LG stat.ML

    Sub-Sampled Newton Methods I: Globally Convergent Algorithms

    Authors: Farbod Roosta-Khorasani, Michael W. Mahoney

    Abstract: Large scale optimization problems are ubiquitous in machine learning and data analysis and there is a plethora of algorithms for solving such problems. Many of these algorithms employ sub-sampling, as a way to either speed up the computations and/or to implicitly implement a form of statistical regularization. In this paper, we consider second-order iterative optimization algorithms and we provide… ▽ More

    Submitted 25 February, 2016; v1 submitted 18 January, 2016; originally announced January 2016.