Skip to main content

Showing 1–13 of 13 results for author: Lacotte, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2303.03382  [pdf, other

    cs.LG stat.ML

    Globally Optimal Training of Neural Networks with Threshold Activation Functions

    Authors: Tolga Ergen, Halil Ibrahim Gulluk, Jonathan Lacotte, Mert Pilanci

    Abstract: Threshold activation functions are highly preferable in neural networks due to their efficiency in hardware implementations. Moreover, their mode of operation is more interpretable and resembles that of biological neurons. However, traditional gradient based algorithms such as Gradient Descent cannot be used to train the parameters of neural networks with threshold activations since the activation… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

    Comments: Accepted to ICLR 2023

  2. arXiv:2107.07480  [pdf, other

    math.OC cs.DS cs.LG stat.ML

    Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update

    Authors: Michał Dereziński, Jonathan Lacotte, Mert Pilanci, Michael W. Mahoney

    Abstract: In second-order optimization, a potential bottleneck can be computing the Hessian matrix of the optimized function at every iteration. Randomized sketching has emerged as a powerful technique for constructing estimates of the Hessian which can be used to perform approximate Newton steps. This involves multiplication by a random sketching matrix, which introduces a trade-off between the computation… ▽ More

    Submitted 15 July, 2021; originally announced July 2021.

  3. arXiv:2105.07291  [pdf, other

    math.OC cs.LG

    Adaptive Newton Sketch: Linear-time Optimization with Quadratic Convergence and Effective Hessian Dimensionality

    Authors: Jonathan Lacotte, Yifei Wang, Mert Pilanci

    Abstract: We propose a randomized algorithm with quadratic convergence rate for convex optimization problems with a self-concordant, composite, strongly convex objective function. Our method is based on performing an approximate Newton step using a random projection of the Hessian. Our first contribution is to show that, at each iteration, the embedding dimension (or sketch size) can be as small as the effe… ▽ More

    Submitted 15 May, 2021; originally announced May 2021.

  4. arXiv:2104.14101  [pdf, other

    cs.LG

    Fast Convex Quadratic Optimization Solvers with Adaptive Sketching-based Preconditioners

    Authors: Jonathan Lacotte, Mert Pilanci

    Abstract: We consider least-squares problems with quadratic regularization and propose novel sketching-based iterative methods with an adaptive sketch size. The sketch size can be as small as the effective dimension of the data matrix to guarantee linear convergence. However, a major difficulty in choosing the sketch size in terms of the effective dimension lies in the fact that the latter is usually unknow… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.

  5. arXiv:2012.07054  [pdf, other

    cs.IT cs.LG

    Adaptive and Oblivious Randomized Subspace Methods for High-Dimensional Optimization: Sharp Analysis and Lower Bounds

    Authors: Jonathan Lacotte, Mert Pilanci

    Abstract: We propose novel randomized optimization methods for high-dimensional convex problems based on restrictions of variables to random subspaces. We consider oblivious and data-adaptive subspaces and study their approximation properties via convex duality and Fenchel conjugates. A suitable adaptive subspace can be generated by sampling a correlated random matrix whose second order statistics mirror th… ▽ More

    Submitted 13 December, 2020; originally announced December 2020.

  6. arXiv:2006.05900  [pdf, other

    cs.LG stat.ML

    The Hidden Convex Optimization Landscape of Two-Layer ReLU Neural Networks: an Exact Characterization of the Optimal Solutions

    Authors: Yifei Wang, Jonathan Lacotte, Mert Pilanci

    Abstract: We prove that finding all globally optimal two-layer ReLU neural networks can be performed by solving a convex optimization program with cone constraints. Our analysis is novel, characterizes all optimal solutions, and does not leverage duality-based analysis which was recently used to lift neural network training into convex spaces. Given the set of solutions of our convex optimization program, w… ▽ More

    Submitted 13 March, 2022; v1 submitted 10 June, 2020; originally announced June 2020.

  7. arXiv:2006.05874  [pdf, other

    cs.LG stat.ML

    Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization

    Authors: Jonathan Lacotte, Mert Pilanci

    Abstract: We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching. We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT). While current randomized solvers for least-squares optimization prescribe an embedding dimension at least greater than the data dimension, we show that… ▽ More

    Submitted 23 October, 2020; v1 submitted 10 June, 2020; originally announced June 2020.

  8. arXiv:2002.09488  [pdf, other

    math.OC cs.LG

    Optimal Randomized First-Order Methods for Least-Squares Problems

    Authors: Jonathan Lacotte, Mert Pilanci

    Abstract: We provide an exact analysis of a class of randomized algorithms for solving overdetermined least-squares problems. We consider first-order methods, where the gradients are pre-conditioned by an approximation of the Hessian, based on a subspace embedding of the data matrix. This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems. We focu… ▽ More

    Submitted 25 February, 2020; v1 submitted 21 February, 2020; originally announced February 2020.

    Comments: arXiv admin note: text overlap with arXiv:2002.00864

  9. arXiv:2002.00864  [pdf, other

    math.OC cs.LG

    Optimal Iterative Sketching with the Subsampled Randomized Hadamard Transform

    Authors: Jonathan Lacotte, Sifan Liu, Edgar Dobriban, Mert Pilanci

    Abstract: Random projections or sketching are widely used in many algorithmic and learning contexts. Here we study the performance of iterative Hessian sketch for least-squares problems. By leveraging and extending recent results from random matrix theory on the limiting spectrum of matrices randomly projected with the subsampled randomized Hadamard transform, and truncated Haar matrices, we can study and c… ▽ More

    Submitted 23 October, 2020; v1 submitted 3 February, 2020; originally announced February 2020.

  10. arXiv:1911.02675  [pdf, other

    math.NA cs.CC math.OC

    Faster Least Squares Optimization

    Authors: Jonathan Lacotte, Mert Pilanci

    Abstract: We investigate iterative methods with randomized preconditioners for solving overdetermined least-squares problems, where the preconditioners are based on a random embedding of the data matrix. We consider two distinct approaches: the sketch is either computed once (fixed preconditioner), or, the random projection is refreshed at each iteration, i.e., sampled independently of previous ones (varyin… ▽ More

    Submitted 13 April, 2021; v1 submitted 6 November, 2019; originally announced November 2019.

  11. arXiv:1906.11809  [pdf, other

    math.OC cs.LG

    High-Dimensional Optimization in Adaptive Random Subspaces

    Authors: Jonathan Lacotte, Mert Pilanci, Marco Pavone

    Abstract: We propose a new randomized optimization method for high-dimensional problems which can be seen as a generalization of coordinate descent to random subspaces. We show that an adaptive sampling strategy for the random subspace significantly outperforms the oblivious sampling method, which is the common choice in the recent literature. The adaptive subspace can be efficiently generated by a correlat… ▽ More

    Submitted 18 December, 2019; v1 submitted 27 June, 2019; originally announced June 2019.

  12. arXiv:1808.04468  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Risk-Sensitive Generative Adversarial Imitation Learning

    Authors: Jonathan Lacotte, Mohammad Ghavamzadeh, Yinlam Chow, Marco Pavone

    Abstract: We study risk-sensitive imitation learning where the agent's goal is to perform at least as well as the expert in terms of a risk profile. We first formulate our risk-sensitive imitation learning setting. We consider the generative adversarial approach to imitation learning (GAIL) and derive an optimization problem for our formulation, which we call it risk-sensitive GAIL (RS-GAIL). We then derive… ▽ More

    Submitted 23 December, 2018; v1 submitted 13 August, 2018; originally announced August 2018.

  13. arXiv:1711.10055  [pdf, other

    cs.AI cs.LG cs.RO

    Risk-sensitive Inverse Reinforcement Learning via Semi- and Non-Parametric Methods

    Authors: Sumeet Singh, Jonathan Lacotte, Anirudha Majumdar, Marco Pavone

    Abstract: The literature on Inverse Reinforcement Learning (IRL) typically assumes that humans take actions in order to minimize the expected value of a cost function, i.e., that humans are risk neutral. Yet, in practice, humans are often far from being risk neutral. To fill this gap, the objective of this paper is to devise a framework for risk-sensitive IRL in order to explicitly account for a human's ris… ▽ More

    Submitted 22 March, 2018; v1 submitted 27 November, 2017; originally announced November 2017.

    Comments: Submitted to International Journal of Robotics Research; Revision 1: (i) Clarified minor technical points; (ii) Revised proof for Theorem 3 to hold under weaker assumptions; (iii) Added additional figures and expanded discussions to improve readability