Skip to main content

Showing 1–13 of 13 results for author: Mania, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2210.09206  [pdf, other

    math.OC cs.LG

    Model Predictive Control via On-Policy Imitation Learning

    Authors: Kwangjun Ahn, Zakaria Mhammedi, Horia Mania, Zhang-Wei Hong, Ali Jadbabaie

    Abstract: In this paper, we leverage the rapid advances in imitation learning, a topic of intense recent focus in the Reinforcement Learning (RL) literature, to develop new sample complexity results and performance guarantees for data-driven Model Predictive Control (MPC) for constrained linear systems. In its simplest form, imitation learning is an approach that tries to learn an expert policy by querying… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    Comments: 26 pages

  2. arXiv:2012.15483  [pdf, other

    cs.LG stat.ML

    Why do classifier accuracies show linear trends under distribution shift?

    Authors: Horia Mania, Suvrit Sra

    Abstract: Recent studies of generalization in deep learning have observed a puzzling trend: accuracies of models on one data distribution are approximately linear functions of the accuracies on another distribution. We explain this trend under an intuitive assumption on model similarity, which was verified empirically in prior work. More precisely, we assume the probability that two models agree in their pr… ▽ More

    Submitted 22 February, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

    Comments: 18 pages, 13 figures

  3. arXiv:2012.07348  [pdf, other

    cs.LG cs.GT cs.MA stat.ML

    Bandit Learning in Decentralized Matching Markets

    Authors: Lydia T. Liu, Feng Ruan, Horia Mania, Michael I. Jordan

    Abstract: We study two-sided matching markets in which one side of the market (the players) does not have a priori knowledge about its preferences for the other side (the arms) and is required to learn its preferences from experience. Also, we assume the players have no direct means of communication. This model extends the standard stochastic multi-armed bandit framework to a decentralized multiple player s… ▽ More

    Submitted 21 June, 2021; v1 submitted 14 December, 2020; originally announced December 2020.

    Comments: 34 pages

  4. arXiv:2006.10277  [pdf, ps, other

    stat.ML cs.LG math.OC

    Active Learning for Nonlinear System Identification with Guarantees

    Authors: Horia Mania, Michael I. Jordan, Benjamin Recht

    Abstract: While the identification of nonlinear dynamical systems is a fundamental building block of model-based reinforcement learning and feedback control, its sample complexity is only understood for systems that either have discrete states and actions or for systems that can be identified from data generated by i.i.d. random inputs. Nonetheless, many interesting dynamical systems have continuous states… ▽ More

    Submitted 18 June, 2020; originally announced June 2020.

    Comments: 29 pages

  5. arXiv:1906.05363  [pdf, other

    cs.LG cs.GT cs.MA stat.ML

    Competing Bandits in Matching Markets

    Authors: Lydia T. Liu, Horia Mania, Michael I. Jordan

    Abstract: Stable matching, a classical model for two-sided markets, has long been studied with little consideration for how each side's preferences are learned. With the advent of massive online markets powered by data-driven matching platforms, it has become necessary to better understand the interplay between learning and market objectives. We propose a statistical learning model in which one side of the… ▽ More

    Submitted 12 July, 2020; v1 submitted 12 June, 2019; originally announced June 2019.

    Comments: 15 pages, 3 figures. A version appears in the Proceedings of The 23nd International Conference on Artificial Intelligence and Statistics (AISTATS), 2020

  6. arXiv:1905.12580  [pdf, other

    cs.LG stat.ML

    Model Similarity Mitigates Test Set Overuse

    Authors: Horia Mania, John Miller, Ludwig Schmidt, Moritz Hardt, Benjamin Recht

    Abstract: Excessive reuse of test data has become commonplace in today's machine learning workflows. Popular benchmarks, competitions, industrial scale tuning, among other applications, all involve test data reuse beyond guidance by statistical confidence bounds. Nonetheless, recent replication studies give evidence that popular benchmarks continue to support progress despite years of extensive reuse. We pr… ▽ More

    Submitted 29 May, 2019; originally announced May 2019.

    Comments: 18 pages, 7 figures

  7. arXiv:1902.07826  [pdf, ps, other

    math.OC cs.LG stat.ML

    Certainty Equivalence is Efficient for Linear Quadratic Control

    Authors: Horia Mania, Stephen Tu, Benjamin Recht

    Abstract: We study the performance of the certainty equivalent controller on Linear Quadratic (LQ) control problems with unknown transition dynamics. We show that for both the fully and partially observed settings, the sub-optimality gap between the cost incurred by playing the certainty equivalent controller on the true system and the cost incurred by using the optimal LQ controller enjoys a fast statistic… ▽ More

    Submitted 24 June, 2019; v1 submitted 20 February, 2019; originally announced February 2019.

    Comments: In the current version we extended our analysis to the case of partially observable systems, i.e. we provided a suboptimality analysis for the Linear Quadratic Gaussian (LQG) setting

  8. arXiv:1805.09388  [pdf, other

    cs.LG math.OC stat.ML

    Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator

    Authors: Sarah Dean, Horia Mania, Nikolai Matni, Benjamin Recht, Stephen Tu

    Abstract: We consider adaptive control of the Linear Quadratic Regulator (LQR), where an unknown linear system is controlled subject to quadratic costs. Leveraging recent developments in the estimation of linear systems and in robust controller synthesis, we present the first provably polynomial time algorithm that provides high probability guarantees of sub-linear regret on this problem. We further study t… ▽ More

    Submitted 23 May, 2018; originally announced May 2018.

  9. arXiv:1803.07055  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    Simple random search provides a competitive approach to reinforcement learning

    Authors: Horia Mania, Aurelia Guy, Benjamin Recht

    Abstract: A common belief in model-free reinforcement learning is that methods based on random search in the parameter space of policies exhibit significantly worse sample complexity than those that explore the space of actions. We dispel such beliefs by introducing a random search method for training static, linear policies for continuous control problems, matching state-of-the-art sample efficiency on the… ▽ More

    Submitted 19 March, 2018; originally announced March 2018.

    Comments: 22 pages, 5 figures, 9 tables

  10. arXiv:1802.08334  [pdf, other

    cs.LG math.OC stat.ML

    Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification

    Authors: Max Simchowitz, Horia Mania, Stephen Tu, Michael I. Jordan, Benjamin Recht

    Abstract: We prove that the ordinary least-squares (OLS) estimator attains nearly minimax optimal performance for the identification of linear dynamical systems from a single observed trajectory. Our upper bound relies on a generalization of Mendelson's small-ball method to dependent data, eschewing the use of standard mixing-time arguments. Our lower bounds reveal that these upper bounds match up to logari… ▽ More

    Submitted 24 May, 2018; v1 submitted 22 February, 2018; originally announced February 2018.

  11. arXiv:1710.01688  [pdf, other

    math.OC cs.LG stat.ML

    On the Sample Complexity of the Linear Quadratic Regulator

    Authors: Sarah Dean, Horia Mania, Nikolai Matni, Benjamin Recht, Stephen Tu

    Abstract: This paper addresses the optimal control problem known as the Linear Quadratic Regulator in the case when the dynamics are unknown. We propose a multi-stage procedure, called Coarse-ID control, that estimates a model from a few experimental trials, estimates the error in that model with respect to the truth, and then designs a controller using both the model and uncertainty estimate. Our technique… ▽ More

    Submitted 13 December, 2018; v1 submitted 4 October, 2017; originally announced October 2017.

    Comments: Contains a new analysis of finite-dimensional truncation, a new data-dependent estimation bound, and an expanded exposition on necessary background in control theory and System Level Synthesis

  12. arXiv:1603.08035  [pdf, ps, other

    stat.ML cs.DM cs.LG

    On kernel methods for covariates that are rankings

    Authors: Horia Mania, Aaditya Ramdas, Martin J. Wainwright, Michael I. Jordan, Benjamin Recht

    Abstract: Permutation-valued features arise in a variety of applications, either in a direct way when preferences are elicited over a collection of items, or an indirect way in which numerical ratings are converted to a ranking. To date, there has been relatively limited study of regression, classification, and testing problems based on permutation-valued features, as opposed to permutation-valued responses… ▽ More

    Submitted 20 July, 2017; v1 submitted 25 March, 2016; originally announced March 2016.

    Comments: 35 pages, 5 figures

  13. arXiv:1507.06970  [pdf, ps, other

    stat.ML cs.DC cs.DS cs.LG math.OC

    Perturbed Iterate Analysis for Asynchronous Stochastic Optimization

    Authors: Horia Mania, Xinghao Pan, Dimitris Papailiopoulos, Benjamin Recht, Kannan Ramchandran, Michael I. Jordan

    Abstract: We introduce and analyze stochastic optimization methods where the input to each gradient update is perturbed by bounded noise. We show that this framework forms the basis of a unified approach to analyze asynchronous implementations of stochastic optimization algorithms.In this framework, asynchronous stochastic optimization algorithms can be thought of as serial methods operating on noisy inputs… ▽ More

    Submitted 25 March, 2016; v1 submitted 24 July, 2015; originally announced July 2015.

    Comments: 30 pages

    MSC Class: 65K10; 65Y05; 68W10; 68W20