Skip to main content

Showing 1–12 of 12 results for author: Kidambi, R

Searching in archive stat. Search in all archives.
.
  1. arXiv:2204.10936  [pdf, other

    cs.IR cs.LG stat.ML

    Counterfactual Learning To Rank for Utility-Maximizing Query Autocompletion

    Authors: Adam Block, Rahul Kidambi, Daniel N. Hill, Thorsten Joachims, Inderjit S. Dhillon

    Abstract: Conventional methods for query autocompletion aim to predict which completed query a user will select from a list. A shortcoming of this approach is that users often do not know which query will provide the best retrieval performance on the current information retrieval system, meaning that any query autocompletion methods trained to mimic user behavior can lead to suboptimal query suggestions. To… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

  2. arXiv:2106.03207  [pdf, other

    cs.LG stat.ML

    Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage

    Authors: Jonathan D. Chang, Masatoshi Uehara, Dhruv Sreenivas, Rahul Kidambi, Wen Sun

    Abstract: This paper studies offline Imitation Learning (IL) where an agent learns to imitate an expert demonstrator without additional online environment interactions. Instead, the learner is presented with a static offline dataset of state-action-next state transition triples from a potentially less proficient behavior policy. We introduce Model-based IL from Offline data (MILO): an algorithmic framework… ▽ More

    Submitted 31 January, 2022; v1 submitted 6 June, 2021; originally announced June 2021.

    Comments: 42 pages, 5 figures, 7 tables

  3. arXiv:2102.10769  [pdf, other

    cs.LG stat.ML

    MobILE: Model-Based Imitation Learning From Observation Alone

    Authors: Rahul Kidambi, Jonathan Chang, Wen Sun

    Abstract: This paper studies Imitation Learning from Observations alone (ILFO) where the learner is presented with expert demonstrations that consist only of states visited by an expert (without access to actions taken by the expert). We present a provably efficient model-based framework MobILE to solve the ILFO problem. MobILE involves carefully trading off strategic exploration against imitation - this is… ▽ More

    Submitted 31 January, 2022; v1 submitted 21 February, 2021; originally announced February 2021.

    Comments: 29 pages, 7 figures

  4. arXiv:2102.07800  [pdf, other

    stat.ML cs.AI cs.LG

    Top-$k$ eXtreme Contextual Bandits with Arm Hierarchy

    Authors: Rajat Sen, Alexander Rakhlin, Lexing Ying, Rahul Kidambi, Dean Foster, Daniel Hill, Inderjit Dhillon

    Abstract: Motivated by modern applications, such as online advertisement and recommender systems, we study the top-$k$ extreme contextual bandits problem, where the total number of arms can be enormous, and the learner is allowed to select $k$ arms and observe all or some of the rewards for the chosen arms. We first propose an algorithm for the non-extreme realizable setting, utilizing the Inverse Gap Weigh… ▽ More

    Submitted 15 February, 2021; originally announced February 2021.

  5. arXiv:2005.05951  [pdf, other

    cs.LG cs.AI stat.ML

    MOReL : Model-Based Offline Reinforcement Learning

    Authors: Rahul Kidambi, Aravind Rajeswaran, Praneeth Netrapalli, Thorsten Joachims

    Abstract: In offline reinforcement learning (RL), the goal is to learn a highly rewarding policy based solely on a dataset of historical interactions with the environment. The ability to train RL policies offline can greatly expand the applicability of RL, its data efficiency, and its experimental velocity. Prior work in offline RL has been confined almost exclusively to model-free RL approaches. In this wo… ▽ More

    Submitted 1 March, 2021; v1 submitted 12 May, 2020; originally announced May 2020.

    Comments: First two authors contributed equally. Published at NeurIPS 2020. After publication at NeurIPS 2020, (1) D4RL benchmark results have been added; (2) hyper-parameter ablation studies have been added; (3) scope of Lemma 3 has been extended

  6. arXiv:1904.12838  [pdf, other

    cs.LG math.OC stat.ML

    The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares

    Authors: Rong Ge, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli

    Abstract: Minimax optimal convergence rates for classes of stochastic convex optimization problems are well characterized, where the majority of results utilize iterate averaged stochastic gradient descent (SGD) with polynomially decaying step sizes. In contrast, SGD's final iterate behavior has received much less attention despite their widespread use in practice. Motivated by this observation, this work p… ▽ More

    Submitted 29 October, 2019; v1 submitted 29 April, 2019; originally announced April 2019.

    Comments: Appears in the proceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2019. 28 pages, 4 tables, 1 Algorithm, 7 figures

  7. arXiv:1803.05591  [pdf, other

    cs.LG math.OC stat.ML

    On the insufficiency of existing momentum schemes for Stochastic Optimization

    Authors: Rahul Kidambi, Praneeth Netrapalli, Prateek Jain, Sham M. Kakade

    Abstract: Momentum based stochastic gradient methods such as heavy ball (HB) and Nesterov's accelerated gradient descent (NAG) method are widely used in practice for training deep networks and other supervised learning models, as they often provide significant improvements over stochastic gradient descent (SGD). Rigorously speaking, "fast gradient" methods have provable improvements over gradient descent on… ▽ More

    Submitted 31 July, 2018; v1 submitted 15 March, 2018; originally announced March 2018.

    Comments: 28 pages, 10 figures. Updated acknowledgements. Appeared as an oral presentation at International Conference on Learning Representations (ICLR), 2018. Code implementing the ASGD method can be found at https://github.com/rahulkidambi/AccSGD

  8. arXiv:1711.08426  [pdf, ps, other

    stat.ML cs.LG math.OC

    Leverage Score Sampling for Faster Accelerated Regression and ERM

    Authors: Naman Agarwal, Sham Kakade, Rahul Kidambi, Yin Tat Lee, Praneeth Netrapalli, Aaron Sidford

    Abstract: Given a matrix $\mathbf{A}\in\mathbb{R}^{n\times d}$ and a vector $b \in\mathbb{R}^{d}$, we show how to compute an $ε$-approximate solution to the regression problem $ \min_{x\in\mathbb{R}^{d}}\frac{1}{2} \|\mathbf{A} x - b\|_{2}^{2} $ in time $ \tilde{O} ((n+\sqrt{d\cdotκ_{\text{sum}}})\cdot s\cdot\logε^{-1}) $ where… ▽ More

    Submitted 22 November, 2017; originally announced November 2017.

  9. arXiv:1711.05482  [pdf, ps, other

    cs.LG stat.ML

    Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles

    Authors: Dhruv Mahajan, Vivek Gupta, S Sathiya Keerthi, Sellamanickam Sundararajan, Shravan Narayanamurthy, Rahul Kidambi

    Abstract: For many applications, an ensemble of base classifiers is an effective solution. The tuning of its parameters(number of classes, amount of data on which each classifier is to be trained on, etc.) requires G, the generalization error of a given ensemble. The efficient estimation of G is the focus of this paper. The key idea is to approximate the variance of the class scores/probabilities of the bas… ▽ More

    Submitted 15 November, 2017; originally announced November 2017.

    Comments: 12 Pages, 4 Figures, 12 Pages, Under Review in SDM 2018

  10. arXiv:1710.09430  [pdf, ps, other

    stat.ML cs.LG math.OC

    A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)

    Authors: Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Venkata Krishna Pillutla, Aaron Sidford

    Abstract: This work provides a simplified proof of the statistical minimax optimality of (iterate averaged) stochastic gradient descent (SGD), for the special case of least squares. This result is obtained by analyzing SGD as a stochastic process and by sharply characterizing the stationary covariance matrix of this process. The finite rate optimality characterization captures the constant factors and addre… ▽ More

    Submitted 21 July, 2018; v1 submitted 25 October, 2017; originally announced October 2017.

    Comments: Lemma 1 has been updated in v2

  11. arXiv:1704.08227  [pdf, other

    stat.ML cs.LG math.OC math.ST

    Accelerating Stochastic Gradient Descent For Least Squares Regression

    Authors: Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Aaron Sidford

    Abstract: There is widespread sentiment that it is not possible to effectively utilize fast gradient methods (e.g. Nesterov's acceleration, conjugate gradient, heavy ball) for the purposes of stochastic optimization due to their instability and error accumulation, a notion made precise in d'Aspremont 2008 and Devolder, Glineur, and Nesterov 2014. This work considers these issues for the special case of stoc… ▽ More

    Submitted 31 July, 2018; v1 submitted 26 April, 2017; originally announced April 2017.

    Comments: 54 pages, 3 figures, 1 table; updated acknowledgements, minor title change. Paper appeared in the proceedings of the Conference on Learning Theory (COLT), 2018

  12. arXiv:1610.03774  [pdf, other

    stat.ML cs.DS cs.LG

    Parallelizing Stochastic Gradient Descent for Least Squares Regression: mini-batching, averaging, and model misspecification

    Authors: Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Aaron Sidford

    Abstract: This work characterizes the benefits of averaging schemes widely used in conjunction with stochastic gradient descent (SGD). In particular, this work provides a sharp analysis of: (1) mini-batching, a method of averaging many samples of a stochastic gradient to both reduce the variance of the stochastic gradient estimate and for parallelizing SGD and (2) tail-averaging, a method involving averagin… ▽ More

    Submitted 31 July, 2018; v1 submitted 12 October, 2016; originally announced October 2016.

    Comments: 39 pages. Published in the Journal of Machine Learning Research (JMLR)