Skip to main content

Showing 1–6 of 6 results for author: Korattikara, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:1912.05663  [pdf, other

    stat.ML cs.AI cs.LG

    Measuring the Reliability of Reinforcement Learning Algorithms

    Authors: Stephanie C. Y. Chan, Samuel Fishman, John Canny, Anoop Korattikara, Sergio Guadarrama

    Abstract: Lack of reliability is a well-known issue for reinforcement learning (RL) algorithms. This problem has gained increasing attention in recent years, and efforts to improve it have grown substantially. To aid RL researchers and production users with the evaluation and improvement of reliability, we propose a set of metrics that quantitatively measure different aspects of reliability. In this work, w… ▽ More

    Submitted 12 February, 2020; v1 submitted 10 December, 2019; originally announced December 2019.

    Comments: Accepted for publication at ICLR 2020 (spotlight)

  2. arXiv:1902.07742  [pdf, other

    cs.LG stat.ML

    From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following

    Authors: Justin Fu, Anoop Korattikara, Sergey Levine, Sergio Guadarrama

    Abstract: Reinforcement learning is a promising framework for solving control problems, but its use in practical situations is hampered by the fact that reward functions are often difficult to engineer. Specifying goals and tasks for autonomous machines, such as robots, is a significant challenge: conventionally, reward functions and goal states have been used to communicate objectives. But people can commu… ▽ More

    Submitted 20 February, 2019; originally announced February 2019.

  3. arXiv:1506.04416  [pdf, other

    cs.LG stat.ML

    Bayesian Dark Knowledge

    Authors: Anoop Korattikara, Vivek Rathod, Kevin Murphy, Max Welling

    Abstract: We consider the problem of Bayesian parameter estimation for deep neural networks, which is important in problem settings where we may have little data, and/ or where we need accurate posterior predictive densities, e.g., for applications involving bandits or active learning. One simple approach to this is to use online Monte Carlo methods, such as SGLD (stochastic gradient Langevin dynamics). Unf… ▽ More

    Submitted 6 November, 2015; v1 submitted 14 June, 2015; originally announced June 2015.

    Comments: final version submitted to NIPS 2015

  4. arXiv:1503.01596  [pdf, other

    cs.LG stat.ML

    Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC

    Authors: Sung** Ahn, Anoop Korattikara, Nathan Liu, Suju Rajan, Max Welling

    Abstract: Despite having various attractive qualities such as high prediction accuracy and the ability to quantify uncertainty and avoid over-fitting, Bayesian Matrix Factorization has not been widely adopted because of the prohibitive cost of inference. In this paper, we propose a scalable distributed Bayesian matrix factorization algorithm using stochastic gradient MCMC. Our algorithm, based on Distribute… ▽ More

    Submitted 9 March, 2015; v1 submitted 5 March, 2015; originally announced March 2015.

  5. arXiv:1304.5299  [pdf, other

    cs.LG stat.ML

    Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget

    Authors: Anoop Korattikara, Yutian Chen, Max Welling

    Abstract: Can we make Bayesian posterior MCMC sampling more efficient when faced with very large datasets? We argue that computing the likelihood for N datapoints in the Metropolis-Hastings (MH) test to reach a single binary decision is computationally inefficient. We introduce an approximate MH rule based on a sequential hypothesis test that allows us to accept or reject samples with high confidence using… ▽ More

    Submitted 14 February, 2014; v1 submitted 18 April, 2013; originally announced April 2013.

    Comments: v4 - version accepted by ICML2014

  6. arXiv:1206.6380  [pdf

    cs.LG stat.CO stat.ML

    Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring

    Authors: Sung** Ahn, Anoop Korattikara, Max Welling

    Abstract: In this paper we address the following question: Can we approximately sample from a Bayesian posterior distribution if we are only allowed to touch a small mini-batch of data-items for every sample we generate?. An algorithm based on the Langevin equation with stochastic gradients (SGLD) was previously proposed to solve this, but its mixing rate was slow. By leveraging the Bayesian Central Limit T… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)