Skip to main content

Showing 1–6 of 6 results for author: Keramati, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2111.14272  [pdf, other

    cs.LG cs.AI stat.ME

    Identification of Subgroups With Similar Benefits in Off-Policy Policy Evaluation

    Authors: Ramtin Keramati, Omer Gottesman, Leo Anthony Celi, Finale Doshi-Velez, Emma Brunskill

    Abstract: Off-policy policy evaluation methods for sequential decision making can be used to help identify if a proposed decision policy is better than a current baseline policy. However, a new decision policy may be better than a baseline policy for some individuals but not others. This has motivated a push towards personalization and accurate per-state estimates of heterogeneous treatment effects (HTEs).… ▽ More

    Submitted 28 November, 2021; originally announced November 2021.

  2. arXiv:2007.05896  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Abstract Models for Strategic Exploration and Fast Reward Transfer

    Authors: Evan Zheran Liu, Ramtin Keramati, Sudarshan Seshadri, Kelvin Guu, Panupong Pasupat, Emma Brunskill, Percy Liang

    Abstract: Model-based reinforcement learning (RL) is appealing because (i) it enables planning and thus more strategic exploration, and (ii) by decoupling dynamics from rewards, it enables fast transfer to new reward functions. However, learning an accurate Markov Decision Process (MDP) over high-dimensional states (e.g., raw pixels) is extremely challenging because it requires function approximation, which… ▽ More

    Submitted 11 July, 2020; originally announced July 2020.

  3. Value Driven Representation for Human-in-the-Loop Reinforcement Learning

    Authors: Ramtin Keramati, Emma Brunskill

    Abstract: Interactive adaptive systems powered by Reinforcement Learning (RL) have many potential applications, such as intelligent tutoring systems. In such systems there is typically an external human system designer that is creating, monitoring and modifying the interactive adaptive system, trying to improve its performance on the target outcomes. In this paper we focus on algorithmic foundation of how t… ▽ More

    Submitted 2 April, 2020; originally announced April 2020.

    Journal ref: UMAP 2019, 27th ACM Conference on User Modeling, Adaptation and Personalization

  4. arXiv:2003.05623  [pdf, other

    stat.ML cs.LG

    Off-policy Policy Evaluation For Sequential Decisions Under Unobserved Confounding

    Authors: Hongseok Namkoong, Ramtin Keramati, Steve Yadlowsky, Emma Brunskill

    Abstract: When observed decisions depend only on observed features, off-policy policy evaluation (OPE) methods for sequential decision making problems can estimate the performance of evaluation policies before deploying them. This assumption is frequently violated due to unobserved confounders, unrecorded variables that impact both the decisions and their outcomes. We assess robustness of OPE methods under… ▽ More

    Submitted 12 March, 2020; originally announced March 2020.

  5. arXiv:1911.01546  [pdf, other

    cs.LG cs.AI

    Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy

    Authors: Ramtin Keramati, Christoph Dann, Alex Tamkin, Emma Brunskill

    Abstract: While maximizing expected return is the goal in most reinforcement learning approaches, risk-sensitive objectives such as conditional value at risk (CVaR) are more suitable for many high-stakes applications. However, relatively little is known about how to explore to quickly learn policies with good CVaR. In this paper, we present the first algorithm for sample-efficient learning of CVaR-optimal p… ▽ More

    Submitted 2 April, 2020; v1 submitted 4 November, 2019; originally announced November 2019.

    Journal ref: Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020)

  6. arXiv:1806.00175  [pdf, other

    cs.AI

    Fast Exploration with Simplified Models and Approximately Optimistic Planning in Model Based Reinforcement Learning

    Authors: Ramtin Keramati, Jay Whang, Patrick Cho, Emma Brunskill

    Abstract: Humans learn to play video games significantly faster than the state-of-the-art reinforcement learning (RL) algorithms. People seem to build simple models that are easy to learn to support planning and strategic exploration. Inspired by this, we investigate two issues in leveraging model-based RL for sample efficiency. First we investigate how to perform strategic exploration when exact planning i… ▽ More

    Submitted 25 November, 2018; v1 submitted 31 May, 2018; originally announced June 2018.