Skip to main content

Showing 1–2 of 2 results for author: Voloshin, C

Searching in archive stat. Search in all archives.
.
  1. arXiv:1911.06854  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

    Authors: Cameron Voloshin, Hoang M. Le, Nan Jiang, Yisong Yue

    Abstract: We offer an experimental benchmark and empirical study for off-policy policy evaluation (OPE) in reinforcement learning, which is a key problem in many safety critical applications. Given the increasing interest in deploying learning-based methods, there has been a flurry of recent proposals for OPE method, leading to a need for standardized empirical analyses. Our work takes a strong focus on div… ▽ More

    Submitted 27 November, 2021; v1 submitted 15 November, 2019; originally announced November 2019.

  2. arXiv:1903.08738  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    Batch Policy Learning under Constraints

    Authors: Hoang M. Le, Cameron Voloshin, Yisong Yue

    Abstract: When learning policies for real-world domains, two important questions arise: (i) how to efficiently use pre-collected off-policy, non-optimal behavior data; and (ii) how to mediate among different competing objectives and constraints. We thus study the problem of batch policy learning under multiple constraints, and offer a systematic solution. We first propose a flexible meta-algorithm that admi… ▽ More

    Submitted 20 March, 2019; originally announced March 2019.