Skip to main content

Showing 1–2 of 2 results for author: Benac, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2308.05075  [pdf, other

    cs.LG

    Bayesian Inverse Transition Learning for Offline Settings

    Authors: Leo Benac, Sonali Parbhoo, Finale Doshi-Velez

    Abstract: Offline Reinforcement learning is commonly used for sequential decision-making in domains such as healthcare and education, where the rewards are known and the transition dynamics $T$ must be estimated on the basis of batch data. A key challenge for all tasks is how to learn a reliable estimate of the transition dynamics $T$ that produce near-optimal policies that are safe enough so that they neve… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: 8 pages, 1 plots, 2 tables

  2. arXiv:2109.13977  [pdf, other

    cs.LG

    Risk averse non-stationary multi-armed bandits

    Authors: Leo Benac, Frédéric Godin

    Abstract: This paper tackles the risk averse multi-armed bandits problem when incurred losses are non-stationary. The conditional value-at-risk (CVaR) is used as the objective function. Two estimation methods are proposed for this objective function in the presence of non-stationary losses, one relying on a weighted empirical distribution of losses and another on the dual representation of the CVaR. Such es… ▽ More

    Submitted 28 September, 2021; originally announced September 2021.