Skip to main content

Showing 1–1 of 1 results for author: Lunghi, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.14372  [pdf, ps, other

    cs.LG

    Learning Constrained Markov Decision Processes With Non-stationary Rewards and Constraints

    Authors: Francesco Emanuele Stradi, Anna Lunghi, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti

    Abstract: In constrained Markov decision processes (CMDPs) with adversarial rewards and constraints, a well-known impossibility result prevents any algorithm from attaining both sublinear regret and sublinear constraint violation, when competing against a best-in-hindsight policy that satisfies constraints on average. In this paper, we show that this negative result can be eased in CMDPs with non-stationary… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.