Skip to main content

Showing 1–1 of 1 results for author: Pratticò, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19861  [pdf, other

    cs.LG math.OC stat.ML

    Operator World Models for Reinforcement Learning

    Authors: Pietro Novelli, Marco Pratticò, Massimiliano Pontil, Carlo Ciliberto

    Abstract: Policy Mirror Descent (PMD) is a powerful and theoretically sound methodology for sequential decision-making. However, it is not directly applicable to Reinforcement Learning (RL) due to the inaccessibility of explicit action-value functions. We address this challenge by introducing a novel approach based on learning a world model of the environment using conditional mean embeddings. We then lever… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.