Operator World Models for Reinforcement Learning

Novelli, Pietro; Pratticò, Marco; Pontil, Massimiliano; Ciliberto, Carlo

Computer Science > Machine Learning

arXiv:2406.19861 (cs)

[Submitted on 28 Jun 2024]

Title:Operator World Models for Reinforcement Learning

Authors:Pietro Novelli, Marco Pratticò, Massimiliano Pontil, Carlo Ciliberto

View PDF HTML (experimental)

Abstract:Policy Mirror Descent (PMD) is a powerful and theoretically sound methodology for sequential decision-making. However, it is not directly applicable to Reinforcement Learning (RL) due to the inaccessibility of explicit action-value functions. We address this challenge by introducing a novel approach based on learning a world model of the environment using conditional mean embeddings. We then leverage the operatorial formulation of RL to express the action-value function in terms of this quantity in closed form via matrix operations. Combining these estimators with PMD leads to POWR, a new RL algorithm for which we prove convergence rates to the global optimum. Preliminary experiments in finite and infinite state settings support the effectiveness of our method.

Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Cite as:	arXiv:2406.19861 [cs.LG]
	(or arXiv:2406.19861v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2406.19861

Submission history

From: Pietro Novelli [view email]
[v1] Fri, 28 Jun 2024 12:05:47 UTC (386 KB)

Full-text links:

Access Paper:

view license

Current browse context:

stat.ML

< prev | next >

new | recent | 2024-06

Change to browse by:

cs
cs.LG
math
math.OC
stat

References & Citations

export BibTeX citation

Computer Science > Machine Learning

Title:Operator World Models for Reinforcement Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Operator World Models for Reinforcement Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators