Optimistic Active Exploration of Dynamical Systems

Sukhija, Bhavya; Treven, Lenart; Sancaktar, Cansu; Blaes, Sebastian; Coros, Stelian; Krause, Andreas

Computer Science > Machine Learning

arXiv:2306.12371 (cs)

[Submitted on 21 Jun 2023 (v1), last revised 30 Oct 2023 (this version, v2)]

Title:Optimistic Active Exploration of Dynamical Systems

Authors:Bhavya Sukhija, Lenart Treven, Cansu Sancaktar, Sebastian Blaes, Stelian Coros, Andreas Krause

View PDF

Abstract:Reinforcement learning algorithms commonly seek to optimize policies for solving one particular task. How should we explore an unknown dynamical system such that the estimated model globally approximates the dynamics and allows us to solve multiple downstream tasks in a zero-shot manner? In this paper, we address this challenge, by develo** an algorithm -- OPAX -- for active exploration. OPAX uses well-calibrated probabilistic models to quantify the epistemic uncertainty about the unknown dynamics. It optimistically -- w.r.t. to plausible dynamics -- maximizes the information gain between the unknown dynamics and state observations. We show how the resulting optimization problem can be reduced to an optimal control problem that can be solved at each episode using standard approaches. We analyze our algorithm for general models, and, in the case of Gaussian process dynamics, we give a first-of-its-kind sample complexity bound and show that the epistemic uncertainty converges to zero. In our experiments, we compare OPAX with other heuristic active exploration approaches on several environments. Our experiments show that OPAX is not only theoretically sound but also performs well for zero-shot planning on novel downstream tasks.

Subjects:	Machine Learning (cs.LG); Robotics (cs.RO); Systems and Control (eess.SY)
Cite as:	arXiv:2306.12371 [cs.LG]
	(or arXiv:2306.12371v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2306.12371

Submission history

From: Lenart Treven [view email]
[v1] Wed, 21 Jun 2023 16:26:59 UTC (2,955 KB)
[v2] Mon, 30 Oct 2023 15:18:01 UTC (3,941 KB)

Computer Science > Machine Learning

Title:Optimistic Active Exploration of Dynamical Systems

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Optimistic Active Exploration of Dynamical Systems

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators