Predictable Interval MDPs through Entropy Regularization

van Zutphen, Menno; Delimpaltadakis, Giannis; Heemels, Maurice; Antunes, Duarte

Electrical Engineering and Systems Science > Systems and Control

arXiv:2403.16711 (eess)

[Submitted on 25 Mar 2024]

Title:Predictable Interval MDPs through Entropy Regularization

Authors:Menno van Zutphen, Giannis Delimpaltadakis, Maurice Heemels, Duarte Antunes

View PDF HTML (experimental)

Abstract:Regularization of control policies using entropy can be instrumental in adjusting predictability of real-world systems. Applications benefiting from such approaches range from, e.g., cybersecurity, which aims at maximal unpredictability, to human-robot interaction, where predictable behavior is highly desirable. In this paper, we consider entropy regularization for interval Markov decision processes (IMDPs). IMDPs are uncertain MDPs, where transition probabilities are only known to belong to intervals. Lately, IMDPs have gained significant popularity in the context of abstracting stochastic systems for control design. In this work, we address robust minimization of the linear combination of entropy and a standard cumulative cost in IMDPs, thereby establishing a trade-off between optimality and predictability. We show that optimal deterministic policies exist, and devise a value-iteration algorithm to compute them. The algorithm solves a number of convex programs at each step. Finally, through an illustrative example we show the benefits of penalizing entropy in IMDPs.

Subjects:	Systems and Control (eess.SY)
Cite as:	arXiv:2403.16711 [eess.SY]
	(or arXiv:2403.16711v1 [eess.SY] for this version)
	https://doi.org/10.48550/arXiv.2403.16711

Submission history

From: Menno Van Zutphen [view email]
[v1] Mon, 25 Mar 2024 12:49:09 UTC (209 KB)

Electrical Engineering and Systems Science > Systems and Control

Title:Predictable Interval MDPs through Entropy Regularization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Systems and Control

Title:Predictable Interval MDPs through Entropy Regularization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators