Hierarchical Potential-based Reward Sha** from Task Specifications

Berducci, Luigi; Aguilar, Edgar A.; Ničković, Dejan; Grosu, Radu

Computer Science > Machine Learning

arXiv:2110.02792 (cs)

[Submitted on 6 Oct 2021 (v1), last revised 3 Oct 2022 (this version, v3)]

Title:Hierarchical Potential-based Reward Sha** from Task Specifications

Authors:Luigi Berducci, Edgar A. Aguilar, Dejan Ničković, Radu Grosu

View PDF

Abstract:The automatic synthesis of policies for robotic-control tasks through reinforcement learning relies on a reward signal that simultaneously captures many possibly conflicting requirements. In this paper, we in\-tro\-duce a novel, hierarchical, potential-based reward-sha** approach (HPRS) for defining effective, multivariate rewards for a large family of such control tasks. We formalize a task as a partially-ordered set of safety, target, and comfort requirements, and define an automated methodology to enforce a natural order among requirements and shape the associated reward. Building upon potential-based reward sha**, we show that HPRS preserves policy optimality. Our experimental evaluation demonstrates HPRS's superior ability in capturing the intended behavior, resulting in task-satisfying policies with improved comfort, and converging to optimal behavior faster than other state-of-the-art approaches. We demonstrate the practical usability of HPRS on several robotics applications and the smooth sim2real transition on two autonomous-driving scenarios for F1TENTH race cars.

Comments:	7 pages main, 5 pages appendix - added f1tenth racing car environment
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2110.02792 [cs.LG]
	(or arXiv:2110.02792v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2110.02792

Submission history

From: Edgar Alexis Aguilar [view email]
[v1] Wed, 6 Oct 2021 14:16:59 UTC (884 KB)
[v2] Tue, 8 Mar 2022 17:20:16 UTC (558 KB)
[v3] Mon, 3 Oct 2022 16:04:26 UTC (1,711 KB)

Computer Science > Machine Learning

Title:Hierarchical Potential-based Reward Sha** from Task Specifications

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Hierarchical Potential-based Reward Sha** from Task Specifications

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators