Off-Policy Adversarial Inverse Reinforcement Learning

Arnob, Samin Yeasar

Computer Science > Machine Learning

arXiv:2005.01138 (cs)

[Submitted on 3 May 2020]

Title:Off-Policy Adversarial Inverse Reinforcement Learning

Authors:Samin Yeasar Arnob

View PDF

Abstract:Adversarial Imitation Learning (AIL) is a class of algorithms in Reinforcement learning (RL), which tries to imitate an expert without taking any reward from the environment and does not provide expert behavior directly to the policy training. Rather, an agent learns a policy distribution that minimizes the difference from expert behavior in an adversarial setting. Adversarial Inverse Reinforcement Learning (AIRL) leverages the idea of AIL, integrates a reward function approximation along with learning the policy, and shows the utility of IRL in the transfer learning setting. But the reward function approximator that enables transfer learning does not perform well in imitation tasks. We propose an Off-Policy Adversarial Inverse Reinforcement Learning (Off-policy-AIRL) algorithm which is sample efficient as well as gives good imitation performance compared to the state-of-the-art AIL algorithm in the continuous control tasks. For the same reward function approximator, we show the utility of learning our algorithm over AIL by using the learned reward function to retrain the policy over a task under significant variation where expert demonstrations are absent.

Comments:	15 pages, 10 figures
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO); Machine Learning (stat.ML)
Cite as:	arXiv:2005.01138 [cs.LG]
	(or arXiv:2005.01138v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2005.01138

Submission history

From: Samin Yeasar Arnob [view email]
[v1] Sun, 3 May 2020 16:51:40 UTC (1,017 KB)

Computer Science > Machine Learning

Title:Off-Policy Adversarial Inverse Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Off-Policy Adversarial Inverse Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators