Showing 1–2 of 2 results for author: Pope, A P

Search v0.5.6 released 2020-02-24

arXiv:2202.02918 [pdf, other]

cs.LG cs.AI cs.NE

Soft Actor-Critic with Inhibitory Networks for Faster Retraining

Authors: Jaime S. Ide, Daria Mićović, Michael J. Guarino, Kevin Alcedo, David Rosenbluth, Adrian P. Pope

Abstract: Reusing previously trained models is critical in deep reinforcement learning to speed up training of new agents. However, it is unclear how to acquire new skills when objectives and constraints are in conflict with previously learned skills. Moreover, when retraining, there is an intrinsic conflict between exploiting what has already been learned and exploring new skills. In soft actor-critic (SAC… ▽ More Reusing previously trained models is critical in deep reinforcement learning to speed up training of new agents. However, it is unclear how to acquire new skills when objectives and constraints are in conflict with previously learned skills. Moreover, when retraining, there is an intrinsic conflict between exploiting what has already been learned and exploring new skills. In soft actor-critic (SAC) methods, a temperature parameter can be dynamically adjusted to weight the action entropy and balance the explore $\times$ exploit trade-off. However, controlling a single coefficient can be challenging within the context of retraining, even more so when goals are contradictory. In this work, inspired by neuroscience research, we propose a novel approach using inhibitory networks to allow separate and adaptive state value evaluations, as well as distinct automatic entropy tuning. Ultimately, our approach allows for controlling inhibition to handle conflict between exploiting less risky, acquired behaviors and exploring novel ones to overcome more challenging tasks. We validate our method through experiments in OpenAI Gym environments. △ Less

Submitted 7 February, 2022; v1 submitted 6 February, 2022; originally announced February 2022.

Comments: 16 pages including Appendix
arXiv:2105.00990 [pdf, other]

cs.LG

Hierarchical Reinforcement Learning for Air-to-Air Combat

Authors: Adrian P. Pope, Jaime S. Ide, Daria Micovic, Henry Diaz, David Rosenbluth, Lee Ritholtz, Jason C. Twedt, Thayne T. Walker, Kevin Alcedo, Daniel Javorsek

Abstract: Artificial Intelligence (AI) is becoming a critical component in the defense industry, as recently demonstrated by DARPA`s AlphaDogfight Trials (ADT). ADT sought to vet the feasibility of AI algorithms capable of piloting an F-16 in simulated air-to-air combat. As a participant in ADT, Lockheed Martin`s (LM) approach combines a hierarchical architecture with maximum-entropy reinforcement learning… ▽ More Artificial Intelligence (AI) is becoming a critical component in the defense industry, as recently demonstrated by DARPA`s AlphaDogfight Trials (ADT). ADT sought to vet the feasibility of AI algorithms capable of piloting an F-16 in simulated air-to-air combat. As a participant in ADT, Lockheed Martin`s (LM) approach combines a hierarchical architecture with maximum-entropy reinforcement learning (RL), integrates expert knowledge through reward sha**, and supports modularity of policies. This approach achieved a $2^{nd}$ place finish in the final ADT event (among eight total competitors) and defeated a graduate of the US Air Force's (USAF) F-16 Weapons Instructor Course in match play. △ Less

Submitted 11 June, 2021; v1 submitted 3 May, 2021; originally announced May 2021.

Comments: 10 pages, 10 figures, The 2021 International Conference on Unmanned Aircraft System (ICUAS 21), June 15-18, 2021, Athens, Greece

Search v0.5.6 released 2020-02-24