Skip to main content

Showing 1–14 of 14 results for author: Graves, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2103.08070  [pdf, other

    cs.RO

    Learning robust driving policies without online exploration

    Authors: Daniel Graves, Nhat M. Nguyen, Kimia Hassanzadeh, Jun **, Jun Luo

    Abstract: We propose a multi-time-scale predictive representation learning method to efficiently learn robust driving policies in an offline manner that generalize well to novel road geometries, and damaged and distracting lane conditions which are not covered in the offline training data. We show that our proposed representation learning method can be applied easily in an offline (batch) reinforcement lear… ▽ More

    Submitted 14 March, 2021; originally announced March 2021.

    Comments: Accepted in ICRA 2021. Due to format limitations of ICRA, we include appendix of our detailed evaluation results in this full version. arXiv admin note: substantial text overlap with arXiv:2006.15110

  2. arXiv:2102.07659  [pdf, other

    cs.AI cs.MA

    Diverse Auto-Curriculum is Critical for Successful Real-World Multiagent Learning Systems

    Authors: Yaodong Yang, Jun Luo, Ying Wen, Oliver Slumbers, Daniel Graves, Haitham Bou Ammar, Jun Wang, Matthew E. Taylor

    Abstract: Multiagent reinforcement learning (MARL) has achieved a remarkable amount of success in solving various types of video games. A cornerstone of this success is the auto-curriculum framework, which shapes the learning process by continually creating new challenging tasks for agents to adapt to, thereby facilitating the acquisition of new skills. In order to extend MARL methods to real-world domains… ▽ More

    Submitted 16 February, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: AAMAS 2021

  3. arXiv:2012.14942  [pdf, other

    cs.LG cs.AI

    LISPR: An Options Framework for Policy Reuse with Reinforcement Learning

    Authors: Daniel Graves, Jun **, Jun Luo

    Abstract: We propose a framework for transferring any existing policy from a potentially unknown source MDP to a target MDP. This framework (1) enables reuse in the target domain of any form of source policy, including classical controllers, heuristic policies, or deep neural network-based policies, (2) attains optimality under suitable theoretical conditions, and (3) guarantees improvement over the source… ▽ More

    Submitted 29 December, 2020; originally announced December 2020.

  4. arXiv:2011.05857  [pdf, other

    cs.RO cs.AI

    Offline Learning of Counterfactual Predictions for Real-World Robotic Reinforcement Learning

    Authors: Jun **, Daniel Graves, Cameron Haigh, Jun Luo, Martin Jagersand

    Abstract: We consider real-world reinforcement learning (RL) of robotic manipulation tasks that involve both visuomotor skills and contact-rich skills. We aim to train a policy that maps multimodal sensory observations (vision and force) to a manipulator's joint velocities under practical considerations. We propose to use offline samples to learn a set of general value functions (GVFs) that make counterfact… ▽ More

    Submitted 25 February, 2022; v1 submitted 11 November, 2020; originally announced November 2020.

    Comments: Accepted in ICRA 2022

  5. arXiv:2010.14289  [pdf, other

    cs.AI

    Affordance as general value function: A computational model

    Authors: Daniel Graves, Johannes Günther, Jun Luo

    Abstract: General value functions (GVFs) in the reinforcement learning (RL) literature are long-term predictive summaries of the outcomes of agents following specific policies in the environment. Affordances as perceived action possibilities with specific valence may be cast into predicted policy-relative goodness and modelled as GVFs. A systematic explication of this connection shows that GVFs and especial… ▽ More

    Submitted 7 May, 2021; v1 submitted 27 October, 2020; originally announced October 2020.

  6. arXiv:2010.09776  [pdf, other

    cs.MA cs.AI cs.GT cs.LG eess.SY

    SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving

    Authors: Ming Zhou, Jun Luo, Julian Villella, Yaodong Yang, David Rusu, Jiayu Miao, Weinan Zhang, Montgomery Alban, Iman Fadakar, Zheng Chen, Aurora Chongxi Huang, Ying Wen, Kimia Hassanzadeh, Daniel Graves, Dong Chen, Zhengbang Zhu, Nhat Nguyen, Mohamed Elsayed, Kun Shao, Sanjeevan Ahilan, Baokuan Zhang, Jiannan Wu, Zhengang Fu, Kasra Rezaee, Peyman Yadmellat , et al. (12 additional authors not shown)

    Abstract: Multi-agent interaction is a fundamental aspect of autonomous driving in the real world. Despite more than a decade of research and development, the problem of how to competently interact with diverse road users in diverse scenarios remains largely unsolved. Learning methods have much to offer towards solving this problem. But they require a realistic multi-agent simulator that generates diverse a… ▽ More

    Submitted 31 October, 2020; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: 20 pages, 11 figures. Paper accepted to CoRL 2020

  7. arXiv:2010.09536  [pdf, other

    cs.LG cs.AI

    What About Inputing Policy in Value Function: Policy Representation and Policy-extended Value Function Approximator

    Authors: Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Changmin Yu, Hangyu Mao, Wulong Liu, Yaodong Yang, Wenyuan Tao, Li Wang

    Abstract: We study Policy-extended Value Function Approximator (PeVFA) in Reinforcement Learning (RL), which extends conventional value function approximator (VFA) to take as input not only the state (and action) but also an explicit policy representation. Such an extension enables PeVFA to preserve values of multiple policies at the same time and brings an appealing characteristic, i.e., \emph{value genera… ▽ More

    Submitted 15 December, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: Accepted as a conference paper on AAAI 2022

  8. arXiv:2006.15110  [pdf, other

    cs.LG cs.RO

    Learning predictive representations in autonomous driving to improve deep reinforcement learning

    Authors: Daniel Graves, Nhat M. Nguyen, Kimia Hassanzadeh, Jun **

    Abstract: Reinforcement learning using a novel predictive representation is applied to autonomous driving to accomplish the task of driving between lane markings where substantial benefits in performance and generalization are observed on unseen test roads in both simulation and on a real Jackal robot. The novel predictive representation is learned by general value functions (GVFs) to provide out-of-policy,… ▽ More

    Submitted 26 June, 2020; originally announced June 2020.

  9. arXiv:2001.09113  [pdf, other

    cs.RO cs.AI eess.SY

    Perception as prediction using general value functions in autonomous driving applications

    Authors: Daniel Graves, Kasra Rezaee, Sean Scheideman

    Abstract: We propose and demonstrate a framework called perception as prediction for autonomous driving that uses general value functions (GVFs) to learn predictions. Perception as prediction learns data-driven predictions relating to the impact of actions on the agent's perception of the world. It also provides a data-driven approach to predict the impact of the anticipated behavior of other agents on the… ▽ More

    Submitted 24 January, 2020; originally announced January 2020.

    Comments: 8 pages, 6 figures, IROS 2019 conference

  10. arXiv:1911.08610  [pdf, other

    cs.LG cs.AI stat.ML

    Efficient decorrelation of features using Gramian in Reinforcement Learning

    Authors: Borislav Mavrin, Daniel Graves, Alan Chan

    Abstract: Learning good representations is a long standing problem in reinforcement learning (RL). One of the conventional ways to achieve this goal in the supervised setting is through regularization of the parameters. Extending some of these ideas to the RL setting has not yielded similar improvements in learning. In this paper, we develop an online regularization framework for decorrelating features in R… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

  11. Mapless Navigation among Dynamics with Social-safety-awareness: a reinforcement learning approach from 2D laser scans

    Authors: Jun **, Nhat M. Nguyen, Nazmus Sakib, Daniel Graves, Hengshuai Yao, Martin Jagersand

    Abstract: We propose a method to tackle the problem of mapless collision-avoidance navigation where humans are present using 2D laser scans. Our proposed method uses ego-safety to measure collision from the robot's perspective while social-safety to measure the impact of our robot's actions on surrounding pedestrians. Specifically, the social-safety part predicts the intrusion impact of our robot's action i… ▽ More

    Submitted 5 March, 2020; v1 submitted 8 November, 2019; originally announced November 2019.

    Comments: Accepted in ICRA 2020

  12. arXiv:1909.03906  [pdf, other

    cs.LG cs.AI

    Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning

    Authors: Kristopher De Asis, Alan Chan, Silviu Pitis, Richard S. Sutton, Daniel Graves

    Abstract: We explore fixed-horizon temporal difference (TD) methods, reinforcement learning algorithms for a new kind of value function that predicts the sum of rewards over a $\textit{fixed}$ number of future time steps. To learn the value function for horizon $h$, these algorithms bootstrap from the value function for horizon $h-1$, or some shorter horizon. Because no value function bootstraps from itself… ▽ More

    Submitted 10 February, 2020; v1 submitted 9 September, 2019; originally announced September 2019.

    Comments: AAAI 2020

    ACM Class: I.2

  13. arXiv:1906.04328  [pdf, other

    cs.LG cs.AI stat.ML

    Importance Resampling for Off-policy Prediction

    Authors: Matthew Schlegel, Wesley Chung, Daniel Graves, Jian Qian, Martha White

    Abstract: Importance sampling (IS) is a common reweighting strategy for off-policy prediction in reinforcement learning. While it is consistent and unbiased, it can result in high variance updates to the weights for the value function. In this work, we explore a resampling strategy as an alternative to reweighting. We propose Importance Resampling (IR) for off-policy prediction, which resamples experience f… ▽ More

    Submitted 13 November, 2019; v1 submitted 10 June, 2019; originally announced June 2019.

    Comments: Recently published in NeurIPS 2019

  14. arXiv:1610.08833  [pdf, other

    cs.DC astro-ph.HE gr-qc

    A Survey of High Level Frameworks in Block-Structured Adaptive Mesh Refinement Packages

    Authors: Anshu Dubey, Ann Almgren, John Bell, Martin Berzins, Steve Brandt, Greg Bryan, Phillip Colella, Daniel Graves, Michael Lijewski, Frank Löffler, Brian O'Shea, Erik Schnetter, Brian Van Straalen, Klaus Weide

    Abstract: Over the last decade block-structured adaptive mesh refinement (SAMR) has found increasing use in large, publicly available codes and frameworks. SAMR frameworks have evolved along different paths. Some have stayed focused on specific domain areas, others have pursued a more general functionality, providing the building blocks for a larger variety of applications. In this survey paper we examine a… ▽ More

    Submitted 27 October, 2016; originally announced October 2016.

    Journal ref: Journal of Parallel and Distributed Computing, Volume 74, Issue 12, December 2014, Pages 3217-3227