Skip to main content

Showing 1–18 of 18 results for author: Khetarpal, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01800  [pdf, other

    cs.LG cs.AI

    Normalization and effective learning rates in reinforcement learning

    Authors: Clare Lyle, Zeyu Zheng, Khimya Khetarpal, James Martens, Hado van Hasselt, Razvan Pascanu, Will Dabney

    Abstract: Normalization layers have recently experienced a renaissance in the deep reinforcement learning and continual learning literature, with several works highlighting diverse benefits such as improving loss landscape conditioning and combatting overestimation bias. However, normalization brings with it a subtle but important side effect: an equivalence between growth in the norm of the network paramet… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.02035  [pdf, other

    cs.LG cs.AI

    A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning

    Authors: Khimya Khetarpal, Zhaohan Daniel Guo, Bernardo Avila Pires, Yunhao Tang, Clare Lyle, Mark Rowland, Nicolas Heess, Diana Borsa, Arthur Guez, Will Dabney

    Abstract: Learning a good representation is a crucial challenge for Reinforcement Learning (RL) agents. Self-predictive learning provides means to jointly learn a latent representation and dynamics model by bootstrap** from future latent representations (BYOL). Recent work has developed theoretical insights into these algorithms by studying a continuous-time ODE model for self-predictive representation le… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  3. arXiv:2402.18762  [pdf, other

    cs.LG

    Disentangling the Causes of Plasticity Loss in Neural Networks

    Authors: Clare Lyle, Zeyu Zheng, Khimya Khetarpal, Hado van Hasselt, Razvan Pascanu, James Martens, Will Dabney

    Abstract: Underpinning the past decades of work on the design, initialization, and optimization of neural networks is a seemingly innocuous assumption: that the network is trained on a \textit{stationary} data distribution. In settings where this assumption is violated, e.g.\ deep reinforcement learning, learning algorithms become unstable and brittle with respect to hyperparameters and even random seeds. O… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  4. arXiv:2402.03575  [pdf, other

    cs.AI cs.HC

    Toward Human-AI Alignment in Large-Scale Multi-Player Games

    Authors: Sugandha Sharma, Guy Davidson, Khimya Khetarpal, Anssi Kanervisto, Udit Arora, Katja Hofmann, Ida Momennejad

    Abstract: Achieving human-AI alignment in complex multi-agent games is crucial for creating trustworthy AI agents that enhance gameplay. We propose a method to evaluate this alignment using an interpretable task-sets framework, focusing on high-level behavioral tasks instead of low-level policies. Our approach has three components. First, we analyze extensive human gameplay data from Xbox's Bleeding Edge (1… ▽ More

    Submitted 18 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  5. arXiv:2310.09997  [pdf, other

    cs.AI cs.LG eess.SY

    Forecaster: Towards Temporally Abstract Tree-Search Planning from Pixels

    Authors: Thomas Jiralerspong, Flemming Kondrup, Doina Precup, Khimya Khetarpal

    Abstract: The ability to plan at many different levels of abstraction enables agents to envision the long-term repercussions of their decisions and thus enables sample-efficient learning. This becomes particularly beneficial in complex environments from high-dimensional state space such as pixels, where the goal is distant and the reward sparse. We introduce Forecaster, a deep hierarchical reinforcement lea… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

  6. arXiv:2304.13892  [pdf, other

    cs.LG cs.AI

    Discovering Object-Centric Generalized Value Functions From Pixels

    Authors: Somjit Nath, Gopeshh Raaj Subbaraj, Khimya Khetarpal, Samira Ebrahimi Kahou

    Abstract: Deep Reinforcement Learning has shown significant progress in extracting useful representations from high-dimensional inputs albeit using hand-crafted auxiliary tasks and pseudo rewards. Automatically learning such representations in an object-centric manner geared towards control and fast adaptation remains an open research problem. In this paper, we introduce a method that tries to discover mean… ▽ More

    Submitted 27 June, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

    Comments: Accepted at ICML 2023

  7. arXiv:2212.14530  [pdf, other

    cs.AI cs.LG

    POMRL: No-Regret Learning-to-Plan with Increasing Horizons

    Authors: Khimya Khetarpal, Claire Vernade, Brendan O'Donoghue, Satinder Singh, Tom Zahavy

    Abstract: We study the problem of planning under model uncertainty in an online meta-reinforcement learning (RL) setting where an agent is presented with a sequence of related tasks with limited interactions per task. The agent can use its experience in each task and across tasks to estimate both the transition model and the distribution over tasks. We propose an algorithm to meta-learn the underlying struc… ▽ More

    Submitted 29 December, 2022; originally announced December 2022.

    Comments: 24 pages, 6 figures

  8. arXiv:2201.09653  [pdf, other

    cs.LG cs.AI

    The Paradox of Choice: Using Attention in Hierarchical Reinforcement Learning

    Authors: Andrei Nica, Khimya Khetarpal, Doina Precup

    Abstract: Decision-making AI agents are often faced with two important challenges: the depth of the planning horizon, and the branching factor due to having many choices. Hierarchical reinforcement learning methods aim to solve the first problem, by providing shortcuts that skip over multiple time steps. To cope with the breadth, it is desirable to restrict the agent's attention at each step to a reasonable… ▽ More

    Submitted 24 January, 2022; originally announced January 2022.

    Comments: 20 pages, 15 figures

  9. arXiv:2108.03213  [pdf, other

    cs.LG cs.AI stat.ML

    Temporally Abstract Partial Models

    Authors: Khimya Khetarpal, Zafarali Ahmed, Gheorghe Comanici, Doina Precup

    Abstract: Humans and animals have the ability to reason and make predictions about different courses of action at many time scales. In reinforcement learning, option models (Sutton, Precup \& Singh, 1999; Precup, 2000) provide the framework for this kind of temporally abstract prediction and reasoning. Natural intelligent agents are also able to focus their attention on courses of action that are relevant o… ▽ More

    Submitted 6 August, 2021; originally announced August 2021.

    Comments: 34 pages, 5 figures

  10. arXiv:2108.01005  [pdf, other

    cs.LG

    Sequoia: A Software Framework to Unify Continual Learning Research

    Authors: Fabrice Normandin, Florian Golemo, Oleksiy Ostapenko, Pau Rodriguez, Matthew D Riemer, Julio Hurtado, Khimya Khetarpal, Ryan Lindeborg, Lucas Cecchi, Timothée Lesort, Laurent Charlin, Irina Rish, Massimo Caccia

    Abstract: The field of Continual Learning (CL) seeks to develop algorithms that accumulate knowledge and skills over time through interaction with non-stationary environments. In practice, a plethora of evaluation procedures (settings) and algorithmic solutions (methods) exist, each with their own potentially disjoint set of assumptions. This variety makes measuring progress in CL difficult. We propose a ta… ▽ More

    Submitted 5 June, 2023; v1 submitted 2 August, 2021; originally announced August 2021.

  11. arXiv:2102.01985  [pdf, other

    cs.LG cs.AI

    Variance Penalized On-Policy and Off-Policy Actor-Critic

    Authors: Arushi Jain, Gandharv Patil, Ayush Jain, Khimya Khetarpal, Doina Precup

    Abstract: Reinforcement learning algorithms are typically geared towards optimizing the expected return of an agent. However, in many practical applications, low variance in the return is desired to ensure the reliability of an algorithm. In this paper, we propose on-policy and off-policy actor-critic algorithms that optimize a performance criterion involving both mean and variance in the return. Previous w… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

    Comments: Accepted to the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), 2021

  12. arXiv:2012.13490  [pdf, other

    cs.LG cs.AI

    Towards Continual Reinforcement Learning: A Review and Perspectives

    Authors: Khimya Khetarpal, Matthew Riemer, Irina Rish, Doina Precup

    Abstract: In this article, we aim to provide a literature review of different formulations and approaches to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We begin by discussing our perspective on why RL is a natural fit for studying continual learning. We then provide a taxonomy of different continual RL formulations by mathematically characterizing two key properties… ▽ More

    Submitted 11 November, 2022; v1 submitted 24 December, 2020; originally announced December 2020.

    Comments: Journal of Artificial Intelligence Research (JAIR)

  13. arXiv:2007.07206  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Robust State Abstractions for Hidden-Parameter Block MDPs

    Authors: Amy Zhang, Shagun Sodhani, Khimya Khetarpal, Joelle Pineau

    Abstract: Many control tasks exhibit similar dynamics that can be modeled as having common latent structure. Hidden-Parameter Markov Decision Processes (HiP-MDPs) explicitly model this structure to improve sample efficiency in multi-task settings. However, this setting makes strong assumptions on the observability of the state that limit its application in real-world scenarios with rich observation spaces.… ▽ More

    Submitted 11 February, 2021; v1 submitted 14 July, 2020; originally announced July 2020.

    Comments: Accepted at the 9th International Conference on Learning Representations. 22 pages, 14 figures

  14. arXiv:2006.15085  [pdf, other

    cs.LG cs.AI stat.ML

    What can I do here? A Theory of Affordances in Reinforcement Learning

    Authors: Khimya Khetarpal, Zafarali Ahmed, Gheorghe Comanici, David Abel, Doina Precup

    Abstract: Reinforcement learning algorithms usually assume that all actions are always available to an agent. However, both people and animals understand the general link between the features of their environment and the actions that are feasible. Gibson (1977) coined the term "affordances" to describe the fact that certain states enable an agent to do certain actions, in the context of embodied agents. In… ▽ More

    Submitted 26 June, 2020; originally announced June 2020.

    Comments: Thirty-seventh International Conference on Machine Learning (ICML 2020)

  15. arXiv:2001.00271  [pdf, other

    cs.LG cs.AI stat.ML

    Options of Interest: Temporal Abstraction with Interest Functions

    Authors: Khimya Khetarpal, Martin Klissarov, Maxime Chevalier-Boisvert, Pierre-Luc Bacon, Doina Precup

    Abstract: Temporal abstraction refers to the ability of an agent to use behaviours of controllers which act for a limited, variable amount of time. The options framework describes such behaviours as consisting of a subset of states in which they can initiate, an internal policy and a stochastic termination condition. However, much of the subsequent work on option discovery has ignored the initiation set, be… ▽ More

    Submitted 1 January, 2020; originally announced January 2020.

    Comments: To appear in Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20)

  16. arXiv:1811.10732  [pdf, ps, other

    cs.AI cs.LG

    Environments for Lifelong Reinforcement Learning

    Authors: Khimya Khetarpal, Shagun Sodhani, Sarath Chandar, Doina Precup

    Abstract: To achieve general artificial intelligence, reinforcement learning (RL) agents should learn not only to optimize returns for one specific task but also to constantly build more complex skills and scaffold their knowledge about the world, without forgetting what has already been learned. In this paper, we discuss the desired characteristics of environments that can support the training and evaluati… ▽ More

    Submitted 6 December, 2018; v1 submitted 26 November, 2018; originally announced November 2018.

    Comments: Accepted at 2nd Continual Learning Workshop, Neural Information Processing Systems (NeurIPS) 2018

  17. arXiv:1807.09664  [pdf, other

    cs.AI cs.CV

    Attend Before you Act: Leveraging human visual attention for continual learning

    Authors: Khimya Khetarpal, Doina Precup

    Abstract: When humans perform a task, such as playing a game, they selectively pay attention to certain parts of the visual input, gathering relevant information and sequentially combining it to build a representation from the sensory data. In this work, we explore leveraging where humans look in an image as an implicit indication of what is salient for decision making. We build on top of the UNREAL archite… ▽ More

    Submitted 25 July, 2018; originally announced July 2018.

    Comments: Lifelong Learning: A Reinforcement Learning Approach (LLARLA) Workshop, ICML 2018

  18. Safe Option-Critic: Learning Safety in the Option-Critic Architecture

    Authors: Arushi Jain, Khimya Khetarpal, Doina Precup

    Abstract: Designing hierarchical reinforcement learning algorithms that exhibit safe behaviour is not only vital for practical applications but also, facilitates a better understanding of an agent's decisions. We tackle this problem in the options framework, a particular way to specify temporally abstract actions which allow an agent to use sub-policies with start and end conditions. We consider a behaviour… ▽ More

    Submitted 2 March, 2021; v1 submitted 20 July, 2018; originally announced July 2018.

    Comments: To appear at The Knowledge Engineering Review (KER), 2021. Previous draft appeared in Adaptive Learning Agents (ALA) 2018 workshop held at ICML, AAMAS in Stockholm. Corrected typos, added references and added extra figures

    Journal ref: The Knowledge Engineering Review 36 (2021) e4