Skip to main content

Showing 1–10 of 10 results for author: McInroe, T

.
  1. arXiv:2406.13376  [pdf, other

    cs.LG

    Efficient Offline Reinforcement Learning: The Critic is Critical

    Authors: Adam Jelley, Trevor McInroe, Sam Devlin, Amos Storkey

    Abstract: Recent work has demonstrated both benefits and limitations from using supervised approaches (without temporal-difference learning) for offline reinforcement learning. While off-policy reinforcement learning provides a promising approach for improving performance beyond supervised approaches, we observe that training is often inefficient and unstable due to temporal difference bootstrap**. In thi… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  2. arXiv:2404.14285  [pdf, other

    cs.RO cs.AI

    LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekee** Robots

    Authors: Dongge Han, Trevor McInroe, Adam Jelley, Stefano V. Albrecht, Peter Bell, Amos Storkey

    Abstract: Large language models (LLMs) have shown significant potential for robotics applications, particularly task planning, by harnessing their language comprehension and text generation capabilities. However, in applications such as household robotics, a critical gap remains in the personalization of these models to individual user preferences. We introduce LLM-Personalize, a novel framework with an opt… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  3. arXiv:2310.05723  [pdf, other

    cs.LG

    Planning to Go Out-of-Distribution in Offline-to-Online Reinforcement Learning

    Authors: Trevor McInroe, Adam Jelley, Stefano V. Albrecht, Amos Storkey

    Abstract: Offline pretraining with a static dataset followed by online fine-tuning (offline-to-online, or OtO) is a paradigm well matched to a real-world RL deployment process. In this scenario, we aim to find the best-performing policy within a limited budget of online interactions. Previous work in the OtO setting has focused on correcting for bias introduced by the policy-constraint mechanisms of offline… ▽ More

    Submitted 21 June, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: 10 pages, 17 figures, published at RLC 2024

  4. arXiv:2305.14133  [pdf, other

    cs.LG

    Conditional Mutual Information for Disentangled Representations in Reinforcement Learning

    Authors: Mhairi Dunion, Trevor McInroe, Kevin Sebastian Luck, Josiah P. Hanna, Stefano V. Albrecht

    Abstract: Reinforcement Learning (RL) environments can produce training data with spurious correlations between features due to the amount of training data or its limited feature coverage. This can lead to RL agents encoding these misleading correlations in their latent representation, preventing the agent from generalising if the correlation changes within the environment or when deployed in the real world… ▽ More

    Submitted 12 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Conference on Neural Information Processing Systems (NeurIPS), 2023

  5. arXiv:2208.01769  [pdf, other

    cs.MA cs.AI cs.LG

    Deep Reinforcement Learning for Multi-Agent Interaction

    Authors: Ibrahim H. Ahmed, Cillian Brewitt, Ignacio Carlucho, Filippos Christianos, Mhairi Dunion, Elliot Fosong, Samuel Garcin, Shangmin Guo, Balint Gyevnar, Trevor McInroe, Georgios Papoudakis, Arrasy Rahman, Lukas Schäfer, Massimiliano Tamborski, Giuseppe Vecchio, Cheng Wang, Stefano V. Albrecht

    Abstract: The development of autonomous agents which can interact with other agents to accomplish a given task is a core area of research in artificial intelligence and machine learning. Towards this goal, the Autonomous Agents Research Group develops novel machine learning algorithms for autonomous systems control, with a specific focus on deep reinforcement learning and multi-agent reinforcement learning.… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

    Comments: Published in AI Communications Special Issue on Multi-Agent Systems Research in the UK

  6. arXiv:2207.05480  [pdf, other

    cs.LG

    Temporal Disentanglement of Representations for Improved Generalisation in Reinforcement Learning

    Authors: Mhairi Dunion, Trevor McInroe, Kevin Sebastian Luck, Josiah P. Hanna, Stefano V. Albrecht

    Abstract: Reinforcement Learning (RL) agents are often unable to generalise well to environment variations in the state space that were not observed during training. This issue is especially problematic for image-based RL, where a change in just one variable, such as the background colour, can change many pixels in the image. The changed pixels can lead to drastic changes in the agent's latent representatio… ▽ More

    Submitted 27 February, 2023; v1 submitted 12 July, 2022; originally announced July 2022.

    Comments: International Conference on Learning Representations (ICLR), 2023

  7. arXiv:2206.11396  [pdf, other

    cs.LG

    Multi-Horizon Representations with Hierarchical Forward Models for Reinforcement Learning

    Authors: Trevor McInroe, Lukas Schäfer, Stefano V. Albrecht

    Abstract: Learning control from pixels is difficult for reinforcement learning (RL) agents because representation learning and policy learning are intertwined. Previous approaches remedy this issue with auxiliary representation learning tasks, but they either do not consider the temporal aspect of the problem or only consider single-step transitions, which may cause learning inefficiencies if important envi… ▽ More

    Submitted 29 January, 2024; v1 submitted 22 June, 2022; originally announced June 2022.

    Comments: Published in TMLR

  8. arXiv:2110.04935  [pdf, other

    cs.LG cs.AI

    Learning Temporally-Consistent Representations for Data-Efficient Reinforcement Learning

    Authors: Trevor McInroe, Lukas Schäfer, Stefano V. Albrecht

    Abstract: Deep reinforcement learning (RL) agents that exist in high-dimensional state spaces, such as those composed of images, have interconnected learning burdens. Agents must learn an action-selection policy that completes their given task, which requires them to learn a representation of the state space that discerns between useful and useless information. The reward function is the only supervised fee… ▽ More

    Submitted 10 October, 2021; originally announced October 2021.

  9. arXiv:2103.06398  [pdf, other

    cs.LG

    Analyzing the Hidden Activations of Deep Policy Networks: Why Representation Matters

    Authors: Trevor A. McInroe, Michael Spurrier, Jennifer Sieber, Stephen Conneely

    Abstract: We analyze the hidden activations of neural network policies of deep reinforcement learning (RL) agents and show, empirically, that it's possible to know a priori if a state representation will lend itself to fast learning. RL agents in high-dimensional states have two main learning burdens: (1) to learn an action-selection policy and (2) to learn to discern between useful and non-useful informati… ▽ More

    Submitted 10 March, 2021; originally announced March 2021.

    Comments: 11 pages, 29 figures

  10. arXiv:2008.12693  [pdf, other

    cs.LG cs.AI stat.ML

    Sample Efficiency in Sparse Reinforcement Learning: Or Your Money Back

    Authors: Trevor A. McInroe

    Abstract: Sparse rewards present a difficult problem in reinforcement learning and may be inevitable in certain domains with complex dynamics such as real-world robotics. Hindsight Experience Replay (HER) is a recent replay memory development that allows agents to learn in sparse settings by altering memories to show them as successful even though they may not be. While, empirically, HER has shown some succ… ▽ More

    Submitted 28 August, 2020; originally announced August 2020.