Skip to main content

Showing 1–15 of 15 results for author: Lacerda, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.07732  [pdf, other

    cs.AI cs.LG

    Monte Carlo Tree Search with Boltzmann Exploration

    Authors: Michael Painter, Mohamed Baioumy, Nick Hawes, Bruno Lacerda

    Abstract: Monte-Carlo Tree Search (MCTS) methods, such as Upper Confidence Bound applied to Trees (UCT), are instrumental to automated planning techniques. However, UCT can be slow to explore an optimal action when it initially appears inferior to other actions. Maximum ENtropy Tree-Search (MENTS) incorporates the maximum entropy principle into an MCTS approach, utilising Boltzmann policies to sample action… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: Camera ready version of NeurIPS2023 paper

    Journal ref: Advances in Neural Information Processing Systems 36 (2024)

  2. arXiv:2311.10090  [pdf, other

    cs.LG cs.AI cs.MA

    JaxMARL: Multi-Agent RL Environments in JAX

    Authors: Alexander Rutherford, Benjamin Ellis, Matteo Gallici, Jonathan Cook, Andrei Lupu, Gardar Ingvarsson, Timon Willi, Akbir Khan, Christian Schroeder de Witt, Alexandra Souly, Saptarashmi Bandyopadhyay, Mikayel Samvelyan, Minqi Jiang, Robert Tjarko Lange, Shimon Whiteson, Bruno Lacerda, Nick Hawes, Tim Rocktaschel, Chris Lu, Jakob Nicolaus Foerster

    Abstract: Benchmarks play an important role in the development of machine learning algorithms. For example, research in reinforcement learning (RL) has been heavily influenced by available environments and benchmarks. However, RL environments are traditionally run on the CPU, limiting their scalability with typical academic compute. Recent advancements in JAX have enabled the wider use of hardware accelerat… ▽ More

    Submitted 19 December, 2023; v1 submitted 16 November, 2023; originally announced November 2023.

  3. arXiv:2306.09211  [pdf, other

    cs.LG cs.RO

    A Framework for Learning from Demonstration with Minimal Human Effort

    Authors: Marc Rigter, Bruno Lacerda, Nick Hawes

    Abstract: We consider robot learning in the context of shared autonomy, where control of the system can switch between a human teleoperator and autonomous control. In this setting we address reinforcement learning, and learning from demonstration, where there is a cost associated with human time. This cost represents the human time required to teleoperate the robot, or recover the robot from failures. For e… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: Preprint version of IEEE Robotics and Automation Letters paper

  4. Formal Modelling for Multi-Robot Systems Under Uncertainty

    Authors: Charlie Street, Masoumeh Mansouri, Bruno Lacerda

    Abstract: Purpose of Review: To effectively synthesise and analyse multi-robot behaviour, we require formal task-level models which accurately capture multi-robot execution. In this paper, we review modelling formalisms for multi-robot systems under uncertainty, and discuss how they can be used for planning, reinforcement learning, model checking, and simulation. Recent Findings: Recent work has investiga… ▽ More

    Submitted 15 August, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 23 pages, 0 figures, 2 tables. Current Robotics Reports (2023). This version of the article has been accepted for publication, after peer review (when applicable) but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://dx.doi.org/10.1007/s43154-023-00104-0

    ACM Class: I.2.9; G.3

  5. arXiv:2212.00124  [pdf, other

    cs.LG

    One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based Offline Reinforcement Learning

    Authors: Marc Rigter, Bruno Lacerda, Nick Hawes

    Abstract: Offline reinforcement learning (RL) is suitable for safety-critical domains where online exploration is too costly or dangerous. In such safety-critical settings, decision-making should take into consideration the risk of catastrophic outcomes. In other words, decision-making should be risk-sensitive. Previous works on risk in offline RL combine together offline RL techniques, to avoid distributio… ▽ More

    Submitted 30 October, 2023; v1 submitted 30 November, 2022; originally announced December 2022.

    Comments: NeurIPS 2023

  6. arXiv:2204.12581  [pdf, other

    cs.LG cs.AI

    RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement Learning

    Authors: Marc Rigter, Bruno Lacerda, Nick Hawes

    Abstract: Offline reinforcement learning (RL) aims to find performant policies from logged data without further environment interaction. Model-based algorithms, which learn a model of the environment from the dataset and perform conservative policy optimisation within that model, have emerged as a promising approach to this problem. In this work, we present Robust Adversarial Model-Based Offline RL (RAMBO),… ▽ More

    Submitted 11 October, 2022; v1 submitted 26 April, 2022; originally announced April 2022.

    Comments: NeurIPS 2022

  7. arXiv:2110.12746  [pdf, other

    cs.AI eess.SY

    Planning for Risk-Aversion and Expected Value in MDPs

    Authors: Marc Rigter, Paul Duckworth, Bruno Lacerda, Nick Hawes

    Abstract: Planning in Markov decision processes (MDPs) typically optimises the expected cost. However, optimising the expectation does not consider the risk that for any given run of the MDP, the total cost received may be unacceptably high. An alternative approach is to find a policy which optimises a risk-averse objective such as conditional value at risk (CVaR). However, optimising the CVaR alone may res… ▽ More

    Submitted 10 March, 2022; v1 submitted 25 October, 2021; originally announced October 2021.

    Comments: Accepted to ICAPS 2022

  8. arXiv:2109.11287  [pdf, other

    cs.RO

    Risk-Aware Motion Planning in Partially Known Environments

    Authors: Fernando S. Barbosa, Bruno Lacerda, Paul Duckworth, Jana Tumova, Nick Hawes

    Abstract: Recent trends envisage robots being deployed in areas deemed dangerous to humans, such as buildings with gas and radiation leaks. In such situations, the model of the underlying hazardous process might be unknown to the agent a priori, giving rise to the problem of planning for safe behaviour in partially known environments. We employ Gaussian process regression to create a probabilistic model of… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

    Comments: 7 pages, 2 figures, to be published in CDC 2021

  9. arXiv:2109.05866  [pdf, other

    cs.LG cs.AI

    On Solving a Stochastic Shortest-Path Markov Decision Process as Probabilistic Inference

    Authors: Mohamed Baioumy, Bruno Lacerda, Paul Duckworth, Nick Hawes

    Abstract: Previous work on planning as active inference addresses finite horizon problems and solutions valid for online planning. We propose solving the general Stochastic Shortest-Path Markov Decision Process (SSP MDP) as probabilistic inference. Furthermore, we discuss online and offline methods for planning under uncertainty. In an SSP MDP, the horizon is indefinite and unknown a priori. SSP MDPs genera… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: Presented at the second International Workshop on Active Inference (IWAI 2021); 11 pages, 2 figures

  10. arXiv:2102.05762  [pdf, other

    cs.LG cs.AI

    Risk-Averse Bayes-Adaptive Reinforcement Learning

    Authors: Marc Rigter, Bruno Lacerda, Nick Hawes

    Abstract: In this work, we address risk-averse Bayes-adaptive reinforcement learning. We pose the problem of optimising the conditional value at risk (CVaR) of the total return in Bayes-adaptive Markov decision processes (MDPs). We show that a policy optimising CVaR in this setting is risk-averse to both the parametric uncertainty due to the prior distribution over MDPs, and the internal uncertainty due to… ▽ More

    Submitted 26 October, 2021; v1 submitted 10 February, 2021; originally announced February 2021.

    Comments: Full version of NeurIPS 2021 paper

  11. arXiv:2012.04626  [pdf, other

    cs.AI

    Minimax Regret Optimisation for Robust Planning in Uncertain Markov Decision Processes

    Authors: Marc Rigter, Bruno Lacerda, Nick Hawes

    Abstract: The parameters for a Markov Decision Process (MDP) often cannot be specified exactly. Uncertain MDPs (UMDPs) capture this model ambiguity by defining sets which the parameters belong to. Minimax regret has been proposed as an objective for planning in UMDPs to find robust policies which are not overly conservative. In this work, we focus on planning for Stochastic Shortest Path (SSP) UMDPs with un… ▽ More

    Submitted 12 February, 2023; v1 submitted 8 December, 2020; originally announced December 2020.

    Comments: Full version of AAAI 2021 paper, with corrigendum attached that describes error in original paper

  12. arXiv:2005.05894  [pdf, other

    cs.RO

    Active Inference for Integrated State-Estimation, Control, and Learning

    Authors: Mohamed Baioumy, Paul Duckworth, Bruno Lacerda, Nick Hawes

    Abstract: This work presents an approach for control, state-estimation and learning model (hyper)parameters for robotic manipulators. It is based on the active inference framework, prominent in computational neuroscience as a theory of the brain, where behaviour arises from minimizing variational free-energy. The robotic manipulator shows adaptive and robust behaviour compared to state-of-the-art methods. A… ▽ More

    Submitted 30 March, 2021; v1 submitted 12 May, 2020; originally announced May 2020.

    Comments: 7 pages, 6 figures, accepted for presentation at the International Conference on Robotics and Automation (ICRA) 2021

  13. arXiv:2003.04445  [pdf, other

    cs.AI

    Convex Hull Monte-Carlo Tree Search

    Authors: Michael Painter, Bruno Lacerda, Nick Hawes

    Abstract: This work investigates Monte-Carlo planning for agents in stochastic environments, with multiple objectives. We propose the Convex Hull Monte-Carlo Tree-Search (CHMCTS) framework, which builds upon Trial Based Heuristic Tree Search and Convex Hull Value Iteration (CHVI), as a solution to multi-objective planning in large environments. Moreover, we consider how to pose the problem of approximating… ▽ More

    Submitted 23 March, 2020; v1 submitted 9 March, 2020; originally announced March 2020.

    Comments: Camera-ready version of paper accepted to ICAPS 2020, along with relevant appendices

  14. arXiv:1803.02906  [pdf, other

    cs.AI cs.RO

    Simultaneous Task Allocation and Planning Under Uncertainty

    Authors: Fatma Faruq, Bruno Lacerda, Nick Hawes, David Parker

    Abstract: We propose novel techniques for task allocation and planning in multi-robot systems operating in uncertain environments. Task allocation is performed simultaneously with planning, which provides more detailed information about individual robot behaviour, but also exploits independence between tasks to do so efficiently. We use Markov decision processes to model robot behaviour and linear temporal… ▽ More

    Submitted 10 August, 2018; v1 submitted 7 March, 2018; originally announced March 2018.

  15. The STRANDS Project: Long-Term Autonomy in Everyday Environments

    Authors: Nick Hawes, Chris Burbridge, Ferdian Jovan, Lars Kunze, Bruno Lacerda, Lenka Mudrová, Jay Young, Jeremy Wyatt, Denise Hebesberger, Tobias Körtner, Rares Ambrus, Nils Bore, John Folkesson, Patric Jensfelt, Lucas Beyer, Alexander Hermans, Bastian Leibe, Aitor Aldoma, Thomas Fäulhammer, Michael Zillich, Markus Vincze, Eris Chinellato, Muhannad Al-Omari, Paul Duckworth, Yiannis Gatsoulis , et al. (8 additional authors not shown)

    Abstract: Thanks to the efforts of the robotics and autonomous systems community, robots are becoming ever more capable. There is also an increasing demand from end-users for autonomous service robots that can operate in real environments for extended periods. In the STRANDS project we are tackling this demand head-on by integrating state-of-the-art artificial intelligence and robotics research into mobile… ▽ More

    Submitted 14 October, 2016; v1 submitted 15 April, 2016; originally announced April 2016.