Skip to main content

Showing 1–9 of 9 results for author: Hayes, C F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.02665  [pdf, ps, other

    cs.LG

    Utility-Based Reinforcement Learning: Unifying Single-objective and Multi-objective Reinforcement Learning

    Authors: Peter Vamplew, Cameron Foale, Conor F. Hayes, Patrick Mannion, Enda Howley, Richard Dazeley, Scott Johnson, Johan Källström, Gabriel Ramos, Roxana Rădulescu, Willem Röpke, Diederik M. Roijers

    Abstract: Research in multi-objective reinforcement learning (MORL) has introduced the utility-based paradigm, which makes use of both environmental rewards and a function that defines the utility derived by the user from those rewards. In this paper we extend this paradigm to the context of single-objective reinforcement learning (RL), and outline multiple potential benefits including the ability to perfor… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: Accepted for the Blue Sky Track at AAMAS'24

  2. arXiv:2305.05560  [pdf, other

    cs.AI

    Distributional Multi-Objective Decision Making

    Authors: Willem Röpke, Conor F. Hayes, Patrick Mannion, Enda Howley, Ann Nowé, Diederik M. Roijers

    Abstract: For effective decision support in scenarios with conflicting objectives, sets of potentially optimal solutions can be presented to the decision maker. We explore both what policies these sets should contain and how such sets can be computed efficiently. With this in mind, we take a distributional approach and introduce a novel dominance criterion relating return distributions of policies directly.… ▽ More

    Submitted 18 July, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

    Comments: Accepted at IJCAI 2023

  3. arXiv:2211.13032  [pdf, other

    cs.AI cs.LG

    Monte Carlo Tree Search Algorithms for Risk-Aware and Multi-Objective Reinforcement Learning

    Authors: Conor F. Hayes, Mathieu Reymond, Diederik M. Roijers, Enda Howley, Patrick Mannion

    Abstract: In many risk-aware and multi-objective reinforcement learning settings, the utility of the user is derived from a single execution of a policy. In these settings, making decisions based on the average future returns is not suitable. For example, in a medical setting a patient may only have one opportunity to treat their illness. Making decisions using just the expected future returns -- known in r… ▽ More

    Submitted 6 December, 2022; v1 submitted 23 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2102.00966

  4. arXiv:2207.00368  [pdf, other

    cs.AI cs.LG

    Multi-Objective Coordination Graphs for the Expected Scalarised Returns with Generative Flow Models

    Authors: Conor F. Hayes, Timothy Verstraeten, Diederik M. Roijers, Enda Howley, Patrick Mannion

    Abstract: Many real-world problems contain multiple objectives and agents, where a trade-off exists between objectives. Key to solving such problems is to exploit sparse dependency structures that exist between agents. For example, in wind farm control a trade-off exists between maximising power and minimising stress on the systems components. Dependencies between turbines arise due to the wake effect. We m… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

  5. arXiv:2204.05027  [pdf, ps, other

    cs.LG cs.AI q-bio.PE

    Exploring the Pareto front of multi-objective COVID-19 mitigation policies using reinforcement learning

    Authors: Mathieu Reymond, Conor F. Hayes, Lander Willem, Roxana Rădulescu, Steven Abrams, Diederik M. Roijers, Enda Howley, Patrick Mannion, Niel Hens, Ann Nowé, Pieter Libin

    Abstract: Infectious disease outbreaks can have a disruptive impact on public health and societal processes. As decision making in the context of epidemic mitigation is hard, reinforcement learning provides a methodology to automatically learn prevention strategies in combination with complex epidemic models. Current research focuses on optimizing policies w.r.t. a single objective, such as the pathogen's a… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

  6. arXiv:2112.15422  [pdf, other

    cs.AI

    Scalar reward is not enough: A response to Silver, Singh, Precup and Sutton (2021)

    Authors: Peter Vamplew, Benjamin J. Smith, Johan Kallstrom, Gabriel Ramos, Roxana Radulescu, Diederik M. Roijers, Conor F. Hayes, Fredrik Heintz, Patrick Mannion, Pieter J. K. Libin, Richard Dazeley, Cameron Foale

    Abstract: The recent paper `"Reward is Enough" by Silver, Singh, Precup and Sutton posits that the concept of reward maximisation is sufficient to underpin all intelligence, both natural and artificial. We contest the underlying assumption of Silver et al. that such reward can be scalar-valued. In this paper we explain why scalar rewards are insufficient to account for some aspects of both biological and co… ▽ More

    Submitted 24 November, 2021; originally announced December 2021.

  7. Expected Scalarised Returns Dominance: A New Solution Concept for Multi-Objective Decision Making

    Authors: Conor F. Hayes, Timothy Verstraeten, Diederik M. Roijers, Enda Howley, Patrick Mannion

    Abstract: In many real-world scenarios, the utility of a user is derived from the single execution of a policy. In this case, to apply multi-objective reinforcement learning, the expected utility of the returns must be optimised. Various scenarios exist where a user's preferences over objectives (also known as the utility function) are unknown or difficult to specify. In such scenarios, a set of optimal pol… ▽ More

    Submitted 1 July, 2022; v1 submitted 2 June, 2021; originally announced June 2021.

  8. A Practical Guide to Multi-Objective Reinforcement Learning and Planning

    Authors: Conor F. Hayes, Roxana Rădulescu, Eugenio Bargiacchi, Johan Källström, Matthew Macfarlane, Mathieu Reymond, Timothy Verstraeten, Luisa M. Zintgraf, Richard Dazeley, Fredrik Heintz, Enda Howley, Athirai A. Irissappane, Patrick Mannion, Ann Nowé, Gabriel Ramos, Marcello Restelli, Peter Vamplew, Diederik M. Roijers

    Abstract: Real-world decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying pr… ▽ More

    Submitted 17 March, 2021; originally announced March 2021.

    Journal ref: Auton Agent Multi-Agent Syst 36, 26 (2022)

  9. arXiv:2102.00966  [pdf, other

    cs.LG cs.AI

    Risk Aware and Multi-Objective Decision Making with Distributional Monte Carlo Tree Search

    Authors: Conor F. Hayes, Mathieu Reymond, Diederik M. Roijers, Enda Howley, Patrick Mannion

    Abstract: In many risk-aware and multi-objective reinforcement learning settings, the utility of the user is derived from the single execution of a policy. In these settings, making decisions based on the average future returns is not suitable. For example, in a medical setting a patient may only have one opportunity to treat their illness. When making a decision, just the expected return -- known in reinfo… ▽ More

    Submitted 2 February, 2021; v1 submitted 1 February, 2021; originally announced February 2021.

    Comments: 8 pages, 4 figures