Skip to main content

Showing 1–5 of 5 results for author: Castellini, J

.
  1. arXiv:2311.09459  [pdf, other

    cs.MA

    On Convex Optimal Value Functions For POSGs

    Authors: Rafael F. Cunha, Jacopo Castellini, Johan Peralez, Jilles S. Dibangoye

    Abstract: Multi-agent planning and reinforcement learning can be challenging when agents cannot see the state of the world or communicate with each other due to communication costs, latency, or noise. Partially Observable Stochastic Games (POSGs) provide a mathematical framework for modelling such scenarios. This paper aims to improve the efficiency of planning and reinforcement learning algorithms for POSG… ▽ More

    Submitted 6 December, 2023; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Currently under review at JAIR

    MSC Class: I.2.6; I.2.8; I.2.11

  2. Difference Rewards Policy Gradients

    Authors: Jacopo Castellini, Sam Devlin, Frans A. Oliehoek, Rahul Savani

    Abstract: Policy gradient methods have become one of the most popular classes of algorithms for multi-agent reinforcement learning. A key challenge, however, that is not addressed by many of these methods is multi-agent credit assignment: assessing an agent's contribution to the overall performance, which is crucial for learning good policies. We propose a novel algorithm called Dr.Reinforce that explicitly… ▽ More

    Submitted 9 November, 2023; v1 submitted 21 December, 2020; originally announced December 2020.

    Comments: This work as been accepted as an Extended Abstract in Proc. of the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2021), U. Endriss, A. Nowé, F. Dignum, A. Lomuscio (eds.), May 3-7 2021, Online

    ACM Class: I.2.6; I.2.11

    Journal ref: Neural Comput & Applic (2022)

  3. arXiv:1904.02478  [pdf, other

    cs.NE cs.LG

    Learning Numeracy: Binary Arithmetic with Neural Turing Machines

    Authors: Jacopo Castellini

    Abstract: One of the main problems encountered so far with recurrent neural networks is that they struggle to retain long-time information dependencies in their recurrent connections. Neural Turing Machines (NTMs) attempt to mitigate this issue by providing the neural network with an external portion of memory, in which information can be stored and manipulated later on. The whole mechanism is differentiabl… ▽ More

    Submitted 6 September, 2019; v1 submitted 4 April, 2019; originally announced April 2019.

  4. Krylov Iterative Methods for the Geometric Mean of Two Matrices Times a Vector

    Authors: Jacopo Castellini

    Abstract: In this work, we are presenting an efficient way to compute the geometric mean of two positive definite matrices times a vector. For this purpose, we are inspecting the application of methods based on Krylov spaces to compute the square root of a matrix. These methods, using only matrix-vector products, are capable of producing a good approximation of the result with a small computational cost.

    Submitted 3 April, 2019; v1 submitted 4 March, 2019; originally announced March 2019.

    Journal ref: Numerical Algorithms 74(2), 561-571, Springer, 2017

  5. Analysing Factorizations of Action-Value Networks for Cooperative Multi-Agent Reinforcement Learning

    Authors: Jacopo Castellini, Frans A. Oliehoek, Rahul Savani, Shimon Whiteson

    Abstract: Recent years have seen the application of deep reinforcement learning techniques to cooperative multi-agent systems, with great empirical success. However, given the lack of theoretical insight, it remains unclear what the employed neural networks are learning, or how we should enhance their learning power to address the problems on which they fail. In this work, we empirically investigate the lea… ▽ More

    Submitted 9 November, 2023; v1 submitted 20 February, 2019; originally announced February 2019.

    Comments: This work as been accepted as an Extended Abstract in Proc. of the 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2019), N. Agmon, M. E. Taylor, E. Elkind, M. Veloso (eds.), May 2019, Montreal, Canada

    ACM Class: I.2.6; I.2.11

    Journal ref: Auton Agent Multi-Agent Syst 35, 25 (2021)