Skip to main content

Showing 1–11 of 11 results for author: Madhushani, U

Searching in archive math. Search in all archives.
.
  1. arXiv:2201.13288  [pdf, other

    math.OC cs.LG stat.ML

    A Regret Minimization Approach to Multi-Agent Control

    Authors: Udaya Ghai, Udari Madhushani, Naomi Leonard, Elad Hazan

    Abstract: We study the problem of multi-agent control of a dynamical system with known dynamics and adversarial disturbances. Our study focuses on optimal control without centralized precomputed policies, but rather with adaptive control policies for the different agents that are only equipped with a stabilizing controller. We give a reduction from any (standard) regret minimizing control method to a distri… ▽ More

    Submitted 25 February, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

    Journal ref: Proceedings of the 39th International Conference on Machine Learning, PMLR 162:7422-7434, 2022

  2. arXiv:2110.07392  [pdf, other

    cs.LG cs.MA math.OC

    Provably Efficient Multi-Agent Reinforcement Learning with Fully Decentralized Communication

    Authors: Justin Lidard, Udari Madhushani, Naomi Ehrich Leonard

    Abstract: A challenge in reinforcement learning (RL) is minimizing the cost of sampling associated with exploration. Distributed exploration reduces sampling complexity in multi-agent RL (MARL). We investigate the benefits to performance in MARL when exploration is fully decentralized. Specifically, we consider a class of online, episodic, tabular $Q$-learning problems under time-varying reward and transiti… ▽ More

    Submitted 2 May, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: Accepted as a conference paper to American Control Conference (ACC) 2022

  3. arXiv:2011.07720  [pdf, other

    stat.ML cs.LG math.PR

    Distributed Bandits: Probabilistic Communication on $d$-regular Graphs

    Authors: Udari Madhushani, Naomi Ehrich Leonard

    Abstract: We study the decentralized multi-agent multi-armed bandit problem for agents that communicate with probability over a network defined by a $d$-regular graph. Every edge in the graph has probabilistic weight $p$ to account for the ($1\!-\!p$) probability of a communication link failure. At each time step, each agent chooses an arm and receives a numerical reward associated with the chosen arm. Afte… ▽ More

    Submitted 8 October, 2021; v1 submitted 15 November, 2020; originally announced November 2020.

  4. arXiv:2011.05927  [pdf, other

    cs.LG eess.SY math.OC

    On Using Hamiltonian Monte Carlo Sampling for Reinforcement Learning Problems in High-dimension

    Authors: Udari Madhushani, Biswadip Dey, Naomi Ehrich Leonard, Amit Chakraborty

    Abstract: Value function based reinforcement learning (RL) algorithms, for example, $Q$-learning, learn optimal policies from datasets of actions, rewards, and state transitions. However, when the underlying state transition dynamics are stochastic and evolve on a high-dimensional space, generating independent and identically distributed (IID) data samples for creating these datasets poses a significant cha… ▽ More

    Submitted 28 March, 2022; v1 submitted 11 November, 2020; originally announced November 2020.

  5. arXiv:2009.01339  [pdf, other

    math.OC stat.ML

    Heterogeneous Explore-Exploit Strategies on Multi-Star Networks

    Authors: Udari Madhushani, Naomi Leonard

    Abstract: We investigate the benefits of heterogeneity in multi-agent explore-exploit decision making where the goal of the agents is to maximize cumulative group reward. To do so we study a class of distributed stochastic bandit problems in which agents communicate over a multi-star network and make sequential choices among options in the same uncertain environment. Typically, in multi-agent bandit problem… ▽ More

    Submitted 1 December, 2020; v1 submitted 2 September, 2020; originally announced September 2020.

  6. arXiv:2004.06171  [pdf, other

    cs.LG math.OC stat.ML

    Distributed Learning: Sequential Decision Making in Resource-Constrained Environments

    Authors: Udari Madhushani, Naomi Ehrich Leonard

    Abstract: We study cost-effective communication strategies that can be used to improve the performance of distributed learning systems in resource-constrained environments. For distributed learning in sequential decision making, we propose a new cost-effective partial communication protocol. We illustrate that with this protocol the group obtains the same order of performance that it obtains with full commu… ▽ More

    Submitted 13 April, 2020; originally announced April 2020.

  7. arXiv:2004.03793  [pdf, other

    math.OC cs.LG

    A Dynamic Observation Strategy for Multi-agent Multi-armed Bandit Problem

    Authors: Udari Madhushani, Naomi Ehrich Leonard

    Abstract: We define and analyze a multi-agent multi-armed bandit problem in which decision-making agents can observe the choices and rewards of their neighbors under a linear observation cost. Neighbors are defined by a network graph that encodes the inherent observation constraints of the system. We define a cost associated with observations such that at every instance an agent makes an observation it rece… ▽ More

    Submitted 7 April, 2020; originally announced April 2020.

  8. arXiv:1905.08731  [pdf, other

    math.OC cs.LG

    Heterogeneous Stochastic Interactions for Multiple Agents in a Multi-armed Bandit Problem

    Authors: Udari Madhushani, Naomi Ehrich Leonard

    Abstract: We define and analyze a multi-agent multi-armed bandit problem in which decision-making agents can observe the choices and rewards of their neighbors. Neighbors are defined by a network graph with heterogeneous and stochastic interconnections. These interactions are determined by the sociability of each agent, which corresponds to the probability that the agent observes its neighbors. We design an… ▽ More

    Submitted 21 May, 2019; originally announced May 2019.

  9. arXiv:1610.04395  [pdf, other

    math.OC

    A Geometric PID Control Framework for Mechanical Systems

    Authors: D. H. S. Maithripala, T. W. U. Madhushani, J. M. Berg

    Abstract: These lectures demonstrate the development of a PID control framework for mechanical systems. Based on the observation that mechanical systems are essentially double integrator systems, we generalize the linear PID controller to mechanical systems that have a non-Euclidean configuration space. Specifically we start by presenting the development of the geometric PID controller for fully actuated me… ▽ More

    Submitted 14 October, 2016; originally announced October 2016.

    Comments: Invited Talk, Systems & Control Engineering, IIT Bombay, 19-26 October 2016

    MSC Class: 93-02

  10. arXiv:1609.09557  [pdf, other

    math.OC

    Feedback Regularization and Geometric PID Control for Trajectory Tracking of Coupled Mechanical Systems: Hoop Robots on an Inclined Plane

    Authors: T. W. U. Madhushani, D. H. S. Maithripala, J. M. Berg

    Abstract: This paper applies geometric PID control for asymptotic tracking of a desired trajectory by a hoop robot in the presence of disturbances and uncertainties. The hoop robot, consisting of a circular body rolling without slip along a one-dimensional surface, is a planar analog of a spherical robot. A variety of coupled mechanical system may be used to actuate the hoop robot. This paper specifically c… ▽ More

    Submitted 26 February, 2017; v1 submitted 29 September, 2016; originally announced September 2016.

    Comments: To be presented at the American Control Conference 2017, May 24--26, Seattle, WA, USA

    MSC Class: 93D06

  11. Semi-globally Exponential Trajectory Tracking for a Class of Spherical Robots

    Authors: T. W. U. Madhushani, D. H. S. Maithripala, J. V. Wijayakulasooriya, J. M. Berg

    Abstract: A spherical robot consists of an externally spherical rigid body rolling on a two-dimensional surface, actuated by an auxiliary mechanism. For a class of actuation mechanisms, we derive a controller for the geometric center of the sphere to asymptotically track any sufficiently smooth reference trajectory, with robustness to bounded, constant uncertainties in the inertial properties of the sphere… ▽ More

    Submitted 1 March, 2017; v1 submitted 4 August, 2016; originally announced August 2016.

    Comments: Pre-print submitted to Automatica on 01/03/2017