-
A Decentralized Policy with Logarithmic Regret for a Class of Multi-Agent Multi-Armed Bandit Problems with Option Unavailability Constraints and Stochastic Communication Protocols
Authors:
Pathmanathan Pankayaraj,
D. H. S. Maithripala,
J. M. Berg
Abstract:
This paper considers a multi-armed bandit (MAB) problem in which multiple mobile agents receive rewards by sampling from a collection of spatially dispersed stochastic processes, called bandits. The goal is to formulate a decentralized policy for each agent, in order to maximize the total cumulative reward over all agents, subject to option availability and inter-agent communication constraints. T…
▽ More
This paper considers a multi-armed bandit (MAB) problem in which multiple mobile agents receive rewards by sampling from a collection of spatially dispersed stochastic processes, called bandits. The goal is to formulate a decentralized policy for each agent, in order to maximize the total cumulative reward over all agents, subject to option availability and inter-agent communication constraints. The problem formulation is motivated by applications in which a team of autonomous mobile robots cooperates to accomplish an exploration and exploitation task in an uncertain environment. Bandit locations are represented by vertices of the spatial graph. At any time, an agent's option consist of sampling the bandit at its current location, or traveling along an edge of the spatial graph to a new bandit location. Communication constraints are described by a directed, non-stationary, stochastic communication graph. At any time, agents may receive data only from their communication graph in-neighbors. For the case of a single agent on a fully connected spatial graph, it is known that the expected regret for any optimal policy is necessarily bounded below by a function that grows as the logarithm of time. A class of policies called upper confidence bound (UCB) algorithms asymptotically achieve logarithmic regret for the classical MAB problem. In this paper, we propose a UCB-based decentralized motion and option selection policy and a non-stationary stochastic communication protocol that guarantee logarithmic regret. To our knowledge, this is the first such decentralized policy for non-fully connected spatial graphs with communication constraints. When the spatial graph is fully connected and the communication graph is stationary, our decentralized algorithm matches or exceeds the best reported prior results from the literature.
△ Less
Submitted 31 March, 2020; v1 submitted 29 March, 2020;
originally announced March 2020.
-
Feedback Regularization and Geometric PID Control for Robust Stabilization of a Planar Three-link Hybrid Bipedal Walking Model
Authors:
W. M. L. T. Weerakoon,
T. W. U. Madhushani,
D. H. S. Maithripala,
J. M. Berg
Abstract:
This paper applies a recently developed geometric PID controller to stabilize a three-link planar bipedal hybrid dynamic walking model. The three links represent the robot torso and two kneeless legs, with an independent control torque available at each hip joint. The geometric PID controller is derived for fully actuated mechanical systems, however in the swing phase the three-link biped robot ha…
▽ More
This paper applies a recently developed geometric PID controller to stabilize a three-link planar bipedal hybrid dynamic walking model. The three links represent the robot torso and two kneeless legs, with an independent control torque available at each hip joint. The geometric PID controller is derived for fully actuated mechanical systems, however in the swing phase the three-link biped robot has three degrees of freedom and only two controls. Following the bipedal walking literature, underactuation is addressed by choosing two "virtual constraints" to enforce, and verifying the stability of the resulting two-dimensional zero dynamics. The resulting controlled dynamics do not have the structure of a mechanical system, however this structure is restored using "feedback regularization," following which geometric PID control is used to provide robust asymptotic regulation of the virtual constraints. The proposed method can tolerate significantly greater variations in inclination, showing the value of the geometric methods, and the benefit of integral action.
△ Less
Submitted 5 October, 2017;
originally announced October 2017.
-
A Geometric PID Control Framework for Mechanical Systems
Authors:
D. H. S. Maithripala,
T. W. U. Madhushani,
J. M. Berg
Abstract:
These lectures demonstrate the development of a PID control framework for mechanical systems. Based on the observation that mechanical systems are essentially double integrator systems, we generalize the linear PID controller to mechanical systems that have a non-Euclidean configuration space. Specifically we start by presenting the development of the geometric PID controller for fully actuated me…
▽ More
These lectures demonstrate the development of a PID control framework for mechanical systems. Based on the observation that mechanical systems are essentially double integrator systems, we generalize the linear PID controller to mechanical systems that have a non-Euclidean configuration space. Specifically we start by presenting the development of the geometric PID controller for fully actuated mechanical systems and then extend it to a class of under actuated interconnected mechanical systems of practical significance by introducing the notion of feedback regularization. We show that feedback regularization is the mechanical system equivalent to partial feedback linearization. We apply these results for trajectory tracking for several systems of interest in the field of robotics. First, we demonstrate the robust almost-global stability properties of the geometric PID controller developed for fully actuated mechanical systems using simulations and experiments on a multi-rotor-aerial-vehicle. The extension to the class of under actuated interconnected systems allow one to ensure the semi-almost-global locally exponential tracking of the geometric center of a spherical robot on an inclined plane of unknown angle of inclination. The results are demonstrated using simulations for a hoop rolling on an inclined plane and then for a sphere rolling on an inclined plane. The final extension that we present here is that of geometric PID control for holonomically or non-holonomically constrained mechanical systems on Lie groups. The results are demonstrated by ensuring the robust almost global locally exponential tracking of a nontrivial spherical pendulum.
△ Less
Submitted 14 October, 2016;
originally announced October 2016.
-
Feedback Regularization and Geometric PID Control for Trajectory Tracking of Coupled Mechanical Systems: Hoop Robots on an Inclined Plane
Authors:
T. W. U. Madhushani,
D. H. S. Maithripala,
J. M. Berg
Abstract:
This paper applies geometric PID control for asymptotic tracking of a desired trajectory by a hoop robot in the presence of disturbances and uncertainties. The hoop robot, consisting of a circular body rolling without slip along a one-dimensional surface, is a planar analog of a spherical robot. A variety of coupled mechanical system may be used to actuate the hoop robot. This paper specifically c…
▽ More
This paper applies geometric PID control for asymptotic tracking of a desired trajectory by a hoop robot in the presence of disturbances and uncertainties. The hoop robot, consisting of a circular body rolling without slip along a one-dimensional surface, is a planar analog of a spherical robot. A variety of coupled mechanical system may be used to actuate the hoop robot. This paper specifically considers two different actuators, one a simple pendulum and the other an internal cart. The geometric PID controller requires the plant to be a mechanical system, and the hoop robot does not satisfy this condition. Therefore a geometric inner loop is presented that gives the hoop robot the required structure. This procedure is here referred to as feedback regularization. Feedback regularization--in contrast to feedback linearization--is coordinate independent, and hence reflects the fundamental system structure. Note also that the resulting mechanical system is nonlinear and underactuated. Subsequently, the geometric PID outer loop guarantees almost-semiglobal tracking with locally exponential convergence, and the integral action of the PID guarantees robustness to constant disturbances and parameter uncertainties, including constant inclination of the rolling surface. The complete tracking controller is the composition of the two coordinate-independent loops, and therefore is also coordinate independent.
△ Less
Submitted 26 February, 2017; v1 submitted 29 September, 2016;
originally announced September 2016.
-
Semi-globally Exponential Trajectory Tracking for a Class of Spherical Robots
Authors:
T. W. U. Madhushani,
D. H. S. Maithripala,
J. V. Wijayakulasooriya,
J. M. Berg
Abstract:
A spherical robot consists of an externally spherical rigid body rolling on a two-dimensional surface, actuated by an auxiliary mechanism. For a class of actuation mechanisms, we derive a controller for the geometric center of the sphere to asymptotically track any sufficiently smooth reference trajectory, with robustness to bounded, constant uncertainties in the inertial properties of the sphere…
▽ More
A spherical robot consists of an externally spherical rigid body rolling on a two-dimensional surface, actuated by an auxiliary mechanism. For a class of actuation mechanisms, we derive a controller for the geometric center of the sphere to asymptotically track any sufficiently smooth reference trajectory, with robustness to bounded, constant uncertainties in the inertial properties of the sphere and actuation mechanism, and to constant disturbance forces including, for example, from constant inclination of the rolling surface. The sphere and actuator are modeled as distinct systems, coupled by reaction forces. It is assumed that the actuator can provide three independent control torques, and that the actuator center of mass remains at a constant distance from the geometric center of the sphere. We show that a necessary and sufficient condition for such a controller to exist is that for any constant disturbance torque acting on the sphere there is a constant input such that the sphere and the actuator mechanism has a stable relative equilibrium. A geometric PID controller guarantees robust, semi-global, locally exponential stability for the position tracking error of the geometric center of the sphere, while ensuring that actuator velocities are bounded.
△ Less
Submitted 1 March, 2017; v1 submitted 4 August, 2016;
originally announced August 2016.
-
Competition and cooperation:aspects of dynamics in sandpiles
Authors:
Anita Mehta,
J M Luck,
J M Berg,
G C Barker
Abstract:
In this article, we review some of our approaches to granular dynamics, now well known to consist of both fast and slow relaxational processes. In the first case, grains typically compete with each other, while in the second, they cooperate. A typical result of {\it cooperation} is the formation of stable bridges, signatures of spatiotemporal inhomogeneities; we review their geometrical characte…
▽ More
In this article, we review some of our approaches to granular dynamics, now well known to consist of both fast and slow relaxational processes. In the first case, grains typically compete with each other, while in the second, they cooperate. A typical result of {\it cooperation} is the formation of stable bridges, signatures of spatiotemporal inhomogeneities; we review their geometrical characteristics and compare theoretical results with those of independent simulations. {\it Cooperative} excitations due to local density fluctuations are also responsible for relaxation at the angle of repose; the {\it competition} between these fluctuations and external driving forces, can, on the other hand, result in a (rare) collapse of the sandpile to the horizontal. Both these features are present in a theory reviewed here. An arena where the effects of cooperation versus competition are felt most keenly is granular compaction; we review here a random graph model, where three-spin interactions are used to model compaction under tap**. The compaction curve shows distinct regions where 'fast' and 'slow' dynamics apply, separated by what we have called the {\it single-particle relaxation threshold}. In the final section of this paper, we explore the effect of shape -- jagged vs. regular -- on the compaction of packings near their jamming limit. One of our major results is an entropic landscape that, while microscopically rough, manifests {\it Edwards' flatness} at a macroscopic level. Another major result is that of surface intermittency under low-intensity shaking.
△ Less
Submitted 28 March, 2005; v1 submitted 31 December, 2004;
originally announced December 2004.