Search | arXiv e-print repository

Incentive Designs for Learning Agents to Stabilize Coupled Exogenous Systems

Authors: Jair Certório, Nuno C. Martins, Richard J. La, Murat Arcak

Abstract: We consider a large population of learning agents noncooperatively selecting strategies from a common set, influencing the dynamics of an exogenous system (ES) we seek to stabilize at a desired equilibrium. Our approach is to design a dynamic payoff mechanism capable of sha** the population's strategy profile, thus affecting the ES's state, by offering incentives for specific strategies within b… ▽ More We consider a large population of learning agents noncooperatively selecting strategies from a common set, influencing the dynamics of an exogenous system (ES) we seek to stabilize at a desired equilibrium. Our approach is to design a dynamic payoff mechanism capable of sha** the population's strategy profile, thus affecting the ES's state, by offering incentives for specific strategies within budget limits. Employing system-theoretic passivity concepts, we establish conditions under which a payoff mechanism can be systematically constructed to ensure the global asymptotic stabilization of the ES's equilibrium. In comparison to previous approaches originally studied in the context of the so-called epidemic population games, the method proposed here allows for more realistic epidemic models and other types of ES, such as predator-prey dynamics. Stabilization is established with the support of a Lyapunov function, which provides useful bounds on the transients. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: 8 pages, 3 figures

MSC Class: 92D10; 92D25

arXiv:2401.15475 [pdf, other]

Epidemic Population Games And Perturbed Best Response Dynamics

Authors: Shinkyu Park, Jair Certorio, Nuno C. Martins, Richard J. La

Abstract: This paper proposes an approach to mitigate epidemic spread in a population of strategic agents by encouraging safer behaviors through carefully designed rewards. These rewards, which vary according to the state of the epidemic, are ascribed by a dynamic payoff mechanism we seek to design. We use a modified SIRS model to track how the epidemic progresses in response to the population's agents stra… ▽ More This paper proposes an approach to mitigate epidemic spread in a population of strategic agents by encouraging safer behaviors through carefully designed rewards. These rewards, which vary according to the state of the epidemic, are ascribed by a dynamic payoff mechanism we seek to design. We use a modified SIRS model to track how the epidemic progresses in response to the population's agents strategic choices. By employing perturbed best response evolutionary dynamics to model the population's strategic behavior, we extend previous related work so as to allow for noise in the agents' perceptions of the rewards and intrinsic costs of the available strategies. Central to our approach is the use of system-theoretic methods and passivity concepts to obtain a Lyapunov function, ensuring the global asymptotic stability of an endemic equilibrium with minimized infection prevalence, under budget constraints. We use the Lyapunov function to construct anytime upper bounds for the size of the population's infectious fraction. For a class of one-parameter perturbed best response models, we propose a method to learn the model's parameter from data. △ Less

Submitted 22 February, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

arXiv:2312.07598 [pdf, ps, other]

Differential Equation Approximations for Population Games using Elementary Probability

Authors: Semih Kara, Nuno C. Martins

Abstract: Population games model the evolution of strategic interactions among a large number of uniform agents. Due to the agents' uniformity and quantity, their aggregate strategic choices can be approximated by the solutions of a class of ordinary differential equations. This mean-field approach has found to be an effective tool of analysis. However its current proofs rely on advanced mathematical techni… ▽ More Population games model the evolution of strategic interactions among a large number of uniform agents. Due to the agents' uniformity and quantity, their aggregate strategic choices can be approximated by the solutions of a class of ordinary differential equations. This mean-field approach has found to be an effective tool of analysis. However its current proofs rely on advanced mathematical techniques, making them less accessible. In this article, we present a simpler derivation, using only undergraduate-level probability. △ Less

Submitted 11 December, 2023; originally announced December 2023.

arXiv:2204.00953 [pdf, ps, other]

Epidemic Population Games With Nonnegligible Disease Death Rate

Authors: Jair Certorio, Nuno C. Martins, Richard J. La

Abstract: A recent article that combines normalized epidemic compartmental models and population games put forth a system theoretic approach to capture the coupling between a population's strategic behavior and the course of an epidemic. It introduced a payoff mechanism that governs the population's strategic choices via incentives, leading to the lowest endemic proportion of infectious individuals subject… ▽ More A recent article that combines normalized epidemic compartmental models and population games put forth a system theoretic approach to capture the coupling between a population's strategic behavior and the course of an epidemic. It introduced a payoff mechanism that governs the population's strategic choices via incentives, leading to the lowest endemic proportion of infectious individuals subject to cost constraints. Under the assumption that the disease death rate is approximately zero, it uses a Lyapunov function to prove convergence and formulate a quasi-convex program to compute an upper bound for the peak size of the population's infectious fraction. In this article, we generalize these results to the case in which the disease death rate is nonnegligible. This generalization brings on additional coupling terms in the normalized compartmental model, leading to a more intricate Lyapunov function and payoff mechanism. Moreover, the associated upper bound can no longer be determined exactly, but it can be computed with arbitrary accuracy by solving a set of convex programs. △ Less

Submitted 2 April, 2022; originally announced April 2022.

Comments: 7 pages, 2 figures. arXiv admin note: text overlap with arXiv:2201.10529

MSC Class: 92D10; 92D25

arXiv:2204.00593 [pdf, ps, other]

Population Games With Erlang Clocks: Convergence to Nash Equilibria For Pairwise Comparison Dynamics

Authors: Semih Kara, Nuno C. Martins, Murat Arcak

Abstract: The prevailing methodology for analyzing population games and evolutionary dynamics in the large population limit assumes that a Poisson process (or clock) inherent to each agent determines when the agent can revise its strategy. Hence, such an approach presupposes exponentially distributed inter-revision intervals, and is inadequate for cases where each strategy entails a sequence of sub-tasks (s… ▽ More The prevailing methodology for analyzing population games and evolutionary dynamics in the large population limit assumes that a Poisson process (or clock) inherent to each agent determines when the agent can revise its strategy. Hence, such an approach presupposes exponentially distributed inter-revision intervals, and is inadequate for cases where each strategy entails a sequence of sub-tasks (sub-strategies) that must be completed before a new revision time occurs. This article proposes a methodology for such cases under the premise that a sub-strategy's duration is exponentially-distributed, leading to Erlang distributed inter-revision intervals. We assume that a so-called pairwise-comparison protocol captures the agents' revision preferences to render our analysis concrete. The presence of sub-strategies brings on additional dynamics that is incompatible with existing models and results. Our main contributions are twofold, both derived for a deterministic approximation valid for large populations. We prove convergence of the population's state to the Nash equilibrium set when a potential game generates a payoff for the strategies. We use system-theoretic passivity to determine conditions under which this convergence is guaranteed for contractive games. △ Less

Submitted 1 April, 2022; originally announced April 2022.

Comments: Submitted to the 61st IEEE Conference on Decision and Control

arXiv:2201.10529 [pdf, ps, other]

Epidemic Population Games And Evolutionary Dynamics

Authors: Nuno C. Martins, Jair Certorio, Richard J. La

Abstract: We propose a system theoretic approach to select and stabilize the endemic equilibrium of an SIRS epidemic model in which the decisions of a population of strategically interacting agents determine the transmission rate. Specifically, the population's agents recurrently revise their choices out of a set of strategies that impact to varying levels the transmission rate. A payoff vector quantifying… ▽ More We propose a system theoretic approach to select and stabilize the endemic equilibrium of an SIRS epidemic model in which the decisions of a population of strategically interacting agents determine the transmission rate. Specifically, the population's agents recurrently revise their choices out of a set of strategies that impact to varying levels the transmission rate. A payoff vector quantifying the incentives provided by a planner for each strategy, after deducting the strategies' intrinsic costs, influences the revision process. An evolutionary dynamics model captures the population's preferences in the revision process by specifying as a function of the payoff vector the rates at which the agents' choices flow toward strategies with higher payoffs. Our main result is a dynamic payoff mechanism that is guaranteed to steer the epidemic variables (via incentives to the population) to the endemic equilibrium with the smallest infectious fraction, subject to cost constraints. We use a Lyapunov function not only to establish convergence but also to obtain an (anytime) upper bound for the peak size of the population's infectious portion. △ Less

Submitted 25 January, 2022; originally announced January 2022.

Comments: 12 pages, 3 figures

MSC Class: 92D10; 92D25

arXiv:2107.02835 [pdf, other]

Pairwise Comparison Evolutionary Dynamics with Strategy-Dependent Revision Rates: Stability and Delta-Passivity (Expanded Version)

Authors: Semih Kara, Nuno C. Martins

Abstract: We report on new stability conditions for evolutionary dynamics in the context of population games. We adhere to the prevailing framework consisting of many agents, grouped into populations, that interact noncooperatively by selecting strategies with a favorable payoff. Each agent is repeatedly allowed to revise its strategy at a rate referred to as revision rate. Previous stability results consid… ▽ More We report on new stability conditions for evolutionary dynamics in the context of population games. We adhere to the prevailing framework consisting of many agents, grouped into populations, that interact noncooperatively by selecting strategies with a favorable payoff. Each agent is repeatedly allowed to revise its strategy at a rate referred to as revision rate. Previous stability results considered either that the payoff mechanism was a memoryless potential game, or allowed for dynamics (in the payoff mechanism) at the expense of precluding any explicit dependence of the agents' revision rates on their current strategies. Allowing the dependence of revision rates on strategies is relevant because the agents' strategies at any point in time are generally unequal. To allow for strategy-dependent revision rates and payoff mechanisms that are dynamic (or memoryless games that are not potential), we focus on an evolutionary dynamics class obtained from a straightforward modification of one that stems from the so-called impartial pairwise comparison strategy revision protocol. Revision protocols consistent with the modified class retain from those in the original one the advantage that the agents operate in a fully decentralized manner and with minimal information requirements - they need to access only the payoff values (not the mechanism) of the available strategies. Our main results determine conditions under which system-theoretic passivity properties are assured, which we leverage for stability analysis. △ Less

Submitted 6 July, 2021; originally announced July 2021.

arXiv:2005.03797 [pdf, ps, other]

Dissipativity Tools for Convergence to Nash Equilibria in Population Games

Authors: Murat Arcak, Nuno C. Martins

Abstract: We analyze the stability of a nonlinear dynamical model describing the noncooperative strategic interactions among the agents of a finite collection of populations. Each agent selects one strategy at a time and revises it repeatedly according to a protocol that typically prioritizes strategies whose payoffs are either higher than that of the current strategy or exceed the population average. The m… ▽ More We analyze the stability of a nonlinear dynamical model describing the noncooperative strategic interactions among the agents of a finite collection of populations. Each agent selects one strategy at a time and revises it repeatedly according to a protocol that typically prioritizes strategies whose payoffs are either higher than that of the current strategy or exceed the population average. The model is predicated on well-established research in population and evolutionary games, and has two sub-components. The first is the payoff dynamics model (PDM), which ascribes the payoff to each strategy according to the proportions of every population adopting the available strategies. The second sub-component is the evolutionary dynamics model (EDM) that accounts for the revision process. In our model, the social state at equilibrium is a best response to the payoff, and can be viewed as a Nash-like solution that has predictive value when it is globally asymptotically stable (GAS). We present a systematic methodology that ascertains GAS by checking separately whether the EDM and PDM satisfy appropriately defined system-theoretic dissipativity properties. Our work generalizes pioneering methods based on notions of contractivity applicable to memoryless PDMs, and more general system-theoretic passivity conditions. As demonstrated with examples, the added flexibility afforded by our approach is particularly useful when the contraction properties of the PDM are unequal across populations. △ Less

Submitted 19 October, 2020; v1 submitted 7 May, 2020; originally announced May 2020.

arXiv:1905.04362 [pdf, ps, other]

Channels, Remote Estimation and Queueing Systems With A Utilization-Dependent Component: A Unifying Survey Of Recent Results

Authors: Varun Jog, Richard J. La, Michael Lin, Nuno C. Martins

Abstract: In this article, we survey the main models, techniques, concepts, and results centered on the design and performance evaluation of engineered systems that rely on a utilization-dependent component (UDC) whose operation may depend on its usage history or assigned workload. Specifically, we report on research themes concentrating on the characterization of the capacity of channels and the design wit… ▽ More In this article, we survey the main models, techniques, concepts, and results centered on the design and performance evaluation of engineered systems that rely on a utilization-dependent component (UDC) whose operation may depend on its usage history or assigned workload. Specifically, we report on research themes concentrating on the characterization of the capacity of channels and the design with performance guarantees of remote estimation and queueing systems. Causes for the dependency of a UDC on past utilization include the use of replenishable energy sources to power the transmission of information among the sub-components of a networked system, and the assistance of a human operator for servicing a queue. Our analysis unveils the similarity of the UDC models typically adopted in each of the research themes, and it reveals the differences in the objectives and technical approaches employed. We also identify new challenges and future research directions inspired by the cross-pollination among the central concepts, techniques, and problem formulations of the research themes discussed. △ Less

Submitted 11 January, 2021; v1 submitted 10 May, 2019; originally announced May 2019.

arXiv:1605.00690 [pdf, other]

Optimal Remote Estimation Over Use-Dependent Packet-Drop Channels - Extended Version

Authors: David Ward, Nuno C. Martins

Abstract: Consider a discrete-time remote estimation system formed by an encoder, a transmission policy, a channel, and a remote estimator. The encoder assesses a random process that the remote estimator seeks to estimate based on information sent to it by the encoder via the channel. The channel is affected by Bernoulli drops. The instantaneous probability of a drop is governed by a finite state machine (F… ▽ More Consider a discrete-time remote estimation system formed by an encoder, a transmission policy, a channel, and a remote estimator. The encoder assesses a random process that the remote estimator seeks to estimate based on information sent to it by the encoder via the channel. The channel is affected by Bernoulli drops. The instantaneous probability of a drop is governed by a finite state machine (FSM). The state of the FSM is denoted as the channel state. At each time step, the encoder decides whether to attempt a transmission through the packet-drop link. The sequence of encoder decisions is the input to the FSM. This paper seeks to design an encoder, transmission policy and remote estimator that minimize a finite-horizon mean squared error cost. We present two structural results. The first result in which we assume that the process to be estimated is white and Gaussian, we show that there is an optimal transmission policy governed by a threshold on the estimation error. The second result characterizes optimal symmetric transmission policies for the case when the measured process is the state of a scalar linear time-invariant plant driven by white Gaussian noise. Use-dependent packet-drop channels can be used to quantify the effect of transmission on channel quality when the encoder is powered by energy harvesting. An application to a mixed initiative system in which a human operator performs visual search tasks is also presented. △ Less

Submitted 2 May, 2016; originally announced May 2016.

arXiv:1401.0926 [pdf, ps, other]

A Class of LTI Distributed Observers for LTI Plants: Necessary and Sufficient Conditions for Stabilizability

Authors: Shinkyu Park, Nuno C. Martins

Abstract: Consider that an autonomous linear time-invariant (LTI) plant is given and that a network of LTI observers assesses its output vector. The dissemination of information within the network is dictated by a pre-specified directed graph in which each vertex represents an observer. Each observer computes its own state estimate using only the portion of the output vector accessible to it and the state e… ▽ More Consider that an autonomous linear time-invariant (LTI) plant is given and that a network of LTI observers assesses its output vector. The dissemination of information within the network is dictated by a pre-specified directed graph in which each vertex represents an observer. Each observer computes its own state estimate using only the portion of the output vector accessible to it and the state estimates of other observers that are transmitted to it by its neighbors, according to the graph. This paper proposes an update rule that is a natural generalization of consensus, and for which we determine necessary and sufficient conditions for the existence of parameters for the update rule that lead to asymptotic omniscience of the state of the plant at all observers. The conditions reduce to certain detectability requirements that imply that if omniscience is not possible under the proposed scheme then it is not viable under any other scheme that is subject to the same communication graph, including nonlinear and time-varying ones. △ Less

Submitted 5 January, 2014; originally announced January 2014.

arXiv:1209.5805 [pdf, other]

Memoryless Control Design for Persistent Surveillance under Safety Constraints

Authors: Eduardo Arvelo, Eric Kim, Nuno C. Martins

Abstract: This paper deals with the design of time-invariant memoryless control policies for robots that move in a finite two- dimensional lattice and are tasked with persistent surveillance of an area in which there are forbidden regions. We model each robot as a controlled Markov chain whose state comprises its position in the lattice and the direction of motion. The goal is to find the minimum number of… ▽ More This paper deals with the design of time-invariant memoryless control policies for robots that move in a finite two- dimensional lattice and are tasked with persistent surveillance of an area in which there are forbidden regions. We model each robot as a controlled Markov chain whose state comprises its position in the lattice and the direction of motion. The goal is to find the minimum number of robots and an associated time-invariant memoryless control policy that guarantees that the largest number of states are persistently surveilled without ever visiting a forbidden state. We propose a design method that relies on a finitely parametrized convex program inspired by entropy maximization principles. Numerical examples are provided. △ Less

Submitted 8 November, 2012; v1 submitted 25 September, 2012; originally announced September 2012.

arXiv:1209.2883 [pdf, ps, other]

Control Design for Markov Chains under Safety Constraints: A Convex Approach

Authors: Eduardo Arvelo, Nuno C. Martins

Abstract: This paper focuses on the design of time-invariant memoryless control policies for fully observed controlled Markov chains, with a finite state space. Safety constraints are imposed through a pre-selected set of forbidden states. A state is qualified as safe if it is not a forbidden state and the probability of it transitioning to a forbidden state is zero. The main objective is to obtain control… ▽ More This paper focuses on the design of time-invariant memoryless control policies for fully observed controlled Markov chains, with a finite state space. Safety constraints are imposed through a pre-selected set of forbidden states. A state is qualified as safe if it is not a forbidden state and the probability of it transitioning to a forbidden state is zero. The main objective is to obtain control policies whose closed loop generates the maximal set of safe recurrent states, which may include multiple recurrent classes. A design method is proposed that relies on a finitely parametrized convex program inspired on entropy maximization principles. A numerical example is provided and the adoption of additional constraints is discussed. △ Less

Submitted 8 November, 2012; v1 submitted 13 September, 2012; originally announced September 2012.

arXiv:1209.1123 [pdf, ps, other]

Stabilizability and Norm-Optimal Control Design subject to Sparsity Constraints

Authors: Serban Sabau, Nuno C. Martins

Abstract: Consider that a linear time-invariant (LTI) plant is given and that we wish to design a stabilizing controller for it. Admissible controllers are LTI and must comply with a pre-selected sparsity pattern. The sparsity pattern is assumed to be quadratically invariant (QI) with respect to the plant, which, from prior results, guarantees that there is a convex parametrization of all admissible stabili… ▽ More Consider that a linear time-invariant (LTI) plant is given and that we wish to design a stabilizing controller for it. Admissible controllers are LTI and must comply with a pre-selected sparsity pattern. The sparsity pattern is assumed to be quadratically invariant (QI) with respect to the plant, which, from prior results, guarantees that there is a convex parametrization of all admissible stabilizing controllers provided that an initial admissible stable stabilizing controller is provided. This paper addresses the previously unsolved problem of determining necessary and sufficient conditions for the existence of an admissible stabilizing controller. The main idea is to cast the existence of such a controller as the feasibility of an exact model-matching problem with stability restrictions, which can be tackled using existing methods. Furthermore, we show that, when it exists, the solution of the model-matching problem can be used to compute an admissible stabilizing controller. This method also leads to a convex parametrization that may be viewed as an extension of Youla's classical approach so as to incorporate sparsity constraints. Applications of this parametrization on the design of norm-optimal controllers via convex methods are also explored. An illustrative example is provided, and a special case is discussed for which the exact model matching problem has a unique and easily computable solution. △ Less

Submitted 5 September, 2012; originally announced September 2012.

arXiv:1203.3136 [pdf, other]

A Receding Horizon Strategy for Systems with Interval-Wise Energy Constraints

Authors: Eduardo Arvelo, Nuno C. Martins

Abstract: We propose a receding horizon control strategy that readily handles systems that exhibit interval-wise total energy constraints on the input control sequence. The approach is based on a variable optimization horizon length and contractive final state constraint sets. The optimization horizon, which recedes by N steps every N steps, is the key to accommodate the interval-wise total energy constrain… ▽ More We propose a receding horizon control strategy that readily handles systems that exhibit interval-wise total energy constraints on the input control sequence. The approach is based on a variable optimization horizon length and contractive final state constraint sets. The optimization horizon, which recedes by N steps every N steps, is the key to accommodate the interval-wise total energy constraints. The varying optimization horizon along with the contractive constraints are used to achieve analytic asymptotic stability of the system under the proposed scheme. The strategy is demonstrated by simulation examples. △ Less

Submitted 21 March, 2012; v1 submitted 14 March, 2012; originally announced March 2012.

arXiv:1109.5466 [pdf, other]

Optimal Sensor Placement for Intruder Detection

Authors: Waseem A. Malik, Nuno C. Martins, Ananthram Swami

Abstract: We consider the centralized detection of an intruder, whose location is modeled as uniform across a specified set of points, using an optimally placed team of sensors. These sensors make conditionally independent observations. The local detectors at the sensors are also assumed to be identical, with detection probability $(P_{_{D}})$ and false alarm probability $(P_{_{F}})$. We formulate the probl… ▽ More We consider the centralized detection of an intruder, whose location is modeled as uniform across a specified set of points, using an optimally placed team of sensors. These sensors make conditionally independent observations. The local detectors at the sensors are also assumed to be identical, with detection probability $(P_{_{D}})$ and false alarm probability $(P_{_{F}})$. We formulate the problem as an N-ary hypothesis testing problem, jointly optimizing the sensor placement and detection policies at the fusion center. We prove that uniform sensor placement is never strictly optimal when the number of sensors $(M)$ equals the number of placement points $(N)$. We prove that for $N_{2} > N_{1} > M$, where $N_{1},N_{2}$ are number of placement points, the framework utilizing $M$ sensors and $N_{1}$ placement points has the same optimal placement structure as the one utilizing $M$ sensors and $N_{2}$ placement points. For $M\leq 5$ and for fixed $P_{_{D}}$, increasing $P_{_{F}}$ leads to optimal placements that are higher in the majorization-based placement scale. Similarly for $M\leq 5$ and for fixed $P_{_{F}}$, increasing $P_{_{D}}$ leads to optimal placements that are higher in the majorization-based placement scale. For $M>5$, this result does not necessarily hold and we provide a simple counterexample. It is conjectured that the set of optimal placements for a given $(M,N)$ can always be placed on a majorization-based placement scale. △ Less

Submitted 26 September, 2011; originally announced September 2011.

Comments: 63 pages, 5 figures

Showing 1–16 of 16 results for author: Martins, N C