Search | arXiv e-print repository

Decentralized Fictitious Play Converges Near a Nash Equilibrium in Near-Potential Games

Authors: Sarper Aydin, Sina Arefizadeh, Ceyhun Eksin

Abstract: We investigate convergence of decentralized fictitious play (DFP) in near-potential games, wherein agents preferences can almost be captured by a potential function. In DFP agents keep local estimates of other agents' empirical frequencies, best-respond against these estimates, and receive information over a time-varying communication network. We prove that empirical frequencies of actions generat… ▽ More We investigate convergence of decentralized fictitious play (DFP) in near-potential games, wherein agents preferences can almost be captured by a potential function. In DFP agents keep local estimates of other agents' empirical frequencies, best-respond against these estimates, and receive information over a time-varying communication network. We prove that empirical frequencies of actions generated by DFP converge around a single Nash Equilibrium (NE) assuming that there are only finitely many Nash equilibria, and the difference in utility functions resulting from unilateral deviations is close enough to the difference in the potential function values. This result assures that DFP has the same convergence properties of standard Fictitious play (FP) in near-potential games. △ Less

Submitted 27 January, 2022; originally announced January 2022.

Comments: 5 pages, Accepted to 2021 The Asilomar Conference on Signals, Systems, and Computers

arXiv:2103.09845 [pdf, other]

Decentralized Fictitious Play in Near-Potential Games with Time-Varying Communication Networks

Authors: Sarper Aydın, Sina Arefizadeh, Ceyhun Eksin

Abstract: We study the convergence properties of decentralized fictitious play (DFP) for the class of near-potential games where the incentives of agents are nearly aligned with a potential function. In DFP, agents share information only with their current neighbors in a sequence of time-varying networks, keep estimates of other agents' empirical frequencies, and take actions to maximize their expected util… ▽ More We study the convergence properties of decentralized fictitious play (DFP) for the class of near-potential games where the incentives of agents are nearly aligned with a potential function. In DFP, agents share information only with their current neighbors in a sequence of time-varying networks, keep estimates of other agents' empirical frequencies, and take actions to maximize their expected utility functions computed with respect to the estimated empirical frequencies. We show that empirical frequencies of actions converge to a set of strategies with potential function values that are larger than the potential function values obtained by approximate Nash equilibria of the closest potential game. This result establishes that DFP has identical convergence guarantees in near-potential games as the standard fictitious play in which agents observe the past actions of all the other agents. △ Less

Submitted 17 March, 2021; originally announced March 2021.

arXiv:2101.07178 [pdf, other]

Partial Observability Approach for the Optimal Transparency Problem in Multi-agent Systems

Authors: Sadegh Arefizadeh, Sadjaad Ozgoli, Sadegh Bolouki, Tamer Başar

Abstract: This paper considers a network of agents, where each agent is assumed to take actions optimally with respect to a predefined payoff function involving the latest actions of the agent's neighbors. Neighborhood relationships stem from payoff functions rather than actual communication channels between the agents. A principal is tasked to optimize the network's performance by controlling the informati… ▽ More This paper considers a network of agents, where each agent is assumed to take actions optimally with respect to a predefined payoff function involving the latest actions of the agent's neighbors. Neighborhood relationships stem from payoff functions rather than actual communication channels between the agents. A principal is tasked to optimize the network's performance by controlling the information available to each agent with regard to other agents' latest actions. The information control by the principal is done via a partial observability approach, which comprises a static partitioning of agents into blocks and making the mean of agents' latest actions within each block publicly available. While the problem setup is general in terms of the payoff functions and the network's performance metric, this paper has a narrower focus to illuminate the problem and how it can be addressed in practice. In particular, the performance metric is assumed to be a function of the steady-state behavior of the agents. After conducting a comprehensive steady-state analysis of the network, an efficient algorithm finding optimal partitions with respect to various performance metrics is presented and validated via numerical examples. △ Less

Submitted 18 January, 2021; originally announced January 2021.

Comments: 14 pages

arXiv:1912.03592 [pdf, other]

Distributed Fictitious Play in Potential Games with Time-Varying Communication Networks

Authors: Sina Arefizadeh, Ceyhun Eksin

Abstract: We propose a distributed algorithm for multiagent systems that aim to optimize a common objective when agents differ in their estimates of the objective-relevant state of the environment. Each agent keeps an estimate of the environment and a model of the behavior of other agents. The model of other agents' behavior assumes agents choose their actions randomly based on a stationary distribution det… ▽ More We propose a distributed algorithm for multiagent systems that aim to optimize a common objective when agents differ in their estimates of the objective-relevant state of the environment. Each agent keeps an estimate of the environment and a model of the behavior of other agents. The model of other agents' behavior assumes agents choose their actions randomly based on a stationary distribution determined by the empirical frequencies of past actions. At each step, each agent takes the action that maximizes its expectation of the common objective computed with respect to its estimate of the environment and its model of others. We propose a weighted averaging rule with non-doubly stochastic weights for agents to estimate the empirical frequency of past actions of all other agents by exchanging their estimates with their neighbors over a time-varying communication network. Under this averaging rule, we show agents' estimates converge to the actual empirical frequencies fast enough. This implies convergence of actions to a Nash equilibrium of the game with identical payoffs given by the expectation of the common objective with respect to an asymptotically agreed estimate of the state of the environment. △ Less

Submitted 7 December, 2019; originally announced December 2019.

Comments: 5 pages, 1 figure, to appear in Proceedings of Asilomar Conference on Signals, Systems, and Computers

arXiv:1802.00122 [pdf]

Assessing Strong String Stability of Constant Spacing Policy under Speed Limit Fluctuations

Authors: Sina Arefizadeh, Aria Hasanzadezonuzy, Alireza Talebpour, Srinivas Shakkottai

Abstract: The speed limit changes frequently throughout the transportation network, due to either safety (e.g., change in geometry) or congestion management (e.g., speed harmonization systems). Any abrupt reduction in the speed limit can create a shockwave that propagates upstream in traffic. Dealing with such an abrupt reduction in speed limit is particularly important while designing control laws for a pl… ▽ More The speed limit changes frequently throughout the transportation network, due to either safety (e.g., change in geometry) or congestion management (e.g., speed harmonization systems). Any abrupt reduction in the speed limit can create a shockwave that propagates upstream in traffic. Dealing with such an abrupt reduction in speed limit is particularly important while designing control laws for a platoon of automated vehicles from both stability and efficiency perspectives. This paper focuses on Adaptive Cruise Control (ACC) based platooning under a constant spacing policy, and investigates the possibility of designing a controller that ensures stability, while tracking a given target velocity profile that changes as a function of location. An ideal controller should maintain a constant spacing between successive vehicles, while tracking the desired velocity profile. The analytical investigations of this paper suggest that such a controller does not exist. △ Less

Submitted 31 January, 2018; originally announced February 2018.

Comments: 10 pages, 4 figures

arXiv:1710.02186 [pdf, other]

Collaborative Platooning of Automated Vehicles Using Variable Time-Gaps

Authors: Aria HasanzadeZonuzy, Sina Arefizadeh, Alireza Talebpour, Srinivas Shakkottai, Swaroop Darbha

Abstract: Connected automated vehicles (CAVs) could potentially be coordinated to safely attain the maximum traffic flow on roadways under dynamic traffic patterns, such as those engendered by the merger of two strings of vehicles due a lane drop. Strings of vehicles have to be shaped correctly in terms of the inter-vehicular time-gap and velocity to ensure that such operation is feasible. However, controll… ▽ More Connected automated vehicles (CAVs) could potentially be coordinated to safely attain the maximum traffic flow on roadways under dynamic traffic patterns, such as those engendered by the merger of two strings of vehicles due a lane drop. Strings of vehicles have to be shaped correctly in terms of the inter-vehicular time-gap and velocity to ensure that such operation is feasible. However, controllers that can achieve such traffic sha** over the multiple dimensions of target time-gap and velocity over a region of space are unknown. The objective of this work is to design such a controller, and to show that we can design candidate time-gap and velocity profiles such that it can stabilize the string of vehicles in attaining the target profiles. Our analysis is based on studying the system in the spacial rather than the time domain, which enables us to study stability as in terms of minimizing errors from the target profile and across vehicles as a function of location. Finally, we conduct numeral simulations in the context of sha** two platoons for merger, which we use to illustrate how to select time-gap and velocity profiles for maximizing flow and maintaining safety. △ Less

Submitted 5 October, 2017; originally announced October 2017.

arXiv:1709.10083 [pdf, other]

Platooning in the Presence of a Speed Drop: A Generalized Control Model

Authors: Sina Arefizadeh, Alireza Talebpour, Igor Zelenko

Abstract: The positive impacts of platooning on travel time reliability, congestion, emissions, and energy consumption have been shown for homogeneous roadway segments. However, speed limit changes frequently throughout the transportation network, due to either safety-related considerations (e.g., workzone operations) or congestion management schemes (e.g., speed harmonization systems). These abrupt changes… ▽ More The positive impacts of platooning on travel time reliability, congestion, emissions, and energy consumption have been shown for homogeneous roadway segments. However, speed limit changes frequently throughout the transportation network, due to either safety-related considerations (e.g., workzone operations) or congestion management schemes (e.g., speed harmonization systems). These abrupt changes in speed limit can result in shock- wave formation and cause travel time unreliability. Therefore, designing a platooning strategy for tracking a reference velocity profile is critical to enabling end-to-end platooning. Accordingly, this study introduces a generalized control model to track a desired velocity profile, while ensuring safety in the platoon of autonomous vehicles. We define appropriate natural error terms and the target curve in the state space of the control system, which is the set of points where all error terms vanish and corresponds to the case when all vehicles move with the desired velocities and in the minimum safe distance between them. In this way, we change the tracking velocity profile problem into a state- feedback stabilization problem with respect to the target curve. Under certain mild assumptions on the Lipschitz constant of the speed drop profile, we show that the stabilizing feedback can be obtained via introducing a natural dynamics for the maximum of the error terms for each vehicle. Moreover, we show that with this stabilizing feedback collisions will not occur if the initial state of the system of vehicles is sufficiently close to the target curve. We also show that the error terms remain bounded throughout the time and space. Two scenarios were simulated, with and without initial perturbations, and results confirmed the effectiveness of the proposed control model in tracking the speed drop while ensuring safety and string stability. △ Less

Submitted 28 September, 2017; originally announced September 2017.

Showing 1–7 of 7 results for author: Arefizadeh, S