Search | arXiv e-print repository

F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data

Authors: Zexing Xu, Linjun Zhang, Sitan Yang, Rasoul Etesami, Hanghang Tong, Huan Zhang, Jiawei Han

Abstract: Demand prediction is a crucial task for e-commerce and physical retail businesses, especially during high-stake sales events. However, the limited availability of historical data from these peak periods poses a significant challenge for traditional forecasting methods. In this paper, we propose a novel approach that leverages strategically chosen proxy data reflective of potential sales patterns f… ▽ More Demand prediction is a crucial task for e-commerce and physical retail businesses, especially during high-stake sales events. However, the limited availability of historical data from these peak periods poses a significant challenge for traditional forecasting methods. In this paper, we propose a novel approach that leverages strategically chosen proxy data reflective of potential sales patterns from similar entities during non-peak periods, enriched by features learned from a graph neural networks (GNNs)-based forecasting model, to predict demand during peak events. We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm that leverages proxy data from non-peak periods and GNN-generated relational metadata to learn feature-specific layer parameters, thereby adapting to demand forecasts for peak events. Theoretically, we show that by considering domain similarities through task-specific metadata, our model achieves improved generalization, where the excess risk decreases as the number of training tasks increases. Empirical evaluations on large-scale industrial datasets demonstrate the superiority of our approach. Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset. △ Less

Submitted 23 June, 2024; originally announced June 2024.

MSC Class: 68T07; 68T05; 62M10; 62M20; 90C90; 91B84

arXiv:2405.00665 [pdf, other]

Optimizing Profitability in Timely Gossip Networks

Authors: Priyanka Kaswan, Melih Bastopcu, Sennur Ulukus, S. Rasoul Etesami, Tamer Başar

Abstract: We consider a communication system where a group of users, interconnected in a bidirectional gossip network, wishes to follow a time-varying source, e.g., updates on an event, in real-time. The users wish to maintain their expected version ages below a threshold, and can either rely on gossip from their neighbors or directly subscribe to a server publishing about the event, if the former option do… ▽ More We consider a communication system where a group of users, interconnected in a bidirectional gossip network, wishes to follow a time-varying source, e.g., updates on an event, in real-time. The users wish to maintain their expected version ages below a threshold, and can either rely on gossip from their neighbors or directly subscribe to a server publishing about the event, if the former option does not meet the timeliness requirements. The server wishes to maximize its profit by increasing subscriptions from users and minimizing event sampling frequency to reduce costs. This leads to a Stackelberg game between the server and the users where the sender is the leader deciding its sampling frequency and the users are the followers deciding their subscription strategies. We investigate equilibrium strategies for low-connectivity and high-connectivity topologies. △ Less

Submitted 1 May, 2024; originally announced May 2024.

arXiv:2404.16009 [pdf, other]

How to Make Money From Fresh Data: Subscription Strategies in Age-Based Systems

Authors: Priyanka Kaswan, Melih Bastopcu, Sennur Ulukus, S. Rasoul Etesami, Tamer Başar

Abstract: We consider a communication system consisting of a server that tracks and publishes updates about a time-varying data source or event, and a gossip network of users interested in closely tracking the event. The timeliness of the information is measured through the version age of information. The users wish to have their expected version ages remain below a threshold, and have the option to either… ▽ More We consider a communication system consisting of a server that tracks and publishes updates about a time-varying data source or event, and a gossip network of users interested in closely tracking the event. The timeliness of the information is measured through the version age of information. The users wish to have their expected version ages remain below a threshold, and have the option to either rely on gossip from their neighbors or subscribe to the server directly to follow updates about the event if the former option does not meet the timeliness requirements. The server wishes to maximize its profit by increasing the number of subscribers and reducing costs associated with the frequent sampling of the event. We model the problem setup as a Stackelberg game between the server and the users, where the server commits to a frequency of sampling the event, and the users make decisions on whether to subscribe or not. As an initial work, we focus on directed networks with unidirectional flow of information and obtain the optimal equilibrium strategies for all the players. We provide simulation results to confirm the theoretical findings and provide additional insights. △ Less

Submitted 24 April, 2024; originally announced April 2024.

arXiv:2403.08741 [pdf, ps, other]

Learning How to Strategically Disclose Information

Authors: Raj Kiriti Velicheti, Melih Bastopcu, S. Rasoul Etesami, Tamer Başar

Abstract: Strategic information disclosure, in its simplest form, considers a game between an information provider (sender) who has access to some private information that an information receiver is interested in. While the receiver takes an action that affects the utilities of both players, the sender can design information (or modify beliefs) of the receiver through signal commitment, hence posing a Stack… ▽ More Strategic information disclosure, in its simplest form, considers a game between an information provider (sender) who has access to some private information that an information receiver is interested in. While the receiver takes an action that affects the utilities of both players, the sender can design information (or modify beliefs) of the receiver through signal commitment, hence posing a Stackelberg game. However, obtaining a Stackelberg equilibrium for this game traditionally requires the sender to have access to the receiver's objective. In this work, we consider an online version of information design where a sender interacts with a receiver of an unknown type who is adversarially chosen at each round. Restricting attention to Gaussian prior and quadratic costs for the sender and the receiver, we show that $\mathcal{O}(\sqrt{T})$ regret is achievable with full information feedback, where $T$ is the total number of interactions between the sender and the receiver. Further, we propose a novel parametrization that allows the sender to achieve $\mathcal{O}(\sqrt{T})$ regret for a general convex utility function. We then consider the Bayesian Persuasion problem with an additional cost term in the objective function, which penalizes signaling policies that are more informative and obtain $\mathcal{O}(\log(T))$ regret. Finally, we establish a sublinear regret bound for the partial information feedback setting and provide simulations to support our theoretical results. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2312.01587 [pdf, other]

Scalable and Independent Learning of Nash Equilibrium Policies in $n$-Player Stochastic Games with Unknown Independent Chains

Authors: Tiancheng Qin, S. Rasoul Etesami

Abstract: We study a subclass of $n$-player stochastic games, namely, stochastic games with independent chains and unknown transition matrices. In this class of games, players control their own internal Markov chains whose transitions do not depend on the states/actions of other players. However, players' decisions are coupled through their payoff functions. We assume players can receive only realizations o… ▽ More We study a subclass of $n$-player stochastic games, namely, stochastic games with independent chains and unknown transition matrices. In this class of games, players control their own internal Markov chains whose transitions do not depend on the states/actions of other players. However, players' decisions are coupled through their payoff functions. We assume players can receive only realizations of their payoffs, and that the players can not observe the states and actions of other players, nor do they know the transition probability matrices of their own Markov chain. Relying on a compact dual formulation of the game based on occupancy measures and the technique of confidence set to maintain high-probability estimates of the unknown transition matrices, we propose a fully decentralized mirror descent algorithm to learn an $ε$-NE for this class of games. The proposed algorithm has the desired properties of independence, scalability, and convergence. Specifically, under no assumptions on the reward functions, we show the proposed algorithm converges in polynomial time in a weaker distance (namely, the averaged Nikaido-Isoda gap) to the set of $ε$-NE policies with arbitrarily high probability. Moreover, assuming the existence of a variationally stable Nash equilibrium policy, we show that the proposed algorithm converges asymptotically to the stable $ε$-NE policy with arbitrarily high probability. In addition to Markov potential games and linear-quadratic stochastic games, this work provides another subclass of $n$-player stochastic games that, under some mild assumptions, admit polynomial-time learning algorithms for finding their stationary $ε$-NE policies. △ Less

Submitted 3 December, 2023; originally announced December 2023.

arXiv:2309.16911 [pdf, other]

Dynamic Batching of Online Arrivals to Leverage Economies of Scale

Authors: Akhil Bhimaraju, S. Rasoul Etesami, Lav R. Varshney

Abstract: Many settings, such as medical testing of patients in hospitals or matching riders to drivers in ride-hailing platforms, require handling arrivals over time. In such applications, it is often beneficial to group the arriving orders, samples, or requests into batches and process the larger batches rather than individual arrivals. However, waiting too long to create larger batches incurs a waiting c… ▽ More Many settings, such as medical testing of patients in hospitals or matching riders to drivers in ride-hailing platforms, require handling arrivals over time. In such applications, it is often beneficial to group the arriving orders, samples, or requests into batches and process the larger batches rather than individual arrivals. However, waiting too long to create larger batches incurs a waiting cost for past arrivals. On the other hand, processing the arrivals too soon leads to higher processing costs by missing the economies of scale of grou** larger numbers of arrivals into larger batches. Moreover, the timing of the next arrival is often unknown, meaning that fixed-size batches or fixed wait times tend to be suboptimal. In this work, we consider the problem of finding the optimal batching schedule to minimize the average wait time plus the average processing cost under both offline and online settings. In the offline problem in which all arrival times are known a priori, we show that the optimal batching schedule can be found in polynomial time by reducing it to a shortest path problem on a weighted acyclic graph. For the online problem with unknown arrival times, we develop online algorithms that are provably competitive for a broad range of processing-cost functions. We also provide a lower bound on the competitive ratio that no online algorithm can beat. Finally, we run extensive numerical experiments on simulated and real data to demonstrate the effectiveness of our proposed algorithms against the optimal offline benchmark. △ Less

Submitted 28 September, 2023; originally announced September 2023.

Comments: 31 pages, 14 figures

arXiv:2309.14317 [pdf, ps, other]

Online and Offline Dynamic Influence Maximization Games Over Social Networks

Authors: Melih Bastopcu, S. Rasoul Etesami, Tamer Başar

Abstract: In this work, we consider dynamic influence maximization games over social networks with multiple players (influencers). The goal of each influencer is to maximize their own reward subject to their limited total budget rate constraints. Thus, influencers need to carefully design their investment policies considering individuals' opinion dynamics and other influencers' investment strategies, leadin… ▽ More In this work, we consider dynamic influence maximization games over social networks with multiple players (influencers). The goal of each influencer is to maximize their own reward subject to their limited total budget rate constraints. Thus, influencers need to carefully design their investment policies considering individuals' opinion dynamics and other influencers' investment strategies, leading to a dynamic game problem. We first consider the case of a single influencer who wants to maximize its utility subject to a total budget rate constraint. We study both offline and online versions of the problem where the opinion dynamics are either known or not known a priori. In the singe-influencer case, we propose an online no-regret algorithm, meaning that as the number of campaign opportunities grows, the average utilities obtained by the offline and online solutions converge. Then, we consider the game formulation with multiple influencers in offline and online settings. For the offline setting, we show that the dynamic game admits a unique Nash equilibrium policy and provide a method to compute it. For the online setting and with two influencers, we show that if each influencer applies the same no-regret online algorithm proposed for the single-influencer maximization problem, they will converge to the set of $ε$-Nash equilibrium policies where $ε=O(\frac{1}{\sqrt{K}})$ scales in average inversely with the number of campaign times $K$ considering the average utilities of the influencers. Moreover, we extend this result to any finite number of influencers under more strict requirements on the information structure. Finally, we provide numerical analysis to validate our results under various settings. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: This work has been submitted to IEEE for possible publication

arXiv:2309.10340 [pdf, other]

Striking a Balance: An Optimal Mechanism Design for Heterogenous Differentially Private Data Acquisition for Logistic Regression

Authors: Ameya Anjarlekar, Rasoul Etesami, R. Srikant

Abstract: We investigate the problem of performing logistic regression on data collected from privacy-sensitive sellers. Since the data is private, sellers must be incentivized through payments to provide their data. Thus, the goal is to design a mechanism that optimizes a weighted combination of test loss, seller privacy, and payment, i.e., strikes a balance between multiple objectives of interest. We solv… ▽ More We investigate the problem of performing logistic regression on data collected from privacy-sensitive sellers. Since the data is private, sellers must be incentivized through payments to provide their data. Thus, the goal is to design a mechanism that optimizes a weighted combination of test loss, seller privacy, and payment, i.e., strikes a balance between multiple objectives of interest. We solve the problem by combining ideas from game theory, statistical learning theory, and differential privacy. The buyer's objective function can be highly non-convex. However, we show that, under certain conditions on the problem parameters, the problem can be convexified by using a change of variables. We also provide asymptotic results characterizing the buyer's test error and payments when the number of sellers becomes large. Finally, we demonstrate our ideas by applying them to a real healthcare data set. △ Less

Submitted 19 September, 2023; originally announced September 2023.

arXiv:2304.00155 [pdf, other]

doi 10.1109/CDC49753.2023.10383839

Online Reinforcement Learning in Markov Decision Process Using Linear Programming

Authors: Vincent Leon, S. Rasoul Etesami

Abstract: We consider online reinforcement learning in episodic Markov decision process (MDP) with unknown transition function and stochastic rewards drawn from some fixed but unknown distribution. The learner aims to learn the optimal policy and minimize their regret over a finite time horizon through interacting with the environment. We devise a simple and efficient model-based algorithm that achieves… ▽ More We consider online reinforcement learning in episodic Markov decision process (MDP) with unknown transition function and stochastic rewards drawn from some fixed but unknown distribution. The learner aims to learn the optimal policy and minimize their regret over a finite time horizon through interacting with the environment. We devise a simple and efficient model-based algorithm that achieves $\widetilde{O}(LX\sqrt{TA})$ regret with high probability, where $L$ is the episode length, $T$ is the number of episodes, and $X$ and $A$ are the cardinalities of the state space and the action space, respectively. The proposed algorithm, which is based on the concept of ``optimism in the face of uncertainty", maintains confidence sets of transition and reward functions and uses occupancy measures to connect the online MDP with linear programming. It achieves a tighter regret bound compared to the existing works that use a similar confidence set framework and improves computational effort compared to those that use a different framework but with a slightly tighter regret bound. △ Less

Submitted 10 March, 2024; v1 submitted 31 March, 2023; originally announced April 2023.

Journal ref: 2023 62nd IEEE Conference on Decision and Control (CDC)

arXiv:2303.15386 [pdf, ps, other]

Robustness of Dynamics in Games: A Contraction Map** Decomposition Approach

Authors: Sina Arefizadeh, Sadegh Arefizadeh, S. Rasoul Etesami, Sadegh Bolouki

Abstract: A systematic framework for analyzing dynamical attributes of games has not been well-studied except for the special class of potential or near-potential games. In particular, the existing results have shortcomings in determining the asymptotic behavior of a given dynamic in a designated game. Although there is a large body literature on develo** convergent dynamics to the Nash equilibrium (NE) o… ▽ More A systematic framework for analyzing dynamical attributes of games has not been well-studied except for the special class of potential or near-potential games. In particular, the existing results have shortcomings in determining the asymptotic behavior of a given dynamic in a designated game. Although there is a large body literature on develo** convergent dynamics to the Nash equilibrium (NE) of a game, in general, the asymptotic behavior of an underlying dynamic may not be even close to a NE. In this paper, we initiate a new direction towards game dynamics by studying the fundamental properties of the map of dynamics in games. To this aim, we first decompose the map of a given dynamic into contractive and non-contractive parts and then explore the asymptotic behavior of those dynamics using the proximity of such decomposition to contraction map**s. In particular, we analyze the non-contractive behavior for better/best response dynamics in discrete-action space sequential/repeated games and show that the non-contractive part of those dynamics is well-behaved in a certain sense. That allows us to estimate the asymptotic behavior of such dynamics using a neighborhood around the fixed point of their contractive part proxy. Finally, we demonstrate the practicality of our framework via an example from duopoly Cournot games. △ Less

Submitted 27 March, 2023; originally announced March 2023.

arXiv:2303.02725 [pdf, other]

Local Environment Poisoning Attacks on Federated Reinforcement Learning

Authors: Evelyn Ma, Praneet Rathi, S. Rasoul Etesami

Abstract: Federated learning (FL) has become a popular tool for solving traditional Reinforcement Learning (RL) tasks. The multi-agent structure addresses the major concern of data-hungry in traditional RL, while the federated mechanism protects the data privacy of individual agents. However, the federated mechanism also exposes the system to poisoning by malicious agents that can mislead the trained policy… ▽ More Federated learning (FL) has become a popular tool for solving traditional Reinforcement Learning (RL) tasks. The multi-agent structure addresses the major concern of data-hungry in traditional RL, while the federated mechanism protects the data privacy of individual agents. However, the federated mechanism also exposes the system to poisoning by malicious agents that can mislead the trained policy. Despite the advantage brought by FL, the vulnerability of Federated Reinforcement Learning (FRL) has not been well-studied before. In this work, we propose a general framework to characterize FRL poisoning as an optimization problem and design a poisoning protocol that can be applied to policy-based FRL. Our framework can also be extended to FRL with actor-critic as a local RL algorithm by training a pair of private and public critics. We provably show that our method can strictly hurt the global objective. We verify our poisoning effectiveness by conducting extensive experiments targeting mainstream RL algorithms and over various RL OpenAI Gym environments covering a wide range of difficulty levels. Within these experiments, we compare clean and baseline poisoning methods against our proposed framework. The results show that the proposed framework is successful in poisoning FRL systems and reducing performance across various environments and does so more effectively than baseline methods. Our work provides new insights into the vulnerability of FL in RL training and poses new challenges for designing robust FRL algorithms △ Less

Submitted 4 January, 2024; v1 submitted 5 March, 2023; originally announced March 2023.

arXiv:2210.07461 [pdf, ps, other]

Distributed Computation for the Non-metric Data Placement Problem using Glauber Dynamics and Auctions

Authors: S. Rasoul Etesami

Abstract: We consider the non-metric data placement problem and develop distributed algorithms for computing or approximating its optimal integral solution. We first show that the non-metric data placement problem is inapproximable up to a logarithmic factor. We then provide a game-theoretic decomposition of the objective function and show that natural Glauber dynamics in which players update their resource… ▽ More We consider the non-metric data placement problem and develop distributed algorithms for computing or approximating its optimal integral solution. We first show that the non-metric data placement problem is inapproximable up to a logarithmic factor. We then provide a game-theoretic decomposition of the objective function and show that natural Glauber dynamics in which players update their resources with probability proportional to the utility they receive from caching those resources will converge to an optimal global solution for a sufficiently large noise parameter. In particular, we establish the polynomial mixing time of the Glauber dynamics for a certain range of noise parameters. Finally, we provide another auction-based distributed algorithm, which allows us to approximate the optimal global solution with a performance guarantee that depends on the ratio of the revenue vs. social welfare obtained from the underlying auction. Our results provide the first distributed computation algorithms for the non-metric data placement problem. △ Less

Submitted 13 October, 2022; originally announced October 2022.

arXiv:2206.06318 [pdf, other]

Limited-Trust in Diffusion of Competing Alternatives over Social Networks

Authors: Vincent Leon, S. Rasoul Etesami, Rakesh Nagi

Abstract: We consider the diffusion of two alternatives in social networks using a game-theoretic approach. Each individual plays a coordination game with its neighbors repeatedly and decides which to adopt. As products are used in conjunction with others and through repeated interactions, individuals are more interested in their long-term benefits and tend to show trust to others to maximize their long-ter… ▽ More We consider the diffusion of two alternatives in social networks using a game-theoretic approach. Each individual plays a coordination game with its neighbors repeatedly and decides which to adopt. As products are used in conjunction with others and through repeated interactions, individuals are more interested in their long-term benefits and tend to show trust to others to maximize their long-term utility by choosing a suboptimal option with respect to instantaneous payoff. To capture such trust behavior, we deploy limited-trust equilibrium (LTE) in diffusion process. We analyze the convergence of emerging dynamics to equilibrium points using mean-field approximation and study the equilibrium state and the convergence rate of diffusion using absorption probability and expected absorption time of a reduced-size absorbing Markov chain. We also show that the diffusion model on LTE under the best-response strategy can be converted to the well-known linear threshold model. Simulation results show that when agents behave trustworthy, their long-term utility will increase significantly compared to the case when they are solely self-interested. Moreover, the Markov chain analysis provides a good estimate of convergence properties over random networks. △ Less

Submitted 2 October, 2023; v1 submitted 13 June, 2022; originally announced June 2022.

arXiv:2203.06798 [pdf, other]

doi 10.1080/10556788.2023.2241151

The Role of Local Steps in Local SGD

Authors: Tiancheng Qin, S. Rasoul Etesami, César A. Uribe

Abstract: We consider the distributed stochastic optimization problem where $n$ agents want to minimize a global function given by the sum of agents' local functions, and focus on the heterogeneous setting when agents' local functions are defined over non-i.i.d. data sets. We study the Local SGD method, where agents perform a number of local stochastic gradient steps and occasionally communicate with a cent… ▽ More We consider the distributed stochastic optimization problem where $n$ agents want to minimize a global function given by the sum of agents' local functions, and focus on the heterogeneous setting when agents' local functions are defined over non-i.i.d. data sets. We study the Local SGD method, where agents perform a number of local stochastic gradient steps and occasionally communicate with a central node to improve their local optimization tasks. We analyze the effect of local steps on the convergence rate and the communication complexity of Local SGD. In particular, instead of assuming a fixed number of local steps across all communication rounds, we allow the number of local steps during the $i$-th communication round, $H_i$, to be different and arbitrary numbers. Our main contribution is to characterize the convergence rate of Local SGD as a function of $\{H_i\}_{i=1}^R$ under various settings of strongly convex, convex, and nonconvex local functions, where $R$ is the total number of communication rounds. Based on this characterization, we provide sufficient conditions on the sequence $\{H_i\}_{i=1}^R$ such that Local SGD can achieve linear speed-up with respect to the number of workers. Furthermore, we propose a new communication strategy with increasing local steps superior to existing communication strategies for strongly convex local functions. On the other hand, for convex and nonconvex local functions, we argue that fixed local steps are the best communication strategy for Local SGD and recover state-of-the-art convergence rate results. Finally, we justify our theoretical results through extensive numerical experiments. △ Less

Submitted 29 September, 2022; v1 submitted 13 March, 2022; originally announced March 2022.

arXiv:2201.12719 [pdf, other]

Faster Convergence of Local SGD for Over-Parameterized Models

Authors: Tiancheng Qin, S. Rasoul Etesami, César A. Uribe

Abstract: Modern machine learning architectures are often highly expressive. They are usually over-parameterized and can interpolate the data by driving the empirical loss close to zero. We analyze the convergence of Local SGD (or FedAvg) for such over-parameterized models in the heterogeneous data setting and improve upon the existing literature by establishing the following convergence rates. For general… ▽ More Modern machine learning architectures are often highly expressive. They are usually over-parameterized and can interpolate the data by driving the empirical loss close to zero. We analyze the convergence of Local SGD (or FedAvg) for such over-parameterized models in the heterogeneous data setting and improve upon the existing literature by establishing the following convergence rates. For general convex loss functions, we establish an error bound of $Ø(1/T)$ under a mild data similarity assumption and an error bound of $Ø(K/T)$ otherwise, where $K$ is the number of local steps and $T$ is the total number of iterations. For non-convex loss functions we prove an error bound of $Ø(K/T)$. These bounds improve upon the best previous bound of $Ø(1/\sqrt{nT})$ in both cases, where $n$ is the number of nodes, when no assumption on the model being over-parameterized is made. We complete our results by providing problem instances in which our established convergence rates are tight to a constant factor with a reasonably small stepsize. Finally, we validate our theoretical results by performing large-scale numerical experiments that reveal the convergence behavior of Local SGD for practical over-parameterized deep learning models, in which the $Ø(1/T)$ convergence rate of Local SGD is clearly shown. △ Less

Submitted 10 June, 2024; v1 submitted 29 January, 2022; originally announced January 2022.

Journal ref: Transactions on Machine Learning Research, ISSN 2835-8856, 2024

arXiv:2201.12224 [pdf, other]

Learning Stationary Nash Equilibrium Policies in $n$-Player Stochastic Games with Independent Chains

Authors: S. Rasoul Etesami

Abstract: We consider a subclass of $n$-player stochastic games, in which players have their own internal state/action spaces while they are coupled through their payoff functions. It is assumed that players' internal chains are driven by independent transition probabilities. Moreover, players can receive only realizations of their payoffs, not the actual functions, and cannot observe each other's states/ac… ▽ More We consider a subclass of $n$-player stochastic games, in which players have their own internal state/action spaces while they are coupled through their payoff functions. It is assumed that players' internal chains are driven by independent transition probabilities. Moreover, players can receive only realizations of their payoffs, not the actual functions, and cannot observe each other's states/actions. For this class of games, we first show that finding a stationary Nash equilibrium (NE) policy without any assumption on the reward functions is interactable. However, for general reward functions, we develop polynomial-time learning algorithms based on dual averaging and dual mirror descent, which converge in terms of the averaged Nikaido-Isoda distance to the set of $ε$-NE policies almost surely or in expectation. In particular, under extra assumptions on the reward functions such as social concavity, we derive polynomial upper bounds on the number of iterates to achieve an $ε$-NE policy with high probability. Finally, we evaluate the effectiveness of the proposed algorithms in learning $ε$-NE policies using numerical experiments for energy management in smart grids. △ Less

Submitted 21 March, 2023; v1 submitted 28 January, 2022; originally announced January 2022.

arXiv:2201.08365 [pdf, ps, other]

The Role of Gossi** for Information Dissemination over Networked Agents

Authors: Melih Bastopcu, S. Rasoul Etesami, Tamer Başar

Abstract: We consider information dissemination over a network of gossi** agents (nodes). In this model, a source keeps the most up-to-date information about a time-varying binary state of the world, and $n$ receiver nodes want to follow the information at the source as accurately as possible. When the information at the source changes, the source first sends updates to a subset of $m\leq n$ nodes. After… ▽ More We consider information dissemination over a network of gossi** agents (nodes). In this model, a source keeps the most up-to-date information about a time-varying binary state of the world, and $n$ receiver nodes want to follow the information at the source as accurately as possible. When the information at the source changes, the source first sends updates to a subset of $m\leq n$ nodes. After that, the nodes share their local information during the gossi** period to disseminate the information further. The nodes then estimate the information at the source using the majority rule at the end of the gossi** period. To analyze information dissemination, we introduce a new error metric to find the average percentage of nodes that can accurately obtain the most up-to-date information at the source. We characterize the equations necessary to obtain the steady-state distribution for the average error and then analyze the system behavior under both high and low gossip rates. In the high gossip rate, in which each node can access other nodes' information more frequently, we show that the nodes update their information based on the majority of the information in the network. In the low gossip rate, we introduce and analyze the gossip gain, which is the reduction at the average error due to gossi**. In particular, we develop an adaptive policy that the source can use to determine its current transmission capacity $m$ based on its past transmission rates and the accuracy of the information at the nodes. In numerical results, we show that when the source's transmission capacity $m$ is limited, gossi** can be harmful as it causes incorrect information to disseminate. We then find the optimal gossip rates to minimize the average error for a fixed $m$. Finally, we illustrate the outperformance of our adaptive policy compared to the constant $m$-selection policy even for the high gossip rates. △ Less

Submitted 20 January, 2022; originally announced January 2022.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2107.05138 [pdf, other]

Open-Loop Equilibrium Strategies for Dynamic Influence Maximization Game Over Social Networks

Authors: S. Rasoul Etesami

Abstract: We consider the problem of budget allocation for competitive influence maximization over social networks. In this problem, multiple competing parties (players) want to distribute their limited advertising resources over a set of social individuals to maximize their long-run cumulative payoffs. It is assumed that the individuals are connected via a social network and update their opinions based on… ▽ More We consider the problem of budget allocation for competitive influence maximization over social networks. In this problem, multiple competing parties (players) want to distribute their limited advertising resources over a set of social individuals to maximize their long-run cumulative payoffs. It is assumed that the individuals are connected via a social network and update their opinions based on the classical DeGroot model. The players must decide the budget distribution among the individuals at a finite number of campaign times to maximize their overall payoff given as a function of individuals' opinions. We show that i) the optimal investment strategy for the case of a single-player can be found in polynomial time by solving a concave program, and ii) the open-loop equilibrium strategies for the multiplayer dynamic game can be computed efficiently by following natural regret minimization dynamics. Our results extend the earlier work on the static version of the problem to a dynamic multistage game. △ Less

Submitted 30 August, 2021; v1 submitted 11 July, 2021; originally announced July 2021.

arXiv:2103.12833 [pdf, other]

doi 10.1007/s13235-023-00518-7

Online Learning in Budget-Constrained Dynamic Colonel Blotto Games

Authors: Vincent Leon, S. Rasoul Etesami

Abstract: In this paper, we study the strategic allocation of limited resources using a Colonel Blotto game (CBG) under a dynamic setting and analyze the problem using an online learning approach. In this model, one of the players is a learner who has limited troops to allocate over a finite time horizon, and the other player is an adversary. In each round, the learner plays a one-shot Colonel Blotto game w… ▽ More In this paper, we study the strategic allocation of limited resources using a Colonel Blotto game (CBG) under a dynamic setting and analyze the problem using an online learning approach. In this model, one of the players is a learner who has limited troops to allocate over a finite time horizon, and the other player is an adversary. In each round, the learner plays a one-shot Colonel Blotto game with the adversary and strategically determines the allocation of troops among battlefields based on past observations. The adversary chooses its allocation action randomly from some fixed distribution that is unknown to the learner. The learner's objective is to minimize its regret, which is the difference between the cumulative reward of the best mixed strategy and the realized cumulative reward by following a learning algorithm while not violating the budget constraint. The learning in dynamic CBG is analyzed under the framework of combinatorial bandits and bandits with knapsacks. We first convert the budget-constrained dynamic CBG to a path planning problem on a directed graph. We then devise an efficient algorithm that combines a special combinatorial bandit algorithm for path planning problem and a bandits with knapsack algorithm to cope with the budget constraint. The theoretical analysis shows that the learner's regret is bounded by a term sublinear in time horizon and polynomial in other parameters. Finally, we justify our theoretical results by carrying out simulations for various scenarios. △ Less

Submitted 8 May, 2023; v1 submitted 23 March, 2021; originally announced March 2021.

arXiv:2102.08915 [pdf, other]

Maximizing Social Welfare Subject to Network Externalities: A Unifying Submodular Optimization Approach

Authors: S. Rasoul Etesami

Abstract: We consider the problem of allocating multiple indivisible items to a set of networked agents to maximize the social welfare subject to network externalities. Here, the social welfare is given by the sum of agents' utilities and externalities capture the effect that one user of an item has on the item's value to others. We first provide a general formulation that captures some of the existing mode… ▽ More We consider the problem of allocating multiple indivisible items to a set of networked agents to maximize the social welfare subject to network externalities. Here, the social welfare is given by the sum of agents' utilities and externalities capture the effect that one user of an item has on the item's value to others. We first provide a general formulation that captures some of the existing models as a special case. We then show that the social welfare maximization problem benefits some nice diminishing or increasing marginal return properties. That allows us to devise polynomial-time approximation algorithms using the Lovasz extension and multilinear extension of the objective functions. Our principled approach recovers or improves some of the existing algorithms and provides a simple and unifying framework for maximizing social welfare subject to network externalities. △ Less

Submitted 27 August, 2023; v1 submitted 17 February, 2021; originally announced February 2021.

arXiv:2011.03255 [pdf, other]

Communication-efficient Decentralized Local SGD over Undirected Networks

Authors: Tiancheng Qin, S. Rasoul Etesami, César A. Uribe

Abstract: We consider the distributed learning problem where a network of $n$ agents seeks to minimize a global function $F$. Agents have access to $F$ through noisy gradients, and they can locally communicate with their neighbors a network. We study the Decentralized Local SDG method, where agents perform a number of local gradient steps and occasionally exchange information with their neighbors. Previous… ▽ More We consider the distributed learning problem where a network of $n$ agents seeks to minimize a global function $F$. Agents have access to $F$ through noisy gradients, and they can locally communicate with their neighbors a network. We study the Decentralized Local SDG method, where agents perform a number of local gradient steps and occasionally exchange information with their neighbors. Previous algorithmic analysis efforts have focused on the specific network topology (star topology) where a leader node aggregates all agents' information. We generalize that setting to an arbitrary network by analyzing the trade-off between the number of communication rounds and the computational effort of each agent. We bound the expected optimality gap in terms of the number of iterates $T$, the number of workers $n$, and the spectral gap of the underlying network. Our main results show that by using only $R=Ω(n)$ communication rounds, one can achieve an error that scales as $O({1}/{nT})$, where the number of communication rounds is independent of $T$ and only depends on the number of agents. Finally, we provide numerical evidence of our theoretical results through experiments on real and synthetic data. △ Less

Submitted 6 November, 2020; originally announced November 2020.

arXiv:2011.03212 [pdf, other]

Optimal Online Algorithms for File-Bundle Caching and Generalization to Distributed Caching

Authors: Tiancheng Qin, S. Rasoul Etesami

Abstract: We consider a generalization of the standard cache problem called file-bundle caching, where different queries (tasks), each containing $l\ge 1$ files, sequentially arrive. An online algorithm that does not know the sequence of queries ahead of time must adaptively decide on what files to keep in the cache to incur the minimum number of cache misses. Here a cache miss refers to the case where at l… ▽ More We consider a generalization of the standard cache problem called file-bundle caching, where different queries (tasks), each containing $l\ge 1$ files, sequentially arrive. An online algorithm that does not know the sequence of queries ahead of time must adaptively decide on what files to keep in the cache to incur the minimum number of cache misses. Here a cache miss refers to the case where at least one file in a query is missing among the cache files. In the special case where $l=1$, this problem reduces to the standard cache problem. We first analyze the performance of the classic least recently used (LRU) algorithm in this setting and show that LRU is a near-optimal online deterministic algorithm for file-bundle caching with regard to competitive ratio. We then extend our results to a generalized $(h,k)$-paging problem in this file-bundle setting, where the performance of the online algorithm with a cache size $k$ is compared to an optimal offline benchmark of a smaller cache size $h<k$. In this latter case, we provide a randomized $O(l \ln \frac{k}{k-h})$-competitive algorithm for our generalized $(h,k)$-paging problem, which can be viewed as an extension of the classic marking algorithm. We complete this result by providing a matching lower bound for the competitive ratio, indicating that the performance of this modified marking algorithm is within a factor of two of any randomized online algorithm. Finally, we look at the distributed version of the file-bundle caching problem where there are $m\ge 1$ identical caches in the system. In this case we show that for $m=l+1$ caches, there is a deterministic distributed caching algorithm which is $(l^2+l)$-competitive and a randomized distributed caching algorithm which is $O(l\ln(2l+1))$-competitive when $l\ge 2$. △ Less

Submitted 6 November, 2020; originally announced November 2020.

arXiv:2009.05208 [pdf, other]

Maximizing Convergence Time in Network Averaging Dynamics Subject to Edge Removal

Authors: S. Rasoul Etesami

Abstract: We consider the consensus interdiction problem (CIP), in which the goal is to maximize the convergence time of consensus averaging dynamics subject to removing a limited number of network edges. We first show that CIP can be cast as an effective resistance interdiction problem (ERIP), in which the goal is to remove a limited number of network edges to maximize the effective resistance between a so… ▽ More We consider the consensus interdiction problem (CIP), in which the goal is to maximize the convergence time of consensus averaging dynamics subject to removing a limited number of network edges. We first show that CIP can be cast as an effective resistance interdiction problem (ERIP), in which the goal is to remove a limited number of network edges to maximize the effective resistance between a source node and a sink node. We show that ERIP is strongly NP-hard, even for bipartite graphs of diameter three with fixed source/sink edges, and establish the same hardness result for the CIP. We then show that both ERIP and CIP cannot be approximated up to a (nearly) polynomial factor assuming exponential time hypothesis. Subsequently, we devise a polynomial-time $mn$-approximation algorithm for the ERIP that only depends on the number of nodes $n$ and the number of edges $m$, but is independent of the size of edge resistances. Finally, using a quadratic program formulation for the CIP, we devise an iterative approximation algorithm to find a first-order stationary solution for the CIP and evaluate its good performance through numerical results. △ Less

Submitted 21 March, 2022; v1 submitted 10 September, 2020; originally announced September 2020.

arXiv:2003.07695 [pdf, other]

Online Assortment and Market Segmentation under Bertrand Competition with Set-Dependent Revenues

Authors: S. Rasoul Etesami

Abstract: We consider an online assortment problem with $[n]:=\{1,2,\ldots,n\}$ sellers, each holding exactly one item $i\in[n]$ with initial inventory $c_i\in \mathbb{Z}_+$, and a sequence of homogeneous buyers arriving over a finite time horizon $t=1,2,\ldots,m$. There is an online platform whose goal is to offer a subset $S_t\subseteq [n]$ of sellers to the arriving buyer at time $t$ to maximize the expe… ▽ More We consider an online assortment problem with $[n]:=\{1,2,\ldots,n\}$ sellers, each holding exactly one item $i\in[n]$ with initial inventory $c_i\in \mathbb{Z}_+$, and a sequence of homogeneous buyers arriving over a finite time horizon $t=1,2,\ldots,m$. There is an online platform whose goal is to offer a subset $S_t\subseteq [n]$ of sellers to the arriving buyer at time $t$ to maximize the expected revenue derived over the entire horizon while respecting the inventory constraints. Given an assortment $S_t$ at time $t$, it is assumed that the buyer will select an item from $S_t$ based on the well-known multinomial logit model, a well-justified choice model from the economic literature. In this model, the revenue obtained from selling an item $i$ at a given time $t$ critically depends on the assortment $S_t$ offered at that time and is given by the Nash equilibrium of a Bertrand game among the sellers in $S_t$. This imposes a strong dependence/externality among the offered assortments, sellers' revenues, and inventory levels. Despite that challenge, we devise a constant competitive algorithm for the online assortment problem with homogeneous buyers. We also show that the online assortment problem with heterogeneous buyers does not admit a constant competitive algorithm. To compensate for that issue, we then consider the assortment problem under an offline setting with heterogeneous buyers. Under a mild market consistency assumption, we show that the generalized Bertrand game admits a pure Nash equilibrium over general buyer-seller bipartite graphs. Finally, we develop an $O(\ln m)$-approximation algorithm for optimal market segmentation of the generalized Bertrand game which allows the platform to derive higher revenues by partitioning the market into smaller pools. △ Less

Submitted 8 December, 2021; v1 submitted 14 March, 2020; originally announced March 2020.

arXiv:2001.00543 [pdf, other]

Toward Optimal Adversarial Policies in the Multiplicative Learning System with a Malicious Expert

Authors: S. Rasoul Etesami, Negar Kiyavash, Vincent Leon, H. Vincent Poor

Abstract: We consider a learning system based on the conventional multiplicative weight (MW) rule that combines experts' advice to predict a sequence of true outcomes. It is assumed that one of the experts is malicious and aims to impose the maximum loss on the system. The loss of the system is naturally defined to be the aggregate absolute difference between the sequence of predicted outcomes and the true… ▽ More We consider a learning system based on the conventional multiplicative weight (MW) rule that combines experts' advice to predict a sequence of true outcomes. It is assumed that one of the experts is malicious and aims to impose the maximum loss on the system. The loss of the system is naturally defined to be the aggregate absolute difference between the sequence of predicted outcomes and the true outcomes. We consider this problem under both offline and online settings. In the offline setting where the malicious expert must choose its entire sequence of decisions a priori, we show somewhat surprisingly that a simple greedy policy of always reporting false prediction is asymptotically optimal with an approximation ratio of $1+O(\sqrt{\frac{\ln N}{N}})$, where $N$ is the total number of prediction stages. In particular, we describe a policy that closely resembles the structure of the optimal offline policy. For the online setting where the malicious expert can adaptively make its decisions, we show that the optimal online policy can be efficiently computed by solving a dynamic program in $O(N^3)$. Our results provide a new direction for vulnerability assessment of commonly used learning algorithms to adversarial attacks where the threat is an integral part of the system. △ Less

Submitted 17 September, 2020; v1 submitted 2 January, 2020; originally announced January 2020.

arXiv:1910.14081 [pdf, other]

Duality and Stability in Complex Multiagent State-Dependent Network Dynamics

Authors: S. Rasoul Etesami

Abstract: Despite significant progress on stability analysis of conventional multiagent networked systems with weakly coupled state-network dynamics, most of the existing results have shortcomings in addressing multiagent systems with highly coupled state-network dynamics. Motivated by numerous applications of such dynamics, in our previous work [1], we initiated a new direction for stability analysis of su… ▽ More Despite significant progress on stability analysis of conventional multiagent networked systems with weakly coupled state-network dynamics, most of the existing results have shortcomings in addressing multiagent systems with highly coupled state-network dynamics. Motivated by numerous applications of such dynamics, in our previous work [1], we initiated a new direction for stability analysis of such systems that uses a sequential optimization framework. Building upon that, in this paper, we extend our results by providing another angle on multiagent network dynamics from a duality perspective, which allows us to view the network structure as dual variables of a constrained nonlinear program. Leveraging that idea, we show that the evolution of the coupled state-network multiagent dynamics can be viewed as iterates of a primal-dual algorithm for a static constrained optimization/saddle-point problem. This view bridges the Lyapunov stability of state-dependent network dynamics and frequently used optimization techniques such as block coordinated descent, mirror descent, the Newton method, and the subgradient method. As a result, we develop a systematic framework for analyzing the Lyapunov stability of state-dependent network dynamics using techniques from nonlinear optimization. Finally, we support our theoretical results through numerical simulations from social science. △ Less

Submitted 19 July, 2020; v1 submitted 30 October, 2019; originally announced October 2019.

arXiv:1906.02644 [pdf, other]

An Optimal Control Framework for Online Job Scheduling with General Cost Functions

Authors: S. Rasoul Etesami

Abstract: We consider the problem of online job scheduling on a single machine or multiple unrelated machines with general job/machine-dependent cost functions. In this model, each job $j$ has a processing requirement (length) $v_{ij}$ and arrives with a nonnegative nondecreasing cost function $g_{ij}(t)$ if it has been dispatched to machine $i$, and this information is revealed to the system upon arrival o… ▽ More We consider the problem of online job scheduling on a single machine or multiple unrelated machines with general job/machine-dependent cost functions. In this model, each job $j$ has a processing requirement (length) $v_{ij}$ and arrives with a nonnegative nondecreasing cost function $g_{ij}(t)$ if it has been dispatched to machine $i$, and this information is revealed to the system upon arrival of job $j$ at time $r_j$. The goal is to dispatch the jobs to the machines in an online fashion and process them preemptively on the machines so as to minimize the generalized completion time $\sum_{j}g_{i(j)j}(C_j)$. Here $i(j)$ refers to the machine to which job $j$ is dispatched, and $C_j$ is the completion time of job $j$ on that machine. It is assumed that jobs cannot migrate between machines and that each machine can work on a single job at any time instance. In particular, we are interested in finding an online scheduling policy whose objective cost is competitive with respect to a slower optimal offline benchmark, i.e., the one that knows all the job specifications a priori and is slower than the online algorithm. We first show that for the case of a single machine and special cost functions $g_j(t)=w_jg(t)$, with nonnegative nondecreasing $g(t)$, the highest-density-first rule is optimal for the generalized fractional completion time. We then extend this result by giving a speed-augmented competitive algorithm for the general nondecreasing cost functions $g_j(t)$ by utilizing a novel optimal control framework. This approach provides a principled method for identifying dual variables in different settings of online job scheduling with general cost functions. Using this method, we also provide a speed-augmented competitive algorithm for multiple unrelated machines with convex functions $g_{ij}(t)$, where the competitive ratio depends on the curvature of cost functions $g_{ij}(t)$. △ Less

Submitted 14 August, 2021; v1 submitted 6 June, 2019; originally announced June 2019.

arXiv:1810.00135 [pdf, other]

A Simple Framework for Stability Analysis of State-Dependent Networks of Heterogeneous Agents

Authors: S. Rasoul Etesami

Abstract: Stability and analysis of multi-agent network systems with state-dependent switching typologies have been a fundamental and longstanding challenge in control, social sciences, and many other related fields. These already complex systems become further complicated once one accounts for asymmetry or heterogeneity of the underlying agents/dynamics. Despite extensive progress in analysis of convention… ▽ More Stability and analysis of multi-agent network systems with state-dependent switching typologies have been a fundamental and longstanding challenge in control, social sciences, and many other related fields. These already complex systems become further complicated once one accounts for asymmetry or heterogeneity of the underlying agents/dynamics. Despite extensive progress in analysis of conventional networked decision systems where the network evolution and state dynamics are driven by independent or weakly coupled processes, most of the existing results fail to address multi-agent systems where the network and state dynamics are highly coupled and evolve based on status of heterogeneous agents. Motivated by numerous applications of such dynamics in social sciences, in this paper we provide a new direction toward analysis of dynamic networks of heterogeneous agents under complex time-varying environments. As a result we show how Lyapunov stability of several challenging problems from opinion dynamics can be established using a simple application of our framework. Moreover, we introduce a new class of asymmetric opinion dynamics, namely nearest neighbor dynamics, and show how our approach can be used to analyze their behavior. In particular, we extend our results to game-theoretic settings and provide new insights toward analysis of complex networked multi-agent systems using exciting field of sequential optimization. △ Less

Submitted 21 December, 2018; v1 submitted 28 September, 2018; originally announced October 2018.

arXiv:1709.06071 [pdf, ps, other]

Managing Price Uncertainty in Prosumer-Centric Energy Trading: A Prospect-Theoretic Stackelberg Game Approach

Authors: Georges El Rahi, S. Rasoul Etesami, Walid Saad, Narayan Mandayam, H. Vincent Poor

Abstract: In this paper, the problem of energy trading between smart grid prosumers, who can simultaneously consume and produce energy, and a grid power company is studied. The problem is formulated as a single-leader, multiple-follower Stackelberg game between the power company and multiple prosumers. In this game, the power company acts as a leader who determines the pricing strategy that maximizes its pr… ▽ More In this paper, the problem of energy trading between smart grid prosumers, who can simultaneously consume and produce energy, and a grid power company is studied. The problem is formulated as a single-leader, multiple-follower Stackelberg game between the power company and multiple prosumers. In this game, the power company acts as a leader who determines the pricing strategy that maximizes its profits, while the prosumers act as followers who react by choosing the amount of energy to buy or sell so as to optimize their current and future profits. The proposed game accounts for each prosumer's subjective decision when faced with the uncertainty of profits, induced by the random future price. In particular, the framing effect, from the framework of prospect theory (PT), is used to account for each prosumer's valuation of its gains and losses with respect to an individual utility reference point. The reference point changes between prosumers and stems from their past experience and future aspirations of profits. The followers' noncooperative game is shown to admit a unique pure-strategy Nash equilibrium (NE) under classical game theory (CGT) which is obtained using a fully distributed algorithm. The results are extended to account for the case of PT using algorithmic solutions that can achieve an NE under certain conditions. Simulation results show that the total grid load varies significantly with the prosumers' reference point and their loss-aversion level. In addition, it is shown that the power company's profits considerably decrease when it fails to account for the prosumers' subjective perceptions under PT. △ Less

Submitted 18 September, 2017; originally announced September 2017.

arXiv:1705.03805 [pdf, other]

Smart Routing of Electric Vehicles for Load Balancing in Smart Grids

Authors: S. Rasoul Etesami, Walid Saad, Narayan Mandayam, H. V. Poor

Abstract: Electric vehicles (EVs) are expected to be a major component of the smart grid. The rapid proliferation of EVs will introduce an unprecedented load on the existing electric grid due to the charging/discharging behavior of the EVs, thus motivating the need for novel approaches for routing EVs across the grid. In this paper, a novel gametheoretic framework for smart routing of EVs within the smart g… ▽ More Electric vehicles (EVs) are expected to be a major component of the smart grid. The rapid proliferation of EVs will introduce an unprecedented load on the existing electric grid due to the charging/discharging behavior of the EVs, thus motivating the need for novel approaches for routing EVs across the grid. In this paper, a novel gametheoretic framework for smart routing of EVs within the smart grid is proposed. The goal of this framework is to balance the electricity load across the grid while taking into account the traffic congestion and the waiting time at charging stations. The EV routing problem is formulated as a noncooperative game. For this game, it is shown that selfish behavior of EVs will result in a pure-strategy Nash equilibrium with the price of anarchy upper bounded by the variance of the ground load induced by the residential, industrial, or commercial users. Moreover, the results are extended to capture the stochastic nature of induced ground load as well as the subjective behavior of the owners of EVs as captured by using notions from the behavioral framework of prospect theory. Simulation results provide new insights on more efficient energy pricing at charging stations and under more realistic grid conditions. △ Less

Submitted 27 December, 2019; v1 submitted 10 May, 2017; originally announced May 2017.

arXiv:1610.02067 [pdf, other]

Stochastic Games for Smart Grid Energy Management with Prospect Prosumers

Authors: Seyed Rasoul Etesami, Walid Saad, Narayan Mandayam, H. Vincent Poor

Abstract: In this paper, the problem of smart grid energy management under stochastic dynamics is investigated. In the considered model, at the demand side, it is assumed that customers can act as prosumers who own renewable energy sources and can both produce and consume energy. Due to the coupling between the prosumers' decisions and the stochastic nature of renewable energy, the interaction among prosume… ▽ More In this paper, the problem of smart grid energy management under stochastic dynamics is investigated. In the considered model, at the demand side, it is assumed that customers can act as prosumers who own renewable energy sources and can both produce and consume energy. Due to the coupling between the prosumers' decisions and the stochastic nature of renewable energy, the interaction among prosumers is formulated as a stochastic game, in which each prosumer seeks to maximize its payoff, in terms of revenues, by controlling its energy consumption and demand. In particular, the subjective behavior of prosumers is explicitly reflected into their payoff functions using prospect theory, a powerful framework that allows modeling real-life human choices. For this prospect-based stochastic game, it is shown that there always exists a stationary Nash equilibrium where the prosumers' trading policies in the equilibrium are independent of the time and their histories of the play. Moreover, a novel distributed algorithm with no information sharing among prosumers is proposed and shown to converge to an $ε$-Nash equilibrium. On the other hand, at the supply side, the interaction between the utility company and the prosumers is formulated as an online optimization problem in which the utility company's goal is to learn its optimal energy allocation rules. For this case, it is shown that such an optimization problem admits a no-regret algorithm meaning that regardless of the actual outcome of the game among the prosumers, the utility company can follow a strategy that mitigates its allocation costs as if it knew the entire demand market a priori. Simulation results show the convergence of the proposed algorithms to their predicted outcomes and present new insights resulting from prospect theory that contribute toward more efficient energy management in the smart grids. △ Less

Submitted 7 August, 2017; v1 submitted 6 October, 2016; originally announced October 2016.

arXiv:1603.06083 [pdf, other]

doi 10.1007/s00530-016-0511-z

Towards Coordinated Bandwidth Adaptations for Hundred-Scale 3D Tele-Immersive Systems

Authors: Mohammad Hosseini, Gregorij Kurillo, Seyed Rasoul Etesami, Jiang Yu

Abstract: 3D tele-immersion improves the state of collaboration among geographically distributed participants. Unlike the traditional 2D videos, a 3D tele-immersive system employs multiple 3D cameras based in each physical site to cover a much larger field of view, generating a very large amount of stream data. One of the major challenges is how to efficiently transmit these bulky 3D streaming data to bandw… ▽ More 3D tele-immersion improves the state of collaboration among geographically distributed participants. Unlike the traditional 2D videos, a 3D tele-immersive system employs multiple 3D cameras based in each physical site to cover a much larger field of view, generating a very large amount of stream data. One of the major challenges is how to efficiently transmit these bulky 3D streaming data to bandwidth-constrained sites. In this paper, we study an adaptive Human Visual System (HVS) -compliant bandwidth management framework for efficient delivery of hundred-scale streams produced from distributed 3D tele-immersive sites to a receiver site with limited bandwidth budget. Our adaptation framework exploits the semantics link of HVS with multiple 3D streams in the 3D tele-immersive environment. We developed TELEVIS, a visual simulation tool to showcase a HVS-aware tele-immersive system for realistic cases. Our evaluation results show that the proposed adaptation can improve the total quality per unit of bandwidth used to deliver streams in 3D tele-immersive systems. △ Less

Submitted 28 March, 2016; v1 submitted 19 March, 2016; originally announced March 2016.

Comments: Springer Multimedia Systems Journal, 14 pages, March 2016

arXiv:1506.04047 [pdf, other]

Approximation Algorithm for the Binary-Preference Capacitated Selfish Replication Game and a Tight Bound on its Price of Anarchy

Authors: Seyed Rasoul Etesami, Tamer Basar

Abstract: We consider the capacitated selfish replication (CSR) game with binary preferences, over general undirected networks. We first show that such games have an associated ordinary potential function, and hence always admit a pure-strategy Nash equilibrium (NE). Further, when the minimum degree of the network and the number of resources are of the same order, there exists an exact polynomial time algor… ▽ More We consider the capacitated selfish replication (CSR) game with binary preferences, over general undirected networks. We first show that such games have an associated ordinary potential function, and hence always admit a pure-strategy Nash equilibrium (NE). Further, when the minimum degree of the network and the number of resources are of the same order, there exists an exact polynomial time algorithm which can find a NE. Following this, we study the price of anarchy of such games, and show that it is bounded above by 3; we further provide some instances for which the price of anarchy is at least 2. We develop a quasi-polynomial algorithm O(n^2D^{ln n}), where n is the number of players and D is the diameter of the network, which can find, in a distributed manner, an allocation profile that is within a constant factor of the optimal allocation, and hence of any pure-strategy NE of the game. Proof of this result uses a novel potential function. △ Less

Submitted 11 March, 2016; v1 submitted 12 June, 2015; originally announced June 2015.

arXiv:1504.01438 [pdf, other]

Convergence Time of Quantized Metropolis Consensus Over Time-Varying Networks

Authors: Tamer Basar, Seyed Rasoul Etesami, Alex Olshevsky

Abstract: We consider the quantized consensus problem on undirected time-varying connected graphs with n nodes, and devise a protocol with fast convergence time to the set of consensus points. Specifically, we show that when the edges of each network in a sequence of connected time-varying networks are activated based on Poisson processes with Metropolis rates, the expected convergence time to the set of co… ▽ More We consider the quantized consensus problem on undirected time-varying connected graphs with n nodes, and devise a protocol with fast convergence time to the set of consensus points. Specifically, we show that when the edges of each network in a sequence of connected time-varying networks are activated based on Poisson processes with Metropolis rates, the expected convergence time to the set of consensus points is at most O(n^2 log^2 n), where each node performs a constant number of updates per unit time. △ Less

Submitted 2 February, 2016; v1 submitted 6 April, 2015; originally announced April 2015.

arXiv:1412.6546 [pdf, other]

doi 10.1109/TAC.2015.2394954

Game-Theoretic Analysis of the Hegselmann-Krause Model for Opinion Dynamics in Finite Dimensions

Authors: Seyed Rasoul Etesami, Tamer Basar

Abstract: We consider the Hegselmann-Krause model for opinion dynamics and study the evolution of the system under various settings. We first analyze the termination time of the synchronous Hegselmann-Krause dynamics in arbitrary finite dimensions and show that the termination time in general only depends on the number of agents involved in the dynamics. To the best of our knowledge, that is the sharpest bo… ▽ More We consider the Hegselmann-Krause model for opinion dynamics and study the evolution of the system under various settings. We first analyze the termination time of the synchronous Hegselmann-Krause dynamics in arbitrary finite dimensions and show that the termination time in general only depends on the number of agents involved in the dynamics. To the best of our knowledge, that is the sharpest bound for the termination time of such dynamics that removes dependency of the termination time from the dimension of the ambient space. This answers an open question in [1] on how to obtain a tighter upper bound for the termination time. Furthermore, we study the asynchronous Hegselmann-Krause model from a novel game-theoretic approach and show that the evolution of an asynchronous Hegselmann-Krause model is equivalent to a sequence of best response updates in a well-designed potential game. We then provide a polynomial upper bound for the expected time and expected number of switching topologies until the dynamic reaches an arbitrarily small neighborhood of its equilibrium points, provided that the agents update uniformly at random. This is a step toward analysis of heterogeneous Hegselmann-Krause dynamics. Finally, we consider the heterogeneous Hegselmann-Krause dynamics and provide a necessary condition for the finite termination time of such dynamics. In particular, we sketch some future directions toward more detailed analysis of the heterogeneous Hegselmann-Krause model. △ Less

Submitted 19 December, 2014; originally announced December 2014.

Comments: The paper is accepted in IEEE Transactions on Automatic Control and will appear soon

arXiv:1404.3442

Optimal versus Nash Equilibrium Computation for Networked Resource Allocation

Authors: S. Rasoul Etesami

Abstract: Motivated by emerging resource allocation and data placement problems such as web caches and peer-to-peer systems, we consider and study a class of resource allocation problems over a network of agents (nodes). In this model, nodes can store only a limited number of resources while accessing the remaining ones through their closest neighbors. We consider this problem under both optimization and ga… ▽ More Motivated by emerging resource allocation and data placement problems such as web caches and peer-to-peer systems, we consider and study a class of resource allocation problems over a network of agents (nodes). In this model, nodes can store only a limited number of resources while accessing the remaining ones through their closest neighbors. We consider this problem under both optimization and game-theoretic frameworks. In the case of optimal resource allocation we will first show that when there are only k=2 resources, the optimal allocation can be found efficiently in O(n^2\log n) steps, where n denotes the total number of nodes. However, for k>2 this problem becomes NP-hard with no polynomial time approximation algorithm with a performance guarantee better than 1+1/102k^2, even under metric access costs. We then provide a 3-approximation algorithm for the optimal resource allocation which runs only in linear time O(n). Subsequently, we look at this problem under a selfish setting formulated as a noncooperative game and provide a 3-approximation algorithm for obtaining its pure Nash equilibria under metric access costs. We then establish an equivalence between the set of pure Nash equilibria and flip-optimal solutions of the Max-k-Cut problem over a specific weighted complete graph. Using this reduction, we show that finding the lexicographically smallest Nash equilibrium for k> 2 is NP-hard, and provide an algorithm to find it in O(n^3 2^n) steps. While the reduction to weighted Max-k-Cut suggests that finding a pure Nash equilibrium using best response dynamics might be PLS-hard, it allows us to use tools from quadratic programming to devise more systematic algorithms towards obtaining Nash equilibrium points. △ Less

Submitted 4 January, 2020; v1 submitted 13 April, 2014; originally announced April 2014.

Comments: A more general version of this arXiv version is already published

arXiv:1403.4109 [pdf, other]

doi 10.1109/TAC.2015.2440568

Convergence Time for Unbiased Quantized Consensus Over Static and Dynamic Networks

Authors: Seyed Rasoul Etesami, Tamer Basar

Abstract: In this paper, the question of expected time to convergence is addressed for unbiased quantized consensus on undirected connected graphs, and some strong results are obtained. The paper first provides a tight expression for the expected convergence time of the unbiased quantized consensus over general but fixed networks. It is shown that the maximum expected convergence time lies within a constant… ▽ More In this paper, the question of expected time to convergence is addressed for unbiased quantized consensus on undirected connected graphs, and some strong results are obtained. The paper first provides a tight expression for the expected convergence time of the unbiased quantized consensus over general but fixed networks. It is shown that the maximum expected convergence time lies within a constant factor of the maximum hitting time of an appropriate lazy random walk, using the theory of harmonic functions for reversible Markov chains. Following this, and using electric resistance analogy of the reversible Markov chains, the paper provides a tight upper bound for the expected convergence time to consensus based on the parameters of the network. Moreover, the paper identifies a precise order of the maximum expected convergence time for some simple graphs such as line graph and cycle. Finally, the results are extended to bound the expected convergence time of the underlying dynamics in time-varying networks. Modeling such dynamics as the evolution of a time inhomogeneous Markov chain, the paper derives a tight upper bound for expected convergence time of the dynamics using the spectral representation of the networks. This upper bound is significantly better than earlier results for the quantized consensus problem over time-varying graphs. △ Less

Submitted 19 December, 2014; v1 submitted 17 March, 2014; originally announced March 2014.

Comments: The paper is accepted in IEEE Transactions on Automatic Control and will appear soon

arXiv:1403.3881 [pdf, other]

Complexity of Equilibrium in Diffusion Games on Social Networks

Authors: Seyed Rasoul Etesami, Tamer Basar

Abstract: In this paper, we consider the competitive diffusion game, and study the existence of its pure-strategy Nash equilibrium when defined over general undirected networks. We first determine the set of pure-strategy Nash equilibria for two special but well-known classes of networks, namely the lattice and the hypercube. Characterizing the utility of the players in terms of graphical distances of their… ▽ More In this paper, we consider the competitive diffusion game, and study the existence of its pure-strategy Nash equilibrium when defined over general undirected networks. We first determine the set of pure-strategy Nash equilibria for two special but well-known classes of networks, namely the lattice and the hypercube. Characterizing the utility of the players in terms of graphical distances of their initial seed placements to other nodes in the network, we show that in general networks the decision process on the existence of pure-strategy Nash equilibrium is an NP-hard problem. Following this, we provide some necessary conditions for a given profile to be a Nash equilibrium. Furthermore, we study players' utilities in the competitive diffusion game over Erdos-Renyi random graphs and show that as the size of the network grows, the utilities of the players are highly concentrated around their expectation, and are bounded below by some threshold based on the parameters of the network. Finally, we obtain a lower bound for the maximum social welfare of the game with two players, and study sub-modularity of the players' utilities. △ Less

Submitted 7 June, 2015; v1 submitted 16 March, 2014; originally announced March 2014.

Comments: A shorter version of this paper has been appeared in 2014 American Control Conference (ACC2014)

Showing 1–38 of 38 results for author: Etesami, R