Search | arXiv e-print repository

Fully Adaptive Regret-Guaranteed Algorithm for Control of Linear Quadratic Systems

Authors: Jafar Abbaszadeh Chekan, Cedric Langbort

Abstract: The first algorithm for the Linear Quadratic (LQ) control problem with an unknown system model, featuring a regret of $\mathcal{O}(\sqrt{T})$, was introduced by Abbasi-Yadkori and Szepesvári (2011). Recognizing the computational complexity of this algorithm, subsequent efforts (see Cohen et al. (2019), Mania et al. (2019), Faradonbeh et al. (2020a), and Kargin et al.(2022)) have been dedicated to… ▽ More The first algorithm for the Linear Quadratic (LQ) control problem with an unknown system model, featuring a regret of $\mathcal{O}(\sqrt{T})$, was introduced by Abbasi-Yadkori and Szepesvári (2011). Recognizing the computational complexity of this algorithm, subsequent efforts (see Cohen et al. (2019), Mania et al. (2019), Faradonbeh et al. (2020a), and Kargin et al.(2022)) have been dedicated to proposing algorithms that are computationally tractable while preserving this order of regret. Although successful, the existing works in the literature lack a fully adaptive exploration-exploitation trade-off adjustment and require a user-defined value, which can lead to overall regret bound growth with some factors. In this work, noticing this gap, we propose the first fully adaptive algorithm that controls the number of policy updates (i.e., tunes the exploration-exploitation trade-off) and optimizes the upper-bound of regret adaptively. Our proposed algorithm builds on the SDP-based approach of Cohen et al. (2019) and relaxes its need for a horizon-dependant warm-up phase by appropriately tuning the regularization parameter and adding an adaptive input perturbation. We further show that through careful exploration-exploitation trade-off adjustment there is no need to commit to the widely-used notion of strong sequential stability, which is restrictive and can introduce complexities in initialization. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2406.00295 [pdf, ps, other]

On Network Congestion Reduction Using Public Signals Under Boundedly Rational User Equilibria (Full Version)

Authors: Olivier Massicot, Cédric Langbort

Abstract: Boundedly Rational User Equilibria (BRUE) capture situations where all agents on a transportation network are electing the fastest option up to some time indifference, and serve as a relaxation of User Equilibria (UE), where each agent exactly minimizes their travel time. We study how the social cost under BRUE departs from that of UE in the context of static demand and stochastic costs, along wit… ▽ More Boundedly Rational User Equilibria (BRUE) capture situations where all agents on a transportation network are electing the fastest option up to some time indifference, and serve as a relaxation of User Equilibria (UE), where each agent exactly minimizes their travel time. We study how the social cost under BRUE departs from that of UE in the context of static demand and stochastic costs, along with the implications of BRUE on the optimal signaling scheme of a benevolent central planner. We show that the average excess time is sublinear in the maximum time indifference of the agents, though such aggregate may hide disparity between populations and the sublinearity constant depends on the topology of the network. Regarding the design of public signals, even though in the limit where agents are totally indifferent, it is optimal to not reveal any information, there is in general no trend in how much information is optimally disclosed to agents. What is more, an increase in information disclosed may either harm or benefit agents as a whole. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: Extends the version submitted to CPHS'24

arXiv:2402.09639 [pdf, other]

Misinformation Regulation in the Presence of Competition between Social Media Platforms (Extended Version)

Authors: So Sasaki, Cédric Langbort

Abstract: Social media platforms have diverse content moderation policies, with many prominent actors hesitant to impose strict regulations. A key reason for this reluctance could be the competitive advantage that comes with lax regulation. A popular platform that starts enforcing content moderation rules may fear that it could lose users to less-regulated alternative platforms. Moreover, if users continue… ▽ More Social media platforms have diverse content moderation policies, with many prominent actors hesitant to impose strict regulations. A key reason for this reluctance could be the competitive advantage that comes with lax regulation. A popular platform that starts enforcing content moderation rules may fear that it could lose users to less-regulated alternative platforms. Moreover, if users continue harmful activities on other platforms, regulation ends up being futile. This article examines the competitive aspect of content moderation by considering the motivations of all involved players (platformer, news source, and social media users), identifying the regulation policies sustained in equilibrium, and evaluating the information quality available on each platform. Applied to simple yet relevant social networks such as stochastic block models, our model reveals the conditions for a popular platform to enforce strict regulation without losing users. Effectiveness of regulation depends on the diffusive property of news posts, friend interaction qualities in social media, the sizes and cohesiveness of communities, and how much sympathizers appreciate surprising news from influencers. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: This version extends the article submitted to the IEEE Transactions on Control of Network Systems

arXiv:2312.06641 [pdf, other]

Online Decision Making with History-Average Dependent Costs (Extended)

Authors: Vijeth Hebbar, Cedric Langbort

Abstract: In many online sequential decision-making scenarios, a learner's choices affect not just their current costs but also the future ones. In this work, we look at one particular case of such a situation where the costs depend on the time average of past decisions over a history horizon. We first recast this problem with history dependent costs as a problem of decision making under stage-wise constrai… ▽ More In many online sequential decision-making scenarios, a learner's choices affect not just their current costs but also the future ones. In this work, we look at one particular case of such a situation where the costs depend on the time average of past decisions over a history horizon. We first recast this problem with history dependent costs as a problem of decision making under stage-wise constraints. To tackle this, we then propose the novel Follow-The-Adaptively-Regularized-Leader (FTARL) algorithm. Our innovative algorithm incorporates adaptive regularizers that depend explicitly on past decisions, allowing us to enforce stage-wise constraints while simultaneously enabling us to establish tight regret bounds. We also discuss the implications of the length of history horizon on design of no-regret algorithms for our problem and present impossibility results when it is the full learning horizon. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: Submitted to L4DC 2024. This is an extended version including proofs and experimental results

arXiv:2306.13956 [pdf, other]

Pointwise-in-Time Explanation for Linear Temporal Logic Rules

Authors: Noel Brindise, Cedric Langbort

Abstract: The new field of Explainable Planning (XAIP) has produced a variety of approaches to explain and describe the behavior of autonomous agents to human observers. Many summarize agent behavior in terms of the constraints, or ''rules,'' which the agent adheres to during its trajectories. In this work, we narrow the focus from summary to specific moments in individual trajectories, offering a ''pointwi… ▽ More The new field of Explainable Planning (XAIP) has produced a variety of approaches to explain and describe the behavior of autonomous agents to human observers. Many summarize agent behavior in terms of the constraints, or ''rules,'' which the agent adheres to during its trajectories. In this work, we narrow the focus from summary to specific moments in individual trajectories, offering a ''pointwise-in-time'' view. Our novel framework, which we define on Linear Temporal Logic (LTL) rules, assigns an intuitive status to any rule in order to describe the trajectory progress at individual time steps; here, a rule is classified as active, satisfied, inactive, or violated. Given a trajectory, a user may query for status of specific LTL rules at individual trajectory time steps. In this paper, we present this novel framework, named Rule Status Assessment (RSA), and provide an example of its implementation. We find that pointwise-in-time status assessment is useful as a post-hoc diagnostic, enabling a user to systematically track the agent's behavior with respect to a set of rules. △ Less

Submitted 1 October, 2023; v1 submitted 24 June, 2023; originally announced June 2023.

Comments: See related publication in Conference on Decision and Control (CDC) 2023

arXiv:2302.02270 [pdf, ps, other]

Regret-Guaranteed Safe Switching with Minimum Cost: LQR Setting with Unknown Dynamics

Authors: Jafar Abbaszadeh Chekan, Cedric Langbort

Abstract: Externally Forced Switched (EFS) systems represent a subset of switched systems where switches occur deliberately to meet an external requirement. However, fast switching can lead to instability, even when all closed-loop modes are stable. In this study, our focus is on an EFS scenario with \textit{unknown system dynamics}, where the next mode to switch to is revealed by an external entity in real… ▽ More Externally Forced Switched (EFS) systems represent a subset of switched systems where switches occur deliberately to meet an external requirement. However, fast switching can lead to instability, even when all closed-loop modes are stable. In this study, our focus is on an EFS scenario with \textit{unknown system dynamics}, where the next mode to switch to is revealed by an external entity in real-time as the switch occurs. The challenge is to track the revealed sequence while (1) minimizing accumulated cost in a regretful sense and (2) ensuring that the norm of the system's state does not grow excessively-a property we refer to as 'the safety of switching.' Achieving the latter involves requiring the closed-loop system to remain in each revealed mode for some minimum dwell time, which must be learned online. We propose an algorithm based on the principles of Optimism in the Face of Uncertainty. This algorithm jointly establishes confidence sets for unknown parameters, devises a feedback policy, and estimates a minimum dwell time for each revealed mode from data. By precisely estimating dwell-time error, our strategy yields an expected regret of $\mathcal{O}(|M| \sqrt{ns})$, where $ns$ and $|M|$ denote the total switches and mode count, respectively. We benchmark this approach against scenarios with known parameters. △ Less

Submitted 19 December, 2023; v1 submitted 4 February, 2023; originally announced February 2023.

arXiv:2212.13619 [pdf, ps, other]

Almost-Bayesian Quadratic Persuasion (Extended Version)

Authors: Olivier Massicot, Cédric Langbort

Abstract: In this article, we relax the Bayesianity assumption in the now-traditional model of Bayesian Persuasion introduced by Kamenica & Gentzkow. Unlike preexisting approaches -- which have tackled the possibility of the receiver (Bob) being non-Bayesian by considering that his thought process is not Bayesian yet known to the sender (Alice), possibly up to a parameter -- we let Alice merely assume that… ▽ More In this article, we relax the Bayesianity assumption in the now-traditional model of Bayesian Persuasion introduced by Kamenica & Gentzkow. Unlike preexisting approaches -- which have tackled the possibility of the receiver (Bob) being non-Bayesian by considering that his thought process is not Bayesian yet known to the sender (Alice), possibly up to a parameter -- we let Alice merely assume that Bob behaves 'almost like' a Bayesian agent, in some sense, without resorting to any specific model. Under this assumption, we study Alice's strategy when both utilities are quadratic and the prior is isotropic. We show that, contrary to the Bayesian case, Alice's optimal response may not be linear anymore. This fact is unfortunate as linear policies remain the only ones for which the induced belief distribution is known. What is more, evaluating linear policies proves difficult except in particular cases, let alone finding an optimal one. Nonetheless, we derive bounds that prove linear policies are near-optimal and allow Alice to compute a near-optimal linear policy numerically. With this solution in hand, we show that Alice shares less information with Bob as he departs more from Bayesianity, much to his detriment. △ Less

Submitted 1 March, 2024; v1 submitted 27 December, 2022; originally announced December 2022.

Comments: This version extends the article submitted to the IEEE Transactions on Automatic Control

arXiv:2210.01374 [pdf, ps, other]

Safety-Aware Learning-Based Control of Systems with Uncertainty Dependent Constraints (extended version)

Authors: Jafar Abbaszadeh Chekan, Cedric Langbort

Abstract: The problem of safely learning and controlling a dynamical system - i.e., of stabilizing an originally (partially) unknown system while ensuring that it does not leave a prescribed 'safe set' - has recently received tremendous attention in the controls community. Further complexities arise, however, when the structure of the safe set itself depends on the unknown part of the system's dynamics. In… ▽ More The problem of safely learning and controlling a dynamical system - i.e., of stabilizing an originally (partially) unknown system while ensuring that it does not leave a prescribed 'safe set' - has recently received tremendous attention in the controls community. Further complexities arise, however, when the structure of the safe set itself depends on the unknown part of the system's dynamics. In particular, a popular approach based on control Lyapunov functions (CLF), control barrier functions (CBF) and Gaussian processes (to build confidence set around the unknown term), which has proved successful in the known-safe set setting, becomes inefficient as-is, due to the introduction of higher-order terms to be estimated and bounded with high probability using only system state measurements. In this paper, we build on the recent literature on GPs and reproducing kernels to perform this latter task, and show how to correspondingly modify the CLF-CBF-based approach to obtain safety guarantees. Namely, we derive exponential CLF and second relative order exponential CBF constraints whose satisfaction guarantees stability and forward in-variance of the partially unknown safe set with high probability. To overcome the intractability of verification of these conditions on the continuous domain, we apply discretization of the state space and use Lipschitz continuity properties of dynamics to derive equivalent CLF and CBF certificates in discrete state space. Finally, we present an algorithm for the control design aim using the derived certificates. △ Less

Submitted 8 October, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

arXiv:2207.10827 [pdf, other]

Learn and Control while Switching: with Guaranteed Stability and Sublinear Regret

Authors: Jafar Abbaszadeh Chekan, Cédric Langbort

Abstract: Over-actuated systems often make it possible to achieve specific performances by switching between different subsets of actuators. However, when the system parameters are unknown, transferring authority to different subsets of actuators is challenging due to stability and performance efficiency concerns. This paper presents an efficient algorithm to tackle the so-called "learn and control while sw… ▽ More Over-actuated systems often make it possible to achieve specific performances by switching between different subsets of actuators. However, when the system parameters are unknown, transferring authority to different subsets of actuators is challenging due to stability and performance efficiency concerns. This paper presents an efficient algorithm to tackle the so-called "learn and control while switching between different actuating modes" problem in the Linear Quadratic (LQ) setting. Our proposed strategy is constructed upon Optimism in the Face of Uncertainty (OFU) based algorithm equipped with a projection toolbox to keep the algorithm efficient, regret-wise. Along the way, we derive an optimum duration for the warm-up phase, thanks to the existence of a stabilizing neighborhood. The stability of the switched system is also guaranteed by designing a minimum average dwell time. The proposed strategy is proved to have a regret bound of $\mathcal{\bar{O}}\big(\sqrt{T}\big)+\mathcal{O}\big(ns\sqrt{T}\big)$ in horizon $T$ with $(ns)$ number of switches, provably outperforming naively applying the basic OFU algorithm. △ Less

Submitted 2 October, 2023; v1 submitted 21 July, 2022; originally announced July 2022.

arXiv:2203.16660 [pdf, other]

On The Role of Social Identity in the Market for (Mis)information

Authors: Vijeth Hebbar, Cedric Langbort

Abstract: Motivated by recent works in the communication and psychology literature, we model and study the role social identity -- a person's sense of belonging to a group -- plays in human information consumption. A hallmark of Social Identity Theory (SIT) is the notion of 'status', i.e., an individual's desire to enhance their and their 'in-group's' utility relative to that of an 'out-group'. In the conte… ▽ More Motivated by recent works in the communication and psychology literature, we model and study the role social identity -- a person's sense of belonging to a group -- plays in human information consumption. A hallmark of Social Identity Theory (SIT) is the notion of 'status', i.e., an individual's desire to enhance their and their 'in-group's' utility relative to that of an 'out-group'. In the context of belief formation, this comes off as a desire to believe positive news about the in-group and negative news about the out-group, which has been empirically shown to support belief in misinformation and false news. We model this phenomenon as a Stackelberg game being played over an information channel between a news-source (sender) and news-consumer (receiver), with the receiver incorporating the 'status' associated with social identity in their utility, in addition to accuracy. We characterize the strategy that must be employed by the sender to ensure that its message is trusted by receivers of all identities while maximizing their overall quality of information. We show that, as a rule, this optimal quality of information at equilibrium decreases when a receiver's sense of identity increases. We further demonstrate how extensions of our model can be used to quantitatively estimate the level of importance given to identity in a population. △ Less

Submitted 1 April, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

Comments: Submitted to CDC 2022. Reworded parts of section V and corrected typos throughout

arXiv:2105.14709 [pdf, other]

Joint Stabilization and Regret Minimization through Switching in Over-Actuated Systems (extended version)

Authors: Jafar Abbaszadeh Chekan, Kamyar Azizzadenesheli, Cedric Langbort

Abstract: Adaptively controlling and minimizing regret in unknown dynamical systems while controlling the growth of the system state is crucial in real-world applications. In this work, we study the problem of stabilization and regret minimization of linear over-actuated dynamical systems. We propose an optimism-based algorithm that leverages possibility of switching between actuating modes in order to alle… ▽ More Adaptively controlling and minimizing regret in unknown dynamical systems while controlling the growth of the system state is crucial in real-world applications. In this work, we study the problem of stabilization and regret minimization of linear over-actuated dynamical systems. We propose an optimism-based algorithm that leverages possibility of switching between actuating modes in order to alleviate state explosion during initial time steps. We theoretically study the rate at which our algorithm learns a stabilizing controller and prove that it achieves a regret upper bound of $\mathcal{O}(\sqrt{T})$. △ Less

Submitted 9 February, 2022; v1 submitted 31 May, 2021; originally announced May 2021.

arXiv:2004.00241 [pdf, other]

Regret Bounds for LQ Adaptive Control Under Database Attacks (Extended Version)

Authors: Jafar Abbaszadeh Chekan, Cedric Langbort

Abstract: This paper is concerned with understanding and countering the effects of database attacks on a learning-based linear quadratic adaptive controller. This attack targets neither sensors nor actuators, but just poisons the learning algorithm and parameter estimator that is part of the regulation scheme. We focus on the adaptive optimal control algorithm introduced by Abbasi-Yadkori and Szepesvari and… ▽ More This paper is concerned with understanding and countering the effects of database attacks on a learning-based linear quadratic adaptive controller. This attack targets neither sensors nor actuators, but just poisons the learning algorithm and parameter estimator that is part of the regulation scheme. We focus on the adaptive optimal control algorithm introduced by Abbasi-Yadkori and Szepesvari and provide regret analysis in the presence of attacks as well as modifications that mitigate their effects. A core step of this algorithm is the self-regularized on-line least squares estimation, which determines a tight confidence set around the true parameters of the system with high probability. In the absence of malicious data injection, this set provides an appropriate estimate of parameters for the aim of control design. However, in the presence of attack, this confidence set is not reliable anymore. Hence, we first tackle the question of how to adjust the confidence set so that it can compensate for the effect of the poisonous data. Then, we quantify the deleterious effect of this type of attack on the optimality of control policy by bounding regret of the closed-loop system under attack. △ Less

Submitted 5 November, 2022; v1 submitted 1 April, 2020; originally announced April 2020.

Comments: 15 pages

arXiv:2002.05346 [pdf, other]

Protecting Consumers Against Personalized Pricing: A Stop** Time Approach

Authors: Roy Dong, Erik Miehling, Cedric Langbort

Abstract: The widespread availability of behavioral data has led to the development of data-driven personalized pricing algorithms: sellers attempt to maximize their revenue by estimating the consumer's willingness-to-pay and pricing accordingly. Our objective is to develop algorithms that protect consumer interests against personalized pricing schemes. In this paper, we consider a consumer who learns more… ▽ More The widespread availability of behavioral data has led to the development of data-driven personalized pricing algorithms: sellers attempt to maximize their revenue by estimating the consumer's willingness-to-pay and pricing accordingly. Our objective is to develop algorithms that protect consumer interests against personalized pricing schemes. In this paper, we consider a consumer who learns more and more about a potential purchase across time, while simultaneously revealing more and more information about herself to a potential seller. We formalize a strategic consumer's purchasing decision when interacting with a seller who uses personalized pricing algorithms, and contextualize this problem among the existing literature in optimal stop** time theory and computational finance. We provide an algorithm that consumers can use to protect their own interests against personalized pricing algorithms. This algorithmic stop** method uses sample paths to train estimates of the optimal stop** time. To the best of our knowledge, this is one of the first works that provides computational methods for the consumer to maximize her utility when decision making under surveillance. We demonstrate the efficacy of the algorithmic stop** method using a numerical simulation, where the seller uses a Kalman filter to approximate the consumer's valuation and sets prices based on myopic expected revenue maximization. Compared to a myopic purchasing strategy, we demonstrate increased payoffs for the consumer in expectation. △ Less

Submitted 11 February, 2020; originally announced February 2020.

arXiv:1909.06057 [pdf, other]

Strategic Inference with a Single Private Sample

Authors: Erik Miehling, Roy Dong, Cédric Langbort, Tamer Başar

Abstract: Motivated by applications in cyber security, we develop a simple game model for describing how a learning agent's private information influences an observing agent's inference process. The model describes a situation in which one of the agents (attacker) is deciding which of two targets to attack, one with a known reward and another with uncertain reward. The attacker receives a single private sam… ▽ More Motivated by applications in cyber security, we develop a simple game model for describing how a learning agent's private information influences an observing agent's inference process. The model describes a situation in which one of the agents (attacker) is deciding which of two targets to attack, one with a known reward and another with uncertain reward. The attacker receives a single private sample from the uncertain target's distribution and updates its belief of the target quality. The other agent (defender) knows the true rewards, but does not see the sample that the attacker has received. This leads to agents possessing asymmetric information: the attacker is uncertain over the parameter of the distribution, whereas the defender is uncertain about the observed sample. After the attacker updates its belief, both the attacker and the defender play a simultaneous move game based on their respective beliefs. We offer a characterization of the pure strategy equilibria of the game and explain how the players' decisions are influenced by their prior knowledge and the payoffs/costs. △ Less

Submitted 13 September, 2019; originally announced September 2019.

Comments: Accepted to 58th Conference on Decision and Control (2019)

arXiv:1810.04301 [pdf, ps, other]

Detection and Mitigation of Biasing Attacks on Distributed Estimation Networks

Authors: Mohammad Deghat, Valery Ugrinovskii, Iman Shames, Cedric Langbort

Abstract: The paper considers a problem of detecting and mitigating biasing attacks on networks of state observers targeting cooperative state estimation algorithms. The problem is cast within the recently developed framework of distributed estimation utilizing the vector dissipativity approach. The paper shows that a network of distributed observers can be endowed with an additional attack detection layer… ▽ More The paper considers a problem of detecting and mitigating biasing attacks on networks of state observers targeting cooperative state estimation algorithms. The problem is cast within the recently developed framework of distributed estimation utilizing the vector dissipativity approach. The paper shows that a network of distributed observers can be endowed with an additional attack detection layer capable of detecting biasing attacks and correcting their effect on estimates produced by the network. An example is provided to illustrate the performance of the proposed distributed attack detector. △ Less

Submitted 9 October, 2018; originally announced October 2018.

Comments: Accepted for publication in Automatica

arXiv:1802.07126 [pdf, other]

On Estimating Multi-Attribute Choice Preferences using Private Signals and Matrix Factorization

Authors: Venkata Sriram Siddhardh Nadendla, Cedric Langbort

Abstract: Revealed preference theory studies the possibility of modeling an agent's revealed preferences and the construction of a consistent utility function. However, modeling agent's choices over preference orderings is not always practical and demands strong assumptions on human rationality and data-acquisition abilities. Therefore, we propose a simple generative choice model where agents are assumed to… ▽ More Revealed preference theory studies the possibility of modeling an agent's revealed preferences and the construction of a consistent utility function. However, modeling agent's choices over preference orderings is not always practical and demands strong assumptions on human rationality and data-acquisition abilities. Therefore, we propose a simple generative choice model where agents are assumed to generate the choice probabilities based on latent factor matrices that capture their choice evaluation across multiple attributes. Since the multi-attribute evaluation is typically hidden within the agent's psyche, we consider a signaling mechanism where agents are provided with choice information through private signals, so that the agent's choices provide more insight about his/her latent evaluation across multiple attributes. We estimate the choice model via a novel multi-stage matrix factorization algorithm that minimizes the average deviation of the factor estimates from choice data. Simulation results are presented to validate the estimation performance of our proposed algorithm. △ Less

Submitted 19 February, 2018; originally announced February 2018.

Comments: 6 pages, 2 figures, to be presented at CISS conference

arXiv:1711.02308 [pdf, ps, other]

Security Strategies of Both Players in Asymmetric Information Zero-Sum Stochastic Games with an Informed Controller

Authors: Lichun Li, Cedric Langbort, Jeff S. Shamma

Abstract: This paper considers a zero-sum two-player asymmetric information stochastic game where only one player knows the system state, and the transition law is controlled by the informed player only. For the informed player, it has been shown that the security strategy only depends on the belief and the current stage. We provide LP formulations whose size is only linear in the size of the uninformed pla… ▽ More This paper considers a zero-sum two-player asymmetric information stochastic game where only one player knows the system state, and the transition law is controlled by the informed player only. For the informed player, it has been shown that the security strategy only depends on the belief and the current stage. We provide LP formulations whose size is only linear in the size of the uninformed player's action set to compute both history based and belief based security strategies. For the uninformed player, we focus on the regret, the difference between 0 and the future payoff guaranteed by the uninformed player in every possible state. Regret is a real vector of the same size as the belief, and depends only on the action of the informed player and the strategy of the uninformed player. This paper shows that the uninformed player has a security strategy that only depends on the regret and the current stage. LP formulations are then given to compute the history based security strategy, the regret at every stage, and the regret based security strategy. The size of the LP formulations are again linear in the size of the uninformed player action set. Finally, an intrusion detection problem is studied to demonstrate the main results in this paper. △ Less

Submitted 7 November, 2017; originally announced November 2017.

Comments: submitted to special issue in the journal Dynamic Games and Applications

arXiv:1711.02303 [pdf, ps, other]

Iterative Computation of Security Strategies of Matrix Games with Growing Action Set

Authors: Lichun Li, Cedric Langbort

Abstract: This paper studies how to efficiently update the saddle-point strategy, or security strategy of one player in a matrix game when the other player develops new actions in the game. It is well known that the saddle-point strategy of one player can be computed by solving a linear program. Develo** a new action will add a new constraint to the existing LP. Therefore, our problem becomes how to solve… ▽ More This paper studies how to efficiently update the saddle-point strategy, or security strategy of one player in a matrix game when the other player develops new actions in the game. It is well known that the saddle-point strategy of one player can be computed by solving a linear program. Develo** a new action will add a new constraint to the existing LP. Therefore, our problem becomes how to solve the new LP with a new constraint efficiently. Considering the potentially huge number of constraints, which corresponds to the large size of the other player's action set, we use shadow vertex simplex method, whose computational complexity is lower than linear with respect to the size of the constraints, as the basis of our iterative algorithm. We first rebuild the main theorems in shadow vertex method with relaxed assumption to make sure such method works well in our model, then analyze the probability that the old optimum remains optimal in the new LP, and finally provides the iterative shadow vertex method whose computational complexity is shown to be strictly less than that of shadow vertex method. The simulation results demonstrates our main results about the probability of re-computing the optimum and the computational complexity of the iterative shadow vertex method. △ Less

Submitted 7 November, 2017; originally announced November 2017.

Comments: submitted to special issue in the journal Dynamic Games and Applications

arXiv:1708.04956 [pdf, other]

Strategic Communication Between Prospect Theoretic Agents over a Gaussian Test Channel

Authors: Venkata Sriram Siddhardh Nadendla, Emrah Akyol, Cedric Langbort, Tamer Başar

Abstract: In this paper, we model a Stackelberg game in a simple Gaussian test channel where a human transmitter (leader) communicates a source message to a human receiver (follower). We model human decision making using prospect theory models proposed for continuous decision spaces. Assuming that the value function is the squared distortion at both the transmitter and the receiver, we analyze the effects o… ▽ More In this paper, we model a Stackelberg game in a simple Gaussian test channel where a human transmitter (leader) communicates a source message to a human receiver (follower). We model human decision making using prospect theory models proposed for continuous decision spaces. Assuming that the value function is the squared distortion at both the transmitter and the receiver, we analyze the effects of the weight functions at both the transmitter and the receiver on optimal communication strategies, namely encoding at the transmitter and decoding at the receiver, in the Stackelberg sense. We show that the optimal strategies for the behavioral agents in the Stackelberg sense are identical to those designed for unbiased agents. At the same time, we also show that the prospect-theoretic distortions at both the transmitter and the receiver are both larger than the expected distortion, thus making behavioral agents less contended than unbiased agents. Consequently, the presence of cognitive biases increases the need for transmission power in order to achieve a given distortion at both transmitter and receiver. △ Less

Submitted 28 September, 2017; v1 submitted 14 August, 2017; originally announced August 2017.

Comments: 6 pages, 3 figures, Accepted to MILCOM-2017, Corrections made in the new version

arXiv:1706.01559 [pdf, ps, other]

Controller-jammer game models of Denial of Service in control systems operating over packet-drop** links

Authors: V. Ugrinovskii, C. Langbort

Abstract: The paper introduces a class of zero-sum games between the adversary and controller as a scenario for a `denial of service' in a networked control system. The communication link is modeled as a set of transmission regimes controlled by a strategic jammer whose intention is to wage an attack on the plant by choosing a most damaging regime-switching strategy. We demonstrate that even in the one-step… ▽ More The paper introduces a class of zero-sum games between the adversary and controller as a scenario for a `denial of service' in a networked control system. The communication link is modeled as a set of transmission regimes controlled by a strategic jammer whose intention is to wage an attack on the plant by choosing a most damaging regime-switching strategy. We demonstrate that even in the one-step case, the introduced games admit a saddle-point equilibrium, at which the jammer's optimal policy is to randomize in a region of the plant's state space, thus requiring the controller to undertake a nontrivial response which is different from what one would expect in a standard stochastic control problem over a packet drop** link. The paper derives conditions for the introduced games to have such a saddle-point equilibrium. Furthermore, we show that in more general multi-stage games, these conditions provide `greedy' jamming strategies for the adversary. △ Less

Submitted 5 June, 2017; originally announced June 2017.

Comments: Accepted for publication in Automatica

arXiv:1703.01957 [pdf, ps, other]

An LP Approach for Solving Two-Player Zero-Sum Repeated Bayesian Games

Authors: Lichun Li, Cedric Langbort, Jeff Shamma

Abstract: This paper studies two-player zero-sum repeated Bayesian games in which every player has a private type that is unknown to the other player, and the initial probability of the type of every player is publicly known. The types of players are independently chosen according to the initial probabilities, and are kept the same all through the game. At every stage, players simultaneously choose actions,… ▽ More This paper studies two-player zero-sum repeated Bayesian games in which every player has a private type that is unknown to the other player, and the initial probability of the type of every player is publicly known. The types of players are independently chosen according to the initial probabilities, and are kept the same all through the game. At every stage, players simultaneously choose actions, and announce their actions publicly. For finite horizon cases, an explicit linear program is provided to compute players' security strategies. Moreover, based on the existing results in [1], this paper shows that a player's sufficient statistics, which is independent of the strategy of the other player, consists of the belief over the player's own type, the regret with respect to the other player's type, and the stage. Explicit linear programs are provided to compute the initial regrets, and the security strategies that only depends on the sufficient statistics. For discounted cases, following the same idea in the finite horizon, this paper shows that a player's sufficient statistics consists of the belief of the player's own type and the anti-discounted regret with respect to the other player's type. Besides, an approximated security strategy depending on the sufficient statistics is provided, and an explicit linear program to compute the approximated security strategy is given. This paper also obtains a bound on the performance difference between the approximated security strategy and the security strategy. △ Less

Submitted 7 November, 2017; v1 submitted 6 March, 2017; originally announced March 2017.

Comments: submitted to TAC, under review

arXiv:1701.08058 [pdf, other]

Optimal Communication Strategies in Networked Cyber-Physical Systems with Adversarial Elements

Authors: Emrah Akyol, Kenneth Rose, Tamer Basar, Cedric Langbort

Abstract: This paper studies optimal communication and coordination strategies in cyber-physical systems for both defender and attacker within a game-theoretic framework. We model the communication network of a cyber-physical system as a sensor network which involves one single Gaussian source observed by many sensors, subject to additive independent Gaussian observation noises. The sensors communicate with… ▽ More This paper studies optimal communication and coordination strategies in cyber-physical systems for both defender and attacker within a game-theoretic framework. We model the communication network of a cyber-physical system as a sensor network which involves one single Gaussian source observed by many sensors, subject to additive independent Gaussian observation noises. The sensors communicate with the estimator over a coherent Gaussian multiple access channel. The aim of the receiver is to reconstruct the underlying source with minimum mean squared error. The scenario of interest here is one where some of the sensors are captured by the attacker and they act as the adversary (jammer): they strive to maximize distortion. The receiver (estimator) knows the captured sensors but still cannot simply ignore them due to the multiple access channel, i.e., the outputs of all sensors are summed to generate the estimator input. We show that the ability of transmitter sensors to secretly agree on a random event, that is "coordination", plays a key role in the analysis... △ Less

Submitted 27 January, 2017; originally announced January 2017.

Comments: submitted to IEEE Transactions on Signal and Information Processing over Networks, Special Issue on Distributed Signal Processing for Security and Privacy in Networked Cyber-Physical Systems

arXiv:1611.02329 [pdf, other]

Convergence Analysis of Iterated Best Response for a Trusted Computation Game

Authors: Shaunak D. Bopardikar, Alberto Speranzon, Cedric Langbort

Abstract: We introduce a game of trusted computation in which a sensor equipped with limited computing power leverages a central node to evaluate a specified function over a large dataset, collected over time. We assume that the central computer can be under attack and we propose a strategy where the sensor retains a limited amount of the data to counteract the effect of attack. We formulate the problem as… ▽ More We introduce a game of trusted computation in which a sensor equipped with limited computing power leverages a central node to evaluate a specified function over a large dataset, collected over time. We assume that the central computer can be under attack and we propose a strategy where the sensor retains a limited amount of the data to counteract the effect of attack. We formulate the problem as a two player game in which the sensor (defender) chooses an optimal fusion strategy using both the non-trusted output from the central computer and locally stored trusted data. The attacker seeks to compromise the computation by influencing the fused value through malicious manipulation of the data stored on the central node. We first characterize all Nash equilibria of this game, which turn out to be dependent on parameters known to both players. Next we adopt an Iterated Best Response (IBR) scheme in which, at each iteration, the central computer reveals its output to the sensor, who then computes its best response based on a linear combination of its private local estimate and the untrusted third-party output. We characterize necessary and sufficient conditions for convergence of the IBR along with numerical results which show that the convergence conditions are relatively tight. △ Less

Submitted 7 November, 2016; originally announced November 2016.

Comments: Contains detailed proofs of all results as well as an additional section on "the case of equal means" (Section 5)

arXiv:1610.08210 [pdf, other]

Price of Transparency in Strategic Machine Learning

Authors: Emrah Akyol, Cedric Langbort, Tamer Basar

Abstract: Based on the observation that the transparency of an algorithm comes with a cost for the algorithm designer when the users (data providers) are strategic, this paper studies the impact of strategic intent of the users on the design and performance of transparent ML algorithms. We quantitatively study the {\bf price of transparency} in the context of strategic classification algorithms, by modeling… ▽ More Based on the observation that the transparency of an algorithm comes with a cost for the algorithm designer when the users (data providers) are strategic, this paper studies the impact of strategic intent of the users on the design and performance of transparent ML algorithms. We quantitatively study the {\bf price of transparency} in the context of strategic classification algorithms, by modeling the problem as a nonzero-sum game between the users and the algorithm designer. The cost of having a transparent algorithm is measured by a quantity, named here as price of transparency which is the ratio of the designer cost at the Stackelberg equilibrium, when the algorithm is transparent (which allows users to be strategic) to that of the setting where the algorithm is not transparent. △ Less

Submitted 26 October, 2016; originally announced October 2016.

Comments: 3rd Workshop on Fairness, Accountability, and Transparency in Machine Learning

arXiv:1609.05300 [pdf, ps, other]

Detection of Biasing Attacks on Distributed Estimation Networks

Authors: Mohammad Deghat, Valery Ugrinovskii, Iman Shames, Cedric Langbort

Abstract: The paper addresses the problem of detecting attacks on distributed estimator networks that aim to intentionally bias process estimates produced by the network. It provides a sufficient condition, in terms of the feasibility of certain linear matrix inequalities, which guarantees distributed input attack detection using an $H_\infty$ approach. The paper addresses the problem of detecting attacks on distributed estimator networks that aim to intentionally bias process estimates produced by the network. It provides a sufficient condition, in terms of the feasibility of certain linear matrix inequalities, which guarantees distributed input attack detection using an $H_\infty$ approach. △ Less

Submitted 17 September, 2016; originally announced September 2016.

Comments: The paper is to appear in Proceedings of the 55th IEEE Conference on Decision and Control, Las Vegas, December 2016

arXiv:1607.03273 [pdf, other]

Scalar Quadratic-Gaussian Soft Watermarking Games

Authors: Kivanc Mihcak, Emrah Akyol, Tamer Basar, Cedric Langbort

Abstract: We introduce the zero-sum game problem of soft watermarking: The hidden information (watermark) comes from a continuum and has a perceptual value; the receiver generates an estimate of the embedded watermark to minimize the expected estimation error (unlike the conventional watermarking schemes where both the hidden information and the receiver output are from a discrete finite set). Applications… ▽ More We introduce the zero-sum game problem of soft watermarking: The hidden information (watermark) comes from a continuum and has a perceptual value; the receiver generates an estimate of the embedded watermark to minimize the expected estimation error (unlike the conventional watermarking schemes where both the hidden information and the receiver output are from a discrete finite set). Applications include embedding a multimedia content into another. We consider in this paper the scalar Gaussian case and use expected mean-squared distortion. We formulate the resulting problem as a zero-sum game between the encoder & receiver pair and the attacker. We show that for the lin- ear encoder, the optimal attacker is Gaussian-affine, derive the optimal system parameters in that case, and discuss the corresponding system behavior. We also provide numerical results to gain further insight and understanding of the system behavior at optimality. △ Less

Submitted 12 July, 2016; originally announced July 2016.

Comments: submitted for publication

arXiv:1602.06406 [pdf, other]

On the Role of Side Information In Strategic Communication

Authors: Emrah Akyol, Cedric Langbort, Tamer Basar

Abstract: This paper analyzes the fundamental limits of strate- gic communication in network settings. Strategic communication differs from the conventional communication paradigms in in- formation theory since it involves different objectives for the encoder and the decoder, which are aware of this mismatch and act accordingly. This leads to a Stackelberg game where both agents commit to their map**s ex-… ▽ More This paper analyzes the fundamental limits of strate- gic communication in network settings. Strategic communication differs from the conventional communication paradigms in in- formation theory since it involves different objectives for the encoder and the decoder, which are aware of this mismatch and act accordingly. This leads to a Stackelberg game where both agents commit to their map**s ex-ante. Building on our prior work on the point-to-point setting, this paper studies the compression and communication problems with the receiver and/or transmitter side information setting. The equilibrium strategies and associated costs are characterized for the Gaussian variables with quadratic cost functions. Several questions on the benefit of side information in source and joint source-channel coding in such strategic settings are analyzed. Our analysis has uncovered an interesting result on optimality of uncoded map**s in strategic source-channel coding in networks. △ Less

Submitted 20 February, 2016; originally announced February 2016.

Comments: submitted to ISIT'16. arXiv admin note: text overlap with arXiv:1510.00764

arXiv:1510.03495 [pdf, other]

Privacy Constrained Information Processing

Authors: Emrah Akyol, Cedric Langbort, Tamer Basar

Abstract: This paper studies communication scenarios where the transmitter and the receiver have different objectives due to privacy concerns, in the context of a variation of the strategic information transfer (SIT) model of Sobel and Crawford. We first formulate the problem as the minimization of a common distortion by the transmitter and the receiver subject to a privacy constrained transmitter. We show… ▽ More This paper studies communication scenarios where the transmitter and the receiver have different objectives due to privacy concerns, in the context of a variation of the strategic information transfer (SIT) model of Sobel and Crawford. We first formulate the problem as the minimization of a common distortion by the transmitter and the receiver subject to a privacy constrained transmitter. We show the equivalence of this formulation to a Stackelberg equilibrium of the SIT problem. Assuming an entropy based privacy measure, a quadratic distortion measure and jointly Gaussian variables, we characterize the Stackelberg equilibrium. Next, we consider asymptotically optimal compression at the transmitter which inherently provides some level of privacy, and study equilibrium conditions. We finally analyze the impact of the presence of an average power constrained Gaussian communication channel between the transmitter and the receiver on the equilibrium conditions. △ Less

Submitted 12 October, 2015; originally announced October 2015.

Comments: will appear in CDC'15

arXiv:1510.00764 [pdf, other]

doi 10.1109/JPROC.2016.2575858

Information-Theoretic Approach to Strategic Communication as a Hierarchical Game

Authors: Emrah Akyol, Cedric Langbort, Tamer Basar

Abstract: This paper analyzes the information disclosure problems originated in economics through the lens of information theory. Such problems are radically different from the conventional communication paradigms in information theory since they involve different objectives for the encoder and the decoder, which are aware of this mismatch and act accordingly. This leads, in our setting, to a hierarchical c… ▽ More This paper analyzes the information disclosure problems originated in economics through the lens of information theory. Such problems are radically different from the conventional communication paradigms in information theory since they involve different objectives for the encoder and the decoder, which are aware of this mismatch and act accordingly. This leads, in our setting, to a hierarchical communication game, where the transmitter announces an encoding strategy with full commitment, and its distortion measure depends on a private information sequence whose realization is available at the transmitter. The receiver decides on its decoding strategy that minimizes its own distortion based on the announced encoding map and the statistics. Three problem settings are considered, focusing on the quadratic distortion measures, and jointly Gaussian source and private information: compression, communication, and the simple equilibrium conditions without any compression or communication. The equilibrium strategies and associated costs are characterized. The analysis is then extended to the receiver side information setting and the major changes in the structure of optimal strategies are identified. Finally, an extension of the results to the broader context of decentralized stochastic control is presented. △ Less

Submitted 11 July, 2016; v1 submitted 2 October, 2015; originally announced October 2015.

Comments: in press, the Proceedings of the IEEE, Special Issue on Principles and Applications of Science of Information

arXiv:1404.1404 [pdf, other]

On the Existence of Optimal Policies for a Class of Static and Sequential Dynamic Teams

Authors: Abhishek Gupta, Serdar Yuksel, Tamer Basar, Cedric Langbort

Abstract: In this paper, we identify sufficient conditions under which static teams and a class of sequential dynamic teams admit team-optimal solutions. We first investigate the existence of optimal solutions in static teams where the observations of the decision makers are conditionally independent or satisfy certain regularity conditions. Building on these findings and the static reduction method of Wits… ▽ More In this paper, we identify sufficient conditions under which static teams and a class of sequential dynamic teams admit team-optimal solutions. We first investigate the existence of optimal solutions in static teams where the observations of the decision makers are conditionally independent or satisfy certain regularity conditions. Building on these findings and the static reduction method of Witsenhausen, we then extend the analysis to sequential dynamic teams. In particular, we show that a large class of dynamic LQG team problems, including the vector version of the well-known Witsenhausen's counterexample and the Gaussian relay channel problem viewed as a dynamic team, admit team-optimal solutions. Results in this paper substantially broaden the class of stochastic control and team problems with non-classical information known to have optimal solutions. △ Less

Submitted 4 April, 2014; originally announced April 2014.

Comments: 38 pages, 2 figures

arXiv:1403.5641 [pdf, ps, other]

Control over adversarial packet-drop** communication networks revisited

Authors: V. Ugrinovskii, C. Langbort

Abstract: We revisit a one-step control problem over an adversarial packet-drop** link. The link is modeled as a set of binary channels controlled by a strategic jammer whose intention is to wage a `denial of service' attack on the plant by choosing a most damaging channel-switching strategy. The paper introduces a class of zero-sum games between the jammer and controller as a scenario for such attack, an… ▽ More We revisit a one-step control problem over an adversarial packet-drop** link. The link is modeled as a set of binary channels controlled by a strategic jammer whose intention is to wage a `denial of service' attack on the plant by choosing a most damaging channel-switching strategy. The paper introduces a class of zero-sum games between the jammer and controller as a scenario for such attack, and derives necessary and sufficient conditions for these games to have a nontrivial saddle-point equilibrium. At this equilibrium, the jammer's optimal policy is to randomize in a region of the plant's state space, thus requiring the controller to undertake a nontrivial response which is different from what one would expect in a standard stochastic control problem over a packet drop** channel. △ Less

Submitted 22 March, 2014; originally announced March 2014.

Comments: This paper has been accepted for presentation at the 2014 American Control Conference, Portland, Oregon

arXiv:1402.4031 [pdf, other]

Estimation with Strategic Sensors

Authors: Farhad Farokhi, Andre M. H. Teixeira, Cedric Langbort

Abstract: We introduce a model of estimation in the presence of strategic, self-interested sensors. We employ a game-theoretic setup to model the interaction between the sensors and the receiver. The cost function of the receiver is equal to the estimation error variance while the cost function of the sensor contains an extra term which is determined by its private information. We start by the single sensor… ▽ More We introduce a model of estimation in the presence of strategic, self-interested sensors. We employ a game-theoretic setup to model the interaction between the sensors and the receiver. The cost function of the receiver is equal to the estimation error variance while the cost function of the sensor contains an extra term which is determined by its private information. We start by the single sensor case in which the receiver has access to a noisy but honest side information in addition to the message transmitted by a strategic sensor. We study both static and dynamic estimation problems. For both these problems, we characterize a family of equilibria in which the sensor and the receiver employ simple strategies. Interestingly, for the dynamic estimation problem, we find an equilibrium for which the strategic sensor uses a memory-less policy. We generalize the static estimation setup to multiple sensors with synchronous communication structure (i.e., all the sensors transmit their messages simultaneously). We prove the maybe surprising fact that, for the constructed equilibrium in affine strategies, the estimation quality degrades as the number of sensors increases. However, if the sensors are herding (i.e., copying each other policies), the quality of the receiver's estimation improves as the number of sensors increases. Finally, we consider the asynchronous communication structure (i.e., the sensors transmit their messages sequentially). △ Less

Submitted 24 June, 2015; v1 submitted 17 February, 2014; originally announced February 2014.

Comments: Results are generalized, illustrative examples are added, and the literature review is improved

arXiv:1401.4786 [pdf, ps, other]

Common Information based Markov Perfect Equilibria for Linear-Gaussian Games with Asymmetric Information

Authors: Abhishek Gupta, Ashutosh Nayyar, Cedric Langbort, Tamer Basar

Abstract: We consider a class of two-player dynamic stochastic nonzero-sum games where the state transition and observation equations are linear, and the primitive random variables are Gaussian. Each controller acquires possibly different dynamic information about the state process and the other controller's past actions and observations. This leads to a dynamic game of asymmetric information among the cont… ▽ More We consider a class of two-player dynamic stochastic nonzero-sum games where the state transition and observation equations are linear, and the primitive random variables are Gaussian. Each controller acquires possibly different dynamic information about the state process and the other controller's past actions and observations. This leads to a dynamic game of asymmetric information among the controllers. Building on our earlier work on finite games with asymmetric information, we devise an algorithm to compute a Nash equilibrium by using the common information among the controllers. We call such equilibria common information based Markov perfect equilibria of the game, which can be viewed as a refinement of Nash equilibrium in games with asymmetric information. If the players' cost functions are quadratic, then we show that under certain conditions a unique common information based Markov perfect equilibrium exists. Furthermore, this equilibrium can be computed by solving a sequence of linear equations. We also show through an example that there could be other Nash equilibria in a game of asymmetric information, not corresponding to common information based Markov perfect equilibria. △ Less

Submitted 19 January, 2014; originally announced January 2014.

Comments: Submitted to SIAM Journal of Control and Optimization

arXiv:1401.3217 [pdf, ps, other]

On Endogenous Random Consensus and Averaging Dynamics

Authors: Behrouz Touri, Cedric Langbort

Abstract: Motivated by various random variations of Hegselmann-Krause model for opinion dynamics and gossip algorithm in an endogenously changing environment, we propose a general framework for the study of endogenously varying random averaging dynamics, i.e.\ an averaging dynamics whose evolution suffers from history dependent sources of randomness. We show that under general assumptions on the averaging d… ▽ More Motivated by various random variations of Hegselmann-Krause model for opinion dynamics and gossip algorithm in an endogenously changing environment, we propose a general framework for the study of endogenously varying random averaging dynamics, i.e.\ an averaging dynamics whose evolution suffers from history dependent sources of randomness. We show that under general assumptions on the averaging dynamics, such dynamics is convergent almost surely. We also determine the limiting behavior of such dynamics and show such dynamics admit infinitely many time-varying Lyapunov functions. △ Less

Submitted 14 January, 2014; originally announced January 2014.

arXiv:1309.4372 [pdf, other]

doi 10.1109/TCNS.2015.2489318

Faithful Implementations of Distributed Algorithms and Control Laws

Authors: Takashi Tanaka, Farhad Farokhi, Cédric Langbort

Abstract: When a distributed algorithm must be executed by strategic agents with misaligned interests, a social leader needs to introduce an appropriate tax/subsidy mechanism to incentivize agents to faithfully implement the intended algorithm so that a correct outcome is obtained. We discuss the incentive issues of implementing economically efficient distributed algorithms using the framework of indirect m… ▽ More When a distributed algorithm must be executed by strategic agents with misaligned interests, a social leader needs to introduce an appropriate tax/subsidy mechanism to incentivize agents to faithfully implement the intended algorithm so that a correct outcome is obtained. We discuss the incentive issues of implementing economically efficient distributed algorithms using the framework of indirect mechanism design theory. In particular, we show that indirect Groves mechanisms are not only sufficient but also necessary to achieve incentive compatibility. This result can be viewed as a generalization of the Green-Laffont theorem to indirect mechanisms. Then we introduce the notion of asymptotic incentive compatibility as an appropriate solution concept to faithfully implement distributed and iterative optimization algorithms. We consider two special types of optimization algorithms: dual decomposition algorithms for resource allocation and average consensus algorithms. △ Less

Submitted 30 September, 2015; v1 submitted 17 September, 2013; originally announced September 2013.

Comments: This manuscript is the extended version of arXiv:1304.3063, which was presented at the 52nd IEEE Conference on Decision and Control. In addition to the previously covered material, this contains a complete discussion including proofs and new results

arXiv:1304.3063 [pdf, other]

A Faithful Distributed Implementation of Dual Decomposition and Average Consensus Algorithms

Authors: Takashi Tanaka, Farhad Farokhi, Cédric Langbort

Abstract: We consider large scale cost allocation problems and consensus seeking problems for multiple agents, in which agents are suggested to collaborate in a distributed algorithm to find a solution. If agents are strategic to minimize their own individual cost rather than the global social cost, they are endowed with an incentive not to follow the intended algorithm, unless the tax/subsidy mechanism is… ▽ More We consider large scale cost allocation problems and consensus seeking problems for multiple agents, in which agents are suggested to collaborate in a distributed algorithm to find a solution. If agents are strategic to minimize their own individual cost rather than the global social cost, they are endowed with an incentive not to follow the intended algorithm, unless the tax/subsidy mechanism is carefully designed. Inspired by the classical Vickrey-Clarke-Groves mechanism and more recent algorithmic mechanism design theory, we propose a tax mechanism that incentivises agents to faithfully implement the intended algorithm. In particular, a new notion of asymptotic incentive compatibility is introduced to characterize a desirable property of such class of mechanisms. The proposed class of tax mechanisms provides a sequence of mechanisms that gives agents a diminishing incentive to deviate from suggested algorithm. △ Less

Submitted 10 April, 2013; originally announced April 2013.

Comments: 8 pages

arXiv:1209.3549 [pdf, ps, other]

Nash Equilibria for Stochastic Games with Asymmetric Information-Part 1: Finite Games

Authors: Ashutosh Nayyar, Abhishek Gupta, Cédric Langbort, Tamer Başar

Abstract: A model of stochastic games where multiple controllers jointly control the evolution of the state of a dynamic system but have access to different information about the state and action processes is considered. The asymmetry of information among the controllers makes it difficult to compute or characterize Nash equilibria. Using common information among the controllers, the game with asymmetric in… ▽ More A model of stochastic games where multiple controllers jointly control the evolution of the state of a dynamic system but have access to different information about the state and action processes is considered. The asymmetry of information among the controllers makes it difficult to compute or characterize Nash equilibria. Using common information among the controllers, the game with asymmetric information is shown to be equivalent to another game with symmetric information. Further, under certain conditions, a Markov state is identified for the equivalent symmetric information game and its Markov perfect equilibria are characterized. This characterization provides a backward induction algorithm to find Nash equilibria of the original game with asymmetric information in pure or behavioral strategies. Each step of this algorithm involves finding Bayesian Nash equilibria of a one-stage Bayesian game. The class of Nash equilibria of the original game that can be characterized in this backward manner are named common information based Markov perfect equilibria. △ Less

Submitted 17 September, 2012; originally announced September 2012.

arXiv:1112.5032 [pdf, ps, other]

Decentralized Disturbance Accommodation with Limited Plant Model Information

Authors: F. Farokhi, C. Langbort, K. H. Johansson

Abstract: The design of optimal disturbance accommodation and servomechanism controllers with limited plant model information is considered in this paper. Their closed-loop performance are compared using a performance metric called competitive ratio which is the worst-case ratio of the cost of a given control design strategy to the cost of the optimal control design with full model information. It was recen… ▽ More The design of optimal disturbance accommodation and servomechanism controllers with limited plant model information is considered in this paper. Their closed-loop performance are compared using a performance metric called competitive ratio which is the worst-case ratio of the cost of a given control design strategy to the cost of the optimal control design with full model information. It was recently shown that when it comes to designing optimal centralized or partially structured decentralized state-feedback controllers with limited model information, the best control design strategy in terms of competitive ratio is a static one. This is true even though the optimal structured decentralized state-feedback controller with full model information is dynamic. In this paper, we show that, in contrast, the best limited model information control design strategy for the disturbance accommodation problem gives a dynamic controller. We find an explicit minimizer of the competitive ratio and we show that it is undominated, that is, there is no other control design strategy that performs better for all possible plants while having the same worst-case ratio. This optimal controller can be separated into a static feedback law and a dynamic disturbance observer. For constant disturbances, it is shown that this structure corresponds to proportional-integral control. △ Less

Submitted 21 December, 2011; originally announced December 2011.

Journal ref: SIAM Journal on Control and Optimization, 51(2), pp. 1543-1573, 2013

arXiv:1112.4294 [pdf, ps, other]

Optimal Disturbance Accommodation with Limited Model Information

Authors: F. Farokhi, C. Langbort, K. H. Johansson

Abstract: The design of optimal dynamic disturbance accommodation controller with limited model information is considered. We adapt the family of limited model information control design strategies, defined earlier by the authors, to handle dynamic controllers. This family of limited model information design strategies construct subcontrollers distributively by accessing only local plant model information.… ▽ More The design of optimal dynamic disturbance accommodation controller with limited model information is considered. We adapt the family of limited model information control design strategies, defined earlier by the authors, to handle dynamic controllers. This family of limited model information design strategies construct subcontrollers distributively by accessing only local plant model information. The closed-loop performance of the dynamic controllers that they can produce are studied using a performance metric called the competitive ratio which is the worst case ratio of the cost a control design strategy to the cost of the optimal control design with full model information. △ Less

Submitted 13 March, 2012; v1 submitted 19 December, 2011; originally announced December 2011.

Comments: Fixed Typos, Updated Introduction and References. This manuscript is an early version of the results presented in arXiv:1112.5032 prepared for the presentation at the American Control Conference 2012

Journal ref: Proceedings of the American Control Conference, pp. 4757-4764, 2012

arXiv:1112.3839 [pdf, ps, other]

doi 10.1016/j.automatica.2012.10.004

Optimal Structured Static State-Feedback Control Design with Limited Model Information for Fully-Actuated Systems

Authors: Farhad Farokhi, Cedric Langbort, Karl H. Johansson

Abstract: We introduce the family of limited model information control design methods, which construct controllers by accessing the plant's model in a constrained way, according to a given design graph. We investigate the closed-loop performance achievable by such control design methods for fully-actuated discrete-time linear time-invariant systems, under a separable quadratic cost. We restrict our study to… ▽ More We introduce the family of limited model information control design methods, which construct controllers by accessing the plant's model in a constrained way, according to a given design graph. We investigate the closed-loop performance achievable by such control design methods for fully-actuated discrete-time linear time-invariant systems, under a separable quadratic cost. We restrict our study to control design methods which produce structured static state feedback controllers, where each subcontroller can at least access the state measurements of those subsystems that affect its corresponding subsystem. We compute the optimal control design strategy (in terms of the competitive ratio and domination metrics) when the control designer has access to the local model information and the global interconnection structure of the plant-to-be-controlled. Lastly, we study the trade-off between the amount of model information exploited by a control design method and the best closed-loop performance (in terms of the competitive ratio) of controllers it can produce. △ Less

Submitted 27 April, 2012; v1 submitted 16 December, 2011; originally announced December 2011.

Comments: Extension of this article's results for disturbance accommodation problem can be found in arXiv:1112.5032

Journal ref: Automatica, Volume 49, Issue 2, February 2013, Pages 326-337

arXiv:1010.1549 [pdf, other]

Task Release Control for Decision Making Queues

Authors: Vaibhav Srivastava, Ruggero Carli, Francesco Bullo, Cédric Langbort

Abstract: We consider the optimal duration allocation in a decision making queue. Decision making tasks arrive at a given rate to a human operator. The correctness of the decision made by human evolves as a sigmoidal function of the duration allocated to the task. Each task in the queue loses its value continuously. We elucidate on this trade-off and determine optimal policies for the human operator. We sho… ▽ More We consider the optimal duration allocation in a decision making queue. Decision making tasks arrive at a given rate to a human operator. The correctness of the decision made by human evolves as a sigmoidal function of the duration allocated to the task. Each task in the queue loses its value continuously. We elucidate on this trade-off and determine optimal policies for the human operator. We show the optimal policy requires the human to drop some tasks. We present a receding horizon optimization strategy, and compare it with the greedy policy. △ Less

Submitted 7 October, 2010; originally announced October 2010.

Comments: 8 pages, Submitted to American Controls Conference, San Francisco, CA, June 2011

Showing 1–41 of 41 results for author: Langbort, C