Search | arXiv e-print repository

Generalizing Better Response Paths and Weakly Acyclic Games

Authors: Bora Yongacoglu, Gürdal Arslan, Lacra Pavel, Serdar Yüksel

Abstract: Weakly acyclic games generalize potential games and are fundamental to the study of game theoretic control. In this paper, we present a generalization of weakly acyclic games, and we observe its importance in multi-agent learning when agents employ experimental strategy updates in periods where they fail to best respond. While weak acyclicity is defined in terms of path connectivity properties of… ▽ More Weakly acyclic games generalize potential games and are fundamental to the study of game theoretic control. In this paper, we present a generalization of weakly acyclic games, and we observe its importance in multi-agent learning when agents employ experimental strategy updates in periods where they fail to best respond. While weak acyclicity is defined in terms of path connectivity properties of a game's better response graph, our generalization is defined using a generalized better response graph. We provide sufficient conditions for this notion of generalized weak acyclicity in both two-player games and $n$-player games. To demonstrate that our generalization is not trivial, we provide examples of games admitting a pure Nash equilibrium that are not generalized weakly acyclic. The generalization presented in this work is closely related to the recent theory of satisficing paths, and the counterexamples presented here constitute the first negative results in that theory. △ Less

Submitted 26 March, 2024; originally announced March 2024.

arXiv:2403.18079 [pdf, ps, other]

Paths to Equilibrium in Normal-Form Games

Authors: Bora Yongacoglu, Gürdal Arslan, Lacra Pavel, Serdar Yüksel

Abstract: In multi-agent reinforcement learning (MARL), agents repeatedly interact across time and revise their strategies as new data arrives, producing a sequence of strategy profiles. This paper studies sequences of strategies satisfying a pairwise constraint inspired by policy updating in reinforcement learning, where an agent who is best responding in period $t$ does not switch its strategy in the next… ▽ More In multi-agent reinforcement learning (MARL), agents repeatedly interact across time and revise their strategies as new data arrives, producing a sequence of strategy profiles. This paper studies sequences of strategies satisfying a pairwise constraint inspired by policy updating in reinforcement learning, where an agent who is best responding in period $t$ does not switch its strategy in the next period $t+1$. This constraint merely requires that optimizing agents do not switch strategies, but does not constrain the other non-optimizing agents in any way, and thus allows for exploration. Sequences with this property are called satisficing paths, and arise naturally in many MARL algorithms. A fundamental question about strategic dynamics is such: for a given game and initial strategy profile, is it always possible to construct a satisficing path that terminates at an equilibrium strategy? The resolution of this question has implications about the capabilities or limitations of a class of MARL algorithms. We answer this question in the affirmative for mixed extensions of finite normal-form games.% △ Less

Submitted 26 March, 2024; originally announced March 2024.

arXiv:2311.12609 [pdf, other]

Reinforcement Learning for Near-Optimal Design of Zero-Delay Codes for Markov Sources

Authors: Liam Cregg, Tamas Linder, Serdar Yuksel

Abstract: In the classical lossy source coding problem, one encodes long blocks of source symbols that enables the distortion to approach the ultimate Shannon limit. Such a block-coding approach introduces large delays, which is undesirable in many delay-sensitive applications. We consider the zero-delay case, where the goal is to encode and decode a finite-alphabet Markov source without any delay. It has b… ▽ More In the classical lossy source coding problem, one encodes long blocks of source symbols that enables the distortion to approach the ultimate Shannon limit. Such a block-coding approach introduces large delays, which is undesirable in many delay-sensitive applications. We consider the zero-delay case, where the goal is to encode and decode a finite-alphabet Markov source without any delay. It has been shown that this problem lends itself to stochastic control techniques, which lead to existence, structural, and general structural approximation results. However, these techniques so far have resulted only in computationally prohibitive algorithmic implementations for code design. To address this problem, we present a reinforcement learning design algorithm and rigorously prove its asymptotic optimality. In particular, we show that a quantized Q-learning algorithm can be used to obtain a near-optimal coding policy for this problem. The proof builds on recent results on quantized Q-learning for weakly Feller controlled Markov chains whose application necessitates the development of supporting technical results on regularity and stability properties, and relating the optimal solutions for discounted and average cost infinite horizon criteria problems. These theoretical results are supported by simulations. △ Less

Submitted 17 June, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

Comments: 15 pages, 3 figures; accepted for publication in IEEE Transactions on Information Theory

arXiv:2311.00123 [pdf, other]

Q-Learning for Stochastic Control under General Information Structures and Non-Markovian Environments

Authors: Ali Devran Kara, Serdar Yuksel

Abstract: As a primary contribution, we present a convergence theorem for stochastic iterations, and in particular, Q-learning iterates, under a general, possibly non-Markovian, stochastic environment. Our conditions for convergence involve an ergodicity and a positivity criterion. We provide a precise characterization on the limit of the iterates and conditions on the environment and initializations for co… ▽ More As a primary contribution, we present a convergence theorem for stochastic iterations, and in particular, Q-learning iterates, under a general, possibly non-Markovian, stochastic environment. Our conditions for convergence involve an ergodicity and a positivity criterion. We provide a precise characterization on the limit of the iterates and conditions on the environment and initializations for convergence. As our second contribution, we discuss the implications and applications of this theorem to a variety of stochastic control problems with non-Markovian environments involving (i) quantized approximations of fully observed Markov Decision Processes (MDPs) with continuous spaces (where quantization break down the Markovian structure), (ii) quantized approximations of belief-MDP reduced partially observable MDPS (POMDPs) with weak Feller continuity and a mild version of filter stability (which requires the knowledge of the model by the controller), (iii) finite window approximations of POMDPs under a uniform controlled filter stability (which does not require the knowledge of the model), and (iv) for multi-agent models where convergence of learning dynamics to a new class of equilibria, subjective Q-learning equilibria, will be studied. In addition to the convergence theorem, some implications of the theorem above are new to the literature and others are interpreted as applications of the convergence theorem. Some open problems are noted. △ Less

Submitted 4 March, 2024; v1 submitted 31 October, 2023; originally announced November 2023.

Comments: 2 figures

arXiv:2310.06742 [pdf, other]

Reinforcement Learning for Optimal Transmission of Markov Sources: Belief Quantization vs Sliding Finite Window Codes

Authors: Liam Cregg, Fady Alajaji, Serdar Yuksel

Abstract: We study the problem of zero-delay coding for the transmission a Markov source over a noisy channel with feedback and present a rigorous reinforcement theoretic solution which is guaranteed to achieve near-optimality. To this end, we formulate the problem as a Markov decision process (MDP) where the state is a probability-measure valued predictor/belief and the actions are quantizer maps. This MDP… ▽ More We study the problem of zero-delay coding for the transmission a Markov source over a noisy channel with feedback and present a rigorous reinforcement theoretic solution which is guaranteed to achieve near-optimality. To this end, we formulate the problem as a Markov decision process (MDP) where the state is a probability-measure valued predictor/belief and the actions are quantizer maps. This MDP formulation has been used to show the optimality of certain classes of encoder policies in prior work. Despite such an analytical approach in determining optimal policies, their computation is prohibitively complex due to the uncountable nature of the constructed state space and the lack of minorization or strong ergodicity results which are commonly assumed for average cost optimal stochastic control. These challenges invite rigorous reinforcement learning methods, which entail several open questions addressed in our paper. We present two complementary approaches for this problem. In the first approach, we approximate the set of all beliefs by a finite set and use nearest-neighbor quantization to obtain a finite state MDP, whose optimal policies become near-optimal for the original MDP as the quantization becomes arbitrarily fine. In the second approach, a sliding finite window of channel outputs and quantizers together with a prior belief state serve as the state of the MDP. We then approximate this state by marginalizing over all possible beliefs, so that our policies only use the finite window term to encode the source. Under an appropriate notion of predictor stability, we show that such policies are near-optimal for the zero-delay coding problem as the window length increases. We give sufficient conditions for predictor stability to hold. Finally, we propose a reinforcement learning algorithm to compute near-optimal policies and provide a detailed comparison of the coding policies. △ Less

Submitted 1 July, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

Comments: Submitted to Journal of Machine Learning Research. 41 pages, 5 figures

arXiv:2309.08197 [pdf, other]

doi 10.1016/j.sigpro.2023.109248

Hyperspectral Image Denoising via Self-Modulating Convolutional Neural Networks

Authors: Orhan Torun, Seniha Esen Yuksel, Erkut Erdem, Nevrez Imamoglu, Aykut Erdem

Abstract: Compared to natural images, hyperspectral images (HSIs) consist of a large number of bands, with each band capturing different spectral information from a certain wavelength, even some beyond the visible spectrum. These characteristics of HSIs make them highly effective for remote sensing applications. That said, the existing hyperspectral imaging devices introduce severe degradation in HSIs. Henc… ▽ More Compared to natural images, hyperspectral images (HSIs) consist of a large number of bands, with each band capturing different spectral information from a certain wavelength, even some beyond the visible spectrum. These characteristics of HSIs make them highly effective for remote sensing applications. That said, the existing hyperspectral imaging devices introduce severe degradation in HSIs. Hence, hyperspectral image denoising has attracted lots of attention by the community lately. While recent deep HSI denoising methods have provided effective solutions, their performance under real-life complex noise remains suboptimal, as they lack adaptability to new data. To overcome these limitations, in our work, we introduce a self-modulating convolutional neural network which we refer to as SM-CNN, which utilizes correlated spectral and spatial information. At the core of the model lies a novel block, which we call spectral self-modulating residual block (SSMRB), that allows the network to transform the features in an adaptive manner based on the adjacent spectral data, enhancing the network's ability to handle complex noise. In particular, the introduction of SSMRB transforms our denoising network into a dynamic network that adapts its predicted features while denoising every input HSI with respect to its spatio-spectral characteristics. Experimental analysis on both synthetic and real data shows that the proposed SM-CNN outperforms other state-of-the-art HSI denoising methods both quantitatively and qualitatively on public benchmark datasets. △ Less

Submitted 15 September, 2023; originally announced September 2023.

Journal ref: Signal Processing, Volume 214, January 2024, 109248

arXiv:2308.03239 [pdf, other]

Asynchronous Decentralized Q-Learning: Two Timescale Analysis By Persistence

Authors: Bora Yongacoglu, Gürdal Arslan, Serdar Yüksel

Abstract: Non-stationarity is a fundamental challenge in multi-agent reinforcement learning (MARL), where agents update their behaviour as they learn. Many theoretical advances in MARL avoid the challenge of non-stationarity by coordinating the policy updates of agents in various ways, including synchronizing times at which agents are allowed to revise their policies. Synchronization enables analysis of man… ▽ More Non-stationarity is a fundamental challenge in multi-agent reinforcement learning (MARL), where agents update their behaviour as they learn. Many theoretical advances in MARL avoid the challenge of non-stationarity by coordinating the policy updates of agents in various ways, including synchronizing times at which agents are allowed to revise their policies. Synchronization enables analysis of many MARL algorithms via multi-timescale methods, but such synchrony is infeasible in many decentralized applications. In this paper, we study an asynchronous variant of the decentralized Q-learning algorithm, a recent MARL algorithm for stochastic games. We provide sufficient conditions under which the asynchronous algorithm drives play to equilibrium with high probability. Our solution utilizes constant learning rates in the Q-factor update, which we show to be critical for relaxing the synchrony assumptions of earlier work. Our analysis also applies to asynchronous generalizations of a number of other algorithms from the regret testing tradition, whose performance is analyzed by multi-timescale methods that study Markov chains obtained via policy update dynamics. This work extends the applicability of the decentralized Q-learning algorithm and its relatives to settings in which parameters are selected in an independent manner, and tames non-stationarity without imposing the coordination assumptions of prior work. △ Less

Submitted 6 August, 2023; originally announced August 2023.

arXiv:2303.13539 [pdf, ps, other]

doi 10.23919/ACC55779.2023.10155828

Decentralized Multi-Agent Reinforcement Learning for Continuous-Space Stochastic Games

Authors: Awni Altabaa, Bora Yongacoglu, Serdar Yüksel

Abstract: Stochastic games are a popular framework for studying multi-agent reinforcement learning (MARL). Recent advances in MARL have focused primarily on games with finitely many states. In this work, we study multi-agent learning in stochastic games with general state spaces and an information structure in which agents do not observe each other's actions. In this context, we propose a decentralized MARL… ▽ More Stochastic games are a popular framework for studying multi-agent reinforcement learning (MARL). Recent advances in MARL have focused primarily on games with finitely many states. In this work, we study multi-agent learning in stochastic games with general state spaces and an information structure in which agents do not observe each other's actions. In this context, we propose a decentralized MARL algorithm and we prove the near-optimality of its policy updates. Furthermore, we study the global policy-updating dynamics for a general class of best-reply based algorithms and derive a closed-form characterization of convergence probabilities over the joint policy space. △ Less

Submitted 16 March, 2023; originally announced March 2023.

Journal ref: 2023 American Control Conference (ACC), San Diego, CA, USA, 2023, pp. 72-77

arXiv:2210.07339 [pdf, ps, other]

Nash Equilibria for Exchangeable Team against Team Games, their Mean Field Limit, and Role of Common Randomness

Authors: Sina Sanjari, Naci Saldi, Serdar Yüksel

Abstract: We study stochastic mean-field games among finite number of teams with large finite as well as infinite number of decision makers. For this class of games within static and dynamic settings, we establish the existence of a Nash equilibrium, and show that a Nash equilibrium exhibits exchangeability in the finite decision maker regime and symmetry in the infinite one. To arrive at these existence an… ▽ More We study stochastic mean-field games among finite number of teams with large finite as well as infinite number of decision makers. For this class of games within static and dynamic settings, we establish the existence of a Nash equilibrium, and show that a Nash equilibrium exhibits exchangeability in the finite decision maker regime and symmetry in the infinite one. To arrive at these existence and structural theorems, we endow the set of randomized policies with a suitable topology under various decentralized information structures, which leads to the desired convexity and compactness of the set of randomized policies. Then, we establish the existence of a randomized Nash equilibrium that is exchangeable (not necessarily symmetric) among decision makers within each team for a general class of exchangeable stochastic games. As the number of decision makers within each team goes to infinity (that is for the mean-field game among teams), using a de Finetti representation theorem, we show existence of a randomized Nash equilibrium that is symmetric (i.e., identical) among decision makers within each team and also independently randomized. Finally, we establish that a Nash equilibrium for a class of mean-field games among teams (which is symmetric) constitutes an approximate Nash equilibrium for the corresponding pre-limit (exchangeable) game among teams with large but finite number of decision makers. We thus show that common randomness is not necessary for large team-against-team games, unlike the case with small sized teams. △ Less

Submitted 10 November, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

arXiv:2209.09857 [pdf, other]

Fine-grained Classification of Solder Joints with α-skew Jensen-Shannon Divergence

Authors: Furkan Ulger, Seniha Esen Yuksel, Atila Yilmaz, Dincer Gokcen

Abstract: Solder joint inspection (SJI) is a critical process in the production of printed circuit boards (PCB). Detection of solder errors during SJI is quite challenging as the solder joints have very small sizes and can take various shapes. In this study, we first show that solders have low feature diversity, and that the SJI can be carried out as a fine-grained image classification task which focuses on… ▽ More Solder joint inspection (SJI) is a critical process in the production of printed circuit boards (PCB). Detection of solder errors during SJI is quite challenging as the solder joints have very small sizes and can take various shapes. In this study, we first show that solders have low feature diversity, and that the SJI can be carried out as a fine-grained image classification task which focuses on hard-to-distinguish object classes. To improve the fine-grained classification accuracy, penalizing confident model predictions by maximizing entropy was found useful in the literature. Inline with this information, we propose using the α-skew Jensen-Shannon divergence (α-JS) for penalizing the confidence in model predictions. We compare the α-JS regularization with both existing entropyregularization based methods and the methods based on attention mechanism, segmentation techniques, transformer models, and specific loss functions for fine-grained image classification tasks. We show that the proposed approach achieves the highest F1-score and competitive accuracy for different models in the finegrained solder joint classification task. Finally, we visualize the activation maps and show that with entropy-regularization, more precise class-discriminative regions are localized, which are also more resilient to noise. Code will be made available here upon acceptance. △ Less

Submitted 20 September, 2022; originally announced September 2022.

Comments: Submitted to IEEE Transactions on Components, Packaging and Manufacturing Technology

arXiv:2209.05703 [pdf, other]

Independent Learning in Mean-Field Games: Satisficing Paths and Convergence to Subjective Equilibria

Authors: Bora Yongacoglu, Gürdal Arslan, Serdar Yüksel

Abstract: Independent learners are agents that employ single-agent algorithms in multi-agent systems, intentionally ignoring the effect of other strategic agents. This paper studies mean-field games from a decentralized learning perspective, with two primary objectives: (i) to identify structure that can guide algorithm design, and (ii) to understand the emergent behaviour in systems of independent learners… ▽ More Independent learners are agents that employ single-agent algorithms in multi-agent systems, intentionally ignoring the effect of other strategic agents. This paper studies mean-field games from a decentralized learning perspective, with two primary objectives: (i) to identify structure that can guide algorithm design, and (ii) to understand the emergent behaviour in systems of independent learners. We study a new model of partially observed mean-field games with finitely many players, local action observability, and a general observation channel for partial observations of the global state. Specific observation channels considered include (a) global observability, (b) local and mean-field observability, (c) local and compressed mean-field observability, and (d) only local observability. We establish conditions under which the control problem of a given agent is equivalent to a fully observed MDP, as well as conditions under which the control problem is equivalent only to a POMDP. Building on the connection to MDPs, we prove the existence of perfect equilibrium among memoryless stationary policies under mean-field observability. Leveraging the connection to POMDPs, we prove convergence of learning iterates obtained by independent learning agents under any of the aforementioned observation channels. We interpret the limiting values as subjective value functions, which an agent believes to be relevant to its control problem. These subjective value functions are then used to propose subjective Q-equilibrium, a new solution concept for partially observed n-player mean-field games, whose existence is proved under mean-field or global observability.We provide a decentralized learning algorithm for partially observed n-player mean-field games, and we show that it drives play to subjective Q-equilibrium by adapting the recently developed theory of satisficing paths to allow for subjectivity. △ Less

Submitted 23 November, 2023; v1 submitted 12 September, 2022; originally announced September 2022.

arXiv:2202.02841 [pdf, other]

An Asymptotically Optimal Two-Part Fixed-Rate Coding Scheme for Networked Control with Unbounded Noise

Authors: Jonathan Keeler, Tamás Linder, Serdar Yüksel

Abstract: It is known that under fixed-rate information constraints, adaptive quantizers can be used to stabilize an open-loop-unstable linear system on $\mathbb{R}^n$ driven by unbounded noise. These adaptive schemes can be designed so that they have near-optimal rate, and the resulting system will be stable in the sense of having an invariant probability measure, or ergodicity, as well as boundedness of t… ▽ More It is known that under fixed-rate information constraints, adaptive quantizers can be used to stabilize an open-loop-unstable linear system on $\mathbb{R}^n$ driven by unbounded noise. These adaptive schemes can be designed so that they have near-optimal rate, and the resulting system will be stable in the sense of having an invariant probability measure, or ergodicity, as well as boundedness of the state second moment. Although structural results and information theoretic bounds of encoders have been studied, the performance of such adaptive fixed-rate quantizers beyond stabilization has not been addressed. In this paper, we propose a two-part adaptive (fixed-rate) coding scheme that achieves state second moment convergence to the classical optimum (i.e., for the fully observed setting) under mild moment conditions on the noise process. The first part, as in prior work, leads to ergodicity (via positive Harris recurrence) and the second part ensures that the state second moment converges to the classical optimum at high rates. These results are established using an intricate analysis which uses random-time state-dependent Lyapunov stochastic drift criteria as a core tool. △ Less

Submitted 11 January, 2023; v1 submitted 6 February, 2022; originally announced February 2022.

Comments: 37 pages, 4 figures. Replacement of prior work as this paper is a direct generalization (from the scalar Gaussian case to the multidimensional case with much more general noise)

arXiv:2111.06781 [pdf, ps, other]

Q-Learning for MDPs with General Spaces: Convergence and Near Optimality via Quantization under Weak Continuity

Authors: Ali Devran Kara, Naci Saldi, Serdar Yüksel

Abstract: Reinforcement learning algorithms often require finiteness of state and action spaces in Markov decision processes (MDPs) (also called controlled Markov chains) and various efforts have been made in the literature towards the applicability of such algorithms for continuous state and action spaces. In this paper, we show that under very mild regularity conditions (in particular, involving only weak… ▽ More Reinforcement learning algorithms often require finiteness of state and action spaces in Markov decision processes (MDPs) (also called controlled Markov chains) and various efforts have been made in the literature towards the applicability of such algorithms for continuous state and action spaces. In this paper, we show that under very mild regularity conditions (in particular, involving only weak continuity of the transition kernel of an MDP), Q-learning for standard Borel MDPs via quantization of states and actions (called Quantized Q-Learning) converges to a limit, and furthermore this limit satisfies an optimality equation which leads to near optimality with either explicit performance bounds or which are guaranteed to be asymptotically optimal. Our approach builds on (i) viewing quantization as a measurement kernel and thus a quantized MDP as a partially observed Markov decision process (POMDP), (ii) utilizing near optimality and convergence results of Q-learning for POMDPs, and (iii) finally, near-optimality of finite state model approximations for MDPs with weakly continuous kernels which we show to correspond to the fixed point of the constructed POMDP. Thus, our paper presents a very general convergence and approximation result for the applicability of Q-learning for continuous MDPs. △ Less

Submitted 7 September, 2023; v1 submitted 12 November, 2021; originally announced November 2021.

arXiv:2110.04638 [pdf, other]

doi 10.1137/22M1515112

Satisficing Paths and Independent Multi-Agent Reinforcement Learning in Stochastic Games

Authors: Bora Yongacoglu, Gürdal Arslan, Serdar Yüksel

Abstract: In multi-agent reinforcement learning (MARL), independent learners are those that do not observe the actions of other agents in the system. Due to the decentralization of information, it is challenging to design independent learners that drive play to equilibrium. This paper investigates the feasibility of using satisficing dynamics to guide independent learners to approximate equilibrium in stoch… ▽ More In multi-agent reinforcement learning (MARL), independent learners are those that do not observe the actions of other agents in the system. Due to the decentralization of information, it is challenging to design independent learners that drive play to equilibrium. This paper investigates the feasibility of using satisficing dynamics to guide independent learners to approximate equilibrium in stochastic games. For $ε\geq 0$, an $ε$-satisficing policy update rule is any rule that instructs the agent to not change its policy when it is $ε$-best-responding to the policies of the remaining players; $ε$-satisficing paths are defined to be sequences of joint policies obtained when each agent uses some $ε$-satisficing policy update rule to select its next policy. We establish structural results on the existence of $ε$-satisficing paths into $ε$-equilibrium in both symmetric $N$-player games and general stochastic games with two players. We then present an independent learning algorithm for $N$-player symmetric games and give high probability guarantees of convergence to $ε$-equilibrium under self-play. This guarantee is made using symmetry alone, leveraging the previously unexploited structure of $ε$-satisficing paths. △ Less

Submitted 19 February, 2023; v1 submitted 9 October, 2021; originally announced October 2021.

Journal ref: SIAM Journal on Mathematics of Data Science, vol 5, no. 3, pp. 745-773, Aug 2023

arXiv:2108.05240 [pdf, other]

Signaling Games in Multiple Dimensions: Geometric Properties of Equilibrium Solutions

Authors: Ertan Kazıklı, Sinan Gezici, Serdar Yüksel

Abstract: Signaling game problems investigate communication scenarios where encoder(s) and decoder(s) have misaligned objectives due to the fact that they either employ different cost functions or have inconsistent priors. This problem has been studied in the literature for scalar sources under various setups. In this paper, we consider multi-dimensional sources under quadratic criteria in the presence of a… ▽ More Signaling game problems investigate communication scenarios where encoder(s) and decoder(s) have misaligned objectives due to the fact that they either employ different cost functions or have inconsistent priors. This problem has been studied in the literature for scalar sources under various setups. In this paper, we consider multi-dimensional sources under quadratic criteria in the presence of a bias leading to a mismatch in the criteria, where we show that the generalization from the scalar setup is more than technical. We show that the Nash equilibrium solutions lead to structural richness due to the subtle geometric analysis the problem entails, with consequences in both system design, the presence of linear Nash equilibria, and an information theoretic problem formulation. We first provide a set of geometric conditions that must be satisfied in equilibrium considering any multi-dimensional source. Then, we consider independent and identically distributed sources and characterize necessary and sufficient conditions under which an informative linear Nash equilibrium exists. These conditions involve the bias vector that leads to misaligned costs. Depending on certain conditions related to the bias vector, the existence of linear Nash equilibria requires sources with a Gaussian or a symmetric density. Moreover, in the case of Gaussian sources, our results have a rate-distortion theoretic implication that achievable rates and distortions in the considered game theoretic setup can be obtained from its team theoretic counterpart. △ Less

Submitted 7 May, 2023; v1 submitted 11 August, 2021; originally announced August 2021.

Comments: 17 pages and 6 figures

arXiv:2104.11927 [pdf, other]

doi 10.1109/TCPMT.2021.3121265

Anomaly Detection for Solder Joints Using $β$-VAE

Authors: Furkan Ulger, Seniha Esen Yuksel, Atila Yilmaz

Abstract: In the assembly process of printed circuit boards (PCB), most of the errors are caused by solder joints in Surface Mount Devices (SMD). In the literature, traditional feature extraction based methods require designing hand-crafted features and rely on the tiered RGB illumination to detect solder joint errors, whereas the supervised Convolutional Neural Network (CNN) based approaches require a lot… ▽ More In the assembly process of printed circuit boards (PCB), most of the errors are caused by solder joints in Surface Mount Devices (SMD). In the literature, traditional feature extraction based methods require designing hand-crafted features and rely on the tiered RGB illumination to detect solder joint errors, whereas the supervised Convolutional Neural Network (CNN) based approaches require a lot of labelled abnormal samples (defective solder joints) to achieve high accuracy. To solve the optical inspection problem in unrestricted environments with no special lighting and without the existence of error-free reference boards, we propose a new beta-Variational Autoencoders (beta-VAE) architecture for anomaly detection that can work on both IC and non-IC components. We show that the proposed model learns disentangled representation of data, leading to more independent features and improved latent space representations. We compare the activation and gradient-based representations that are used to characterize anomalies; and observe the effect of different beta parameters on accuracy and on untwining the feature representations in beta-VAE. Finally, we show that anomalies on solder joints can be detected with high accuracy via a model trained on directly normal samples without designated hardware or feature engineering. △ Less

Submitted 16 December, 2021; v1 submitted 24 April, 2021; originally announced April 2021.

Comments: Published in IEEE Transactions on Components, Packaging and Manufacturing Technology

Journal ref: in IEEE Transactions on Components, Packaging and Manufacturing Technology, vol. 11, no. 12, pp. 2214-2221, Dec. 2021

arXiv:2103.12158 [pdf, other]

Convergence of Finite Memory Q-Learning for POMDPs and Near Optimality of Learned Policies under Filter Stability

Authors: Ali Devran Kara, Serdar Yuksel

Abstract: In this paper, for POMDPs, we provide the convergence of a Q learning algorithm for control policies using a finite history of past observations and control actions, and, consequentially, we establish near optimality of such limit Q functions under explicit filter stability conditions. We present explicit error bounds relating the approximation error to the length of the finite history window. We… ▽ More In this paper, for POMDPs, we provide the convergence of a Q learning algorithm for control policies using a finite history of past observations and control actions, and, consequentially, we establish near optimality of such limit Q functions under explicit filter stability conditions. We present explicit error bounds relating the approximation error to the length of the finite history window. We establish the convergence of such Q-learning iterations under mild ergodicity assumptions on the state process during the exploration phase. We further show that the limit fixed point equation gives an optimal solution for an approximate belief-MDP. We then provide bounds on the performance of the policy obtained using the limit Q values compared to the performance of the optimal policy for the POMDP, where we also present explicit conditions using recent results on filter stability in controlled POMDPs. While there exist many experimental results, (i) the rigorous asymptotic convergence (to an approximate MDP value function) for such finite-memory Q-learning algorithms, and (ii) the near optimality with an explicit rate of convergence (in the memory size) are results that are new to the literature, to our knowledge. △ Less

Submitted 25 October, 2022; v1 submitted 22 March, 2021; originally announced March 2021.

arXiv:2103.10810 [pdf, ps, other]

Zero-Delay Lossy Coding of Linear Vector Markov Sources: Optimality of Stationary Codes and Near Optimality of Finite Memory Codes

Authors: Meysam Ghomi, Tamas Linder, Serdar Yuksel

Abstract: Optimal zero-delay coding (quantization) of $\mathbb{R}^d$-valued linearly generated Markov sources is studied under quadratic distortion. The structure and existence of deterministic and stationary coding policies that are optimal for the infinite horizon average cost (distortion) problem are established. Prior results studying the optimality of zero-delay codes for Markov sources for infinite ho… ▽ More Optimal zero-delay coding (quantization) of $\mathbb{R}^d$-valued linearly generated Markov sources is studied under quadratic distortion. The structure and existence of deterministic and stationary coding policies that are optimal for the infinite horizon average cost (distortion) problem are established. Prior results studying the optimality of zero-delay codes for Markov sources for infinite horizons either considered finite alphabet sources or, for the $\mathbb{R}^d$-valued case, only showed the existence of deterministic and non-stationary Markov coding policies or those which are randomized. In addition to existence results, for finite blocklength (horizon) $T$ the performance of an optimal coding policy is shown to approach the infinite time horizon optimum at a rate $O(\frac{1}{T})$. This gives an explicit rate of convergence that quantifies the near-optimality of finite window (finite-memory) codes among all optimal zero-delay codes. △ Less

Submitted 13 January, 2022; v1 submitted 19 March, 2021; originally announced March 2021.

Comments: 15 pages

ACM Class: E.4

arXiv:2101.00799 [pdf, other]

Quadratic Signaling with Prior Mismatch at an Encoder and Decoder: Equilibria, Continuity and Robustness Properties

Authors: Ertan Kazıklı, Serkan Sarıtaş, Sinan Gezici, Serdar Yüksel

Abstract: We consider communications through a Gaussian noise channel between an encoder and a decoder which have subjective probabilistic models on the source distribution. Although they consider the same cost function, the induced expected costs are misaligned due to their prior mismatch, which requires a game theoretic approach. We consider two approaches: a Nash setup, with no prior commitment, and a St… ▽ More We consider communications through a Gaussian noise channel between an encoder and a decoder which have subjective probabilistic models on the source distribution. Although they consider the same cost function, the induced expected costs are misaligned due to their prior mismatch, which requires a game theoretic approach. We consider two approaches: a Nash setup, with no prior commitment, and a Stackelberg solution concept, where the encoder is committed to a given announced policy apriori. We show that the Stackelberg equilibrium cost of the encoder is upper semi continuous, under the Wasserstein metric, as encoder's prior approaches the decoder's prior, and it is also lower semi continuous with Gaussian priors. For the Stackelberg setup, the optimality of affine policies for Gaussian signaling no longer holds under prior mismatch, and thus team-theoretic optimality of linear/affine policies are not robust to perturbations. We provide conditions under which there exist informative Nash and Stackelberg equilibria with affine policies. Finally, we show existence of fully informative Nash and Stackelberg equilibria for the cheap talk problem under an absolute continuity condition. △ Less

Submitted 21 November, 2021; v1 submitted 4 January, 2021; originally announced January 2021.

Comments: 16 pages, 3 figures

arXiv:2012.08265 [pdf, other]

Signaling Games for Log-Concave Distributions: Number of Bins and Properties of Equilibria

Authors: Ertan Kazıklı, Serkan Sarıtaş, Sinan Gezici, Tamás Linder, Serdar Yüksel

Abstract: We investigate the equilibrium behavior for the decentralized cheap talk problem for real random variables and quadratic cost criteria in which an encoder and a decoder have misaligned objective functions. In prior work, it has been shown that the number of bins in any equilibrium has to be countable, generalizing a classical result due to Crawford and Sobel who considered sources with density sup… ▽ More We investigate the equilibrium behavior for the decentralized cheap talk problem for real random variables and quadratic cost criteria in which an encoder and a decoder have misaligned objective functions. In prior work, it has been shown that the number of bins in any equilibrium has to be countable, generalizing a classical result due to Crawford and Sobel who considered sources with density supported on $[0,1]$. In this paper, we first refine this result in the context of log-concave sources. For sources with two-sided unbounded support, we prove that, for any finite number of bins, there exists a unique equilibrium. In contrast, for sources with semi-unbounded support, there may be a finite upper bound on the number of bins in equilibrium depending on certain conditions stated explicitly. Moreover, we prove that for log-concave sources, the expected costs of the encoder and the decoder in equilibrium decrease as the number of bins increases. Furthermore, for strictly log-concave sources with two-sided unbounded support, we prove convergence to the unique equilibrium under best response dynamics which starts with a given number of bins, making a connection with the classical theory of optimal quantization and convergence results of Lloyd's method. In addition, we consider more general sources which satisfy certain assumptions on the tail(s) of the distribution and we show that there exist equilibria with infinitely many bins for sources with two-sided unbounded support. Further explicit characterizations are provided for sources with exponential, Gaussian, and compactly-supported probability distributions. △ Less

Submitted 14 November, 2021; v1 submitted 15 December, 2020; originally announced December 2020.

Comments: 27 pages and 1 figure. arXiv admin note: text overlap with arXiv:1901.06738

arXiv:2010.07452 [pdf, other]

Near Optimality of Finite Memory Feedback Policies in Partially Observed Markov Decision Processes

Authors: Ali Devran Kara, Serdar Yuksel

Abstract: In the theory of Partially Observed Markov Decision Processes (POMDPs), existence of optimal policies have in general been established via converting the original partially observed stochastic control problem to a fully observed one on the belief space, leading to a belief-MDP. However, computing an optimal policy for this fully observed model, and so for the original POMDP, using classical dynami… ▽ More In the theory of Partially Observed Markov Decision Processes (POMDPs), existence of optimal policies have in general been established via converting the original partially observed stochastic control problem to a fully observed one on the belief space, leading to a belief-MDP. However, computing an optimal policy for this fully observed model, and so for the original POMDP, using classical dynamic or linear programming methods is challenging even if the original system has finite state and action spaces, since the state space of the fully observed belief-MDP model is always uncountable. Furthermore, there exist very few rigorous value function approximation and optimal policy approximation results, as regularity conditions needed often require a tedious study involving the spaces of probability measures leading to properties such as Feller continuity. In this paper, we study a planning problem for POMDPs where the system dynamics and measurement channel model are assumed to be known. We construct an approximate belief model by discretizing the belief space using only finite window information variables. We then find optimal policies for the approximate model and we rigorously establish near optimality of the constructed finite window control policies in POMDPs under mild non-linear filter stability conditions and the assumption that the measurement and action sets are finite (and the state space is real vector valued). We also establish a rate of convergence result which relates the finite window memory size and the approximation error bound, where the rate of convergence is exponential under explicit and testable exponential filter stability conditions. While there exist many experimental results and few rigorous asymptotic convergence results, an explicit rate of convergence result is new in the literature, to our knowledge. △ Less

Submitted 8 January, 2022; v1 submitted 14 October, 2020; originally announced October 2020.

arXiv:2005.05743 [pdf, other]

Quadratic Privacy-Signaling Games and the MMSE Information Bottleneck Problem for Gaussian Sources

Authors: Ertan Kazıklı, Sinan Gezici, Serdar Yüksel

Abstract: We investigate a privacy-signaling game problem in which a sender with privacy concerns observes a pair of correlated random vectors which are modeled as jointly Gaussian. The sender aims to hide one of these random vectors and convey the other one whereas the objective of the receiver is to accurately estimate both of the random vectors. We analyze these conflicting objectives in a game theoretic… ▽ More We investigate a privacy-signaling game problem in which a sender with privacy concerns observes a pair of correlated random vectors which are modeled as jointly Gaussian. The sender aims to hide one of these random vectors and convey the other one whereas the objective of the receiver is to accurately estimate both of the random vectors. We analyze these conflicting objectives in a game theoretic framework with quadratic costs where depending on the commitment conditions (of the sender), we consider Nash or Stackelberg (Bayesian persuasion) equilibria. We show that a payoff dominant Nash equilibrium among all admissible policies is attained by a set of explicitly characterized linear policies. We also show that a payoff dominant Nash equilibrium coincides with a Stackelberg equilibrium. We formulate the information bottleneck problem within our Stackelberg framework under the mean squared error distortion criterion where the information bottleneck setup has a further restriction that only one of the random variables is observed at the sender. We show that this MMSE Gaussian Information Bottleneck Problem admits a linear solution which is explicitly characterized in the paper. We provide explicit conditions on when the optimal solutions, or equilibrium solutions in the Nash setup, are informative or noninformative. △ Less

Submitted 4 March, 2022; v1 submitted 12 May, 2020; originally announced May 2020.

Comments: 16 pages, 6 figures

arXiv:1901.06738 [pdf, ps, other]

doi 10.1109/ISIT.2019.8849498

On the Number of Bins in Equilibria for Signaling Games

Authors: Serkan Sarıtaş, Philippe Furrer, Sinan Gezici, Tamás Linder, Serdar Yüksel

Abstract: We investigate the equilibrium behavior for the decentralized quadratic cheap talk problem in which an encoder and a decoder, viewed as two decision makers, have misaligned objective functions. In prior work, we have shown that the number of bins under any equilibrium has to be at most countable, generalizing a classical result due to Crawford and Sobel who considered sources with density supporte… ▽ More We investigate the equilibrium behavior for the decentralized quadratic cheap talk problem in which an encoder and a decoder, viewed as two decision makers, have misaligned objective functions. In prior work, we have shown that the number of bins under any equilibrium has to be at most countable, generalizing a classical result due to Crawford and Sobel who considered sources with density supported on $[0,1]$. In this paper, we refine this result in the context of exponential and Gaussian sources. For exponential sources, a relation between the upper bound on the number of bins and the misalignment in the objective functions is derived, the equilibrium costs are compared, and it is shown that there also exist equilibria with infinitely many bins under certain parametric assumptions. For Gaussian sources, it is shown that there exist equilibria with infinitely many bins. △ Less

Submitted 23 January, 2019; v1 submitted 20 January, 2019; originally announced January 2019.

Comments: 25 pages, single column

arXiv:1901.02825 [pdf, other]

Invariance Properties of Controlled Stochastic Nonlinear Systems under Information Constraints

Authors: Christoph Kawan, Serdar Yüksel

Abstract: Given a stochastic nonlinear system controlled over a possibly noisy communication channel, the paper studies the largest class of channels for which there exist coding and control policies so that the closed-loop system is stochastically stable. The stability criterion considered is asymptotic mean stationarity (AMS). We develop a general method based on ergodic theory and probability to derive f… ▽ More Given a stochastic nonlinear system controlled over a possibly noisy communication channel, the paper studies the largest class of channels for which there exist coding and control policies so that the closed-loop system is stochastically stable. The stability criterion considered is asymptotic mean stationarity (AMS). We develop a general method based on ergodic theory and probability to derive fundamental bounds on information transmission requirements leading to stabilization. Through this method we develop a new notion of entropy which is tailored to derive lower bounds for asymptotic mean stationarity for both noise-free and noisy channels. The bounds obtained through probabilistic and ergodic-theoretic analysis are more refined in comparison with the bounds obtained earlier via information-theoretic methods. Moreover, our approach is more versatile in view of the models considered and allows for finer lower bounds when the AMS measure is known to admit further properties such as moment bounds. △ Less

Submitted 4 May, 2020; v1 submitted 9 January, 2019; originally announced January 2019.

MSC Class: 93E15; 93C10; 37A35

arXiv:1711.06600 [pdf, ps, other]

On optimal coding of non-linear dynamical systems

Authors: Christoph Kawan, Serdar Yüksel

Abstract: We consider the problem of zero-delay coding of a dynamical system over a discrete noiseless channel under three estimation criteria concerned with the low-distortion regime. For these three criteria, formulated stochastically in terms of a probability distribution for the initial state, we characterize the smallest channel capacities above which the estimation objectives can be achieved. The resu… ▽ More We consider the problem of zero-delay coding of a dynamical system over a discrete noiseless channel under three estimation criteria concerned with the low-distortion regime. For these three criteria, formulated stochastically in terms of a probability distribution for the initial state, we characterize the smallest channel capacities above which the estimation objectives can be achieved. The results establish further connections between topological and metric entropy of dynamical systems and information theory. △ Less

Submitted 31 March, 2018; v1 submitted 17 November, 2017; originally announced November 2017.

MSC Class: 93E10; 37A35; 93E99; 94A17; 94A29

arXiv:1704.03816 [pdf, ps, other]

doi 10.1016/j.automatica.2020.108883

Dynamic Signaling Games with Quadratic Criteria under Nash and Stackelberg Equilibria

Authors: Serkan Sarıtaş, Serdar Yüksel, Sinan Gezici

Abstract: This paper considers dynamic (multi-stage) signaling games involving an encoder and a decoder who have subjective models on the cost functions. We consider both Nash (simultaneous-move) and Stackelberg (leader-follower) equilibria of dynamic signaling games under quadratic criteria. For the multi-stage scalar cheap talk, we show that the final stage equilibrium is always quantized and under furthe… ▽ More This paper considers dynamic (multi-stage) signaling games involving an encoder and a decoder who have subjective models on the cost functions. We consider both Nash (simultaneous-move) and Stackelberg (leader-follower) equilibria of dynamic signaling games under quadratic criteria. For the multi-stage scalar cheap talk, we show that the final stage equilibrium is always quantized and under further conditions the equilibria for all time stages must be quantized. In contrast, the Stackelberg equilibria are always fully revealing. In the multi-stage signaling game where the transmission of a Gauss-Markov source over a memoryless Gaussian channel is considered, affine policies constitute an invariant subspace under best response maps for Nash equilibria; whereas the Stackelberg equilibria always admit linear policies for scalar sources but such policies may be non-linear for multi-dimensional sources. We obtain an explicit recursion for optimal linear encoding policies for multi-dimensional sources, and derive conditions under which Stackelberg equilibria are informative. △ Less

Submitted 13 February, 2019; v1 submitted 12 April, 2017; originally announced April 2017.

Comments: 17 pages

arXiv:1612.00564 [pdf, other]

Metric and topological entropy bounds for optimal coding of stochastic dynamical systems

Authors: Christoph Kawan, Serdar Yüksel

Abstract: We consider the problem of optimal zero-delay coding and estimation of a stochastic dynamical system over a noisy communication channel under three estimation criteria concerned with the low-distortion regime. The criteria considered are (i) a strong and (ii) a weak form of almost sure stability of the estimation error as well as (ii) quadratic stability in expectation. For all three objectives, w… ▽ More We consider the problem of optimal zero-delay coding and estimation of a stochastic dynamical system over a noisy communication channel under three estimation criteria concerned with the low-distortion regime. The criteria considered are (i) a strong and (ii) a weak form of almost sure stability of the estimation error as well as (ii) quadratic stability in expectation. For all three objectives, we derive lower bounds on the smallest channel capacity $C_0$ above which the objective can be achieved with an arbitrarily small error. We first obtain bounds through a dynamical systems approach by constructing an infinite-dimensional dynamical system and relating the capacity with the topological and the metric entropy of this dynamical system. We also consider information-theoretic and probability-theoretic approaches to address the different criteria. Finally, we prove that a memoryless noisy channel in general constitutes no obstruction to asymptotic almost sure state estimation with arbitrarily small errors, when there is no noise in the system. The results provide new solution methods for the criteria introduced (e.g., standard information-theoretic bounds cannot be applied for some of the criteria) and establish further connections between dynamical systems, networked control, and information theory, and especially in the context of nonlinear stochastic systems. △ Less

Submitted 26 June, 2018; v1 submitted 1 December, 2016; originally announced December 2016.

arXiv:1606.09135 [pdf, ps, other]

Optimal Zero Delay Coding of Markov Sources: Stationary and Finite Memory Codes

Authors: Richard G. Wood, Tamás Linder, Serdar Yüksel

Abstract: The optimal zero delay coding of a finite state Markov source is considered. The existence and structure of optimal codes are studied using a stochastic control formulation. Prior results in the literature established the optimality of deterministic Markov (Walrand-Varaiya type) coding policies for finite time horizon problem, and the optimality of both deterministic nonstationary and randomized s… ▽ More The optimal zero delay coding of a finite state Markov source is considered. The existence and structure of optimal codes are studied using a stochastic control formulation. Prior results in the literature established the optimality of deterministic Markov (Walrand-Varaiya type) coding policies for finite time horizon problem, and the optimality of both deterministic nonstationary and randomized stationary policies for the infinite time horizon problem. Our main result here shows that for any irreducible and aperiodic Markov source with a finite alphabet, \emph{deterministic and stationary} Markov coding policies are optimal for the infinite horizon problem. In addition, the finite blocklength (time horizon) performance on an optimal (stationary and Markov) coding policy is shown to approach the infinite time horizon optimum at a rate $O(1/T)$. The results are extended to systems where zero delay communication takes place across a noisy channel with noiseless feedback. △ Less

Submitted 5 April, 2017; v1 submitted 29 June, 2016; originally announced June 2016.

Comments: 27 pages

MSC Class: 68P30

arXiv:1604.00299 [pdf, ps, other]

Stochastic Control Approach to Reputation Games

Authors: Nuh Aygün Dalkıran, Serdar Yüksel

Abstract: Through a stochastic control theoretic approach, we analyze reputation games where a strategic long-lived player acts in a sequential repeated game against a collection of short-lived players. The key assumption in our model is that the information of the short-lived players is nested in that of the long-lived player. This nested information structure is obtained through an appropriate monitoring… ▽ More Through a stochastic control theoretic approach, we analyze reputation games where a strategic long-lived player acts in a sequential repeated game against a collection of short-lived players. The key assumption in our model is that the information of the short-lived players is nested in that of the long-lived player. This nested information structure is obtained through an appropriate monitoring structure. Under this monitoring structure, we show that, given mild assumptions, the set of Perfect Bayesian Equilibrium payoffs coincide with Markov Perfect Equilibrium payoffs, and hence a dynamic programming formulation can be obtained for the computation of equilibrium strategies of the strategic long-lived player in the discounted setup. We also consider the undiscounted average-payoff setup where we obtain an optimal equilibrium strategy of the strategic long-lived player under further technical conditions. We then use this optimal strategy in the undiscounted setup as a tool to obtain a tight upper payoff bound for the arbitrarily patient long-lived player in the discounted setup. Finally, by using measure concentration techniques, we obtain a refined lower payoff bound on the value of reputation in the discounted setup. We also study the continuity of equilibrium payoffs in the prior beliefs. △ Less

Submitted 20 January, 2020; v1 submitted 1 April, 2016; originally announced April 2016.

Comments: To appear in IEEE Transactions on Automatic Control

arXiv:1506.07924 [pdf, ps, other]

Decentralized Q-Learning for Stochastic Teams and Games

Authors: Gürdal Arslan, Serdar Yüksel

Abstract: There are only a few learning algorithms applicable to stochastic dynamic teams and games which generalize Markov decision processes to decentralized stochastic control problems involving possibly self-interested decision makers. Learning in games is generally difficult because of the non-stationary environment in which each decision maker aims to learn its optimal decisions with minimal informati… ▽ More There are only a few learning algorithms applicable to stochastic dynamic teams and games which generalize Markov decision processes to decentralized stochastic control problems involving possibly self-interested decision makers. Learning in games is generally difficult because of the non-stationary environment in which each decision maker aims to learn its optimal decisions with minimal information in the presence of the other decision makers who are also learning. In stochastic dynamic games, learning is more challenging because, while learning, the decision makers alter the state of the system and hence the future cost. In this paper, we present decentralized Q-learning algorithms for stochastic games, and study their convergence for the weakly acyclic case which includes team problems as an important special case. The algorithm is decentralized in that each decision maker has access to only its local information, the state information, and the local cost realizations; furthermore, it is completely oblivious to the presence of other decision makers. We show that these algorithms converge to equilibrium policies almost surely in large classes of stochastic games. △ Less

Submitted 2 May, 2016; v1 submitted 25 June, 2015; originally announced June 2015.

Comments: To appear in IEEE Trans. Automatic Control

arXiv:1506.04013 [pdf, other]

Stationary and Ergodic Properties of Stochastic Non-Linear Systems Controlled over Communication Channels

Authors: Serdar Yüksel

Abstract: This paper is concerned with the following problem: Given a stochastic non-linear system controlled over a noisy channel, what is the largest class of channels for which there exist coding and control policies so that the closed loop system is stochastically stable? Stochastic stability notions considered are stationarity, ergodicity or asymptotic mean stationarity. We do not restrict the state sp… ▽ More This paper is concerned with the following problem: Given a stochastic non-linear system controlled over a noisy channel, what is the largest class of channels for which there exist coding and control policies so that the closed loop system is stochastically stable? Stochastic stability notions considered are stationarity, ergodicity or asymptotic mean stationarity. We do not restrict the state space to be compact, for example systems considered can be driven by unbounded noise. Necessary and sufficient conditions are obtained for a large class of systems and channels. A generalization of Bode's Integral Formula for a large class of non-linear systems and information channels is obtained. The findings generalize existing results for linear systems. △ Less

Submitted 23 August, 2016; v1 submitted 12 June, 2015; originally announced June 2015.

Comments: To appear in SIAM Journal on Control and Optimization

arXiv:1503.04360 [pdf, ps, other]

doi 10.1109/TAC.2016.2578843

Quadratic Multi-Dimensional Signaling Games and Affine Equilibria

Authors: Serkan Sarıtaş, Serdar Yüksel, Sinan Gezici

Abstract: This paper studies the decentralized quadratic cheap talk and signaling game problems when an encoder and a decoder, viewed as two decision makers, have misaligned objective functions. The main contributions of this study are the extension of Crawford and Sobel's cheap talk formulation to multi-dimensional sources and to noisy channel setups. We consider both (simultaneous) Nash equilibria and (se… ▽ More This paper studies the decentralized quadratic cheap talk and signaling game problems when an encoder and a decoder, viewed as two decision makers, have misaligned objective functions. The main contributions of this study are the extension of Crawford and Sobel's cheap talk formulation to multi-dimensional sources and to noisy channel setups. We consider both (simultaneous) Nash equilibria and (sequential) Stackelberg equilibria. We show that for arbitrary scalar sources, in the presence of misalignment, the quantized nature of all equilibrium policies holds for Nash equilibria in the sense that all Nash equilibria are equivalent to those achieved by quantized encoder policies. On the other hand, all Stackelberg equilibria policies are fully informative. For multi-dimensional setups, unlike the scalar case, Nash equilibrium policies may be of non-quantized nature, and even linear. In the noisy setup, a Gaussian source is to be transmitted over an additive Gaussian channel. The goals of the encoder and the decoder are misaligned by a bias term and encoder's cost also includes a penalty term on signal power. Conditions for the existence of affine Nash equilibria as well as general informative equilibria are presented. For the noisy setup, the only Stackelberg equilibrium is the linear equilibrium when the variables are scalar. Our findings provide further conditions on when affine policies may be optimal in decentralized multi-criteria control problems and lead to conditions for the presence of active information transmission in strategic environments. △ Less

Submitted 29 September, 2016; v1 submitted 14 March, 2015; originally announced March 2015.

Comments: 15 pages, 4 figures

arXiv:1411.5767 [pdf, other]

Output Constrained Lossy Source Coding with Limited Common Randomness

Authors: Naci Saldi, Tamás Linder, Serdar Yüksel

Abstract: This paper studies a Shannon-theoretic version of the generalized distribution preserving quantization problem where a stationary and memoryless source is encoded subject to a distortion constraint and the additional requirement that the reproduction also be stationary and memoryless with a given distribution. The encoder and decoder are stochastic and assumed to have access to independent common… ▽ More This paper studies a Shannon-theoretic version of the generalized distribution preserving quantization problem where a stationary and memoryless source is encoded subject to a distortion constraint and the additional requirement that the reproduction also be stationary and memoryless with a given distribution. The encoder and decoder are stochastic and assumed to have access to independent common randomness. Recent work has characterized the minimum achievable coding rate at a given distortion level when unlimited common randomness is available. Here we consider the general case where the available common randomness may be rate limited. Our main result completely characterizes the set of achievable coding and common randomness rate pairs at any distortion level, thereby providing the optimal tradeoff between these two rate quantities. We also consider two variations of this problem where we investigate the effect of relaxing the strict output distribution constraint and the role of `private randomness' used by the decoder on the rate region. Our results have strong connections with Cuff's recent work on distributed channel synthesis. In particular, our achievability proof combines a coupling argument with the approach developed by Cuff, where instead of explicitly constructing the encoder-decoder pair, a joint distribution is constructed from which a desired encoder-decoder pair is established. We show however that for our problem, the separated solution of first finding an optimal channel and then synthesizing this channel results in a suboptimal rate region. △ Less

Submitted 8 July, 2015; v1 submitted 20 November, 2014; originally announced November 2014.

Comments: 15 pages

arXiv:1309.2915 [pdf, ps, other]

doi 10.1109/TIT.2014.2373382

Randomized Quantization and Source Coding with Constrained Output Distribution

Authors: Naci Saldi, Tamás Linder, Serdar Yüksel

Abstract: This paper studies fixed-rate randomized vector quantization under the constraint that the quantizer's output has a given fixed probability distribution. A general representation of randomized quantizers that includes the common models in the literature is introduced via appropriate mixtures of joint probability measures on the product of the source and reproduction alphabets. Using this represent… ▽ More This paper studies fixed-rate randomized vector quantization under the constraint that the quantizer's output has a given fixed probability distribution. A general representation of randomized quantizers that includes the common models in the literature is introduced via appropriate mixtures of joint probability measures on the product of the source and reproduction alphabets. Using this representation and results from optimal transport theory, the existence of an optimal (minimum distortion) randomized quantizer having a given output distribution is shown under various conditions. For sources with densities and the mean square distortion measure, it is shown that this optimum can be attained by randomizing quantizers having convex codecells. For stationary and memoryless source and output distributions a rate-distortion theorem is proved, providing a single-letter expression for the optimum distortion in the limit of large block-lengths. △ Less

Submitted 3 October, 2014; v1 submitted 11 September, 2013; originally announced September 2013.

Comments: To appear in the IEEE Transactions on Information Theory

MSC Class: 94A29

arXiv:1307.7533 [pdf, ps, other]

Stabilization of Linear Systems Over Gaussian Networks

Authors: Ali A. Zaidi, Tobias J. Oechtering, Serdar Yuksel, Mikael Skoglund

Abstract: The problem of remotely stabilizing a noisy linear time invariant plant over a Gaussian relay network is addressed. The network is comprised of a sensor node, a group of relay nodes and a remote controller. The sensor and the relay nodes operate subject to an average transmit power constraint and they can cooperate to communicate the observations of the plant's state to the remote controller. The… ▽ More The problem of remotely stabilizing a noisy linear time invariant plant over a Gaussian relay network is addressed. The network is comprised of a sensor node, a group of relay nodes and a remote controller. The sensor and the relay nodes operate subject to an average transmit power constraint and they can cooperate to communicate the observations of the plant's state to the remote controller. The communication links between all nodes are modeled as Gaussian channels. Necessary as well as sufficient conditions for mean-square stabilization over various network topologies are derived. The sufficient conditions are in general obtained using delay-free linear policies and the necessary conditions are obtained using information theoretic tools. Different settings where linear policies are optimal, asymptotically optimal (in certain parameters of the system) and suboptimal have been identified. For the case with noisy multi-dimensional sources controlled over scalar channels, it is shown that linear time varying policies lead to minimum capacity requirements, meeting the fundamental lower bound. For the case with noiseless sources and parallel channels, non-linear policies which meet the lower bound have been identified. △ Less

Submitted 29 July, 2013; originally announced July 2013.

arXiv:1307.0396 [pdf, ps, other]

doi 10.1109/TIT.2014.2346780

On Optimal Zero-Delay Coding of Vector Markov Sources

Authors: Tamás Linder, Serdar Yüksel

Abstract: Optimal zero-delay coding (quantization) of a vector-valued Markov source driven by a noise process is considered. Using a stochastic control problem formulation, the existence and structure of optimal quantization policies are studied. For a finite-horizon problem with bounded per-stage distortion measure, the existence of an optimal zero-delay quantization policy is shown provided that the quant… ▽ More Optimal zero-delay coding (quantization) of a vector-valued Markov source driven by a noise process is considered. Using a stochastic control problem formulation, the existence and structure of optimal quantization policies are studied. For a finite-horizon problem with bounded per-stage distortion measure, the existence of an optimal zero-delay quantization policy is shown provided that the quantizers allowed are ones with convex codecells. The bounded distortion assumption is relaxed to cover cases that include the linear quadratic Gaussian problem. For the infinite horizon problem and a stationary Markov source the optimality of deterministic Markov coding policies is shown. The existence of optimal stationary Markov quantization policies is also shown provided randomization that is shared by the encoder and the decoder is allowed. △ Less

Submitted 8 August, 2014; v1 submitted 1 July, 2013; originally announced July 2013.

Comments: IEEE Transactions on Information Theory, accepted for publication

MSC Class: 93E20; 94A29; 60J05

arXiv:1209.4365 [pdf, other]

Stochastic Stabilization of Partially Observed and Multi-Sensor Systems Driven by Gaussian Noise under Fixed-Rate Information Constraints

Authors: Andrew P. Johnston, Serdar Yüksel

Abstract: We investigate the stabilization of unstable multidimensional partially observed single-sensor and multi-sensor linear systems driven by unbounded noise and controlled over discrete noiseless channels under fixed-rate information constraints. Stability is achieved under fixed-rate communication requirements that are asymptotically tight in the limit of large sampling periods. Through the use of si… ▽ More We investigate the stabilization of unstable multidimensional partially observed single-sensor and multi-sensor linear systems driven by unbounded noise and controlled over discrete noiseless channels under fixed-rate information constraints. Stability is achieved under fixed-rate communication requirements that are asymptotically tight in the limit of large sampling periods. Through the use of similarity transforms, sampling and random-time drift conditions we obtain a coding and control policy leading to the existence of a unique invariant distribution and finite second moment for the sampled state. We use a vector stabilization scheme in which all modes of the linear system visit a compact set together infinitely often. We prove tight necessary and sufficient conditions for the general multi-sensor case under an assumption related to the Jordan form structure of such systems. In the absence of this assumption, we give sufficient conditions for stabilization. △ Less

Submitted 19 September, 2012; originally announced September 2012.

Comments: 31 pages, 2 figures. This paper is to appear in part at the IEEE Conference on Decision and Control, Hawaii, 2012

arXiv:1204.3097 [pdf, ps, other]

Technical Report: Observability of a Linear System under Sparsity Constraints

Authors: Wei Dai, Serdar Yüksel

Abstract: Consider an n-dimensional linear system where it is known that there are at most k<n non-zero components in the initial state. The observability problem, that is the recovery of the initial state, for such a system is considered. We obtain sufficient conditions on the number of the available observations to be able to recover the initial state exactly for such a system. Both deterministic and stoc… ▽ More Consider an n-dimensional linear system where it is known that there are at most k<n non-zero components in the initial state. The observability problem, that is the recovery of the initial state, for such a system is considered. We obtain sufficient conditions on the number of the available observations to be able to recover the initial state exactly for such a system. Both deterministic and stochastic setups are considered for system dynamics. In the former setting, the system matrices are known deterministically, whereas in the latter setting, all of the matrices are picked from a randomized class of matrices. The main message is that, one does not need to obtain full n observations to be able to uniquely identify the initial state of the linear system, even when the observations are picked randomly, when the initial condition is known to be sparse. △ Less

Submitted 13 April, 2012; originally announced April 2012.

arXiv:1201.5360 [pdf, ps, other]

Characterization of Information Channels for Asymptotic Mean Stationarity and Stochastic Stability of Non-stationary/Unstable Linear Systems

Authors: Serdar Yüksel

Abstract: Stabilization of non-stationary linear systems over noisy communication channels is considered. Stochastically stable sources, and unstable but noise-free or bounded-noise systems have been extensively studied in information theory and control theory literature since 1970s, with a renewed interest in the past decade. There have also been studies on non-causal and causal coding of unstable/non-stat… ▽ More Stabilization of non-stationary linear systems over noisy communication channels is considered. Stochastically stable sources, and unstable but noise-free or bounded-noise systems have been extensively studied in information theory and control theory literature since 1970s, with a renewed interest in the past decade. There have also been studies on non-causal and causal coding of unstable/non-stationary linear Gaussian sources. In this paper, tight necessary and sufficient conditions for stochastic stabilizability of unstable (non-stationary) possibly multi-dimensional linear systems driven by Gaussian noise over discrete channels (possibly with memory and feedback) are presented. Stochastic stability notions include recurrence, asymptotic mean stationarity and sample path ergodicity, and the existence of finite second moments. Our constructive proof uses random-time state-dependent stochastic drift criteria for stabilization of Markov chains. For asymptotic mean stationarity (and thus sample path ergodicity), it is sufficient that the capacity of a channel is (strictly) greater than the sum of the logarithms of the unstable pole magnitudes for memoryless channels and a class of channels with memory. This condition is also necessary under a mild technical condition. Sufficient conditions for the existence of finite average second moments for such systems driven by unbounded noise are provided. △ Less

Submitted 4 May, 2012; v1 submitted 25 January, 2012; originally announced January 2012.

Comments: To appear in IEEE Transactions on Information Theory

MSC Class: 15A15; 15A09; 15A23

arXiv:1201.4109 [pdf, ps, other]

On the Multiple Access Channel with Asymmetric Noisy State Information at the Encoders

Authors: Nevroz Şen, Fady Alajaji, Serdar Yüksel, Giacomo Como

Abstract: We consider the problem of reliable communication over multiple-access channels (MAC) where the channel is driven by an independent and identically distributed state process and the encoders and the decoder are provided with various degrees of asymmetric noisy channel state information (CSI). For the case where the encoders observe causal, asymmetric noisy CSI and the decoder observes complete CSI… ▽ More We consider the problem of reliable communication over multiple-access channels (MAC) where the channel is driven by an independent and identically distributed state process and the encoders and the decoder are provided with various degrees of asymmetric noisy channel state information (CSI). For the case where the encoders observe causal, asymmetric noisy CSI and the decoder observes complete CSI, we provide inner and outer bounds to the capacity region, which are tight for the sum-rate capacity. We then observe that, under a Markov assumption, similar capacity results also hold in the case where the receiver observes noisy CSI. Furthermore, we provide a single letter characterization for the capacity region when the CSI at the encoders are asymmetric deterministic functions of the CSI at the decoder and the encoders have non-causal noisy CSI (its causal version is recently solved in \cite{como-yuksel}). When the encoders observe asymmetric noisy CSI with asymmetric delays and the decoder observes complete CSI, we provide a single letter characterization for the capacity region. Finally, we consider a cooperative scenario with common and private messages, with asymmetric noisy CSI at the encoders and complete CSI at the decoder. We provide a single letter expression for the capacity region for such channels. For the cooperative scenario, we also note that as soon as the common message encoder does not have access to CSI, then in any noisy setup, covering the cases where no CSI or noisy CSI at the decoder, it is possible to obtain a single letter characterization for the capacity region. The main component in these results is a generalization of a converse coding approach, recently introduced in [1] for the MAC with asymmetric quantized CSI at the encoders and herein considerably extended and adapted for the noisy CSI setup. △ Less

Submitted 19 January, 2012; originally announced January 2012.

Comments: Submitted to the IEEE Transactions on Information Theory

arXiv:1111.2451 [pdf, ps, other]

Unitary Precoding and Basis Dependency of MMSE Performance for Gaussian Erasure Channels

Authors: Ayça Özçelikkale, Serdar Yüksel, Haldun M. Ozaktas

Abstract: We consider the transmission of a Gaussian vector source over a multi-dimensional Gaussian channel where a random or a fixed subset of the channel outputs are erased. Within the setup where the only encoding operation allowed is a linear unitary transformation on the source, we investigate the MMSE performance, both in average, and also in terms of guarantees that hold with high probability as a f… ▽ More We consider the transmission of a Gaussian vector source over a multi-dimensional Gaussian channel where a random or a fixed subset of the channel outputs are erased. Within the setup where the only encoding operation allowed is a linear unitary transformation on the source, we investigate the MMSE performance, both in average, and also in terms of guarantees that hold with high probability as a function of the system parameters. Under the performance criterion of average MMSE, necessary conditions that should be satisfied by the optimal unitary encoders are established and explicit solutions for a class of settings are presented. For random sampling of signals that have a low number of degrees of freedom, we present MMSE bounds that hold with high probability. Our results illustrate how the spread of the eigenvalue distribution and the unitary transformation contribute to these performance guarantees. The performance of the discrete Fourier transform (DFT) is also investigated. As a benchmark, we investigate the equidistant sampling of circularly wide-sense stationary (c.w.s.s.) signals, and present the explicit error expression that quantifies the effects of the sampling rate and the eigenvalue distribution of the covariance matrix of the signal. These findings may be useful in understanding the geometric dependence of signal uncertainty in a stochastic process. In particular, unlike information theoretic measures such as entropy, we highlight the basis dependence of uncertainty in a signal with another perspective. The unitary encoding space restriction exhibits the most and least favorable signal bases for estimation. △ Less

Submitted 13 September, 2014; v1 submitted 10 November, 2011; originally announced November 2011.

Comments: Accepted for publication in IEEE Transactions on Information Theory

arXiv:1103.3054 [pdf, ps, other]

On the Capacity of Memoryless Finite-State Multiple Access Channels with Asymmetric Noisy State Information at the Encoders

Authors: Nevroz Şen, Giacomo Como, Serdar Yüksel, Fady Alajaji

Abstract: We consider the capacity of memoryless finite-state multiple access channel (FS-MAC) with causal asymmetric noisy state information available at both transmitters and complete state information available at the receiver. Single letter inner and outer bounds are provided for the capacity of such channels when the state process is independent and identically distributed. The outer bound is attained… ▽ More We consider the capacity of memoryless finite-state multiple access channel (FS-MAC) with causal asymmetric noisy state information available at both transmitters and complete state information available at the receiver. Single letter inner and outer bounds are provided for the capacity of such channels when the state process is independent and identically distributed. The outer bound is attained by observing that the proposed inner bound is tight for the sum-rate capacity. △ Less

Submitted 1 June, 2011; v1 submitted 15 March, 2011; originally announced March 2011.

Comments: To be submitted Allerton Conference 2011

arXiv:1012.1912 [pdf, ps, other]

On the Capacity of Memoryless Finite-State Multiple-Access Channels with Asymmetric State Information at the Encoders

Authors: Giacomo Como, Serdar Yüksel

Abstract: A single-letter characterization is provided for the capacity region of finite-state multiple-access channels, when the channel state process is an independent and identically distributed sequence, the transmitters have access to partial (quantized) state information, and complete channel state information is available at the receiver. The partial channel state information is assumed to be asymmet… ▽ More A single-letter characterization is provided for the capacity region of finite-state multiple-access channels, when the channel state process is an independent and identically distributed sequence, the transmitters have access to partial (quantized) state information, and complete channel state information is available at the receiver. The partial channel state information is assumed to be asymmetric at the encoders. As a main contribution, a tight converse coding theorem is presented. The difficulties associated with the case when the channel state has memory are discussed and connections to decentralized stochastic control theory are presented. △ Less

Submitted 8 December, 2010; originally announced December 2010.

Comments: 8 pages, 1 figure, accepted for publication, in press

arXiv:1010.4824 [pdf, ps, other]

On Optimal Causal Coding of Partially Observed Markov Sources in Single and Multi-Terminal Settings

Authors: Serdar Yüksel

Abstract: The optimal causal coding of a partially observed Markov process is studied, where the cost to be minimized is a bounded, non-negative, additive, measurable single-letter function of the source and the receiver output. A structural result is obtained extending Witsenhausen's and Walrand-Varaiya's structural results on optimal real-time coders to a partially observed setting. The decentralized (mul… ▽ More The optimal causal coding of a partially observed Markov process is studied, where the cost to be minimized is a bounded, non-negative, additive, measurable single-letter function of the source and the receiver output. A structural result is obtained extending Witsenhausen's and Walrand-Varaiya's structural results on optimal real-time coders to a partially observed setting. The decentralized (multi-terminal) setup is also considered. For the case where the source is an i.i.d. process, it is shown that the optimal decentralized causal coding of correlated observations problem admits a solution which is memoryless. For Markov sources, a counterexample to a natural separation conjecture is presented. △ Less

Submitted 23 August, 2012; v1 submitted 22 October, 2010; originally announced October 2010.

Comments: To appear in IEEE Transactions on Information Theory

MSC Class: 94A15; 93E20

arXiv:1010.4820 [pdf, ps, other]

Random-Time, State-Dependent Stochastic Drift for Markov Chains and Application to Stochastic Stabilization Over Erasure Channels

Authors: Serdar Yüksel, Sean P. Meyn

Abstract: It is known that state-dependent, multi-step Lyapunov bounds lead to greatly simplified verification theorems for stability for large classes of Markov chain models. This is one component of the "fluid model" approach to stability of stochastic networks. In this paper we extend the general theory to randomized multi-step Lyapunov theory to obtain criteria for stability and steady-state performance… ▽ More It is known that state-dependent, multi-step Lyapunov bounds lead to greatly simplified verification theorems for stability for large classes of Markov chain models. This is one component of the "fluid model" approach to stability of stochastic networks. In this paper we extend the general theory to randomized multi-step Lyapunov theory to obtain criteria for stability and steady-state performance bounds, such as finite moments. These results are applied to a remote stabilization problem, in which a controller receives measurements from an erasure channel with limited capacity. Based on the general results in the paper it is shown that stability of the closed loop system is assured provided that the channel capacity is greater than the logarithm of the unstable eigenvalue, plus an additional correction term. The existence of a finite second moment in steady-state is established under additional conditions. △ Less

Submitted 17 May, 2012; v1 submitted 22 October, 2010; originally announced October 2010.

Comments: To appear in IEEE Transactions on Automatic Control

MSC Class: 93E03; 94A15; 60J05

arXiv:1009.3824 [pdf, ps, other]

Optimization and Convergence of Observation Channels in Stochastic Control

Authors: Serdar Yüksel, Tamás Linder

Abstract: This paper studies the optimization of observation channels (stochastic kernels) in partially observed stochastic control problems. In particular, existence and continuity properties are investigated mostly (but not exclusively) concentrating on the single-stage case. Continuity properties of the optimal cost in channels are explored under total variation, setwise convergence, and weak convergence… ▽ More This paper studies the optimization of observation channels (stochastic kernels) in partially observed stochastic control problems. In particular, existence and continuity properties are investigated mostly (but not exclusively) concentrating on the single-stage case. Continuity properties of the optimal cost in channels are explored under total variation, setwise convergence, and weak convergence. Sufficient conditions for compactness of a class of channels under total variation and setwise convergence are presented and applications to quantization are explored. △ Less

Submitted 7 February, 2012; v1 submitted 20 September, 2010; originally announced September 2010.

Comments: 24 pages, to appear in the SIAM Journal on Control and Optimization

MSC Class: 15A15; 15A09; 15A23

arXiv:0707.2014 [pdf, ps, other]

On the error exponent of variable-length block-coding schemes over finite-state Markov channels with feedback

Authors: Giacomo Como, Serdar Yuksel, Sekhar Tatikonda

Abstract: The error exponent of Markov channels with feedback is studied in the variable-length block-coding setting. Burnashev's classic result is extended and a single letter characterization for the reliability function of finite-state Markov channels is presented, under the assumption that the channel state is causally observed both at the transmitter and at the receiver side. Tools from stochastic co… ▽ More The error exponent of Markov channels with feedback is studied in the variable-length block-coding setting. Burnashev's classic result is extended and a single letter characterization for the reliability function of finite-state Markov channels is presented, under the assumption that the channel state is causally observed both at the transmitter and at the receiver side. Tools from stochastic control theory are used in order to treat channels with intersymbol interference. In particular the convex analytical approach to Markov decision processes is adopted to handle problems with stop** time horizons arising from variable-length coding schemes. △ Less

Submitted 13 July, 2007; originally announced July 2007.

Showing 1–47 of 47 results for author: Yuksel, S