-
Robust Cooperative Multi-Agent Reinforcement Learning:A Mean-Field Type Game Perspective
Authors:
Muhammad Aneeq uz Zaman,
Mathieu Laurière,
Alec Koppel,
Tamer Başar
Abstract:
In this paper, we study the problem of robust cooperative multi-agent reinforcement learning (RL) where a large number of cooperative agents with distributed information aim to learn policies in the presence of \emph{stochastic} and \emph{non-stochastic} uncertainties whose distributions are respectively known and unknown. Focusing on policy optimization that accounts for both types of uncertainti…
▽ More
In this paper, we study the problem of robust cooperative multi-agent reinforcement learning (RL) where a large number of cooperative agents with distributed information aim to learn policies in the presence of \emph{stochastic} and \emph{non-stochastic} uncertainties whose distributions are respectively known and unknown. Focusing on policy optimization that accounts for both types of uncertainties, we formulate the problem in a worst-case (minimax) framework, which is is intractable in general. Thus, we focus on the Linear Quadratic setting to derive benchmark solutions. First, since no standard theory exists for this problem due to the distributed information structure, we utilize the Mean-Field Type Game (MFTG) paradigm to establish guarantees on the solution quality in the sense of achieved Nash equilibrium of the MFTG. This in turn allows us to compare the performance against the corresponding original robust multi-agent control problem. Then, we propose a Receding-horizon Gradient Descent Ascent RL algorithm to find the MFTG Nash equilibrium and we prove a non-asymptotic rate of convergence. Finally, we provide numerical experiments to demonstrate the efficacy of our approach relative to a baseline algorithm.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
A Mean Field Game Model for Timely Computation in Edge Computing Systems
Authors:
Shubham Aggarwal,
Muhammad Aneeq uz Zaman,
Melih Bastopcu,
Sennur Ulukus,
Tamer Başar
Abstract:
We consider the problem of task offloading in multi-access edge computing (MEC) systems constituting $N$ devices assisted by an edge server (ES), where the devices can split task execution between a local processor and the ES. Since the local task execution and communication with the ES both consume power, each device must judiciously choose between the two. We model the problem as a large populat…
▽ More
We consider the problem of task offloading in multi-access edge computing (MEC) systems constituting $N$ devices assisted by an edge server (ES), where the devices can split task execution between a local processor and the ES. Since the local task execution and communication with the ES both consume power, each device must judiciously choose between the two. We model the problem as a large population non-cooperative game among the $N$ devices. Since computation of an equilibrium in this scenario is difficult due to the presence of a large number of devices, we employ the mean-field game framework to reduce the finite-agent game problem to a generic user's multi-objective optimization problem, with a coupled consistency condition. By leveraging the novel age of information (AoI) metric, we invoke techniques from stochastic hybrid systems (SHS) theory and study the tradeoffs between increasing information freshness and reducing power consumption. In numerical simulations, we validate that a higher load at the ES may lead devices to upload their task to the ES less often.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Diagnosis Of Takotsubo Syndrome By Robust Feature Selection From The Complex Latent Space Of DL-based Segmentation Network
Authors:
Fahim Ahmed Zaman,
Wahidul Alam,
Tarun Kanti Roy,
Amanda Chang,
Kan Liu,
Xiaodong Wu
Abstract:
Researchers have shown significant correlations among segmented objects in various medical imaging modalities and disease related pathologies. Several studies showed that using hand crafted features for disease prediction neglects the immense possibility to use latent features from deep learning (DL) models which may reduce the overall accuracy of differential diagnosis. However, directly using cl…
▽ More
Researchers have shown significant correlations among segmented objects in various medical imaging modalities and disease related pathologies. Several studies showed that using hand crafted features for disease prediction neglects the immense possibility to use latent features from deep learning (DL) models which may reduce the overall accuracy of differential diagnosis. However, directly using classification or segmentation models on medical to learn latent features opt out robust feature selection and may lead to overfitting. To fill this gap, we propose a novel feature selection technique using the latent space of a segmentation model that can aid diagnosis. We evaluated our method in differentiating a rare cardiac disease: Takotsubo Syndrome (TTS) from the ST elevation myocardial infarction (STEMI) using echocardiogram videos (echo). TTS can mimic clinical features of STEMI in echo and extremely hard to distinguish. Our approach shows promising results in differential diagnosis of TTS with 82% diagnosis accuracy beating the previous state-of-the-art (SOTA) approach. Moreover, the robust feature selection technique using LASSO algorithm shows great potential in reducing the redundant features and creates a robust pipeline for short- and long-term disease prognoses in the downstream analysis.
△ Less
Submitted 18 January, 2024; v1 submitted 19 December, 2023;
originally announced December 2023.
-
Surf-CDM: Score-Based Surface Cold-Diffusion Model For Medical Image Segmentation
Authors:
Fahim Ahmed Zaman,
Mathews Jacob,
Amanda Chang,
Kan Liu,
Milan Sonka,
Xiaodong Wu
Abstract:
Diffusion models have shown impressive performance for image generation, often times outperforming other generative models. Since their introduction, researchers have extended the powerful noise-to-image denoising pipeline to discriminative tasks, including image segmentation. In this work we propose a conditional score-based generative modeling framework for medical image segmentation which relie…
▽ More
Diffusion models have shown impressive performance for image generation, often times outperforming other generative models. Since their introduction, researchers have extended the powerful noise-to-image denoising pipeline to discriminative tasks, including image segmentation. In this work we propose a conditional score-based generative modeling framework for medical image segmentation which relies on a parametric surface representation for the segmentation masks. The surface re-parameterization allows the direct application of standard diffusion theory, as opposed to when the mask is represented as a binary mask. Moreover, we adapted an extended variant of the diffusion technique known as the "cold-diffusion" where the diffusion model can be constructed with deterministic perturbations instead of Gaussian noise, which facilitates significantly faster convergence in the reverse diffusion. We evaluated our method on the segmentation of the left ventricle from 65 transthoracic echocardiogram videos (2230 echo image frames) and compared its performance to the most popular and widely used image segmentation models. Our proposed model not only outperformed the compared methods in terms of segmentation accuracy, but also showed potential in estimating segmentation uncertainties for further downstream analyses due to its inherent generative nature.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Quantitative perfusion maps using a novelty spatiotemporal convolutional neural network
Authors:
Anbo Cao,
Pin-Yu Le,
Zhonghui Qie,
Haseeb Hassan,
Yingwei Guo,
Asim Zaman,
Jiaxi Lu,
Xueqiang Zeng,
Huihui Yang,
Xiaoqiang Miao,
Taiyu Han,
Guangtao Huang,
Yan Kang,
Yu Luo,
Jia Guo
Abstract:
Dynamic susceptibility contrast magnetic resonance imaging (DSC-MRI) is widely used to evaluate acute ischemic stroke to distinguish salvageable tissue and infarct core. For this purpose, traditional methods employ deconvolution techniques, like singular value decomposition, which are known to be vulnerable to noise, potentially distorting the derived perfusion parameters. However, deep learning t…
▽ More
Dynamic susceptibility contrast magnetic resonance imaging (DSC-MRI) is widely used to evaluate acute ischemic stroke to distinguish salvageable tissue and infarct core. For this purpose, traditional methods employ deconvolution techniques, like singular value decomposition, which are known to be vulnerable to noise, potentially distorting the derived perfusion parameters. However, deep learning technology could leverage it, which can accurately estimate clinical perfusion parameters compared to traditional clinical approaches. Therefore, this study presents a perfusion parameters estimation network that considers spatial and temporal information, the Spatiotemporal Network (ST-Net), for the first time. The proposed network comprises a designed physical loss function to enhance model performance further. The results indicate that the network can accurately estimate perfusion parameters, including cerebral blood volume (CBV), cerebral blood flow (CBF), and time to maximum of the residual function (Tmax). The structural similarity index (SSIM) mean values for CBV, CBF, and Tmax parameters were 0.952, 0.943, and 0.863, respectively. The DICE score for the hypo-perfused region reached 0.859, demonstrating high consistency. The proposed model also maintains time efficiency, closely approaching the performance of commercial gold-standard software.
△ Less
Submitted 8 December, 2023;
originally announced December 2023.
-
Trust, but Verify: Robust Image Segmentation using Deep Learning
Authors:
Fahim Ahmed Zaman,
Xiaodong Wu,
Weiyu Xu,
Milan Sonka,
Raghuraman Mudumbai
Abstract:
We describe a method for verifying the output of a deep neural network for medical image segmentation that is robust to several classes of random as well as worst-case perturbations i.e. adversarial attacks. This method is based on a general approach recently developed by the authors called "Trust, but Verify" wherein an auxiliary verification network produces predictions about certain masked feat…
▽ More
We describe a method for verifying the output of a deep neural network for medical image segmentation that is robust to several classes of random as well as worst-case perturbations i.e. adversarial attacks. This method is based on a general approach recently developed by the authors called "Trust, but Verify" wherein an auxiliary verification network produces predictions about certain masked features in the input image using the segmentation as an input. A well-designed auxiliary network will produce high-quality predictions when the input segmentations are accurate, but will produce low-quality predictions when the segmentations are incorrect. Checking the predictions of such a network with the original image allows us to detect bad segmentations. However, to ensure the verification method is truly robust, we need a method for checking the quality of the predictions that does not itself rely on a black-box neural network. Indeed, we show that previous methods for segmentation evaluation that do use deep neural regression networks are vulnerable to false negatives i.e. can inaccurately label bad segmentations as good. We describe the design of a verification network that avoids such vulnerability and present results to demonstrate its robustness compared to previous methods.
△ Less
Submitted 19 December, 2023; v1 submitted 25 October, 2023;
originally announced October 2023.
-
Ophthalmic Biomarker Detection Using Ensembled Vision Transformers -- Winning Solution to IEEE SPS VIP Cup 2023
Authors:
H. A. Z. Sameen Shahgir,
Khondker Salman Sayeed,
Tanjeem Azwad Zaman,
Md. Asif Haider,
Sheikh Saifur Rahman Jony,
M. Sohel Rahman
Abstract:
This report outlines our approach in the IEEE SPS VIP Cup 2023: Ophthalmic Biomarker Detection competition. Our primary objective in this competition was to identify biomarkers from Optical Coherence Tomography (OCT) images obtained from a diverse range of patients. Using robust augmentations and 5-fold cross-validation, we trained two vision transformer-based models: MaxViT and EVA-02, and ensemb…
▽ More
This report outlines our approach in the IEEE SPS VIP Cup 2023: Ophthalmic Biomarker Detection competition. Our primary objective in this competition was to identify biomarkers from Optical Coherence Tomography (OCT) images obtained from a diverse range of patients. Using robust augmentations and 5-fold cross-validation, we trained two vision transformer-based models: MaxViT and EVA-02, and ensembled them at inference time. We find MaxViT's use of convolution layers followed by strided attention to be better suited for the detection of local features while EVA-02's use of normal attention mechanism and knowledge distillation is better for detecting global features. Ours was the best-performing solution in the competition, achieving a patient-wise F1 score of 0.814 in the first phase and 0.8527 in the second and final phase of VIP Cup 2023, scoring 3.8% higher than the next-best solution.
△ Less
Submitted 21 October, 2023;
originally announced October 2023.
-
Prosumers Participation in Markets: A Scalar-Parameterized Function Bidding Approach
Authors:
Abdullah Alawad,
Muhammad Aneeq uz Zaman,
Khaled Alshehri,
Tamer Başar
Abstract:
In uniform-price markets, suppliers compete to supply a resource to consumers, resulting in a single market price determined by their competition. For sufficient flexibility, producers and consumers prefer to commit to a function as their strategies, indicating their preferred quantity at any given market price. Producers and consumers may wish to act as both, i.e., prosumers. In this paper, we ex…
▽ More
In uniform-price markets, suppliers compete to supply a resource to consumers, resulting in a single market price determined by their competition. For sufficient flexibility, producers and consumers prefer to commit to a function as their strategies, indicating their preferred quantity at any given market price. Producers and consumers may wish to act as both, i.e., prosumers. In this paper, we examine the behavior of profit-maximizing prosumers in a uniform-price market for resource allocation with the objective of maximizing the social welfare. We propose a scalar-parameterized function bidding mechanism for the prosumers, in which we establish the existence and uniqueness of Nash equilibrium. Furthermore, we provide an efficient way to compute the Nash equilibrium through the computation of the market allocation at the Nash equilibrium. Finally, we present a case study to illustrate the welfare loss under different variations of market parameters, such as the market's supply capacity and inelastic demand.
△ Less
Submitted 14 March, 2024; v1 submitted 27 September, 2023;
originally announced September 2023.
-
Large Population Games on Constrained Unreliable Networks
Authors:
Shubham Aggarwal,
Muhammad Aneeq uz Zaman,
Melih Bastopcu,
Tamer Başar
Abstract:
This paper studies an $N$--agent cost-coupled game where the agents are connected via an unreliable capacity constrained network. Each agent receives state information over that network which loses packets with probability $p$. A Base station (BS) actively schedules agent communications over the network by minimizing a weighted Age of Information (WAoI) based cost function under a capacity limit…
▽ More
This paper studies an $N$--agent cost-coupled game where the agents are connected via an unreliable capacity constrained network. Each agent receives state information over that network which loses packets with probability $p$. A Base station (BS) actively schedules agent communications over the network by minimizing a weighted Age of Information (WAoI) based cost function under a capacity limit $\mathcal{C} < N$ on the number of transmission attempts at each instant. Under a standard information structure, we show that the problem can be decoupled into a scheduling problem for the BS and a game problem for the $N$ agents. Since the scheduling problem is an NP hard combinatorics problem, we propose an approximately optimal solution which approaches the optimal solution as $N \rightarrow \infty$. In the process, we also provide some insights on the case without channel erasure. Next, to solve the large population game problem, we use the mean-field game framework to compute an approximate decentralized Nash equilibrium. Finally, we validate the theoretical results using a numerical example.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
Weighted Age of Information based Scheduling for Large Population Games on Networks
Authors:
Shubham Aggarwal,
Muhammad Aneeq uz Zaman,
Melih Bastopcu,
Tamer Başar
Abstract:
In this paper, we consider a discrete-time multi-agent system involving $N$ cost-coupled networked rational agents solving a consensus problem and a central Base Station (BS), scheduling agent communications over a network. Due to a hard bandwidth constraint on the number of transmissions through the network, at most $R_d < N$ agents can concurrently access their state information through the netw…
▽ More
In this paper, we consider a discrete-time multi-agent system involving $N$ cost-coupled networked rational agents solving a consensus problem and a central Base Station (BS), scheduling agent communications over a network. Due to a hard bandwidth constraint on the number of transmissions through the network, at most $R_d < N$ agents can concurrently access their state information through the network. Under standard assumptions on the information structure of the agents and the BS, we first show that the control actions of the agents are free of any dual effect, allowing for separation between estimation and control problems at each agent. Next, we propose a weighted age of information (WAoI) metric for the scheduling problem of the BS, where the weights depend on the estimation error of the agents. The BS aims to find the optimum scheduling policy that minimizes the WAoI, subject to the hard bandwidth constraint. Since this problem is NP hard, we first relax the hard constraint to a soft update rate constraint, and then compute an optimal policy for the relaxed problem by reformulating it into a Markov Decision Process (MDP). This then inspires a sub-optimal policy for the bandwidth constrained problem, which is shown to approach the optimal policy as $N \rightarrow \infty$. Next, we solve the consensus problem using the mean-field game framework wherein we first design decentralized control policies for a limiting case of the $N$-agent system (as $N \rightarrow \infty$). By explicitly constructing the mean-field system, we prove the existence and uniqueness of the mean-field equilibrium. Consequently, we show that the obtained equilibrium policies constitute an $ε$-Nash equilibrium for the finite agent system. Finally, we validate the performance of both the scheduling and the control policies through numerical simulations.
△ Less
Submitted 26 December, 2022; v1 submitted 26 September, 2022;
originally announced September 2022.
-
Applying wav2vec2 for Speech Recognition on Bengali Common Voices Dataset
Authors:
H. A. Z. Sameen Shahgir,
Khondker Salman Sayeed,
Tanjeem Azwad Zaman
Abstract:
Speech is inherently continuous, where discrete words, phonemes and other units are not clearly segmented, and so speech recognition has been an active research problem for decades. In this work we have fine-tuned wav2vec 2.0 to recognize and transcribe Bengali speech -- training it on the Bengali Common Voice Speech Dataset. After training for 71 epochs, on a training set consisting of 36919 mp3…
▽ More
Speech is inherently continuous, where discrete words, phonemes and other units are not clearly segmented, and so speech recognition has been an active research problem for decades. In this work we have fine-tuned wav2vec 2.0 to recognize and transcribe Bengali speech -- training it on the Bengali Common Voice Speech Dataset. After training for 71 epochs, on a training set consisting of 36919 mp3 files, we achieved a training loss of 0.3172 and WER of 0.2524 on a validation set of size 7,747. Using a 5-gram language model, the Levenshtein Distance was 2.6446 on a test set of size 7,747. Then the training set and validation set were combined, shuffled and split into 85-15 ratio. Training for 7 more epochs on this combined dataset yielded an improved Levenshtein Distance of 2.60753 on the test set. Our model was the best performing one, achieving a Levenshtein Distance of 6.234 on a hidden dataset, which was 1.1049 units lower than other competing submissions.
△ Less
Submitted 11 September, 2022;
originally announced September 2022.
-
Observer-Based Consensus of Nonlinear Positive Multi-Agent Systems with Saturated Control Input
Authors:
Amirreza Zaman,
Wolfgang Birk,
Khalid Tourkey Atta
Abstract:
This paper presents the distributed pinning consensus solution for nonlinear positive multi-agent systems with nonlinear control input by applying observer-based control protocols. The network topology is considered as a directed and fully connected structure. By considering sector input nonlinearities and various forms of topologies, two kinds of state observers involving standard observer and di…
▽ More
This paper presents the distributed pinning consensus solution for nonlinear positive multi-agent systems with nonlinear control input by applying observer-based control protocols. The network topology is considered as a directed and fully connected structure. By considering sector input nonlinearities and various forms of topologies, two kinds of state observers involving standard observer and distributed pinning observer are presented for each regarded nonlinear agent by applying a novel analysis directly dealing with the nonlinear input and nonlinear system dynamics. The measured local output detail outlines the first observer, and the other observer is achieved via the corresponding output detail of its adjacent agents. Based on further observed state details, a distributed pinning observer-based strategy is derived for the leader-follower non-negative global consensus of the nonlinear positive multi-agent system. Additionally, two multi-step algorithms are proposed to set up the observer gains and each protocol criterion. Performance evaluations are provided to confirm the proposed control method and illustrate the effectiveness of the derived non-negative consensus observer-based protocols.
△ Less
Submitted 12 May, 2022;
originally announced May 2022.
-
Linear Quadratic Mean-Field Games with Communication Constraints
Authors:
Shubham Aggarwal,
Muhammad Aneeq uz Zaman,
Tamer Başar
Abstract:
In this paper, we study a large population game with heterogeneous dynamics and cost functions solving a consensus problem. Moreover, the agents have communication constraints which appear as: (1) an Additive-White Gaussian Noise (AWGN) channel, and (2) asynchronous data transmission via a fixed scheduling policy. Since the complexity of solving the game increases with the number of agents, we use…
▽ More
In this paper, we study a large population game with heterogeneous dynamics and cost functions solving a consensus problem. Moreover, the agents have communication constraints which appear as: (1) an Additive-White Gaussian Noise (AWGN) channel, and (2) asynchronous data transmission via a fixed scheduling policy. Since the complexity of solving the game increases with the number of agents, we use the Mean-Field Game paradigm to solve it. Under standard assumptions on the information structure of the agents, we prove that the control of the agent in the MFG setting is free of the dual effect. This allows us to obtain an equilibrium control policy for the generic agent, which is a function of only the local observation of the agent. Furthermore, the equilibrium mean-field trajectory is shown to follow linear dynamics, hence making it computable. We show that in the finite population game, the equilibrium control policy prescribed by the MFG analysis constitutes an $ε$-Nash equilibrium, where $ε$ tends to zero as the number of agents goes to infinity. The paper is concluded with simulations demonstrating the performance of the equilibrium control policy.
△ Less
Submitted 25 August, 2022; v1 submitted 10 March, 2022;
originally announced March 2022.
-
Adversarial Linear-Quadratic Mean-Field Games over Multigraphs
Authors:
Muhammad Aneeq uz Zaman,
Sujay Bhatt,
Tamer Başar
Abstract:
In this paper, we propose a game between an exogenous adversary and a network of agents connected via a multigraph. The multigraph is composed of (1) a global graph structure, capturing the virtual interactions among the agents, and (2) a local graph structure, capturing physical/local interactions among the agents. The aim of each agent is to achieve consensus with the other agents in a decentral…
▽ More
In this paper, we propose a game between an exogenous adversary and a network of agents connected via a multigraph. The multigraph is composed of (1) a global graph structure, capturing the virtual interactions among the agents, and (2) a local graph structure, capturing physical/local interactions among the agents. The aim of each agent is to achieve consensus with the other agents in a decentralized manner by minimizing a local cost associated with its local graph and a global cost associated with the global graph. The exogenous adversary, on the other hand, aims to maximize the average cost incurred by all agents in the multigraph. We derive Nash equilibrium policies for the agents and the adversary in the Mean-Field Game setting, when the agent population in the global graph is arbitrarily large and the ``homogeneous mixing" hypothesis holds on local graphs. This equilibrium is shown to be unique and the equilibrium Markov policies for each agent depend on the local state of the agent, as well as the influences on the agent by the local and global mean fields.
△ Less
Submitted 3 October, 2021; v1 submitted 29 September, 2021;
originally announced September 2021.
-
Reinforcement Learning in Non-Stationary Discrete-Time Linear-Quadratic Mean-Field Games
Authors:
Muhammad Aneeq uz Zaman,
Kaiqing Zhang,
Erik Miehling,
Tamer Başar
Abstract:
In this paper, we study large population multi-agent reinforcement learning (RL) in the context of discrete-time linear-quadratic mean-field games (LQ-MFGs). Our setting differs from most existing work on RL for MFGs, in that we consider a non-stationary MFG over an infinite horizon. We propose an actor-critic algorithm to iteratively compute the mean-field equilibrium (MFE) of the LQ-MFG. There a…
▽ More
In this paper, we study large population multi-agent reinforcement learning (RL) in the context of discrete-time linear-quadratic mean-field games (LQ-MFGs). Our setting differs from most existing work on RL for MFGs, in that we consider a non-stationary MFG over an infinite horizon. We propose an actor-critic algorithm to iteratively compute the mean-field equilibrium (MFE) of the LQ-MFG. There are two primary challenges: i) the non-stationarity of the MFG induces a linear-quadratic tracking problem, which requires solving a backwards-in-time (non-causal) equation that cannot be solved by standard (causal) RL algorithms; ii) Many RL algorithms assume that the states are sampled from the stationary distribution of a Markov chain (MC), that is, the chain is already mixed, an assumption that is not satisfied for real data sources. We first identify that the mean-field trajectory follows linear dynamics, allowing the problem to be reformulated as a linear quadratic Gaussian problem. Under this reformulation, we propose an actor-critic algorithm that allows samples to be drawn from an unmixed MC. Finite-sample convergence guarantees for the algorithm are then provided. To characterize the performance of our algorithm in multi-agent RL, we have developed an error bound with respect to the Nash equilibrium of the finite-population game.
△ Less
Submitted 1 October, 2020; v1 submitted 9 September, 2020;
originally announced September 2020.
-
Multi-agent Planning for thermalling gliders using multi level graph-search
Authors:
Muhammad Aneeq uz Zaman,
Aamer Iqbal Bhatti
Abstract:
This paper solves a path planning problem for a group of gliders. The gliders are tasked with visiting a set of interest points. The gliders have limited range but are able to increase their range by visiting special points called thermals. The problem addressed in this paper is of path planning for the gliders such that, the total number of interest points visited by the gliders is maximized. Thi…
▽ More
This paper solves a path planning problem for a group of gliders. The gliders are tasked with visiting a set of interest points. The gliders have limited range but are able to increase their range by visiting special points called thermals. The problem addressed in this paper is of path planning for the gliders such that, the total number of interest points visited by the gliders is maximized. This is referred to as the multi-agent problem. The problem is solved by first decomposing it into several single-agent problems. In a single-agent problem a set of interest points are allocated to a single glider. This problem is solved by planning a path which maximizes the number of visited interest points from the allocated set. This is achieved through a uniform cost graph search, as shown in our earlier work. The multi-agent problem now consists of determining the best allocation (of interest points) for each glider. Two ways are presented of solving this problem, a brute force search approach as shown in earlier work and a Branch\&Bound type graph search. The Branch&Bound approach is the main contribution of the paper. This approach is proven to be optimal and shown to be faster than the brute force search using simulations.
△ Less
Submitted 2 July, 2020;
originally announced July 2020.
-
Approximate Equilibrium Computation for Discrete-Time Linear-Quadratic Mean-Field Games
Authors:
Muhammad Aneeq uz Zaman,
Kaiqing Zhang,
Erik Miehling,
Tamer Başar
Abstract:
While the topic of mean-field games (MFGs) has a relatively long history, heretofore there has been limited work concerning algorithms for the computation of equilibrium control policies. In this paper, we develop a computable policy iteration algorithm for approximating the mean-field equilibrium in linear-quadratic MFGs with discounted cost. Given the mean-field, each agent faces a linear-quadra…
▽ More
While the topic of mean-field games (MFGs) has a relatively long history, heretofore there has been limited work concerning algorithms for the computation of equilibrium control policies. In this paper, we develop a computable policy iteration algorithm for approximating the mean-field equilibrium in linear-quadratic MFGs with discounted cost. Given the mean-field, each agent faces a linear-quadratic tracking problem, the solution of which involves a dynamical system evolving in retrograde time. This makes the development of forward-in-time algorithm updates challenging. By identifying a structural property of the mean-field update operator, namely that it preserves sequences of a particular form, we develop a forward-in-time equilibrium computation algorithm. Bounds that quantify the accuracy of the computed mean-field equilibrium as a function of the algorithm's stop** condition are provided. The optimality of the computed equilibrium is validated numerically. In contrast to the most recent/concurrent results, our algorithm appears to be the first to study infinite-horizon MFGs with non-stationary mean-field equilibria, though with focus on the linear quadratic setting.
△ Less
Submitted 6 April, 2020; v1 submitted 29 March, 2020;
originally announced March 2020.
-
AMP: Authentication of Media via Provenance
Authors:
Paul England,
Henrique S. Malvar,
Eric Horvitz,
Jack W. Stokes,
Cédric Fournet,
Rebecca Burke-Aguero,
Amaury Chamayou,
Sylvan Clebsch,
Manuel Costa,
John Deutscher,
Shabnam Erfani,
Matt Gaylor,
Andrew Jenks,
Kevin Kane,
Elissa Redmiles,
Alex Shamis,
Isha Sharma,
Sam Wenker,
Anika Zaman
Abstract:
Advances in graphics and machine learning have led to the general availability of easy-to-use tools for modifying and synthesizing media. The proliferation of these tools threatens to cast doubt on the veracity of all media. One approach to thwarting the flow of fake media is to detect modified or synthesized media through machine learning methods. While detection may help in the short term, we be…
▽ More
Advances in graphics and machine learning have led to the general availability of easy-to-use tools for modifying and synthesizing media. The proliferation of these tools threatens to cast doubt on the veracity of all media. One approach to thwarting the flow of fake media is to detect modified or synthesized media through machine learning methods. While detection may help in the short term, we believe that it is destined to fail as the quality of fake media generation continues to improve. Soon, neither humans nor algorithms will be able to reliably distinguish fake versus real content. Thus, pipelines for assuring the source and integrity of media will be required---and increasingly relied upon. We propose AMP, a system that ensures the authentication of media via certifying provenance. AMP creates one or more publisher-signed manifests for a media instance uploaded by a content provider. These manifests are stored in a database allowing fast lookup from applications such as browsers. For reference, the manifests are also registered and signed by a permissioned ledger, implemented using the Confidential Consortium Framework (CCF). CCF employs both software and hardware techniques to ensure the integrity and transparency of all registered manifests. AMP, through its use of CCF, enables a consortium of media providers to govern the service while making all its operations auditable. The authenticity of the media can be communicated to the user via visual elements in the browser, indicating that an AMP manifest has been successfully located and verified.
△ Less
Submitted 20 June, 2020; v1 submitted 22 January, 2020;
originally announced January 2020.