-
It's a Feature, Not a Bug: Measuring Creative Fluidity in Image Generators
Authors:
Aditi Ramaswamy,
Melane Navaratnarajah,
Hana Chockler
Abstract:
With the rise of freely available image generators, AI-generated art has become the centre of a series of heated debates, one of which concerns the concept of human creativity. Can an image generation AI exhibit ``creativity'' of the same type that artists do, and if so, how does that manifest? Our paper attempts to define and empirically measure one facet of creative behavior in AI, by conducting…
▽ More
With the rise of freely available image generators, AI-generated art has become the centre of a series of heated debates, one of which concerns the concept of human creativity. Can an image generation AI exhibit ``creativity'' of the same type that artists do, and if so, how does that manifest? Our paper attempts to define and empirically measure one facet of creative behavior in AI, by conducting an experiment to quantify the "fluidity of prompt interpretation", or just "fluidity", in a series of selected popular image generators. To study fluidity, we (1) introduce a clear definition for it, (2) create chains of auto-generated prompts and images seeded with an initial "ground-truth: image, (3) measure these chains' breakage points using preexisting visual and semantic metrics, and (4) use both statistical tests and visual explanations to study these chains and determine whether the image generators used to produce them exhibit significant fluidity.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
You Only Explain Once
Authors:
David A. Kelly,
Hana Chockler,
Daniel Kroening,
Nathan Blake,
Aditi Ramaswamy,
Melane Navaratnarajah,
Aaditya Shivakumar
Abstract:
In this paper, we propose a new black-box explainability algorithm and tool, YO-ReX, for efficient explanation of the outputs of object detectors. The new algorithm computes explanations for all objects detected in the image simultaneously. Hence, compared to the baseline, the new algorithm reduces the number of queries by a factor of 10X for the case of ten detected objects. The speedup increases…
▽ More
In this paper, we propose a new black-box explainability algorithm and tool, YO-ReX, for efficient explanation of the outputs of object detectors. The new algorithm computes explanations for all objects detected in the image simultaneously. Hence, compared to the baseline, the new algorithm reduces the number of queries by a factor of 10X for the case of ten detected objects. The speedup increases further with with the number of objects. Our experimental results demonstrate that YO-ReX can explain the outputs of YOLO with a negligible overhead over the running time of YOLO. We also demonstrate similar results for explaining SSD and Faster R-CNN. The speedup is achieved by avoiding backtracking by combining aggressive pruning with a causal analysis.
△ Less
Submitted 23 November, 2023;
originally announced November 2023.
-
Control with EIT: High energy charged particle detection
Authors:
Aneesh Ramaswamy,
Svetlana A. Malinovskaya
Abstract:
The strong non-linear optical response of atomic systems in electromagnetically induced transparency (EIT) states is considered as a means to detect the presence of small perturbations to steady states. For the 3-level system, expressions for the group velocity and group velocity dispersion (GVD) were derived and a quantum control protocol was established to account for the change in the chirp spe…
▽ More
The strong non-linear optical response of atomic systems in electromagnetically induced transparency (EIT) states is considered as a means to detect the presence of small perturbations to steady states. For the 3-level system, expressions for the group velocity and group velocity dispersion (GVD) were derived and a quantum control protocol was established to account for the change in the chirp spectrum of a probe pulse when the steady state was perturbed. This was applied to the propagation of slow Cherenkov polaritons in the medium due to the passage of a train of high-energy charged particles (high energy particles). The choice of the initial steady state with focus on the slow light condition and strong narrowly confined dispersion, equated to the continuous trap** of Cherenkov polaritons in the medium along a narrow group cone, allowing for non-trivial fields to accumulate. Considering another medium prepared for the detection of the radiation, swee** of the control field and detuning parameters in the field-atom parameter space showed the presence of optimal regions to maximize the first order perturbation in the coherences creating changes in the optical responses that modify the chirp spectra of probe pulses.
△ Less
Submitted 1 September, 2023;
originally announced September 2023.
-
Mirrorless lasing: a theoretical perspective
Authors:
Aneesh Ramaswamy,
Jabir Chathanathil,
Dimitra Kanta,
Emmanuel Klinger,
Aram Papoyan,
Svetlana Shmavonyan,
Aleksandr Khanbekyan,
Arne Wickenbrock,
Dmitry Budker,
Svetlana A. Malinovskaya
Abstract:
Mirrorless lasing has been a topic of particular interest for about a decade due to promising new horizons for quantum science and applications. In this work, we review first-principles theory that describes this phenomenon, and discuss degenerate mirrorless lasing in a vapor of Rb atoms, the mechanisms of amplification of light generated in the medium with population inversion between magnetic su…
▽ More
Mirrorless lasing has been a topic of particular interest for about a decade due to promising new horizons for quantum science and applications. In this work, we review first-principles theory that describes this phenomenon, and discuss degenerate mirrorless lasing in a vapor of Rb atoms, the mechanisms of amplification of light generated in the medium with population inversion between magnetic sublevels within the $D_2$ line, and challenges associated with experimental realization.
△ Less
Submitted 15 August, 2023;
originally announced August 2023.
-
Chirped Fractional Stimulated Raman Adiabatic Passage
Authors:
Jabir Chathanathil,
Aneesh Ramaswamy,
Vladimir S. Malinovsky,
Dmitry Budker,
Svetlana A. Malinovskaya
Abstract:
Stimulated Raman Adiabatic Passage (STIRAP) is a widely used method for adiabatic population transfer in a multilevel system. In this work, we study STIRAP under novel conditions and focus on the fractional, F-STIRAP, which is known to create a superposition state with the maximum coherence. In both configurations, STIRAP and F-STIRAP, we implement pulse chir** aiming at a higher contrast, a bro…
▽ More
Stimulated Raman Adiabatic Passage (STIRAP) is a widely used method for adiabatic population transfer in a multilevel system. In this work, we study STIRAP under novel conditions and focus on the fractional, F-STIRAP, which is known to create a superposition state with the maximum coherence. In both configurations, STIRAP and F-STIRAP, we implement pulse chir** aiming at a higher contrast, a broader range of parameters for adiabaticity, and enhanced spectral selectivity. Such goals target improvement of quantum imaging, sensing and metrology, and broaden the range of applications of quantum control techniques and protocols. In conventional STIRAP and F-STIRAP, two-photon resonance is required conceptually to satisfy the adiabaticity condition for dynamics within the dark state. Here, we account for a non-zero two-photon detuning and present control schemes to achieve the adiabatic conditions in STIRAP and F-STIRAP through a skillful compensation of the two-photon detuning by pulse chir**. We show that the chirped configuration - C-STIRAP - permits adiabatic passage to a predetermined state among two nearly degenerate final states, when conventional STIRAP fails to resolve them. We demonstrate such a selectivity within a broad range of parameters of the two-photon detuning and the chirp rate. In the C-F-STIRAP, chir** of the pump and the Stokes pulses with different time delays permits a complete compensation of the two-photon detuning and results in a selective maximum coherence of the initial and the target state with higher spectral resolution than in the conventional F-STIRAP.
△ Less
Submitted 29 May, 2023;
originally announced May 2023.
-
A Framework for Provably Stable and Consistent Training of Deep Feedforward Networks
Authors:
Arunselvan Ramaswamy,
Shalabh Bhatnagar,
Naman Saxena
Abstract:
We present a novel algorithm for training deep neural networks in supervised (classification and regression) and unsupervised (reinforcement learning) scenarios. This algorithm combines the standard stochastic gradient descent and the gradient clip** method. The output layer is updated using clipped gradients, the rest of the neural network is updated using standard gradients. Updating the outpu…
▽ More
We present a novel algorithm for training deep neural networks in supervised (classification and regression) and unsupervised (reinforcement learning) scenarios. This algorithm combines the standard stochastic gradient descent and the gradient clip** method. The output layer is updated using clipped gradients, the rest of the neural network is updated using standard gradients. Updating the output layer using clipped gradient stabilizes it. We show that the remaining layers are automatically stabilized provided the neural network is only composed of squashing (compact range) activations. We also present a novel squashing activation function - it is obtained by modifying a Gaussian Error Linear Unit (GELU) to have compact range - we call it Truncated GELU (tGELU). Unlike other squashing activations, such as sigmoid, the range of tGELU can be explicitly specified. As a consequence, the problem of vanishing gradients that arise due to a small range, e.g., in the case of a sigmoid activation, is eliminated. We prove that a NN composed of squashing activations (tGELU, sigmoid, etc.), when updated using the algorithm presented herein, is numerically stable and has consistent performance (low variance). The theory is supported by extensive experiments. Within reinforcement learning, as a consequence of our study, we show that target networks in Deep Q-Learning can be omitted, greatly speeding up learning and alleviating memory requirements. Cross-entropy based classification algorithms that suffer from high variance issues are more consistent when trained using our framework. One symptom of numerical instability in training is the high variance of the neural network update values. We show, in theory and through experiments, that our algorithm updates have low variance, and the training loss reduces in a smooth manner.
△ Less
Submitted 20 May, 2023;
originally announced May 2023.
-
Stability and Convergence of Distributed Stochastic Approximations with large Unbounded Stochastic Information Delays
Authors:
Adrian Redder,
Arunselvan Ramaswamy,
Holger Karl
Abstract:
We generalize the Borkar-Meyn stability Theorem (BMT) to distributed stochastic approximations (SAs) with information delays that possess an arbitrary moment bound. To model the delays, we introduce Age of Information Processes (AoIPs): stochastic processes on the non-negative integers with a unit growth property. We show that AoIPs with an arbitrary moment bound cannot exceed any fraction of time…
▽ More
We generalize the Borkar-Meyn stability Theorem (BMT) to distributed stochastic approximations (SAs) with information delays that possess an arbitrary moment bound. To model the delays, we introduce Age of Information Processes (AoIPs): stochastic processes on the non-negative integers with a unit growth property. We show that AoIPs with an arbitrary moment bound cannot exceed any fraction of time infinitely often. In combination with a suitably chosen stepsize, this property turns out to be sufficient for the stability of distributed SAs. Compared to the BMT, our analysis requires crucial modifications and a new line of argument to handle the SA errors caused by AoI. In our analysis, we show that these SA errors satisfy a recursive inequality. To evaluate this recursion, we propose a new Gronwall-type inequality for time-varying lower limits of summations. As applications to our distributed BMT, we discuss distributed gradient-based optimization and a new approach to analyzing SAs with momentum.
△ Less
Submitted 11 May, 2023;
originally announced May 2023.
-
Distributed gradient-based optimization in the presence of dependent aperiodic communication
Authors:
Adrian Redder,
Arunselvan Ramaswamy,
Holger Karl
Abstract:
Iterative distributed optimization algorithms involve multiple agents that communicate with each other, over time, in order to minimize/maximize a global objective. In the presence of unreliable communication networks, the Age-of-Information (AoI), which measures the freshness of data received, may be large and hence hinder algorithmic convergence. In this paper, we study the convergence of genera…
▽ More
Iterative distributed optimization algorithms involve multiple agents that communicate with each other, over time, in order to minimize/maximize a global objective. In the presence of unreliable communication networks, the Age-of-Information (AoI), which measures the freshness of data received, may be large and hence hinder algorithmic convergence. In this paper, we study the convergence of general distributed gradient-based optimization algorithms in the presence of communication that neither happens periodically nor at stochastically independent points in time. We show that convergence is guaranteed provided the random variables associated with the AoI processes are stochastically dominated by a random variable with finite first moment. This improves on previous requirements of boundedness of more than the first moment. We then introduce stochastically strongly connected (SSC) networks, a new stochastic form of strong connectedness for time-varying networks. We show: If for any $p \ge0$ the processes that describe the success of communication between agents in a SSC network are $α$-mixing with $n^{p-1}α(n)$ summable, then the associated AoI processes are stochastically dominated by a random variable with finite $p$-th moment. In combination with our first contribution, this implies that distributed stochastic gradient descend converges in the presence of AoI, if $α(n)$ is summable.
△ Less
Submitted 27 January, 2022;
originally announced January 2022.
-
3DPG: Distributed Deep Deterministic Policy Gradient Algorithms for Networked Multi-Agent Systems
Authors:
Adrian Redder,
Arunselvan Ramaswamy,
Holger Karl
Abstract:
We present Distributed Deep Deterministic Policy Gradient (3DPG), a multi-agent actor-critic (MAAC) algorithm for Markov games. Unlike previous MAAC algorithms, 3DPG is fully distributed during both training and deployment. 3DPG agents calculate local policy gradients based on the most recently available local data (states, actions) and local policies of other agents. During training, this informa…
▽ More
We present Distributed Deep Deterministic Policy Gradient (3DPG), a multi-agent actor-critic (MAAC) algorithm for Markov games. Unlike previous MAAC algorithms, 3DPG is fully distributed during both training and deployment. 3DPG agents calculate local policy gradients based on the most recently available local data (states, actions) and local policies of other agents. During training, this information is exchanged using a potentially lossy and delaying communication network. The network therefore induces Age of Information (AoI) for data and policies. We prove the asymptotic convergence of 3DPG even in the presence of potentially unbounded Age of Information (AoI). This provides an important step towards practical online and distributed multi-agent learning since 3DPG does not assume information to be available deterministically. We analyze 3DPG in the presence of policy and data transfer under mild practical assumptions. Our analysis shows that 3DPG agents converge to a local Nash equilibrium of Markov games in terms of utility functions expressed as the expected value of the agents local approximate action-value functions (Q-functions). The expectations of the local Q-functions are with respect to limiting distributions over the global state-action space shaped by the agents' accumulated local experiences. Our results also shed light on the policies obtained by general MAAC algorithms. We show through a heuristic argument and numerical experiments that 3DPG improves convergence over previous MAAC algorithms that use old actions instead of old policies during training. Further, we show that 3DPG is robust to AoI; it learns competitive policies even with large AoI and low data availability.
△ Less
Submitted 2 November, 2022; v1 submitted 3 January, 2022;
originally announced January 2022.
-
Practical sufficient conditions for convergence of distributed optimisation algorithms over communication networks with interference
Authors:
Adrian Redder,
Arunselvan Ramaswamy,
Holger Karl
Abstract:
Information exchange over networks can be affected by various forms of delay. This causes challenges for using the network by a multi-agent system to solve a distributed optimisation problem. Distributed optimisation schemes, however, typically do not assume network models that are representative for real-world communication networks, since communication links are most of the time abstracted as lo…
▽ More
Information exchange over networks can be affected by various forms of delay. This causes challenges for using the network by a multi-agent system to solve a distributed optimisation problem. Distributed optimisation schemes, however, typically do not assume network models that are representative for real-world communication networks, since communication links are most of the time abstracted as lossless. Our objective is therefore to formulate a representative network model and provide practically verifiable network conditions that ensure convergence of distributed algorithms in the presence of interference and possibly unbounded delay. Our network is modelled by a sequence of directed-graphs, where to each network link we associate a process for the instantaneous signal-to-interference-plus-noise ratio. We then formulate practical conditions that can be verified locally and show that the age of information (AoI) associated with data communicated over the network is in $\mathcal{O}(\sqrt{n})$. Under these conditions we show that a penalty-based gradient descent algorithm can be used to solve a rich class of stochastic, constrained, distributed optimisation problems. The strength of our result lies in the bridge between practical verifiable network conditions and an abstract optimisation theory. We illustrate numerically that our algorithm converges in an extreme scenario where the average AoI diverges.
△ Less
Submitted 10 May, 2021;
originally announced May 2021.
-
Deep Q-Learning: Theoretical Insights from an Asymptotic Analysis
Authors:
Arunselvan Ramaswamy,
Eyke Hüllermeier
Abstract:
Deep Q-Learning is an important reinforcement learning algorithm, which involves training a deep neural network, called Deep Q-Network (DQN), to approximate the well-known Q-function. Although wildly successful under laboratory conditions, serious gaps between theory and practice as well as a lack of formal guarantees prevent its use in the real world. Adopting a dynamical systems perspective, we…
▽ More
Deep Q-Learning is an important reinforcement learning algorithm, which involves training a deep neural network, called Deep Q-Network (DQN), to approximate the well-known Q-function. Although wildly successful under laboratory conditions, serious gaps between theory and practice as well as a lack of formal guarantees prevent its use in the real world. Adopting a dynamical systems perspective, we provide a theoretical analysis of a popular version of Deep Q-Learning under realistic and verifiable assumptions. More specifically, we prove an important result on the convergence of the algorithm, characterizing the asymptotic behavior of the learning process. Our result sheds light on hitherto unexplained properties of the algorithm and helps understand empirical observations, such as performance inconsistencies even after training. Unlike previous theories, our analysis accommodates state Markov processes with multiple stationary distributions. In spite of the focus on Deep Q-Learning, we believe that our theory may be applied to understand other deep learning algorithms
△ Less
Submitted 12 April, 2021; v1 submitted 25 August, 2020;
originally announced August 2020.
-
Optimization over time-varying networks with unbounded delays
Authors:
Arunselvan Ramaswamy,
Adrian Redder,
Daniel E. Quevedo
Abstract:
Solving optimization problems in multi-agent systems (MAS) involves information exchange between agents. These solutions must be robust to delays and errors that arise from an unreliable wireless network which typically connects the MAS. In today's large-scale dynamic Internet Of Things style multi-agent scenarios, the network topology changes and evolves over time. In this paper, we present a sim…
▽ More
Solving optimization problems in multi-agent systems (MAS) involves information exchange between agents. These solutions must be robust to delays and errors that arise from an unreliable wireless network which typically connects the MAS. In today's large-scale dynamic Internet Of Things style multi-agent scenarios, the network topology changes and evolves over time. In this paper, we present a simple distributed gradient based optimization framework and an associated algorithm. Convergence to a minimum of a given objective is shown under mild conditions on the network topology and objective. Specifically, we only assume that a message sent by a sender reaches the intended receiver, possibly delayed, with some positive probability. To the best of our knowledge ours is the first analysis under such weak general network conditions. We also discuss in detail the verifiability of all assumptions involved. This paper makes a significant technical contribution in terms of the allowed class of objective functions. Specifically, we present an analysis wherein the objective function is such that its sample-gradient is merely locally Lipschitz continuous. The theory developed herein is supported by numerical results. Another contribution of this paper is a consensus algorithm based on the main framework/ algorithm. The long-term behavior of this consensus algorithm is a direct consequence of the theory presented. Again, we believe that ours is the first consensus algorithm to account for unbounded stochastic communication delays, in addition to time-varying networks.
△ Less
Submitted 18 December, 2019; v1 submitted 15 December, 2019;
originally announced December 2019.
-
Deep reinforcement learning for scheduling in large-scale networked control systems
Authors:
Adrian Redder,
Arunselvan Ramaswamy,
Daniel E. Quevedo
Abstract:
This work considers the problem of control and resource scheduling in networked systems. We present DIRA, a Deep reinforcement learning based Iterative Resource Allocation algorithm, which is scalable and control-aware. Our algorithm is tailored towards large-scale problems where control and scheduling need to act jointly to optimize performance. DIRA can be used to schedule general time-domain op…
▽ More
This work considers the problem of control and resource scheduling in networked systems. We present DIRA, a Deep reinforcement learning based Iterative Resource Allocation algorithm, which is scalable and control-aware. Our algorithm is tailored towards large-scale problems where control and scheduling need to act jointly to optimize performance. DIRA can be used to schedule general time-domain optimization based controllers. In the present work, we focus on control designs based on suitably adapted linear quadratic regulators. We apply our algorithm to networked systems with correlated fading communication channels. Our simulations show that DIRA scales well to large scheduling problems.
△ Less
Submitted 23 September, 2019; v1 submitted 15 May, 2019;
originally announced May 2019.
-
DSPG: Decentralized Simultaneous Perturbations Gradient Descent Scheme
Authors:
Arunselvan Ramaswamy
Abstract:
Distributed descent-based methods are an essential toolset to solving optimization problems in multi-agent system scenarios. Here the agents seek to optimize a global objective function through mutual cooperation. Oftentimes, cooperation is achieved over a wireless communication network that is prone to delays and errors. There are many scenarios wherein the objective function is either non-differ…
▽ More
Distributed descent-based methods are an essential toolset to solving optimization problems in multi-agent system scenarios. Here the agents seek to optimize a global objective function through mutual cooperation. Oftentimes, cooperation is achieved over a wireless communication network that is prone to delays and errors. There are many scenarios wherein the objective function is either non-differentiable or merely observable. In this paper, we present a cross-entropy based distributed stochastic approximation algorithm (SA) that finds a minimum of the objective, using only samples. We call this algorithm Decentralized Simultaneous Perturbation Stochastic Gradient, with Constant Sensitivity Parameters (DSPG). This algorithm is a two fold improvement over the classic Simultaneous Perturbation Stochastic Approximations (SPSA) algorithm. Specifically, DSPG allows for (i) the use of old information from other agents and (ii) easy implementation through the use simple hyper-parameter choices. We analyze the biases and variances that arise due to these two allowances. We show that the biases due to communication delays can be countered by a careful choice of algorithm hyper-parameters. The variance of the gradient estimator and its effect on the rate of convergence is studied. We present numerical results supporting our theory. Finally, we discuss an application to the stochastic consensus problem.
△ Less
Submitted 27 August, 2019; v1 submitted 17 March, 2019;
originally announced March 2019.
-
Multi-Stage Reinforcement Learning For Object Detection
Authors:
Jonas Koenig,
Simon Malberg,
Martin Martens,
Sebastian Niehaus,
Artus Krohn-Grimberghe,
Arunselvan Ramaswamy
Abstract:
We present a reinforcement learning approach for detecting objects within an image. Our approach performs a step-wise deformation of a bounding box with the goal of tightly framing the object. It uses a hierarchical tree-like representation of predefined region candidates, which the agent can zoom in on. This reduces the number of region candidates that must be evaluated so that the agent can affo…
▽ More
We present a reinforcement learning approach for detecting objects within an image. Our approach performs a step-wise deformation of a bounding box with the goal of tightly framing the object. It uses a hierarchical tree-like representation of predefined region candidates, which the agent can zoom in on. This reduces the number of region candidates that must be evaluated so that the agent can afford to compute new feature maps before each step to enhance detection quality. We compare an approach that is based purely on zoom actions with one that is extended by a second refinement stage to fine-tune the bounding box after each zoom step. We also improve the fitting ability by allowing for different aspect ratios of the bounding box. Finally, we propose different reward functions to lead to a better guidance of the agent while following its search trajectories. Experiments indicate that each of these extensions leads to more correct detections. The best performing approach comprises a zoom stage and a refinement stage, uses aspect-ratio modifying actions and is trained using a combination of three different reward metrics.
△ Less
Submitted 26 October, 2018; v1 submitted 15 October, 2018;
originally announced October 2018.
-
Deep Reinforcement Learning for Wireless Sensor Scheduling in Cyber-Physical Systems
Authors:
Alex S. Leong,
Arunselvan Ramaswamy,
Daniel E. Quevedo,
Holger Karl,
Ling Shi
Abstract:
In many Cyber-Physical Systems, we encounter the problem of remote state estimation of geographically distributed and remote physical processes. This paper studies the scheduling of sensor transmissions to estimate the states of multiple remote, dynamic processes. Information from the different sensors have to be transmitted to a central gateway over a wireless network for monitoring purposes, whe…
▽ More
In many Cyber-Physical Systems, we encounter the problem of remote state estimation of geographically distributed and remote physical processes. This paper studies the scheduling of sensor transmissions to estimate the states of multiple remote, dynamic processes. Information from the different sensors have to be transmitted to a central gateway over a wireless network for monitoring purposes, where typically fewer wireless channels are available than there are processes to be monitored. For effective estimation at the gateway, the sensors need to be scheduled appropriately, i.e., at each time instant one needs to decide which sensors have network access and which ones do not. To address this scheduling problem, we formulate an associated Markov decision process (MDP). This MDP is then solved using a Deep Q-Network, a recent deep reinforcement learning algorithm that is at once scalable and model-free. We compare our scheduling algorithm to popular scheduling algorithms such as round-robin and reduced-waiting-time, among others. Our algorithm is shown to significantly outperform these algorithms for many example scenarios.
△ Less
Submitted 27 May, 2020; v1 submitted 13 September, 2018;
originally announced September 2018.
-
DeepCAS: A Deep Reinforcement Learning Algorithm for Control-Aware Scheduling
Authors:
Burak Demirel,
Arunselvan Ramaswamy,
Daniel E. Quevedo,
Holger Karl
Abstract:
We consider networked control systems consisting of multiple independent controlled subsystems, operating over a shared communication network. Such systems are ubiquitous in cyber-physical systems, Internet of Things, and large-scale industrial systems. In many large-scale settings, the size of the communication network is smaller than the size of the system. In consequence, scheduling issues aris…
▽ More
We consider networked control systems consisting of multiple independent controlled subsystems, operating over a shared communication network. Such systems are ubiquitous in cyber-physical systems, Internet of Things, and large-scale industrial systems. In many large-scale settings, the size of the communication network is smaller than the size of the system. In consequence, scheduling issues arise. The main contribution of this paper is to develop a deep reinforcement learning-based \emph{control-aware} scheduling (\textsc{DeepCAS}) algorithm to tackle these issues. We use the following (optimal) design strategy: First, we synthesize an optimal controller for each subsystem; next, we design a learning algorithm that adapts to the chosen subsystems (plants) and controllers. As a consequence of this adaptation, our algorithm finds a schedule that minimizes the \emph{control loss}. We present empirical results to show that \textsc{DeepCAS} finds schedules with better performance than periodic ones.
△ Less
Submitted 13 June, 2018; v1 submitted 8 March, 2018;
originally announced March 2018.
-
Asynchronous stochastic approximations with asymptotically biased errors and deep multi-agent learning
Authors:
Arunselvan Ramaswamy,
Shalabh Bhatnagar,
Daniel E. Quevedo
Abstract:
Asynchronous stochastic approximations (SAs) are an important class of model-free algorithms, tools and techniques that are popular in multi-agent and distributed control scenarios. To counter Bellman's curse of dimensionality, such algorithms are coupled with function approximations. Although the learning/ control problem becomes more tractable, function approximations affect stability and conver…
▽ More
Asynchronous stochastic approximations (SAs) are an important class of model-free algorithms, tools and techniques that are popular in multi-agent and distributed control scenarios. To counter Bellman's curse of dimensionality, such algorithms are coupled with function approximations. Although the learning/ control problem becomes more tractable, function approximations affect stability and convergence. In this paper, we present verifiable sufficient conditions for stability and convergence of asynchronous SAs with biased approximation errors. The theory developed herein is used to analyze Policy Gradient methods and noisy Value Iteration schemes. Specifically, we analyze the asynchronous approximate counterparts of the policy gradient (A2PG) and value iteration (A2VI) schemes. It is shown that the stability of these algorithms is unaffected by biased approximation errors, provided they are asymptotically bounded. With respect to convergence (of A2VI and A2PG), a relationship between the limiting set and the approximation errors is established. Finally, experimental results are presented that support the theory.
△ Less
Submitted 2 May, 2019; v1 submitted 22 February, 2018;
originally announced February 2018.
-
Analyzing Approximate Value Iteration Algorithms
Authors:
Arunselvan Ramaswamy,
Shalabh Bhatnagar
Abstract:
In this paper, we consider the stochastic iterative counterpart of the value iteration scheme wherein only noisy and possibly biased approximations of the Bellman operator are available. We call this counterpart as the approximate value iteration (AVI) scheme. Neural networks are often used as function approximators, in order to counter Bellman's curse of dimensionality. In this paper, they are us…
▽ More
In this paper, we consider the stochastic iterative counterpart of the value iteration scheme wherein only noisy and possibly biased approximations of the Bellman operator are available. We call this counterpart as the approximate value iteration (AVI) scheme. Neural networks are often used as function approximators, in order to counter Bellman's curse of dimensionality. In this paper, they are used to approximate the Bellman operator. Since neural networks are typically trained using sample data, errors and biases may be introduced. The design of AVI accounts for implementations with biased approximations of the Bellman operator and sampling errors. We present verifiable sufficient conditions under which AVI is stable (almost surely bounded) and converges to a fixed point of the approximate Bellman operator. To ensure the stability of AVI, we present three different yet related sets of sufficient conditions that are based on the existence of an appropriate Lyapunov function. These Lyapunov function based conditions are easily verifiable and new to the literature. The verifiability is enhanced by the fact that a recipe for the construction of the necessary Lyapunov function is also provided. We also show that the stability analysis of AVI can be readily extended to the general case of set-valued stochastic approximations. Finally, we show that AVI can also be used in more general circumstances, i.e., for finding fixed points of contractive set-valued maps.
△ Less
Submitted 30 May, 2021; v1 submitted 14 September, 2017;
originally announced September 2017.
-
Analysis of gradient descent methods with non-diminishing, bounded errors
Authors:
Arunselvan Ramaswamy,
Shalabh Bhatnagar
Abstract:
The main aim of this paper is to provide an analysis of gradient descent (GD) algorithms with gradient errors that do not necessarily vanish, asymptotically. In particular, sufficient conditions are presented for both stability (almost sure boundedness of the iterates) and convergence of GD with bounded, (possibly) non-diminishing gradient errors. In addition to ensuring stability, such an algorit…
▽ More
The main aim of this paper is to provide an analysis of gradient descent (GD) algorithms with gradient errors that do not necessarily vanish, asymptotically. In particular, sufficient conditions are presented for both stability (almost sure boundedness of the iterates) and convergence of GD with bounded, (possibly) non-diminishing gradient errors. In addition to ensuring stability, such an algorithm is shown to converge to a small neighborhood of the minimum set, which depends on the gradient errors. It is worth noting that the main result of this paper can be used to show that GD with asymptotically vanishing errors indeed converges to the minimum set. The results presented herein are not only more general when compared to previous results, but our analysis of GD with errors is new to the literature to the best of our knowledge. Our work extends the contributions of Mangasarian & Solodov, Bertsekas & Tsitsiklis and Tadic & Doucet. Using our framework, a simple yet effective implementation of GD using simultaneous perturbation stochastic approximations (SP SA), with constant sensitivity parameters, is presented. Another important improvement over many previous results is that there are no `additional' restrictions imposed on the step-sizes. In machine learning applications where step-sizes are related to learning rates, our assumptions, unlike those of other papers, do not affect these learning rates. Finally, we present experimental results to validate our theory.
△ Less
Submitted 18 September, 2017; v1 submitted 1 April, 2016;
originally announced April 2016.
-
A Nonlinear Boundary Condition for Continuum Models of Biomolecular Electrostatics
Authors:
J. P. Bardhan,
D. A. Tejani,
N. S. Wieckowski,
A. Ramaswamy,
M. G. Knepley
Abstract:
Understanding the behavior of biomolecules such as proteins requires understanding the critical influence of the surrounding fluid (solvent) environment--water with mobile salt ions such as sodium. Unfortunately, for many studies, fully atomistic simulations of biomolecules, surrounded by thousands of water molecules and ions are too computationally slow. Continuum solvent models based on macrosco…
▽ More
Understanding the behavior of biomolecules such as proteins requires understanding the critical influence of the surrounding fluid (solvent) environment--water with mobile salt ions such as sodium. Unfortunately, for many studies, fully atomistic simulations of biomolecules, surrounded by thousands of water molecules and ions are too computationally slow. Continuum solvent models based on macroscopic dielectric theory (e.g. the Poisson equation) are popular alternatives, but their simplicity fails to capture well-known phenomena of functional significance. For example, standard theories predict that electrostatic response is symmetric with respect to the sign of an atomic charge, even though response is in fact strongly asymmetric if the charge is near the biomolecule surface. In this work, we present an asymmetric continuum theory that captures the essential physical mechanism--the finite size of solvent atoms--using a nonlinear boundary condition (NLBC) at the dielectric interface between the biomolecule and solvent. Numerical calculations using boundary-integral methods demonstrate that the new NLBC model reproduces a wide range of results computed by more realistic, and expensive, all-atom molecular-dynamics (MD) simulations in explicit water. We discuss model extensions such as modeling dilute-electrolyte solvents with Debye-Huckel theory (the linearized Poisson-Boltzmann equation) and opportunities for the electromagnetics community to contribute to research in this important area of molecular nanoscience and engineering.
△ Less
Submitted 24 May, 2015;
originally announced May 2015.
-
Stability of Stochastic Approximations with `Controlled Markov' Noise and Temporal Difference Learning
Authors:
Arunselvan Ramaswamy,
Shalabh Bhatnagar
Abstract:
We are interested in understanding stability (almost sure boundedness) of stochastic approximation algorithms (SAs) driven by a `controlled Markov' process. Analyzing this class of algorithms is important, since many reinforcement learning (RL) algorithms can be cast as SAs driven by a `controlled Markov' process. In this paper, we present easily verifiable sufficient conditions for stability and…
▽ More
We are interested in understanding stability (almost sure boundedness) of stochastic approximation algorithms (SAs) driven by a `controlled Markov' process. Analyzing this class of algorithms is important, since many reinforcement learning (RL) algorithms can be cast as SAs driven by a `controlled Markov' process. In this paper, we present easily verifiable sufficient conditions for stability and convergence of SAs driven by a `controlled Markov' process. Many RL applications involve continuous state spaces. While our analysis readily ensures stability for such continuous state applications, traditional analyses do not. As compared to literature, our analysis presents a two-fold generalization (a) the Markov process may evolve in a continuous state space and (b) the process need not be ergodic under any given stationary policy. Temporal difference learning (TD) is an important policy evaluation method in reinforcement learning. The theory developed herein, is used to analyze generalized $TD(0)$, an important variant of TD. Our theory is also used to analyze a TD formulation of supervised learning for forecasting problems.
△ Less
Submitted 17 May, 2018; v1 submitted 23 April, 2015;
originally announced April 2015.
-
Stochastic recursive inclusion in two timescales with an application to the Lagrangian dual problem
Authors:
Arunselvan Ramaswamy,
Shalabh Bhatnagar
Abstract:
In this paper we present a framework to analyze the asymptotic behavior of two timescale stochastic approximation algorithms including those with set-valued mean fields. This paper builds on the works of Borkar and Perkins & Leslie. The framework presented herein is more general as compared to the synchronous two timescale framework of Perkins \& Leslie, however the assumptions involved are easily…
▽ More
In this paper we present a framework to analyze the asymptotic behavior of two timescale stochastic approximation algorithms including those with set-valued mean fields. This paper builds on the works of Borkar and Perkins & Leslie. The framework presented herein is more general as compared to the synchronous two timescale framework of Perkins \& Leslie, however the assumptions involved are easily verifiable. As an application, we use this framework to analyze the two timescale stochastic approximation algorithm corresponding to the Lagrangian dual problem in optimization theory.
△ Less
Submitted 9 October, 2015; v1 submitted 6 February, 2015;
originally announced February 2015.
-
A Generalization of the Borkar-Meyn Theorem for Stochastic Recursive Inclusions
Authors:
Arunselvan Ramaswamy,
Shalabh Bhatnagar
Abstract:
In this paper the stability theorem of Borkar and Meyn is extended to include the case when the mean field is a differential inclusion. Two different sets of sufficient conditions are presented that guarantee the stability and convergence of stochastic recursive inclusions. Our work builds on the works of Benaim, Hofbauer and Sorin as well as Borkar and Meyn. As a corollary to one of the main theo…
▽ More
In this paper the stability theorem of Borkar and Meyn is extended to include the case when the mean field is a differential inclusion. Two different sets of sufficient conditions are presented that guarantee the stability and convergence of stochastic recursive inclusions. Our work builds on the works of Benaim, Hofbauer and Sorin as well as Borkar and Meyn. As a corollary to one of the main theorems, a natural generalization of the Borkar and Meyn Theorem follows. In addition, the original theorem of Borkar and Meyn is shown to hold under slightly relaxed assumptions. Finally, as an application to one of the main theorems we discuss a solution to the approximate drift problem.
△ Less
Submitted 27 September, 2016; v1 submitted 6 February, 2015;
originally announced February 2015.
-
Rainbow Connection Number of Graph Power and Graph Products
Authors:
Manu Basavaraju,
L. Sunil Chandran,
Deepak Rajendraprasad,
Arunselvan Ramaswamy
Abstract:
Rainbow connection number, rc(G), of a connected graph G is the minimum number of colors needed to color its edges so that every pair of vertices is connected by at least one path in which no two edges are colored the same (Note that the coloring need not be proper). In this paper we study the rainbow connection number with respect to three important graph product operations (namely cartesian prod…
▽ More
Rainbow connection number, rc(G), of a connected graph G is the minimum number of colors needed to color its edges so that every pair of vertices is connected by at least one path in which no two edges are colored the same (Note that the coloring need not be proper). In this paper we study the rainbow connection number with respect to three important graph product operations (namely cartesian product, lexicographic product and strong product) and the operation of taking the power of a graph. In this direction, we show that if G is a graph obtained by applying any of the operations mentioned above on non-trivial graphs, then rc(G) <= 2r(G)+c, where r(G) denotes the radius of G and c \in {0,1,2}. In general the rainbow connection number of a bridgeless graph can be as high as the square of its radius [Basavaraju et. al, 2010]. This is an attempt to identify some graph classes which have rainbow connection number very close to the obvious lower bound of diameter (and thus the radius). The bounds reported are tight upto additive constants. The proofs are constructive and hence yield polynomial time (2 + 2/r(G))-factor approximation algorithms.
△ Less
Submitted 22 July, 2011; v1 submitted 21 April, 2011;
originally announced April 2011.
-
Rainbow Connection Number and Radius
Authors:
Manu Basavaraju,
L. Sunil Chandran,
Deepak Rajendraprasad,
Arunselvan Ramaswamy
Abstract:
The rainbow connection number, rc(G), of a connected graph G is the minimum number of colours needed to colour its edges, so that every pair of its vertices is connected by at least one path in which no two edges are coloured the same. In this note we show that for every bridgeless graph G with radius r, rc(G) <= r(r + 2). We demonstrate that this bound is the best possible for rc(G) as a function…
▽ More
The rainbow connection number, rc(G), of a connected graph G is the minimum number of colours needed to colour its edges, so that every pair of its vertices is connected by at least one path in which no two edges are coloured the same. In this note we show that for every bridgeless graph G with radius r, rc(G) <= r(r + 2). We demonstrate that this bound is the best possible for rc(G) as a function of r, not just for bridgeless graphs, but also for graphs of any stronger connectivity. It may be noted that for a general 1-connected graph G, rc(G) can be arbitrarily larger than its radius (Star graph for instance). We further show that for every bridgeless graph G with radius r and chordality (size of a largest induced cycle) k, rc(G) <= rk.
It is known that computing rc(G) is NP-Hard [Chakraborty et al., 2009]. Here, we present a (r+3)-factor approximation algorithm which runs in O(nm) time and a (d+3)-factor approximation algorithm which runs in O(dm) time to rainbow colour any connected graph G on n vertices, with m edges, diameter d and radius r.
△ Less
Submitted 11 September, 2012; v1 submitted 2 November, 2010;
originally announced November 2010.