-
Approximation with Random Shallow ReLU Networks with Applications to Model Reference Adaptive Control
Authors:
Andrew Lamperski,
Tyler Lekang
Abstract:
Neural networks are regularly employed in adaptive control of nonlinear systems and related methods of reinforcement learning. A common architecture uses a neural network with a single hidden layer (i.e. a shallow network), in which the weights and biases are fixed in advance and only the output layer is trained. While classical results show that there exist neural networks of this type that can a…
▽ More
Neural networks are regularly employed in adaptive control of nonlinear systems and related methods of reinforcement learning. A common architecture uses a neural network with a single hidden layer (i.e. a shallow network), in which the weights and biases are fixed in advance and only the output layer is trained. While classical results show that there exist neural networks of this type that can approximate arbitrary continuous functions over bounded regions, they are non-constructive, and the networks used in practice have no approximation guarantees. Thus, the approximation properties required for control with neural networks are assumed, rather than proved. In this paper, we aim to fill this gap by showing that for sufficiently smooth functions, ReLU networks with randomly generated weights and biases achieve $L_{\infty}$ error of $O(m^{-1/2})$ with high probability, where $m$ is the number of neurons. It suffices to generate the weights uniformly over a sphere and the biases uniformly over an interval. We show how the result can be used to get approximations of required accuracy in a model reference adaptive control application.
△ Less
Submitted 16 April, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
An algorithm for bilevel optimization with traffic equilibrium constraints: convergence rate analysis
Authors:
Akshit Goyal,
Andrew Lamperski
Abstract:
Bilevel optimization with traffic equilibrium constraints plays an important role in transportation planning and management problems such as traffic control, transport network design, and congestion pricing. In this paper, we consider a double-loop gradient-based algorithm to solve such bilevel problems and provide a non-asymptotic convergence guarantee of $\mathcal{O}(K^{-1})+\mathcal{O}(λ^D)$ wh…
▽ More
Bilevel optimization with traffic equilibrium constraints plays an important role in transportation planning and management problems such as traffic control, transport network design, and congestion pricing. In this paper, we consider a double-loop gradient-based algorithm to solve such bilevel problems and provide a non-asymptotic convergence guarantee of $\mathcal{O}(K^{-1})+\mathcal{O}(λ^D)$ where $K$, $D$ are respectively the number of upper- and lower-level iterations, and $0<λ<1$ is a constant. Compared to existing literature, which either provides asymptotic convergence or makes strong assumptions and requires a complex design of step sizes, we establish convergence for choice of simple constant step sizes and considering fewer assumptions. The analysis techniques in this paper use concepts from the field of robust control and can potentially serve as a guiding framework for analyzing more general bilevel optimization algorithms.
△ Less
Submitted 25 June, 2023;
originally announced June 2023.
-
Function Approximation with Randomly Initialized Neural Networks for Approximate Model Reference Adaptive Control
Authors:
Tyler Lekang,
Andrew Lamperski
Abstract:
Classical results in neural network approximation theory show how arbitrary continuous functions can be approximated by networks with a single hidden layer, under mild assumptions on the activation function. However, the classical theory does not give a constructive means to generate the network parameters that achieve a desired accuracy. Recent results have demonstrated that for specialized activ…
▽ More
Classical results in neural network approximation theory show how arbitrary continuous functions can be approximated by networks with a single hidden layer, under mild assumptions on the activation function. However, the classical theory does not give a constructive means to generate the network parameters that achieve a desired accuracy. Recent results have demonstrated that for specialized activation functions, such as ReLUs and some classes of analytic functions, high accuracy can be achieved via linear combinations of randomly initialized activations. These recent works utilize specialized integral representations of target functions that depend on the specific activation functions used. This paper defines mollified integral representations, which provide a means to form integral representations of target functions using activations for which no direct integral representation is currently known. The new construction enables approximation guarantees for randomly initialized networks for a variety of widely used activation functions.
△ Less
Submitted 5 April, 2023; v1 submitted 28 March, 2023;
originally announced March 2023.
-
Non-Asymptotic Pointwise and Worst-Case Bounds for Classical Spectrum Estimators
Authors:
Andrew Lamperski
Abstract:
Spectrum estimation is a fundamental methodology in the analysis of time-series data, with applications including medicine, speech analysis, and control design. The asymptotic theory of spectrum estimation is well-understood, but the theory is limited when the number of samples is fixed and finite. This paper gives non-asymptotic error bounds for a broad class of spectral estimators, both pointwis…
▽ More
Spectrum estimation is a fundamental methodology in the analysis of time-series data, with applications including medicine, speech analysis, and control design. The asymptotic theory of spectrum estimation is well-understood, but the theory is limited when the number of samples is fixed and finite. This paper gives non-asymptotic error bounds for a broad class of spectral estimators, both pointwise (at specific frequencies) and in the worst case over all frequencies. The general method is used to derive error bounds for the classical Blackman-Tukey, Bartlett, and Welch estimators. In particular, these are first non-asymptotic error bounds for Bartlett and Welch estimators.
△ Less
Submitted 14 August, 2023; v1 submitted 21 March, 2023;
originally announced March 2023.
-
Sufficient Conditions for Persistency of Excitation with Step and ReLU Activation Functions
Authors:
Tyler Lekang,
Andrew Lamperski
Abstract:
This paper defines geometric criteria which are then used to establish sufficient conditions for persistency of excitation with vector functions constructed from single hidden-layer neural networks with step or ReLU activation functions. We show that these conditions hold when employing reference system tracking, as is commonly done in adaptive control. We demonstrate the results numerically on a…
▽ More
This paper defines geometric criteria which are then used to establish sufficient conditions for persistency of excitation with vector functions constructed from single hidden-layer neural networks with step or ReLU activation functions. We show that these conditions hold when employing reference system tracking, as is commonly done in adaptive control. We demonstrate the results numerically on a system with linearly parameterized activations of this type and show that the parameter estimates converge to the true values with the sufficient conditions met.
△ Less
Submitted 14 September, 2022; v1 submitted 13 September, 2022;
originally announced September 2022.
-
Constrained Langevin Algorithms with L-mixing External Random Variables
Authors:
Yu** Zheng,
Andrew Lamperski
Abstract:
Langevin algorithms are gradient descent methods augmented with additive noise, and are widely used in Markov Chain Monte Carlo (MCMC) sampling, optimization, and machine learning. In recent years, the non-asymptotic analysis of Langevin algorithms for non-convex learning has been extensively explored. For constrained problems with non-convex losses over a compact convex domain with IID data varia…
▽ More
Langevin algorithms are gradient descent methods augmented with additive noise, and are widely used in Markov Chain Monte Carlo (MCMC) sampling, optimization, and machine learning. In recent years, the non-asymptotic analysis of Langevin algorithms for non-convex learning has been extensively explored. For constrained problems with non-convex losses over a compact convex domain with IID data variables, the projected Langevin algorithm achieves a deviation of $O(T^{-1/4} (\log T)^{1/2})$ from its target distribution [27] in $1$-Wasserstein distance.
In this paper, we obtain a deviation of $O(T^{-1/2} \log T)$ in $1$-Wasserstein distance for non-convex losses with $L$-mixing data variables and polyhedral constraints (which are not necessarily bounded). This improves on the previous bound for constrained problems and matches the best-known bound for unconstrained problems.
△ Less
Submitted 7 January, 2023; v1 submitted 27 May, 2022;
originally announced May 2022.
-
Wasserstein Contraction Bounds on Closed Convex Domains with Applications to Stochastic Adaptive Control
Authors:
Tyler Lekang,
Andrew Lamperski
Abstract:
This paper is motivated by the problem of quantitatively bounding the convergence of adaptive control methods for stochastic systems to a stationary distribution. Such bounds are useful for analyzing statistics of trajectories and determining appropriate step sizes for simulations. To this end, we extend a methodology from (unconstrained) stochastic differential equations (SDEs) which provides con…
▽ More
This paper is motivated by the problem of quantitatively bounding the convergence of adaptive control methods for stochastic systems to a stationary distribution. Such bounds are useful for analyzing statistics of trajectories and determining appropriate step sizes for simulations. To this end, we extend a methodology from (unconstrained) stochastic differential equations (SDEs) which provides contractions in a specially chosen Wasserstein distance. This theory focuses on unconstrained SDEs with fairly restrictive assumptions on the drift terms. Typical adaptive control schemes place constraints on the learned parameters and their update rules violate the drift conditions. To this end, we extend the contraction theory to the case of constrained systems represented by reflected stochastic differential equations and generalize the allowable drifts. We show how the general theory can be used to derive quantitative contraction bounds on a nonlinear stochastic adaptive regulation problem.
△ Less
Submitted 15 October, 2021; v1 submitted 24 September, 2021;
originally announced September 2021.
-
Projected Stochastic Gradient Langevin Algorithms for Constrained Sampling and Non-Convex Learning
Authors:
Andrew Lamperski
Abstract:
Langevin algorithms are gradient descent methods with additive noise. They have been used for decades in Markov chain Monte Carlo (MCMC) sampling, optimization, and learning. Their convergence properties for unconstrained non-convex optimization and learning problems have been studied widely in the last few years. Other work has examined projected Langevin algorithms for sampling from log-concave…
▽ More
Langevin algorithms are gradient descent methods with additive noise. They have been used for decades in Markov chain Monte Carlo (MCMC) sampling, optimization, and learning. Their convergence properties for unconstrained non-convex optimization and learning problems have been studied widely in the last few years. Other work has examined projected Langevin algorithms for sampling from log-concave distributions restricted to convex compact sets. For learning and optimization, log-concave distributions correspond to convex losses. In this paper, we analyze the case of non-convex losses with compact convex constraint sets and IID external data variables. We term the resulting method the projected stochastic gradient Langevin algorithm (PSGLA). We show the algorithm achieves a deviation of $O(T^{-1/4}(\log T)^{1/2})$ from its target distribution in 1-Wasserstein distance. For optimization and learning, we show that the algorithm achieves $ε$-suboptimal solutions, on average, provided that it is run for a time that is polynomial in $ε^{-1}$ and slightly super-exponential in the problem dimension.
△ Less
Submitted 22 December, 2020;
originally announced December 2020.
-
Differential Dynamic Programming for Nonlinear Dynamic Games
Authors:
Bolei Di,
Andrew Lamperski
Abstract:
Dynamic games arise when multiple agents with differing objectives choose control inputs to a dynamic system. Dynamic games model a wide variety of applications in economics, defense, and energy systems. However, compared to single-agent control problems, the computational methods for dynamic games are relatively limited. As in the single-agent case, only very specialized dynamic games can be solv…
▽ More
Dynamic games arise when multiple agents with differing objectives choose control inputs to a dynamic system. Dynamic games model a wide variety of applications in economics, defense, and energy systems. However, compared to single-agent control problems, the computational methods for dynamic games are relatively limited. As in the single-agent case, only very specialized dynamic games can be solved exactly, and so approximation algorithms are required. This paper extends the differential dynamic programming algorithm from single-agent control to the case of non-zero sum full-information dynamic games. The method works by computing quadratic approximations to the dynamic programming equations. The approximation results in static quadratic games which are solved recursively. Convergence is proved by showing that the algorithm iterates sufficiently close to iterates of Newton's method to inherit its convergence properties. A numerical example is provided.
△ Less
Submitted 21 September, 2018;
originally announced September 2018.
-
Moment Analysis of Stochastic Hybrid Systems Using Semidefinite Programming
Authors:
Khem Raj Ghusinga,
Andrew Lamperski,
Abhyudai Singh
Abstract:
This paper proposes a semidefinite programming based method for estimating moments of a stochastic hybrid system (SHS). For polynomial SHSs -- which consist of polynomial continuous vector fields, reset maps, and transition intensities -- the dynamics of moments evolve according to a system of linear ordinary differential equations. However, it is generally not possible to solve the system exactly…
▽ More
This paper proposes a semidefinite programming based method for estimating moments of a stochastic hybrid system (SHS). For polynomial SHSs -- which consist of polynomial continuous vector fields, reset maps, and transition intensities -- the dynamics of moments evolve according to a system of linear ordinary differential equations. However, it is generally not possible to solve the system exactly since time evolution of a specific moment may depend upon moments of order higher than it. One way to overcome this problem is to employ so-called moment closure methods that give point approximations to moments, but these are limited in that accuracy of the estimations is unknown. We find lower and upper bounds on a moment of interest via a semidefinite program that includes linear constraints obtained from moment dynamics, along with semidefinite constraints that arise from the non-negativity of moment matrices. These bounds are further shown to improve as the size of semidefinite program is increased. The key insight in the method is a reduction from stochastic hybrid systems with multiple discrete modes to a single-mode hybrid system with algebraic constraints. We further extend the scope of the proposed method to a class of non-polynomial SHSs which can be recast to polynomial SHSs via augmentation of additional states. Finally, we illustrate the applicability of results via examples of SHSs drawn from different disciplines.
△ Less
Submitted 1 February, 2018;
originally announced February 2018.
-
Estimating stationary characteristic functions of stochastic systems via semidefinite programming
Authors:
Khem Raj Ghusinga,
Andrew Lamperski,
Abhyudai Singh
Abstract:
This paper proposes a methodology to estimate characteristic functions of stochastic differential equations that are defined over polynomials and driven by Lévy noise. For such systems, the time evolution of the characteristic function is governed by a partial differential equation; consequently, the stationary characteristic function can be obtained by solving an ordinary differential equation (O…
▽ More
This paper proposes a methodology to estimate characteristic functions of stochastic differential equations that are defined over polynomials and driven by Lévy noise. For such systems, the time evolution of the characteristic function is governed by a partial differential equation; consequently, the stationary characteristic function can be obtained by solving an ordinary differential equation (ODE). However, except for a few special cases such as linear systems, the solution to the ODE consists of unknown coefficients. These coefficients are closely related with the stationary moments of the process, and bounds on these can be obtained by utilizing the fact that the characteristic function is positive definite. These bounds can be further used to find bounds on other higher order stationary moments and also estimate the stationary characteristic function itself. The method is finally illustrated via examples.
△ Less
Submitted 16 November, 2017;
originally announced November 2017.
-
Approximate moment dynamics for polynomial and trigonometric stochastic systems
Authors:
Khem Raj Ghusinga,
Mohammad Soltani,
Andrew Lamperski,
Sairaj Dhople,
Abhyudai Singh
Abstract:
Stochastic dynamical systems often contain nonlinearities which make it hard to compute probability density functions or statistical moments of these systems. For the moment computations, nonlinearities in the dynamics lead to unclosed moment dynamics; in particular, the time evolution of a moment of a specific order may depend both on moments of order higher than it and on some nonlinear function…
▽ More
Stochastic dynamical systems often contain nonlinearities which make it hard to compute probability density functions or statistical moments of these systems. For the moment computations, nonlinearities in the dynamics lead to unclosed moment dynamics; in particular, the time evolution of a moment of a specific order may depend both on moments of order higher than it and on some nonlinear function of other moments. The moment closure techniques are used to find an approximate, close system of equations the moment dynamics. In this work, we extend a moment closure technique based on derivative matching that was originally proposed for polynomial stochastic systems with discrete states to continuous state stochastic systems to continuous state stochastic differential equations, with both polynomial and trigonometric nonlinearities. We validate the technique using two examples of nonlinear stochastic systems.
△ Less
Submitted 26 March, 2017;
originally announced March 2017.
-
Analysis and Control of Stochastic Systems using Semidefinite Programming over Moments
Authors:
Andrew Lamperski,
Khem Raj Ghusinga,
Abhyudai Singh
Abstract:
This paper develops a unified methodology for probabilistic analysis and optimal control design for jump diffusion processes defined by polynomials. For such systems, the evolution of the moments of the state can be described via a system of linear ordinary differential equations. Typically, however, the moments are not closed and an infinite system of equations is required to compute statistical…
▽ More
This paper develops a unified methodology for probabilistic analysis and optimal control design for jump diffusion processes defined by polynomials. For such systems, the evolution of the moments of the state can be described via a system of linear ordinary differential equations. Typically, however, the moments are not closed and an infinite system of equations is required to compute statistical moments exactly. Existing methods for stochastic analysis, known as closure methods, focus on approximating this infinite system of equations with a finite dimensional system. This work develops an alternative approach in which the higher order terms, which are approximated in closure methods, are viewed as inputs to a finite-dimensional linear control system. Under this interpretation, upper and lower bounds of statistical moments can be computed via convex linear optimal control problems with semidefinite constraints. For analysis of steady-state distributions, this optimal control problem reduces to a static semidefinite program. These same optimization problems extend automatically to stochastic optimal control problems. For minimization problems, the methodology leads to guaranteed lower bounds on the true optimal value. Furthermore, we show how an approximate optimal control strategy can be constructed from the solution of the semidefinite program. The results are illustrated using numerous examples.
△ Less
Submitted 1 February, 2017;
originally announced February 2017.
-
Stochastic optimal control using semidefinite programming for moment dynamics
Authors:
Andrew Lamperski,
Khem Raj Ghusinga,
Abhyudai Singh
Abstract:
This paper presents a method to approximately solve stochastic optimal control problems in which the cost function and the system dynamics are polynomial. For stochastic systems with polynomial dynamics, the moments of the state can be expressed as a, possibly infinite, system of deterministic linear ordinary differential equations. By casting the problem as a deterministic control problem in mome…
▽ More
This paper presents a method to approximately solve stochastic optimal control problems in which the cost function and the system dynamics are polynomial. For stochastic systems with polynomial dynamics, the moments of the state can be expressed as a, possibly infinite, system of deterministic linear ordinary differential equations. By casting the problem as a deterministic control problem in moment space, semidefinite programming is used to find a lower bound on the optimal solution. The constraints in the semidefinite program are imposed by the ordinary differential equations for moment dynamics and semidefiniteness of the outer product of moments. From the solution to the semidefinite program, an approximate optimal control strategy can be constructed using a least squares method. In the linear quadratic case, the method gives an exact solution to the optimal control problem. In more complex problems, an infinite number of moment differential equations would be required to compute the optimal control law. In this case, we give a procedure to increase the size of the semidefinite program, leading to increasingly accurate approximations to the true optimal control strategy.
△ Less
Submitted 20 March, 2016;
originally announced March 2016.
-
Stability of Asynchronous Networked Control Systems with Probabilistic Clocks
Authors:
Andrew Lamperski
Abstract:
This paper studies the stability of sampled and networked control systems with sampling and communication times governed by probabilistic clocks. The clock models have few restrictions, and can be used to model numerous phenomena such as deterministic sampling, jitter, and transmission times of packet drop** networks. Moreover, the stability theory can be applied to an arbitrary number of clocks…
▽ More
This paper studies the stability of sampled and networked control systems with sampling and communication times governed by probabilistic clocks. The clock models have few restrictions, and can be used to model numerous phenomena such as deterministic sampling, jitter, and transmission times of packet drop** networks. Moreover, the stability theory can be applied to an arbitrary number of clocks with different distributions, operating asynchronously. The paper gives Lyapunov-type sufficient conditions for stochastic stability of nonlinear networked systems. For linear systems, the paper gives necessary and sufficient conditions for exponential mean square stability, based on linear matrix inequalities. In both the linear and nonlinear cases, the Lyapunov inequalities are constructed from a simple linear combination of the classical inequalities from continuous and discrete time. Crucially, the stability theorems only depend on the mean sampling intervals. Thus, they can be applied with only limited statistical information about the clocks. The Lyapunov theorems are then applied to systems with multirate sampling, asynchronous communication, delays, and packet losses.
△ Less
Submitted 8 October, 2014; v1 submitted 2 October, 2014;
originally announced October 2014.
-
Optimal Two Player LQR State Feedback With Varying Delay
Authors:
Nikolai Matni,
Andrew Lamperski,
John C. Doyle
Abstract:
This paper presents an explicit solution to a two player distributed LQR problem in which communication between controllers occurs across a communication link with varying delay. We extend known dynamic programming methods to accommodate this varying delay, and show that under suitable assumptions, the optimal control actions are linear in their information, and that the resulting controller has p…
▽ More
This paper presents an explicit solution to a two player distributed LQR problem in which communication between controllers occurs across a communication link with varying delay. We extend known dynamic programming methods to accommodate this varying delay, and show that under suitable assumptions, the optimal control actions are linear in their information, and that the resulting controller has piecewise linear dynamics dictated by the current effective delay regime.
△ Less
Submitted 30 March, 2014;
originally announced March 2014.
-
Optimal Control with Noisy Time
Authors:
Andrew Lamperski,
Noah J. Cowan
Abstract:
This paper examines stochastic optimal control problems in which the state is perfectly known, but the controller's measure of time is a stochastic process derived from a strictly increasing Lévy process. We provide dynamic programming results for continuous-time finite-horizon control and specialize these results to solve a noisy-time variant of the linear quadratic regulator problem and a portfo…
▽ More
This paper examines stochastic optimal control problems in which the state is perfectly known, but the controller's measure of time is a stochastic process derived from a strictly increasing Lévy process. We provide dynamic programming results for continuous-time finite-horizon control and specialize these results to solve a noisy-time variant of the linear quadratic regulator problem and a portfolio optimization problem with random trade activity rates. For the linear quadratic case, the optimal controller is linear and can be computed from a generalization of the classical Riccati differential equation.
△ Less
Submitted 31 December, 2013;
originally announced January 2014.
-
The H2 Control Problem for Quadratically Invariant Systems with Delays
Authors:
Andrew Lamperski,
John C. Doyle
Abstract:
This paper gives a new solution to the output feedback H2 problem for quadratically invariant communication delay patterns. A characterization of all stabilizing controllers satisfying the delay constraints is given and the decentralized H2 problem is cast as a convex model matching problem. The main result shows that the model matching problem can be reduced to a finite-dimensional quadratic prog…
▽ More
This paper gives a new solution to the output feedback H2 problem for quadratically invariant communication delay patterns. A characterization of all stabilizing controllers satisfying the delay constraints is given and the decentralized H2 problem is cast as a convex model matching problem. The main result shows that the model matching problem can be reduced to a finite-dimensional quadratic program. A recursive state-space method for computing the optimal controller based on vectorization is given.
△ Less
Submitted 7 October, 2014; v1 submitted 30 December, 2013;
originally announced December 2013.
-
Optimal Decentralized State-Feedback Control with Sparsity and Delays
Authors:
Andrew Lamperski,
Laurent Lessard
Abstract:
This work presents the solution to a class of decentralized linear quadratic state-feedback control problems, in which the plant and controller must satisfy the same combination of delay and sparsity constraints. Using a novel decomposition of the noise history, the control problem is split into independent subproblems that are solved using dynamic programming. The approach presented herein both u…
▽ More
This work presents the solution to a class of decentralized linear quadratic state-feedback control problems, in which the plant and controller must satisfy the same combination of delay and sparsity constraints. Using a novel decomposition of the noise history, the control problem is split into independent subproblems that are solved using dynamic programming. The approach presented herein both unifies and generalizes many existing results.
△ Less
Submitted 23 November, 2014; v1 submitted 31 May, 2013;
originally announced June 2013.
-
Output Feedback H_2 Model Matching for Decentralized Systems with Delays
Authors:
Andrew Lamperski,
John C. Doyle
Abstract:
This paper gives a new solution to the output feedback H_2 model matching problem for a large class of delayed information sharing patterns. Existing methods for such problems typically reduce the decentralized problem to a centralized problem of higher state dimension. In contrast, the controller given in this paper is constructed from the solutions to the centralized control and estimation Ricca…
▽ More
This paper gives a new solution to the output feedback H_2 model matching problem for a large class of delayed information sharing patterns. Existing methods for such problems typically reduce the decentralized problem to a centralized problem of higher state dimension. In contrast, the controller given in this paper is constructed from the solutions to the centralized control and estimation Riccati equations for the original system. The problem is solved by decomposing the controller into two components. One is centralized, but delayed, while the other is decentralized with finite impulse response (FIR). It is then shown that the optimal controller can be constructed through a combination of centralized spectral factorization and quadratic programming.
△ Less
Submitted 17 September, 2012;
originally announced September 2012.