Search | arXiv e-print repository

Online Model-free Safety Verification for Markov Decision Processes Without Safety Violation

Authors: Abhijit Mazumdar, Rafal Wisniewski, Manuela L. Bujorianu

Abstract: In this paper, we consider the problem of safety assessment for Markov decision processes without explicit knowledge of the model. We aim to learn probabilistic safety specifications associated with a given policy without compromising the safety of the process. To accomplish our goal, we characterize a subset of the state-space called proxy set, which contains the states that are near in a probabi… ▽ More In this paper, we consider the problem of safety assessment for Markov decision processes without explicit knowledge of the model. We aim to learn probabilistic safety specifications associated with a given policy without compromising the safety of the process. To accomplish our goal, we characterize a subset of the state-space called proxy set, which contains the states that are near in a probabilistic sense to the forbidden set consisting of all unsafe states. We compute the safety function using the single-step temporal difference method. To this end, we relate the safety function computation to that of the value function estimation using temporal difference learning. Since the given control policy could be unsafe, we use a safe baseline subpolicy to generate data for learning. We then use an off-policy temporal difference learning method with importance sampling to learn the safety function corresponding to the given policy. Finally, we demonstrate our results using a numerical example △ Less

Submitted 8 December, 2023; originally announced December 2023.

arXiv:2310.08019 [pdf, ps, other]

Robust 1-bit Compressed Sensing with Iterative Hard Thresholding

Authors: Namiko Matsumoto, Arya Mazumdar

Abstract: In 1-bit compressed sensing, the aim is to estimate a $k$-sparse unit vector $x\in S^{n-1}$ within an $ε$ error (in $\ell_2$) from minimal number of linear measurements that are quantized to just their signs, i.e., from measurements of the form $y = \mathrm{Sign}(\langle a, x\rangle).$ In this paper, we study a noisy version where a fraction of the measurements can be flipped, potentially by an ad… ▽ More In 1-bit compressed sensing, the aim is to estimate a $k$-sparse unit vector $x\in S^{n-1}$ within an $ε$ error (in $\ell_2$) from minimal number of linear measurements that are quantized to just their signs, i.e., from measurements of the form $y = \mathrm{Sign}(\langle a, x\rangle).$ In this paper, we study a noisy version where a fraction of the measurements can be flipped, potentially by an adversary. In particular, we analyze the Binary Iterative Hard Thresholding (BIHT) algorithm, a proximal gradient descent on a properly defined loss function used for 1-bit compressed sensing, in this noisy setting. It is known from recent results that, with $\tilde{O}(\frac{k}ε)$ noiseless measurements, BIHT provides an estimate within $ε$ error. This result is optimal and universal, meaning one set of measurements work for all sparse vectors. In this paper, we show that BIHT also provides better results than all known methods for the noisy setting. We show that when up to $τ$-fraction of the sign measurements are incorrect (adversarial error), with the same number of measurements as before, BIHT agnostically provides an estimate of $x$ within an $\tilde{O}(ε+τ)$ error, maintaining the universality of measurements. This establishes stability of iterative hard thresholding in the presence of measurement error. To obtain the result, we use the restricted approximate invertibility of Gaussian matrices, as well as a tight analysis of the high-dimensional geometry of the adversarially corrupted measurements. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: Accepted to appear in ACM-SIAM Symposium on Discrete Algorithms (SODA) 2024

arXiv:2207.03427 [pdf, ps, other]

Binary Iterative Hard Thresholding Converges with Optimal Number of Measurements for 1-Bit Compressed Sensing

Authors: Namiko Matsumoto, Arya Mazumdar

Abstract: Compressed sensing has been a very successful high-dimensional signal acquisition and recovery technique that relies on linear operations. However, the actual measurements of signals have to be quantized before storing or processing. 1(One)-bit compressed sensing is a heavily quantized version of compressed sensing, where each linear measurement of a signal is reduced to just one bit: the sign of… ▽ More Compressed sensing has been a very successful high-dimensional signal acquisition and recovery technique that relies on linear operations. However, the actual measurements of signals have to be quantized before storing or processing. 1(One)-bit compressed sensing is a heavily quantized version of compressed sensing, where each linear measurement of a signal is reduced to just one bit: the sign of the measurement. Once enough of such measurements are collected, the recovery problem in 1-bit compressed sensing aims to find the original signal with as much accuracy as possible. The recovery problem is related to the traditional "halfspace-learning" problem in learning theory. For recovery of sparse vectors, a popular reconstruction method from 1-bit measurements is the binary iterative hard thresholding (BIHT) algorithm. The algorithm is a simple projected sub-gradient descent method, and is known to converge well empirically, despite the nonconvexity of the problem. The convergence property of BIHT was not theoretically justified, except with an exorbitantly large number of measurements (i.e., a number of measurement greater than $\max\{k^{10}, 24^{48}, k^{3.5}/ε\}$, where $k$ is the sparsity, $ε$ denotes the approximation error, and even this expression hides other factors). In this paper we show that the BIHT algorithm converges with only $\tilde{O}(\frac{k}ε)$ measurements. Note that, this dependence on $k$ and $ε$ is optimal for any recovery method in 1-bit compressed sensing. With this result, to the best of our knowledge, BIHT is the only practical and efficient (polynomial time) algorithm that requires the optimal number of measurements in all parameters (both $k$ and $ε$). This is also an example of a gradient descent algorithm converging to the correct solution for a nonconvex problem, under suitable structural conditions. △ Less

Submitted 7 July, 2022; originally announced July 2022.

Comments: To appear in FOCS 2022

arXiv:2110.00044 [pdf, other]

doi 10.1109/TAES.2022.3218496

Trajectory Planning with Deep Reinforcement Learning in High-Level Action Spaces

Authors: Kyle R. Williams, Rachel Schlossman, Daniel Whitten, Joe Ingram, Srideep Musuvathy, Anirudh Patel, James Pagan, Kyle A. Williams, Sam Green, Anirban Mazumdar, Julie Parish

Abstract: This paper presents a technique for trajectory planning based on continuously parameterized high-level actions (motion primitives) of variable duration. This technique leverages deep reinforcement learning (Deep RL) to formulate a policy which is suitable for real-time implementation. There is no separation of motion primitive generation and trajectory planning: each individual short-horizon motio… ▽ More This paper presents a technique for trajectory planning based on continuously parameterized high-level actions (motion primitives) of variable duration. This technique leverages deep reinforcement learning (Deep RL) to formulate a policy which is suitable for real-time implementation. There is no separation of motion primitive generation and trajectory planning: each individual short-horizon motion is formed during the Deep RL training to achieve the full-horizon objective. Effectiveness of the technique is demonstrated numerically on a well-studied trajectory generation problem and a planning problem on a known obstacle-rich map. This paper also develops a new loss function term for policy-gradient-based Deep RL, which is analogous to an anti-windup mechanism in feedback control. We demonstrate the inclusion of this new term in the underlying optimization increases the average policy return in our numerical example. △ Less

Submitted 12 August, 2022; v1 submitted 30 September, 2021; originally announced October 2021.

Journal ref: IEEE Transactions on Aerospace and Electronic Systems, 59 (2023) 2513-2529

arXiv:2008.02748 [pdf, other]

On Passivity, Feedback Passivity, And Feedback Passivity Over Erasure Network: A Piecewise Affine Approximation Approach

Authors: Abhijit Mazumdar, Srinivasan Krishnaswamy, Somanath Majhi

Abstract: In this paper, we deal with the problem of passivity and feedback passification of smooth discrete-time nonlinear systems by considering their piecewise affine approximations. Sufficient conditions are derived for passivity and feedback passivity. These results are then extended to systems that operate over Gilbert-Elliott type communication channels. As a special case, results for feedback passiv… ▽ More In this paper, we deal with the problem of passivity and feedback passification of smooth discrete-time nonlinear systems by considering their piecewise affine approximations. Sufficient conditions are derived for passivity and feedback passivity. These results are then extended to systems that operate over Gilbert-Elliott type communication channels. As a special case, results for feedback passivity of piecewise affine systems over a lossy channel are also derived. △ Less

Submitted 6 August, 2020; originally announced August 2020.

arXiv:1911.00668 [pdf, ps, other]

$H_{\infty}$ Optimal Control of Jump Systems Over Multiple Lossy Communication Channels

Authors: Abhijit Mazumdar, Srinivasan Krishnaswamy, Somanath Majhi

Abstract: In this paper, we consider the $H_{\infty}$ optimal control problem for a Markovian jump linear system (MJLS) over a lossy communication network. It is assumed that the controller communicates with each actuator through a different communication channel. We solve the $H_{\infty}$ optimization problem for a Transmission Control Protocol (TCP) using the theory of dynamic games and obtain a state-fee… ▽ More In this paper, we consider the $H_{\infty}$ optimal control problem for a Markovian jump linear system (MJLS) over a lossy communication network. It is assumed that the controller communicates with each actuator through a different communication channel. We solve the $H_{\infty}$ optimization problem for a Transmission Control Protocol (TCP) using the theory of dynamic games and obtain a state-feedback controller. The infinite horizon $H_{\infty}$ optimization problem is analyzed as a limiting case of the finite horizon optimization problem. Then, we obtain the corresponding state-feedback controller, and show that it stabilizes the closed-loop system in the face of random packet dropouts. △ Less

Submitted 2 November, 2019; originally announced November 2019.

Showing 1–6 of 6 results for author: Mazumdar, A