Search | arXiv e-print repository

Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning

Authors: Ilnura Usmanova, Yarden As, Maryam Kamgarpour, Andreas Krause

Abstract: Optimizing noisy functions online, when evaluating the objective requires experiments on a deployed system, is a crucial task arising in manufacturing, robotics and many others. Often, constraints on safe inputs are unknown ahead of time, and we only obtain noisy information, indicating how close we are to violating the constraints. Yet, safety must be guaranteed at all times, not only for the fin… ▽ More Optimizing noisy functions online, when evaluating the objective requires experiments on a deployed system, is a crucial task arising in manufacturing, robotics and many others. Often, constraints on safe inputs are unknown ahead of time, and we only obtain noisy information, indicating how close we are to violating the constraints. Yet, safety must be guaranteed at all times, not only for the final output of the algorithm. We introduce a general approach for seeking a stationary point in high dimensional non-linear stochastic optimization problems in which maintaining safety during learning is crucial. Our approach called LB-SGD is based on applying stochastic gradient descent (SGD) with a carefully chosen adaptive step size to a logarithmic barrier approximation of the original problem. We provide a complete convergence analysis of non-convex, convex, and strongly-convex smooth constrained problems, with first-order and zeroth-order feedback. Our approach yields efficient updates and scales better with dimensionality compared to existing approaches. We empirically compare the sample complexity and the computational cost of our method with existing safe learning approaches. Beyond synthetic benchmarks, we demonstrate the effectiveness of our approach on minimizing constraint violation in policy search tasks in safe reinforcement learning (RL). △ Less

Submitted 2 June, 2023; v1 submitted 21 July, 2022; originally announced July 2022.

Comments: 36 pages, 9 pages of appendix

arXiv:2201.09802 [pdf, other]

Constrained Policy Optimization via Bayesian World Models

Authors: Yarden As, Ilnura Usmanova, Sebastian Curi, Andreas Krause

Abstract: Improving sample-efficiency and safety are crucial challenges when deploying reinforcement learning in high-stakes real world applications. We propose LAMBDA, a novel model-based approach for policy optimization in safety critical tasks modeled via constrained Markov decision processes. Our approach utilizes Bayesian world models, and harnesses the resulting uncertainty to maximize optimistic uppe… ▽ More Improving sample-efficiency and safety are crucial challenges when deploying reinforcement learning in high-stakes real world applications. We propose LAMBDA, a novel model-based approach for policy optimization in safety critical tasks modeled via constrained Markov decision processes. Our approach utilizes Bayesian world models, and harnesses the resulting uncertainty to maximize optimistic upper bounds on the task objective, as well as pessimistic upper bounds on the safety constraints. We demonstrate LAMBDA's state of the art performance on the Safety-Gym benchmark suite in terms of sample efficiency and constraint violation. △ Less

Submitted 6 February, 2022; v1 submitted 24 January, 2022; originally announced January 2022.

arXiv:2111.03637 [pdf, other]

Risk-averse Heteroscedastic Bayesian Optimization

Authors: Anastasiia Makarova, Ilnura Usmanova, Ilija Bogunovic, Andreas Krause

Abstract: Many black-box optimization tasks arising in high-stakes applications require risk-averse decisions. The standard Bayesian optimization (BO) paradigm, however, optimizes the expected value only. We generalize BO to trade mean and input-dependent variance of the objective, both of which we assume to be unknown a priori. In particular, we propose a novel risk-averse heteroscedastic Bayesian optimiza… ▽ More Many black-box optimization tasks arising in high-stakes applications require risk-averse decisions. The standard Bayesian optimization (BO) paradigm, however, optimizes the expected value only. We generalize BO to trade mean and input-dependent variance of the objective, both of which we assume to be unknown a priori. In particular, we propose a novel risk-averse heteroscedastic Bayesian optimization algorithm (RAHBO) that aims to identify a solution with high return and low noise variance, while learning the noise distribution on the fly. To this end, we model both expectation and variance as (unknown) RKHS functions, and propose a novel risk-aware acquisition function. We bound the regret for our approach and provide a robust rule to report the final decision point for applications where only a single solution must be identified. We demonstrate the effectiveness of RAHBO on synthetic benchmark functions and hyperparameter tuning tasks. △ Less

Submitted 5 November, 2021; originally announced November 2021.

Journal ref: Advances in Neural Information Processing Systems, 2021

arXiv:2109.09835 [pdf, ps, other]

Fast Projection Onto Convex Smooth Constraints

Authors: Ilnura Usmanova, Maryam Kamgarpour, Andreas Krause, Kfir Yehuda Levy

Abstract: The Euclidean projection onto a convex set is an important problem that arises in numerous constrained optimization tasks. Unfortunately, in many cases, computing projections is computationally demanding. In this work, we focus on projection problems where the constraints are smooth and the number of constraints is significantly smaller than the dimension. The runtime of existing approaches to sol… ▽ More The Euclidean projection onto a convex set is an important problem that arises in numerous constrained optimization tasks. Unfortunately, in many cases, computing projections is computationally demanding. In this work, we focus on projection problems where the constraints are smooth and the number of constraints is significantly smaller than the dimension. The runtime of existing approaches to solving such problems is either cubic in the dimension or polynomial in the inverse of the target accuracy. Conversely, we propose a simple and efficient primal-dual approach, with a runtime that scales only linearly with the dimension, and only logarithmically in the inverse of the target accuracy. We empirically demonstrate its performance, and compare it with standard baselines. △ Less

Submitted 20 September, 2021; originally announced September 2021.

arXiv:1912.09478 [pdf, other]

Log Barriers for Safe Non-convex Black-box Optimization

Authors: Ilnura Usmanova, Andreas Krause, Maryam Kamgarpour

Abstract: We address the problem of minimizing a smooth function $f^0(x)$ over a compact set $D$ defined by smooth functional constraints $f^i(x)\leq 0,~ i = 1,\ldots, m$ given noisy value measurements of $f^i(x)$. This problem arises in safety-critical applications, where certain parameters need to be adapted online in a data-driven fashion, such as in personalized medicine, robotics, manufacturing, etc. I… ▽ More We address the problem of minimizing a smooth function $f^0(x)$ over a compact set $D$ defined by smooth functional constraints $f^i(x)\leq 0,~ i = 1,\ldots, m$ given noisy value measurements of $f^i(x)$. This problem arises in safety-critical applications, where certain parameters need to be adapted online in a data-driven fashion, such as in personalized medicine, robotics, manufacturing, etc. In such cases, it is important to ensure constraints are not violated while taking measurements and seeking the minimum of the cost function. We propose a new algorithm s0-LBM, which provides provably feasible iterates with high probability and applies to the challenging case of uncertain zero-th order oracle. We also analyze the convergence rate of the algorithm, and empirically demonstrate its effectiveness. △ Less

Submitted 19 December, 2019; originally announced December 2019.

Comments: under review

arXiv:1912.09466 [pdf, other]

Safe non-smooth black-box optimization with application to policy search

Authors: Ilnura Usmanova, Andreas Krause, Maryam Kamgarpour

Abstract: For safety-critical black-box optimization tasks, observations of the constraints and the objective are often noisy and available only for the feasible points. We propose an approach based on log barriers to find a local solution of a non-convex non-smooth black-box optimization problem $\min f^0(x)$ subject to $f^i(x)\leq 0,~ i = 1,\ldots, m$, at the same time, guaranteeing constraint satisfactio… ▽ More For safety-critical black-box optimization tasks, observations of the constraints and the objective are often noisy and available only for the feasible points. We propose an approach based on log barriers to find a local solution of a non-convex non-smooth black-box optimization problem $\min f^0(x)$ subject to $f^i(x)\leq 0,~ i = 1,\ldots, m$, at the same time, guaranteeing constraint satisfaction while learning an optimal solution with high probability. Our proposed algorithm exploits noisy observations to iteratively improve on an initial safe point until convergence. We derive the convergence rate and prove safety of our algorithm. We demonstrate its performance in an application to an iterative control design problem. △ Less

Submitted 23 February, 2021; v1 submitted 19 December, 2019; originally announced December 2019.

arXiv:1903.04626 [pdf, other]

Safe Convex Learning under Uncertain Constraints

Authors: Ilnura Usmanova, Andreas Krause, Maryam Kamgarpour

Abstract: We address the problem of minimizing a convex smooth function $f(x)$ over a compact polyhedral set $D$ given a stochastic zeroth-order constraint feedback model. This problem arises in safety-critical machine learning applications, such as personalized medicine and robotics. In such cases, one needs to ensure constraints are satisfied while exploring the decision space to find optimum of the loss… ▽ More We address the problem of minimizing a convex smooth function $f(x)$ over a compact polyhedral set $D$ given a stochastic zeroth-order constraint feedback model. This problem arises in safety-critical machine learning applications, such as personalized medicine and robotics. In such cases, one needs to ensure constraints are satisfied while exploring the decision space to find optimum of the loss function. We propose a new variant of the Frank-Wolfe algorithm, which applies to the case of uncertain linear constraints. Using robust optimization, we provide the convergence rate of the algorithm while guaranteeing feasibility of all iterates, with high probability. △ Less

Submitted 9 December, 2019; v1 submitted 11 March, 2019; originally announced March 2019.

Comments: 15 pages, 7 figures, AISTATS 2019

arXiv:1509.01679 [pdf]

Nonlinear stochastic multiarmed bandit problems with inexact oracle

Authors: Alexander Gasnikov, Ekaterina Krymova, Anastasia Lagunovskaya, Ilnura Usmanova, Fedor Fedorenko

Abstract: In the paper we consider one point and two point multiarmed bamdit problems. In other words we consider the online stochastic convex optimization problems with oracle that return the value (realization) of the function at one point or at two points. We allow these values to be inexact, but the level of noise should be small enough. We generalize well known results for inexact oracle case. And we a… ▽ More In the paper we consider one point and two point multiarmed bamdit problems. In other words we consider the online stochastic convex optimization problems with oracle that return the value (realization) of the function at one point or at two points. We allow these values to be inexact, but the level of noise should be small enough. We generalize well known results for inexact oracle case. And we also generalize classical results to prox-structures differ from euclidian. △ Less

Submitted 17 March, 2017; v1 submitted 5 September, 2015; originally announced September 2015.

Comments: 12 pages, in Russia, Avtomatika i Telemekhanika. 2017

arXiv:1508.02182 [pdf]

About accelerated randomized methods

Authors: Alexander Gasnikov, Pavel Dvurechensky, Ilnura Usmanova

Abstract: We show how one can obtain nonaccelerated randomized coordinate descent method (Yu. Nesterov, 2010) and nonaccelerated method of randomization of sum-type functional (Le Roux-Schmidt-Bach, 2012) from the optimal method for the stochastic optimization problem (SIGMA, Devolder-Glineur-Nesterov-Dvurechensky-Gasnikov, 2014). The main trick is a special restart technique. We considered this trick to be… ▽ More We show how one can obtain nonaccelerated randomized coordinate descent method (Yu. Nesterov, 2010) and nonaccelerated method of randomization of sum-type functional (Le Roux-Schmidt-Bach, 2012) from the optimal method for the stochastic optimization problem (SIGMA, Devolder-Glineur-Nesterov-Dvurechensky-Gasnikov, 2014). The main trick is a special restart technique. We considered this trick to be usefull in others contexts. We consider only strongly convex case. We show that accelerated variants of this methods seems to be nontrivial in this context. That is, it is hard (perhaps impossible) to obtain accelerated variants using the same trick. We also propose new approach for accelerated coordinate descent methods. This approach is based on the coupling technique (Allen-Zhu-Orrechia, 2015) and allows us: to generalize accelerated coordinate descent methods for conditional optimization problems, to obtain the dual solution due to the primal-dual nature, to extend Universal method (Yu. Nesterov, 2013) to accelerated coordinate descent methods etc. △ Less

Submitted 26 May, 2018; v1 submitted 10 August, 2015; originally announced August 2015.

Comments: 41 pages, in Russian in TRUDY MIPT. 2016. V. 8. no. 2. P. 67-100

arXiv:1412.3890 [pdf]

Gradient-free prox-methods with inexact oracle for stochastic convex optimization problems on a simplex

Authors: Alexander Gasnikov, Anastasia Lagunovskaya, Ilnura Usmanova, Fedor Fedorenko

Abstract: In the paper we show that euclidian randomization in some situations (i.e. for gradient-free method on a simplex) can be as good as the randomization on the unit sphere in 1-norm. That is on the simplex example we show that for gradient-free methods the choise of the prox-structure and the choise of a way of randomization have to be connected to each other. We demonstrate how it can be done in an… ▽ More In the paper we show that euclidian randomization in some situations (i.e. for gradient-free method on a simplex) can be as good as the randomization on the unit sphere in 1-norm. That is on the simplex example we show that for gradient-free methods the choise of the prox-structure and the choise of a way of randomization have to be connected to each other. We demonstrate how it can be done in an optimal way. It is important that we consider inexact oracle. △ Less

Submitted 17 April, 2016; v1 submitted 12 December, 2014; originally announced December 2014.

Comments: 26 pages, in Russian, Avtomatika i Telemekhanika. 2016

Showing 1–10 of 10 results for author: Usmanova, I