Search | arXiv e-print repository

A Pontryagin Perspective on Reinforcement Learning

Authors: Onno Eberhard, Claire Vernade, Michael Muehlebach

Abstract: Reinforcement learning has traditionally focused on learning state-dependent policies to solve optimal control problems in a closed-loop fashion. In this work, we introduce the paradigm of open-loop reinforcement learning where a fixed action sequence is learned instead. We present three new algorithms: one robust model-based method and two sample-efficient model-free methods. Rather than basing o… ▽ More Reinforcement learning has traditionally focused on learning state-dependent policies to solve optimal control problems in a closed-loop fashion. In this work, we introduce the paradigm of open-loop reinforcement learning where a fixed action sequence is learned instead. We present three new algorithms: one robust model-based method and two sample-efficient model-free methods. Rather than basing our algorithms on Bellman's equation from dynamic programming, our work builds on Pontryagin's principle from the theory of open-loop optimal control. We provide convergence guarantees and evaluate all methods empirically on a pendulum swing-up task, as well as on two high-dimensional MuJoCo tasks, demonstrating remarkable performance compared to existing baselines. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.10618 [pdf, other]

Distributed Event-Based Learning via ADMM

Authors: Guner Dilsad Er, Sebastian Trimpe, Michael Muehlebach

Abstract: We consider a distributed learning problem, where agents minimize a global objective function by exchanging information over a network. Our approach has two distinct features: (i) It substantially reduces communication by triggering communication only when necessary, and (ii) it is agnostic to the data-distribution among the different agents. We can therefore guarantee convergence even if the loca… ▽ More We consider a distributed learning problem, where agents minimize a global objective function by exchanging information over a network. Our approach has two distinct features: (i) It substantially reduces communication by triggering communication only when necessary, and (ii) it is agnostic to the data-distribution among the different agents. We can therefore guarantee convergence even if the local data-distributions of the agents are arbitrarily distinct. We analyze the convergence rate of the algorithm and derive accelerated convergence rates in a convex setting. We also characterize the effect of communication drops and demonstrate that our algorithm is robust to communication failures. The article concludes by presenting numerical results from a distributed LASSO problem, and distributed learning tasks on MNIST and CIFAR-10 datasets. The experiments underline communication savings of 50% or more due to the event-based communication strategy, show resilience towards heterogeneous data-distributions, and highlight that our approach outperforms common baselines such as FedAvg, FedProx, and FedADMM. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 29 pages, 12 figures

arXiv:2404.04355 [pdf, other]

Gray-Box Nonlinear Feedback Optimization

Authors: Zhiyu He, Saverio Bolognani, Michael Muehlebach, Florian Dörfler

Abstract: Feedback optimization enables autonomous optimality seeking of a dynamical system through its closed-loop interconnection with iterative optimization algorithms. Among various iteration structures, model-based approaches require the input-output sensitivity of the system to construct gradients, whereas model-free approaches bypass this need by estimating gradients from real-time evaluations of the… ▽ More Feedback optimization enables autonomous optimality seeking of a dynamical system through its closed-loop interconnection with iterative optimization algorithms. Among various iteration structures, model-based approaches require the input-output sensitivity of the system to construct gradients, whereas model-free approaches bypass this need by estimating gradients from real-time evaluations of the objective. These approaches own complementary benefits in sample efficiency and accuracy against model mismatch, i.e., errors of sensitivities. To achieve the best of both worlds, we propose gray-box feedback optimization controllers, featuring systematic incorporation of approximate sensitivities into model-free updates via adaptive convex combination. We quantify conditions on the accuracy of the sensitivities that render the gray-box approach preferable. We elucidate how the closed-loop performance is determined by the number of iterations, the problem dimension, and the cumulative effect of inaccurate sensitivities. The proposed controller contributes to a balanced closed-loop behavior, which retains provable sample efficiency and optimality guarantees for nonconvex problems. We further develop a running gray-box controller to handle constrained time-varying problems with changing objectives and steady-state maps. △ Less

Submitted 5 April, 2024; originally announced April 2024.

arXiv:2403.12859 [pdf, other]

Primal Methods for Variational Inequality Problems with Functional Constraints

Authors: Liang Zhang, Niao He, Michael Muehlebach

Abstract: Constrained variational inequality problems are recognized for their broad applications across various fields including machine learning and operations research. First-order methods have emerged as the standard approach for solving these problems due to their simplicity and scalability. However, they typically rely on projection or linear minimization oracles to navigate the feasible set, which be… ▽ More Constrained variational inequality problems are recognized for their broad applications across various fields including machine learning and operations research. First-order methods have emerged as the standard approach for solving these problems due to their simplicity and scalability. However, they typically rely on projection or linear minimization oracles to navigate the feasible set, which becomes computationally expensive in practical scenarios featuring multiple functional constraints. Existing efforts to tackle such functional constrained variational inequality problems have centered on primal-dual algorithms grounded in the Lagrangian function. These algorithms along with their theoretical analysis often require the existence and prior knowledge of the optimal Lagrange multipliers. In this work, we propose a simple primal method, termed Constrained Gradient Method (CGM), for addressing functional constrained variational inequality problems, without necessitating any information on the optimal Lagrange multipliers. We establish a non-asymptotic convergence analysis of the algorithm for variational inequality problems with monotone operators under smooth constraints. Remarkably, our algorithms match the complexity of projection-based methods in terms of operator queries for both monotone and strongly monotone settings, while utilizing significantly cheaper oracles based on quadratic programming. Furthermore, we provide several numerical examples to evaluate the efficacy of our algorithms. △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2401.14029 [pdf, other]

doi 10.1109/LCSYS.2024.3406943

Towards a Systems Theory of Algorithms

Authors: Florian Dörfler, Zhiyu He, Giuseppe Belgioioso, Saverio Bolognani, John Lygeros, Michael Muehlebach

Abstract: Traditionally, numerical algorithms are seen as isolated pieces of code confined to an {\em in silico} existence. However, this perspective is not appropriate for many modern computational approaches in control, learning, or optimization, wherein {\em in vivo} algorithms interact with their environment. Examples of such {\em open algorithms} include various real-time optimization-based control str… ▽ More Traditionally, numerical algorithms are seen as isolated pieces of code confined to an {\em in silico} existence. However, this perspective is not appropriate for many modern computational approaches in control, learning, or optimization, wherein {\em in vivo} algorithms interact with their environment. Examples of such {\em open algorithms} include various real-time optimization-based control strategies, reinforcement learning, decision-making architectures, online optimization, and many more. Further, even {\em closed} algorithms in learning or optimization are increasingly abstracted in block diagrams with interacting dynamic modules and pipelines. In this opinion paper, we state our vision on a to-be-cultivated {\em systems theory of algorithms} and argue in favor of viewing algorithms as open dynamical systems interacting with other algorithms, physical systems, humans, or databases. Remarkably, the manifold tools developed under the umbrella of systems theory are well suited for addressing a range of challenges in the algorithmic domain. We survey various instances where the principles of algorithmic systems theory are being developed and outline pertinent modeling, analysis, and design challenges. △ Less

Submitted 30 April, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

arXiv:2306.03655 [pdf, other]

Online Learning under Adversarial Nonlinear Constraints

Authors: Pavel Kolev, Georg Martius, Michael Muehlebach

Abstract: In many applications, learning systems are required to process continuous non-stationary data streams. We study this problem in an online learning framework and propose an algorithm that can deal with adversarial time-varying and nonlinear constraints. As we show in our work, the algorithm called Constraint Violation Velocity Projection (CVV-Pro) achieves $\sqrt{T}$ regret and converges to the fea… ▽ More In many applications, learning systems are required to process continuous non-stationary data streams. We study this problem in an online learning framework and propose an algorithm that can deal with adversarial time-varying and nonlinear constraints. As we show in our work, the algorithm called Constraint Violation Velocity Projection (CVV-Pro) achieves $\sqrt{T}$ regret and converges to the feasible set at a rate of $1/\sqrt{T}$, despite the fact that the feasible set is slowly time-varying and a priori unknown to the learner. CVV-Pro only relies on local sparse linear approximations of the feasible set and therefore avoids optimizing over the entire set at each iteration, which is in sharp contrast to projected gradients or Frank-Wolfe methods. We also empirically evaluate our algorithm on two-player games, where the players are subjected to a shared constraint. △ Less

Submitted 13 October, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

Comments: NeurIPS 2023

arXiv:2305.08536 [pdf, other]

A Dynamical Systems Perspective on Discrete Optimization

Authors: Tong Guanchun, Michael Muehlebach

Abstract: We discuss a dynamical systems perspective on discrete optimization. Departing from the fact that many combinatorial optimization problems can be reformulated as finding low energy spin configurations in corresponding Ising models, we derive a penalized rank-two relaxation of the Ising formulation. It turns out that the associated gradient flow dynamics exactly correspond to a type of hardware sol… ▽ More We discuss a dynamical systems perspective on discrete optimization. Departing from the fact that many combinatorial optimization problems can be reformulated as finding low energy spin configurations in corresponding Ising models, we derive a penalized rank-two relaxation of the Ising formulation. It turns out that the associated gradient flow dynamics exactly correspond to a type of hardware solvers termed oscillator-based Ising machines. We also analyze the advantage of adding angle penalties by leveraging random rounding techniques. Therefore, our work contributes to a rigorous understanding of oscillator-based Ising machines by drawing connections to the penalty method in constrained optimization and providing a rationale for the introduction of sub-harmonic injection locking. Furthermore, we characterize a class of coupling functions between oscillators, which ensures convergence to discrete solutions. This class of coupling functions avoids explicit penalty terms or rounding schemes, which are prevalent in other formulations. △ Less

Submitted 15 May, 2023; originally announced May 2023.

arXiv:2303.09261 [pdf, other]

Orthogonal Directions Constrained Gradient Method: from non-linear equality constraints to Stiefel manifold

Authors: Sholom Schechtman, Daniil Tiapkin, Michael Muehlebach, Eric Moulines

Abstract: We consider the problem of minimizing a non-convex function over a smooth manifold $\mathcal{M}$. We propose a novel algorithm, the Orthogonal Directions Constrained Gradient Method (ODCGM) which only requires computing a projection onto a vector space. ODCGM is infeasible but the iterates are constantly pulled towards the manifold, ensuring the convergence of ODCGM towards $\mathcal{M}$. ODCGM is… ▽ More We consider the problem of minimizing a non-convex function over a smooth manifold $\mathcal{M}$. We propose a novel algorithm, the Orthogonal Directions Constrained Gradient Method (ODCGM) which only requires computing a projection onto a vector space. ODCGM is infeasible but the iterates are constantly pulled towards the manifold, ensuring the convergence of ODCGM towards $\mathcal{M}$. ODCGM is much simpler to implement than the classical methods which require the computation of a retraction. Moreover, we show that ODCGM exhibits the near-optimal oracle complexities $\mathcal{O}(1/\varepsilon^2)$ and $\mathcal{O}(1/\varepsilon^4)$ in the deterministic and stochastic cases, respectively. Furthermore, we establish that, under an appropriate choice of the projection metric, our method recovers the landing algorithm of Ablin and Peyré (2022), a recently introduced algorithm for optimization over the Stiefel manifold. As a result, we significantly extend the analysis of Ablin and Peyré (2022), establishing near-optimal rates both in deterministic and stochastic frameworks. Finally, we perform numerical experiments which shows the efficiency of ODCGM in a high-dimensional setting. △ Less

Submitted 16 March, 2023; originally announced March 2023.

arXiv:2302.00316 [pdf, other]

Accelerated First-Order Optimization under Nonlinear Constraints

Authors: Michael Muehlebach, Michael I. Jordan

Abstract: We exploit analogies between first-order algorithms for constrained optimization and non-smooth dynamical systems to design a new class of accelerated first-order algorithms for constrained optimization. Unlike Frank-Wolfe or projected gradients, these algorithms avoid optimization over the entire feasible set at each iteration. We prove convergence to stationary points even in a nonconvex setting… ▽ More We exploit analogies between first-order algorithms for constrained optimization and non-smooth dynamical systems to design a new class of accelerated first-order algorithms for constrained optimization. Unlike Frank-Wolfe or projected gradients, these algorithms avoid optimization over the entire feasible set at each iteration. We prove convergence to stationary points even in a nonconvex setting and we derive accelerated rates for the convex setting both in continuous time, as well as in discrete time. An important property of these algorithms is that constraints are expressed in terms of velocities instead of positions, which naturally leads to sparse, local and convex approximations of the feasible set (even if the feasible set is nonconvex). Thus, the complexity tends to grow mildly in the number of decision variables and in the number of constraints, which makes the algorithms suitable for machine learning applications. We apply our algorithms to a compressed sensing and a sparse regression problem, showing that we can treat nonconvex $\ell^p$ constraints ($p<1$) efficiently, while recovering state-of-the-art performance for $p=1$. △ Less

Submitted 2 January, 2024; v1 submitted 1 February, 2023; originally announced February 2023.

Comments: 44 pages, 6 figures

arXiv:2206.02953 [pdf, other]

Sampling without Replacement Leads to Faster Rates in Finite-Sum Minimax Optimization

Authors: Aniket Das, Bernhard Schölkopf, Michael Muehlebach

Abstract: We analyze the convergence rates of stochastic gradient algorithms for smooth finite-sum minimax optimization and show that, for many such algorithms, sampling the data points without replacement leads to faster convergence compared to sampling with replacement. For the smooth and strongly convex-strongly concave setting, we consider gradient descent ascent and the proximal point method, and prese… ▽ More We analyze the convergence rates of stochastic gradient algorithms for smooth finite-sum minimax optimization and show that, for many such algorithms, sampling the data points without replacement leads to faster convergence compared to sampling with replacement. For the smooth and strongly convex-strongly concave setting, we consider gradient descent ascent and the proximal point method, and present a unified analysis of two popular without-replacement sampling strategies, namely Random Reshuffling (RR), which shuffles the data every epoch, and Single Shuffling or Shuffle Once (SO), which shuffles only at the beginning. We obtain tight convergence rates for RR and SO and demonstrate that these strategies lead to faster convergence than uniform sampling. Moving beyond convexity, we obtain similar results for smooth nonconvex-nonconcave objectives satisfying a two-sided Polyak-Łojasiewicz inequality. Finally, we demonstrate that our techniques are general enough to analyze the effect of data-ordering attacks, where an adversary manipulates the order in which data points are supplied to the optimizer. Our analysis also recovers tight rates for the incremental gradient method, where the data points are not shuffled at all. △ Less

Submitted 10 October, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

Comments: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

arXiv:2107.08225 [pdf, other]

On Constraints in First-Order Optimization: A View from Non-Smooth Dynamical Systems

Authors: Michael Muehlebach, Michael I. Jordan

Abstract: We introduce a class of first-order methods for smooth constrained optimization that are based on an analogy to non-smooth dynamical systems. Two distinctive features of our approach are that (i) projections or optimizations over the entire feasible set are avoided, in stark contrast to projected gradient methods or the Frank-Wolfe method, and (ii) iterates are allowed to become infeasible, which… ▽ More We introduce a class of first-order methods for smooth constrained optimization that are based on an analogy to non-smooth dynamical systems. Two distinctive features of our approach are that (i) projections or optimizations over the entire feasible set are avoided, in stark contrast to projected gradient methods or the Frank-Wolfe method, and (ii) iterates are allowed to become infeasible, which differs from active set or feasible direction methods, where the descent motion stops as soon as a new constraint is encountered. The resulting algorithmic procedure is simple to implement even when constraints are nonlinear, and is suitable for large-scale constrained optimization problems in which the feasible set fails to have a simple structure. The key underlying idea is that constraints are expressed in terms of velocities instead of positions, which has the algorithmic consequence that optimizations over feasible sets at each iteration are replaced with optimizations over local, sparse convex approximations. In particular, this means that at each iteration only constraints that are violated are taken into account. The result is a simplified suite of algorithms and an expanded range of possible applications in machine learning. △ Less

Submitted 5 November, 2022; v1 submitted 17 July, 2021; originally announced July 2021.

Comments: 47 pages, 11 figures

arXiv:2002.12493 [pdf, other]

Optimization with Momentum: Dynamical, Control-Theoretic, and Symplectic Perspectives

Authors: Michael Muehlebach, Michael I. Jordan

Abstract: We analyze the convergence rate of various momentum-based optimization algorithms from a dynamical systems point of view. Our analysis exploits fundamental topological properties, such as the continuous dependence of iterates on their initial conditions, to provide a simple characterization of convergence rates. In many cases, closed-form expressions are obtained that relate algorithm parameters t… ▽ More We analyze the convergence rate of various momentum-based optimization algorithms from a dynamical systems point of view. Our analysis exploits fundamental topological properties, such as the continuous dependence of iterates on their initial conditions, to provide a simple characterization of convergence rates. In many cases, closed-form expressions are obtained that relate algorithm parameters to the convergence rate. The analysis encompasses discrete time and continuous time, as well as time-invariant and time-variant formulations, and is not limited to a convex or Euclidean setting. In addition, the article rigorously establishes why symplectic discretization schemes are important for momentum-based optimization algorithms, and provides a characterization of algorithms that exhibit accelerated convergence. △ Less

Submitted 12 April, 2021; v1 submitted 27 February, 2020; originally announced February 2020.

Comments: 30 pages; 20 pages appendix and references

arXiv:2002.03546 [pdf, ps, other]

Continuous-time Lower Bounds for Gradient-based Algorithms

Authors: Michael Muehlebach, Michael I. Jordan

Abstract: This article derives lower bounds on the convergence rate of continuous-time gradient-based optimization algorithms. The algorithms are subjected to a time-normalization constraint that avoids a reparametrization of time in order to make the discussion of continuous-time convergence rates meaningful. We reduce the multi-dimensional problem to a single dimension, recover well-known lower bounds fro… ▽ More This article derives lower bounds on the convergence rate of continuous-time gradient-based optimization algorithms. The algorithms are subjected to a time-normalization constraint that avoids a reparametrization of time in order to make the discussion of continuous-time convergence rates meaningful. We reduce the multi-dimensional problem to a single dimension, recover well-known lower bounds from the discrete-time setting, and provide insight into why these lower bounds occur. We present algorithms that achieve the proposed lower bounds, even when the function class under consideration includes certain nonconvex functions. △ Less

Submitted 3 August, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

Comments: 13 pages

arXiv:1905.07436 [pdf, other]

A Dynamical Systems Perspective on Nesterov Acceleration

Authors: Michael Muehlebach, Michael I. Jordan

Abstract: We present a dynamical system framework for understanding Nesterov's accelerated gradient method. In contrast to earlier work, our derivation does not rely on a vanishing step size argument. We show that Nesterov acceleration arises from discretizing an ordinary differential equation with a semi-implicit Euler integration scheme. We analyze both the underlying differential equation as well as the… ▽ More We present a dynamical system framework for understanding Nesterov's accelerated gradient method. In contrast to earlier work, our derivation does not rely on a vanishing step size argument. We show that Nesterov acceleration arises from discretizing an ordinary differential equation with a semi-implicit Euler integration scheme. We analyze both the underlying differential equation as well as the discretization to obtain insights into the phenomenon of acceleration. The analysis suggests that a curvature-dependent dam** term lies at the heart of the phenomenon. We further establish connections between the discretized and the continuous-time dynamics. △ Less

Submitted 17 May, 2019; originally announced May 2019.

Comments: 11 pages, 4 figures, to appear in the Proceedings of the 36th International Conference on Machine Learning

arXiv:1608.08823 [pdf, other]

Approximation of Continuous-Time Infinite-Horizon Optimal Control Problems Arising in Model Predictive Control - Supplementary Notes

Authors: Michael Muehlebach, Raffaello D'Andrea

Abstract: These notes present preliminary results regarding two different approximations of linear infinite-horizon optimal control problems arising in model predictive control. Input and state trajectories are parametrized with basis functions and a finite dimensional representation of the dynamics is obtained via a Galerkin approach. It is shown that the two approximations provide lower, respectively uppe… ▽ More These notes present preliminary results regarding two different approximations of linear infinite-horizon optimal control problems arising in model predictive control. Input and state trajectories are parametrized with basis functions and a finite dimensional representation of the dynamics is obtained via a Galerkin approach. It is shown that the two approximations provide lower, respectively upper bounds on the optimal cost of the underlying infinite dimensional optimal control problem. These bounds get tighter as the number of basis functions is increased. In addition, conditions guaranteeing convergence to the cost of the underlying problem are provided. △ Less

Submitted 31 August, 2016; originally announced August 2016.

Comments: Supplementary notes, 10 pages

Showing 1–15 of 15 results for author: Muehlebach, M