Search | arXiv e-print repository

Physical Derivatives: Computing policy gradients by physical forward-propagation

Authors: Arash Mehrjou, Ashkan Soleymani, Stefan Bauer, Bernhard Schölkopf

Abstract: Model-free and model-based reinforcement learning are two ends of a spectrum. Learning a good policy without a dynamic model can be prohibitively expensive. Learning the dynamic model of a system can reduce the cost of learning the policy, but it can also introduce bias if it is not accurate. We propose a middle ground where instead of the transition model, the sensitivity of the trajectories with… ▽ More Model-free and model-based reinforcement learning are two ends of a spectrum. Learning a good policy without a dynamic model can be prohibitively expensive. Learning the dynamic model of a system can reduce the cost of learning the policy, but it can also introduce bias if it is not accurate. We propose a middle ground where instead of the transition model, the sensitivity of the trajectories with respect to the perturbation of the parameters is learned. This allows us to predict the local behavior of the physical system around a set of nominal policies without knowing the actual model. We assay our method on a custom-built physical robot in extensive experiments and show the feasibility of the approach in practice. We investigate potential challenges when applying our method to physical systems and propose solutions to each of them. △ Less

Submitted 15 January, 2022; originally announced January 2022.

arXiv:2107.03770 [pdf, other]

Federated Learning as a Mean-Field Game

Authors: Arash Mehrjou

Abstract: We establish a connection between federated learning, a concept from machine learning, and mean-field games, a concept from game theory and control theory. In this analogy, the local federated learners are considered as the players and the aggregation of the gradients in a central server is the mean-field effect. We present federated learning as a differential game and discuss the properties of th… ▽ More We establish a connection between federated learning, a concept from machine learning, and mean-field games, a concept from game theory and control theory. In this analogy, the local federated learners are considered as the players and the aggregation of the gradients in a central server is the mean-field effect. We present federated learning as a differential game and discuss the properties of the equilibrium of this game. We hope this novel view to federated learning brings together researchers from these two distinct areas to work on fundamental problems of large-scale distributed and privacy-preserving learning algorithms. △ Less

Submitted 8 July, 2021; originally announced July 2021.

arXiv:1910.14428

Kernel-Guided Training of Implicit Generative Models with Stability Guarantees

Authors: Arash Mehrjou, Wittawat Jitkrittum, Krikamol Muandet, Bernhard Schölkopf

Abstract: Modern implicit generative models such as generative adversarial networks (GANs) are generally known to suffer from issues such as instability, uninterpretability, and difficulty in assessing their performance. If we see these implicit models as dynamical systems, some of these issues are caused by being unable to control their behavior in a meaningful way during the course of training. In this wo… ▽ More Modern implicit generative models such as generative adversarial networks (GANs) are generally known to suffer from issues such as instability, uninterpretability, and difficulty in assessing their performance. If we see these implicit models as dynamical systems, some of these issues are caused by being unable to control their behavior in a meaningful way during the course of training. In this work, we propose a theoretically grounded method to guide the training trajectories of GANs by augmenting the GAN loss function with a kernel-based regularization term that controls local and global discrepancies between the model and true distributions. This control signal allows us to inject prior knowledge into the model. We provide theoretical guarantees on the stability of the resulting dynamical system and demonstrate different aspects of it via a wide range of experiments. △ Less

Submitted 3 November, 2019; v1 submitted 29 October, 2019; originally announced October 2019.

Comments: There was a misunderstanding in how an article should be updated on arXiv. We have withdrawn this article from this link. The same article can be found at arXiv:1901.09206

arXiv:1901.08403 [pdf, other]

Deep Lyapunov Function: Automatic Stability Analysis for Dynamical Systems

Authors: Arash Mehrjou, Bernhard Schölkopf

Abstract: Stability analysis plays a crucial role in studying the behavior of dynamical systems with theoretical and engineering applications. Among various kinds of stability, the stability of equilibrium points is of the greatest importance which is mainly studied by Lyapunov's stability theory. This theory requires finding a function with specified properties. Except for a few simple examples, there is n… ▽ More Stability analysis plays a crucial role in studying the behavior of dynamical systems with theoretical and engineering applications. Among various kinds of stability, the stability of equilibrium points is of the greatest importance which is mainly studied by Lyapunov's stability theory. This theory requires finding a function with specified properties. Except for a few simple examples, there is no straightforward constructive algorithm to find a Lyapunov function for an arbitrary dynamical system. The goal of this work is proposing a simple yet effective way to approximate this function using deep learning tools. △ Less

Submitted 24 January, 2019; originally announced January 2019.

arXiv:1805.10615 [pdf, other]

A Local Information Criterion for Dynamical Systems

Authors: Arash Mehrjou, Friedrich Solowjow, Sebastian Trimpe, Bernhard Schölkopf

Abstract: Encoding a sequence of observations is an essential task with many applications. The encoding can become highly efficient when the observations are generated by a dynamical system. A dynamical system imposes regularities on the observations that can be leveraged to achieve a more efficient code. We propose a method to encode a given or learned dynamical system. Apart from its application for encod… ▽ More Encoding a sequence of observations is an essential task with many applications. The encoding can become highly efficient when the observations are generated by a dynamical system. A dynamical system imposes regularities on the observations that can be leveraged to achieve a more efficient code. We propose a method to encode a given or learned dynamical system. Apart from its application for encoding a sequence of observations, we propose to use the compression achieved by this encoding as a criterion for model selection. Given a dataset, different learning algorithms result in different models. But not all learned models are equally good. We show that the proposed encoding approach can be used to choose the learned model which is closer to the true underlying dynamics. We provide experiments for both encoding and model selection, and theoretical results that shed light on why the approach works. △ Less

Submitted 27 May, 2018; originally announced May 2018.

Showing 1–5 of 5 results for author: Mehrjou, A