Search | arXiv e-print repository

Learning Synthetic Environments and Reward Networks for Reinforcement Learning

Authors: Fabio Ferreira, Thomas Nierhoff, Andreas Saelinger, Frank Hutter

Abstract: We introduce Synthetic Environments (SEs) and Reward Networks (RNs), represented by neural networks, as proxy environment models for training Reinforcement Learning (RL) agents. We show that an agent, after being trained exclusively on the SE, is able to solve the corresponding real environment. While an SE acts as a full proxy to a real environment by learning about its state dynamics and rewards… ▽ More We introduce Synthetic Environments (SEs) and Reward Networks (RNs), represented by neural networks, as proxy environment models for training Reinforcement Learning (RL) agents. We show that an agent, after being trained exclusively on the SE, is able to solve the corresponding real environment. While an SE acts as a full proxy to a real environment by learning about its state dynamics and rewards, an RN is a partial proxy that learns to augment or replace rewards. We use bi-level optimization to evolve SEs and RNs: the inner loop trains the RL agent, and the outer loop trains the parameters of the SE / RN via an evolution strategy. We evaluate our proposed new concept on a broad range of RL algorithms and classic control environments. In a one-to-one comparison, learning an SE proxy requires more interactions with the real environment than training agents only on the real environment. However, once such an SE has been learned, we do not need any interactions with the real environment to train new agents. Moreover, the learned SE proxies allow us to train agents with fewer interactions while maintaining the original task performance. Our empirical results suggest that SEs achieve this result by learning informed representations that bias the agents towards relevant states. Moreover, we find that these proxies are robust against hyperparameter variation and can also transfer to unseen agents. △ Less

Submitted 6 February, 2022; originally announced February 2022.

Journal ref: International Conference on Learning Representations (ICLR 2022)

arXiv:1510.03362 [pdf, other]

On the Smoothness of Paging Algorithms

Authors: Jan Reineke, Alejandro Salinger

Abstract: We study the smoothness of paging algorithms. How much can the number of page faults increase due to a perturbation of the request sequence? We call a paging algorithm smooth if the maximal increase in page faults is proportional to the number of changes in the request sequence. We also introduce quantitative smoothness notions that measure the smoothness of an algorithm. We derive lower and upper… ▽ More We study the smoothness of paging algorithms. How much can the number of page faults increase due to a perturbation of the request sequence? We call a paging algorithm smooth if the maximal increase in page faults is proportional to the number of changes in the request sequence. We also introduce quantitative smoothness notions that measure the smoothness of an algorithm. We derive lower and upper bounds on the smoothness of deterministic and randomized demand-paging and competitive algorithms. Among strongly-competitive deterministic algorithms LRU matches the lower bound, while FIFO matches the upper bound. Well-known randomized algorithms like Partition, Equitable, or Mark are shown not to be smooth. We introduce two new randomized algorithms, called Smoothed-LRU and LRU-Random. Smoothed- LRU allows to sacrifice competitiveness for smoothness, where the trade-off is controlled by a parameter. LRU-Random is at least as competitive as any deterministic algorithm while smoother. △ Less

Submitted 12 October, 2015; originally announced October 2015.

Comments: Full version of paper presented at WAOA 2015

ACM Class: C.3; F.1.2; F.2.2

arXiv:1411.7359 [pdf, other]

Algorithms in the Ultra-Wide Word Model

Authors: Arash Farzan, Alejandro López-Ortiz, Patrick K. Nicholson, Alejandro Salinger

Abstract: The effective use of parallel computing resources to speed up algorithms in current multi-core parallel architectures remains a difficult challenge, with ease of programming playing a key role in the eventual success of various parallel architectures. In this paper we consider an alternative view of parallelism in the form of an ultra-wide word processor. We introduce the Ultra-Wide Word architect… ▽ More The effective use of parallel computing resources to speed up algorithms in current multi-core parallel architectures remains a difficult challenge, with ease of programming playing a key role in the eventual success of various parallel architectures. In this paper we consider an alternative view of parallelism in the form of an ultra-wide word processor. We introduce the Ultra-Wide Word architecture and model, an extension of the word-RAM model that allows for constant time operations on thousands of bits in parallel. Word parallelism as exploited by the word-RAM model does not suffer from the more difficult aspects of parallel programming, namely synchronization and concurrency. For the standard word-RAM algorithms, the speedups obtained are moderate, as they are limited by the word size. We argue that a large class of word-RAM algorithms can be implemented in the Ultra-Wide Word model, obtaining speedups comparable to multi-threaded computations while kee** the simplicity of programming of the sequential RAM model. We show that this is the case by describing implementations of Ultra-Wide Word algorithms for dynamic programming and string searching. In addition, we show that the Ultra-Wide Word model can be used to implement a nonstandard memory architecture, which enables the sidestep** of lower bounds of important data structure problems such as priority queues and dynamic prefix sums. While similar ideas about operating on large words have been mentioned before in the context of multimedia processors [Thorup 2003], it is only recently that an architecture like the one we propose has become feasible and that details can be worked out. △ Less

Submitted 7 December, 2014; v1 submitted 26 November, 2014; originally announced November 2014.

Comments: 28 pages, 5 figures; minor changes

ACM Class: F.1.1; F.1.2; F.2.2

arXiv:1205.3952 [pdf, other]

Automating embedded analysis capabilities and managing software complexity in multiphysics simulation part II: application to partial differential equations

Authors: Roger P. Pawlowski, Eric T. Phipps, Andrew G. Salinger, Steven J. Owen, Christopher M. Siefert, Matthew L. Staten

Abstract: A template-based generic programming approach was presented in a previous paper that separates the development effort of programming a physical model from that of computing additional quantities, such as derivatives, needed for embedded analysis algorithms. In this paper, we describe the implementation details for using the template-based generic programming approach for simulation and analysis of… ▽ More A template-based generic programming approach was presented in a previous paper that separates the development effort of programming a physical model from that of computing additional quantities, such as derivatives, needed for embedded analysis algorithms. In this paper, we describe the implementation details for using the template-based generic programming approach for simulation and analysis of partial differential equations (PDEs). We detail several of the hurdles that we have encountered, and some of the software infrastructure developed to overcome them. We end with a demonstration where we present shape optimization and uncertainty quantification results for a 3D PDE application. △ Less

Submitted 17 May, 2012; originally announced May 2012.

arXiv:1205.0790 [pdf, other]

Automating embedded analysis capabilities and managing software complexity in multiphysics simulation part I: template-based generic programming

Authors: Roger P. Pawlowski, Eric T. Phipps, Andrew G. Salinger

Abstract: An approach for incorporating embedded simulation and analysis capabilities in complex simulation codes through template-based generic programming is presented. This approach relies on templating and operator overloading within the C++ language to transform a given calculation into one that can compute a variety of additional quantities that are necessary for many state-of-the-art simulation and a… ▽ More An approach for incorporating embedded simulation and analysis capabilities in complex simulation codes through template-based generic programming is presented. This approach relies on templating and operator overloading within the C++ language to transform a given calculation into one that can compute a variety of additional quantities that are necessary for many state-of-the-art simulation and analysis algorithms. An approach for incorporating these ideas into complex simulation codes through general graph-based assembly is also presented. These ideas have been implemented within a set of packages in the Trilinos framework and are demonstrated on a simple problem from chemical engineering. △ Less

Submitted 15 May, 2012; v1 submitted 3 May, 2012; originally announced May 2012.

Showing 1–5 of 5 results for author: Saelinger, A