Skip to main content

Showing 1–21 of 21 results for author: Genewein, T

.
  1. arXiv:2402.04494  [pdf, other

    cs.LG cs.AI stat.ML

    Grandmaster-Level Chess Without Search

    Authors: Anian Ruoss, Grégoire Delétang, Sourabh Medapati, Jordi Grau-Moya, Li Kevin Wenliang, Elliot Catt, John Reid, Tim Genewein

    Abstract: The recent breakthrough successes in machine learning are mainly attributed to scale: namely large-scale attention-based architectures and datasets of unprecedented scale. This paper investigates the impact of training at scale for chess. Unlike traditional chess engines that rely on complex heuristics, explicit search, or a combination of both, we train a 270M parameter transformer model with sup… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  2. arXiv:2401.14953  [pdf, other

    cs.LG cs.AI

    Learning Universal Predictors

    Authors: Jordi Grau-Moya, Tim Genewein, Marcus Hutter, Laurent Orseau, Grégoire Delétang, Elliot Catt, Anian Ruoss, Li Kevin Wenliang, Christopher Mattern, Matthew Aitchison, Joel Veness

    Abstract: Meta-learning has emerged as a powerful approach to train neural networks to learn new tasks quickly from limited data. Broad exposure to different tasks leads to versatile representations enabling general problem solving. But, what are the limits of meta-learning? In this work, we explore the potential of amortizing the most powerful universal predictor, namely Solomonoff Induction (SI), into neu… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: 32 pages, 11 figures

  3. arXiv:2309.10668  [pdf, other

    cs.LG cs.AI cs.CL cs.IT

    Language Modeling Is Compression

    Authors: Grégoire Delétang, Anian Ruoss, Paul-Ambroise Duquenne, Elliot Catt, Tim Genewein, Christopher Mattern, Jordi Grau-Moya, Li Kevin Wenliang, Matthew Aitchison, Laurent Orseau, Marcus Hutter, Joel Veness

    Abstract: It has long been established that predictive models can be transformed into lossless compressors and vice versa. Incidentally, in recent years, the machine learning community has focused on training increasingly large and powerful self-supervised (language) models. Since these large language models exhibit impressive predictive capabilities, they are well-positioned to be strong compressors. In th… ▽ More

    Submitted 18 March, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

  4. arXiv:2305.16843  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Randomized Positional Encodings Boost Length Generalization of Transformers

    Authors: Anian Ruoss, Grégoire Delétang, Tim Genewein, Jordi Grau-Moya, Róbert Csordás, Mehdi Bennani, Shane Legg, Joel Veness

    Abstract: Transformers have impressive generalization capabilities on tasks with a fixed context length. However, they fail to generalize to sequences of arbitrary length, even for seemingly simple tasks such as duplicating a string. Moreover, simply training on longer sequences is inefficient due to the quadratic computation complexity of the global attention mechanism. In this work, we demonstrate that th… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

  5. arXiv:2302.03067  [pdf, other

    cs.LG cs.AI stat.ML

    Memory-Based Meta-Learning on Non-Stationary Distributions

    Authors: Tim Genewein, Grégoire Delétang, Anian Ruoss, Li Kevin Wenliang, Elliot Catt, Vincent Dutordoir, Jordi Grau-Moya, Laurent Orseau, Marcus Hutter, Joel Veness

    Abstract: Memory-based meta-learning is a technique for approximating Bayes-optimal predictors. Under fairly general conditions, minimizing sequential prediction error, measured by the log loss, leads to implicit meta-learning. The goal of this work is to investigate how far this interpretation can be realized by current sequence prediction models and training regimes. The focus is on piecewise stationary s… ▽ More

    Submitted 25 May, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

  6. arXiv:2209.15618  [pdf, other

    cs.AI cs.LG

    Beyond Bayes-optimality: meta-learning what you know you don't know

    Authors: Jordi Grau-Moya, Grégoire Delétang, Markus Kunesch, Tim Genewein, Elliot Catt, Kevin Li, Anian Ruoss, Chris Cundy, Joel Veness, Jane Wang, Marcus Hutter, Christopher Summerfield, Shane Legg, Pedro Ortega

    Abstract: Meta-training agents with memory has been shown to culminate in Bayes-optimal agents, which casts Bayes-optimality as the implicit solution to a numerical optimization problem rather than an explicit modeling assumption. Bayes-optimal agents are risk-neutral, since they solely attune to the expected return, and ambiguity-neutral, since they act in new situations as if the uncertainty were known. T… ▽ More

    Submitted 12 October, 2022; v1 submitted 30 September, 2022; originally announced September 2022.

    Comments: 33 pages, 8 figures, technical report

  7. arXiv:2207.02098  [pdf, other

    cs.LG cs.AI cs.CL cs.FL

    Neural Networks and the Chomsky Hierarchy

    Authors: Grégoire Delétang, Anian Ruoss, Jordi Grau-Moya, Tim Genewein, Li Kevin Wenliang, Elliot Catt, Chris Cundy, Marcus Hutter, Shane Legg, Joel Veness, Pedro A. Ortega

    Abstract: Reliable generalization lies at the heart of safe ML and AI. However, understanding when and how neural networks generalize remains one of the most important unsolved problems in the field. In this work, we conduct an extensive empirical study (20'910 models, 15 tasks) to investigate whether insights from the theory of computation can predict the limits of neural network generalization in practice… ▽ More

    Submitted 28 February, 2023; v1 submitted 5 July, 2022; originally announced July 2022.

  8. arXiv:2203.12592  [pdf, other

    cs.LG stat.ML

    Your Policy Regularizer is Secretly an Adversary

    Authors: Rob Brekelmans, Tim Genewein, Jordi Grau-Moya, Grégoire Delétang, Markus Kunesch, Shane Legg, Pedro Ortega

    Abstract: Policy regularization methods such as maximum entropy regularization are widely used in reinforcement learning to improve the robustness of a learned policy. In this paper, we show how this robustness arises from hedging against worst-case perturbations of the reward function, which are chosen from a limited set by an imagined adversary. Using convex duality, we characterize this robust set of adv… ▽ More

    Submitted 8 July, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

    Comments: Transactions on Machine Learning Research

    Journal ref: TMLR (2022) https://openreview.net/forum?id=berNQMTYWZ

  9. arXiv:2111.02907  [pdf, other

    cs.LG

    Model-Free Risk-Sensitive Reinforcement Learning

    Authors: Grégoire Delétang, Jordi Grau-Moya, Markus Kunesch, Tim Genewein, Rob Brekelmans, Shane Legg, Pedro A. Ortega

    Abstract: We extend temporal-difference (TD) learning in order to obtain risk-sensitive, model-free reinforcement learning algorithms. This extension can be regarded as modification of the Rescorla-Wagner rule, where the (sigmoidal) stimulus is taken to be either the event of over- or underestimating the TD target. As a result, one obtains a stochastic approximation rule for estimating the free energy from… ▽ More

    Submitted 4 November, 2021; originally announced November 2021.

    Comments: DeepMind Tech Report: 13 pages, 4 figures

  10. arXiv:2110.10819  [pdf, other

    cs.LG cs.AI

    Shaking the foundations: delusions in sequence models for interaction and control

    Authors: Pedro A. Ortega, Markus Kunesch, Grégoire Delétang, Tim Genewein, Jordi Grau-Moya, Joel Veness, Jonas Buchli, Jonas Degrave, Bilal Piot, Julien Perolat, Tom Everitt, Corentin Tallec, Emilio Parisotto, Tom Erez, Yutian Chen, Scott Reed, Marcus Hutter, Nando de Freitas, Shane Legg

    Abstract: The recent phenomenal success of language models has reinvigorated machine learning research, and large sequence models such as transformers are being applied to a variety of domains. One important problem class that has remained relatively elusive however is purposeful adaptive behavior. Currently there is a common perception that sequence models "lack the understanding of the cause and effect of… ▽ More

    Submitted 20 October, 2021; originally announced October 2021.

    Comments: DeepMind Tech Report, 16 pages, 4 figures

  11. arXiv:2103.03938  [pdf, other

    cs.AI cs.LG

    Causal Analysis of Agent Behavior for AI Safety

    Authors: Grégoire Déletang, Jordi Grau-Moya, Miljan Martic, Tim Genewein, Tom McGrath, Vladimir Mikulik, Markus Kunesch, Shane Legg, Pedro A. Ortega

    Abstract: As machine learning systems become more powerful they also become increasingly unpredictable and opaque. Yet, finding human-understandable explanations of how they work is essential for their safe deployment. This technical report illustrates a methodology for investigating the causal mechanisms that drive the behaviour of artificial agents. Six use cases are covered, each addressing a typical que… ▽ More

    Submitted 5 March, 2021; originally announced March 2021.

    Comments: 16 pages, 16 figures, 6 tables

  12. arXiv:2010.12237  [pdf, other

    cs.AI cs.LG

    Algorithms for Causal Reasoning in Probability Trees

    Authors: Tim Genewein, Tom McGrath, Grégoire Déletang, Vladimir Mikulik, Miljan Martic, Shane Legg, Pedro A. Ortega

    Abstract: Probability trees are one of the simplest models of causal generative processes. They possess clean semantics and -- unlike causal Bayesian networks -- they can represent context-specific causal dependencies, which are necessary for e.g. causal induction. Yet, they have received little attention from the AI and ML community. Here we present concrete algorithms for causal reasoning in discrete prob… ▽ More

    Submitted 11 November, 2020; v1 submitted 23 October, 2020; originally announced October 2020.

    Comments: (2nd version with correction to algorithm) 11 pages, 8 figures, 5 algorithms. A companion Colaboratory tutorial is available at https://github.com/deepmind/deepmind-research/tree/master/causal_reasoning

  13. arXiv:2010.11223  [pdf, other

    cs.AI cs.LG cs.NE

    Meta-trained agents implement Bayes-optimal agents

    Authors: Vladimir Mikulik, Grégoire Delétang, Tom McGrath, Tim Genewein, Miljan Martic, Shane Legg, Pedro A. Ortega

    Abstract: Memory-based meta-learning is a powerful technique to build agents that adapt fast to any task within a target distribution. A previous theoretical study has argued that this remarkable performance is because the meta-training protocol incentivises agents to behave Bayes-optimally. We empirically investigate this claim on a number of prediction and bandit tasks. Inspired by ideas from theoretical… ▽ More

    Submitted 21 October, 2020; originally announced October 2020.

    Comments: Published at 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada

  14. arXiv:1908.03463  [pdf, other

    stat.ML cs.LG

    Group Pruning using a Bounded-Lp norm for Group Gating and Regularization

    Authors: Chaithanya Kumar Mummadi, Tim Genewein, Dan Zhang, Thomas Brox, Volker Fischer

    Abstract: Deep neural networks achieve state-of-the-art results on several tasks while increasing in complexity. It has been shown that neural networks can be pruned during training by imposing sparsity inducing regularizers. In this paper, we investigate two techniques for group-wise pruning during training in order to improve network efficiency. We propose a gating factor after every convolutional layer t… ▽ More

    Submitted 9 August, 2019; originally announced August 2019.

    Comments: German Conference on Pattern Recognition (GCPR) 2019, 12 main pages, 3 pages of appendix, 4 figures, 2 tables

  15. arXiv:1905.03030  [pdf, other

    cs.LG cs.AI stat.ML

    Meta-learning of Sequential Strategies

    Authors: Pedro A. Ortega, Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alex Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin Miller, Mohammad Azar, Ian Osband, Neil Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew Botvinick, Shane Legg

    Abstract: In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundations of this tool for building new, scalable agents that operate on broad domains. To do so, we present basic algorithmic templates for building near-optimal pred… ▽ More

    Submitted 18 July, 2019; v1 submitted 8 May, 2019; originally announced May 2019.

    Comments: DeepMind Technical Report (15 pages, 6 figures). Version V1.1

  16. arXiv:1810.01118  [pdf, other

    cs.LG cs.CV stat.ML

    Sinkhorn AutoEncoders

    Authors: Giorgio Patrini, Rianne van den Berg, Patrick Forré, Marcello Carioni, Samarth Bhargav, Max Welling, Tim Genewein, Frank Nielsen

    Abstract: Optimal transport offers an alternative to maximum likelihood for learning generative autoencoding models. We show that minimizing the p-Wasserstein distance between the generator and the true data distribution is equivalent to the unconstrained min-min optimization of the p-Wasserstein distance between the encoder aggregated posterior and the prior in latent space, plus a reconstruction error. We… ▽ More

    Submitted 15 July, 2019; v1 submitted 2 October, 2018; originally announced October 2018.

    Comments: Accepted for oral presentation at UAI19

  17. arXiv:1804.05906  [pdf, other

    cs.AI

    An information-theoretic on-line update principle for perception-action coupling

    Authors: Zhen Peng, Tim Genewein, Felix Leibfried, Daniel A. Braun

    Abstract: Inspired by findings of sensorimotor coupling in humans and animals, there has recently been a growing interest in the interaction between action and perception in robotic systems [Bogh et al., 2016]. Here we consider perception and action as two serial information channels with limited information-processing capacity. We follow [Genewein et al., 2015] and formulate a constrained optimization prob… ▽ More

    Submitted 16 April, 2018; originally announced April 2018.

    Comments: 8 pages, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

  18. arXiv:1702.04267  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    On Detecting Adversarial Perturbations

    Authors: Jan Hendrik Metzen, Tim Genewein, Volker Fischer, Bastian Bischoff

    Abstract: Machine learning and deep learning in particular has advanced tremendously on perceptual tasks in recent years. However, it remains vulnerable against adversarial perturbations of the input that have been crafted specifically to fool the system while being quasi-imperceptible to a human. In this work, we propose to augment deep neural networks with a small "detector" subnetwork which is trained on… ▽ More

    Submitted 21 February, 2017; v1 submitted 14 February, 2017; originally announced February 2017.

    Comments: Final version for ICLR2017 (see https://openreview.net/forum?id=SJzCSf9xg&noteId=SJzCSf9xg)

  19. arXiv:1604.02080  [pdf, other

    cs.AI eess.SY

    Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes

    Authors: Jordi Grau-Moya, Felix Leibfried, Tim Genewein, Daniel A. Braun

    Abstract: Information-theoretic principles for learning and acting have been proposed to solve particular classes of Markov Decision Problems. Mathematically, such approaches are governed by a variational free energy principle and allow solving MDP planning problems with information-processing constraints expressed in terms of a Kullback-Leibler divergence with respect to a reference distribution. Here we c… ▽ More

    Submitted 7 April, 2016; originally announced April 2016.

    Comments: 16 pages, 3 figures

  20. arXiv:1312.4353  [pdf, other

    cs.AI cs.IT stat.ML

    Abstraction in decision-makers with limited information processing capabilities

    Authors: Tim Genewein, Daniel A. Braun

    Abstract: A distinctive property of human and animal intelligence is the ability to form abstractions by neglecting irrelevant information which allows to separate structure from noise. From an information theoretic point of view abstractions are desirable because they allow for very efficient information processing. In artificial systems abstractions are often implemented through computationally costly for… ▽ More

    Submitted 19 December, 2013; v1 submitted 16 December, 2013; originally announced December 2013.

    Comments: Presented at the NIPS 2013 Workshop on Planning with Information Constraints

  21. arXiv:1206.1898  [pdf, ps, other

    stat.ML cs.AI math.ST

    A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function

    Authors: Pedro A. Ortega, Jordi Grau-Moya, Tim Genewein, David Balduzzi, Daniel A. Braun

    Abstract: We propose a novel Bayesian approach to solve stochastic optimization problems that involve finding extrema of noisy, nonlinear functions. Previous work has focused on representing possible functions explicitly, which leads to a two-step procedure of first, doing inference over the function space and second, finding the extrema of these functions. Here we skip the representation step and directly… ▽ More

    Submitted 10 November, 2012; v1 submitted 8 June, 2012; originally announced June 2012.

    Comments: 9 pages, 5 figures

    Journal ref: Neural Information Processing Systems (NIPS) 2012