Skip to main content

Showing 1–18 of 18 results for author: Zintgraf, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2301.08028  [pdf, other

    cs.LG

    A Survey of Meta-Reinforcement Learning

    Authors: Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa Zintgraf, Chelsea Finn, Shimon Whiteson

    Abstract: While deep reinforcement learning (RL) has fueled multiple high-profile successes in machine learning, it is held back from more widespread adoption by its often poor data efficiency and the limited generality of the policies it produces. A promising approach for alleviating these limitations is to cast the development of better RL algorithms as a machine learning problem itself in a process calle… ▽ More

    Submitted 19 January, 2023; originally announced January 2023.

  2. arXiv:2206.12765  [pdf, other

    cs.AI cs.LG

    Generalized Beliefs for Cooperative AI

    Authors: Darius Muglich, Luisa Zintgraf, Christian Schroeder de Witt, Shimon Whiteson, Jakob Foerster

    Abstract: Self-play is a common paradigm for constructing solutions in Markov games that can yield optimal policies in collaborative settings. However, these policies often adopt highly-specialized conventions that make playing with a novel partner difficult. To address this, recent approaches rely on encoding symmetry and convention-awareness into policy training, but these require strong environmental ass… ▽ More

    Submitted 25 June, 2022; originally announced June 2022.

  3. arXiv:2202.08132  [pdf, other

    cs.LG

    Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients

    Authors: Milad Alizadeh, Shyam A. Tailor, Luisa M Zintgraf, Joost van Amersfoort, Sebastian Farquhar, Nicholas Donald Lane, Yarin Gal

    Abstract: Pruning neural networks at initialization would enable us to find sparse models that retain the accuracy of the original network while consuming fewer computational resources for training and inference. However, current methods are insufficient to enable this optimization and lead to a large degradation in model performance. In this paper, we identify a fundamental limitation in the formulation of… ▽ More

    Submitted 5 April, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

  4. arXiv:2112.00478  [pdf, other

    cs.LG cs.AI stat.ML

    On the Practical Consistency of Meta-Reinforcement Learning Algorithms

    Authors: Zheng Xiong, Luisa Zintgraf, Jacob Beck, Risto Vuorio, Shimon Whiteson

    Abstract: Consistency is the theoretical property of a meta learning algorithm that ensures that, under certain assumptions, it can adapt to any task at test time. An open question is whether and how theoretical consistency translates into practice, in comparison to inconsistent algorithms. In this paper, we empirically investigate this question on a set of representative meta-RL algorithms. We find that th… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

  5. arXiv:2107.08295  [pdf, other

    cs.AI cs.MA

    Communicating via Markov Decision Processes

    Authors: Samuel Sokota, Christian Schroeder de Witt, Maximilian Igl, Luisa Zintgraf, Philip Torr, Martin Strohmeier, J. Zico Kolter, Shimon Whiteson, Jakob Foerster

    Abstract: We consider the problem of communicating exogenous information by means of Markov decision process trajectories. This setting, which we call a Markov coding game (MCG), generalizes both source coding and a large class of referential games. MCGs also isolate a problem that is important in decentralized control settings in which cheap-talk is not available -- namely, they require balancing communica… ▽ More

    Submitted 12 June, 2022; v1 submitted 17 July, 2021; originally announced July 2021.

    Comments: ICML 2022

  6. arXiv:2106.12937  [pdf, other

    cs.HC cs.AI

    Optimizing piano practice with a utility-based scaffold

    Authors: Alexandra Moringen, Sören Rüttgers, Luisa Zintgraf, Jason Friedman, Helge Ritter

    Abstract: A typical part of learning to play the piano is the progression through a series of practice units that focus on individual dimensions of the skill, such as hand coordination, correct posture, or correct timing. Ideally, a focus on a particular practice method should be made in a way to maximize the learner's progress in learning to play the piano. Because we each learn differently, and because th… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

  7. arXiv:2104.08492  [pdf, other

    cs.AI cs.LG

    A Self-Supervised Auxiliary Loss for Deep RL in Partially Observable Settings

    Authors: Eltayeb Ahmed, Luisa Zintgraf, Christian A. Schroeder de Witt, Nicolas Usunier

    Abstract: In this work we explore an auxiliary loss useful for reinforcement learning in environments where strong performing agents are required to be able to navigate a spatial environment. The auxiliary loss proposed is to minimize the classification error of a neural network classifier that predicts whether or not a pair of states sampled from the agents current episode trajectory are in order. The clas… ▽ More

    Submitted 17 April, 2021; originally announced April 2021.

  8. ORBIT: A Real-World Few-Shot Dataset for Teachable Object Recognition

    Authors: Daniela Massiceti, Luisa Zintgraf, John Bronskill, Lida Theodorou, Matthew Tobias Harris, Edward Cutrell, Cecily Morrison, Katja Hofmann, Simone Stumpf

    Abstract: Object recognition has made great advances in the last decade, but predominately still relies on many high-quality training examples per object category. In contrast, learning new objects from only a few examples could enable many impactful applications from robotics to user personalization. Most few-shot learning research, however, has been driven by benchmark datasets that lack the high variatio… ▽ More

    Submitted 8 October, 2021; v1 submitted 8 April, 2021; originally announced April 2021.

    Comments: IEEE/CVF International Conference on Computer Vision (ICCV), 2021

  9. A Practical Guide to Multi-Objective Reinforcement Learning and Planning

    Authors: Conor F. Hayes, Roxana Rădulescu, Eugenio Bargiacchi, Johan Källström, Matthew Macfarlane, Mathieu Reymond, Timothy Verstraeten, Luisa M. Zintgraf, Richard Dazeley, Fredrik Heintz, Enda Howley, Athirai A. Irissappane, Patrick Mannion, Ann Nowé, Gabriel Ramos, Marcello Restelli, Peter Vamplew, Diederik M. Roijers

    Abstract: Real-world decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying pr… ▽ More

    Submitted 17 March, 2021; originally announced March 2021.

    Journal ref: Auton Agent Multi-Agent Syst 36, 26 (2022)

  10. arXiv:2101.03864  [pdf, other

    cs.LG cs.MA

    Deep Interactive Bayesian Reinforcement Learning via Meta-Learning

    Authors: Luisa Zintgraf, Sam Devlin, Kamil Ciosek, Shimon Whiteson, Katja Hofmann

    Abstract: Agents that interact with other agents often do not know a priori what the other agents' strategies are, but have to maximise their own online return while interacting with and learning about others. The optimal adaptive behaviour under uncertainty over the other agents' strategies w.r.t. some prior can in principle be computed using the Interactive Bayesian Reinforcement Learning framework. Unfor… ▽ More

    Submitted 15 April, 2022; v1 submitted 11 January, 2021; originally announced January 2021.

    Comments: Published as an extended abstract at AAMAS 2021

  11. arXiv:2010.01062  [pdf, other

    cs.LG cs.AI stat.ML

    Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning

    Authors: Luisa Zintgraf, Leo Feng, Cong Lu, Maximilian Igl, Kristian Hartikainen, Katja Hofmann, Shimon Whiteson

    Abstract: To rapidly learn a new task, it is often essential for agents to explore efficiently -- especially when performance matters from the first timestep. One way to learn such behaviour is via meta-learning. Many existing methods however rely on dense rewards for meta-training, and can fail catastrophically if the rewards are sparse. Without a suitable reward signal, the need for exploration during met… ▽ More

    Submitted 9 June, 2021; v1 submitted 2 October, 2020; originally announced October 2020.

    Comments: Published at the International Conference on Machine Learning (ICML) 2021

  12. arXiv:1911.13159  [pdf, other

    cs.LG stat.ML

    VIABLE: Fast Adaptation via Backpropagating Learned Loss

    Authors: Leo Feng, Luisa Zintgraf, Bei Peng, Shimon Whiteson

    Abstract: In few-shot learning, typically, the loss function which is applied at test time is the one we are ultimately interested in minimising, such as the mean-squared-error loss for a regression problem. However, given that we have few samples at test time, we argue that the loss function that we are interested in minimising is not necessarily the loss function most suitable for computing gradients in a… ▽ More

    Submitted 29 November, 2019; originally announced November 2019.

    Comments: Published at the 3rd Workshop on Meta-Learning at NeurIPS 2019

  13. arXiv:1910.08348  [pdf, other

    cs.LG stat.ML

    VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

    Authors: Luisa Zintgraf, Kyriacos Shiarlis, Maximilian Igl, Sebastian Schulze, Yarin Gal, Katja Hofmann, Shimon Whiteson

    Abstract: Trading off exploration and exploitation in an unknown environment is key to maximising expected return during learning. A Bayes-optimal policy, which does so optimally, conditions its actions not only on the environment state but on the agent's uncertainty about the environment. Computing a Bayes-optimal policy is however intractable for all but the smallest tasks. In this paper, we introduce var… ▽ More

    Submitted 27 February, 2020; v1 submitted 18 October, 2019; originally announced October 2019.

    Comments: Published at ICLR 2020

  14. arXiv:1810.03642  [pdf, other

    cs.LG stat.ML

    Fast Context Adaptation via Meta-Learning

    Authors: Luisa M Zintgraf, Kyriacos Shiarlis, Vitaly Kurin, Katja Hofmann, Shimon Whiteson

    Abstract: We propose CAVIA for meta-learning, a simple extension to MAML that is less prone to meta-overfitting, easier to parallelise, and more interpretable. CAVIA partitions the model parameters into two parts: context parameters that serve as additional input to the model and are adapted on individual tasks, and shared parameters that are meta-trained and shared across tasks. At test time, only the cont… ▽ More

    Submitted 10 June, 2019; v1 submitted 8 October, 2018; originally announced October 2018.

    Comments: Published at the International Conference on Machine Learning (ICML) 2019

  15. arXiv:1806.02426  [pdf, other

    cs.LG stat.ML

    Deep Variational Reinforcement Learning for POMDPs

    Authors: Maximilian Igl, Luisa Zintgraf, Tuan Anh Le, Frank Wood, Shimon Whiteson

    Abstract: Many real-world sequential decision making problems are partially observable by nature, and the environment model is typically unknown. Consequently, there is great need for reinforcement learning methods that can tackle such problems given only a stream of incomplete and noisy observations. In this paper, we propose deep variational reinforcement learning (DVRL), which introduces an inductive bia… ▽ More

    Submitted 6 June, 2018; originally announced June 2018.

  16. arXiv:1802.07606  [pdf, other

    cs.LG cs.AI stat.ML

    Ordered Preference Elicitation Strategies for Supporting Multi-Objective Decision Making

    Authors: Luisa M Zintgraf, Diederik M Roijers, Sjoerd Linders, Catholijn M Jonker, Ann Nowé

    Abstract: In multi-objective decision planning and learning, much attention is paid to producing optimal solution sets that contain an optimal policy for every possible user preference profile. We argue that the step that follows, i.e, determining which policy to execute by maximising the user's intrinsic utility function over this (possibly infinite) set, is under-studied. This paper aims to fill this gap.… ▽ More

    Submitted 21 February, 2018; originally announced February 2018.

    Comments: AAMAS 2018, Source code at https://github.com/lmzintgraf/gp_pref_elicit

  17. arXiv:1702.04595  [pdf, other

    cs.CV cs.AI

    Visualizing Deep Neural Network Decisions: Prediction Difference Analysis

    Authors: Luisa M Zintgraf, Taco S Cohen, Tameem Adel, Max Welling

    Abstract: This article presents the prediction difference analysis method for visualizing the response of a deep neural network to a specific input. When classifying images, the method highlights areas in a given input image that provide evidence for or against a certain class. It overcomes several shortcoming of previous methods and provides great additional insight into the decision making process of clas… ▽ More

    Submitted 15 February, 2017; originally announced February 2017.

    Comments: ICLR2017

  18. arXiv:1603.02518  [pdf, other

    cs.CV

    A New Method to Visualize Deep Neural Networks

    Authors: Luisa M. Zintgraf, Taco S. Cohen, Max Welling

    Abstract: We present a method for visualising the response of a deep neural network to a specific input. For image data for instance our method will highlight areas that provide evidence in favor of, and against choosing a certain class. The method overcomes several shortcomings of previous methods and provides great additional insight into the decision making process of convolutional networks, which is imp… ▽ More

    Submitted 12 June, 2017; v1 submitted 8 March, 2016; originally announced March 2016.

    Comments: Please note that this version of the article is outdated. The new version (published at ICLR2017) includes additional experiments on MRI scans and can be found at arXiv:1702.04595