Skip to main content

Showing 1–23 of 23 results for author: Biedenkapp, A

.
  1. arXiv:2404.09521  [pdf, other

    cs.LG cs.AI

    Inferring Behavior-Specific Context Improves Zero-Shot Generalization in Reinforcement Learning

    Authors: Tidiane Camaret Ndir, André Biedenkapp, Noor Awad

    Abstract: In this work, we address the challenge of zero-shot generalization (ZSG) in Reinforcement Learning (RL), where agents must adapt to entirely novel environments without additional training. We argue that understanding and utilizing contextual cues, such as the gravity level of the environment, is critical for robust generalization, and we propose to integrate the learning of context representations… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: https://github.com/tidiane-camaret/contextual_rl_zero_shot

  2. arXiv:2403.10967  [pdf, other

    cs.LG cs.AI

    Dreaming of Many Worlds: Learning Contextual World Models Aids Zero-Shot Generalization

    Authors: Sai Prasanna, Karim Farid, Raghu Rajan, André Biedenkapp

    Abstract: Zero-shot generalization (ZSG) to unseen dynamics is a major challenge for creating generally capable embodied agents. To address the broader challenge, we start with the simpler setting of contextual reinforcement learning (cRL), assuming observability of the context values that parameterize the variation in the system's dynamics, such as the mass or dimensions of a robot, without making further… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: 33 pages

  3. arXiv:2402.06402  [pdf, other

    cs.LG cs.AI

    Hierarchical Transformers are Efficient Meta-Reinforcement Learners

    Authors: Gresa Shala, André Biedenkapp, Josif Grabocka

    Abstract: We introduce Hierarchical Transformers for Meta-Reinforcement Learning (HTrMRL), a powerful online meta-reinforcement learning approach. HTrMRL aims to address the challenge of enabling reinforcement learning agents to perform effectively in previously unseen tasks. We demonstrate how past episodes serve as a rich source of information, which our model effectively distills and applies to new conte… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  4. arXiv:2206.03493  [pdf, other

    cs.LG

    DeepCAVE: An Interactive Analysis Tool for Automated Machine Learning

    Authors: René Sass, Eddie Bergman, André Biedenkapp, Frank Hutter, Marius Lindauer

    Abstract: Automated Machine Learning (AutoML) is used more than ever before to support users in determining efficient hyperparameters, neural architectures, or even full machine learning pipelines. However, users tend to mistrust the optimization process and its results due to a lack of transparency, making manual tuning still widespread. We introduce DeepCAVE, an interactive framework to analyze and monito… ▽ More

    Submitted 11 July, 2022; v1 submitted 7 June, 2022; originally announced June 2022.

    Comments: Workshop on Adaptive Experimental Design and Active Learning in the Real World (ReALML@ICML'22)

  5. arXiv:2205.13881  [pdf, other

    cs.AI cs.LG cs.NE

    Automated Dynamic Algorithm Configuration

    Authors: Steven Adriaensen, André Biedenkapp, Gresa Shala, Noor Awad, Theresa Eimer, Marius Lindauer, Frank Hutter

    Abstract: The performance of an algorithm often critically depends on its parameter configuration. While a variety of automated algorithm configuration methods have been proposed to relieve users from the tedious and error-prone task of manually tuning parameters, there is still a lot of untapped potential as the learned configuration is static, i.e., parameter settings remain fixed throughout the run. Howe… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

  6. arXiv:2202.04500  [pdf, other

    cs.LG

    Contextualize Me -- The Case for Context in Reinforcement Learning

    Authors: Carolin Benjamins, Theresa Eimer, Frederik Schubert, Aditya Mohan, Sebastian Döhler, André Biedenkapp, Bodo Rosenhahn, Frank Hutter, Marius Lindauer

    Abstract: While Reinforcement Learning ( RL) has made great strides towards solving increasingly complicated problems, many algorithms are still brittle to even slight environmental changes. Contextual Reinforcement Learning (cRL) provides a framework to model such changes in a principled manner, thereby enabling flexible, precise and interpretable task specification and generation. Our goal is to show how… ▽ More

    Submitted 2 June, 2023; v1 submitted 9 February, 2022; originally announced February 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2110.02102

  7. arXiv:2202.03259  [pdf, other

    cs.NE cs.LG

    Theory-inspired Parameter Control Benchmarks for Dynamic Algorithm Configuration

    Authors: André Biedenkapp, Nguyen Dang, Martin S. Krejca, Frank Hutter, Carola Doerr

    Abstract: It has long been observed that the performance of evolutionary algorithms and other randomized search heuristics can benefit from a non-static choice of the parameters that steer their optimization behavior. Mechanisms that identify suitable configurations on the fly ("parameter control") or via a dedicated training process ("dynamic algorithm configuration") are therefore an important component o… ▽ More

    Submitted 15 April, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

  8. Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

    Authors: Jack Parker-Holder, Raghu Rajan, Xingyou Song, André Biedenkapp, Yingjie Miao, Theresa Eimer, Baohe Zhang, Vu Nguyen, Roberto Calandra, Aleksandra Faust, Frank Hutter, Marius Lindauer

    Abstract: The combination of Reinforcement Learning (RL) with deep learning has led to a series of impressive feats, with many believing (deep) RL provides a path towards generally capable agents. However, the success of RL agents is often highly sensitive to design choices in the training process, which may require tedious and error-prone manual tuning. This makes it challenging to use RL for new problems,… ▽ More

    Submitted 2 June, 2022; v1 submitted 11 January, 2022; originally announced January 2022.

    Comments: Published in JAIR. Co-first authors and co-last authors are listed in alphabetical order

    MSC Class: 68T01 ACM Class: I.2.6

    Journal ref: Journal of Artificial Intelligence Research 74 (2022) 517-568

  9. arXiv:2110.02102  [pdf, other

    cs.LG

    CARL: A Benchmark for Contextual and Adaptive Reinforcement Learning

    Authors: Carolin Benjamins, Theresa Eimer, Frederik Schubert, André Biedenkapp, Bodo Rosenhahn, Frank Hutter, Marius Lindauer

    Abstract: While Reinforcement Learning has made great strides towards solving ever more complicated tasks, many algorithms are still brittle to even slight changes in their environment. This is a limiting factor for real-world applications of RL. Although the research community continuously aims at improving both robustness and generalization of RL algorithms, unfortunately it still lacks an open-source set… ▽ More

    Submitted 11 October, 2021; v1 submitted 5 October, 2021; originally announced October 2021.

    Journal ref: Workshop on Ecological Theory of Reinforcement Learning, NeurIPS 2021

  10. arXiv:2109.09831  [pdf, other

    cs.LG stat.ML

    SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization

    Authors: Marius Lindauer, Katharina Eggensperger, Matthias Feurer, André Biedenkapp, Difan Deng, Carolin Benjamins, Tim Ruhopf, René Sass, Frank Hutter

    Abstract: Algorithm parameters, in particular hyperparameters of machine learning algorithms, can substantially impact their performance. To support users in determining well-performing hyperparameter configurations for their algorithms, datasets and applications at hand, SMAC3 offers a robust and flexible framework for Bayesian Optimization, which can improve performance within a few evaluations. It offers… ▽ More

    Submitted 8 February, 2022; v1 submitted 20 September, 2021; originally announced September 2021.

    Journal ref: Journal of Machine Learning Research 23 (2022) 1-9

  11. arXiv:2106.05262  [pdf, other

    cs.LG

    TempoRL: Learning When to Act

    Authors: André Biedenkapp, Raghu Rajan, Frank Hutter, Marius Lindauer

    Abstract: Reinforcement learning is a powerful approach to learn behaviour through interactions with an environment. However, behaviours are usually learned in a purely reactive fashion, where an appropriate action is selected based on an observation. In this form, it is challenging to learn when it is necessary to execute new decisions. This makes learning inefficient, especially in environments that need… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

    Comments: Accepted at ICML'21

  12. arXiv:2106.05110  [pdf, other

    cs.LG

    Self-Paced Context Evaluation for Contextual Reinforcement Learning

    Authors: Theresa Eimer, André Biedenkapp, Frank Hutter, Marius Lindauer

    Abstract: Reinforcement learning (RL) has made a lot of advances for solving a single problem in a given environment; but learning policies that generalize to unseen variations of a problem remains challenging. To improve sample efficiency for learning on such instances of a problem domain, we present Self-Paced Context Evaluation (SPaCE). Based on self-paced learning, \spc automatically generates \task cur… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

    Journal ref: Proceedings of Machine Learning Research 139 (ICML 2021)

  13. arXiv:2105.08541  [pdf, other

    cs.AI

    DACBench: A Benchmark Library for Dynamic Algorithm Configuration

    Authors: Theresa Eimer, André Biedenkapp, Maximilian Reimer, Steven Adriaensen, Frank Hutter, Marius Lindauer

    Abstract: Dynamic Algorithm Configuration (DAC) aims to dynamically control a target algorithm's hyperparameters in order to improve its performance. Several theoretical and empirical results have demonstrated the benefits of dynamically controlling hyperparameters in domains like evolutionary computation, AI Planning or deep learning. Replicating these results, as well as studying new methods for DAC, howe… ▽ More

    Submitted 18 May, 2021; originally announced May 2021.

    Comments: Accepted at IJCAI 2021

    Journal ref: 30th International Joint Conference on Artificial Intelligence (IJCAI 2021)

  14. arXiv:2105.01015  [pdf, other

    cs.LG cs.AI stat.ML

    Bag of Baselines for Multi-objective Joint Neural Architecture Search and Hyperparameter Optimization

    Authors: Julia Guerrero-Viu, Sven Hauns, Sergio Izquierdo, Guilherme Miotto, Simon Schrodi, Andre Biedenkapp, Thomas Elsken, Difan Deng, Marius Lindauer, Frank Hutter

    Abstract: Neural architecture search (NAS) and hyperparameter optimization (HPO) make deep learning accessible to non-experts by automatically finding the architecture of the deep neural network to use and tuning the hyperparameters of the used training pipeline. While both NAS and HPO have been studied extensively in recent years, NAS methods typically assume fixed hyperparameters and vice versa - there ex… ▽ More

    Submitted 3 May, 2021; originally announced May 2021.

  15. arXiv:2102.13651  [pdf, other

    cs.LG cs.AI cs.NE eess.SY

    On the Importance of Hyperparameter Optimization for Model-based Reinforcement Learning

    Authors: Baohe Zhang, Raghu Rajan, Luis Pineda, Nathan Lambert, André Biedenkapp, Kurtland Chua, Frank Hutter, Roberto Calandra

    Abstract: Model-based Reinforcement Learning (MBRL) is a promising framework for learning control in a data-efficient manner. MBRL algorithms can be fairly complex due to the separate dynamics modeling and the subsequent planning algorithm, and as a result, they often possess tens of hyperparameters and architectural choices. For this reason, MBRL typically requires significant human expertise before it can… ▽ More

    Submitted 26 February, 2021; originally announced February 2021.

    Comments: 19 pages, accepted by AISTATS 2021

  16. arXiv:2102.03275  [pdf, other

    cs.LG

    In-Loop Meta-Learning with Gradient-Alignment Reward

    Authors: Samuel Müller, André Biedenkapp, Frank Hutter

    Abstract: At the heart of the standard deep learning training loop is a greedy gradient step minimizing a given loss. We propose to add a second step to maximize training generalization. To do this, we optimize the loss of the next training step. While computing the gradient for this generally is very expensive and many interesting applications consider non-differentiable parameters (e.g. due to hard sample… ▽ More

    Submitted 5 February, 2021; originally announced February 2021.

    Comments: Accepted to Meta Learning Workshop at AAAI

  17. arXiv:2012.08180  [pdf, ps, other

    cs.LG cs.NE stat.ML

    Squirrel: A Switching Hyperparameter Optimizer

    Authors: Noor Awad, Gresa Shala, Difan Deng, Neeratyoy Mallik, Matthias Feurer, Katharina Eggensperger, Andre' Biedenkapp, Diederick Vermetten, Hao Wang, Carola Doerr, Marius Lindauer, Frank Hutter

    Abstract: In this short note, we describe our submission to the NeurIPS 2020 BBO challenge. Motivated by the fact that different optimizers work well on different problems, our approach switches between different optimizers. Since the team names on the competition's leaderboard were randomly generated "alliteration nicknames", consisting of an adjective and an animal with the same initial letter, we called… ▽ More

    Submitted 16 December, 2020; v1 submitted 15 December, 2020; originally announced December 2020.

  18. arXiv:2009.01555  [pdf, other

    cs.LG stat.ML

    Sample-Efficient Automated Deep Reinforcement Learning

    Authors: Jörg K. H. Franke, Gregor Köhler, André Biedenkapp, Frank Hutter

    Abstract: Despite significant progress in challenging problems across various domains, applying state-of-the-art deep reinforcement learning (RL) algorithms remains challenging due to their sensitivity to the choice of hyperparameters. This sensitivity can partly be attributed to the non-stationarity of the RL problem, potentially requiring different hyperparameter settings at various stages of the learning… ▽ More

    Submitted 17 March, 2021; v1 submitted 3 September, 2020; originally announced September 2020.

    Comments: In Proceedings of the International Conference on Learning Representations (ICLR 2021), 2021

  19. arXiv:2006.08246  [pdf, other

    cs.AI cs.LG

    Learning Heuristic Selection with Dynamic Algorithm Configuration

    Authors: David Speck, André Biedenkapp, Frank Hutter, Robert Mattmüller, Marius Lindauer

    Abstract: A key challenge in satisficing planning is to use multiple heuristics within one heuristic search. An aggregation of multiple heuristic estimates, for example by taking the maximum, has the disadvantage that bad estimates of a single heuristic can negatively affect the whole search. Since the performance of a heuristic varies from instance to instance, approaches such as algorithm selection can be… ▽ More

    Submitted 12 April, 2021; v1 submitted 15 June, 2020; originally announced June 2020.

    Comments: Long version of the paper at the International Conference on Automated Planning and Scheduling (ICAPS) 2021

  20. arXiv:1909.07750  [pdf, other

    cs.LG cs.AI stat.ML

    MDP Playground: An Analysis and Debug Testbed for Reinforcement Learning

    Authors: Raghu Rajan, Jessica Lizeth Borja Diaz, Suresh Guttikonda, Fabio Ferreira, André Biedenkapp, Jan Ole von Hartz, Frank Hutter

    Abstract: We present MDP Playground, a testbed for Reinforcement Learning (RL) agents with dimensions of hardness that can be controlled independently to challenge agents in different ways and obtain varying degrees of hardness in toy and complex RL environments. We consider and allow control over a wide variety of dimensions, including delayed rewards, sequence lengths, reward density, stochasticity, image… ▽ More

    Submitted 14 July, 2023; v1 submitted 17 September, 2019; originally announced September 2019.

    Comments: Same version as the one published in JAIR Vol. 77 (2023)

    MSC Class: 68T01 ACM Class: I.2.6; I.2.8

    Journal ref: Journal of Artificial Intelligence Research 77 (2023) 821-890

  21. arXiv:1908.06756  [pdf, other

    cs.LG cs.AI stat.ML

    BOAH: A Tool Suite for Multi-Fidelity Bayesian Optimization & Analysis of Hyperparameters

    Authors: Marius Lindauer, Katharina Eggensperger, Matthias Feurer, André Biedenkapp, Joshua Marben, Philipp Müller, Frank Hutter

    Abstract: Hyperparameter optimization and neural architecture search can become prohibitively expensive for regular black-box Bayesian optimization because the training and evaluation of a single model can easily take several hours. To overcome this, we introduce a comprehensive tool suite for effective multi-fidelity Bayesian optimization and the analysis of its runs. The suite, written in Python, provides… ▽ More

    Submitted 16 August, 2019; originally announced August 2019.

  22. arXiv:1908.06674  [pdf, other

    cs.LG cs.AI stat.ML

    Towards Assessing the Impact of Bayesian Optimization's Own Hyperparameters

    Authors: Marius Lindauer, Matthias Feurer, Katharina Eggensperger, André Biedenkapp, Frank Hutter

    Abstract: Bayesian Optimization (BO) is a common approach for hyperparameter optimization (HPO) in automated machine learning. Although it is well-accepted that HPO is crucial to obtain well-performing machine learning models, tuning BO's own hyperparameters is often neglected. In this paper, we empirically study the impact of optimizing BO's own hyperparameters and the transferability of the found settings… ▽ More

    Submitted 19 August, 2019; originally announced August 2019.

    Comments: Accepted at DSO workshop (as part of IJCAI'19)

  23. arXiv:1906.07644  [pdf, other

    cs.LG cs.AI eess.SY stat.ML

    Towards White-box Benchmarks for Algorithm Control

    Authors: André Biedenkapp, H. Furkan Bozkurt, Frank Hutter, Marius Lindauer

    Abstract: The performance of many algorithms in the fields of hard combinatorial problem solving, machine learning or AI in general depends on tuned hyperparameter configurations. Automated methods have been proposed to alleviate users from the tedious and error-prone task of manually searching for performance-optimized configurations across a set of problem instances. However there is still a lot of untapp… ▽ More

    Submitted 22 August, 2019; v1 submitted 18 June, 2019; originally announced June 2019.

    Comments: 8 pages, 9 figures