Skip to main content

Showing 1–8 of 8 results for author: Burgard, W

Searching in archive stat. Search in all archives.
.
  1. arXiv:2303.10144  [pdf, other

    cs.LG stat.ML

    Dynamic Update-to-Data Ratio: Minimizing World Model Overfitting

    Authors: Nicolai Dorka, Tim Welschehold, Wolfram Burgard

    Abstract: Early stop** based on the validation set performance is a popular approach to find the right balance between under- and overfitting in the context of supervised learning. However, in reinforcement learning, even for supervised sub-problems such as world model learning, early stop** is not applicable as the dataset is continually evolving. As a solution, we propose a new general method that dyn… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

    Comments: ICLR 2023

  2. arXiv:2009.08169  [pdf

    cs.LG cs.CV stat.ML

    Holistic Filter Pruning for Efficient Deep Neural Networks

    Authors: Lukas Enderich, Fabian Timm, Wolfram Burgard

    Abstract: Deep neural networks (DNNs) are usually over-parameterized to increase the likelihood of getting adequate initial weights by random initialization. Consequently, trained DNNs have many redundancies which can be pruned from the model to reduce complexity and improve the ability to generalize. Structural sparsity, as achieved by filter pruning, directly reduces the tensor sizes of weights and activa… ▽ More

    Submitted 17 September, 2020; originally announced September 2020.

    Comments: preprint, accepted at WACV2021

  3. arXiv:2007.02701  [pdf, other

    cs.LG cs.AI stat.ML

    Scaling Imitation Learning in Minecraft

    Authors: Artemij Amiranashvili, Nicolai Dorka, Wolfram Burgard, Vladlen Koltun, Thomas Brox

    Abstract: Imitation learning is a powerful family of techniques for learning sensorimotor coordination in immersive environments. We apply imitation learning to attain state-of-the-art performance on hard exploration problems in the Minecraft environment. We report experiments that highlight the influence of network architecture, loss function, and data augmentation. An early version of our approach reached… ▽ More

    Submitted 6 July, 2020; originally announced July 2020.

  4. arXiv:2003.04046  [pdf, other

    cs.LG eess.SP stat.ML

    Efficiency and Equity are Both Essential: A Generalized Traffic Signal Controller with Deep Reinforcement Learning

    Authors: Shengchao Yan, **gwei Zhang, Daniel Büscher, Wolfram Burgard

    Abstract: Traffic signal controllers play an essential role in today's traffic system. However, the majority of them currently is not sufficiently flexible or adaptive to generate optimal traffic schedules. In this paper we present an approach to learning policies for signal controllers using deep reinforcement learning aiming for optimized traffic flow. Our method uses a novel formulation of the reward fun… ▽ More

    Submitted 27 December, 2020; v1 submitted 9 March, 2020; originally announced March 2020.

    Comments: Published as a conference paper at IROS 2020

  5. arXiv:1909.01039  [pdf, other

    cs.RO cs.HC cs.LG stat.ML

    Learning User Preferences for Trajectories from Brain Signals

    Authors: Henrich Kolkhorst, Wolfram Burgard, Michael Tangermann

    Abstract: Robot motions in the presence of humans should not only be feasible and safe, but also conform to human preferences. This, however, requires user feedback on the robot's behavior. In this work, we propose a novel approach to leverage the user's brain signals as a feedback modality in order to decode the judgment of robot trajectories and rank them according to the user's preferences. We show that… ▽ More

    Submitted 20 December, 2019; v1 submitted 3 September, 2019; originally announced September 2019.

    Comments: The International Symposium on Robotics Research (ISRR), Hanoi, Vietnam, October 2019; reformatted to two-column layout

  6. arXiv:1903.07400  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Scheduled Intrinsic Drive: A Hierarchical Take on Intrinsically Motivated Exploration

    Authors: **gwei Zhang, Niklas Wetzel, Nicolai Dorka, Joschka Boedecker, Wolfram Burgard

    Abstract: Exploration in sparse reward reinforcement learning remains an open challenge. Many state-of-the-art methods use intrinsic motivation to complement the sparse extrinsic reward signal, giving the agent more opportunities to receive feedback during exploration. Commonly these signals are added as bonus rewards, which results in a mixture policy that neither conducts exploration nor task fulfillment… ▽ More

    Submitted 21 June, 2019; v1 submitted 18 March, 2019; originally announced March 2019.

    Comments: A video of our experimental results can be found at https://youtu.be/b0MbY3lUlEI

  7. arXiv:1805.01667  [pdf, other

    cs.LG q-bio.NC q-bio.QM stat.ML

    Intracranial Error Detection via Deep Learning

    Authors: Martin Völker, Jiří Hammer, Robin T. Schirrmeister, Joos Behncke, Lukas D. J. Fiederer, Andreas Schulze-Bonhage, Petr Marusič, Wolfram Burgard, Tonio Ball

    Abstract: Deep learning techniques have revolutionized the field of machine learning and were recently successfully applied to various classification problems in noninvasive electroencephalography (EEG). However, these methods were so far only rarely evaluated for use in intracranial EEG. We employed convolutional neural networks (CNNs) to classify and characterize the error-related brain response as measur… ▽ More

    Submitted 2 November, 2018; v1 submitted 4 May, 2018; originally announced May 2018.

    Comments: 8 pages, 6 figures. Accepted at the 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC2018)

    ACM Class: I.2.6; I.2.8; I.5.0; J.2; J.3

  8. arXiv:1604.03912  [pdf, other

    cs.AI cs.LG eess.SY stat.ML

    Inverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics

    Authors: Michael Herman, Tobias Gindele, Jörg Wagner, Felix Schmitt, Wolfram Burgard

    Abstract: Inverse Reinforcement Learning (IRL) describes the problem of learning an unknown reward function of a Markov Decision Process (MDP) from observed behavior of an agent. Since the agent's behavior originates in its policy and MDP policies depend on both the stochastic system dynamics as well as the reward function, the solution of the inverse problem is significantly influenced by both. Current IRL… ▽ More

    Submitted 13 April, 2016; originally announced April 2016.

    Comments: accepted to appear in AISTATS 2016