Skip to main content

Showing 1–36 of 36 results for author: Hafner, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.07515  [pdf, other

    cs.RO cs.AI cs.LG

    OpenBot-Fleet: A System for Collective Learning with Real Robots

    Authors: Matthias Müller, Samarth Brahmbhatt, Ankur Deka, Quentin Leboutet, David Hafner, Vladlen Koltun

    Abstract: We introduce OpenBot-Fleet, a comprehensive open-source cloud robotics system for navigation. OpenBot-Fleet uses smartphones for sensing, local compute and communication, Google Firebase for secure cloud storage and off-board compute, and a robust yet low-cost wheeled robot toact in real-world environments. The robots collect task data and upload it to the cloud where navigation policies can be le… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: Accepted at ICRA'24

  2. arXiv:2311.12579  [pdf, other

    physics.geo-ph cs.LG

    Machine-Guided Discovery of a Real-World Rogue Wave Model

    Authors: Dion Häfner, Johannes Gemmrich, Markus Jochum

    Abstract: Big data and large-scale machine learning have had a profound impact on science and engineering, particularly in fields focused on forecasting and prediction. Yet, it is still not clear how we can use the superior pattern matching abilities of machine learning models for scientific discovery. This is because the goals of machine learning and science are generally not aligned. In addition to being… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

    Journal ref: Proceedings of the National Academy of Sciences (2023), 120(48), e2306275120

  3. arXiv:2308.01399  [pdf, other

    cs.CL cs.AI cs.LG

    Learning to Model the World with Language

    Authors: Jessy Lin, Yuqing Du, Olivia Watkins, Danijar Hafner, Pieter Abbeel, Dan Klein, Anca Dragan

    Abstract: To interact with humans and act in the world, agents need to understand the range of language that people use and relate it to the visual world. While current agents can learn to execute simple language instructions, we aim to build agents that leverage diverse language -- language like "this button turns on the TV" or "I put the bowls away" -- that conveys general knowledge, describes the state o… ▽ More

    Submitted 31 May, 2024; v1 submitted 31 July, 2023; originally announced August 2023.

    Comments: ICML 2024. Website: https://dynalang.github.io/

  4. arXiv:2305.14343  [pdf, other

    cs.LG cs.AI cs.CV

    Video Prediction Models as Rewards for Reinforcement Learning

    Authors: Alejandro Escontrela, Ademi Adeniji, Wilson Yan, Ajay Jain, Xue Bin Peng, Ken Goldberg, Youngwoon Lee, Danijar Hafner, Pieter Abbeel

    Abstract: Specifying reward signals that allow agents to learn complex behaviors is a long-standing challenge in reinforcement learning. A promising approach is to extract preferences for behaviors from unlabeled videos, which are widely available on the internet. We present Video Prediction Rewards (VIPER), an algorithm that leverages pretrained video prediction models as action-free reward signals for rei… ▽ More

    Submitted 30 May, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: 22 pages, 18 figures, 4 tables. under review

  5. arXiv:2301.04104  [pdf, other

    cs.AI cs.LG stat.ML

    Mastering Diverse Domains through World Models

    Authors: Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, Timothy Lillicrap

    Abstract: Develo** a general algorithm that learns to solve tasks across a wide range of applications has been a fundamental challenge in artificial intelligence. Although current reinforcement learning algorithms can be readily applied to tasks similar to what they have been developed for, configuring them for new application domains requires significant human expertise and experimentation. We present Dr… ▽ More

    Submitted 17 April, 2024; v1 submitted 10 January, 2023; originally announced January 2023.

    Comments: Website: https://danijar.com/dreamerv3

  6. arXiv:2210.13383  [pdf, other

    cs.AI cs.LG

    Evaluating Long-Term Memory in 3D Mazes

    Authors: Jurgis Pasukonis, Timothy Lillicrap, Danijar Hafner

    Abstract: Intelligent agents need to remember salient information to reason in partially-observed environments. For example, agents with a first-person view should remember the positions of relevant objects even if they go out of view. Similarly, to effectively navigate through rooms agents need to remember the floor plan of how rooms are connected. However, most benchmark tasks in reinforcement learning do… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: Project website: https://github.com/jurgisp/memory-maze

  7. arXiv:2210.11698  [pdf, other

    cs.LG cs.AI

    Learning Robust Dynamics through Variational Sparse Gating

    Authors: Arnav Kumar Jain, Shivakanth Sujit, Shruti Joshi, Vincent Michalski, Danijar Hafner, Samira Ebrahimi-Kahou

    Abstract: Learning world models from their sensory inputs enables agents to plan for actions by imagining their future outcomes. World models have previously been shown to improve sample-efficiency in simulated environments with few objects, but have not yet been applied successfully to environments with many objects. In environments with many objects, often only a small number of them are moving or interac… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

  8. arXiv:2210.02396  [pdf, other

    cs.CV cs.AI cs.LG

    Temporally Consistent Transformers for Video Generation

    Authors: Wilson Yan, Danijar Hafner, Stephen James, Pieter Abbeel

    Abstract: To generate accurate videos, algorithms have to understand the spatial and temporal dependencies in the world. Current algorithms enable accurate predictions over short horizons but tend to suffer from temporal inconsistencies. When generated content goes out of view and is later revisited, the model invents different content instead. Despite this severe limitation, no established benchmarks on co… ▽ More

    Submitted 31 May, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: Project website: https://wilson1yan.github.io/teco

  9. arXiv:2206.14244  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Masked World Models for Visual Control

    Authors: Younggyo Seo, Danijar Hafner, Hao Liu, Fangchen Liu, Stephen James, Kimin Lee, Pieter Abbeel

    Abstract: Visual model-based reinforcement learning (RL) has the potential to enable sample-efficient robot learning from visual observations. Yet the current approaches typically train a single model end-to-end for learning both visual representations and dynamics, making it difficult to accurately model the interaction between robots and small objects. In this work, we introduce a visual model-based RL fr… ▽ More

    Submitted 27 May, 2023; v1 submitted 28 June, 2022; originally announced June 2022.

    Comments: Project website: https://sites.google.com/view/mwm-rl. Accepted to CoRL 2022

  10. arXiv:2206.14176  [pdf, other

    cs.RO cs.AI cs.LG

    DayDreamer: World Models for Physical Robot Learning

    Authors: Philipp Wu, Alejandro Escontrela, Danijar Hafner, Ken Goldberg, Pieter Abbeel

    Abstract: To solve tasks in complex environments, robots need to learn from experience. Deep reinforcement learning is a common approach to robot learning but requires a large amount of trial and error to learn, limiting its deployment in the physical world. As a consequence, many advances in robot learning rely on simulators. On the other hand, learning inside of simulators fails to capture the complexity… ▽ More

    Submitted 28 June, 2022; originally announced June 2022.

    Comments: Website: https://danijar.com/daydreamer

  11. arXiv:2206.04114  [pdf, other

    cs.AI cs.LG cs.RO stat.ML

    Deep Hierarchical Planning from Pixels

    Authors: Danijar Hafner, Kuang-Huei Lee, Ian Fischer, Pieter Abbeel

    Abstract: Intelligent agents need to select long sequences of actions to solve complex tasks. While humans easily break down tasks into subgoals and reach them through millions of muscle commands, current artificial intelligence is limited to tasks with horizons of a few hundred decisions, despite large compute budgets. Research on hierarchical reinforcement learning aims to overcome this limitation but has… ▽ More

    Submitted 8 June, 2022; originally announced June 2022.

    Comments: Website: https://danijar.com/director

  12. arXiv:2112.03899  [pdf, other

    cs.LG cs.AI

    Information is Power: Intrinsic Control via Information Capture

    Authors: Nicholas Rhinehart, Jenny Wang, Glen Berseth, John D. Co-Reyes, Danijar Hafner, Chelsea Finn, Sergey Levine

    Abstract: Humans and animals explore their environment and acquire useful skills even in the absence of clear goals, exhibiting intrinsic motivation. The study of intrinsic motivation in artificial agents is concerned with the following question: what is a good general-purpose objective for an agent? We study this question in dynamic partially-observed environments, and argue that a compact and general lear… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

    Comments: NeurIPS 2021

  13. arXiv:2110.09514  [pdf, other

    cs.LG cs.AI cs.CV cs.RO stat.ML

    Discovering and Achieving Goals via World Models

    Authors: Russell Mendonca, Oleh Rybkin, Kostas Daniilidis, Danijar Hafner, Deepak Pathak

    Abstract: How can artificial agents learn to solve many diverse tasks in complex visual environments in the absence of any supervision? We decompose this question into two problems: discovering new goals and learning to reliably achieve them. We introduce Latent Explorer Achiever (LEXA), a unified solution to these that learns a world model from image inputs and uses it to train an explorer and an achiever… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2021. First two authors contributed equally. Website at https://orybkin.github.io/lexa/

  14. arXiv:2109.06780  [pdf, other

    cs.AI cs.LG

    Benchmarking the Spectrum of Agent Capabilities

    Authors: Danijar Hafner

    Abstract: Evaluating the general abilities of intelligent agents requires complex simulation environments. Existing benchmarks typically evaluate only one narrow task per environment, requiring researchers to perform expensive training runs on many different environments. We introduce Crafter, an open world survival game with visual inputs that evaluates a wide range of general abilities within a single env… ▽ More

    Submitted 12 February, 2022; v1 submitted 14 September, 2021; originally announced September 2021.

    Comments: Published at ICLR 2022. Website: https://danijar.com/crafter

  15. arXiv:2105.08089  [pdf, other

    cs.DL cs.AI

    A Measure of Research Taste

    Authors: Vladlen Koltun, David Hafner

    Abstract: Researchers are often evaluated by citation-based metrics. Such metrics can inform hiring, promotion, and funding decisions. Concerns have been expressed that popular citation-based metrics incentivize researchers to maximize the production of publications. Such incentives may not be optimal for scientific progress. Here we present a citation-based measure that rewards both productivity and taste:… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

    Comments: Results can be explored at https://cap-measure.org/

  16. arXiv:2102.09532  [pdf, other

    cs.CV cs.AI cs.LG

    Clockwork Variational Autoencoders

    Authors: Vaibhav Saxena, Jimmy Ba, Danijar Hafner

    Abstract: Deep learning has enabled algorithms to generate realistic images. However, accurately predicting long video sequences requires understanding long-term dependencies and remains an open challenge. While existing video prediction models succeed at generating sharp images, they tend to fail at accurately predicting far into the future. We introduce the Clockwork VAE (CW-VAE), a video prediction model… ▽ More

    Submitted 20 February, 2021; v1 submitted 18 February, 2021; originally announced February 2021.

    Comments: 17 pages, 12 figures, 4 tables

  17. The h-index is no longer an effective correlate of scientific reputation

    Authors: Vladlen Koltun, David Hafner

    Abstract: The impact of individual scientists is commonly quantified using citation-based measures. The most common such measure is the h-index. A scientist's h-index affects hiring, promotion, and funding decisions, and thus shapes the progress of science. Here we report a large-scale study of scientometric measures, analyzing millions of articles and hundreds of millions of citations across four scientifi… ▽ More

    Submitted 5 February, 2021; originally announced February 2021.

    Comments: An interactive visualization of our work can be found at https://h-frac.org

  18. arXiv:2012.11538  [pdf, other

    cs.LG cs.AI cs.RO

    Evaluating Agents without Rewards

    Authors: Brendon Matusch, Jimmy Ba, Danijar Hafner

    Abstract: Reinforcement learning has enabled agents to solve challenging tasks in unknown environments. However, manually crafting reward functions can be time consuming, expensive, and error prone to human error. Competing objectives have been proposed for agents to learn without external supervision, but it has been unclear how well they reflect task rewards or human behavior. To accelerate the developmen… ▽ More

    Submitted 9 February, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

    Comments: 15 pages, 6 figures, 5 tables

  19. arXiv:2012.04603  [pdf, other

    cs.LG

    Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual Model-Based Reinforcement Learning

    Authors: Mohammad Babaeizadeh, Mohammad Taghi Saffar, Danijar Hafner, Harini Kannan, Chelsea Finn, Sergey Levine, Dumitru Erhan

    Abstract: Model-based reinforcement learning (MBRL) methods have shown strong sample efficiency and performance across a variety of tasks, including when faced with high-dimensional visual observations. These methods learn to predict the environment dynamics and expected reward from interaction and use this predictive model to plan and perform the task. However, MBRL methods vary in their fundamental design… ▽ More

    Submitted 8 December, 2020; originally announced December 2020.

  20. arXiv:2011.13897  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Latent Skill Planning for Exploration and Transfer

    Authors: Kevin Xie, Homanga Bharadhwaj, Danijar Hafner, Animesh Garg, Florian Shkurti

    Abstract: To quickly solve new tasks in complex environments, intelligent agents need to build up reusable knowledge. For example, a learned world model captures knowledge about the environment that applies to new tasks. Similarly, skills capture general behaviors that can apply to new tasks. In this paper, we investigate how these two approaches can be integrated into a single reinforcement learning agent.… ▽ More

    Submitted 2 May, 2021; v1 submitted 27 November, 2020; originally announced November 2020.

    Comments: First two authors contributed equally. Published as a conference paper in ICLR 2021

  21. arXiv:2010.02193  [pdf, other

    cs.LG cs.AI stat.ML

    Mastering Atari with Discrete World Models

    Authors: Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, Jimmy Ba

    Abstract: Intelligent agents need to generalize from past experience to achieve goals in complex environments. World models facilitate such generalization and allow learning behaviors from imagined outcomes to increase sample-efficiency. While learning world models from image inputs has recently become feasible for some tasks, modeling Atari games accurately enough to derive successful behaviors has remaine… ▽ More

    Submitted 12 February, 2022; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: Published at ICLR 2021. Website: https://danijar.com/dreamerv2

  22. arXiv:2009.01791  [pdf, other

    cs.AI cs.IT cs.LG stat.ML

    Action and Perception as Divergence Minimization

    Authors: Danijar Hafner, Pedro A. Ortega, Jimmy Ba, Thomas Parr, Karl Friston, Nicolas Heess

    Abstract: To learn directed behaviors in complex environments, intelligent agents need to optimize objective functions. Various objectives are known for designing artificial agents, including task rewards and intrinsic motivation. However, it is unclear how the known objectives relate to each other, which objectives remain yet to be discovered, and which objectives better describe the behavior of humans. We… ▽ More

    Submitted 12 February, 2022; v1 submitted 3 September, 2020; originally announced September 2020.

    Comments: Website: https://danijar.com/apd

  23. arXiv:2006.04120  [pdf

    q-bio.NC cs.AI

    Sophisticated Inference

    Authors: Karl Friston, Lancelot Da Costa, Danijar Hafner, Casper Hesp, Thomas Parr

    Abstract: Active inference offers a first principle account of sentient behaviour, from which special and important cases can be derived, e.g., reinforcement learning, active learning, Bayes optimal inference, Bayes optimal design, etc. Active inference resolves the exploitation-exploration dilemma in relation to prior preferences, by placing information gain on the same footing as reward or value. In brief… ▽ More

    Submitted 7 June, 2020; originally announced June 2020.

  24. arXiv:2005.05960  [pdf, other

    cs.LG cs.AI cs.CV cs.NE cs.RO stat.ML

    Planning to Explore via Self-Supervised World Models

    Authors: Ramanan Sekar, Oleh Rybkin, Kostas Daniilidis, Pieter Abbeel, Danijar Hafner, Deepak Pathak

    Abstract: Reinforcement learning allows solving complex tasks, however, the learning tends to be task-specific and the sample efficiency remains a challenge. We present Plan2Explore, a self-supervised reinforcement learning agent that tackles both these challenges through a new approach to self-supervised exploration and fast adaptation to new tasks, which need not be known during exploration. During explor… ▽ More

    Submitted 30 June, 2020; v1 submitted 12 May, 2020; originally announced May 2020.

    Comments: Accepted at ICML 2020. Videos and code at https://ramanans1.github.io/plan2explore/

  25. arXiv:1912.01603  [pdf, other

    cs.LG cs.AI cs.RO

    Dream to Control: Learning Behaviors by Latent Imagination

    Authors: Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi

    Abstract: Learned world models summarize an agent's experience to facilitate learning complex behaviors. While learning world models from high-dimensional sensory inputs is becoming feasible through deep learning, there are many potential ways for deriving behaviors from them. We present Dreamer, a reinforcement learning agent that solves long-horizon tasks from images purely by latent imagination. We effic… ▽ More

    Submitted 17 March, 2020; v1 submitted 3 December, 2019; originally announced December 2019.

    Comments: 9 pages, 12 figures

  26. arXiv:1907.01341  [pdf, other

    cs.CV

    Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer

    Authors: René Ranftl, Katrin Lasinger, David Hafner, Konrad Schindler, Vladlen Koltun

    Abstract: The success of monocular depth estimation relies on large and diverse training sets. Due to the challenges associated with acquiring dense ground-truth depth across different environments at scale, a number of datasets with distinct characteristics and biases have emerged. We develop tools that enable mixing multiple datasets during training, even if their annotations are incompatible. In particul… ▽ More

    Submitted 25 August, 2020; v1 submitted 2 July, 2019; originally announced July 2019.

    Comments: To appear in TPAMI (accepted August 2020)

  27. arXiv:1812.03973  [pdf, other

    cs.LG cs.PL stat.ML

    Bayesian Layers: A Module for Neural Network Uncertainty

    Authors: Dustin Tran, Michael W. Dusenberry, Mark van der Wilk, Danijar Hafner

    Abstract: We describe Bayesian Layers, a module designed for fast experimentation with neural network uncertainty. It extends neural network libraries with drop-in replacements for common layers. This enables composition via a unified abstraction over deterministic and stochastic functions and allows for scalability via the underlying system. These layers capture uncertainty over weights (Bayesian neural ne… ▽ More

    Submitted 5 March, 2019; v1 submitted 10 December, 2018; originally announced December 2018.

    Comments: Code available at https://github.com/tensorflow/tensor2tensor

  28. arXiv:1812.00025  [pdf, other

    cs.LG cs.AI

    Modulated Policy Hierarchies

    Authors: Alexander Pashevich, Danijar Hafner, James Davidson, Rahul Sukthankar, Cordelia Schmid

    Abstract: Solving tasks with sparse rewards is a main challenge in reinforcement learning. While hierarchical controllers are an intuitive approach to this problem, current methods often require manual reward sha**, alternating training phases, or manually defined sub tasks. We introduce modulated policy hierarchies (MPH), that can learn end-to-end to solve tasks from sparse rewards. To achieve this, we s… ▽ More

    Submitted 30 November, 2018; originally announced December 2018.

    Comments: 8 pages, 5 figures

  29. arXiv:1811.04551  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Latent Dynamics for Planning from Pixels

    Authors: Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson

    Abstract: Planning has been very successful for control tasks with known environment dynamics. To leverage planning in unknown environments, the agent needs to learn the dynamics from interactions with the world. However, learning dynamics models that are accurate enough for planning has been a long-standing challenge, especially in image-based domains. We propose the Deep Planning Network (PlaNet), a purel… ▽ More

    Submitted 4 June, 2019; v1 submitted 11 November, 2018; originally announced November 2018.

    Comments: 20 pages, 12 figures, 1 table

  30. arXiv:1807.09289  [pdf, other

    stat.ML cs.LG

    Noise Contrastive Priors for Functional Uncertainty

    Authors: Danijar Hafner, Dustin Tran, Timothy Lillicrap, Alex Irpan, James Davidson

    Abstract: Obtaining reliable uncertainty estimates of neural network predictions is a long standing challenge. Bayesian neural networks have been proposed as a solution, but it remains open how to specify their prior. In particular, the common practice of an independent normal prior in weight space imposes relatively weak constraints on the function posterior, allowing it to generalize in unforeseen ways on… ▽ More

    Submitted 30 June, 2019; v1 submitted 24 July, 2018; originally announced July 2018.

    Comments: 12 pages, 6 figures

  31. arXiv:1807.01675  [pdf, other

    cs.LG cs.AI stat.ML

    Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion

    Authors: Jacob Buckman, Danijar Hafner, George Tucker, Eugene Brevdo, Honglak Lee

    Abstract: Integrating model-free and model-based approaches in reinforcement learning has the potential to achieve the high performance of model-free algorithms with low sample complexity. However, this is difficult because an imperfect dynamics model can degrade the performance of the learning algorithm, and in sufficiently complex environments, the dynamics model will almost always be imperfect. As a resu… ▽ More

    Submitted 7 June, 2019; v1 submitted 4 July, 2018; originally announced July 2018.

    Journal ref: Advances in Neural Information Processing Systems, 2019 (pp. 8224-8234)

  32. arXiv:1804.10332  [pdf, other

    cs.RO cs.AI

    Sim-to-Real: Learning Agile Locomotion For Quadruped Robots

    Authors: Jie Tan, Tingnan Zhang, Erwin Coumans, Atil Iscen, Yunfei Bai, Danijar Hafner, Steven Bohez, Vincent Vanhoucke

    Abstract: Designing agile locomotion for quadruped robots often requires extensive expertise and tedious manual tuning. In this paper, we present a system to automate this process by leveraging deep reinforcement learning techniques. Our system can learn quadruped locomotion from scratch using simple reward signals. In addition, users can provide an open loop reference to guide the learning process when mor… ▽ More

    Submitted 16 May, 2018; v1 submitted 26 April, 2018; originally announced April 2018.

    Comments: Accompanying video: https://www.youtube.com/watch?v=lUZUr7jxoqM

  33. arXiv:1711.10327  [pdf, other

    cs.IR cs.CL cs.LG stat.ML

    Generative Interest Estimation for Document Recommendations

    Authors: Danijar Hafner, Alexander Immer, Willi Raschkowski, Fabian Windheuser

    Abstract: Learning distributed representations of documents has pushed the state-of-the-art in several natural language processing tasks and was successfully applied to the field of recommender systems recently. In this paper, we propose a novel content-based recommender system based on learned representations and a generative model of user interest. Our method works as follows: First, we learn representati… ▽ More

    Submitted 28 November, 2017; originally announced November 2017.

  34. arXiv:1709.02878  [pdf, other

    cs.LG cs.AI

    TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow

    Authors: Danijar Hafner, James Davidson, Vincent Vanhoucke

    Abstract: We introduce TensorFlow Agents, an efficient infrastructure paradigm for building parallel reinforcement learning algorithms in TensorFlow. We simulate multiple environments in parallel, and group them to perform the neural network computation on a batch rather than individual observations. This allows the TensorFlow execution engine to parallelize computation, without the need for manual synchron… ▽ More

    Submitted 31 October, 2018; v1 submitted 8 September, 2017; originally announced September 2017.

    Comments: White paper, 7 pages

  35. arXiv:1706.05744  [pdf, other

    cs.LG cs.AI

    Learning Hierarchical Information Flow with Recurrent Neural Modules

    Authors: Danijar Hafner, Alex Irpan, James Davidson, Nicolas Heess

    Abstract: We propose ThalNet, a deep learning model inspired by neocortical communication via the thalamus. Our model consists of recurrent neural modules that send features through a routing center, endowing the modules with the flexibility to share features over multiple time steps. We show that our model learns to route information hierarchically, processing input data by a chain of modules. We observe c… ▽ More

    Submitted 3 November, 2017; v1 submitted 18 June, 2017; originally announced June 2017.

    Comments: NIPS 2017

  36. arXiv:1610.02164  [pdf, other

    cs.LG cs.AI

    Deep Reinforcement Learning From Raw Pixels in Doom

    Authors: Danijar Hafner

    Abstract: Using current reinforcement learning methods, it has recently become possible to learn to play unknown 3D games from raw pixels. In this work, we study the challenges that arise in such complex environments, and summarize current methods to approach these. We choose a task within the Doom game, that has not been approached yet. The goal for the agent is to fight enemies in a 3D world consisting of… ▽ More

    Submitted 7 October, 2016; originally announced October 2016.

    Comments: Bachelor's thesis