Skip to main content

Showing 1–9 of 9 results for author: Talvitie, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16006  [pdf, other

    cs.LG cs.AI

    Bounding-Box Inference for Error-Aware Model-Based Reinforcement Learning

    Authors: Erin J. Talvitie, Zilei Shao, Huiying Li, **ghan Hu, Jacob Boerma, Rory Zhao, Xintong Wang

    Abstract: In model-based reinforcement learning, simulated experiences from the learned model are often treated as equivalent to experience from the real environment. However, when the model is inaccurate, it can catastrophically interfere with policy learning. Alternatively, the agent might learn about the model's accuracy and selectively use it only when it can provide reliable predictions. We empirically… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: To appear: Reinforcement Learning Conference (RLC), 2024

  2. arXiv:2007.02418  [pdf, other

    cs.LG cs.AI stat.ML

    Selective Dyna-style Planning Under Limited Model Capacity

    Authors: Zaheer Abbas, Samuel Sokota, Erin J. Talvitie, Martha White

    Abstract: In model-based reinforcement learning, planning with an imperfect model of the environment has the potential to harm learning progress. But even when a model is imperfect, it may still contain information that is useful for planning. In this paper, we investigate the idea of using an imperfect model selectively. The agent should plan in parts of the state space where the model would be helpful but… ▽ More

    Submitted 7 March, 2021; v1 submitted 5 July, 2020; originally announced July 2020.

    Comments: Accepted at ICML 2020

  3. arXiv:2006.04363  [pdf, other

    cs.LG cs.AI stat.ML

    Hallucinating Value: A Pitfall of Dyna-style Planning with Imperfect Environment Models

    Authors: Taher Jafferjee, Ehsan Imani, Erin Talvitie, Martha White, Micheal Bowling

    Abstract: Dyna-style reinforcement learning (RL) agents improve sample efficiency over model-free RL agents by updating the value function with simulated experience generated by an environment model. However, it is often difficult to learn accurate models of environment dynamics, and even small errors may result in failure of Dyna agents. In this paper, we investigate one type of model error: hallucinated s… ▽ More

    Submitted 8 June, 2020; originally announced June 2020.

    Comments: 9 pages, 7 figures,

  4. arXiv:1806.01825  [pdf, other

    cs.AI

    The Effect of Planning Shape on Dyna-style Planning in High-dimensional State Spaces

    Authors: G. Zacharias Holland, Erin J. Talvitie, Michael Bowling

    Abstract: Dyna is a fundamental approach to model-based reinforcement learning (MBRL) that interleaves planning, acting, and learning in an online setting. In the most typical application of Dyna, the dynamics model is used to generate one-step transitions from selected start states from the agent's history, which are used to update the agent's value function or policy as if they were real experiences. In t… ▽ More

    Submitted 28 March, 2019; v1 submitted 5 June, 2018; originally announced June 2018.

  5. arXiv:1801.09624  [pdf, other

    cs.LG

    Learning the Reward Function for a Misspecified Model

    Authors: Erik Talvitie

    Abstract: In model-based reinforcement learning it is typical to decouple the problems of learning the dynamics model and learning the reward function. However, when the dynamics model is flawed, it may generate erroneous states that would never occur in the true environment. It is not clear a priori what value the reward function should assign to such states. This paper presents a novel error bound that ac… ▽ More

    Submitted 8 June, 2018; v1 submitted 29 January, 2018; originally announced January 2018.

    Comments: To appear at ICML 2018

    ACM Class: I.2.6; I.2.8

  6. arXiv:1709.06009  [pdf, other

    cs.LG

    Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents

    Authors: Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew Hausknecht, Michael Bowling

    Abstract: The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge of building AI agents with general competency across dozens of Atari 2600 games. It supports a variety of different problem settings and it has been receiving increasing attention from the scientific community, leading to some high-profile success stories such as the much publicized Deep Q-Networks (DQN). In t… ▽ More

    Submitted 30 November, 2017; v1 submitted 18 September, 2017; originally announced September 2017.

  7. arXiv:1612.06018  [pdf, other

    cs.LG cs.AI

    Self-Correcting Models for Model-Based Reinforcement Learning

    Authors: Erik Talvitie

    Abstract: When an agent cannot represent a perfectly accurate model of its environment's dynamics, model-based reinforcement learning (MBRL) can fail catastrophically. Planning involves composing the predictions of the model; when flawed predictions are composed, even minor errors can compound and render the model useless for planning. Hallucinated Replay (Talvitie 2014) trains the model to "correct" itself… ▽ More

    Submitted 26 July, 2017; v1 submitted 18 December, 2016; originally announced December 2016.

    Comments: Original paper appeared in Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017. This version incorporates the appendix into document (rather than as supplementary material), corrects a minor error in Lemma 1, and fixes some type-os

    ACM Class: I.2.6; I.2.8

    Journal ref: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2597-2603 (2017)

  8. arXiv:1512.01563  [pdf, other

    cs.LG

    State of the Art Control of Atari Games Using Shallow Reinforcement Learning

    Authors: Yitao Liang, Marlos C. Machado, Erik Talvitie, Michael Bowling

    Abstract: The recently introduced Deep Q-Networks (DQN) algorithm has gained attention as one of the first successful combinations of deep neural networks and reinforcement learning. Its promise was demonstrated in the Arcade Learning Environment (ALE), a challenging framework composed of dozens of Atari 2600 games used to evaluate general competency in AI. It achieved dramatically better results than earli… ▽ More

    Submitted 21 April, 2016; v1 submitted 4 December, 2015; originally announced December 2015.

    Comments: A shorter version of this paper appears in the Proceedings of the 15th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2016)

  9. arXiv:1401.3870  [pdf

    cs.LG cs.AI stat.ML

    Learning to Make Predictions In Partially Observable Environments Without a Generative Model

    Authors: Erik Talvitie, Satinder Singh

    Abstract: When faced with the problem of learning a model of a high-dimensional environment, a common approach is to limit the model to make only a restricted set of predictions, thereby simplifying the learning problem. These partial models may be directly useful for making decisions or may be combined together to form a more complete, structured model. However, in partially observable (non-Markov) environ… ▽ More

    Submitted 16 January, 2014; originally announced January 2014.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 42, pages 353-392, 2011