Skip to main content

Showing 1–14 of 14 results for author: Janner, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.02682  [pdf, other

    cs.LG cs.AI cs.RO

    H-GAP: Humanoid Control with a Generalist Planner

    Authors: Zhengyao Jiang, Yingchen Xu, Nolan Wagener, Yicheng Luo, Michael Janner, Edward Grefenstette, Tim Rocktäschel, Yuandong Tian

    Abstract: Humanoid control is an important research challenge offering avenues for integration into human-centric infrastructures and enabling physics-driven humanoid animations. The daunting challenges in this field stem from the difficulty of optimizing in high-dimensional action spaces and the instability introduced by the bipedal morphology of humanoids. However, the extensive collection of human motion… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: 18 pages including appendix, 4 figures

  2. arXiv:2306.08810  [pdf, other

    cs.LG cs.AI

    Deep Generative Models for Decision-Making and Control

    Authors: Michael Janner

    Abstract: Deep model-based reinforcement learning methods offer a conceptually simple approach to the decision-making and control problem: use learning for the purpose of estimating an approximate dynamics model, and offload the rest of the work to classical trajectory optimization. However, this combination has a number of empirical shortcomings, limiting the usefulness of model-based methods in practice.… ▽ More

    Submitted 8 July, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: UC Berkeley PhD thesis; supersedes arXiv:2010.14496, arXiv:2106.02039, and arXiv:2205.09991

  3. arXiv:2305.13301  [pdf, other

    cs.LG cs.AI cs.CV

    Training Diffusion Models with Reinforcement Learning

    Authors: Kevin Black, Michael Janner, Yilun Du, Ilya Kostrikov, Sergey Levine

    Abstract: Diffusion models are a class of flexible generative models trained with an approximation to the log-likelihood objective. However, most use cases of diffusion models are not concerned with likelihoods, but instead with downstream objectives such as human-perceived image quality or drug effectiveness. In this paper, we investigate reinforcement learning methods for directly optimizing diffusion mod… ▽ More

    Submitted 4 January, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: 23 pages, 16 figures

  4. arXiv:2304.10573  [pdf, other

    cs.LG cs.AI

    IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies

    Authors: Philippe Hansen-Estruch, Ilya Kostrikov, Michael Janner, Jakub Grudzien Kuba, Sergey Levine

    Abstract: Effective offline RL methods require properly handling out-of-distribution actions. Implicit Q-learning (IQL) addresses this by training a Q-function using only dataset actions through a modified Bellman backup. However, it is unclear which policy actually attains the values represented by this implicitly trained Q-function. In this paper, we reinterpret IQL as an actor-critic method by generalizi… ▽ More

    Submitted 19 May, 2023; v1 submitted 20 April, 2023; originally announced April 2023.

    Comments: 9 Pages, 4 Figures, 3 Tables

  5. arXiv:2208.10291  [pdf, other

    cs.LG

    Efficient Planning in a Compact Latent Action Space

    Authors: Zhengyao Jiang, Tianjun Zhang, Michael Janner, Yueying Li, Tim Rocktäschel, Edward Grefenstette, Yuandong Tian

    Abstract: Planning-based reinforcement learning has shown strong performance in tasks in discrete and low-dimensional continuous action spaces. However, planning usually brings significant computational overhead for decision-making, and scaling such methods to high-dimensional action spaces remains challenging. To advance efficient planning for high-dimensional continuous control, we propose Trajectory Auto… ▽ More

    Submitted 24 January, 2023; v1 submitted 22 August, 2022; originally announced August 2022.

    Comments: Accepted by ICLR2023. Code available at https://github.com/ZhengyaoJiang/latentplan

  6. arXiv:2206.10524  [pdf, other

    cs.LG eess.SY

    Lyapunov Density Models: Constraining Distribution Shift in Learning-Based Control

    Authors: Katie Kang, Paula Gradu, Jason Choi, Michael Janner, Claire Tomlin, Sergey Levine

    Abstract: Learned models and policies can generalize effectively when evaluated within the distribution of the training data, but can produce unpredictable and erroneous outputs on out-of-distribution inputs. In order to avoid distribution shift when deploying learning-based control algorithms, we seek a mechanism to constrain the agent to states and actions that resemble those that it was trained on. In co… ▽ More

    Submitted 21 June, 2022; originally announced June 2022.

  7. arXiv:2205.09991  [pdf, other

    cs.LG cs.AI

    Planning with Diffusion for Flexible Behavior Synthesis

    Authors: Michael Janner, Yilun Du, Joshua B. Tenenbaum, Sergey Levine

    Abstract: Model-based reinforcement learning methods often use learning only for the purpose of estimating an approximate dynamics model, offloading the rest of the decision-making work to classical trajectory optimizers. While conceptually simple, this combination has a number of empirical shortcomings, suggesting that learned models may not be well-suited to standard trajectory optimization. In this paper… ▽ More

    Submitted 20 December, 2022; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: ICML 2022 (long talk). Project page and code at https://diffusion-planning.github.io/

  8. arXiv:2106.02039  [pdf, other

    cs.LG cs.AI

    Offline Reinforcement Learning as One Big Sequence Modeling Problem

    Authors: Michael Janner, Qiyang Li, Sergey Levine

    Abstract: Reinforcement learning (RL) is typically concerned with estimating stationary policies or single-step models, leveraging the Markov property to factorize problems in time. However, we can also view RL as a generic sequence modeling problem, with the goal being to produce a sequence of actions that leads to a sequence of high rewards. Viewed in this way, it is tempting to consider whether high-capa… ▽ More

    Submitted 28 November, 2021; v1 submitted 3 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021 (spotlight). Project page and code at: https://trajectory-transformer.github.io/

  9. arXiv:2010.14496  [pdf, other

    cs.LG cs.AI

    Generative Temporal Difference Learning for Infinite-Horizon Prediction

    Authors: Michael Janner, Igor Mordatch, Sergey Levine

    Abstract: We introduce the $γ$-model, a predictive model of environment dynamics with an infinite probabilistic horizon. Replacing standard single-step models with $γ$-models leads to generalizations of the procedures central to model-based control, including the model rollout and model-based value estimation. The $γ$-model, trained with a generative reinterpretation of temporal difference learning, is a na… ▽ More

    Submitted 28 November, 2021; v1 submitted 27 October, 2020; originally announced October 2020.

    Comments: NeurIPS 2020. Project page at: https://gammamodels.github.io/

  10. arXiv:1910.12827  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Entity Abstraction in Visual Model-Based Reinforcement Learning

    Authors: Rishi Veerapaneni, John D. Co-Reyes, Michael Chang, Michael Janner, Chelsea Finn, Jiajun Wu, Joshua B. Tenenbaum, Sergey Levine

    Abstract: This paper tests the hypothesis that modeling a scene in terms of entities and their local interactions, as opposed to modeling the scene globally, provides a significant benefit in generalizing to physical tasks in a combinatorial space the learner has not encountered before. We present object-centric perception, prediction, and planning (OP3), which to the best of our knowledge is the first full… ▽ More

    Submitted 6 May, 2020; v1 submitted 28 October, 2019; originally announced October 2019.

    Comments: Accepted at CoRL 2019

  11. arXiv:1906.08253  [pdf, other

    cs.LG cs.AI stat.ML

    When to Trust Your Model: Model-Based Policy Optimization

    Authors: Michael Janner, Justin Fu, Marvin Zhang, Sergey Levine

    Abstract: Designing effective model-based reinforcement learning algorithms is difficult because the ease of data generation must be weighed against the bias of model-generated data. In this paper, we study the role of model usage in policy optimization both theoretically and empirically. We first formulate and analyze a model-based reinforcement learning algorithm with a guarantee of monotonic improvement… ▽ More

    Submitted 28 November, 2021; v1 submitted 19 June, 2019; originally announced June 2019.

    Comments: NeurIPS 2019. Code at https://github.com/JannerM/mbpo, project page at: https://jannerm.github.io/mbpo-www/

  12. arXiv:1812.10972  [pdf, other

    cs.LG cs.AI cs.CV cs.RO stat.ML

    Reasoning About Physical Interactions with Object-Oriented Prediction and Planning

    Authors: Michael Janner, Sergey Levine, William T. Freeman, Joshua B. Tenenbaum, Chelsea Finn, Jiajun Wu

    Abstract: Object-based factorizations provide a useful level of abstraction for interacting with the world. Building explicit object representations, however, often requires supervisory signals that are difficult to obtain in practice. We present a paradigm for learning object-centric representations for physical scene understanding without direct supervision of object properties. Our model, Object-Oriented… ▽ More

    Submitted 7 January, 2019; v1 submitted 28 December, 2018; originally announced December 2018.

    Comments: ICLR 2019, project page: https://people.eecs.berkeley.edu/~janner/o2p2/

  13. arXiv:1711.03678  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Self-Supervised Intrinsic Image Decomposition

    Authors: Michael Janner, Jiajun Wu, Tejas D. Kulkarni, Ilker Yildirim, Joshua B. Tenenbaum

    Abstract: Intrinsic decomposition from a single image is a highly challenging task, due to its inherent ambiguity and the scarcity of training data. In contrast to traditional fully supervised learning approaches, in this paper we propose learning intrinsic image decomposition by explaining the input image. Our model, the Rendered Intrinsics Network (RIN), joins together an image decomposition pipeline, whi… ▽ More

    Submitted 5 February, 2018; v1 submitted 9 November, 2017; originally announced November 2017.

    Comments: NIPS 2017 camera-ready version, project page: http://rin.csail.mit.edu/

  14. arXiv:1707.03938  [pdf, other

    cs.CL cs.AI cs.LG

    Representation Learning for Grounded Spatial Reasoning

    Authors: Michael Janner, Karthik Narasimhan, Regina Barzilay

    Abstract: The interpretation of spatial references is highly contextual, requiring joint inference over both language and the environment. We consider the task of spatial reasoning in a simulated environment, where an agent can act and receive rewards. The proposed model learns a representation of the world steered by instruction text. This design allows for precise alignment of local neighborhoods with cor… ▽ More

    Submitted 10 November, 2017; v1 submitted 12 July, 2017; originally announced July 2017.

    Comments: Accepted to TACL 2017, code: https://github.com/jannerm/spatial-reasoning