Skip to main content

Showing 1–50 of 62 results for author: Mordatch, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  2. arXiv:2312.06585  [pdf, other

    cs.LG

    Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

    Authors: Avi Singh, John D. Co-Reyes, Rishabh Agarwal, Ankesh Anand, Piyush Patil, Xavier Garcia, Peter J. Liu, James Harrison, Jaehoon Lee, Kelvin Xu, Aaron Parisi, Abhishek Kumar, Alex Alemi, Alex Rizkowsky, Azade Nova, Ben Adlam, Bernd Bohnet, Gamaleldin Elsayed, Hanie Sedghi, Igor Mordatch, Isabelle Simpson, Izzeddin Gur, Jasper Snoek, Jeffrey Pennington, Jiri Hron , et al. (16 additional authors not shown)

    Abstract: Fine-tuning language models~(LMs) on human-generated data remains a prevalent practice. However, the performance of such models is often limited by the quantity and diversity of high-quality human data. In this paper, we explore whether we can go beyond human data on tasks where we have access to scalar feedback, for example, on math problems where one can verify correctness. To do so, we investig… ▽ More

    Submitted 17 April, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: Accepted to TMLR. Camera-ready version. First three authors contributed equally

  3. arXiv:2311.17894  [pdf, other

    cond-mat.mes-hall cond-mat.mtrl-sci cs.LG

    Learning and Controlling Silicon Dopant Transitions in Graphene using Scanning Transmission Electron Microscopy

    Authors: Max Schwarzer, Jesse Farebrother, Joshua Greaves, Ekin Dogus Cubuk, Rishabh Agarwal, Aaron Courville, Marc G. Bellemare, Sergei Kalinin, Igor Mordatch, Pablo Samuel Castro, Kevin M. Roccapriore

    Abstract: We introduce a machine learning approach to determine the transition dynamics of silicon atoms on a single layer of carbon atoms, when stimulated by the electron beam of a scanning transmission electron microscope (STEM). Our method is data-centric, leveraging data collected on a STEM. The data samples are processed and filtered to produce symbolic representations, which we use to train a neural n… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

  4. arXiv:2311.09235  [pdf, other

    cs.LG cs.AI

    Scalable Diffusion for Materials Generation

    Authors: Sherry Yang, KwangHwan Cho, Amil Merchant, Pieter Abbeel, Dale Schuurmans, Igor Mordatch, Ekin Dogus Cubuk

    Abstract: Generative models trained on internet-scale data are capable of generating novel and realistic texts, images, and videos. A natural next question is whether these models can advance science, for example by generating novel stable materials. Traditionally, models with explicit structures (e.g., graphs) have been used in modeling structural relationships in scientific data (e.g., atoms and bonds in… ▽ More

    Submitted 3 June, 2024; v1 submitted 18 October, 2023; originally announced November 2023.

    Comments: https://unified-materials.github.io/

  5. arXiv:2311.07587  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?

    Authors: C. Daniel Freeman, Laura Culp, Aaron Parisi, Maxwell L Bileschi, Gamaleldin F Elsayed, Alex Rizkowsky, Isabelle Simpson, Alex Alemi, Azade Nova, Ben Adlam, Bernd Bohnet, Gaurav Mishra, Hanie Sedghi, Igor Mordatch, Izzeddin Gur, Jaehoon Lee, JD Co-Reyes, Jeffrey Pennington, Kelvin Xu, Kevin Swersky, Kshiteej Mahajan, Lechao Xiao, Rosanne Liu, Simon Kornblith, Noah Constant , et al. (5 additional authors not shown)

    Abstract: We introduce and study the problem of adversarial arithmetic, which provides a simple yet challenging testbed for language model alignment. This problem is comprised of arithmetic questions posed in natural language, with an arbitrary adversarial string inserted before the question is complete. Even in the simple setting of 1-digit addition problems, it is easy to find adversarial prompts that mak… ▽ More

    Submitted 15 November, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

  6. arXiv:2310.08864  [pdf, other

    cs.RO

    Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, A**kya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (267 additional authors not shown)

    Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More

    Submitted 1 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Project website: https://robotics-transformer-x.github.io

  7. arXiv:2307.15818  [pdf, other

    cs.RO cs.CL cs.CV cs.LG

    RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

    Authors: Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, Pete Florence, Chuyuan Fu, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Kehang Han, Karol Hausman, Alexander Herzog, Jasmine Hsu, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal , et al. (29 additional authors not shown)

    Abstract: We study how vision-language models trained on Internet-scale data can be incorporated directly into end-to-end robotic control to boost generalization and enable emergent semantic reasoning. Our goal is to enable a single end-to-end trained model to both learn to map robot observations to actions and enjoy the benefits of large-scale pretraining on language and vision-language data from the web.… ▽ More

    Submitted 28 July, 2023; originally announced July 2023.

    Comments: Website: https://robotics-transformer.github.io/

  8. arXiv:2305.14325  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    Improving Factuality and Reasoning in Language Models through Multiagent Debate

    Authors: Yilun Du, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, Igor Mordatch

    Abstract: Large language models (LLMs) have demonstrated remarkable capabilities in language generation, understanding, and few-shot learning in recent years. An extensive body of work has explored how their performance may be further improved through the tools of prompting, ranging from verification, self-consistency, or intermediate scratchpads. In this paper, we present a complementary approach to improv… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: Project Webpage and Code: https://composable-models.github.io/llm_debate/

  9. arXiv:2305.02968  [pdf, other

    cs.LG cs.AI

    Masked Trajectory Models for Prediction, Representation, and Control

    Authors: Philipp Wu, Arjun Majumdar, Kevin Stone, Yixin Lin, Igor Mordatch, Pieter Abbeel, Aravind Rajeswaran

    Abstract: We introduce Masked Trajectory Models (MTM) as a generic abstraction for sequential decision making. MTM takes a trajectory, such as a state-action sequence, and aims to reconstruct the trajectory conditioned on random subsets of the same trajectory. By training with a highly randomized masking pattern, MTM learns versatile networks that can take on different roles or capabilities, by simply choos… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: Accepted for publication at ICML 2023. Project webpage: https://wuphilipp.github.io/mtm/

  10. arXiv:2303.14870  [pdf, other

    cs.RO cs.AI cs.LG

    Bi-Manual Block Assembly via Sim-to-Real Reinforcement Learning

    Authors: Satoshi Kataoka, Youngseog Chung, Seyed Kamyar Seyed Ghasemipour, Pannag Sanketi, Shixiang Shane Gu, Igor Mordatch

    Abstract: Most successes in robotic manipulation have been restricted to single-arm gripper robots, whose low dexterity limits the range of solvable tasks to pick-and-place, inser-tion, and object rearrangement. More complex tasks such as assembly require dual and multi-arm platforms, but entail a suite of unique challenges such as bi-arm coordination and collision avoidance, robust gras**, and long-horiz… ▽ More

    Submitted 26 March, 2023; originally announced March 2023.

    Comments: Our accompanying project webpage can be found at: https://sites.google.com/view/u-shape-block-assembly. arXiv admin note: substantial text overlap with arXiv:2203.08277

  11. arXiv:2303.03378  [pdf, other

    cs.LG cs.AI cs.RO

    PaLM-E: An Embodied Multimodal Language Model

    Authors: Danny Driess, Fei Xia, Mehdi S. M. Sajjadi, Corey Lynch, Aakanksha Chowdhery, Brian Ichter, Ayzaan Wahid, Jonathan Tompson, Quan Vuong, Tianhe Yu, Wenlong Huang, Yevgen Chebotar, Pierre Sermanet, Daniel Duckworth, Sergey Levine, Vincent Vanhoucke, Karol Hausman, Marc Toussaint, Klaus Greff, Andy Zeng, Igor Mordatch, Pete Florence

    Abstract: Large language models excel at a wide range of complex tasks. However, enabling general inference in the real world, e.g., for robotics problems, raises the challenge of grounding. We propose embodied language models to directly incorporate real-world continuous sensor modalities into language models and thereby establish the link between words and percepts. Input to our embodied language model ar… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

  12. arXiv:2303.00855  [pdf

    cs.RO cs.AI cs.CL cs.CV cs.LG

    Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied Agents

    Authors: Wenlong Huang, Fei Xia, Dhruv Shah, Danny Driess, Andy Zeng, Yao Lu, Pete Florence, Igor Mordatch, Sergey Levine, Karol Hausman, Brian Ichter

    Abstract: Recent progress in large language models (LLMs) has demonstrated the ability to learn and leverage Internet-scale knowledge through pre-training with autoregressive models. Unfortunately, applying such models to settings with embodied agents, such as robots, is challenging due to their lack of experience with the physical world, inability to parse non-language observations, and ignorance of reward… ▽ More

    Submitted 11 December, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

  13. arXiv:2212.06817  [pdf, other

    cs.RO cs.AI cs.CL cs.CV cs.LG

    RT-1: Robotics Transformer for Real-World Control at Scale

    Authors: Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Tomas Jackson, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Kuang-Huei Lee, Sergey Levine, Yao Lu, Utsav Malla, Deeksha Manjunath , et al. (26 additional authors not shown)

    Abstract: By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance. While this capability has been demonstrated in other fields such as computer vision, natural language processing or speech recognition, it remains to be shown in robotics, wher… ▽ More

    Submitted 11 August, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

    Comments: See website at robotics-transformer1.github.io

  14. arXiv:2211.13746  [pdf, other

    cs.MA cs.AI cs.GT cs.NE

    Melting Pot 2.0

    Authors: John P. Agapiou, Alexander Sasha Vezhnevets, Edgar A. Duéñez-Guzmán, Jayd Matyas, Yiran Mao, Peter Sunehag, Raphael Köster, Udari Madhushani, Kavya Kopparapu, Ramona Comanescu, DJ Strouse, Michael B. Johanson, Sukhdeep Singh, Julia Haas, Igor Mordatch, Dean Mobbs, Joel Z. Leibo

    Abstract: Multi-agent artificial intelligence research promises a path to develop intelligent technologies that are more human-like and more human-compatible than those produced by "solipsistic" approaches, which do not consider interactions between agents. Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence, and provides an evaluation protocol that measures ge… ▽ More

    Submitted 30 October, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

    Comments: 69 pages, 54 figures. arXiv admin note: text overlap with arXiv:2107.06857

  15. arXiv:2211.13337  [pdf, other

    cs.LG

    Multi-Environment Pretraining Enables Transfer to Action Limited Datasets

    Authors: David Venuto, Sherry Yang, Pieter Abbeel, Doina Precup, Igor Mordatch, Ofir Nachum

    Abstract: Using massive datasets to train large-scale models has emerged as a dominant approach for broad generalization in natural language and vision applications. In reinforcement learning, however, a key challenge is that available data of sequential decision making is often not annotated with actions - for example, videos of game-play are much more available than sequences of frames paired with their l… ▽ More

    Submitted 5 December, 2022; v1 submitted 23 November, 2022; originally announced November 2022.

  16. arXiv:2211.09760  [pdf, other

    cs.LG math.OC stat.ML

    VeLO: Training Versatile Learned Optimizers by Scaling Up

    Authors: Luke Metz, James Harrison, C. Daniel Freeman, Amil Merchant, Lucas Beyer, James Bradbury, Naman Agrawal, Ben Poole, Igor Mordatch, Adam Roberts, Jascha Sohl-Dickstein

    Abstract: While deep learning models have replaced hand-designed features across many domains, these models are still trained with hand-designed optimizers. In this work, we leverage the same scaling approach behind the success of deep learning to learn versatile optimizers. We train an optimizer for deep learning which is itself a small neural network that ingests gradients and outputs parameter updates. M… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

  17. arXiv:2210.13432  [pdf, other

    cs.CL

    Towards Better Few-Shot and Finetuning Performance with Forgetful Causal Language Models

    Authors: Hao Liu, Xinyang Geng, Lisa Lee, Igor Mordatch, Sergey Levine, Sharan Narang, Pieter Abbeel

    Abstract: Large language models (LLM) trained using the next-token-prediction objective, such as GPT3 and PaLM, have revolutionized natural language processing in recent years by showing impressive zero-shot and few-shot capabilities across a wide range of tasks. In this work, we propose a simple technique that significantly boosts the performance of LLMs without adding computational cost. Our key observati… ▽ More

    Submitted 31 January, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

    Comments: Added T-FCM and better FCM results

  18. arXiv:2210.12272  [pdf, other

    stat.ML cs.LG cs.RO

    Implicit Offline Reinforcement Learning via Supervised Learning

    Authors: Alexandre Piche, Rafael Pardinas, David Vazquez, Igor Mordatch, Chris Pal

    Abstract: Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset collected by policies of different expertise levels. It is as simple as supervised learning and Behavior Cloning (BC), but takes advantage of return information. On datasets collected by policies of similar expertise, implicit BC has been shown to match or outperform exp… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

  19. arXiv:2210.11522  [pdf, other

    cs.CV cs.AI cs.LG

    Composing Ensembles of Pre-trained Models via Iterative Consensus

    Authors: Shuang Li, Yilun Du, Joshua B. Tenenbaum, Antonio Torralba, Igor Mordatch

    Abstract: Large pre-trained models exhibit distinct and complementary capabilities dependent on the data they are trained on. Language models such as GPT-3 are capable of textual reasoning but cannot understand visual information, while vision models such as DALL-E can generate photorealistic photos but fail to understand complex language descriptions. In this work, we propose a unified framework for compos… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

  20. arXiv:2207.05608  [pdf, other

    cs.RO cs.AI cs.CL cs.CV cs.LG

    Inner Monologue: Embodied Reasoning through Planning with Language Models

    Authors: Wenlong Huang, Fei Xia, Ted Xiao, Harris Chan, Jacky Liang, Pete Florence, Andy Zeng, Jonathan Tompson, Igor Mordatch, Yevgen Chebotar, Pierre Sermanet, Noah Brown, Tomas Jackson, Linda Luu, Sergey Levine, Karol Hausman, Brian Ichter

    Abstract: Recent works have shown how the reasoning capabilities of Large Language Models (LLMs) can be applied to domains beyond natural language processing, such as planning and interaction for robots. These embodied problems require an agent to understand many semantic aspects of the world: the repertoire of skills available, how these skills influence the world, and how changes to the world map back to… ▽ More

    Submitted 12 July, 2022; originally announced July 2022.

    Comments: Project website: https://innermonologue.github.io

  21. arXiv:2206.15448  [pdf, other

    cs.LG cs.AI

    Learning Iterative Reasoning through Energy Minimization

    Authors: Yilun Du, Shuang Li, Joshua B. Tenenbaum, Igor Mordatch

    Abstract: Deep learning has excelled on complex pattern recognition tasks such as image classification and object recognition. However, it struggles with tasks requiring nontrivial reasoning, such as algorithmic computation. Humans are able to solve such tasks through iterative reasoning -- spending more time thinking about harder tasks. Most existing neural networks, however, exhibit a fixed computational… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

    Comments: ICML 2022. Website at https://energy-based-model.github.io/iterative-reasoning-as-energy-minimization/

  22. arXiv:2205.15241  [pdf, other

    cs.AI cs.LG

    Multi-Game Decision Transformers

    Authors: Kuang-Huei Lee, Ofir Nachum, Mengjiao Yang, Lisa Lee, Daniel Freeman, Winnie Xu, Sergio Guadarrama, Ian Fischer, Eric Jang, Henryk Michalewski, Igor Mordatch

    Abstract: A longstanding goal of the field of AI is a method for learning a highly capable, generalist agent from diverse experience. In the subfields of vision and language, this was largely achieved by scaling up transformer-based models and training them on large, diverse datasets. Motivated by this progress, we investigate whether the same strategy can be used to produce generalist reinforcement learnin… ▽ More

    Submitted 15 October, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: NeurIPS 2022. 24 pages, 16 figures. Additional information, videos and code can be seen at https://sites.google.com/view/multi-game-transformers

  23. arXiv:2203.13733  [pdf, other

    cs.RO cs.LG

    Blocks Assemble! Learning to Assemble with Large-Scale Structured Reinforcement Learning

    Authors: Seyed Kamyar Seyed Ghasemipour, Daniel Freeman, Byron David, Shixiang Shane Gu, Satoshi Kataoka, Igor Mordatch

    Abstract: Assembly of multi-part physical structures is both a valuable end product for autonomous robotics, as well as a valuable diagnostic task for open-ended training of embodied intelligent agents. We introduce a naturalistic physics-based environment with a set of connectable magnet blocks inspired by children's toy kits. The objective is to assemble blocks into a succession of target blueprints. Desp… ▽ More

    Submitted 12 April, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

    Comments: Accompanying project webpage can be found at: https://sites.google.com/view/learning-direct-assembly

  24. arXiv:2203.08277  [pdf, other

    cs.RO cs.AI cs.LG

    Bi-Manual Manipulation and Attachment via Sim-to-Real Reinforcement Learning

    Authors: Satoshi Kataoka, Seyed Kamyar Seyed Ghasemipour, Daniel Freeman, Igor Mordatch

    Abstract: Most successes in robotic manipulation have been restricted to single-arm robots, which limits the range of solvable tasks to pick-and-place, insertion, and objects rearrangement. In contrast, dual and multi arm robot platforms unlock a rich diversity of problems that can be tackled, such as laundry folding and executing cooking skills. However, develo** controllers for multi-arm robots is compl… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: Our accompanying project webpage can be found at: https://sites.google.com/view/bimanual-attachment

  25. arXiv:2202.01771  [pdf, other

    cs.LG cs.CL

    Pre-Trained Language Models for Interactive Decision-Making

    Authors: Shuang Li, Xavier Puig, Chris Paxton, Yilun Du, Clinton Wang, Linxi Fan, Tao Chen, De-An Huang, Ekin Akyürek, Anima Anandkumar, Jacob Andreas, Igor Mordatch, Antonio Torralba, Yuke Zhu

    Abstract: Language model (LM) pre-training is useful in many language processing tasks. But can pre-trained LMs be further leveraged for more general machine learning problems? We propose an approach for using LMs to scaffold learning and generalization in general sequential decision-making problems. In this approach, goals and observations are represented as a sequence of embeddings, and a policy network i… ▽ More

    Submitted 29 October, 2022; v1 submitted 3 February, 2022; originally announced February 2022.

  26. arXiv:2201.07207  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.RO

    Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents

    Authors: Wenlong Huang, Pieter Abbeel, Deepak Pathak, Igor Mordatch

    Abstract: Can world knowledge learned by large language models (LLMs) be used to act in interactive environments? In this paper, we investigate the possibility of grounding high-level tasks, expressed in natural language (e.g. "make breakfast"), to a chosen set of actionable steps (e.g. "open fridge"). While prior work focused on learning from explicit step-by-step examples of how to act, we surprisingly fi… ▽ More

    Submitted 8 March, 2022; v1 submitted 18 January, 2022; originally announced January 2022.

    Comments: Project website at https://huangwl18.github.io/language-planner

  27. arXiv:2111.03062  [pdf, other

    cs.RO cs.AI cs.CV cs.LG eess.SY

    Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning

    Authors: Wenlong Huang, Igor Mordatch, Pieter Abbeel, Deepak Pathak

    Abstract: Dexterous manipulation of arbitrary objects, a fundamental daily task for humans, has been a grand challenge for autonomous robotic systems. Although data-driven approaches using reinforcement learning can develop specialist policies that discover behaviors to control a single object, they often exhibit poor generalization to unseen ones. In this work, we show that policies learned by existing rei… ▽ More

    Submitted 4 November, 2021; originally announced November 2021.

    Comments: Website at https://huangwl18.github.io/geometry-dex

  28. arXiv:2111.03042  [pdf, other

    cs.CV cs.AI cs.LG

    Unsupervised Learning of Compositional Energy Concepts

    Authors: Yilun Du, Shuang Li, Yash Sharma, Joshua B. Tenenbaum, Igor Mordatch

    Abstract: Humans are able to rapidly understand scenes by utilizing concepts extracted from prior experience. Such concepts are diverse, and include global scene descriptors, such as the weather or lighting, as well as local scene descriptors, such as the color or size of a particular object. So far, unsupervised discovery of concepts has focused on either modeling the global scene-level or the local object… ▽ More

    Submitted 4 November, 2021; originally announced November 2021.

    Comments: NeurIPS 2021, website and code at https://energy-based-model.github.io/comet/

  29. arXiv:2110.07594  [pdf, other

    cs.LG cs.AI cs.MA

    The Neural MMO Platform for Massively Multiagent Research

    Authors: Joseph Suarez, Yilun Du, Clare Zhu, Igor Mordatch, Phillip Isola

    Abstract: Neural MMO is a computationally accessible research platform that combines large agent populations, long time horizons, open-ended tasks, and modular game systems. Existing environments feature subsets of these properties, but Neural MMO is the first to combine them all. We present Neural MMO as free and open source software with active support, ongoing development, documentation, and additional t… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

  30. arXiv:2109.00137  [pdf, other

    cs.RO cs.CV cs.LG

    Implicit Behavioral Cloning

    Authors: Pete Florence, Corey Lynch, Andy Zeng, Oscar Ramirez, Ayzaan Wahid, Laura Downs, Adrian Wong, Johnny Lee, Igor Mordatch, Jonathan Tompson

    Abstract: We find that across a wide range of robot policy learning scenarios, treating supervised policy learning with an implicit model generally performs better, on average, than commonly used explicit models. We present extensive experiments on this finding, and we provide both intuitive insight and theoretical arguments distinguishing the properties of implicit models compared to their explicit counter… ▽ More

    Submitted 31 August, 2021; originally announced September 2021.

  31. arXiv:2107.06857  [pdf, other

    cs.MA cs.AI

    Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot

    Authors: Joel Z. Leibo, Edgar Duéñez-Guzmán, Alexander Sasha Vezhnevets, John P. Agapiou, Peter Sunehag, Raphael Koster, Jayd Matyas, Charles Beattie, Igor Mordatch, Thore Graepel

    Abstract: Existing evaluation suites for multi-agent reinforcement learning (MARL) do not assess generalization to novel situations as their primary objective (unlike supervised-learning benchmarks). Our contribution, Melting Pot, is a MARL evaluation suite that fills this gap, and uses reinforcement learning to reduce the human labor required to create novel test scenarios. This works because one agent's b… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

    Comments: Accepted to ICML 2021 and presented as a long talk; 33 pages; 9 figures

    Journal ref: In International Conference on Machine Learning 2021 (pp. 6187-6199). PMLR

  32. arXiv:2106.13281  [pdf, other

    cs.RO cs.AI

    Brax -- A Differentiable Physics Engine for Large Scale Rigid Body Simulation

    Authors: C. Daniel Freeman, Erik Frey, Anton Raichuk, Sertan Girgin, Igor Mordatch, Olivier Bachem

    Abstract: We present Brax, an open source library for rigid body simulation with a focus on performance and parallelism on accelerators, written in JAX. We present results on a suite of tasks inspired by the existing reinforcement learning literature, but remade in our engine. Additionally, we provide reimplementations of PPO, SAC, ES, and direct policy optimization in JAX that compile alongside our environ… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

    Comments: 9 pages + 12 pages of appendices and references. In submission at NeurIPS 2021 Datasets and Benchmarks Track

  33. arXiv:2106.13229  [pdf, other

    cs.LG cs.AI cs.RO

    Model-Based Reinforcement Learning via Latent-Space Collocation

    Authors: Oleh Rybkin, Chuning Zhu, Anusha Nagabandi, Kostas Daniilidis, Igor Mordatch, Sergey Levine

    Abstract: The ability to plan into the future while utilizing only raw high-dimensional observations, such as images, can provide autonomous agents with broad capabilities. Visual model-based reinforcement learning (RL) methods that plan future actions directly have shown impressive results on tasks that require only short-horizon reasoning, however, these methods struggle on temporally extended tasks. We a… ▽ More

    Submitted 7 August, 2021; v1 submitted 24 June, 2021; originally announced June 2021.

    Comments: International Conference on Machine Learning (ICML), 2021. Videos and code at https://orybkin.github.io/latco/

  34. arXiv:2106.01345  [pdf, other

    cs.LG cs.AI

    Decision Transformer: Reinforcement Learning via Sequence Modeling

    Authors: Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch

    Abstract: We introduce a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem. This allows us to draw upon the simplicity and scalability of the Transformer architecture, and associated advances in language modeling such as GPT-x and BERT. In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling. Unlike prior… ▽ More

    Submitted 24 June, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: First two authors contributed equally. Last two authors advised equally

  35. arXiv:2103.05247  [pdf, other

    cs.LG cs.AI

    Pretrained Transformers as Universal Computation Engines

    Authors: Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch

    Abstract: We investigate the capability of a transformer pretrained on natural language to generalize to other modalities with minimal finetuning -- in particular, without finetuning of the self-attention and feedforward layers of the residual blocks. We consider such a model, which we call a Frozen Pretrained Transformer (FPT), and study finetuning it on a variety of sequence classification tasks spanning… ▽ More

    Submitted 30 June, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

  36. arXiv:2012.03548  [pdf, other

    cs.LG cs.AI cs.RO

    Reset-Free Lifelong Learning with Skill-Space Planning

    Authors: Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch

    Abstract: The objective of lifelong reinforcement learning (RL) is to optimize agents which can continuously adapt and interact in changing environments. However, current RL approaches fail drastically when environments are non-stationary and interactions are non-episodic. We propose Lifelong Skill Planning (LiSP), an algorithmic framework for non-episodic lifelong RL based on planning in an abstract space… ▽ More

    Submitted 15 June, 2021; v1 submitted 7 December, 2020; originally announced December 2020.

    Comments: In the proceedings of the 7th International Conference on Learning Representations (ICLR), Virtual, April 2021

  37. arXiv:2012.01316  [pdf, other

    cs.LG

    Improved Contrastive Divergence Training of Energy Based Models

    Authors: Yilun Du, Shuang Li, Joshua Tenenbaum, Igor Mordatch

    Abstract: Contrastive divergence is a popular method of training energy-based models, but is known to have difficulties with training stability. We propose an adaptation to improve contrastive divergence training by scrutinizing a gradient term that is difficult to calculate and is often left out for convenience. We show that this gradient term is numerically significant and in practice is important to avoi… ▽ More

    Submitted 10 June, 2021; v1 submitted 2 December, 2020; originally announced December 2020.

    Comments: ICML 2021, Project webpage at https://energy-based-model.github.io/improved-contrastive-divergence

  38. arXiv:2011.12216  [pdf, other

    cs.LG cs.AI stat.ML

    Energy-Based Models for Continual Learning

    Authors: Shuang Li, Yilun Du, Gido M. van de Ven, Igor Mordatch

    Abstract: We motivate Energy-Based Models (EBMs) as a promising model class for continual learning problems. Instead of tackling continual learning via the use of external memory, growing models, or regularization, EBMs change the underlying training objective to cause less interference with previously learned information. Our proposed version of EBMs for continual learning is simple, efficient, and outperf… ▽ More

    Submitted 18 December, 2022; v1 submitted 24 November, 2020; originally announced November 2020.

    Comments: Project page: https://energy-based-model.github.io/Energy-Based-Models-for-Continual-Learning

    Journal ref: Proceedings of The 1st Conference on Lifelong Learning Agents (CoLLAs), 2022, pp. 1-22

  39. arXiv:2011.01975  [pdf, other

    cs.AI cs.CV cs.LG cs.RO

    Rearrangement: A Challenge for Embodied AI

    Authors: Dhruv Batra, Angel X. Chang, Sonia Chernova, Andrew J. Davison, Jia Deng, Vladlen Koltun, Sergey Levine, Jitendra Malik, Igor Mordatch, Roozbeh Mottaghi, Manolis Savva, Hao Su

    Abstract: We describe a framework for research and evaluation in Embodied AI. Our proposal is based on a canonical task: Rearrangement. A standard task can focus the development of new techniques and serve as a source of trained models that can be transferred to other settings. In the rearrangement task, the goal is to bring a given physical environment into a specified state. The goal state can be specifie… ▽ More

    Submitted 3 November, 2020; originally announced November 2020.

    Comments: Authors are listed in alphabetical order

  40. arXiv:2010.14496  [pdf, other

    cs.LG cs.AI

    Generative Temporal Difference Learning for Infinite-Horizon Prediction

    Authors: Michael Janner, Igor Mordatch, Sergey Levine

    Abstract: We introduce the $γ$-model, a predictive model of environment dynamics with an infinite probabilistic horizon. Replacing standard single-step models with $γ$-models leads to generalizations of the procedures central to model-based control, including the model rollout and model-based value estimation. The $γ$-model, trained with a generative reinterpretation of temporal difference learning, is a na… ▽ More

    Submitted 28 November, 2021; v1 submitted 27 October, 2020; originally announced October 2020.

    Comments: NeurIPS 2020. Project page at: https://gammamodels.github.io/

  41. arXiv:2007.04976  [pdf, other

    cs.LG cs.CV stat.ML

    One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control

    Authors: Wenlong Huang, Igor Mordatch, Deepak Pathak

    Abstract: Reinforcement learning is typically concerned with learning control policies tailored to a particular agent. We investigate whether there exists a single global policy that can generalize to control a wide variety of agent morphologies -- ones in which even dimensionality of state and action spaces changes. We propose to express this global policy as a collection of identical modular neural networ… ▽ More

    Submitted 9 July, 2020; originally announced July 2020.

    Comments: Accepted at ICML 2020. Videos and code at https://huangwl18.github.io/modular-rl/

  42. arXiv:2004.07804  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    A Game Theoretic Framework for Model Based Reinforcement Learning

    Authors: Aravind Rajeswaran, Igor Mordatch, Vikash Kumar

    Abstract: Model-based reinforcement learning (MBRL) has recently gained immense interest due to its potential for sample efficiency and ability to incorporate off-policy data. However, designing stable and efficient MBRL algorithms using rich function approximators have remained challenging. To help expose the practical challenges in MBRL and simplify algorithm design from the lens of abstraction, we develo… ▽ More

    Submitted 11 March, 2021; v1 submitted 16 April, 2020; originally announced April 2020.

    Comments: ICML 2020. This version contains expanded discussion, hyperparameter configurations, and ablation studies

  43. arXiv:2004.06030  [pdf, other

    cs.CV cs.LG stat.ML

    Compositional Visual Generation and Inference with Energy Based Models

    Authors: Yilun Du, Shuang Li, Igor Mordatch

    Abstract: A vital aspect of human intelligence is the ability to compose increasingly complex concepts out of simpler ideas, enabling both rapid learning and adaptation of knowledge. In this paper we show that energy-based models can exhibit this ability by directly combining probability distributions. Samples from the combined distribution correspond to compositions of concepts. For example, given a distri… ▽ More

    Submitted 17 December, 2020; v1 submitted 13 April, 2020; originally announced April 2020.

    Comments: NeurIPS 2020 Spotlight; Website at https://energy-based-model.github.io/compositional-generation-inference/

  44. arXiv:2001.12004  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Neural MMO v1.3: A Massively Multiagent Game Environment for Training and Evaluating Neural Networks

    Authors: Joseph Suarez, Yilun Du, Igor Mordatch, Phillip Isola

    Abstract: Progress in multiagent intelligence research is fundamentally limited by the number and quality of environments available for study. In recent years, simulated games have become a dominant research platform within reinforcement learning, in part due to their accessibility and interpretability. Previous works have targeted and demonstrated success on arcade, first person shooter (FPS), real-time st… ▽ More

    Submitted 16 April, 2020; v1 submitted 31 January, 2020; originally announced January 2020.

  45. arXiv:1912.01188  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Adaptive Online Planning for Continual Lifelong Learning

    Authors: Kevin Lu, Igor Mordatch, Pieter Abbeel

    Abstract: We study learning control in an online reset-free lifelong learning scenario, where mistakes can compound catastrophically into the future and the underlying dynamics of the environment may change. Traditional model-free policy learning methods have achieved successes in difficult tasks due to their broad flexibility, but struggle in this setting, as they can activate failure modes early in their… ▽ More

    Submitted 27 June, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

    Comments: Originally published in NeurIPS Deep RL 2019

  46. arXiv:1909.07528  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Emergent Tool Use From Multi-Agent Autocurricula

    Authors: Bowen Baker, Ingmar Kanitscheider, Todor Markov, Yi Wu, Glenn Powell, Bob McGrew, Igor Mordatch

    Abstract: Through multi-agent competition, the simple objective of hide-and-seek, and standard reinforcement learning algorithms at scale, we find that agents create a self-supervised autocurriculum inducing multiple distinct rounds of emergent strategy, many of which require sophisticated tool use and coordination. We find clear evidence of six emergent phases in agent strategy in our environment, each of… ▽ More

    Submitted 10 February, 2020; v1 submitted 16 September, 2019; originally announced September 2019.

  47. arXiv:1909.06878  [pdf, other

    cs.LG cs.RO stat.ML

    Model Based Planning with Energy Based Models

    Authors: Yilun Du, Toru Lin, Igor Mordatch

    Abstract: Model-based planning holds great promise for improving both sample efficiency and generalization in reinforcement learning (RL). We show that energy-based models (EBMs) are a promising class of models to use for model-based planning. EBMs naturally support inference of intermediate states given start and goal state distributions. We provide an online algorithm to train EBMs while interacting with… ▽ More

    Submitted 8 March, 2021; v1 submitted 15 September, 2019; originally announced September 2019.

    Comments: CoRL 2019

  48. arXiv:1903.08689  [pdf, other

    cs.LG cs.CV stat.ML

    Implicit Generation and Generalization in Energy-Based Models

    Authors: Yilun Du, Igor Mordatch

    Abstract: Energy based models (EBMs) are appealing due to their generality and simplicity in likelihood modeling, but have been traditionally difficult to train. We present techniques to scale MCMC based EBM training on continuous neural networks, and we show its success on the high-dimensional data domains of ImageNet32x32, ImageNet128x128, CIFAR-10, and robotic hand trajectories, achieving better samples… ▽ More

    Submitted 29 June, 2020; v1 submitted 20 March, 2019; originally announced March 2019.

  49. arXiv:1903.00784  [pdf, other

    cs.MA cs.LG stat.ML

    Neural MMO: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents

    Authors: Joseph Suarez, Yilun Du, Phillip Isola, Igor Mordatch

    Abstract: The emergence of complex life on Earth is often attributed to the arms race that ensued from a huge number of organisms all competing for finite resources. We present an artificial intelligence research environment, inspired by the human game genre of MMORPGs (Massively Multiplayer Online Role-Playing Games, a.k.a. MMOs), that aims to simulate this setting in microcosm. As with MMORPGs and the rea… ▽ More

    Submitted 2 March, 2019; originally announced March 2019.

  50. arXiv:1901.10251  [pdf, other

    cs.LG stat.ML

    Multi-Agent Reinforcement Learning with Multi-Step Generative Models

    Authors: Orr Krupnik, Igor Mordatch, Aviv Tamar

    Abstract: We consider model-based reinforcement learning (MBRL) in 2-agent, high-fidelity continuous control problems -- an important domain for robots interacting with other agents in the same workspace. For non-trivial dynamical systems, MBRL typically suffers from accumulating errors. Several recent studies have addressed this problem by learning latent variable models for trajectory segments and optimiz… ▽ More

    Submitted 1 November, 2019; v1 submitted 29 January, 2019; originally announced January 2019.