Skip to main content

Showing 1–13 of 13 results for author: Pong, V

.
  1. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  2. arXiv:2107.07184  [pdf, other

    cs.LG cs.RO

    MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

    Authors: Kevin Li, Abhishek Gupta, Ashwin Reddy, Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine

    Abstract: Exploration in reinforcement learning is a challenging problem: in the worst case, the agent must search for high-reward states that could be hidden anywhere in the state space. Can we define a more tractable class of RL problems, where the agent is provided with examples of successful outcomes? In this problem setting, the reward function can be obtained automatically by training a classifier to… ▽ More

    Submitted 18 July, 2021; v1 submitted 15 July, 2021; originally announced July 2021.

    Comments: Accepted to ICML 2021. First two authors contributed equally

  3. arXiv:2107.03974  [pdf, other

    cs.LG cs.AI cs.RO

    Offline Meta-Reinforcement Learning with Online Self-Supervision

    Authors: Vitchyr H. Pong, Ashvin Nair, Laura Smith, Catherine Huang, Sergey Levine

    Abstract: Meta-reinforcement learning (RL) methods can meta-train policies that adapt to new tasks with orders of magnitude less data than standard RL, but meta-training itself is costly and time-consuming. If we can meta-train on offline data, then we can reuse the same static dataset, labeled once with rewards for different tasks, to meta-train policies that adapt to a variety of new tasks at meta-test ti… ▽ More

    Submitted 6 July, 2022; v1 submitted 8 July, 2021; originally announced July 2021.

    Comments: 8.5 pages, 6 figures, accepted to ICML 2022

  4. arXiv:2104.11707  [pdf, other

    cs.LG cs.AI cs.RO

    DisCo RL: Distribution-Conditioned Reinforcement Learning for General-Purpose Policies

    Authors: Soroush Nasiriany, Vitchyr H. Pong, Ashvin Nair, Alexander Khazatsky, Glen Berseth, Sergey Levine

    Abstract: Can we use reinforcement learning to learn general-purpose policies that can perform a wide range of different tasks, resulting in flexible and reusable skills? Contextual policies provide this capability in principle, but the representation of the context determines the degree of generalization and expressivity. Categorical contexts preclude generalization to entirely new tasks. Goal-conditioned… ▽ More

    Submitted 23 April, 2021; originally announced April 2021.

    Comments: ICRA 2021

  5. arXiv:2104.10190  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    Outcome-Driven Reinforcement Learning via Variational Inference

    Authors: Tim G. J. Rudner, Vitchyr H. Pong, Rowan McAllister, Yarin Gal, Sergey Levine

    Abstract: While reinforcement learning algorithms provide automated acquisition of optimal policies, practical application of such methods requires a number of design decisions, such as manually designing reward functions that not only define the task, but also provide sufficient sha** to accomplish it. In this paper, we view reinforcement learning as inferring policies that achieve desired outcomes, rath… ▽ More

    Submitted 28 December, 2022; v1 submitted 20 April, 2021; originally announced April 2021.

    Comments: Published in Advances in Neural Information Processing Systems 34 (NeurIPS 2021)

  6. arXiv:1911.08453  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Planning with Goal-Conditioned Policies

    Authors: Soroush Nasiriany, Vitchyr H. Pong, Steven Lin, Sergey Levine

    Abstract: Planning methods can solve temporally extended sequential decision making problems by composing simple behaviors. However, planning requires suitable abstractions for the states and transitions, which typically need to be designed by hand. In contrast, model-free reinforcement learning (RL) can acquire behaviors from low-level inputs directly, but often struggles with temporally extended tasks. Ca… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

    Comments: In Advances in Neural Information Processing Systems, 2019

  7. arXiv:1910.11670  [pdf, other

    cs.RO cs.CV cs.LG

    Contextual Imagined Goals for Self-Supervised Robotic Learning

    Authors: Ashvin Nair, Shikhar Bahl, Alexander Khazatsky, Vitchyr Pong, Glen Berseth, Sergey Levine

    Abstract: While reinforcement learning provides an appealing formalism for learning individual skills, a general-purpose robotic system must be able to master an extensive repertoire of behaviors. Instead of learning a large collection of skills individually, can we instead enable a robot to propose and practice its own behaviors automatically, learning about the affordances and behaviors that it can perfor… ▽ More

    Submitted 23 October, 2019; originally announced October 2019.

    Comments: 12 pages, to be presented at Conference on Robot Learning (CoRL) 2019. Project website: https://ccrig.github.io/

  8. arXiv:1905.07447  [pdf, other

    cs.RO cs.CV cs.LG

    REPLAB: A Reproducible Low-Cost Arm Benchmark Platform for Robotic Learning

    Authors: Brian Yang, Jesse Zhang, Vitchyr Pong, Sergey Levine, Dinesh Jayaraman

    Abstract: Standardized evaluation measures have aided in the progress of machine learning approaches in disciplines such as computer vision and machine translation. In this paper, we make the case that robotic learning would also benefit from benchmarking, and present the "REPLAB" platform for benchmarking vision-based manipulation tasks. REPLAB is a reproducible and self-contained hardware stack (robot arm… ▽ More

    Submitted 17 May, 2019; originally announced May 2019.

    Comments: Extended version of paper accepted to ICRA 2019

  9. arXiv:1903.03698  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Skew-Fit: State-Covering Self-Supervised Reinforcement Learning

    Authors: Vitchyr H. Pong, Murtaza Dalal, Steven Lin, Ashvin Nair, Shikhar Bahl, Sergey Levine

    Abstract: Autonomous agents that must exhibit flexible and broad capabilities will need to be equipped with large repertoires of skills. Defining each skill with a manually-designed reward function limits this repertoire and imposes a manual engineering burden. Self-supervised agents that set their own goals can automate this process, but designing appropriate goal setting objectives can be difficult, and o… ▽ More

    Submitted 4 August, 2020; v1 submitted 8 March, 2019; originally announced March 2019.

    Comments: ICML 2020. 8 pages, 8 figures; 9 pages appendix (6 additional figures)

  10. arXiv:1807.04742  [pdf, other

    cs.LG cs.CV cs.RO stat.ML

    Visual Reinforcement Learning with Imagined Goals

    Authors: Ashvin Nair, Vitchyr Pong, Murtaza Dalal, Shikhar Bahl, Steven Lin, Sergey Levine

    Abstract: For an autonomous agent to fulfill a wide range of user-specified goals at test time, it must be able to learn broadly applicable and general-purpose skill repertoires. Furthermore, to provide the requisite level of generality, these skills must handle raw sensory input such as images. In this paper, we propose an algorithm that acquires such general-purpose skills by combining unsupervised repres… ▽ More

    Submitted 4 December, 2018; v1 submitted 12 July, 2018; originally announced July 2018.

    Comments: 15 pages, NeurIPS 2018

  11. arXiv:1803.06773  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Composable Deep Reinforcement Learning for Robotic Manipulation

    Authors: Tuomas Haarnoja, Vitchyr Pong, Aurick Zhou, Murtaza Dalal, Pieter Abbeel, Sergey Levine

    Abstract: Model-free deep reinforcement learning has been shown to exhibit good performance in domains ranging from video games to simulated robotic manipulation and locomotion. However, model-free methods are known to perform poorly when the interaction time with the environment is limited, as is the case for most real-world robotic tasks. In this paper, we study how maximum entropy policies trained using… ▽ More

    Submitted 18 March, 2018; originally announced March 2018.

    Comments: Videos: https://sites.google.com/view/composing-real-world-policies/

  12. arXiv:1802.09081  [pdf, other

    cs.LG

    Temporal Difference Models: Model-Free Deep RL for Model-Based Control

    Authors: Vitchyr Pong, Shixiang Gu, Murtaza Dalal, Sergey Levine

    Abstract: Model-free reinforcement learning (RL) is a powerful, general tool for learning complex behaviors. However, its sample efficiency is often impractically large for solving challenging real-world problems, even with off-policy algorithms such as Q-learning. A limiting factor in classic model-free RL is that the learning signal consists only of scalar rewards, ignoring much of the rich information co… ▽ More

    Submitted 24 February, 2020; v1 submitted 25 February, 2018; originally announced February 2018.

    Comments: Appeared in ICLR 2018; typos corrected

  13. arXiv:1702.01182  [pdf, other

    cs.LG cs.RO

    Uncertainty-Aware Reinforcement Learning for Collision Avoidance

    Authors: Gregory Kahn, Adam Villaflor, Vitchyr Pong, Pieter Abbeel, Sergey Levine

    Abstract: Reinforcement learning can enable complex, adaptive behavior to be learned automatically for autonomous robotic platforms. However, practical deployment of reinforcement learning methods must contend with the fact that the training process itself can be unsafe for the robot. In this paper, we consider the specific case of a mobile robot learning to navigate an a priori unknown environment while av… ▽ More

    Submitted 3 February, 2017; originally announced February 2017.