Skip to main content

Showing 1–6 of 6 results for author: Pong, V H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  2. arXiv:2107.03974  [pdf, other

    cs.LG cs.AI cs.RO

    Offline Meta-Reinforcement Learning with Online Self-Supervision

    Authors: Vitchyr H. Pong, Ashvin Nair, Laura Smith, Catherine Huang, Sergey Levine

    Abstract: Meta-reinforcement learning (RL) methods can meta-train policies that adapt to new tasks with orders of magnitude less data than standard RL, but meta-training itself is costly and time-consuming. If we can meta-train on offline data, then we can reuse the same static dataset, labeled once with rewards for different tasks, to meta-train policies that adapt to a variety of new tasks at meta-test ti… ▽ More

    Submitted 6 July, 2022; v1 submitted 8 July, 2021; originally announced July 2021.

    Comments: 8.5 pages, 6 figures, accepted to ICML 2022

  3. arXiv:2104.11707  [pdf, other

    cs.LG cs.AI cs.RO

    DisCo RL: Distribution-Conditioned Reinforcement Learning for General-Purpose Policies

    Authors: Soroush Nasiriany, Vitchyr H. Pong, Ashvin Nair, Alexander Khazatsky, Glen Berseth, Sergey Levine

    Abstract: Can we use reinforcement learning to learn general-purpose policies that can perform a wide range of different tasks, resulting in flexible and reusable skills? Contextual policies provide this capability in principle, but the representation of the context determines the degree of generalization and expressivity. Categorical contexts preclude generalization to entirely new tasks. Goal-conditioned… ▽ More

    Submitted 23 April, 2021; originally announced April 2021.

    Comments: ICRA 2021

  4. arXiv:2104.10190  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    Outcome-Driven Reinforcement Learning via Variational Inference

    Authors: Tim G. J. Rudner, Vitchyr H. Pong, Rowan McAllister, Yarin Gal, Sergey Levine

    Abstract: While reinforcement learning algorithms provide automated acquisition of optimal policies, practical application of such methods requires a number of design decisions, such as manually designing reward functions that not only define the task, but also provide sufficient sha** to accomplish it. In this paper, we view reinforcement learning as inferring policies that achieve desired outcomes, rath… ▽ More

    Submitted 28 December, 2022; v1 submitted 20 April, 2021; originally announced April 2021.

    Comments: Published in Advances in Neural Information Processing Systems 34 (NeurIPS 2021)

  5. arXiv:1911.08453  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Planning with Goal-Conditioned Policies

    Authors: Soroush Nasiriany, Vitchyr H. Pong, Steven Lin, Sergey Levine

    Abstract: Planning methods can solve temporally extended sequential decision making problems by composing simple behaviors. However, planning requires suitable abstractions for the states and transitions, which typically need to be designed by hand. In contrast, model-free reinforcement learning (RL) can acquire behaviors from low-level inputs directly, but often struggles with temporally extended tasks. Ca… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

    Comments: In Advances in Neural Information Processing Systems, 2019

  6. arXiv:1903.03698  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Skew-Fit: State-Covering Self-Supervised Reinforcement Learning

    Authors: Vitchyr H. Pong, Murtaza Dalal, Steven Lin, Ashvin Nair, Shikhar Bahl, Sergey Levine

    Abstract: Autonomous agents that must exhibit flexible and broad capabilities will need to be equipped with large repertoires of skills. Defining each skill with a manually-designed reward function limits this repertoire and imposes a manual engineering burden. Self-supervised agents that set their own goals can automate this process, but designing appropriate goal setting objectives can be difficult, and o… ▽ More

    Submitted 4 August, 2020; v1 submitted 8 March, 2019; originally announced March 2019.

    Comments: ICML 2020. 8 pages, 8 figures; 9 pages appendix (6 additional figures)