Skip to main content

Showing 1–6 of 6 results for author: Ngo, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.08797  [pdf, other

    cs.CY

    Computing Power and the Governance of Artificial Intelligence

    Authors: Girish Sastry, Lennart Heim, Haydn Belfield, Markus Anderljung, Miles Brundage, Julian Hazell, Cullen O'Keefe, Gillian K. Hadfield, Richard Ngo, Konstantin Pilz, George Gor, Emma Bluemke, Sarah Shoker, Janet Egan, Robert F. Trager, Shahar Avin, Adrian Weller, Yoshua Bengio, Diane Coyle

    Abstract: Computing power, or "compute," is crucial for the development and deployment of artificial intelligence (AI) capabilities. As a result, governments and companies have started to leverage compute as a means to govern AI. For example, governments are investing in domestic compute capacity, controlling the flow of compute to competing countries, and subsidizing compute access to certain sectors. Howe… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: Figures can be accessed at: https://github.com/lheim/CPGAI-Figures

  2. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  3. arXiv:2209.00626  [pdf, ps, other

    cs.AI cs.LG

    The Alignment Problem from a Deep Learning Perspective

    Authors: Richard Ngo, Lawrence Chan, Sören Mindermann

    Abstract: In coming years or decades, artificial general intelligence (AGI) may surpass human capabilities at many critical tasks. We argue that, without substantial effort to prevent it, AGIs could learn to pursue goals that are in conflict (i.e. misaligned) with human interests. If trained like today's most capable models, AGIs could learn to act deceptively to receive higher reward, learn misaligned inte… ▽ More

    Submitted 19 March, 2024; v1 submitted 29 August, 2022; originally announced September 2022.

    Comments: Published in ICLR 2024

  4. arXiv:2011.08827  [pdf, other

    cs.LG cs.AI

    Avoiding Tampering Incentives in Deep RL via Decoupled Approval

    Authors: Jonathan Uesato, Ramana Kumar, Victoria Krakovna, Tom Everitt, Richard Ngo, Shane Legg

    Abstract: How can we design agents that pursue a given objective when all feedback mechanisms are influenceable by the agent? Standard RL algorithms assume a secure reward function, and can thus perform poorly in settings where agents can tamper with the reward-generating mechanism. We present a principled solution to the problem of learning from influenceable feedback, which combines approval with a decoup… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

  5. arXiv:2011.08820  [pdf, other

    cs.LG cs.AI

    REALab: An Embedded Perspective on Tampering

    Authors: Ramana Kumar, Jonathan Uesato, Richard Ngo, Tom Everitt, Victoria Krakovna, Shane Legg

    Abstract: This paper describes REALab, a platform for embedded agency research in reinforcement learning (RL). REALab is designed to model the structure of tampering problems that may arise in real-world deployments of RL. Standard Markov Decision Process (MDP) formulations of RL and simulated environments mirroring the MDP structure assume secure access to feedback (e.g., rewards). This may be unrealistic… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

  6. arXiv:2010.07877  [pdf, other

    cs.LG cs.AI

    Avoiding Side Effects By Considering Future Tasks

    Authors: Victoria Krakovna, Laurent Orseau, Richard Ngo, Miljan Martic, Shane Legg

    Abstract: Designing reward functions is difficult: the designer has to specify what to do (what it means to complete the task) as well as what not to do (side effects that should be avoided while completing the task). To alleviate the burden on the reward designer, we propose an algorithm to automatically generate an auxiliary reward function that penalizes side effects. This auxiliary objective rewards the… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: Published in NeurIPS 2020