-
Computing Power and the Governance of Artificial Intelligence
Authors:
Girish Sastry,
Lennart Heim,
Haydn Belfield,
Markus Anderljung,
Miles Brundage,
Julian Hazell,
Cullen O'Keefe,
Gillian K. Hadfield,
Richard Ngo,
Konstantin Pilz,
George Gor,
Emma Bluemke,
Sarah Shoker,
Janet Egan,
Robert F. Trager,
Shahar Avin,
Adrian Weller,
Yoshua Bengio,
Diane Coyle
Abstract:
Computing power, or "compute," is crucial for the development and deployment of artificial intelligence (AI) capabilities. As a result, governments and companies have started to leverage compute as a means to govern AI. For example, governments are investing in domestic compute capacity, controlling the flow of compute to competing countries, and subsidizing compute access to certain sectors. Howe…
▽ More
Computing power, or "compute," is crucial for the development and deployment of artificial intelligence (AI) capabilities. As a result, governments and companies have started to leverage compute as a means to govern AI. For example, governments are investing in domestic compute capacity, controlling the flow of compute to competing countries, and subsidizing compute access to certain sectors. However, these efforts only scratch the surface of how compute can be used to govern AI development and deployment. Relative to other key inputs to AI (data and algorithms), AI-relevant compute is a particularly effective point of intervention: it is detectable, excludable, and quantifiable, and is produced via an extremely concentrated supply chain. These characteristics, alongside the singular importance of compute for cutting-edge AI models, suggest that governing compute can contribute to achieving common policy objectives, such as ensuring the safety and beneficial use of AI. More precisely, policymakers could use compute to facilitate regulatory visibility of AI, allocate resources to promote beneficial outcomes, and enforce restrictions against irresponsible or malicious AI development and usage. However, while compute-based policies and technologies have the potential to assist in these areas, there is significant variation in their readiness for implementation. Some ideas are currently being piloted, while others are hindered by the need for fundamental research. Furthermore, naive or poorly scoped approaches to compute governance carry significant risks in areas like privacy, economic impacts, and centralization of power. We end by suggesting guardrails to minimize these risks from compute governance.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
AI Systems of Concern
Authors:
Kayla Matteucci,
Shahar Avin,
Fazl Barez,
Seán Ó hÉigeartaigh
Abstract:
Concerns around future dangers from advanced AI often centre on systems hypothesised to have intrinsic characteristics such as agent-like behaviour, strategic awareness, and long-range planning. We label this cluster of characteristics as "Property X". Most present AI systems are low in "Property X"; however, in the absence of deliberate steering, current research directions may rapidly lead to th…
▽ More
Concerns around future dangers from advanced AI often centre on systems hypothesised to have intrinsic characteristics such as agent-like behaviour, strategic awareness, and long-range planning. We label this cluster of characteristics as "Property X". Most present AI systems are low in "Property X"; however, in the absence of deliberate steering, current research directions may rapidly lead to the emergence of highly capable AI systems that are also high in "Property X". We argue that "Property X" characteristics are intrinsically dangerous, and when combined with greater capabilities will result in AI systems for which safety and control is difficult to guarantee. Drawing on several scholars' alternative frameworks for possible AI research trajectories, we argue that most of the proposed benefits of advanced AI can be obtained by systems designed to minimise this property. We then propose indicators and governance interventions to identify and limit the development of systems with risky "Property X" characteristics.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Frontier AI Regulation: Managing Emerging Risks to Public Safety
Authors:
Markus Anderljung,
Joslyn Barnhart,
Anton Korinek,
Jade Leung,
Cullen O'Keefe,
Jess Whittlestone,
Shahar Avin,
Miles Brundage,
Justin Bullock,
Duncan Cass-Beggs,
Ben Chang,
Tantum Collins,
Tim Fist,
Gillian Hadfield,
Alan Hayes,
Lewis Ho,
Sara Hooker,
Eric Horvitz,
Noam Kolt,
Jonas Schuett,
Yonadav Shavit,
Divya Siddarth,
Robert Trager,
Kevin Wolf
Abstract:
Advanced AI models hold the promise of tremendous benefits for humanity, but society needs to proactively manage the accompanying risks. In this paper, we focus on what we term "frontier AI" models: highly capable foundation models that could possess dangerous capabilities sufficient to pose severe risks to public safety. Frontier AI models pose a distinct regulatory challenge: dangerous capabilit…
▽ More
Advanced AI models hold the promise of tremendous benefits for humanity, but society needs to proactively manage the accompanying risks. In this paper, we focus on what we term "frontier AI" models: highly capable foundation models that could possess dangerous capabilities sufficient to pose severe risks to public safety. Frontier AI models pose a distinct regulatory challenge: dangerous capabilities can arise unexpectedly; it is difficult to robustly prevent a deployed model from being misused; and, it is difficult to stop a model's capabilities from proliferating broadly. To address these challenges, at least three building blocks for the regulation of frontier models are needed: (1) standard-setting processes to identify appropriate requirements for frontier AI developers, (2) registration and reporting requirements to provide regulators with visibility into frontier AI development processes, and (3) mechanisms to ensure compliance with safety standards for the development and deployment of frontier AI models. Industry self-regulation is an important first step. However, wider societal discussions and government intervention will be needed to create standards and to ensure compliance with them. We consider several options to this end, including granting enforcement powers to supervisory authorities and licensure regimes for frontier AI models. Finally, we propose an initial set of safety standards. These include conducting pre-deployment risk assessments; external scrutiny of model behavior; using risk assessments to inform deployment decisions; and monitoring and responding to new information about model capabilities and uses post-deployment. We hope this discussion contributes to the broader conversation on how to balance public safety risks and innovation benefits from advances at the frontier of AI development.
△ Less
Submitted 7 November, 2023; v1 submitted 6 July, 2023;
originally announced July 2023.
-
Model evaluation for extreme risks
Authors:
Toby Shevlane,
Sebastian Farquhar,
Ben Garfinkel,
Mary Phuong,
Jess Whittlestone,
Jade Leung,
Daniel Kokotajlo,
Nahema Marchal,
Markus Anderljung,
Noam Kolt,
Lewis Ho,
Divya Siddarth,
Shahar Avin,
Will Hawkins,
Been Kim,
Iason Gabriel,
Vijay Bolina,
Jack Clark,
Yoshua Bengio,
Paul Christiano,
Allan Dafoe
Abstract:
Current approaches to building general-purpose AI systems tend to produce systems with both beneficial and harmful capabilities. Further progress in AI development could lead to capabilities that pose extreme risks, such as offensive cyber capabilities or strong manipulation skills. We explain why model evaluation is critical for addressing extreme risks. Developers must be able to identify danger…
▽ More
Current approaches to building general-purpose AI systems tend to produce systems with both beneficial and harmful capabilities. Further progress in AI development could lead to capabilities that pose extreme risks, such as offensive cyber capabilities or strong manipulation skills. We explain why model evaluation is critical for addressing extreme risks. Developers must be able to identify dangerous capabilities (through "dangerous capability evaluations") and the propensity of models to apply their capabilities for harm (through "alignment evaluations"). These evaluations will become critical for kee** policymakers and other stakeholders informed, and for making responsible decisions about model training, deployment, and security.
△ Less
Submitted 22 September, 2023; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Filling gaps in trustworthy development of AI
Authors:
Shahar Avin,
Haydn Belfield,
Miles Brundage,
Gretchen Krueger,
Jasmine Wang,
Adrian Weller,
Markus Anderljung,
Igor Krawczuk,
David Krueger,
Jonathan Lebensold,
Tegan Maharaj,
Noa Zilberman
Abstract:
The range of application of artificial intelligence (AI) is vast, as is the potential for harm. Growing awareness of potential risks from AI systems has spurred action to address those risks, while eroding confidence in AI systems and the organizations that develop them. A 2019 study found over 80 organizations that published and adopted "AI ethics principles'', and more have joined since. But the…
▽ More
The range of application of artificial intelligence (AI) is vast, as is the potential for harm. Growing awareness of potential risks from AI systems has spurred action to address those risks, while eroding confidence in AI systems and the organizations that develop them. A 2019 study found over 80 organizations that published and adopted "AI ethics principles'', and more have joined since. But the principles often leave a gap between the "what" and the "how" of trustworthy AI development. Such gaps have enabled questionable or ethically dubious behavior, which casts doubts on the trustworthiness of specific organizations, and the field more broadly. There is thus an urgent need for concrete methods that both enable AI developers to prevent harm and allow them to demonstrate their trustworthiness through verifiable behavior. Below, we explore mechanisms (drawn from arXiv:2004.07213) for creating an ecosystem where AI developers can earn trust - if they are trustworthy. Better assessment of developer trustworthiness could inform user choice, employee actions, investment decisions, legal recourse, and emerging governance regimes.
△ Less
Submitted 14 December, 2021;
originally announced December 2021.
-
Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims
Authors:
Miles Brundage,
Shahar Avin,
Jasmine Wang,
Haydn Belfield,
Gretchen Krueger,
Gillian Hadfield,
Heidy Khlaaf,
**gying Yang,
Helen Toner,
Ruth Fong,
Tegan Maharaj,
Pang Wei Koh,
Sara Hooker,
Jade Leung,
Andrew Trask,
Emma Bluemke,
Jonathan Lebensold,
Cullen O'Keefe,
Mark Koren,
Théo Ryffel,
JB Rubinovitz,
Tamay Besiroglu,
Federica Carugati,
Jack Clark,
Peter Eckersley
, et al. (34 additional authors not shown)
Abstract:
With the recent wave of progress in artificial intelligence (AI) has come a growing awareness of the large-scale impacts of AI systems, and recognition that existing regulations and norms in industry and academia are insufficient to ensure responsible AI development. In order for AI developers to earn trust from system users, customers, civil society, governments, and other stakeholders that they…
▽ More
With the recent wave of progress in artificial intelligence (AI) has come a growing awareness of the large-scale impacts of AI systems, and recognition that existing regulations and norms in industry and academia are insufficient to ensure responsible AI development. In order for AI developers to earn trust from system users, customers, civil society, governments, and other stakeholders that they are building AI responsibly, they will need to make verifiable claims to which they can be held accountable. Those outside of a given organization also need effective means of scrutinizing such claims. This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems and their associated development processes, with a focus on providing evidence about the safety, security, fairness, and privacy protection of AI systems. We analyze ten mechanisms for this purpose--spanning institutions, software, and hardware--and make recommendations aimed at implementing, exploring, or improving those mechanisms.
△ Less
Submitted 20 April, 2020; v1 submitted 15 April, 2020;
originally announced April 2020.
-
Exploring AI Futures Through Role Play
Authors:
Shahar Avin,
Ross Gruetzemacher,
James Fox
Abstract:
We present an innovative methodology for studying and teaching the impacts of AI through a role play game. The game serves two primary purposes: 1) training AI developers and AI policy professionals to reflect on and prepare for future social and ethical challenges related to AI and 2) exploring possible futures involving AI technology development, deployment, social impacts, and governance. While…
▽ More
We present an innovative methodology for studying and teaching the impacts of AI through a role play game. The game serves two primary purposes: 1) training AI developers and AI policy professionals to reflect on and prepare for future social and ethical challenges related to AI and 2) exploring possible futures involving AI technology development, deployment, social impacts, and governance. While the game currently focuses on the inter relations between short --, mid and long term impacts of AI, it has potential to be adapted for a broad range of scenarios, exploring in greater depths issues of AI policy research and affording training within organizations. The game presented here has undergone two years of development and has been tested through over 30 events involving between 3 and 70 participants. The game is under active development, but preliminary findings suggest that role play is a promising methodology for both exploring AI futures and training individuals and organizations in thinking about, and reflecting on, the impacts of AI and strategic mistakes that can be avoided today.
△ Less
Submitted 18 December, 2019;
originally announced December 2019.
-
Between Progress and Potential Impact of AI: the Neglected Dimensions
Authors:
Fernando Martínez-Plumed,
Shahar Avin,
Miles Brundage,
Allan Dafoe,
Sean Ó hÉigeartaigh,
José Hernández-Orallo
Abstract:
We reframe the analysis of progress in AI by incorporating into an overall framework both the task performance of a system, and the time and resource costs incurred in the development and deployment of the system. These costs include: data, expert knowledge, human oversight, software resources, computing cycles, hardware and network facilities, and (what kind of) time. These costs are distributed…
▽ More
We reframe the analysis of progress in AI by incorporating into an overall framework both the task performance of a system, and the time and resource costs incurred in the development and deployment of the system. These costs include: data, expert knowledge, human oversight, software resources, computing cycles, hardware and network facilities, and (what kind of) time. These costs are distributed over the life cycle of the system, and may place differing demands on different developers and users. The multidimensional performance and cost space we present can be collapsed to a single utility metric that measures the value of the system for different stakeholders. Even without a single utility function, AI advances can be generically assessed by whether they expand the Pareto surface. We label these types of costs as neglected dimensions of AI progress, and explore them using four case studies: Alpha* (Go, Chess, and other board games), ALE (Atari games), ImageNet (Image classification) and Virtual Personal Assistants (Siri, Alexa, Cortana, and Google Assistant). This broader model of progress in AI will lead to novel ways of estimating the potential societal use and impact of an AI system, and the establishment of milestones for future progress.
△ Less
Submitted 2 July, 2022; v1 submitted 2 June, 2018;
originally announced June 2018.
-
The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation
Authors:
Miles Brundage,
Shahar Avin,
Jack Clark,
Helen Toner,
Peter Eckersley,
Ben Garfinkel,
Allan Dafoe,
Paul Scharre,
Thomas Zeitzoff,
Bobby Filar,
Hyrum Anderson,
Heather Roff,
Gregory C. Allen,
Jacob Steinhardt,
Carrick Flynn,
Seán Ó hÉigeartaigh,
Simon Beard,
Haydn Belfield,
Sebastian Farquhar,
Clare Lyle,
Rebecca Crootof,
Owain Evans,
Michael Page,
Joanna Bryson,
Roman Yampolskiy
, et al. (1 additional authors not shown)
Abstract:
This report surveys the landscape of potential security threats from malicious uses of AI, and proposes ways to better forecast, prevent, and mitigate these threats. After analyzing the ways in which AI may influence the threat landscape in the digital, physical, and political domains, we make four high-level recommendations for AI researchers and other stakeholders. We also suggest several promis…
▽ More
This report surveys the landscape of potential security threats from malicious uses of AI, and proposes ways to better forecast, prevent, and mitigate these threats. After analyzing the ways in which AI may influence the threat landscape in the digital, physical, and political domains, we make four high-level recommendations for AI researchers and other stakeholders. We also suggest several promising areas for further research that could expand the portfolio of defenses, or make attacks less effective or harder to execute. Finally, we discuss, but do not conclusively resolve, the long-term equilibrium of attackers and defenders.
△ Less
Submitted 20 February, 2018;
originally announced February 2018.