Search | arXiv e-print repository

Position Paper: Technical Research and Talent is Needed for Effective AI Governance

Authors: Anka Reuel, Lisa Soder, Ben Bucknall, Trond Arne Undheim

Abstract: In light of recent advancements in AI capabilities and the increasingly widespread integration of AI systems into society, governments worldwide are actively seeking to mitigate the potential harms and risks associated with these technologies through regulation and other governance tools. However, there exist significant gaps between governance aspirations and the current state of the technical to… ▽ More In light of recent advancements in AI capabilities and the increasingly widespread integration of AI systems into society, governments worldwide are actively seeking to mitigate the potential harms and risks associated with these technologies through regulation and other governance tools. However, there exist significant gaps between governance aspirations and the current state of the technical tooling necessary for their realisation. In this position paper, we survey policy documents published by public-sector institutions in the EU, US, and China to highlight specific areas of disconnect between the technical requirements necessary for enacting proposed policy actions, and the current technical state of the art. Our analysis motivates a call for tighter integration of the AI/ML research community within AI governance in order to i) catalyse technical research aimed at bridging the gap between current and supposed technical underpinnings of regulatory action, as well as ii) increase the level of technical expertise within governing institutions so as to inform and guide effective governance of AI. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 9 pages, 3 figures, Proceedings of the 41 st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

arXiv:2406.04554 [pdf, ps, other]

Generative AI Needs Adaptive Governance

Authors: Anka Reuel, Trond Arne Undheim

Abstract: Because of the speed of its development, broad scope of application, and its ability to augment human performance, generative AI challenges the very notions of governance, trust, and human agency. The technology's capacity to mimic human knowledge work, feedback loops including significant uptick in users, research, investor, policy, and media attention, data and compute resources, all lead to rap… ▽ More Because of the speed of its development, broad scope of application, and its ability to augment human performance, generative AI challenges the very notions of governance, trust, and human agency. The technology's capacity to mimic human knowledge work, feedback loops including significant uptick in users, research, investor, policy, and media attention, data and compute resources, all lead to rapidly increasing capabilities. For those reasons, adaptive governance, where AI governance and AI co-evolve, is essential for governing generative AI. In sharp contrast to traditional governance's regulatory regimes that are based on a mix of rigid one-and-done provisions for disclosure, registration and risk management, which in the case of AI carry the potential for regulatory misalignment, this paper argues that generative AI calls for adaptive governance. We define adaptive governance in the context of AI and outline an adaptive AI governance framework. We outline actors, roles, as well as both shared and actors-specific policy activities. We further provide examples of how the framework could be operationalized in practice. We then explain that the adaptive AI governance stance is not without its risks and limitations, such as insufficient oversight, insufficient depth, regulatory uncertainty, and regulatory capture, and provide potential approaches to fix these shortcomings. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2405.19522 [pdf]

Artificial Intelligence Index Report 2024

Authors: Nestor Maslej, Loredana Fattorini, Raymond Perrault, Vanessa Parli, Anka Reuel, Erik Brynjolfsson, John Etchemendy, Katrina Ligett, Terah Lyons, James Manyika, Juan Carlos Niebles, Yoav Shoham, Russell Wald, Jack Clark

Abstract: The 2024 Index is our most comprehensive to date and arrives at an important moment when AI's influence on society has never been more pronounced. This year, we have broadened our scope to more extensively cover essential trends such as technical advancements in AI, public perceptions of the technology, and the geopolitical dynamics surrounding its development. Featuring more original data than ev… ▽ More The 2024 Index is our most comprehensive to date and arrives at an important moment when AI's influence on society has never been more pronounced. This year, we have broadened our scope to more extensively cover essential trends such as technical advancements in AI, public perceptions of the technology, and the geopolitical dynamics surrounding its development. Featuring more original data than ever before, this edition introduces new estimates on AI training costs, detailed analyses of the responsible AI landscape, and an entirely new chapter dedicated to AI's impact on science and medicine. The AI Index report tracks, collates, distills, and visualizes data related to artificial intelligence (AI). Our mission is to provide unbiased, rigorously vetted, broadly sourced data in order for policymakers, researchers, executives, journalists, and the general public to develop a more thorough and nuanced understanding of the complex field of AI. The AI Index is recognized globally as one of the most credible and authoritative sources for data and insights on artificial intelligence. Previous editions have been cited in major newspapers, including the The New York Times, Bloomberg, and The Guardian, have amassed hundreds of academic citations, and been referenced by high-level policymakers in the United States, the United Kingdom, and the European Union, among other places. This year's edition surpasses all previous ones in size, scale, and scope, reflecting the growing significance that AI is coming to hold in all of our lives. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.06909 [pdf, ps, other]

Fairness in Reinforcement Learning: A Survey

Authors: Anka Reuel, Devin Ma

Abstract: While our understanding of fairness in machine learning has significantly progressed, our understanding of fairness in reinforcement learning (RL) remains nascent. Most of the attention has been on fairness in one-shot classification tasks; however, real-world, RL-enabled systems (e.g., autonomous vehicles) are much more complicated in that agents operate in dynamic environments over a long period… ▽ More While our understanding of fairness in machine learning has significantly progressed, our understanding of fairness in reinforcement learning (RL) remains nascent. Most of the attention has been on fairness in one-shot classification tasks; however, real-world, RL-enabled systems (e.g., autonomous vehicles) are much more complicated in that agents operate in dynamic environments over a long period of time. To ensure the responsible development and deployment of these systems, we must better understand fairness in RL. In this paper, we survey the literature to provide the most up-to-date snapshot of the frontiers of fairness in RL. We start by reviewing where fairness considerations can arise in RL, then discuss the various definitions of fairness in RL that have been put forth thus far. We continue to highlight the methodologies researchers used to implement fairness in single- and multi-agent RL systems before showcasing the distinct application domains that fair RL has been investigated in. Finally, we critically examine gaps in the literature, such as understanding fairness in the context of RLHF, that still need to be addressed in future work to truly operationalize fair RL in real-world systems. △ Less

Submitted 11 May, 2024; originally announced May 2024.

Comments: 10 pages

ACM Class: A.1; I.2

arXiv:2401.03408 [pdf, other]

doi 10.1145/3630106.3658942

Escalation Risks from Language Models in Military and Diplomatic Decision-Making

Authors: Juan-Pablo Rivera, Gabriel Mukobi, Anka Reuel, Max Lamparth, Chandler Smith, Jacquelyn Schneider

Abstract: Governments are increasingly considering integrating autonomous AI agents in high-stakes military and foreign-policy decision-making, especially with the emergence of advanced generative AI models like GPT-4. Our work aims to scrutinize the behavior of multiple AI agents in simulated wargames, specifically focusing on their predilection to take escalatory actions that may exacerbate multilateral c… ▽ More Governments are increasingly considering integrating autonomous AI agents in high-stakes military and foreign-policy decision-making, especially with the emergence of advanced generative AI models like GPT-4. Our work aims to scrutinize the behavior of multiple AI agents in simulated wargames, specifically focusing on their predilection to take escalatory actions that may exacerbate multilateral conflicts. Drawing on political science and international relations literature about escalation dynamics, we design a novel wargame simulation and scoring framework to assess the escalation risks of actions taken by these agents in different scenarios. Contrary to prior studies, our research provides both qualitative and quantitative insights and focuses on large language models (LLMs). We find that all five studied off-the-shelf LLMs show forms of escalation and difficult-to-predict escalation patterns. We observe that models tend to develop arms-race dynamics, leading to greater conflict, and in rare cases, even to the deployment of nuclear weapons. Qualitatively, we also collect the models' reported reasonings for chosen actions and observe worrying justifications based on deterrence and first-strike tactics. Given the high stakes of military and foreign-policy contexts, we recommend further examination and cautious consideration before deploying autonomous language model agents for strategic military or diplomatic decision-making. △ Less

Submitted 7 January, 2024; originally announced January 2024.

Comments: 10 pages body, 57 pages appendix, 46 figures, 11 tables

Journal ref: The 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT 24), June 3-6, 2024, Rio de Janeiro, Brazil

arXiv:2308.15514 [pdf, other]

International Governance of Civilian AI: A Jurisdictional Certification Approach

Authors: Robert Trager, Ben Harack, Anka Reuel, Allison Carnegie, Lennart Heim, Lewis Ho, Sarah Kreps, Ranjit Lall, Owen Larter, Seán Ó hÉigeartaigh, Simon Staffell, José Jaime Villalobos

Abstract: This report describes trade-offs in the design of international governance arrangements for civilian artificial intelligence (AI) and presents one approach in detail. This approach represents the extension of a standards, licensing, and liability regime to the global level. We propose that states establish an International AI Organization (IAIO) to certify state jurisdictions (not firms or AI proj… ▽ More This report describes trade-offs in the design of international governance arrangements for civilian artificial intelligence (AI) and presents one approach in detail. This approach represents the extension of a standards, licensing, and liability regime to the global level. We propose that states establish an International AI Organization (IAIO) to certify state jurisdictions (not firms or AI projects) for compliance with international oversight standards. States can give force to these international standards by adopting regulations prohibiting the import of goods whose supply chains embody AI from non-IAIO-certified jurisdictions. This borrows attributes from models of existing international organizations, such as the International Civilian Aviation Organization (ICAO), the International Maritime Organization (IMO), and the Financial Action Task Force (FATF). States can also adopt multilateral controls on the export of AI product inputs, such as specialized hardware, to non-certified jurisdictions. Indeed, both the import and export standards could be required for certification. As international actors reach consensus on risks of and minimum standards for advanced AI, a jurisdictional certification regime could mitigate a broad range of potential harms, including threats to public safety. △ Less

Submitted 11 September, 2023; v1 submitted 29 August, 2023; originally announced August 2023.

arXiv:2304.07249 [pdf, other]

How to design an AI ethics board

Authors: Jonas Schuett, Anka Reuel, Alexis Carlier

Abstract: Organizations that develop and deploy artificial intelligence (AI) systems need to take measures to reduce the associated risks. In this paper, we examine how AI companies could design an AI ethics board in a way that reduces risks from AI. We identify five high-level design choices: (1) What responsibilities should the board have? (2) What should its legal structure be? (3) Who should sit on the… ▽ More Organizations that develop and deploy artificial intelligence (AI) systems need to take measures to reduce the associated risks. In this paper, we examine how AI companies could design an AI ethics board in a way that reduces risks from AI. We identify five high-level design choices: (1) What responsibilities should the board have? (2) What should its legal structure be? (3) Who should sit on the board? (4) How should it make decisions and should its decisions be binding? (5) What resources does it need? We break down each of these questions into more specific sub-questions, list options, and discuss how different design choices affect the board's ability to reduce risks from AI. Several failures have shown that designing an AI ethics board can be challenging. This paper provides a toolbox that can help AI companies to overcome these challenges. △ Less

Submitted 14 April, 2023; originally announced April 2023.

Comments: 21 pages, 2 figures, 2 tables

arXiv:2302.12461 [pdf, other]

doi 10.1145/3630106.3659042

Analyzing And Editing Inner Mechanisms Of Backdoored Language Models

Authors: Max Lamparth, Anka Reuel

Abstract: Poisoning of data sets is a potential security threat to large language models that can lead to backdoored models. A description of the internal mechanisms of backdoored language models and how they process trigger inputs, e.g., when switching to toxic language, has yet to be found. In this work, we study the internal representations of transformer-based backdoored language models and determine ea… ▽ More Poisoning of data sets is a potential security threat to large language models that can lead to backdoored models. A description of the internal mechanisms of backdoored language models and how they process trigger inputs, e.g., when switching to toxic language, has yet to be found. In this work, we study the internal representations of transformer-based backdoored language models and determine early-layer MLP modules as most important for the backdoor mechanism in combination with the initial embedding projection. We use this knowledge to remove, insert, and modify backdoor mechanisms with engineered replacements that reduce the MLP module outputs to essentials for the backdoor mechanism. To this end, we introduce PCP ablation, where we replace transformer modules with low-rank matrices based on the principal components of their activations. We demonstrate our results on backdoored toy, backdoored large, and non-backdoored open-source models. We show that we can improve the backdoor robustness of large language models by locally constraining individual modules during fine-tuning on potentially poisonous data sets. Trigger warning: Offensive language. △ Less

Submitted 3 May, 2024; v1 submitted 24 February, 2023; originally announced February 2023.

Comments: Final version accepted at FAccT 24

Journal ref: The 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT 24), June 3-6, 2024, Rio de Janeiro, Brazil

Showing 1–8 of 8 results for author: Reuel, A