Search | arXiv e-print repository

Training Compute Thresholds: Features and Functions in AI Governance

Abstract: This paper examines the use of training compute thresholds as a tool for governing artificial intelligence (AI) systems. We argue that compute thresholds serve as a valuable trigger for further evaluation of AI models, rather than being the sole determinant of the regulation. Key advantages of compute thresholds include their correlation with model capabilities and risks, quantifiability, ease of… ▽ More This paper examines the use of training compute thresholds as a tool for governing artificial intelligence (AI) systems. We argue that compute thresholds serve as a valuable trigger for further evaluation of AI models, rather than being the sole determinant of the regulation. Key advantages of compute thresholds include their correlation with model capabilities and risks, quantifiability, ease of measurement, robustness to circumvention, knowability before model development and deployment, potential for external verification, and targeted scope. Compute thresholds provide a practical starting point for identifying potentially high-risk models and can be used as an initial filter in AI governance frameworks alongside other sector-specific regulations and broader governance measures. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: Working paper

arXiv:2405.10295 [pdf]

Societal Adaptation to Advanced AI

Authors: Jamie Bernardi, Gabriel Mukobi, Hilary Greaves, Lennart Heim, Markus Anderljung

Abstract: Existing strategies for managing risks from advanced AI systems often focus on affecting what AI systems are developed and how they diffuse. However, this approach becomes less feasible as the number of developers of advanced AI grows, and impedes beneficial use-cases as well as harmful ones. In response, we urge a complementary approach: increasing societal adaptation to advanced AI, that is, red… ▽ More Existing strategies for managing risks from advanced AI systems often focus on affecting what AI systems are developed and how they diffuse. However, this approach becomes less feasible as the number of developers of advanced AI grows, and impedes beneficial use-cases as well as harmful ones. In response, we urge a complementary approach: increasing societal adaptation to advanced AI, that is, reducing the expected negative impacts from a given level of diffusion of a given AI capability. We introduce a conceptual framework which helps identify adaptive interventions that avoid, defend against and remedy potentially harmful uses of AI systems, illustrated with examples in election manipulation, cyberterrorism, and loss of control to AI decision-makers. We discuss a three-step cycle that society can implement to adapt to AI. Increasing society's ability to implement this cycle builds its resilience to advanced AI. We conclude with concrete recommendations for governments, industry, and third-parties. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2404.02675 [pdf, other]

Responsible Reporting for Frontier AI Development

Authors: Noam Kolt, Markus Anderljung, Joslyn Barnhart, Asher Brass, Kevin Esvelt, Gillian K. Hadfield, Lennart Heim, Mikel Rodriguez, Jonas B. Sandbrink, Thomas Woodside

Abstract: Mitigating the risks from frontier AI systems requires up-to-date and reliable information about those systems. Organizations that develop and deploy frontier systems have significant access to such information. By reporting safety-critical information to actors in government, industry, and civil society, these organizations could improve visibility into new and emerging risks posed by frontier sy… ▽ More Mitigating the risks from frontier AI systems requires up-to-date and reliable information about those systems. Organizations that develop and deploy frontier systems have significant access to such information. By reporting safety-critical information to actors in government, industry, and civil society, these organizations could improve visibility into new and emerging risks posed by frontier systems. Equipped with this information, developers could make better informed decisions on risk management, while policymakers could design more targeted and robust regulatory infrastructure. We outline the key features of responsible reporting and propose mechanisms for implementing them in practice. △ Less

Submitted 3 April, 2024; originally announced April 2024.

arXiv:2403.08501 [pdf, other]

Governing Through the Cloud: The Intermediary Role of Compute Providers in AI Regulation

Authors: Lennart Heim, Tim Fist, Janet Egan, Sihao Huang, Stephen Zekany, Robert Trager, Michael A Osborne, Noa Zilberman

Abstract: As jurisdictions around the world take their first steps toward regulating the most powerful AI systems, such as the EU AI Act and the US Executive Order 14110, there is a growing need for effective enforcement mechanisms that can verify compliance and respond to violations. We argue that compute providers should have legal obligations and ethical responsibilities associated with AI development an… ▽ More As jurisdictions around the world take their first steps toward regulating the most powerful AI systems, such as the EU AI Act and the US Executive Order 14110, there is a growing need for effective enforcement mechanisms that can verify compliance and respond to violations. We argue that compute providers should have legal obligations and ethical responsibilities associated with AI development and deployment, both to provide secure infrastructure and to serve as intermediaries for AI regulation. Compute providers can play an essential role in a regulatory ecosystem via four key capacities: as securers, safeguarding AI systems and critical infrastructure; as record keepers, enhancing visibility for policymakers; as verifiers of customer activities, ensuring oversight; and as enforcers, taking actions against rule violations. We analyze the technical feasibility of performing these functions in a targeted and privacy-conscious manner and present a range of technical instruments. In particular, we describe how non-confidential information, to which compute providers largely already have access, can provide two key governance-relevant properties of a computational workload: its type-e.g., large-scale training or inference-and the amount of compute it has consumed. Using AI Executive Order 14110 as a case study, we outline how the US is beginning to implement record kee** requirements for compute providers. We also explore how verification and enforcement roles could be added to establish a comprehensive AI compute oversight scheme. We argue that internationalization will be key to effective implementation, and highlight the critical challenge of balancing confidentiality and privacy with risk mitigation as the role of compute providers in AI regulation expands. △ Less

Submitted 26 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

Comments: v2: Fixing affiliations, formatting errors, and vector graphics

arXiv:2402.08797 [pdf, other]

Computing Power and the Governance of Artificial Intelligence

Authors: Girish Sastry, Lennart Heim, Haydn Belfield, Markus Anderljung, Miles Brundage, Julian Hazell, Cullen O'Keefe, Gillian K. Hadfield, Richard Ngo, Konstantin Pilz, George Gor, Emma Bluemke, Sarah Shoker, Janet Egan, Robert F. Trager, Shahar Avin, Adrian Weller, Yoshua Bengio, Diane Coyle

Abstract: Computing power, or "compute," is crucial for the development and deployment of artificial intelligence (AI) capabilities. As a result, governments and companies have started to leverage compute as a means to govern AI. For example, governments are investing in domestic compute capacity, controlling the flow of compute to competing countries, and subsidizing compute access to certain sectors. Howe… ▽ More Computing power, or "compute," is crucial for the development and deployment of artificial intelligence (AI) capabilities. As a result, governments and companies have started to leverage compute as a means to govern AI. For example, governments are investing in domestic compute capacity, controlling the flow of compute to competing countries, and subsidizing compute access to certain sectors. However, these efforts only scratch the surface of how compute can be used to govern AI development and deployment. Relative to other key inputs to AI (data and algorithms), AI-relevant compute is a particularly effective point of intervention: it is detectable, excludable, and quantifiable, and is produced via an extremely concentrated supply chain. These characteristics, alongside the singular importance of compute for cutting-edge AI models, suggest that governing compute can contribute to achieving common policy objectives, such as ensuring the safety and beneficial use of AI. More precisely, policymakers could use compute to facilitate regulatory visibility of AI, allocate resources to promote beneficial outcomes, and enforce restrictions against irresponsible or malicious AI development and usage. However, while compute-based policies and technologies have the potential to assist in these areas, there is significant variation in their readiness for implementation. Some ideas are currently being piloted, while others are hindered by the need for fundamental research. Furthermore, naive or poorly scoped approaches to compute governance carry significant risks in areas like privacy, economic impacts, and centralization of power. We end by suggesting guardrails to minimize these risks from compute governance. △ Less

Submitted 13 February, 2024; originally announced February 2024.

Comments: Figures can be accessed at: https://github.com/lheim/CPGAI-Figures

arXiv:2401.13138 [pdf, other]

Visibility into AI Agents

Authors: Alan Chan, Carson Ezell, Max Kaufmann, Kevin Wei, Lewis Hammond, Herbie Bradley, Emma Bluemke, Nitarshan Rajkumar, David Krueger, Noam Kolt, Lennart Heim, Markus Anderljung

Abstract: Increased delegation of commercial, scientific, governmental, and personal activities to AI agents -- systems capable of pursuing complex goals with limited supervision -- may exacerbate existing societal risks and introduce new risks. Understanding and mitigating these risks involves critically evaluating existing governance structures, revising and adapting these structures where needed, and ens… ▽ More Increased delegation of commercial, scientific, governmental, and personal activities to AI agents -- systems capable of pursuing complex goals with limited supervision -- may exacerbate existing societal risks and introduce new risks. Understanding and mitigating these risks involves critically evaluating existing governance structures, revising and adapting these structures where needed, and ensuring accountability of key stakeholders. Information about where, why, how, and by whom certain AI agents are used, which we refer to as visibility, is critical to these objectives. In this paper, we assess three categories of measures to increase visibility into AI agents: agent identifiers, real-time monitoring, and activity logging. For each, we outline potential implementations that vary in intrusiveness and informativeness. We analyze how the measures apply across a spectrum of centralized through decentralized deployment contexts, accounting for various actors in the supply chain including hardware and software service providers. Finally, we discuss the implications of our measures for privacy and concentration of power. Further work into understanding the measures and mitigating their negative impacts can help to build a foundation for the governance of AI agents. △ Less

Submitted 17 May, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

Comments: Accepted to ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT 2024)

arXiv:2401.02452 [pdf, other]

The Compute Divide in Machine Learning: A Threat to Academic Contribution and Scrutiny?

Authors: Tamay Besiroglu, Sage Andrus Bergerson, Amelia Michael, Lennart Heim, Xueyun Luo, Neil Thompson

Abstract: There are pronounced differences in the extent to which industrial and academic AI labs use computing resources. We provide a data-driven survey of the role of the compute divide in sha** machine learning research. We show that a compute divide has coincided with a reduced representation of academic-only research teams in compute intensive research topics, especially foundation models. We argue… ▽ More There are pronounced differences in the extent to which industrial and academic AI labs use computing resources. We provide a data-driven survey of the role of the compute divide in sha** machine learning research. We show that a compute divide has coincided with a reduced representation of academic-only research teams in compute intensive research topics, especially foundation models. We argue that, academia will likely play a smaller role in advancing the associated techniques, providing critical evaluation and scrutiny, and in the diffusion of such models. Concurrent with this change in research focus, there is a noticeable shift in academic research towards embracing open source, pre-trained models developed within the industry. To address the challenges arising from this trend, especially reduced scrutiny of influential models, we recommend approaches aimed at thoughtfully expanding academic insights. Nationally-sponsored computing infrastructure coupled with open science initiatives could judiciously boost academic compute access, prioritizing research on interpretability, safety and security. Structured access programs and third-party auditing may also allow measured external evaluation of industry systems. △ Less

Submitted 8 January, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

arXiv:2311.15377 [pdf, other]

Increased Compute Efficiency and the Diffusion of AI Capabilities

Authors: Konstantin Pilz, Lennart Heim, Nicholas Brown

Abstract: Training advanced AI models requires large investments in computational resources, or compute. Yet, as hardware innovation reduces the price of compute and algorithmic advances make its use more efficient, the cost of training an AI model to a given performance falls over time - a concept we describe as increasing compute efficiency. We find that while an access effect increases the number of acto… ▽ More Training advanced AI models requires large investments in computational resources, or compute. Yet, as hardware innovation reduces the price of compute and algorithmic advances make its use more efficient, the cost of training an AI model to a given performance falls over time - a concept we describe as increasing compute efficiency. We find that while an access effect increases the number of actors who can train models to a given performance over time, a performance effect simultaneously increases the performance available to each actor. This potentially enables large compute investors to pioneer new capabilities, maintaining a performance advantage even as capabilities diffuse. Since large compute investors tend to develop new capabilities first, it will be particularly important that they share information about their AI models, evaluate them for emerging risks, and, more generally, make responsible development and release decisions. Further, as compute efficiency increases, governments will need to prepare for a world where dangerous AI capabilities are widely available - for instance, by develo** defenses against harmful AI models or by actively intervening in the diffusion of particularly dangerous capabilities. △ Less

Submitted 13 February, 2024; v1 submitted 26 November, 2023; originally announced November 2023.

arXiv:2311.02651 [pdf]

Compute at Scale: A Broad Investigation into the Data Center Industry

Authors: Konstantin Pilz, Lennart Heim

Abstract: This report characterizes the data center industry and its importance for AI development. Data centers are industrial facilities that efficiently provide compute at scale and thus constitute the engine rooms of today's digital economy. As large-scale AI training and inference become increasingly computationally expensive, they are dominantly executed from this designated infrastructure. Key featur… ▽ More This report characterizes the data center industry and its importance for AI development. Data centers are industrial facilities that efficiently provide compute at scale and thus constitute the engine rooms of today's digital economy. As large-scale AI training and inference become increasingly computationally expensive, they are dominantly executed from this designated infrastructure. Key features of data centers include large-scale compute clusters that require extensive cooling and consume large amounts of power, the need for fast connectivity both within the data center and to the internet, and an emphasis on security and reliability. The global industry is valued at approximately $250B and is expected to double over the next seven years. There are likely about 500 large (above 10 MW) data centers globally, with the US, Europe, and China constituting the most important markets. The report further covers important actors, business models, main inputs, and typical locations of data centers. △ Less

Submitted 22 November, 2023; v1 submitted 5 November, 2023; originally announced November 2023.

arXiv:2310.13625 [pdf, other]

Oversight for Frontier AI through a Know-Your-Customer Scheme for Compute Providers

Authors: Janet Egan, Lennart Heim

Abstract: To address security and safety risks stemming from highly capable artificial intelligence (AI) models, we propose that the US government should ensure compute providers implement Know-Your-Customer (KYC) schemes. Compute - the computational power and infrastructure required to train and run these AI models - is emerging as a node for oversight. KYC, a standard developed by the banking sector to id… ▽ More To address security and safety risks stemming from highly capable artificial intelligence (AI) models, we propose that the US government should ensure compute providers implement Know-Your-Customer (KYC) schemes. Compute - the computational power and infrastructure required to train and run these AI models - is emerging as a node for oversight. KYC, a standard developed by the banking sector to identify and verify client identity, could provide a mechanism for greater public oversight of frontier AI development and close loopholes in existing export controls. Such a scheme has the potential to identify and warn stakeholders of potentially problematic and/or sudden advancements in AI capabilities, build government capacity for AI regulation, and allow for the development and implementation of more nuanced and targeted export controls. Unlike the strategy of limiting access to AI chip purchases, regulating the digital access to compute offers more precise controls, allowing regulatory control over compute quantities, as well as the flexibility to suspend access at any time. To enact a KYC scheme, the US government will need to work closely with industry to (1) establish a dynamic threshold of compute that effectively captures high-risk frontier model development, while minimizing imposition on developers not engaged in frontier AI; (2) set requirements and guidance for compute providers to keep records and report high-risk entities; (3) establish government capacity that allows for co-design, implementation, administration and enforcement of the scheme; and (4) engage internationally to promote international alignment with the scheme and support its long-term efficacy. While the scheme will not address all AI risks, it complements proposed solutions by allowing for a more precise and flexible approach to controlling the development of frontier AI models and unwanted AI proliferation. △ Less

Submitted 20 October, 2023; originally announced October 2023.

arXiv:2308.15514 [pdf, other]

International Governance of Civilian AI: A Jurisdictional Certification Approach

Authors: Robert Trager, Ben Harack, Anka Reuel, Allison Carnegie, Lennart Heim, Lewis Ho, Sarah Kreps, Ranjit Lall, Owen Larter, Seán Ó hÉigeartaigh, Simon Staffell, José Jaime Villalobos

Abstract: This report describes trade-offs in the design of international governance arrangements for civilian artificial intelligence (AI) and presents one approach in detail. This approach represents the extension of a standards, licensing, and liability regime to the global level. We propose that states establish an International AI Organization (IAIO) to certify state jurisdictions (not firms or AI proj… ▽ More This report describes trade-offs in the design of international governance arrangements for civilian artificial intelligence (AI) and presents one approach in detail. This approach represents the extension of a standards, licensing, and liability regime to the global level. We propose that states establish an International AI Organization (IAIO) to certify state jurisdictions (not firms or AI projects) for compliance with international oversight standards. States can give force to these international standards by adopting regulations prohibiting the import of goods whose supply chains embody AI from non-IAIO-certified jurisdictions. This borrows attributes from models of existing international organizations, such as the International Civilian Aviation Organization (ICAO), the International Maritime Organization (IMO), and the Financial Action Task Force (FATF). States can also adopt multilateral controls on the export of AI product inputs, such as specialized hardware, to non-certified jurisdictions. Indeed, both the import and export standards could be required for certification. As international actors reach consensus on risks of and minimum standards for advanced AI, a jurisdictional certification regime could mitigate a broad range of potential harms, including threats to public safety. △ Less

Submitted 11 September, 2023; v1 submitted 29 August, 2023; originally announced August 2023.

arXiv:2305.07153 [pdf, other]

Towards best practices in AGI safety and governance: A survey of expert opinion

Authors: Jonas Schuett, Noemi Dreksler, Markus Anderljung, David McCaffary, Lennart Heim, Emma Bluemke, Ben Garfinkel

Abstract: A number of leading AI companies, including OpenAI, Google DeepMind, and Anthropic, have the stated goal of building artificial general intelligence (AGI) - AI systems that achieve or exceed human performance across a wide range of cognitive tasks. In pursuing this goal, they may develop and deploy AI systems that pose particularly significant risks. While they have already taken some measures to… ▽ More A number of leading AI companies, including OpenAI, Google DeepMind, and Anthropic, have the stated goal of building artificial general intelligence (AGI) - AI systems that achieve or exceed human performance across a wide range of cognitive tasks. In pursuing this goal, they may develop and deploy AI systems that pose particularly significant risks. While they have already taken some measures to mitigate these risks, best practices have not yet emerged. To support the identification of best practices, we sent a survey to 92 leading experts from AGI labs, academia, and civil society and received 51 responses. Participants were asked how much they agreed with 50 statements about what AGI labs should do. Our main finding is that participants, on average, agreed with all of them. Many statements received extremely high levels of agreement. For example, 98% of respondents somewhat or strongly agreed that AGI labs should conduct pre-deployment risk assessments, dangerous capabilities evaluations, third-party model audits, safety restrictions on model usage, and red teaming. Ultimately, our list of statements may serve as a helpful foundation for efforts to develop best practices, standards, and regulations for AGI labs. △ Less

Submitted 11 May, 2023; originally announced May 2023.

Comments: 38 pages, 8 figures, 8 tables

arXiv:2211.04325 [pdf, other]

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Authors: Pablo Villalobos, Jaime Sevilla, Lennart Heim, Tamay Besiroglu, Marius Hobbhahn, Anson Ho

Abstract: We analyze the growth of dataset sizes used in machine learning for natural language processing and computer vision, and extrapolate these using two methods; using the historical growth rate and estimating the compute-optimal dataset size for future predicted compute budgets. We investigate the growth in data usage by estimating the total stock of unlabeled data available on the internet over the… ▽ More We analyze the growth of dataset sizes used in machine learning for natural language processing and computer vision, and extrapolate these using two methods; using the historical growth rate and estimating the compute-optimal dataset size for future predicted compute budgets. We investigate the growth in data usage by estimating the total stock of unlabeled data available on the internet over the coming decades. Our analysis indicates that the stock of high-quality language data will be exhausted soon; likely before 2026. By contrast, the stock of low-quality language data and image data will be exhausted only much later; between 2030 and 2050 (for low-quality language) and between 2030 and 2060 (for images). Our work suggests that the current trend of ever-growing ML models that rely on enormous datasets might slow down if data efficiency is not drastically improved or new sources of data become available. △ Less

Submitted 25 October, 2022; originally announced November 2022.

arXiv:2210.04610 [pdf, other]

Red-Teaming the Stable Diffusion Safety Filter

Authors: Javier Rando, Daniel Paleka, David Lindner, Lennart Heim, Florian Tramèr

Abstract: Stable Diffusion is a recent open-source image generation model comparable to proprietary models such as DALLE, Imagen, or Parti. Stable Diffusion comes with a safety filter that aims to prevent generating explicit images. Unfortunately, the filter is obfuscated and poorly documented. This makes it hard for users to prevent misuse in their applications, and to understand the filter's limitations a… ▽ More Stable Diffusion is a recent open-source image generation model comparable to proprietary models such as DALLE, Imagen, or Parti. Stable Diffusion comes with a safety filter that aims to prevent generating explicit images. Unfortunately, the filter is obfuscated and poorly documented. This makes it hard for users to prevent misuse in their applications, and to understand the filter's limitations and improve it. We first show that it is easy to generate disturbing content that bypasses the safety filter. We then reverse-engineer the filter and find that while it aims to prevent sexual content, it ignores violence, gore, and other similarly disturbing content. Based on our analysis, we argue safety measures in future model releases should strive to be fully open and properly documented to stimulate security contributions from the community. △ Less

Submitted 10 November, 2022; v1 submitted 3 October, 2022; originally announced October 2022.

Comments: ML Safety Workshop NeurIPS 2022

arXiv:2207.02852 [pdf, other]

Machine Learning Model Sizes and the Parameter Gap

Authors: Pablo Villalobos, Jaime Sevilla, Tamay Besiroglu, Lennart Heim, Anson Ho, Marius Hobbhahn

Abstract: We study trends in model size of notable machine learning systems over time using a curated dataset. From 1950 to 2018, model size in language models increased steadily by seven orders of magnitude. The trend then accelerated, with model size increasing by another five orders of magnitude in just 4 years from 2018 to 2022. Vision models grew at a more constant pace, totaling 7 orders of magnitude… ▽ More We study trends in model size of notable machine learning systems over time using a curated dataset. From 1950 to 2018, model size in language models increased steadily by seven orders of magnitude. The trend then accelerated, with model size increasing by another five orders of magnitude in just 4 years from 2018 to 2022. Vision models grew at a more constant pace, totaling 7 orders of magnitude of growth between 1950 and 2022. We also identify that, since 2020, there have been many language models below 20B parameters, many models above 70B parameters, but a scarcity of models in the 20-70B parameter range. We refer to that scarcity as the parameter gap. We provide some stylized facts about the parameter gap and propose a few hypotheses to explain it. The explanations we favor are: (a) increasing model size beyond 20B parameters requires adopting different parallelism techniques, which makes mid-sized models less cost-effective, (b) GPT-3 was one order of magnitude larger than previous language models, and researchers afterwards primarily experimented with bigger models to outperform it. While these dynamics likely exist, and we believe they play some role in generating the gap, we don't have high confidence that there are no other, more important dynamics at play. △ Less

Submitted 5 July, 2022; originally announced July 2022.

arXiv:2202.05924 [pdf, other]

doi 10.1109/IJCNN55064.2022.9891914

Compute Trends Across Three Eras of Machine Learning

Authors: Jaime Sevilla, Lennart Heim, Anson Ho, Tamay Besiroglu, Marius Hobbhahn, Pablo Villalobos

Abstract: Compute, data, and algorithmic advances are the three fundamental factors that guide the progress of modern Machine Learning (ML). In this paper we study trends in the most readily quantified factor - compute. We show that before 2010 training compute grew in line with Moore's law, doubling roughly every 20 months. Since the advent of Deep Learning in the early 2010s, the scaling of training compu… ▽ More Compute, data, and algorithmic advances are the three fundamental factors that guide the progress of modern Machine Learning (ML). In this paper we study trends in the most readily quantified factor - compute. We show that before 2010 training compute grew in line with Moore's law, doubling roughly every 20 months. Since the advent of Deep Learning in the early 2010s, the scaling of training compute has accelerated, doubling approximately every 6 months. In late 2015, a new trend emerged as firms developed large-scale ML models with 10 to 100-fold larger requirements in training compute. Based on these observations we split the history of compute in ML into three eras: the Pre Deep Learning Era, the Deep Learning Era and the Large-Scale Era. Overall, our work highlights the fast-growing compute requirements for training advanced ML systems. △ Less

Submitted 9 March, 2022; v1 submitted 11 February, 2022; originally announced February 2022.

Journal ref: 2022 International Joint Conference on Neural Networks (IJCNN), Padua, Italy, 2022, pp. 1-8

arXiv:2104.10645 [pdf, other]

Measuring what Really Matters: Optimizing Neural Networks for TinyML

Authors: Lennart Heim, Andreas Biri, Zhongnan Qu, Lothar Thiele

Abstract: With the surge of inexpensive computational and memory resources, neural networks (NNs) have experienced an unprecedented growth in architectural and computational complexity. Introducing NNs to resource-constrained devices enables cost-efficient deployments, widespread availability, and the preservation of sensitive data. This work addresses the challenges of bringing Machine Learning to MCUs, wh… ▽ More With the surge of inexpensive computational and memory resources, neural networks (NNs) have experienced an unprecedented growth in architectural and computational complexity. Introducing NNs to resource-constrained devices enables cost-efficient deployments, widespread availability, and the preservation of sensitive data. This work addresses the challenges of bringing Machine Learning to MCUs, where we focus on the ubiquitous ARM Cortex-M architecture. The detailed effects and trade-offs that optimization methods, software frameworks, and MCU hardware architecture have on key performance metrics such as inference latency and energy consumption have not been previously studied in depth for state-of-the-art frameworks such as TensorFlow Lite Micro. We find that empirical investigations which measure the perceptible metrics - performance as experienced by the user - are indispensable, as the impact of specialized instructions and layer types can be subtle. To this end, we propose an implementation-aware design as a cost-effective method for verification and benchmarking. Employing our developed toolchain, we demonstrate how existing NN deployments on resource-constrained devices can be improved by systematically optimizing NNs to their targeted application scenario. △ Less

Submitted 21 April, 2021; originally announced April 2021.

Showing 1–17 of 17 results for author: Heim, L