-
Transformers in Reinforcement Learning: A Survey
Authors:
Pranav Agarwal,
Aamer Abdul Rahman,
Pierre-Luc St-Charles,
Simon J. D. Prince,
Samira Ebrahimi Kahou
Abstract:
Transformers have significantly impacted domains like natural language processing, computer vision, and robotics, where they improve performance compared to other neural networks. This survey explores how transformers are used in reinforcement learning (RL), where they are seen as a promising solution for addressing challenges such as unstable training, credit assignment, lack of interpretability,…
▽ More
Transformers have significantly impacted domains like natural language processing, computer vision, and robotics, where they improve performance compared to other neural networks. This survey explores how transformers are used in reinforcement learning (RL), where they are seen as a promising solution for addressing challenges such as unstable training, credit assignment, lack of interpretability, and partial observability. We begin by providing a brief domain overview of RL, followed by a discussion on the challenges of classical RL algorithms. Next, we delve into the properties of the transformer and its variants and discuss the characteristics that make them well-suited to address the challenges inherent in RL. We examine the application of transformers to various aspects of RL, including representation learning, transition and reward function modeling, and policy optimization. We also discuss recent research that aims to enhance the interpretability and efficiency of transformers in RL, using visualization techniques and efficient training strategies. Often, the transformer architecture must be tailored to the specific needs of a given application. We present a broad overview of how transformers have been adapted for several applications, including robotics, medicine, language modeling, cloud computing, and combinatorial optimization. We conclude by discussing the limitations of using transformers in RL and assess their potential for catalyzing future breakthroughs in this field.
△ Less
Submitted 12 July, 2023;
originally announced July 2023.
-
Deep Learning and Ethics
Authors:
Travis LaCroix,
Simon J. D. Prince
Abstract:
This article appears as chapter 21 of Prince (2023, Understanding Deep Learning); a complete draft of the textbook is available here: http://udlbook.com. This chapter considers potential harms arising from the design and use of AI systems. These include algorithmic bias, lack of explainability, data privacy violations, militarization, fraud, and environmental concerns. The aim is not to provide ad…
▽ More
This article appears as chapter 21 of Prince (2023, Understanding Deep Learning); a complete draft of the textbook is available here: http://udlbook.com. This chapter considers potential harms arising from the design and use of AI systems. These include algorithmic bias, lack of explainability, data privacy violations, militarization, fraud, and environmental concerns. The aim is not to provide advice on being more ethical. Instead, the goal is to express ideas and start conversations in key areas that have received attention in philosophy, political science, and the broader social sciences.
△ Less
Submitted 20 June, 2023; v1 submitted 24 May, 2023;
originally announced May 2023.
-
PCaaD: Towards Automated Determination and Exploitation of Industrial Processes
Authors:
B. Green,
W. Knowles,
M. Krotofil,
R. Derbyshire,
D. Prince,
N. Suri
Abstract:
Over the last decade, Programmable Logic Controllers (PLCs) have been increasingly targeted by attackers to obtain control over industrial processes that support critical services. Such targeted attacks typically require detailed knowledge of system-specific attributes, including hardware configurations, adopted protocols, and PLC control-logic, i.e. process comprehension. The consensus from both…
▽ More
Over the last decade, Programmable Logic Controllers (PLCs) have been increasingly targeted by attackers to obtain control over industrial processes that support critical services. Such targeted attacks typically require detailed knowledge of system-specific attributes, including hardware configurations, adopted protocols, and PLC control-logic, i.e. process comprehension. The consensus from both academics and practitioners suggests stealthy process comprehension obtained from a PLC alone, to conduct targeted attacks, is impractical. In contrast, we assert that current PLC programming practices open the door to a new vulnerability class based on control-logic constructs. To support this, we propose the concept of Process Comprehension at a Distance (PCaaD), as a novel methodological and automatable approach for system-agnostic exploitation of PLC library functions, leading to the targeted exfiltration of operational data, manipulation of control-logic behavior, and establishment of covert command and control channels through unused memory. We validate PCaaD on widely used PLCs, by identification of practical attacks.
△ Less
Submitted 19 February, 2021;
originally announced February 2021.
-
Optimizing Deeper Transformers on Small Datasets
Authors:
Peng Xu,
Dhruv Kumar,
Wei Yang,
Wenjie Zi,
Keyi Tang,
Chenyang Huang,
Jackie Chi Kit Cheung,
Simon J. D. Prince,
Yanshuai Cao
Abstract:
It is a common belief that training deep transformers from scratch requires large datasets. Consequently, for small datasets, people usually use shallow and simple additional layers on top of pre-trained models during fine-tuning. This work shows that this does not always need to be the case: with proper initialization and optimization, the benefits of very deep transformers can carry over to chal…
▽ More
It is a common belief that training deep transformers from scratch requires large datasets. Consequently, for small datasets, people usually use shallow and simple additional layers on top of pre-trained models during fine-tuning. This work shows that this does not always need to be the case: with proper initialization and optimization, the benefits of very deep transformers can carry over to challenging tasks with small datasets, including Text-to-SQL semantic parsing and logical reading comprehension. In particular, we successfully train $48$ layers of transformers, comprising $24$ fine-tuned layers from pre-trained RoBERTa and $24$ relation-aware layers trained from scratch. With fewer training steps and no task-specific pre-training, we obtain the state-of-the-art performance on the challenging cross-domain Text-to-SQL parsing benchmark Spider. We achieve this by deriving a novel Data-dependent Transformer Fixed-update initialization scheme (DT-Fixup), inspired by the prior T-Fixup work. Further error analysis shows that increasing depth can help improve generalization on small datasets for hard cases that require reasoning and structural understanding.
△ Less
Submitted 31 May, 2021; v1 submitted 30 December, 2020;
originally announced December 2020.
-
Normalizing Flows: An Introduction and Review of Current Methods
Authors:
Ivan Kobyzev,
Simon J. D. Prince,
Marcus A. Brubaker
Abstract:
Normalizing Flows are generative models which produce tractable distributions where both sampling and density evaluation can be efficient and exact. The goal of this survey article is to give a coherent and comprehensive review of the literature around the construction and use of Normalizing Flows for distribution learning. We aim to provide context and explanation of the models, review current st…
▽ More
Normalizing Flows are generative models which produce tractable distributions where both sampling and density evaluation can be efficient and exact. The goal of this survey article is to give a coherent and comprehensive review of the literature around the construction and use of Normalizing Flows for distribution learning. We aim to provide context and explanation of the models, review current state-of-the-art literature, and identify open questions and promising future directions.
△ Less
Submitted 5 June, 2020; v1 submitted 25 August, 2019;
originally announced August 2019.
-
Risks and Transaction Costs of Distributed-Ledger Fintech: Boundary Effects and Consequences
Authors:
Kim Kaivanto,
Daniel Prince
Abstract:
Fintech business models based on distributed ledgers -- and their smart-contract variants in particular -- offer the prospect of democratizing access to faster, anywhere-accessible, lower cost, reliable-and-secure high-quality financial services. In addition to holding great, economically transformative promise, these business models pose new, little-studied risks and transaction costs. However, t…
▽ More
Fintech business models based on distributed ledgers -- and their smart-contract variants in particular -- offer the prospect of democratizing access to faster, anywhere-accessible, lower cost, reliable-and-secure high-quality financial services. In addition to holding great, economically transformative promise, these business models pose new, little-studied risks and transaction costs. However, these risks and transaction costs are not evident during the demonstration and testing phases of development, when adopters and users are drawn from the community of developers themselves, as well as from among non-programmer fintech evangelists. Hence, when the new risks and transaction costs become manifest -- as the fintech business models are rolled out across the wider economy -- the consequences may also appear to be new and surprising. The present study represents an effort to get ahead of these developments by delineating risks and transaction costs inherent in distributed-ledger- and smart-contracts-based fintech business models. The analysis focuses on code risk and moral-hazard risk, as well as on mixed-economy risks and the unintended consequences of replicating bricks-and-mortar-generation contract forms within the ultra-low transaction-cost environment of fintech.
△ Less
Submitted 27 February, 2017;
originally announced February 2017.