-
Transductive Active Learning: Theory and Applications
Authors:
Jonas Hübotter,
Bhavya Sukhija,
Lenart Treven,
Yarden As,
Andreas Krause
Abstract:
We generalize active learning to address real-world settings with concrete prediction targets where sampling is restricted to an accessible region of the domain, while prediction targets may lie outside this region. We analyze a family of decision rules that sample adaptively to minimize uncertainty about prediction targets. We are the first to show, under general regularity assumptions, that such…
▽ More
We generalize active learning to address real-world settings with concrete prediction targets where sampling is restricted to an accessible region of the domain, while prediction targets may lie outside this region. We analyze a family of decision rules that sample adaptively to minimize uncertainty about prediction targets. We are the first to show, under general regularity assumptions, that such decision rules converge uniformly to the smallest possible uncertainty obtainable from the accessible data. We demonstrate their strong sample efficiency in two key applications: Active few-shot fine-tuning of large neural networks and safe Bayesian optimization, where they improve significantly upon the state-of-the-art.
△ Less
Submitted 22 May, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Active Few-Shot Fine-Tuning
Authors:
Jonas Hübotter,
Bhavya Sukhija,
Lenart Treven,
Yarden As,
Andreas Krause
Abstract:
We study the question: How can we select the right data for fine-tuning to a specific task? We call this data selection problem active fine-tuning and show that it is an instance of transductive active learning, a novel generalization of classical active learning. We propose ITL, short for information-based transductive learning, an approach which samples adaptively to maximize information gained…
▽ More
We study the question: How can we select the right data for fine-tuning to a specific task? We call this data selection problem active fine-tuning and show that it is an instance of transductive active learning, a novel generalization of classical active learning. We propose ITL, short for information-based transductive learning, an approach which samples adaptively to maximize information gained about the specified task. We are the first to show, under general regularity assumptions, that such decision rules converge uniformly to the smallest possible uncertainty obtainable from the accessible data. We apply ITL to the few-shot fine-tuning of large neural networks and show that fine-tuning with ITL learns the task with significantly fewer examples than the state-of-the-art.
△ Less
Submitted 21 June, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Efficient Exploration in Continuous-time Model-based Reinforcement Learning
Authors:
Lenart Treven,
Jonas Hübotter,
Bhavya Sukhija,
Florian Dörfler,
Andreas Krause
Abstract:
Reinforcement learning algorithms typically consider discrete-time dynamics, even though the underlying systems are often continuous in time. In this paper, we introduce a model-based reinforcement learning algorithm that represents continuous-time dynamics using nonlinear ordinary differential equations (ODEs). We capture epistemic uncertainty using well-calibrated probabilistic models, and use t…
▽ More
Reinforcement learning algorithms typically consider discrete-time dynamics, even though the underlying systems are often continuous in time. In this paper, we introduce a model-based reinforcement learning algorithm that represents continuous-time dynamics using nonlinear ordinary differential equations (ODEs). We capture epistemic uncertainty using well-calibrated probabilistic models, and use the optimistic principle for exploration. Our regret bounds surface the importance of the measurement selection strategy(MSS), since in continuous time we not only must decide how to explore, but also when to observe the underlying system. Our analysis demonstrates that the regret is sublinear when modeling ODEs with Gaussian Processes (GP) for common choices of MSS, such as equidistant sampling. Additionally, we propose an adaptive, data-dependent, practical MSS that, when combined with GP dynamics, also achieves sublinear regret with significantly fewer samples. We showcase the benefits of continuous-time modeling over its discrete-time counterpart, as well as our proposed adaptive MSS over standard baselines, on several applications.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
Tuning Legged Locomotion Controllers via Safe Bayesian Optimization
Authors:
Daniel Widmer,
Dongho Kang,
Bhavya Sukhija,
Jonas Hübotter,
Andreas Krause,
Stelian Coros
Abstract:
This paper presents a data-driven strategy to streamline the deployment of model-based controllers in legged robotic hardware platforms. Our approach leverages a model-free safe learning algorithm to automate the tuning of control gains, addressing the mismatch between the simplified model used in the control formulation and the real system. This method substantially mitigates the risk of hazardou…
▽ More
This paper presents a data-driven strategy to streamline the deployment of model-based controllers in legged robotic hardware platforms. Our approach leverages a model-free safe learning algorithm to automate the tuning of control gains, addressing the mismatch between the simplified model used in the control formulation and the real system. This method substantially mitigates the risk of hazardous interactions with the robot by sample-efficiently optimizing parameters within a probably safe region. Additionally, we extend the applicability of our approach to incorporate the different gait parameters as contexts, leading to a safe, sample-efficient exploration algorithm capable of tuning a motion controller for diverse gait patterns. We validate our method through simulation and hardware experiments, where we demonstrate that the algorithm obtains superior performance on tuning a model-based motion controller for multiple gaits safely.
△ Less
Submitted 25 October, 2023; v1 submitted 12 June, 2023;
originally announced June 2023.
-
A Cut-Matching Game for Constant-Hop Expanders
Authors:
Bernhard Haeupler,
Jonas Huebotter,
Mohsen Ghaffari
Abstract:
This paper provides a cut-strategy that produces constant-hop expanders in the well-known cut-matching game framework.
Constant-hop expanders strengthen expanders with constant conductance by guaranteeing that any demand can be (obliviously) routed along constant-hop paths - in contrast to the $Ω(\log n)$-hop routes in expanders.
Cut-matching games for expanders are key tools for obtaining clo…
▽ More
This paper provides a cut-strategy that produces constant-hop expanders in the well-known cut-matching game framework.
Constant-hop expanders strengthen expanders with constant conductance by guaranteeing that any demand can be (obliviously) routed along constant-hop paths - in contrast to the $Ω(\log n)$-hop routes in expanders.
Cut-matching games for expanders are key tools for obtaining close-to-linear-time approximation algorithms for many hard problems, including finding (balanced or approximately-largest) sparse cuts, certifying the expansion of a graph by embedding an (explicit) expander, as well as computing expander decompositions, hierarchical cut decompositions, oblivious routings, multi-cuts, and multicommodity flows. The cut-matching game provided in this paper is crucial in extending this versatile and powerful machinery to constant-hop expanders. It is also a key ingredient towards close-to-linear time algorithms for computing a constant approximation of multicommodity-flows and multi-cuts - the approximation factor being a constant relies on the expanders being constant-hop.
△ Less
Submitted 21 November, 2022;
originally announced November 2022.
-
Learning Policies for Continuous Control via Transition Models
Authors:
Justus Huebotter,
Serge Thill,
Marcel van Gerven,
Pablo Lanillos
Abstract:
It is doubtful that animals have perfect inverse models of their limbs (e.g., what muscle contraction must be applied to every joint to reach a particular location in space). However, in robot control, moving an arm's end-effector to a target position or along a target trajectory requires accurate forward and inverse models. Here we show that by learning the transition (forward) model from interac…
▽ More
It is doubtful that animals have perfect inverse models of their limbs (e.g., what muscle contraction must be applied to every joint to reach a particular location in space). However, in robot control, moving an arm's end-effector to a target position or along a target trajectory requires accurate forward and inverse models. Here we show that by learning the transition (forward) model from interaction, we can use it to drive the learning of an amortized policy. Hence, we revisit policy optimization in relation to the deep active inference framework and describe a modular neural network architecture that simultaneously learns the system dynamics from prediction errors and the stochastic policy that generates suitable continuous control commands to reach a desired reference position. We evaluated the model by comparing it against the baseline of a linear quadratic regulator, and conclude with additional steps to take toward human-like motor control.
△ Less
Submitted 16 September, 2022;
originally announced September 2022.
-
Training Deep Spiking Auto-encoders without Bursting or Dying Neurons through Regularization
Authors:
Justus F. Hübotter,
Pablo Lanillos,
Jakub M. Tomczak
Abstract:
Spiking neural networks are a promising approach towards next-generation models of the brain in computational neuroscience. Moreover, compared to classic artificial neural networks, they could serve as an energy-efficient deployment of AI by enabling fast computation in specialized neuromorphic hardware. However, training deep spiking neural networks, especially in an unsupervised manner, is chall…
▽ More
Spiking neural networks are a promising approach towards next-generation models of the brain in computational neuroscience. Moreover, compared to classic artificial neural networks, they could serve as an energy-efficient deployment of AI by enabling fast computation in specialized neuromorphic hardware. However, training deep spiking neural networks, especially in an unsupervised manner, is challenging and the performance of a spiking model is significantly hindered by dead or bursting neurons. Here, we apply end-to-end learning with membrane potential-based backpropagation to a spiking convolutional auto-encoder with multiple trainable layers of leaky integrate-and-fire neurons. We propose bio-inspired regularization methods to control the spike density in latent representations. In the experiments, we show that applying regularization on membrane potential and spiking output successfully avoids both dead and bursting neurons and significantly decreases the reconstruction error of the spiking auto-encoder. Training regularized networks on the MNIST dataset yields image reconstruction quality comparable to non-spiking baseline models (deterministic and variational auto-encoder) and indicates improvement upon earlier approaches. Importantly, we show that, unlike the variational auto-encoder, the spiking latent representations display structure associated with the image class.
△ Less
Submitted 22 September, 2021;
originally announced September 2021.
-
Implementation of Algorithms for Right-Sizing Data Centers
Authors:
Jonas Hübotter
Abstract:
The energy consumption of data centers assumes a significant fraction of the world's overall energy consumption. Most data centers are statically provisioned, leading to a very low average utilization of servers. In this work, we survey uni-dimensional and high-dimensional approaches for dynamically powering up and powering down servers to reduce the energy footprint of data centers while ensuring…
▽ More
The energy consumption of data centers assumes a significant fraction of the world's overall energy consumption. Most data centers are statically provisioned, leading to a very low average utilization of servers. In this work, we survey uni-dimensional and high-dimensional approaches for dynamically powering up and powering down servers to reduce the energy footprint of data centers while ensuring that incoming jobs are processed in time. We implement algorithms for smoothed online convex optimization and variations thereof where, in each round, the agent receives a convex cost function. The agent seeks to balance minimizing this cost and a movement cost associated with changing decisions in-between rounds. We implement the algorithms in their most general form, inviting future research on their performance in other application areas. We evaluate the algorithms for the application of right-sizing data centers using traces from Facebook, Microsoft, Alibaba, and Los Alamos National Lab. Our experiments show that the online algorithms perform close to the dynamic offline optimum in practice and promise a significant cost reduction compared to a static provisioning of servers. We discuss how features of the data center model and trace impact the performance. Finally, we investigate the practical use of predictions to achieve further cost reductions.
△ Less
Submitted 21 August, 2021;
originally announced August 2021.