-
Bandits with Preference Feedback: A Stackelberg Game Perspective
Authors:
Barna Pásztor,
Parnian Kassraie,
Andreas Krause
Abstract:
Bandits with preference feedback present a powerful tool for optimizing unknown target functions when only pairwise comparisons are allowed instead of direct value queries. This model allows for incorporating human feedback into online inference and optimization and has been employed in systems for fine-tuning large language models. The problem is well understood in simplified settings with linear…
▽ More
Bandits with preference feedback present a powerful tool for optimizing unknown target functions when only pairwise comparisons are allowed instead of direct value queries. This model allows for incorporating human feedback into online inference and optimization and has been employed in systems for fine-tuning large language models. The problem is well understood in simplified settings with linear target functions or over finite small domains that limit practical interest. Taking the next step, we consider infinite domains and nonlinear (kernelized) rewards. In this setting, selecting a pair of actions is quite challenging and requires balancing exploration and exploitation at two levels: within the pair, and along the iterations of the algorithm. We propose MAXMINLCB, which emulates this trade-off as a zero-sum Stackelberg game, and chooses action pairs that are informative and yield favorable rewards. MAXMINLCB consistently outperforms existing algorithms and satisfies an anytime-valid rate-optimal regret guarantee. This is due to our novel preference-based confidence sequences for kernelized logistic estimators.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Using graph neural networks to reconstruct charged pion showers in the CMS High Granularity Calorimeter
Authors:
M. Aamir,
B. Acar,
G. Adamov,
T. Adams,
C. Adloff,
S. Afanasiev,
C. Agrawal,
C. Agrawal,
A. Ahmad,
H. A. Ahmed,
S. Akbar,
N. Akchurin,
B. Akgul,
B. Akgun,
R. O. Akpinar,
E. Aktas,
A. AlKadhim,
V. Alexakhin,
J. Alimena,
J. Alison,
A. Alpana,
W. Alshehri,
P. Alvarez Dominguez,
M. Alyari,
C. Amendola
, et al. (550 additional authors not shown)
Abstract:
A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadr…
▽ More
A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadronic section. The shower reconstruction method is based on graph neural networks and it makes use of a dynamic reduction network architecture. It is shown that the algorithm is able to capture and mitigate the main effects that normally hinder the reconstruction of hadronic showers using classical reconstruction methods, by compensating for fluctuations in the multiplicity, energy, and spatial distributions of the shower's constituents. The performance of the algorithm is evaluated using test beam data collected in 2018 prototype of the CMS HGCAL accompanied by a section of the CALICE AHCAL prototype. The capability of the method to mitigate the impact of energy leakage from the calorimeter is also demonstrated.
△ Less
Submitted 30 June, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Standardizing Structural Causal Models
Authors:
Weronika Ormaniec,
Scott Sussex,
Lars Lorch,
Bernhard Schölkopf,
Andreas Krause
Abstract:
Synthetic datasets generated by structural causal models (SCMs) are commonly used for benchmarking causal structure learning algorithms. However, the variances and pairwise correlations in SCM data tend to increase along the causal ordering. Several popular algorithms exploit these artifacts, possibly leading to conclusions that do not generalize to real-world settings. Existing metrics like…
▽ More
Synthetic datasets generated by structural causal models (SCMs) are commonly used for benchmarking causal structure learning algorithms. However, the variances and pairwise correlations in SCM data tend to increase along the causal ordering. Several popular algorithms exploit these artifacts, possibly leading to conclusions that do not generalize to real-world settings. Existing metrics like $\operatorname{Var}$-sortability and $\operatorname{R^2}$-sortability quantify these patterns, but they do not provide tools to remedy them. To address this, we propose internally-standardized structural causal models (iSCMs), a modification of SCMs that introduces a standardization operation at each variable during the generative process. By construction, iSCMs are not $\operatorname{Var}$-sortable, and as we show experimentally, not $\operatorname{R^2}$-sortable either for commonly-used graph families. Moreover, contrary to the post-hoc standardization of data generated by standard SCMs, we prove that linear iSCMs are less identifiable from prior knowledge on the weights and do not collapse to deterministic relationships in large systems, which may make iSCMs a useful model in causal inference beyond the benchmarking problem studied here.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Breeding Programs Optimization with Reinforcement Learning
Authors:
Omar G. Younis,
Luca Corinzia,
Ioannis N. Athanasiadis,
Andreas Krause,
Joachim M. Buhmann,
Matteo Turchetta
Abstract:
Crop breeding is crucial in improving agricultural productivity while potentially decreasing land usage, greenhouse gas emissions, and water consumption. However, breeding programs are challenging due to long turnover times, high-dimensional decision spaces, long-term objectives, and the need to adapt to rapid climate change. This paper introduces the use of Reinforcement Learning (RL) to optimize…
▽ More
Crop breeding is crucial in improving agricultural productivity while potentially decreasing land usage, greenhouse gas emissions, and water consumption. However, breeding programs are challenging due to long turnover times, high-dimensional decision spaces, long-term objectives, and the need to adapt to rapid climate change. This paper introduces the use of Reinforcement Learning (RL) to optimize simulated crop breeding programs. RL agents are trained to make optimal crop selection and cross-breeding decisions based on genetic information. To benchmark RL-based breeding algorithms, we introduce a suite of Gym environments. The study demonstrates the superiority of RL techniques over standard practices in terms of genetic gain when simulated in silico using real-world genomic maize data.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Stochastic Bilevel Optimization with Lower-Level Contextual Markov Decision Processes
Authors:
Vinzenz Thoma,
Barna Pasztor,
Andreas Krause,
Giorgia Ramponi,
Yifan Hu
Abstract:
In various applications, the optimal policy in a strategic decision-making problem depends both on the environmental configuration and exogenous events. For these settings, we introduce Bilevel Optimization with Contextual Markov Decision Processes (BO-CMDP), a stochastic bilevel decision-making model, where the lower level consists of solving a contextual Markov Decision Process (CMDP). BO-CMDP c…
▽ More
In various applications, the optimal policy in a strategic decision-making problem depends both on the environmental configuration and exogenous events. For these settings, we introduce Bilevel Optimization with Contextual Markov Decision Processes (BO-CMDP), a stochastic bilevel decision-making model, where the lower level consists of solving a contextual Markov Decision Process (CMDP). BO-CMDP can be viewed as a Stackelberg Game where the leader and a random context beyond the leader's control together decide the setup of (many) MDPs that (potentially multiple) followers best respond to. This framework extends beyond traditional bilevel optimization and finds relevance in diverse fields such as model design for MDPs, tax design, reward sha** and dynamic mechanism design. We propose a stochastic Hyper Policy Gradient Descent (HPGD) algorithm to solve BO-CMDP, and demonstrate its convergence. Notably, HPGD only utilizes observations of the followers' trajectories. Therefore, it allows followers to use any training procedure and the leader to be agnostic of the specific algorithm used, which aligns with various real-world scenarios. We further consider the setting when the leader can influence the training of followers and propose an accelerated algorithm. We empirically demonstrate the performance of our algorithm.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
NeoRL: Efficient Exploration for Nonepisodic RL
Authors:
Bhavya Sukhija,
Lenart Treven,
Florian Dörfler,
Stelian Coros,
Andreas Krause
Abstract:
We study the problem of nonepisodic reinforcement learning (RL) for nonlinear dynamical systems, where the system dynamics are unknown and the RL agent has to learn from a single trajectory, i.e., without resets. We propose Nonepisodic Optimistic RL (NeoRL), an approach based on the principle of optimism in the face of uncertainty. NeoRL uses well-calibrated probabilistic models and plans optimist…
▽ More
We study the problem of nonepisodic reinforcement learning (RL) for nonlinear dynamical systems, where the system dynamics are unknown and the RL agent has to learn from a single trajectory, i.e., without resets. We propose Nonepisodic Optimistic RL (NeoRL), an approach based on the principle of optimism in the face of uncertainty. NeoRL uses well-calibrated probabilistic models and plans optimistically w.r.t. the epistemic uncertainty about the unknown dynamics. Under continuity and bounded energy assumptions on the system, we provide a first-of-its-kind regret bound of $\setO(β_T \sqrt{T Γ_T})$ for general nonlinear systems with Gaussian process dynamics. We compare NeoRL to other baselines on several deep RL environments and empirically demonstrate that NeoRL achieves the optimal average cost while incurring the least regret.
△ Less
Submitted 4 June, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
When to Sense and Control? A Time-adaptive Approach for Continuous-Time RL
Authors:
Lenart Treven,
Bhavya Sukhija,
Yarden As,
Florian Dörfler,
Andreas Krause
Abstract:
Reinforcement learning (RL) excels in optimizing policies for discrete-time Markov decision processes (MDP). However, various systems are inherently continuous in time, making discrete-time MDPs an inexact modeling choice. In many applications, such as greenhouse control or medical treatments, each interaction (measurement or switching of action) involves manual intervention and thus is inherently…
▽ More
Reinforcement learning (RL) excels in optimizing policies for discrete-time Markov decision processes (MDP). However, various systems are inherently continuous in time, making discrete-time MDPs an inexact modeling choice. In many applications, such as greenhouse control or medical treatments, each interaction (measurement or switching of action) involves manual intervention and thus is inherently costly. Therefore, we generally prefer a time-adaptive approach with fewer interactions with the system. In this work, we formalize an RL framework, Time-adaptive Control & Sensing (TaCoS), that tackles this challenge by optimizing over policies that besides control predict the duration of its application. Our formulation results in an extended MDP that any standard RL algorithm can solve. We demonstrate that state-of-the-art RL algorithms trained on TaCoS drastically reduce the interaction amount over their discrete-time counterpart while retaining the same or improved performance, and exhibiting robustness over discretization frequency. Finally, we propose OTaCoS, an efficient model-based algorithm for our setting. We show that OTaCoS enjoys sublinear regret for systems with sufficiently smooth dynamics and empirically results in further sample-efficiency gains.
△ Less
Submitted 4 June, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
Safe Exploration Using Bayesian World Models and Log-Barrier Optimization
Authors:
Yarden As,
Bhavya Sukhija,
Andreas Krause
Abstract:
A major challenge in deploying reinforcement learning in online tasks is ensuring that safety is maintained throughout the learning process. In this work, we propose CERL, a new method for solving constrained Markov decision processes while kee** the policy safe during learning. Our method leverages Bayesian world models and suggests policies that are pessimistic w.r.t. the model's epistemic unc…
▽ More
A major challenge in deploying reinforcement learning in online tasks is ensuring that safety is maintained throughout the learning process. In this work, we propose CERL, a new method for solving constrained Markov decision processes while kee** the policy safe during learning. Our method leverages Bayesian world models and suggests policies that are pessimistic w.r.t. the model's epistemic uncertainty. This makes CERL robust towards model inaccuracies and leads to safe exploration during learning. In our experiments, we demonstrate that CERL outperforms the current state-of-the-art in terms of safety and optimality in solving CMDPs from image observations.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
On the $K$-theory of $\mathbf{Z}/p^n$
Authors:
Benjamin Antieau,
Achim Krause,
Thomas Nikolaus
Abstract:
We give an explicit algebraic description, based on prismatic cohomology, of the algebraic K-groups of rings of the form $O_K/I$ where $K$ is a p-adic field and $I$ is a non-trivial ideal in the ring of integers $O_K$; this class includes the rings $\mathbf{Z}/p^n$ where $p$ is a prime.
The algebraic description allows us to describe a practical algorithm to compute individual K-groups as well a…
▽ More
We give an explicit algebraic description, based on prismatic cohomology, of the algebraic K-groups of rings of the form $O_K/I$ where $K$ is a p-adic field and $I$ is a non-trivial ideal in the ring of integers $O_K$; this class includes the rings $\mathbf{Z}/p^n$ where $p$ is a prime.
The algebraic description allows us to describe a practical algorithm to compute individual K-groups as well as to obtain several theoretical results: the vanishing of the even K-groups in high degrees, the determination of the orders of the odd K-groups in high degrees, and the degree of nilpotence of $v_1$ acting on the mod $p$ syntomic cohomology of $\mathbf{Z}/p^n$.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
GrINd: Grid Interpolation Network for Scattered Observations
Authors:
Andrzej Dulny,
Paul Heinisch,
Andreas Hotho,
Anna Krause
Abstract:
Predicting the evolution of spatiotemporal physical systems from sparse and scattered observational data poses a significant challenge in various scientific domains. Traditional methods rely on dense grid-structured data, limiting their applicability in scenarios with sparse observations. To address this challenge, we introduce GrINd (Grid Interpolation Network for Scattered Observations), a novel…
▽ More
Predicting the evolution of spatiotemporal physical systems from sparse and scattered observational data poses a significant challenge in various scientific domains. Traditional methods rely on dense grid-structured data, limiting their applicability in scenarios with sparse observations. To address this challenge, we introduce GrINd (Grid Interpolation Network for Scattered Observations), a novel network architecture that leverages the high-performance of grid-based models by map** scattered observations onto a high-resolution grid using a Fourier Interpolation Layer. In the high-resolution space, a NeuralPDE-class model predicts the system's state at future timepoints using differentiable ODE solvers and fully convolutional neural networks parametrizing the system's dynamics. We empirically evaluate GrINd on the DynaBench benchmark dataset, comprising six different physical systems observed at scattered locations, demonstrating its state-of-the-art performance compared to existing models. GrINd offers a promising approach for forecasting physical systems from sparse, scattered observational data, extending the applicability of deep learning methods to real-world scenarios with limited data availability.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Global Vegetation Modeling with Pre-Trained Weather Transformers
Authors:
Pascal Janetzky,
Florian Gallusser,
Simon Hentschel,
Andreas Hotho,
Anna Krause
Abstract:
Accurate vegetation models can produce further insights into the complex interaction between vegetation activity and ecosystem processes. Previous research has established that long-term trends and short-term variability of temperature and precipitation affect vegetation activity. Motivated by the recent success of Transformer-based Deep Learning models for medium-range weather forecasting, we ada…
▽ More
Accurate vegetation models can produce further insights into the complex interaction between vegetation activity and ecosystem processes. Previous research has established that long-term trends and short-term variability of temperature and precipitation affect vegetation activity. Motivated by the recent success of Transformer-based Deep Learning models for medium-range weather forecasting, we adapt the publicly available pre-trained FourCastNet to model vegetation activity while accounting for the short-term dynamics of climate variability. We investigate how the learned global representation of the atmosphere's state can be transferred to model the normalized difference vegetation index (NDVI). Our model globally estimates vegetation activity at a resolution of \SI{0.25}{\degree} while relying only on meteorological data. We demonstrate that leveraging pre-trained weather models improves the NDVI estimates compared to learning an NDVI model from scratch. Additionally, we compare our results to other recent data-driven NDVI modeling approaches from machine learning and ecology literature. We further provide experimental evidence on how much data and training time is necessary to turn FourCastNet into an effective vegetation model. Code and models will be made available upon publication.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
A PAC-Bayesian Framework for Optimal Control with Stability Guarantees
Authors:
Mahrokh Ghoddousi Boroujeni,
Clara Lucía Galimberti,
Andreas Krause,
Giancarlo Ferrari-Trecate
Abstract:
Stochastic Nonlinear Optimal Control (SNOC) involves minimizing a cost function that averages out the random uncertainties affecting the dynamics of nonlinear systems. For tractability reasons, this problem is typically addressed by minimizing an empirical cost, which represents the average cost across a finite dataset of sampled disturbances. However, this approach raises the challenge of quantif…
▽ More
Stochastic Nonlinear Optimal Control (SNOC) involves minimizing a cost function that averages out the random uncertainties affecting the dynamics of nonlinear systems. For tractability reasons, this problem is typically addressed by minimizing an empirical cost, which represents the average cost across a finite dataset of sampled disturbances. However, this approach raises the challenge of quantifying the control performance against out-of-sample uncertainties. Particularly, in scenarios where the training dataset is small, SNOC policies are prone to overfitting, resulting in significant discrepancies between the empirical cost and the true cost, i.e., the average SNOC cost incurred during control deployment. Therefore, establishing generalization bounds on the true cost is crucial for ensuring reliability in real-world applications. In this paper, we introduce a novel approach that leverages PAC-Bayes theory to provide rigorous generalization bounds for SNOC. Based on these bounds, we propose a new method for designing optimal controllers, offering a principled way to incorporate prior knowledge into the synthesis process, which aids in improving the control policy and mitigating overfitting. Furthermore, by leveraging recent parametrizations of stabilizing controllers for nonlinear systems, our framework inherently ensures closed-loop stability. The effectiveness of our proposed method in incorporating prior knowledge and combating overfitting is shown by designing neural network controllers for tasks in cooperative robotics.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Bridging the Sim-to-Real Gap with Bayesian Inference
Authors:
Jonas Rothfuss,
Bhavya Sukhija,
Lenart Treven,
Florian Dörfler,
Stelian Coros,
Andreas Krause
Abstract:
We present SIM-FSVGD for learning robot dynamics from data. As opposed to traditional methods, SIM-FSVGD leverages low-fidelity physical priors, e.g., in the form of simulators, to regularize the training of neural network models. While learning accurate dynamics already in the low data regime, SIM-FSVGD scales and excels also when more data is available. We empirically show that learning with imp…
▽ More
We present SIM-FSVGD for learning robot dynamics from data. As opposed to traditional methods, SIM-FSVGD leverages low-fidelity physical priors, e.g., in the form of simulators, to regularize the training of neural network models. While learning accurate dynamics already in the low data regime, SIM-FSVGD scales and excels also when more data is available. We empirically show that learning with implicit physical priors results in accurate mean model estimation as well as precise uncertainty quantification. We demonstrate the effectiveness of SIM-FSVGD in bridging the sim-to-real gap on a high-performance RC racecar system. Using model-based RL, we demonstrate a highly dynamic parking maneuver with drifting, using less than half the data compared to the state of the art.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Sound Event Detection and Localization with Distance Estimation
Authors:
Daniel Aleksander Krause,
Archontis Politis,
Annamaria Mesaros
Abstract:
Sound Event Detection and Localization (SELD) is a combined task of identifying sound events and their corresponding direction-of-arrival (DOA). While this task has numerous applications and has been extensively researched in recent years, it fails to provide full information about the sound source position. In this paper, we overcome this problem by extending the task to Sound Event Detection, Lo…
▽ More
Sound Event Detection and Localization (SELD) is a combined task of identifying sound events and their corresponding direction-of-arrival (DOA). While this task has numerous applications and has been extensively researched in recent years, it fails to provide full information about the sound source position. In this paper, we overcome this problem by extending the task to Sound Event Detection, Localization with Distance Estimation (3D SELD). We study two ways of integrating distance estimation within the SELD core - a multi-task approach, in which the problem is tackled by a separate model output, and a single-task approach obtained by extending the multi-ACCDOA method to include distance information. We investigate both methods for the Ambisonic and binaural versions of STARSS23: Sony-TAU Realistic Spatial Soundscapes 2023. Moreover, our study involves experiments on the loss function related to the distance estimation part. Our results show that it is possible to perform 3D SELD without any degradation of performance in sound event detection and DOA estimation.
△ Less
Submitted 12 June, 2024; v1 submitted 18 March, 2024;
originally announced March 2024.
-
Transductive Active Learning: Theory and Applications
Authors:
Jonas Hübotter,
Bhavya Sukhija,
Lenart Treven,
Yarden As,
Andreas Krause
Abstract:
We generalize active learning to address real-world settings with concrete prediction targets where sampling is restricted to an accessible region of the domain, while prediction targets may lie outside this region. We analyze a family of decision rules that sample adaptively to minimize uncertainty about prediction targets. We are the first to show, under general regularity assumptions, that such…
▽ More
We generalize active learning to address real-world settings with concrete prediction targets where sampling is restricted to an accessible region of the domain, while prediction targets may lie outside this region. We analyze a family of decision rules that sample adaptively to minimize uncertainty about prediction targets. We are the first to show, under general regularity assumptions, that such decision rules converge uniformly to the smallest possible uncertainty obtainable from the accessible data. We demonstrate their strong sample efficiency in two key applications: Active few-shot fine-tuning of large neural networks and safe Bayesian optimization, where they improve significantly upon the state-of-the-art.
△ Less
Submitted 22 May, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Active Few-Shot Fine-Tuning
Authors:
Jonas Hübotter,
Bhavya Sukhija,
Lenart Treven,
Yarden As,
Andreas Krause
Abstract:
We study the question: How can we select the right data for fine-tuning to a specific task? We call this data selection problem active fine-tuning and show that it is an instance of transductive active learning, a novel generalization of classical active learning. We propose ITL, short for information-based transductive learning, an approach which samples adaptively to maximize information gained…
▽ More
We study the question: How can we select the right data for fine-tuning to a specific task? We call this data selection problem active fine-tuning and show that it is an instance of transductive active learning, a novel generalization of classical active learning. We propose ITL, short for information-based transductive learning, an approach which samples adaptively to maximize information gained about the specified task. We are the first to show, under general regularity assumptions, that such decision rules converge uniformly to the smallest possible uncertainty obtainable from the accessible data. We apply ITL to the few-shot fine-tuning of large neural networks and show that fine-tuning with ITL learns the task with significantly fewer examples than the state-of-the-art.
△ Less
Submitted 21 June, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Transition Constrained Bayesian Optimization via Markov Decision Processes
Authors:
Jose Pablo Folch,
Calvin Tsay,
Robert M Lee,
Behrang Shafei,
Weronika Ormaniec,
Andreas Krause,
Mark van der Wilk,
Ruth Misener,
Mojmír Mutný
Abstract:
Bayesian optimization is a methodology to optimize black-box functions. Traditionally, it focuses on the setting where you can arbitrarily query the search space. However, many real-life problems do not offer this flexibility; in particular, the search space of the next query may depend on previous ones. Example challenges arise in the physical sciences in the form of local movement constraints, r…
▽ More
Bayesian optimization is a methodology to optimize black-box functions. Traditionally, it focuses on the setting where you can arbitrarily query the search space. However, many real-life problems do not offer this flexibility; in particular, the search space of the next query may depend on previous ones. Example challenges arise in the physical sciences in the form of local movement constraints, required monotonicity in certain variables, and transitions influencing the accuracy of measurements. Altogether, such transition constraints necessitate a form of planning. This work extends classical Bayesian optimization via the framework of Markov Decision Processes. We iteratively solve a tractable linearization of our utility function using reinforcement learning to obtain a policy that plans ahead for the entire horizon. This is a parallel to the optimization of an acquisition function in policy space. The resulting policy is potentially history-dependent and non-Markovian. We showcase applications in chemical reactor optimization, informative path planning, machine calibration, and other synthetic examples.
△ Less
Submitted 29 May, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Safe Guaranteed Exploration for Non-linear Systems
Authors:
Manish Prajapat,
Johannes Köhler,
Matteo Turchetta,
Andreas Krause,
Melanie N. Zeilinger
Abstract:
Safely exploring environments with a-priori unknown constraints is a fundamental challenge that restricts the autonomy of robots. While safety is paramount, guarantees on sufficient exploration are also crucial for ensuring autonomous task completion. To address these challenges, we propose a novel safe guaranteed exploration framework using optimal control, which achieves first-of-its-kind result…
▽ More
Safely exploring environments with a-priori unknown constraints is a fundamental challenge that restricts the autonomy of robots. While safety is paramount, guarantees on sufficient exploration are also crucial for ensuring autonomous task completion. To address these challenges, we propose a novel safe guaranteed exploration framework using optimal control, which achieves first-of-its-kind results: guaranteed exploration for non-linear systems with finite time sample complexity bounds, while being provably safe with arbitrarily high probability. The framework is general and applicable to many real-world scenarios with complex non-linear dynamics and unknown domains. Based on this framework we propose an efficient algorithm, SageMPC, SAfe Guaranteed Exploration using Model Predictive Control. SageMPC improves efficiency by incorporating three techniques: i) exploiting a Lipschitz bound, ii) goal-directed exploration, and iii) receding horizon style re-planning, all while maintaining the desired sample complexity, safety and exploration guarantees of the framework. Lastly, we demonstrate safe efficient exploration in challenging unknown environments using SageMPC with a car model.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL
Authors:
Jiawei Huang,
Niao He,
Andreas Krause
Abstract:
We study the sample complexity of reinforcement learning (RL) in Mean-Field Games (MFGs) with model-based function approximation that requires strategic exploration to find a Nash Equilibrium policy. We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity. Notably, P-MBED measures the complexity of the single-agent model cl…
▽ More
We study the sample complexity of reinforcement learning (RL) in Mean-Field Games (MFGs) with model-based function approximation that requires strategic exploration to find a Nash Equilibrium policy. We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity. Notably, P-MBED measures the complexity of the single-agent model class converted from the given mean-field model class, and potentially, can be exponentially lower than the MBED proposed by \citet{huang2023statistical}. We contribute a model elimination algorithm featuring a novel exploration strategy and establish sample complexity results polynomial w.r.t.~P-MBED. Crucially, our results reveal that, under the basic realizability and Lipschitz continuity assumptions, \emph{learning Nash Equilibrium in MFGs is no more statistically challenging than solving a logarithmic number of single-agent RL problems}. We further extend our results to Multi-Type MFGs, generalizing from conventional MFGs and involving multiple types of agents. This extension implies statistical tractability of a broader class of Markov Games through the efficacy of mean-field approximation. Finally, inspired by our theoretical algorithm, we present a heuristic approach with improved computational efficiency and empirically demonstrate its effectiveness.
△ Less
Submitted 3 June, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
Personalized Federated Learning of Probabilistic Models: A PAC-Bayesian Approach
Authors:
Mahrokh Ghoddousi Boroujeni,
Andreas Krause,
Giancarlo Ferrari Trecate
Abstract:
Federated learning aims to infer a shared model from private and decentralized data stored locally by multiple clients. Personalized federated learning (PFL) goes one step further by adapting the global model to each client, enhancing the model's fit for different clients. A significant level of personalization is required for highly heterogeneous clients, but can be challenging to achieve especia…
▽ More
Federated learning aims to infer a shared model from private and decentralized data stored locally by multiple clients. Personalized federated learning (PFL) goes one step further by adapting the global model to each client, enhancing the model's fit for different clients. A significant level of personalization is required for highly heterogeneous clients, but can be challenging to achieve especially when they have small datasets. To address this problem, we propose a PFL algorithm named PAC-PFL for learning probabilistic models within a PAC-Bayesian framework that utilizes differential privacy to handle data-dependent priors. Our algorithm collaboratively learns a shared hyper-posterior and regards each client's posterior inference as the personalization step. By establishing and minimizing a generalization bound on the average true risk of clients, PAC-PFL effectively combats over-fitting. PACPFL achieves accurate and well-calibrated predictions, supported by experiments on a dataset of photovoltaic panel power generation, FEMNIST dataset (Caldas et al., 2019), and Dirichlet-partitioned EMNIST dataset (Cohen et al., 2017).
△ Less
Submitted 16 January, 2024;
originally announced January 2024.
-
Interdependent Total Factor Productivity in an Input-Output model
Authors:
Thomas M. Bombarde,
Andrew L. Krause
Abstract:
Industries learn productivity improvements from their suppliers. The observed empirical importance of these interactions, often omitted by input-output models, mandates larger attention. This article embeds interdependent total factor productivity (TFP) growth into a general non-parametric input-output model. TFP growth is assumed to be Cobb-Douglas in TFP-stocks of adjacent sectors, where elastic…
▽ More
Industries learn productivity improvements from their suppliers. The observed empirical importance of these interactions, often omitted by input-output models, mandates larger attention. This article embeds interdependent total factor productivity (TFP) growth into a general non-parametric input-output model. TFP growth is assumed to be Cobb-Douglas in TFP-stocks of adjacent sectors, where elasticities are the input-output coefficients. Studying how the steady state of the system reacts to changes in research effort bears insight for policy and the input-output literature. First, industries higher in the supply chain see a greater multiplication of their productivity gains. Second, the presence of `laggard' industries can bottleneck the the rest of the economy. By deriving these insights formally, we review a canonical method for aggregating TFP -- Hulten's Theorem -- and show the potential importance of backward linkages.
△ Less
Submitted 23 December, 2023;
originally announced December 2023.
-
Witt vectors with coefficients and TR
Authors:
Emanuele Dotto,
Achim Krause,
Thomas Nikolaus,
Irakli Patchkoria
Abstract:
We give a new construction of $p$-typical Witt vectors with coefficients in terms of ghost maps and show that this construction is isomorphic to the one defined in terms of formal power series from the authors' previous paper. We show that our construction recovers Kaledin's polynomial Witt vectors in the case of vector spaces over a perfect field of characteristic $p$. We then identify the compon…
▽ More
We give a new construction of $p$-typical Witt vectors with coefficients in terms of ghost maps and show that this construction is isomorphic to the one defined in terms of formal power series from the authors' previous paper. We show that our construction recovers Kaledin's polynomial Witt vectors in the case of vector spaces over a perfect field of characteristic $p$. We then identify the components of the $p$-typical TR with coefficients, originally defined by Lindenstrauss and McCarthy and later reworked by the second and third authors in joint work with McCandless, with the $p$-typical Witt vectors with coefficients. This extends a celebrated result of Hesselholt and Hesselholt-Madsen relating the components of TR with the Witt vectors. As an application, we given an algebraic description of the components of the Hill-Hopkins-Ravenel norm for cyclic $p$-groups in terms of $p$-typical Witt vectors with coefficients.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
EquiReact: An equivariant neural network for chemical reactions
Authors:
Puck van Gerwen,
Ksenia R. Briling,
Charlotte Bunne,
Vignesh Ram Somnath,
Ruben Laplaza,
Andreas Krause,
Clemence Corminboeuf
Abstract:
Equivariant neural networks have considerably improved the accuracy and data-efficiency of predictions of molecular properties. Building on this success, we introduce EquiReact, an equivariant neural network to infer properties of chemical reactions, built from three-dimensional structures of reactants and products. We illustrate its competitive performance on the prediction of activation barriers…
▽ More
Equivariant neural networks have considerably improved the accuracy and data-efficiency of predictions of molecular properties. Building on this success, we introduce EquiReact, an equivariant neural network to infer properties of chemical reactions, built from three-dimensional structures of reactants and products. We illustrate its competitive performance on the prediction of activation barriers on the GDB7-22-TS, Cyclo-23-TS and Proparg-21-TS datasets with different regimes according to the inclusion of atom-map** information. We show that, compared to state-of-the-art models for reaction property prediction, EquiReact offers: (i) a flexible model with reduced sensitivity between atom-map** regimes, (ii) better extrapolation capabilities to unseen chemistries, (iii) impressive prediction errors for datasets exhibiting subtle variations in three-dimensional geometries of reactants/products, (iv) reduced sensitivity to geometry quality and (iv) excellent data efficiency.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Sinkhorn Flow: A Continuous-Time Framework for Understanding and Generalizing the Sinkhorn Algorithm
Authors:
Mohammad Reza Karimi,
Ya-** Hsieh,
Andreas Krause
Abstract:
Many problems in machine learning can be formulated as solving entropy-regularized optimal transport on the space of probability measures. The canonical approach involves the Sinkhorn iterates, renowned for their rich mathematical properties. Recently, the Sinkhorn algorithm has been recast within the mirror descent framework, thus benefiting from classical optimization theory insights. Here, we b…
▽ More
Many problems in machine learning can be formulated as solving entropy-regularized optimal transport on the space of probability measures. The canonical approach involves the Sinkhorn iterates, renowned for their rich mathematical properties. Recently, the Sinkhorn algorithm has been recast within the mirror descent framework, thus benefiting from classical optimization theory insights. Here, we build upon this result by introducing a continuous-time analogue of the Sinkhorn algorithm. This perspective allows us to derive novel variants of Sinkhorn schemes that are robust to noise and bias. Moreover, our continuous-time dynamics not only generalize but also offer a unified perspective on several recently discovered dynamics in machine learning and mathematics, such as the "Wasserstein mirror flow" of (Deb et al. 2023) or the "mean-field Schrödinger equation" of (Claisse et al. 2023).
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
Grain boundary migration in polycrystalline $α$-Fe
Authors:
Zipeng Xu,
Yu-Feng Shen,
S. Kiana Naghibzadeh,
Xiaoyao Peng,
Vivekanand Muralikrishnan,
Siddharth Maddali,
David Menasche,
Amanda R. Krause,
Kaushik Dayal,
Robert M. Suter,
Gregory S. Rohrer
Abstract:
High energy x-ray diffraction microscopy was used to image the microstructure of $α$-Fe before and after a 600 $^\circ$C anneal. These data were used to determine the areas, curvatures, energies, and velocities of approximately 40,000 grain boundaries. The measured grain boundary properties depend on the five macroscopic grain boundary parameters. The velocities are not correlated with the product…
▽ More
High energy x-ray diffraction microscopy was used to image the microstructure of $α$-Fe before and after a 600 $^\circ$C anneal. These data were used to determine the areas, curvatures, energies, and velocities of approximately 40,000 grain boundaries. The measured grain boundary properties depend on the five macroscopic grain boundary parameters. The velocities are not correlated with the product of the mean boundary curvature and grain boundary energy, usually assumed to be the driving force. Boundary migration is made up of area changes (lateral motion) and translation (normal motion) and both contribute to the total migration. Through the lateral motion component of the migration, low energy boundaries tend to expand in area while high energy boundaries shrink, reducing the average energy through grain boundary replacement. The driving force for this process is not related to curvature and might disrupt the expected curvature-velocity relationship.
△ Less
Submitted 18 November, 2023;
originally announced November 2023.
-
Data-Efficient Task Generalization via Probabilistic Model-based Meta Reinforcement Learning
Authors:
Arjun Bhardwaj,
Jonas Rothfuss,
Bhavya Sukhija,
Yarden As,
Marco Hutter,
Stelian Coros,
Andreas Krause
Abstract:
We introduce PACOH-RL, a novel model-based Meta-Reinforcement Learning (Meta-RL) algorithm designed to efficiently adapt control policies to changing dynamics. PACOH-RL meta-learns priors for the dynamics model, allowing swift adaptation to new dynamics with minimal interaction data. Existing Meta-RL methods require abundant meta-learning data, limiting their applicability in settings such as robo…
▽ More
We introduce PACOH-RL, a novel model-based Meta-Reinforcement Learning (Meta-RL) algorithm designed to efficiently adapt control policies to changing dynamics. PACOH-RL meta-learns priors for the dynamics model, allowing swift adaptation to new dynamics with minimal interaction data. Existing Meta-RL methods require abundant meta-learning data, limiting their applicability in settings such as robotics, where data is costly to obtain. To address this, PACOH-RL incorporates regularization and epistemic uncertainty quantification in both the meta-learning and task adaptation stages. When facing new dynamics, we use these uncertainty estimates to effectively guide exploration and data collection. Overall, this enables positive transfer, even when access to data from prior tasks or dynamic settings is severely limited. Our experiment results demonstrate that PACOH-RL outperforms model-based RL and model-based Meta-RL baselines in adapting to new dynamic conditions. Finally, on a real robotic car, we showcase the potential for efficient RL policy adaptation in diverse, data-scarce conditions.
△ Less
Submitted 6 February, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Likelihood Ratio Confidence Sets for Sequential Decision Making
Authors:
Nicolas Emmenegger,
Mojmír Mutný,
Andreas Krause
Abstract:
Certifiable, adaptive uncertainty estimates for unknown quantities are an essential ingredient of sequential decision-making algorithms. Standard approaches rely on problem-dependent concentration results and are limited to a specific combination of parameterization, noise family, and estimator. In this paper, we revisit the likelihood-based inference principle and propose to use likelihood ratios…
▽ More
Certifiable, adaptive uncertainty estimates for unknown quantities are an essential ingredient of sequential decision-making algorithms. Standard approaches rely on problem-dependent concentration results and are limited to a specific combination of parameterization, noise family, and estimator. In this paper, we revisit the likelihood-based inference principle and propose to use likelihood ratios to construct any-time valid confidence sequences without requiring specialized treatment in each application scenario. Our method is especially suitable for problems with well-specified likelihoods, and the resulting sets always maintain the prescribed coverage in a model-agnostic manner. The size of the sets depends on a choice of estimator sequence in the likelihood ratio. We discuss how to provably choose the best sequence of estimators and shed light on connections to online convex optimization with algorithms such as Follow-the-Regularized-Leader. To counteract the initially large bias of the estimators, we propose a reweighting scheme that also opens up deployment in non-parametric settings such as RKHS function classes. We provide a non-asymptotic analysis of the likelihood ratio confidence sets size for generalized linear models, using insights from convex duality and online learning. We showcase the practical strength of our method on generalized linear bandit problems, survival analysis, and bandits with various additive noise distributions.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
Riemannian stochastic optimization methods avoid strict saddle points
Authors:
Ya-** Hsieh,
Mohammad Reza Karimi,
Andreas Krause,
Panayotis Mertikopoulos
Abstract:
Many modern machine learning applications - from online principal component analysis to covariance matrix identification and dictionary learning - can be formulated as minimization problems on Riemannian manifolds, and are typically solved with a Riemannian stochastic gradient method (or some variant thereof). However, in many cases of interest, the resulting minimization problem is not geodesical…
▽ More
Many modern machine learning applications - from online principal component analysis to covariance matrix identification and dictionary learning - can be formulated as minimization problems on Riemannian manifolds, and are typically solved with a Riemannian stochastic gradient method (or some variant thereof). However, in many cases of interest, the resulting minimization problem is not geodesically convex, so the convergence of the chosen solver to a desirable solution - i.e., a local minimizer - is by no means guaranteed. In this paper, we study precisely this question, that is, whether stochastic Riemannian optimization algorithms are guaranteed to avoid saddle points with probability 1. For generality, we study a family of retraction-based methods which, in addition to having a potentially much lower per-iteration cost relative to Riemannian gradient descent, include other widely used algorithms, such as natural policy gradient methods and mirror descent in ordinary convex spaces. In this general setting, we show that, under mild assumptions for the ambient manifold and the oracle providing gradient information, the policies under study avoid strict saddle points / submanifolds with probability 1, from any initial condition. This result provides an important sanity check for the use of gradient methods on manifolds as it shows that, almost always, the limit state of a stochastic Riemannian algorithm can only be a local minimizer.
△ Less
Submitted 4 November, 2023;
originally announced November 2023.
-
Efficient Exploration in Continuous-time Model-based Reinforcement Learning
Authors:
Lenart Treven,
Jonas Hübotter,
Bhavya Sukhija,
Florian Dörfler,
Andreas Krause
Abstract:
Reinforcement learning algorithms typically consider discrete-time dynamics, even though the underlying systems are often continuous in time. In this paper, we introduce a model-based reinforcement learning algorithm that represents continuous-time dynamics using nonlinear ordinary differential equations (ODEs). We capture epistemic uncertainty using well-calibrated probabilistic models, and use t…
▽ More
Reinforcement learning algorithms typically consider discrete-time dynamics, even though the underlying systems are often continuous in time. In this paper, we introduce a model-based reinforcement learning algorithm that represents continuous-time dynamics using nonlinear ordinary differential equations (ODEs). We capture epistemic uncertainty using well-calibrated probabilistic models, and use the optimistic principle for exploration. Our regret bounds surface the importance of the measurement selection strategy(MSS), since in continuous time we not only must decide how to explore, but also when to observe the underlying system. Our analysis demonstrates that the regret is sublinear when modeling ODEs with Gaussian Processes (GP) for common choices of MSS, such as equidistant sampling. Additionally, we propose an adaptive, data-dependent, practical MSS that, when combined with GP dynamics, also achieves sublinear regret with significantly fewer samples. We showcase the benefits of continuous-time modeling over its discrete-time counterpart, as well as our proposed adaptive MSS over standard baselines, on several applications.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
Implicit Manifold Gaussian Process Regression
Authors:
Bernardo Fichera,
Viacheslav Borovitskiy,
Andreas Krause,
Aude Billard
Abstract:
Gaussian process regression is widely used because of its ability to provide well-calibrated uncertainty estimates and handle small or sparse datasets. However, it struggles with high-dimensional data. One possible way to scale this technique to higher dimensions is to leverage the implicit low-dimensional manifold upon which the data actually lies, as postulated by the manifold hypothesis. Prior…
▽ More
Gaussian process regression is widely used because of its ability to provide well-calibrated uncertainty estimates and handle small or sparse datasets. However, it struggles with high-dimensional data. One possible way to scale this technique to higher dimensions is to leverage the implicit low-dimensional manifold upon which the data actually lies, as postulated by the manifold hypothesis. Prior work ordinarily requires the manifold structure to be explicitly provided though, i.e. given by a mesh or be known to be one of the well-known manifolds like the sphere. In contrast, in this paper we propose a Gaussian process regression technique capable of inferring implicit structure directly from data (labeled and unlabeled) in a fully differentiable way. For the resulting model, we discuss its convergence to the Matérn Gaussian process on the assumed manifold. Our technique scales up to hundreds of thousands of data points, and may improve the predictive performance and calibration of the standard Gaussian process regression in high-dimensional settings.
△ Less
Submitted 1 February, 2024; v1 submitted 30 October, 2023;
originally announced October 2023.
-
Intrinsic Gaussian Vector Fields on Manifolds
Authors:
Daniel Robert-Nicoud,
Andreas Krause,
Viacheslav Borovitskiy
Abstract:
Various applications ranging from robotics to climate science require modeling signals on non-Euclidean domains, such as the sphere. Gaussian process models on manifolds have recently been proposed for such tasks, in particular when uncertainty quantification is needed. In the manifold setting, vector-valued signals can behave very differently from scalar-valued ones, with much of the progress so…
▽ More
Various applications ranging from robotics to climate science require modeling signals on non-Euclidean domains, such as the sphere. Gaussian process models on manifolds have recently been proposed for such tasks, in particular when uncertainty quantification is needed. In the manifold setting, vector-valued signals can behave very differently from scalar-valued ones, with much of the progress so far focused on modeling the latter. The former, however, are crucial for many applications, such as modeling wind speeds or force fields of unknown dynamical systems. In this paper, we propose novel Gaussian process models for vector-valued signals on manifolds that are intrinsically defined and account for the geometry of the space in consideration. We provide computational primitives needed to deploy the resulting Hodge-Matérn Gaussian vector fields on the two-dimensional sphere and the hypertori. Further, we highlight two generalization directions: discrete two-dimensional meshes and "ideal" manifolds like hyperspheres, Lie groups, and homogeneous spaces. Finally, we show that our Gaussian vector fields constitute considerably more refined inductive biases than the extrinsic fields proposed before.
△ Less
Submitted 31 March, 2024; v1 submitted 28 October, 2023;
originally announced October 2023.
-
Contextual Stochastic Bilevel Optimization
Authors:
Yifan Hu,
Jie Wang,
Yao Xie,
Andreas Krause,
Daniel Kuhn
Abstract:
We introduce contextual stochastic bilevel optimization (CSBO) -- a stochastic bilevel optimization framework with the lower-level problem minimizing an expectation conditioned on some contextual information and the upper-level decision variable. This framework extends classical stochastic bilevel optimization when the lower-level decision maker responds optimally not only to the decision of the u…
▽ More
We introduce contextual stochastic bilevel optimization (CSBO) -- a stochastic bilevel optimization framework with the lower-level problem minimizing an expectation conditioned on some contextual information and the upper-level decision variable. This framework extends classical stochastic bilevel optimization when the lower-level decision maker responds optimally not only to the decision of the upper-level decision maker but also to some side information and when there are multiple or even infinite many followers. It captures important applications such as meta-learning, personalized federated learning, end-to-end learning, and Wasserstein distributionally robust optimization with side information (WDRO-SI). Due to the presence of contextual information, existing single-loop methods for classical stochastic bilevel optimization are unable to converge. To overcome this challenge, we introduce an efficient double-loop gradient method based on the Multilevel Monte-Carlo (MLMC) technique and establish its sample and computational complexities. When specialized to stochastic nonconvex optimization, our method matches existing lower bounds. For meta-learning, the complexity of our method does not depend on the number of tasks. Numerical experiments further validate our theoretical results.
△ Less
Submitted 27 October, 2023;
originally announced October 2023.
-
Causal Modeling with Stationary Diffusions
Authors:
Lars Lorch,
Andreas Krause,
Bernhard Schölkopf
Abstract:
We develop a novel approach towards causal inference. Rather than structural equations over a causal graph, we learn stochastic differential equations (SDEs) whose stationary densities model a system's behavior under interventions. These stationary diffusion models do not require the formalism of causal graphs, let alone the common assumption of acyclicity. We show that in several cases, they gene…
▽ More
We develop a novel approach towards causal inference. Rather than structural equations over a causal graph, we learn stochastic differential equations (SDEs) whose stationary densities model a system's behavior under interventions. These stationary diffusion models do not require the formalism of causal graphs, let alone the common assumption of acyclicity. We show that in several cases, they generalize to unseen interventions on their variables, often better than classical approaches. Our inference method is based on a new theoretical result that expresses a stationarity condition on the diffusion's generator in a reproducing kernel Hilbert space. The resulting kernel deviation from stationarity (KDS) is an objective function of independent interest.
△ Less
Submitted 16 March, 2024; v1 submitted 26 October, 2023;
originally announced October 2023.
-
Prismatic cohomology relative to $δ$-rings
Authors:
Benjamin Antieau,
Achim Krause,
Thomas Nikolaus
Abstract:
We develop prismatic and syntomic cohomology relative to a $δ$-ring. This simultaneously generalizes Bhatt and Scholze's absolute and relative prismatic cohomology and shows that the latter, which was defined relative to a prism, is in fact independent of the prism structure and only depends on the underlying $δ$-ring. We give several possible definitions of our new version of prismatic cohomology…
▽ More
We develop prismatic and syntomic cohomology relative to a $δ$-ring. This simultaneously generalizes Bhatt and Scholze's absolute and relative prismatic cohomology and shows that the latter, which was defined relative to a prism, is in fact independent of the prism structure and only depends on the underlying $δ$-ring. We give several possible definitions of our new version of prismatic cohomology: a site theoretic definition, one using prismatic crystals, and a stack theoretic definition. These are equivalent under mild syntomicity hypotheses. As an application, we note how the theory of prismatic cohomology of filtered rings arises naturally in this context.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
DockGame: Cooperative Games for Multimeric Rigid Protein Docking
Authors:
Vignesh Ram Somnath,
Pier Giuseppe Sessa,
Maria Rodriguez Martinez,
Andreas Krause
Abstract:
Protein interactions and assembly formation are fundamental to most biological processes. Predicting the assembly structure from constituent proteins -- referred to as the protein docking task -- is thus a crucial step in protein design applications. Most traditional and deep learning methods for docking have focused mainly on binary docking, following either a search-based, regression-based, or g…
▽ More
Protein interactions and assembly formation are fundamental to most biological processes. Predicting the assembly structure from constituent proteins -- referred to as the protein docking task -- is thus a crucial step in protein design applications. Most traditional and deep learning methods for docking have focused mainly on binary docking, following either a search-based, regression-based, or generative modeling paradigm. In this paper, we focus on the less-studied multimeric (i.e., two or more proteins) docking problem. We introduce DockGame, a novel game-theoretic framework for docking -- we view protein docking as a cooperative game between proteins, where the final assembly structure(s) constitute stable equilibria w.r.t. the underlying game potential. Since we do not have access to the true potential, we consider two approaches - i) learning a surrogate game potential guided by physics-based energy functions and computing equilibria by simultaneous gradient updates, and ii) sampling from the Gibbs distribution of the true potential by learning a diffusion generative model over the action spaces (rotations and translations) of all proteins. Empirically, on the Docking Benchmark 5.5 (DB5.5) dataset, DockGame has much faster runtimes than traditional docking methods, can generate multiple plausible assembly structures, and achieves comparable performance to existing binary docking baselines, despite solving the harder task of coordinating multiple protein chains.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Distributionally Robust Model-based Reinforcement Learning with Large State Spaces
Authors:
Shyam Sundhar Ramesh,
Pier Giuseppe Sessa,
Yifan Hu,
Andreas Krause,
Ilija Bogunovic
Abstract:
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment. To overcome these issues, we study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and…
▽ More
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment. To overcome these issues, we study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets. We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics, leveraging access to a generative model (i.e., simulator). We further demonstrate the statistical sample complexity of the proposed method for different uncertainty sets. These complexity bounds are independent of the number of states and extend beyond linear dynamics, ensuring the effectiveness of our approach in identifying near-optimal distributionally-robust policies. The proposed method can be further combined with other model-free distributionally robust reinforcement learning methods to obtain a near-optimal robust policy. Experimental results demonstrate the robustness of our algorithm to distributional shifts and its superior performance in terms of the number of samples needed.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Turing instabilities are not enough to ensure pattern formation
Authors:
Andrew L. Krause,
Eamonn A. Gaffney,
Thomas Jun Jewell,
Václav Klika,
Benjamin J. Walker
Abstract:
Symmetry-breaking instabilities play an important role in understanding the mechanisms underlying the diversity of patterns observed in nature, such as in Turing's reaction--diffusion theory, which connects cellular signalling and transport with the development of growth and form. Extensive literature focuses on the linear stability analysis of homogeneous equilibria in these systems, culminating…
▽ More
Symmetry-breaking instabilities play an important role in understanding the mechanisms underlying the diversity of patterns observed in nature, such as in Turing's reaction--diffusion theory, which connects cellular signalling and transport with the development of growth and form. Extensive literature focuses on the linear stability analysis of homogeneous equilibria in these systems, culminating in a set of conditions for transport-driven instabilities that are commonly presumed to initiate self-organisation. We demonstrate that a selection of simple, canonical transport models with only mild multistable non-linearities can satisfy the Turing instability conditions while also robustly exhibiting only transient patterns. Hence, a Turing-like instability is insufficient for the existence of a patterned state. \ak{While it is known that linear theory can fail to predict the formation of patterns, we demonstrate that such failures can appear robustly in systems with multiple stable homogeneous equilibria.} Given that biological systems \ak{such as} gene regulatory networks and spatially distributed ecosystems often exhibit a high degree of multistability and nonlinearity, this raises important questions of how to analyse prospective mechanisms for self-organisation.
△ Less
Submitted 21 December, 2023; v1 submitted 29 August, 2023;
originally announced August 2023.
-
Digital Twin of the Radio Environment: A Novel Approach for Anomaly Detection in Wireless Networks
Authors:
Anton Krause,
Mohd Danish Khursheed,
Philipp Schulz,
Friedrich Burmeister,
Gerhard Fettweis
Abstract:
The increasing relevance of resilience in wireless connectivity for Industry 4.0 stems from the growing complexity and interconnectivity of industrial systems, where a single point of failure can disrupt the entire network, leading to significant downtime and productivity losses. It is thus essential to constantly monitor the network and identify any anomaly such as a jammer. Hereby, technologies…
▽ More
The increasing relevance of resilience in wireless connectivity for Industry 4.0 stems from the growing complexity and interconnectivity of industrial systems, where a single point of failure can disrupt the entire network, leading to significant downtime and productivity losses. It is thus essential to constantly monitor the network and identify any anomaly such as a jammer. Hereby, technologies envisioned to be integrated in 6G, in particular joint communications and sensing (JCAS) and accurate indoor positioning of transmitters, open up the possibility to build a digital twin (DT) of the radio environment. This paper proposes a new approach for anomaly detection in wireless networks enabled by such a DT which allows to integrate contextual information on the network in the anomaly detection procedure. The basic approach is thereby to compare expected received signal strengths (RSSs) from the DT with measurements done by distributed sensing units (SUs). Employing simulations, different algorithms are compared regarding their ability to infer from the comparison on the presence or absence of an anomaly, particular a jammer. Overall, the feasibility of anomaly detection using the proposed approach is demonstrated which integrates in the ongoing research on employing DTs for comprehensive monitoring of wireless networks.
△ Less
Submitted 13 October, 2023; v1 submitted 14 August, 2023;
originally announced August 2023.
-
A note on quadratic forms
Authors:
Fabian Hebestreit,
Achim Krause,
Maxime Ramzi
Abstract:
For a field extension $L/K$ we consider maps that are quadratic over $L$ but whose polarisation is only bilinear over $K$. Our main result is that all such are automatically quadratic forms over $L$ in the usual sense if and only if $L/K$ is formally unramified. In particular, this shows that over finite and number fields, one of the axioms in the standard definition of quadratic forms is superflu…
▽ More
For a field extension $L/K$ we consider maps that are quadratic over $L$ but whose polarisation is only bilinear over $K$. Our main result is that all such are automatically quadratic forms over $L$ in the usual sense if and only if $L/K$ is formally unramified. In particular, this shows that over finite and number fields, one of the axioms in the standard definition of quadratic forms is superfluous.
△ Less
Submitted 6 February, 2024; v1 submitted 3 August, 2023;
originally announced August 2023.
-
Multitask Learning with No Regret: from Improved Confidence Bounds to Active Learning
Authors:
Pier Giuseppe Sessa,
Pierre Laforgue,
Nicolò Cesa-Bianchi,
Andreas Krause
Abstract:
Multitask learning is a powerful framework that enables one to simultaneously learn multiple related tasks by sharing information between them. Quantifying uncertainty in the estimated tasks is of pivotal importance for many downstream applications, such as online or active learning. In this work, we provide novel multitask confidence intervals in the challenging agnostic setting, i.e., when neith…
▽ More
Multitask learning is a powerful framework that enables one to simultaneously learn multiple related tasks by sharing information between them. Quantifying uncertainty in the estimated tasks is of pivotal importance for many downstream applications, such as online or active learning. In this work, we provide novel multitask confidence intervals in the challenging agnostic setting, i.e., when neither the similarity between tasks nor the tasks' features are available to the learner. The obtained intervals do not require i.i.d. data and can be directly applied to bound the regret in online learning. Through a refined analysis of the multitask information gain, we obtain new regret guarantees that, depending on a task similarity parameter, can significantly improve over treating tasks independently. We further propose a novel online learning algorithm that achieves such improved regret without knowing this parameter in advance, i.e., automatically adapting to task similarity. As a second key application of our results, we introduce a novel multitask active learning setup where several tasks must be simultaneously optimized, but only one of them can be queried for feedback by the learner at each round. For this problem, we design a no-regret algorithm that uses our confidence intervals to decide which task should be queried. Finally, we empirically validate our bounds and algorithms on synthetic and real-world (drug discovery) data.
△ Less
Submitted 3 August, 2023;
originally announced August 2023.
-
VisualPDE: rapid interactive simulations of partial differential equations
Authors:
Benjamin J. Walker,
Adam K. Townsend,
Alexander K. Chudasama,
Andrew L. Krause
Abstract:
Computing has revolutionised the study of complex nonlinear systems, both by allowing us to solve previously intractable models and through the ability to visualise solutions in different ways. Using ubiquitous computing infrastructure, we provide a means to go one step further in using computers to understand complex models through instantaneous and interactive exploration. This ubiquitous infras…
▽ More
Computing has revolutionised the study of complex nonlinear systems, both by allowing us to solve previously intractable models and through the ability to visualise solutions in different ways. Using ubiquitous computing infrastructure, we provide a means to go one step further in using computers to understand complex models through instantaneous and interactive exploration. This ubiquitous infrastructure has enormous potential in education, outreach and research. Here, we present VisualPDE, an online, interactive solver for a broad class of 1D and 2D partial differential equation (PDE) systems. Abstract dynamical systems concepts such as symmetry-breaking instabilities, subcritical bifurcations and the role of initial data in multistable nonlinear models become much more intuitive when you can play with these models yourself, and immediately answer questions about how the system responds to changes in parameters, initial conditions, boundary conditions or even spatiotemporal forcing. Importantly, VisualPDE is freely available, open source and highly customisable. We give several examples in teaching, research and knowledge exchange, providing high-level discussions of how it may be employed in different settings. This includes designing web-based course materials structured around interactive simulations, or easily crafting specific simulations that can be shared with students or collaborators via a simple URL. We envisage VisualPDE becoming an invaluable resource for teaching and research in mathematical biology and beyond. We also hope that it inspires other efforts to make mathematics more interactive and accessible.
△ Less
Submitted 16 October, 2023; v1 submitted 2 August, 2023;
originally announced August 2023.
-
Adversarial Causal Bayesian Optimization
Authors:
Scott Sussex,
Pier Giuseppe Sessa,
Anastasiia Makarova,
Andreas Krause
Abstract:
In Causal Bayesian Optimization (CBO), an agent intervenes on an unknown structural causal model to maximize a downstream reward variable. In this paper, we consider the generalization where other agents or external events also intervene on the system, which is key for enabling adaptiveness to non-stationarities such as weather changes, market forces, or adversaries. We formalize this generalizati…
▽ More
In Causal Bayesian Optimization (CBO), an agent intervenes on an unknown structural causal model to maximize a downstream reward variable. In this paper, we consider the generalization where other agents or external events also intervene on the system, which is key for enabling adaptiveness to non-stationarities such as weather changes, market forces, or adversaries. We formalize this generalization of CBO as Adversarial Causal Bayesian Optimization (ACBO) and introduce the first algorithm for ACBO with bounded regret: Causal Bayesian Optimization with Multiplicative Weights (CBO-MW). Our approach combines a classical online learning strategy with causal modeling of the rewards. To achieve this, it computes optimistic counterfactual reward estimates by propagating uncertainty through the causal graph. We derive regret bounds for CBO-MW that naturally depend on graph-related quantities. We further propose a scalable implementation for the case of combinatorial interventions and submodular rewards. Empirically, CBO-MW outperforms non-causal and non-adversarial Bayesian optimization methods on synthetic environments and environments based on real-word data. Our experiments include a realistic demonstration of how CBO-MW can be used to learn users' demand patterns in a shared mobility system and reposition vehicles in strategic areas.
△ Less
Submitted 31 July, 2023;
originally announced July 2023.
-
Submodular Reinforcement Learning
Authors:
Manish Prajapat,
Mojmír Mutný,
Melanie N. Zeilinger,
Andreas Krause
Abstract:
In reinforcement learning (RL), rewards of states are typically considered additive, and following the Markov assumption, they are $\textit{independent}$ of states visited previously. In many important applications, such as coverage control, experiment design and informative path planning, rewards naturally have diminishing returns, i.e., their value decreases in light of similar states visited pr…
▽ More
In reinforcement learning (RL), rewards of states are typically considered additive, and following the Markov assumption, they are $\textit{independent}$ of states visited previously. In many important applications, such as coverage control, experiment design and informative path planning, rewards naturally have diminishing returns, i.e., their value decreases in light of similar states visited previously. To tackle this, we propose $\textit{submodular RL}$ (SubRL), a paradigm which seeks to optimize more general, non-additive (and history-dependent) rewards modelled via submodular set functions which capture diminishing returns. Unfortunately, in general, even in tabular settings, we show that the resulting optimization problem is hard to approximate. On the other hand, motivated by the success of greedy algorithms in classical submodular optimization, we propose SubPO, a simple policy gradient-based algorithm for SubRL that handles non-additive rewards by greedily maximizing marginal gains. Indeed, under some assumptions on the underlying Markov Decision Process (MDP), SubPO recovers optimal constant factor approximations of submodular bandits. Moreover, we derive a natural policy gradient approach for locally optimizing SubRL instances even in large state- and action- spaces. We showcase the versatility of our approach by applying SubPO to several applications, such as biodiversity monitoring, Bayesian experiment design, informative path planning, and coverage maximization. Our results demonstrate sample efficiency, as well as scalability to high-dimensional state-action spaces.
△ Less
Submitted 24 May, 2024; v1 submitted 25 July, 2023;
originally announced July 2023.
-
Anytime Model Selection in Linear Bandits
Authors:
Parnian Kassraie,
Nicolas Emmenegger,
Andreas Krause,
Aldo Pacchiano
Abstract:
Model selection in the context of bandit optimization is a challenging problem, as it requires balancing exploration and exploitation not only for action selection, but also for model selection. One natural approach is to rely on online learning algorithms that treat different models as experts. Existing methods, however, scale poorly ($\text{poly}M$) with the number of models $M$ in terms of thei…
▽ More
Model selection in the context of bandit optimization is a challenging problem, as it requires balancing exploration and exploitation not only for action selection, but also for model selection. One natural approach is to rely on online learning algorithms that treat different models as experts. Existing methods, however, scale poorly ($\text{poly}M$) with the number of models $M$ in terms of their regret. Our key insight is that, for model selection in linear bandits, we can emulate full-information feedback to the online learner with a favorable bias-variance trade-off. This allows us to develop ALEXP, which has an exponentially improved ($\log M$) dependence on $M$ for its regret. ALEXP has anytime guarantees on its regret, and neither requires knowledge of the horizon $n$, nor relies on an initial purely exploratory stage. Our approach utilizes a novel time-uniform analysis of the Lasso, establishing a new connection between online learning and high-dimensional statistics.
△ Less
Submitted 12 November, 2023; v1 submitted 24 July, 2023;
originally announced July 2023.
-
Patterning of nonlocal transport models in biology: the impact of spatial dimension
Authors:
Thomas Jun Jewell,
Andrew L. Krause,
Philip K. Maini,
Eamonn A. Gaffney
Abstract:
Throughout developmental biology and ecology, transport can be driven by nonlocal interactions. Examples include cells that migrate based on contact with pseudopodia extended from other cells, and animals that move based on their vision of other animals. Nonlocal integro-PDE models have been used to investigate contact attraction and repulsion in cell populations in 1D. In this paper, we generalis…
▽ More
Throughout developmental biology and ecology, transport can be driven by nonlocal interactions. Examples include cells that migrate based on contact with pseudopodia extended from other cells, and animals that move based on their vision of other animals. Nonlocal integro-PDE models have been used to investigate contact attraction and repulsion in cell populations in 1D. In this paper, we generalise the analysis of pattern formation in such a model from 1D to higher spatial dimensions. Numerical simulations in 2D demonstrate complex behaviour in the model, including spatio-temporal patterns, multi-stability, and the selection of spots or stripes heavily depending on interactions being attractive or repulsive. Through linear stability analysis in $N$ dimensions, we demonstrate how, unlike in local Turing reaction-diffusion models, the capacity for pattern formation fundamentally changes with dimensionality for this nonlocal model. Most notably, pattern formation is possible only in higher than one spatial dimension for both the single species system with repulsive interactions, and the two species system with `run-and-chase' interactions. The latter case may be relevant to zebrafish stripe formation, which has been shown to be driven by run-and-chase dynamics between melanophore and xanthophore pigment cells.
△ Less
Submitted 6 July, 2023;
originally announced July 2023.
-
Safe Model-Based Multi-Agent Mean-Field Reinforcement Learning
Authors:
Matej Jusup,
Barna Pásztor,
Tadeusz Janik,
Kenan Zhang,
Francesco Corman,
Andreas Krause,
Ilija Bogunovic
Abstract:
Many applications, e.g., in shared mobility, require coordinating a large number of agents. Mean-field reinforcement learning addresses the resulting scalability challenge by optimizing the policy of a representative agent interacting with the infinite population of identical agents instead of considering individual pairwise interactions. In this paper, we address an important generalization where…
▽ More
Many applications, e.g., in shared mobility, require coordinating a large number of agents. Mean-field reinforcement learning addresses the resulting scalability challenge by optimizing the policy of a representative agent interacting with the infinite population of identical agents instead of considering individual pairwise interactions. In this paper, we address an important generalization where there exist global constraints on the distribution of agents (e.g., requiring capacity constraints or minimum coverage requirements to be met). We propose Safe-M$^3$-UCRL, the first model-based mean-field reinforcement learning algorithm that attains safe policies even in the case of unknown transitions. As a key ingredient, it uses epistemic uncertainty in the transition model within a log-barrier approach to ensure pessimistic constraints satisfaction with high probability. Beyond the synthetic swarm motion benchmark, we showcase Safe-M$^3$-UCRL on the vehicle repositioning problem faced by many shared mobility operators and evaluate its performance through simulations built on vehicle trajectory data from a service provider in Shenzhen. Our algorithm effectively meets the demand in critical areas while ensuring service accessibility in regions with low demand.
△ Less
Submitted 27 December, 2023; v1 submitted 29 June, 2023;
originally announced June 2023.
-
TaylorPDENet: Learning PDEs from non-grid Data
Authors:
Paul Heinisch,
Andrzej Dulny,
Anna Krause,
Andreas Hotho
Abstract:
Modeling data obtained from dynamical systems has gained attention in recent years as a challenging task for machine learning models. Previous approaches assume the measurements to be distributed on a grid. However, for real-world applications like weather prediction, the observations are taken from arbitrary locations within the spatial domain. In this paper, we propose TaylorPDENet - a novel mac…
▽ More
Modeling data obtained from dynamical systems has gained attention in recent years as a challenging task for machine learning models. Previous approaches assume the measurements to be distributed on a grid. However, for real-world applications like weather prediction, the observations are taken from arbitrary locations within the spatial domain. In this paper, we propose TaylorPDENet - a novel machine learning method that is designed to overcome this challenge. Our algorithm uses the multidimensional Taylor expansion of a dynamical system at each observation point to estimate the spatial derivatives to perform predictions. TaylorPDENet is able to accomplish two objectives simultaneously: accurately forecast the evolution of a complex dynamical system and explicitly reconstruct the underlying differential equation describing the system. We evaluate our model on a variety of advection-diffusion equations with different parameters and show that it performs similarly to equivalent approaches on grid-structured data while being able to process unstructured data as well.
△ Less
Submitted 26 June, 2023;
originally announced June 2023.
-
Safe Risk-averse Bayesian Optimization for Controller Tuning
Authors:
Christopher Koenig,
Miks Ozols,
Anastasia Makarova,
Efe C. Balta,
Andreas Krause,
Alisa Rupenyan
Abstract:
Controller tuning and parameter optimization are crucial in system design to improve both the controller and underlying system performance. Bayesian optimization has been established as an efficient model-free method for controller tuning and adaptation. Standard methods, however, are not enough for high-precision systems to be robust with respect to unknown input-dependent noise and stable under…
▽ More
Controller tuning and parameter optimization are crucial in system design to improve both the controller and underlying system performance. Bayesian optimization has been established as an efficient model-free method for controller tuning and adaptation. Standard methods, however, are not enough for high-precision systems to be robust with respect to unknown input-dependent noise and stable under safety constraints. In this work, we present a novel data-driven approach, RaGoOSE, for safe controller tuning in the presence of heteroscedastic noise, combining safe learning with risk-averse Bayesian optimization. We demonstrate the method for synthetic benchmark and compare its performance to established BO-based tuning methods. We further evaluate RaGoOSE performance on a real precision-motion system utilized in semiconductor industry applications and compare it to the built-in auto-tuning routine.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
Optimistic Active Exploration of Dynamical Systems
Authors:
Bhavya Sukhija,
Lenart Treven,
Cansu Sancaktar,
Sebastian Blaes,
Stelian Coros,
Andreas Krause
Abstract:
Reinforcement learning algorithms commonly seek to optimize policies for solving one particular task. How should we explore an unknown dynamical system such that the estimated model globally approximates the dynamics and allows us to solve multiple downstream tasks in a zero-shot manner? In this paper, we address this challenge, by develo** an algorithm -- OPAX -- for active exploration. OPAX us…
▽ More
Reinforcement learning algorithms commonly seek to optimize policies for solving one particular task. How should we explore an unknown dynamical system such that the estimated model globally approximates the dynamics and allows us to solve multiple downstream tasks in a zero-shot manner? In this paper, we address this challenge, by develo** an algorithm -- OPAX -- for active exploration. OPAX uses well-calibrated probabilistic models to quantify the epistemic uncertainty about the unknown dynamics. It optimistically -- w.r.t. to plausible dynamics -- maximizes the information gain between the unknown dynamics and state observations. We show how the resulting optimization problem can be reduced to an optimal control problem that can be solved at each episode using standard approaches. We analyze our algorithm for general models, and, in the case of Gaussian process dynamics, we give a first-of-its-kind sample complexity bound and show that the epistemic uncertainty converges to zero. In our experiments, we compare OPAX with other heuristic active exploration approaches on several environments. Our experiments show that OPAX is not only theoretically sound but also performs well for zero-shot planning on novel downstream tasks.
△ Less
Submitted 30 October, 2023; v1 submitted 21 June, 2023;
originally announced June 2023.
-
"We've Disabled MFA for You": An Evaluation of the Security and Usability of Multi-Factor Authentication Recovery Deployments
Authors:
Sabrina Amft,
Sandra Höltervennhoff,
Nicolas Huaman,
Alexander Krause,
Lucy Simko,
Yasemin Acar,
Sascha Fahl
Abstract:
Multi-Factor Authentication is intended to strengthen the security of password-based authentication by adding another factor, such as hardware tokens or one-time passwords using mobile apps. However, this increased authentication security comes with potential drawbacks that can lead to account and asset loss. If users lose access to their additional authentication factors for any reason, they will…
▽ More
Multi-Factor Authentication is intended to strengthen the security of password-based authentication by adding another factor, such as hardware tokens or one-time passwords using mobile apps. However, this increased authentication security comes with potential drawbacks that can lead to account and asset loss. If users lose access to their additional authentication factors for any reason, they will be locked out of their accounts. Consequently, services that provide Multi-Factor Authentication should deploy procedures to allow their users to recover from losing access to their additional factor that are both secure and easy-to-use. In this work, we investigate the security and user experience of Multi-Factor Authentication recovery procedures, and compare their deployment to descriptions on help and support pages. We first evaluate the official help and support pages of 1,303 websites that provide Multi-Factor Authentication and collect documented information about their recovery procedures. Second, we select a subset of 71 websites, create accounts, set up Multi-Factor Authentication, and perform an in-depth investigation of their recovery procedure security and user experience. We find that many websites deploy insecure Multi-Factor Authentication recovery procedures and allowed us to circumvent and disable Multi-Factor Authentication when having access to the accounts' associated email addresses. Furthermore, we commonly observed discrepancies between our in-depth analysis and the official help and support pages, implying that information meant to aid users is often either incorrect or outdated. Based on our findings, we provide recommendations for best practices regarding Multi-Factor Authentication recovery.
△ Less
Submitted 19 September, 2023; v1 submitted 16 June, 2023;
originally announced June 2023.