Search | arXiv e-print repository

Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning

Authors: Abdullah Akgül, Manuel Haußmann, Melih Kandemir

Abstract: Current approaches to model-based offline Reinforcement Learning (RL) often incorporate uncertainty-based reward penalization to address the distributional shift problem. While these approaches have achieved some success, we argue that this penalization introduces excessive conservatism, potentially resulting in suboptimal policies through underestimation. We identify as an important cause of over… ▽ More Current approaches to model-based offline Reinforcement Learning (RL) often incorporate uncertainty-based reward penalization to address the distributional shift problem. While these approaches have achieved some success, we argue that this penalization introduces excessive conservatism, potentially resulting in suboptimal policies through underestimation. We identify as an important cause of over-penalization the lack of a reliable uncertainty estimator capable of propagating uncertainties in the Bellman operator. The common approach to calculating the penalty term relies on sampling-based uncertainty estimation, resulting in high variance. To address this challenge, we propose a novel method termed Moment Matching Offline Model-Based Policy Optimization (MOMBO). MOMBO learns a Q-function using moment matching, which allows us to deterministically propagate uncertainties through the Q-function. We evaluate MOMBO's performance across various environments and demonstrate empirically that MOMBO is a more stable and sample-efficient approach. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2402.05758 [pdf, other]

Latent variable model for high-dimensional point process with structured missingness

Authors: Maksim Sinelnikov, Manuel Haussmann, Harri Lähdesmäki

Abstract: Longitudinal data are important in numerous fields, such as healthcare, sociology and seismology, but real-world datasets present notable challenges for practitioners because they can be high-dimensional, contain structured missingness patterns, and measurement time points can be governed by an unknown stochastic process. While various solutions have been suggested, the majority of them have been… ▽ More Longitudinal data are important in numerous fields, such as healthcare, sociology and seismology, but real-world datasets present notable challenges for practitioners because they can be high-dimensional, contain structured missingness patterns, and measurement time points can be governed by an unknown stochastic process. While various solutions have been suggested, the majority of them have been designed to account for only one of these challenges. In this work, we propose a flexible and efficient latent-variable model that is capable of addressing all these limitations. Our approach utilizes Gaussian processes to capture temporal correlations between samples and their associated missingness masks as well as to model the underlying point process. We construct our model as a variational autoencoder together with deep neural network parameterised encoder and decoder models, and develop a scalable amortised variational inference approach for efficient model training. We demonstrate competitive performance using both simulated and real datasets. △ Less

Submitted 28 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

arXiv:2311.03002 [pdf, other]

Estimating treatment effects from single-arm trials via latent-variable modeling

Authors: Manuel Haussmann, Tran Minh Son Le, Viivi Halla-aho, Samu Kurki, Jussi V. Leinonen, Miika Koskinen, Samuel Kaski, Harri Lähdesmäki

Abstract: Randomized controlled trials (RCTs) are the accepted standard for treatment effect estimation but they can be infeasible due to ethical reasons and prohibitive costs. Single-arm trials, where all patients belong to the treatment group, can be a viable alternative but require access to an external control group. We propose an identifiable deep latent-variable model for this scenario that can also a… ▽ More Randomized controlled trials (RCTs) are the accepted standard for treatment effect estimation but they can be infeasible due to ethical reasons and prohibitive costs. Single-arm trials, where all patients belong to the treatment group, can be a viable alternative but require access to an external control group. We propose an identifiable deep latent-variable model for this scenario that can also account for missing covariate observations by modeling their structured missingness patterns. Our method uses amortized variational inference to learn both group-specific and identifiable shared latent representations, which can subsequently be used for {\em (i)} patient matching if treatment outcomes are not available for the treatment group, or for {\em (ii)} direct treatment effect estimation assuming outcomes are available for both groups. We evaluate the model on a public benchmark as well as on a data set consisting of a published RCT study and real-world electronic health records. Compared to previous methods, our results show improved performance both for direct treatment effect estimation as well as for effect estimation via patient matching. △ Less

Submitted 4 March, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

Comments: Published at the 27th International Conference on Artificial Intelligence and Statistics (AISTATS) 2024

arXiv:2306.10915 [pdf, other]

Practical Equivariances via Relational Conditional Neural Processes

Authors: Daolang Huang, Manuel Haussmann, Ulpu Remes, ST John, Grégoire Clarté, Kevin Sebastian Luck, Samuel Kaski, Luigi Acerbi

Abstract: Conditional Neural Processes (CNPs) are a class of metalearning models popular for combining the runtime efficiency of amortized inference with reliable uncertainty quantification. Many relevant machine learning tasks, such as in spatio-temporal modeling, Bayesian Optimization and continuous control, inherently contain equivariances -- for example to translation -- which the model can exploit for… ▽ More Conditional Neural Processes (CNPs) are a class of metalearning models popular for combining the runtime efficiency of amortized inference with reliable uncertainty quantification. Many relevant machine learning tasks, such as in spatio-temporal modeling, Bayesian Optimization and continuous control, inherently contain equivariances -- for example to translation -- which the model can exploit for maximal performance. However, prior attempts to include equivariances in CNPs do not scale effectively beyond two input dimensions. In this work, we propose Relational Conditional Neural Processes (RCNPs), an effective approach to incorporate equivariances into any neural process model. Our proposed method extends the applicability and impact of equivariant neural processes to higher dimensions. We empirically demonstrate the competitive performance of RCNPs on a large array of tasks naturally containing equivariances. △ Less

Submitted 5 November, 2023; v1 submitted 19 June, 2023; originally announced June 2023.

Comments: 38 pages, 8 figures. Accepted at the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

arXiv:2301.12776 [pdf, other]

PAC-Bayesian Soft Actor-Critic Learning

Authors: Bahareh Tasdighi, Abdullah Akgül, Manuel Haussmann, Kenny Kazimirzak Brink, Melih Kandemir

Abstract: Actor-critic algorithms address the dual goals of reinforcement learning (RL), policy evaluation and improvement via two separate function approximators. The practicality of this approach comes at the expense of training instability, caused mainly by the destructive effect of the approximation errors of the critic on the actor. We tackle this bottleneck by employing an existing Probably Approximat… ▽ More Actor-critic algorithms address the dual goals of reinforcement learning (RL), policy evaluation and improvement via two separate function approximators. The practicality of this approach comes at the expense of training instability, caused mainly by the destructive effect of the approximation errors of the critic on the actor. We tackle this bottleneck by employing an existing Probably Approximately Correct (PAC) Bayesian bound for the first time as the critic training objective of the Soft Actor-Critic (SAC) algorithm. We further demonstrate that online learning performance improves significantly when a stochastic actor explores multiple futures by critic-guided random search. We observe our resulting algorithm to compare favorably against the state-of-the-art SAC implementation on multiple classical control and locomotion tasks in terms of both sample efficiency and regret. △ Less

Submitted 10 June, 2024; v1 submitted 30 January, 2023; originally announced January 2023.

Comments: 19 pages, 2 figures

arXiv:2106.01216 [pdf, other]

Evidential Turing Processes

Authors: Melih Kandemir, Abdullah Akgül, Manuel Haussmann, Gozde Unal

Abstract: A probabilistic classifier with reliable predictive uncertainties i) fits successfully to the target domain data, ii) provides calibrated class probabilities in difficult regions of the target domain (e.g.\ class overlap), and iii) accurately identifies queries coming out of the target domain and rejects them. We introduce an original combination of Evidential Deep Learning, Neural Processes, and… ▽ More A probabilistic classifier with reliable predictive uncertainties i) fits successfully to the target domain data, ii) provides calibrated class probabilities in difficult regions of the target domain (e.g.\ class overlap), and iii) accurately identifies queries coming out of the target domain and rejects them. We introduce an original combination of Evidential Deep Learning, Neural Processes, and Neural Turing Machines capable of providing all three essential properties mentioned above for total uncertainty quantification. We observe our method on five classification tasks to be the only one that can excel all three aspects of total calibration with a single standalone predictor. Our unified solution delivers an implementation-friendly and compute efficient recipe for safety clearance and provides intellectual economy to an investigation of algorithmic roots of epistemic awareness in deep neural nets. △ Less

Submitted 8 March, 2022; v1 submitted 2 June, 2021; originally announced June 2021.

Comments: accepted at ICLR2022; camera ready version

arXiv:2104.04543 [pdf, other]

doi 10.21468/SciPostPhys.13.1.003

Understanding Event-Generation Networks via Uncertainties

Authors: Marco Bellagente, Manuel Haußmann, Michel Luchmann, Tilman Plehn

Abstract: Following the growing success of generative neural networks in LHC simulations, the crucial question is how to control the networks and assign uncertainties to their event output. We show how Bayesian normalizing flow or invertible networks capture uncertainties from the training and turn them into an uncertainty on the event weight. Fundamentally, the interplay between density and uncertainty est… ▽ More Following the growing success of generative neural networks in LHC simulations, the crucial question is how to control the networks and assign uncertainties to their event output. We show how Bayesian normalizing flow or invertible networks capture uncertainties from the training and turn them into an uncertainty on the event weight. Fundamentally, the interplay between density and uncertainty estimates indicates that these networks learn functions in analogy to parameter fits rather than binned event counts. △ Less

Submitted 1 October, 2021; v1 submitted 9 April, 2021; originally announced April 2021.

Comments: 24 pages

Journal ref: SciPost Phys. 13, 003 (2022)

arXiv:2006.09914 [pdf, other]

Learning Partially Known Stochastic Dynamics with Empirical PAC Bayes

Authors: Manuel Haussmann, Sebastian Gerwinn, Andreas Look, Barbara Rakitsch, Melih Kandemir

Abstract: Neural Stochastic Differential Equations model a dynamical environment with neural nets assigned to their drift and diffusion terms. The high expressive power of their nonlinearity comes at the expense of instability in the identification of the large set of free parameters. This paper presents a recipe to improve the prediction accuracy of such models in three steps: i) accounting for epistemic u… ▽ More Neural Stochastic Differential Equations model a dynamical environment with neural nets assigned to their drift and diffusion terms. The high expressive power of their nonlinearity comes at the expense of instability in the identification of the large set of free parameters. This paper presents a recipe to improve the prediction accuracy of such models in three steps: i) accounting for epistemic uncertainty by assuming probabilistic weights, ii) incorporation of partial knowledge on the state dynamics, and iii) training the resultant hybrid model by an objective derived from a PAC-Bayesian generalization bound. We observe in our experiments that this recipe effectively translates partial and noisy prior knowledge into an improved model fit. △ Less

Submitted 26 February, 2021; v1 submitted 17 June, 2020; originally announced June 2020.

Comments: Accepted at AISTATS 2021

arXiv:1906.11471 [pdf, other]

Deep Active Learning with Adaptive Acquisition

Authors: Manuel Haussmann, Fred A. Hamprecht, Melih Kandemir

Abstract: Model selection is treated as a standard performance boosting step in many machine learning applications. Once all other properties of a learning problem are fixed, the model is selected by grid search on a held-out validation set. This is strictly inapplicable to active learning. Within the standardized workflow, the acquisition function is chosen among available heuristics a priori, and its succ… ▽ More Model selection is treated as a standard performance boosting step in many machine learning applications. Once all other properties of a learning problem are fixed, the model is selected by grid search on a held-out validation set. This is strictly inapplicable to active learning. Within the standardized workflow, the acquisition function is chosen among available heuristics a priori, and its success is observed only after the labeling budget is already exhausted. More importantly, none of the earlier studies report a unique consistently successful acquisition heuristic to the extent to stand out as the unique best choice. We present a method to break this vicious circle by defining the acquisition function as a learning predictor and training it by reinforcement feedback collected from each labeling round. As active learning is a scarce data regime, we bootstrap from a well-known heuristic that filters the bulk of data points on which all heuristics would agree, and learn a policy to warp the top portion of this ranking in the most beneficial way for the character of a specific data distribution. Our system consists of a Bayesian neural net, the predictor, a bootstrap acquisition function, a probabilistic state definition, and another Bayesian policy network that can effectively incorporate this input distribution. We observe on three benchmark data sets that our method always manages to either invent a new superior acquisition function or to adapt itself to the a priori unknown best performing heuristic for each specific data set. △ Less

Submitted 27 June, 2019; originally announced June 2019.

Comments: Accepted at IJCAI 2019

arXiv:1906.00816 [pdf, ps, other]

Bayesian Evidential Deep Learning with PAC Regularization

Authors: Manuel Haussmann, Sebastian Gerwinn, Melih Kandemir

Abstract: We propose a novel method for closed-form predictive distribution modeling with neural nets. In quantifying prediction uncertainty, we build on Evidential Deep Learning, which has been impactful as being both simple to implement and giving closed-form access to predictive uncertainty. We employ it to model aleatoric uncertainty and extend it to account also for epistemic uncertainty by converting… ▽ More We propose a novel method for closed-form predictive distribution modeling with neural nets. In quantifying prediction uncertainty, we build on Evidential Deep Learning, which has been impactful as being both simple to implement and giving closed-form access to predictive uncertainty. We employ it to model aleatoric uncertainty and extend it to account also for epistemic uncertainty by converting it to a Bayesian Neural Net. While extending its uncertainty quantification capabilities, we maintain its analytically accessible predictive distribution model by performing progressive moment matching for the first time for approximate weight marginalization. The eventual model introduces a prohibitively large number of hyperparameters for stable training. We overcome this drawback by deriving a vacuous PAC bound that comprises the marginal likelihood of the predictor and a complexity penalty. We observe on regression, classification, and out-of-domain detection benchmarks that our method improves model fit and uncertainty quantification. △ Less

Submitted 21 January, 2021; v1 submitted 3 June, 2019; originally announced June 2019.

Comments: Presented at AABI 2020

arXiv:1805.07654 [pdf, other]

Sampling-Free Variational Inference of Bayesian Neural Networks by Variance Backpropagation

Authors: Manuel Haussmann, Fred A. Hamprecht, Melih Kandemir

Abstract: We propose a new Bayesian Neural Net formulation that affords variational inference for which the evidence lower bound is analytically tractable subject to a tight approximation. We achieve this tractability by (i) decomposing ReLU nonlinearities into the product of an identity and a Heaviside step function, (ii) introducing a separate path that decomposes the neural net expectation from its varia… ▽ More We propose a new Bayesian Neural Net formulation that affords variational inference for which the evidence lower bound is analytically tractable subject to a tight approximation. We achieve this tractability by (i) decomposing ReLU nonlinearities into the product of an identity and a Heaviside step function, (ii) introducing a separate path that decomposes the neural net expectation from its variance. We demonstrate formally that introducing separate latent binary variables to the activations allows representing the neural network likelihood as a chain of linear operations. Performing variational inference on this construction enables a sampling-free computation of the evidence lower bound which is a more effective approximation than the widely applied Monte Carlo sampling and CLT related techniques. We evaluate the model on a range of regression and classification tasks against BNN inference alternatives, showing competitive or improved performance over the current state-of-the-art. △ Less

Submitted 12 June, 2019; v1 submitted 19 May, 2018; originally announced May 2018.

Comments: Accepted at UAI 2019

Showing 1–11 of 11 results for author: Haußmann, M