-
Kalman Bayesian Neural Networks for Closed-form Online Learning
Authors:
Philipp Wagner,
Xinyang Wu,
Marco F. Huber
Abstract:
Compared to point estimates calculated by standard neural networks, Bayesian neural networks (BNN) provide probability distributions over the output predictions and model parameters, i.e., the weights. Training the weight distribution of a BNN, however, is more involved due to the intractability of the underlying Bayesian inference problem and thus, requires efficient approximations. In this paper…
▽ More
Compared to point estimates calculated by standard neural networks, Bayesian neural networks (BNN) provide probability distributions over the output predictions and model parameters, i.e., the weights. Training the weight distribution of a BNN, however, is more involved due to the intractability of the underlying Bayesian inference problem and thus, requires efficient approximations. In this paper, we propose a novel approach for BNN learning via closed-form Bayesian inference. For this purpose, the calculation of the predictive distribution of the output and the update of the weight distribution are treated as Bayesian filtering and smoothing problems, where the weights are modeled as Gaussian random variables. This allows closed-form expressions for training the network's parameters in a sequential/online fashion without gradient descent. We demonstrate our method on several UCI datasets and compare it to the state of the art.
△ Less
Submitted 30 November, 2022; v1 submitted 3 October, 2021;
originally announced October 2021.
-
Towards Measuring Bias in Image Classification
Authors:
Nina Schaaf,
Omar de Mitri,
Hang Beom Kim,
Alexander Windberger,
Marco F. Huber
Abstract:
Convolutional Neural Networks (CNN) have become de fact state-of-the-art for the main computer vision tasks. However, due to the complex underlying structure their decisions are hard to understand which limits their use in some context of the industrial world. A common and hard to detect challenge in machine learning (ML) tasks is data bias. In this work, we present a systematic approach to uncove…
▽ More
Convolutional Neural Networks (CNN) have become de fact state-of-the-art for the main computer vision tasks. However, due to the complex underlying structure their decisions are hard to understand which limits their use in some context of the industrial world. A common and hard to detect challenge in machine learning (ML) tasks is data bias. In this work, we present a systematic approach to uncover data bias by means of attribution maps. For this purpose, first an artificial dataset with a known bias is created and used to train intentionally biased CNNs. The networks' decisions are then inspected using attribution maps. Finally, meaningful metrics are used to measure the attribution maps' representativeness with respect to the known bias. The proposed study shows that some attribution map techniques highlight the presence of bias in the data better than others and metrics can support the identification of bias.
△ Less
Submitted 1 July, 2021;
originally announced July 2021.
-
A Survey on the Explainability of Supervised Machine Learning
Authors:
Nadia Burkart,
Marco F. Huber
Abstract:
Predictions obtained by, e.g., artificial neural networks have a high accuracy but humans often perceive the models as black boxes. Insights about the decision making are mostly opaque for humans. Particularly understanding the decision making in highly sensitive areas such as healthcare or fifinance, is of paramount importance. The decision-making behind the black boxes requires it to be more tra…
▽ More
Predictions obtained by, e.g., artificial neural networks have a high accuracy but humans often perceive the models as black boxes. Insights about the decision making are mostly opaque for humans. Particularly understanding the decision making in highly sensitive areas such as healthcare or fifinance, is of paramount importance. The decision-making behind the black boxes requires it to be more transparent, accountable, and understandable for humans. This survey paper provides essential definitions, an overview of the different principles and methodologies of explainable Supervised Machine Learning (SML). We conduct a state-of-the-art survey that reviews past and recent explainable SML approaches and classifies them according to the introduced definitions. Finally, we illustrate principles by means of an explanatory case study and discuss important future directions.
△ Less
Submitted 16 November, 2020;
originally announced November 2020.
-
Bayesian Perceptron: Towards fully Bayesian Neural Networks
Authors:
Marco F. Huber
Abstract:
Artificial neural networks (NNs) have become the de facto standard in machine learning. They allow learning highly nonlinear transformations in a plethora of applications. However, NNs usually only provide point estimates without systematically quantifying corresponding uncertainties. In this paper a novel approach towards fully Bayesian NNs is proposed, where training and predictions of a percept…
▽ More
Artificial neural networks (NNs) have become the de facto standard in machine learning. They allow learning highly nonlinear transformations in a plethora of applications. However, NNs usually only provide point estimates without systematically quantifying corresponding uncertainties. In this paper a novel approach towards fully Bayesian NNs is proposed, where training and predictions of a perceptron are performed within the Bayesian inference framework in closed-form. The weights and the predictions of the perceptron are considered Gaussian random variables. Analytical expressions for predicting the perceptron's output and for learning the weights are provided for commonly used activation functions like sigmoid or ReLU. This approach requires no computationally expensive gradient calculations and further allows sequential learning.
△ Less
Submitted 10 September, 2020; v1 submitted 3 September, 2020;
originally announced September 2020.
-
Benchmark and Survey of Automated Machine Learning Frameworks
Authors:
Marc-André Zöller,
Marco F. Huber
Abstract:
Machine learning (ML) has become a vital part in many aspects of our daily life. However, building well performing machine learning applications requires highly specialized data scientists and domain experts. Automated machine learning (AutoML) aims to reduce the demand for data scientists by enabling domain experts to build machine learning applications automatically without extensive knowledge o…
▽ More
Machine learning (ML) has become a vital part in many aspects of our daily life. However, building well performing machine learning applications requires highly specialized data scientists and domain experts. Automated machine learning (AutoML) aims to reduce the demand for data scientists by enabling domain experts to build machine learning applications automatically without extensive knowledge of statistics and machine learning. This paper is a combination of a survey on current AutoML methods and a benchmark of popular AutoML frameworks on real data sets. Driven by the selected frameworks for evaluation, we summarize and review important AutoML techniques and methods concerning every step in building an ML pipeline. The selected AutoML frameworks are evaluated on 137 data sets from established AutoML benchmark suits.
△ Less
Submitted 26 January, 2021; v1 submitted 26 April, 2019;
originally announced April 2019.
-
Enhancing Decision Tree based Interpretation of Deep Neural Networks through L1-Orthogonal Regularization
Authors:
Nina Schaaf,
Marco F. Huber,
Johannes Maucher
Abstract:
One obstacle that so far prevents the introduction of machine learning models primarily in critical areas is the lack of explainability. In this work, a practicable approach of gaining explainability of deep artificial neural networks (NN) using an interpretable surrogate model based on decision trees is presented. Simply fitting a decision tree to a trained NN usually leads to unsatisfactory resu…
▽ More
One obstacle that so far prevents the introduction of machine learning models primarily in critical areas is the lack of explainability. In this work, a practicable approach of gaining explainability of deep artificial neural networks (NN) using an interpretable surrogate model based on decision trees is presented. Simply fitting a decision tree to a trained NN usually leads to unsatisfactory results in terms of accuracy and fidelity. Using L1-orthogonal regularization during training, however, preserves the accuracy of the NN, while it can be closely approximated by small decision trees. Tests with different data sets confirm that L1-orthogonal regularization yields models of lower complexity and at the same time higher fidelity compared to other regularizers.
△ Less
Submitted 3 October, 2019; v1 submitted 10 April, 2019;
originally announced April 2019.
-
On Multi-Step Sensor Scheduling via Convex Optimization
Authors:
Marco F. Huber
Abstract:
Effective sensor scheduling requires the consideration of long-term effects and thus optimization over long time horizons. Determining the optimal sensor schedule, however, is equivalent to solving a binary integer program, which is computationally demanding for long time horizons and many sensors. For linear Gaussian systems, two efficient multi-step sensor scheduling approaches are proposed in t…
▽ More
Effective sensor scheduling requires the consideration of long-term effects and thus optimization over long time horizons. Determining the optimal sensor schedule, however, is equivalent to solving a binary integer program, which is computationally demanding for long time horizons and many sensors. For linear Gaussian systems, two efficient multi-step sensor scheduling approaches are proposed in this paper. The first approach determines approximate but close to optimal sensor schedules via convex optimization. The second approach combines convex optimization with a \BB search for efficiently determining the optimal sensor schedule.
△ Less
Submitted 30 March, 2012;
originally announced March 2012.
-
Adaptive Gaussian Mixture Filter Based on Statistical Linearization
Authors:
Marco F. Huber
Abstract:
Gaussian mixtures are a common density representation in nonlinear, non-Gaussian Bayesian state estimation. Selecting an appropriate number of Gaussian components, however, is difficult as one has to trade of computational complexity against estimation accuracy. In this paper, an adaptive Gaussian mixture filter based on statistical linearization is proposed. Depending on the nonlinearity of the c…
▽ More
Gaussian mixtures are a common density representation in nonlinear, non-Gaussian Bayesian state estimation. Selecting an appropriate number of Gaussian components, however, is difficult as one has to trade of computational complexity against estimation accuracy. In this paper, an adaptive Gaussian mixture filter based on statistical linearization is proposed. Depending on the nonlinearity of the considered estimation problem, this filter dynamically increases the number of components via splitting. For this purpose, a measure is introduced that allows for quantifying the locally induced linearization error at each Gaussian mixture component. The deviation between the nonlinear and the linearized state space model is evaluated for determining the splitting direction. The proposed approach is not restricted to a specific statistical linearization method. Simulations show the superior estimation performance compared to related approaches and common filtering algorithms.
△ Less
Submitted 30 March, 2012;
originally announced March 2012.
-
Robust Filtering and Smoothing with Gaussian Processes
Authors:
Marc Peter Deisenroth,
Ryan Turner,
Marco F. Huber,
Uwe D. Hanebeck,
Carl Edward Rasmussen
Abstract:
We propose a principled algorithm for robust Bayesian filtering and smoothing in nonlinear stochastic dynamic systems when both the transition function and the measurement function are described by non-parametric Gaussian process (GP) models. GPs are gaining increasing importance in signal processing, machine learning, robotics, and control for representing unknown system functions by posterior pr…
▽ More
We propose a principled algorithm for robust Bayesian filtering and smoothing in nonlinear stochastic dynamic systems when both the transition function and the measurement function are described by non-parametric Gaussian process (GP) models. GPs are gaining increasing importance in signal processing, machine learning, robotics, and control for representing unknown system functions by posterior probability distributions. This modern way of "system identification" is more robust than finding point estimates of a parametric function representation. In this article, we present a principled algorithm for robust analytic smoothing in GP dynamic systems, which are increasingly used in robotics and control. Our numerical evaluations demonstrate the robustness of the proposed approach in situations where other state-of-the-art Gaussian filters and smoothers can fail.
△ Less
Submitted 20 March, 2012;
originally announced March 2012.