Skip to main content

Showing 1–31 of 31 results for author: Fischer, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.02868  [pdf, other

    stat.AP

    Bayesian Adaptive Trials for Social Policy

    Authors: Sally Cripps, Anna Lopatnikova, Hadi Mohasel Afshar, Ben Gales, Roman Marchant, Gilad Francis, Catarina Moreira, Alex Fischer

    Abstract: This paper proposes Bayesian Adaptive Trials (BAT) as both an efficient method to conduct trials and a unifying framework for evaluation social policy interventions, addressing limitations inherent in traditional methods such as Randomized Controlled Trials (RCT). Recognizing the crucial need for evidence-based approaches in public policy, the proposal aims to lower barriers to the adoption of evi… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  2. arXiv:2311.01888  [pdf, other

    stat.ML cs.LG

    Learning Sparse Codes with Entropy-Based ELBOs

    Authors: Dmytro Velychko, Simon Damm, Asja Fischer, Jörg Lücke

    Abstract: Standard probabilistic sparse coding assumes a Laplace prior, a linear map** from latents to observables, and Gaussian observable distributions. We here derive a solely entropy-based learning objective for the parameters of standard sparse coding. The novel variational objective has the following features: (A) unlike MAP approximations, it uses non-trivial posterior approximations for probabilis… ▽ More

    Submitted 9 April, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

  3. arXiv:2206.10311  [pdf, other

    cs.LG math.ST stat.ML

    Marginal Tail-Adaptive Normalizing Flows

    Authors: Mike Laszkiewicz, Johannes Lederer, Asja Fischer

    Abstract: Learning the tail behavior of a distribution is a notoriously difficult problem. By definition, the number of samples from the tail is small, and deep generative models, such as normalizing flows, tend to concentrate on learning the body of the distribution. In this paper, we focus on improving the ability of normalizing flows to correctly capture the tail behavior and, thus, form more accurate mo… ▽ More

    Submitted 27 June, 2022; v1 submitted 21 June, 2022; originally announced June 2022.

    Comments: Accepted at ICML2022 Thirty-ninth International Conference on Machine Learning

  4. arXiv:2203.10775  [pdf, other

    stat.ME math.PR

    Modified Method of Moments for Generalized Laplace Distribution

    Authors: Adrian Fischer, Robert E. Gaunt, Andrey Sarantsev

    Abstract: In this note, we consider the performance of the classic method of moments for parameter estimation of symmetric variance-gamma (generalized Laplace) distributions. We do this through both theoretical analysis (multivariate delta method) and a comprehensive simulation study with comparison to maximum likelihood estimation, finding performance is often unsatisfactory. In addition, we modify the met… ▽ More

    Submitted 19 November, 2023; v1 submitted 21 March, 2022; originally announced March 2022.

    Comments: 18 pages

    MSC Class: 62F10; 62F12; 60E07

  5. arXiv:2112.07400  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Robustifying automatic speech recognition by extracting slowly varying features

    Authors: Matias Pizarro, Dorothea Kolossa, Asja Fischer

    Abstract: In the past few years, it has been shown that deep learning systems are highly vulnerable under attacks with adversarial examples. Neural-network-based automatic speech recognition (ASR) systems are no exception. Targeted and untargeted attacks can modify an audio input signal in such a way that humans still recognise the same words, while ASR systems are steered to predict a different transcripti… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

  6. arXiv:2107.07352  [pdf, other

    cs.LG cs.AI stat.ML

    Copula-Based Normalizing Flows

    Authors: Mike Laszkiewicz, Johannes Lederer, Asja Fischer

    Abstract: Normalizing flows, which learn a distribution by transforming the data to samples from a Gaussian base distribution, have proven powerful density approximations. But their expressive power is limited by this choice of the base distribution. We, therefore, propose to generalize the base distribution to a more elaborate copula distribution to capture the properties of the target distribution more ac… ▽ More

    Submitted 15 July, 2021; originally announced July 2021.

    Comments: Accepted for presentation at the ICML 2021 Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models (INNF+ 2021)

  7. arXiv:2101.02726  [pdf, other

    cs.LG stat.ML

    A Novel Regression Loss for Non-Parametric Uncertainty Optimization

    Authors: Joachim Sicking, Maram Akila, Maximilian Pintz, Tim Wirtz, Asja Fischer, Stefan Wrobel

    Abstract: Quantification of uncertainty is one of the most promising approaches to establish safe machine learning. Despite its importance, it is far from being generally solved, especially for neural networks. One of the most commonly used approaches so far is Monte Carlo dropout, which is computationally cheap and easy to apply in practice. However, it can underestimate the uncertainty. We propose a new o… ▽ More

    Submitted 7 January, 2021; originally announced January 2021.

    Comments: Accepted at the 3rd Symposium on Advances in Approximate Bayesian Inference (AABI), code is available on: https://github.com/fraunhofer-iais/second-moment-loss. arXiv admin note: substantial text overlap with arXiv:2012.12687

  8. arXiv:2012.12687  [pdf, other

    cs.LG stat.ML

    Wasserstein Dropout

    Authors: Joachim Sicking, Maram Akila, Maximilian Pintz, Tim Wirtz, Asja Fischer, Stefan Wrobel

    Abstract: Despite of its importance for safe machine learning, uncertainty quantification for neural networks is far from being solved. State-of-the-art approaches to estimate neural uncertainties are often hybrid, combining parametric models with explicit or implicit (dropout-based) ensembling. We take another pathway and propose a novel approach to uncertainty quantification for regression tasks, Wasserst… ▽ More

    Submitted 2 December, 2021; v1 submitted 23 December, 2020; originally announced December 2020.

  9. arXiv:2010.14860  [pdf, other

    stat.ML cs.LG

    The ELBO of Variational Autoencoders Converges to a Sum of Three Entropies

    Authors: Simon Damm, Dennis Forster, Dmytro Velychko, Zhenwen Dai, Asja Fischer, Jörg Lücke

    Abstract: The central objective function of a variational autoencoder (VAE) is its variational lower bound (the ELBO). Here we show that for standard (i.e., Gaussian) VAEs the ELBO converges to a value given by the sum of three entropies: the (negative) entropy of the prior distribution, the expected (negative) entropy of the observable distribution, and the average entropy of the variational distributions… ▽ More

    Submitted 20 April, 2023; v1 submitted 28 October, 2020; originally announced October 2020.

    Journal ref: Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS), PMLR 206:3931-3960, 2023

  10. arXiv:2008.03209  [pdf, other

    cs.LG cs.AI stat.ML

    Investigating maximum likelihood based training of infinite mixtures for uncertainty quantification

    Authors: Sina Däubener, Asja Fischer

    Abstract: Uncertainty quantification in neural networks gained a lot of attention in the past years. The most popular approaches, Bayesian neural networks (BNNs), Monte Carlo dropout, and deep ensembles have one thing in common: they are all based on some kind of mixture model. While the BNNs build infinite mixture models and are derived via variational inference, the latter two build finite mixtures traine… ▽ More

    Submitted 17 August, 2020; v1 submitted 7 August, 2020; originally announced August 2020.

    Journal ref: Presented at the uncertainty workshop of ECML PKDD 2020

  11. arXiv:2007.09668  [pdf, other

    cs.LG stat.ML

    Improving the Long-Range Performance of Gated Graph Neural Networks

    Authors: Denis Lukovnikov, Jens Lehmann, Asja Fischer

    Abstract: Many popular variants of graph neural networks (GNNs) that are capable of handling multi-relational graphs may suffer from vanishing gradients. In this work, we propose a novel GNN architecture based on the Gated Graph Neural Network with an improved ability to handle long-range dependencies in multi-relational graphs. An experimental analysis on different synthetic tasks demonstrates that the pro… ▽ More

    Submitted 19 July, 2020; originally announced July 2020.

  12. arXiv:2007.05434  [pdf, other

    cs.LG math.ST stat.ML

    Characteristics of Monte Carlo Dropout in Wide Neural Networks

    Authors: Joachim Sicking, Maram Akila, Tim Wirtz, Sebastian Houben, Asja Fischer

    Abstract: Monte Carlo (MC) dropout is one of the state-of-the-art approaches for uncertainty estimation in neural networks (NNs). It has been interpreted as approximately performing Bayesian inference. Based on previous work on the approximation of Gaussian processes by wide and deep neural networks with random weights, we study the limiting distribution of wide untrained NNs under dropout more rigorously a… ▽ More

    Submitted 10 July, 2020; originally announced July 2020.

    Comments: Accepted at the ICML 2020 workshop for Uncertainty and Robustness in Deep Learning

  13. arXiv:2006.14999  [pdf, other

    stat.ML cs.LG

    On the convergence of the Metropolis algorithm with fixed-order updates for multivariate binary probability distributions

    Authors: Kai Brügge, Asja Fischer, Christian Igel

    Abstract: The Metropolis algorithm is arguably the most fundamental Markov chain Monte Carlo (MCMC) method. But the algorithm is not guaranteed to converge to the desired distribution in the case of multivariate binary distributions (e.g., Ising models or stochastic neural networks such as Boltzmann machines) if the variables (sites or neurons) are updated in a fixed order, a setting commonly used in practi… ▽ More

    Submitted 26 June, 2020; originally announced June 2020.

  14. arXiv:2006.13365  [pdf, other

    cs.LG cs.AI stat.ML

    Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge Graph Embedding Models Under a Unified Framework

    Authors: Mehdi Ali, Max Berrendorf, Charles Tapley Hoyt, Laurent Vermue, Mikhail Galkin, Sahand Sharifzadeh, Asja Fischer, Volker Tresp, Jens Lehmann

    Abstract: The heterogeneity in recently published knowledge graph embedding models' implementations, training, and evaluation has made fair and thorough comparisons difficult. In order to assess the reproducibility of previously published results, we re-implemented and evaluated 21 interaction models in the PyKEEN software package. Here, we outline which results could be reproduced with their reported hyper… ▽ More

    Submitted 1 November, 2021; v1 submitted 23 June, 2020; originally announced June 2020.

  15. arXiv:2005.14611  [pdf, other

    eess.AS cs.CR cs.LG cs.SD stat.ML

    Detecting Adversarial Examples for Speech Recognition via Uncertainty Quantification

    Authors: Sina Däubener, Lea Schönherr, Asja Fischer, Dorothea Kolossa

    Abstract: Machine learning systems and also, specifically, automatic speech recognition (ASR) systems are vulnerable against adversarial attacks, where an attacker maliciously changes the input. In the case of ASR systems, the most interesting cases are targeted attacks, in which an attacker aims to force the system into recognizing given target transcriptions in an arbitrary audio sample. The increasing nu… ▽ More

    Submitted 2 August, 2020; v1 submitted 24 May, 2020; originally announced May 2020.

  16. arXiv:2005.00466  [pdf, other

    stat.ML cs.LG stat.ME

    Thresholded Adaptive Validation: Tuning the Graphical Lasso for Graph Recovery

    Authors: Mike Laszkiewicz, Asja Fischer, Johannes Lederer

    Abstract: Many Machine Learning algorithms are formulated as regularized optimization problems, but their performance hinges on a regularization parameter that needs to be calibrated to each application at hand. In this paper, we propose a general calibration scheme for regularized optimization problems and apply it to the graphical lasso, which is a method for Gaussian graphical modeling. The scheme is equ… ▽ More

    Submitted 30 March, 2021; v1 submitted 1 May, 2020; originally announced May 2020.

    Comments: To appear in the proceedings of Artificial Intelligence and Statistics (AISTATS) 2021

  17. arXiv:1902.01080  [pdf, other

    stat.ML cs.AI cs.LG

    Predictive Uncertainty Quantification with Compound Density Networks

    Authors: Agustinus Kristiadi, Sina Däubener, Asja Fischer

    Abstract: Despite the huge success of deep neural networks (NNs), finding good mechanisms for quantifying their prediction uncertainty is still an open problem. Bayesian neural networks are one of the most popular approaches to uncertainty quantification. On the other hand, it was recently shown that ensembles of NNs, which belong to the class of mixture models, can be used to quantify prediction uncertaint… ▽ More

    Submitted 29 December, 2019; v1 submitted 4 February, 2019; originally announced February 2019.

    Comments: Bayesian deep learning workshop, NeuRIPS 2019

  18. arXiv:1812.04356  [pdf, other

    math.ST stat.ML

    Robust Bregman Clustering

    Authors: Aurélie Fischer, Clément Levrard, Claire Brécheteau

    Abstract: Using a trimming approach, we investigate a k-means type method based on Bregman divergences for clustering data possibly corrupted with clutter noise. The main interest of Bregman divergences is that the standard Lloyd algorithm adapts to these distortion measures, and they are well-suited for clustering data sampled according to mixture models from exponential families. We prove that there exist… ▽ More

    Submitted 9 September, 2020; v1 submitted 11 December, 2018; originally announced December 2018.

    Comments: Annals of Statistics, Institute of Mathematical Statistics, In press

  19. arXiv:1811.01118  [pdf, other

    cs.LG cs.AI stat.ML

    Learning to Rank Query Graphs for Complex Question Answering over Knowledge Graphs

    Authors: Gaurav Maheshwari, Priyansh Trivedi, Denis Lukovnikov, Nilesh Chakraborty, Asja Fischer, Jens Lehmann

    Abstract: In this paper, we conduct an empirical investigation of neural query graph ranking approaches for the task of complex question answering over knowledge graphs. We experiment with six different ranking models and propose a novel self-attention based slot matching model which exploits the inherent structure of query graphs, our logical form of choice. Our proposed model generally outperforms the oth… ▽ More

    Submitted 2 November, 2018; originally announced November 2018.

  20. arXiv:1807.05031  [pdf, other

    stat.ML cs.LG

    On the Relation Between the Sharpest Directions of DNN Loss and the SGD Step Length

    Authors: Stanisław Jastrzębski, Zachary Kenton, Nicolas Ballas, Asja Fischer, Yoshua Bengio, Amos Storkey

    Abstract: Stochastic Gradient Descent (SGD) based training of neural networks with a large learning rate or a small batch-size typically ends in well-generalizing, flat regions of the weight space, as indicated by small eigenvalues of the Hessian of the training loss. However, the curvature along the SGD trajectory is poorly understood. An empirical investigation shows that initially SGD visits increasingly… ▽ More

    Submitted 23 December, 2019; v1 submitted 13 July, 2018; originally announced July 2018.

    Journal ref: International Conference on Learning Representations (ICLR) 2019

  21. arXiv:1804.04512  [pdf, ps, other

    cs.LG cs.CV stat.ML

    DLL: A Blazing Fast Deep Neural Network Library

    Authors: Baptiste Wicht, Jean Hennebert, Andreas Fischer

    Abstract: Deep Learning Library (DLL) is a new library for machine learning with deep neural networks that focuses on speed. It supports feed-forward neural networks such as fully-connected Artificial Neural Networks (ANNs) and Convolutional Neural Networks (CNNs). It also has very comprehensive support for Restricted Boltzmann Machines (RBMs) and Convolutional RBMs. Our main motivation for this work was to… ▽ More

    Submitted 11 April, 2018; originally announced April 2018.

    Comments: 6 pages

  22. arXiv:1803.03166  [pdf, other

    stat.ML stat.AP

    Aggregation using input-output trade-off

    Authors: Aurélie Fischer, Mathilde Mougeot

    Abstract: In this paper, we introduce a new learning strategy based on a seminal idea of Mojirsheibani (1999, 2000, 2002a, 2002b), who proposed a smart method for combining several classifiers, relying on a consensus notion. In many aggregation methods, the prediction for a new observation x is computed by building a linear or convex combination over a collection of basic estimators r1(x),. .. , rm(x) previ… ▽ More

    Submitted 8 March, 2018; originally announced March 2018.

  23. arXiv:1802.00934  [pdf, other

    cs.AI stat.ML

    Incorporating Literals into Knowledge Graph Embeddings

    Authors: Agustinus Kristiadi, Mohammad Asif Khan, Denis Lukovnikov, Jens Lehmann, Asja Fischer

    Abstract: Knowledge graphs, on top of entities and their relationships, contain other important elements: literals. Literals encode interesting properties (e.g. the height) of entities that are not captured by links between entities alone. Most of the existing work on embedding (or latent feature) based knowledge graph analysis focuses mainly on the relations between entities. In this work, we study the eff… ▽ More

    Submitted 18 July, 2019; v1 submitted 3 February, 2018; originally announced February 2018.

    Comments: 9 pages, 2 figures, 6 tables

  24. arXiv:1711.04623  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Three Factors Influencing Minima in SGD

    Authors: Stanisław Jastrzębski, Zachary Kenton, Devansh Arpit, Nicolas Ballas, Asja Fischer, Yoshua Bengio, Amos Storkey

    Abstract: We investigate the dynamical and convergent properties of stochastic gradient descent (SGD) applied to Deep Neural Networks (DNNs). Characterizing the relation between learning rate, batch size and the properties of the final minima, such as width or generalization, remains an open question. In order to tackle this problem we investigate the previously proposed approximation of SGD by a stochastic… ▽ More

    Submitted 13 September, 2018; v1 submitted 13 November, 2017; originally announced November 2017.

    Comments: First two authors contributed equally. Short version accepted into ICLR workshop. Accepted to Artificial Neural Networks and Machine Learning, ICANN 2018

  25. arXiv:1709.08894  [pdf, other

    stat.ML cs.LG

    On the regularization of Wasserstein GANs

    Authors: Henning Petzka, Asja Fischer, Denis Lukovnicov

    Abstract: Since their invention, generative adversarial networks (GANs) have become a popular approach for learning to model a distribution of real (unlabeled) data. Convergence problems during training are overcome by Wasserstein GANs which minimize the distance between the model and the empirical distribution in terms of a different metric, but thereby introduce a Lipschitz constraint into the optimizatio… ▽ More

    Submitted 5 March, 2018; v1 submitted 26 September, 2017; originally announced September 2017.

    Comments: Published as a conference paper at ICLR 2018. * Henning Petzka and Asja Fischer contributed equally to this work (11 pages +13 pages appendix)

  26. arXiv:1706.05394  [pdf, other

    stat.ML cs.LG

    A Closer Look at Memorization in Deep Networks

    Authors: Devansh Arpit, Stanisław Jastrzębski, Nicolas Ballas, David Krueger, Emmanuel Bengio, Maxinder S. Kanwal, Tegan Maharaj, Asja Fischer, Aaron Courville, Yoshua Bengio, Simon Lacoste-Julien

    Abstract: We examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness. While deep networks are capable of memorizing noise data, our results suggest that they tend to prioritize learning simple patterns first. In our experiments, we expose qualitative differences in gradient-based optimization of deep neural networks (DNNs) on noise vs. r… ▽ More

    Submitted 1 July, 2017; v1 submitted 16 June, 2017; originally announced June 2017.

    Comments: Appears in Proceedings of the 34th International Conference on Machine Learning (ICML 2017), Devansh Arpit, Stanisław Jastrzębski, Nicolas Ballas, and David Krueger contributed equally to this work

  27. arXiv:1610.01000  [pdf, other

    stat.AP stat.ML

    Statistical learning for wind power : a modeling and stability study towards forecasting

    Authors: Aurélie Fischer, Lucie Montuelle, Mathilde Mougeot, Dominique Picard

    Abstract: We focus on wind power modeling using machine learning techniques. We show on real data provided by the wind energy company Ma{ï}a Eolis, that parametric models, even following closely the physical equation relating wind production to wind speed are outperformed by intelligent learning algorithms. In particular, the CART-Bagging algorithm gives very stable and promising results. Besides, as a step… ▽ More

    Submitted 12 January, 2018; v1 submitted 4 October, 2016; originally announced October 2016.

    Journal ref: Wind Energy, Wiley, 2017, 20 (12), pp.2037 - 2047

  28. arXiv:1510.01624  [pdf, other

    cs.LG cs.NE stat.ML

    Population-Contrastive-Divergence: Does Consistency help with RBM training?

    Authors: Oswin Krause, Asja Fischer, Christian Igel

    Abstract: Estimating the log-likelihood gradient with respect to the parameters of a Restricted Boltzmann Machine (RBM) typically requires sampling using Markov Chain Monte Carlo (MCMC) techniques. To save computation time, the Markov chains are only run for a small number of steps, which leads to a biased estimate. This bias can cause RBM training algorithms such as Contrastive Divergence (CD) learning to… ▽ More

    Submitted 28 June, 2017; v1 submitted 6 October, 2015; originally announced October 2015.

    Comments: An updated version is under review

  29. arXiv:1506.03877  [pdf, other

    cs.LG stat.ML

    Bidirectional Helmholtz Machines

    Authors: Jorg Bornschein, Samira Shabanian, Asja Fischer, Yoshua Bengio

    Abstract: Efficient unsupervised training and inference in deep generative models remains a challenging problem. One basic approach, called Helmholtz machine, involves training a top-down directed generative model together with a bottom-up auxiliary model used for approximate inference. Recent results indicate that better generative models can be obtained with better approximate inference procedures. Instea… ▽ More

    Submitted 24 May, 2016; v1 submitted 11 June, 2015; originally announced June 2015.

  30. arXiv:1311.1354  [pdf, other

    stat.ML cs.LG

    How to Center Binary Deep Boltzmann Machines

    Authors: Jan Melchior, Asja Fischer, Laurenz Wiskott

    Abstract: This work analyzes centered binary Restricted Boltzmann Machines (RBMs) and binary Deep Boltzmann Machines (DBMs), where centering is done by subtracting offset values from visible and hidden variables. We show analytically that (i) centering results in a different but equivalent parameterization for artificial neural networks in general, (ii) the expected performance of centered binary RBMs/DBMs… ▽ More

    Submitted 16 July, 2015; v1 submitted 6 November, 2013; originally announced November 2013.

    Comments: Author list in meta data corrected - 57 pages, 17 figures, 13 tables

    Journal ref: Journal of Machine Learning Research, 17(99), 2016, 1:61

  31. COBRA: A Combined Regression Strategy

    Authors: Gérard Biau, Aurélie Fischer, Benjamin Guedj, James Malley

    Abstract: A new method for combining several initial estimators of the regression function is introduced. Instead of building a linear or convex optimized combination over a collection of basic estimators $r_1,\dots,r_M$, we use them as a collective indicator of the proximity between the training data and a test observation. This local distance approach is model-free and very fast. More specifically, the re… ▽ More

    Submitted 23 May, 2019; v1 submitted 9 March, 2013; originally announced March 2013.

    Comments: 42 pages

    Journal ref: Journal of Multivariate Analysis (2016), vol. 146, 18--28