Skip to main content

Showing 1–25 of 25 results for author: Jacobsen, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.08719  [pdf, other

    stat.ML cs.LG stat.ME

    Addressing Misspecification in Simulation-based Inference through Data-driven Calibration

    Authors: Antoine Wehenkel, Juan L. Gamella, Ozan Sener, Jens Behrmann, Guillermo Sapiro, Marco Cuturi, Jörn-Henrik Jacobsen

    Abstract: Driven by steady progress in generative modeling, simulation-based inference (SBI) has enabled inference over stochastic simulators. However, recent work has demonstrated that model misspecification can harm SBI's reliability. This work introduces robust posterior estimation (ROPE), a framework that overcomes model misspecification with a small real-world calibration set of ground truth parameter… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  2. arXiv:2307.13918  [pdf, other

    stat.ML cs.LG q-bio.QM

    Simulation-based Inference for Cardiovascular Models

    Authors: Antoine Wehenkel, Jens Behrmann, Andrew C. Miller, Guillermo Sapiro, Ozan Sener, Marco Cuturi, Jörn-Henrik Jacobsen

    Abstract: Over the past decades, hemodynamics simulators have steadily evolved and have become tools of choice for studying cardiovascular systems in-silico. While such tools are routinely used to simulate whole-body hemodynamics from physiological parameters, solving the corresponding inverse problem of map** waveforms back to plausible physiological parameters remains both promising and challenging. Mot… ▽ More

    Submitted 29 July, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

  3. arXiv:2202.03881  [pdf, other

    cs.LG stat.ML

    Robust Hybrid Learning With Expert Augmentation

    Authors: Antoine Wehenkel, Jens Behrmann, Hsiang Hsu, Guillermo Sapiro, Gilles Louppe, Jörn-Henrik Jacobsen

    Abstract: Hybrid modelling reduces the misspecification of expert models by combining them with machine learning (ML) components learned from data. Similarly to many ML algorithms, hybrid model performance guarantees are limited to the training distribution. Leveraging the insight that the expert model is usually valid even outside the training domain, we overcome this limitation by introducing a hybrid dat… ▽ More

    Submitted 11 April, 2023; v1 submitted 8 February, 2022; originally announced February 2022.

    Journal ref: Transaction on Machine Learning Research, 2023

  4. arXiv:2112.00881  [pdf, other

    cs.LG stat.ML

    Learning Invariant Representations with Missing Data

    Authors: Mark Goldstein, Jörn-Henrik Jacobsen, Olina Chau, Adriel Saporta, Aahlad Puli, Rajesh Ranganath, Andrew C. Miller

    Abstract: Spurious correlations allow flexible models to predict well during training but poorly on related test distributions. Recent work has shown that models that satisfy particular independencies involving correlation-inducing \textit{nuisance} variables have guarantees on their test performance. Enforcing such independencies requires nuisances to be observed during training. However, nuisances, such a… ▽ More

    Submitted 8 June, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

    Comments: CLeaR (Causal Learning and Reasoning) 2022

  5. arXiv:2010.07249  [pdf, other

    cs.LG cs.AI

    Environment Inference for Invariant Learning

    Authors: Elliot Creager, Jörn-Henrik Jacobsen, Richard Zemel

    Abstract: Learning models that gracefully handle distribution shifts is central to research on domain generalization, robust optimization, and fairness. A promising formulation is domain-invariant learning, which identifies the key issue of learning which features are domain-specific versus domain-invariant. An important assumption in this area is that the training examples are partitioned into "domains" or… ▽ More

    Submitted 15 July, 2021; v1 submitted 14 October, 2020; originally announced October 2020.

  6. arXiv:2006.09347  [pdf, other

    cs.LG stat.ML

    Understanding and Mitigating Exploding Inverses in Invertible Neural Networks

    Authors: Jens Behrmann, Paul Vicol, Kuan-Chieh Wang, Roger Grosse, Jörn-Henrik Jacobsen

    Abstract: Invertible neural networks (INNs) have been used to design generative models, implement memory-saving gradient computation, and solve inverse problems. In this work, we show that commonly-used INN architectures suffer from exploding inverses and are thus prone to becoming numerically non-invertible. Across a wide range of INN use-cases, we reveal failures including the non-applicability of the cha… ▽ More

    Submitted 24 December, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: AISTATS 2021

  7. arXiv:2004.07780  [pdf, other

    cs.CV cs.AI cs.LG q-bio.NC

    Shortcut Learning in Deep Neural Networks

    Authors: Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, Felix A. Wichmann

    Abstract: Deep learning has triggered the current rise of artificial intelligence and is the workhorse of today's machine intelligence. Numerous success stories have rapidly spread all over science, industry and society, but its limitations have only recently come into focus. In this perspective we seek to distill how many of deep learning's problems can be seen as different symptoms of the same underlying… ▽ More

    Submitted 21 November, 2023; v1 submitted 16 April, 2020; originally announced April 2020.

    Comments: perspective article published at Nature Machine Intelligence (https://doi.org/10.1038/s42256-020-00257-z)

  8. arXiv:2003.00688  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Out-of-Distribution Generalization via Risk Extrapolation (REx)

    Authors: David Krueger, Ethan Caballero, Joern-Henrik Jacobsen, Amy Zhang, Jonathan Binas, Dinghuai Zhang, Remi Le Priol, Aaron Courville

    Abstract: Distributional shift is one of the major obstacles when transferring machine learning prediction systems from the lab to the real world. To tackle this problem, we assume that variation across training domains is representative of the variation we might encounter at test time, but also that shifts at test time may be more extreme in magnitude. In particular, we show that reducing differences in ri… ▽ More

    Submitted 25 February, 2021; v1 submitted 2 March, 2020; originally announced March 2020.

  9. arXiv:2002.05616  [pdf, other

    stat.ML cs.LG

    Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling

    Authors: Will Grathwohl, Kuan-Chieh Wang, Jorn-Henrik Jacobsen, David Duvenaud, Richard Zemel

    Abstract: We present a new method for evaluating and training unnormalized density models. Our approach only requires access to the gradient of the unnormalized model's log-density. We estimate the Stein discrepancy between the data density $p(x)$ and the model density $q(x)$ defined by a vector function of the data. We parameterize this function with a neural network and fit its parameters to maximize the… ▽ More

    Submitted 14 August, 2020; v1 submitted 13 February, 2020; originally announced February 2020.

    Comments: ICML 2020

  10. arXiv:2002.04599  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations

    Authors: Florian Tramèr, Jens Behrmann, Nicholas Carlini, Nicolas Papernot, Jörn-Henrik Jacobsen

    Abstract: Adversarial examples are malicious inputs crafted to induce misclassification. Commonly studied sensitivity-based adversarial examples introduce semantically-small changes to an input that result in a different model prediction. This paper studies a complementary failure mode, invariance-based adversarial examples, that introduce minimal semantic changes that modify an input's true label yet prese… ▽ More

    Submitted 4 August, 2020; v1 submitted 11 February, 2020; originally announced February 2020.

    Comments: ICML 2020 (Supersedes the workshop paper "Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness", arXiv:1903.10484)

  11. arXiv:2002.02798  [pdf, other

    stat.ML cs.LG

    How to train your neural ODE: the world of Jacobian and kinetic regularization

    Authors: Chris Finlay, Jörn-Henrik Jacobsen, Levon Nurbekyan, Adam M Oberman

    Abstract: Training neural ODEs on large datasets has not been tractable due to the necessity of allowing the adaptive numerical ODE solver to refine its step size to very small values. In practice this leads to dynamics equivalent to many hundreds or even thousands of layers. In this paper, we overcome this apparent difficulty by introducing a theoretically-grounded combination of both optimal transport and… ▽ More

    Submitted 23 June, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

    Comments: Accepted to ICML 2020

  12. arXiv:1912.03263  [pdf, other

    cs.LG cs.CV stat.ML

    Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One

    Authors: Will Grathwohl, Kuan-Chieh Wang, Jörn-Henrik Jacobsen, David Duvenaud, Mohammad Norouzi, Kevin Swersky

    Abstract: We propose to reinterpret a standard discriminative classifier of p(y|x) as an energy based model for the joint distribution p(x,y). In this setting, the standard class probabilities can be easily computed as well as unnormalized values of p(x) and p(x|y). Within this framework, standard discriminative architectures may beused and the model can also be trained on unlabeled data. We demonstrate tha… ▽ More

    Submitted 15 September, 2020; v1 submitted 6 December, 2019; originally announced December 2019.

  13. arXiv:1911.00937  [pdf, other

    cs.LG stat.ML

    Preventing Gradient Attenuation in Lipschitz Constrained Convolutional Networks

    Authors: Qiyang Li, Saminul Haque, Cem Anil, James Lucas, Roger Grosse, Jörn-Henrik Jacobsen

    Abstract: Lipschitz constraints under L2 norm on deep neural networks are useful for provable adversarial robustness bounds, stable training, and Wasserstein distance estimation. While heuristic approaches such as the gradient penalty have seen much practical success, it is challenging to achieve similar practical performance while provably enforcing a Lipschitz constraint. In principle, one can design Lips… ▽ More

    Submitted 9 November, 2019; v1 submitted 3 November, 2019; originally announced November 2019.

    Comments: 9 main pages, 31 pages total, 3 figures. Accepted at 33rd Conference on Neural Information Processing Systems (NeurIPS 2019)

  14. arXiv:1906.02735  [pdf, other

    stat.ML cs.LG

    Residual Flows for Invertible Generative Modeling

    Authors: Ricky T. Q. Chen, Jens Behrmann, David Duvenaud, Jörn-Henrik Jacobsen

    Abstract: Flow-based generative models parameterize probability distributions through an invertible transformation and can be trained by maximum likelihood. Invertible residual networks provide a flexible family of transformations where only Lipschitz conditions rather than strict architectural constraints are needed for enforcing invertibility. However, prior work trained invertible residual networks for d… ▽ More

    Submitted 23 July, 2020; v1 submitted 6 June, 2019; originally announced June 2019.

    Comments: NeurIPS 2019

  15. arXiv:1906.02589  [pdf, other

    cs.LG cs.AI stat.ML

    Flexibly Fair Representation Learning by Disentanglement

    Authors: Elliot Creager, David Madras, Jörn-Henrik Jacobsen, Marissa A. Weis, Kevin Swersky, Toniann Pitassi, Richard Zemel

    Abstract: We consider the problem of learning representations that achieve group and subgroup fairness with respect to multiple sensitive attributes. Taking inspiration from the disentangled representation learning literature, we propose an algorithm for learning compact representations of datasets that are useful for reconstruction and prediction, but are also \emph{flexibly fair}, meaning they can be easi… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

    Journal ref: Proceedings of the International Conference on Machine Learning (ICML), 2019

  16. arXiv:1906.01171  [pdf, other

    cs.LG stat.ML

    Understanding the Limitations of Conditional Generative Models

    Authors: Ethan Fetaya, Jörn-Henrik Jacobsen, Will Grathwohl, Richard Zemel

    Abstract: Class-conditional generative models hold promise to overcome the shortcomings of their discriminative counterparts. They are a natural choice to solve discriminative tasks in a robust manner as they jointly optimize for predictive performance and accurate modeling of the input distribution. In this work, we investigate robust classification with likelihood-based generative models from a theoretica… ▽ More

    Submitted 17 February, 2020; v1 submitted 3 June, 2019; originally announced June 2019.

  17. arXiv:1903.10484  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness

    Authors: Jörn-Henrik Jacobsen, Jens Behrmannn, Nicholas Carlini, Florian Tramèr, Nicolas Papernot

    Abstract: Adversarial examples are malicious inputs crafted to cause a model to misclassify them. Their most common instantiation, "perturbation-based" adversarial examples introduce changes to the input that leave its true label unchanged, yet result in a different model prediction. Conversely, "invariance-based" adversarial examples insert changes to the input that leave the model's prediction unaffected… ▽ More

    Submitted 25 March, 2019; originally announced March 2019.

    Comments: Accepted at the ICLR 2019 SafeML Workshop

  18. arXiv:1811.00995  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Invertible Residual Networks

    Authors: Jens Behrmann, Will Grathwohl, Ricky T. Q. Chen, David Duvenaud, Jörn-Henrik Jacobsen

    Abstract: We show that standard ResNet architectures can be made invertible, allowing the same model to be used for classification, density estimation, and generation. Typically, enforcing invertibility requires partitioning dimensions or restricting network architectures. In contrast, our approach only requires adding a simple normalization step during training, already available in standard frameworks. In… ▽ More

    Submitted 18 May, 2019; v1 submitted 2 November, 2018; originally announced November 2018.

    Journal ref: Proceedings of the International Conference on Machine Learning (ICML), 2019

  19. arXiv:1811.00401  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Excessive Invariance Causes Adversarial Vulnerability

    Authors: Jörn-Henrik Jacobsen, Jens Behrmann, Richard Zemel, Matthias Bethge

    Abstract: Despite their impressive performance, deep neural networks exhibit striking failures on out-of-distribution inputs. One core idea of adversarial example research is to reveal neural network errors under such distribution shifts. We decompose these errors into two complementary sources: sensitivity and invariance. We show deep networks are not only too sensitive to task-irrelevant changes of their… ▽ More

    Submitted 12 July, 2020; v1 submitted 1 November, 2018; originally announced November 2018.

    Journal ref: Proceedings of the 7th International Conference on Learning Representations (ICLR), 2019

  20. arXiv:1802.07088  [pdf, other

    cs.LG cs.CV stat.ML

    i-RevNet: Deep Invertible Networks

    Authors: Jörn-Henrik Jacobsen, Arnold Smeulders, Edouard Oyallon

    Abstract: It is widely believed that the success of deep convolutional networks is based on progressively discarding uninformative variability about the input with respect to the problem at hand. This is supported empirically by the difficulty of recovering images from their hidden representations, in most commonly used network architectures. In this paper we show via a one-to-one map** that this loss of… ▽ More

    Submitted 20 February, 2018; originally announced February 2018.

    Journal ref: ICLR 2018 - International Conference on Learning Representations, Apr 2018, Vancouver, Canada. 2018, https://iclr.cc/

  21. arXiv:1706.00598  [pdf, other

    cs.CV stat.ML

    Dynamic Steerable Blocks in Deep Residual Networks

    Authors: Jörn-Henrik Jacobsen, Bert de Brabandere, Arnold W. M. Smeulders

    Abstract: Filters in convolutional networks are typically parameterized in a pixel basis, that does not take prior knowledge about the visual world into account. We investigate the generalized notion of frames designed with image properties in mind, as alternatives to this parametrization. We show that frame-based ResNets and Densenets can improve performance on Cifar-10+ consistently, while having addition… ▽ More

    Submitted 19 July, 2017; v1 submitted 2 June, 2017; originally announced June 2017.

  22. arXiv:1703.04140  [pdf, other

    cs.LG stat.ML

    Multiscale Hierarchical Convolutional Networks

    Authors: Jörn-Henrik Jacobsen, Edouard Oyallon, Stéphane Mallat, Arnold W. M. Smeulders

    Abstract: Deep neural network algorithms are difficult to analyze because they lack structure allowing to understand the properties of underlying transforms and invariants. Multiscale hierarchical convolutional networks are structured deep convolutional networks where layers are indexed by progressively higher dimensional attributes, which are learned from training data. Each new layer is computed with mult… ▽ More

    Submitted 12 March, 2017; originally announced March 2017.

  23. arXiv:1605.02971  [pdf, other

    cs.CV

    Structured Receptive Fields in CNNs

    Authors: Jörn-Henrik Jacobsen, Jan van Gemert, Zhongyu Lou, Arnold W. M. Smeulders

    Abstract: Learning powerful feature representations with CNNs is hard when training data are limited. Pre-training is one way to overcome this, but it requires large datasets sufficiently similar to the target domain. Another option is to design priors into the model, which can range from tuned hyperparameters to fully engineered representations like Scattering Networks. We combine these ideas into structur… ▽ More

    Submitted 13 May, 2016; v1 submitted 10 May, 2016; originally announced May 2016.

    Comments: Reason for update: i) Fix Reference for "Deep roto-translation scattering for object classification" by Oyallon and Mallat. ii) Fixed two minor typos. iii) Removed implicit assumption in equation (4) where scale is represented with diffusion time and adapted to rest of paper where scale is represented with standard deviation, to avoid possible confusion

  24. The IceProd Framework: Distributed Data Processing for the IceCube Neutrino Observatory

    Authors: M. G. Aartsen, R. Abbasi, M. Ackermann, J. Adams, J. A. Aguilar, M. Ahlers, D. Altmann, C. Arguelles, J. Auffenberg, X. Bai, M. Baker, S. W. Barwick, V. Baum, R. Bay, J. J. Beatty, J. Becker Tjus, K. -H. Becker, S. BenZvi, P. Berghaus, D. Berley, E. Bernardini, A. Bernhard, D. Z. Besson, G. Binder, D. Bindig , et al. (262 additional authors not shown)

    Abstract: IceCube is a one-gigaton instrument located at the geographic South Pole, designed to detect cosmic neutrinos, iden- tify the particle nature of dark matter, and study high-energy neutrinos themselves. Simulation of the IceCube detector and processing of data require a significant amount of computational resources. IceProd is a distributed management system based on Python, XML-RPC and GridFTP. It… ▽ More

    Submitted 22 August, 2014; v1 submitted 22 November, 2013; originally announced November 2013.

    Journal ref: Journal of Parallel & Distributed Computing 75:198,2015

  25. arXiv:1003.4847  [pdf, ps, other

    math-ph cond-mat.stat-mech cs.DS math.CO

    A tree-decomposed transfer matrix for computing exact Potts model partition functions for arbitrary graphs, with applications to planar graph colourings

    Authors: Andrea Bedini, Jesper Lykke Jacobsen

    Abstract: Combining tree decomposition and transfer matrix techniques provides a very general algorithm for computing exact partition functions of statistical models defined on arbitrary graphs. The algorithm is particularly efficient in the case of planar graphs. We illustrate it by computing the Potts model partition functions and chromatic polynomials (the number of proper vertex colourings using Q colou… ▽ More

    Submitted 6 August, 2010; v1 submitted 25 March, 2010; originally announced March 2010.

    Comments: 5 pages, 3 figures. Version 2 has been substantially expanded. Version 3 shows that the worst-case running time is sub-exponential in the number of vertices