Skip to main content

Showing 1–47 of 47 results for author: Forré, P

.
  1. arXiv:2407.02134  [pdf, other

    cs.IT

    Abstract Markov Random Fields

    Authors: Leon Lang, Clélia de Mulatier, Rick Quax, Patrick Forré

    Abstract: Markov random fields are known to be fully characterized by properties of their information diagrams, or I-diagrams. In particular, for Markov random fields, regions in the I-diagram corresponding to disconnected vertex sets in the graph vanish. Recently, I-diagrams have been generalized to F-diagrams, for a larger class of functions F satisfying the chain rule beyond Shannon entropy, such as Kull… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 50 pages, 7 figures

  2. arXiv:2406.15753  [pdf, other

    cs.LG cs.AI stat.ML

    The Perils of Optimizing Learned Reward Functions: Low Training Error Does Not Guarantee Low Regret

    Authors: Lukas Fluri, Leon Lang, Alessandro Abate, Patrick Forré, David Krueger, Joar Skalse

    Abstract: In reinforcement learning, specifying reward functions that capture the intended task can be very challenging. Reward learning aims to address this issue by learning the reward function. However, a learned reward model may have a low error on the training distribution, and yet subsequently produce a policy with large regret. We say that such a reward model has an error-regret mismatch. The main so… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 58 pages, 1 figure

  3. arXiv:2406.04052  [pdf, ps, other

    cs.LG cs.AI

    Multivector Neurons: Better and Faster O(n)-Equivariant Clifford Graph Neural Networks

    Authors: Cong Liu, David Ruhe, Patrick Forré

    Abstract: Most current deep learning models equivariant to $O(n)$ or $SO(n)$ either consider mostly scalar information such as distances and angles or have a very high computational complexity. In this work, we test a few novel message passing graph neural networks (GNNs) based on Clifford multivectors, structured similarly to other prevalent equivariant models in geometric deep learning. Our approach lever… ▽ More

    Submitted 10 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  4. arXiv:2402.14730  [pdf, other

    cs.LG cs.AI

    Clifford-Steerable Convolutional Neural Networks

    Authors: Maksim Zhdanov, David Ruhe, Maurice Weiler, Ana Lucic, Johannes Brandstetter, Patrick Forré

    Abstract: We present Clifford-Steerable Convolutional Neural Networks (CS-CNNs), a novel class of $\mathrm{E}(p, q)$-equivariant CNNs. CS-CNNs process multivector fields on pseudo-Euclidean spaces $\mathbb{R}^{p,q}$. They cover, for instance, $\mathrm{E}(3)$-equivariance on $\mathbb{R}^3$ and Poincaré-equivariance on Minkowski spacetime $\mathbb{R}^{1,3}$. Our approach is based on an implicit parametrizatio… ▽ More

    Submitted 6 July, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: accepted to ICML 2024

  5. arXiv:2402.10011  [pdf, other

    cs.AI

    Clifford Group Equivariant Simplicial Message Passing Networks

    Authors: Cong Liu, David Ruhe, Floor Eijkelboom, Patrick Forré

    Abstract: We introduce Clifford Group Equivariant Simplicial Message Passing Networks, a method for steerable E(n)-equivariant message passing on simplicial complexes. Our method integrates the expressivity of Clifford group-equivariant layers with simplicial message passing, which is topologically more intricate than regular graph message passing. Clifford algebras include higher-order objects such as bive… ▽ More

    Submitted 12 March, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  6. arXiv:2311.12447  [pdf, other

    cs.AI

    Designing Long-term Group Fair Policies in Dynamical Systems

    Authors: Miriam Rateike, Isabel Valera, Patrick Forré

    Abstract: Neglecting the effect that decisions have on individuals (and thus, on the underlying data distribution) when designing algorithmic decision-making policies may increase inequalities and unfairness in the long term - even if fairness considerations were taken in the policy design process. In this paper, we propose a novel framework for achieving long-term group fairness in dynamical systems, in wh… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

  7. arXiv:2311.05931  [pdf, other

    cs.LG cs.AI stat.ML

    Early-Exit Neural Networks with Nested Prediction Sets

    Authors: Metod Jazbec, Patrick Forré, Stephan Mandt, Dan Zhang, Eric Nalisnick

    Abstract: Early-exit neural networks (EENNs) enable adaptive and efficient inference by providing predictions at multiple stages during the forward pass. In safety-critical applications, these predictions are meaningful only when accompanied by reliable uncertainty estimates. A popular method for quantifying the uncertainty of predictive models is the use of prediction sets. However, we demonstrate that sta… ▽ More

    Submitted 2 June, 2024; v1 submitted 10 November, 2023; originally announced November 2023.

    Comments: UAI 2024

  8. arXiv:2310.19384  [pdf, other

    stat.ML cs.LG

    Deep anytime-valid hypothesis testing

    Authors: Teodora Pandeva, Patrick Forré, Aaditya Ramdas, Shubhanshu Shekhar

    Abstract: We propose a general framework for constructing powerful, sequential hypothesis tests for a large class of nonparametric testing problems. The null hypothesis for these problems is defined in an abstract form using the action of two known operators on the data distribution. This abstraction allows for a unified treatment of several classical tasks, such as two-sample testing, independence testing,… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  9. arXiv:2310.11366  [pdf, other

    cs.LG stat.ML

    Lie Group Decompositions for Equivariant Neural Networks

    Authors: Mircea Mironenco, Patrick Forré

    Abstract: Invariance and equivariance to geometrical transformations have proven to be very useful inductive biases when training (convolutional) neural network models, especially in the low-data regime. Much work has focused on the case where the symmetry group employed is compact or abelian, or both. Recent work has explored enlarging the class of transformations used to the case of Lie groups, principall… ▽ More

    Submitted 10 July, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: Published at ICLR 2024. Code is available at https://github.com/mirceamironenco/rgenn

  10. arXiv:2310.01808  [pdf, other

    stat.ML cs.LG stat.CO

    Simulation-based Inference with the Generalized Kullback-Leibler Divergence

    Authors: Benjamin Kurt Miller, Marco Federici, Christoph Weniger, Patrick Forré

    Abstract: In Simulation-based Inference, the goal is to solve the inverse problem when the likelihood is only known implicitly. Neural Posterior Estimation commonly fits a normalized density estimator as a surrogate model for the posterior. This formulation cannot easily fit unnormalized surrogates because it optimizes the Kullback-Leibler divergence. We propose to optimize a generalized Kullback-Leibler di… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

    Comments: Accepted at Synergy of Scientific and Machine Learning Modeling ICML 2023 Workshop https://syns-ml.github.io/2023/contributions/

  11. arXiv:2309.07200  [pdf, other

    cs.LG cs.AI cs.IT

    Latent Representation and Simulation of Markov Processes via Time-Lagged Information Bottleneck

    Authors: Marco Federici, Patrick Forré, Ryota Tomioka, Bastiaan S. Veeling

    Abstract: Markov processes are widely used mathematical models for describing dynamic systems in various fields. However, accurately simulating large-scale systems at long time scales is computationally expensive due to the short time steps required for accurate integration. In this paper, we introduce an inference process that maps complex systems into a simplified representational space and models large j… ▽ More

    Submitted 26 January, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: 10 pages, 15 figures, Accepted ICLR 2024

  12. arXiv:2306.08445  [pdf, other

    cs.LG

    Deep Gaussian Markov Random Fields for Graph-Structured Dynamical Systems

    Authors: Fiona Lippert, Bart Kranstauber, E. Emiel van Loon, Patrick Forré

    Abstract: Probabilistic inference in high-dimensional state-space models is computationally challenging. For many spatiotemporal systems, however, prior knowledge about the dependency structure of state variables is available. We leverage this structure to develop a computationally efficient approach to state estimation and learning in graph-structured state-space models with (partially) unknown dynamics an… ▽ More

    Submitted 27 October, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023; camera-ready version

  13. arXiv:2306.00608  [pdf, other

    stat.ML cs.IT cs.LG

    On the Effectiveness of Hybrid Mutual Information Estimation

    Authors: Marco Federici, David Ruhe, Patrick Forré

    Abstract: Estimating the mutual information from samples from a joint distribution is a challenging problem in both science and engineering. In this work, we realize a variational bound that generalizes both discriminative and generative approaches. Using this bound, we propose a hybrid method to mitigate their respective shortcomings. Further, we propose Predictive Quantization (PQ): a simple generative me… ▽ More

    Submitted 2 June, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

  14. arXiv:2305.11141  [pdf, other

    cs.LG cs.AI

    Clifford Group Equivariant Neural Networks

    Authors: David Ruhe, Johannes Brandstetter, Patrick Forré

    Abstract: We introduce Clifford Group Equivariant Neural Networks: a novel approach for constructing $\mathrm{O}(n)$- and $\mathrm{E}(n)$-equivariant models. We identify and study the $\textit{Clifford group}$, a subgroup inside the Clifford algebra tailored to achieve several favorable properties. Primarily, the group's action forms an orthogonal automorphism that extends beyond the typical vector space to… ▽ More

    Submitted 22 October, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: Published at NeurIPS 2023 (Oral)

  15. arXiv:2304.10978  [pdf, other

    stat.ML cs.LG stat.ME

    Balancing Simulation-based Inference for Conservative Posteriors

    Authors: Arnaud Delaunoy, Benjamin Kurt Miller, Patrick Forré, Christoph Weniger, Gilles Louppe

    Abstract: Conservative inference is a major concern in simulation-based inference. It has been shown that commonly used algorithms can produce overconfident posterior approximations. Balancing has empirically proven to be an effective way to mitigate this issue. However, its application remains limited to neural ratio estimation. In this work, we extend balancing to any algorithm that provides a posterior d… ▽ More

    Submitted 21 April, 2023; originally announced April 2023.

  16. arXiv:2211.09008  [pdf, other

    astro-ph.IM astro-ph.HE cs.LG gr-qc

    Normalizing Flows for Hierarchical Bayesian Analysis: A Gravitational Wave Population Study

    Authors: David Ruhe, Kaze Wong, Miles Cranmer, Patrick Forré

    Abstract: We propose parameterizing the population distribution of the gravitational wave population modeling framework (Hierarchical Bayesian Analysis) with a normalizing flow. We first demonstrate the merit of this method on illustrative experiments and then analyze four parameters of the latest LIGO/Virgo data release: primary mass, secondary mass, redshift, and effective spin. Our results show that desp… ▽ More

    Submitted 29 December, 2022; v1 submitted 15 November, 2022; originally announced November 2022.

  17. arXiv:2211.04539  [pdf, other

    cs.LG

    Physics-informed inference of aerial animal movements from weather radar data

    Authors: Fiona Lippert, Bart Kranstauber, E. Emiel van Loon, Patrick Forré

    Abstract: Studying animal movements is essential for effective wildlife conservation and conflict mitigation. For aerial movements, operational weather radars have become an indispensable data source in this respect. However, partial measurements, incomplete spatial coverage, and poor understanding of animal behaviours make it difficult to reconstruct complete spatio-temporal movement patterns from availabl… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2022, AI4Science workshop

  18. arXiv:2210.13027  [pdf, other

    stat.ME cs.LG stat.ML

    E-Valuating Classifier Two-Sample Tests

    Authors: Teodora Pandeva, Tim Bakker, Christian A. Naesseth, Patrick Forré

    Abstract: We introduce a powerful deep classifier two-sample test for high-dimensional data based on E-values, called E-value Classifier Two-Sample Test (E-C2ST). Our test combines ideas from existing work on split likelihood ratio tests and predictive independence tests. The resulting E-values are suitable for anytime-valid sequential two-sample tests. This feature allows for more effective use of data in… ▽ More

    Submitted 30 April, 2024; v1 submitted 24 October, 2022; originally announced October 2022.

  19. arXiv:2210.06170  [pdf, other

    stat.ML astro-ph.IM cs.LG hep-ph

    Contrastive Neural Ratio Estimation for Simulation-based Inference

    Authors: Benjamin Kurt Miller, Christoph Weniger, Patrick Forré

    Abstract: Likelihood-to-evidence ratio estimation is usually cast as either a binary (NRE-A) or a multiclass (NRE-B) classification task. In contrast to the binary classification framework, the current formulation of the multiclass version has an intrinsic and unknown bias term, making otherwise informative diagnostics unreliable. We propose a multiclass framework free from the bias inherent to NRE-B at opt… ▽ More

    Submitted 4 July, 2024; v1 submitted 10 October, 2022; originally announced October 2022.

    Comments: 11 pages. 34 pages with references and supplemental material. Accepted at NeurIPS 2022. Updated version corrects code implementation error and all experiments. Code at https://github.com/bkmi/cnre

  20. arXiv:2210.05484  [pdf, other

    cs.LG

    Equivariance-aware Architectural Optimization of Neural Networks

    Authors: Kaitlin Maile, Dennis G. Wilson, Patrick Forré

    Abstract: Incorporating equivariance to symmetry groups as a constraint during neural network training can improve performance and generalization for tasks exhibiting those symmetries, but such symmetries are often not perfectly nor explicitly present. This motivates algorithmically optimizing the architectural constraints imposed by equivariance. We propose the equivariance relaxation morphism, which prese… ▽ More

    Submitted 7 February, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

  21. arXiv:2210.02177  [pdf, other

    cs.LG cs.NE

    Multi-objective optimization via equivariant deep hypervolume approximation

    Authors: Jim Boelrijk, Bernd Ensing, Patrick Forré

    Abstract: Optimizing multiple competing objectives is a common problem across science and industry. The inherent inextricable trade-off between those objectives leads one to the task of exploring their Pareto front. A meaningful quantity for the purpose of the latter is the hypervolume indicator, which is used in Bayesian Optimization (BO) and Evolutionary Algorithms (EAs). However, the computational comple… ▽ More

    Submitted 23 October, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: Updated with camera-ready version. Accepted at ICLR 2023

  22. arXiv:2210.02083  [pdf, other

    cs.LG eess.SP

    Multi-View Independent Component Analysis with Shared and Individual Sources

    Authors: Teodora Pandeva, Patrick Forré

    Abstract: Independent component analysis (ICA) is a blind source separation method for linear disentanglement of independent latent sources from observed data. We investigate the special setting of noisy linear ICA where the observations are split among different views, each receiving a mixture of shared and individual sources. We prove that the corresponding linear structure is identifiable, and the source… ▽ More

    Submitted 3 March, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

  23. arXiv:2202.09393  [pdf, other

    cs.IT

    Information Decomposition Diagrams Applied beyond Shannon Entropy: A Generalization of Hu's Theorem

    Authors: Leon Lang, Pierre Baudot, Rick Quax, Patrick Forré

    Abstract: In information theory, one major goal is to find useful functions that summarize the amount of information contained in the interaction of several random variables. Specifically, one can ask how the classical Shannon entropy, mutual information, and higher interaction information relate to each other. This is answered by Hu's theorem, which is widely known in the form of information diagrams: it r… ▽ More

    Submitted 1 March, 2024; v1 submitted 18 February, 2022; originally announced February 2022.

    Comments: 58 pages, 5 figures

  24. arXiv:2109.11631  [pdf, ps, other

    math.PR math.CT math.FA math.ST

    Quasi-Measurable Spaces

    Authors: Patrick Forré

    Abstract: We introduce the categories of quasi-measurable spaces, which are slight generalizations of the category of quasi-Borel spaces, where we now allow for general sample spaces and less restrictive random variables, spaces and maps. We show that each category of quasi-measurable spaces is bi-complete and cartesian closed. We also introduce several different strong probability monads. Together these co… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    MSC Class: 2020 MSC: Primary: 60A05; 62A01; 18M05; Secondary: 62H22; 62D20; 28A50; 18M05

  25. arXiv:2107.13349  [pdf, other

    cs.LG cs.AI

    Self-Supervised Inference in State-Space Models

    Authors: David Ruhe, Patrick Forré

    Abstract: We perform approximate inference in state-space models with nonlinear state transitions. Without parameterizing a generative model, we apply Bayesian update formulas using a local linearity approximation parameterized by neural networks. This comes accompanied by a maximum likelihood objective that requires no supervision via uncorrupt observations or ground truth latent states. The optimization b… ▽ More

    Submitted 25 January, 2022; v1 submitted 28 July, 2021; originally announced July 2021.

  26. arXiv:2107.01214  [pdf, other

    stat.ML astro-ph.IM cs.LG hep-ph

    Truncated Marginal Neural Ratio Estimation

    Authors: Benjamin Kurt Miller, Alex Cole, Patrick Forré, Gilles Louppe, Christoph Weniger

    Abstract: Parametric stochastic simulators are ubiquitous in science, often featuring high-dimensional input parameters and/or an intractable likelihood. Performing Bayesian parameter inference in this context can be challenging. We present a neural simulation-based inference algorithm which simultaneously offers simulation efficiency and fast empirical posterior testability, which is unique among modern al… ▽ More

    Submitted 26 October, 2021; v1 submitted 2 July, 2021; originally announced July 2021.

    Comments: 10 pages. 27 pages with references and supplemental material. Implementation of experiments at https://github.com/bkmi/tmnre/. Ready-to-use implementation of underlying algorithm at https://github.com/undark-lab/swyft/. Accepted at NeurIPS 2021

  27. arXiv:2106.06020  [pdf, other

    cs.LG cs.CG cs.CV stat.ML

    Coordinate Independent Convolutional Networks -- Isometry and Gauge Equivariant Convolutions on Riemannian Manifolds

    Authors: Maurice Weiler, Patrick Forré, Erik Verlinde, Max Welling

    Abstract: Motivated by the vast success of deep convolutional networks, there is a great interest in generalizing convolutions to non-Euclidean manifolds. A major complication in comparison to flat spaces is that it is unclear in which alignment a convolution kernel should be applied on a manifold. The underlying reason for this ambiguity is that general manifolds do not come with a canonical choice of refe… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: The implementation of orientation independent Möbius convolutions is publicly available at https://github.com/mauriceweiler/MobiusCNNs

  28. arXiv:2106.03783  [pdf, other

    cs.LG cs.IT

    An Information-theoretic Approach to Distribution Shifts

    Authors: Marco Federici, Ryota Tomioka, Patrick Forré

    Abstract: Safely deploying machine learning models to the real world is often a challenging process. Models trained with data obtained from a specific geographic location tend to fail when queried with data obtained elsewhere, agents trained in a simulation can struggle to adapt when deployed in the real world or novel environments, and neural networks that are fit to a subset of the population might carry… ▽ More

    Submitted 1 November, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

  29. arXiv:2104.11547  [pdf, ps, other

    math.ST math.PR stat.ML stat.OT

    Transitional Conditional Independence

    Authors: Patrick Forré

    Abstract: We develope the framework of transitional conditional independence. For this we introduce transition probability spaces and transitional random variables. These constructions will generalize, strengthen and unify previous notions of (conditional) random variables and non-stochastic variables, (extended) stochastic conditional independence and some form of functional conditional independence. Trans… ▽ More

    Submitted 27 August, 2021; v1 submitted 23 April, 2021; originally announced April 2021.

    MSC Class: 62A99; 60A05

  30. arXiv:2103.15418  [pdf, other

    astro-ph.IM

    Detecting Dispersed Radio Transients in Real Time Using Convolutional Neural Networks

    Authors: David Ruhe, Mark Kuiack, Antonia Rowlinson, Ralph Wijers, Patrick Forré

    Abstract: We present a methodology for automated real-time analysis of a radio image data stream with the goal to find transient sources. Contrary to previous works, the transients we are interested in occur on a time-scale where dispersion starts to play a role, so we must search a higher-dimensional data space and yet work fast enough to keep up with the data stream in real time. The approach consists of… ▽ More

    Submitted 6 August, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

  31. arXiv:2103.04786  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    Combining Interventional and Observational Data Using Causal Reductions

    Authors: Maximilian Ilse, Patrick Forré, Max Welling, Joris M. Mooij

    Abstract: Unobserved confounding is one of the main challenges when estimating causal effects. We propose a causal reduction method that, given a causal model, replaces an arbitrary number of possibly high-dimensional latent confounders with a single latent confounder that takes values in the same space as the treatment variable, without changing the observational and interventional distributions the causal… ▽ More

    Submitted 22 February, 2023; v1 submitted 8 March, 2021; originally announced March 2021.

  32. arXiv:2102.05379  [pdf, other

    stat.ML cs.CL cs.LG

    Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions

    Authors: Emiel Hoogeboom, Didrik Nielsen, Priyank Jaini, Patrick Forré, Max Welling

    Abstract: Generative flows and diffusion models have been predominantly trained on ordinal data, for example natural images. This paper introduces two extensions of flows and diffusion for categorical data such as language or image segmentation: Argmax Flows and Multinomial Diffusion. Argmax Flows are defined by a composition of a continuous distribution (such as a normalizing flow), and an argmax function.… ▽ More

    Submitted 22 October, 2021; v1 submitted 10 February, 2021; originally announced February 2021.

    Comments: Accepted at Neural Information Processing Systems (NeurIPS 2021)

  33. arXiv:2011.07248  [pdf, other

    cs.LG cs.NE stat.ML

    Self Normalizing Flows

    Authors: T. Anderson Keller, Jorn W. T. Peters, Priyank Jaini, Emiel Hoogeboom, Patrick Forré, Max Welling

    Abstract: Efficient gradient computation of the Jacobian determinant term is a core problem in many machine learning settings, and especially so in the normalizing flow framework. Most proposed flow models therefore either restrict to a function class with easy evaluation of the Jacobian determinant, or an efficient estimator thereof. However, these restrictions limit the performance of such density models,… ▽ More

    Submitted 9 June, 2021; v1 submitted 14 November, 2020; originally announced November 2020.

  34. arXiv:2009.02594  [pdf, other

    cs.LG stat.ML

    FlipOut: Uncovering Redundant Weights via Sign Flip**

    Authors: Andrei Apostol, Maarten Stol, Patrick Forré

    Abstract: Modern neural networks, although achieving state-of-the-art results on many tasks, tend to have a large number of parameters, which increases training time and resource usage. This problem can be alleviated by pruning. Existing methods, however, often require extensive parameter tuning or multiple cycles of pruning and retraining to convergence in order to obtain a favorable accuracy-sparsity trad… ▽ More

    Submitted 5 September, 2020; originally announced September 2020.

  35. arXiv:2008.10880  [pdf, other

    cs.LG cs.AI stat.ML

    Improving Fair Predictions Using Variational Inference In Causal Models

    Authors: Rik Helwegen, Christos Louizos, Patrick Forré

    Abstract: The importance of algorithmic fairness grows with the increasing impact machine learning has on people's lives. Recent work on fairness metrics shows the need for causal reasoning in fairness constraints. In this work, a practical method named FairTrade is proposed for creating flexible prediction models which integrate fairness constraints on sensitive causal paths. The method uses recent advance… ▽ More

    Submitted 25 August, 2020; originally announced August 2020.

  36. arXiv:2006.06663  [pdf, other

    stat.ML cs.LG

    Neural Ordinary Differential Equations on Manifolds

    Authors: Luca Falorsi, Patrick Forré

    Abstract: Normalizing flows are a powerful technique for obtaining reparameterizable samples from complex multimodal distributions. Unfortunately current approaches fall short when the underlying space has a non trivial topology, and are only available for the most basic geometries. Recently normalizing flows in Euclidean space based on Neural ODEs show great promise, yet suffer the same limitations. Using… ▽ More

    Submitted 11 June, 2020; originally announced June 2020.

  37. arXiv:2006.00896  [pdf, other

    cs.LG stat.ML

    Pruning via Iterative Ranking of Sensitivity Statistics

    Authors: Stijn Verdenius, Maarten Stol, Patrick Forré

    Abstract: With the introduction of SNIP [arXiv:1810.02340v2], it has been demonstrated that modern neural networks can effectively be pruned before training. Yet, its sensitivity criterion has since been criticized for not propagating training signal properly or even disconnecting layers. As a remedy, GraSP [arXiv:2002.07376v1] was introduced, compromising on simplicity. However, in this work we show that b… ▽ More

    Submitted 14 June, 2020; v1 submitted 1 June, 2020; originally announced June 2020.

    Comments: 25 pages, 21 figures, 62 pictures, typos corrected, reference added

  38. arXiv:2005.01856  [pdf, other

    stat.ML cs.CV cs.LG

    Selecting Data Augmentation for Simulating Interventions

    Authors: Maximilian Ilse, Jakub M. Tomczak, Patrick Forré

    Abstract: Machine learning models trained with purely observational data and the principle of empirical risk minimization \citep{vapnik_principles_1992} can fail to generalize to unseen domains. In this paper, we focus on the case where the problem arises through spurious correlation between the observed domains and the actual task labels. We find that many domain generalization methods do not explicitly ta… ▽ More

    Submitted 26 October, 2020; v1 submitted 4 May, 2020; originally announced May 2020.

  39. arXiv:2002.07017  [pdf, other

    cs.LG stat.ML

    Learning Robust Representations via Multi-View Information Bottleneck

    Authors: Marco Federici, Anjan Dutta, Patrick Forré, Nate Kushman, Zeynep Akata

    Abstract: The information bottleneck principle provides an information-theoretic method for representation learning, by training an encoder to retain all information which is relevant for predicting the label while minimizing the amount of other, excess information in the representation. The original formulation, however, requires labeled data to identify the superfluous information. In this work, we extend… ▽ More

    Submitted 18 February, 2020; v1 submitted 17 February, 2020; originally announced February 2020.

  40. arXiv:1903.02958  [pdf, other

    stat.ML cs.CG cs.LG math.PR math.RT

    Reparameterizing Distributions on Lie Groups

    Authors: Luca Falorsi, Pim de Haan, Tim R. Davidson, Patrick Forré

    Abstract: Reparameterizable densities are an important way to learn probability distributions in a deep learning setting. For many distributions it is possible to create low-variance gradient estimators by utilizing a `reparameterization trick'. Due to the absence of a general reparameterization trick, much research has recently been devoted to extend the number of reparameterizable distributional families.… ▽ More

    Submitted 7 March, 2019; originally announced March 2019.

    Comments: AISTATS (2019), code available at https://github.com/pimdh/relie

  41. arXiv:1901.00433  [pdf, ps, other

    stat.ML cs.AI cs.LG stat.ME

    Causal Calculus in the Presence of Cycles, Latent Confounders and Selection Bias

    Authors: Patrick Forré, Joris M. Mooij

    Abstract: We prove the main rules of causal calculus (also called do-calculus) for i/o structural causal models (ioSCMs), a generalization of a recently proposed general class of non-/linear structural causal models that allow for cycles, latent confounders and arbitrary probability distributions. We also generalize adjustment criteria and formulas from the acyclic setting to the general one (i.e. ioSCMs).… ▽ More

    Submitted 3 July, 2019; v1 submitted 2 January, 2019; originally announced January 2019.

    Comments: Accepted for publication in Conference on Uncertainty in Artificial Intelligence 2019 (UAI-2019)

    Journal ref: Proceedings of the 35th Annual Conference on Uncertainty in Artificial Intelligence, 2019

  42. arXiv:1810.01118  [pdf, other

    cs.LG cs.CV stat.ML

    Sinkhorn AutoEncoders

    Authors: Giorgio Patrini, Rianne van den Berg, Patrick Forré, Marcello Carioni, Samarth Bhargav, Max Welling, Tim Genewein, Frank Nielsen

    Abstract: Optimal transport offers an alternative to maximum likelihood for learning generative autoencoding models. We show that minimizing the p-Wasserstein distance between the generator and the true data distribution is equivalent to the unconstrained min-min optimization of the p-Wasserstein distance between the encoder aggregated posterior and the prior in latent space, plus a reconstruction error. We… ▽ More

    Submitted 15 July, 2019; v1 submitted 2 October, 2018; originally announced October 2018.

    Comments: Accepted for oral presentation at UAI19

  43. arXiv:1807.04689  [pdf, other

    stat.ML cs.LG

    Explorations in Homeomorphic Variational Auto-Encoding

    Authors: Luca Falorsi, Pim de Haan, Tim R. Davidson, Nicola De Cao, Maurice Weiler, Patrick Forré, Taco S. Cohen

    Abstract: The manifold hypothesis states that many kinds of high-dimensional data are concentrated near a low-dimensional manifold. If the topology of this data manifold is non-trivial, a continuous encoder network cannot embed it in a one-to-one manner without creating holes of low density in the latent space. This is at odds with the Gaussian prior assumption typically made in Variational Auto-Encoders (V… ▽ More

    Submitted 12 July, 2018; originally announced July 2018.

    Comments: 16 pages, 8 figures, ICML workshop on Theoretical Foundations and Applications of Deep Generative Models

  44. arXiv:1807.03024  [pdf, other

    stat.ML cs.AI cs.LG

    Constraint-based Causal Discovery for Non-Linear Structural Causal Models with Cycles and Latent Confounders

    Authors: Patrick Forré, Joris M. Mooij

    Abstract: We address the problem of causal discovery from data, making use of the recently proposed causal modeling framework of modular structural causal models (mSCM) to handle cycles, latent confounders and non-linearities. We introduce σ-connection graphs (σ-CG), a new class of mixed graphs (containing undirected, bidirected and directed edges) with additional structure, and extend the concept of σ-sepa… ▽ More

    Submitted 9 July, 2018; originally announced July 2018.

    Comments: Accepted for publication in Conference on Uncertainty in Artificial Intelligence 2018

    Journal ref: Proceedings of the 34th Annual Conference on Uncertainty in Artificial Intelligence (2018), 269-278

  45. arXiv:1710.08775  [pdf, ps, other

    math.ST stat.ME stat.ML stat.OT

    Markov Properties for Graphical Models with Cycles and Latent Variables

    Authors: Patrick Forré, Joris M. Mooij

    Abstract: We investigate probabilistic graphical models that allow for both cycles and latent variables. For this we introduce directed graphs with hyperedges (HEDGes), generalizing and combining both marginalized directed acyclic graphs (mDAGs) that can model latent (dependent) variables, and directed mixed graphs (DMGs) that can model cycles. We define and analyse several different Markov properties that… ▽ More

    Submitted 24 October, 2017; originally announced October 2017.

    Comments: 131 pages

  46. arXiv:1611.06221  [pdf, other

    stat.ME cs.AI cs.LG

    Foundations of Structural Causal Models with Cycles and Latent Variables

    Authors: Stephan Bongers, Patrick Forré, Jonas Peters, Joris M. Mooij

    Abstract: Structural causal models (SCMs), also known as (nonparametric) structural equation models (SEMs), are widely used for causal modeling purposes. In particular, acyclic SCMs, also known as recursive SEMs, form a well-studied subclass of SCMs that generalize causal Bayesian networks to allow for latent confounders. In this paper, we investigate SCMs in a more general setting, allowing for the presenc… ▽ More

    Submitted 22 November, 2021; v1 submitted 18 November, 2016; originally announced November 2016.

    Comments: 75 pages (including supplementary material)

    MSC Class: 62A09; 68T30 (Primary) 68T37 (Secondary)

    Journal ref: The Annals of Statistics 49(5), 2021, 2885-2915

  47. arXiv:1605.08344  [pdf, ps, other

    math.NT math.AG

    Cohomological Hasse principle for schemes over valuation rings of higher dimensional local fields

    Authors: Patrick Forré

    Abstract: K. Kato's conjecture about the cohomological Hasse principle for regular connected schemes $\mathfrak X$ which are flat and proper over the complete discrete valuation rings $\mathcal O_N$ of higher local fields $F_N$ is proven. This generalizes the work of M. Kerz, S. Saito and U. Jannsen for finite fields to the case of all higher local fields. For that purpose a $p$-alteration theorem for the l… ▽ More

    Submitted 26 May, 2016; originally announced May 2016.

    MSC Class: 11G45; 14E15; 14F42; 11G25