Skip to main content

Showing 1–50 of 237 results for author: Schölkopf, B

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.00529  [pdf, other

    cs.LG cs.SD eess.AS math.ST stat.ML

    Detecting and Identifying Selection Structure in Sequential Data

    Authors: Yujia Zheng, Zeyu Tang, Yiwen Qiu, Bernhard Schölkopf, Kun Zhang

    Abstract: We argue that the selective inclusion of data points based on latent objectives is common in practical situations, such as music sequences. Since this selection process often distorts statistical analysis, previous work primarily views it as a bias to be corrected and proposes various methods to mitigate its effect. However, while controlling this bias is crucial, selection also offers an opportun… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: ICML 2024

  2. arXiv:2406.19049  [pdf, other

    cs.LG cs.AI stat.ML

    Accuracy on the wrong line: On the pitfalls of noisy data for out-of-distribution generalisation

    Authors: Amartya Sanyal, Yaxi Hu, Yaodong Yu, Yian Ma, Yixin Wang, Bernhard Schölkopf

    Abstract: "Accuracy-on-the-line" is a widely observed phenomenon in machine learning, where a model's accuracy on in-distribution (ID) and out-of-distribution (OOD) data is positively correlated across different hyperparameters and data configurations. But when does this useful relationship break down? In this work, we explore its robustness. The key observation is that noisy data and the presence of nuisan… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  3. arXiv:2406.14302  [pdf, ps, other

    stat.ML cs.AI cs.LG

    Identifiable Exchangeable Mechanisms for Causal Structure and Representation Learning

    Authors: Patrik Reizinger, Siyuan Guo, Ferenc Huszár, Bernhard Schölkopf, Wieland Brendel

    Abstract: Identifying latent representations or causal structures is important for good generalization and downstream task performance. However, both fields have been developed rather independently. We observe that several methods in both representation and causal structure learning rely on the same data-generating process (DGP), namely, exchangeable but not i.i.d. (independent and identically distributed)… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  4. arXiv:2406.11601  [pdf, other

    cs.LG stat.ML

    Standardizing Structural Causal Models

    Authors: Weronika Ormaniec, Scott Sussex, Lars Lorch, Bernhard Schölkopf, Andreas Krause

    Abstract: Synthetic datasets generated by structural causal models (SCMs) are commonly used for benchmarking causal structure learning algorithms. However, the variances and pairwise correlations in SCM data tend to increase along the causal ordering. Several popular algorithms exploit these artifacts, possibly leading to conclusions that do not generalize to real-world settings. Existing metrics like… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  5. arXiv:2405.20318  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    CausalQuest: Collecting Natural Causal Questions for AI Agents

    Authors: Roberto Ceraolo, Dmitrii Kharlapenko, Amélie Reymond, Rada Mihalcea, Mrinmaya Sachan, Bernhard Schölkopf, Zhi**g **

    Abstract: Humans have an innate drive to seek out causality. Whether fuelled by curiosity or specific goals, we constantly question why things happen, how they are interconnected, and many other related phenomena. To develop AI agents capable of addressing this natural human quest for causality, we urgently need a comprehensive dataset of natural causal questions. Unfortunately, existing datasets either con… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  6. arXiv:2405.18836  [pdf, other

    stat.ME cs.LG

    Do Finetti: On Causal Effects for Exchangeable Data

    Authors: Siyuan Guo, Chi Zhang, Karthika Mohan, Ferenc Huszár, Bernhard Schölkopf

    Abstract: We study causal effect estimation in a setting where the data are not i.i.d. (independent and identically distributed). We focus on exchangeable data satisfying an assumption of independent causal mechanisms. Traditional causal effect estimation frameworks, e.g., relying on structural causal models and do-calculus, are typically limited to i.i.d. data and do not extend to more general exchangeable… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  7. arXiv:2405.11633  [pdf, other

    cs.LG stat.ML

    Geometry-Aware Instrumental Variable Regression

    Authors: Heiner Kremer, Bernhard Schölkopf

    Abstract: Instrumental variable (IV) regression can be approached through its formulation in terms of conditional moment restrictions (CMR). Building on variants of the generalized method of moments, most CMR estimators are implicitly based on approximating the population data distribution via reweightings of the empirical sample. While for large sample sizes, in the independent identically distributed (IID… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  8. arXiv:2403.13041  [pdf, other

    cs.CR cs.AI cs.LG stat.ML

    Provable Privacy with Non-Private Pre-Processing

    Authors: Yaxi Hu, Amartya Sanyal, Bernhard Schölkopf

    Abstract: When analysing Differentially Private (DP) machine learning pipelines, the potential privacy cost of data-dependent pre-processing is frequently overlooked in privacy accounting. In this work, we propose a general framework to evaluate the additional privacy cost incurred by non-private data-dependent pre-processing algorithms. Our framework establishes upper bounds on the overall privacy guarante… ▽ More

    Submitted 21 June, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

  9. arXiv:2403.07379  [pdf, other

    cs.LG cs.CL stat.ML

    Hallmarks of Optimization Trajectories in Neural Networks: Directional Exploration and Redundancy

    Authors: Sidak Pal Singh, Bobby He, Thomas Hofmann, Bernhard Schölkopf

    Abstract: We propose a fresh take on understanding the mechanisms of neural networks by analyzing the rich directional structure of optimization trajectories, represented by their pointwise parameters. Towards this end, we introduce some natural notions of the complexity of optimization trajectories, both qualitative and quantitative, which hallmark the directional nature of optimization in neural networks:… ▽ More

    Submitted 24 June, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: Preprint, 57 pages

  10. arXiv:2402.09236  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models

    Authors: Goutham Rajendran, Simon Buchholz, Bryon Aragam, Bernhard Schölkopf, Pradeep Ravikumar

    Abstract: To build intelligent machine learning systems, there are two broad approaches. One approach is to build inherently interpretable models, as endeavored by the growing field of causal representation learning. The other approach is to build highly-performant foundation models and then invest efforts into understanding how they work. In this work, we relate these two approaches and study how to learn… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: 36 pages

  11. arXiv:2402.01399  [pdf, other

    cs.LG cs.AI stat.ML

    A Probabilistic Model behind Self-Supervised Learning

    Authors: Alice Bizeul, Bernhard Schölkopf, Carl Allen

    Abstract: In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels. A common task is to classify augmentations or different modalities of the data, which share semantic content (e.g. an object in an image) but differ in style (e.g. the object's location). Many approaches to self-supervised learning have been proposed, e.g. SimCLR, CLIP, and VicREG, which… ▽ More

    Submitted 4 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  12. arXiv:2312.13438  [pdf, ps, other

    stat.ML cs.LG

    Independent Mechanism Analysis and the Manifold Hypothesis

    Authors: Shubhangi Ghosh, Luigi Gresele, Julius von Kügelgen, Michel Besserve, Bernhard Schölkopf

    Abstract: Independent Mechanism Analysis (IMA) seeks to address non-identifiability in nonlinear Independent Component Analysis (ICA) by assuming that the Jacobian of the mixing function has orthogonal columns. As typical in ICA, previous work focused on the case with an equal number of latent components and observed mixtures. Here, we extend IMA to settings with a larger number of mixtures that reside on a… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: 6 pages, Accepted at Neurips Causal Representation Learning 2023

  13. arXiv:2311.18639  [pdf, other

    stat.ML cs.LG

    Targeted Reduction of Causal Models

    Authors: Armin Kekić, Bernhard Schölkopf, Michel Besserve

    Abstract: Why does a phenomenon occur? Addressing this question is central to most scientific inquiries and often relies on simulations of scientific models. As models become more intricate, deciphering the causes behind phenomena in high-dimensional spaces of interconnected variables becomes increasingly challenging. Causal Representation Learning (CRL) offers a promising avenue to uncover interpretable ca… ▽ More

    Submitted 3 June, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

  14. arXiv:2311.08815  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Self-Supervised Disentanglement by Leveraging Structure in Data Augmentations

    Authors: Cian Eastwood, Julius von Kügelgen, Linus Ericsson, Diane Bouchacourt, Pascal Vincent, Bernhard Schölkopf, Mark Ibrahim

    Abstract: Self-supervised representation learning often uses data augmentations to induce some invariance to "style" attributes of the data. However, with downstream tasks generally unknown at training time, it is difficult to deduce a priori which attributes of the data are indeed "style" and can be safely discarded. To address this, we introduce a more principled approach that seeks to disentangle style f… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  15. arXiv:2311.08743  [pdf, other

    stat.ME

    Kernel-based independence tests for causal structure learning on functional data

    Authors: Felix Laumann, Julius von Kügelgen, Junhyung Park, Bernhard Schölkopf, Mauricio Barahona

    Abstract: Measurements of systems taken along a continuous functional dimension, such as time or space, are ubiquitous in many fields, from the physical and biological sciences to economics and engineering.Such measurements can be viewed as realisations of an underlying smooth process sampled over the continuum. However, traditional methods for independence testing and causal learning are not directly appli… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  16. arXiv:2310.07665  [pdf, other

    cs.AI cs.LG stat.ML

    Deep Backtracking Counterfactuals for Causally Compliant Explanations

    Authors: Klaus-Rudolf Kladny, Julius von Kügelgen, Bernhard Schölkopf, Michael Muehlebach

    Abstract: Counterfactuals answer questions of what would have been observed under altered circumstances and can therefore offer valuable insights. Whereas the classical interventional interpretation of counterfactuals has been studied extensively, backtracking constitutes a less studied alternative where all causal laws are kept intact. In the present work, we introduce a practical method called deep backtr… ▽ More

    Submitted 9 February, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

  17. arXiv:2307.09933  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Spuriosity Didn't Kill the Classifier: Using Invariant Predictions to Harness Spurious Features

    Authors: Cian Eastwood, Shashank Singh, Andrei Liviu Nicolicioiu, Marin Vlastelica, Julius von Kügelgen, Bernhard Schölkopf

    Abstract: To avoid failures on out-of-distribution data, recent works have sought to extract features that have an invariant or stable relationship with the label across domains, discarding "spurious" or unstable features whose relationship with the label changes across domains. However, unstable features often carry complementary information that could boost performance if used correctly in the test domain… ▽ More

    Submitted 8 November, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

    Comments: NeurIPS 2023 Camera-Ready

  18. arXiv:2306.06002  [pdf, other

    stat.ME cs.AI

    Causal Effect Estimation from Observational and Interventional Data Through Matrix Weighted Linear Estimators

    Authors: Klaus-Rudolf Kladny, Julius von Kügelgen, Bernhard Schölkopf, Michael Muehlebach

    Abstract: We study causal effect estimation from a mixture of observational and interventional data in a confounded linear regression model with multivariate treatments. We show that the statistical efficiency in terms of expected squared error can be improved by combining estimators arising from both the observational and interventional setting. To this end, we derive methods based on matrix weighted linea… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Journal ref: UAI 2023

  19. arXiv:2306.03968  [pdf, other

    stat.ML cs.LG

    Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels

    Authors: Alexander Immer, Tycho F. A. van der Ouderaa, Mark van der Wilk, Gunnar Rätsch, Bernhard Schölkopf

    Abstract: Selecting hyperparameters in deep learning greatly impacts its effectiveness but requires manual effort and expertise. Recent works show that Bayesian model selection with Laplace approximations can allow to optimize such hyperparameters just like standard neural network parameters using gradients and on the training data. However, estimating a single hyperparameter gradient requires a pass throug… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: ICML 2023

  20. arXiv:2306.02235  [pdf, other

    cs.LG cs.AI math.ST stat.ME stat.ML

    Learning Linear Causal Representations from Interventions under General Nonlinear Mixing

    Authors: Simon Buchholz, Goutham Rajendran, Elan Rosenfeld, Bryon Aragam, Bernhard Schölkopf, Pradeep Ravikumar

    Abstract: We study the problem of learning causal representations from unknown, latent interventions in a general setting, where the latent distribution is Gaussian but the mixing function is completely general. We prove strong identifiability results given unknown single-node interventions, i.e., without having access to the intervention targets. This generalizes prior works which have focused on weaker cl… ▽ More

    Submitted 18 December, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

    Comments: Accepted as Oral paper at NeurIPS 2023

  21. arXiv:2306.00542  [pdf, other

    stat.ML cs.AI cs.LG

    Nonparametric Identifiability of Causal Representations from Unknown Interventions

    Authors: Julius von Kügelgen, Michel Besserve, Liang Wendong, Luigi Gresele, Armin Kekić, Elias Bareinboim, David M. Blei, Bernhard Schölkopf

    Abstract: We study causal representation learning, the task of inferring latent causal variables and their causal relations from high-dimensional mixtures of the variables. Prior work relies on weak supervision, in the form of counterfactual pre- and post-intervention views or temporal structure; places restrictive assumptions, such as linearity, on the mixing function or latent causal model; or requires pa… ▽ More

    Submitted 28 October, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023 camera-ready version; 36 pages, 4 figures

    MSC Class: 68T05 ACM Class: I.2.6

  22. arXiv:2305.17225  [pdf, other

    stat.ML cs.AI cs.LG

    Causal Component Analysis

    Authors: Liang Wendong, Armin Kekić, Julius von Kügelgen, Simon Buchholz, Michel Besserve, Luigi Gresele, Bernhard Schölkopf

    Abstract: Independent Component Analysis (ICA) aims to recover independent latent variables from observed mixtures thereof. Causal Representation Learning (CRL) aims instead to infer causally related (thus often statistically dependent) latent variables, together with the unknown graph encoding their causal relationships. We introduce an intermediate problem termed Causal Component Analysis (CauCA). CauCA c… ▽ More

    Submitted 17 January, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023 final camera-ready version

  23. arXiv:2305.10898  [pdf, other

    cs.LG stat.ML

    Estimation Beyond Data Reweighting: Kernel Method of Moments

    Authors: Heiner Kremer, Yassine Nemmour, Bernhard Schölkopf, Jia-Jie Zhu

    Abstract: Moment restrictions and their conditional counterparts emerge in many areas of machine learning and statistics ranging from causal inference to reinforcement learning. Estimators for these tasks, generally called methods of moments, include the prominent generalized method of moments (GMM) which has recently gained attention in causal inference. GMM is a special case of the broader family of empir… ▽ More

    Submitted 13 June, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

  24. arXiv:2305.09088  [pdf, other

    cs.LG stat.ML

    The Hessian perspective into the Nature of Convolutional Neural Networks

    Authors: Sidak Pal Singh, Thomas Hofmann, Bernhard Schölkopf

    Abstract: While Convolutional Neural Networks (CNNs) have long been investigated and applied, as well as theorized, we aim to provide a slightly different perspective into their nature -- through the perspective of their Hessian maps. The reason is that the loss Hessian captures the pairwise interaction of parameters and therefore forms a natural ground to probe how the architectural aspects of CNN get mani… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: ICML 2023 conference proceedings

  25. arXiv:2305.01764  [pdf, other

    cs.CL cs.AI cs.LG stat.ME

    Psychologically-Inspired Causal Prompts

    Authors: Zhiheng Lyu, Zhi**g **, Justus Mattern, Rada Mihalcea, Mrinmaya Sachan, Bernhard Schoelkopf

    Abstract: NLP datasets are richer than just input-output pairs; rather, they carry causal relations between the input and output variables. In this work, we take sentiment classification as an example and look into the causal relations between the review (X) and sentiment (Y). As psychology studies show that language can affect emotion, different psychological processes are evoked when a person first makes… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

  26. arXiv:2304.07896  [pdf, other

    cs.LG cs.AI stat.ML

    Out-of-Variable Generalization for Discriminative Models

    Authors: Siyuan Guo, Jonas Wildberger, Bernhard Schölkopf

    Abstract: The ability of an agent to do well in new environments is a critical aspect of intelligence. In machine learning, this ability is known as $\textit{strong}$ or $\textit{out-of-distribution}$ generalization. However, merely considering differences in data distributions is inadequate for fully capturing differences between learning environments. In the present paper, we investigate… ▽ More

    Submitted 8 February, 2024; v1 submitted 16 April, 2023; originally announced April 2023.

    Comments: Accepted at ICLR 2024

  27. arXiv:2303.06484  [pdf, other

    cs.LG cs.CV stat.ML

    Generalizing and Decoupling Neural Collapse via Hyperspherical Uniformity Gap

    Authors: Weiyang Liu, Longhui Yu, Adrian Weller, Bernhard Schölkopf

    Abstract: The neural collapse (NC) phenomenon describes an underlying geometric symmetry for deep neural networks, where both deeply learned features and classifiers converge to a simplex equiangular tight frame. It has been shown that both cross-entropy loss and mean square error can provably lead to NC. We remove NC's key assumption on the feature dimension and the number of classes, and then present a ge… ▽ More

    Submitted 15 April, 2023; v1 submitted 11 March, 2023; originally announced March 2023.

    Comments: ICLR 2023 (v2: fixed typos)

  28. arXiv:2301.13724  [pdf, other

    stat.ML astro-ph.IM cs.LG math-ph physics.data-an

    Towards fully covariant machine learning

    Authors: Soledad Villar, David W. Hogg, Weichi Yao, George A. Kevrekidis, Bernhard Schölkopf

    Abstract: Any representation of data involves arbitrary investigator choices. Because those choices are external to the data-generating process, each choice leads to an exact symmetry, corresponding to the group of transformations that takes one possible representation to another. These are the passive symmetries; they include coordinate freedom, gauge symmetry, and units covariance, all of which have led t… ▽ More

    Submitted 28 June, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

    Comments: substantial revision from v1; submitted to TMLR

  29. arXiv:2301.08544  [pdf, ps, other

    quant-ph stat.ML

    Multi-Armed Bandits and Quantum Channel Oracles

    Authors: Simon Buchholz, Jonas M. Kübler, Bernhard Schölkopf

    Abstract: Multi-armed bandits are one of the theoretical pillars of reinforcement learning. Recently, the investigation of quantum algorithms for multi-armed bandit problems was started, and it was found that a quadratic speed-up (in query complexity) is possible when the arms and the randomness of the rewards of the arms can be queried in superposition. Here we introduce further bandit models where we only… ▽ More

    Submitted 26 February, 2024; v1 submitted 20 January, 2023; originally announced January 2023.

    Comments: 47 pages

    MSC Class: 68Q12

  30. arXiv:2212.08498  [pdf, other

    stat.AP cs.AI math.DS

    Evaluating vaccine allocation strategies using simulation-assisted causal modelling

    Authors: Armin Kekić, Jonas Dehning, Luigi Gresele, Julius von Kügelgen, Viola Priesemann, Bernhard Schölkopf

    Abstract: Early on during a pandemic, vaccine availability is limited, requiring prioritisation of different population groups. Evaluating vaccine allocation is therefore a crucial element of pandemics response. In the present work, we develop a model to retrospectively evaluate age-dependent counterfactual vaccine allocation strategies against the COVID-19 pandemic. To estimate the effect of allocation on… ▽ More

    Submitted 14 December, 2022; originally announced December 2022.

  31. arXiv:2212.06925  [pdf, other

    cs.LG stat.ME stat.ML

    On the Relationship Between Explanation and Prediction: A Causal View

    Authors: Amir-Hossein Karimi, Krikamol Muandet, Simon Kornblith, Bernhard Schölkopf, Been Kim

    Abstract: Being able to provide explanations for a model's decision has become a central requirement for the development, deployment, and adoption of machine learning models. However, we are yet to understand what explanation methods can and cannot do. How do upstream factors such as data, model prediction, hyperparameters, and random initialization influence downstream explanations? While previous work rai… ▽ More

    Submitted 12 May, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

  32. arXiv:2211.03846  [pdf, other

    cs.LG cs.MA stat.ME

    Federated Causal Discovery From Interventions

    Authors: Amin Abyaneh, Nino Scherrer, Patrick Schwab, Stefan Bauer, Bernhard Schölkopf, Arash Mehrjou

    Abstract: Causal discovery serves a pivotal role in mitigating model uncertainty through recovering the underlying causal mechanisms among variables. In many practical domains, such as healthcare, access to the data gathered by individual entities is limited, primarily for privacy and regulatory constraints. However, the majority of existing causal discovery methods require the data to be available in a cen… ▽ More

    Submitted 11 February, 2024; v1 submitted 7 November, 2022; originally announced November 2022.

  33. arXiv:2210.16525  [pdf, other

    stat.ML cs.LG econ.EM

    Spectral Representation Learning for Conditional Moment Models

    Authors: Ziyu Wang, Yucen Luo, Yueru Li, Jun Zhu, Bernhard Schölkopf

    Abstract: Many problems in causal inference and economics can be formulated in the framework of conditional moment models, which characterize the target function through a collection of conditional moment restrictions. For nonparametric conditional moment models, efficient estimation often relies on preimposed conditions on various measures of ill-posedness of the hypothesis space, which are hard to validat… ▽ More

    Submitted 28 December, 2022; v1 submitted 29 October, 2022; originally announced October 2022.

  34. arXiv:2210.09054  [pdf, other

    stat.ML cs.AI cs.LG

    On the Identifiability and Estimation of Causal Location-Scale Noise Models

    Authors: Alexander Immer, Christoph Schultheiss, Julia E. Vogt, Bernhard Schölkopf, Peter Bühlmann, Alexander Marx

    Abstract: We study the class of location-scale or heteroscedastic noise models (LSNMs), in which the effect $Y$ can be written as a function of the cause $X$ and a noise source $N$ independent of $X$, which may be scaled by a positive function $g$ over the cause, i.e., $Y = f(X) + g(X)N$. Despite the generality of the model class, we show the causal direction is identifiable up to some pathological cases. T… ▽ More

    Submitted 1 June, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: ICML 2023

  35. arXiv:2210.08031  [pdf, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    Neural Attentive Circuits

    Authors: Nasim Rahaman, Martin Weiss, Francesco Locatello, Chris Pal, Yoshua Bengio, Bernhard Schölkopf, Li Erran Li, Nicolas Ballas

    Abstract: Recent work has seen the development of general purpose neural architectures that can be trained to perform tasks across diverse data modalities. General purpose models typically make few assumptions about the underlying data-structure and are known to perform well in the large-data regime. At the same time, there has been growing interest in modular neural architectures that represent the data us… ▽ More

    Submitted 19 October, 2022; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: To appear at NeurIPS 2022

  36. arXiv:2210.00364  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    DCI-ES: An Extended Disentanglement Framework with Connections to Identifiability

    Authors: Cian Eastwood, Andrei Liviu Nicolicioiu, Julius von Kügelgen, Armin Kekić, Frederik Träuble, Andrea Dittadi, Bernhard Schölkopf

    Abstract: In representation learning, a common approach is to seek representations which disentangle the underlying factors of variation. Eastwood & Williams (2018) proposed three metrics for quantifying the quality of such disentangled representations: disentanglement (D), completeness (C) and informativeness (I). In this work, we first connect this DCI framework to two common notions of linear and nonline… ▽ More

    Submitted 16 February, 2023; v1 submitted 1 October, 2022; originally announced October 2022.

    Comments: Accepted to ICLR 2023

  37. arXiv:2208.06406  [pdf, other

    stat.ML cs.LG

    Function Classes for Identifiable Nonlinear Independent Component Analysis

    Authors: Simon Buchholz, Michel Besserve, Bernhard Schölkopf

    Abstract: Unsupervised learning of latent variable models (LVMs) is widely used to represent data in machine learning. When such models reflect the ground truth factors and the mechanisms map** them to observations, there is reason to expect that they allow generalization in downstream tasks. It is however well known that such identifiability guaranties are typically not achievable without putting constra… ▽ More

    Submitted 12 August, 2022; originally announced August 2022.

    Comments: 43 pages

    Journal ref: NeurIPS 2022

  38. arXiv:2208.01893  [pdf, other

    cs.LG q-bio.QM stat.ML

    Flow Annealed Importance Sampling Bootstrap

    Authors: Laurence Illing Midgley, Vincent Stimper, Gregor N. C. Simm, Bernhard Schölkopf, José Miguel Hernández-Lobato

    Abstract: Normalizing flows are tractable density models that can approximate complicated target distributions, e.g. Boltzmann distributions of physical systems. However, current methods for training flows either suffer from mode-seeking behavior, use samples from the target generated beforehand by expensive MCMC methods, or use stochastic losses that have high variance. To avoid these problems, we augment… ▽ More

    Submitted 7 March, 2023; v1 submitted 3 August, 2022; originally announced August 2022.

  39. arXiv:2207.12067  [pdf, other

    cs.LG math.GR stat.ML

    Homomorphism Autoencoder -- Learning Group Structured Representations from Observed Transitions

    Authors: Hamza Keurti, Hsiao-Ru Pan, Michel Besserve, Benjamin F. Grewe, Bernhard Schölkopf

    Abstract: How can agents learn internal models that veridically represent interactions with the real world is a largely open question. As machine learning is moving towards representations containing not just observational but also interventional knowledge, we study this problem using tools from representation learning and group theory. We propose methods enabling an agent acting upon the world to learn int… ▽ More

    Submitted 2 July, 2024; v1 submitted 25 July, 2022; originally announced July 2022.

    Comments: Accepted at ICML2023, Presented at the Symmetry and Geometry in Neural Representations Workshop (NeurReps) @ NeurIPS2022, 26 pages, 17 figures

  40. arXiv:2207.09944  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    Probable Domain Generalization via Quantile Risk Minimization

    Authors: Cian Eastwood, Alexander Robey, Shashank Singh, Julius von Kügelgen, Hamed Hassani, George J. Pappas, Bernhard Schölkopf

    Abstract: Domain generalization (DG) seeks predictors which perform well on unseen test distributions by leveraging data drawn from multiple related training distributions or domains. To achieve this, DG is commonly formulated as an average- or worst-case problem over the set of possible domains. However, predictors that perform well on average lack robustness while predictors that perform well in the worst… ▽ More

    Submitted 22 August, 2023; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: NeurIPS 2022 camera-ready (+ minor corrections)

  41. arXiv:2207.09239  [pdf, other

    cs.LG stat.ML

    Assaying Out-Of-Distribution Generalization in Transfer Learning

    Authors: Florian Wenzel, Andrea Dittadi, Peter Vincent Gehler, Carl-Johann Simon-Gabriel, Max Horn, Dominik Zietlow, David Kernert, Chris Russell, Thomas Brox, Bernt Schiele, Bernhard Schölkopf, Francesco Locatello

    Abstract: Since out-of-distribution generalization is a generally ill-posed problem, various proxy targets (e.g., calibration, adversarial robustness, algorithmic corruptions, invariance across shifts) were studied across different research programs resulting in different recommendations. While sharing the same aspirational goal, these approaches have never been tested under the same experimental conditions… ▽ More

    Submitted 21 October, 2022; v1 submitted 19 July, 2022; originally announced July 2022.

  42. arXiv:2207.06137  [pdf, other

    stat.ML cs.AI cs.LG

    Probing the Robustness of Independent Mechanism Analysis for Representation Learning

    Authors: Joanna Sliwa, Shubhangi Ghosh, Vincent Stimper, Luigi Gresele, Bernhard Schölkopf

    Abstract: One aim of representation learning is to recover the original latent code that generated the data, a task which requires additional information or inductive biases. A recently proposed approach termed Independent Mechanism Analysis (IMA) postulates that each latent source should influence the observed mixtures independently, complementing standard nonlinear independent component analysis, and taki… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

    Comments: 10 pages, 14 figures, UAI CRL 2022 final camera-ready version

  43. arXiv:2207.04771  [pdf, other

    cs.LG math.ST stat.ML

    Functional Generalized Empirical Likelihood Estimation for Conditional Moment Restrictions

    Authors: Heiner Kremer, Jia-Jie Zhu, Krikamol Muandet, Bernhard Schölkopf

    Abstract: Important problems in causal inference, economics, and, more generally, robust machine learning can be expressed as conditional moment restrictions, but estimation becomes challenging as it requires solving a continuum of unconditional moment restrictions. Previous works addressed this problem by extending the generalized method of moments (GMM) to continuum moment restrictions. In contrast, gener… ▽ More

    Submitted 16 February, 2024; v1 submitted 11 July, 2022; originally announced July 2022.

  44. arXiv:2206.11131  [pdf, other

    cs.LG stat.ME

    Variational Causal Dynamics: Discovering Modular World Models from Interventions

    Authors: Anson Lei, Bernhard Schölkopf, Ingmar Posner

    Abstract: Latent world models allow agents to reason about complex environments with high-dimensional observations. However, adapting to new environments and effectively leveraging previous knowledge remain significant challenges. We present variational causal dynamics (VCD), a structured world model that exploits the invariance of causal mechanisms across environments to achieve fast and modular adaptation… ▽ More

    Submitted 22 June, 2022; originally announced June 2022.

  45. arXiv:2206.08843  [pdf, other

    cs.LG stat.ML

    AutoML Two-Sample Test

    Authors: Jonas M. Kübler, Vincent Stimper, Simon Buchholz, Krikamol Muandet, Bernhard Schölkopf

    Abstract: Two-sample tests are important in statistics and machine learning, both as tools for scientific discovery as well as to detect distribution shifts. This led to the development of many sophisticated test procedures going beyond the standard supervised learning frameworks, whose usage can require specialized knowledge about two-sample testing. We use a simple test that takes the mean discrepancy of… ▽ More

    Submitted 15 January, 2023; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: NeurIPS 2022

  46. arXiv:2206.02953  [pdf, other

    math.OC cs.GT cs.LG stat.ML

    Sampling without Replacement Leads to Faster Rates in Finite-Sum Minimax Optimization

    Authors: Aniket Das, Bernhard Schölkopf, Michael Muehlebach

    Abstract: We analyze the convergence rates of stochastic gradient algorithms for smooth finite-sum minimax optimization and show that, for many such algorithms, sampling the data points without replacement leads to faster convergence compared to sampling with replacement. For the smooth and strongly convex-strongly concave setting, we consider gradient descent ascent and the proximal point method, and prese… ▽ More

    Submitted 10 October, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  47. arXiv:2206.02416  [pdf, other

    stat.ML cs.AI cs.LG

    Embrace the Gap: VAEs Perform Independent Mechanism Analysis

    Authors: Patrik Reizinger, Luigi Gresele, Jack Brady, Julius von Kügelgen, Dominik Zietlow, Bernhard Schölkopf, Georg Martius, Wieland Brendel, Michel Besserve

    Abstract: Variational autoencoders (VAEs) are a popular framework for modeling complex data distributions; they can be efficiently trained via variational inference by maximizing the evidence lower bound (ELBO), at the expense of a gap to the exact (log-)marginal likelihood. While VAEs are commonly used for representation learning, it is unclear why ELBO maximization would yield useful representations, sinc… ▽ More

    Submitted 27 January, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: NeurIPS2022 final version

  48. arXiv:2206.02013  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    Causal Discovery in Heterogeneous Environments Under the Sparse Mechanism Shift Hypothesis

    Authors: Ronan Perry, Julius von Kügelgen, Bernhard Schölkopf

    Abstract: Machine learning approaches commonly rely on the assumption of independent and identically distributed (i.i.d.) data. In reality, however, this assumption is almost always violated due to distribution shifts between environments. Although valuable learning signals can be provided by heterogeneous data from changing distributions, it is also known that learning under arbitrary (adversarial) changes… ▽ More

    Submitted 15 October, 2022; v1 submitted 4 June, 2022; originally announced June 2022.

    Comments: NeurIPS 2022 camera-ready version. JvK and BS are shared last authors. 10 pages + Bibliography + Appendix (26 pages total)

  49. arXiv:2206.01665  [pdf, other

    cs.LG stat.ME stat.ML

    BaCaDI: Bayesian Causal Discovery with Unknown Interventions

    Authors: Alexander Hägele, Jonas Rothfuss, Lars Lorch, Vignesh Ram Somnath, Bernhard Schölkopf, Andreas Krause

    Abstract: Inferring causal structures from experimentation is a central task in many domains. For example, in biology, recent advances allow us to obtain single-cell expression data under multiple interventions such as drugs or gene knockouts. However, the targets of the interventions are often uncertain or unknown and the number of observations limited. As a result, standard causal discovery methods can no… ▽ More

    Submitted 23 February, 2023; v1 submitted 3 June, 2022; originally announced June 2022.

    Comments: Accepted to AISTATS 2023. 26 pages

  50. arXiv:2205.12934  [pdf, other

    cs.LG stat.ML

    Amortized Inference for Causal Structure Learning

    Authors: Lars Lorch, Scott Sussex, Jonas Rothfuss, Andreas Krause, Bernhard Schölkopf

    Abstract: Inferring causal structure poses a combinatorial search problem that typically involves evaluating structures with a score or independence test. The resulting search is costly, and designing suitable scores or tests that capture prior knowledge is difficult. In this work, we propose to amortize causal structure learning. Rather than searching over structures, we train a variational inference model… ▽ More

    Submitted 15 December, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: NeurIPS 2022, fixed formatting of Figure 5