Skip to main content

Showing 1–48 of 48 results for author: Schölkopf, B

Searching in archive math. Search in all archives.
.
  1. arXiv:2407.00529  [pdf, other

    cs.LG cs.SD eess.AS math.ST stat.ML

    Detecting and Identifying Selection Structure in Sequential Data

    Authors: Yujia Zheng, Zeyu Tang, Yiwen Qiu, Bernhard Schölkopf, Kun Zhang

    Abstract: We argue that the selective inclusion of data points based on latent objectives is common in practical situations, such as music sequences. Since this selection process often distorts statistical analysis, previous work primarily views it as a bias to be corrected and proposes various methods to mitigate its effect. However, while controlling this bias is crucial, selection also offers an opportun… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: ICML 2024

  2. arXiv:2406.00388  [pdf, ps, other

    math.ST

    Products, Abstractions and Inclusions of Causal Spaces

    Authors: Simon Buchholz, Junhyung Park, Bernhard Schölkopf

    Abstract: Causal spaces have recently been introduced as a measure-theoretic framework to encode the notion of causality. While it has some advantages over established frameworks, such as structural causal models, the theory is so far only developed for single causal spaces. In many mathematical theories, not least the theory of probability spaces of which causal spaces are a direct extension, combinations… ▽ More

    Submitted 6 June, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

  3. arXiv:2402.09236  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models

    Authors: Goutham Rajendran, Simon Buchholz, Bryon Aragam, Bernhard Schölkopf, Pradeep Ravikumar

    Abstract: To build intelligent machine learning systems, there are two broad approaches. One approach is to build inherently interpretable models, as endeavored by the growing field of causal representation learning. The other approach is to build highly-performant foundation models and then invest efforts into understanding how they work. In this work, we relate these two approaches and study how to learn… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: 36 pages

  4. arXiv:2306.02235  [pdf, other

    cs.LG cs.AI math.ST stat.ME stat.ML

    Learning Linear Causal Representations from Interventions under General Nonlinear Mixing

    Authors: Simon Buchholz, Goutham Rajendran, Elan Rosenfeld, Bryon Aragam, Bernhard Schölkopf, Pradeep Ravikumar

    Abstract: We study the problem of learning causal representations from unknown, latent interventions in a general setting, where the latent distribution is Gaussian but the mixing function is completely general. We prove strong identifiability results given unknown single-node interventions, i.e., without having access to the intervention targets. This generalizes prior works which have focused on weaker cl… ▽ More

    Submitted 18 December, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

    Comments: Accepted as Oral paper at NeurIPS 2023

  5. arXiv:2305.17139  [pdf, other

    cs.AI math.ST

    A Measure-Theoretic Axiomatisation of Causality

    Authors: Junhyung Park, Simon Buchholz, Bernhard Schölkopf, Krikamol Muandet

    Abstract: Causality is a central concept in a wide range of research areas, yet there is still no universally agreed axiomatisation of causality. We view causality both as an extension of probability theory and as a study of \textit{what happens when one intervenes on a system}, and argue in favour of taking Kolmogorov's measure-theoretic axiomatisation of probability as the starting point towards an axioma… ▽ More

    Submitted 6 June, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

  6. arXiv:2212.08498  [pdf, other

    stat.AP cs.AI math.DS

    Evaluating vaccine allocation strategies using simulation-assisted causal modelling

    Authors: Armin Kekić, Jonas Dehning, Luigi Gresele, Julius von Kügelgen, Viola Priesemann, Bernhard Schölkopf

    Abstract: Early on during a pandemic, vaccine availability is limited, requiring prioritisation of different population groups. Evaluating vaccine allocation is therefore a crucial element of pandemics response. In the present work, we develop a model to retrospectively evaluate age-dependent counterfactual vaccine allocation strategies against the COVID-19 pandemic. To estimate the effect of allocation on… ▽ More

    Submitted 14 December, 2022; originally announced December 2022.

  7. arXiv:2207.12067  [pdf, other

    cs.LG math.GR stat.ML

    Homomorphism Autoencoder -- Learning Group Structured Representations from Observed Transitions

    Authors: Hamza Keurti, Hsiao-Ru Pan, Michel Besserve, Benjamin F. Grewe, Bernhard Schölkopf

    Abstract: How can agents learn internal models that veridically represent interactions with the real world is a largely open question. As machine learning is moving towards representations containing not just observational but also interventional knowledge, we study this problem using tools from representation learning and group theory. We propose methods enabling an agent acting upon the world to learn int… ▽ More

    Submitted 2 July, 2024; v1 submitted 25 July, 2022; originally announced July 2022.

    Comments: Accepted at ICML2023, Presented at the Symmetry and Geometry in Neural Representations Workshop (NeurReps) @ NeurIPS2022, 26 pages, 17 figures

  8. arXiv:2207.04771  [pdf, other

    cs.LG math.ST stat.ML

    Functional Generalized Empirical Likelihood Estimation for Conditional Moment Restrictions

    Authors: Heiner Kremer, Jia-Jie Zhu, Krikamol Muandet, Bernhard Schölkopf

    Abstract: Important problems in causal inference, economics, and, more generally, robust machine learning can be expressed as conditional moment restrictions, but estimation becomes challenging as it requires solving a continuum of unconditional moment restrictions. Previous works addressed this problem by extending the generalized method of moments (GMM) to continuum moment restrictions. In contrast, gener… ▽ More

    Submitted 16 February, 2024; v1 submitted 11 July, 2022; originally announced July 2022.

  9. arXiv:2206.02953  [pdf, other

    math.OC cs.GT cs.LG stat.ML

    Sampling without Replacement Leads to Faster Rates in Finite-Sum Minimax Optimization

    Authors: Aniket Das, Bernhard Schölkopf, Michael Muehlebach

    Abstract: We analyze the convergence rates of stochastic gradient algorithms for smooth finite-sum minimax optimization and show that, for many such algorithms, sampling the data points without replacement leads to faster convergence compared to sampling with replacement. For the smooth and strongly convex-strongly concave setting, we consider gradient descent ascent and the proximal point method, and prese… ▽ More

    Submitted 10 October, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  10. arXiv:2204.11564  [pdf, other

    math.OC eess.SY

    Maximum Mean Discrepancy Distributionally Robust Nonlinear Chance-Constrained Optimization with Finite-Sample Guarantee

    Authors: Yassine Nemmour, Heiner Kremer, Bernhard Schölkopf, Jia-Jie Zhu

    Abstract: This paper is motivated by addressing open questions in distributionally robust chance-constrained programs (DRCCP) using the popular Wasserstein ambiguity sets. Specifically, the computational techniques for those programs typically place restrictive assumptions on the constraint functions and the size of the Wasserstein ambiguity sets is often set using costly cross-validation (CV) procedures or… ▽ More

    Submitted 25 April, 2022; originally announced April 2022.

  11. arXiv:2203.15756  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Causal de Finetti: On the Identification of Invariant Causal Structure in Exchangeable Data

    Authors: Siyuan Guo, Viktor Tóth, Bernhard Schölkopf, Ferenc Huszár

    Abstract: Constraint-based causal discovery methods leverage conditional independence tests to infer causal relationships in a wide variety of applications. Just as the majority of machine learning methods, existing work focuses on studying $\textit{independent and identically distributed}$ data. However, it is known that even with infinite i.i.d.$\ $ data, constraint-based methods can only identify causal… ▽ More

    Submitted 24 May, 2024; v1 submitted 29 March, 2022; originally announced March 2022.

    Comments: camera-ready NeurIPS 2023

  12. arXiv:2201.05830  [pdf, other

    cs.RO math.DS stat.ML

    Physical Derivatives: Computing policy gradients by physical forward-propagation

    Authors: Arash Mehrjou, Ashkan Soleymani, Stefan Bauer, Bernhard Schölkopf

    Abstract: Model-free and model-based reinforcement learning are two ends of a spectrum. Learning a good policy without a dynamic model can be prohibitively expensive. Learning the dynamic model of a system can reduce the cost of learning the policy, but it can also introduce bias if it is not accurate. We propose a middle ground where instead of the transition model, the sensitivity of the trajectories with… ▽ More

    Submitted 15 January, 2022; originally announced January 2022.

  13. arXiv:2110.13588  [pdf, other

    eess.SY math.OC

    Distributional Robustness Regularized Scenario Optimization with Application to Model Predictive Control

    Authors: Yassine Nemmour, Bernhard Schölkopf, Jia-Jie Zhu

    Abstract: We provide a functional view of distributional robustness motivated by robust statistics and functional analysis. This results in two practical computational approaches for approximate distributionally robust nonlinear optimization based on gradient norms and reproducing kernel Hilbert spaces. Our method can be applied to the settings of statistical learning with small sample size and test distrib… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

    Journal ref: Proceedings of the 3rd Conference on Learning for Dynamics and Control, PMLR 144:1255-1269, 2021

  14. arXiv:2102.11834  [pdf, other

    cs.GT econ.TH math.CO

    Finding Stable Matchings in PhD Markets with Consistent Preferences and Cooperative Partners

    Authors: Maximilian Mordig, Riccardo Della Vecchia, Nicolò Cesa-Bianchi, Bernhard Schölkopf

    Abstract: We introduce a new algorithm for finding stable matchings in multi-sided matching markets. Our setting is motivated by a PhD market of students, advisors, and co-advisors, and can be generalized to supply chain networks viewed as $n$-sided markets. In the three-sided PhD market, students primarily care about advisors and then about co-advisors (consistent preferences), while advisors and co-adviso… ▽ More

    Submitted 6 July, 2021; v1 submitted 23 February, 2021; originally announced February 2021.

  15. arXiv:2102.08474  [pdf, other

    cs.LG math.OC stat.ML

    Adversarially Robust Kernel Smoothing

    Authors: Jia-Jie Zhu, Christina Kouridi, Yassine Nemmour, Bernhard Schölkopf

    Abstract: We propose a scalable robust learning algorithm combining kernel smoothing and robust optimization. Our method is motivated by the convex analysis perspective of distributionally robust optimization based on probability metrics, such as the Wasserstein distance and the maximum mean discrepancy. We adapt the integral operator using supremal convolution in convex analysis to form a novel function ma… ▽ More

    Submitted 19 February, 2022; v1 submitted 16 February, 2021; originally announced February 2021.

  16. arXiv:2101.12080  [pdf, other

    cs.GT econ.TH math.CO

    Two-Sided Matching Markets in the ELLIS 2020 PhD Program

    Authors: Maximilian Mordig, Riccardo Della Vecchia, Nicolò Cesa-Bianchi, Bernhard Schölkopf

    Abstract: The ELLIS PhD program is a European initiative that supports excellent young researchers by connecting them to leading researchers in AI. In particular, PhD students are supervised by two advisors from different countries: an advisor and a co-advisor. In this work we summarize the procedure that, in its final step, matches students to advisors in the ELLIS 2020 PhD program. The steps of the proced… ▽ More

    Submitted 11 March, 2021; v1 submitted 28 January, 2021; originally announced January 2021.

  17. arXiv:2007.02938  [pdf, other

    stat.ML cs.LG math.ST

    Causal Feature Selection via Orthogonal Search

    Authors: Ashkan Soleymani, Anant Raj, Stefan Bauer, Bernhard Schölkopf, Michel Besserve

    Abstract: The problem of inferring the direct causal parents of a response variable among a large set of explanatory variables is of high practical importance in many disciplines. However, established approaches often scale at least exponentially with the number of explanatory variables, are difficult to extend to nonlinear relationships, and are difficult to extend to cyclic data. Inspired by {\em Debiased… ▽ More

    Submitted 16 September, 2022; v1 submitted 6 July, 2020; originally announced July 2020.

  18. arXiv:2006.09268  [pdf, ps, other

    cs.LG math.PR math.ST stat.ML

    Metrizing Weak Convergence with Maximum Mean Discrepancies

    Authors: Carl-Johann Simon-Gabriel, Alessandro Barp, Bernhard Schölkopf, Lester Mackey

    Abstract: This paper characterizes the maximum mean discrepancies (MMD) that metrize the weak convergence of probability measures for a wide class of kernels. More precisely, we prove that, on a locally compact, non-compact, Hausdorff space, the MMD of a bounded continuous Borel measurable kernel k, whose reproducing kernel Hilbert space (RKHS) functions vanish at infinity, metrizes the weak convergence of… ▽ More

    Submitted 3 September, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: 14 pages. Corrects in particular Thm.12 of Simon-Gabriel and Schölkopf, JMLR, 19(44):1-29, 2018. See http://jmlr.org/papers/v19/16-291.html

    MSC Class: 60B10 (Primary) 60F05; 60-08; 28-08 (Secondary) ACM Class: G.3; I.2.6; I.5.0

  19. arXiv:2006.06981  [pdf, other

    math.OC cs.LG stat.ML

    Kernel Distributionally Robust Optimization

    Authors: Jia-Jie Zhu, Wittawat Jitkrittum, Moritz Diehl, Bernhard Schölkopf

    Abstract: We propose kernel distributionally robust optimization (Kernel DRO) using insights from the robust optimization theory and functional analysis. Our method uses reproducing kernel Hilbert spaces (RKHS) to construct a wide range of convex ambiguity sets, which can be generalized to sets based on integral probability metrics and finite-order moment bounds. This perspective unifies multiple existing r… ▽ More

    Submitted 27 February, 2021; v1 submitted 12 June, 2020; originally announced June 2020.

    Journal ref: Proceedings of Machine Learning Research, PMLR 130:280-288, 2021

  20. arXiv:2005.06413  [pdf, ps, other

    stat.ME cs.LG math.PR stat.ML

    Crackovid: Optimizing Group Testing

    Authors: Louis Abraham, Gary Bécigneul, Bernhard Schölkopf

    Abstract: We study the problem usually referred to as group testing in the context of COVID-19. Given $n$ samples taken from patients, how should we select mixtures of samples to be tested, so as to maximize information and minimize the number of tests? We consider both adaptive and non-adaptive strategies, and take a Bayesian approach with a prior both for infection of patients and test errors. We start by… ▽ More

    Submitted 13 May, 2020; originally announced May 2020.

  21. arXiv:2004.00166  [pdf, other

    math.OC cs.LG eess.SY

    Worst-Case Risk Quantification under Distributional Ambiguity using Kernel Mean Embedding in Moment Problem

    Authors: Jia-Jie Zhu, Wittawat Jitkrittum, Moritz Diehl, Bernhard Schölkopf

    Abstract: In order to anticipate rare and impactful events, we propose to quantify the worst-case risk under distributional ambiguity using a recent development in kernel methods -- the kernel mean embedding. Specifically, we formulate the generalized moment problem whose ambiguity set (i.e., the moment constraint) is described by constraints in the associated reproducing kernel Hilbert space in a nonparame… ▽ More

    Submitted 6 September, 2020; v1 submitted 31 March, 2020; originally announced April 2020.

  22. arXiv:2003.02658  [pdf, other

    cs.LG math.DS stat.ML

    SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for Gaussian Process Regression with Derivatives

    Authors: Emmanouil Angelis, Philippe Wenk, Bernhard Schölkopf, Stefan Bauer, Andreas Krause

    Abstract: Gaussian processes are an important regression tool with excellent analytic properties which allow for direct integration of derivative observations. However, vanilla GP methods scale cubically in the amount of observations. In this work, we propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features. We then prove deterministic, non-asymptotic and expo… ▽ More

    Submitted 5 March, 2020; originally announced March 2020.

  23. arXiv:2002.10271  [pdf, other

    stat.ML cs.LG math.ST

    Testing Goodness of Fit of Conditional Density Models with Kernels

    Authors: Wittawat Jitkrittum, Heishiro Kanagawa, Bernhard Schölkopf

    Abstract: We propose two nonparametric statistical tests of goodness of fit for conditional distributions: given a conditional probability density function $p(y|x)$ and a joint sample, decide whether the sample is drawn from $p(y|x)r_x(x)$ for some density $r_x$. Our tests, formulated with a Stein operator, can be applied to any differentiable conditional density model, and require no knowledge of the norma… ▽ More

    Submitted 30 June, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: In UAI 2020. http://auai.org/uai2020/accepted.php

    MSC Class: 46E22; 62G10 ACM Class: G.3; I.2.6

  24. arXiv:2001.10398  [pdf, other

    math.OC cs.LG eess.SY

    A Kernel Mean Embedding Approach to Reducing Conservativeness in Stochastic Programming and Control

    Authors: Jia-Jie Zhu, Moritz Diehl, Bernhard Schölkopf

    Abstract: We apply kernel mean embedding methods to sample-based stochastic optimization and control. Specifically, we use the reduced-set expansion method as a way to discard sampled scenarios. The effect of such constraint removal is improved optimality and decreased conservativeness. This is achieved by solving a distributional-distance-regularized optimization problem. We demonstrated this optimization… ▽ More

    Submitted 22 April, 2020; v1 submitted 28 January, 2020; originally announced January 2020.

  25. arXiv:1911.11082  [pdf, other

    stat.ML cs.LG eess.SY math.OC

    A New Distribution-Free Concept for Representing, Comparing, and Propagating Uncertainty in Dynamical Systems with Kernel Probabilistic Programming

    Authors: Jia-Jie Zhu, Krikamol Muandet, Moritz Diehl, Bernhard Schölkopf

    Abstract: This work presents the concept of kernel mean embedding and kernel probabilistic programming in the context of stochastic systems. We propose formulations to represent, compare, and propagate uncertainties for fairly general stochastic dynamics in a distribution-free manner. The new tools enjoy sound theory rooted in functional analysis and wide applicability as demonstrated in distinct numerical… ▽ More

    Submitted 4 May, 2020; v1 submitted 25 November, 2019; originally announced November 2019.

  26. arXiv:1910.14428   

    stat.ML cs.LG math.DS

    Kernel-Guided Training of Implicit Generative Models with Stability Guarantees

    Authors: Arash Mehrjou, Wittawat Jitkrittum, Krikamol Muandet, Bernhard Schölkopf

    Abstract: Modern implicit generative models such as generative adversarial networks (GANs) are generally known to suffer from issues such as instability, uninterpretability, and difficulty in assessing their performance. If we see these implicit models as dynamical systems, some of these issues are caused by being unable to control their behavior in a meaningful way during the course of training. In this wo… ▽ More

    Submitted 3 November, 2019; v1 submitted 29 October, 2019; originally announced October 2019.

    Comments: There was a misunderstanding in how an article should be updated on arXiv. We have withdrawn this article from this link. The same article can be found at arXiv:1901.09206

  27. arXiv:1902.08480  [pdf, other

    cs.LG math.DS stat.ML

    AReS and MaRS - Adversarial and MMD-Minimizing Regression for SDEs

    Authors: Gabriele Abbati, Philippe Wenk, Michael A Osborne, Andreas Krause, Bernhard Schölkopf, Stefan Bauer

    Abstract: Stochastic differential equations are an important modeling class in many disciplines. Consequently, there exist many methods relying on various discretization and numerical integration schemes. In this paper, we propose a novel, probabilistic model for estimating the drift and diffusion given noisy observations of the underlying stochastic system. Using state-of-the-art adversarial and moment mat… ▽ More

    Submitted 28 May, 2019; v1 submitted 22 February, 2019; originally announced February 2019.

    Comments: Published at the Thirty-sixth International Conference on Machine Learning (ICML 2019)

  28. arXiv:1902.06278  [pdf, other

    cs.LG math.DS stat.ML

    ODIN: ODE-Informed Regression for Parameter and State Inference in Time-Continuous Dynamical Systems

    Authors: Philippe Wenk, Gabriele Abbati, Michael A Osborne, Bernhard Schölkopf, Andreas Krause, Stefan Bauer

    Abstract: Parameter inference in ordinary differential equations is an important problem in many applied sciences and in engineering, especially in a data-scarce setting. In this work, we introduce a novel generative modeling approach based on constrained Gaussian processes and leverage it to build a computationally and data efficient algorithm for state and parameter inference. In an extensive set of exper… ▽ More

    Submitted 5 December, 2019; v1 submitted 17 February, 2019; originally announced February 2019.

    Comments: Published at the Thirty-fourth AAAI Conference on Artificial Intelligence

  29. arXiv:1901.08403  [pdf, other

    math.DS

    Deep Lyapunov Function: Automatic Stability Analysis for Dynamical Systems

    Authors: Arash Mehrjou, Bernhard Schölkopf

    Abstract: Stability analysis plays a crucial role in studying the behavior of dynamical systems with theoretical and engineering applications. Among various kinds of stability, the stability of equilibrium points is of the greatest importance which is mainly studied by Lyapunov's stability theory. This theory requires finding a function with specified properties. Except for a few simple examples, there is n… ▽ More

    Submitted 24 January, 2019; originally announced January 2019.

  30. arXiv:1805.10615  [pdf, other

    stat.ML cs.LG math.DS

    A Local Information Criterion for Dynamical Systems

    Authors: Arash Mehrjou, Friedrich Solowjow, Sebastian Trimpe, Bernhard Schölkopf

    Abstract: Encoding a sequence of observations is an essential task with many applications. The encoding can become highly efficient when the observations are generated by a dynamical system. A dynamical system imposes regularities on the observations that can be leveraged to achieve a more efficient code. We propose a method to encode a given or learned dynamical system. Apart from its application for encod… ▽ More

    Submitted 27 May, 2018; originally announced May 2018.

  31. arXiv:1804.03911  [pdf, ps, other

    math.ST

    Structural causal models for macro-variables in time-series

    Authors: Dominik Janzing, Paul Rubenstein, Bernhard Schölkopf

    Abstract: We consider a bivariate time series $(X_t,Y_t)$ that is given by a simple linear autoregressive model. Assuming that the equations describing each variable as a linear combination of past values are considered structural equations, there is a clear meaning of how intervening on one particular $X_t$ influences $Y_{t'}$ at later times $t'>t$. In the present work, we describe conditions under which o… ▽ More

    Submitted 11 April, 2018; originally announced April 2018.

    Comments: 8 pages

  32. arXiv:1803.09539  [pdf, other

    stat.ML cs.LG math.OC

    On Matching Pursuit and Coordinate Descent

    Authors: Francesco Locatello, Anant Raj, Sai Praneeth Karimireddy, Gunnar Rätsch, Bernhard Schölkopf, Sebastian U. Stich, Martin Jaggi

    Abstract: Two popular examples of first-order optimization methods over linear spaces are coordinate descent and matching pursuit algorithms, with their randomized variants. While the former targets the optimization by moving along coordinates, the latter considers a generalized notion of directions. Exploiting the connection between the two algorithms, we present a unified analysis of both, providing affin… ▽ More

    Submitted 31 May, 2019; v1 submitted 26 March, 2018; originally announced March 2018.

    Journal ref: ICML 2018 - Proceedings of the 35th International Conference on Machine Learning

  33. arXiv:1705.02212  [pdf, other

    stat.ML cs.AI cs.LG math.ST

    Group invariance principles for causal generative models

    Authors: Michel Besserve, Naji Shajarisales, Bernhard Schölkopf, Dominik Janzing

    Abstract: The postulate of independence of cause and mechanism (ICM) has recently led to several new causal discovery algorithms. The interpretation of independence and the way it is utilized, however, varies across these methods. Our aim in this paper is to propose a group theoretic framework for ICM to unify and generalize these approaches. In our setting, the cause-mechanism relationship is assessed by c… ▽ More

    Submitted 5 May, 2017; originally announced May 2017.

    Comments: 16 pages, 6 figures

    ACM Class: I.2.6; I.2.10; G.3; I.5.3

  34. arXiv:1609.07478  [pdf, other

    math.OC cs.LG stat.ML

    Screening Rules for Convex Problems

    Authors: Anant Raj, Jakob Olbrich, Bernd Gärtner, Bernhard Schölkopf, Martin Jaggi

    Abstract: We propose a new framework for deriving screening rules for convex optimization problems. Our approach covers a large class of constrained and penalized optimization formulations, and works in two steps. First, given any approximate point, the structure of the objective function and the duality gap is used to gather information on the optimal solution. In the second step, this information is used… ▽ More

    Submitted 23 September, 2016; originally announced September 2016.

  35. arXiv:1604.05251  [pdf, ps, other

    stat.ML math.FA math.PR

    Kernel Distribution Embeddings: Universal Kernels, Characteristic Kernels and Kernel Metrics on Distributions

    Authors: Carl-Johann Simon-Gabriel, Bernhard Schölkopf

    Abstract: Kernel mean embeddings have recently attracted the attention of the machine learning community. They map measures $μ$ from some set $M$ to functions in a reproducing kernel Hilbert space (RKHS) with kernel $k$. The RKHS distance of two mapped measures is a semi-metric $d_k$ over $M$. We study three questions. (I) For a given kernel, what sets $M$ can be embedded? (II) When is the embedding injecti… ▽ More

    Submitted 17 December, 2019; v1 submitted 18 April, 2016; originally announced April 2016.

    Comments: Old and longer version of the JMLR paper with same title (published 2018). Please start with the JMLR version. 55 pages (33 pages main text, 22 pages appendix), 2 tables, 1 figure (in appendix)

    MSC Class: G.3 ACM Class: G.3

    Journal ref: Journal of Machine Learning Research, 19(44):1-29, 2018

  36. arXiv:1603.00784  [pdf, other

    math.ST

    The Arrow of Time in Multivariate Time Series

    Authors: Stefan Bauer, Bernhard Schölkopf, Jonas Peters

    Abstract: We prove that a time series satisfying a (linear) multivariate autoregressive moving average (VARMA) model satisfies the same model assumption in the reversed time direction, too, if all innovations are normally distributed. This reversibility breaks down if the innovations are non-Gaussian. This means that under the assumption of a VARMA process with non-Gaussian noise, the arrow of time becomes… ▽ More

    Submitted 2 March, 2016; originally announced March 2016.

  37. arXiv:1603.00285  [pdf, ps, other

    math.ST stat.ML

    Kernel-based Tests for Joint Independence

    Authors: Niklas Pfister, Peter Bühlmann, Bernhard Schölkopf, Jonas Peters

    Abstract: We investigate the problem of testing whether $d$ random variables, which may or may not be continuous, are jointly (or mutually) independent. Our method builds on ideas of the two variable Hilbert-Schmidt independence criterion (HSIC) but allows for an arbitrary number of variables. We embed the $d$-dimensional joint distribution and the product of the marginals into a reproducing kernel Hilbert… ▽ More

    Submitted 4 November, 2016; v1 submitted 1 March, 2016; originally announced March 2016.

    Comments: 67 pages

  38. arXiv:1512.02057  [pdf, other

    cond-mat.stat-mech math.ST quant-ph

    Algorithmic independence of initial condition and dynamical law in thermodynamics and causal inference

    Authors: Dominik Janzing, Rafael Chaves, Bernhard Schoelkopf

    Abstract: We postulate a principle stating that the initial condition of a physical system is typically algorithmically independent of the dynamical law. We argue that this links thermodynamics and causal inference. On the one hand, it entails behaviour that is similar to the usual arrow of time. On the other hand, it motivates a statistical asymmetry between cause and effect that has recently postulated in… ▽ More

    Submitted 7 December, 2015; originally announced December 2015.

    Comments: 7 pages, latex, 2 figures

    Journal ref: New J. Phys. 18, 093052 (2016)

  39. arXiv:1502.02398  [pdf, other

    stat.ML math.PR math.ST

    Towards a Learning Theory of Cause-Effect Inference

    Authors: David Lopez-Paz, Krikamol Muandet, Bernhard Schölkopf, Ilya Tolstikhin

    Abstract: We pose causal inference as the problem of learning to classify probability distributions. In particular, we assume access to a collection $\{(S_i,l_i)\}_{i=1}^n$, where each $S_i$ is a sample drawn from the probability distribution of $X_i \times Y_i$, and $l_i$ is a binary label indicating whether "$X_i \to Y_i$" or "$X_i \leftarrow Y_i$". Given these data, we build a causal inference rule in tw… ▽ More

    Submitted 18 May, 2015; v1 submitted 9 February, 2015; originally announced February 2015.

  40. arXiv:1411.0900  [pdf, ps, other

    stat.ML math.ST

    Kernel Mean Estimation via Spectral Filtering

    Authors: Krikamol Muandet, Bharath Sriperumbudur, Bernhard Schölkopf

    Abstract: The problem of estimating the kernel mean in a reproducing kernel Hilbert space (RKHS) is central to kernel methods in that it is used by classical approaches (e.g., when centering a kernel PCA matrix), and it also forms the core inference step of modern kernel methods (e.g., kernel-based non-parametric tests) that rely on embedding probability distributions in RKHSs. Muandet et al. (2014) has sho… ▽ More

    Submitted 4 November, 2014; originally announced November 2014.

    Comments: To appear at the 28th Annual Conference on Neural Information Processing Systems (NIPS 2014). 16 pages

  41. arXiv:1306.0842  [pdf, ps, other

    stat.ML cs.LG math.ST

    Kernel Mean Estimation and Stein's Effect

    Authors: Krikamol Muandet, Kenji Fukumizu, Bharath Sriperumbudur, Arthur Gretton, Bernhard Schölkopf

    Abstract: A mean function in reproducing kernel Hilbert space, or a kernel mean, is an important part of many applications ranging from kernel principal component analysis to Hilbert-space embedding of distributions. Given finite samples, an empirical average is the standard estimate for the true kernel mean. We show that this estimator can be improved via a well-known phenomenon in statistics called Stein'… ▽ More

    Submitted 6 June, 2013; v1 submitted 4 June, 2013; originally announced June 2013.

    Comments: first draft

  42. arXiv:1205.1928  [pdf, ps, other

    math.FA cs.LG

    The representer theorem for Hilbert spaces: a necessary and sufficient condition

    Authors: Francesco Dinuzzo, Bernhard Schölkopf

    Abstract: A family of regularization functionals is said to admit a linear representer theorem if every member of the family admits minimizers that lie in a fixed finite dimensional subspace. A recent characterization states that a general class of regularization functionals with differentiable regularizer admits a linear representer theorem if and only if the regularization term is a non-decreasing functio… ▽ More

    Submitted 17 July, 2012; v1 submitted 9 May, 2012; originally announced May 2012.

  43. Quantifying causal influences

    Authors: Dominik Janzing, David Balduzzi, Moritz Grosse-Wentrup, Bernhard Schölkopf

    Abstract: Many methods for causal inference generate directed acyclic graphs (DAGs) that formalize causal relations between $n$ variables. Given the joint distribution on all these variables, the DAG contains all information about how intervening on one variable changes the distribution of the other $n-1$ variables. However, quantifying the causal influence of one variable on another one remains a nontrivia… ▽ More

    Submitted 28 January, 2014; v1 submitted 29 March, 2012; originally announced March 2012.

    Comments: Published in at http://dx.doi.org/10.1214/13-AOS1145 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS1145

    Journal ref: Annals of Statistics 2013, Vol. 41, No. 5, 2324-2358

  44. arXiv:0907.5309  [pdf, ps, other

    stat.ML math.ST

    Hilbert space embeddings and metrics on probability measures

    Authors: Bharath K. Sriperumbudur, Arthur Gretton, Kenji Fukumizu, Bernhard Schölkopf, Gert R. G. Lanckriet

    Abstract: A Hilbert space embedding for probability measures has recently been proposed, with applications including dimensionality reduction, homogeneity testing, and independence testing. This embedding represents any probability measure as a mean element in a reproducing kernel Hilbert space (RKHS). A pseudometric on the space of probability measures can be defined as the distance between distribution… ▽ More

    Submitted 29 January, 2010; v1 submitted 30 July, 2009; originally announced July 2009.

    Comments: 48 pages

  45. arXiv:0810.4752  [pdf, other

    stat.ML math.ST

    Statistical Learning Theory: Models, Concepts, and Results

    Authors: Ulrike von Luxburg, Bernhard Schoelkopf

    Abstract: Statistical learning theory provides the theoretical basis for many of today's machine learning algorithms. In this article we attempt to give a gentle, non-technical overview over the key ideas and insights of statistical learning theory. We target at a broad audience, not necessarily machine learning researchers. This paper can serve as a starting point for people who want to get an overview o… ▽ More

    Submitted 27 October, 2008; originally announced October 2008.

  46. arXiv:0804.3678  [pdf, ps, other

    math.ST cs.IT stat.ML

    Causal inference using the algorithmic Markov condition

    Authors: Dominik Janzing, Bernhard Schoelkopf

    Abstract: Inferring the causal structure that links n observables is usually based upon detecting statistical dependences and choosing simple graphs that make the joint measure Markovian. Here we argue why causal inference is also possible when only single observations are present. We develop a theory how to generate causal graphs explaining similarities between single objects. To this end, we replace t… ▽ More

    Submitted 23 April, 2008; originally announced April 2008.

    Comments: 16 figures

    MSC Class: 62A01

  47. Kernel methods in machine learning

    Authors: Thomas Hofmann, Bernhard Schölkopf, Alexander J. Smola

    Abstract: We review machine learning methods employing positive definite kernels. These methods formulate learning and estimation problems in a reproducing kernel Hilbert space (RKHS) of functions defined on the data domain, expanded in terms of a kernel. Working in linear spaces of function has the benefit of facilitating the construction and analysis of learning algorithms while at the same time allowin… ▽ More

    Submitted 1 July, 2008; v1 submitted 30 January, 2007; originally announced January 2007.

    Comments: Published in at http://dx.doi.org/10.1214/009053607000000677 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS0290 MSC Class: 30C40 (Primary) 68T05 (Secondary)

    Journal ref: Annals of Statistics 2008, Vol. 36, No. 3, 1171-1220

  48. Comment on "Support Vector Machines with Applications"

    Authors: Olivier Bousquet, Bernhard Schölkopf

    Abstract: Comment on ``Support Vector Machines with Applications'' [math.ST/0612817]

    Submitted 28 December, 2006; originally announced December 2006.

    Comments: Published at http://dx.doi.org/10.1214/088342306000000484 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-STS-STS153D

    Journal ref: Statistical Science 2006, Vol. 21, No. 3, 337-340