Skip to main content

Showing 1–12 of 12 results for author: Buchholz, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.09236  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models

    Authors: Goutham Rajendran, Simon Buchholz, Bryon Aragam, Bernhard Schölkopf, Pradeep Ravikumar

    Abstract: To build intelligent machine learning systems, there are two broad approaches. One approach is to build inherently interpretable models, as endeavored by the growing field of causal representation learning. The other approach is to build highly-performant foundation models and then invest efforts into understanding how they work. In this work, we relate these two approaches and study how to learn… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: 36 pages

  2. arXiv:2306.02235  [pdf, other

    cs.LG cs.AI math.ST stat.ME stat.ML

    Learning Linear Causal Representations from Interventions under General Nonlinear Mixing

    Authors: Simon Buchholz, Goutham Rajendran, Elan Rosenfeld, Bryon Aragam, Bernhard Schölkopf, Pradeep Ravikumar

    Abstract: We study the problem of learning causal representations from unknown, latent interventions in a general setting, where the latent distribution is Gaussian but the mixing function is completely general. We prove strong identifiability results given unknown single-node interventions, i.e., without having access to the intervention targets. This generalizes prior works which have focused on weaker cl… ▽ More

    Submitted 18 December, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

    Comments: Accepted as Oral paper at NeurIPS 2023

  3. arXiv:2305.17225  [pdf, other

    stat.ML cs.AI cs.LG

    Causal Component Analysis

    Authors: Liang Wendong, Armin Kekić, Julius von Kügelgen, Simon Buchholz, Michel Besserve, Luigi Gresele, Bernhard Schölkopf

    Abstract: Independent Component Analysis (ICA) aims to recover independent latent variables from observed mixtures thereof. Causal Representation Learning (CRL) aims instead to infer causally related (thus often statistically dependent) latent variables, together with the unknown graph encoding their causal relationships. We introduce an intermediate problem termed Causal Component Analysis (CauCA). CauCA c… ▽ More

    Submitted 17 January, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023 final camera-ready version

  4. arXiv:2305.17161  [pdf, other

    cs.LG

    Flow Matching for Scalable Simulation-Based Inference

    Authors: Maximilian Dax, Jonas Wildberger, Simon Buchholz, Stephen R. Green, Jakob H. Macke, Bernhard Schölkopf

    Abstract: Neural posterior estimation methods based on discrete normalizing flows have become established tools for simulation-based inference (SBI), but scaling them to high-dimensional problems can be challenging. Building on recent advances in generative modeling, we here present flow matching posterior estimation (FMPE), a technique for SBI using continuous normalizing flows. Like diffusion models, and… ▽ More

    Submitted 27 October, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023. Code available at https://github.com/dingo-gw/flow-matching-posterior-estimation

  5. arXiv:2305.17139  [pdf, other

    cs.AI math.ST

    A Measure-Theoretic Axiomatisation of Causality

    Authors: Junhyung Park, Simon Buchholz, Bernhard Schölkopf, Krikamol Muandet

    Abstract: Causality is a central concept in a wide range of research areas, yet there is still no universally agreed axiomatisation of causality. We view causality both as an extension of probability theory and as a study of \textit{what happens when one intervenes on a system}, and argue in favour of taking Kolmogorov's measure-theoretic axiomatisation of probability as the starting point towards an axioma… ▽ More

    Submitted 6 June, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

  6. arXiv:2208.06406  [pdf, other

    stat.ML cs.LG

    Function Classes for Identifiable Nonlinear Independent Component Analysis

    Authors: Simon Buchholz, Michel Besserve, Bernhard Schölkopf

    Abstract: Unsupervised learning of latent variable models (LVMs) is widely used to represent data in machine learning. When such models reflect the ground truth factors and the mechanisms map** them to observations, there is reason to expect that they allow generalization in downstream tasks. It is however well known that such identifiability guaranties are typically not achievable without putting constra… ▽ More

    Submitted 12 August, 2022; originally announced August 2022.

    Comments: 43 pages

    Journal ref: NeurIPS 2022

  7. arXiv:2206.08843  [pdf, other

    cs.LG stat.ML

    AutoML Two-Sample Test

    Authors: Jonas M. Kübler, Vincent Stimper, Simon Buchholz, Krikamol Muandet, Bernhard Schölkopf

    Abstract: Two-sample tests are important in statistics and machine learning, both as tools for scientific discovery as well as to detect distribution shifts. This led to the development of many sophisticated test procedures going beyond the standard supervised learning frameworks, whose usage can require specialized knowledge about two-sample testing. We use a simple test that takes the mean discrepancy of… ▽ More

    Submitted 15 January, 2023; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: NeurIPS 2022

  8. arXiv:2012.10244  [pdf, other

    eess.SY cs.CE

    Time Aggregation Techniques Applied to a Capacity Expansion Model for Real-Life Sector Coupled Energy Systems

    Authors: Mette Gamst, Stefanie Buchholz, David Pisinger

    Abstract: Simulating energy systems is vital for energy planning to understand the effects of fluctuating renewable energy sources and integration of multiple energy sectors. Capacity expansion is a powerful tool for energy analysts and consists of simulating energy systems with the option of investing in new energy sources. In this paper, we apply clustering based aggregation techniques from the literature… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

    ACM Class: G.4; G.2.3

  9. arXiv:2010.13158  [pdf, other

    physics.geo-ph cs.SD eess.AS

    A "DIY" data acquisition system for acoustic field measurements under harsh conditions

    Authors: Steffen Büchholz, Mathias Lemke, Julius Reiss, Jörn Sesterhenn

    Abstract: Monitoring active volcanos is an ongoing and important task hel** to understand and predict volcanic eruptions. In recent years, analysing the acoustic properties of eruptions became more relevant. We present an inexpensive, lightweight, portable, easy to use and modular acoustic data acquisition system for field measurements that can record data with up to 100~kHz. The system is based on a Rasp… ▽ More

    Submitted 25 October, 2020; originally announced October 2020.

    Comments: 9 figures at the end

  10. arXiv:cs/0009008  [pdf, ps, other

    cs.CL

    Introduction to the CoNLL-2000 Shared Task: Chunking

    Authors: Erik F. Tjong Kim Sang, Sabine Buchholz

    Abstract: We describe the CoNLL-2000 shared task: dividing text into syntactically related non-overlap** groups of words, so-called text chunking. We give background information on the data sets, present a general overview of the systems that have taken part in the shared task and briefly discuss their performance.

    Submitted 18 September, 2000; originally announced September 2000.

    Comments: 6 pages

    ACM Class: I.2.7

    Journal ref: Proceedings of CoNLL-2000 and LLL-2000, Lisbon, Portugal

  11. arXiv:cs/9906005  [pdf, ps, other

    cs.CL cs.LG

    Memory-Based Shallow Parsing

    Authors: Walter Daelemans, Sabine Buchholz, Jorn Veenstra

    Abstract: We present a memory-based learning (MBL) approach to shallow parsing in which POS tagging, chunking, and identification of syntactic relations are formulated as memory-based modules. The experiments reported in this paper show competitive results, the F-value for the Wall Street Journal (WSJ) treebank is: 93.8% for NP chunking, 94.7% for VP chunking, 77.1% for subject detection and 79.0% for obj… ▽ More

    Submitted 2 June, 1999; originally announced June 1999.

    Comments: 8 pages, to appear in: Proceedings of the EACL'99 workshop on Computational Natural Language Learning (CoNLL-99), Bergen, Norway, June 1999

    Report number: ILK-9907 ACM Class: I.6.2; I.7.1

  12. arXiv:cs/9906004  [pdf, ps, other

    cs.CL cs.LG

    Cascaded Grammatical Relation Assignment

    Authors: Sabine Buchholz, Jorn Veenstra, Walter Daelemans

    Abstract: In this paper we discuss cascaded Memory-Based grammatical relations assignment. In the first stages of the cascade, we find chunks of several types (NP,VP,ADJP,ADVP,PP) and label them with their adverbial function (e.g. local, temporal). In the last stage, we assign grammatical relations to pairs of chunks. We studied the effect of adding several levels to this cascaded classifier and we found… ▽ More

    Submitted 2 June, 1999; originally announced June 1999.

    Comments: 8 pages, to appear in: proceedings of EMNLP/VLC-99, University of Maryland, USA, June 21-22, 1999

    Report number: ILK-9908 ACM Class: I.6.2; I.7.1