Search | arXiv e-print repository

doi 10.3847/1538-3881/ac2c02

The HETDEX Instrumentation: Hobby-Eberly Telescope Wide Field Upgrade and VIRUS

Authors: Gary J. Hill, Hanshin Lee, Phillip J. MacQueen, Andreas Kelz, Niv Drory, Brian L. Vattiat, John M. Good, Jason Ramsey, Herman Kriel, Trent Peterson, D. L. DePoy, Karl Gebhardt, J. L. Marshall, Sarah E. Tuttle, Svend M. Bauer, Taylor S. Chonis, Maximilian H. Fabricius, Cynthia Froning, Marco Haeuser, Briana L. Indahl, Thomas Jahn, Martin Landriau, Ron Leck, Francesco Montesano, Travis Prochaska , et al. (24 additional authors not shown)

Abstract: The Hobby-Eberly Telescope (HET) Dark Energy Experiment (HETDEX) is undertaking a blind wide-field low-resolution spectroscopic survey of 540 square degrees of sky to identify and derive redshifts for a million Lyman-alpha emitting galaxies (LAEs) in the redshift range 1.9 < z < 3.5. The ultimate goal is to measure the expansion rate of the Universe at this epoch, to sharply constrain cosmological… ▽ More The Hobby-Eberly Telescope (HET) Dark Energy Experiment (HETDEX) is undertaking a blind wide-field low-resolution spectroscopic survey of 540 square degrees of sky to identify and derive redshifts for a million Lyman-alpha emitting galaxies (LAEs) in the redshift range 1.9 < z < 3.5. The ultimate goal is to measure the expansion rate of the Universe at this epoch, to sharply constrain cosmological parameters and thus the nature of dark energy. A major multi-year wide field upgrade (WFU) of the HET was completed in 2016 that substantially increased the field of view to 22 arcminutes diameter and the pupil to 10 meters, by replacing the optical corrector, tracker, and prime focus instrument package and by develo** a new telescope control system. The new, wide-field HET now feeds the Visible Integral-field Replicable Unit Spectrograph (VIRUS), a new low-resolution integral field spectrograph (LRS2), and the Habitable Zone Planet Finder (HPF), a precision near-infrared radial velocity spectrograph. VIRUS consists of 156 identical spectrographs fed by almost 35,000 fibers in 78 integral field units arrayed at the focus of the upgraded HET. VIRUS operates in a bandpass of 3500-5500 Angstroms with resolving power R~800. VIRUS is the first example of large scale replication applied to instrumentation in optical astronomy to achieve spectroscopic surveys of very large areas of sky. This paper presents technical details of the HET WFU and VIRUS, as flowed-down from the HETDEX science requirements, along with experience from commissioning this major telescope upgrade and the innovative instrumentation suite for HETDEX. △ Less

Submitted 7 December, 2021; v1 submitted 7 October, 2021; originally announced October 2021.

Comments: 65 pages, 25 figures, published in the Astronomical Journal; replaced with final published version

Journal ref: AJ 162 298 (2021)

arXiv:2110.03628 [pdf, other]

Boxhead: A Dataset for Learning Hierarchical Representations

Authors: Yukun Chen, Andrea Dittadi, Frederik Träuble, Stefan Bauer, Bernhard Schölkopf

Abstract: Disentanglement is hypothesized to be beneficial towards a number of downstream tasks. However, a common assumption in learning disentangled representations is that the data generative factors are statistically independent. As current methods are almost solely evaluated on toy datasets where this ideal assumption holds, we investigate their performance in hierarchical settings, a relevant feature… ▽ More Disentanglement is hypothesized to be beneficial towards a number of downstream tasks. However, a common assumption in learning disentangled representations is that the data generative factors are statistically independent. As current methods are almost solely evaluated on toy datasets where this ideal assumption holds, we investigate their performance in hierarchical settings, a relevant feature of real-world data. In this work, we introduce Boxhead, a dataset with hierarchically structured ground-truth generative factors. We use this novel dataset to evaluate the performance of state-of-the-art autoencoder-based disentanglement models and observe that hierarchical models generally outperform single-layer VAEs in terms of disentanglement of hierarchically arranged factors. △ Less

Submitted 6 December, 2021; v1 submitted 7 October, 2021; originally announced October 2021.

Comments: NeurIPS 2021 Workshop on Shared Visual Representations in Human and Machine Intelligence (SVRHM 2021)

arXiv:2110.01554 [pdf, other]

The winter dilemma

Authors: Sebastian Contreras, Philipp Dönges, Joel Wagner, Simon Bauer, Sebastian B. Mohr, Emil N. Iftekhar, Mirjam Kretzschmar, Michael Maes, Kai Nagel, André Calero Valdez, Viola Priesemann

Abstract: With winter coming in the northern hemisphere, disadvantageous seasonality of SARS-CoV-2 requires high immunity levels in the population or increasing non-pharmaceutical interventions (NPIs), compared to summer. Otherwise intensive care units (ICUs) might fill up. However, compliance with mandatory NPIs, vaccine uptake, and individual protective measures depend on individuals' opinions and behavio… ▽ More With winter coming in the northern hemisphere, disadvantageous seasonality of SARS-CoV-2 requires high immunity levels in the population or increasing non-pharmaceutical interventions (NPIs), compared to summer. Otherwise intensive care units (ICUs) might fill up. However, compliance with mandatory NPIs, vaccine uptake, and individual protective measures depend on individuals' opinions and behavior. Opinions, in turn, depend on information, e.g., about vaccine safety or current infection levels. Therefore, understanding how information about the pandemic affects its spread through the modulation of voluntary protection-seeking behaviors is crucial for better preparedness this winter and for future crises. △ Less

Submitted 15 October, 2021; v1 submitted 4 October, 2021; originally announced October 2021.

Comments: Estimation of COVID-19 case numbers for the coming winter

arXiv:2109.10957 [pdf, other]

Real Robot Challenge: A Robotics Competition in the Cloud

Authors: Stefan Bauer, Felix Widmaier, Manuel Wüthrich, Annika Buchholz, Sebastian Stark, Anirudh Goyal, Thomas Steinbrenner, Joel Akpo, Shruti Joshi, Vincent Berenz, Vaibhav Agrawal, Niklas Funk, Julen Urain De Jesus, Jan Peters, Joe Watson, Claire Chen, Krishnan Srinivasan, Junwu Zhang, Jeffrey Zhang, Matthew R. Walter, Rishabh Madan, Charles Schaff, Takahiro Maeda, Takuma Yoneda, Denis Yarats , et al. (17 additional authors not shown)

Abstract: Dexterous manipulation remains an open problem in robotics. To coordinate efforts of the research community towards tackling this problem, we propose a shared benchmark. We designed and built robotic platforms that are hosted at MPI for Intelligent Systems and can be accessed remotely. Each platform consists of three robotic fingers that are capable of dexterous object manipulation. Users are able… ▽ More Dexterous manipulation remains an open problem in robotics. To coordinate efforts of the research community towards tackling this problem, we propose a shared benchmark. We designed and built robotic platforms that are hosted at MPI for Intelligent Systems and can be accessed remotely. Each platform consists of three robotic fingers that are capable of dexterous object manipulation. Users are able to control the platforms remotely by submitting code that is executed automatically, akin to a computational cluster. Using this setup, i) we host robotics competitions, where teams from anywhere in the world access our platforms to tackle challenging tasks ii) we publish the datasets collected during these competitions (consisting of hundreds of robot hours), and iii) we give researchers access to these platforms for their own projects. △ Less

Submitted 10 June, 2022; v1 submitted 22 September, 2021; originally announced September 2021.

arXiv:2109.02429 [pdf, other]

Learning Neural Causal Models with Active Interventions

Authors: Nino Scherrer, Olexa Bilaniuk, Yashas Annadani, Anirudh Goyal, Patrick Schwab, Bernhard Schölkopf, Michael C. Mozer, Yoshua Bengio, Stefan Bauer, Nan Rosemary Ke

Abstract: Discovering causal structures from data is a challenging inference problem of fundamental importance in all areas of science. The appealing properties of neural networks have recently led to a surge of interest in differentiable neural network-based methods for learning causal structures from data. So far, differentiable causal discovery has focused on static datasets of observational or fixed int… ▽ More Discovering causal structures from data is a challenging inference problem of fundamental importance in all areas of science. The appealing properties of neural networks have recently led to a surge of interest in differentiable neural network-based methods for learning causal structures from data. So far, differentiable causal discovery has focused on static datasets of observational or fixed interventional origin. In this work, we introduce an active intervention targeting (AIT) method which enables a quick identification of the underlying causal structure of the data-generating process. Our method significantly reduces the required number of interactions compared with random intervention targeting and is applicable for both discrete and continuous optimization formulations of learning the underlying directed acyclic graph (DAG) from data. We examine the proposed method across multiple frameworks in a wide range of settings and demonstrate superior performance on multiple benchmarks from simulated to real-world data. △ Less

Submitted 5 March, 2022; v1 submitted 6 September, 2021; originally announced September 2021.

arXiv:2108.10018 [pdf, other]

Mutational signatures and transmissibility of SARS-CoV-2 Gamma and Lambda variants

Authors: Karen Y. Oróstica, Sebastian Contreras, Sebastian B. Mohr, Jonas Dehning, Simon Bauer, David Medina-Ortiz, Emil N. Iftekhar, Karen Mujica, Paulo C. Covarrubias, Soledad Ulloa, Andrés E. Castillo, Ricardo A. Verdugo, Jorge Fernández, Álvaro Olivera-Nappa, Viola Priesemann

Abstract: The emergence of SARS-CoV-2 variants of concern endangers the long-term control of COVID-19, especially in countries with limited genomic surveillance. In this work, we explored genomic drivers of contagion in Chile. We sequenced 3443 SARS-CoV-2 genomes collected between January and July 2021, where the Gamma (P.1), Lambda (C.37), Alpha (B.1.1.7), B.1.1.348, and B.1.1 lineages were predominant. Us… ▽ More The emergence of SARS-CoV-2 variants of concern endangers the long-term control of COVID-19, especially in countries with limited genomic surveillance. In this work, we explored genomic drivers of contagion in Chile. We sequenced 3443 SARS-CoV-2 genomes collected between January and July 2021, where the Gamma (P.1), Lambda (C.37), Alpha (B.1.1.7), B.1.1.348, and B.1.1 lineages were predominant. Using a Bayesian model tailored for limited genomic surveillance, we found that Lambda and Gamma variants' reproduction numbers were about 5% and 16% larger than Alpha's, respectively. We observed an overabundance of mutations in the Spike gene, strongly correlated with the variant's transmissibility. Furthermore, the variants' mutational signatures featured a breakpoint concurrent with the beginning of vaccination (mostly CoronaVac, an inactivated virus vaccine), indicating an additional putative selective pressure. Thus, our work provides a reliable method for quantifying novel variants' transmissibility under subsampling (as newly-reported Delta, B.1.617.2) and highlights the importance of continuous genomic surveillance. △ Less

Submitted 23 August, 2021; originally announced August 2021.

arXiv:2108.09779 [pdf, other]

doi 10.1109/IROS47612.2022.9981458

Transferring Dexterous Manipulation from GPU Simulation to a Remote Real-World TriFinger

Authors: Arthur Allshire, Mayank Mittal, Varun Lodaya, Viktor Makoviychuk, Denys Makoviichuk, Felix Widmaier, Manuel Wüthrich, Stefan Bauer, Ankur Handa, Animesh Garg

Abstract: We present a system for learning a challenging dexterous manipulation task involving moving a cube to an arbitrary 6-DoF pose with only 3-fingers trained with NVIDIA's IsaacGym simulator. We show empirical benefits, both in simulation and sim-to-real transfer, of using keypoints as opposed to position+quaternion representations for the object pose in 6-DoF for policy observations and in reward cal… ▽ More We present a system for learning a challenging dexterous manipulation task involving moving a cube to an arbitrary 6-DoF pose with only 3-fingers trained with NVIDIA's IsaacGym simulator. We show empirical benefits, both in simulation and sim-to-real transfer, of using keypoints as opposed to position+quaternion representations for the object pose in 6-DoF for policy observations and in reward calculation to train a model-free reinforcement learning agent. By utilizing domain randomization strategies along with the keypoint representation of the pose of the manipulated object, we achieve a high success rate of 83% on a remote TriFinger system maintained by the organizers of the Real Robot Challenge. With the aim of assisting further research in learning in-hand manipulation, we make the codebase of our system, along with trained checkpoints that come with billions of steps of experience available, at https://s2r2-ig.github.io △ Less

Submitted 20 October, 2022; v1 submitted 22 August, 2021; originally announced August 2021.

Comments: International Conference on Intelligent Robots and Systems (IROS 2022)

arXiv:2108.06639 [pdf]

On the carrier transport and radiative recombination mechanisms in tunneling injection quantum dot lasers

Authors: V. Mikhelashvili, S. Bauer, I. Khanonkin, O. Eyal, G. Seri, L. Gal, J. P. Reithmaier, G. Eisenstein

Abstract: We report temperature-dependent current-voltage (I - V - T) and output light power-voltage or current (P - V - T) or (P - I - T) characteristics of 1550 nm tunneling injection quantum dot (TI QD) laser diodes. Experimental data is accompanied by physical models that distinguish between different current flow and light emission mechanisms for different applied voltages and temperature ranges. Three… ▽ More We report temperature-dependent current-voltage (I - V - T) and output light power-voltage or current (P - V - T) or (P - I - T) characteristics of 1550 nm tunneling injection quantum dot (TI QD) laser diodes. Experimental data is accompanied by physical models that distinguish between different current flow and light emission mechanisms for different applied voltages and temperature ranges. Three exponential regimes in the I - V characteristics were identified for low bias levels where no optical radiation takes place. At the lowest bias levels, the diffusion-recombination mechanism based on the classical Shockley-Reid-Hall theory dominates. This is followed, at low and near room temperature, by a combination of weak tunneling and generation-recombination, respectively. In the third exponential region, for all temperatures carrier transport is dictated by strong tunneling, which is characterized by a temperature-independent slope of the I - V curves and a variable ideality factor. The I - V results were compared to a conventional QD laser in which the current flow mechanisms of the first and third types are absent, which clearly demonstrates the key role played by the TI layer. In the post-exponential voltage range, when the diodes are in the high injection regime, the characteristics of the two types of diodes are identical. The typical behavior at the threshold current, where the output power increases fast has a clear signature in the I - V characteristics. Finally, an analytical quantitative relationship is established between the light output power and the applied voltage and current as well as the carrier density participating in radiative recombination. △ Less

Submitted 14 August, 2021; originally announced August 2021.

arXiv:2107.13371 [pdf, other]

Thin-Film InGaAs Metamorphic Buffer for telecom C-band InAs Quantum Dots and Optical Resonators on GaAs Platform

Authors: Robert Sittig, Cornelius Nawrath, Sascha Kolatschek, Stephanie Bauer, Richard Schaber, Jiasheng Huang, Ponraj Vijayan, Simone Luca Portalupi, Michael Jetter, Peter Michler

Abstract: The GaAs-based material system is well-known for the implementation of InAs quantum dots (QDs) with outstanding optical properties. However, these dots typically emit at a wavelength of around 900nm. The insertion of a metamorphic buffer (MMB) can shift the emission to the technologically attractive telecom C-band range centered at 1550nm. However, the thickness of common MMB designs limits their… ▽ More The GaAs-based material system is well-known for the implementation of InAs quantum dots (QDs) with outstanding optical properties. However, these dots typically emit at a wavelength of around 900nm. The insertion of a metamorphic buffer (MMB) can shift the emission to the technologically attractive telecom C-band range centered at 1550nm. However, the thickness of common MMB designs limits their compatibility with most photonic resonator types. Here we report on the MOVPE growth of a novel InGaAs MMB with a non-linear indium content grading profile designed to maximize plastic relaxation within minimal layer thickness. Single-photon emission at 1550nm from InAs QDs deposited on top of this thin-film MMB is demonstrated. The strength of the new design is proven by integrating it into a bullseye cavity via nano-structuring techniques. The presented advances in the epitaxial growth of QD/MMB structures form the basis for the fabrication of high-quality telecom non-classical light sources as a key component of photonic quantum technologies. △ Less

Submitted 2 August, 2021; v1 submitted 28 July, 2021; originally announced July 2021.

arXiv:2107.05686 [pdf, other]

The Role of Pretrained Representations for the OOD Generalization of Reinforcement Learning Agents

Authors: Andrea Dittadi, Frederik Träuble, Manuel Wüthrich, Felix Widmaier, Peter Gehler, Ole Winther, Francesco Locatello, Olivier Bachem, Bernhard Schölkopf, Stefan Bauer

Abstract: Building sample-efficient agents that generalize out-of-distribution (OOD) in real-world settings remains a fundamental unsolved problem on the path towards achieving higher-level cognition. One particularly promising approach is to begin with low-dimensional, pretrained representations of our world, which should facilitate efficient downstream learning and generalization. By training 240 represen… ▽ More Building sample-efficient agents that generalize out-of-distribution (OOD) in real-world settings remains a fundamental unsolved problem on the path towards achieving higher-level cognition. One particularly promising approach is to begin with low-dimensional, pretrained representations of our world, which should facilitate efficient downstream learning and generalization. By training 240 representations and over 10,000 reinforcement learning (RL) policies on a simulated robotic setup, we evaluate to what extent different properties of pretrained VAE-based representations affect the OOD generalization of downstream agents. We observe that many agents are surprisingly robust to realistic distribution shifts, including the challenging sim-to-real case. In addition, we find that the generalization performance of a simple downstream proxy task reliably predicts the generalization performance of our RL agents under a wide range of OOD settings. Such proxy tasks can thus be used to select pretrained representations that will lead to agents that generalize. △ Less

Submitted 16 April, 2022; v1 submitted 12 July, 2021; originally announced July 2021.

Comments: Published at ICLR 2022

arXiv:2107.03316 [pdf, other]

doi 10.1021/acs.nanolett.1c02647

Bright Purcell enhanced single-photon source in the telecom O-band based on a quantum dot in a circular Bragg grating

Authors: Sascha Kolatschek, Cornelius Nawrath, Stephanie Bauer, Jiasheng Huang, Julius Fischer, Robert Sittig, Michael Jetter, Simone L. Portalupi, Peter Michler

Abstract: The combination of semiconductor quantum dots (QDs) with photonic cavities is a promising way to realize non-classical light sources with state-of-the-art performances in terms of brightness, indistinguishability and repetition rate. In the present work we demonstrate the coupling of an InGaAs/GaAs QDs emitting in the telecom O-band to a circular Bragg grating cavity. We demonstrate a broadband ge… ▽ More The combination of semiconductor quantum dots (QDs) with photonic cavities is a promising way to realize non-classical light sources with state-of-the-art performances in terms of brightness, indistinguishability and repetition rate. In the present work we demonstrate the coupling of an InGaAs/GaAs QDs emitting in the telecom O-band to a circular Bragg grating cavity. We demonstrate a broadband geometric extraction efficiency enhancement by investigating two emission lines under above-band excitation, inside and detuned from the cavity mode, respectively. In the first case, a Purcell enhancement of 4 is attained. For the latter case, an end-to-end brightness of 1.4% with a brightness at the first lens of 23% is achieved. Using p-shell pum**, a combination of high count rate with pure single-photon emission (g(2)(0) = 0.01 in saturation) is achieved. Finally a good single-photon purity (g(2)(0) = 0.13) together with a high detector count rate of 191kcps is demonstrated for a temperature of up to 77K. △ Less

Submitted 7 July, 2021; originally announced July 2021.

arXiv:2107.01670 [pdf]

doi 10.1016/j.lanepe.2021.100185

A look into the future of the COVID-19 pandemic in Europe: an expert consultation

Authors: Emil Nafis Iftekhar, Viola Priesemann, Rudi Balling, Simon Bauer, Philippe Beutels, André Calero Valdez, Sarah Cuschieri, Thomas Czypionka, Uga Dumpis, Enrico Glaab, Eva Grill, Claudia Hanson, Pirta Hotulainen, Peter Klimek, Mirjam Kretzschmar, Tyll Krüger, Jenny Krutzinna, Nicola Low, Helena Machado, Carlos Martins, Martin McKee, Sebastian Bernd Mohr, Armin Nassehi, Matjaž Perc, Elena Petelos , et al. (9 additional authors not shown)

Abstract: How will the coronavirus disease 2019 (COVID-19) pandemic develop in the coming months and years? Based on an expert survey, we examine key aspects that are likely to influence COVID-19 in Europe. The future challenges and developments will strongly depend on the progress of national and global vaccination programs, the emergence and spread of variants of concern, and public responses to nonpharma… ▽ More How will the coronavirus disease 2019 (COVID-19) pandemic develop in the coming months and years? Based on an expert survey, we examine key aspects that are likely to influence COVID-19 in Europe. The future challenges and developments will strongly depend on the progress of national and global vaccination programs, the emergence and spread of variants of concern, and public responses to nonpharmaceutical interventions (NPIs). In the short term, many people are still unvaccinated, VOCs continue to emerge and spread, and mobility and population mixing is expected to increase over the summer. Therefore, policies that lift restrictions too much and too early risk another damaging wave. This challenge remains despite the reduced opportunities for transmission due to vaccination progress and reduced indoor mixing in the summer. In autumn 2021, increased indoor activity might accelerate the spread again, but a necessary reintroduction of NPIs might be too slow. The incidence may strongly rise again, possibly filling intensive care units, if vaccination levels are not high enough. A moderate, adaptive level of NPIs will thus remain necessary. These epidemiological aspects are put into perspective with the economic, social, and health-related consequences and thereby provide a holistic perspective on the future of COVID-19. △ Less

Submitted 23 July, 2021; v1 submitted 4 July, 2021; originally announced July 2021.

Comments: Manuscript is accepted by The Lancet Regional Health - Europe as a Viewpoint article. Supplementary material can be accessed here: https://owncloud.gwdg.de/index.php/f/1439962756

Journal ref: Lancet Reg. Health Eur. 8, 100185 (2021)

arXiv:2107.00848 [pdf, other]

Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning

Authors: Nan Rosemary Ke, Aniket Didolkar, Sarthak Mittal, Anirudh Goyal, Guillaume Lajoie, Stefan Bauer, Danilo Rezende, Yoshua Bengio, Michael Mozer, Christopher Pal

Abstract: Inducing causal relationships from observations is a classic problem in machine learning. Most work in causality starts from the premise that the causal variables themselves are observed. However, for AI agents such as robots trying to make sense of their environment, the only observables are low-level variables like pixels in images. To generalize well, an agent must induce high-level variables,… ▽ More Inducing causal relationships from observations is a classic problem in machine learning. Most work in causality starts from the premise that the causal variables themselves are observed. However, for AI agents such as robots trying to make sense of their environment, the only observables are low-level variables like pixels in images. To generalize well, an agent must induce high-level variables, particularly those which are causal or are affected by causal variables. A central goal for AI and causality is thus the joint discovery of abstract representations and causal structure. However, we note that existing environments for studying causal induction are poorly suited for this objective because they have complicated task-specific causal graphs which are impossible to manipulate parametrically (e.g., number of nodes, sparsity, causal chain length, etc.). In this work, our goal is to facilitate research in learning representations of high-level variables as well as causal structures among them. In order to systematically probe the ability of methods to identify these variables and structures, we design a suite of benchmarking RL environments. We evaluate various representation learning algorithms from the literature and find that explicitly incorporating structure and modularity in models can help causal induction in model-based reinforcement learning. △ Less

Submitted 2 July, 2021; originally announced July 2021.

arXiv:2106.16091 [pdf, other]

Exploring the Latent Space of Autoencoders with Interventional Assays

Authors: Felix Leeb, Stefan Bauer, Michel Besserve, Bernhard Schölkopf

Abstract: Autoencoders exhibit impressive abilities to embed the data manifold into a low-dimensional latent space, making them a staple of representation learning methods. However, without explicit supervision, which is often unavailable, the representation is usually uninterpretable, making analysis and principled progress challenging. We propose a framework, called latent responses, which exploits the lo… ▽ More Autoencoders exhibit impressive abilities to embed the data manifold into a low-dimensional latent space, making them a staple of representation learning methods. However, without explicit supervision, which is often unavailable, the representation is usually uninterpretable, making analysis and principled progress challenging. We propose a framework, called latent responses, which exploits the locally contractive behavior exhibited by variational autoencoders to explore the learned manifold. More specifically, we develop tools to probe the representation using interventions in the latent space to quantify the relationships between latent variables. We extend the notion of disentanglement to take the learned generative process into account and consequently avoid the limitations of existing metrics that may rely on spurious correlations. Our analyses underscore the importance of studying the causal structure of the representation to improve performance on downstream tasks such as generation, interpolation, and inference of the factors of variation. △ Less

Submitted 11 January, 2023; v1 submitted 30 June, 2021; originally announced June 2021.

Comments: Published in NeurIPS 2022 Conference Proceedings

arXiv:2106.07635 [pdf, other]

Variational Causal Networks: Approximate Bayesian Inference over Causal Structures

Authors: Yashas Annadani, Jonas Rothfuss, Alexandre Lacoste, Nino Scherrer, Anirudh Goyal, Yoshua Bengio, Stefan Bauer

Abstract: Learning the causal structure that underlies data is a crucial step towards robust real-world decision making. The majority of existing work in causal inference focuses on determining a single directed acyclic graph (DAG) or a Markov equivalence class thereof. However, a crucial aspect to acting intelligently upon the knowledge about causal structure which has been inferred from finite data demand… ▽ More Learning the causal structure that underlies data is a crucial step towards robust real-world decision making. The majority of existing work in causal inference focuses on determining a single directed acyclic graph (DAG) or a Markov equivalence class thereof. However, a crucial aspect to acting intelligently upon the knowledge about causal structure which has been inferred from finite data demands reasoning about its uncertainty. For instance, planning interventions to find out more about the causal mechanisms that govern our data requires quantifying epistemic uncertainty over DAGs. While Bayesian causal inference allows to do so, the posterior over DAGs becomes intractable even for a small number of variables. Aiming to overcome this issue, we propose a form of variational inference over the graphs of Structural Causal Models (SCMs). To this end, we introduce a parametric variational family modelled by an autoregressive distribution over the space of discrete DAGs. Its number of parameters does not grow exponentially with the number of variables and can be tractably learned by maximising an Evidence Lower Bound (ELBO). In our experiments, we demonstrate that the proposed variational posterior is able to provide a good approximation of the true posterior. △ Less

Submitted 14 June, 2021; originally announced June 2021.

Comments: 10 pages, 6 figures

arXiv:2105.14257 [pdf, other]

Diffusion-Based Representation Learning

Authors: Korbinian Abstreiter, Sarthak Mittal, Stefan Bauer, Bernhard Schölkopf, Arash Mehrjou

Abstract: Diffusion-based methods represented as stochastic differential equations on a continuous-time domain have recently proven successful as a non-adversarial generative model. Training such models relies on denoising score matching, which can be seen as multi-scale denoising autoencoders. Here, we augment the denoising score matching framework to enable representation learning without any supervised s… ▽ More Diffusion-based methods represented as stochastic differential equations on a continuous-time domain have recently proven successful as a non-adversarial generative model. Training such models relies on denoising score matching, which can be seen as multi-scale denoising autoencoders. Here, we augment the denoising score matching framework to enable representation learning without any supervised signal. GANs and VAEs learn representations by directly transforming latent codes to data samples. In contrast, the introduced diffusion-based representation learning relies on a new formulation of the denoising score matching objective and thus encodes the information needed for denoising. We illustrate how this difference allows for manual control of the level of details encoded in the representation. Using the same approach, we propose to learn an infinite-dimensional latent code that achieves improvements of state-of-the-art models on semi-supervised image classification. We also compare the quality of learned representations of diffusion score matching with other methods like autoencoder and contrastively trained systems through their performances on downstream tasks. △ Less

Submitted 1 August, 2022; v1 submitted 29 May, 2021; originally announced May 2021.

arXiv:2105.02087 [pdf, other]

doi 10.1109/LRA.2021.3129139

Benchmarking Structured Policies and Policy Optimization for Real-World Dexterous Object Manipulation

Authors: Niklas Funk, Charles Schaff, Rishabh Madan, Takuma Yoneda, Julen Urain De Jesus, Joe Watson, Ethan K. Gordon, Felix Widmaier, Stefan Bauer, Siddhartha S. Srinivasa, Tapomayukh Bhattacharjee, Matthew R. Walter, Jan Peters

Abstract: Dexterous manipulation is a challenging and important problem in robotics. While data-driven methods are a promising approach, current benchmarks require simulation or extensive engineering support due to the sample inefficiency of popular methods. We present benchmarks for the TriFinger system, an open-source robotic platform for dexterous manipulation and the focus of the 2020 Real Robot Challen… ▽ More Dexterous manipulation is a challenging and important problem in robotics. While data-driven methods are a promising approach, current benchmarks require simulation or extensive engineering support due to the sample inefficiency of popular methods. We present benchmarks for the TriFinger system, an open-source robotic platform for dexterous manipulation and the focus of the 2020 Real Robot Challenge. The benchmarked methods, which were successful in the challenge, can be generally described as structured policies, as they combine elements of classical robotics and modern policy optimization. This inclusion of inductive biases facilitates sample efficiency, interpretability, reliability and high performance. The key aspects of this benchmarking is validation of the baselines across both simulation and the real system, thorough ablation study over the core features of each solution, and a retrospective analysis of the challenge as a manipulation benchmark. The code and demo videos for this work can be found on our website (https://sites.google.com/view/benchmark-rrc). △ Less

Submitted 8 December, 2021; v1 submitted 5 May, 2021; originally announced May 2021.

Journal ref: IEEE Robotics and Automation Letters 7 (2022) 478-485

arXiv:2103.15561 [pdf, other]

Pyfectious: An individual-level simulator to discover optimal containment polices for epidemic diseases

Authors: Arash Mehrjou, Ashkan Soleymani, Amin Abyaneh, Samir Bhatt, Bernhard Schölkopf, Stefan Bauer

Abstract: Simulating the spread of infectious diseases in human communities is critical for predicting the trajectory of an epidemic and verifying various policies to control the devastating impacts of the outbreak. Many existing simulators are based on compartment models that divide people into a few subsets and simulate the dynamics among those subsets using hypothesized differential equations. However, t… ▽ More Simulating the spread of infectious diseases in human communities is critical for predicting the trajectory of an epidemic and verifying various policies to control the devastating impacts of the outbreak. Many existing simulators are based on compartment models that divide people into a few subsets and simulate the dynamics among those subsets using hypothesized differential equations. However, these models lack the requisite granularity to study the effect of intelligent policies that influence every individual in a particular way. In this work, we introduce a simulator software capable of modeling a population structure and controlling the disease's propagation at an individualistic level. In order to estimate the confidence of the conclusions drawn from the simulator, we employ a comprehensive probabilistic approach where the entire population is constructed as a hierarchical random variable. This approach makes the inferred conclusions more robust against sampling artifacts and gives confidence bounds for decisions based on the simulation results. To showcase potential applications, the simulator parameters are set based on the formal statistics of the COVID-19 pandemic, and the outcome of a wide range of control measures is investigated. Furthermore, the simulator is used as the environment of a reinforcement learning problem to find the optimal policies to control the pandemic. The obtained experimental results indicate the simulator's adaptability and capacity in making sound predictions and a successful policy derivation example based on real-world data. As an exemplary application, our results show that the proposed policy discovery method can lead to control measures that produce significantly fewer infected individuals in the population and protect the health system against saturation. △ Less

Submitted 20 April, 2021; v1 submitted 24 March, 2021; originally announced March 2021.

arXiv:2103.11175 [pdf, other]

NCoRE: Neural Counterfactual Representation Learning for Combinations of Treatments

Authors: Sonali Parbhoo, Stefan Bauer, Patrick Schwab

Abstract: Estimating an individual's potential response to interventions from observational data is of high practical relevance for many domains, such as healthcare, public policy or economics. In this setting, it is often the case that combinations of interventions may be applied simultaneously, for example, multiple prescriptions in healthcare or different fiscal and monetary measures in economics. Howeve… ▽ More Estimating an individual's potential response to interventions from observational data is of high practical relevance for many domains, such as healthcare, public policy or economics. In this setting, it is often the case that combinations of interventions may be applied simultaneously, for example, multiple prescriptions in healthcare or different fiscal and monetary measures in economics. However, existing methods for counterfactual inference are limited to settings in which actions are not used simultaneously. Here, we present Neural Counterfactual Relation Estimation (NCoRE), a new method for learning counterfactual representations in the combination treatment setting that explicitly models cross-treatment interactions. NCoRE is based on a novel branched conditional neural representation that includes learnt treatment interaction modulators to infer the potential causal generative process underlying the combination of multiple treatments. Our experiments show that NCoRE significantly outperforms existing state-of-the-art methods for counterfactual treatment effect estimation that do not account for the effects of combining multiple treatments across several synthetic, semi-synthetic and real-world benchmarks. △ Less

Submitted 20 March, 2021; originally announced March 2021.

arXiv:2103.08877 [pdf, other]

Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling

Authors: Đorđe Miladinović, Aleksandar Stanić, Stefan Bauer, Jürgen Schmidhuber, Joachim M. Buhmann

Abstract: How to improve generative modeling by better exploiting spatial regularities and coherence in images? We introduce a novel neural network for building image generators (decoders) and apply it to variational autoencoders (VAEs). In our spatial dependency networks (SDNs), feature maps at each level of a deep neural net are computed in a spatially coherent way, using a sequential gating-based mechani… ▽ More How to improve generative modeling by better exploiting spatial regularities and coherence in images? We introduce a novel neural network for building image generators (decoders) and apply it to variational autoencoders (VAEs). In our spatial dependency networks (SDNs), feature maps at each level of a deep neural net are computed in a spatially coherent way, using a sequential gating-based mechanism that distributes contextual information across 2-D space. We show that augmenting the decoder of a hierarchical VAE by spatial dependency layers considerably improves density estimation over baseline convolutional architectures and the state-of-the-art among the models within the same class. Furthermore, we demonstrate that SDN can be applied to large images by synthesizing samples of high quality and coherence. In a vanilla VAE setting, we find that a powerful SDN decoder also improves learning disentangled representations, indicating that neural architectures play an important role in this task. Our results suggest favoring spatial dependency over convolutional layers in various VAE settings. The accompanying source code is given at https://github.com/djordjemila/sdn. △ Less

Submitted 16 March, 2021; originally announced March 2021.

Journal ref: International Conference on Learning Representations (2021);

arXiv:2103.06228 [pdf, other]

doi 10.1371/journal.pcbi.1009288

Relaxing restrictions at the pace of vaccination increases freedom and guards against further COVID-19 waves

Authors: Simon Bauer, Sebastian Contreras, Jonas Dehning, Matthias Linden, Emil Iftekhar, Sebastian B. Mohr, Álvaro Olivera-Nappa, Viola Priesemann

Abstract: Mass vaccination offers a promising exit strategy for the COVID-19 pandemic. However, as vaccination progresses, demands to lift restrictions increase, despite most of the population remaining susceptible. Using our age-stratified SEIRD-ICU compartmental model and curated epidemiological and vaccination data, we quantified the rate (relative to vaccination progress) at which countries can lift non… ▽ More Mass vaccination offers a promising exit strategy for the COVID-19 pandemic. However, as vaccination progresses, demands to lift restrictions increase, despite most of the population remaining susceptible. Using our age-stratified SEIRD-ICU compartmental model and curated epidemiological and vaccination data, we quantified the rate (relative to vaccination progress) at which countries can lift non-pharmaceutical interventions without overwhelming their healthcare systems. We analyzed scenarios ranging from immediately lifting restrictions (accepting high mortality and morbidity) to reducing case numbers to a level where test-trace-and-isolate (TTI) programs efficiently compensate for local spreading events. In general, the age-dependent vaccination roll-out implies a transient decrease of more than ten years in the average age of ICU patients and deceased. The pace of vaccination determines the speed of lifting restrictions; Taking the European Union (EU) as an example case, all considered scenarios allow for steadily increasing contacts starting in May 2021 and relaxing most restrictions by autumn 2021. Throughout summer 2021, only mild contact restrictions will remain necessary. However, only high vaccine uptake can prevent further severe waves. Across EU countries, seroprevalence impacts the long-term success of vaccination campaigns more strongly than age demographics. In addition, we highlight the need for preventive measures to reduce contagion in school settings throughout the year 2021, where children might be drivers of contagion because of them remaining susceptible... △ Less

Submitted 15 July, 2021; v1 submitted 10 March, 2021; originally announced March 2021.

Journal ref: PLoS Comput Biol 17(9): e1009288 (2021)

arXiv:2103.04755 [pdf, other]

doi 10.1088/1748-0221/16/08/T08015

The Design, Construction, and Commissioning of the KATRIN Experiment

Authors: M. Aker, K. Altenmüller, J. F. Amsbaugh, M. Arenz, M. Babutzka, J. Bast, S. Bauer, H. Bechtler, M. Beck, A. Beglarian, J. Behrens, B. Bender, R. Berendes, A. Berlev, U. Besserer, C. Bettin, B. Bieringer, K. Blaum, F. Block, S. Bobien, J. Bohn, K. Bokeloh, H. Bolz, B. Bornschein, L. Bornschein , et al. (204 additional authors not shown)

Abstract: The KArlsruhe TRItium Neutrino (KATRIN) experiment, which aims to make a direct and model-independent determination of the absolute neutrino mass scale, is a complex experiment with many components. More than 15 years ago, we published a technical design report (TDR) [https://publikationen.bibliothek.kit.edu/270060419] to describe the hardware design and requirements to achieve our sensitivity goa… ▽ More The KArlsruhe TRItium Neutrino (KATRIN) experiment, which aims to make a direct and model-independent determination of the absolute neutrino mass scale, is a complex experiment with many components. More than 15 years ago, we published a technical design report (TDR) [https://publikationen.bibliothek.kit.edu/270060419] to describe the hardware design and requirements to achieve our sensitivity goal of 0.2 eV at 90% C.L. on the neutrino mass. Since then there has been considerable progress, culminating in the publication of first neutrino mass results with the entire beamline operating [arXiv:1909.06048]. In this paper, we document the current state of all completed beamline components (as of the first neutrino mass measurement campaign), demonstrate our ability to reliably and stably control them over long times, and present details on their respective commissioning campaigns. △ Less

Submitted 11 June, 2021; v1 submitted 5 March, 2021; originally announced March 2021.

Comments: Added missing acknowledgement, corrected performance statement in chapter 4.2.5, updated author list and references

arXiv:2102.11107 [pdf, other]

Towards Causal Representation Learning

Authors: Bernhard Schölkopf, Francesco Locatello, Stefan Bauer, Nan Rosemary Ke, Nal Kalchbrenner, Anirudh Goyal, Yoshua Bengio

Abstract: The two fields of machine learning and graphical causality arose and developed separately. However, there is now cross-pollination and increasing interest in both fields to benefit from the advances of the other. In the present paper, we review fundamental concepts of causal inference and relate them to crucial open problems of machine learning, including transfer and generalization, thereby assay… ▽ More The two fields of machine learning and graphical causality arose and developed separately. However, there is now cross-pollination and increasing interest in both fields to benefit from the advances of the other. In the present paper, we review fundamental concepts of causal inference and relate them to crucial open problems of machine learning, including transfer and generalization, thereby assaying how causality can contribute to modern machine learning research. This also applies in the opposite direction: we note that most work in causality starts from the premise that the causal variables are given. A central problem for AI and causality is, thus, causal representation learning, the discovery of high-level causal variables from low-level observations. Finally, we delineate some implications of causality for machine learning and propose key research areas at the intersection of both communities. △ Less

Submitted 22 February, 2021; originally announced February 2021.

Comments: Special Issue of Proceedings of the IEEE - Advances in Machine Learning and Deep Neural Networks

arXiv:2102.11024 [pdf]

doi 10.1038/s41524-020-00472-7

Convolutional neural network-assisted recognition of nanoscale L12 ordered structures in face-centred cubic alloys

Authors: Yue Li, Xuyang Zhou, Timoteo Colnaghi, Ye Wei, Andreas Marek, Hongxiang Li, Stefan Bauer, Markus Rampp, Leigh Stephenson

Abstract: Nanoscale L12-type ordered structures are widely used in face-centred cubic (FCC) alloys to exploit their hardening capacity and thereby improve mechanical properties. These fine-scale particles are typically fully coherent with matrix with the same atomic configuration disregarding chemical species, which makes them challenging to be characterized. Spatial distribution maps (SDMs) are used to pro… ▽ More Nanoscale L12-type ordered structures are widely used in face-centred cubic (FCC) alloys to exploit their hardening capacity and thereby improve mechanical properties. These fine-scale particles are typically fully coherent with matrix with the same atomic configuration disregarding chemical species, which makes them challenging to be characterized. Spatial distribution maps (SDMs) are used to probe local order by interrogating the three-dimensional (3D) distribution of atoms within reconstructed atom probe tomography (APT) data. However, it is almost impossible to manually analyse the complete point cloud ($>10$ million) in search for the partial crystallographic information retained within the data. Here, we proposed an intelligent L12-ordered structure recognition method based on convolutional neural networks (CNNs). The SDMs of a simulated L12-ordered structure and the FCC matrix were firstly generated. These simulated images combined with a small amount of experimental data were used to train a CNN-based L12-ordered structure recognition model. Finally, the approach was successfully applied to reveal the 3D distribution of L12-type $δ^\prime$-Al3(LiMg) nanoparticles with an average radius of 2.54 nm in a FCC Al-Li-Mg system. The minimum radius of detectable nanodomain is even down to 5 Å. The proposed CNN-APT method is promising to be extended to recognize other nanoscale ordered structures and even more-challenging short-range ordered phenomena in the near future. △ Less

Submitted 16 February, 2021; originally announced February 2021.

Comments: 41 pages, 7 figures, 5 supplementary figures

Journal ref: NPJ Computational Materials 7, 8 (2021)

arXiv:2102.07257 [pdf, other]

An Algorithm for Reconstructing the Orphan Stream Progenitor with MilkyWay@home Volunteer Computing

Authors: Siddhartha Shelton, Heidi Jo Newberg, Jake Weiss, Jacob S. Bauer, Matthew Arsenault, Larry Widrow, Clayton Rayment, Travis Desell, Roland Judd, Malik Magdon-Ismail, Eric Mendelsohn, Matthew Newby, Colin Rice, Boleslaw K. Szymanski, Jeffery M. Thompson, Carlos Varela, Benjamin Willett, Steve Ulin, Lee Newberg

Abstract: We have developed a method for estimating the properties of the progenitor dwarf galaxy from the tidal stream of stars that were ripped from it as it fell into the Milky Way. In particular, we show that the mass and radial profile of a progenitor dwarf galaxy evolved along the orbit of the Orphan Stream, including the stellar and dark matter components, can be reconstructed from the distribution o… ▽ More We have developed a method for estimating the properties of the progenitor dwarf galaxy from the tidal stream of stars that were ripped from it as it fell into the Milky Way. In particular, we show that the mass and radial profile of a progenitor dwarf galaxy evolved along the orbit of the Orphan Stream, including the stellar and dark matter components, can be reconstructed from the distribution of stars in the tidal stream it produced. We use MilkyWay@home, a PetaFLOPS-scale distributed supercomputer, to optimize our dwarf galaxy parameters until we arrive at best-fit parameters. The algorithm fits the dark matter mass, dark matter radius, stellar mass, radial profile of stars, and orbital time. The parameters are recovered even though the dark matter component extends well past the half light radius of the dwarf galaxy progenitor, proving that we are able to extract information about the dark matter halos of dwarf galaxies from the tidal debris. Our simulations assumed that the Milky Way potential, dwarf galaxy orbit, and the form of the density model for the dwarf galaxy were known exactly; more work is required to evaluate the sources of systematic error in fitting real data. This method can be used to estimate the dark matter content in dwarf galaxies without the assumption of virial equilibrium that is required to estimate the mass using line-of-sight velocities. This demonstration is a first step towards building an infrastructure that will fit the Milky Way potential using multiple tidal streams. △ Less

Submitted 14 February, 2021; originally announced February 2021.

Comments: 25 pages, 5 figures, to be submitted to ApJS

arXiv:2012.03769 [pdf, other]

Overcoming Barriers to Data Sharing with Medical Image Generation: A Comprehensive Evaluation

Authors: August DuMont Schütte, Jürgen Hetzel, Sergios Gatidis, Tobias Hepp, Benedikt Dietz, Stefan Bauer, Patrick Schwab

Abstract: Privacy concerns around sharing personally identifiable information are a major practical barrier to data sharing in medical research. However, in many cases, researchers have no interest in a particular individual's information but rather aim to derive insights at the level of cohorts. Here, we utilize Generative Adversarial Networks (GANs) to create derived medical imaging datasets consisting en… ▽ More Privacy concerns around sharing personally identifiable information are a major practical barrier to data sharing in medical research. However, in many cases, researchers have no interest in a particular individual's information but rather aim to derive insights at the level of cohorts. Here, we utilize Generative Adversarial Networks (GANs) to create derived medical imaging datasets consisting entirely of synthetic patient data. The synthetic images ideally have, in aggregate, similar statistical properties to those of a source dataset but do not contain sensitive personal information. We assess the quality of synthetic data generated by two GAN models for chest radiographs with 14 different radiology findings and brain computed tomography (CT) scans with six types of intracranial hemorrhages. We measure the synthetic image quality by the performance difference of predictive models trained on either the synthetic or the real dataset. We find that synthetic data performance disproportionately benefits from a reduced number of unique label combinations. Our open-source benchmark also indicates that at low number of samples per class, label overfitting effects start to dominate GAN training. We additionally conducted a reader study in which trained radiologists do not perform better than random on discriminating between synthetic and real medical images for intermediate levels of resolutions. In accordance with our benchmark results, the classification accuracy of radiologists increases at higher spatial resolution levels. Our study offers valuable guidelines and outlines practical conditions under which insights derived from synthetic medical images are similar to those that would have been derived from real imaging data. Our results indicate that synthetic data sharing may be an attractive and privacy-preserving alternative to sharing real patient-level data in the right settings. △ Less

Submitted 16 August, 2021; v1 submitted 29 November, 2020; originally announced December 2020.

arXiv:2011.11413 [pdf, other]

doi 10.1126/sciadv.abg2243

Low case numbers enable long-term stable pandemic control without lockdowns

Authors: Sebastian Contreras, Jonas Dehning, Sebastian B. Mohr, Simon Bauer, F. Paul Spitzner, Viola Priesemann

Abstract: The traditional long-term solutions for epidemic control involve eradication or population immunity. Here, we analytically derive the existence of a third viable solution: a stable equilibrium at low case numbers, where test-trace-and-isolate policies partially compensate for local spreading events, and only moderate restrictions remain necessary. In this equilibrium, daily cases stabilize around… ▽ More The traditional long-term solutions for epidemic control involve eradication or population immunity. Here, we analytically derive the existence of a third viable solution: a stable equilibrium at low case numbers, where test-trace-and-isolate policies partially compensate for local spreading events, and only moderate restrictions remain necessary. In this equilibrium, daily cases stabilize around ten new infections per million people or less. However, stability is endangered if restrictions are relaxed or case numbers grow too high. The latter destabilization marks a tip** point beyond which the spread self-accelerates. We show that a lockdown can reestablish control and that recurring lockdowns are not necessary given sustained, moderate contact reduction. We illustrate how this strategy profits from vaccination and helps mitigate variants of concern. This strategy reduces cumulative cases (and fatalities) 4x more than strategies that only avoid hospital collapse. In the long term, immunization, large-scale testing, and international coordination will further facilitate control. △ Less

Submitted 14 October, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

Comments: Final version

Journal ref: Sci. Adv. 7, eabg2243 (2021)

arXiv:2010.14766 [pdf, other]

A Sober Look at the Unsupervised Learning of Disentangled Representations and their Evaluation

Authors: Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, Olivier Bachem

Abstract: The idea behind the \emph{unsupervised} learning of \emph{disentangled} representations is that real-world data is generated by a few explanatory factors of variation which can be recovered by unsupervised learning algorithms. In this paper, we provide a sober look at recent progress in the field and challenge some common assumptions. We first theoretically show that the unsupervised learning of d… ▽ More The idea behind the \emph{unsupervised} learning of \emph{disentangled} representations is that real-world data is generated by a few explanatory factors of variation which can be recovered by unsupervised learning algorithms. In this paper, we provide a sober look at recent progress in the field and challenge some common assumptions. We first theoretically show that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data. Then, we train over $14000$ models covering most prominent methods and evaluation metrics in a reproducible large-scale experimental study on eight data sets. We observe that while the different methods successfully enforce properties "encouraged" by the corresponding losses, well-disentangled models seemingly cannot be identified without supervision. Furthermore, different evaluation metrics do not always agree on what should be considered "disentangled" and exhibit systematic differences in the estimation. Finally, increased disentanglement does not seem to necessarily lead to a decreased sample complexity of learning for downstream tasks. Our results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision, investigate concrete benefits of enforcing disentanglement of the learned representations, and consider a reproducible experimental setup covering several data sets. △ Less

Submitted 27 October, 2020; originally announced October 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:1811.12359

Journal ref: Journal of Machine Learning Research 2020, Volume 21, Number 209

arXiv:2010.14407 [pdf, other]

On the Transfer of Disentangled Representations in Realistic Settings

Authors: Andrea Dittadi, Frederik Träuble, Francesco Locatello, Manuel Wüthrich, Vaibhav Agrawal, Ole Winther, Stefan Bauer, Bernhard Schölkopf

Abstract: Learning meaningful representations that disentangle the underlying structure of the data generating process is considered to be of key importance in machine learning. While disentangled representations were found to be useful for diverse tasks such as abstract reasoning and fair classification, their scalability and real-world impact remain questionable. We introduce a new high-resolution dataset… ▽ More Learning meaningful representations that disentangle the underlying structure of the data generating process is considered to be of key importance in machine learning. While disentangled representations were found to be useful for diverse tasks such as abstract reasoning and fair classification, their scalability and real-world impact remain questionable. We introduce a new high-resolution dataset with 1M simulated images and over 1,800 annotated real-world images of the same setup. In contrast to previous work, this new dataset exhibits correlations, a complex underlying structure, and allows to evaluate transfer to unseen simulated and real-world settings where the encoder i) remains in distribution or ii) is out of distribution. We propose new architectures in order to scale disentangled representation learning to realistic high-resolution settings and conduct a large-scale empirical study of disentangled representations on this dataset. We observe that disentanglement is a good predictor for out-of-distribution (OOD) task performance. △ Less

Submitted 11 March, 2021; v1 submitted 27 October, 2020; originally announced October 2020.

Comments: Published at ICLR 2021

arXiv:2010.12737 [pdf, other]

doi 10.1038/s41467-021-26721-x

Real-time Non-line-of-Sight imaging of dynamic scenes

Authors: Ji Hyun Nam, Eric Brandt, Sebastian Bauer, Xiaochun Liu, Eftychios Sifakis, Andreas Velten

Abstract: Non-Line-of-Sight (NLOS) imaging aims at recovering the 3D geometry of objects that are hidden from the direct line of sight. In the past, this method has suffered from the weak available multibounce signal limiting scene size, capture speed, and reconstruction quality. While algorithms capable of reconstructing scenes at several frames per second have been demonstrated, real-time NLOS video has o… ▽ More Non-Line-of-Sight (NLOS) imaging aims at recovering the 3D geometry of objects that are hidden from the direct line of sight. In the past, this method has suffered from the weak available multibounce signal limiting scene size, capture speed, and reconstruction quality. While algorithms capable of reconstructing scenes at several frames per second have been demonstrated, real-time NLOS video has only been demonstrated for retro-reflective objects where the NLOS signal strength is enhanced by 4 orders of magnitude or more. Furthermore, it has also been noted that the signal-to-noise ratio of reconstructions in NLOS methods drops quickly with distance and past reconstructions, therefore, have been limited to small scenes with depths of few meters. Actual models of noise and resolution in the scene have been simplistic, ignoring many of the complexities of the problem. We show that SPAD (Single-Photon Avalanche Diode) array detectors with a total of just 28 pixels combined with a specifically extended Phasor Field reconstruction algorithm can reconstruct live real-time videos of non-retro-reflective NLOS scenes. We provide an analysis of the Signal-to-Noise-Ratio (SNR) of our reconstructions and show that for our method it is possible to reconstruct the scene such that SNR, motion blur, angular resolution, and depth resolution are all independent of scene size suggesting that reconstruction of very large scenes may be possible. In the future, the light efficiency for NLOS imaging systems can be improved further by adding more pixels to the sensor array. △ Less

Submitted 23 October, 2020; originally announced October 2020.

Journal ref: Nature Communications 12, 6526 (2021)

arXiv:2010.07093 [pdf, other]

Function Contrastive Learning of Transferable Meta-Representations

Authors: Muhammad Waleed Gondal, Shruti Joshi, Nasim Rahaman, Stefan Bauer, Manuel Wüthrich, Bernhard Schölkopf

Abstract: Meta-learning algorithms adapt quickly to new tasks that are drawn from the same task distribution as the training tasks. The mechanism leading to fast adaptation is the conditioning of a downstream predictive model on the inferred representation of the task's underlying data generative process, or \emph{function}. This \emph{meta-representation}, which is computed from a few observed examples of… ▽ More Meta-learning algorithms adapt quickly to new tasks that are drawn from the same task distribution as the training tasks. The mechanism leading to fast adaptation is the conditioning of a downstream predictive model on the inferred representation of the task's underlying data generative process, or \emph{function}. This \emph{meta-representation}, which is computed from a few observed examples of the underlying function, is learned jointly with the predictive model. In this work, we study the implications of this joint training on the transferability of the meta-representations. Our goal is to learn meta-representations that are robust to noise in the data and facilitate solving a wide range of downstream tasks that share the same underlying functions. To this end, we propose a decoupled encoder-decoder approach to supervised meta-learning, where the encoder is trained with a contrastive objective to find a good representation of the underlying function. In particular, our training scheme is driven by the self-supervision signal indicating whether two sets of examples stem from the same function. Our experiments on a number of synthetic and real-world datasets show that the representations we obtain outperform strong baselines in terms of downstream performance and noise robustness, even when these baselines are trained in an end-to-end manner. △ Less

Submitted 22 July, 2021; v1 submitted 14 October, 2020; originally announced October 2020.

Comments: ICML 2021

arXiv:2010.04296 [pdf, other]

CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning

Authors: Ossama Ahmed, Frederik Träuble, Anirudh Goyal, Alexander Neitz, Yoshua Bengio, Bernhard Schölkopf, Manuel Wüthrich, Stefan Bauer

Abstract: Despite recent successes of reinforcement learning (RL), it remains a challenge for agents to transfer learned skills to related environments. To facilitate research addressing this problem, we propose CausalWorld, a benchmark for causal structure and transfer learning in a robotic manipulation environment. The environment is a simulation of an open-source robotic platform, hence offering the poss… ▽ More Despite recent successes of reinforcement learning (RL), it remains a challenge for agents to transfer learned skills to related environments. To facilitate research addressing this problem, we propose CausalWorld, a benchmark for causal structure and transfer learning in a robotic manipulation environment. The environment is a simulation of an open-source robotic platform, hence offering the possibility of sim-to-real transfer. Tasks consist of constructing 3D shapes from a given set of blocks - inspired by how children learn to build complex structures. The key strength of CausalWorld is that it provides a combinatorial family of such tasks with common causal structure and underlying factors (including, e.g., robot and object masses, colors, sizes). The user (or the agent) may intervene on all causal variables, which allows for fine-grained control over how similar different tasks (or task distributions) are. One can thus easily define training and evaluation distributions of a desired difficulty level, targeting a specific form of generalization (e.g., only changes in appearance or object mass). Further, this common parametrization facilitates defining curricula by interpolating between an initial and a target task. While users may define their own task distributions, we present eight meaningful distributions as concrete benchmarks, ranging from simple to very challenging, all of which require long-horizon planning as well as precise low-level motor control. Finally, we provide baseline results for a subset of these tasks on distinct training curricula and corresponding evaluation protocols, verifying the feasibility of the tasks in this benchmark. △ Less

Submitted 24 November, 2020; v1 submitted 8 October, 2020; originally announced October 2020.

Comments: The first two authors contributed equally, the last two authors avised jointly

arXiv:2009.11148 [pdf, other]

doi 10.1109/TVCG.2020.3030388

Visualization of Human Spine Biomechanics for Spinal Surgery

Authors: Pepe Eulzer, Sabine Bauer, Francis Kilian, Kai Lawonn

Abstract: We propose a visualization application, designed for the exploration of human spine simulation data. Our goal is to support research in biomechanical spine simulation and advance efforts to implement simulation-backed analysis in surgical applications. Biomechanical simulation is a state-of-the-art technique for analyzing load distributions of spinal structures. Through the inclusion of patient-sp… ▽ More We propose a visualization application, designed for the exploration of human spine simulation data. Our goal is to support research in biomechanical spine simulation and advance efforts to implement simulation-backed analysis in surgical applications. Biomechanical simulation is a state-of-the-art technique for analyzing load distributions of spinal structures. Through the inclusion of patient-specific data, such simulations may facilitate personalized treatment and customized surgical interventions. Difficulties in spine modelling and simulation can be partly attributed to poor result representation, which may also be a hindrance when introducing such techniques into a clinical environment. Comparisons of measurements across multiple similar anatomical structures and the integration of temporal data make commonly available diagrams and charts insufficient for an intuitive and systematic display of results. Therefore, we facilitate methods such as multiple coordinated views, abstraction and focus and context to display simulation outcomes in a dedicated tool. By linking the result data with patient-specific anatomy, we make relevant parameters tangible for clinicians. Furthermore, we introduce new concepts to show the directions of impact force vectors, which were not accessible before. We integrated our toolset into a spine segmentation and simulation pipeline and evaluated our methods with both surgeons and biomechanical researchers. When comparing our methods against standard representations that are currently in use, we found increases in accuracy and speed in data exploration tasks. In a qualitative review, domain experts deemed the tool highly useful when dealing with simulation result data, which typically combines time-dependent patient movement and the resulting force distributions on spinal structures. △ Less

Submitted 23 September, 2020; originally announced September 2020.

Comments: 9+2 pages, 11 figures, to be published in IEEE Transactions on Visualization and Computer Graphics

arXiv:2008.13412 [pdf, other]

doi 10.1038/s41467-020-20816-7

Real-time Prediction of COVID-19 related Mortality using Electronic Health Records

Authors: Patrick Schwab, Arash Mehrjou, Sonali Parbhoo, Leo Anthony Celi, Jürgen Hetzel, Markus Hofer, Bernhard Schölkopf, Stefan Bauer

Abstract: Coronavirus Disease 2019 (COVID-19) is an emerging respiratory disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) with rapid human-to-human transmission and a high case fatality rate particularly in older patients. Due to the exponential growth of infections, many healthcare systems across the world are under pressure to care for increasing amounts of at-risk patien… ▽ More Coronavirus Disease 2019 (COVID-19) is an emerging respiratory disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) with rapid human-to-human transmission and a high case fatality rate particularly in older patients. Due to the exponential growth of infections, many healthcare systems across the world are under pressure to care for increasing amounts of at-risk patients. Given the high number of infected patients, identifying patients with the highest mortality risk early is critical to enable effective intervention and optimal prioritisation of care. Here, we present the COVID-19 Early Warning System (CovEWS), a clinical risk scoring system for assessing COVID-19 related mortality risk. CovEWS provides continuous real-time risk scores for individual patients with clinically meaningful predictive performance up to 192 hours (8 days) in advance, and is automatically derived from patients' electronic health records (EHRs) using machine learning. We trained and evaluated CovEWS using de-identified data from a cohort of 66430 COVID-19 positive patients seen at over 69 healthcare institutions in the United States (US), Australia, Malaysia and India amounting to an aggregated total of over 2863 years of patient observation time. On an external test cohort of 5005 patients, CovEWS predicts COVID-19 related mortality from $78.8\%$ ($95\%$ confidence interval [CI]: $76.0$, $84.7\%$) to $69.4\%$ ($95\%$ CI: $57.6, 75.2\%$) specificity at a sensitivity greater than $95\%$ between respectively 1 and 192 hours prior to observed mortality events - significantly outperforming existing generic and COVID-19 specific clinical risk scores. CovEWS could enable clinicians to intervene at an earlier stage, and may therefore help in preventing or mitigating COVID-19 related mortality. △ Less

Submitted 31 August, 2020; originally announced August 2020.

arXiv:2008.03596 [pdf, other]

TriFinger: An Open-Source Robot for Learning Dexterity

Authors: Manuel Wüthrich, Felix Widmaier, Felix Grimminger, Joel Akpo, Shruti Joshi, Vaibhav Agrawal, Bilal Hammoud, Majid Khadiv, Miroslav Bogdanovic, Vincent Berenz, Julian Viereck, Maximilien Naveau, Ludovic Righetti, Bernhard Schölkopf, Stefan Bauer

Abstract: Dexterous object manipulation remains an open problem in robotics, despite the rapid progress in machine learning during the past decade. We argue that a hindrance is the high cost of experimentation on real systems, in terms of both time and money. We address this problem by proposing an open-source robotic platform which can safely operate without human supervision. The hardware is inexpensive (… ▽ More Dexterous object manipulation remains an open problem in robotics, despite the rapid progress in machine learning during the past decade. We argue that a hindrance is the high cost of experimentation on real systems, in terms of both time and money. We address this problem by proposing an open-source robotic platform which can safely operate without human supervision. The hardware is inexpensive (about \SI{5000}[\$]{}) yet highly dynamic, robust, and capable of complex interaction with external objects. The software operates at 1-kilohertz and performs safety checks to prevent the hardware from breaking. The easy-to-use front-end (in C++ and Python) is suitable for real-time control as well as deep reinforcement learning. In addition, the software framework is largely robot-agnostic and can hence be used independently of the hardware proposed herein. Finally, we illustrate the potential of the proposed platform through a number of experiments, including real-time optimal control, deep reinforcement learning from scratch, throwing, and writing. △ Less

Submitted 21 January, 2021; v1 submitted 8 August, 2020; originally announced August 2020.

arXiv:2007.14184 [pdf, other]

A Commentary on the Unsupervised Learning of Disentangled Representations

Authors: Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, Olivier Bachem

Abstract: The goal of the unsupervised learning of disentangled representations is to separate the independent explanatory factors of variation in the data without access to supervision. In this paper, we summarize the results of Locatello et al., 2019, and focus on their implications for practitioners. We discuss the theoretical result showing that the unsupervised learning of disentangled representations… ▽ More The goal of the unsupervised learning of disentangled representations is to separate the independent explanatory factors of variation in the data without access to supervision. In this paper, we summarize the results of Locatello et al., 2019, and focus on their implications for practitioners. We discuss the theoretical result showing that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases and the practical challenges it entails. Finally, we comment on our experimental findings, highlighting the limitations of state-of-the-art approaches and directions for future research. △ Less

Submitted 28 July, 2020; originally announced July 2020.

Journal ref: The Thirty-Fourth AAAI Conference on Artificial Intelligence 2020 (AAAI-20)

arXiv:2007.06533 [pdf, other]

S2RMs: Spatially Structured Recurrent Modules

Authors: Nasim Rahaman, Anirudh Goyal, Muhammad Waleed Gondal, Manuel Wuthrich, Stefan Bauer, Yash Sharma, Yoshua Bengio, Bernhard Schölkopf

Abstract: Capturing the structure of a data-generating process by means of appropriate inductive biases can help in learning models that generalize well and are robust to changes in the input distribution. While methods that harness spatial and temporal structures find broad application, recent work has demonstrated the potential of models that leverage sparse and modular structure using an ensemble of spar… ▽ More Capturing the structure of a data-generating process by means of appropriate inductive biases can help in learning models that generalize well and are robust to changes in the input distribution. While methods that harness spatial and temporal structures find broad application, recent work has demonstrated the potential of models that leverage sparse and modular structure using an ensemble of sparingly interacting modules. In this work, we take a step towards dynamic models that are capable of simultaneously exploiting both modular and spatiotemporal structures. We accomplish this by abstracting the modeled dynamical system as a collection of autonomous but sparsely interacting sub-systems. The sub-systems interact according to a topology that is learned, but also informed by the spatial structure of the underlying real-world system. This results in a class of models that are well suited for modeling the dynamics of systems that only offer local views into their state, along with corresponding spatial locations of those views. On the tasks of video prediction from cropped frames and multi-agent world modeling from partial observations in the challenging Starcraft2 domain, we find our models to be more robust to the number of available views and better capable of generalization to novel tasks without additional training, even when compared against strong baselines that perform equally well or better on the training distribution. △ Less

Submitted 13 July, 2020; originally announced July 2020.

arXiv:2007.02938 [pdf, other]

Causal Feature Selection via Orthogonal Search

Authors: Ashkan Soleymani, Anant Raj, Stefan Bauer, Bernhard Schölkopf, Michel Besserve

Abstract: The problem of inferring the direct causal parents of a response variable among a large set of explanatory variables is of high practical importance in many disciplines. However, established approaches often scale at least exponentially with the number of explanatory variables, are difficult to extend to nonlinear relationships, and are difficult to extend to cyclic data. Inspired by {\em Debiased… ▽ More The problem of inferring the direct causal parents of a response variable among a large set of explanatory variables is of high practical importance in many disciplines. However, established approaches often scale at least exponentially with the number of explanatory variables, are difficult to extend to nonlinear relationships, and are difficult to extend to cyclic data. Inspired by {\em Debiased} machine learning methods, we study a one-vs.-the-rest feature selection approach to discover the direct causal parent of the response. We propose an algorithm that works for purely observational data while also offering theoretical guarantees, including the case of partially nonlinear relationships possibly under the presence of cycles. As it requires only one estimation for each variable, our approach is applicable even to large graphs. We demonstrate significant improvements compared to established approaches. △ Less

Submitted 16 September, 2022; v1 submitted 6 July, 2020; originally announced July 2020.

arXiv:2006.09885 [pdf, other]

Staging Epileptogenesis with Deep Neural Networks

Authors: Diyuan Lu, Sebastian Bauer, Valentin Neubert, Lara Sophie Costard, Felix Rosenow, Jochen Triesch

Abstract: Epilepsy is a common neurological disorder characterized by recurrent seizures accompanied by excessive synchronous brain activity. The process of structural and functional brain alterations leading to increased seizure susceptibility and eventually spontaneous seizures is called epileptogenesis (EPG) and can span months or even years. Detecting and monitoring the progression of EPG could allow fo… ▽ More Epilepsy is a common neurological disorder characterized by recurrent seizures accompanied by excessive synchronous brain activity. The process of structural and functional brain alterations leading to increased seizure susceptibility and eventually spontaneous seizures is called epileptogenesis (EPG) and can span months or even years. Detecting and monitoring the progression of EPG could allow for targeted early interventions that could slow down disease progression or even halt its development. Here, we propose an approach for staging EPG using deep neural networks and identify potential electroencephalography (EEG) biomarkers to distinguish different phases of EPG. Specifically, continuous intracranial EEG recordings were collected from a rodent model where epilepsy is induced by electrical perforant pathway stimulation (PPS). A deep neural network (DNN) is trained to distinguish EEG signals from before stimulation (baseline), shortly after the PPS and long after the PPS but before the first spontaneous seizure (FSS). Experimental results show that our proposed method can classify EEG signals from the three phases with an average area under the curve (AUC) of 0.93, 0.89, and 0.86. To the best of our knowledge, this represents the first successful attempt to stage EPG prior to the FSS using DNNs. △ Less

Submitted 17 June, 2020; originally announced June 2020.

arXiv:2006.07886 [pdf, other]

On Disentangled Representations Learned From Correlated Data

Authors: Frederik Träuble, Elliot Creager, Niki Kilbertus, Francesco Locatello, Andrea Dittadi, Anirudh Goyal, Bernhard Schölkopf, Stefan Bauer

Abstract: The focus of disentanglement approaches has been on identifying independent factors of variation in data. However, the causal variables underlying real-world observations are often not statistically independent. In this work, we bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data in a large-scale empirical study (incl… ▽ More The focus of disentanglement approaches has been on identifying independent factors of variation in data. However, the causal variables underlying real-world observations are often not statistically independent. In this work, we bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data in a large-scale empirical study (including 4260 models). We show and quantify that systematically induced correlations in the dataset are being learned and reflected in the latent representations, which has implications for downstream applications of disentanglement such as fairness. We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels. △ Less

Submitted 16 July, 2021; v1 submitted 14 June, 2020; originally announced June 2020.

Comments: Published at the 38th International Conference on Machine Learning (ICML 2021)

arXiv:2006.07796 [pdf, other]

Structure by Architecture: Structured Representations without Regularization

Authors: Felix Leeb, Guilia Lanzillotta, Yashas Annadani, Michel Besserve, Stefan Bauer, Bernhard Schölkopf

Abstract: We study the problem of self-supervised structured representation learning using autoencoders for downstream tasks such as generative modeling. Unlike most methods which rely on matching an arbitrary, relatively unstructured, prior distribution for sampling, we propose a sampling technique that relies solely on the independence of latent variables, thereby avoiding the trade-off between reconstruc… ▽ More We study the problem of self-supervised structured representation learning using autoencoders for downstream tasks such as generative modeling. Unlike most methods which rely on matching an arbitrary, relatively unstructured, prior distribution for sampling, we propose a sampling technique that relies solely on the independence of latent variables, thereby avoiding the trade-off between reconstruction quality and generative performance typically observed in VAEs. We design a novel autoencoder architecture capable of learning a structured representation without the need for aggressive regularization. Our structural decoders learn a hierarchy of latent variables, thereby ordering the information without any additional regularization or supervision. We demonstrate how these models learn a representation that improves results in a variety of downstream tasks including generation, disentanglement, and extrapolation using several challenging and natural image datasets. △ Less

Submitted 15 February, 2024; v1 submitted 14 June, 2020; originally announced June 2020.

Comments: Published at ICLR 2023

arXiv:2006.06675 [pdf, other]

Towards Early Diagnosis of Epilepsy from EEG Data

Authors: Diyuan Lu, Sebastian Bauer, Valentin Neubert, Lara Sophie Costard, Felix Rosenow, Jochen Triesch

Abstract: Epilepsy is one of the most common neurological disorders, affecting about 1% of the population at all ages. Detecting the development of epilepsy, i.e., epileptogenesis (EPG), before any seizures occur could allow for early interventions and potentially more effective treatments. Here, we investigate if modern machine learning (ML) techniques can detect EPG from intra-cranial electroencephalograp… ▽ More Epilepsy is one of the most common neurological disorders, affecting about 1% of the population at all ages. Detecting the development of epilepsy, i.e., epileptogenesis (EPG), before any seizures occur could allow for early interventions and potentially more effective treatments. Here, we investigate if modern machine learning (ML) techniques can detect EPG from intra-cranial electroencephalography (EEG) recordings prior to the occurrence of any seizures. For this we use a rodent model of epilepsy where EPG is triggered by electrical stimulation of the brain. We propose a ML framework for EPG identification, which combines a deep convolutional neural network (CNN) with a prediction aggregation method to obtain the final classification decision. Specifically, the neural network is trained to distinguish five second segments of EEG recordings taken from either the pre-stimulation period or the post-stimulation period. Due to the gradual development of epilepsy, there is enormous overlap of the EEG patterns before and after the stimulation. Hence, a prediction aggregation process is introduced, which pools predictions over a longer period. By aggregating predictions over one hour, our approach achieves an area under the curve (AUC) of 0.99 on the EPG detection task. This demonstrates the feasibility of EPG prediction from EEG recordings. △ Less

Submitted 17 June, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

Comments: Machine Learning for Healthcare conference 2020

arXiv:2006.02600

Phasor field waves: A statistical treatment for the case of a partially coherent optical carrier

Authors: Syed Azer Reza, Sebastian Bauer, Andreas Velten

Abstract: This paper presents a statistical treatment of phasor fields (P-fields) - a wave-like quantity denoting the slow temporal variations in time-averaged irradiance (which was recently introduced to model and describe non-line-of-sight (NLoS) imaging as well as imaging through diffuse or scattering apertures) - and quantifies the magnitude of a spurious signal which emerges due to a partial spatial co… ▽ More This paper presents a statistical treatment of phasor fields (P-fields) - a wave-like quantity denoting the slow temporal variations in time-averaged irradiance (which was recently introduced to model and describe non-line-of-sight (NLoS) imaging as well as imaging through diffuse or scattering apertures) - and quantifies the magnitude of a spurious signal which emerges due to a partial spatial coherence of the underlying optical carrier. This spurious signal is not described by the Huygens-like P-field imaging integral which assumes optical incoherence as a necessary condition to describe P-field imaging completely (as was shown by Reza etal. recently). In this paper, we estimate the relationship between the expected magnitude of this spurious signal and the degree of partial roughness within the P-field imaging system. The treatment allows us to determine the accuracy of the estimate provided by the P-field integral for varying degrees of partial coherence and allows to define a P-field signal-to-noise ratio as a figure-of-merit for the case of a partially coherent optical carrier. The study of partial coherence also enables to better relate aperture roughness to P-field noise. △ Less

Submitted 18 September, 2020; v1 submitted 3 June, 2020; originally announced June 2020.

Comments: The model proposed has various deficiencies which need more work to address

arXiv:2005.10210 [pdf, other]

doi 10.1038/s43588-022-00382-2

A machine learning route between band map** and band structure

Authors: Rui Patrick Xian, Vincent Stimper, Marios Zacharias, Maciej Dendzik, Shuo Dong, Samuel Beaulieu, Bernhard Schölkopf, Martin Wolf, Laurenz Rettig, Christian Carbogno, Stefan Bauer, Ralph Ernstorfer

Abstract: Electronic band structure (BS) and crystal structure are the two complementary identifiers of solid state materials. While convenient instruments and reconstruction algorithms have made large, empirical, crystal structure databases possible, extracting quasiparticle dispersion (closely related to BS) from photoemission band map** data is currently limited by the available computational methods.… ▽ More Electronic band structure (BS) and crystal structure are the two complementary identifiers of solid state materials. While convenient instruments and reconstruction algorithms have made large, empirical, crystal structure databases possible, extracting quasiparticle dispersion (closely related to BS) from photoemission band map** data is currently limited by the available computational methods. To cope with the growing size and scale of photoemission data, we develop a pipeline including probabilistic machine learning and the associated data processing, optimization and evaluation methods for band structure reconstruction, leveraging theoretical calculations. The pipeline reconstructs all 14 valence bands of a semiconductor and shows excellent performance on benchmarks and other materials datasets. The reconstruction uncovers previously inaccessible momentum-space structural information on both global and local scales, while realizing a path towards integration with materials science databases. Our approach illustrates the potential of combining machine learning and domain knowledge for scalable feature extraction in multidimensional data. △ Less

Submitted 15 November, 2022; v1 submitted 20 May, 2020; originally announced May 2020.

arXiv:2005.08302 [pdf, other]

Clinical Predictive Models for COVID-19: Systematic Study

Authors: Patrick Schwab, August DuMont Schütte, Benedikt Dietz, Stefan Bauer

Abstract: Coronavirus Disease 2019 (COVID-19) is a rapidly emerging respiratory disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Due to the rapid human-to-human transmission of SARS-CoV-2, many healthcare systems are at risk of exceeding their healthcare capacities, in particular in terms of SARS-CoV-2 tests, hospital and intensive care unit (ICU) beds and mechanical venti… ▽ More Coronavirus Disease 2019 (COVID-19) is a rapidly emerging respiratory disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Due to the rapid human-to-human transmission of SARS-CoV-2, many healthcare systems are at risk of exceeding their healthcare capacities, in particular in terms of SARS-CoV-2 tests, hospital and intensive care unit (ICU) beds and mechanical ventilators. Predictive algorithms could potentially ease the strain on healthcare systems by identifying those who are most likely to receive a positive SARS-CoV-2 test, be hospitalised or admitted to the ICU. Here, we study clinical predictive models that estimate, using machine learning and based on routinely collected clinical data, which patients are likely to receive a positive SARS-CoV-2 test, require hospitalisation or intensive care. To evaluate the predictive performance of our models, we perform a retrospective evaluation on clinical and blood analysis data from a cohort of 5644 patients. Our experimental results indicate that our predictive models identify (i) patients that test positive for SARS-CoV-2 a priori at a sensitivity of 75% (95% CI: 67%, 81%) and a specificity of 49% (95% CI: 46%, 51%), (ii) SARS-CoV-2 positive patients that require hospitalisation with 0.92 AUC (95% CI: 0.81, 0.98), and (iii) SARS-CoV-2 positive patients that require critical care with 0.98 AUC (95% CI: 0.95, 1.00). In addition, we determine which clinical features are predictive to what degree for each of the aforementioned clinical tasks. Our results indicate that predictive models trained on routinely collected clinical data could be used to predict clinical pathways for COVID-19, and therefore help inform care and prioritise resources. △ Less

Submitted 29 November, 2020; v1 submitted 17 May, 2020; originally announced May 2020.

arXiv:2003.02658 [pdf, other]

SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for Gaussian Process Regression with Derivatives

Authors: Emmanouil Angelis, Philippe Wenk, Bernhard Schölkopf, Stefan Bauer, Andreas Krause

Abstract: Gaussian processes are an important regression tool with excellent analytic properties which allow for direct integration of derivative observations. However, vanilla GP methods scale cubically in the amount of observations. In this work, we propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features. We then prove deterministic, non-asymptotic and expo… ▽ More Gaussian processes are an important regression tool with excellent analytic properties which allow for direct integration of derivative observations. However, vanilla GP methods scale cubically in the amount of observations. In this work, we propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features. We then prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior. To furthermore illustrate the practical applicability of our method, we then apply it to ODIN, a recently developed algorithm for ODE parameter inference. In an extensive experiments section, all results are empirically validated, demonstrating the speed, accuracy, and practical applicability of this approach. △ Less

Submitted 5 March, 2020; originally announced March 2020.

arXiv:2001.06208 [pdf, other]

Causal models for dynamical systems

Authors: Jonas Peters, Stefan Bauer, Niklas Pfister

Abstract: A probabilistic model describes a system in its observational state. In many situations, however, we are interested in the system's response under interventions. The class of structural causal models provides a language that allows us to model the behaviour under interventions. It can been taken as a starting point to answer a plethora of causal questions, including the identification of causal ef… ▽ More A probabilistic model describes a system in its observational state. In many situations, however, we are interested in the system's response under interventions. The class of structural causal models provides a language that allows us to model the behaviour under interventions. It can been taken as a starting point to answer a plethora of causal questions, including the identification of causal effects or causal structure learning. In this chapter, we provide a natural and straight-forward extension of this concept to dynamical systems, focusing on continuous time models. In particular, we introduce two types of causal kinetic models that differ in how the randomness enters into the model: it may either be considered as observational noise or as systematic driving noise. In both cases, we define interventions and therefore provide a possible starting point for causal inference. In this sense, the book chapter provides more questions than answers. The focus of the proposed causal kinetic models lies on the dynamics themselves rather than corresponding stationary distributions, for example. We believe that this is beneficial when the aim is to model the full time evolution of the system and data are measured at different time points. Under this focus, it is natural to consider interventions in the differential equations themselves. △ Less

Submitted 17 January, 2020; originally announced January 2020.

arXiv:1910.01075 [pdf, other]

Learning Neural Causal Models from Unknown Interventions

Authors: Nan Rosemary Ke, Olexa Bilaniuk, Anirudh Goyal, Stefan Bauer, Hugo Larochelle, Bernhard Schölkopf, Michael C. Mozer, Chris Pal, Yoshua Bengio

Abstract: Promising results have driven a recent surge of interest in continuous optimization methods for Bayesian network structure learning from observational data. However, there are theoretical limitations on the identifiability of underlying structures obtained from observational data alone. Interventional data provides much richer information about the underlying data-generating process. However, the… ▽ More Promising results have driven a recent surge of interest in continuous optimization methods for Bayesian network structure learning from observational data. However, there are theoretical limitations on the identifiability of underlying structures obtained from observational data alone. Interventional data provides much richer information about the underlying data-generating process. However, the extension and application of methods designed for observational data to include interventions is not straightforward and remains an open problem. In this paper we provide a general framework based on continuous optimization and neural networks to create models for the combination of observational and interventional data. The proposed method is even applicable in the challenging and realistic case that the identity of the intervened upon variable is unknown. We examine the proposed method in the setting of graph recovery both de novo and from a partially-known edge set. We establish strong benchmark results on several structure learning tasks, including structure recovery of both synthetic graphs as well as standard graphs from the Bayesian Network Repository. △ Less

Submitted 23 August, 2020; v1 submitted 2 October, 2019; originally announced October 2019.

arXiv:1909.06048 [pdf, other]

doi 10.1103/PhysRevLett.123.221802

An improved upper limit on the neutrino mass from a direct kinematic method by KATRIN

Authors: M. Aker, K. Altenmüller, M. Arenz, M. Babutzka, J. Barrett, S. Bauer, M. Beck, A. Beglarian, J. Behrens, T. Bergmann, U. Besserer, K. Blaum, F. Block, S. Bobien, K. Bokeloh, J. Bonn, B. Bornschein, L. Bornschein, H. Bouquet, T. Brunst, T. S. Caldwell, L. La Cascio, S. Chilingaryan, W. Choi, T. J. Corona , et al. (184 additional authors not shown)

Abstract: We report on the neutrino mass measurement result from the first four-week science run of the Karlsruhe Tritium Neutrino experiment KATRIN in spring 2019. Beta-decay electrons from a high-purity gaseous molecular tritium source are energy analyzed by a high-resolution MAC-E filter. A fit of the integrated electron spectrum over a narrow interval around the kinematic endpoint at 18.57 keV gives an… ▽ More We report on the neutrino mass measurement result from the first four-week science run of the Karlsruhe Tritium Neutrino experiment KATRIN in spring 2019. Beta-decay electrons from a high-purity gaseous molecular tritium source are energy analyzed by a high-resolution MAC-E filter. A fit of the integrated electron spectrum over a narrow interval around the kinematic endpoint at 18.57 keV gives an effective neutrino mass square value of $(-1.0^{+0.9}_{-1.1})$ eV$^2$. From this we derive an upper limit of 1.1 eV (90$\%$ confidence level) on the absolute mass scale of neutrinos. This value coincides with the KATRIN sensitivity. It improves upon previous mass limits from kinematic measurements by almost a factor of two and provides model-independent input to cosmological studies of structure formation. △ Less

Submitted 13 September, 2019; originally announced September 2019.

Journal ref: Phys. Rev. Lett. 123, 221802 (2019)

arXiv:1908.05472 [pdf, other]

doi 10.1007/s42979-020-0087-8

Playing a Strategy Game with Knowledge-Based Reinforcement Learning

Authors: Viktor Voss, Liudmyla Nechepurenko, Dr. Rudi Schaefer, Steffen Bauer

Abstract: This paper presents Knowledge-Based Reinforcement Learning (KB-RL) as a method that combines a knowledge-based approach and a reinforcement learning (RL) technique into one method for intelligent problem solving. The proposed approach focuses on multi-expert knowledge acquisition, with the reinforcement learning being applied as a conflict resolution strategy aimed at integrating the knowledge of… ▽ More This paper presents Knowledge-Based Reinforcement Learning (KB-RL) as a method that combines a knowledge-based approach and a reinforcement learning (RL) technique into one method for intelligent problem solving. The proposed approach focuses on multi-expert knowledge acquisition, with the reinforcement learning being applied as a conflict resolution strategy aimed at integrating the knowledge of multiple exerts into one knowledge base. The article describes the KB-RL approach in detail and applies the reported method to one of the most challenging problems of current Artificial Intelligence (AI) research, namely playing a strategy game. The results show that the KB-RL system is able to play and complete the full FreeCiv game, and to win against the computer players in various game settings. Moreover, with more games played, the system improves the gameplay by shortening the number of rounds that it takes to win the game. Overall, the reported experiment supports the idea that, based on human knowledge and empowered by reinforcement learning, the KB-RL system can deliver a strong solution to the complex, multi-strategic problems, and, mainly, to improve the solution with increased experience. △ Less

Submitted 15 August, 2019; originally announced August 2019.

Comments: preprint

Showing 51–100 of 242 results for author: Bauer, S