-
ANDES, the high-resolution spectrograph for the ELT: RIZ Spectrograph preliminary design
Authors:
Bruno Chazelas,
Yevgeniy Ivanisenko,
Audrey Lanotte,
Pablo Santos Diaz,
Ludovic Genolet,
Michael Sordet,
Ian Hughes,
Christophe Lovis,
Tobias M. Schmidt,
Manuel Amate,
José Peñate Castro,
Afrodisio Vega Moreno,
Fabio Tenegi,
Roberto Simoes,
Jonay I. González Hernández,
María Rosa Zapatero Osorio,
Javier Piqueras,
Tomás Belenguer Dávila,
Rocío Calvo Ortega,
Roberto Varas González,
Luis Miguel González Fernández,
Pedro J. Amado,
Jonathan Kern,
Frank Dionies,
Svend-Marian Bauer
, et al. (22 additional authors not shown)
Abstract:
We present here the preliminary design of the RIZ module, one of the visible spectrographs of the ANDES instrument 1. It is a fiber-fed high-resolution, high-stability spectrograph. Its design follows the guidelines of successful predecessors such as HARPS and ESPRESSO. In this paper we present the status of the spectrograph at the preliminary design stage. The spectrograph will be a warm, vacuum-…
▽ More
We present here the preliminary design of the RIZ module, one of the visible spectrographs of the ANDES instrument 1. It is a fiber-fed high-resolution, high-stability spectrograph. Its design follows the guidelines of successful predecessors such as HARPS and ESPRESSO. In this paper we present the status of the spectrograph at the preliminary design stage. The spectrograph will be a warm, vacuum-operated, thermally controlled and fiber-fed echelle spectrograph. Following the phase A design, the huge etendue of the telescope will be reformed in the instrument with a long slit made of smaller fibers. We discuss the system design of the spectrographs system.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Challenges and Considerations in the Evaluation of Bayesian Causal Discovery
Authors:
Amir Mohammad Karimi Mamaghan,
Panagiotis Tigas,
Karl Henrik Johansson,
Yarin Gal,
Yashas Annadani,
Stefan Bauer
Abstract:
Representing uncertainty in causal discovery is a crucial component for experimental design, and more broadly, for safe and reliable causal decision making. Bayesian Causal Discovery (BCD) offers a principled approach to encapsulating this uncertainty. Unlike non-Bayesian causal discovery, which relies on a single estimated causal graph and model parameters for assessment, evaluating BCD presents…
▽ More
Representing uncertainty in causal discovery is a crucial component for experimental design, and more broadly, for safe and reliable causal decision making. Bayesian Causal Discovery (BCD) offers a principled approach to encapsulating this uncertainty. Unlike non-Bayesian causal discovery, which relies on a single estimated causal graph and model parameters for assessment, evaluating BCD presents challenges due to the nature of its inferred quantity - the posterior distribution. As a result, the research community has proposed various metrics to assess the quality of the approximate posterior. However, there is, to date, no consensus on the most suitable metric(s) for evaluation. In this work, we reexamine this question by dissecting various metrics and understanding their limitations. Through extensive empirical evaluation, we find that many existing metrics fail to exhibit a strong correlation with the quality of approximation to the true posterior, especially in scenarios with low sample sizes where BCD is most desirable. We highlight the suitability (or lack thereof) of these metrics under two distinct factors: the identifiability of the underlying causal model and the quantity of available data. Both factors affect the entropy of the true posterior, indicating that the current metrics are less fitting in settings of higher entropy. Our findings underline the importance of a more nuanced evaluation of new methods by taking into account the nature of the true posterior, as well as guide and motivate the development of new evaluation procedures for this challenge.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Amortized Active Causal Induction with Deep Reinforcement Learning
Authors:
Yashas Annadani,
Panagiotis Tigas,
Stefan Bauer,
Adam Foster
Abstract:
We present Causal Amortized Active Structure Learning (CAASL), an active intervention design policy that can select interventions that are adaptive, real-time and that does not require access to the likelihood. This policy, an amortized network based on the transformer, is trained with reinforcement learning on a simulator of the design environment, and a reward function that measures how close th…
▽ More
We present Causal Amortized Active Structure Learning (CAASL), an active intervention design policy that can select interventions that are adaptive, real-time and that does not require access to the likelihood. This policy, an amortized network based on the transformer, is trained with reinforcement learning on a simulator of the design environment, and a reward function that measures how close the true causal graph is to a causal graph posterior inferred from the gathered data. On synthetic data and a single-cell gene expression simulator, we demonstrate empirically that the data acquired through our policy results in a better estimate of the underlying causal graph than alternative strategies. Our design policy successfully achieves amortized intervention design on the distribution of the training environment while also generalizing well to distribution shifts in test-time design environments. Further, our policy also demonstrates excellent zero-shot generalization to design environments with dimensionality higher than that during training, and to intervention types that it has not been trained on.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Opportunities for machine learning in scientific discovery
Authors:
Ricardo Vinuesa,
Jean Rabault,
Hossein Azizpour,
Stefan Bauer,
Bingni W. Brunton,
Arne Elofsson,
Elias Jarlebring,
Hedvig Kjellstrom,
Stefano Markidis,
David Marlevi,
Paola Cinnella,
Steven L. Brunton
Abstract:
Technological advancements have substantially increased computational power and data availability, enabling the application of powerful machine-learning (ML) techniques across various fields. However, our ability to leverage ML methods for scientific discovery, {\it i.e.} to obtain fundamental and formalized knowledge about natural processes, is still in its infancy. In this review, we explore how…
▽ More
Technological advancements have substantially increased computational power and data availability, enabling the application of powerful machine-learning (ML) techniques across various fields. However, our ability to leverage ML methods for scientific discovery, {\it i.e.} to obtain fundamental and formalized knowledge about natural processes, is still in its infancy. In this review, we explore how the scientific community can increasingly leverage ML techniques to achieve scientific discoveries. We observe that the applicability and opportunity of ML depends strongly on the nature of the problem domain, and whether we have full ({\it e.g.}, turbulence), partial ({\it e.g.}, computational biochemistry), or no ({\it e.g.}, neuroscience) {\it a-priori} knowledge about the governing equations and physical properties of the system. Although challenges remain, principled use of ML is opening up new avenues for fundamental scientific discoveries. Throughout these diverse fields, there is a theme that ML is enabling researchers to embrace complexity in observational data that was previously intractable to classic analysis and numerical investigations.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Derivative-free tree optimization for complex systems
Authors:
Ye Wei,
Bo Peng,
Ruiwen Xie,
Yangtao Chen,
Yu Qin,
Peng Wen,
Stefan Bauer,
Po-Yen Tung
Abstract:
A tremendous range of design tasks in materials, physics, and biology can be formulated as finding the optimum of an objective function depending on many parameters without knowing its closed-form expression or the derivative. Traditional derivative-free optimization techniques often rely on strong assumptions about objective functions, thereby failing at optimizing non-convex systems beyond 100 d…
▽ More
A tremendous range of design tasks in materials, physics, and biology can be formulated as finding the optimum of an objective function depending on many parameters without knowing its closed-form expression or the derivative. Traditional derivative-free optimization techniques often rely on strong assumptions about objective functions, thereby failing at optimizing non-convex systems beyond 100 dimensions. Here, we present a tree search method for derivative-free optimization that enables accelerated optimal design of high-dimensional complex systems. Specifically, we introduce stochastic tree expansion, dynamic upper confidence bound, and short-range backpropagation mechanism to evade local optimum, iteratively approximating the global optimum using machine learning models. This development effectively confronts the dimensionally challenging problems, achieving convergence to global optima across various benchmark functions up to 2,000 dimensions, surpassing the existing methods by 10- to 20-fold. Our method demonstrates wide applicability to a wide range of real-world complex systems spanning materials, physics, and biology, considerably outperforming state-of-the-art algorithms. This enables efficient autonomous knowledge discovery and facilitates self-driving virtual laboratories. Although we focus on problems within the realm of natural science, the advancements in optimization techniques achieved herein are applicable to a broader spectrum of challenges across all quantitative disciplines.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
Roadmap on Data-Centric Materials Science
Authors:
Stefan Bauer,
Peter Benner,
Tristan Bereau,
Volker Blum,
Mario Boley,
Christian Carbogno,
C. Richard A. Catlow,
Gerhard Dehm,
Sebastian Eibl,
Ralph Ernstorfer,
Ádám Fekete,
Lucas Foppa,
Peter Fratzl,
Christoph Freysoldt,
Baptiste Gault,
Luca M. Ghiringhelli,
Sajal K. Giri,
Anton Gladyshev,
Pawan Goyal,
Jason Hattrick-Simpers,
Lara Kabalan,
Petr Karpov,
Mohammad S. Khorrami,
Christoph Koch,
Sebastian Kokott
, et al. (36 additional authors not shown)
Abstract:
Science is and always has been based on data, but the terms "data-centric" and the "4th paradigm of" materials research indicate a radical change in how information is retrieved, handled and research is performed. It signifies a transformative shift towards managing vast data collections, digital repositories, and innovative data analytics methods. The integration of Artificial Intelligence (AI) a…
▽ More
Science is and always has been based on data, but the terms "data-centric" and the "4th paradigm of" materials research indicate a radical change in how information is retrieved, handled and research is performed. It signifies a transformative shift towards managing vast data collections, digital repositories, and innovative data analytics methods. The integration of Artificial Intelligence (AI) and its subset Machine Learning (ML), has become pivotal in addressing all these challenges. This Roadmap on Data-Centric Materials Science explores fundamental concepts and methodologies, illustrating diverse applications in electronic-structure theory, soft matter theory, microstructure research, and experimental techniques like photoemission, atom probe tomography, and electron microscopy. While the roadmap delves into specific areas within the broad interdisciplinary field of materials science, the provided examples elucidate key concepts applicable to a wider range of topics. The discussed instances offer insights into addressing the multifaceted challenges encountered in contemporary materials research.
△ Less
Submitted 1 May, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
The Essential Role of Causality in Foundation World Models for Embodied AI
Authors:
Tarun Gupta,
Wenbo Gong,
Chao Ma,
Nick Pawlowski,
Agrin Hilmkil,
Meyer Scetbon,
Marc Rigter,
Ade Famoti,
Ashley Juan Llorens,
Jianfeng Gao,
Stefan Bauer,
Danica Kragic,
Bernhard Schölkopf,
Cheng Zhang
Abstract:
Recent advances in foundation models, especially in large multi-modal models and conversational agents, have ignited interest in the potential of generally capable embodied agents. Such agents will require the ability to perform new tasks in many different real-world environments. However, current foundation models fail to accurately model physical interactions and are therefore insufficient for E…
▽ More
Recent advances in foundation models, especially in large multi-modal models and conversational agents, have ignited interest in the potential of generally capable embodied agents. Such agents will require the ability to perform new tasks in many different real-world environments. However, current foundation models fail to accurately model physical interactions and are therefore insufficient for Embodied AI. The study of causality lends itself to the construction of veridical world models, which are crucial for accurately predicting the outcomes of possible interactions. This paper focuses on the prospects of building foundation world models for the upcoming generation of embodied agents and presents a novel viewpoint on the significance of causality within these. We posit that integrating causal considerations is vital to facilitating meaningful physical interactions with the world. Finally, we demystify misconceptions about causality in this context and present our outlook for future research.
△ Less
Submitted 29 April, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
3D Vertebrae Measurements: Assessing Vertebral Dimensions in Human Spine Mesh Models Using Local Anatomical Vertebral Axes
Authors:
Ivanna Kramer,
Vinzent Rittel,
Lara Blomenkamp,
Sabine Bauer,
Dietrich Paulus
Abstract:
Vertebral morphological measurements are important across various disciplines, including spinal biomechanics and clinical applications, pre- and post-operatively. These measurements also play a crucial role in anthropological longitudinal studies, where spinal metrics are repeatedly documented over extended periods. Traditionally, such measurements have been manually conducted, a process that is t…
▽ More
Vertebral morphological measurements are important across various disciplines, including spinal biomechanics and clinical applications, pre- and post-operatively. These measurements also play a crucial role in anthropological longitudinal studies, where spinal metrics are repeatedly documented over extended periods. Traditionally, such measurements have been manually conducted, a process that is time-consuming. In this study, we introduce a novel, fully automated method for measuring vertebral morphology using 3D meshes of lumbar and thoracic spine models.Our experimental results demonstrate the method's capability to accurately measure low-resolution patient-specific vertebral meshes with mean absolute error (MAE) of 1.09 mm and those derived from artificially created lumbar spines, where the average MAE value was 0.7 mm. Our qualitative analysis indicates that measurements obtained using our method on 3D spine models can be accurately reprojected back onto the original medical images if these images are available.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Molecular causality in the advent of foundation models
Authors:
Sebastian Lobentanzer,
Pablo Rodriguez-Mier,
Stefan Bauer,
Julio Saez-Rodriguez
Abstract:
Correlation is not causation. As simple as this widely agreed-upon statement may seem, scientifically defining causality and using it to drive our modern biomedical research is immensely challenging. In this perspective, we attempt to synergise the partly disparate fields of systems biology, causal reasoning, and machine learning, to inform future approaches in the field of systems biology and mol…
▽ More
Correlation is not causation. As simple as this widely agreed-upon statement may seem, scientifically defining causality and using it to drive our modern biomedical research is immensely challenging. In this perspective, we attempt to synergise the partly disparate fields of systems biology, causal reasoning, and machine learning, to inform future approaches in the field of systems biology and molecular networks.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
DiscoBAX: Discovery of Optimal Intervention Sets in Genomic Experiment Design
Authors:
Clare Lyle,
Arash Mehrjou,
Pascal Notin,
Andrew Jesson,
Stefan Bauer,
Yarin Gal,
Patrick Schwab
Abstract:
The discovery of therapeutics to treat genetically-driven pathologies relies on identifying genes involved in the underlying disease mechanisms. Existing approaches search over the billions of potential interventions to maximize the expected influence on the target phenotype. However, to reduce the risk of failure in future stages of trials, practical experiment design aims to find a set of interv…
▽ More
The discovery of therapeutics to treat genetically-driven pathologies relies on identifying genes involved in the underlying disease mechanisms. Existing approaches search over the billions of potential interventions to maximize the expected influence on the target phenotype. However, to reduce the risk of failure in future stages of trials, practical experiment design aims to find a set of interventions that maximally change a target phenotype via diverse mechanisms. We propose DiscoBAX, a sample-efficient method for maximizing the rate of significant discoveries per experiment while simultaneously probing for a wide range of diverse mechanisms during a genomic experiment campaign. We provide theoretical guarantees of approximate optimality under standard assumptions, and conduct a comprehensive experimental evaluation covering both synthetic as well as real-world experimental design tasks. DiscoBAX outperforms existing state-of-the-art methods for experimental design, selecting effective and diverse perturbations in biological systems.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Doubly Robust Structure Identification from Temporal Data
Authors:
Emmanouil Angelis,
Francesco Quinzan,
Ashkan Soleymani,
Patrick Jaillet,
Stefan Bauer
Abstract:
Learning the causes of time-series data is a fundamental task in many applications, spanning from finance to earth sciences or bio-medical applications. Common approaches for this task are based on vector auto-regression, and they do not take into account unknown confounding between potential causes. However, in settings with many potential causes and noisy data, these approaches may be substantia…
▽ More
Learning the causes of time-series data is a fundamental task in many applications, spanning from finance to earth sciences or bio-medical applications. Common approaches for this task are based on vector auto-regression, and they do not take into account unknown confounding between potential causes. However, in settings with many potential causes and noisy data, these approaches may be substantially biased. Furthermore, potential causes may be correlated in practical applications. Moreover, existing algorithms often do not work with cyclic data. To address these challenges, we propose a new doubly robust method for Structure Identification from Temporal Data ( SITD ). We provide theoretical guarantees, showing that our method asymptotically recovers the true underlying causal structure. Our analysis extends to cases where the potential causes have cycles and they may be confounded. We further perform extensive experiments to showcase the superior performance of our method.
△ Less
Submitted 10 November, 2023;
originally announced November 2023.
-
Diffusion Based Causal Representation Learning
Authors:
Amir Mohammad Karimi Mamaghan,
Andrea Dittadi,
Stefan Bauer,
Karl Henrik Johansson,
Francesco Quinzan
Abstract:
Causal reasoning can be considered a cornerstone of intelligent systems. Having access to an underlying causal graph comes with the promise of cause-effect estimation and the identification of efficient and safe interventions. However, learning causal representations remains a major challenge, due to the complexity of many real-world systems. Previous works on causal representation learning have m…
▽ More
Causal reasoning can be considered a cornerstone of intelligent systems. Having access to an underlying causal graph comes with the promise of cause-effect estimation and the identification of efficient and safe interventions. However, learning causal representations remains a major challenge, due to the complexity of many real-world systems. Previous works on causal representation learning have mostly focused on Variational Auto-Encoders (VAE). These methods only provide representations from a point estimate, and they are unsuitable to handle high dimensions. To overcome these problems, we proposed a new Diffusion-based Causal Representation Learning (DCRL) algorithm. This algorithm uses diffusion-based representations for causal discovery. DCRL offers access to infinite dimensional latent codes, which encode different levels of information in the latent code. In a first proof of principle, we investigate the use of DCRL for causal representation learning. We further demonstrate experimentally that this approach performs comparably well in identifying the causal structure and causal variables.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
Triggered telecom C-band single-photon source with high brightness, high indistinguishability and sub-GHz spectral linewidth
Authors:
Raphael Joos,
Stephanie Bauer,
Christian Rupp,
Sascha Kolatschek,
Wolfgang Fischer,
Cornelius Nawrath,
Ponraj Vijayan,
Robert Sittig,
Michael Jetter,
Simone L. Portalupi,
Peter Michler
Abstract:
Long-range, terrestrial quantum networks will require high brightness single-photon sources emitting in the telecom C-band for maximum transmission rate. Many applications additionally demand triggered operation with high indistinguishability and narrow spectral linewidth. This would enable the efficient implementation of photonic gate operations and photon storage in quantum memories, as for inst…
▽ More
Long-range, terrestrial quantum networks will require high brightness single-photon sources emitting in the telecom C-band for maximum transmission rate. Many applications additionally demand triggered operation with high indistinguishability and narrow spectral linewidth. This would enable the efficient implementation of photonic gate operations and photon storage in quantum memories, as for instance required for a quantum repeater. Especially, semiconductor quantum dots (QDs) have shown these properties in the near-infrared regime. However, the simultaneous demonstration of all these properties in the telecom C-band has been elusive. Here, we present a coherently (incoherently) optically-pumped narrow-band (0.8 GHz) triggered single-photon source in the telecom C-band. The source shows simultaneously high single-photon purity with $g^{(2)}(0) = 0.026$ ($g^{(2)}(0) = 0.014$), high two-photon interference visibility of 0.508 (0.664) and high application-ready rates of 0.75 MHz (1.45 MHz) of polarized photons. The source is based on a QD coupled to a circular Bragg grating cavity combined with spectral filtering. Coherent (incoherent) operation is performed via the novel SUPER scheme (phonon-assisted excitation).
△ Less
Submitted 31 October, 2023;
originally announced October 2023.
-
Causal machine learning for single-cell genomics
Authors:
Alejandro Tejada-Lapuerta,
Paul Bertin,
Stefan Bauer,
Hananeh Aliee,
Yoshua Bengio,
Fabian J. Theis
Abstract:
Advances in single-cell omics allow for unprecedented insights into the transcription profiles of individual cells. When combined with large-scale perturbation screens, through which specific biological mechanisms can be targeted, these technologies allow for measuring the effect of targeted perturbations on the whole transcriptome. These advances provide an opportunity to better understand the ca…
▽ More
Advances in single-cell omics allow for unprecedented insights into the transcription profiles of individual cells. When combined with large-scale perturbation screens, through which specific biological mechanisms can be targeted, these technologies allow for measuring the effect of targeted perturbations on the whole transcriptome. These advances provide an opportunity to better understand the causative role of genes in complex biological processes such as gene regulation, disease progression or cellular development. However, the high-dimensional nature of the data, coupled with the intricate complexity of biological systems renders this task nontrivial. Within the machine learning community, there has been a recent increase of interest in causality, with a focus on adapting established causal techniques and algorithms to handle high-dimensional data. In this perspective, we delineate the application of these methodologies within the realm of single-cell genomics and their challenges. We first present the model that underlies most of current causal approaches to single-cell biology and discuss and challenge the assumptions it entails from the biological point of view. We then identify open problems in the application of causal approaches to single-cell data: generalising to unseen environments, learning interpretable models, and learning causal models of dynamics. For each problem, we discuss how various research directions - including the development of computational approaches and the adaptation of experimental protocols - may offer ways forward, or on the contrary pose some difficulties. With the advent of single cell atlases and increasing perturbation data, we expect causal models to become a crucial tool for informed experimental design.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
Highly indistinguishable single photons from droplet-etched GaAs quantum dots integrated in single-mode waveguides and beamsplitters
Authors:
Florian Hornung,
Ulrich Pfister,
Stephanie Bauer,
Dee Rocking Cyrlyson's,
Dongze Wang,
Ponraj Vijayan,
Ailton J. Garcia Jr,
Saimon Filipe Covre da Silva,
Michael Jetter,
Simone L. Portalupi,
Armando Rastelli,
Peter Michler
Abstract:
The integration of on-demand quantum emitters into photonic integrated circuits (PICs) has drawn much of attention in recent years, as it promises a scalable implementation of quantum information schemes. A central property for several applications is the indistinguishability of the emitted photons. In this regard, GaAs quantum dots (QDs) obtained by droplet etching epitaxy show excellent performa…
▽ More
The integration of on-demand quantum emitters into photonic integrated circuits (PICs) has drawn much of attention in recent years, as it promises a scalable implementation of quantum information schemes. A central property for several applications is the indistinguishability of the emitted photons. In this regard, GaAs quantum dots (QDs) obtained by droplet etching epitaxy show excellent performances with visibilities close to one for both individual and remote emitters. Therefore, the realization of these QDs into PICs is highly appealing. Here, we show the first implementation in this direction, realizing the key passive elements needed in PICs, i.e. single-mode waveguides (WGs) with integrated GaAs-QDs, which can be coherently controlled, as well as beamsplitters. We study both the statistical distribution of wavelength, linewidth and decay times of the excitonic line of multiple QDs, as well as the quantum optical properties of individual emitters under resonant excitation. Here, we achieve single-photon purities as high as $1-\text{g}^{(2)}(0)=0.929\pm0.009$ as well as two-photon interference visibilities of up to V$_{\text{TPI}}=0.939\pm0.004$ for two consecutively emitted photons.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
HealthWalk: Promoting Health and Mobility through Sensor-Based Rollator Walker Assistance
Authors:
Ivanna Kramer,
Kevin Weirauch,
Sabine Bauer,
Mark Oliver Mints,
Peer Neubert
Abstract:
Rollator walkers allow people with physical limitations to increase their mobility and give them the confidence and independence to participate in society for longer. However, rollator walker users often have poor posture, leading to further health problems and, in the worst case, falls. Integrating sensors into rollator walker designs can help to address this problem and results in a platform tha…
▽ More
Rollator walkers allow people with physical limitations to increase their mobility and give them the confidence and independence to participate in society for longer. However, rollator walker users often have poor posture, leading to further health problems and, in the worst case, falls. Integrating sensors into rollator walker designs can help to address this problem and results in a platform that allows several other interesting use cases. This paper briefly overviews existing systems and the current research directions and challenges in this field. We also present our early HealthWalk rollator walker prototype for data collection with older people, rheumatism, multiple sclerosis and Parkinson patients, and individuals with visual impairments.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Evaluating the Benefits: Quantifying the Effects of TCP Options, QUIC, and CDNs on Throughput
Authors:
Simon Bauer,
Patrick Sattler,
Johannes Zirngibl,
Christoph Schwarzenberg,
Georg Carle
Abstract:
To keep up with increasing demands on quality of experience, assessing and understanding the performance of network connections is crucial for web service providers. While different measures, like TCP options, alternative transport layer protocols like QUIC, or the hosting of services in CDNs, are expected to improve connection performance, no studies are quantifying such impacts on connections on…
▽ More
To keep up with increasing demands on quality of experience, assessing and understanding the performance of network connections is crucial for web service providers. While different measures, like TCP options, alternative transport layer protocols like QUIC, or the hosting of services in CDNs, are expected to improve connection performance, no studies are quantifying such impacts on connections on the Internet.
This paper introduces an active Internet measurement approach to assess the impacts of mentioned measures on connection performance. We conduct downloads from public web servers considering different vantage points, extract performance indicators like throughput, RTT, and retransmission rate, and survey speed-ups due to TCP option usage. Further, we compare the performance of QUIC-based downloads to TCP-based downloads considering different option configurations.
Next to significant throughput improvements due to TCP option usage, in particular TCP window scaling, and QUIC, our study shows significantly increased performance for connections to domains hosted by different giant CDNs.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
High-rate intercity quantum key distribution with a semiconductor single-photon source
Authors:
**gzhong Yang,
Zenghui Jiang,
Frederik Benthin,
Joscha Hanel,
Tom Fandrich,
Raphael Joos,
Stephanie Bauer,
Sascha Kolatschek,
Ali Hreibi,
Eddy Patrick Rugeramigabo,
Michael Jetter,
Simone Luca Portalupi,
Michael Zopf,
Peter Michler,
Stefan Kück,
Fei Ding
Abstract:
Quantum key distribution (QKD) enables the transmission of information that is secure against general attacks by eavesdroppers. The use of on-demand quantum light sources in QKD protocols is expected to help improve security and maximum tolerable loss. Semiconductor quantum dots (QDs) are a promising building block for quantum communication applications because of the deterministic emission of sin…
▽ More
Quantum key distribution (QKD) enables the transmission of information that is secure against general attacks by eavesdroppers. The use of on-demand quantum light sources in QKD protocols is expected to help improve security and maximum tolerable loss. Semiconductor quantum dots (QDs) are a promising building block for quantum communication applications because of the deterministic emission of single photons with high brightness and low multiphoton contribution. Here we report on the first intercity QKD experiment using a bright deterministic single photon source. A BB84 protocol based on polarisation encoding is realised using the high-rate single photons in the telecommunication C-band emitted from a semiconductor QD embedded in a circular Bragg grating structure. Utilising the 79 km long link with 25.49 dB loss (equivalent to 130 km for the direct-connected optical fibre) between the German cities of Hannover and Braunschweig, a record-high secret key bits per pulse of 4.8 * 10^{-5} with an average quantum bit error ratio of ~ 0.65 % are demonstrated. An asymptotic maximum tolerable loss of 28.11 dB is found, corresponding to a length of 144 km of standard telecommunication fibre. Deterministic semiconductor sources therefore challenge state-of-the-art QKD protocols and have the potential to excel in measurement device independent protocols and quantum repeater applications.
△ Less
Submitted 2 July, 2024; v1 submitted 30 August, 2023;
originally announced August 2023.
-
Real Robot Challenge 2022: Learning Dexterous Manipulation from Offline Data in the Real World
Authors:
Nico Gürtler,
Felix Widmaier,
Cansu Sancaktar,
Sebastian Blaes,
Pavel Kolev,
Stefan Bauer,
Manuel Wüthrich,
Markus Wulfmeier,
Martin Riedmiller,
Arthur Allshire,
Qiang Wang,
Robert McCarthy,
Hangyeol Kim,
Jongchan Baek,
Wookyong Kwon,
Shanliang Qian,
Yasunori Toshimitsu,
Mike Yan Michelis,
Amirhossein Kazemipour,
Arman Raayatsanati,
Hehui Zheng,
Barnabas Gavin Cangan,
Bernhard Schölkopf,
Georg Martius
Abstract:
Experimentation on real robots is demanding in terms of time and costs. For this reason, a large part of the reinforcement learning (RL) community uses simulators to develop and benchmark algorithms. However, insights gained in simulation do not necessarily translate to real robots, in particular for tasks involving complex interactions with the environment. The Real Robot Challenge 2022 therefore…
▽ More
Experimentation on real robots is demanding in terms of time and costs. For this reason, a large part of the reinforcement learning (RL) community uses simulators to develop and benchmark algorithms. However, insights gained in simulation do not necessarily translate to real robots, in particular for tasks involving complex interactions with the environment. The Real Robot Challenge 2022 therefore served as a bridge between the RL and robotics communities by allowing participants to experiment remotely with a real robot - as easily as in simulation.
In the last years, offline reinforcement learning has matured into a promising paradigm for learning from pre-collected datasets, alleviating the reliance on expensive online interactions. We therefore asked the participants to learn two dexterous manipulation tasks involving pushing, gras**, and in-hand orientation from provided real-robot datasets. An extensive software documentation and an initial stage based on a simulation of the real set-up made the competition particularly accessible. By giving each team plenty of access budget to evaluate their offline-learned policies on a cluster of seven identical real TriFinger platforms, we organized an exciting competition for machine learners and roboticists alike.
In this work we state the rules of the competition, present the methods used by the winning teams and compare their results with a benchmark of state-of-the-art offline RL algorithms on the challenge datasets.
△ Less
Submitted 24 November, 2023; v1 submitted 15 August, 2023;
originally announced August 2023.
-
Benchmarking Offline Reinforcement Learning on Real-Robot Hardware
Authors:
Nico Gürtler,
Sebastian Blaes,
Pavel Kolev,
Felix Widmaier,
Manuel Wüthrich,
Stefan Bauer,
Bernhard Schölkopf,
Georg Martius
Abstract:
Learning policies from previously recorded data is a promising direction for real-world robotics tasks, as online learning is often infeasible. Dexterous manipulation in particular remains an open problem in its general form. The combination of offline reinforcement learning with large diverse datasets, however, has the potential to lead to a breakthrough in this challenging domain analogously to…
▽ More
Learning policies from previously recorded data is a promising direction for real-world robotics tasks, as online learning is often infeasible. Dexterous manipulation in particular remains an open problem in its general form. The combination of offline reinforcement learning with large diverse datasets, however, has the potential to lead to a breakthrough in this challenging domain analogously to the rapid progress made in supervised learning in recent years. To coordinate the efforts of the research community toward tackling this problem, we propose a benchmark including: i) a large collection of data for offline learning from a dexterous manipulation platform on two tasks, obtained with capable RL agents trained in simulation; ii) the option to execute learned policies on a real-world robotic system and a simulation for efficient debugging. We evaluate prominent open-sourced offline reinforcement learning algorithms on the datasets and provide a reproducible experimental setup for offline reinforcement learning on real systems.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
BayesDAG: Gradient-Based Posterior Inference for Causal Discovery
Authors:
Yashas Annadani,
Nick Pawlowski,
Joel Jennings,
Stefan Bauer,
Cheng Zhang,
Wenbo Gong
Abstract:
Bayesian causal discovery aims to infer the posterior distribution over causal models from observed data, quantifying epistemic uncertainty and benefiting downstream tasks. However, computational challenges arise due to joint inference over combinatorial space of Directed Acyclic Graphs (DAGs) and nonlinear functions. Despite recent progress towards efficient posterior inference over DAGs, existin…
▽ More
Bayesian causal discovery aims to infer the posterior distribution over causal models from observed data, quantifying epistemic uncertainty and benefiting downstream tasks. However, computational challenges arise due to joint inference over combinatorial space of Directed Acyclic Graphs (DAGs) and nonlinear functions. Despite recent progress towards efficient posterior inference over DAGs, existing methods are either limited to variational inference on node permutation matrices for linear causal models, leading to compromised inference accuracy, or continuous relaxation of adjacency matrices constrained by a DAG regularizer, which cannot ensure resulting graphs are DAGs. In this work, we introduce a scalable Bayesian causal discovery framework based on a combination of stochastic gradient Markov Chain Monte Carlo (SG-MCMC) and Variational Inference (VI) that overcomes these limitations. Our approach directly samples DAGs from the posterior without requiring any DAG regularization, simultaneously draws function parameter samples and is applicable to both linear and nonlinear causal models. To enable our approach, we derive a novel equivalence to the permutation-based DAG learning, which opens up possibilities of using any relaxed gradient estimator defined over permutations. To our knowledge, this is the first framework applying gradient-based MCMC sampling for causal discovery. Empirical evaluation on synthetic and real-world datasets demonstrate our approach's effectiveness compared to state-of-the-art baselines.
△ Less
Submitted 8 December, 2023; v1 submitted 25 July, 2023;
originally announced July 2023.
-
Benchmarking Bayesian Causal Discovery Methods for Downstream Treatment Effect Estimation
Authors:
Chris Chinenye Emezue,
Alexandre Drouin,
Tristan Deleu,
Stefan Bauer,
Yoshua Bengio
Abstract:
The practical utility of causality in decision-making is widespread and brought about by the intertwining of causal discovery and causal inference. Nevertheless, a notable gap exists in the evaluation of causal discovery methods, where insufficient emphasis is placed on downstream inference. To address this gap, we evaluate seven established baseline causal discovery methods including a newly prop…
▽ More
The practical utility of causality in decision-making is widespread and brought about by the intertwining of causal discovery and causal inference. Nevertheless, a notable gap exists in the evaluation of causal discovery methods, where insufficient emphasis is placed on downstream inference. To address this gap, we evaluate seven established baseline causal discovery methods including a newly proposed method based on GFlowNets, on the downstream task of treatment effect estimation. Through the implementation of a distribution-level evaluation, we offer valuable and unique insights into the efficacy of these causal discovery methods for treatment effect estimation, considering both synthetic and real-world scenarios, as well as low-data scenarios. The results of our study demonstrate that some of the algorithms studied are able to effectively capture a wide range of useful and diverse ATE modes, while some tend to learn many low-probability modes which impacts the (unrelaxed) recall and precision.
△ Less
Submitted 30 July, 2023; v1 submitted 10 July, 2023;
originally announced July 2023.
-
DRCFS: Doubly Robust Causal Feature Selection
Authors:
Francesco Quinzan,
Ashkan Soleymani,
Patrick Jaillet,
Cristian R. Rojas,
Stefan Bauer
Abstract:
Knowing the features of a complex system that are highly relevant to a particular target variable is of fundamental interest in many areas of science. Existing approaches are often limited to linear settings, sometimes lack guarantees, and in most cases, do not scale to the problem at hand, in particular to images. We propose DRCFS, a doubly robust feature selection method for identifying the caus…
▽ More
Knowing the features of a complex system that are highly relevant to a particular target variable is of fundamental interest in many areas of science. Existing approaches are often limited to linear settings, sometimes lack guarantees, and in most cases, do not scale to the problem at hand, in particular to images. We propose DRCFS, a doubly robust feature selection method for identifying the causal features even in nonlinear and high dimensional settings. We provide theoretical guarantees, illustrate necessary conditions for our assumptions, and perform extensive experiments across a wide range of simulated and semi-synthetic datasets. DRCFS significantly outperforms existing state-of-the-art methods, selecting robust features even in challenging highly non-linear and high-dimensional problems.
△ Less
Submitted 5 July, 2023; v1 submitted 12 June, 2023;
originally announced June 2023.
-
Societal feedback induces complex and chaotic dynamics in endemic infectious diseases
Authors:
Joel Wagner,
Simon Bauer,
Sebastian Contreras,
Luk Fleddermann,
Ulrich Parlitz,
Viola Priesemann
Abstract:
Classically, endemic diseases are expected to display relatively stable, predictable infection dynamics. Indeed, diseases like influenza show yearly recurring infection waves that can be anticipated accurately enough to develop and distribute new vaccines. In contrast, newly-emerging diseases may cause more complex, unpredictable dynamics, like COVID-19 has demonstrated. Here we show that complex…
▽ More
Classically, endemic diseases are expected to display relatively stable, predictable infection dynamics. Indeed, diseases like influenza show yearly recurring infection waves that can be anticipated accurately enough to develop and distribute new vaccines. In contrast, newly-emerging diseases may cause more complex, unpredictable dynamics, like COVID-19 has demonstrated. Here we show that complex infection dynamics can also occur in the endemic state of seasonal diseases when including human behaviour. We implement human behaviour as a feedback between incidence and disease mitigation and study the system as an epidemiological oscillator driven by seasonality. When behaviour and seasonality have a comparable impact, we find a rich structure in parameter and state space with Arnold tongues, co-existing attractors, and chaos. Moreover, we demonstrate that if a disease requires active mitigation, balancing costs of mitigation and infections can lead societies right into this complex regime. We observe indications of this when comparing past COVID-19 and influenza data to model simulations. Our results challenge the intuition that endemicity implies predictability and seasonal waves, and show that complex dynamics can dominate even in the endemic phase.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
How charges separate when surfaces are dewetted
Authors:
Aaron D. Ratschow,
Lisa S. Bauer,
Pravash Bista,
Stefan A. L. Weber,
Hans-Jürgen Butt,
Steffen Hardt
Abstract:
Charge separation at moving three-phase contact lines is observed in nature as well as technological processes. Despite the growing number of experimental investigations in recent years, the physical mechanism behind the charging remains obscure. Here we identify the origin of charge separation as the dewetting of the bound surface charge within the electric double layer by the receding contact li…
▽ More
Charge separation at moving three-phase contact lines is observed in nature as well as technological processes. Despite the growing number of experimental investigations in recent years, the physical mechanism behind the charging remains obscure. Here we identify the origin of charge separation as the dewetting of the bound surface charge within the electric double layer by the receding contact line. This charge depends strongly on the local electric double layer structure close to the contact line, which is affected by the gas-liquid interface and the internal flow of the liquid. We summarize the charge separation mechanism in an analytical model that captures parametric dependencies in agreement with our experiments and numerical simulations. Charge separation increases with increasing contact angle and decreases with increasing dewetting velocity. Our findings reveal the universal mechanism of charge separation at receding contact lines, relevant to many dynamic wetting scenarios, and provide a theoretical foundation for both fundamental questions, like contact angle hysteresis, and practical applications.
△ Less
Submitted 3 May, 2023;
originally announced May 2023.
-
Understanding Causality with Large Language Models: Feasibility and Opportunities
Authors:
Cheng Zhang,
Stefan Bauer,
Paul Bennett,
Jiangfeng Gao,
Wenbo Gong,
Agrin Hilmkil,
Joel Jennings,
Chao Ma,
Tom Minka,
Nick Pawlowski,
James Vaughan
Abstract:
We assess the ability of large language models (LLMs) to answer causal questions by analyzing their strengths and weaknesses against three types of causal question. We believe that current LLMs can answer causal questions with existing causal knowledge as combined domain experts. However, they are not yet able to provide satisfactory answers for discovering new knowledge or for high-stakes decisio…
▽ More
We assess the ability of large language models (LLMs) to answer causal questions by analyzing their strengths and weaknesses against three types of causal question. We believe that current LLMs can answer causal questions with existing causal knowledge as combined domain experts. However, they are not yet able to provide satisfactory answers for discovering new knowledge or for high-stakes decision-making tasks with high precision. We discuss possible future directions and opportunities, such as enabling explicit and implicit causal modules as well as deep causal-aware LLMs. These will not only enable LLMs to answer many different types of causal questions for greater impact but also enable LLMs to be more trustworthy and efficient in general.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
Differentiable Multi-Target Causal Bayesian Experimental Design
Authors:
Yashas Annadani,
Panagiotis Tigas,
Desi R. Ivanova,
Andrew Jesson,
Yarin Gal,
Adam Foster,
Stefan Bauer
Abstract:
We introduce a gradient-based approach for the problem of Bayesian optimal experimental design to learn causal models in a batch setting -- a critical component for causal discovery from finite data where interventions can be costly or risky. Existing methods rely on greedy approximations to construct a batch of experiments while using black-box methods to optimize over a single target-state pair…
▽ More
We introduce a gradient-based approach for the problem of Bayesian optimal experimental design to learn causal models in a batch setting -- a critical component for causal discovery from finite data where interventions can be costly or risky. Existing methods rely on greedy approximations to construct a batch of experiments while using black-box methods to optimize over a single target-state pair to intervene with. In this work, we completely dispose of the black-box optimization techniques and greedy heuristics and instead propose a conceptually simple end-to-end gradient-based optimization procedure to acquire a set of optimal intervention target-state pairs. Such a procedure enables parameterization of the design space to efficiently optimize over a batch of multi-target-state interventions, a setting which has hitherto not been explored due to its complexity. We demonstrate that our proposed method outperforms baselines and existing acquisition strategies in both single-target and multi-target settings across a number of synthetic datasets.
△ Less
Submitted 2 June, 2023; v1 submitted 21 February, 2023;
originally announced February 2023.
-
Trust Your $\nabla$: Gradient-based Intervention Targeting for Causal Discovery
Authors:
Mateusz Olko,
Michał Zając,
Aleksandra Nowak,
Nino Scherrer,
Yashas Annadani,
Stefan Bauer,
Łukasz Kuciński,
Piotr Miłoś
Abstract:
Inferring causal structure from data is a challenging task of fundamental importance in science. Observational data are often insufficient to identify a system's causal structure uniquely. While conducting interventions (i.e., experiments) can improve the identifiability, such samples are usually challenging and expensive to obtain. Hence, experimental design approaches for causal discovery aim to…
▽ More
Inferring causal structure from data is a challenging task of fundamental importance in science. Observational data are often insufficient to identify a system's causal structure uniquely. While conducting interventions (i.e., experiments) can improve the identifiability, such samples are usually challenging and expensive to obtain. Hence, experimental design approaches for causal discovery aim to minimize the number of interventions by estimating the most informative intervention target. In this work, we propose a novel Gradient-based Intervention Targeting method, abbreviated GIT, that 'trusts' the gradient estimator of a gradient-based causal discovery framework to provide signals for the intervention acquisition function. We provide extensive experiments in simulated and real-world datasets and demonstrate that GIT performs on par with competitive baselines, surpassing them in the low-data regime.
△ Less
Submitted 3 April, 2024; v1 submitted 24 November, 2022;
originally announced November 2022.
-
Federated Causal Discovery From Interventions
Authors:
Amin Abyaneh,
Nino Scherrer,
Patrick Schwab,
Stefan Bauer,
Bernhard Schölkopf,
Arash Mehrjou
Abstract:
Causal discovery serves a pivotal role in mitigating model uncertainty through recovering the underlying causal mechanisms among variables. In many practical domains, such as healthcare, access to the data gathered by individual entities is limited, primarily for privacy and regulatory constraints. However, the majority of existing causal discovery methods require the data to be available in a cen…
▽ More
Causal discovery serves a pivotal role in mitigating model uncertainty through recovering the underlying causal mechanisms among variables. In many practical domains, such as healthcare, access to the data gathered by individual entities is limited, primarily for privacy and regulatory constraints. However, the majority of existing causal discovery methods require the data to be available in a centralized location. In response, researchers have introduced federated causal discovery. While previous federated methods consider distributed observational data, the integration of interventional data remains largely unexplored. We propose FedCDI, a federated framework for inferring causal structures from distributed data containing interventional samples. In line with the federated learning framework, FedCDI improves privacy by exchanging belief updates rather than raw samples. Additionally, it introduces a novel intervention-aware method for aggregating individual updates. We analyze scenarios with shared or disjoint intervened covariates, and mitigate the adverse effects of interventional data heterogeneity. The performance and scalability of FedCDI is rigorously tested across a variety of synthetic and real-world graphs.
△ Less
Submitted 11 February, 2024; v1 submitted 7 November, 2022;
originally announced November 2022.
-
From Points to Functions: Infinite-dimensional Representations in Diffusion Models
Authors:
Sarthak Mittal,
Guillaume Lajoie,
Stefan Bauer,
Arash Mehrjou
Abstract:
Diffusion-based generative models learn to iteratively transfer unstructured noise to a complex target distribution as opposed to Generative Adversarial Networks (GANs) or the decoder of Variational Autoencoders (VAEs) which produce samples from the target distribution in a single step. Thus, in diffusion models every sample is naturally connected to a random trajectory which is a solution to a le…
▽ More
Diffusion-based generative models learn to iteratively transfer unstructured noise to a complex target distribution as opposed to Generative Adversarial Networks (GANs) or the decoder of Variational Autoencoders (VAEs) which produce samples from the target distribution in a single step. Thus, in diffusion models every sample is naturally connected to a random trajectory which is a solution to a learned stochastic differential equation (SDE). Generative models are only concerned with the final state of this trajectory that delivers samples from the desired distribution. Abstreiter et. al showed that these stochastic trajectories can be seen as continuous filters that wash out information along the way. Consequently, it is reasonable to ask if there is an intermediate time step at which the preserved information is optimal for a given downstream task. In this work, we show that a combination of information content from different time steps gives a strictly better representation for the downstream task. We introduce an attention and recurrence based modules that ``learn to mix'' information content of various time-steps such that the resultant representation leads to superior performance in downstream tasks.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
Learning Latent Structural Causal Models
Authors:
Jithendaraa Subramanian,
Yashas Annadani,
Ivaxi Sheth,
Nan Rosemary Ke,
Tristan Deleu,
Stefan Bauer,
Derek Nowrouzezahrai,
Samira Ebrahimi Kahou
Abstract:
Causal learning has long concerned itself with the accurate recovery of underlying causal mechanisms. Such causal modelling enables better explanations of out-of-distribution data. Prior works on causal learning assume that the high-level causal variables are given. However, in machine learning tasks, one often operates on low-level data like image pixels or high-dimensional vectors. In such setti…
▽ More
Causal learning has long concerned itself with the accurate recovery of underlying causal mechanisms. Such causal modelling enables better explanations of out-of-distribution data. Prior works on causal learning assume that the high-level causal variables are given. However, in machine learning tasks, one often operates on low-level data like image pixels or high-dimensional vectors. In such settings, the entire Structural Causal Model (SCM) -- structure, parameters, \textit{and} high-level causal variables -- is unobserved and needs to be learnt from low-level data. We treat this problem as Bayesian inference of the latent SCM, given low-level data. For linear Gaussian additive noise SCMs, we present a tractable approximate inference method which performs joint inference over the causal variables, structure and parameters of the latent SCM from random, known interventions. Experiments are performed on synthetic datasets and a causally generated image dataset to demonstrate the efficacy of our approach. We also perform image generation from unseen interventions, thereby verifying out of distribution generalization for the proposed causal model.
△ Less
Submitted 24 October, 2022;
originally announced October 2022.
-
Generalised Gillespie Algorithms for Simulations in a Rule-Based Epidemiological Model Framework
Authors:
David Alonso,
Steffen Bauer,
Markus Kirkilionis,
Lisa Maria Kreusser,
Luca Sbano
Abstract:
Rule-based models have been successfully used to represent different aspects of the COVID-19 pandemic, including age, testing, hospitalisation, lockdowns, immunity, infectivity, behaviour, mobility and vaccination of individuals. These rule-based approaches are motivated by chemical reaction rules which are traditionally solved numerically with the standard Gillespie algorithm proposed in the cont…
▽ More
Rule-based models have been successfully used to represent different aspects of the COVID-19 pandemic, including age, testing, hospitalisation, lockdowns, immunity, infectivity, behaviour, mobility and vaccination of individuals. These rule-based approaches are motivated by chemical reaction rules which are traditionally solved numerically with the standard Gillespie algorithm proposed in the context of molecular dynamics. When applying reaction system type of approaches to epidemiology, generalisations of the Gillespie algorithm are required due to the time-dependency of the problems. In this article, we present different generalisations of the standard Gillespie algorithm which address discrete subtypes (e.g., incorporating the age structure of the population), time-discrete updates (e.g., incorporating daily imposed change of rates for lockdowns) and deterministic delays (e.g., given waiting time until a specific change in types such as release from isolation occurs). These algorithms are complemented by relevant examples in the context of the COVID-19 pandemic and numerical results.
△ Less
Submitted 24 October, 2022; v1 submitted 17 October, 2022;
originally announced October 2022.
-
From Static to Dynamic Structures: Improving Binding Affinity Prediction with a Graph-Based Deep Learning Model
Authors:
Yaosen Min,
Ye Wei,
Peizhuo Wang,
Xiaoting Wang,
Han Li,
Nian Wu,
Stefan Bauer,
Shuxin Zheng,
Yu Shi,
Yingheng Wang,
Ji Wu,
Dan Zhao,
Jianyang Zeng
Abstract:
Accurate prediction of the protein-ligand binding affinities is an essential challenge in the structure-based drug design. Despite recent advance in data-driven methods in affinity prediction, their accuracy is still limited, partially because they only take advantage of static crystal structures while the actual binding affinities are generally depicted by the thermodynamic ensembles between prot…
▽ More
Accurate prediction of the protein-ligand binding affinities is an essential challenge in the structure-based drug design. Despite recent advance in data-driven methods in affinity prediction, their accuracy is still limited, partially because they only take advantage of static crystal structures while the actual binding affinities are generally depicted by the thermodynamic ensembles between proteins and ligands. One effective way to approximate such a thermodynamic ensemble is to use molecular dynamics (MD) simulation. Here, we curated an MD dataset containing 3,218 different protein-ligand complexes, and further developed Dynaformer, which is a graph-based deep learning model. Dynaformer was able to accurately predict the binding affinities by learning the geometric characteristics of the protein-ligand interactions from the MD trajectories. In silico experiments demonstrated that our model exhibits state-of-the-art scoring and ranking power on the CASF-2016 benchmark dataset, outperforming the methods hitherto reported. Moreover, we performed a virtual screening on the heat shock protein 90 (HSP90) using Dynaformer that identified 20 candidates and further experimentally validated their binding affinities. We demonstrated that our approach is more efficient, which can identify 12 hit compounds (two were in the submicromolar range), including several newly discovered scaffolds. We anticipate this new synergy between large-scale MD datasets and deep learning models will provide a new route toward accelerating the early drug discovery process.
△ Less
Submitted 3 June, 2023; v1 submitted 19 August, 2022;
originally announced August 2022.
-
High emission rate from a Purcell-enhanced, triggered source of pure single photons in the telecom C-band
Authors:
Cornelius Nawrath,
Raphael Joos,
Sascha Kolatschek,
Stephanie Bauer,
Pascal Pruy,
Florian Hornung,
Julius Fischer,
Jiasheng Huang,
Ponraj Vijayan,
Robert Sittig,
Michael Jetter,
Simone Luca Portalupi,
Peter Michler
Abstract:
Several emission features mark semiconductor quantum dots as promising non-classical light sources for prospective quantum implementations. For long-distance transmission [1] and Si-based on-chip processing[2, 3], the possibility to match the telecom C-band [4] stands out, while source brightness and high single-photon purity are key features in virtually any quantum implementation [5, 6]. Here we…
▽ More
Several emission features mark semiconductor quantum dots as promising non-classical light sources for prospective quantum implementations. For long-distance transmission [1] and Si-based on-chip processing[2, 3], the possibility to match the telecom C-band [4] stands out, while source brightness and high single-photon purity are key features in virtually any quantum implementation [5, 6]. Here we present an InAs/InGaAs/GaAs quantum dot emitting in the telecom C-band coupled to a circular Bragg grating. The Purcell enhancement of the emission enables a simultaneously high brightness with a fiber-coupled single-photon count rate of 13.9MHz for an excitation repetition rate of 228MHz (first-lens collection efficiency ca. 17% for NA = 0.6), while maintaining a low multi-photon contribution of g(2)(0) = 0.0052. Moreover, the compatibility with temperatures of up to 40K attainable with compact cryo coolers, further underlines the suitability for out-of-the-lab implementations.
△ Less
Submitted 26 July, 2022;
originally announced July 2022.
-
Latent Variable Models for Bayesian Causal Discovery
Authors:
Jithendaraa Subramanian,
Yashas Annadani,
Ivaxi Sheth,
Stefan Bauer,
Derek Nowrouzezahrai,
Samira Ebrahimi Kahou
Abstract:
Learning predictors that do not rely on spurious correlations involves building causal representations. However, learning such a representation is very challenging. We, therefore, formulate the problem of learning a causal representation from high dimensional data and study causal recovery with synthetic data. This work introduces a latent variable decoder model, Decoder BCD, for Bayesian causal d…
▽ More
Learning predictors that do not rely on spurious correlations involves building causal representations. However, learning such a representation is very challenging. We, therefore, formulate the problem of learning a causal representation from high dimensional data and study causal recovery with synthetic data. This work introduces a latent variable decoder model, Decoder BCD, for Bayesian causal discovery and performs experiments in mildly supervised and unsupervised settings. We present a series of synthetic experiments to characterize important factors for causal discovery and show that using known intervention targets as labels helps in unsupervised Bayesian inference over structure and parameters of linear Gaussian additive noise latent structural causal models.
△ Less
Submitted 10 August, 2022; v1 submitted 12 July, 2022;
originally announced July 2022.
-
Invariant Causal Mechanisms through Distribution Matching
Authors:
Mathieu Chevalley,
Charlotte Bunne,
Andreas Krause,
Stefan Bauer
Abstract:
Learning representations that capture the underlying data generating process is a key problem for data efficient and robust use of neural networks. One key property for robustness which the learned representation should capture and which recently received a lot of attention is described by the notion of invariance. In this work we provide a causal perspective and new algorithm for learning invaria…
▽ More
Learning representations that capture the underlying data generating process is a key problem for data efficient and robust use of neural networks. One key property for robustness which the learned representation should capture and which recently received a lot of attention is described by the notion of invariance. In this work we provide a causal perspective and new algorithm for learning invariant representations. Empirically we show that this algorithm works well on a diverse set of tasks and in particular we observe state-of-the-art performance on domain generalization, where we are able to significantly boost the score of existing models.
△ Less
Submitted 23 June, 2022;
originally announced June 2022.
-
Diffusion Models for Video Prediction and Infilling
Authors:
Tobias Höppe,
Arash Mehrjou,
Stefan Bauer,
Didrik Nielsen,
Andrea Dittadi
Abstract:
Predicting and anticipating future outcomes or reasoning about missing information in a sequence are critical skills for agents to be able to make intelligent decisions. This requires strong, temporally coherent generative capabilities. Diffusion models have shown remarkable success in several generative tasks, but have not been extensively explored in the video domain. We present Random-Mask Vide…
▽ More
Predicting and anticipating future outcomes or reasoning about missing information in a sequence are critical skills for agents to be able to make intelligent decisions. This requires strong, temporally coherent generative capabilities. Diffusion models have shown remarkable success in several generative tasks, but have not been extensively explored in the video domain. We present Random-Mask Video Diffusion (RaMViD), which extends image diffusion models to videos using 3D convolutions, and introduces a new conditioning technique during training. By varying the mask we condition on, the model is able to perform video prediction, infilling, and upsampling. Due to our simple conditioning scheme, we can utilize the same architecture as used for unconditional training, which allows us to train the model in a conditional and unconditional fashion at the same time. We evaluate RaMViD on two benchmark datasets for video prediction, on which we achieve state-of-the-art results, and one for video generation. High-resolution videos are provided at https://sites.google.com/view/video-diffusion-prediction.
△ Less
Submitted 14 November, 2022; v1 submitted 15 June, 2022;
originally announced June 2022.
-
On the Generalization and Adaption Performance of Causal Models
Authors:
Nino Scherrer,
Anirudh Goyal,
Stefan Bauer,
Yoshua Bengio,
Nan Rosemary Ke
Abstract:
Learning models that offer robust out-of-distribution generalization and fast adaptation is a key challenge in modern machine learning. Modelling causal structure into neural networks holds the promise to accomplish robust zero and few-shot adaptation. Recent advances in differentiable causal discovery have proposed to factorize the data generating process into a set of modules, i.e. one module fo…
▽ More
Learning models that offer robust out-of-distribution generalization and fast adaptation is a key challenge in modern machine learning. Modelling causal structure into neural networks holds the promise to accomplish robust zero and few-shot adaptation. Recent advances in differentiable causal discovery have proposed to factorize the data generating process into a set of modules, i.e. one module for the conditional distribution of every variable where only causal parents are used as predictors. Such a modular decomposition of knowledge enables adaptation to distributions shifts by only updating a subset of parameters. In this work, we systematically study the generalization and adaption performance of such modular neural causal models by comparing it to monolithic models and structured models where the set of predictors is not constrained to causal parents. Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes and offer robust generalization. We also found that the effects are more significant for sparser graphs as compared to denser graphs.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
Dexterous Robotic Manipulation using Deep Reinforcement Learning and Knowledge Transfer for Complex Sparse Reward-based Tasks
Authors:
Qiang Wang,
Francisco Roldan Sanchez,
Robert McCarthy,
David Cordova Bulens,
Kevin McGuinness,
Noel O'Connor,
Manuel Wüthrich,
Felix Widmaier,
Stefan Bauer,
Stephen J. Redmond
Abstract:
This paper describes a deep reinforcement learning (DRL) approach that won Phase 1 of the Real Robot Challenge (RRC) 2021, and then extends this method to a more difficult manipulation task. The RRC consisted of using a TriFinger robot to manipulate a cube along a specified positional trajectory, but with no requirement for the cube to have any specific orientation. We used a relatively simple rew…
▽ More
This paper describes a deep reinforcement learning (DRL) approach that won Phase 1 of the Real Robot Challenge (RRC) 2021, and then extends this method to a more difficult manipulation task. The RRC consisted of using a TriFinger robot to manipulate a cube along a specified positional trajectory, but with no requirement for the cube to have any specific orientation. We used a relatively simple reward function, a combination of goal-based sparse reward and distance reward, in conjunction with Hindsight Experience Replay (HER) to guide the learning of the DRL agent (Deep Deterministic Policy Gradient (DDPG)). Our approach allowed our agents to acquire dexterous robotic manipulation strategies in simulation. These strategies were then applied to the real robot and outperformed all other competition submissions, including those using more traditional robotic control techniques, in the final evaluation stage of the RRC. Here we extend this method, by modifying the task of Phase 1 of the RRC to require the robot to maintain the cube in a particular orientation, while the cube is moved along the required positional trajectory. The requirement to also orient the cube makes the agent unable to learn the task through blind exploration due to increased problem complexity. To circumvent this issue, we make novel use of a Knowledge Transfer (KT) technique that allows the strategies learned by the agent in the original task (which was agnostic to cube orientation) to be transferred to this task (where orientation matters). KT allowed the agent to learn and perform the extended task in the simulator, which improved the average positional deviation from 0.134 m to 0.02 m, and average orientation deviation from 142° to 76° during evaluation. This KT concept shows good generalisation properties and could be applied to any actor-critic learning algorithm.
△ Less
Submitted 27 January, 2023; v1 submitted 19 May, 2022;
originally announced May 2022.
-
Good practice guide on the graphene-based AC-QHE realization of the farad
Authors:
Luca Callegaro,
Stephan Bauer,
Blaise Jeanneret,
Mattias Kruskopf,
Martina Marzano,
Massimo Ortolano,
Frederic Overney
Abstract:
This Good Practice Guide provides information for the realization of the farad from the quantum Hall resistance in graphene devices by using digital impedance bridges. The fabrication and characterization of graphene quantum Hall effect devices, the cryogenic environment required to achieve the quantization conditions, the digital impedance bridges and calibration procedures are reported.
The gu…
▽ More
This Good Practice Guide provides information for the realization of the farad from the quantum Hall resistance in graphene devices by using digital impedance bridges. The fabrication and characterization of graphene quantum Hall effect devices, the cryogenic environment required to achieve the quantization conditions, the digital impedance bridges and calibration procedures are reported.
The guide is a deliverable of the Joint Research Project EMPIR 18SIB07 GIQS: Graphene Impedance Quantum Standard. This project received funding from the European Metrology Programme for Innovation and Research (EMPIR) co-financed by the Participating States and from the European Unions' Horizon 2020 research and innovation programme. Funder ID: 10.13039/100014132 , Grant no: 18SIB07.
△ Less
Submitted 10 May, 2022;
originally announced May 2022.
-
Federated Learning in Multi-Center Critical Care Research: A Systematic Case Study using the eICU Database
Authors:
Arash Mehrjou,
Ashkan Soleymani,
Annika Buchholz,
Jürgen Hetzel,
Patrick Schwab,
Stefan Bauer
Abstract:
Federated learning (FL) has been proposed as a method to train a model on different units without exchanging data. This offers great opportunities in the healthcare sector, where large datasets are available but cannot be shared to ensure patient privacy. We systematically investigate the effectiveness of FL on the publicly available eICU dataset for predicting the survival of each ICU stay. We em…
▽ More
Federated learning (FL) has been proposed as a method to train a model on different units without exchanging data. This offers great opportunities in the healthcare sector, where large datasets are available but cannot be shared to ensure patient privacy. We systematically investigate the effectiveness of FL on the publicly available eICU dataset for predicting the survival of each ICU stay. We employ Federated Averaging as the main practical algorithm for FL and show how its performance changes by altering three key hyper-parameters, taking into account that clients can significantly vary in size. We find that in many settings, a large number of local training epochs improves the performance while at the same time reducing communication costs. Furthermore, we outline in which settings it is possible to have only a low number of hospitals participating in each federated update round. When many hospitals with low patient counts are involved, the effect of overfitting can be avoided by decreasing the batchsize. This study thus contributes toward identifying suitable settings for running distributed algorithms such as FL on clinical datasets.
△ Less
Submitted 20 April, 2022;
originally announced April 2022.
-
Interventions, Where and How? Experimental Design for Causal Models at Scale
Authors:
Panagiotis Tigas,
Yashas Annadani,
Andrew Jesson,
Bernhard Schölkopf,
Yarin Gal,
Stefan Bauer
Abstract:
Causal discovery from observational and interventional data is challenging due to limited data and non-identifiability: factors that introduce uncertainty in estimating the underlying structural causal model (SCM). Selecting experiments (interventions) based on the uncertainty arising from both factors can expedite the identification of the SCM. Existing methods in experimental design for causal d…
▽ More
Causal discovery from observational and interventional data is challenging due to limited data and non-identifiability: factors that introduce uncertainty in estimating the underlying structural causal model (SCM). Selecting experiments (interventions) based on the uncertainty arising from both factors can expedite the identification of the SCM. Existing methods in experimental design for causal discovery from limited data either rely on linear assumptions for the SCM or select only the intervention target. This work incorporates recent advances in Bayesian causal discovery into the Bayesian optimal experimental design framework, allowing for active causal discovery of large, nonlinear SCMs while selecting both the interventional target and the value. We demonstrate the performance of the proposed method on synthetic graphs (Erdos-Rènyi, Scale Free) for both linear and nonlinear SCMs as well as on the \emph{in-silico} single-cell gene regulatory network dataset, DREAM.
△ Less
Submitted 21 October, 2022; v1 submitted 3 March, 2022;
originally announced March 2022.
-
Bayesian Structure Learning with Generative Flow Networks
Authors:
Tristan Deleu,
António Góis,
Chris Emezue,
Mansi Rankawat,
Simon Lacoste-Julien,
Stefan Bauer,
Yoshua Bengio
Abstract:
In Bayesian structure learning, we are interested in inferring a distribution over the directed acyclic graph (DAG) structure of Bayesian networks, from data. Defining such a distribution is very challenging, due to the combinatorially large sample space, and approximations based on MCMC are often required. Recently, a novel class of probabilistic models, called Generative Flow Networks (GFlowNets…
▽ More
In Bayesian structure learning, we are interested in inferring a distribution over the directed acyclic graph (DAG) structure of Bayesian networks, from data. Defining such a distribution is very challenging, due to the combinatorially large sample space, and approximations based on MCMC are often required. Recently, a novel class of probabilistic models, called Generative Flow Networks (GFlowNets), have been introduced as a general framework for generative modeling of discrete and composite objects, such as graphs. In this work, we propose to use a GFlowNet as an alternative to MCMC for approximating the posterior distribution over the structure of Bayesian networks, given a dataset of observations. Generating a sample DAG from this approximate distribution is viewed as a sequential decision problem, where the graph is constructed one edge at a time, based on learned transition probabilities. Through evaluation on both simulated and real data, we show that our approach, called DAG-GFlowNet, provides an accurate approximation of the posterior over DAGs, and it compares favorably against other methods based on MCMC or variational inference.
△ Less
Submitted 28 June, 2022; v1 submitted 28 February, 2022;
originally announced February 2022.
-
Machine learning-enabled high-entropy alloy discovery
Authors:
Ziyuan Rao,
PoYen Tung,
Ruiwen Xie,
Ye Wei,
Hongbin Zhang,
Alberto Ferrari,
T. P. C. Klaver,
Fritz Körmann,
Prithiv Thoudden Sukumar,
Alisson Kwiatkowski da Silva,
Yao Chen,
Zhiming Li,
Dirk Ponge,
Jörg Neugebauer,
Oliver Gutfleisch,
Stefan Bauer,
Dierk Raabe
Abstract:
High-entropy alloys are solid solutions of multiple principal elements, capable of reaching composition and feature regimes inaccessible for dilute materials. Discovering those with valuable properties, however, relies on serendipity, as thermodynamic alloy design rules alone often fail in high-dimensional composition spaces. Here, we propose an active-learning strategy to accelerate the design of…
▽ More
High-entropy alloys are solid solutions of multiple principal elements, capable of reaching composition and feature regimes inaccessible for dilute materials. Discovering those with valuable properties, however, relies on serendipity, as thermodynamic alloy design rules alone often fail in high-dimensional composition spaces. Here, we propose an active-learning strategy to accelerate the design of novel high-entropy Invar alloys in a practically infinite compositional space, based on very sparse data. Our approach works as a closed-loop, integrating machine learning with density-functional theory, thermodynamic calculations, and experiments. After processing and characterizing 17 new alloys (out of millions of possible compositions), we identified 2 high-entropy Invar alloys with extremely low thermal expansion coefficients around 2*10-6 K-1 at 300 K. Our study thus opens a new pathway for the fast and automated discovery of high-entropy alloys with optimal thermal, magnetic and electrical properties.
△ Less
Submitted 28 February, 2022;
originally announced February 2022.
-
Compositional Multi-Object Reinforcement Learning with Linear Relation Networks
Authors:
Davide Mambelli,
Frederik Träuble,
Stefan Bauer,
Bernhard Schölkopf,
Francesco Locatello
Abstract:
Although reinforcement learning has seen remarkable progress over the last years, solving robust dexterous object-manipulation tasks in multi-object settings remains a challenge. In this paper, we focus on models that can learn manipulation tasks in fixed multi-object settings and extrapolate this skill zero-shot without any drop in performance when the number of objects changes. We consider the g…
▽ More
Although reinforcement learning has seen remarkable progress over the last years, solving robust dexterous object-manipulation tasks in multi-object settings remains a challenge. In this paper, we focus on models that can learn manipulation tasks in fixed multi-object settings and extrapolate this skill zero-shot without any drop in performance when the number of objects changes. We consider the generic task of bringing a specific cube out of a set to a goal position. We find that previous approaches, which primarily leverage attention and graph neural network-based architectures, do not generalize their skills when the number of input objects changes while scaling as $K^2$. We propose an alternative plug-and-play module based on relational inductive biases to overcome these limitations. Besides exceeding performances in their training environment, we show that our approach, which scales linearly in $K$, allows agents to extrapolate and generalize zero-shot to any new object number.
△ Less
Submitted 31 January, 2022;
originally announced January 2022.
-
Conditional Generation of Medical Time Series for Extrapolation to Underrepresented Populations
Authors:
Simon Bing,
Andrea Dittadi,
Stefan Bauer,
Patrick Schwab
Abstract:
The widespread adoption of electronic health records (EHRs) and subsequent increased availability of longitudinal healthcare data has led to significant advances in our understanding of health and disease with direct and immediate impact on the development of new diagnostics and therapeutic treatment options. However, access to EHRs is often restricted due to their perceived sensitive nature and a…
▽ More
The widespread adoption of electronic health records (EHRs) and subsequent increased availability of longitudinal healthcare data has led to significant advances in our understanding of health and disease with direct and immediate impact on the development of new diagnostics and therapeutic treatment options. However, access to EHRs is often restricted due to their perceived sensitive nature and associated legal concerns, and the cohorts therein typically are those seen at a specific hospital or network of hospitals and therefore not representative of the wider population of patients. Here, we present HealthGen, a new approach for the conditional generation of synthetic EHRs that maintains an accurate representation of real patient characteristics, temporal information and missingness patterns. We demonstrate experimentally that HealthGen generates synthetic cohorts that are significantly more faithful to real patient EHRs than the current state-of-the-art, and that augmenting real data sets with conditionally generated cohorts of underrepresented subpopulations of patients can significantly enhance the generalisability of models derived from these data sets to different patient populations. Synthetic conditionally generated EHRs could help increase the accessibility of longitudinal healthcare data sets and improve the generalisability of inferences made from these data sets to underrepresented populations.
△ Less
Submitted 20 January, 2022;
originally announced January 2022.
-
Physical Derivatives: Computing policy gradients by physical forward-propagation
Authors:
Arash Mehrjou,
Ashkan Soleymani,
Stefan Bauer,
Bernhard Schölkopf
Abstract:
Model-free and model-based reinforcement learning are two ends of a spectrum. Learning a good policy without a dynamic model can be prohibitively expensive. Learning the dynamic model of a system can reduce the cost of learning the policy, but it can also introduce bias if it is not accurate. We propose a middle ground where instead of the transition model, the sensitivity of the trajectories with…
▽ More
Model-free and model-based reinforcement learning are two ends of a spectrum. Learning a good policy without a dynamic model can be prohibitively expensive. Learning the dynamic model of a system can reduce the cost of learning the policy, but it can also introduce bias if it is not accurate. We propose a middle ground where instead of the transition model, the sensitivity of the trajectories with respect to the perturbation of the parameters is learned. This allows us to predict the local behavior of the physical system around a set of nominal policies without knowing the actual model. We assay our method on a custom-built physical robot in extensive experiments and show the feasibility of the approach in practice. We investigate potential challenges when applying our method to physical systems and propose solutions to each of them.
△ Less
Submitted 15 January, 2022;
originally announced January 2022.
-
Interplay between risk perception, behaviour, and COVID-19 spread
Authors:
Philipp Dönges,
Joel Wagner,
Sebastian Contreras,
Emil Iftekhar,
Simon Bauer,
Sebastian B. Mohr,
Jonas Dehning,
André Calero Valdez,
Mirjam Kretzschmar,
Michael Mäs,
Kai Nagel,
Viola Priesemann
Abstract:
Pharmaceutical and non-pharmaceutical interventions (NPIs) have been crucial for controlling COVID-19. They are complemented by voluntary health-protective behaviour, building a complex interplay between risk perception, behaviour, and disease spread. We studied how voluntary health-protective behaviour and vaccination willingness impact the long-term dynamics. We analysed how different levels of…
▽ More
Pharmaceutical and non-pharmaceutical interventions (NPIs) have been crucial for controlling COVID-19. They are complemented by voluntary health-protective behaviour, building a complex interplay between risk perception, behaviour, and disease spread. We studied how voluntary health-protective behaviour and vaccination willingness impact the long-term dynamics. We analysed how different levels of mandatory NPIs determine how individuals use their leeway for voluntary actions. If mandatory NPIs are too weak, COVID-19 incidence will surge, implying high morbidity and mortality before individuals react; if they are too strong, one expects a rebound wave once restrictions are lifted, challenging the transition to endemicity. Conversely, moderate mandatory NPIs give individuals time and room to adapt their level of caution, mitigating disease spread effectively. When complemented with high vaccination rates, this also offers a robust way to limit the impacts of the Omicron variant of concern. Altogether, our work highlights the importance of appropriate mandatory NPIs to maximise the impact of individual voluntary actions in pandemic control.
△ Less
Submitted 11 March, 2022; v1 submitted 22 December, 2021;
originally announced December 2021.
-
A Rule-Based Epidemiological Modelling Framework
Authors:
David Alonso,
Steffen Bauer,
Markus Kirkilionis,
Lisa Maria Kreusser,
Luca Sbano
Abstract:
Motivated by chemical reaction rules, we introduce a rule-based epidemiological framework for the systematic mathematical modelling of future pandemics. Here we stress that we do not have a specific model in mind, but a whole collection of models which can be transformed into each other, or represent different aspects of a pandemic, and these aspects can change during the course of the emergency,…
▽ More
Motivated by chemical reaction rules, we introduce a rule-based epidemiological framework for the systematic mathematical modelling of future pandemics. Here we stress that we do not have a specific model in mind, but a whole collection of models which can be transformed into each other, or represent different aspects of a pandemic, and these aspects can change during the course of the emergency, as happened during the Covid-19 pandemic. As conditions for outbreaks in the modern world change on different time-scales, some rapidly, epidemiology has few 'laws', besides perhaps the fundamental infection process described by Kermack-McKendrick. Each single of our variety of models, called framework, is based on a mathematical formulation that we call a rule-based system. They have several advantages, for example that they can be both interpreted stochastically and deterministically, without changing the model structure. Rule-based systems should be easier to communicate to non-specialists, when compared to differential equations. Due to their combinatorial nature, the rule-based model framework we propose is ideal for systematic mathematical modelling, systematic links to statistics, data analysis in general and also machine learning leading to artificial intelligence.
△ Less
Submitted 22 May, 2024; v1 submitted 14 November, 2021;
originally announced November 2021.
-
GeneDisco: A Benchmark for Experimental Design in Drug Discovery
Authors:
Arash Mehrjou,
Ashkan Soleymani,
Andrew Jesson,
Pascal Notin,
Yarin Gal,
Stefan Bauer,
Patrick Schwab
Abstract:
In vitro cellular experimentation with genetic interventions, using for example CRISPR technologies, is an essential step in early-stage drug discovery and target validation that serves to assess initial hypotheses about causal associations between biological mechanisms and disease pathologies. With billions of potential hypotheses to test, the experimental design space for in vitro genetic experi…
▽ More
In vitro cellular experimentation with genetic interventions, using for example CRISPR technologies, is an essential step in early-stage drug discovery and target validation that serves to assess initial hypotheses about causal associations between biological mechanisms and disease pathologies. With billions of potential hypotheses to test, the experimental design space for in vitro genetic experiments is extremely vast, and the available experimental capacity - even at the largest research institutions in the world - pales in relation to the size of this biological hypothesis space. Machine learning methods, such as active and reinforcement learning, could aid in optimally exploring the vast biological space by integrating prior knowledge from various information sources as well as extrapolating to yet unexplored areas of the experimental design space based on available data. However, there exist no standardised benchmarks and data sets for this challenging task and little research has been conducted in this area to date. Here, we introduce GeneDisco, a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery. GeneDisco contains a curated set of multiple publicly available experimental data sets as well as open-source implementations of state-of-the-art active learning policies for experimental design and exploration.
△ Less
Submitted 22 October, 2021;
originally announced October 2021.