Skip to main content

Showing 1–31 of 31 results for author: Theis, F

.
  1. arXiv:2401.08868  [pdf, other

    cs.CV

    B-Cos Aligned Transformers Learn Human-Interpretable Features

    Authors: Manuel Tran, Amal Lahiani, Yashin Dicente Cid, Melanie Boxberg, Peter Lienemann, Christian Matek, Sophia J. Wagner, Fabian J. Theis, Eldad Klaiman, Tingying Peng

    Abstract: Vision Transformers (ViTs) and Swin Transformers (Swin) are currently state-of-the-art in computational pathology. However, domain experts are still reluctant to use these models due to their lack of interpretability. This is not surprising, as critical decisions need to be transparent and understandable. The most common approach to understanding transformers is to visualize their attention. Howev… ▽ More

    Submitted 18 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted at MICCAI 2023 (oral). Camera-ready available at https://doi.org/10.1007/978-3-031-43993-3_50

  2. arXiv:2311.15100  [pdf, other

    cs.CV cs.AI cs.LG

    Unbalancedness in Neural Monge Maps Improves Unpaired Domain Translation

    Authors: Luca Eyring, Dominik Klein, Théo Uscidda, Giovanni Palla, Niki Kilbertus, Zeynep Akata, Fabian Theis

    Abstract: In optimal transport (OT), a Monge map is known as a map** that transports a source distribution to a target distribution in the most cost-efficient way. Recently, multiple neural estimators for Monge maps have been developed and applied in diverse unpaired domain translation tasks, e.g. in single-cell biology and computer vision. However, the classic OT framework enforces mass conservation, whi… ▽ More

    Submitted 11 March, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

    Comments: ICLR 2024

  3. arXiv:2311.07621  [pdf, other

    q-bio.GN cs.LG

    To Transformers and Beyond: Large Language Models for the Genome

    Authors: Micaela E. Consens, Cameron Dufault, Michael Wainberg, Duncan Forster, Mehran Karimzadeh, Hani Goodarzi, Fabian J. Theis, Alan Moses, Bo Wang

    Abstract: In the rapidly evolving landscape of genomics, deep learning has emerged as a useful tool for tackling complex computational challenges. This review focuses on the transformative role of Large Language Models (LLMs), which are mostly based on the transformer architecture, in genomics. Building on the foundation of traditional convolutional neural networks and recurrent neural networks, we explore… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

  4. arXiv:2311.02455  [pdf, other

    cs.LG q-bio.GN q-bio.QM stat.AP

    Mixed Models with Multiple Instance Learning

    Authors: Jan P. Engelmann, Alessandro Palma, Jakub M. Tomczak, Fabian J. Theis, Francesco Paolo Casale

    Abstract: Predicting patient features from single-cell data can help identify cellular states implicated in health and disease. Linear models and average cell type expressions are typically favored for this task for their efficiency and robustness, but they overlook the rich cell heterogeneity inherent in single-cell data. To address this gap, we introduce MixMIL, a framework integrating Generalized Linear… ▽ More

    Submitted 8 March, 2024; v1 submitted 4 November, 2023; originally announced November 2023.

    Comments: AISTATS 2024 Oral, Code: https://github.com/AIH-SGML/MixMIL

  5. arXiv:2310.14935  [pdf

    cs.LG q-bio.GN

    Causal machine learning for single-cell genomics

    Authors: Alejandro Tejada-Lapuerta, Paul Bertin, Stefan Bauer, Hananeh Aliee, Yoshua Bengio, Fabian J. Theis

    Abstract: Advances in single-cell omics allow for unprecedented insights into the transcription profiles of individual cells. When combined with large-scale perturbation screens, through which specific biological mechanisms can be targeted, these technologies allow for measuring the effect of targeted perturbations on the whole transcriptome. These advances provide an opportunity to better understand the ca… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: 35 pages, 7 figures, 3 tables, 1 box

  6. arXiv:2310.09254  [pdf, other

    stat.ML cs.LG

    Entropic (Gromov) Wasserstein Flow Matching with GENOT

    Authors: Dominik Klein, Théo Uscidda, Fabian Theis, Marco Cuturi

    Abstract: Optimal transport (OT) theory has reshaped the field of generative modeling: Combined with neural networks, recent \textit{Neural OT} (N-OT) solvers use OT as an inductive bias, to focus on ``thrifty'' map**s that minimize average displacement costs. This core principle has fueled the successful application of N-OT solvers to high-stakes scientific challenges, notably single-cell genomics. N-OT… ▽ More

    Submitted 12 March, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

  7. arXiv:2307.00558  [pdf, other

    cs.LG q-bio.QM

    Conditionally Invariant Representation Learning for Disentangling Cellular Heterogeneity

    Authors: Hananeh Aliee, Ferdinand Kapl, Soroor Hediyeh-Zadeh, Fabian J. Theis

    Abstract: This paper presents a novel approach that leverages domain variability to learn representations that are conditionally invariant to unwanted variability or distractors. Our approach identifies both spurious and invariant latent features necessary for achieving accurate reconstruction by placing distinct conditional priors on latent features. The invariant signals are disentangled from noise by enf… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

  8. arXiv:2306.17246  [pdf, other

    cs.LG q-bio.BM stat.AP

    The power of motifs as inductive bias for learning molecular distributions

    Authors: Johanna Sommer, Leon Hetzel, David Lüdke, Fabian Theis, Stephan Günnemann

    Abstract: Machine learning for molecules holds great potential for efficiently exploring the vast chemical space and thus streamlining the drug discovery process by facilitating the design of new therapeutic molecules. Deep generative models have shown promising results for molecule generation, but the benefits of specific inductive biases for learning distributions over small graphs are unclear. Our study… ▽ More

    Submitted 4 April, 2023; originally announced June 2023.

    Comments: Accepted for publication at the MLDD workshop, ICLR 2023

  9. arXiv:2305.19303  [pdf, other

    physics.chem-ph cs.LG

    MAGNet: Motif-Agnostic Generation of Molecules from Shapes

    Authors: Leon Hetzel, Johanna Sommer, Bastian Rieck, Fabian Theis, Stephan Günnemann

    Abstract: Recent advances in machine learning for molecules exhibit great potential for facilitating drug discovery from in silico predictions. Most models for molecule generation rely on the decomposition of molecules into frequently occurring substructures (motifs), from which they generate novel compounds. While motif representations greatly aid in learning molecular distributions, such methods struggle… ▽ More

    Submitted 7 November, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

  10. arXiv:2305.14243  [pdf, other

    cs.AI cs.CV

    Training Transitive and Commutative Multimodal Transformers with LoReTTa

    Authors: Manuel Tran, Yashin Dicente Cid, Amal Lahiani, Fabian J. Theis, Tingying Peng, Eldad Klaiman

    Abstract: Training multimodal foundation models is challenging due to the limited availability of multimodal datasets. While many public datasets pair images with text, few combine images with audio or text with audio. Even rarer are datasets that align all three modalities at once. Critical domains such as healthcare, infrastructure, or transportation are particularly affected by missing modalities. This m… ▽ More

    Submitted 16 January, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted at NeurIPS 2023 (poster). Camera-ready version

  11. arXiv:2211.03793  [pdf, other

    q-bio.GN cs.LG q-bio.QM stat.AP

    Uncertainty Quantification for Atlas-Level Cell Type Transfer

    Authors: Jan Engelmann, Leon Hetzel, Giovanni Palla, Lisa Sikkema, Malte Luecken, Fabian Theis

    Abstract: Single-cell reference atlases are large-scale, cell-level maps that capture cellular heterogeneity within an organ using single cell genomics. Given their size and cellular diversity, these atlases serve as high-quality training data for the transfer of cell type labels to new datasets. Such label transfer, however, must be robust to domain shifts in gene expression due to measurement technique, l… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: Workshop paper at the 2022 ICML Workshop on Computational Biology

  12. arXiv:2210.14672  [pdf, other

    cs.LG

    Sparsity in Continuous-Depth Neural Networks

    Authors: Hananeh Aliee, Till Richter, Mikhail Solonin, Ignacio Ibarra, Fabian Theis, Niki Kilbertus

    Abstract: Neural Ordinary Differential Equations (NODEs) have proven successful in learning dynamical systems in terms of accurately recovering the observed trajectories. While different types of sparsity have been proposed to improve robustness, the generalization properties of NODEs for dynamical systems beyond the observed data are underexplored. We systematically study the influence of weight and featur… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: Neurips 2022

  13. Noise transfer for unsupervised domain adaptation of retinal OCT images

    Authors: Valentin Koch, Olle Holmberg, Hannah Spitzer, Johannes Schiefelbein, Ben Asani, Michael Hafner, Fabian J Theis

    Abstract: Optical coherence tomography (OCT) imaging from different camera devices causes challenging domain shifts and can cause a severe drop in accuracy for machine learning models. In this work, we introduce a minimal noise adaptation method based on a singular value decomposition (SVDNA) to overcome the domain gap between target domains from three different device manufacturers in retinal OCT imaging.… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

    Comments: published at MICCAI 2022

  14. arXiv:2205.11096  [pdf, other

    eess.IV cs.CV cs.LG

    FedNorm: Modality-Based Normalization in Federated Learning for Multi-Modal Liver Segmentation

    Authors: Tobias Bernecker, Annette Peters, Christopher L. Schlett, Fabian Bamberg, Fabian Theis, Daniel Rueckert, Jakob Weiß, Shadi Albarqouni

    Abstract: Given the high incidence and effective treatment options for liver diseases, they are of great socioeconomic importance. One of the most common methods for analyzing CT and MRI images for diagnosis and follow-up treatment is liver segmentation. Recent advances in deep learning have demonstrated encouraging results for automatic liver segmentation. Despite this, their success depends primarily on t… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

    Comments: Under Review

  15. arXiv:2205.07110  [pdf, other

    cs.LG q-bio.QM

    SystemMatch: optimizing preclinical drug models to human clinical outcomes via generative latent-space matching

    Authors: Scott Gigante, Varsha G. Raghavan, Amanda M. Robinson, Robert A. Barton, Adeeb H. Rahman, Drausin F. Wulsin, Jacques Banchereau, Noam Solomon, Luis F. Voloch, Fabian J. Theis

    Abstract: Translating the relevance of preclinical models ($\textit{in vitro}$, animal models, or organoids) to their relevance in humans presents an important challenge during drug development. The rising abundance of single-cell genomic data from human tumors and tissue offers a new opportunity to optimize model systems by their similarity to targeted human cell types in disease. In this work, we introduc… ▽ More

    Submitted 14 May, 2022; originally announced May 2022.

    Comments: Published at the MLDD workshop, ICLR 2022

  16. arXiv:2204.13545  [pdf, other

    cs.LG q-bio.GN stat.AP stat.ML

    Predicting Cellular Responses to Novel Drug Perturbations at a Single-Cell Resolution

    Authors: Leon Hetzel, Simon Böhm, Niki Kilbertus, Stephan Günnemann, Mohammad Lotfollahi, Fabian Theis

    Abstract: Single-cell transcriptomics enabled the study of cellular heterogeneity in response to perturbations at the resolution of individual cells. However, scaling high-throughput screens (HTSs) to measure cellular responses for many drugs remains a challenge due to technical limitations and, more importantly, the cost of such multiplexed experiments. Thus, transferring information from routinely perform… ▽ More

    Submitted 30 December, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

    Comments: 10 pages. NeurIPS 2022 conference paper

  17. arXiv:2106.12430  [pdf, other

    cs.LG cs.AI

    Beyond Predictions in Neural ODEs: Identification and Interventions

    Authors: Hananeh Aliee, Fabian J. Theis, Niki Kilbertus

    Abstract: Spurred by tremendous success in pattern matching and prediction tasks, researchers increasingly resort to machine learning to aid original scientific discovery. Given large amounts of observational data about a system, can we uncover the rules that govern its evolution? Solving this task holds the great promise of fully understanding the causal interactions and being able to make reliable predict… ▽ More

    Submitted 23 June, 2021; originally announced June 2021.

  18. arXiv:2104.11364  [pdf

    q-bio.OT cs.CY

    A field guide to cultivating computational biology

    Authors: Anne E Carpenter, Casey S Greene, Piero Carnici, Benilton S Carvalho, Michiel de Hoon, Stacey Finley, Kim-Anh Le Cao, Jerry SH Lee, Luigi Marchionni, Suzanne Sindi, Fabian J Theis, Gregory P Way, Jean YH Yang, Elana J Fertig

    Abstract: Biomedical research centers can empower basic discovery and novel therapeutic strategies by leveraging their large-scale datasets from experiments and patients. This data, together with new technologies to create and analyze it, has ushered in an era of data-driven discovery which requires moving beyond the traditional individual, single-discipline investigator research model. This interdisciplina… ▽ More

    Submitted 22 April, 2021; originally announced April 2021.

  19. arXiv:1910.01791  [pdf, other

    cs.LG eess.IV q-bio.CB q-bio.GN stat.ML

    Conditional out-of-sample generation for unpaired data using trVAE

    Authors: Mohammad Lotfollahi, Mohsen Naghipourfar, Fabian J. Theis, F. Alexander Wolf

    Abstract: While generative models have shown great success in generating high-dimensional samples conditional on low-dimensional descriptors (learning e.g. stroke thickness in MNIST, hair color in CelebA, or speaker identity in Wavenet), their generation out-of-sample poses fundamental problems. The conditional variational autoencoder (CVAE) as a simple conditional generative model does not explicitly relat… ▽ More

    Submitted 30 October, 2019; v1 submitted 3 October, 2019; originally announced October 2019.

    Comments: Added reference to Johansson et al. (2016) and removed sentences from Lopez et al. (2018) in the background section (see acknowledgements)

  20. arXiv:1909.12550  [pdf

    q-bio.GN q-bio.MN q-bio.PE

    Single-cell eQTLGen Consortium: a personalized understanding of disease

    Authors: Monique G. P. van der Wijst, Dylan H. de Vries, Hilde E. Groot, Gosia Trynka, Chung-Chau Hon, Martijn C. Nawijn, Youssef Idaghdour, Pim van der Harst, Chun J. Ye, Joseph Powell, Fabian J. Theis, Ahmed Mahfouz, Matthias Heinig, Lude Franke

    Abstract: In recent years, functional genomics approaches combining genetic information with bulk RNA-sequencing data have identified the downstream expression effects of disease-associated genetic risk factors through so-called expression quantitative trait locus (eQTL) analysis. Single-cell RNA-sequencing creates enormous opportunities for map** eQTLs across different cell types and in dynamic processes… ▽ More

    Submitted 27 September, 2019; originally announced September 2019.

    Comments: 26 pages, 5 figures, position paper of sc-eQTLGen consortium

  21. arXiv:1810.04281  [pdf, other

    stat.AP q-bio.QM

    Fully integrative data analysis of NMR metabolic fingerprints with comprehensive patient data: a case report based on the German Chronic Kidney Disease (GCKD) study

    Authors: Helena U. Zacharias, Michael Altenbuchinger, Stefan Solbrig, Andreas Schäfer, Mustafa Buyukozkan, Ulla T. Schultheiß, Fruzsina Kotsis, Anna Köttgen, Jan Krumsiek, Fabian J. Theis, Rainer Spang, Peter J. Oefner, Wolfram Gronwald, GCKD study investigators

    Abstract: Omics data facilitate the gain of novel insights into the pathophysiology of diseases and, consequently, their diagnosis, treatment, and prevention. To that end, it is necessary to integrate omics data with other data types such as clinical, phenotypic, and demographic parameters of categorical or continuous nature. Here, we exemplify this data integration issue for a study on chronic kidney disea… ▽ More

    Submitted 8 October, 2018; originally announced October 2018.

  22. arXiv:1608.07616  [pdf

    cs.CV

    Mitosis Detection in Intestinal Crypt Images with Hough Forest and Conditional Random Fields

    Authors: Gerda Bortsova, Michael Sterr, Lichao Wang, Fausto Milletari, Nassir Navab, Anika Böttcher, Heiko Lickert, Fabian Theis, Tingying Peng

    Abstract: Intestinal enteroendocrine cells secrete hormones that are vital for the regulation of glucose metabolism but their differentiation from intestinal stem cells is not fully understood. Asymmetric stem cell divisions have been linked to intestinal stem cell homeostasis and secretory fate commitment. We monitored cell divisions using 4D live cell imaging of cultured intestinal crypts to characterize… ▽ More

    Submitted 26 August, 2016; originally announced August 2016.

    Comments: Accepted at the 7th International Conference on Machine Learning in Medical Imaging

  23. arXiv:1511.01658  [pdf, other

    math.OC q-bio.MN

    A simulation-based approach for solving optimisation problems with ODE-type steady state constraints

    Authors: Anna Fiedler, Fabian J. Theis, Jan Hasenauer

    Abstract: Ordinary differential equations (ODEs) are widely used to model biological, (bio-)chemical and technical processes. The parameters of these ODEs are often estimated from experimental data using ODE-constrained optimisation. This article proposes a simple simulation-based approach for solving optimisation problems with steady state constraints relying on an ODE. This simulation-based optimisation m… ▽ More

    Submitted 5 November, 2015; originally announced November 2015.

    Comments: 11 pages, 3 figures

  24. arXiv:1506.06392  [pdf, other

    q-bio.MN q-bio.QM

    Data-driven modelling of biological multi-scale processes

    Authors: Jan Hasenauer, Nick Jagiella, Sabrina Hross, Fabian J. Theis

    Abstract: Biological processes involve a variety of spatial and temporal scales. A holistic understanding of many biological processes therefore requires multi-scale models which capture the relevant properties on all these scales. In this manuscript we review mathematical modelling approaches used to describe the individual spatial scales and how they are integrated into holistic models. We discuss the rel… ▽ More

    Submitted 21 June, 2015; originally announced June 2015.

    Comments: This manuscript will appear in the Journal of Coupled Systems and Multiscale Dynamics (American Scientific Publishers)

    MSC Class: 92Bxx; 93A30

  25. arXiv:1407.2112  [pdf

    cs.GR cs.HC q-bio.QM

    MCA: Multiresolution Correlation Analysis, a graphical tool for subpopulation identification in single-cell gene expression data

    Authors: Justin Feigelman, Fabian J. Theis, Carsten Marr

    Abstract: Background: Biological data often originate from samples containing mixtures of subpopulations, corresponding e.g. to distinct cellular phenotypes. However, identification of distinct subpopulations may be difficult if biological measurements yield distributions that are not easily separable. Results: We present Multiresolution Correlation Analysis (MCA), a method for visually identifying subpopul… ▽ More

    Submitted 8 July, 2014; originally announced July 2014.

    Comments: BioVis 2014 conference

  26. Separation of uncorrelated stationary time series using autocovariance matrices

    Authors: Jari Miettinen, Katrin Illner, Klaus Nordhausen, Hannu Oja, Sara Taskinen, Fabian J. Theis

    Abstract: Blind source separation (BSS) is a signal processing tool, which is widely used in various fields. Examples include biomedical signal separation, brain imaging and economic time series applications. In BSS, one assumes that the observed $p$ time series are linear combinations of $p$ latent uncorrelated weakly stationary time series. The aim is then to find an estimate for an unmixing matrix, which… ▽ More

    Submitted 14 May, 2014; originally announced May 2014.

    MSC Class: 62H05; 62H10

    Journal ref: Journal of Time Series Analysis, Vol 37, 337-354 (2016)

  27. arXiv:1202.4605  [pdf, other

    physics.data-an

    Joining Forces of Bayesian and Frequentist Methodology: A Study for Inference in the Presence of Non-Identifiability

    Authors: Andreas Raue, Clemens Kreutz, Fabian Joachim Theis, Jens Timmer

    Abstract: Increasingly complex applications involve large datasets in combination with non-linear and high dimensional mathematical models. In this context, statistical inference is a challenging issue that calls for pragmatic approaches that take advantage of both Bayesian and frequentist methods. The elegance of Bayesian methodology is founded in the propagation of information content provided by experime… ▽ More

    Submitted 21 February, 2012; originally announced February 2012.

    Comments: Article to appear in Phil. Trans. Roy. Soc. A

    Journal ref: Phil. Trans. R. Soc. A. 371, 20110544, 2013

  28. Stability and multi-attractor dynamics of a toggle switch based on a two-stage model of stochastic gene expression

    Authors: Michael K. Strasser, Fabian J. Theis, Carsten Marr

    Abstract: A toggle switch consists of two genes that mutually repress each other. This regulatory motif is active during cell differentiation and is thought to act as a memory device, being able to choose and maintain cell fate decisions. In this contribution, we study the stability and dynamics of a two-stage gene expression switch within a probabilistic framework inspired by the properties of the Pu/Gata… ▽ More

    Submitted 1 December, 2011; originally announced December 2011.

    Comments: to appear in the Biophysical Journal

  29. arXiv:1104.1200  [pdf, other

    physics.soc-ph cs.SI

    Modularity maximization and tree clustering: Novel ways to determine effective geographic borders

    Authors: Daniel Grady, Rafael Brune, Christian Thiemann, Fabian Theis, Dirk Brockmann

    Abstract: Territorial subdivisions and geographic borders are essential for understanding phenomena in sociology, political science, history, and economics. They influence the interregional flow of information and cross-border trade and affect the diffusion of innovation and technology. However, most existing administrative borders were determined by a variety of historic and political circumstances along w… ▽ More

    Submitted 6 April, 2011; originally announced April 2011.

  30. Patterns of subnet usage reveal distinct scales of regulation in the transcriptional regulatory network of Escherichia coli

    Authors: Carsten Marr, Fabian J. Theis, Larry S. Liebovitch, Marc-Thorsten Hütt

    Abstract: The set of regulatory interactions between genes, mediated by transcription factors, forms a species' transcriptional regulatory network (TRN). By comparing this network with measured gene expression data one can identify functional properties of the TRN and gain general insight into transcriptional control. We define the subnet of a node as the subgraph consisting of all nodes topologically downs… ▽ More

    Submitted 24 May, 2010; originally announced May 2010.

    Comments: 14 pages, 8 figures, to be published in PLoS Computational Biology

  31. arXiv:1001.0943  [pdf

    physics.soc-ph cond-mat.stat-mech

    The structure of borders in a small world

    Authors: C. Thiemann, F. Theis, D. Grady, R. Brune, D. Brockmann

    Abstract: Geographic borders are not only essential for the effective functioning of government, the distribution of administrative responsibilities and the allocation of public resources, they also influence the interregional flow of information, cross-border trade operations, the diffusion of innovation and technology, and the spatial spread of infectious diseases. However, as growing interactions and m… ▽ More

    Submitted 6 January, 2010; originally announced January 2010.

    Comments: 9 pages