Search | arXiv e-print repository

CausalPlayground: Addressing Data-Generation Requirements in Cutting-Edge Causality Research

Authors: Andreas W M Sauter, Erman Acar, Aske Plaat

Abstract: Research on causal effects often relies on synthetic data due to the scarcity of real-world datasets with ground-truth effects. Since current data-generating tools do not always meet all requirements for state-of-the-art research, ad-hoc methods are often employed. This leads to heterogeneity among datasets and delays research progress. We address the shortcomings of current data-generating librar… ▽ More Research on causal effects often relies on synthetic data due to the scarcity of real-world datasets with ground-truth effects. Since current data-generating tools do not always meet all requirements for state-of-the-art research, ad-hoc methods are often employed. This leads to heterogeneity among datasets and delays research progress. We address the shortcomings of current data-generating libraries by introducing CausalPlayground, a Python library that provides a standardized platform for generating, sampling, and sharing structural causal models (SCMs). CausalPlayground offers fine-grained control over SCMs, interventions, and the generation of datasets of SCMs for learning and quantitative research. Furthermore, by integrating with Gymnasium, the standard framework for reinforcement learning (RL) environments, we enable online interaction with the SCMs. Overall, by introducing CausalPlayground we aim to foster more efficient and comparable research in the field. All code and API documentation is available at https://github.com/sa-and/CausalPlayground. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2404.11208 [pdf, other]

CAGE: Causality-Aware Shapley Value for Global Explanations

Authors: Nils Ole Breuer, Andreas Sauter, Majid Mohammadi, Erman Acar

Abstract: As Artificial Intelligence (AI) is having more influence on our everyday lives, it becomes important that AI-based decisions are transparent and explainable. As a consequence, the field of eXplainable AI (or XAI) has become popular in recent years. One way to explain AI models is to elucidate the predictive importance of the input features for the AI model in general, also referred to as global ex… ▽ More As Artificial Intelligence (AI) is having more influence on our everyday lives, it becomes important that AI-based decisions are transparent and explainable. As a consequence, the field of eXplainable AI (or XAI) has become popular in recent years. One way to explain AI models is to elucidate the predictive importance of the input features for the AI model in general, also referred to as global explanations. Inspired by cooperative game theory, Shapley values offer a convenient way for quantifying the feature importance as explanations. However many methods based on Shapley values are built on the assumption of feature independence and often overlook causal relations of the features which could impact their importance for the ML model. Inspired by studies of explanations at the local level, we propose CAGE (Causally-Aware Shapley Values for Global Explanations). In particular, we introduce a novel sampling procedure for out-coalition features that respects the causal relations of the input features. We derive a practical approach that incorporates causal knowledge into global explanation and offers the possibility to interpret the predictive feature importance considering their causal relation. We evaluate our method on synthetic data and real-world data. The explanations from our approach suggest that they are not only more intuitive but also more faithful compared to previous global explanation methods. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2403.04499 [pdf, ps, other]

Pressure-improved Scott-Vogelius type elements

Authors: Nis-Erik Bohne, Benedikt Gräßle, Stefan A. Sauter

Abstract: The Scott-Vogelius element is a popular finite element for the discretization of the Stokes equations which enjoys inf-sup stability and gives divergence-free velocity approximation. However, it is well known that the convergence rates for the discrete pressure deteriorate in the presence of certain $critical$ $vertices$ in a triangulation of the domain. Modifications of the Scott-Vogelius element… ▽ More The Scott-Vogelius element is a popular finite element for the discretization of the Stokes equations which enjoys inf-sup stability and gives divergence-free velocity approximation. However, it is well known that the convergence rates for the discrete pressure deteriorate in the presence of certain $critical$ $vertices$ in a triangulation of the domain. Modifications of the Scott-Vogelius element such as the recently introduced pressure-wired Stokes element also suffer from this effect. In this paper we introduce a simple modification strategy for these pressure spaces that preserves the inf-sup stability while the pressure converges at an optimal rate. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Comments: 26 pages, 4 figures

MSC Class: 65N30; 65N12; 76D07

arXiv:2401.16974 [pdf, other]

CORE: Towards Scalable and Efficient Causal Discovery with Reinforcement Learning

Authors: Andreas W. M. Sauter, Nicolò Botteghi, Erman Acar, Aske Plaat

Abstract: Causal discovery is the challenging task of inferring causal structure from data. Motivated by Pearl's Causal Hierarchy (PCH), which tells us that passive observations alone are not enough to distinguish correlation from causation, there has been a recent push to incorporate interventions into machine learning research. Reinforcement learning provides a convenient framework for such an active appr… ▽ More Causal discovery is the challenging task of inferring causal structure from data. Motivated by Pearl's Causal Hierarchy (PCH), which tells us that passive observations alone are not enough to distinguish correlation from causation, there has been a recent push to incorporate interventions into machine learning research. Reinforcement learning provides a convenient framework for such an active approach to learning. This paper presents CORE, a deep reinforcement learning-based approach for causal discovery and intervention planning. CORE learns to sequentially reconstruct causal graphs from data while learning to perform informative interventions. Our results demonstrate that CORE generalizes to unseen graphs and efficiently uncovers causal structures. Furthermore, CORE scales to larger graphs with up to 10 variables and outperforms existing approaches in structure estimation accuracy and sample efficiency. All relevant code and supplementary material can be found at https://github.com/sa-and/CORE △ Less

Submitted 30 January, 2024; originally announced January 2024.

Comments: To be published In Proc. of the 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2024), Auckland, New Zealand, May 6 - 10, 2024, IFAAMAS

ACM Class: I.2.6; I.2.8

arXiv:2311.10590 [pdf, other]

EduGym: An Environment and Notebook Suite for Reinforcement Learning Education

Authors: Thomas M. Moerland, Matthias Müller-Brockhausen, Zhao Yang, Andrius Bernatavicius, Koen Ponse, Tom Kouwenhoven, Andreas Sauter, Michiel van der Meer, Bram Renting, Aske Plaat

Abstract: Due to the empirical success of reinforcement learning, an increasing number of students study the subject. However, from our practical teaching experience, we see students entering the field (bachelor, master and early PhD) often struggle. On the one hand, textbooks and (online) lectures provide the fundamentals, but students find it hard to translate between equations and code. On the other hand… ▽ More Due to the empirical success of reinforcement learning, an increasing number of students study the subject. However, from our practical teaching experience, we see students entering the field (bachelor, master and early PhD) often struggle. On the one hand, textbooks and (online) lectures provide the fundamentals, but students find it hard to translate between equations and code. On the other hand, public codebases do provide practical examples, but the implemented algorithms tend to be complex, and the underlying test environments contain multiple reinforcement learning challenges at once. Although this is realistic from a research perspective, it often hinders educational conceptual understanding. To solve this issue we introduce EduGym, a set of educational reinforcement learning environments and associated interactive notebooks tailored for education. Each EduGym environment is specifically designed to illustrate a certain aspect/challenge of reinforcement learning (e.g., exploration, partial observability, stochasticity, etc.), while the associated interactive notebook explains the challenge and its possible solution approaches, connecting equations and code in a single document. An evaluation among RL students and researchers shows 86% of them think EduGym is a useful tool for reinforcement learning education. All notebooks are available from https://www.edugym.org/, while the full software package can be installed from https://github.com/RLG-Leiden/edugym. △ Less

Submitted 22 February, 2024; v1 submitted 17 November, 2023; originally announced November 2023.

arXiv:2311.06815 [pdf]

doi 10.2196/50865

Evaluation of GPT-4 for chest X-ray impression generation: A reader study on performance and perception

Authors: Sebastian Ziegelmayer, Alexander W. Marka, Nicolas Lenhart, Nadja Nehls, Stefan Reischl, Felix Harder, Andreas Sauter, Marcus Makowski, Markus Graf, Joshua Gawlitza

Abstract: The remarkable generative capabilities of multimodal foundation models are currently being explored for a variety of applications. Generating radiological impressions is a challenging task that could significantly reduce the workload of radiologists. In our study we explored and analyzed the generative abilities of GPT-4 for Chest X-ray impression generation. To generate and evaluate impressions o… ▽ More The remarkable generative capabilities of multimodal foundation models are currently being explored for a variety of applications. Generating radiological impressions is a challenging task that could significantly reduce the workload of radiologists. In our study we explored and analyzed the generative abilities of GPT-4 for Chest X-ray impression generation. To generate and evaluate impressions of chest X-rays based on different input modalities (image, text, text and image), a blinded radiological report was written for 25-cases of the publicly available NIH-dataset. GPT-4 was given image, finding section or both sequentially to generate an input dependent impression. In a blind randomized reading, 4-radiologists rated the impressions and were asked to classify the impression origin (Human, AI), providing justification for their decision. Lastly text model evaluation metrics and their correlation with the radiological score (summation of the 4 dimensions) was assessed. According to the radiological score, the human-written impression was rated highest, although not significantly different to text-based impressions. The automated evaluation metrics showed moderate to substantial correlations to the radiological score for the image impressions, however individual scores were highly divergent among inputs, indicating insufficient representation of radiological quality. Detection of AI-generated impressions varied by input and was 61% for text-based impressions. Impressions classified as AI-generated had significantly worse radiological scores even when written by a radiologist, indicating potential bias. Our study revealed significant discrepancies between a radiological assessment and common automatic evaluation metrics depending on the model input. The detection of AI-generated findings is subject to bias that highly rated impressions are perceived as human-written. △ Less

Submitted 12 November, 2023; originally announced November 2023.

Journal ref: J Med Internet Res 2023;25:e50865

arXiv:2310.10410 [pdf, other]

Loci-Segmented: Improving Scene Segmentation Learning

Authors: Manuel Traub, Frederic Becker, Adrian Sauter, Sebastian Otte, Martin V. Butz

Abstract: Current slot-oriented approaches for compositional scene segmentation from images and videos rely on provided background information or slot assignments. We present a segmented location and identity tracking system, Loci-Segmented (Loci-s), which does not require either of this information. It learns to dynamically segment scenes into interpretable background and slot-based object encodings, separ… ▽ More Current slot-oriented approaches for compositional scene segmentation from images and videos rely on provided background information or slot assignments. We present a segmented location and identity tracking system, Loci-Segmented (Loci-s), which does not require either of this information. It learns to dynamically segment scenes into interpretable background and slot-based object encodings, separating rgb, mask, location, and depth information for each. The results reveal largely superior video decomposition performance in the MOVi datasets and in another established dataset collection targeting scene segmentation. The system's well-interpretable, compositional latent encodings may serve as a foundation model for downstream tasks. △ Less

Submitted 6 February, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

arXiv:2307.15506 [pdf, other]

doi 10.1186/s41747-024-00450-4

Improving image quality of sparse-view lung tumor CT images with U-Net

Authors: Annika Ries, Tina Dorosti, Johannes Thalhammer, Daniel Sasse, Andreas Sauter, Felix Meurer, Ashley Benne, Tobias Lasser, Franz Pfeiffer, Florian Schaff, Daniela Pfeiffer

Abstract: Background: We aimed at improving image quality (IQ) of sparse-view computed tomography (CT) images using a U-Net for lung metastasis detection and determining the best tradeoff between number of views, IQ, and diagnostic confidence. Methods: CT images from 41 subjects aged 62.8 $\pm$ 10.6 years (mean $\pm$ standard deviation), 23 men, 34 with lung metastasis, 7 healthy, were retrospectively sel… ▽ More Background: We aimed at improving image quality (IQ) of sparse-view computed tomography (CT) images using a U-Net for lung metastasis detection and determining the best tradeoff between number of views, IQ, and diagnostic confidence. Methods: CT images from 41 subjects aged 62.8 $\pm$ 10.6 years (mean $\pm$ standard deviation), 23 men, 34 with lung metastasis, 7 healthy, were retrospectively selected (2016-2018) and forward projected onto 2,048-view sinograms. Six corresponding sparse-view CT data subsets at varying levels of undersampling were reconstructed from sinograms using filtered backprojection with 16, 32, 64, 128, 256, and 512 views. A dual-frame U-Net was trained and evaluated for each subsampling level on 8,658 images from 22 diseased subjects. A representative image per scan was selected from 19 subjects (12 diseased, 7 healthy) for a single-blinded multireader study. These slices, for all levels of subsampling, with and without U-Net postprocessing, were presented to three readers. IQ and diagnostic confidence were ranked using predefined scales. Subjective nodule segmentation was evaluated using sensitivity and Dice similarity coefficient (DSC); clustered Wilcoxon signed-rank test was used. Results: The 64-projection sparse-view images resulted in 0.89 sensitivity and 0.81 DSC, while their counterparts, postprocessed with the U-Net, had improved metrics (0.94 sensitivity and 0.85 DSC) (p = 0.400). Fewer views led to insufficient IQ for diagnosis. For increased views, no substantial discrepancies were noted between sparse-view and postprocessed images. Conclusions: Projection views can be reduced from 2,048 to 64 while maintaining IQ and the confidence of the radiologists on a satisfactory level. △ Less

Submitted 14 February, 2024; v1 submitted 28 July, 2023; originally announced July 2023.

Journal ref: Eur Radiol Exp 8, 54 (2024)

arXiv:2305.00959 [pdf, ps, other]

Skeleton Integral Equations for Acoustic Transmission Problems with Varying Coefficients

Authors: Francesco Florian, Ralf Hiptmair, Stefan A. Sauter

Abstract: In this paper we will derive an non-local (``integral'') equation which transforms a three-dimensional acoustic transmission problem with \emph{variable} coefficients, non-zero absorption, and mixed boundary conditions to a non-local equation on a ``skeleton'' of the domain $Ω\subset\mathbb{R}^{3}$, where ``skeleton'' stands for the union of the interfaces and boundaries of a Lipschitz partition o… ▽ More In this paper we will derive an non-local (``integral'') equation which transforms a three-dimensional acoustic transmission problem with \emph{variable} coefficients, non-zero absorption, and mixed boundary conditions to a non-local equation on a ``skeleton'' of the domain $Ω\subset\mathbb{R}^{3}$, where ``skeleton'' stands for the union of the interfaces and boundaries of a Lipschitz partition of $Ω$. To that end, we introduce and analyze abstract layer potentials as solutions of auxiliary coercive full space variational problems and derive jump conditions across domain interfaces. This allows us to formulate the non-local skeleton equation as a \emph{direct method} for the unknown Cauchy data of the solution of the original partial differential equation. We establish coercivity and continuity of the variational form of the skeleton equation based on auxiliary full space variational problems. Explicit expressions for Green's functions is not required and all our estimates are \emph{explicit} in the complex wave number. △ Less

Submitted 24 February, 2024; v1 submitted 1 May, 2023; originally announced May 2023.

arXiv:2212.09673 [pdf, other]

The pressure-wired Stokes element: a mesh-robust version of the Scott-Vogelius element

Authors: Benedikt Gräßle, Nis-Erik Bohne, Stefan A. Sauter

Abstract: The Scott-Vogelius finite element pair for the numerical discretization of the stationary Stokes equation in 2D is a popular element which is based on a continuous velocity approximation of polynomial order $k$ and a discontinuous pressure approximation of order $k-1$. It employs a "singular distance" (measured by some geometric mesh quantity $ Θ\left( \mathbf{z}\right) \geq 0$ for triangle vertic… ▽ More The Scott-Vogelius finite element pair for the numerical discretization of the stationary Stokes equation in 2D is a popular element which is based on a continuous velocity approximation of polynomial order $k$ and a discontinuous pressure approximation of order $k-1$. It employs a "singular distance" (measured by some geometric mesh quantity $ Θ\left( \mathbf{z}\right) \geq 0$ for triangle vertices $\mathbf{z}$) and imposes a local side condition on the pressure space associated to vertices $\mathbf{z}$ with $Θ\left( \mathbf{z}\right) =0$. The method is inf-sup stable for any fixed regular triangulation and $k\geq 4$. However, the inf-sup constant deteriorates if the triangulation contains nearly singular vertices $0<Θ\left( \mathbf{z}\right) \ll 1$. In this paper, we introduce a very simple parameter-dependent modification of the Scott-Vogelius element such that the inf-sup constant is independent of nearly-singular vertices. We will show by analysis and also by numerical experiments that the effect on the divergence-free condition for the discrete velocity is negligibly small. △ Less

Submitted 18 March, 2024; v1 submitted 19 December, 2022; originally announced December 2022.

MSC Class: 65N12; 65N30; 76D07

arXiv:2208.05868 [pdf]

doi 10.1148/ryai.230024

TotalSegmentator: robust segmentation of 104 anatomical structures in CT images

Authors: Jakob Wasserthal, Hanns-Christian Breit, Manfred T. Meyer, Maurice Pradella, Daniel Hinck, Alexander W. Sauter, Tobias Heye, Daniel Boll, Joshy Cyriac, Shan Yang, Michael Bach, Martin Segeroth

Abstract: We present a deep learning segmentation model that can automatically and robustly segment all major anatomical structures in body CT images. In this retrospective study, 1204 CT examinations (from the years 2012, 2016, and 2020) were used to segment 104 anatomical structures (27 organs, 59 bones, 10 muscles, 8 vessels) relevant for use cases such as organ volumetry, disease characterization, and s… ▽ More We present a deep learning segmentation model that can automatically and robustly segment all major anatomical structures in body CT images. In this retrospective study, 1204 CT examinations (from the years 2012, 2016, and 2020) were used to segment 104 anatomical structures (27 organs, 59 bones, 10 muscles, 8 vessels) relevant for use cases such as organ volumetry, disease characterization, and surgical or radiotherapy planning. The CT images were randomly sampled from routine clinical studies and thus represent a real-world dataset (different ages, pathologies, scanners, body parts, sequences, and sites). The authors trained an nnU-Net segmentation algorithm on this dataset and calculated Dice similarity coefficients (Dice) to evaluate the model's performance. The trained algorithm was applied to a second dataset of 4004 whole-body CT examinations to investigate age dependent volume and attenuation changes. The proposed model showed a high Dice score (0.943) on the test set, which included a wide range of clinical data with major pathologies. The model significantly outperformed another publicly available segmentation model on a separate dataset (Dice score, 0.932 versus 0.871, respectively). The aging study demonstrated significant correlations between age and volume and mean attenuation for a variety of organ groups (e.g., age and aortic volume; age and mean attenuation of the autochthonous dorsal musculature). The developed model enables robust and accurate segmentation of 104 anatomical structures. The annotated dataset (https://doi.org/10.5281/zenodo.6802613) and toolkit (https://www.github.com/wasserth/TotalSegmentator) are publicly available. △ Less

Submitted 16 June, 2023; v1 submitted 11 August, 2022; originally announced August 2022.

Comments: Accepted at Radiology: Artificial Intelligence

Journal ref: Radiol Artif Intell 2023;5(5):e230024

arXiv:2207.08457 [pdf, other]

A Meta-Reinforcement Learning Algorithm for Causal Discovery

Authors: Andreas Sauter, Erman Acar, Vincent François-Lavet

Abstract: Causal discovery is a major task with the utmost importance for machine learning since causal structures can enable models to go beyond pure correlation-based inference and significantly boost their performance. However, finding causal structures from data poses a significant challenge both in computational effort and accuracy, let alone its impossibility without interventions in general. In this… ▽ More Causal discovery is a major task with the utmost importance for machine learning since causal structures can enable models to go beyond pure correlation-based inference and significantly boost their performance. However, finding causal structures from data poses a significant challenge both in computational effort and accuracy, let alone its impossibility without interventions in general. In this paper, we develop a meta-reinforcement learning algorithm that performs causal discovery by learning to perform interventions such that it can construct an explicit causal graph. Apart from being useful for possible downstream applications, the estimated causal graph also provides an explanation for the data-generating process. In this article, we show that our algorithm estimates a good graph compared to the SOTA approaches, even in environments whose underlying causal structure is previously unseen. Further, we make an ablation study that shows how learning interventions contribute to the overall performance of our approach. We conclude that interventions indeed help boost the performance, efficiently yielding an accurate estimate of the causal structure of a possibly unseen environment. △ Less

Submitted 21 February, 2023; v1 submitted 18 July, 2022; originally announced July 2022.

Comments: Camera-ready version for CLEAR23

arXiv:2204.01270 [pdf, other]

The inf-sup constant for $hp$-Crouzeix-Raviart triangular elements

Authors: S. Sauter

Abstract: In this paper, we consider the discretization of the two-dimensional stationary Stokes equation by Crouzeix-Raviart elements for the velocity of polynomial order $k\geq1$ on conforming triangulations and discontinuous pressure approximations of order $k-1$. We will bound the inf-sup constant from below independent of the mesh size and show that it depends only logarithmically on $k$. Our assumptio… ▽ More In this paper, we consider the discretization of the two-dimensional stationary Stokes equation by Crouzeix-Raviart elements for the velocity of polynomial order $k\geq1$ on conforming triangulations and discontinuous pressure approximations of order $k-1$. We will bound the inf-sup constant from below independent of the mesh size and show that it depends only logarithmically on $k$. Our assumptions on the mesh are very mild: for odd $k$ we require that the triangulations contain at least one inner vertex while for even $k$ we assume that the triangulations consist of more than a single triangle. △ Less

Submitted 21 December, 2022; v1 submitted 4 April, 2022; originally announced April 2022.

Comments: 46 pages, 6 figures

arXiv:2201.02602 [pdf, other]

Wavenumber-explicit hp-FEM analysis for Maxwell's equations with impedance boundary conditions

Authors: Jens M. Melenk, Stefan A. Sauter

Abstract: The time-harmonic Maxwell equations at high wavenumber k in domains with an analytic boundary and impedance boundary conditions are considered. A wavenumber-explicit stability and regularity theory is developed that decomposes the solution into a part with finite Sobolev regularity that is controlled uniformly in k and an analytic part. Using this regularity, quasi-optimality of the Galerkin discr… ▽ More The time-harmonic Maxwell equations at high wavenumber k in domains with an analytic boundary and impedance boundary conditions are considered. A wavenumber-explicit stability and regularity theory is developed that decomposes the solution into a part with finite Sobolev regularity that is controlled uniformly in k and an analytic part. Using this regularity, quasi-optimality of the Galerkin discretization based on Nedelec elements of order p on a mesh with mesh size h is shown under the k-explicit scale resolution condition that a) kh/p is sufficient small and b) p/\ln k is bounded from below. △ Less

Submitted 15 August, 2023; v1 submitted 7 January, 2022; originally announced January 2022.

Comments: 91 pages, 6 figures

MSC Class: 35J05 65N12 65N30

arXiv:2106.05763 [pdf, other]

A Deep Variational Approach to Clustering Survival Data

Authors: Laura Manduchi, Ričards Marcinkevičs, Michela C. Massi, Thomas Weikert, Alexander Sauter, Verena Gotta, Timothy Müller, Flavio Vasella, Marian C. Neidert, Marc Pfister, Bram Stieltjes, Julia E. Vogt

Abstract: In this work, we study the problem of clustering survival data $-$ a challenging and so far under-explored task. We introduce a novel semi-supervised probabilistic approach to cluster survival data by leveraging recent advances in stochastic gradient variational inference. In contrast to previous work, our proposed method employs a deep generative model to uncover the underlying distribution of bo… ▽ More In this work, we study the problem of clustering survival data $-$ a challenging and so far under-explored task. We introduce a novel semi-supervised probabilistic approach to cluster survival data by leveraging recent advances in stochastic gradient variational inference. In contrast to previous work, our proposed method employs a deep generative model to uncover the underlying distribution of both the explanatory variables and censored survival times. We compare our model to the related work on clustering and mixture models for survival data in comprehensive experiments on a wide range of synthetic, semi-synthetic, and real-world datasets, including medical imaging data. Our method performs better at identifying clusters and is competitive at predicting survival times. Relying on novel generative assumptions, the proposed model offers a holistic perspective on clustering survival data and holds a promise of discovering subpopulations whose survival is regulated by different generative mechanisms. △ Less

Submitted 10 March, 2022; v1 submitted 10 June, 2021; originally announced June 2021.

Comments: ICLR 2022

arXiv:2006.04998 [pdf]

Machine Learning Automatically Detects COVID-19 using Chest CTs in a Large Multicenter Cohort

Authors: Eduardo Jose Mortani Barbosa Jr., Bogdan Georgescu, Shikha Chaganti, Gorka Bastarrika Aleman, Jordi Broncano Cabrero, Guillaume Chabin, Thomas Flohr, Philippe Grenier, Sasa Grbic, Nakul Gupta, François Mellot, Savvas Nicolaou, Thomas Re, Pina Sanelli, Alexander W. Sauter, Young** Yoo, Valentin Ziebandt, Dorin Comaniciu

Abstract: Objectives: To investigate machine-learning classifiers and interpretable models using chest CT for detection of COVID-19 and differentiation from other pneumonias, ILD and normal CTs. Methods: Our retrospective multi-institutional study obtained 2096 chest CTs from 16 institutions (including 1077 COVID-19 patients). Training/testing cohorts included 927/100 COVID-19, 388/33 ILD, 189/33 other pn… ▽ More Objectives: To investigate machine-learning classifiers and interpretable models using chest CT for detection of COVID-19 and differentiation from other pneumonias, ILD and normal CTs. Methods: Our retrospective multi-institutional study obtained 2096 chest CTs from 16 institutions (including 1077 COVID-19 patients). Training/testing cohorts included 927/100 COVID-19, 388/33 ILD, 189/33 other pneumonias, and 559/34 normal (no pathologies) CTs. A metric-based approach for classification of COVID-19 used interpretable features, relying on logistic regression and random forests. A deep learning-based classifier differentiated COVID-19 via 3D features extracted directly from CT attenuation and probability distribution of airspace opacities. Results: Most discriminative features of COVID-19 are percentage of airspace opacity and peripheral and basal predominant opacities, concordant with the typical characterization of COVID-19 in the literature. Unsupervised hierarchical clustering compares feature distribution across COVID-19 and control cohorts. The metrics-based classifier achieved AUC=0.83, sensitivity=0.74, and specificity=0.79 of versus respectively 0.93, 0.90, and 0.83 for the DL-based classifier. Most of ambiguity comes from non-COVID-19 pneumonia with manifestations that overlap with COVID-19, as well as mild COVID-19 cases. Non-COVID-19 classification performance is 91% for ILD, 64% for other pneumonias and 94% for no pathologies, which demonstrates the robustness of our method against different compositions of control groups. Conclusions: Our new method accurately discriminates COVID-19 from other types of pneumonia, ILD, and no pathologies CTs, using quantitative imaging features derived from chest CT, while balancing interpretability of results and classification performance, and therefore may be useful to facilitate diagnosis of COVID-19. △ Less

Submitted 9 October, 2020; v1 submitted 8 June, 2020; originally announced June 2020.

arXiv:2004.01279 [pdf]

doi 10.1148/ryai.2020200048#

Automated Quantification of CT Patterns Associated with COVID-19 from Chest CT

Authors: Shikha Chaganti, Abishek Balachandran, Guillaume Chabin, Stuart Cohen, Thomas Flohr, Bogdan Georgescu, Philippe Grenier, Sasa Grbic, Siqi Liu, François Mellot, Nicolas Murray, Savvas Nicolaou, William Parker, Thomas Re, Pina Sanelli, Alexander W. Sauter, Zhoubing Xu, Young** Yoo, Valentin Ziebandt, Dorin Comaniciu

Abstract: Purpose: To present a method that automatically segments and quantifies abnormal CT patterns commonly present in coronavirus disease 2019 (COVID-19), namely ground glass opacities and consolidations. Materials and Methods: In this retrospective study, the proposed method takes as input a non-contrasted chest CT and segments the lesions, lungs, and lobes in three dimensions, based on a dataset of 9… ▽ More Purpose: To present a method that automatically segments and quantifies abnormal CT patterns commonly present in coronavirus disease 2019 (COVID-19), namely ground glass opacities and consolidations. Materials and Methods: In this retrospective study, the proposed method takes as input a non-contrasted chest CT and segments the lesions, lungs, and lobes in three dimensions, based on a dataset of 9749 chest CT volumes. The method outputs two combined measures of the severity of lung and lobe involvement, quantifying both the extent of COVID-19 abnormalities and presence of high opacities, based on deep learning and deep reinforcement learning. The first measure of (PO, PHO) is global, while the second of (LSS, LHOS) is lobewise. Evaluation of the algorithm is reported on CTs of 200 participants (100 COVID-19 confirmed patients and 100 healthy controls) from institutions from Canada, Europe and the United States collected between 2002-Present (April, 2020). Ground truth is established by manual annotations of lesions, lungs, and lobes. Correlation and regression analyses were performed to compare the prediction to the ground truth. Results: Pearson correlation coefficient between method prediction and ground truth for COVID-19 cases was calculated as 0.92 for PO (P < .001), 0.97 for PHO(P < .001), 0.91 for LSS (P < .001), 0.90 for LHOS (P < .001). 98 of 100 healthy controls had a predicted PO of less than 1%, 2 had between 1-2%. Automated processing time to compute the severity scores was 10 seconds per case compared to 30 minutes required for manual annotations. Conclusion: A new method segments regions of CT abnormalities associated with COVID-19 and computes (PO, PHO), as well as (LSS, LHOS) severity scores. △ Less

Submitted 18 November, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

Journal ref: Radiology: Artificial Intelligence, Vol. 2, No. 4, 2020

arXiv:1907.01738 [pdf, other]

A Stable Boundary Integral Formulation of an Acoustic Wave Transmission Problem with Mixed Boundary Conditions

Authors: Sarah Eberle, Francesco Florian, Ralf Hiptmair, Stefan A. Sauter

Abstract: In this paper, we consider an acoustic wave transmission problem with mixed boundary conditions of Dirichlet, Neumann, and impedance type. The transmission interfaces may join the domain boundary in a general way independent of the location of the boundary conditions. We will derive a formulation as a \textit{direct}, \textit{space-time retarded boundary integral equation}, where both Cauchy data… ▽ More In this paper, we consider an acoustic wave transmission problem with mixed boundary conditions of Dirichlet, Neumann, and impedance type. The transmission interfaces may join the domain boundary in a general way independent of the location of the boundary conditions. We will derive a formulation as a \textit{direct}, \textit{space-time retarded boundary integral equation}, where both Cauchy data are kept as unknowns on the impedance part of the boundary. This requires the definition of single-trace spaces which incorporate homogeneous Dirichlet and Neumann conditions on the corresponding parts on the boundary. We prove the continuity and coercivity of the formulation by employing the technique of operational calculus in the Laplace domain. △ Less

Submitted 6 October, 2020; v1 submitted 3 July, 2019; originally announced July 2019.

Comments: 15 pages, 1 figure

arXiv:1904.00207 [pdf, ps, other]

Wave number-Explicit Analysis for Galerkin Discretizations of Lossy Helmholtz Problems

Authors: Jens M. Melenk, Stefan A. Sauter, Céline Torres

Abstract: We present a stability and convergence theory for the lossy Helmholtz equation and its Galerkin discretization. The boundary conditions are of Robin type. All estimates are explicit with respect to the real and imaginary part of the complex wave number $ζ\in\mathbb{C}$, $\operatorname{Re}ζ\geq0$, $\left\vert ζ\right\vert \geq1$. For the extreme cases $ζ\in\operatorname*{i}\mathbb{R}$ and… ▽ More We present a stability and convergence theory for the lossy Helmholtz equation and its Galerkin discretization. The boundary conditions are of Robin type. All estimates are explicit with respect to the real and imaginary part of the complex wave number $ζ\in\mathbb{C}$, $\operatorname{Re}ζ\geq0$, $\left\vert ζ\right\vert \geq1$. For the extreme cases $ζ\in\operatorname*{i}\mathbb{R}$ and $ζ\in\mathbb{R}_{\geq0}$, the estimates coincide with the existing estimates in the literature and exhibit a seamless transition between these cases in the right complex half plane. △ Less

Submitted 30 March, 2019; originally announced April 2019.

Comments: 29 pages, 1 figure

arXiv:1803.00966 [pdf, ps, other]

Stability and error analysis for the Helmholtz equation with variable coefficients

Authors: I. G. Graham, S. A. Sauter

Abstract: We discuss the stability theory and numerical analysis of the Helmholtz equation with variable and possibly non-smooth or oscillatory coefficients. Using the unique continuation principle and the Fredholm alternative, we first give an existence-uniqueness result for this problem, which holds under rather general conditions on the coefficients and on the domain. Under additional assumptions, we der… ▽ More We discuss the stability theory and numerical analysis of the Helmholtz equation with variable and possibly non-smooth or oscillatory coefficients. Using the unique continuation principle and the Fredholm alternative, we first give an existence-uniqueness result for this problem, which holds under rather general conditions on the coefficients and on the domain. Under additional assumptions, we derive estimates for the stability constant (i.e., the norm of the solution operator) in terms of the data (i.e. PDE coefficients and frequency), and we apply these estimates to obtain a new finite element error analysis for the Helmholtz equation which is valid at high frequency and with variable wave speed. The central role played by the stability constant in this theory leads us to investigate its behaviour with respect to coefficient variation in detail. We give, via a 1D analysis, an a priori bound with stability constant growing exponentially in the variance of the coefficients (wave speed and/or diffusion coefficient). Then, by means a family of analytic examples (supplemented by numerical experiments), we show that this estimate is sharp △ Less

Submitted 17 April, 2019; v1 submitted 2 March, 2018; originally announced March 2018.

MSC Class: 65N12 65N15 65N30

arXiv:1711.05430 [pdf, other]

doi 10.1007/s00033-018-1031-9

Stability estimate for the Helmholtz equation with rapidly jum** coefficients

Authors: Stefan Sauter, Celine Torres

Abstract: The goal of this paper is to investigate the stability of the Helmholtz equation in the high- frequency regime with non-smooth and rapidly oscillating coefficients on bounded domains. Existence and uniqueness of the problem can be proved using the unique continuation principle in Fredholm's alternative. However, this approach does not give directly a coefficient-explicit energy estimate. We presen… ▽ More The goal of this paper is to investigate the stability of the Helmholtz equation in the high- frequency regime with non-smooth and rapidly oscillating coefficients on bounded domains. Existence and uniqueness of the problem can be proved using the unique continuation principle in Fredholm's alternative. However, this approach does not give directly a coefficient-explicit energy estimate. We present a new theoretical approach for the one-dimensional problem and find that for a new class of coefficients, including coefficients with an arbitrary number of discontinuities, the stability constant (i.e., the norm of the solution operator) is bounded by a term independent of the number of jumps. We emphasize that no periodicity of the coefficients is required. By selecting the wave speed function in a certain \resonant" way, we construct a class of oscillatory configurations, such that the stability constant grows exponentially in the frequency. This shows that our estimates are sharp. △ Less

Submitted 30 August, 2018; v1 submitted 15 November, 2017; originally announced November 2017.

Comments: a) Added references, b) rewritten the introduction with a summary of the results/techniques of the paper, c) Corrected typos

arXiv:1704.01890 [pdf, ps, other]

A Posteriori Modelling-Discretization Error Estimate for Elliptic Problems with L ^\infty -Coefficients

Authors: M. Weymuth, S. Sauter, S. Repin

Abstract: We consider elliptic problems with complicated, discontinuous diffusion tensor $A_{\scriptscriptstyle 0} $. One of the standard approaches to numerically treat such problems is to simplify the coefficient by some approximation, say $A_{\varepsilon}$, and to use standard finite elements. In \cite{Repin2012} a combined modelling-discretization strategy has been proposed which estimates the discretiz… ▽ More We consider elliptic problems with complicated, discontinuous diffusion tensor $A_{\scriptscriptstyle 0} $. One of the standard approaches to numerically treat such problems is to simplify the coefficient by some approximation, say $A_{\varepsilon}$, and to use standard finite elements. In \cite{Repin2012} a combined modelling-discretization strategy has been proposed which estimates the discretization and modelling errors by a posteriori estimates of functional type. This strategy allows to balance these two errors in a problem adapted way. However, the estimate of the modelling error is derived under the assumption that the difference $A_{\scriptscriptstyle 0} -A_{\varepsilon}$ is bounded in the $L^{\infty}$-norm, which requires that the approximation of the coefficient matches the discontinuities of the original coefficient. Therefore this theory is not appropriate for applications with discontinuous coefficients along \textit{complicated, curved} interfaces. Based on bounds for $A_{\scriptscriptstyle 0} -A_{\varepsilon}$ in an $L^{q}$-norm with $q<\infty$ we generalize the combined modelling-discretization strategy to a larger class of coefficients. △ Less

Submitted 6 April, 2017; originally announced April 2017.

arXiv:1703.07965 [pdf, ps, other]

Convergence analysis of energy conserving explicit local time-step** methods for the wave equation

Authors: Marcus J. Grote, Michaela Mehlin, Stefan Sauter

Abstract: Local adaptivity and mesh refinement are key to the efficient simulation of wave phenomena in heterogeneous media or complex geometry. Locally refined meshes, however, dictate a small time-step everywhere with a crippling effect on any explicit time-marching method. In [18] a leap-frog (LF) based explicit local time-step** (LTS) method was proposed, which overcomes the severe bottleneck due to a… ▽ More Local adaptivity and mesh refinement are key to the efficient simulation of wave phenomena in heterogeneous media or complex geometry. Locally refined meshes, however, dictate a small time-step everywhere with a crippling effect on any explicit time-marching method. In [18] a leap-frog (LF) based explicit local time-step** (LTS) method was proposed, which overcomes the severe bottleneck due to a few small elements by taking small time-steps in the locally refined region and larger steps elsewhere. Here a rigorous convergence proof is presented for the fully-discrete LTS-LF method when combined with a standard conforming finite element method (FEM) in space. Numerical results further illustrate the usefulness of the LTS-LF Galerkin FEM in the presence of corner singularities. △ Less

Submitted 23 March, 2017; originally announced March 2017.

MSC Class: 65M12; 65M20; 65M60; 65L06; 65L20

arXiv:1703.03224 [pdf, ps, other]

A Family of Crouzeix-Raviart Finite Elements in 3D

Authors: Patrick Ciarlet Jr., Charles F. Dunkl, Stefan A. Sauter

Abstract: In this paper we will develop a family of non-conforming "Crouzeix-Raviart" type finite elements in three dimensions. They consist of local polynomials of maximal degree $p\in\mathbb{N}$ on simplicial finite element meshes while certain jump conditions are imposed across adjacent simplices. We will prove optimal a priori estimates for these finite elements. The characterization of this space via… ▽ More In this paper we will develop a family of non-conforming "Crouzeix-Raviart" type finite elements in three dimensions. They consist of local polynomials of maximal degree $p\in\mathbb{N}$ on simplicial finite element meshes while certain jump conditions are imposed across adjacent simplices. We will prove optimal a priori estimates for these finite elements. The characterization of this space via jump conditions is implicit and the derivation of a local basis requires some deeper theoretical tools from orthogonal polynomials on triangles and their representation. We will derive these tools for this purpose. These results allow us to give explicit representations of the local basis functions. Finally we will analyze the linear independence of these sets of functions and discuss the question whether they span the whole non-conforming space. △ Less

Submitted 9 March, 2017; originally announced March 2017.

Comments: 29 figures

MSC Class: Primary 33C45; 33C50; 65N12; 65N30; Secondary 33C80

arXiv:1612.01285 [pdf, ps, other]

A Fully Discrete Galerkin Method for Abel-type Integral Equations

Authors: Urs Vögeli, Khadijeh Nedaiasl, Stefan A. Sauter

Abstract: In this paper, we present a Galerkin method for Abel-type integral equation with a general class of kernel. Stability and quasi-optimal convergence estimates are derived in ractional-order Sobolev norms. The fully-discrete Galerkin method is defined by employing simple tensor-Gauss quadrature. We develop a corresponding perturbation analysis which allows to keep the number of quadrature points sma… ▽ More In this paper, we present a Galerkin method for Abel-type integral equation with a general class of kernel. Stability and quasi-optimal convergence estimates are derived in ractional-order Sobolev norms. The fully-discrete Galerkin method is defined by employing simple tensor-Gauss quadrature. We develop a corresponding perturbation analysis which allows to keep the number of quadrature points small. Numerical experiments have been performed which illustrate the sharpness of the theoretical estimates and the sensitivity of the solution with respect to some parameters in the equation. △ Less

Submitted 7 March, 2018; v1 submitted 5 December, 2016; originally announced December 2016.

Comments: 28 pages, 7 figures

MSC Class: 45E10; 65R20; 65D32

arXiv:1310.8493 [pdf, ps, other]

Functional Estimates for Derivatives of the Modified Bessel Function $K_{0}$ and related Exponential Functions

Authors: Silvia Falletta, Stefan A. Sauter

Abstract: Let $K_{0}$ denote the modified Bessel function of second kind and zeroth order. In this paper we will studying the function $\tildeω_{n}\left( x\right) :=\frac{\left( -x\right) ^{n}K_{0}^{\left( n\right) }\left( x\right) }{n!}$ for positive argument. The function $\tildeω_{n}$ plays an important role for the formulation of the wave equation in two spatial dimensions as a retarded potential integr… ▽ More Let $K_{0}$ denote the modified Bessel function of second kind and zeroth order. In this paper we will studying the function $\tildeω_{n}\left( x\right) :=\frac{\left( -x\right) ^{n}K_{0}^{\left( n\right) }\left( x\right) }{n!}$ for positive argument. The function $\tildeω_{n}$ plays an important role for the formulation of the wave equation in two spatial dimensions as a retarded potential integral equation. We will prove that the growth of the derivatives $\tildeω_{n}^{\left( m\right) }$ with respect to $n$ can be bounded by $O\left( \left( n+1\right) ^{m/2}\right) $ while for small and large arguments $x$ the growth even becomes independent of $n$. These estimates are based on an integral representation of $K_{0}$ which involves the function $g_{n}\left( t\right) =\frac{t^{n}}{n!}\exp\left( -t\right) $ and their derivatives. The estimates then rely on a subtle analysis of $g_{n}$ and its derivatives which we will also present in this paper. △ Less

Submitted 31 October, 2013; originally announced October 2013.

arXiv:1307.8429 [pdf, ps, other]

doi 10.1090/S0025-5718-2014-02910-4

The intersection of bivariate orthogonal polynomials on triangle patches

Authors: Tom H. Koornwinder, Stefan A. Sauter

Abstract: In this paper, the intersection of bivariate orthogonal polynomials on triangle patches will be investigated. The result is interesting by its own but also has important applications in the theory of a posteriori error estimation for finite element discretizations with $p$-refinement, i.e., if the local polynomial degree of the test and trial functions is increased to improve the accuracy. A trian… ▽ More In this paper, the intersection of bivariate orthogonal polynomials on triangle patches will be investigated. The result is interesting by its own but also has important applications in the theory of a posteriori error estimation for finite element discretizations with $p$-refinement, i.e., if the local polynomial degree of the test and trial functions is increased to improve the accuracy. A triangle patch is a set of disjoint open triangles whose closed union covers a neighborhood of the common triangle vertex. On each triangle we consider the space of orthogonal polynomials of degree n with respect to the weight function which is the product of the barycentric coordinates. We show that the intersection of these polynomial spaces is the null space. The analysis requires the derivation of subtle representations of orthogonal polynomials on triangles. Up to four triangles have to be considered to identify that the intersection is trivial. △ Less

Submitted 14 November, 2013; v1 submitted 31 July, 2013; originally announced July 2013.

Comments: 20 pages, 7 figures. v3: minor corrections and additions; accepted by Mathematics of Computation

MSC Class: 65N15; 65N30; 65N50; 33C45; 33C50

Journal ref: Math. Comp. 84 (2015), 1795-1812

Showing 1–27 of 27 results for author: Sauter, A