Search | arXiv e-print repository

arXiv:2405.19224 [pdf, other]

A study on the adequacy of common IQA measures for medical images

Authors: Anna Breger, Clemens Karner, Ian Selby, Janek Gröhl, Sören Dittmer, Edward Lilley, Judith Babar, Jake Beckford, Timothy J Sadler, Shahab Shahipasand, Arthikkaa Thavakumar, Michael Roberts, Carola-Bibiane Schönlieb

Abstract: Image quality assessment (IQA) is standard practice in the development stage of novel machine learning algorithms that operate on images. The most commonly used IQA measures have been developed and tested for natural images, but not in the medical setting. Reported inconsistencies arising in medical images are not surprising, as they have different properties than natural images. In this study, we… ▽ More Image quality assessment (IQA) is standard practice in the development stage of novel machine learning algorithms that operate on images. The most commonly used IQA measures have been developed and tested for natural images, but not in the medical setting. Reported inconsistencies arising in medical images are not surprising, as they have different properties than natural images. In this study, we test the applicability of common IQA measures for medical image data by comparing their assessment to manually rated chest X-ray (5 experts) and photoacoustic image data (1 expert). Moreover, we include supplementary studies on grayscale natural images and accelerated brain MRI data. The results of all experiments show a similar outcome in line with previous findings for medical imaging: PSNR and SSIM in the default setting are in the lower range of the result list and HaarPSI outperforms the other tested measures in the overall performance. Also among the top performers in our medical experiments are the full reference measures DISTS, FSIM, LPIPS and MS-SSIM. Generally, the results on natural images yield considerably higher correlations, suggesting that the additional employment of tailored IQA measures for medical imaging algorithms is needed. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.19097 [pdf, other]

A study of why we need to reassess full reference image quality assessment with medical images

Authors: Anna Breger, Ander Biguri, Malena Sabaté Landman, Ian Selby, Nicole Amberg, Elisabeth Brunner, Janek Gröhl, Sepideh Hatamikia, Clemens Karner, Lipeng Ning, Sören Dittmer, Michael Roberts, AIX-COVNET Collaboration, Carola-Bibiane Schönlieb

Abstract: Image quality assessment (IQA) is not just indispensable in clinical practice to ensure high standards, but also in the development stage of novel algorithms that operate on medical images with reference data. This paper provides a structured and comprehensive collection of examples where the two most common full reference (FR) image quality measures prove to be unsuitable for the assessment of no… ▽ More Image quality assessment (IQA) is not just indispensable in clinical practice to ensure high standards, but also in the development stage of novel algorithms that operate on medical images with reference data. This paper provides a structured and comprehensive collection of examples where the two most common full reference (FR) image quality measures prove to be unsuitable for the assessment of novel algorithms using different kinds of medical images, including real-world MRI, CT, OCT, X-Ray, digital pathology and photoacoustic imaging data. In particular, the FR-IQA measures PSNR and SSIM are known and tested for working successfully in many natural imaging tasks, but discrepancies in medical scenarios have been noted in the literature. Inconsistencies arising in medical images are not surprising, as they have very different properties than natural images which have not been targeted nor tested in the development of the mentioned measures, and therefore might imply wrong judgement of novel methods for medical images. Therefore, improvement is urgently needed in particular in this era of AI to increase explainability, reproducibility and generalizability in machine learning for medical imaging and beyond. On top of the pitfalls we will provide ideas for future research as well as suggesting guidelines for the usage of FR-IQA measures applied to medical images. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.19000 [pdf, other]

FedMAP: Unlocking Potential in Personalized Federated Learning through Bi-Level MAP Optimization

Authors: Fan Zhang, Carlos Esteve-Yagüe, Sören Dittmer, Carola-Bibiane Schönlieb, Michael Roberts

Abstract: Federated Learning (FL) enables collaborative training of machine learning models on decentralized data while preserving data privacy. However, data across clients often differs significantly due to class imbalance, feature distribution skew, sample size imbalance, and other phenomena. Leveraging information from these not identically distributed (non-IID) datasets poses substantial challenges. FL… ▽ More Federated Learning (FL) enables collaborative training of machine learning models on decentralized data while preserving data privacy. However, data across clients often differs significantly due to class imbalance, feature distribution skew, sample size imbalance, and other phenomena. Leveraging information from these not identically distributed (non-IID) datasets poses substantial challenges. FL methods based on a single global model cannot effectively capture the variations in client data and underperform in non-IID settings. Consequently, Personalized FL (PFL) approaches that adapt to each client's data distribution but leverage other clients' data are essential but currently underexplored. We propose a novel Bayesian PFL framework using bi-level optimization to tackle the data heterogeneity challenges. Our proposed framework utilizes the global model as a prior distribution within a Maximum A Posteriori (MAP) estimation of personalized client models. This approach facilitates PFL by integrating shared knowledge from the prior, thereby enhancing local model performance, generalization ability, and communication efficiency. We extensively evaluated our bi-level optimization approach on real-world and synthetic datasets, demonstrating significant improvements in model accuracy compared to existing methods while reducing communication overhead. This study contributes to PFL by establishing a solid theoretical foundation for the proposed method and offering a robust, ready-to-use framework that effectively addresses the challenges posed by non-IID data in FL. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2312.16188 [pdf, other]

The curious case of the test set AUROC

Authors: Michael Roberts, Alon Hazan, Sören Dittmer, James H. F. Rudd, Carola-Bibiane Schönlieb

Abstract: Whilst the size and complexity of ML models have rapidly and significantly increased over the past decade, the methods for assessing their performance have not kept pace. In particular, among the many potential performance metrics, the ML community stubbornly continues to use (a) the area under the receiver operating characteristic curve (AUROC) for a validation and test cohort (distinct from trai… ▽ More Whilst the size and complexity of ML models have rapidly and significantly increased over the past decade, the methods for assessing their performance have not kept pace. In particular, among the many potential performance metrics, the ML community stubbornly continues to use (a) the area under the receiver operating characteristic curve (AUROC) for a validation and test cohort (distinct from training data) or (b) the sensitivity and specificity for the test data at an optimal threshold determined from the validation ROC. However, we argue that considering scores derived from the test ROC curve alone gives only a narrow insight into how a model performs and its ability to generalise. △ Less

Submitted 19 December, 2023; originally announced December 2023.

Comments: 3 pages, 4 figures

arXiv:2310.02874 [pdf, other]

Recent Methodological Advances in Federated Learning for Healthcare

Authors: Fan Zhang, Daniel Kreuter, Yichen Chen, Sören Dittmer, Samuel Tull, Tolou Shadbahr, BloodCounts! Collaboration, Jacobus Preller, James H. F. Rudd, John A. D. Aston, Carola-Bibiane Schönlieb, Nicholas Gleadall, Michael Roberts

Abstract: For healthcare datasets, it is often not possible to combine data samples from multiple sites due to ethical, privacy or logistical concerns. Federated learning allows for the utilisation of powerful machine learning algorithms without requiring the pooling of data. Healthcare data has many simultaneous challenges which require new methodologies to address, such as highly-siloed data, class imbala… ▽ More For healthcare datasets, it is often not possible to combine data samples from multiple sites due to ethical, privacy or logistical concerns. Federated learning allows for the utilisation of powerful machine learning algorithms without requiring the pooling of data. Healthcare data has many simultaneous challenges which require new methodologies to address, such as highly-siloed data, class imbalance, missing data, distribution shifts and non-standardised variables. Federated learning adds significant methodological complexity to conventional centralised machine learning, requiring distributed optimisation, communication between nodes, aggregation of models and redistribution of models. In this systematic review, we consider all papers on Scopus that were published between January 2015 and February 2023 and which describe new federated learning methodologies for addressing challenges with healthcare data. We performed a detailed review of the 89 papers which fulfilled these criteria. Significant systemic issues were identified throughout the literature which compromise the methodologies in many of the papers reviewed. We give detailed recommendations to help improve the quality of the methodology development for federated learning in healthcare. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Comments: Supplementary table of extracted data at the end of the document

arXiv:2307.13579 [pdf, other]

Reinterpreting survival analysis in the universal approximator age

Authors: Sören Dittmer, Michael Roberts, Jacobus Preller, AIX COVNET, James H. F. Rudd, John A. D. Aston, Carola-Bibiane Schönlieb

Abstract: Survival analysis is an integral part of the statistical toolbox. However, while most domains of classical statistics have embraced deep learning, survival analysis only recently gained some minor attention from the deep learning community. This recent development is likely in part motivated by the COVID-19 pandemic. We aim to provide the tools needed to fully harness the potential of survival ana… ▽ More Survival analysis is an integral part of the statistical toolbox. However, while most domains of classical statistics have embraced deep learning, survival analysis only recently gained some minor attention from the deep learning community. This recent development is likely in part motivated by the COVID-19 pandemic. We aim to provide the tools needed to fully harness the potential of survival analysis in deep learning. On the one hand, we discuss how survival analysis connects to classification and regression. On the other hand, we provide technical tools. We provide a new loss function, evaluation metrics, and the first universal approximating network that provably produces survival curves without numeric integration. We show that the loss function and model outperform other approaches using a large numerical study. △ Less

Submitted 25 July, 2023; originally announced July 2023.

arXiv:2307.10431 [pdf, other]

Bayesian view on the training of invertible residual networks for solving linear inverse problems

Authors: Clemens Arndt, Sören Dittmer, Nick Heilenkötter, Meira Iske, Tobias Kluth, Judith Nickel

Abstract: Learning-based methods for inverse problems, adapting to the data's inherent structure, have become ubiquitous in the last decade. Besides empirical investigations of their often remarkable performance, an increasing number of works addresses the issue of theoretical guarantees. Recently, [3] exploited invertible residual networks (iResNets) to learn provably convergent regularizations given reaso… ▽ More Learning-based methods for inverse problems, adapting to the data's inherent structure, have become ubiquitous in the last decade. Besides empirical investigations of their often remarkable performance, an increasing number of works addresses the issue of theoretical guarantees. Recently, [3] exploited invertible residual networks (iResNets) to learn provably convergent regularizations given reasonable assumptions. They enforced these guarantees by approximating the linear forward operator with an iResNet. Supervised training on relevant samples introduces data dependency into the approach. An open question in this context is to which extent the data's inherent structure influences the training outcome, i.e., the learned reconstruction scheme. Here we address this delicate interplay of training design and data dependency from a Bayesian perspective and shed light on opportunities and limitations. We resolve these limitations by analyzing reconstruction-based training of the inverses of iResNets, where we show that this optimization strategy introduces a level of data-dependency that cannot be achieved by approximation training. We further provide and discuss a series of numerical experiments underpinning and extending the theoretical findings. △ Less

Submitted 19 July, 2023; originally announced July 2023.

arXiv:2306.01335 [pdf, other]

doi 10.1088/1361-6420/ad0660

Invertible residual networks in the context of regularization theory for linear inverse problems

Authors: Clemens Arndt, Alexander Denker, Sören Dittmer, Nick Heilenkötter, Meira Iske, Tobias Kluth, Peter Maass, Judith Nickel

Abstract: Learned inverse problem solvers exhibit remarkable performance in applications like image reconstruction tasks. These data-driven reconstruction methods often follow a two-step scheme. First, one trains the often neural network-based reconstruction scheme via a dataset. Second, one applies the scheme to new measurements to obtain reconstructions. We follow these steps but parameterize the reconstr… ▽ More Learned inverse problem solvers exhibit remarkable performance in applications like image reconstruction tasks. These data-driven reconstruction methods often follow a two-step scheme. First, one trains the often neural network-based reconstruction scheme via a dataset. Second, one applies the scheme to new measurements to obtain reconstructions. We follow these steps but parameterize the reconstruction scheme with invertible residual networks (iResNets). We demonstrate that the invertibility enables investigating the influence of the training and architecture choices on the resulting reconstruction scheme. For example, assuming local approximation properties of the network, we show that these schemes become convergent regularizations. In addition, the investigations reveal a formal link to the linear regularization theory of linear inverse problems and provide a nonlinear spectral regularization for particular architecture classes. On the numerical side, we investigate the local approximation property of selected trained architectures and present a series of experiments on the MNIST dataset that underpin and extend our theoretical findings. △ Less

Submitted 20 December, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

Journal ref: Inverse Problems 39 125018 (2023)

arXiv:2210.13191 [pdf, other]

doi 10.1038/s42256-023-00665-x

Navigating the challenges in creating complex data systems: a development philosophy

Authors: Sören Dittmer, Michael Roberts, Julian Gilbey, Ander Biguri, AIX-COVNET Collaboration, Jacobus Preller, James H. F. Rudd, John A. D. Aston, Carola-Bibiane Schönlieb

Abstract: In this perspective, we argue that despite the democratization of powerful tools for data science and machine learning over the last decade, develo** the code for a trustworthy and effective data science system (DSS) is getting harder. Perverse incentives and a lack of widespread software engineering (SE) skills are among many root causes we identify that naturally give rise to the current syste… ▽ More In this perspective, we argue that despite the democratization of powerful tools for data science and machine learning over the last decade, develo** the code for a trustworthy and effective data science system (DSS) is getting harder. Perverse incentives and a lack of widespread software engineering (SE) skills are among many root causes we identify that naturally give rise to the current systemic crisis in reproducibility of DSSs. We analyze why SE and building large complex systems is, in general, hard. Based on these insights, we identify how SE addresses those difficulties and how we can apply and generalize SE methods to construct DSSs that are fit for purpose. We advocate two key development philosophies, namely that one should incrementally grow -- not biphasically plan and build -- DSSs, and one should always employ two types of feedback loops during development: one which tests the code's correctness and another that evaluates the code's efficacy. △ Less

Submitted 21 October, 2022; originally announced October 2022.

arXiv:2209.05098 [pdf, other]

SELTO: Sample-Efficient Learned Topology Optimization

Authors: Sören Dittmer, David Erzmann, Henrik Harms, Peter Maass

Abstract: Recent developments in Deep Learning (DL) suggest a vast potential for Topology Optimization (TO). However, while there are some promising attempts, the subfield still lacks a firm footing regarding basic methods and datasets. We aim to address both points. First, we explore physics-based preprocessing and equivariant networks to create sample-efficient components for TO DL pipelines. We evaluate… ▽ More Recent developments in Deep Learning (DL) suggest a vast potential for Topology Optimization (TO). However, while there are some promising attempts, the subfield still lacks a firm footing regarding basic methods and datasets. We aim to address both points. First, we explore physics-based preprocessing and equivariant networks to create sample-efficient components for TO DL pipelines. We evaluate them in a large-scale ablation study using end-to-end supervised training. The results demonstrate a drastic improvement in sample efficiency and the predictions' physical correctness. Second, to improve comparability and future progress, we publish the two first TO datasets containing problems and corresponding ground truth solutions. △ Less

Submitted 6 June, 2023; v1 submitted 12 September, 2022; originally announced September 2022.

Comments: 25 pages, 10 figures, submitted to the International Journal for Numerical Methods in Engineering

MSC Class: 68U05; 68U07; 65N21; 68T01

arXiv:2206.08478 [pdf, other]

doi 10.1038/s43856-023-00356-z

Classification of datasets with imputed missing values: does imputation quality matter?

Authors: Tolou Shadbahr, Michael Roberts, Jan Stanczuk, Julian Gilbey, Philip Teare, Sören Dittmer, Matthew Thorpe, Ramon Vinas Torne, Evis Sala, Pietro Lio, Mishal Patel, AIX-COVNET Collaboration, James H. F. Rudd, Tuomas Mirtti, Antti Rannikko, John A. D. Aston, **g Tang, Carola-Bibiane Schönlieb

Abstract: Classifying samples in incomplete datasets is a common aim for machine learning practitioners, but is non-trivial. Missing data is found in most real-world datasets and these missing values are typically imputed using established methods, followed by classification of the now complete, imputed, samples. The focus of the machine learning researcher is then to optimise the downstream classification… ▽ More Classifying samples in incomplete datasets is a common aim for machine learning practitioners, but is non-trivial. Missing data is found in most real-world datasets and these missing values are typically imputed using established methods, followed by classification of the now complete, imputed, samples. The focus of the machine learning researcher is then to optimise the downstream classification performance. In this study, we highlight that it is imperative to consider the quality of the imputation. We demonstrate how the commonly used measures for assessing quality are flawed and propose a new class of discrepancy scores which focus on how well the method recreates the overall distribution of the data. To conclude, we highlight the compromised interpretability of classifier models trained using poorly imputed data. △ Less

Submitted 16 June, 2022; originally announced June 2022.

Comments: 17 pages, 10 figures, 30 supplementary pages

arXiv:2206.04406 [pdf, other]

Unsupervised Learning of the Total Variation Flow

Authors: Tamara G. Grossmann, Sören Dittmer, Yury Korolev, Carola-Bibiane Schönlieb

Abstract: The total variation (TV) flow generates a scale-space representation of an image based on the TV functional. This gradient flow observes desirable features for images, such as sharp edges and enables spectral, scale, and texture analysis. Solving the TV flow is challenging; one reason is the the non-uniqueness of the subgradients. The standard numerical approach for TV flow requires solving multip… ▽ More The total variation (TV) flow generates a scale-space representation of an image based on the TV functional. This gradient flow observes desirable features for images, such as sharp edges and enables spectral, scale, and texture analysis. Solving the TV flow is challenging; one reason is the the non-uniqueness of the subgradients. The standard numerical approach for TV flow requires solving multiple non-smooth optimisation problems. Even with state-of-the-art convex optimisation techniques, this is often prohibitively expensive and strongly motivates the use of alternative, faster approaches. Inspired by and extending the framework of physics-informed neural networks (PINNs), we propose the TVflowNET, an unsupervised neural network approach, to approximate the solution of the TV flow given an initial image and a time instance. The TVflowNET requires no ground truth data but rather makes use of the PDE for optimisation of the network parameters. We circumvent the challenges related to the non-uniqueness of the subgradients by additionally learning the related diffusivity term. Our approach significantly speeds up the computation time and we show that the TVflowNET approximates the TV flow solution with high fidelity for different image sizes and image types. Additionally, we give a full comparison of different network architecture designs as well as training regimes to underscore the effectiveness of our approach. △ Less

Submitted 22 April, 2024; v1 submitted 9 June, 2022; originally announced June 2022.

arXiv:2012.13053 [pdf, other]

Function Secret Sharing for PSI-CA:With Applications to Private Contact Tracing

Authors: Samuel Dittmer, Yuval Ishai, Steve Lu, Rafail Ostrovsky, Mohamed Elsabagh, Nikolaos Kiourtis, Brian Schulte, Angelos Stavrou

Abstract: In this work we describe a token-based solution to Contact Tracing via Distributed Point Functions (DPF) and, more generally, Function Secret Sharing (FSS). The key idea behind the solution is that FSS natively supports secure keyword search on raw sets of keywords without a need for processing the keyword sets via a data structure for set membership. Furthermore, the FSS functionality enables add… ▽ More In this work we describe a token-based solution to Contact Tracing via Distributed Point Functions (DPF) and, more generally, Function Secret Sharing (FSS). The key idea behind the solution is that FSS natively supports secure keyword search on raw sets of keywords without a need for processing the keyword sets via a data structure for set membership. Furthermore, the FSS functionality enables adding up numerical payloads associated with multiple matches without additional interaction. These features make FSS an attractive tool for lightweight privacy-preserving searching on a database of tokens belonging to infected individuals. △ Less

Submitted 23 December, 2020; originally announced December 2020.

arXiv:2008.02839 [pdf, other]

Learned convex regularizers for inverse problems

Authors: Subhadip Mukherjee, Sören Dittmer, Zakhar Shumaylov, Sebastian Lunz, Ozan Öktem, Carola-Bibiane Schönlieb

Abstract: We consider the variational reconstruction framework for inverse problems and propose to learn a data-adaptive input-convex neural network (ICNN) as the regularization functional. The ICNN-based convex regularizer is trained adversarially to discern ground-truth images from unregularized reconstructions. Convexity of the regularizer is desirable since (i) one can establish analytical convergence g… ▽ More We consider the variational reconstruction framework for inverse problems and propose to learn a data-adaptive input-convex neural network (ICNN) as the regularization functional. The ICNN-based convex regularizer is trained adversarially to discern ground-truth images from unregularized reconstructions. Convexity of the regularizer is desirable since (i) one can establish analytical convergence guarantees for the corresponding variational reconstruction problem and (ii) devise efficient and provable algorithms for reconstruction. In particular, we show that the optimal solution to the variational problem converges to the ground-truth if the penalty parameter decays sub-linearly with respect to the norm of the noise. Further, we prove the existence of a sub-gradient-based algorithm that leads to a monotonically decreasing error in the parameter space with iterations. To demonstrate the performance of our approach for solving inverse problems, we consider the tasks of deblurring natural images and reconstructing images in computed tomography (CT), and show that the proposed convex regularizer is at least competitive with and sometimes superior to state-of-the-art data-driven techniques for inverse problems. △ Less

Submitted 1 March, 2021; v1 submitted 6 August, 2020; originally announced August 2020.

arXiv:2007.01593 [pdf, other]

Deep image prior for 3D magnetic particle imaging: A quantitative comparison of regularization techniques on Open MPI dataset

Authors: Sören Dittmer, Tobias Kluth, Mads Thorstein Roar Henriksen, Peter Maass

Abstract: Magnetic particle imaging (MPI) is an imaging modality exploiting the nonlinear magnetization behavior of (super-)paramagnetic nanoparticles to obtain a space- and often also time-dependent concentration of a tracer consisting of these nanoparticles. MPI has a continuously increasing number of potential medical applications. One prerequisite for successful performance in these applications is a pr… ▽ More Magnetic particle imaging (MPI) is an imaging modality exploiting the nonlinear magnetization behavior of (super-)paramagnetic nanoparticles to obtain a space- and often also time-dependent concentration of a tracer consisting of these nanoparticles. MPI has a continuously increasing number of potential medical applications. One prerequisite for successful performance in these applications is a proper solution to the image reconstruction problem. More classical methods from inverse problems theory, as well as novel approaches from the field of machine learning, have the potential to deliver high-quality reconstructions in MPI. We investigate a novel reconstruction approach based on a deep image prior, which builds on representing the solution by a deep neural network. Novel approaches, as well as variational and iterative regularization techniques, are compared quantitatively in terms of peak signal-to-noise ratios and structural similarity indices on the publicly available Open MPI dataset. △ Less

Submitted 3 July, 2020; originally announced July 2020.

arXiv:2007.01575 [pdf, other]

Ground Truth Free Denoising by Optimal Transport

Authors: Sören Dittmer, Carola-Bibiane Schönlieb, Peter Maass

Abstract: We present a learned unsupervised denoising method for arbitrary types of data, which we explore on images and one-dimensional signals. The training is solely based on samples of noisy data and examples of noise, which -- critically -- do not need to come in pairs. We only need the assumption that the noise is independent and additive (although we describe how this can be extended). The method res… ▽ More We present a learned unsupervised denoising method for arbitrary types of data, which we explore on images and one-dimensional signals. The training is solely based on samples of noisy data and examples of noise, which -- critically -- do not need to come in pairs. We only need the assumption that the noise is independent and additive (although we describe how this can be extended). The method rests on a Wasserstein Generative Adversarial Network setting, which utilizes two critics and one generator. △ Less

Submitted 3 July, 2020; originally announced July 2020.

arXiv:1907.04675 [pdf, other]

A Projectional Ansatz to Reconstruction

Authors: Sören Dittmer, Peter Maass

Abstract: Recently the field of inverse problems has seen a growing usage of mathematically only partially understood learned and non-learned priors. Based on first principles, we develop a projectional approach to inverse problems that addresses the incorporation of these priors, while still guaranteeing data consistency. We implement this projectional method (PM) on the one hand via very general Plug-and-… ▽ More Recently the field of inverse problems has seen a growing usage of mathematically only partially understood learned and non-learned priors. Based on first principles, we develop a projectional approach to inverse problems that addresses the incorporation of these priors, while still guaranteeing data consistency. We implement this projectional method (PM) on the one hand via very general Plug-and-Play priors and on the other hand, via an end-to-end training approach. To this end, we introduce a novel alternating neural architecture, allowing for the incorporation of highly customized priors from data in a principled manner. We also show how the recent success of Regularization by Denoising (RED) can, at least to some extent, be explained as an approximation of the PM. Furthermore, we demonstrate how the idea can be applied to stop the degradation of Deep Image Prior (DIP) reconstructions over time. △ Less

Submitted 6 August, 2019; v1 submitted 10 July, 2019; originally announced July 2019.

arXiv:1903.08743 [pdf, other]

Phase transition in random contingency tables with non-uniform margins

Authors: Sam Dittmer, Hanbaek Lyu, Igor Pak

Abstract: For parameters $n,δ,B,$ and $C$, let $X=(X_{k\ell})$ be the random uniform contingency table whose first $\lfloor n^δ \rfloor $ rows and columns have margin $\lfloor BCn \rfloor$ and the last $n$ rows and columns have margin $\lfloor Cn \rfloor$. For every $0<δ<1$, we establish a sharp phase transition of the limiting distribution of each entry of $X$ at the critical value $B_{c}=1+\sqrt{1+1/C}$.… ▽ More For parameters $n,δ,B,$ and $C$, let $X=(X_{k\ell})$ be the random uniform contingency table whose first $\lfloor n^δ \rfloor $ rows and columns have margin $\lfloor BCn \rfloor$ and the last $n$ rows and columns have margin $\lfloor Cn \rfloor$. For every $0<δ<1$, we establish a sharp phase transition of the limiting distribution of each entry of $X$ at the critical value $B_{c}=1+\sqrt{1+1/C}$. In particular, for $1/2<δ<1$, we show that the distribution of each entry converges to a geometric distribution in total variation distance, whose mean depends sensitively on whether $B<B_{c}$ or $B>B_{c}$. Our main result shows that $\mathbb{E}[X_{11}]$ is uniformly bounded for $B<B_{c}$, but has sharp asymptotic $C(B-B_{c}) n^{1-δ}$ for $B>B_{c}$. We also establish a strong law of large numbers for the row sums in top right and top left blocks. △ Less

Submitted 11 September, 2020; v1 submitted 20 March, 2019; originally announced March 2019.

Comments: 24 pages, 4 figures. Earlier version is accepted for publication in Transactions of the AMS. This version contains an appendix on asymptotic independence of the entries

arXiv:1812.03889 [pdf, other]

doi 10.1007/s10851-019-00923-x

Regularization by architecture: A deep prior approach for inverse problems

Authors: Sören Dittmer, Tobias Kluth, Peter Maass, Daniel Otero Baguer

Abstract: The present paper studies so-called deep image prior (DIP) techniques in the context of ill-posed inverse problems. DIP networks have been recently introduced for applications in image processing; also first experimental results for applying DIP to inverse problems have been reported. This paper aims at discussing different interpretations of DIP and to obtain analytic results for specific network… ▽ More The present paper studies so-called deep image prior (DIP) techniques in the context of ill-posed inverse problems. DIP networks have been recently introduced for applications in image processing; also first experimental results for applying DIP to inverse problems have been reported. This paper aims at discussing different interpretations of DIP and to obtain analytic results for specific network designs and linear operators. The main contribution is to introduce the idea of viewing these approaches as the optimization of Tikhonov functionals rather than optimizing networks. Besides theoretical results, we present numerical verifications. △ Less

Submitted 18 March, 2020; v1 submitted 10 December, 2018; originally announced December 2018.

Journal ref: Journal of Mathematical Imaging and Vision (2019)

arXiv:1812.02566 [pdf, other]

Singular Values for ReLU Layers

Authors: Sören Dittmer, Emily J. King, Peter Maass

Abstract: Despite their prevalence in neural networks we still lack a thorough theoretical characterization of ReLU layers. This paper aims to further our understanding of ReLU layers by studying how the activation function ReLU interacts with the linear component of the layer and what role this interaction plays in the success of the neural network in achieving its intended task. To this end, we introduce… ▽ More Despite their prevalence in neural networks we still lack a thorough theoretical characterization of ReLU layers. This paper aims to further our understanding of ReLU layers by studying how the activation function ReLU interacts with the linear component of the layer and what role this interaction plays in the success of the neural network in achieving its intended task. To this end, we introduce two new tools: ReLU singular values of operators and the Gaussian mean width of operators. By presenting on the one hand theoretical justifications, results, and interpretations of these two concepts and on the other hand numerical experiments and results of the ReLU singular values and the Gaussian mean width being applied to trained neural networks, we hope to give a comprehensive, singular-value-centric view of ReLU layers. We find that ReLU singular values and the Gaussian mean width do not only enable theoretical insights, but also provide one with metrics which seem promising for practical applications. In particular, these measures can be used to distinguish correctly and incorrectly classified data as it traverses the network. We conclude by introducing two tools based on our findings: double-layers and harmonic pruning. △ Less

Submitted 12 August, 2019; v1 submitted 6 December, 2018; originally announced December 2018.

arXiv:1806.09730 [pdf, other]

Analysis of Invariance and Robustness via Invertibility of ReLU-Networks

Authors: Jens Behrmann, Sören Dittmer, Pascal Fernsel, Peter Maaß

Abstract: Studying the invertibility of deep neural networks (DNNs) provides a principled approach to better understand the behavior of these powerful models. Despite being a promising diagnostic tool, a consistent theory on their invertibility is still lacking. We derive a theoretically motivated approach to explore the preimages of ReLU-layers and mechanisms affecting the stability of the inverse. Using t… ▽ More Studying the invertibility of deep neural networks (DNNs) provides a principled approach to better understand the behavior of these powerful models. Despite being a promising diagnostic tool, a consistent theory on their invertibility is still lacking. We derive a theoretically motivated approach to explore the preimages of ReLU-layers and mechanisms affecting the stability of the inverse. Using the developed theory, we numerically show how this approach uncovers characteristic properties of the network. △ Less

Submitted 27 June, 2018; v1 submitted 25 June, 2018; originally announced June 2018.

arXiv:1802.06312 [pdf, ps, other]

Counting linear extensions of restricted posets

Authors: Samuel Dittmer, Igor Pak

Abstract: The classical 1991 result by Brightwell and Winkler states that the number of linear extensions of a poset is #P-complete. We extend this result to posets with certain restrictions. First, we prove that the number of linear extension for posets of height two is #P-complete. Furthermore, we prove that this holds for incidence posets of graphs. Finally, we prove that the number of linear extensions… ▽ More The classical 1991 result by Brightwell and Winkler states that the number of linear extensions of a poset is #P-complete. We extend this result to posets with certain restrictions. First, we prove that the number of linear extension for posets of height two is #P-complete. Furthermore, we prove that this holds for incidence posets of graphs. Finally, we prove that the number of linear extensions for posets of dimension two is #P-complete. △ Less

Submitted 17 February, 2018; originally announced February 2018.

arXiv:1706.00222 [pdf, other]

doi 10.1088/1748-0221/12/05/P05022

Test Beam Performance Measurements for the Phase I Upgrade of the CMS Pixel Detector

Authors: M. Dragicevic, M. Friedl, J. Hrubec, H. Steininger, A. Gädda, J. Härkönen, T. Lampén, P. Luukka, T. Peltola, E. Tuominen, E. Tuovinen, A. Winkler, P. Eerola, T. Tuuva, G. Baulieu, G. Boudoul, L. Caponetto, C. Combaret, D. Contardo, T. Dupasquier, G. Gallbit, N. Lumb, L. Mirabito, S. Perries, M. Vander Donckt , et al. (462 additional authors not shown)

Abstract: A new pixel detector for the CMS experiment was built in order to cope with the instantaneous luminosities anticipated for the Phase~I Upgrade of the LHC. The new CMS pixel detector provides four-hit tracking with a reduced material budget as well as new cooling and powering schemes. A new front-end readout chip mitigates buffering and bandwidth limitations, and allows operation at low comparator… ▽ More A new pixel detector for the CMS experiment was built in order to cope with the instantaneous luminosities anticipated for the Phase~I Upgrade of the LHC. The new CMS pixel detector provides four-hit tracking with a reduced material budget as well as new cooling and powering schemes. A new front-end readout chip mitigates buffering and bandwidth limitations, and allows operation at low comparator thresholds. In this paper, comprehensive test beam studies are presented, which have been conducted to verify the design and to quantify the performance of the new detector assemblies in terms of tracking efficiency and spatial resolution. Under optimal conditions, the tracking efficiency is $99.95\pm0.05\,\%$, while the intrinsic spatial resolutions are $4.80\pm0.25\,μ\mathrm{m}$ and $7.99\pm0.21\,μ\mathrm{m}$ along the $100\,μ\mathrm{m}$ and $150\,μ\mathrm{m}$ pixel pitch, respectively. The findings are compared to a detailed Monte Carlo simulation of the pixel detector and good agreement is found. △ Less

Submitted 1 June, 2017; originally announced June 2017.

Report number: CMS-NOTE-2017-002

arXiv:1505.01824 [pdf, other]

doi 10.1088/1748-0221/11/04/P04023

Trap** in irradiated p-on-n silicon sensors at fluences anticipated at the HL-LHC outer tracker

Authors: W. Adam, T. Bergauer, M. Dragicevic, M. Friedl, R. Fruehwirth, M. Hoch, J. Hrubec, M. Krammer, W. Treberspurg, W. Waltenberger, S. Alderweireldt, W. Beaumont, X. Janssen, S. Luyckx, P. Van Mechelen, N. Van Remortel, A. Van Spilbeeck, P. Barria, C. Caillol, B. Clerbaux, G. De Lentdecker, D. Dobur, L. Favart, A. Grebenyuk, Th. Lenzi , et al. (663 additional authors not shown)

Abstract: The degradation of signal in silicon sensors is studied under conditions expected at the CERN High-Luminosity LHC. 200 $μ$m thick n-type silicon sensors are irradiated with protons of different energies to fluences of up to $3 \cdot 10^{15}$ neq/cm$^2$. Pulsed red laser light with a wavelength of 672 nm is used to generate electron-hole pairs in the sensors. The induced signals are used to determi… ▽ More The degradation of signal in silicon sensors is studied under conditions expected at the CERN High-Luminosity LHC. 200 $μ$m thick n-type silicon sensors are irradiated with protons of different energies to fluences of up to $3 \cdot 10^{15}$ neq/cm$^2$. Pulsed red laser light with a wavelength of 672 nm is used to generate electron-hole pairs in the sensors. The induced signals are used to determine the charge collection efficiencies separately for electrons and holes drifting through the sensor. The effective trap** rates are extracted by comparing the results to simulation. The electric field is simulated using Synopsys device simulation assuming two effective defects. The generation and drift of charge carriers are simulated in an independent simulation based on PixelAV. The effective trap** rates are determined from the measured charge collection efficiencies and the simulated and measured time-resolved current pulses are compared. The effective trap** rates determined for both electrons and holes are about 50% smaller than those obtained using standard extrapolations of studies at low fluences and suggests an improved tracker performance over initial expectations. △ Less

Submitted 7 May, 2015; originally announced May 2015.

Journal ref: 2016 JINST 11 P04023

arXiv:1411.4413 [pdf, other]

doi 10.1038/nature14474

Observation of the rare $B^0_s\toμ^+μ^-$ decay from the combined analysis of CMS and LHCb data

Authors: The CMS, LHCb Collaborations, :, V. Khachatryan, A. M. Sirunyan, A. Tumasyan, W. Adam, T. Bergauer, M. Dragicevic, J. Erö, M. Friedl, R. Frühwirth, V. M. Ghete, C. Hartl, N. Hörmann, J. Hrubec, M. Jeitler, W. Kiesenhofer, V. Knünz, M. Krammer, I. Krätschmer, D. Liko, I. Mikulec, D. Rabady, B. Rahbaran , et al. (2807 additional authors not shown)

Abstract: A joint measurement is presented of the branching fractions $B^0_s\toμ^+μ^-$ and $B^0\toμ^+μ^-$ in proton-proton collisions at the LHC by the CMS and LHCb experiments. The data samples were collected in 2011 at a centre-of-mass energy of 7 TeV, and in 2012 at 8 TeV. The combined analysis produces the first observation of the $B^0_s\toμ^+μ^-$ decay, with a statistical significance exceeding six sta… ▽ More A joint measurement is presented of the branching fractions $B^0_s\toμ^+μ^-$ and $B^0\toμ^+μ^-$ in proton-proton collisions at the LHC by the CMS and LHCb experiments. The data samples were collected in 2011 at a centre-of-mass energy of 7 TeV, and in 2012 at 8 TeV. The combined analysis produces the first observation of the $B^0_s\toμ^+μ^-$ decay, with a statistical significance exceeding six standard deviations, and the best measurement of its branching fraction so far. Furthermore, evidence for the $B^0\toμ^+μ^-$ decay is obtained with a statistical significance of three standard deviations. The branching fraction measurements are statistically compatible with SM predictions and impose stringent constraints on several theories beyond the SM. △ Less

Submitted 17 August, 2015; v1 submitted 17 November, 2014; originally announced November 2014.

Comments: Correspondence should be addressed to [email protected]

Report number: CERN-PH-EP-2014-220, CMS-BPH-13-007, LHCb-PAPER-2014-049

Journal ref: Nature 522, 68-72 (04 June 2015)

arXiv:1302.6824 [pdf]

From Influence Diagrams to Junction Trees

Authors: Frank Jensen, Finn Verner Jensen, Soren L. Dittmer

Abstract: We present an approach to the solution of decision problems formulated as influence diagrams. This approach involves a special triangulation of the underlying graph, the construction of a junction tree with special properties, and a message passing algorithm operating on the junction tree for computation of expected utilities and optimal decision policies. We present an approach to the solution of decision problems formulated as influence diagrams. This approach involves a special triangulation of the underlying graph, the construction of a junction tree with special properties, and a message passing algorithm operating on the junction tree for computation of expected utilities and optimal decision policies. △ Less

Submitted 27 February, 2013; originally announced February 2013.

Comments: Appears in Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (UAI1994)

Report number: UAI-P-1994-PG-367-373

arXiv:1302.1535 [pdf]

Myopic Value of Information in Influence Diagrams

Authors: Soren L. Dittmer, Finn Verner Jensen

Abstract: We present a method for calculation of myopic value of information in influence diagrams (Howard & Matheson, 1981) based on the strong junction tree framework (Jensen, Jensen & Dittmer, 1994). The difference in instantiation order in the influence diagrams is reflected in the corresponding junction trees by the order in which the chance nodes are marginalized. This order of marginalization can b… ▽ More We present a method for calculation of myopic value of information in influence diagrams (Howard & Matheson, 1981) based on the strong junction tree framework (Jensen, Jensen & Dittmer, 1994). The difference in instantiation order in the influence diagrams is reflected in the corresponding junction trees by the order in which the chance nodes are marginalized. This order of marginalization can be changed by table expansion and in effect the same junction tree with expanded tables may be used for calculating the expected utility for scenarios with different instantiation order. We also compare our method to the classic method of modeling different instantiation orders in the same influence diagram. △ Less

Submitted 6 February, 2013; originally announced February 2013.

Comments: Appears in Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence (UAI1997)

Report number: UAI-P-1997-PG-142-149

Showing 1–27 of 27 results for author: Dittmer, S