-
Calibrating the Dice loss to handle neural network overconfidence for biomedical image segmentation
Authors:
Michael Yeung,
Leonardo Rundo,
Yang Nan,
Evis Sala,
Carola-Bibiane Schönlieb,
Guang Yang
Abstract:
The Dice similarity coefficient (DSC) is both a widely used metric and loss function for biomedical image segmentation due to its robustness to class imbalance. However, it is well known that the DSC loss is poorly calibrated, resulting in overconfident predictions that cannot be usefully interpreted in biomedical and clinical practice. Performance is often the only metric used to evaluate segment…
▽ More
The Dice similarity coefficient (DSC) is both a widely used metric and loss function for biomedical image segmentation due to its robustness to class imbalance. However, it is well known that the DSC loss is poorly calibrated, resulting in overconfident predictions that cannot be usefully interpreted in biomedical and clinical practice. Performance is often the only metric used to evaluate segmentations produced by deep neural networks, and calibration is often neglected. However, calibration is important for translation into biomedical and clinical practice, providing crucial contextual information to model predictions for interpretation by scientists and clinicians. In this study, we provide a simple yet effective extension of the DSC loss, named the DSC++ loss, that selectively modulates the penalty associated with overconfident, incorrect predictions. As a standalone loss function, the DSC++ loss achieves significantly improved calibration over the conventional DSC loss across six well-validated open-source biomedical imaging datasets, including both 2D binary and 3D multi-class segmentation tasks. Similarly, we observe significantly improved calibration when integrating the DSC++ loss into four DSC-based loss functions. Finally, we use softmax thresholding to illustrate that well calibrated outputs enable tailoring of recall-precision bias, which is an important post-processing technique to adapt the model predictions to suit the biomedical or clinical task. The DSC++ loss overcomes the major limitation of the DSC loss, providing a suitable loss function for training deep learning segmentation models for use in biomedical and clinical practice. Source code is available at: https://github.com/mlyg/DicePlusPlus.
△ Less
Submitted 1 November, 2022; v1 submitted 31 October, 2021;
originally announced November 2021.
-
Learning convex regularizers satisfying the variational source condition for inverse problems
Authors:
Subhadip Mukherjee,
Carola-Bibiane Schönlieb,
Martin Burger
Abstract:
Variational regularization has remained one of the most successful approaches for reconstruction in imaging inverse problems for several decades. With the emergence and astonishing success of deep learning in recent years, a considerable amount of research has gone into data-driven modeling of the regularizer in the variational setting. Our work extends a recently proposed method, referred to as a…
▽ More
Variational regularization has remained one of the most successful approaches for reconstruction in imaging inverse problems for several decades. With the emergence and astonishing success of deep learning in recent years, a considerable amount of research has gone into data-driven modeling of the regularizer in the variational setting. Our work extends a recently proposed method, referred to as adversarial convex regularization (ACR), that seeks to learn data-driven convex regularizers via adversarial training in an attempt to combine the power of data with the classical convex regularization theory. Specifically, we leverage the variational source condition (SC) during training to enforce that the ground-truth images minimize the variational loss corresponding to the learned convex regularizer. This is achieved by adding an appropriate penalty term to the ACR training objective. The resulting regularizer (abbreviated as ACR-SC) performs on par with the ACR, but unlike ACR, comes with a quantitative convergence rate estimate.
△ Less
Submitted 24 October, 2021;
originally announced October 2021.
-
Stochastic Primal-Dual Deep Unrolling
Authors:
Junqi Tang,
Subhadip Mukherjee,
Carola-Bibiane Schönlieb
Abstract:
We propose a new type of efficient deep-unrolling networks for solving imaging inverse problems. Conventional deep-unrolling methods require full forward operator and its adjoint across each layer, and hence can be significantly more expensive computationally as compared with other end-to-end methods that are based on post-processing of model-based reconstructions, especially for 3D image reconstr…
▽ More
We propose a new type of efficient deep-unrolling networks for solving imaging inverse problems. Conventional deep-unrolling methods require full forward operator and its adjoint across each layer, and hence can be significantly more expensive computationally as compared with other end-to-end methods that are based on post-processing of model-based reconstructions, especially for 3D image reconstruction tasks. We develop a stochastic (ordered-subsets) variant of the classical learned primal-dual (LPD), which is a state-of-the-art unrolling network for tomographic image reconstruction. The proposed learned stochastic primal-dual (LSPD) network only uses subsets of the forward and adjoint operators and offers considerable computational efficiency. We provide theoretical analysis of a special case of our LSPD framework, suggesting that it has the potential to achieve image reconstruction quality competitive with the full-batch LPD while requiring only a fraction of the computation. The numerical results for two different X-ray computed tomography (CT) imaging tasks (namely, low-dose and sparse-view CT) corroborate this theoretical finding, demonstrating the promise of LSPD networks for large-scale imaging problems.
△ Less
Submitted 15 February, 2022; v1 submitted 19 October, 2021;
originally announced October 2021.
-
StyleGAN-induced data-driven regularization for inverse problems
Authors:
Arthur Conmy,
Subhadip Mukherjee,
Carola-Bibiane Schönlieb
Abstract:
Recent advances in generative adversarial networks (GANs) have opened up the possibility of generating high-resolution photo-realistic images that were impossible to produce previously. The ability of GANs to sample from high-dimensional distributions has naturally motivated researchers to leverage their power for modeling the image prior in inverse problems. We extend this line of research by dev…
▽ More
Recent advances in generative adversarial networks (GANs) have opened up the possibility of generating high-resolution photo-realistic images that were impossible to produce previously. The ability of GANs to sample from high-dimensional distributions has naturally motivated researchers to leverage their power for modeling the image prior in inverse problems. We extend this line of research by develo** a Bayesian image reconstruction framework that utilizes the full potential of a pre-trained StyleGAN2 generator, which is the currently dominant GAN architecture, for constructing the prior distribution on the underlying image. Our proposed approach, which we refer to as learned Bayesian reconstruction with generative models (L-BRGM), entails joint optimization over the style-code and the input latent code, and enhances the expressive power of a pre-trained StyleGAN2 generator by allowing the style-codes to be different for different generator layers. Considering the inverse problems of image inpainting and super-resolution, we demonstrate that the proposed approach is competitive with, and sometimes superior to, state-of-the-art GAN-based image reconstruction methods.
△ Less
Submitted 7 October, 2021;
originally announced October 2021.
-
Predicting isocitrate dehydrogenase mutation status in glioma using structural brain networks and graph neural networks
Authors:
Yiran Wei,
Yonghao Li,
Xi Chen,
Carola-Bibiane Schönlieb,
Chao Li,
Stephen J. Price
Abstract:
Glioma is a common malignant brain tumor with distinct survival among patients. The isocitrate dehydrogenase (IDH) gene mutation provides critical diagnostic and prognostic value for glioma. It is of crucial significance to non-invasively predict IDH mutation based on pre-treatment MRI. Machine learning/deep learning models show reasonable performance in predicting IDH mutation using MRI. However,…
▽ More
Glioma is a common malignant brain tumor with distinct survival among patients. The isocitrate dehydrogenase (IDH) gene mutation provides critical diagnostic and prognostic value for glioma. It is of crucial significance to non-invasively predict IDH mutation based on pre-treatment MRI. Machine learning/deep learning models show reasonable performance in predicting IDH mutation using MRI. However, most models neglect the systematic brain alterations caused by tumor invasion, where widespread infiltration along white matter tracts is a hallmark of glioma. Structural brain network provides an effective tool to characterize brain organisation, which could be captured by the graph neural networks (GNN) to more accurately predict IDH mutation.
Here we propose a method to predict IDH mutation using GNN, based on the structural brain network of patients. Specifically, we firstly construct a network template of healthy subjects, consisting of atlases of edges (white matter tracts) and nodes (cortical/subcortical brain regions) to provide regions of interest (ROIs). Next, we employ autoencoders to extract the latent multi-modal MRI features from the ROIs of edges and nodes in patients, to train a GNN architecture for predicting IDH mutation. The results show that the proposed method outperforms the baseline models using the 3D-CNN and 3D-DenseNet. In addition, model interpretation suggests its ability to identify the tracts infiltrated by tumor, corresponding to clinical prior knowledge. In conclusion, integrating brain networks with GNN offers a new avenue to study brain lesions using computational neuroscience and computer vision approaches.
△ Less
Submitted 10 September, 2021; v1 submitted 4 September, 2021;
originally announced September 2021.
-
Adaptive unsupervised learning with enhanced feature representation for intra-tumor partitioning and survival prediction for glioblastoma
Authors:
Yifan Li,
Chao Li,
Yiran Wei,
Stephen Price,
Carola-Bibiane Schönlieb,
Xi Chen
Abstract:
Glioblastoma is profoundly heterogeneous in regional microstructure and vasculature. Characterizing the spatial heterogeneity of glioblastoma could lead to more precise treatment. With unsupervised learning techniques, glioblastoma MRI-derived radiomic features have been widely utilized for tumor sub-region segmentation and survival prediction. However, the reliability of algorithm outcomes is oft…
▽ More
Glioblastoma is profoundly heterogeneous in regional microstructure and vasculature. Characterizing the spatial heterogeneity of glioblastoma could lead to more precise treatment. With unsupervised learning techniques, glioblastoma MRI-derived radiomic features have been widely utilized for tumor sub-region segmentation and survival prediction. However, the reliability of algorithm outcomes is often challenged by both ambiguous intermediate process and instability introduced by the randomness of clustering algorithms, especially for data from heterogeneous patients.
In this paper, we propose an adaptive unsupervised learning approach for efficient MRI intra-tumor partitioning and glioblastoma survival prediction. A novel and problem-specific Feature-enhanced Auto-Encoder (FAE) is developed to enhance the representation of pairwise clinical modalities and therefore improve clustering stability of unsupervised learning algorithms such as K-means. Moreover, the entire process is modelled by the Bayesian optimization (BO) technique with a custom loss function that the hyper-parameters can be adaptively optimized in a reasonably few steps. The results demonstrate that the proposed approach can produce robust and clinically relevant MRI sub-regions and statistically significant survival predictions.
△ Less
Submitted 20 August, 2021;
originally announced August 2021.
-
Image reconstruction in light-sheet microscopy: spatially varying deconvolution and mixed noise
Authors:
Bogdan Toader,
Jerome Boulanger,
Yury Korolev,
Martin O. Lenz,
James Manton,
Carola-Bibiane Schonlieb,
Leila Muresan
Abstract:
We study the problem of deconvolution for light-sheet microscopy, where the data is corrupted by spatially varying blur and a combination of Poisson and Gaussian noise. The spatial variation of the point spread function (PSF) of a light-sheet microscope is determined by the interaction between the excitation sheet and the detection objective PSF. First, we introduce a model of the image formation…
▽ More
We study the problem of deconvolution for light-sheet microscopy, where the data is corrupted by spatially varying blur and a combination of Poisson and Gaussian noise. The spatial variation of the point spread function (PSF) of a light-sheet microscope is determined by the interaction between the excitation sheet and the detection objective PSF. First, we introduce a model of the image formation process that incorporates this interaction, therefore capturing the main characteristics of this imaging modality. Then, we formulate a variational model that accounts for the combination of Poisson and Gaussian noise through a data fidelity term consisting of the infimal convolution of the single noise fidelities, first introduced in L. Calatroni et al. "Infimal convolution of data discrepancies for mixed noise removal", SIAM Journal on Imaging Sciences 10.3 (2017), 1196-1233. We establish convergence rates in a Bregman distance under a source condition for the infimal convolution fidelity and a discrepancy principle for choosing the value of the regularisation parameter. The inverse problem is solved by applying the primal-dual hybrid gradient (PDHG) algorithm in a novel way. Finally, numerical experiments performed on both simulated and real data show superior reconstruction results in comparison with other methods.
△ Less
Submitted 8 August, 2021;
originally announced August 2021.
-
LaplaceNet: A Hybrid Graph-Energy Neural Network for Deep Semi-Supervised Classification
Authors:
Philip Sellars,
Angelica I. Aviles-Rivero,
Carola-Bibiane Schönlieb
Abstract:
Semi-supervised learning has received a lot of recent attention as it alleviates the need for large amounts of labelled data which can often be expensive, requires expert knowledge and be time consuming to collect. Recent developments in deep semi-supervised classification have reached unprecedented performance and the gap between supervised and semi-supervised learning is ever-decreasing. This im…
▽ More
Semi-supervised learning has received a lot of recent attention as it alleviates the need for large amounts of labelled data which can often be expensive, requires expert knowledge and be time consuming to collect. Recent developments in deep semi-supervised classification have reached unprecedented performance and the gap between supervised and semi-supervised learning is ever-decreasing. This improvement in performance has been based on the inclusion of numerous technical tricks, strong augmentation techniques and costly optimisation schemes with multi-term loss functions. We propose a new framework, LaplaceNet, for deep semi-supervised classification that has a greatly reduced model complexity. We utilise a hybrid approach where pseudolabels are produced by minimising the Laplacian energy on a graph. These pseudo-labels are then used to iteratively train a neural-network backbone. Our model outperforms state-of-the art methods for deep semi-supervised classification, over several benchmark datasets. Furthermore, we consider the application of strong-augmentations to neural networks theoretically and justify the use of a multi-sampling approach for semi-supervised learning. We demonstrate, through rigorous experimentation, that a multi-sampling augmentation approach improves generalisation and reduces the sensitivity of the network to augmentation.
△ Less
Submitted 28 September, 2022; v1 submitted 8 June, 2021;
originally announced June 2021.
-
HERS Superpixels: Deep Affinity Learning for Hierarchical Entropy Rate Segmentation
Authors:
Hankui Peng,
Angelica I. Aviles-Rivero,
Carola-Bibiane Schönlieb
Abstract:
Superpixels serve as a powerful preprocessing tool in numerous computer vision tasks. By using superpixel representation, the number of image primitives can be largely reduced by orders of magnitudes. With the rise of deep learning in recent years, a few works have attempted to feed deeply learned features / graphs into existing classical superpixel techniques. However, none of them are able to pr…
▽ More
Superpixels serve as a powerful preprocessing tool in numerous computer vision tasks. By using superpixel representation, the number of image primitives can be largely reduced by orders of magnitudes. With the rise of deep learning in recent years, a few works have attempted to feed deeply learned features / graphs into existing classical superpixel techniques. However, none of them are able to produce superpixels in near real-time, which is crucial to the applicability of superpixels in practice. In this work, we propose a two-stage graph-based framework for superpixel segmentation. In the first stage, we introduce an efficient Deep Affinity Learning (DAL) network that learns pairwise pixel affinities by aggregating multi-scale information. In the second stage, we propose a highly efficient superpixel method called Hierarchical Entropy Rate Segmentation (HERS). Using the learned affinities from the first stage, HERS builds a hierarchical tree structure that can produce any number of highly adaptive superpixels instantaneously. We demonstrate, through visual and numerical experiments, the effectiveness and efficiency of our method compared to various state-of-the-art superpixel methods.
△ Less
Submitted 18 November, 2021; v1 submitted 7 June, 2021;
originally announced June 2021.
-
End-to-end reconstruction meets data-driven regularization for inverse problems
Authors:
Subhadip Mukherjee,
Marcello Carioni,
Ozan Öktem,
Carola-Bibiane Schönlieb
Abstract:
We propose an unsupervised approach for learning end-to-end reconstruction operators for ill-posed inverse problems. The proposed method combines the classical variational framework with iterative unrolling, which essentially seeks to minimize a weighted combination of the expected distortion in the measurement space and the Wasserstein-1 distance between the distributions of the reconstruction an…
▽ More
We propose an unsupervised approach for learning end-to-end reconstruction operators for ill-posed inverse problems. The proposed method combines the classical variational framework with iterative unrolling, which essentially seeks to minimize a weighted combination of the expected distortion in the measurement space and the Wasserstein-1 distance between the distributions of the reconstruction and ground-truth. More specifically, the regularizer in the variational setting is parametrized by a deep neural network and learned simultaneously with the unrolled reconstruction operator. The variational problem is then initialized with the reconstruction of the unrolled operator and solved iteratively till convergence. Notably, it takes significantly fewer iterations to converge, thanks to the excellent initialization obtained via the unrolled operator. The resulting approach combines the computational efficiency of end-to-end unrolled reconstruction with the well-posedness and noise-stability guarantees of the variational setting. Moreover, we demonstrate with the example of X-ray computed tomography (CT) that our approach outperforms state-of-the-art unsupervised methods, and that it outperforms or is on par with state-of-the-art supervised learned reconstruction approaches.
△ Less
Submitted 7 June, 2021;
originally announced June 2021.
-
CAFLOW: Conditional Autoregressive Flows
Authors:
Georgios Batzolis,
Marcello Carioni,
Christian Etmann,
Soroosh Afyouni,
Zoe Kourtzi,
Carola Bibiane Schönlieb
Abstract:
We introduce CAFLOW, a new diverse image-to-image translation model that simultaneously leverages the power of auto-regressive modeling and the modeling efficiency of conditional normalizing flows. We transform the conditioning image into a sequence of latent encodings using a multi-scale normalizing flow and repeat the process for the conditioned image. We model the conditional distribution of th…
▽ More
We introduce CAFLOW, a new diverse image-to-image translation model that simultaneously leverages the power of auto-regressive modeling and the modeling efficiency of conditional normalizing flows. We transform the conditioning image into a sequence of latent encodings using a multi-scale normalizing flow and repeat the process for the conditioned image. We model the conditional distribution of the latent encodings by modeling the auto-regressive distributions with an efficient multi-scale normalizing flow, where each conditioning factor affects image synthesis at its respective resolution scale. Our proposed framework performs well on a range of image-to-image translation tasks. It outperforms former designs of conditional flows because of its expressive auto-regressive structure.
△ Less
Submitted 4 June, 2021;
originally announced June 2021.
-
Focus U-Net: A novel dual attention-gated CNN for polyp segmentation during colonoscopy
Authors:
Michael Yeung,
Evis Sala,
Carola-Bibiane Schönlieb,
Leonardo Rundo
Abstract:
Background: Colonoscopy remains the gold-standard screening for colorectal cancer. However, significant miss rates for polyps have been reported, particularly when there are multiple small adenomas. This presents an opportunity to leverage computer-aided systems to support clinicians and reduce the number of polyps missed.
Method: In this work we introduce the Focus U-Net, a novel dual attention…
▽ More
Background: Colonoscopy remains the gold-standard screening for colorectal cancer. However, significant miss rates for polyps have been reported, particularly when there are multiple small adenomas. This presents an opportunity to leverage computer-aided systems to support clinicians and reduce the number of polyps missed.
Method: In this work we introduce the Focus U-Net, a novel dual attention-gated deep neural network, which combines efficient spatial and channel-based attention into a single Focus Gate module to encourage selective learning of polyp features. The Focus U-Net further incorporates short-range skip connections and deep supervision. Furthermore, we introduce the Hybrid Focal loss, a new compound loss function based on the Focal loss and Focal Tversky loss, to handle class-imbalanced image segmentation. For our experiments, we selected five public datasets containing images of polyps obtained during optical colonoscopy: CVC-ClinicDB, Kvasir-SEG, CVC-ColonDB, ETIS-Larib PolypDB and EndoScene test set. To evaluate model performance, we use the Dice similarity coefficient (DSC) and Intersection over Union (IoU) metrics.
Results: Our model achieves state-of-the-art results for both CVC-ClinicDB and Kvasir-SEG, with a mean DSC of 0.941 and 0.910, respectively. When evaluated on a combination of five public polyp datasets, our model similarly achieves state-of-the-art results with a mean DSC of 0.878 and mean IoU of 0.809, a 14% and 15% improvement over the previous state-of-the-art results of 0.768 and 0.702, respectively.
Conclusions: This study shows the potential for deep learning to provide fast and accurate polyp segmentation results for use during colonoscopy. The Focus U-Net may be adapted for future use in newer non-invasive screening and more broadly to other biomedical image segmentation tasks involving class imbalance and requiring efficiency.
△ Less
Submitted 22 June, 2021; v1 submitted 16 May, 2021;
originally announced May 2021.
-
An end-to-end Optical Character Recognition approach for ultra-low-resolution printed text images
Authors:
Julian D. Gilbey,
Carola-Bibiane Schönlieb
Abstract:
Some historical and more recent printed documents have been scanned or stored at very low resolutions, such as 60 dpi. Though such scans are relatively easy for humans to read, they still present significant challenges for optical character recognition (OCR) systems. The current state-of-the art is to use super-resolution to reconstruct an approximation of the original high-resolution image and to…
▽ More
Some historical and more recent printed documents have been scanned or stored at very low resolutions, such as 60 dpi. Though such scans are relatively easy for humans to read, they still present significant challenges for optical character recognition (OCR) systems. The current state-of-the art is to use super-resolution to reconstruct an approximation of the original high-resolution image and to feed this into a standard OCR system. Our novel end-to-end method bypasses the super-resolution step and produces better OCR results. This approach is inspired from our understanding of the human visual system, and builds on established neural networks for performing OCR.
Our experiments have shown that it is possible to perform OCR on 60 dpi scanned images of English text, which is a significantly lower resolution than the state-of-the-art, and we achieved a mean character level accuracy (CLA) of 99.7% and word level accuracy (WLA) of 98.9% across a set of about 1000 pages of 60 dpi text in a wide range of fonts. For 75 dpi images, the mean CLA was 99.9% and the mean WLA was 99.4% on the same sample of texts. We make our code and data (including a set of low-resolution images with their ground truths) publicly available as a benchmark for future work in this field.
△ Less
Submitted 10 May, 2021;
originally announced May 2021.
-
Semi-supervised Superpixel-based Multi-Feature Graph Learning for Hyperspectral Image Data
Authors:
Madeleine Kotzagiannidis,
Carola-Bibiane Schönlieb
Abstract:
Graphs naturally lend themselves to model the complexities of Hyperspectral Image (HSI) data as well as to serve as semi-supervised classifiers by propagating given labels among nearest neighbours. In this work, we present a novel framework for the classification of HSI data in light of a very limited amount of labelled data, inspired by multi-view graph learning and graph signal processing. Given…
▽ More
Graphs naturally lend themselves to model the complexities of Hyperspectral Image (HSI) data as well as to serve as semi-supervised classifiers by propagating given labels among nearest neighbours. In this work, we present a novel framework for the classification of HSI data in light of a very limited amount of labelled data, inspired by multi-view graph learning and graph signal processing. Given an a priori superpixel-segmented hyperspectral image, we seek a robust and efficient graph construction and label propagation method to conduct semi-supervised learning (SSL). Since the graph is paramount to the success of the subsequent classification task, particularly in light of the intrinsic complexity of HSI data, we consider the problem of finding the optimal graph to model such data. Our contribution is two-fold: firstly, we propose a multi-stage edge-efficient semi-supervised graph learning framework for HSI data which exploits given label information through pseudo-label features embedded in the graph construction. Secondly, we examine and enhance the contribution of multiple superpixel features embedded in the graph on the basis of pseudo-labels in an extension of the previous framework, which is less reliant on excessive parameter tuning. Ultimately, we demonstrate the superiority of our approaches in comparison with state-of-the-art methods through extensive numerical experiments.
△ Less
Submitted 27 April, 2021;
originally announced April 2021.
-
Adversarially learned iterative reconstruction for imaging inverse problems
Authors:
Subhadip Mukherjee,
Ozan Öktem,
Carola-Bibiane Schönlieb
Abstract:
In numerous practical applications, especially in medical image reconstruction, it is often infeasible to obtain a large ensemble of ground-truth/measurement pairs for supervised learning. Therefore, it is imperative to develop unsupervised learning protocols that are competitive with supervised approaches in performance. Motivated by the maximum-likelihood principle, we propose an unsupervised le…
▽ More
In numerous practical applications, especially in medical image reconstruction, it is often infeasible to obtain a large ensemble of ground-truth/measurement pairs for supervised learning. Therefore, it is imperative to develop unsupervised learning protocols that are competitive with supervised approaches in performance. Motivated by the maximum-likelihood principle, we propose an unsupervised learning framework for solving ill-posed inverse problems. Instead of seeking pixel-wise proximity between the reconstructed and the ground-truth images, the proposed approach learns an iterative reconstruction network whose output matches the ground-truth in distribution. Considering tomographic reconstruction as an application, we demonstrate that the proposed unsupervised approach not only performs on par with its supervised variant in terms of objective quality measures but also successfully circumvents the issue of over-smoothing that supervised approaches tend to suffer from. The improvement in reconstruction quality comes at the expense of higher training complexity, but, once trained, the reconstruction time remains the same as its supervised counterpart.
△ Less
Submitted 30 March, 2021;
originally announced March 2021.
-
Efficient Global Optimization of Non-differentiable, Symmetric Objectives for Multi Camera Placement
Authors:
Maria L. Hänel,
Carola-B. Schönlieb
Abstract:
We propose a novel iterative method for optimally placing and orienting multiple cameras in a 3D scene. Sample applications include improving the accuracy of 3D reconstruction, maximizing the covered area for surveillance, or improving the coverage in multi-viewpoint pedestrian tracking. Our algorithm is based on a block-coordinate ascent combined with a surrogate function and an exclusion area te…
▽ More
We propose a novel iterative method for optimally placing and orienting multiple cameras in a 3D scene. Sample applications include improving the accuracy of 3D reconstruction, maximizing the covered area for surveillance, or improving the coverage in multi-viewpoint pedestrian tracking. Our algorithm is based on a block-coordinate ascent combined with a surrogate function and an exclusion area technique. This allows to flexibly handle difficult objective functions that are often expensive and quantized or non-differentiable. The solver is globally convergent and easily parallelizable. We show how to accelerate the optimization by exploiting special properties of the objective function, such as symmetry. Additionally, we discuss the trade-off between non-optimal stationary points and the cost reduction when optimizing the viewpoints consecutively.
△ Less
Submitted 20 March, 2021;
originally announced March 2021.
-
BrainNetGAN: Data augmentation of brain connectivity using generative adversarial network for dementia classification
Authors:
Chao Li,
Yiran Wei,
Xi Chen,
Carola-Bibiane Schonlieb
Abstract:
Alzheimer's disease (AD) is the most common age-related dementia. It remains a challenge to identify the individuals at risk of dementia for precise management. Brain MRI offers a noninvasive biomarker to detect brain aging. Previous evidence shows that the brain structural change detected by diffusion MRI is associated with dementia. Mounting studies has conceptualised the brain as a complex netw…
▽ More
Alzheimer's disease (AD) is the most common age-related dementia. It remains a challenge to identify the individuals at risk of dementia for precise management. Brain MRI offers a noninvasive biomarker to detect brain aging. Previous evidence shows that the brain structural change detected by diffusion MRI is associated with dementia. Mounting studies has conceptualised the brain as a complex network, which has shown the utility of this approach in characterising various neurological and psychiatric disorders. Therefore, the structural connectivity shows promise in dementia classification. The proposed BrainNetGAN is a generative adversarial network variant to augment the brain structural connectivity matrices for binary dementia classification tasks. Structural connectivity matrices between separated brain regions are constructed using tractography on diffusion MRI data. The BrainNetGAN model is trained to generate fake brain connectivity matrices, which are expected to reflect latent distribution of the real brain network data. Finally, a convolutional neural network classifier is proposed for binary dementia classification. Numerical results show that the binary classification performance in the testing set was improved using the BrainNetGAN augmented dataset. The proposed methodology allows quick synthesis of an arbitrary number of augmented connectivity matrices and can be easily transferred to similar classification tasks.
△ Less
Submitted 23 July, 2021; v1 submitted 10 March, 2021;
originally announced March 2021.
-
Wasserstein GANs Work Because They Fail (to Approximate the Wasserstein Distance)
Authors:
Jan Stanczuk,
Christian Etmann,
Lisa Maria Kreusser,
Carola-Bibiane Schönlieb
Abstract:
Wasserstein GANs are based on the idea of minimising the Wasserstein distance between a real and a generated distribution. We provide an in-depth mathematical analysis of differences between the theoretical setup and the reality of training Wasserstein GANs. In this work, we gather both theoretical and empirical evidence that the WGAN loss is not a meaningful approximation of the Wasserstein dista…
▽ More
Wasserstein GANs are based on the idea of minimising the Wasserstein distance between a real and a generated distribution. We provide an in-depth mathematical analysis of differences between the theoretical setup and the reality of training Wasserstein GANs. In this work, we gather both theoretical and empirical evidence that the WGAN loss is not a meaningful approximation of the Wasserstein distance. Moreover, we argue that the Wasserstein distance is not even a desirable loss function for deep generative models, and conclude that the success of Wasserstein GANs can in truth be attributed to a failure to approximate the Wasserstein distance.
△ Less
Submitted 5 October, 2021; v1 submitted 2 March, 2021;
originally announced March 2021.
-
Equivariant neural networks for inverse problems
Authors:
Elena Celledoni,
Matthias J. Ehrhardt,
Christian Etmann,
Brynjulf Owren,
Carola-Bibiane Schönlieb,
Ferdia Sherry
Abstract:
In recent years the use of convolutional layers to encode an inductive bias (translational equivariance) in neural networks has proven to be a very fruitful idea. The successes of this approach have motivated a line of research into incorporating other symmetries into deep learning methods, in the form of group equivariant convolutional neural networks. Much of this work has been focused on roto-t…
▽ More
In recent years the use of convolutional layers to encode an inductive bias (translational equivariance) in neural networks has proven to be a very fruitful idea. The successes of this approach have motivated a line of research into incorporating other symmetries into deep learning methods, in the form of group equivariant convolutional neural networks. Much of this work has been focused on roto-translational symmetry of $\mathbf R^d$, but other examples are the scaling symmetry of $\mathbf R^d$ and rotational symmetry of the sphere. In this work, we demonstrate that group equivariant convolutional operations can naturally be incorporated into learned reconstruction methods for inverse problems that are motivated by the variational regularisation approach. Indeed, if the regularisation functional is invariant under a group symmetry, the corresponding proximal operator will satisfy an equivariance property with respect to the same group symmetry. As a result of this observation, we design learned iterative methods in which the proximal operators are modelled as group equivariant convolutional neural networks. We use roto-translationally equivariant operations in the proposed methodology and apply it to the problems of low-dose computerised tomography reconstruction and subsampled magnetic resonance imaging reconstruction. The proposed methodology is demonstrated to improve the reconstruction quality of a learned reconstruction method with a little extra computational cost at training time but without any extra cost at test time.
△ Less
Submitted 23 February, 2021;
originally announced February 2021.
-
Depthwise Separable Convolutions Allow for Fast and Memory-Efficient Spectral Normalization
Authors:
Christina Runkel,
Christian Etmann,
Michael Möller,
Carola-Bibiane Schönlieb
Abstract:
An increasing number of models require the control of the spectral norm of convolutional layers of a neural network. While there is an abundance of methods for estimating and enforcing upper bounds on those during training, they are typically costly in either memory or time. In this work, we introduce a very simple method for spectral normalization of depthwise separable convolutions, which introd…
▽ More
An increasing number of models require the control of the spectral norm of convolutional layers of a neural network. While there is an abundance of methods for estimating and enforcing upper bounds on those during training, they are typically costly in either memory or time. In this work, we introduce a very simple method for spectral normalization of depthwise separable convolutions, which introduces negligible computational and memory overhead. We demonstrate the effectiveness of our method on image classification tasks using standard architectures like MobileNetV2.
△ Less
Submitted 12 February, 2021;
originally announced February 2021.
-
Unified Focal loss: Generalising Dice and cross entropy-based losses to handle class imbalanced medical image segmentation
Authors:
Michael Yeung,
Evis Sala,
Carola-Bibiane Schönlieb,
Leonardo Rundo
Abstract:
Automatic segmentation methods are an important advancement in medical image analysis. Machine learning techniques, and deep neural networks in particular, are the state-of-the-art for most medical image segmentation tasks. Issues with class imbalance pose a significant challenge in medical datasets, with lesions often occupying a considerably smaller volume relative to the background. Loss functi…
▽ More
Automatic segmentation methods are an important advancement in medical image analysis. Machine learning techniques, and deep neural networks in particular, are the state-of-the-art for most medical image segmentation tasks. Issues with class imbalance pose a significant challenge in medical datasets, with lesions often occupying a considerably smaller volume relative to the background. Loss functions used in the training of deep learning algorithms differ in their robustness to class imbalance, with direct consequences for model convergence. The most commonly used loss functions for segmentation are based on either the cross entropy loss, Dice loss or a combination of the two. We propose the Unified Focal loss, a new hierarchical framework that generalises Dice and cross entropy-based losses for handling class imbalance. We evaluate our proposed loss function on five publicly available, class imbalanced medical imaging datasets: CVC-ClinicDB, Digital Retinal Images for Vessel Extraction (DRIVE), Breast Ultrasound 2017 (BUS2017), Brain Tumour Segmentation 2020 (BraTS20) and Kidney Tumour Segmentation 2019 (KiTS19). We compare our loss function performance against six Dice or cross entropy-based loss functions, across 2D binary, 3D binary and 3D multiclass segmentation tasks, demonstrating that our proposed loss function is robust to class imbalance and consistently outperforms the other loss functions. Source code is available at: https://github.com/mlyg/unified-focal-loss
△ Less
Submitted 24 November, 2021; v1 submitted 8 February, 2021;
originally announced February 2021.
-
Expectation-Maximization Regularized Deep Learning for Weakly Supervised Tumor Segmentation for Glioblastoma
Authors:
Chao Li,
Wenjian Huang,
Xi Chen,
Yiran Wei,
Stephen J. Price,
Carola-Bibiane Schönlieb
Abstract:
We present an Expectation-Maximization (EM) Regularized Deep Learning (EMReDL) model for weakly supervised tumor segmentation. The proposed framework is tailored to glioblastoma, a type of malignant tumor characterized by its diffuse infiltration into the surrounding brain tissue, which poses significant challenge to treatment target and tumor burden estimation using conventional structural MRI. A…
▽ More
We present an Expectation-Maximization (EM) Regularized Deep Learning (EMReDL) model for weakly supervised tumor segmentation. The proposed framework is tailored to glioblastoma, a type of malignant tumor characterized by its diffuse infiltration into the surrounding brain tissue, which poses significant challenge to treatment target and tumor burden estimation using conventional structural MRI. Although physiological MRI provides more specific information regarding tumor infiltration, the relatively low resolution hinders a precise full annotation. This has motivated us to develop a weakly supervised deep learning solution that exploits the partial labelled tumor regions.
EMReDL contains two components: a physiological prior prediction model and EM-regularized segmentation model. The physiological prior prediction model exploits the physiological MRI by training a classifier to generate a physiological prior map. This map is passed to the segmentation model for regularization using the EM algorithm. We evaluated the model on a glioblastoma dataset with the pre-operative multiparametric and recurrence MRI available. EMReDL showed to effectively segment the infiltrated tumor from the partially labelled region of potential infiltration. The segmented core tumor and infiltrated tumor demonstrated high consistency with the tumor burden labelled by experts. The performance comparisons showed that EMReDL achieved higher accuracy than published state-of-the-art models. On MR spectroscopy, the segmented region displayed more aggressive features than other partial labelled region. The proposed model can be generalized to other segmentation tasks that rely on partial labels, with the CNN architecture flexible in the framework.
△ Less
Submitted 15 July, 2021; v1 submitted 21 January, 2021;
originally announced January 2021.
-
Beyond Fine-tuning: Classifying High Resolution Mammograms using Function-Preserving Transformations
Authors:
Tao Wei,
Angelica I Aviles-Rivero,
Shuo Wang,
Yuan Huang,
Fiona J Gilbert,
Carola-Bibiane Schönlieb,
Chang Wen Chen
Abstract:
The task of classifying mammograms is very challenging because the lesion is usually small in the high resolution image. The current state-of-the-art approaches for medical image classification rely on using the de-facto method for ConvNets - fine-tuning. However, there are fundamental differences between natural images and medical images, which based on existing evidence from the literature, limi…
▽ More
The task of classifying mammograms is very challenging because the lesion is usually small in the high resolution image. The current state-of-the-art approaches for medical image classification rely on using the de-facto method for ConvNets - fine-tuning. However, there are fundamental differences between natural images and medical images, which based on existing evidence from the literature, limits the overall performance gain when designed with algorithmic approaches. In this paper, we propose to go beyond fine-tuning by introducing a novel framework called MorphHR, in which we highlight a new transfer learning scheme. The idea behind the proposed framework is to integrate function-preserving transformations, for any continuous non-linear activation neurons, to internally regularise the network for improving mammograms classification. The proposed solution offers two major advantages over the existing techniques. Firstly and unlike fine-tuning, the proposed approach allows for modifying not only the last few layers but also several of the first ones on a deep ConvNet. By doing this, we can design the network front to be suitable for learning domain specific features. Secondly, the proposed scheme is scalable to hardware. Therefore, one can fit high resolution images on standard GPU memory. We show that by using high resolution images, one prevents losing relevant information. We demonstrate, through numerical and visual experiments, that the proposed approach yields to a significant improvement in the classification performance over state-of-the-art techniques, and is indeed on a par with radiology experts. Moreover and for generalisation purposes, we show the effectiveness of the proposed learning scheme on another large dataset, the ChestX-ray14, surpassing current state-of-the-art techniques.
△ Less
Submitted 19 January, 2021;
originally announced January 2021.
-
TFPnP: Tuning-free Plug-and-Play Proximal Algorithm with Applications to Inverse Imaging Problems
Authors:
Kaixuan Wei,
Angelica Aviles-Rivero,
**gwei Liang,
Ying Fu,
Hua Huang,
Carola-Bibiane Schönlieb
Abstract:
Plug-and-Play (PnP) is a non-convex optimization framework that combines proximal algorithms, for example, the alternating direction method of multipliers (ADMM), with advanced denoising priors. Over the past few years, great empirical success has been obtained by PnP algorithms, especially for the ones that integrate deep learning-based denoisers. However, a key challenge of PnP approaches is the…
▽ More
Plug-and-Play (PnP) is a non-convex optimization framework that combines proximal algorithms, for example, the alternating direction method of multipliers (ADMM), with advanced denoising priors. Over the past few years, great empirical success has been obtained by PnP algorithms, especially for the ones that integrate deep learning-based denoisers. However, a key challenge of PnP approaches is the need for manual parameter tweaking as it is essential to obtain high-quality results across the high discrepancy in imaging conditions and varying scene content. In this work, we present a class of tuning-free PnP proximal algorithms that can determine parameters such as denoising strength, termination time, and other optimization-specific parameters automatically. A core part of our approach is a policy network for automated parameter search which can be effectively learned via a mixture of model-free and model-based deep reinforcement learning strategies. We demonstrate, through rigorous numerical and visual experiments, that the learned policy can customize parameters to different settings, and is often more efficient and effective than existing handcrafted criteria. Moreover, we discuss several practical considerations of PnP denoisers, which together with our learned policy yield state-of-the-art results. This advanced performance is prevalent on both linear and nonlinear exemplar inverse imaging problems, and in particular shows promising results on compressed sensing MRI, sparse-view CT, single-photon imaging, and phase retrieval.
△ Less
Submitted 18 September, 2021; v1 submitted 18 November, 2020;
originally announced December 2020.
-
Bayesian optimization assisted unsupervised learning for efficient intra-tumor partitioning in MRI and survival prediction for glioblastoma patients
Authors:
Yifan Li,
Chao Li,
Stephen Price,
Carola-Bibiane Schönlieb,
Xi Chen
Abstract:
Glioblastoma is profoundly heterogeneous in microstructure and vasculature, which may lead to tumor regional diversity and distinct treatment response. Although successful in tumor sub-region segmentation and survival prediction, radiomics based on machine learning algorithms, is challenged by its robustness, due to the vague intermediate process and track changes. Also, the weak interpretability…
▽ More
Glioblastoma is profoundly heterogeneous in microstructure and vasculature, which may lead to tumor regional diversity and distinct treatment response. Although successful in tumor sub-region segmentation and survival prediction, radiomics based on machine learning algorithms, is challenged by its robustness, due to the vague intermediate process and track changes. Also, the weak interpretability of the model poses challenges to clinical application. Here we proposed a machine learning framework to semi-automatically fine-tune the clustering algorithms and quantitatively identify stable sub-regions for reliable clinical survival prediction. Hyper-parameters are automatically determined by the global minimum of the trained Gaussian Process (GP) surrogate model through Bayesian optimization(BO) to alleviate the difficulty of tuning parameters for clinical researchers. To enhance the interpretability of the survival prediction model, we incorporated the prior knowledge of intra-tumoral heterogeneity, by segmenting tumor sub-regions and extracting sub-regional features. The results demonstrated that the global minimum of the trained GP surrogate can be used as sub-optimal hyper-parameter solutions for efficient. The sub-regions segmented based on physiological MRI can be applied to predict patient survival, which could enhance the clinical interpretability for the machine learning model.
△ Less
Submitted 29 September, 2021; v1 submitted 5 December, 2020;
originally announced December 2020.
-
A Three-Stage Self-Training Framework for Semi-Supervised Semantic Segmentation
Authors:
Rihuan Ke,
Angelica Aviles-Rivero,
Saurabh Pandey,
Saikumar Reddy,
Carola-Bibiane Schönlieb
Abstract:
Semantic segmentation has been widely investigated in the community, in which the state of the art techniques are based on supervised models. Those models have reported unprecedented performance at the cost of requiring a large set of high quality segmentation masks. To obtain such annotations is highly expensive and time consuming, in particular, in semantic segmentation where pixel-level annotat…
▽ More
Semantic segmentation has been widely investigated in the community, in which the state of the art techniques are based on supervised models. Those models have reported unprecedented performance at the cost of requiring a large set of high quality segmentation masks. To obtain such annotations is highly expensive and time consuming, in particular, in semantic segmentation where pixel-level annotations are required. In this work, we address this problem by proposing a holistic solution framed as a three-stage self-training framework for semi-supervised semantic segmentation. The key idea of our technique is the extraction of the pseudo-masks statistical information to decrease uncertainty in the predicted probability whilst enforcing segmentation consistency in a multi-task fashion. We achieve this through a three-stage solution. Firstly, we train a segmentation network to produce rough pseudo-masks which predicted probability is highly uncertain. Secondly, we then decrease the uncertainty of the pseudo-masks using a multi-task model that enforces consistency whilst exploiting the rich statistical information of the data. We compare our approach with existing methods for semi-supervised semantic segmentation and demonstrate its state-of-the-art performance with extensive experiments.
△ Less
Submitted 9 January, 2022; v1 submitted 1 December, 2020;
originally announced December 2020.
-
Contrastive Registration for Unsupervised Medical Image Segmentation
Authors:
Lihao Liu,
Angelica I Aviles-Rivero,
Carola-Bibiane Schönlieb
Abstract:
Medical image segmentation is a relevant task as it serves as the first step for several diagnosis processes, thus it is indispensable in clinical usage. Whilst major success has been reported using supervised techniques, they assume a large and well-representative labelled set. This is a strong assumption in the medical domain where annotations are expensive, time-consuming, and inherent to human…
▽ More
Medical image segmentation is a relevant task as it serves as the first step for several diagnosis processes, thus it is indispensable in clinical usage. Whilst major success has been reported using supervised techniques, they assume a large and well-representative labelled set. This is a strong assumption in the medical domain where annotations are expensive, time-consuming, and inherent to human bias. To address this problem, unsupervised techniques have been proposed in the literature yet it is still an open problem due to the difficulty of learning any transformation pattern. In this work, we present a novel optimisation model framed into a new CNN-based contrastive registration architecture for unsupervised medical image segmentation. The core of our approach is to exploit image-level registration and feature-level from a contrastive learning mechanism, to perform registration-based segmentation. Firstly, we propose an architecture to capture the image-to-image transformation pattern via registration for unsupervised medical image segmentation. Secondly, we embed a contrastive learning mechanism into the registration architecture to enhance the discriminating capacity of the network in the feature-level. We show that our proposed technique mitigates the major drawbacks of existing unsupervised techniques. We demonstrate, through numerical and visual experiments, that our technique substantially outperforms the current state-of-the-art unsupervised segmentation methods on two major medical image datasets.
△ Less
Submitted 20 July, 2022; v1 submitted 17 November, 2020;
originally announced November 2020.
-
Regularized Compression of MRI Data: Modular Optimization of Joint Reconstruction and Coding
Authors:
Veronica Corona,
Yehuda Dar,
Guy Williams,
Carola-Bibiane Schönlieb
Abstract:
The Magnetic Resonance Imaging (MRI) processing chain starts with a critical acquisition stage that provides raw data for reconstruction of images for medical diagnosis. This flow usually includes a near-lossless data compression stage that enables digital storage and/or transmission in binary formats. In this work we propose a framework for joint optimization of the MRI reconstruction and lossy c…
▽ More
The Magnetic Resonance Imaging (MRI) processing chain starts with a critical acquisition stage that provides raw data for reconstruction of images for medical diagnosis. This flow usually includes a near-lossless data compression stage that enables digital storage and/or transmission in binary formats. In this work we propose a framework for joint optimization of the MRI reconstruction and lossy compression, producing compressed representations of medical images that achieve improved trade-offs between quality and bit-rate. Moreover, we demonstrate that lossy compression can even improve the reconstruction quality compared to settings based on lossless compression. Our method has a modular optimization structure, implemented using the alternating direction method of multipliers (ADMM) technique and the state-of-the-art image compression technique (BPG) as a black-box module iteratively applied. This establishes a medical data compression approach compatible with a lossy compression standard of choice. A main novelty of the proposed algorithm is in the total-variation regularization added to the modular compression process, leading to decompressed images of higher quality without any additional processing at/after the decompression stage. Our experiments show that our regularization-based approach for joint MRI reconstruction and compression often achieves significant PSNR gains between 4 to 9 dB at high bit-rates compared to non-regularized solutions of the joint task. Compared to regularization-based solutions, our optimization method provides PSNR gains between 0.5 to 1 dB at high bit-rates, which is the range of interest for medical image compression.
△ Less
Submitted 9 November, 2020; v1 submitted 8 October, 2020;
originally announced October 2020.
-
GraphXCOVID: Explainable Deep Graph Diffusion Pseudo-Labelling for Identifying COVID-19 on Chest X-rays
Authors:
Angelica I Aviles-Rivero,
Philip Sellars,
Carola-Bibiane Schönlieb,
Nicolas Papadakis
Abstract:
Can one learn to diagnose COVID-19 under extreme minimal supervision? Since the outbreak of the novel COVID-19 there has been a rush for develo** Artificial Intelligence techniques for expert-level disease identification on Chest X-ray data. In particular, the use of deep supervised learning has become the go-to paradigm. However, the performance of such models is heavily dependent on the availa…
▽ More
Can one learn to diagnose COVID-19 under extreme minimal supervision? Since the outbreak of the novel COVID-19 there has been a rush for develo** Artificial Intelligence techniques for expert-level disease identification on Chest X-ray data. In particular, the use of deep supervised learning has become the go-to paradigm. However, the performance of such models is heavily dependent on the availability of a large and representative labelled dataset. The creation of which is a heavily expensive and time consuming task, and especially imposes a great challenge for a novel disease. Semi-supervised learning has shown the ability to match the incredible performance of supervised models whilst requiring a small fraction of the labelled examples. This makes the semi-supervised paradigm an attractive option for identifying COVID-19. In this work, we introduce a graph based deep semi-supervised framework for classifying COVID-19 from chest X-rays. Our framework introduces an optimisation model for graph diffusion that reinforces the natural relation among the tiny labelled set and the vast unlabelled data. We then connect the diffusion prediction output as pseudo-labels that are used in an iterative scheme in a deep net. We demonstrate, through our experiments, that our model is able to outperform the current leading supervised model with a tiny fraction of the labelled examples. Finally, we provide attention maps to accommodate the radiologist's mental model, better fitting their perceptual and cognitive abilities. These visualisation aims to assist the radiologist in judging whether the diagnostic is correct or not, and in consequence to accelerate the decision.
△ Less
Submitted 4 July, 2021; v1 submitted 30 September, 2020;
originally announced October 2020.
-
A Linear Transportation $\mathrm{L}^p$ Distance for Pattern Recognition
Authors:
Oliver M. Crook,
Mihai Cucuringu,
Tim Hurst,
Carola-Bibiane Schönlieb,
Matthew Thorpe,
Konstantinos C. Zygalakis
Abstract:
The transportation $\mathrm{L}^p$ distance, denoted $\mathrm{TL}^p$, has been proposed as a generalisation of Wasserstein $\mathrm{W}^p$ distances motivated by the property that it can be applied directly to colour or multi-channelled images, as well as multivariate time-series without normalisation or mass constraints. These distances, as with $\mathrm{W}^p$, are powerful tools in modelling data…
▽ More
The transportation $\mathrm{L}^p$ distance, denoted $\mathrm{TL}^p$, has been proposed as a generalisation of Wasserstein $\mathrm{W}^p$ distances motivated by the property that it can be applied directly to colour or multi-channelled images, as well as multivariate time-series without normalisation or mass constraints. These distances, as with $\mathrm{W}^p$, are powerful tools in modelling data with spatial or temporal perturbations. However, their computational cost can make them infeasible to apply to even moderate pattern recognition tasks. We propose linear versions of these distances and show that the linear $\mathrm{TL}^p$ distance significantly improves over the linear $\mathrm{W}^p$ distance on signal processing tasks, whilst being several orders of magnitude faster to compute than the $\mathrm{TL}^p$ distance.
△ Less
Submitted 23 September, 2020;
originally announced September 2020.
-
Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans
Authors:
Michael Roberts,
Derek Driggs,
Matthew Thorpe,
Julian Gilbey,
Michael Yeung,
Stephan Ursprung,
Angelica I. Aviles-Rivero,
Christian Etmann,
Cathal McCague,
Lucian Beer,
Jonathan R. Weir-McCall,
Zhongzhao Teng,
Effrossyni Gkrania-Klotsas,
James H. F. Rudd,
Evis Sala,
Carola-Bibiane Schönlieb
Abstract:
Machine learning methods offer great promise for fast and accurate detection and prognostication of COVID-19 from standard-of-care chest radiographs (CXR) and computed tomography (CT) images. Many articles have been published in 2020 describing new machine learning-based models for both of these tasks, but it is unclear which are of potential clinical utility. In this systematic review, we search…
▽ More
Machine learning methods offer great promise for fast and accurate detection and prognostication of COVID-19 from standard-of-care chest radiographs (CXR) and computed tomography (CT) images. Many articles have been published in 2020 describing new machine learning-based models for both of these tasks, but it is unclear which are of potential clinical utility. In this systematic review, we search EMBASE via OVID, MEDLINE via PubMed, bioRxiv, medRxiv and arXiv for published papers and preprints uploaded from January 1, 2020 to October 3, 2020 which describe new machine learning models for the diagnosis or prognosis of COVID-19 from CXR or CT images. Our search identified 2,212 studies, of which 415 were included after initial screening and, after quality screening, 61 studies were included in this systematic review. Our review finds that none of the models identified are of potential clinical use due to methodological flaws and/or underlying biases. This is a major weakness, given the urgency with which validated COVID-19 models are needed. To address this, we give many recommendations which, if followed, will solve these issues and lead to higher quality model development and well documented manuscripts.
△ Less
Submitted 5 January, 2021; v1 submitted 14 August, 2020;
originally announced August 2020.
-
Unsupervised Image Restoration Using Partially Linear Denoisers
Authors:
Rihuan Ke,
Carola-Bibiane Schönlieb
Abstract:
Deep neural network based methods are the state of the art in various image restoration problems. Standard supervised learning frameworks require a set of noisy measurement and clean image pairs for which a distance between the output of the restoration model and the ground truth, clean images is minimized. The ground truth images, however, are often unavailable or very expensive to acquire in rea…
▽ More
Deep neural network based methods are the state of the art in various image restoration problems. Standard supervised learning frameworks require a set of noisy measurement and clean image pairs for which a distance between the output of the restoration model and the ground truth, clean images is minimized. The ground truth images, however, are often unavailable or very expensive to acquire in real-world applications. We circumvent this problem by proposing a class of structured denoisers that can be decomposed as the sum of a nonlinear image-dependent map**, a linear noise-dependent term and a small residual term. We show that these denoisers can be trained with only noisy images under the condition that the noise has zero mean and known variance. The exact distribution of the noise, however, is not assumed to be known. We show the superiority of our approach for image denoising, and demonstrate its extension to solving other restoration problems such as blind deblurring where the ground truth is not available. Our method outperforms some recent unsupervised and self-supervised deep denoising models that do not require clean images for their training. For blind deblurring problems, the method, using only one noisy and blurry observation per image, reaches a quality not far away from its fully supervised counterparts on a benchmark dataset.
△ Less
Submitted 30 March, 2021; v1 submitted 13 August, 2020;
originally announced August 2020.
-
Scanning electron diffraction tomography of strain
Authors:
Robert Tovey,
Duncan N. Johnstone,
Sean M. Collins,
William R. B. Lionheart,
Paul A. Midgley,
Martin Benning,
Carola-Bibiane Schönlieb
Abstract:
Strain engineering is used to obtain desirable materials properties in a range of modern technologies. Direct nanoscale measurement of the three-dimensional strain tensor field within these materials has however been limited by a lack of suitable experimental techniques and data analysis tools. Scanning electron diffraction has emerged as a powerful tool for obtaining two-dimensional maps of strai…
▽ More
Strain engineering is used to obtain desirable materials properties in a range of modern technologies. Direct nanoscale measurement of the three-dimensional strain tensor field within these materials has however been limited by a lack of suitable experimental techniques and data analysis tools. Scanning electron diffraction has emerged as a powerful tool for obtaining two-dimensional maps of strain components perpendicular to the incident electron beam direction. Extension of this method to recover the full three-dimensional strain tensor field has been restricted though by the absence of a formal framework for tensor tomography using such data. Here, we show that it is possible to reconstruct the full non-symmetric strain tensor field as the solution to an ill-posed tensor tomography inverse problem. We then demonstrate the properties of this tomography problem both analytically and computationally, highlighting why incorporating precession to perform scanning precession electron diffraction may be important. We establish a general framework for non-symmetric tensor tomography and demonstrate computationally its applicability for achieving strain tomography with scanning precession electron diffraction data.
△ Less
Submitted 10 November, 2020; v1 submitted 3 August, 2020;
originally announced August 2020.
-
Learned convex regularizers for inverse problems
Authors:
Subhadip Mukherjee,
Sören Dittmer,
Zakhar Shumaylov,
Sebastian Lunz,
Ozan Öktem,
Carola-Bibiane Schönlieb
Abstract:
We consider the variational reconstruction framework for inverse problems and propose to learn a data-adaptive input-convex neural network (ICNN) as the regularization functional. The ICNN-based convex regularizer is trained adversarially to discern ground-truth images from unregularized reconstructions. Convexity of the regularizer is desirable since (i) one can establish analytical convergence g…
▽ More
We consider the variational reconstruction framework for inverse problems and propose to learn a data-adaptive input-convex neural network (ICNN) as the regularization functional. The ICNN-based convex regularizer is trained adversarially to discern ground-truth images from unregularized reconstructions. Convexity of the regularizer is desirable since (i) one can establish analytical convergence guarantees for the corresponding variational reconstruction problem and (ii) devise efficient and provable algorithms for reconstruction. In particular, we show that the optimal solution to the variational problem converges to the ground-truth if the penalty parameter decays sub-linearly with respect to the norm of the noise. Further, we prove the existence of a sub-gradient-based algorithm that leads to a monotonically decreasing error in the parameter space with iterations. To demonstrate the performance of our approach for solving inverse problems, we consider the tasks of deblurring natural images and reconstructing images in computed tomography (CT), and show that the proposed convex regularizer is at least competitive with and sometimes superior to state-of-the-art data-driven techniques for inverse problems.
△ Less
Submitted 1 March, 2021; v1 submitted 6 August, 2020;
originally announced August 2020.
-
Image reconstruction in dynamic inverse problems with temporal models
Authors:
Andreas Hauptmann,
Ozan Öktem,
Carola Schönlieb
Abstract:
The paper surveys variational approaches for image reconstruction in dynamic inverse problems. Emphasis is on methods that rely on parametrised temporal models. These are here encoded as diffeomorphic deformations with time dependent parameters, or as motion constrained reconstruction where the motion model is given by a partial differential equation. The survey also includes recent development in…
▽ More
The paper surveys variational approaches for image reconstruction in dynamic inverse problems. Emphasis is on methods that rely on parametrised temporal models. These are here encoded as diffeomorphic deformations with time dependent parameters, or as motion constrained reconstruction where the motion model is given by a partial differential equation. The survey also includes recent development in integrating deep learning for solving these computationally demanding variational methods. Examples are given for 2D dynamic tomography, but methods apply to general inverse problems.
△ Less
Submitted 20 July, 2020;
originally announced July 2020.
-
Art Speaks Maths, Maths Speaks Art
Authors:
Ninetta Leone,
Simone Parisotto,
Kasia Targonska-Hadzibabic,
Spike Bucklow,
Alessandro Launaro,
Suzanne Reynolds,
Carola-Bibiane Schönlieb
Abstract:
Our interdisciplinary team Mathematics for Applications in Cultural Heritage (MACH) aims to use mathematical research for the benefit of the arts and humanities. Our ultimate goal is to create user-friendly software toolkits for artists, art conservators and archaeologists. In order for their underlying mathematical engines and functionality to be optimised for the needs of the end users, we pursu…
▽ More
Our interdisciplinary team Mathematics for Applications in Cultural Heritage (MACH) aims to use mathematical research for the benefit of the arts and humanities. Our ultimate goal is to create user-friendly software toolkits for artists, art conservators and archaeologists. In order for their underlying mathematical engines and functionality to be optimised for the needs of the end users, we pursue an iterative approach based on a continuous communication between the mathematicians and the cultural-heritage members of our team. Our paper illustrates how maths can speak art, but only if first art speaks maths.
△ Less
Submitted 17 July, 2020;
originally announced July 2020.
-
Ground Truth Free Denoising by Optimal Transport
Authors:
Sören Dittmer,
Carola-Bibiane Schönlieb,
Peter Maass
Abstract:
We present a learned unsupervised denoising method for arbitrary types of data, which we explore on images and one-dimensional signals. The training is solely based on samples of noisy data and examples of noise, which -- critically -- do not need to come in pairs. We only need the assumption that the noise is independent and additive (although we describe how this can be extended). The method res…
▽ More
We present a learned unsupervised denoising method for arbitrary types of data, which we explore on images and one-dimensional signals. The training is solely based on samples of noisy data and examples of noise, which -- critically -- do not need to come in pairs. We only need the assumption that the noise is independent and additive (although we describe how this can be extended). The method rests on a Wasserstein Generative Adversarial Network setting, which utilizes two critics and one generator.
△ Less
Submitted 3 July, 2020;
originally announced July 2020.
-
Deeply Learned Spectral Total Variation Decomposition
Authors:
Tamara G. Grossmann,
Yury Korolev,
Guy Gilboa,
Carola-Bibiane Schönlieb
Abstract:
Non-linear spectral decompositions of images based on one-homogeneous functionals such as total variation have gained considerable attention in the last few years. Due to their ability to extract spectral components corresponding to objects of different size and contrast, such decompositions enable filtering, feature transfer, image fusion and other applications. However, obtaining this decomposit…
▽ More
Non-linear spectral decompositions of images based on one-homogeneous functionals such as total variation have gained considerable attention in the last few years. Due to their ability to extract spectral components corresponding to objects of different size and contrast, such decompositions enable filtering, feature transfer, image fusion and other applications. However, obtaining this decomposition involves solving multiple non-smooth optimisation problems and is therefore computationally highly intensive. In this paper, we present a neural network approximation of a non-linear spectral decomposition. We report up to four orders of magnitude ($\times 10,000$) speedup in processing of mega-pixel size images, compared to classical GPU implementations. Our proposed network, TVSpecNET, is able to implicitly learn the underlying PDE and, despite being entirely data driven, inherits invariances of the model based transform. To the best of our knowledge, this is the first approach towards learning a non-linear spectral decomposition of images. Not only do we gain a staggering computational advantage, but this approach can also be seen as a step towards studying neural networks that can decompose an image into spectral components defined by a user rather than a handcrafted functional.
△ Less
Submitted 21 October, 2020; v1 submitted 17 June, 2020;
originally announced June 2020.
-
SLIC-UAV: A Method for monitoring recovery in tropical restoration projects through identification of signature species using UAVs
Authors:
Jonathan Williams,
Carola-Bibiane Schönlieb,
Tom Swinfield,
Bambang Irawan,
Eva Achmad,
Muhammad Zudhi,
Habibi,
Elva Gemita,
David A. Coomes
Abstract:
Logged forests cover four million square kilometres of the tropics and restoring these forests is essential if we are to avoid the worst impacts of climate change, yet monitoring recovery is challenging. Tracking the abundance of visually identifiable, early-successional species enables successional status and thereby restoration progress to be evaluated. Here we present a new pipeline, SLIC-UAV,…
▽ More
Logged forests cover four million square kilometres of the tropics and restoring these forests is essential if we are to avoid the worst impacts of climate change, yet monitoring recovery is challenging. Tracking the abundance of visually identifiable, early-successional species enables successional status and thereby restoration progress to be evaluated. Here we present a new pipeline, SLIC-UAV, for processing Unmanned Aerial Vehicle (UAV) imagery to map early-successional species in tropical forests. The pipeline is novel because it comprises: (a) a time-efficient approach for labelling crowns from UAV imagery; (b) machine learning of species based on spectral and textural features within individual tree crowns, and (c) automatic segmentation of orthomosaiced UAV imagery into 'superpixels', using Simple Linear Iterative Clustering (SLIC). Creating superpixels reduces the dataset's dimensionality and focuses prediction onto clusters of pixels, greatly improving accuracy. To demonstrate SLIC-UAV, support vector machines and random forests were used to predict the species of hand-labelled crowns in a restoration concession in Indonesia. Random forests were most accurate at discriminating species for whole crowns, with accuracy ranging from 79.3% when map** five common species, to 90.5% when map** the three most visually-distinctive species. In contrast, support vector machines proved better for labelling automatically segmented superpixels, with accuracy ranging from 74.3% to 91.7% for the same species. Models were extended to map species across 100 hectares of forest. The study demonstrates the power of SLIC-UAV for map** characteristic early-successional tree species as an indicator of successional stage within tropical forest restoration areas. Continued effort is needed to develop easy-to-implement and low-cost technology to improve the affordability of project management.
△ Less
Submitted 11 June, 2020;
originally announced June 2020.
-
Structure preserving deep learning
Authors:
Elena Celledoni,
Matthias J. Ehrhardt,
Christian Etmann,
Robert I McLachlan,
Brynjulf Owren,
Carola-Bibiane Schönlieb,
Ferdia Sherry
Abstract:
Over the past few years, deep learning has risen to the foreground as a topic of massive interest, mainly as a result of successes obtained in solving large-scale image processing tasks. There are multiple challenging mathematical problems involved in applying deep learning: most deep learning methods require the solution of hard optimisation problems, and a good understanding of the tradeoff betw…
▽ More
Over the past few years, deep learning has risen to the foreground as a topic of massive interest, mainly as a result of successes obtained in solving large-scale image processing tasks. There are multiple challenging mathematical problems involved in applying deep learning: most deep learning methods require the solution of hard optimisation problems, and a good understanding of the tradeoff between computational effort, amount of data and model complexity is required to successfully design a deep learning approach for a given problem. A large amount of progress made in deep learning has been based on heuristic explorations, but there is a growing effort to mathematically understand the structure in existing deep learning methods and to systematically design new deep learning methods to preserve certain types of structure in deep learning. In this article, we review a number of these directions: some deep neural networks can be understood as discretisations of dynamical systems, neural networks can be designed to have desirable properties such as invertibility or group equivariance, and new algorithmic frameworks based on conformal Hamiltonian systems and Riemannian manifolds to solve the optimisation problems have been proposed. We conclude our review of each of these topics by discussing some open problems that we consider to be interesting directions for future research.
△ Less
Submitted 5 June, 2020;
originally announced June 2020.
-
Unsupervised clustering of Roman pottery profiles from their SSAE representation
Authors:
Simone Parisotto,
Alessandro Launaro,
Ninetta Leone,
Carola-Bibiane Schönlieb
Abstract:
In this paper we introduce the ROman COmmonware POTtery (ROCOPOT) database, which comprises of more than 2000 black and white imaging profiles of pottery shapes extracted from 11 Roman catalogues and related to different excavation sites. The partiality and the handcrafted variance of the shape fragments within this new database make their unsupervised clustering a very challenging problem: profil…
▽ More
In this paper we introduce the ROman COmmonware POTtery (ROCOPOT) database, which comprises of more than 2000 black and white imaging profiles of pottery shapes extracted from 11 Roman catalogues and related to different excavation sites. The partiality and the handcrafted variance of the shape fragments within this new database make their unsupervised clustering a very challenging problem: profile similarities are thus explored via the hierarchical clustering of non-linear features learned in the latent representation space of a stacked sparse autoencoder (SSAE) network, unveiling new profile matches. Results are commented both from a mathematical and archaeological perspective so as to unlock new research directions in the respective communities.
△ Less
Submitted 4 June, 2020;
originally announced June 2020.
-
Variational regularisation for inverse problems with imperfect forward operators and general noise models
Authors:
Leon Bungert,
Martin Burger,
Yury Korolev,
Carola-Bibiane Schoenlieb
Abstract:
We study variational regularisation methods for inverse problems with imperfect forward operators whose errors can be modelled by order intervals in a partial order of a Banach lattice. We carry out analysis with respect to existence and convex duality for general data fidelity terms and regularisation functionals. Both for a-priori and a-posteriori parameter choice rules, we obtain convergence ra…
▽ More
We study variational regularisation methods for inverse problems with imperfect forward operators whose errors can be modelled by order intervals in a partial order of a Banach lattice. We carry out analysis with respect to existence and convex duality for general data fidelity terms and regularisation functionals. Both for a-priori and a-posteriori parameter choice rules, we obtain convergence rates of the regularized solutions in terms of Bregman distances. Our results apply to fidelity terms such as Wasserstein distances, f-divergences, norms, as well as sums and infimal convolutions of those.
△ Less
Submitted 23 October, 2020; v1 submitted 28 May, 2020;
originally announced May 2020.
-
Multi-task deep learning for image segmentation using recursive approximation tasks
Authors:
Rihuan Ke,
Aurélie Bugeau,
Nicolas Papadakis,
Mark Kirkland,
Peter Schuetz,
Carola-Bibiane Schönlieb
Abstract:
Fully supervised deep neural networks for segmentation usually require a massive amount of pixel-level labels which are manually expensive to create. In this work, we develop a multi-task learning method to relax this constraint. We regard the segmentation problem as a sequence of approximation subproblems that are recursively defined and in increasing levels of approximation accuracy. The subprob…
▽ More
Fully supervised deep neural networks for segmentation usually require a massive amount of pixel-level labels which are manually expensive to create. In this work, we develop a multi-task learning method to relax this constraint. We regard the segmentation problem as a sequence of approximation subproblems that are recursively defined and in increasing levels of approximation accuracy. The subproblems are handled by a framework that consists of 1) a segmentation task that learns from pixel-level ground truth segmentation masks of a small fraction of the images, 2) a recursive approximation task that conducts partial object regions learning and data-driven mask evolution starting from partial masks of each object instance, and 3) other problem oriented auxiliary tasks that are trained with sparse annotations and promote the learning of dedicated features. Most training images are only labeled by (rough) partial masks, which do not contain exact object boundaries, rather than by their full segmentation masks. During the training phase, the approximation task learns the statistics of these partial masks, and the partial regions are recursively increased towards object boundaries aided by the learned information from the segmentation task in a fully data-driven fashion. The network is trained on an extremely small amount of precisely segmented images and a large set of coarse labels. Annotations can thus be obtained in a cheap way. We demonstrate the efficiency of our approach in three applications with microscopy images and ultrasound images.
△ Less
Submitted 26 May, 2020;
originally announced May 2020.
-
3D deformable registration of longitudinal abdominopelvic CT images using unsupervised deep learning
Authors:
Maureen van Eijnatten,
Leonardo Rundo,
K. Joost Batenburg,
Felix Lucka,
Emma Beddowes,
Carlos Caldas,
Ferdia A. Gallagher,
Evis Sala,
Carola-Bibiane Schönlieb,
Ramona Woitek
Abstract:
This study investigates the use of the unsupervised deep learning framework VoxelMorph for deformable registration of longitudinal abdominopelvic CT images acquired in patients with bone metastases from breast cancer. The CT images were refined prior to registration by automatically removing the CT table and all other extra-corporeal components. To improve the learning capabilities of VoxelMorph w…
▽ More
This study investigates the use of the unsupervised deep learning framework VoxelMorph for deformable registration of longitudinal abdominopelvic CT images acquired in patients with bone metastases from breast cancer. The CT images were refined prior to registration by automatically removing the CT table and all other extra-corporeal components. To improve the learning capabilities of VoxelMorph when only a limited amount of training data is available, a novel incremental training strategy is proposed based on simulated deformations of consecutive CT images. In a 4-fold cross-validation scheme, the incremental training strategy achieved significantly better registration performance compared to training on a single volume. Although our deformable image registration method did not outperform iterative registration using NiftyReg (considered as a benchmark) in terms of registration quality, the registrations were approximately 300 times faster. This study showed the feasibility of deep learning based deformable registration of longitudinal abdominopelvic CT images via a novel incremental training strategy based on simulated deformations.
△ Less
Submitted 15 May, 2020;
originally announced May 2020.
-
On Learned Operator Correction in Inverse Problems
Authors:
Sebastian Lunz,
Andreas Hauptmann,
Tanja Tarvainen,
Carola-Bibiane Schönlieb,
Simon Arridge
Abstract:
We discuss the possibility to learn a data-driven explicit model correction for inverse problems and whether such a model correction can be used within a variational framework to obtain regularised reconstructions. This paper discusses the conceptual difficulty to learn such a forward model correction and proceeds to present a possible solution as forward-adjoint correction that explicitly correct…
▽ More
We discuss the possibility to learn a data-driven explicit model correction for inverse problems and whether such a model correction can be used within a variational framework to obtain regularised reconstructions. This paper discusses the conceptual difficulty to learn such a forward model correction and proceeds to present a possible solution as forward-adjoint correction that explicitly corrects in both data and solution spaces. We then derive conditions under which solutions to the variational problem with a learned correction converge to solutions obtained with the correct operator. The proposed approach is evaluated on an application to limited view photoacoustic tomography and compared to the established framework of Bayesian approximation error method.
△ Less
Submitted 21 October, 2020; v1 submitted 14 May, 2020;
originally announced May 2020.
-
iUNets: Fully invertible U-Nets with Learnable Up- and Downsampling
Authors:
Christian Etmann,
Rihuan Ke,
Carola-Bibiane Schönlieb
Abstract:
U-Nets have been established as a standard architecture for image-to-image learning problems such as segmentation and inverse problems in imaging. For large-scale data, as it for example appears in 3D medical imaging, the U-Net however has prohibitive memory requirements. Here, we present a new fully-invertible U-Net-based architecture called the iUNet, which employs novel learnable and invertible…
▽ More
U-Nets have been established as a standard architecture for image-to-image learning problems such as segmentation and inverse problems in imaging. For large-scale data, as it for example appears in 3D medical imaging, the U-Net however has prohibitive memory requirements. Here, we present a new fully-invertible U-Net-based architecture called the iUNet, which employs novel learnable and invertible up- and downsampling operations, thereby making the use of memory-efficient backpropagation possible. This allows us to train deeper and larger networks in practice, under the same GPU memory restrictions. Due to its invertibility, the iUNet can furthermore be used for constructing normalizing flows.
△ Less
Submitted 30 June, 2020; v1 submitted 11 May, 2020;
originally announced May 2020.
-
Learning Optical Flow for Fast MRI Reconstruction
Authors:
T. Schmoderer,
A. I Aviles-Rivero,
V. Corona,
N. Debroux,
C-B. Schönlieb
Abstract:
Reconstructing high-quality magnetic resonance images (MRI) from undersampled raw data is of great interest from both technical and clinical point of views. To this date, however, it is still a mathematically and computationally challenging problem due to its severe ill-posedness, resulting from the highly undersampled data leading to significantly missing information. Whilst a number of technique…
▽ More
Reconstructing high-quality magnetic resonance images (MRI) from undersampled raw data is of great interest from both technical and clinical point of views. To this date, however, it is still a mathematically and computationally challenging problem due to its severe ill-posedness, resulting from the highly undersampled data leading to significantly missing information. Whilst a number of techniques have been presented to improve image reconstruction, they only account for spatio-temporal regularisation, which shows its limitations in several relevant scenarios including dynamic data. In this work, we propose a new mathematical model for the reconstruction of high-quality medical MRI from few measurements. Our proposed approach combines - \textit{in a multi-task and hybrid model} - the traditional compressed sensing formulation for the reconstruction of dynamic MRI with motion compensation by learning an optical flow approximation. More precisely, we propose to encode the dynamics in the form of an optical flow model that is sparsely represented over a learned dictionary. This has the advantage that ground truth data is not required in the training of the optical flow term. Furthermore, we present an efficient optimisation scheme to tackle the non-convex problem based on an alternating splitting method. We demonstrate the potentials of our approach through an extensive set of numerical results using different datasets and acceleration factors. Our combined approach reaches and outperforms several state of the art techniques. Finally, we show the ability of our technique to transfer phantom based knowledge to real datasets.
△ Less
Submitted 26 October, 2020; v1 submitted 22 April, 2020;
originally announced April 2020.
-
SPRING: A fast stochastic proximal alternating method for non-smooth non-convex optimization
Authors:
Derek Driggs,
Junqi Tang,
**gwei Liang,
Mike Davies,
Carola-Bibiane Schönlieb
Abstract:
We introduce SPRING, a novel stochastic proximal alternating linearized minimization algorithm for solving a class of non-smooth and non-convex optimization problems. Large-scale imaging problems are becoming increasingly prevalent due to advances in data acquisition and computational capabilities. Motivated by the success of stochastic optimization methods, we propose a stochastic variant of prox…
▽ More
We introduce SPRING, a novel stochastic proximal alternating linearized minimization algorithm for solving a class of non-smooth and non-convex optimization problems. Large-scale imaging problems are becoming increasingly prevalent due to advances in data acquisition and computational capabilities. Motivated by the success of stochastic optimization methods, we propose a stochastic variant of proximal alternating linearized minimization (PALM) algorithm \cite{bolte2014proximal}. We provide global convergence guarantees, demonstrating that our proposed method with variance-reduced stochastic gradient estimators, such as SAGA \cite{SAGA} and SARAH \cite{sarah}, achieves state-of-the-art oracle complexities. We also demonstrate the efficacy of our algorithm via several numerical examples including sparse non-negative matrix factorization, sparse principal component analysis, and blind image deconvolution.
△ Less
Submitted 19 January, 2021; v1 submitted 27 February, 2020;
originally announced February 2020.
-
Tuning-free Plug-and-Play Proximal Algorithm for Inverse Imaging Problems
Authors:
Kaixuan Wei,
Angelica Aviles-Rivero,
**gwei Liang,
Ying Fu,
Carola-Bibiane Schönlieb,
Hua Huang
Abstract:
Plug-and-play (PnP) is a non-convex framework that combines ADMM or other proximal algorithms with advanced denoiser priors. Recently, PnP has achieved great empirical success, especially with the integration of deep learning-based denoisers. However, a key problem of PnP based approaches is that they require manual parameter tweaking. It is necessary to obtain high-quality results across the high…
▽ More
Plug-and-play (PnP) is a non-convex framework that combines ADMM or other proximal algorithms with advanced denoiser priors. Recently, PnP has achieved great empirical success, especially with the integration of deep learning-based denoisers. However, a key problem of PnP based approaches is that they require manual parameter tweaking. It is necessary to obtain high-quality results across the high discrepancy in terms of imaging conditions and varying scene content. In this work, we present a tuning-free PnP proximal algorithm, which can automatically determine the internal parameters including the penalty parameter, the denoising strength and the terminal time. A key part of our approach is to develop a policy network for automatic search of parameters, which can be effectively learned via mixed model-free and model-based deep reinforcement learning. We demonstrate, through numerical and visual experiments, that the learned policy can customize different parameters for different states, and often more efficient and effective than existing handcrafted criteria. Moreover, we discuss the practical considerations of the plugged denoisers, which together with our learned policy yield state-of-the-art results. This is prevalent on both linear and nonlinear exemplary inverse imaging problems, and in particular, we show promising results on Compressed Sensing MRI and phase retrieval.
△ Less
Submitted 18 November, 2020; v1 submitted 21 February, 2020;
originally announced February 2020.
-
CycleCluster: Modernising Clustering Regularisation for Deep Semi-Supervised Classification
Authors:
Philip Sellars,
Angelica Aviles-Rivero,
Carola Bibiane Schönlieb
Abstract:
Given the potential difficulties in obtaining large quantities of labelled data, many works have explored the use of deep semi-supervised learning, which uses both labelled and unlabelled data to train a neural network architecture. The vast majority of SSL approaches focus on implementing the low-density separation assumption or consistency assumption, the idea that decision boundaries should lie…
▽ More
Given the potential difficulties in obtaining large quantities of labelled data, many works have explored the use of deep semi-supervised learning, which uses both labelled and unlabelled data to train a neural network architecture. The vast majority of SSL approaches focus on implementing the low-density separation assumption or consistency assumption, the idea that decision boundaries should lie in low density regions. However, they have implemented this assumption by making local changes to the decision boundary at each data point, ignoring the global structure of the data. In this work, we explore an alternative approach using the global information present in the clustered data to update our decision boundaries. We propose a novel framework, CycleCluster, for deep semi-supervised classification. Our core optimisation is driven by a new clustering based regularisation along with a graph based pseudo-labels and a shared deep network. Demonstrating that direct implementation of the cluster assumption is a viable alternative to the popular consistency based regularisation. We demonstrate the predictive capability of our technique through a careful set of numerical results.
△ Less
Submitted 1 September, 2021; v1 submitted 15 January, 2020;
originally announced January 2020.