-
A Guide to Stochastic Optimisation for Large-Scale Inverse Problems
Authors:
Matthias J. Ehrhardt,
Zeljko Kereta,
**gwei Liang,
Junqi Tang
Abstract:
Stochastic optimisation algorithms are the de facto standard for machine learning with large amounts of data. Handling only a subset of available data in each optimisation step dramatically reduces the per-iteration computational costs, while still ensuring significant progress towards the solution. Driven by the need to solve large-scale optimisation problems as efficiently as possible, the last…
▽ More
Stochastic optimisation algorithms are the de facto standard for machine learning with large amounts of data. Handling only a subset of available data in each optimisation step dramatically reduces the per-iteration computational costs, while still ensuring significant progress towards the solution. Driven by the need to solve large-scale optimisation problems as efficiently as possible, the last decade has witnessed an explosion of research in this area. Leveraging the parallels between machine learning and inverse problems has allowed harnessing the power of this research wave for solving inverse problems. In this survey, we provide a comprehensive account of the state-of-the-art in stochastic optimisation from the viewpoint of inverse problems. We present algorithms with diverse modalities of problem randomisation and discuss the roles of variance reduction, acceleration, higher-order methods, and other algorithmic modifications, and compare theoretical results with practical behaviour. We focus on the potential and the challenges for stochastic optimisation that are unique to inverse imaging problems and are not commonly encountered in machine learning. We conclude the survey with illustrative examples from imaging problems to examine the advantages and disadvantages that this new generation of algorithms bring to the field of inverse problems.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Learning Preconditioners for Inverse Problems
Authors:
Matthias J. Ehrhardt,
Patrick Fahy,
Mohammad Golbabaee
Abstract:
We explore the application of preconditioning in optimisation algorithms, specifically those appearing in Inverse Problems in imaging. Such problems often contain an ill-posed forward operator and are large-scale. Therefore, computationally efficient algorithms which converge quickly are desirable. To remedy these issues, learning-to-optimise leverages training data to accelerate solving particula…
▽ More
We explore the application of preconditioning in optimisation algorithms, specifically those appearing in Inverse Problems in imaging. Such problems often contain an ill-posed forward operator and are large-scale. Therefore, computationally efficient algorithms which converge quickly are desirable. To remedy these issues, learning-to-optimise leverages training data to accelerate solving particular optimisation problems. Many traditional optimisation methods use scalar hyperparameters, significantly limiting their convergence speed when applied to ill-conditioned problems. In contrast, we propose a novel approach that replaces these scalar quantities with matrices learned using data. Often, preconditioning considers only symmetric positive-definite preconditioners. However, we consider multiple parametrisations of the preconditioner, which do not require symmetry or positive-definiteness. These parametrisations include using full matrices, diagonal matrices, and convolutions. We analyse the convergence properties of these methods and compare their performance against classical optimisation algorithms. Generalisation performance of these methods is also considered, both for in-distribution and out-of-distribution data.
△ Less
Submitted 31 May, 2024;
originally announced June 2024.
-
Coupling Analysis of the Asymptotic Behaviour of a Primal-Dual Langevin Algorithm
Authors:
Martin Burger,
Matthias J. Ehrhardt,
Lorenz Kuger,
Lukas Weigand
Abstract:
We analyze a recently proposed algorithm for the problem of sampling from probability distributions $μ^\ast$ in $\mathbb{R}^d$ with a Lebesgue density of the form $μ^\ast(x) \propto \exp(-f(Kx)-g(x))$, where $K$ is a linear operator and $f,g$ convex and non-smooth. The algorithm is a generalization of the primal-dual hybrid gradient (PDHG) convex optimization algorithm to a sampling scheme. We ana…
▽ More
We analyze a recently proposed algorithm for the problem of sampling from probability distributions $μ^\ast$ in $\mathbb{R}^d$ with a Lebesgue density of the form $μ^\ast(x) \propto \exp(-f(Kx)-g(x))$, where $K$ is a linear operator and $f,g$ convex and non-smooth. The algorithm is a generalization of the primal-dual hybrid gradient (PDHG) convex optimization algorithm to a sampling scheme. We analyze the method's continuous time limit, an SDE in the joint primal-dual variable. We give mild conditions under which the corresponding Fokker-Planck equation converges to a unique stationary state, which however does not concentrate in the dual variable and consequently does not have $μ^\ast$ as its primal marginal. Under a smoothness assumption on $f$, we show that the scheme converges to the purely primal overdamped Langevin diffusion in the limit of small primal and large dual step sizes. We further prove that the target can never be the primal marginal of the invariant solution for any modification of the SDE with space-homogeneous diffusion coefficient. A correction with inhomogeneous diffusion coefficient and the correct invariant solution is proposed, but the scheme requires the same smoothness assumptions on $f$ and is numerically inferior to overdamped Langevin diffusion. We demonstrate our findings numerically, first on small-scale examples in which we can exactly verify the theoretical results, and subsequently on typical examples of larger scale from Bayesian imaging inverse problems.
△ Less
Submitted 29 May, 2024; v1 submitted 28 May, 2024;
originally announced May 2024.
-
Practical Acceleration of the Condat-Vũ Algorithm
Authors:
Derek Driggs,
Matthias J. Ehrhardt,
Carola-Bibiane Schönlieb,
Junqi Tang
Abstract:
The Condat-Vũ algorithm is a widely used primal-dual method for optimizing composite objectives of three functions. Several algorithms for optimizing composite objectives of two functions are special cases of Condat-Vũ, including proximal gradient descent (PGD). It is well-known that PGD exhibits suboptimal performance, and a simple adjustment to PGD can accelerate its convergence rate from…
▽ More
The Condat-Vũ algorithm is a widely used primal-dual method for optimizing composite objectives of three functions. Several algorithms for optimizing composite objectives of two functions are special cases of Condat-Vũ, including proximal gradient descent (PGD). It is well-known that PGD exhibits suboptimal performance, and a simple adjustment to PGD can accelerate its convergence rate from $\mathcal{O}(1/T)$ to $\mathcal{O}(1/T^2)$ on convex objectives, and this accelerated rate is optimal. In this work, we show that a simple adjustment to the Condat-Vũ algorithm allows it to recover accelerated PGD (APGD) as a special case, instead of PGD. We prove that this accelerated Condat--Vũ algorithm achieves optimal convergence rates and significantly outperforms the traditional Condat-Vũ algorithm in regimes where the Condat--Vũ algorithm approximates the dynamics of PGD. We demonstrate the effectiveness of our approach in various applications in machine learning and computational imaging.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
An adaptively inexact first-order method for bilevel optimization with application to hyperparameter learning
Authors:
Mohammad Sadegh Salehi,
Subhadip Mukherjee,
Lindon Roberts,
Matthias J. Ehrhardt
Abstract:
Various tasks in data science are modeled utilizing the variational regularization approach, where manually selecting regularization parameters presents a challenge. The difficulty gets exacerbated when employing regularizers involving a large number of hyperparameters. To overcome this challenge, bilevel learning can be employed to learn such parameters from data. However, neither exact function…
▽ More
Various tasks in data science are modeled utilizing the variational regularization approach, where manually selecting regularization parameters presents a challenge. The difficulty gets exacerbated when employing regularizers involving a large number of hyperparameters. To overcome this challenge, bilevel learning can be employed to learn such parameters from data. However, neither exact function values nor exact gradients with respect to the hyperparameters are attainable, necessitating methods that only rely on inexact evaluation of such quantities. State-of-the-art inexact gradient-based methods a priori select a sequence of the required accuracies and cannot identify an appropriate step size since the Lipschitz constant of the hypergradient is unknown. In this work, we propose an algorithm with backtracking line search that only relies on inexact function evaluations and hypergradients and show convergence to a stationary point. Furthermore, the proposed algorithm determines the required accuracy dynamically rather than manually selected before running it. Our numerical experiments demonstrate the efficiency and feasibility of our approach for hyperparameter estimation on a range of relevant problems in imaging and data science such as total variation and field of experts denoising and multinomial logistic regression. Particularly, the results show that the algorithm is robust to its own hyperparameters such as the initial accuracies and step size.
△ Less
Submitted 10 April, 2024; v1 submitted 19 August, 2023;
originally announced August 2023.
-
Proximal Langevin Sampling With Inexact Proximal Map**
Authors:
Matthias J. Ehrhardt,
Lorenz Kuger,
Carola-Bibiane Schönlieb
Abstract:
In order to solve tasks like uncertainty quantification or hypothesis tests in Bayesian imaging inverse problems, we often have to draw samples from the arising posterior distribution. For the usually log-concave but high-dimensional posteriors, Markov chain Monte Carlo methods based on time discretizations of Langevin diffusion are a popular tool. If the potential defining the distribution is non…
▽ More
In order to solve tasks like uncertainty quantification or hypothesis tests in Bayesian imaging inverse problems, we often have to draw samples from the arising posterior distribution. For the usually log-concave but high-dimensional posteriors, Markov chain Monte Carlo methods based on time discretizations of Langevin diffusion are a popular tool. If the potential defining the distribution is non-smooth, these discretizations are usually of an implicit form leading to Langevin sampling algorithms that require the evaluation of proximal operators. For some of the potentials relevant in imaging problems this is only possible approximately using an iterative scheme. We investigate the behaviour of a proximal Langevin algorithm under the presence of errors in the evaluation of proximal map**s. We generalize existing non-asymptotic and asymptotic convergence results of the exact algorithm to our inexact setting and quantify the bias between the target and the algorithm's stationary distribution due to the errors. We show that the additional bias stays bounded for bounded errors and converges to zero for decaying errors in a strongly convex setting. We apply the inexact algorithm to sample numerically from the posterior of typical imaging inverse problems in which we can only approximate the proximal operator by an iterative scheme and validate our theoretical convergence results.
△ Less
Submitted 13 May, 2024; v1 submitted 30 June, 2023;
originally announced June 2023.
-
Designing Stable Neural Networks using Convex Analysis and ODEs
Authors:
Ferdia Sherry,
Elena Celledoni,
Matthias J. Ehrhardt,
Davide Murari,
Brynjulf Owren,
Carola-Bibiane Schönlieb
Abstract:
Motivated by classical work on the numerical integration of ordinary differential equations we present a ResNet-styled neural network architecture that encodes non-expansive (1-Lipschitz) operators, as long as the spectral norms of the weights are appropriately constrained. This is to be contrasted with the ordinary ResNet architecture which, even if the spectral norms of the weights are constrain…
▽ More
Motivated by classical work on the numerical integration of ordinary differential equations we present a ResNet-styled neural network architecture that encodes non-expansive (1-Lipschitz) operators, as long as the spectral norms of the weights are appropriately constrained. This is to be contrasted with the ordinary ResNet architecture which, even if the spectral norms of the weights are constrained, has a Lipschitz constant that, in the worst case, grows exponentially with the depth of the network. Further analysis of the proposed architecture shows that the spectral norms of the weights can be further constrained to ensure that the network is an averaged operator, making it a natural candidate for a learned denoiser in Plug-and-Play algorithms. Using a novel adaptive way of enforcing the spectral norm constraints, we show that, even with these constraints, it is possible to train performant networks. The proposed architecture is applied to the problem of adversarially robust image classification, to image denoising, and finally to the inverse problem of deblurring.
△ Less
Submitted 18 April, 2024; v1 submitted 29 June, 2023;
originally announced June 2023.
-
On Optimal Regularization Parameters via Bilevel Learning
Authors:
Matthias J. Ehrhardt,
Silvia Gazzola,
Sebastian J. Scott
Abstract:
Variational regularization is commonly used to solve linear inverse problems, and involves augmenting a data fidelity by a regularizer. The regularizer is used to promote a priori information and is weighted by a regularization parameter. Selection of an appropriate regularization parameter is critical, with various choices leading to very different reconstructions. Classical strategies used to de…
▽ More
Variational regularization is commonly used to solve linear inverse problems, and involves augmenting a data fidelity by a regularizer. The regularizer is used to promote a priori information and is weighted by a regularization parameter. Selection of an appropriate regularization parameter is critical, with various choices leading to very different reconstructions. Classical strategies used to determine a suitable parameter value include the discrepancy principle and the L-curve criterion, and in recent years a supervised machine learning approach called bilevel learning has been employed. Bilevel learning is a powerful framework to determine optimal parameters and involves solving a nested optimization problem. While previous strategies enjoy various theoretical results, the well-posedness of bilevel learning in this setting is still an open question. In particular, a necessary property is positivity of the determined regularization parameter. In this work, we provide a new condition that better characterizes positivity of optimal regularization parameters than the existing theory. Numerical results verify and explore this new condition for both small and high-dimensional problems.
△ Less
Submitted 22 January, 2024; v1 submitted 28 May, 2023;
originally announced May 2023.
-
Analyzing Inexact Hypergradients for Bilevel Learning
Authors:
Matthias J. Ehrhardt,
Lindon Roberts
Abstract:
Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modeled as the solution to an optimization problem. Here the exact gradient with respect to the hyperparameters cannot be feasibly computed and approximate strategies are required. We introduce a unified framework for computing hypergradients that generalizes existing met…
▽ More
Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modeled as the solution to an optimization problem. Here the exact gradient with respect to the hyperparameters cannot be feasibly computed and approximate strategies are required. We introduce a unified framework for computing hypergradients that generalizes existing methods based on the implicit function theorem and automatic differentiation/backpropagation, showing that these two seemingly disparate approaches are actually tightly connected. Our framework is extremely flexible, allowing its subproblems to be solved with any suitable method, to any degree of accuracy. We derive a priori and computable a posteriori error bounds for all our methods, and numerically show that our a posteriori bounds are usually more accurate. Our numerical results also show that, surprisingly, for efficient bilevel optimization, the choice of hypergradient algorithm is at least as important as the choice of lower-level solver.
△ Less
Submitted 14 November, 2023; v1 submitted 11 January, 2023;
originally announced January 2023.
-
Stochastic Primal Dual Hybrid Gradient Algorithm with Adaptive Step-Sizes
Authors:
Antonin Chambolle,
Claire Delplancke,
Matthias J. Ehrhardt,
Carola-Bibiane Schönlieb,
Junqi Tang
Abstract:
In this work we propose a new primal-dual algorithm with adaptive step-sizes. The stochastic primal-dual hybrid gradient (SPDHG) algorithm with constant step-sizes has become widely applied in large-scale convex optimization across many scientific fields due to its scalability. While the product of the primal and dual step-sizes is subject to an upper-bound in order to ensure convergence, the sele…
▽ More
In this work we propose a new primal-dual algorithm with adaptive step-sizes. The stochastic primal-dual hybrid gradient (SPDHG) algorithm with constant step-sizes has become widely applied in large-scale convex optimization across many scientific fields due to its scalability. While the product of the primal and dual step-sizes is subject to an upper-bound in order to ensure convergence, the selection of the ratio of the step-sizes is critical in applications. Up-to-now there is no systematic and successful way of selecting the primal and dual step-sizes for SPDHG. In this work, we propose a general class of adaptive SPDHG (A-SPDHG) algorithms, and prove their convergence under weak assumptions. We also propose concrete parameters-updating strategies which satisfy the assumptions of our theory and thereby lead to convergent algorithms. Numerical examples on computed tomography demonstrate the effectiveness of the proposed schemes.
△ Less
Submitted 4 December, 2023; v1 submitted 6 January, 2023;
originally announced January 2023.
-
Uncertainty Quantification in CT pulmonary angiography
Authors:
Adwaye M Rambojun,
Hend Komber,
Jennifer Rossdale,
Jay Suntharalingam,
Jonathan C L Rodrigues,
Matthias J Ehrhardt,
Audrey Repetti
Abstract:
Computed tomography (CT) imaging of the thorax is widely used for the detection and monitoring of pulmonary embolism (PE). However, CT images can contain artifacts due to the acquisition or the processes involved in image reconstruction. Radiologists often have to distinguish between such artifacts and actual PEs. Our main contribution comes in the form of a scalable hypothesis testing method for…
▽ More
Computed tomography (CT) imaging of the thorax is widely used for the detection and monitoring of pulmonary embolism (PE). However, CT images can contain artifacts due to the acquisition or the processes involved in image reconstruction. Radiologists often have to distinguish between such artifacts and actual PEs. Our main contribution comes in the form of a scalable hypothesis testing method for CT, to enable quantifying uncertainty of possible PEs. In particular, we introduce a Bayesian Framework to quantify the uncertainty of an observed compact structure that can be identified as a PE. We assess the ability of the method to operate under high noise environments and with insufficient data.
△ Less
Submitted 6 January, 2023;
originally announced January 2023.
-
Compressed Sensing MRI Reconstruction Regularized by VAEs with Structured Image Covariance
Authors:
Margaret Duff,
Ivor J. A. Simpson,
Matthias J. Ehrhardt,
Neill D. F. Campbell
Abstract:
Objective: This paper investigates how generative models, trained on ground-truth images, can be used \changes{as} priors for inverse problems, penalizing reconstructions far from images the generator can produce. The aim is that learned regularization will provide complex data-driven priors to inverse problems while still retaining the control and insight of a variational regularization method. M…
▽ More
Objective: This paper investigates how generative models, trained on ground-truth images, can be used \changes{as} priors for inverse problems, penalizing reconstructions far from images the generator can produce. The aim is that learned regularization will provide complex data-driven priors to inverse problems while still retaining the control and insight of a variational regularization method. Moreover, unsupervised learning, without paired training data, allows the learned regularizer to remain flexible to changes in the forward problem such as noise level, sampling pattern or coil sensitivities in MRI.
Approach: We utilize variational autoencoders (VAEs) that generate not only an image but also a covariance uncertainty matrix for each image. The covariance can model changing uncertainty dependencies caused by structure in the image, such as edges or objects, and provides a new distance metric from the manifold of learned images.
Main results: We evaluate these novel generative regularizers on retrospectively sub-sampled real-valued MRI measurements from the fastMRI dataset. We compare our proposed learned regularization against other unlearned regularization approaches and unsupervised and supervised deep learning methods.
Significance: Our results show that the proposed method is competitive with other state-of-the-art methods and behaves consistently with changing sampling patterns and noise levels.
△ Less
Submitted 16 June, 2023; v1 submitted 26 October, 2022;
originally announced October 2022.
-
Imaging with Equivariant Deep Learning
Authors:
Dongdong Chen,
Mike Davies,
Matthias J. Ehrhardt,
Carola-Bibiane Schönlieb,
Ferdia Sherry,
Julián Tachella
Abstract:
From early image processing to modern computational imaging, successful models and algorithms have relied on a fundamental property of natural signals: symmetry. Here symmetry refers to the invariance property of signal sets to transformations such as translation, rotation or scaling. Symmetry can also be incorporated into deep neural networks in the form of equivariance, allowing for more data-ef…
▽ More
From early image processing to modern computational imaging, successful models and algorithms have relied on a fundamental property of natural signals: symmetry. Here symmetry refers to the invariance property of signal sets to transformations such as translation, rotation or scaling. Symmetry can also be incorporated into deep neural networks in the form of equivariance, allowing for more data-efficient learning. While there has been important advances in the design of end-to-end equivariant networks for image classification in recent years, computational imaging introduces unique challenges for equivariant network solutions since we typically only observe the image through some noisy ill-conditioned forward operator that itself may not be equivariant. We review the emerging field of equivariant imaging and show how it can provide improved generalization and new imaging opportunities. Along the way we show the interplay between the acquisition physics and group actions and links to iterative reconstruction, blind compressed sensing and self-supervised learning.
△ Less
Submitted 4 September, 2022;
originally announced September 2022.
-
On the convergence and sampling of randomized primal-dual algorithms and their application to parallel MRI reconstruction
Authors:
Eric B Gutierrez,
Claire Delplancke,
Matthias J Ehrhardt
Abstract:
Stochastic Primal-Dual Hybrid Gradient (SPDHG) is an algorithm proposed by Chambolle et al. (2018) to efficiently solve a wide class of nonsmooth large-scale optimization problems. In this paper we contribute to its theoretical foundations and prove its almost sure convergence for convex but neither necessarily strongly convex nor smooth functionals, as well as for any random sampling. In addition…
▽ More
Stochastic Primal-Dual Hybrid Gradient (SPDHG) is an algorithm proposed by Chambolle et al. (2018) to efficiently solve a wide class of nonsmooth large-scale optimization problems. In this paper we contribute to its theoretical foundations and prove its almost sure convergence for convex but neither necessarily strongly convex nor smooth functionals, as well as for any random sampling. In addition, we study SPDHG for parallel Magnetic Resonance Imaging reconstruction, where data from different coils are randomly selected at each iteration. We apply SPDHG using a wide range of random sampling methods and compare its performance across a range of settings, including mini-batch size and step size parameters. We show that the sampling can significantly affect the convergence speed of SPDHG and for many cases an optimal sampling can be identified.
△ Less
Submitted 24 November, 2023; v1 submitted 25 July, 2022;
originally announced July 2022.
-
Regularization of Inverse Problems: Deep Equilibrium Models versus Bilevel Learning
Authors:
Danilo Riccio,
Matthias J. Ehrhardt,
Martin Benning
Abstract:
Variational regularization methods are commonly used to approximate solutions of inverse problems. In recent years, model-based variational regularization methods have often been replaced with data-driven ones such as the fields-of-expert model (Roth and Black, 2009). Training the parameters of such data-driven methods can be formulated as a bilevel optimization problem. In this paper, we compare…
▽ More
Variational regularization methods are commonly used to approximate solutions of inverse problems. In recent years, model-based variational regularization methods have often been replaced with data-driven ones such as the fields-of-expert model (Roth and Black, 2009). Training the parameters of such data-driven methods can be formulated as a bilevel optimization problem. In this paper, we compare the framework of bilevel learning for the training of data-driven variational regularization models with the novel framework of deep equilibrium models (Bai, Kolter, and Koltun, 2019) that has recently been introduced in the context of inverse problems (Gilton, Ongie, and Willett, 2021). We show that computing the lower-level optimization problem within the bilevel formulation with a fixed point iteration is a special case of the deep equilibrium framework. We compare both approaches computationally, with a variety of numerical examples for the inverse problems of denoising, inpainting and deconvolution.
△ Less
Submitted 15 January, 2024; v1 submitted 27 June, 2022;
originally announced June 2022.
-
Regularising Inverse Problems with Generative Machine Learning Models
Authors:
Margaret Duff,
Neill D. F. Campbell,
Matthias J. Ehrhardt
Abstract:
Deep neural network approaches to inverse imaging problems have produced impressive results in the last few years. In this paper, we consider the use of generative models in a variational regularisation approach to inverse problems. The considered regularisers penalise images that are far from the range of a generative model that has learned to produce images similar to a training dataset. We name…
▽ More
Deep neural network approaches to inverse imaging problems have produced impressive results in the last few years. In this paper, we consider the use of generative models in a variational regularisation approach to inverse problems. The considered regularisers penalise images that are far from the range of a generative model that has learned to produce images similar to a training dataset. We name this family \textit{generative regularisers}. The success of generative regularisers depends on the quality of the generative model and so we propose a set of desired criteria to assess generative models and guide future research. In our numerical experiments, we evaluate three common generative models, autoencoders, variational autoencoders and generative adversarial networks, against our desired criteria. We also test three different generative regularisers on the inverse problems of deblurring, deconvolution, and tomography. We show that restricting solutions of the inverse problem to lie exactly in the range of a generative model can give good results but that allowing small deviations from the range of the generator produces more consistent results.
△ Less
Submitted 18 June, 2022; v1 submitted 22 July, 2021;
originally announced July 2021.
-
Equivariant neural networks for inverse problems
Authors:
Elena Celledoni,
Matthias J. Ehrhardt,
Christian Etmann,
Brynjulf Owren,
Carola-Bibiane Schönlieb,
Ferdia Sherry
Abstract:
In recent years the use of convolutional layers to encode an inductive bias (translational equivariance) in neural networks has proven to be a very fruitful idea. The successes of this approach have motivated a line of research into incorporating other symmetries into deep learning methods, in the form of group equivariant convolutional neural networks. Much of this work has been focused on roto-t…
▽ More
In recent years the use of convolutional layers to encode an inductive bias (translational equivariance) in neural networks has proven to be a very fruitful idea. The successes of this approach have motivated a line of research into incorporating other symmetries into deep learning methods, in the form of group equivariant convolutional neural networks. Much of this work has been focused on roto-translational symmetry of $\mathbf R^d$, but other examples are the scaling symmetry of $\mathbf R^d$ and rotational symmetry of the sphere. In this work, we demonstrate that group equivariant convolutional operations can naturally be incorporated into learned reconstruction methods for inverse problems that are motivated by the variational regularisation approach. Indeed, if the regularisation functional is invariant under a group symmetry, the corresponding proximal operator will satisfy an equivariance property with respect to the same group symmetry. As a result of this observation, we design learned iterative methods in which the proximal operators are modelled as group equivariant convolutional neural networks. We use roto-translationally equivariant operations in the proposed methodology and apply it to the problems of low-dose computerised tomography reconstruction and subsampled magnetic resonance imaging reconstruction. The proposed methodology is demonstrated to improve the reconstruction quality of a learned reconstruction method with a little extra computational cost at training time but without any extra cost at test time.
△ Less
Submitted 23 February, 2021;
originally announced February 2021.
-
Synergistic Multi-spectral CT Reconstruction with Directional Total Variation
Authors:
Evelyn Cueva,
Alexander Meaney,
Samuli Siltanen,
Matthias J. Ehrhardt
Abstract:
This work considers synergistic multi-spectral CT reconstruction where information from all available energy channels is combined to improve the reconstruction of each individual channel, we propose to fuse this available data (represented by a single sinogram) to obtain a polyenergetic image which keeps structural information shared by the energy channels with increased signal-to-noise-ratio. Thi…
▽ More
This work considers synergistic multi-spectral CT reconstruction where information from all available energy channels is combined to improve the reconstruction of each individual channel, we propose to fuse this available data (represented by a single sinogram) to obtain a polyenergetic image which keeps structural information shared by the energy channels with increased signal-to-noise-ratio. This new image is used as prior information during a channel-by-channel minimization process through the directional total variation. We analyze the use of directional total variation within variational regularization and iterative regularization. Our numerical results on simulated and experimental data show improvements in terms of image quality and in computational speed.
△ Less
Submitted 22 April, 2021; v1 submitted 5 January, 2021;
originally announced January 2021.
-
Convergence Properties of a Randomized Primal-Dual Algorithm with Applications to Parallel MRI
Authors:
Eric B. Gutierrez,
Claire Delplancke,
Matthias J. Ehrhardt
Abstract:
The Stochastic Primal-Dual Hybrid Gradient (SPDHG) was proposed by Chambolle et al. (2018) and is an efficient algorithm to solve some nonsmooth large-scale optimization problems. In this paper we prove its almost sure convergence for convex but not necessarily strongly convex functionals. We also look into its application to parallel Magnetic Resonance Imaging reconstruction in order to test perf…
▽ More
The Stochastic Primal-Dual Hybrid Gradient (SPDHG) was proposed by Chambolle et al. (2018) and is an efficient algorithm to solve some nonsmooth large-scale optimization problems. In this paper we prove its almost sure convergence for convex but not necessarily strongly convex functionals. We also look into its application to parallel Magnetic Resonance Imaging reconstruction in order to test performance of SPDHG. Our numerical results show that for a range of settings SPDHG converges significantly faster than its deterministic counterpart.
△ Less
Submitted 31 March, 2021; v1 submitted 2 December, 2020;
originally announced December 2020.
-
Efficient Hyperparameter Tuning with Dynamic Accuracy Derivative-Free Optimization
Authors:
Matthias J. Ehrhardt,
Lindon Roberts
Abstract:
Many machine learning solutions are framed as optimization problems which rely on good hyperparameters. Algorithms for tuning these hyperparameters usually assume access to exact solutions to the underlying learning problem, which is typically not practical. Here, we apply a recent dynamic accuracy derivative-free optimization method to hyperparameter tuning, which allows inexact evaluations of th…
▽ More
Many machine learning solutions are framed as optimization problems which rely on good hyperparameters. Algorithms for tuning these hyperparameters usually assume access to exact solutions to the underlying learning problem, which is typically not practical. Here, we apply a recent dynamic accuracy derivative-free optimization method to hyperparameter tuning, which allows inexact evaluations of the learning problem while retaining convergence guarantees. We test the method on the problem of learning elastic net weights for a logistic classifier, and demonstrate its robustness and efficiency compared to a fixed accuracy approach. This demonstrates a promising approach for hyperparameter tuning, with both convergence guarantees and practical performance.
△ Less
Submitted 5 November, 2020;
originally announced November 2020.
-
A temporal multiscale approach for MR Fingerprinting
Authors:
Samuel Cortinhas,
Mohammad Golbabaee,
Matthias J. Ehrhardt
Abstract:
Quantitative MRI (qMRI) is becoming increasingly important for research and clinical applications, however, state-of-the-art reconstruction methods for qMRI are computationally prohibitive. We propose a temporal multiscale approach to reduce computation times in qMRI. Instead of computing exact gradients of the qMRI likelihood, we propose a novel approximation relying on the temporal smoothness of…
▽ More
Quantitative MRI (qMRI) is becoming increasingly important for research and clinical applications, however, state-of-the-art reconstruction methods for qMRI are computationally prohibitive. We propose a temporal multiscale approach to reduce computation times in qMRI. Instead of computing exact gradients of the qMRI likelihood, we propose a novel approximation relying on the temporal smoothness of the data. These gradients are then used in a coarse-to-fine (C2F) approach, for example using coordinate descent. The C2F approach was also found to improve the accuracy of solutions, compared to similar methods where no multiscaling was used.
△ Less
Submitted 30 March, 2021; v1 submitted 7 October, 2020;
originally announced October 2020.
-
Multi-modality imaging with structure-promoting regularisers
Authors:
Matthias J. Ehrhardt
Abstract:
Imaging with multiple modalities or multiple channels is becoming increasingly important for our modern society. A key tool for understanding and early diagnosis of cancer and dementia is PET-MR, a combined positron emission tomography and magnetic resonance imaging scanner which can simultaneously acquire functional and anatomical data. Similarly in remote sensing, while hyperspectral sensors may…
▽ More
Imaging with multiple modalities or multiple channels is becoming increasingly important for our modern society. A key tool for understanding and early diagnosis of cancer and dementia is PET-MR, a combined positron emission tomography and magnetic resonance imaging scanner which can simultaneously acquire functional and anatomical data. Similarly in remote sensing, while hyperspectral sensors may allow to characterise and distinguish materials, digital cameras offer high spatial resolution to delineate objects. In both of these examples, the imaging modalities can be considered individually or jointly. In this chapter we discuss mathematical approaches which allow to combine information from several imaging modalities so that multi-modality imaging can be more than just the sum of its components.
△ Less
Submitted 22 July, 2020;
originally announced July 2020.
-
Inexact Derivative-Free Optimization for Bilevel Learning
Authors:
Matthias J. Ehrhardt,
Lindon Roberts
Abstract:
Variational regularization techniques are dominant in the field of mathematical imaging. A drawback of these techniques is that they are dependent on a number of parameters which have to be set by the user. A by now common strategy to resolve this issue is to learn these parameters from data. While mathematically appealing this strategy leads to a nested optimization problem (known as bilevel opti…
▽ More
Variational regularization techniques are dominant in the field of mathematical imaging. A drawback of these techniques is that they are dependent on a number of parameters which have to be set by the user. A by now common strategy to resolve this issue is to learn these parameters from data. While mathematically appealing this strategy leads to a nested optimization problem (known as bilevel optimization) which is computationally very difficult to handle. It is common when solving the upper-level problem to assume access to exact solutions of the lower-level problem, which is practically infeasible. In this work we propose to solve these problems using inexact derivative-free optimization algorithms which never require exact lower-level problem solutions, but instead assume access to approximate solutions with controllable accuracy, which is achievable in practice. We prove global convergence and a worstcase complexity bound for our approach. We test our proposed framework on ROFdenoising and learning MRI sampling patterns. Dynamically adjusting the lower-level accuracy yields learned parameters with similar reconstruction quality as highaccuracy evaluations but with dramatic reductions in computational work (up to 100 times faster in some cases).
△ Less
Submitted 8 December, 2020; v1 submitted 22 June, 2020;
originally announced June 2020.
-
Structure preserving deep learning
Authors:
Elena Celledoni,
Matthias J. Ehrhardt,
Christian Etmann,
Robert I McLachlan,
Brynjulf Owren,
Carola-Bibiane Schönlieb,
Ferdia Sherry
Abstract:
Over the past few years, deep learning has risen to the foreground as a topic of massive interest, mainly as a result of successes obtained in solving large-scale image processing tasks. There are multiple challenging mathematical problems involved in applying deep learning: most deep learning methods require the solution of hard optimisation problems, and a good understanding of the tradeoff betw…
▽ More
Over the past few years, deep learning has risen to the foreground as a topic of massive interest, mainly as a result of successes obtained in solving large-scale image processing tasks. There are multiple challenging mathematical problems involved in applying deep learning: most deep learning methods require the solution of hard optimisation problems, and a good understanding of the tradeoff between computational effort, amount of data and model complexity is required to successfully design a deep learning approach for a given problem. A large amount of progress made in deep learning has been based on heuristic explorations, but there is a growing effort to mathematically understand the structure in existing deep learning methods and to systematically design new deep learning methods to preserve certain types of structure in deep learning. In this article, we review a number of these directions: some deep neural networks can be understood as discretisations of dynamical systems, neural networks can be designed to have desirable properties such as invertibility or group equivariance, and new algorithmic frameworks based on conformal Hamiltonian systems and Riemannian manifolds to solve the optimisation problems have been proposed. We conclude our review of each of these topics by discussing some open problems that we consider to be interesting directions for future research.
△ Less
Submitted 5 June, 2020;
originally announced June 2020.
-
Robust Image Reconstruction with Misaligned Structural Information
Authors:
Leon Bungert,
Matthias J. Ehrhardt
Abstract:
Multi-modality (or multi-channel) imaging is becoming increasingly important and more widely available, e.g. hyperspectral imaging in remote sensing, spectral CT in material sciences as well as multi-contrast MRI and PET-MR in medicine. Research in the last decades resulted in a plethora of mathematical methods to combine data from several modalities. State-of-the-art methods, often formulated as…
▽ More
Multi-modality (or multi-channel) imaging is becoming increasingly important and more widely available, e.g. hyperspectral imaging in remote sensing, spectral CT in material sciences as well as multi-contrast MRI and PET-MR in medicine. Research in the last decades resulted in a plethora of mathematical methods to combine data from several modalities. State-of-the-art methods, often formulated as variational regularization, have shown to significantly improve image reconstruction both quantitatively and qualitatively. Almost all of these models rely on the assumption that the modalities are perfectly registered, which is not the case in most real world applications. We propose a variational framework which jointly performs reconstruction and registration, thereby overcoming this hurdle. Our approach is the first to achieve this for different modalities and outranks established approaches in terms of accuracy of both reconstruction and registration. Numerical results on simulated and real data show the potential of the proposed strategy for various applications in multi-contrast MRI, PET-MR, and hyperspectral imaging: typical misalignments between modalities such as rotations, translations, zooms can be effectively corrected during the reconstruction process. Therefore the proposed framework allows the robust exploitation of shared information across multiple modalities under real conditions.
△ Less
Submitted 24 December, 2020; v1 submitted 1 April, 2020;
originally announced April 2020.
-
Accelerating Variance-Reduced Stochastic Gradient Methods
Authors:
Derek Driggs,
Matthias J. Ehrhardt,
Carola-Bibiane Schönlieb
Abstract:
Variance reduction is a crucial tool for improving the slow convergence of stochastic gradient descent. Only a few variance-reduced methods, however, have yet been shown to directly benefit from Nesterov's acceleration techniques to match the convergence rates of accelerated gradient methods. Such approaches rely on "negative momentum", a technique for further variance reduction that is generally…
▽ More
Variance reduction is a crucial tool for improving the slow convergence of stochastic gradient descent. Only a few variance-reduced methods, however, have yet been shown to directly benefit from Nesterov's acceleration techniques to match the convergence rates of accelerated gradient methods. Such approaches rely on "negative momentum", a technique for further variance reduction that is generally specific to the SVRG gradient estimator. In this work, we show that negative momentum is unnecessary for acceleration and develop a universal acceleration framework that allows all popular variance-reduced methods to achieve accelerated convergence rates. The constants appearing in these rates, including their dependence on the number of functions $n$, scale with the mean-squared-error and bias of the gradient estimator. In a series of numerical experiments, we demonstrate that versions of SAGA, SVRG, SARAH, and SARGE using our framework significantly outperform non-accelerated versions and compare favourably with algorithms using negative momentum.
△ Less
Submitted 29 October, 2020; v1 submitted 21 October, 2019;
originally announced October 2019.
-
Learning the Sampling Pattern for MRI
Authors:
Ferdia Sherry,
Martin Benning,
Juan Carlos De los Reyes,
Martin J. Graves,
Georg Maierhofer,
Guy Williams,
Carola-Bibiane Schönlieb,
Matthias J. Ehrhardt
Abstract:
The discovery of the theory of compressed sensing brought the realisation that many inverse problems can be solved even when measurements are "incomplete". This is particularly interesting in magnetic resonance imaging (MRI), where long acquisition times can limit its use. In this work, we consider the problem of learning a sparse sampling pattern that can be used to optimally balance acquisition…
▽ More
The discovery of the theory of compressed sensing brought the realisation that many inverse problems can be solved even when measurements are "incomplete". This is particularly interesting in magnetic resonance imaging (MRI), where long acquisition times can limit its use. In this work, we consider the problem of learning a sparse sampling pattern that can be used to optimally balance acquisition time versus quality of the reconstructed image. We use a supervised learning approach, making the assumption that our training data is representative enough of new data acquisitions. We demonstrate that this is indeed the case, even if the training data consists of just 7 training pairs of measurements and ground-truth images; with a training set of brain images of size 192 by 192, for instance, one of the learned patterns samples only 35% of k-space, however results in reconstructions with mean SSIM 0.914 on a test set of similar images. The proposed framework is general enough to learn arbitrary sampling patterns, including common patterns such as Cartesian, spiral and radial sampling.
△ Less
Submitted 21 June, 2020; v1 submitted 20 June, 2019;
originally announced June 2019.
-
Deep learning as optimal control problems: models and numerical methods
Authors:
Martin Benning,
Elena Celledoni,
Matthias J. Ehrhardt,
Brynjulf Owren,
Carola-Bibiane Schönlieb
Abstract:
We consider recent work of Haber and Ruthotto 2017 and Chang et al. 2018, where deep learning neural networks have been interpreted as discretisations of an optimal control problem subject to an ordinary differential equation constraint. We review the first order conditions for optimality, and the conditions ensuring optimality after discretisation. This leads to a class of algorithms for solving…
▽ More
We consider recent work of Haber and Ruthotto 2017 and Chang et al. 2018, where deep learning neural networks have been interpreted as discretisations of an optimal control problem subject to an ordinary differential equation constraint. We review the first order conditions for optimality, and the conditions ensuring optimality after discretisation. This leads to a class of algorithms for solving the discrete optimal control problem which guarantee that the corresponding discrete necessary conditions for optimality are fulfilled. The differential equation setting lends itself to learning additional parameters such as the time discretisation. We explore this extension alongside natural constraints (e.g. time steps lie in a simplex). We compare these deep learning algorithms numerically in terms of induced flow and generalisation ability.
△ Less
Submitted 30 September, 2019; v1 submitted 11 April, 2019;
originally announced April 2019.
-
Faster PET Reconstruction with Non-Smooth Priors by Randomization and Preconditioning
Authors:
Matthias J. Ehrhardt,
Pawel Markiewicz,
Carola-Bibiane Schönlieb
Abstract:
Uncompressed clinical data from modern positron emission tomography (PET) scanners are very large, exceeding 350 million data points (projection bins). The last decades have seen tremendous advancements in mathematical imaging tools many of which lead to non-smooth (i.e. non-differentiable) optimization problems which are much harder to solve than smooth optimization problems. Most of these tools…
▽ More
Uncompressed clinical data from modern positron emission tomography (PET) scanners are very large, exceeding 350 million data points (projection bins). The last decades have seen tremendous advancements in mathematical imaging tools many of which lead to non-smooth (i.e. non-differentiable) optimization problems which are much harder to solve than smooth optimization problems. Most of these tools have not been translated to clinical PET data, as the state-of-the-art algorithms for non-smooth problems do not scale well to large data. In this work, inspired by big data machine learning applications, we use advanced randomized optimization algorithms to solve the PET reconstruction problem for a very large class of non-smooth priors which includes for example total variation, total generalized variation, directional total variation and various different physical constraints. The proposed algorithm randomly uses subsets of the data and only updates the variables associated with these. While this idea often leads to divergent algorithms, we show that the proposed algorithm does indeed converge for any proper subset selection. Numerically, we show on real PET data (FDG and florbetapir) from a Siemens Biograph mMR that about ten projections and backprojections are sufficient to solve the MAP optimisation problem related to many popular non-smooth priors; thus showing that the proposed algorithm is fast enough to bring these models into routine clinical practice.
△ Less
Submitted 2 August, 2019; v1 submitted 21 August, 2018;
originally announced August 2018.
-
A geometric integration approach to nonsmooth, nonconvex optimisation
Authors:
Erlend S. Riis,
Matthias J. Ehrhardt,
G. R. W. Quispel,
Carola-Bibiane Schönlieb
Abstract:
The optimisation of nonsmooth, nonconvex functions without access to gradients is a particularly challenging problem that is frequently encountered, for example in model parameter optimisation problems. Bilevel optimisation of parameters is a standard setting in areas such as variational regularisation problems and supervised machine learning. We present efficient and robust derivative-free method…
▽ More
The optimisation of nonsmooth, nonconvex functions without access to gradients is a particularly challenging problem that is frequently encountered, for example in model parameter optimisation problems. Bilevel optimisation of parameters is a standard setting in areas such as variational regularisation problems and supervised machine learning. We present efficient and robust derivative-free methods called randomised Itoh--Abe methods. These are generalisations of the Itoh--Abe discrete gradient method, a well-known scheme from geometric integration, which has previously only been considered in the smooth setting. We demonstrate that the method and its favourable energy dissipation properties are well-defined in the nonsmooth setting. Furthermore, we prove that whenever the objective function is locally Lipschitz continuous, the iterates almost surely converge to a connected set of Clarke stationary points. We present an implementation of the methods, and apply it to various test problems. The numerical results indicate that the randomised Itoh--Abe methods are superior to state-of-the-art derivative-free optimisation methods in solving nonsmooth problems while remaining competitive in terms of efficiency.
△ Less
Submitted 19 July, 2018;
originally announced July 2018.
-
Enhancing joint reconstruction and segmentation with non-convex Bregman iteration
Authors:
Veronica Corona,
Martin Benning,
Matthias J. Ehrhardt,
Lynn F. Gladden,
Richard Mair,
Andi Reci,
Andrew J. Sederman,
Stefanie Reichelt,
Carola-Bibiane Schoenlieb
Abstract:
All imaging modalities such as computed tomography (CT), emission tomography and magnetic resonance imaging (MRI) require a reconstruction approach to produce an image. A common image processing task for applications that utilise those modalities is image segmentation, typically performed posterior to the reconstruction. We explore a new approach that combines reconstruction and segmentation in a…
▽ More
All imaging modalities such as computed tomography (CT), emission tomography and magnetic resonance imaging (MRI) require a reconstruction approach to produce an image. A common image processing task for applications that utilise those modalities is image segmentation, typically performed posterior to the reconstruction. We explore a new approach that combines reconstruction and segmentation in a unified framework. We derive a variational model that consists of a total variation regularised reconstruction from undersampled measurements and a Chan-Vese based segmentation. We extend the variational regularisation scheme to a Bregman iteration framework to improve the reconstruction and therefore the segmentation. We develop a novel alternating minimisation scheme that solves the non-convex optimisation problem with provable convergence guarantees. Our results for synthetic and real data show that both reconstruction and segmentation are improved compared to the classical sequential approach.
△ Less
Submitted 4 March, 2019; v1 submitted 4 July, 2018;
originally announced July 2018.
-
A geometric integration approach to smooth optimisation: Foundations of the discrete gradient method
Authors:
Matthias J. Ehrhardt,
Erlend S. Riis,
Torbjørn Ringholm,
Carola-Bibiane Schönlieb
Abstract:
Discrete gradient methods are geometric integration techniques that can preserve the dissipative structure of gradient flows. Due to the monotonic decay of the function values, they are well suited for general convex and nonconvex optimization problems. Both zero- and first-order algorithms can be derived from the discrete gradient method by selecting different discrete gradients. In this paper, w…
▽ More
Discrete gradient methods are geometric integration techniques that can preserve the dissipative structure of gradient flows. Due to the monotonic decay of the function values, they are well suited for general convex and nonconvex optimization problems. Both zero- and first-order algorithms can be derived from the discrete gradient method by selecting different discrete gradients. In this paper, we present a comprehensive analysis of the discrete gradient method for optimisation which provides a solid theoretical foundation. We show that the discrete gradient method is well-posed by proving the existence and uniqueness of iterates for any positive time step, and propose an efficient method for solving the associated discrete gradient equation. Moreover, we establish an $O(1/k)$ convergence rate for convex objectives and prove linear convergence if instead the Polyak-Lojasiewicz inequality is satisfied. The analysis is carried out for three discrete gradients - the Gonzalez discrete gradient, the mean value discrete gradient, and the Itoh-Abe discrete gradient - as well as for a randomised Itoh-Abe method. Our theoretical results are illustrated with a variety of numerical experiments, and we furthermore demonstrate that the methods are robust with respect to stiffness.
△ Less
Submitted 11 February, 2020; v1 submitted 16 May, 2018;
originally announced May 2018.
-
Choose your path wisely: gradient descent in a Bregman distance framework
Authors:
Martin Benning,
Marta M. Betcke,
Matthias J. Ehrhardt,
Carola-Bibiane Schönlieb
Abstract:
We propose an extension of a special form of gradient descent -- in the literature known as linearised Bregman iteration -- to a larger class of non-convex functions. We replace the classical (squared) two norm metric in the gradient descent setting with a generalised Bregman distance, based on a proper, convex and lower semi-continuous function. The algorithm's global convergence is proven for fu…
▽ More
We propose an extension of a special form of gradient descent -- in the literature known as linearised Bregman iteration -- to a larger class of non-convex functions. We replace the classical (squared) two norm metric in the gradient descent setting with a generalised Bregman distance, based on a proper, convex and lower semi-continuous function. The algorithm's global convergence is proven for functions that satisfy the Kurdyka-Łojasiewicz property. Examples illustrate that features of different scale are being introduced throughout the iteration, transitioning from coarse to fine. This coarse-to-fine approach with respect to scale allows to recover solutions of non-convex optimisation problems that are superior to those obtained with conventional gradient descent, or even projected and proximal gradient descent. The effectiveness of the linearised Bregman iteration in combination with early stop** is illustrated for the applications of parallel magnetic resonance imaging, blind deconvolution as well as image classification with neural networks.
△ Less
Submitted 25 May, 2021; v1 submitted 11 December, 2017;
originally announced December 2017.
-
Blind Image Fusion for Hyperspectral Imaging with the Directional Total Variation
Authors:
Leon Bungert,
David A. Coomes,
Matthias J. Ehrhardt,
Jennifer Rasch,
Rafael Reisenhofer,
Carola-Bibiane Schönlieb
Abstract:
Hyperspectral imaging is a cutting-edge type of remote sensing used for map** vegetation properties, rock minerals and other materials. A major drawback of hyperspectral imaging devices is their intrinsic low spatial resolution. In this paper, we propose a method for increasing the spatial resolution of a hyperspectral image by fusing it with an image of higher spatial resolution that was obtain…
▽ More
Hyperspectral imaging is a cutting-edge type of remote sensing used for map** vegetation properties, rock minerals and other materials. A major drawback of hyperspectral imaging devices is their intrinsic low spatial resolution. In this paper, we propose a method for increasing the spatial resolution of a hyperspectral image by fusing it with an image of higher spatial resolution that was obtained with a different imaging modality. This is accomplished by solving a variational problem in which the regularization functional is the directional total variation. To accommodate for possible mis-registrations between the two images, we consider a non-convex blind super-resolution problem where both a fused image and the corresponding convolution kernel are estimated. Using this approach, our model can realign the given images if needed. Our experimental results indicate that the non-convexity is negligible in practice and that reliable solutions can be computed using a variety of different optimization algorithms. Numerical results on real remote sensing data from plant sciences and urban monitoring show the potential of the proposed method and suggests that it is robust with respect to the regularization parameters, mis-registration and the shape of the kernel.
△ Less
Submitted 9 April, 2018; v1 submitted 4 October, 2017;
originally announced October 2017.
-
Stochastic Primal-Dual Hybrid Gradient Algorithm with Arbitrary Sampling and Imaging Applications
Authors:
Antonin Chambolle,
Matthias J. Ehrhardt,
Peter Richtárik,
Carola-Bibiane Schönlieb
Abstract:
We propose a stochastic extension of the primal-dual hybrid gradient algorithm studied by Chambolle and Pock in 2011 to solve saddle point problems that are separable in the dual variable. The analysis is carried out for general convex-concave saddle point problems and problems that are either partially smooth / strongly convex or fully smooth / strongly convex. We perform the analysis for arbitra…
▽ More
We propose a stochastic extension of the primal-dual hybrid gradient algorithm studied by Chambolle and Pock in 2011 to solve saddle point problems that are separable in the dual variable. The analysis is carried out for general convex-concave saddle point problems and problems that are either partially smooth / strongly convex or fully smooth / strongly convex. We perform the analysis for arbitrary samplings of dual variables, and obtain known deterministic results as a special case. Several variants of our stochastic method significantly outperform the deterministic variant on a variety of imaging tasks.
△ Less
Submitted 10 April, 2018; v1 submitted 15 June, 2017;
originally announced June 2017.
-
Gradient descent in a generalised Bregman distance framework
Authors:
Martin Benning,
Marta M. Betcke,
Matthias J. Ehrhardt,
Carola-Bibiane Schönlieb
Abstract:
We discuss a special form of gradient descent that in the literature has become known as the so-called linearised Bregman iteration. The idea is to replace the classical (squared) two norm metric in the gradient descent setting with a generalised Bregman distance, based on a more general proper, convex and lower semi-continuous functional. Gradient descent as well as the entropic mirror descent by…
▽ More
We discuss a special form of gradient descent that in the literature has become known as the so-called linearised Bregman iteration. The idea is to replace the classical (squared) two norm metric in the gradient descent setting with a generalised Bregman distance, based on a more general proper, convex and lower semi-continuous functional. Gradient descent as well as the entropic mirror descent by Nemirovsky and Yudin are special cases, as is a specific form of non-linear Landweber iteration introduced by Bachmayr and Burger. We are going to analyse the linearised Bregman iteration in a setting where the functional we want to minimise is neither necessarily Lipschitz-continuous (in the classical sense) nor necessarily convex, and establish a global convergence result under the additional assumption that the functional we wish to minimise satisfies the so-called Kurdyka-Łojasiewicz property.
△ Less
Submitted 27 December, 2016; v1 submitted 7 December, 2016;
originally announced December 2016.
-
Multi-Contrast MRI Reconstruction with Structure-Guided Total Variation
Authors:
Matthias J. Ehrhardt,
Marta M. Betcke
Abstract:
Magnetic resonance imaging (MRI) is a versatile imaging technique that allows different contrasts depending on the acquisition parameters. Many clinical imaging studies acquire MRI data for more than one of these contrasts---such as for instance T1 and T2 weighted images---which makes the overall scanning procedure very time consuming. As all of these images show the same underlying anatomy one ca…
▽ More
Magnetic resonance imaging (MRI) is a versatile imaging technique that allows different contrasts depending on the acquisition parameters. Many clinical imaging studies acquire MRI data for more than one of these contrasts---such as for instance T1 and T2 weighted images---which makes the overall scanning procedure very time consuming. As all of these images show the same underlying anatomy one can try to omit unnecessary measurements by taking the similarity into account during reconstruction. We will discuss two modifications of total variation---based on i) location and ii) direction---that take structural a priori knowledge into account and reduce to total variation in the degenerate case when no structural knowledge is available. We solve the resulting convex minimization problem with the alternating direction method of multipliers that separates the forward operator from the prior. For both priors the corresponding proximal operator can be implemented as an extension of the fast gradient projection method on the dual problem for total variation. We tested the priors on six data sets that are based on phantoms and real MRI images. In all test cases exploiting the structural information from the other contrast yields better results than separate reconstruction with total variation in terms of standard metrics like peak signal-to-noise ratio and structural similarity index. Furthermore, we found that exploiting the two dimensional directional information results in images with well defined edges, superior to those reconstructed solely using a priori information about the edge location.
△ Less
Submitted 20 November, 2015;
originally announced November 2015.