-
FlowSDF: Flow Matching for Medical Image Segmentation Using Distance Transforms
Authors:
Lea Bogensperger,
Dominik Narnhofer,
Alexander Falk,
Konrad Schindler,
Thomas Pock
Abstract:
Medical image segmentation is a crucial task that relies on the ability to accurately identify and isolate regions of interest in medical images. Thereby, generative approaches allow to capture the statistical properties of segmentation masks that are dependent on the respective structures. In this work we propose FlowSDF, an image-guided conditional flow matching framework to represent the signed…
▽ More
Medical image segmentation is a crucial task that relies on the ability to accurately identify and isolate regions of interest in medical images. Thereby, generative approaches allow to capture the statistical properties of segmentation masks that are dependent on the respective structures. In this work we propose FlowSDF, an image-guided conditional flow matching framework to represent the signed distance function (SDF) leading to an implicit distribution of segmentation masks. The advantage of leveraging the SDF is a more natural distortion when compared to that of binary masks. By learning a vector field that is directly related to the probability path of a conditional distribution of SDFs, we can accurately sample from the distribution of segmentation masks, allowing for the evaluation of statistical quantities. Thus, this probabilistic representation allows for the generation of uncertainty maps represented by the variance, which can aid in further analysis and enhance the predictive robustness. We qualitatively and quantitatively illustrate competitive performance of the proposed method on a public nuclei and gland segmentation data set, highlighting its utility in medical image segmentation applications.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Selective, Interpretable, and Motion Consistent Privacy Attribute Obfuscation for Action Recognition
Authors:
Filip Ilic,
He Zhao,
Thomas Pock,
Richard P. Wildes
Abstract:
Concerns for the privacy of individuals captured in public imagery have led to privacy-preserving action recognition. Existing approaches often suffer from issues arising through obfuscation being applied globally and a lack of interpretability. Global obfuscation hides privacy sensitive regions, but also contextual regions important for action recognition. Lack of interpretability erodes trust in…
▽ More
Concerns for the privacy of individuals captured in public imagery have led to privacy-preserving action recognition. Existing approaches often suffer from issues arising through obfuscation being applied globally and a lack of interpretability. Global obfuscation hides privacy sensitive regions, but also contextual regions important for action recognition. Lack of interpretability erodes trust in these new technologies. We highlight the limitations of current paradigms and propose a solution: Human selected privacy templates that yield interpretability by design, an obfuscation scheme that selectively hides attributes and also induces temporal consistency, which is important in action recognition. Our approach is architecture agnostic and directly modifies input imagery, while existing approaches generally require architecture training. Our approach offers more flexibility, as no retraining is required, and outperforms alternatives on three widely used datasets.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Diffusion-based generation of Histopathological Whole Slide Images at a Gigapixel scale
Authors:
Robert Harb,
Thomas Pock,
Heimo Müller
Abstract:
We present a novel diffusion-based approach to generate synthetic histopathological Whole Slide Images (WSIs) at an unprecedented gigapixel scale. Synthetic WSIs have many potential applications: They can augment training datasets to enhance the performance of many computational pathology applications. They allow the creation of synthesized copies of datasets that can be shared without violating p…
▽ More
We present a novel diffusion-based approach to generate synthetic histopathological Whole Slide Images (WSIs) at an unprecedented gigapixel scale. Synthetic WSIs have many potential applications: They can augment training datasets to enhance the performance of many computational pathology applications. They allow the creation of synthesized copies of datasets that can be shared without violating privacy regulations. Or they can facilitate learning representations of WSIs without requiring data annotations. Despite this variety of applications, no existing deep-learning-based method generates WSIs at their typically high resolutions. Mainly due to the high computational complexity. Therefore, we propose a novel coarse-to-fine sampling scheme to tackle image generation of high-resolution WSIs. In this scheme, we increase the resolution of an initial low-resolution image to a high-resolution WSI. Particularly, a diffusion model sequentially adds fine details to images and increases their resolution. In our experiments, we train our method with WSIs from the TCGA-BRCA dataset. Additionally to quantitative evaluations, we also performed a user study with pathologists. The study results suggest that our generated WSIs resemble the structure of real WSIs.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
Product of Gaussian Mixture Diffusion Models
Authors:
Martin Zach,
Erich Kobler,
Antonin Chambolle,
Thomas Pock
Abstract:
In this work we tackle the problem of estimating the density $ f_X $ of a random variable $ X $ by successive smoothing, such that the smoothed random variable $ Y $ fulfills the diffusion partial differential equation $ (\partial_t - Δ_1)f_Y(\,\cdot\,, t) = 0 $ with initial condition $ f_Y(\,\cdot\,, 0) = f_X $. We propose a product-of-experts-type model utilizing Gaussian mixture experts and stu…
▽ More
In this work we tackle the problem of estimating the density $ f_X $ of a random variable $ X $ by successive smoothing, such that the smoothed random variable $ Y $ fulfills the diffusion partial differential equation $ (\partial_t - Δ_1)f_Y(\,\cdot\,, t) = 0 $ with initial condition $ f_Y(\,\cdot\,, 0) = f_X $. We propose a product-of-experts-type model utilizing Gaussian mixture experts and study configurations that admit an analytic expression for $ f_Y (\,\cdot\,, t) $. In particular, with a focus on image processing, we derive conditions for models acting on filter-, wavelet-, and shearlet responses. Our construction naturally allows the model to be trained simultaneously over the entire diffusion horizon using empirical Bayes. We show numerical results for image denoising where our models are competitive while being tractable, interpretable, and having only a small number of learnable parameters. As a byproduct, our models can be used for reliable noise level estimation, allowing blind denoising of images corrupted by heteroscedastic noise.
△ Less
Submitted 9 January, 2024; v1 submitted 19 October, 2023;
originally announced October 2023.
-
Subgradient Langevin Methods for Sampling from Non-smooth Potentials
Authors:
Andreas Habring,
Martin Holler,
Thomas Pock
Abstract:
This paper is concerned with sampling from probability distributions $π$ on $\mathbb{R}^d$ admitting a density of the form $π(x) \propto e^{-U(x)}$, where $U(x)=F(x)+G(Kx)$ with $K$ being a linear operator and $G$ being non-differentiable. Two different methods are proposed, both employing a subgradient step with respect to $G\circ K$, but, depending on the regularity of $F$, either an explicit or…
▽ More
This paper is concerned with sampling from probability distributions $π$ on $\mathbb{R}^d$ admitting a density of the form $π(x) \propto e^{-U(x)}$, where $U(x)=F(x)+G(Kx)$ with $K$ being a linear operator and $G$ being non-differentiable. Two different methods are proposed, both employing a subgradient step with respect to $G\circ K$, but, depending on the regularity of $F$, either an explicit or an implicit gradient step with respect to $F$ can be implemented. For both methods, non-asymptotic convergence proofs are provided, with improved convergence results for more regular $F$. Further, numerical experiments are conducted for simple 2D examples, illustrating the convergence rates, and for examples of Bayesian imaging, showing the practical feasibility of the proposed methods for high dimensional data.
△ Less
Submitted 25 May, 2024; v1 submitted 2 August, 2023;
originally announced August 2023.
-
On the Relationship Between RNN Hidden State Vectors and Semantic Ground Truth
Authors:
Edi Muškardin,
Martin Tappler,
Ingo Pill,
Bernhard K. Aichernig,
Thomas Pock
Abstract:
We examine the assumption that the hidden-state vectors of recurrent neural networks (RNNs) tend to form clusters of semantically similar vectors, which we dub the clustering hypothesis. While this hypothesis has been assumed in the analysis of RNNs in recent years, its validity has not been studied thoroughly on modern neural network architectures. We examine the clustering hypothesis in the cont…
▽ More
We examine the assumption that the hidden-state vectors of recurrent neural networks (RNNs) tend to form clusters of semantically similar vectors, which we dub the clustering hypothesis. While this hypothesis has been assumed in the analysis of RNNs in recent years, its validity has not been studied thoroughly on modern neural network architectures. We examine the clustering hypothesis in the context of RNNs that were trained to recognize regular languages. This enables us to draw on perfect ground-truth automata in our evaluation, against which we can compare the RNN's accuracy and the distribution of the hidden-state vectors.
We start with examining the (piecewise linear) separability of an RNN's hidden-state vectors into semantically different classes. We continue the analysis by computing clusters over the hidden-state vector space with multiple state-of-the-art unsupervised clustering approaches. We formally analyze the accuracy of computed clustering functions and the validity of the clustering hypothesis by determining whether clusters group semantically similar vectors to the same state in the ground-truth model.
Our evaluation supports the validity of the clustering hypothesis in the majority of examined cases. We observed that the hidden-state vectors of well-trained RNNs are separable, and that the unsupervised clustering techniques succeed in finding clusters of similar state vectors.
△ Less
Submitted 29 June, 2023;
originally announced June 2023.
-
Non-Log-Concave and Nonsmooth Sampling via Langevin Monte Carlo Algorithms
Authors:
Tim Tsz-Kit Lau,
Han Liu,
Thomas Pock
Abstract:
We study the problem of approximate sampling from non-log-concave distributions, e.g., Gaussian mixtures, which is often challenging even in low dimensions due to their multimodality. We focus on performing this task via Markov chain Monte Carlo (MCMC) methods derived from discretizations of the overdamped Langevin diffusions, which are commonly known as Langevin Monte Carlo algorithms. Furthermor…
▽ More
We study the problem of approximate sampling from non-log-concave distributions, e.g., Gaussian mixtures, which is often challenging even in low dimensions due to their multimodality. We focus on performing this task via Markov chain Monte Carlo (MCMC) methods derived from discretizations of the overdamped Langevin diffusions, which are commonly known as Langevin Monte Carlo algorithms. Furthermore, we are also interested in two nonsmooth cases for which a large class of proximal MCMC methods have been developed: (i) a nonsmooth prior is considered with a Gaussian mixture likelihood; (ii) a Laplacian mixture distribution. Such nonsmooth and non-log-concave sampling tasks arise from a wide range of applications to Bayesian inference and imaging inverse problems such as image deconvolution. We perform numerical simulations to compare the performance of most commonly used Langevin Monte Carlo algorithms.
△ Less
Submitted 29 May, 2024; v1 submitted 25 May, 2023;
originally announced May 2023.
-
Learned Discretization Schemes for the Second-Order Total Generalized Variation
Authors:
Lea Bogensperger,
Antonin Chambolle,
Alexander Effland,
Thomas Pock
Abstract:
The total generalized variation extends the total variation by incorporating higher-order smoothness. Thus, it can also suffer from similar discretization issues related to isotropy. Inspired by the success of novel discretization schemes of the total variation, there has been recent work to improve the second-order total generalized variation discretization, based on the same design idea. In this…
▽ More
The total generalized variation extends the total variation by incorporating higher-order smoothness. Thus, it can also suffer from similar discretization issues related to isotropy. Inspired by the success of novel discretization schemes of the total variation, there has been recent work to improve the second-order total generalized variation discretization, based on the same design idea. In this work, we propose to extend this to a general discretization scheme based on interpolation filters, for which we prove variational consistency. We then describe how to learn these interpolation filters to optimize the discretization for various imaging applications. We illustrate the performance of the method on a synthetic data set as well as for natural image denoising.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
Score-Based Generative Models for Medical Image Segmentation using Signed Distance Functions
Authors:
Lea Bogensperger,
Dominik Narnhofer,
Filip Ilic,
Thomas Pock
Abstract:
Medical image segmentation is a crucial task that relies on the ability to accurately identify and isolate regions of interest in medical images. Thereby, generative approaches allow to capture the statistical properties of segmentation masks that are dependent on the respective structures. In this work we propose a conditional score-based generative modeling framework to represent the signed dist…
▽ More
Medical image segmentation is a crucial task that relies on the ability to accurately identify and isolate regions of interest in medical images. Thereby, generative approaches allow to capture the statistical properties of segmentation masks that are dependent on the respective structures. In this work we propose a conditional score-based generative modeling framework to represent the signed distance function (SDF) leading to an implicit distribution of segmentation masks. The advantage of leveraging the SDF is a more natural distortion when compared to that of binary masks. By learning the score function of the conditional distribution of SDFs we can accurately sample from the distribution of segmentation masks, allowing for the evaluation of statistical quantities. Thus, this probabilistic representation allows for the generation of uncertainty maps represented by the variance, which can aid in further analysis and enhance the predictive robustness. We qualitatively and quantitatively illustrate competitive performance of the proposed method on a public nuclei and gland segmentation data set, highlighting its potential utility in medical image segmentation applications.
△ Less
Submitted 21 July, 2023; v1 submitted 10 March, 2023;
originally announced March 2023.
-
Learning Gradually Non-convex Image Priors Using Score Matching
Authors:
Erich Kobler,
Thomas Pock
Abstract:
In this paper, we propose a unified framework of denoising score-based models in the context of graduated non-convex energy minimization. We show that for sufficiently large noise variance, the associated negative log density -- the energy -- becomes convex. Consequently, denoising score-based models essentially follow a graduated non-convexity heuristic. We apply this framework to learning genera…
▽ More
In this paper, we propose a unified framework of denoising score-based models in the context of graduated non-convex energy minimization. We show that for sufficiently large noise variance, the associated negative log density -- the energy -- becomes convex. Consequently, denoising score-based models essentially follow a graduated non-convexity heuristic. We apply this framework to learning generalized Fields of Experts image priors that approximate the joint density of noisy images and their associated variances. These priors can be easily incorporated into existing optimization algorithms for solving inverse problems and naturally implement a fast and robust graduated non-convexity mechanism.
△ Less
Submitted 21 February, 2023;
originally announced February 2023.
-
Bilevel learning of regularization models and their discretization for image deblurring and super-resolution
Authors:
Tatiana A. Bubba,
Luca Calatroni,
Ambra Catozzi,
Serena Crisci,
Thomas Pock,
Monica Pragliola,
Siiri Rautio,
Danilo Riccio,
Andrea Sebastiani
Abstract:
Bilevel learning is a powerful optimization technique that has extensively been employed in recent years to bridge the world of model-driven variational approaches with data-driven methods. Upon suitable parametrization of the desired quantities of interest (e.g., regularization terms or discretization filters), such approach computes optimal parameter values by solving a nested optimization probl…
▽ More
Bilevel learning is a powerful optimization technique that has extensively been employed in recent years to bridge the world of model-driven variational approaches with data-driven methods. Upon suitable parametrization of the desired quantities of interest (e.g., regularization terms or discretization filters), such approach computes optimal parameter values by solving a nested optimization problem where the variational model acts as a constraint. In this work, we consider two different use cases of bilevel learning for the problem of image restoration. First, we focus on learning scalar weights and convolutional filters defining a Field of Experts regularizer to restore natural images degraded by blur and noise. For improving the practical performance, the lower-level problem is solved by means of a gradient descent scheme combined with a line-search strategy based on the Barzilai-Borwein rule. As a second application, the bilevel setup is employed for learning a discretization of the popular total variation regularizer for solving image restoration problems (in particular, deblurring and super-resolution). Numerical results show the effectiveness of the approach and their generalization to multiple tasks.
△ Less
Submitted 27 October, 2023; v1 submitted 20 February, 2023;
originally announced February 2023.
-
Explicit Diffusion of Gaussian Mixture Model Based Image Priors
Authors:
Martin Zach,
Thomas Pock,
Erich Kobler,
Antonin Chambolle
Abstract:
In this work we tackle the problem of estimating the density $f_X$ of a random variable $X$ by successive smoothing, such that the smoothed random variable $Y$ fulfills $(\partial_t - Δ_1)f_Y(\,\cdot\,, t) = 0$, $f_Y(\,\cdot\,, 0) = f_X$. With a focus on image processing, we propose a product/fields of experts model with Gaussian mixture experts that admits an analytic expression for…
▽ More
In this work we tackle the problem of estimating the density $f_X$ of a random variable $X$ by successive smoothing, such that the smoothed random variable $Y$ fulfills $(\partial_t - Δ_1)f_Y(\,\cdot\,, t) = 0$, $f_Y(\,\cdot\,, 0) = f_X$. With a focus on image processing, we propose a product/fields of experts model with Gaussian mixture experts that admits an analytic expression for $f_Y (\,\cdot\,, t)$ under an orthogonality constraint on the filters. This construction naturally allows the model to be trained simultaneously over the entire diffusion horizon using empirical Bayes. We show preliminary results on image denoising where our model leads to competitive results while being tractable, interpretable, and having only a small number of learnable parameters. As a byproduct, our model can be used for reliable noise estimation, allowing blind denoising of images corrupted by heteroscedastic noise.
△ Less
Submitted 16 February, 2023;
originally announced February 2023.
-
Quantum Transport in Open Spin Chains using Neural-Network Quantum States
Authors:
Johannes Mellak,
Enrico Arrigoni,
Thomas Pock,
Wolfgang von der Linden
Abstract:
In this work we study the treatment of asymmetric open quantum systems with neural networks based on the restricted Boltzmann machine. In particular, we are interested in the non-equilibrium steady state current in the boundary-driven (anisotropic) Heisenberg spin chain. We address previously published difficulties in treating asymmetric dissipative systems with neural-network quantum states and M…
▽ More
In this work we study the treatment of asymmetric open quantum systems with neural networks based on the restricted Boltzmann machine. In particular, we are interested in the non-equilibrium steady state current in the boundary-driven (anisotropic) Heisenberg spin chain. We address previously published difficulties in treating asymmetric dissipative systems with neural-network quantum states and Monte-Carlo sampling and present an optimization method and a sampling technique that can be used to obtain high-fidelity steady state approximations of such systems. We point out some inherent symmetries of the Lindblad operator under consideration and exploit them during sampling. We show that local observables are not always a good indicator of the quality of the approximation and finally present results for the spin current that are in agreement with known results of simple open Heisenberg chains.
△ Less
Submitted 18 January, 2023; v1 submitted 27 December, 2022;
originally announced December 2022.
-
Posterior-Variance-Based Error Quantification for Inverse Problems in Imaging
Authors:
Dominik Narnhofer,
Andreas Habring,
Martin Holler,
Thomas Pock
Abstract:
In this work, a method for obtaining pixel-wise error bounds in Bayesian regularization of inverse imaging problems is introduced. The proposed method employs estimates of the posterior variance together with techniques from conformal prediction in order to obtain coverage guarantees for the error bounds, without making any assumption on the underlying data distribution. It is generally applicable…
▽ More
In this work, a method for obtaining pixel-wise error bounds in Bayesian regularization of inverse imaging problems is introduced. The proposed method employs estimates of the posterior variance together with techniques from conformal prediction in order to obtain coverage guarantees for the error bounds, without making any assumption on the underlying data distribution. It is generally applicable to Bayesian regularization approaches, independent, e.g., of the concrete choice of the prior. Furthermore, the coverage guarantees can also be obtained in case only approximate sampling from the posterior is possible. With this in particular, the proposed framework is able to incorporate any learned prior in a black-box manner. Guaranteed coverage without assumptions on the underlying distributions is only achievable since the magnitude of the error bounds is, in general, unknown in advance. Nevertheless, experiments with multiple regularization approaches presented in the paper confirm that in practice, the obtained error bounds are rather tight. For realizing the numerical experiments, also a novel primal-dual Langevin algorithm for sampling from non-smooth distributions is introduced in this work.
△ Less
Submitted 23 December, 2022;
originally announced December 2022.
-
Stable Deep MRI Reconstruction using Generative Priors
Authors:
Martin Zach,
Florian Knoll,
Thomas Pock
Abstract:
Data-driven approaches recently achieved remarkable success in magnetic resonance imaging (MRI) reconstruction, but integration into clinical routine remains challenging due to a lack of generalizability and interpretability. In this paper, we address these challenges in a unified framework based on generative image priors. We propose a novel deep neural network based regularizer which is trained…
▽ More
Data-driven approaches recently achieved remarkable success in magnetic resonance imaging (MRI) reconstruction, but integration into clinical routine remains challenging due to a lack of generalizability and interpretability. In this paper, we address these challenges in a unified framework based on generative image priors. We propose a novel deep neural network based regularizer which is trained in a generative setting on reference magnitude images only. After training, the regularizer encodes higher-level domain statistics which we demonstrate by synthesizing images without data. Embedding the trained model in a classical variational approach yields high-quality reconstructions irrespective of the sub-sampling pattern. In addition, the model shows stable behavior when confronted with out-of-distribution data in the form of contrast variation. Furthermore, a probabilistic interpretation provides a distribution of reconstructions and hence allows uncertainty quantification. To reconstruct parallel MRI, we propose a fast algorithm to jointly estimate the image and the sensitivity maps. The results demonstrate competitive performance, on par with state-of-the-art end-to-end deep learning methods, while preserving the flexibility with respect to sub-sampling patterns and allowing for uncertainty quantification.
△ Less
Submitted 15 June, 2023; v1 submitted 25 October, 2022;
originally announced October 2022.
-
Is Appearance Free Action Recognition Possible?
Authors:
Filip Ilic,
Thomas Pock,
Richard P. Wildes
Abstract:
Intuition might suggest that motion and dynamic information are key to video-based action recognition. In contrast, there is evidence that state-of-the-art deep-learning video understanding architectures are biased toward static information available in single frames. Presently, a methodology and corresponding dataset to isolate the effects of dynamic information in video are missing. Their absenc…
▽ More
Intuition might suggest that motion and dynamic information are key to video-based action recognition. In contrast, there is evidence that state-of-the-art deep-learning video understanding architectures are biased toward static information available in single frames. Presently, a methodology and corresponding dataset to isolate the effects of dynamic information in video are missing. Their absence makes it difficult to understand how well contemporary architectures capitalize on dynamic vs. static information. We respond with a novel Appearance Free Dataset (AFD) for action recognition. AFD is devoid of static information relevant to action recognition in a single frame. Modeling of the dynamics is necessary for solving the task, as the action is only apparent through consideration of the temporal dimension. We evaluated 11 contemporary action recognition architectures on AFD as well as its related RGB video. Our results show a notable decrease in performance for all architectures on AFD compared to RGB. We also conducted a complimentary study with humans that shows their recognition accuracy on AFD and RGB is very similar and much better than the evaluated architectures on AFD. Our results motivate a novel architecture that revives explicit recovery of optical flow, within a contemporary design for best performance on AFD and RGB.
△ Less
Submitted 13 July, 2022;
originally announced July 2022.
-
Computed Tomography Reconstruction using Generative Energy-Based Priors
Authors:
Martin Zach,
Erich Kobler,
Thomas Pock
Abstract:
In the past decades, Computed Tomography (CT) has established itself as one of the most important imaging techniques in medicine. Today, the applicability of CT is only limited by the deposited radiation dose, reduction of which manifests in noisy or incomplete measurements. Thus, the need for robust reconstruction algorithms arises. In this work, we learn a parametric regularizer with a global re…
▽ More
In the past decades, Computed Tomography (CT) has established itself as one of the most important imaging techniques in medicine. Today, the applicability of CT is only limited by the deposited radiation dose, reduction of which manifests in noisy or incomplete measurements. Thus, the need for robust reconstruction algorithms arises. In this work, we learn a parametric regularizer with a global receptive field by maximizing it's likelihood on reference CT data. Due to this unsupervised learning strategy, our trained regularizer truly represents higher-level domain statistics, which we empirically demonstrate by synthesizing CT images. Moreover, this regularizer can easily be applied to different CT reconstruction problems by embedding it in a variational framework, which increases flexibility and interpretability compared to feed-forward learning-based approaches. In addition, the accompanying probabilistic perspective enables experts to explore the full posterior distribution and may quantify uncertainty of the reconstruction approach. We apply the regularizer to limited-angle and few-view CT reconstruction problems, where it outperforms traditional reconstruction algorithms by a large margin.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
Learning atrial fiber orientations and conductivity tensors from intracardiac maps using physics-informed neural networks
Authors:
Thomas Grandits,
Simone Pezzuto,
Francisco Sahli Costabal,
Paris Perdikaris,
Thomas Pock,
Gernot Plank,
Rolf Krause
Abstract:
Electroanatomical maps are a key tool in the diagnosis and treatment of atrial fibrillation. Current approaches focus on the activation times recorded. However, more information can be extracted from the available data. The fibers in cardiac tissue conduct the electrical wave faster, and their direction could be inferred from activation times. In this work, we employ a recently developed approach,…
▽ More
Electroanatomical maps are a key tool in the diagnosis and treatment of atrial fibrillation. Current approaches focus on the activation times recorded. However, more information can be extracted from the available data. The fibers in cardiac tissue conduct the electrical wave faster, and their direction could be inferred from activation times. In this work, we employ a recently developed approach, called physics informed neural networks, to learn the fiber orientations from electroanatomical maps, taking into account the physics of the electrical wave propagation. In particular, we train the neural network to weakly satisfy the anisotropic eikonal equation and to predict the measured activation times. We use a local basis for the anisotropic conductivity tensor, which encodes the fiber orientation. The methodology is tested both in a synthetic example and for patient data. Our approach shows good agreement in both cases, with an RMSE of 2.2ms on the in-silico data and outperforming a state of the art method on the patient data. The results show a first step towards learning the fiber orientations from electroanatomical maps with physics-informed neural networks.
△ Less
Submitted 6 May, 2021; v1 submitted 22 February, 2021;
originally announced February 2021.
-
GEASI: Geodesic-based Earliest Activation Sites Identification in cardiac models
Authors:
Thomas Grandits,
Alexander Effland,
Thomas Pock,
Rolf Krause,
Gernot Plank,
Simone Pezzuto
Abstract:
The identification of the initial ventricular activation sequence is a critical step for the correct personalization of patient-specific cardiac models. In healthy conditions, the Purkinje network is the main source of the electrical activation, but under pathological conditions the so-called earliest activation sites (EASs) are possibly sparser and more localized. Yet, their number, location and…
▽ More
The identification of the initial ventricular activation sequence is a critical step for the correct personalization of patient-specific cardiac models. In healthy conditions, the Purkinje network is the main source of the electrical activation, but under pathological conditions the so-called earliest activation sites (EASs) are possibly sparser and more localized. Yet, their number, location and timing may not be easily inferred from remote recordings, such as the epicardial activation or the 12-lead electrocardiogram (ECG), due to the underlying complexity of the model. In this work, we introduce GEASI (Geodesic-based Earliest Activation Sites Identification) as a novel approach to simultaneously identify all EASs. To this end, we start from the anisotropic eikonal equation modeling cardiac electrical activation and exploit its Hamilton--Jacobi formulation to minimize a given objective function, e.g. the quadratic mismatch to given activation measurements. This versatile approach can be extended to estimate the number of activation sites by means of the topological gradient, or fitting a given ECG. We conducted various experiments in 2D and 3D for in-silico models and an in-vivo intracardiac recording collected from a patient undergoing cardiac resynchronization therapy. The results demonstrate the clinical applicability of GEASI for potential future personalized models and clinical intervention.
△ Less
Submitted 30 June, 2021; v1 submitted 19 February, 2021;
originally announced February 2021.
-
Bayesian Uncertainty Estimation of Learned Variational MRI Reconstruction
Authors:
Dominik Narnhofer,
Alexander Effland,
Erich Kobler,
Kerstin Hammernik,
Florian Knoll,
Thomas Pock
Abstract:
Recent deep learning approaches focus on improving quantitative scores of dedicated benchmarks, and therefore only reduce the observation-related (aleatoric) uncertainty. However, the model-immanent (epistemic) uncertainty is less frequently systematically analyzed. In this work, we introduce a Bayesian variational framework to quantify the epistemic uncertainty. To this end, we solve the linear i…
▽ More
Recent deep learning approaches focus on improving quantitative scores of dedicated benchmarks, and therefore only reduce the observation-related (aleatoric) uncertainty. However, the model-immanent (epistemic) uncertainty is less frequently systematically analyzed. In this work, we introduce a Bayesian variational framework to quantify the epistemic uncertainty. To this end, we solve the linear inverse problem of undersampled MRI reconstruction in a variational setting. The associated energy functional is composed of a data fidelity term and the total deep variation (TDV) as a learned parametric regularizer. To estimate the epistemic uncertainty we draw the parameters of the TDV regularizer from a multivariate Gaussian distribution, whose mean and covariance matrix are learned in a stochastic optimal control problem. In several numerical experiments, we demonstrate that our approach yields competitive results for undersampled MRI reconstruction. Moreover, we can accurately quantify the pixelwise epistemic uncertainty, which can serve radiologists as an additional resource to visualize reconstruction reliability.
△ Less
Submitted 22 October, 2021; v1 submitted 12 February, 2021;
originally announced February 2021.
-
One-sided Frank-Wolfe algorithms for saddle problems
Authors:
Vladimir Kolmogorov,
Thomas Pock
Abstract:
We study a class of convex-concave saddle-point problems of the form $\min_x\max_y \langle Kx,y\rangle+f_{\cal{P}}(x)-h^\ast(y)$ where $K$ is a linear operator, $f_{\cal{P}}$ is the sum of a convex function $f$ with a Lipschitz-continuous gradient and the indicator function of a bounded convex polytope $\cal{P}$, and $h^\ast$ is a convex (possibly nonsmooth) function. Such problem arises, for exam…
▽ More
We study a class of convex-concave saddle-point problems of the form $\min_x\max_y \langle Kx,y\rangle+f_{\cal{P}}(x)-h^\ast(y)$ where $K$ is a linear operator, $f_{\cal{P}}$ is the sum of a convex function $f$ with a Lipschitz-continuous gradient and the indicator function of a bounded convex polytope $\cal{P}$, and $h^\ast$ is a convex (possibly nonsmooth) function. Such problem arises, for example, as a Lagrangian relaxation of various discrete optimization problems. Our main assumptions are the existence of an efficient linear minimization oracle ($lmo$) for $f_{\cal{P}}$ and an efficient proximal map for $h^*$ which motivate the solution via a blend of proximal primal-dual algorithms and Frank-Wolfe algorithms. In case $h^*$ is the indicator function of a linear constraint and function $f$ is quadratic, we show a $O(1/n^2)$ convergence rate on the dual objective, requiring $O(n \log n)$ calls of $lmo$. If the problem comes from the constrained optimization problem $\min_{x\in\mathbb R^d}\{f_{\cal{P}}(x)\:|\:Ax-b=0\}$ then we additionally get bound $O(1/n^2)$ both on the primal gap and on the infeasibility gap. In the most general case, we show a $O(1/n)$ convergence rate of the primal-dual gap again requiring $O(n\log n)$ calls of $lmo$. To the best of our knowledge, this improves on the known convergence rates for the considered class of saddle-point problems. We show applications to labeling problems frequently appearing in machine learning and computer vision.
△ Less
Submitted 4 June, 2021; v1 submitted 29 January, 2021;
originally announced January 2021.
-
Shared Prior Learning of Energy-Based Models for Image Reconstruction
Authors:
Thomas Pinetz,
Erich Kobler,
Thomas Pock,
Alexander Effland
Abstract:
We propose a novel learning-based framework for image reconstruction particularly designed for training without ground truth data, which has three major building blocks: energy-based learning, a patch-based Wasserstein loss functional, and shared prior learning. In energy-based learning, the parameters of an energy functional composed of a learned data fidelity term and a data-driven regularizer a…
▽ More
We propose a novel learning-based framework for image reconstruction particularly designed for training without ground truth data, which has three major building blocks: energy-based learning, a patch-based Wasserstein loss functional, and shared prior learning. In energy-based learning, the parameters of an energy functional composed of a learned data fidelity term and a data-driven regularizer are computed in a mean-field optimal control problem. In the absence of ground truth data, we change the loss functional to a patch-based Wasserstein functional, in which local statistics of the output images are compared to uncorrupted reference patches. Finally, in shared prior learning, both aforementioned optimal control problems are optimized simultaneously with shared learned parameters of the regularizer to further enhance unsupervised image reconstruction. We derive several time discretization schemes of the gradient flow and verify their consistency in terms of Mosco convergence. In numerous numerical experiments, we demonstrate that the proposed method generates state-of-the-art results for various image reconstruction applications--even if no ground truth images are available for training.
△ Less
Submitted 13 November, 2020; v1 submitted 12 November, 2020;
originally announced November 2020.
-
BP-MVSNet: Belief-Propagation-Layers for Multi-View-Stereo
Authors:
Christian Sormann,
Patrick Knöbelreiter,
Andreas Kuhn,
Mattia Rossi,
Thomas Pock,
Friedrich Fraundorfer
Abstract:
In this work, we propose BP-MVSNet, a convolutional neural network (CNN)-based Multi-View-Stereo (MVS) method that uses a differentiable Conditional Random Field (CRF) layer for regularization. To this end, we propose to extend the BP layer and add what is necessary to successfully use it in the MVS setting. We therefore show how we can calculate a normalization based on the expected 3D error, whi…
▽ More
In this work, we propose BP-MVSNet, a convolutional neural network (CNN)-based Multi-View-Stereo (MVS) method that uses a differentiable Conditional Random Field (CRF) layer for regularization. To this end, we propose to extend the BP layer and add what is necessary to successfully use it in the MVS setting. We therefore show how we can calculate a normalization based on the expected 3D error, which we can then use to normalize the label jumps in the CRF. This is required to make the BP layer invariant to different scales in the MVS setting. In order to also enable fractional label jumps, we propose a differentiable interpolation step, which we embed into the computation of the pairwise term. These extensions allow us to integrate the BP layer into a multi-scale MVS network, where we continuously improve a rough initial estimate until we get high quality depth maps as a result. We evaluate the proposed BP-MVSNet in an ablation study and conduct extensive experiments on the DTU, Tanks and Temples and ETH3D data sets. The experiments show that we can significantly outperform the baseline and achieve state-of-the-art results.
△ Less
Submitted 23 October, 2020;
originally announced October 2020.
-
PIEMAP: Personalized Inverse Eikonal Model from cardiac Electro-Anatomical Maps
Authors:
Thomas Grandits,
Simone Pezzuto,
Jolijn M. Lubrecht,
Thomas Pock,
Gernot Plank,
Rolf Krause
Abstract:
Electroanatomical map**, a keystone diagnostic tool in cardiac electrophysiology studies, can provide high-density maps of the local electric properties of the tissue. It is therefore tempting to use such data to better individualize current patient-specific models of the heart through a data assimilation procedure and to extract potentially insightful information such as conduction properties.…
▽ More
Electroanatomical map**, a keystone diagnostic tool in cardiac electrophysiology studies, can provide high-density maps of the local electric properties of the tissue. It is therefore tempting to use such data to better individualize current patient-specific models of the heart through a data assimilation procedure and to extract potentially insightful information such as conduction properties. Parameter identification for state-of-the-art cardiac models is however a challenging task. In this work, we introduce a novel inverse problem for inferring the anisotropic structure of the conductivity tensor, that is fiber orientation and conduction velocity along and across fibers, of an eikonal model for cardiac activation. The proposed method, named PIEMAP, performed robustly with synthetic data and showed promising results with clinical data. These results suggest that PIEMAP could be a useful supplement in future clinical workflows of personalized therapies.
△ Less
Submitted 27 October, 2020; v1 submitted 24 August, 2020;
originally announced August 2020.
-
Total Deep Variation: A Stable Regularizer for Inverse Problems
Authors:
Erich Kobler,
Alexander Effland,
Karl Kunisch,
Thomas Pock
Abstract:
Various problems in computer vision and medical imaging can be cast as inverse problems. A frequent method for solving inverse problems is the variational approach, which amounts to minimizing an energy composed of a data fidelity term and a regularizer. Classically, handcrafted regularizers are used, which are commonly outperformed by state-of-the-art deep learning approaches. In this work, we co…
▽ More
Various problems in computer vision and medical imaging can be cast as inverse problems. A frequent method for solving inverse problems is the variational approach, which amounts to minimizing an energy composed of a data fidelity term and a regularizer. Classically, handcrafted regularizers are used, which are commonly outperformed by state-of-the-art deep learning approaches. In this work, we combine the variational formulation of inverse problems with deep learning by introducing the data-driven general-purpose total deep variation regularizer. In its core, a convolutional neural network extracts local features on multiple scales and in successive blocks. This combination allows for a rigorous mathematical analysis including an optimal control formulation of the training problem in a mean-field setting and a stability analysis with respect to the initial values and the parameters of the regularizer. In addition, we experimentally verify the robustness against adversarial attacks and numerically derive upper bounds for the generalization error. Finally, we achieve state-of-the-art results for numerous imaging tasks.
△ Less
Submitted 15 June, 2020;
originally announced June 2020.
-
Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems
Authors:
Patrick Knöbelreiter,
Christian Sormann,
Alexander Shekhovtsov,
Friedrich Fraundorfer,
Thomas Pock
Abstract:
It has been proposed by many researchers that combining deep neural networks with graphical models can create more efficient and better regularized composite models. The main difficulties in implementing this in practice are associated with a discrepancy in suitable learning objectives as well as with the necessity of approximations for the inference. In this work we take one of the simplest infer…
▽ More
It has been proposed by many researchers that combining deep neural networks with graphical models can create more efficient and better regularized composite models. The main difficulties in implementing this in practice are associated with a discrepancy in suitable learning objectives as well as with the necessity of approximations for the inference. In this work we take one of the simplest inference methods, a truncated max-product Belief Propagation, and add what is necessary to make it a proper component of a deep learning model: We connect it to learning formulations with losses on marginals and compute the backprop operation. This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs), allowing us to design a hierarchical model composing BP inference and CNNs at different scale levels. The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
△ Less
Submitted 13 March, 2020;
originally announced March 2020.
-
Total Deep Variation for Linear Inverse Problems
Authors:
Erich Kobler,
Alexander Effland,
Karl Kunisch,
Thomas Pock
Abstract:
Diverse inverse problems in imaging can be cast as variational problems composed of a task-specific data fidelity term and a regularization term. In this paper, we propose a novel learnable general-purpose regularizer exploiting recent architectural design patterns from deep learning. We cast the learning problem as a discrete sampled optimal control problem, for which we derive the adjoint state…
▽ More
Diverse inverse problems in imaging can be cast as variational problems composed of a task-specific data fidelity term and a regularization term. In this paper, we propose a novel learnable general-purpose regularizer exploiting recent architectural design patterns from deep learning. We cast the learning problem as a discrete sampled optimal control problem, for which we derive the adjoint state equations and an optimality condition. By exploiting the variational structure of our approach, we perform a sensitivity analysis with respect to the learned parameters obtained from different training datasets. Moreover, we carry out a nonlinear eigenfunction analysis, which reveals interesting properties of the learned regularizer. We show state-of-the-art performance for classical image restoration and medical image reconstruction problems.
△ Less
Submitted 17 February, 2020; v1 submitted 14 January, 2020;
originally announced January 2020.
-
Improving Optical Flow on a Pyramid Level
Authors:
Markus Hofinger,
Samuel Rota Bulò,
Lorenzo Porzi,
Arno Knapitsch,
Thomas Pock,
Peter Kontschieder
Abstract:
In this work we review the coarse-to-fine spatial feature pyramid concept, which is used in state-of-the-art optical flow estimation networks to make exploration of the pixel flow search space computationally tractable and efficient. Within an individual pyramid level, we improve the cost volume construction process by departing from a war**- to a sampling-based strategy, which avoids ghosting a…
▽ More
In this work we review the coarse-to-fine spatial feature pyramid concept, which is used in state-of-the-art optical flow estimation networks to make exploration of the pixel flow search space computationally tractable and efficient. Within an individual pyramid level, we improve the cost volume construction process by departing from a war**- to a sampling-based strategy, which avoids ghosting and hence enables us to better preserve fine flow details. We further amplify the positive effects through a level-specific, loss max-pooling strategy that adaptively shifts the focus of the learning process on under-performing predictions. Our second contribution revises the gradient flow across pyramid levels. The typical operations performed at each pyramid level can lead to noisy, or even contradicting gradients across levels. We show and discuss how properly blocking some of these gradient components leads to improved convergence and ultimately better performance. Finally, we introduce a distillation concept to counteract the issue of catastrophic forgetting and thus preserving knowledge over models sequentially trained on multiple datasets. Our findings are conceptually simple and easy to implement, yet result in compelling improvements on relevant error measures that we demonstrate via exhaustive ablations on datasets like Flying Chairs2, Flying Things, Sintel and KITTI. We establish new state-of-the-art results on the challenging Sintel and KITTI 2012 test datasets, and even show the portability of our findings to different optical flow and depth from stereo approaches.
△ Less
Submitted 18 July, 2020; v1 submitted 23 December, 2019;
originally announced December 2019.
-
Image Morphing in Deep Feature Spaces: Theory and Applications
Authors:
Alexander Effland,
Erich Kobler,
Thomas Pock,
Marko Rajković,
Martin Rumpf
Abstract:
This paper combines image metamorphosis with deep features. To this end, images are considered as maps into a high-dimensional feature space and a structure-sensitive, anisotropic flow regularization is incorporated in the metamorphosis model proposed by Miller, Trouvé, Younes and coworkers. For this model a variational time discretization of the Riemannian path energy is presented and the existen…
▽ More
This paper combines image metamorphosis with deep features. To this end, images are considered as maps into a high-dimensional feature space and a structure-sensitive, anisotropic flow regularization is incorporated in the metamorphosis model proposed by Miller, Trouvé, Younes and coworkers. For this model a variational time discretization of the Riemannian path energy is presented and the existence of discrete geodesic paths minimizing this energy is demonstrated. Furthermore, convergence of discrete geodesic paths to geodesic paths in the time continuous model is investigated. The spatial discretization is based on a finite difference approximation in image space and a stable spline approximation in deformation space, the fully discrete model is optimized using the iPALM algorithm. Numerical experiments indicate that the incorporation of semantic deep features is superior to intensity-based approaches.
△ Less
Submitted 2 July, 2020; v1 submitted 28 October, 2019;
originally announced October 2019.
-
On the estimation of the Wasserstein distance in generative models
Authors:
Thomas Pinetz,
Daniel Soukup,
Thomas Pock
Abstract:
Generative Adversarial Networks (GANs) have been used to model the underlying probability distribution of sample based datasets. GANs are notoriuos for training difficulties and their dependence on arbitrary hyperparameters. One recent improvement in GAN literature is to use the Wasserstein distance as loss function leading to Wasserstein Generative Adversarial Networks (WGANs). Using this as a ba…
▽ More
Generative Adversarial Networks (GANs) have been used to model the underlying probability distribution of sample based datasets. GANs are notoriuos for training difficulties and their dependence on arbitrary hyperparameters. One recent improvement in GAN literature is to use the Wasserstein distance as loss function leading to Wasserstein Generative Adversarial Networks (WGANs). Using this as a basis, we show various ways in which the Wasserstein distance is estimated for the task of generative modelling. Additionally, the secrets in training such models are shown and summarized at the end of this work. Where applicable, we extend current works to different algorithms, different cost functions, and different regularization schemes to improve generative models.
△ Less
Submitted 2 October, 2019;
originally announced October 2019.
-
Learned Collaborative Stereo Refinement
Authors:
Patrick Knöbelreiter,
Thomas Pock
Abstract:
In this work, we propose a learning-based method to denoise and refine disparity maps of a given stereo method. The proposed variational network arises naturally from unrolling the iterates of a proximal gradient method applied to a variational energy defined in a joint disparity, color, and confidence image space. Our method allows to learn a robust collaborative regularizer leveraging the joint…
▽ More
In this work, we propose a learning-based method to denoise and refine disparity maps of a given stereo method. The proposed variational network arises naturally from unrolling the iterates of a proximal gradient method applied to a variational energy defined in a joint disparity, color, and confidence image space. Our method allows to learn a robust collaborative regularizer leveraging the joint statistics of the color image, the confidence map and the disparity map. Due to the variational structure of our method, the individual steps can be easily visualized, thus enabling interpretability of the method. We can therefore provide interesting insights into how our method refines and denoises disparity maps. The efficiency of our method is demonstrated by the publicly available stereo benchmarks Middlebury 2014 and Kitti 2015.
△ Less
Submitted 31 July, 2019;
originally announced July 2019.
-
Self-Supervised Learning for Stereo Reconstruction on Aerial Images
Authors:
Patrick Knöbelreiter,
Christoph Vogel,
Thomas Pock
Abstract:
Recent developments established deep learning as an inevitable tool to boost the performance of dense matching and stereo estimation. On the downside, learning these networks requires a substantial amount of training data to be successful. Consequently, the application of these models outside of the laboratory is far from straight forward. In this work we propose a self-supervised training procedu…
▽ More
Recent developments established deep learning as an inevitable tool to boost the performance of dense matching and stereo estimation. On the downside, learning these networks requires a substantial amount of training data to be successful. Consequently, the application of these models outside of the laboratory is far from straight forward. In this work we propose a self-supervised training procedure that allows us to adapt our network to the specific (imaging) characteristics of the dataset at hand, without the requirement of external ground truth data. We instead generate interim training data by running our intermediate network on the whole dataset, followed by conservative outlier filtering. Bootstrapped from a pre-trained version of our hybrid CNN-CRF model, we alternate the generation of training data and network training. With this simple concept we are able to lift the completeness and accuracy of the pre-trained version significantly. We also show that our final model compares favorably to other popular stereo estimation algorithms on an aerial dataset.
△ Less
Submitted 29 July, 2019;
originally announced July 2019.
-
An Optimal Control Approach to Early Stop** Variational Methods for Image Restoration
Authors:
Alexander Effland,
Erich Kobler,
Karl Kunisch,
Thomas Pock
Abstract:
We investigate a well-known phenomenon of variational approaches in image processing, where typically the best image quality is achieved when the gradient flow process is stopped before converging to a stationary point. This paradox originates from a tradeoff between optimization and modelling errors of the underlying variational model and holds true even if deep learning methods are used to learn…
▽ More
We investigate a well-known phenomenon of variational approaches in image processing, where typically the best image quality is achieved when the gradient flow process is stopped before converging to a stationary point. This paradox originates from a tradeoff between optimization and modelling errors of the underlying variational model and holds true even if deep learning methods are used to learn highly expressive regularizers from data. In this paper, we take advantage of this paradox and introduce an optimal stop** time into the gradient flow process, which in turn is learned from data by means of an optimal control approach. As a result, we obtain highly efficient numerical schemes that achieve competitive results for image denoising and image deblurring. A nonlinear spectral analysis of the gradient of the learned regularizer gives enlightening insights about the different regularization properties.
△ Less
Submitted 19 July, 2019;
originally announced July 2019.
-
Fast Decomposable Submodular Function Minimization using Constrained Total Variation
Authors:
K S Sesh Kumar,
Francis Bach,
Thomas Pock
Abstract:
We consider the problem of minimizing the sum of submodular set functions assuming minimization oracles of each summand function. Most existing approaches reformulate the problem as the convex minimization of the sum of the corresponding Lovász extensions and the squared Euclidean norm, leading to algorithms requiring total variation oracles of the summand functions; without further assumptions, t…
▽ More
We consider the problem of minimizing the sum of submodular set functions assuming minimization oracles of each summand function. Most existing approaches reformulate the problem as the convex minimization of the sum of the corresponding Lovász extensions and the squared Euclidean norm, leading to algorithms requiring total variation oracles of the summand functions; without further assumptions, these more complex oracles require many calls to the simpler minimization oracles often available in practice. In this paper, we consider a modified convex problem requiring constrained version of the total variation oracles that can be solved with significantly fewer calls to the simple minimization oracles. We support our claims by showing results on graph cuts for 2D and 3D graphs
△ Less
Submitted 27 May, 2019;
originally announced May 2019.
-
Convex-Concave Backtracking for Inertial Bregman Proximal Gradient Algorithms in Non-Convex Optimization
Authors:
Mahesh Chandra Mukkamala,
Peter Ochs,
Thomas Pock,
Shoham Sabach
Abstract:
Backtracking line-search is an old yet powerful strategy for finding a better step sizes to be used in proximal gradient algorithms. The main principle is to locally find a simple convex upper bound of the objective function, which in turn controls the step size that is used. In case of inertial proximal gradient algorithms, the situation becomes much more difficult and usually leads to very restr…
▽ More
Backtracking line-search is an old yet powerful strategy for finding a better step sizes to be used in proximal gradient algorithms. The main principle is to locally find a simple convex upper bound of the objective function, which in turn controls the step size that is used. In case of inertial proximal gradient algorithms, the situation becomes much more difficult and usually leads to very restrictive rules on the extrapolation parameter. In this paper, we show that the extrapolation parameter can be controlled by locally finding also a simple concave lower bound of the objective function. This gives rise to a double convex-concave backtracking procedure which allows for an adaptive choice of both the step size and extrapolation parameters. We apply this procedure to the class of inertial Bregman proximal gradient methods, and prove that any sequence generated by these algorithms converges globally to a critical point of the function at hand. Numerical experiments on a number of challenging non-convex problems in image processing and machine learning were conducted and show the power of combining inertial step and double backtracking strategy in achieving improved performances.
△ Less
Submitted 5 November, 2019; v1 submitted 6 April, 2019;
originally announced April 2019.
-
Deep Learning Methods for Parallel Magnetic Resonance Image Reconstruction
Authors:
Florian Knoll,
Kerstin Hammernik,
Chi Zhang,
Steen Moeller,
Thomas Pock,
Daniel K. Sodickson,
Mehmet Akcakaya
Abstract:
Following the success of deep learning in a wide range of applications, neural network-based machine learning techniques have received interest as a means of accelerating magnetic resonance imaging (MRI). A number of ideas inspired by deep learning techniques from computer vision and image processing have been successfully applied to non-linear image reconstruction in the spirit of compressed sens…
▽ More
Following the success of deep learning in a wide range of applications, neural network-based machine learning techniques have received interest as a means of accelerating magnetic resonance imaging (MRI). A number of ideas inspired by deep learning techniques from computer vision and image processing have been successfully applied to non-linear image reconstruction in the spirit of compressed sensing for both low dose computed tomography and accelerated MRI. The additional integration of multi-coil information to recover missing k-space lines in the MRI reconstruction process, is still studied less frequently, even though it is the de-facto standard for currently used accelerated MR acquisitions. This manuscript provides an overview of the recent machine learning approaches that have been proposed specifically for improving parallel imaging. A general background introduction to parallel MRI is given that is structured around the classical view of image space and k-space based methods. Both linear and non-linear methods are covered, followed by a discussion of recent efforts to further improve parallel imaging using machine learning, and specifically using artificial neural networks. Image-domain based techniques that introduce improved regularizers are covered as well as k-space based methods, where the focus is on better interpolation strategies using neural networks. Issues and open problems are discussed as well as recent efforts for producing open datasets and benchmarks for the community.
△ Less
Submitted 1 April, 2019;
originally announced April 2019.
-
A convex variational model for learning convolutional image atoms from incomplete data
Authors:
Antonin Chambolle,
Martin Holler Thomas Pock
Abstract:
A variational model for learning convolutional image atoms from corrupted and/or incomplete data is introduced and analyzed both in function space and numerically. Building on lifting and relaxation strategies, the proposed approach is convex and allows for simultaneous image reconstruction and atom-learning in a general, inverse problems context. Further, motivated by an improved numerical perfor…
▽ More
A variational model for learning convolutional image atoms from corrupted and/or incomplete data is introduced and analyzed both in function space and numerically. Building on lifting and relaxation strategies, the proposed approach is convex and allows for simultaneous image reconstruction and atom-learning in a general, inverse problems context. Further, motivated by an improved numerical performance, also a semi-convex variant is included in the analysis and the experiments of the paper. For both settings, fundamental analytical properties allowing in particular to ensure well-posedness and stability results for inverse problems are proven in a continuous setting. Exploiting convexity, globally optimal solutions are further computed numerically for applications with incomplete, noisy and blurry data and numerical results are shown.
△ Less
Submitted 7 December, 2018;
originally announced December 2018.
-
Learning Energy Based Inpainting for Optical Flow
Authors:
Christoph Vogel,
Patrick Knöbelreiter,
Thomas Pock
Abstract:
Modern optical flow methods are often composed of a cascade of many independent steps or formulated as a black box neural network that is hard to interpret and analyze. In this work we seek for a plain, interpretable, but learnable solution. We propose a novel inpainting based algorithm that approaches the problem in three steps: feature selection and matching, selection of supporting points and e…
▽ More
Modern optical flow methods are often composed of a cascade of many independent steps or formulated as a black box neural network that is hard to interpret and analyze. In this work we seek for a plain, interpretable, but learnable solution. We propose a novel inpainting based algorithm that approaches the problem in three steps: feature selection and matching, selection of supporting points and energy based inpainting. To facilitate the inference we propose an optimization layer that allows to backpropagate through 10K iterations of a first-order method without any numerical or memory problems. Compared to recent state-of-the-art networks, our modular CNN is very lightweight and competitive with other, more involved, inpainting based methods.
△ Less
Submitted 8 November, 2018;
originally announced November 2018.
-
3D Fluid Flow Estimation with Integrated Particle Reconstruction
Authors:
Katrin Lasinger,
Christoph Vogel,
Thomas Pock,
Konrad Schindler
Abstract:
The standard approach to densely reconstruct the motion in a volume of fluid is to inject high-contrast tracer particles and record their motion with multiple high-speed cameras. Almost all existing work processes the acquired multi-view video in two separate steps, utilizing either a pure Eulerian or pure Lagrangian approach. Eulerian methods perform a voxel-based reconstruction of particles per…
▽ More
The standard approach to densely reconstruct the motion in a volume of fluid is to inject high-contrast tracer particles and record their motion with multiple high-speed cameras. Almost all existing work processes the acquired multi-view video in two separate steps, utilizing either a pure Eulerian or pure Lagrangian approach. Eulerian methods perform a voxel-based reconstruction of particles per time step, followed by 3D motion estimation, with some form of dense matching between the precomputed voxel grids from different time steps. In this sequential procedure, the first step cannot use temporal consistency considerations to support the reconstruction, while the second step has no access to the original, high-resolution image data. Alternatively, Lagrangian methods reconstruct an explicit, sparse set of particles and track the individual particles over time. Physical constraints can only be incorporated in a post-processing step when interpolating the particle tracks to a dense motion field. We show, for the first time, how to jointly reconstruct both the individual tracer particles and a dense 3D fluid motion field from the image data, using an integrated energy minimization. Our hybrid Lagrangian/Eulerian model reconstructs individual particles, and at the same time recovers a dense 3D motion field in the entire domain. Making particles explicit greatly reduces the memory consumption and allows one to use the high-res input images for matching. Whereas the dense motion field makes it possible to include physical a-priori constraints and account for the incompressibility and viscosity of the fluid. The method exhibits greatly (~70%) improved results over our recently published baseline with two separate steps for 3D reconstruction and motion estimation. Our results with only two time steps are comparable to those of sota tracking-based methods that require much longer sequences.
△ Less
Submitted 21 November, 2019; v1 submitted 9 April, 2018;
originally announced April 2018.
-
Variational 3D-PIV with Sparse Descriptors
Authors:
Katrin Lasinger,
Christoph Vogel,
Thomas Pock,
Konrad Schindler
Abstract:
3D Particle Imaging Velocimetry (3D-PIV) aim to recover the flow field in a volume of fluid, which has been seeded with tracer particles and observed from multiple camera viewpoints. The first step of 3D-PIV is to reconstruct the 3D locations of the tracer particles from synchronous views of the volume. We propose a new method for iterative particle reconstruction (IPR), in which the locations and…
▽ More
3D Particle Imaging Velocimetry (3D-PIV) aim to recover the flow field in a volume of fluid, which has been seeded with tracer particles and observed from multiple camera viewpoints. The first step of 3D-PIV is to reconstruct the 3D locations of the tracer particles from synchronous views of the volume. We propose a new method for iterative particle reconstruction (IPR), in which the locations and intensities of all particles are inferred in one joint energy minimization. The energy function is designed to penalize deviations between the reconstructed 3D particles and the image evidence, while at the same time aiming for a sparse set of particles. We find that the new method, without any post-processing, achieves significantly cleaner particle volumes than a conventional, tomographic MART reconstruction, and can handle a wide range of particle densities. The second step of 3D-PIV is to then recover the dense motion field from two consecutive particle reconstructions. We propose a variational model, which makes it possible to directly include physical properties, such as incompressibility and viscosity, in the estimation of the motion field. To further exploit the sparse nature of the input data, we propose a novel, compact descriptor of the local particle layout. Hence, we avoid the memory-intensive storage of high-resolution intensity volumes. Our framework is generic and allows for a variety of different data costs (correlation measures) and regularizers. We quantitatively evaluate it with both the sum of squared differences (SSD) and the normalized cross-correlation (NCC), respectively with both a hard and a soft version of the incompressibility constraint.
△ Less
Submitted 9 April, 2018;
originally announced April 2018.
-
Robust Deformation Estimation in Wood-Composite Materials using Variational Optical Flow
Authors:
Markus Hofinger,
Thomas Pock,
Thomas Moosbrugger
Abstract:
Wood-composite materials are widely used today as they homogenize humidity related directional deformations. Quantification of these deformations as coefficients is important for construction and engineering and topic of current research but still a manual process.
This work introduces a novel computer vision approach that automatically extracts these properties directly from scans of the wooden…
▽ More
Wood-composite materials are widely used today as they homogenize humidity related directional deformations. Quantification of these deformations as coefficients is important for construction and engineering and topic of current research but still a manual process.
This work introduces a novel computer vision approach that automatically extracts these properties directly from scans of the wooden specimens, taken at different humidity levels during the long lasting humidity conditioning process. These scans are used to compute a humidity dependent deformation field for each pixel, from which the desired coefficients can easily be calculated.
The overall method includes automated registration of the wooden blocks, numerical optimization to compute a variational optical flow field which is further used to calculate dense strain fields and finally the engineering coefficients and their variance throughout the wooden blocks. The methods regularization is fully parameterizable which allows to model and suppress artifacts due to surface appearance changes of the specimens from mold, cracks, etc. that typically arise in the conditioning process.
△ Less
Submitted 13 February, 2018;
originally announced February 2018.
-
Adaptive FISTA for Non-convex Optimization
Authors:
Peter Ochs,
Thomas Pock
Abstract:
In this paper we propose an adaptively extrapolated proximal gradient method, which is based on the accelerated proximal gradient method (also known as FISTA), however we locally optimize the extrapolation parameter by carrying out an exact (or inexact) line search. It turns out that in some situations, the proposed algorithm is equivalent to a class of SR1 (identity minus rank 1) proximal quasi-N…
▽ More
In this paper we propose an adaptively extrapolated proximal gradient method, which is based on the accelerated proximal gradient method (also known as FISTA), however we locally optimize the extrapolation parameter by carrying out an exact (or inexact) line search. It turns out that in some situations, the proposed algorithm is equivalent to a class of SR1 (identity minus rank 1) proximal quasi-Newton methods. Convergence is proved in a general non-convex setting, and hence, as a byproduct, we also obtain new convergence guarantees for proximal quasi-Newton methods. The efficiency of the new method is shown in numerical experiments on a sparsity regularized non-linear inverse problem.
△ Less
Submitted 30 June, 2019; v1 submitted 12 November, 2017;
originally announced November 2017.
-
Semantic 3D Reconstruction with Finite Element Bases
Authors:
Audrey Richard,
Christoph Vogel,
Maros Blaha,
Thomas Pock,
Konrad Schindler
Abstract:
We propose a novel framework for the discretisation of multi-label problems on arbitrary, continuous domains. Our work bridges the gap between general FEM discretisations, and labeling problems that arise in a variety of computer vision tasks, including for instance those derived from the generalised Potts model. Starting from the popular formulation of labeling as a convex relaxation by functiona…
▽ More
We propose a novel framework for the discretisation of multi-label problems on arbitrary, continuous domains. Our work bridges the gap between general FEM discretisations, and labeling problems that arise in a variety of computer vision tasks, including for instance those derived from the generalised Potts model. Starting from the popular formulation of labeling as a convex relaxation by functional lifting, we show that FEM discretisation is valid for the most general case, where the regulariser is anisotropic and non-metric. While our findings are generic and applicable to different vision problems, we demonstrate their practical implementation in the context of semantic 3D reconstruction, where such regularisers have proved particularly beneficial. The proposed FEM approach leads to a smaller memory footprint as well as faster computation, and it constitutes a very simple way to enable variable, adaptive resolution within the same model.
△ Less
Submitted 4 October, 2017;
originally announced October 2017.
-
Total Roto-Translational Variation
Authors:
Antonin Chambolle,
Thomas Pock
Abstract:
We consider curvature depending variational models for image regularization, such as Euler's elastica. These models are known to provide strong priors for the continuity of edges and hence have important applications in shape-and image processing. We consider a lifted convex representation of these models in the roto-translation space: In this space, curvature depending variational energies are re…
▽ More
We consider curvature depending variational models for image regularization, such as Euler's elastica. These models are known to provide strong priors for the continuity of edges and hence have important applications in shape-and image processing. We consider a lifted convex representation of these models in the roto-translation space: In this space, curvature depending variational energies are represented by means of a convex functional defined on divergence free vector fields. The line energies are then easily extended to any scalar function. It yields a natural generalization of the total variation to the roto-translation space. As our main result, we show that the proposed convex representation is tight for characteristic functions of smooth shapes. We also discuss cases where this representation fails. For numerical solution, we propose a staggered grid discretization based on an averaged Raviart-Thomas finite elements approximation. This discretization is consistent, up to minor details, with the underlying continuous model. The resulting non-smooth convex optimization problem is solved using a first-order primal-dual algorithm. We illustrate the results of our numerical algorithm on various problems from shape-and image processing.
△ Less
Submitted 29 July, 2018; v1 submitted 28 September, 2017;
originally announced September 2017.
-
Scalable Full Flow with Learned Binary Descriptors
Authors:
Gottfried Munda,
Alexander Shekhovtsov,
Patrick Knöbelreiter,
Thomas Pock
Abstract:
We propose a method for large displacement optical flow in which local matching costs are learned by a convolutional neural network (CNN) and a smoothness prior is imposed by a conditional random field (CRF). We tackle the computation- and memory-intensive operations on the 4D cost volume by a min-projection which reduces memory complexity from quadratic to linear and binary descriptors for effici…
▽ More
We propose a method for large displacement optical flow in which local matching costs are learned by a convolutional neural network (CNN) and a smoothness prior is imposed by a conditional random field (CRF). We tackle the computation- and memory-intensive operations on the 4D cost volume by a min-projection which reduces memory complexity from quadratic to linear and binary descriptors for efficient matching. This enables evaluation of the cost on the fly and allows to perform learning and CRF inference on high resolution images without ever storing the 4D cost volume. To address the problem of learning binary descriptors we propose a new hybrid learning scheme. In contrast to current state of the art approaches for learning binary CNNs we can compute the exact non-zero gradient within our model. We compare several methods for training binary descriptors and show results on public available benchmarks.
△ Less
Submitted 20 July, 2017;
originally announced July 2017.
-
Learning a Variational Network for Reconstruction of Accelerated MRI Data
Authors:
Kerstin Hammernik,
Teresa Klatzer,
Erich Kobler,
Michael P Recht,
Daniel K Sodickson,
Thomas Pock,
Florian Knoll
Abstract:
Purpose: To allow fast and high-quality reconstruction of clinical accelerated multi-coil MR data by learning a variational network that combines the mathematical structure of variational models with deep learning.
Theory and Methods: Generalized compressed sensing reconstruction formulated as a variational model is embedded in an unrolled gradient descent scheme. All parameters of this formulat…
▽ More
Purpose: To allow fast and high-quality reconstruction of clinical accelerated multi-coil MR data by learning a variational network that combines the mathematical structure of variational models with deep learning.
Theory and Methods: Generalized compressed sensing reconstruction formulated as a variational model is embedded in an unrolled gradient descent scheme. All parameters of this formulation, including the prior model defined by filter kernels and activation functions as well as the data term weights, are learned during an offline training procedure. The learned model can then be applied online to previously unseen data.
Results: The variational network approach is evaluated on a clinical knee imaging protocol. The variational network reconstructions outperform standard reconstruction algorithms in terms of image quality and residual artifacts for all tested acceleration factors and sampling patterns.
Conclusion: Variational network reconstructions preserve the natural appearance of MR images as well as pathologies that were not included in the training data set. Due to its high computational performance, i.e., reconstruction time of 193 ms on a single graphics card, and the omission of parameter tuning once the network is trained, this new approach to image reconstruction can easily be integrated into clinical workflow.
△ Less
Submitted 3 April, 2017;
originally announced April 2017.
-
Real-Time Panoramic Tracking for Event Cameras
Authors:
Christian Reinbacher,
Gottfried Munda,
Thomas Pock
Abstract:
Event cameras are a paradigm shift in camera technology. Instead of full frames, the sensor captures a sparse set of events caused by intensity changes. Since only the changes are transferred, those cameras are able to capture quick movements of objects in the scene or of the camera itself. In this work we propose a novel method to perform camera tracking of event cameras in a panoramic setting wi…
▽ More
Event cameras are a paradigm shift in camera technology. Instead of full frames, the sensor captures a sparse set of events caused by intensity changes. Since only the changes are transferred, those cameras are able to capture quick movements of objects in the scene or of the camera itself. In this work we propose a novel method to perform camera tracking of event cameras in a panoramic setting with three degrees of freedom. We propose a direct camera tracking formulation, similar to state-of-the-art in visual odometry. We show that the minimal information needed for simultaneous tracking and map** is the spatial position of events, without using the appearance of the imaged scene point. We verify the robustness to fast camera movements and dynamic objects in the scene on a recently proposed dataset and self-recorded sequences.
△ Less
Submitted 21 March, 2017; v1 submitted 15 March, 2017;
originally announced March 2017.
-
Inertial Proximal Alternating Linearized Minimization (iPALM) for Nonconvex and Nonsmooth Problems
Authors:
Thomas Pock,
Shoham Sabach
Abstract:
In this paper we study nonconvex and nonsmooth optimization problems with semi-algebraic data, where the variables vector is split into several blocks of variables. The problem consists of one smooth function of the entire variables vector and the sum of nonsmooth functions for each block separately. We analyze an inertial version of the Proximal Alternating Linearized Minimization (PALM) algorith…
▽ More
In this paper we study nonconvex and nonsmooth optimization problems with semi-algebraic data, where the variables vector is split into several blocks of variables. The problem consists of one smooth function of the entire variables vector and the sum of nonsmooth functions for each block separately. We analyze an inertial version of the Proximal Alternating Linearized Minimization (PALM) algorithm and prove its global convergence to a critical point of the objective function at hand. We illustrate our theoretical findings by presenting numerical experiments on blind image deconvolution, on sparse non-negative matrix factorization and on dictionary learning, which demonstrate the viability and effectiveness of the proposed method.
△ Less
Submitted 8 February, 2017;
originally announced February 2017.
-
End-to-End Training of Hybrid CNN-CRF Models for Stereo
Authors:
Patrick Knöbelreiter,
Christian Reinbacher,
Alexander Shekhovtsov,
Thomas Pock
Abstract:
We propose a novel and principled hybrid CNN+CRF model for stereo estimation. Our model allows to exploit the advantages of both, convolutional neural networks (CNNs) and conditional random fields (CRFs) in an unified approach. The CNNs compute expressive features for matching and distinctive color edges, which in turn are used to compute the unary and binary costs of the CRF. For inference, we ap…
▽ More
We propose a novel and principled hybrid CNN+CRF model for stereo estimation. Our model allows to exploit the advantages of both, convolutional neural networks (CNNs) and conditional random fields (CRFs) in an unified approach. The CNNs compute expressive features for matching and distinctive color edges, which in turn are used to compute the unary and binary costs of the CRF. For inference, we apply a recently proposed highly parallel dual block descent algorithm which only needs a small fixed number of iterations to compute a high-quality approximate minimizer. As the main contribution of the paper, we propose a theoretically sound method based on the structured output support vector machine (SSVM) to train the hybrid CNN+CRF model on large-scale data end-to-end. Our trained models perform very well despite the fact that we are using shallow CNNs and do not apply any kind of post-processing to the final output of the CRF. We evaluate our combined models on challenging stereo benchmarks such as Middlebury 2014 and Kitti 2015 and also investigate the performance of each individual component.
△ Less
Submitted 3 May, 2017; v1 submitted 30 November, 2016;
originally announced November 2016.
-
A first-order primal-dual algorithm with linesearch
Authors:
Yura Malitsky,
Thomas Pock
Abstract:
The paper proposes a linesearch for a primal-dual method. Each iteration of the linesearch requires to update only the dual (or primal) variable. For many problems, in particular for regularized least squares, the linesearch does not require any additional matrix-vector multiplications. We prove convergence of the proposed method under standard assumptions. We also show an ergodic $O(1/N)$ rate of…
▽ More
The paper proposes a linesearch for a primal-dual method. Each iteration of the linesearch requires to update only the dual (or primal) variable. For many problems, in particular for regularized least squares, the linesearch does not require any additional matrix-vector multiplications. We prove convergence of the proposed method under standard assumptions. We also show an ergodic $O(1/N)$ rate of convergence for our method. In case one or both of the prox-functions are strongly convex, we modify our basic method to get a better convergence rate. Finally, we propose a linesearch for a saddle point problem with an additional smooth term. Several numerical experiments confirm the efficiency of our proposed methods.
△ Less
Submitted 23 March, 2018; v1 submitted 31 August, 2016;
originally announced August 2016.