Search | arXiv e-print repository

Time-Harmonic Optical Flow with Applications in Elastography

Authors: Oleh Melnyk, Michael Quellmalz, Gabriele Steidl, Noah Jaitner, Jakob Jordan, Ingolf Sack

Abstract: In this paper, we propose mathematical models for reconstructing the optical flow in time-harmonic elastography. In this image acquisition technique, the object undergoes a special time-harmonic oscillation with known frequency so that only the spatially varying amplitude of the velocity field has to be determined. This allows for a simpler multi-frame optical flow analysis using Fourier analytic… ▽ More In this paper, we propose mathematical models for reconstructing the optical flow in time-harmonic elastography. In this image acquisition technique, the object undergoes a special time-harmonic oscillation with known frequency so that only the spatially varying amplitude of the velocity field has to be determined. This allows for a simpler multi-frame optical flow analysis using Fourier analytic tools in time. We propose three variational optical flow models and show how their minimization can be tackled via Fourier transform in time. Numerical examples with synthetic as well as real-world data demonstrate the benefits of our approach. Keywords: optical flow, elastography, Fourier transform, iteratively reweighted least squares, Horn--Schunck method △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 29 pages, 8 figures

arXiv:2403.18705 [pdf, other]

Conditional Wasserstein Distances with Applications in Bayesian OT Flow Matching

Authors: Jannis Chemseddine, Paul Hagemann, Gabriele Steidl, Christian Wald

Abstract: In inverse problems, many conditional generative models approximate the posterior measure by minimizing a distance between the joint measure and its learned approximation. While this approach also controls the distance between the posterior measures in the case of the Kullback--Leibler divergence, this is in general not hold true for the Wasserstein distance. In this paper, we introduce a conditio… ▽ More In inverse problems, many conditional generative models approximate the posterior measure by minimizing a distance between the joint measure and its learned approximation. While this approach also controls the distance between the posterior measures in the case of the Kullback--Leibler divergence, this is in general not hold true for the Wasserstein distance. In this paper, we introduce a conditional Wasserstein distance via a set of restricted couplings that equals the expected Wasserstein distance of the posteriors. Interestingly, the dual formulation of the conditional Wasserstein-1 flow resembles losses in the conditional Wasserstein GAN literature in a quite natural way. We derive theoretical properties of the conditional Wasserstein distance, characterize the corresponding geodesics and velocity fields as well as the flow ODEs. Subsequently, we propose to approximate the velocity fields by relaxing the conditional Wasserstein distance. Based on this, we propose an extension of OT Flow Matching for solving Bayesian inverse problems and demonstrate its numerical advantages on an inverse problem and class-conditional image generation. △ Less

Submitted 5 June, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

Comments: This paper supersedes arXiv:2310.13433

arXiv:2403.15563 [pdf, other]

Sparse additive function decompositions facing basis transforms

Authors: Fatima Antarou Ba, Oleh Melnyk, Christian Wald, Gabriele Steidl

Abstract: High-dimensional real-world systems can often be well characterized by a small number of simultaneous low-complexity interactions. The analysis of variance (ANOVA) decomposition and the anchored decomposition are typical techniques to find sparse additive decompositions of functions. In this paper, we are interested in a setting, where these decompositions are not directly spare, but become so aft… ▽ More High-dimensional real-world systems can often be well characterized by a small number of simultaneous low-complexity interactions. The analysis of variance (ANOVA) decomposition and the anchored decomposition are typical techniques to find sparse additive decompositions of functions. In this paper, we are interested in a setting, where these decompositions are not directly spare, but become so after an appropriate basis transform. Noting that the sparsity of those additive function decompositions is equivalent to the fact that most of its mixed partial derivatives vanish, we can exploit a connection to the underlying function graphs to determine an orthogonal transform that realizes the appropriate basis change. This is done in three steps: we apply singular value decomposition to minimize the number of vertices of the function graph, and joint block diagonalization techniques of families of matrices followed by sparse minimization based on relaxations of the zero ''norm'' for minimizing the number of edges. For the latter one, we propose and analyze minimization techniques over the manifold of special orthogonal matrices. Various numerical examples illustrate the reliability of our approach for functions having, after a basis transform, a sparse additive decomposition into summands with at most two variables. △ Less

Submitted 28 March, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

Comments: 46 pages, 10 figures, 8 tables

MSC Class: 26Bxx; 33F05; 58C05; 90C26; 65Kxx; 65F25; 15B99

arXiv:2402.08425 [pdf, other]

Transfer Operators from Batches of Unpaired Points via Entropic Transport Kernels

Authors: Florian Beier, Hancheng Bi, Clément Sarrazin, Bernhard Schmitzer, Gabriele Steidl

Abstract: In this paper, we are concerned with estimating the joint probability of random variables $X$ and $Y$, given $N$ independent observation blocks $(\boldsymbol{x}^i,\boldsymbol{y}^i)$, $i=1,\ldots,N$, each of $M$ samples $(\boldsymbol{x}^i,\boldsymbol{y}^i) = \bigl((x^i_j, y^i_{σ^i(j)}) \bigr)_{j=1}^M$, where $σ^i$ denotes an unknown permutation of i.i.d. sampled pairs $(x^i_j,y_j^i)$,… ▽ More In this paper, we are concerned with estimating the joint probability of random variables $X$ and $Y$, given $N$ independent observation blocks $(\boldsymbol{x}^i,\boldsymbol{y}^i)$, $i=1,\ldots,N$, each of $M$ samples $(\boldsymbol{x}^i,\boldsymbol{y}^i) = \bigl((x^i_j, y^i_{σ^i(j)}) \bigr)_{j=1}^M$, where $σ^i$ denotes an unknown permutation of i.i.d. sampled pairs $(x^i_j,y_j^i)$, $j=1,\ldots,M$. This means that the internal ordering of the $M$ samples within an observation block is not known. We derive a maximum-likelihood inference functional, propose a computationally tractable approximation and analyze their properties. In particular, we prove a $Γ$-convergence result showing that we can recover the true density from empirical approximations as the number $N$ of blocks goes to infinity. Using entropic optimal transport kernels, we model a class of hypothesis spaces of density functions over which the inference functional can be minimized. This hypothesis class is particularly suited for approximate inference of transfer operators from data. We solve the resulting discrete minimization problem by a modification of the EMML algorithm to take addional transition probability constraints into account and prove the convergence of this algorithm. Proof-of-concept examples demonstrate the potential of our method. △ Less

Submitted 13 February, 2024; originally announced February 2024.

MSC Class: 37A30; 62G07

arXiv:2402.04613 [pdf, other]

Wasserstein Gradient Flows for Moreau Envelopes of f-Divergences in Reproducing Kernel Hilbert Spaces

Authors: Sebastian Neumayer, Viktor Stein, Gabriele Steidl, Nicolaj Rux

Abstract: Most commonly used $f$-divergences of measures, e.g., the Kullback-Leibler divergence, are subject to limitations regarding the support of the involved measures. A remedy consists of regularizing the $f$-divergence by a squared maximum mean discrepancy (MMD) associated with a characteristic kernel $K$. In this paper, we use the so-called kernel mean embedding to show that the corresponding regular… ▽ More Most commonly used $f$-divergences of measures, e.g., the Kullback-Leibler divergence, are subject to limitations regarding the support of the involved measures. A remedy consists of regularizing the $f$-divergence by a squared maximum mean discrepancy (MMD) associated with a characteristic kernel $K$. In this paper, we use the so-called kernel mean embedding to show that the corresponding regularization can be rewritten as the Moreau envelope of some function in the reproducing kernel Hilbert space associated with $K$. Then, we exploit well-known results on Moreau envelopes in Hilbert spaces to prove properties of the MMD-regularized $f$-divergences and, in particular, their gradients. Subsequently, we use our findings to analyze Wasserstein gradient flows of MMD-regularized $f$-divergences. Finally, we consider Wasserstein gradient flows starting from empirical measures. We provide proof-of-the-concept numerical examples for $f$-divergences with both infinite and finite recession constant. △ Less

Submitted 9 March, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

Comments: 46 pages, 13 figures

MSC Class: 46N10 (Primary) 46E22; 94A15 (Secondary)

arXiv:2401.16896 [pdf, other]

Parallelly Sliced Optimal Transport on Spheres and on the Rotation Group

Authors: Michael Quellmalz, Léo Buecher, Gabriele Steidl

Abstract: Sliced optimal transport, which is basically a Radon transform followed by one-dimensional optimal transport, became popular in various applications due to its efficient computation. In this paper, we deal with sliced optimal transport on the sphere $\mathbb{S}^{d-1}$ and on the rotation group SO(3). We propose a parallel slicing procedure of the sphere which requires again only optimal transforms… ▽ More Sliced optimal transport, which is basically a Radon transform followed by one-dimensional optimal transport, became popular in various applications due to its efficient computation. In this paper, we deal with sliced optimal transport on the sphere $\mathbb{S}^{d-1}$ and on the rotation group SO(3). We propose a parallel slicing procedure of the sphere which requires again only optimal transforms on the line. We analyze the properties of the corresponding parallelly sliced optimal transport, which provides in particular a rotationally invariant metric on the spherical probability measures. For SO(3), we introduce a new two-dimensional Radon transform and develop its singular value decomposition. Based on this, we propose a sliced optimal transport on SO(3). As Wasserstein distances were extensively used in barycenter computations, we derive algorithms to compute the barycenters with respect to our new sliced Wasserstein distances and provide synthetic numerical examples on the 2-sphere that demonstrate their behavior for both the free and fixed support setting of discrete spherical measures. In terms of computational speed, they outperform the existing methods for semicircular slicing as well as the regularized Wasserstein barycenters. △ Less

Submitted 22 May, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

Comments: 40 pages, 11 figures

arXiv:2401.14381 [pdf, other]

Manifold GCN: Diffusion-based Convolutional Neural Network for Manifold-valued Graphs

Authors: Martin Hanik, Gabriele Steidl, Christoph von Tycowicz

Abstract: We propose two graph neural network layers for graphs with features in a Riemannian manifold. First, based on a manifold-valued graph diffusion equation, we construct a diffusion layer that can be applied to an arbitrary number of nodes and graph connectivity patterns. Second, we model a tangent multilayer perceptron by transferring ideas from the vector neuron framework to our general setting. Bo… ▽ More We propose two graph neural network layers for graphs with features in a Riemannian manifold. First, based on a manifold-valued graph diffusion equation, we construct a diffusion layer that can be applied to an arbitrary number of nodes and graph connectivity patterns. Second, we model a tangent multilayer perceptron by transferring ideas from the vector neuron framework to our general setting. Both layers are equivariant with respect to node permutations and isometries of the feature manifold. These properties have been shown to lead to a beneficial inductive bias in many deep learning tasks. Numerical examples on synthetic data as well as on triangle meshes of the right hippocampus to classify Alzheimer's disease demonstrate the very good performance of our layers. △ Less

Submitted 25 January, 2024; originally announced January 2024.

MSC Class: 53Z50 ACM Class: I.2.4

arXiv:2312.16611 [pdf, other]

Learning from small data sets: Patch-based regularizers in inverse problems for image reconstruction

Authors: Moritz Piening, Fabian Altekrüger, Johannes Hertrich, Paul Hagemann, Andrea Walther, Gabriele Steidl

Abstract: The solution of inverse problems is of fundamental interest in medical and astronomical imaging, geophysics as well as engineering and life sciences. Recent advances were made by using methods from machine learning, in particular deep neural networks. Most of these methods require a huge amount of (paired) data and computer capacity to train the networks, which often may not be available. Our pape… ▽ More The solution of inverse problems is of fundamental interest in medical and astronomical imaging, geophysics as well as engineering and life sciences. Recent advances were made by using methods from machine learning, in particular deep neural networks. Most of these methods require a huge amount of (paired) data and computer capacity to train the networks, which often may not be available. Our paper addresses the issue of learning from small data sets by taking patches of very few images into account. We focus on the combination of model-based and data-driven methods by approximating just the image prior, also known as regularizer in the variational model. We review two methodically different approaches, namely optimizing the maximum log-likelihood of the patch distribution, and penalizing Wasserstein-like discrepancies of whole empirical patch distributions. From the point of view of Bayesian inverse problems, we show how we can achieve uncertainty quantification by approximating the posterior using Langevin Monte Carlo methods. We demonstrate the power of the methods in computed tomography, image super-resolution, and inpainting. Indeed, the approach provides also high-quality results in zero-shot super-resolution, where only a low-resolution image is available. The paper is accompanied by a GitHub repository containing implementations of all methods as well as data examples so that the reader can get their own insight into the performance. △ Less

Submitted 27 December, 2023; originally announced December 2023.

arXiv:2310.03054 [pdf, other]

Posterior Sampling Based on Gradient Flows of the MMD with Negative Distance Kernel

Authors: Paul Hagemann, Johannes Hertrich, Fabian Altekrüger, Robert Beinert, Jannis Chemseddine, Gabriele Steidl

Abstract: We propose conditional flows of the maximum mean discrepancy (MMD) with the negative distance kernel for posterior sampling and conditional generative modeling. This MMD, which is also known as energy distance, has several advantageous properties like efficient computation via slicing and sorting. We approximate the joint distribution of the ground truth and the observations using discrete Wassers… ▽ More We propose conditional flows of the maximum mean discrepancy (MMD) with the negative distance kernel for posterior sampling and conditional generative modeling. This MMD, which is also known as energy distance, has several advantageous properties like efficient computation via slicing and sorting. We approximate the joint distribution of the ground truth and the observations using discrete Wasserstein gradient flows and establish an error bound for the posterior distributions. Further, we prove that our particle flow is indeed a Wasserstein gradient flow of an appropriate functional. The power of our method is demonstrated by numerical examples including conditional image generation and inverse problems like superresolution, inpainting and computed tomography in low-dose and limited-angle settings. △ Less

Submitted 21 March, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

Comments: Published as a conference paper at ICLR 2024

arXiv:2308.11314 [pdf, other]

A Study of Particle Motion in the Presence of Clusters

Authors: Christian Wald, Gabriele Steidl

Abstract: The motivation for this study came from the task of analysing the kinetic behavior of single molecules in a living cell based on Single Molecule Localization Microscopy. Given measurements of both the motion of clusters and molecules, the main task consists in detecting if a molecule belongs to a cluster. While the exact size of the clusters is usually unknown, upper bounds are available. In this… ▽ More The motivation for this study came from the task of analysing the kinetic behavior of single molecules in a living cell based on Single Molecule Localization Microscopy. Given measurements of both the motion of clusters and molecules, the main task consists in detecting if a molecule belongs to a cluster. While the exact size of the clusters is usually unknown, upper bounds are available. In this study, we simulate the cluster movement by a Brownian motion and those of the particles by a Gaussian mixture model with two modes depending on the position of the particle within or outside a cluster. We propose various variational models to detect if a particle lies within a cluster based on the Wasserstein and maximum mean discrepancy distances between measures. We compare the performance of the proposed models for simulated data. △ Less

Submitted 22 August, 2023; originally announced August 2023.

arXiv:2307.10980 [pdf, other]

Denoising of Sphere- and SO(3)-Valued Data by Relaxed Tikhonov Regularization

Authors: Robert Beinert, Jonas Bresch, Gabriele Steidl

Abstract: Manifold-valued signal- and image processing has received attention due to modern image acquisition techniques. Recently, a convex relaxation of the otherwise nonconvex Tikhonov-regularization for denoising circle-valued data has been proposed by Condat (2022). The circle constraints are here encoded in a series of low-dimensional, positive semi-definite matrices. Using Schur complement arguments,… ▽ More Manifold-valued signal- and image processing has received attention due to modern image acquisition techniques. Recently, a convex relaxation of the otherwise nonconvex Tikhonov-regularization for denoising circle-valued data has been proposed by Condat (2022). The circle constraints are here encoded in a series of low-dimensional, positive semi-definite matrices. Using Schur complement arguments, we show that the resulting variational model can be simplified while leading to the same solution. The simplified model can be generalized to higher dimensional spheres and to SO(3)-valued data, where we rely on the quaternion representation of the latter. Standard algorithms from convex analysis can be applied to solve the resulting convex minimization problem. As proof-of-the-concept, we use the alternating direction method of multipliers to demonstrate the denoising behavior of the proposed method. In a series of experiments, we demonstrate the numerical convergence of the signal- or image values to the underlying manifold. △ Less

Submitted 16 May, 2024; v1 submitted 20 July, 2023; originally announced July 2023.

MSC Class: 94A08; 94A12; 65J22; 90C22; 90C25

arXiv:2305.07071 [pdf, other]

Generalized Iterative Scaling for Regularized Optimal Transport with Affine Constraints: Application Examples

Authors: Johannes von Lindheim, Gabriele Steidl

Abstract: We demonstrate the relevance of an algorithm called generalized iterative scaling (GIS) or simultaneous multiplicative algebraic reconstruction technique (SMART) and its rescaled block-iterative version (RBI-SMART) in the field of optimal transport (OT). Many OT problems can be tackled through the use of entropic regularization by solving the Schrödinger problem, which is an information projection… ▽ More We demonstrate the relevance of an algorithm called generalized iterative scaling (GIS) or simultaneous multiplicative algebraic reconstruction technique (SMART) and its rescaled block-iterative version (RBI-SMART) in the field of optimal transport (OT). Many OT problems can be tackled through the use of entropic regularization by solving the Schrödinger problem, which is an information projection problem, that is, with respect to the Kullback--Leibler divergence. Here we consider problems that have several affine constraints. It is well-known that cyclic information projections onto the individual affine sets converge to the solution. In practice, however, even these individual projections are not explicitly available in general. In this paper, we exchange them for one GIS iteration. If this is done for every affine set, we obtain RBI-SMART. We provide a convergence proof using an interpretation of these iterations as two-step affine projections in an equivalent problem. This is done in a slightly more general setting than RBI-SMART, since we use a mix of explicitly known information projections and GIS iterations. We proceed to specialize this algorithm to several OT applications. First, we find the measure that minimizes the regularized OT divergence to a given measure under moment constraints. Second and third, the proposed framework yields an algorithm for solving a regularized martingale OT problem, as well as a relaxed version of the barycentric weak OT problem. Finally, we show an approach from the literature for unbalanced OT problems. △ Less

Submitted 11 May, 2023; originally announced May 2023.

arXiv:2304.09092 [pdf, other]

doi 10.1088/1361-6420/acf156

Sliced Optimal Transport on the Sphere

Authors: Michael Quellmalz, Robert Beinert, Gabriele Steidl

Abstract: Sliced optimal transport reduces optimal transport on multi-dimensional domains to transport on the line. More precisely, sliced optimal transport is the concatenation of the well-known Radon transform and the cumulative density transform, which analytically yields the solutions of the reduced transport problems. Inspired by this concept, we propose two adaptions for optimal transport on the 2-sph… ▽ More Sliced optimal transport reduces optimal transport on multi-dimensional domains to transport on the line. More precisely, sliced optimal transport is the concatenation of the well-known Radon transform and the cumulative density transform, which analytically yields the solutions of the reduced transport problems. Inspired by this concept, we propose two adaptions for optimal transport on the 2-sphere. Firstly, as counterpart to the Radon transform, we introduce the vertical slice transform, which integrates along all circles orthogonal to a given direction. Secondly, we introduce a semicircle transform, which integrates along all half great circles with an appropriate weight function. Both transforms are generalized to arbitrary measures on the sphere. While the vertical slice transform can be combined with optimal transport on the interval and leads to a sliced Wasserstein distance restricted to even probability measures, the semicircle transform is related to optimal transport on the circle and results in a different sliced Wasserstein distance for arbitrary probability measures. The applicability of both novel sliced optimal transport concepts on the sphere is demonstrated by proof-of-concept examples dealing with the interpolation and classification of spherical probability measures. The numerical implementation relies on the singular value decompositions of both transforms and fast Fourier techniques. For the inversion with respect to probability measures, we propose the minimization of an entropy-regularized Kullback--Leibler divergence, which can be numerically realized using a primal-dual proximal splitting algorithm. △ Less

Submitted 2 August, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

Comments: 39 pages, 6 figures

Journal ref: Inverse Problems 39(10), article number 105005, 2023

arXiv:2303.15845 [pdf, other]

Conditional Generative Models are Provably Robust: Pointwise Guarantees for Bayesian Inverse Problems

Authors: Fabian Altekrüger, Paul Hagemann, Gabriele Steidl

Abstract: Conditional generative models became a very powerful tool to sample from Bayesian inverse problem posteriors. It is well-known in classical Bayesian literature that posterior measures are quite robust with respect to perturbations of both the prior measure and the negative log-likelihood, which includes perturbations of the observations. However, to the best of our knowledge, the robustness of con… ▽ More Conditional generative models became a very powerful tool to sample from Bayesian inverse problem posteriors. It is well-known in classical Bayesian literature that posterior measures are quite robust with respect to perturbations of both the prior measure and the negative log-likelihood, which includes perturbations of the observations. However, to the best of our knowledge, the robustness of conditional generative models with respect to perturbations of the observations has not been investigated yet. In this paper, we prove for the first time that appropriately learned conditional generative models provide robust results for single observations. △ Less

Submitted 23 October, 2023; v1 submitted 28 March, 2023; originally announced March 2023.

Comments: Accepted and published in Transactions on Machine Learning Research (07/2023)

Journal ref: Transactions on Machine Learning Research (TMLR), 2023

arXiv:2303.04772 [pdf, other]

Multilevel Diffusion: Infinite Dimensional Score-Based Diffusion Models for Image Generation

Authors: Paul Hagemann, Sophie Mildenberger, Lars Ruthotto, Gabriele Steidl, Nicole Tianjiao Yang

Abstract: Score-based diffusion models (SBDM) have recently emerged as state-of-the-art approaches for image generation. Existing SBDMs are typically formulated in a finite-dimensional setting, where images are considered as tensors of finite size. This paper develops SBDMs in the infinite-dimensional setting, that is, we model the training data as functions supported on a rectangular domain. Besides the qu… ▽ More Score-based diffusion models (SBDM) have recently emerged as state-of-the-art approaches for image generation. Existing SBDMs are typically formulated in a finite-dimensional setting, where images are considered as tensors of finite size. This paper develops SBDMs in the infinite-dimensional setting, that is, we model the training data as functions supported on a rectangular domain. Besides the quest for generating images at ever higher resolution, our primary motivation is to create a well-posed infinite-dimensional learning problem so that we can discretize it consistently on multiple resolution levels. We thereby intend to obtain diffusion models that generalize across different resolution levels and improve the efficiency of the training process. We demonstrate how to overcome two shortcomings of current SBDM approaches in the infinite-dimensional setting. First, we modify the forward process to ensure that the latent distribution is well-defined in the infinite-dimensional setting using the notion of trace class operators. We derive the reverse processes for finite approximations. Second, we illustrate that approximating the score function with an operator network is beneficial for multilevel training. After deriving the convergence of the discretization and the approximation of multilevel training, we implement an infinite-dimensional SBDM approach and show the first promising results on MNIST and Fashion-MNIST, underlining our developed theory. △ Less

Submitted 4 November, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

MSC Class: 60H10; 65D18

arXiv:2301.11624 [pdf, other]

Neural Wasserstein Gradient Flows for Maximum Mean Discrepancies with Riesz Kernels

Authors: Fabian Altekrüger, Johannes Hertrich, Gabriele Steidl

Abstract: Wasserstein gradient flows of maximum mean discrepancy (MMD) functionals with non-smooth Riesz kernels show a rich structure as singular measures can become absolutely continuous ones and conversely. In this paper we contribute to the understanding of such flows. We propose to approximate the backward scheme of Jordan, Kinderlehrer and Otto for computing such Wasserstein gradient flows as well as… ▽ More Wasserstein gradient flows of maximum mean discrepancy (MMD) functionals with non-smooth Riesz kernels show a rich structure as singular measures can become absolutely continuous ones and conversely. In this paper we contribute to the understanding of such flows. We propose to approximate the backward scheme of Jordan, Kinderlehrer and Otto for computing such Wasserstein gradient flows as well as a forward scheme for so-called Wasserstein steepest descent flows by neural networks (NNs). Since we cannot restrict ourselves to absolutely continuous measures, we have to deal with transport plans and velocity plans instead of usual transport maps and velocity fields. Indeed, we approximate the disintegration of both plans by generative NNs which are learned with respect to appropriate loss functions. In order to evaluate the quality of both neural schemes, we benchmark them on the interaction energy. Here we provide analytic formulas for Wasserstein schemes starting at a Dirac measure and show their convergence as the time step size tends to zero. Finally, we illustrate our neural MMD flows by numerical examples. △ Less

Submitted 21 March, 2024; v1 submitted 27 January, 2023; originally announced January 2023.

Comments: Accepted at ICML 2023

Journal ref: Proceedings of the 40th International Conference on Machine Learning, PMLR 202:664-690, 2023

arXiv:2301.04441 [pdf, other]

Wasserstein Gradient Flows of the Discrepancy with Distance Kernel on the Line

Authors: Johannes Hertrich, Robert Beinert, Manuel Gräf, Gabriele Steidl

Abstract: This paper provides results on Wasserstein gradient flows between measures on the real line. Utilizing the isometric embedding of the Wasserstein space $\mathcal P_2(\mathbb R)$ into the Hilbert space $L_2((0,1))$, Wasserstein gradient flows of functionals on $\mathcal P_2(\mathbb R)$ can be characterized as subgradient flows of associated functionals on $L_2((0,1))$. For the maximum mean discrepa… ▽ More This paper provides results on Wasserstein gradient flows between measures on the real line. Utilizing the isometric embedding of the Wasserstein space $\mathcal P_2(\mathbb R)$ into the Hilbert space $L_2((0,1))$, Wasserstein gradient flows of functionals on $\mathcal P_2(\mathbb R)$ can be characterized as subgradient flows of associated functionals on $L_2((0,1))$. For the maximum mean discrepancy functional $\mathcal F_ν:= \mathcal D^2_K(\cdot, ν)$ with the non-smooth negative distance kernel $K(x,y) = -|x-y|$, we deduce a formula for the associated functional. This functional appears to be convex, and we show that $\mathcal F_ν$ is convex along (generalized) geodesics. For the Dirac measure $ν= δ_q$, $q \in \mathbb R$ as end point of the flow, this enables us to determine the Wasserstein gradient flows analytically. Various examples of Wasserstein gradient flows are given for illustration. △ Less

Submitted 11 January, 2023; originally announced January 2023.

Comments: arXiv admin note: text overlap with arXiv:2211.01804

arXiv:2211.01804 [pdf, other]

doi 10.1016/j.jmaa.2023.127829

Wasserstein Steepest Descent Flows of Discrepancies with Riesz Kernels

Authors: Johannes Hertrich, Manuel Gräf, Robert Beinert, Gabriele Steidl

Abstract: The aim of this paper is twofold. Based on the geometric Wasserstein tangent space, we first introduce Wasserstein steepest descent flows. These are locally absolutely continuous curves in the Wasserstein space whose tangent vectors point into a steepest descent direction of a given functional. This allows the use of Euler forward schemes instead of Jordan--Kinderlehrer--Otto schemes. For $λ$-conv… ▽ More The aim of this paper is twofold. Based on the geometric Wasserstein tangent space, we first introduce Wasserstein steepest descent flows. These are locally absolutely continuous curves in the Wasserstein space whose tangent vectors point into a steepest descent direction of a given functional. This allows the use of Euler forward schemes instead of Jordan--Kinderlehrer--Otto schemes. For $λ$-convex functionals, we show that Wasserstein steepest descent flows are an equivalent characterization of Wasserstein gradient flows. The second aim is to study Wasserstein flows of the maximum mean discrepancy with respect to certain Riesz kernels. The crucial part is hereby the treatment of the interaction energy. Although it is not $λ$-convex along generalized geodesics, we give analytic expressions for Wasserstein steepest descent flows of the interaction energy starting at Dirac measures. In contrast to smooth kernels, the particle may explode, i.e., a Dirac measure becomes a non-Dirac one. The computation of steepest descent flows amounts to finding equilibrium measures with external fields, which nicely links Wasserstein flows of interaction energies with potential theory. Finally, we provide numerical simulations of Wasserstein steepest descent flows of discrepancies. △ Less

Submitted 4 October, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

Journal ref: Journal of Mathematical Analysis and Applications, vol. 531, 127829, 2024

arXiv:2209.08086 [pdf, other]

doi 10.1090/mcom/3869

Motion Detection in Diffraction Tomography by Common Circle Methods

Authors: Michael Quellmalz, Peter Elbau, Otmar Scherzer, Gabriele Steidl

Abstract: The method of common lines is a well-established reconstruction technique in cryogenic electron microscopy (cryo-EM), which can be used to extract the relative orientations of an object given tomographic projection images from different directions. In this paper, we deal with an analogous problem in optical diffraction tomography. Based on the Fourier diffraction theorem, we show that rigid moti… ▽ More The method of common lines is a well-established reconstruction technique in cryogenic electron microscopy (cryo-EM), which can be used to extract the relative orientations of an object given tomographic projection images from different directions. In this paper, we deal with an analogous problem in optical diffraction tomography. Based on the Fourier diffraction theorem, we show that rigid motions of the object, i.e., rotations and translations, can be determined by detecting common circles in the Fourier-transformed data. We introduce two methods to identify common circles. The first one is motivated by the common line approach for projection images and detects the relative orientation by parameterizing the common circles in the two images. The second one assumes a smooth motion over time and calculates the angular velocity of the rotational motion via an infinitesimal version of the common circle method. Interestingly, using the stereographic projection, both methods can be reformulated as common line methods, but these lines are, in contrast to those used in cryo-EM, not confined to pass through the origin and allow for a full reconstruction of the relative orientations. Numerical proof-of-the-concept examples demonstrate the performance of our reconstruction methods. △ Less

Submitted 8 May, 2023; v1 submitted 16 September, 2022; originally announced September 2022.

Comments: 35 pages, 13 figures

Journal ref: Mathematics of Computation 93(346), pages 747-784, March 2024

arXiv:2205.12021 [pdf, other]

doi 10.1088/1361-6420/acce5e

PatchNR: Learning from Very Few Images by Patch Normalizing Flow Regularization

Authors: Fabian Altekrüger, Alexander Denker, Paul Hagemann, Johannes Hertrich, Peter Maass, Gabriele Steidl

Abstract: Learning neural networks using only few available information is an important ongoing research topic with tremendous potential for applications. In this paper, we introduce a powerful regularizer for the variational modeling of inverse problems in imaging. Our regularizer, called patch normalizing flow regularizer (patchNR), involves a normalizing flow learned on small patches of very few images.… ▽ More Learning neural networks using only few available information is an important ongoing research topic with tremendous potential for applications. In this paper, we introduce a powerful regularizer for the variational modeling of inverse problems in imaging. Our regularizer, called patch normalizing flow regularizer (patchNR), involves a normalizing flow learned on small patches of very few images. In particular, the training is independent of the considered inverse problem such that the same regularizer can be applied for different forward operators acting on the same class of images. By investigating the distribution of patches versus those of the whole image class, we prove that our model is indeed a MAP approach. Numerical examples for low-dose and limited-angle computed tomography (CT) as well as superresolution of material images demonstrate that our method provides very high quality results. The training set consists of just six images for CT and one image for superresolution. Finally, we combine our patchNR with ideas from internal learning for performing superresolution of natural images directly from the low-resolution observation without knowledge of any high-resolution image. △ Less

Submitted 21 November, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

Journal ref: Inverse Problems, Volume 39, Number 6, 2023

arXiv:2205.09006 [pdf, ps, other]

On Assignment Problems Related to Gromov-Wasserstein Distances on the Real Line

Authors: Robert Beinert, Cosmas Heiss, Gabriele Steidl

Abstract: Let $x_1 < \dots < x_n$ and $y_1 < \dots < y_n$, $n \in \mathbb N$, be real numbers. We show by an example that the assignment problem $$ \max_{σ\in S_n} F_σ(x,y) := \frac12 \sum_{i,k=1}^n |x_i - x_k|^α\, |y_{σ(i)} - y_{σ(k)}|^α, \quad α>0, $$ is in general neither solved by the identical permutation (id) nor the anti-identical permutation (a-id) if $n > 2 +2^α$. Indeed the above maximum can be, d… ▽ More Let $x_1 < \dots < x_n$ and $y_1 < \dots < y_n$, $n \in \mathbb N$, be real numbers. We show by an example that the assignment problem $$ \max_{σ\in S_n} F_σ(x,y) := \frac12 \sum_{i,k=1}^n |x_i - x_k|^α\, |y_{σ(i)} - y_{σ(k)}|^α, \quad α>0, $$ is in general neither solved by the identical permutation (id) nor the anti-identical permutation (a-id) if $n > 2 +2^α$. Indeed the above maximum can be, depending on the number of points, arbitrary far away from $F_\text{id}(x,y)$ and $F_\text{a-id}(x,y)$. The motivation to deal with such assignment problems came from their relation to Gromov-Wasserstein divergences which have recently attained a lot of attention. △ Less

Submitted 18 May, 2022; originally announced May 2022.

arXiv:2205.06725 [pdf, other]

Multi-Marginal Gromov-Wasserstein Transport and Barycenters

Authors: Florian Beier, Robert Beinert, Gabriele Steidl

Abstract: Gromov-Wasserstein (GW) distances are combinations of Gromov-Hausdorff and Wasserstein distances that allow the comparison of two different metric measure spaces (mm-spaces). Due to their invariance under measure- and distance-preserving transformations, they are well suited for many applications in graph and shape analysis. In this paper, we introduce the concept of multi-marginal GW transport be… ▽ More Gromov-Wasserstein (GW) distances are combinations of Gromov-Hausdorff and Wasserstein distances that allow the comparison of two different metric measure spaces (mm-spaces). Due to their invariance under measure- and distance-preserving transformations, they are well suited for many applications in graph and shape analysis. In this paper, we introduce the concept of multi-marginal GW transport between a set of mm-spaces as well as its regularized and unbalanced versions. As a special case, we discuss multi-marginal fused variants, which combine the structure information of an mm-space with label information from an additional label space. To tackle the new formulations numerically, we consider the bi-convex relaxation of the multi-marginal GW problem, which is tight in the balanced case if the cost function is conditionally negative definite. The relaxed model can be solved by an alternating minimization, where each step can be performed by a multi-marginal Sinkhorn scheme. We show relations of our multi-marginal GW problem to (unbalanced, fused) GW barycenters and present various numerical results, which indicate the potential of the concept. △ Less

Submitted 13 July, 2023; v1 submitted 13 May, 2022; originally announced May 2022.

MSC Class: 65K10; 49M20; 28A35; 28A33

arXiv:2203.04851 [pdf, ps, other]

Quasi $α$-Firmly Nonexpansive Map**s in Wasserstein Spaces

Authors: Arian Bërdëllima, Gabriele Steidl

Abstract: This paper introduces the concept of quasi $α$-firmly nonexpansive map**s in Wasserstein spaces over $\mathbb R^d$ and analyzes properties of these map**s. We prove that for quasi $α$-firmly nonexpansive map**s satisfying a certain quadratic growth condition, the fixed point iterations converge in the narrow topology. As a byproduct, we will get the known convergence of the proximal point al… ▽ More This paper introduces the concept of quasi $α$-firmly nonexpansive map**s in Wasserstein spaces over $\mathbb R^d$ and analyzes properties of these map**s. We prove that for quasi $α$-firmly nonexpansive map**s satisfying a certain quadratic growth condition, the fixed point iterations converge in the narrow topology. As a byproduct, we will get the known convergence of the proximal point algorithm in Wasserstein spaces. We apply our results to show for the first time that cyclic proximal point algorithms for minimizing the sum of certain functionals on Wasserstein spaces converge under appropriate assumptions. △ Less

Submitted 1 September, 2022; v1 submitted 9 March, 2022; originally announced March 2022.

MSC Class: 46T99; 47H10; 47J26; 28A33

arXiv:2112.11964 [pdf, other]

doi 10.1109/TIP.2022.3221286

On a linear Gromov-Wasserstein distance

Authors: Florian Beier, Robert Beinert, Gabriele Steidl

Abstract: Gromov-Wasserstein distances are generalization of Wasserstein distances, which are invariant under distance preserving transformations. Although a simplified version of optimal transport in Wasserstein spaces, called linear optimal transport (LOT), was successfully used in practice, there does not exist a notion of linear Gromov-Wasserstein distances so far. In this paper, we propose a definition… ▽ More Gromov-Wasserstein distances are generalization of Wasserstein distances, which are invariant under distance preserving transformations. Although a simplified version of optimal transport in Wasserstein spaces, called linear optimal transport (LOT), was successfully used in practice, there does not exist a notion of linear Gromov-Wasserstein distances so far. In this paper, we propose a definition of linear Gromov-Wasserstein distances. We motivate our approach by a generalized LOT model, which is based on barycentric projection maps of transport plans. Numerical examples illustrate that the linear Gromov-Wasserstein distances, similarly as LOT, can replace the expensive computation of pairwise Gromov-Wasserstein distances in applications like shape classification. △ Less

Submitted 5 July, 2022; v1 submitted 22 December, 2021; originally announced December 2021.

MSC Class: 65K10; 28A33; 28A50; 68W25

arXiv:2111.12506 [pdf, other]

doi 10.1017/9781009331012

Generalized Normalizing Flows via Markov Chains

Authors: Paul Hagemann, Johannes Hertrich, Gabriele Steidl

Abstract: Normalizing flows, diffusion normalizing flows and variational autoencoders are powerful generative models. This chapter provides a unified framework to handle these approaches via Markov chains. We consider stochastic normalizing flows as a pair of Markov chains fulfilling some properties and show how many state-of-the-art models for data generation fit into this framework. Indeed numerical simul… ▽ More Normalizing flows, diffusion normalizing flows and variational autoencoders are powerful generative models. This chapter provides a unified framework to handle these approaches via Markov chains. We consider stochastic normalizing flows as a pair of Markov chains fulfilling some properties and show how many state-of-the-art models for data generation fit into this framework. Indeed numerical simulations show that including stochastic layers improves the expressivity of the network and allows for generating multimodal distributions from unimodal ones. The Markov chains point of view enables us to couple both deterministic layers as invertible neural networks and stochastic layers as Metropolis-Hasting layers, Langevin layers, variational autoencoders and diffusion normalizing flows in a mathematically sound way. Our framework establishes a useful mathematical tool to combine the various approaches. △ Less

Submitted 20 July, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

Comments: arXiv admin note: text overlap with arXiv:2109.11375

arXiv:2109.11375 [pdf, other]

doi 10.1137/21M1450604

Stochastic Normalizing Flows for Inverse Problems: a Markov Chains Viewpoint

Authors: Paul Hagemann, Johannes Hertrich, Gabriele Steidl

Abstract: To overcome topological constraints and improve the expressiveness of normalizing flow architectures, Wu, Köhler and Noé introduced stochastic normalizing flows which combine deterministic, learnable flow transformations with stochastic sampling methods. In this paper, we consider stochastic normalizing flows from a Markov chain point of view. In particular, we replace transition densities by gene… ▽ More To overcome topological constraints and improve the expressiveness of normalizing flow architectures, Wu, Köhler and Noé introduced stochastic normalizing flows which combine deterministic, learnable flow transformations with stochastic sampling methods. In this paper, we consider stochastic normalizing flows from a Markov chain point of view. In particular, we replace transition densities by general Markov kernels and establish proofs via Radon-Nikodym derivatives which allows to incorporate distributions without densities in a sound way. Further, we generalize the results for sampling from posterior distributions as required in inverse problems. The performance of the proposed conditional stochastic normalizing flow is demonstrated by numerical examples. △ Less

Submitted 7 February, 2022; v1 submitted 23 September, 2021; originally announced September 2021.

Journal ref: SIAM/ASA Journal on Uncertainty Quantification, vol. 10 (3), pp. 1162-1190, 2022

arXiv:2108.00227 [pdf, other]

On the Dynamical System of Principal Curves in $\mathbb R^d$

Authors: Robert Beinert, Arian Bërdëllima, Manuel Gräf, Gabriele Steidl

Abstract: Principal curves are natural generalizations of principal lines arising as first principal components in the Principal Component Analysis. They can be characterized from a stochastic point of view as so-called self-consistent curves based on the conditional expectation and from the variational-calculus point of view as saddle points of the expected difference of a random variable and its projectio… ▽ More Principal curves are natural generalizations of principal lines arising as first principal components in the Principal Component Analysis. They can be characterized from a stochastic point of view as so-called self-consistent curves based on the conditional expectation and from the variational-calculus point of view as saddle points of the expected difference of a random variable and its projection onto some curve, where the current curve acts as argument of the energy functional. Beyond that, Duchamp and Stützle (1993,1996) showed that planar curves can by computed as solutions of a system of ordinary differential equations. The aim of this paper is to generalize this characterization of principal curves to $\mathbb R^d$ with $d \ge 3$. Having derived such a dynamical system, we provide several examples for principal curves related to uniform distribution on certain domains in $\mathbb R^3$. △ Less

Submitted 31 July, 2021; originally announced August 2021.

arXiv:2106.05645 [pdf, other]

An Image Registration Model in Electron Backscatter Diffraction

Authors: Manuel Gräf, Sebastian Neumayer, Ralf Hielscher, Gabriele Steidl, Moritz Liesegang, Tilman Beck

Abstract: Recently, variational methods were successfully applied for computing the optical flow in gray and RGB-valued image sequences. A crucial assumption in these models is that pixel-values do not change under transformations. Nowadays, modern image acquisition techniques such as electron backscatter tomography (EBSD), which is used in material sciences, can capture images with values in nonlinear spac… ▽ More Recently, variational methods were successfully applied for computing the optical flow in gray and RGB-valued image sequences. A crucial assumption in these models is that pixel-values do not change under transformations. Nowadays, modern image acquisition techniques such as electron backscatter tomography (EBSD), which is used in material sciences, can capture images with values in nonlinear spaces. Here, the image values belong to the quotient space $\text{SO}(3)/ \mathcal S$ of the special orthogonal group modulo the discrete symmetry group of the crystal. For such data, the assumption that pixel-values remain unchanged under transformations appears to be no longer valid. Hence, we propose a variational model for determining the optical flow in $\text{SO}(3)/\mathcal S$-valued image sequences, taking the dependence of pixel-values on the transformation into account. More precisely, the data is transformed according to the rotation part in the polar decomposition of the Jacobian of the transformation. To model non-smooth transformations without obtaining so-called staircasing effects, we propose to use a total generalized variation like prior. Then, we prove existence of a minimizer for our model and explain how it can be discretized and minimized by a primal-dual algorithm. Numerical examples illustrate the performance of our method. △ Less

Submitted 3 September, 2021; v1 submitted 10 June, 2021; originally announced June 2021.

arXiv:2105.14893 [pdf, other]

doi 10.1553/etna_vol55s142

Sparse Mixture Models inspired by ANOVA Decompositions

Authors: Johannes Hertrich, Fatima Antarou Ba, Gabriele Steidl

Abstract: Inspired by the analysis of variance (ANOVA) decomposition of functions we propose a Gaussian-Uniform mixture model on the high-dimensional torus which relies on the assumption that the function we wish to approximate can be well explained by limited variable interactions. We consider three approaches, namely wrapped Gaussians, diagonal wrapped Gaussians and products of von Mises distributions. Th… ▽ More Inspired by the analysis of variance (ANOVA) decomposition of functions we propose a Gaussian-Uniform mixture model on the high-dimensional torus which relies on the assumption that the function we wish to approximate can be well explained by limited variable interactions. We consider three approaches, namely wrapped Gaussians, diagonal wrapped Gaussians and products of von Mises distributions. The sparsity of the mixture model is ensured by the fact that its summands are products of Gaussian-like density functions acting on low dimensional spaces and uniform probability densities defined on the remaining directions. To learn such a sparse mixture model from given samples, we propose an objective function consisting of the negative log-likelihood function of the mixture model and a regularizer that penalizes the number of its summands. For minimizing this functional we combine the Expectation Maximization algorithm with a proximal step that takes the regularizer into account. To decide which summands of the mixture model are important, we apply a Kolmogorov-Smirnov test. Numerical examples demonstrate the performance of our approach. △ Less

Submitted 1 October, 2021; v1 submitted 31 May, 2021; originally announced May 2021.

arXiv:2104.07990 [pdf, other]

doi 10.1088/1361-6420/ac2749

Fourier reconstruction for diffraction tomography of an object rotated into arbitrary orientations

Authors: Clemens Kirisits, Michael Quellmalz, Monika Ritsch-Marte, Otmar Scherzer, Eric Setterqvist, Gabriele Steidl

Abstract: In this paper, we study the mathematical imaging problem of optical diffraction tomography (ODT) for the scenario of a microscopic rigid particle rotating in a trap created, for instance, by acoustic or optical forces. Under the influence of the inhomogeneous forces the particle carries out a time-dependent smooth, but complicated motion described by a set of affine transformations. The rotation o… ▽ More In this paper, we study the mathematical imaging problem of optical diffraction tomography (ODT) for the scenario of a microscopic rigid particle rotating in a trap created, for instance, by acoustic or optical forces. Under the influence of the inhomogeneous forces the particle carries out a time-dependent smooth, but complicated motion described by a set of affine transformations. The rotation of the particle enables one to record optical images from a wide range of angles, which largely eliminates the "missing cone problem" in optics. This advantage, however, comes at the price that the rotation axis in this scenario is not fixed, but continuously undergoes some variations, and that the rotation angles are not equally spaced, which is in contrast to standard tomographic reconstruction assumptions. In the present work, we assume that the time-dependent motion parameters are known, and that the particle's scattering potential is compatible with making the first order Born or Rytov approximation. We prove a Fourier diffraction theorem and derive novel backprojection formulae for the reconstruction of the scattering potential, which depends on the refractive index distribution inside the object, taking its complicated motion into account. This provides the basis for solving the ODT problem with an efficient non-uniform discrete Fourier transform. △ Less

Submitted 16 April, 2021; originally announced April 2021.

Journal ref: Inverse Problems 37 (2021) 115002

arXiv:2104.05304 [pdf, ps, other]

On $α$-Firmly Nonexpansive Operators in $r$-Uniformly Convex Spaces

Authors: Arian Bërdëllima, Gabriele Steidl

Abstract: We introduce the class of $α$-firmly nonexpansive and quasi $α$-firmly nonexpansive operators on $r$-uniformly convex Banach spaces. This extends the existing notion from Hilbert spaces, where $α$-firmly nonexpansive operators coincide with so-called $α$-averaged operators. For our more general setting, we show that $α$-averaged operators form a subset of $α$-firmly nonexpansive operators. We deve… ▽ More We introduce the class of $α$-firmly nonexpansive and quasi $α$-firmly nonexpansive operators on $r$-uniformly convex Banach spaces. This extends the existing notion from Hilbert spaces, where $α$-firmly nonexpansive operators coincide with so-called $α$-averaged operators. For our more general setting, we show that $α$-averaged operators form a subset of $α$-firmly nonexpansive operators. We develop some basic calculus rules for (quasi) $α$-firmly nonexpansive operators. In particular, we show that their compositions and convex combinations are again (quasi) $α$-firmly nonexpansive. Moreover, we will see that quasi $α$-firmly nonexpansive operators enjoy the asymptotic regularity property. Then, based on Browder's demiclosedness principle, we prove for $r$-uniformly convex Banach spaces that the weak cluster points of the iterates $x_{n+1}:=Tx_{n}$ belong to the fixed point set $\text{Fix} T$ whenever the operator $T$ is nonexpansive and quasi $α$-firmly. If additionally the space has a Fréchet differentiable norm or satisfies Opial's property then these iterates converge weakly to some element in $\text{Fix} T$. Further, the projections $P_{\text{Fix} T}x_n$ converge strongly to this weak limit point. Finally, we give three illustrative examples, where our theory can be applied, namely from infinite dimensional neural networks, semigroup theory, and contractive projections in $L_p$, $p \in (1,\infty) \backslash \{2\}$ spaces on probability measure spaces. △ Less

Submitted 15 April, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

MSC Class: 46B25; 46B20; 47H10; 47J26; 47H09

arXiv:2103.10854 [pdf, other]

doi 10.1007/s10851-022-01126-7

Unbalanced Multi-Marginal Optimal Transport

Authors: Florian Beier, Johannes von Lindheim, Sebastian Neumayer, Gabriele Steidl

Abstract: Entropy regularized optimal transport and its multi-marginal generalization have attracted increasing attention in various applications, in particular due to efficient Sinkhorn-like algorithms for computing optimal transport plans. However, it is often desirable that the marginals of the optimal transport plan do not match the given measures exactly, which led to the introduction of the so-called… ▽ More Entropy regularized optimal transport and its multi-marginal generalization have attracted increasing attention in various applications, in particular due to efficient Sinkhorn-like algorithms for computing optimal transport plans. However, it is often desirable that the marginals of the optimal transport plan do not match the given measures exactly, which led to the introduction of the so-called unbalanced optimal transport. Since unbalanced methods were not examined for the multi-marginal setting so far, we address this topic in the present paper. More precisely, we introduce the unbalanced multi-marginal optimal transport problem and its dual, and show that a unique optimal transport plan exists under mild assumptions. Further, we generalize the Sinkhorn algorithm for regularized unbalanced optimal transport to the multi-marginal setting and prove its convergence. If the cost function decouples according to a tree, the iterates can be computed efficiently. At the end, we discuss three applications of our framework, namely two barycenter problems and a transfer operator approach, where we establish a relation between the barycenter problem and the multi-marginal optimal transport with an appropriate tree-structured cost function. △ Less

Submitted 29 September, 2022; v1 submitted 19 March, 2021; originally announced March 2021.

arXiv:2103.07686 [pdf, ps, other]

On approximate operator representations of sequences in Banach spaces

Authors: Ole Christensen, Marzieh Hasannasab, Gabriele Steidl

Abstract: Generalizing results by Halperin et al., Grivaux recently showed that any linearly independent sequence $\{f_k\}_{k=1}^\infty$ in a separable Banach space $X$ can be represented as a suborbit $\{T^{α(k)}\varphi\}_{k=1}^\infty$ of some bounded operator $T: X\to X.$ In general, the operator $T$ and the powers $α(k)$ are not known explicitly. In this paper we consider approximate representations… ▽ More Generalizing results by Halperin et al., Grivaux recently showed that any linearly independent sequence $\{f_k\}_{k=1}^\infty$ in a separable Banach space $X$ can be represented as a suborbit $\{T^{α(k)}\varphi\}_{k=1}^\infty$ of some bounded operator $T: X\to X.$ In general, the operator $T$ and the powers $α(k)$ are not known explicitly. In this paper we consider approximate representations $\{f_k\}_{k=1}^\infty \approx \{T^{α(k)}\varphi\}_{k=1}^\infty$ of certain types of sequences $\{f_k\}_{k=1}^\infty.$ In contrast to the results in the literature we are able to be very explicit about the operator $T$ and suitable powers $α(k),$ and we do not need to assume that the sequences are linearly independent. The exact meaning of approximation is defined in a way such that $\{T^{α(k)}\varphi\}_{k=1}^\infty$ keeps essential features of $\{f_k\}_{k=1}^\infty,$ e.g., in the setting of atomic decompositions and Banach frames. We will present two different approaches. The first approach is universal, in the sense that it applies in general Banach spaces; the technical conditions are typically easy to verify in sequence spaces, but are more complicated in function spaces. For this reason we present a second approach, directly tailored to the setting of Banach function spaces. A number of examples prove that the results apply in arbitrary weighted $\ell^p$-spaces and $L^p$-spaces. △ Less

Submitted 16 March, 2021; v1 submitted 13 March, 2021; originally announced March 2021.

MSC Class: 42C40

arXiv:2101.11544 [pdf, other]

Super-Resolution for Doubly-Dispersive Channel Estimation

Authors: Robert Beinert, Peter Jung, Gabriele Steidl, Tom Szollmann

Abstract: In this work we consider the problem of identification and reconstruction of doubly-dispersive channel operators which are given by finite linear combinations of time-frequency shifts. Such operators arise as time-varying linear systems for example in radar and wireless communications. In particular, for information transmission in highly non-stationary environments the channel needs to be estimat… ▽ More In this work we consider the problem of identification and reconstruction of doubly-dispersive channel operators which are given by finite linear combinations of time-frequency shifts. Such operators arise as time-varying linear systems for example in radar and wireless communications. In particular, for information transmission in highly non-stationary environments the channel needs to be estimated quickly with identification signals of short duration and for vehicular application simultaneous high-resolution radar is desired as well. We consider the time-continuous setting and prove an exact resampling reformulation of the involved channel operator when applied to a trigonometric polynomial as identifier in terms of sparse linear combinations of real-valued atoms. Motivated by recent works of Heckel et al. we present an exact approach for off-the-grid superresolution which allows to perform the identification with realizable signals having compact support. Then we show how an alternating descent conditional gradient algorithm can be adapted to solve the reformulated problem. Numerical examples demonstrate the performance of this algorithm, in particular in comparison with a simple adaptive grid refinement strategy and an orthogonal matching pursuit algorithm. △ Less

Submitted 27 January, 2021; originally announced January 2021.

MSC Class: 47A62; 65R30; 65T99; 94A20

arXiv:2012.13460 [pdf, ps, other]

Continuous Wavelet Frames on the Sphere: The Group-Theoretic Approach Revisited

Authors: S. Dahlke, F. De Mari, E. De Vito, M. Hansen, M. Hasannasab, M. Quellmalz, G. Steidl, G. Teschke

Abstract: In \cite{AV99}, Antoine and Vandergheynst propose a group-theoretic approach to continuous wavelet frames on the sphere. The frame is constructed from a single so-called admissible function by applying the unitary operators associated to a representation of the Lorentz group, which is square-integrable modulo the nilpotent factor of the Iwasawa decomposition. We prove necessary and sufficient cond… ▽ More In \cite{AV99}, Antoine and Vandergheynst propose a group-theoretic approach to continuous wavelet frames on the sphere. The frame is constructed from a single so-called admissible function by applying the unitary operators associated to a representation of the Lorentz group, which is square-integrable modulo the nilpotent factor of the Iwasawa decomposition. We prove necessary and sufficient conditions for functions on the sphere, which ensure that the corresponding system is a frame. We strengthen a similar result in \cite{AV99} by providing a complete and detailed proof. △ Less

Submitted 24 December, 2020; originally announced December 2020.

arXiv:2011.02281 [pdf, other]

Convolutional Proximal Neural Networks and Plug-and-Play Algorithms

Authors: Johannes Hertrich, Sebastian Neumayer, Gabriele Steidl

Abstract: In this paper, we introduce convolutional proximal neural networks (cPNNs), which are by construction averaged operators. For filters of full length, we propose a stochastic gradient descent algorithm on a submanifold of the Stiefel manifold to train cPNNs. In case of filters with limited length, we design algorithms for minimizing functionals that approximate the orthogonality constraints imposed… ▽ More In this paper, we introduce convolutional proximal neural networks (cPNNs), which are by construction averaged operators. For filters of full length, we propose a stochastic gradient descent algorithm on a submanifold of the Stiefel manifold to train cPNNs. In case of filters with limited length, we design algorithms for minimizing functionals that approximate the orthogonality constraints imposed on the operators by penalizing the least squares distance to the identity operator. Then, we investigate how scaled cPNNs with a prescribed Lipschitz constant can be used for denoising signals and images, where the achieved quality depends on the Lipschitz constant. Finally, we apply cPNN based denoisers within a Plug-and-Play (PnP) framework and provide convergence results for the corresponding PnP forward-backward splitting algorithm based on an oracle construction. △ Less

Submitted 4 November, 2020; originally announced November 2020.

arXiv:2009.07520 [pdf, other]

doi 10.3934/ipi.2021053

PCA Reduced Gaussian Mixture Models with Applications in Superresolution

Authors: Johannes Hertrich, Dang Phoung Lan Nguyen, Jean-Fancois Aujol, Dominique Bernard, Yannick Berthoumieu, Abdellatif Saadaldin, Gabriele Steidl

Abstract: Despite the rapid development of computational hardware, the treatment of large and high dimensional data sets is still a challenging problem. This paper provides a twofold contribution to the topic. First, we propose a Gaussian Mixture Model in conjunction with a reduction of the dimensionality of the data in each component of the model by principal component analysis, called PCA-GMM. To learn th… ▽ More Despite the rapid development of computational hardware, the treatment of large and high dimensional data sets is still a challenging problem. This paper provides a twofold contribution to the topic. First, we propose a Gaussian Mixture Model in conjunction with a reduction of the dimensionality of the data in each component of the model by principal component analysis, called PCA-GMM. To learn the (low dimensional) parameters of the mixture model we propose an EM algorithm whose M-step requires the solution of constrained optimization problems. Fortunately, these constrained problems do not depend on the usually large number of samples and can be solved efficiently by an (inertial) proximal alternating linearized minimization algorithm. Second, we apply our PCA-GMM for the superresolution of 2D and 3D material images based on the approach of Sandeep and Jacob. Numerical results confirm the moderate influence of the dimensionality reduction on the overall superresolution result. △ Less

Submitted 6 May, 2021; v1 submitted 16 September, 2020; originally announced September 2020.

Journal ref: Inverse Problems and Imaging, vol. 16, pp. 341-366, 2022

arXiv:2006.16085 [pdf, other]

doi 10.1016/j.physd.2021.132980

Transfer Operators from Optimal Transport Plans for Coherent Set Detection

Authors: Péter Koltai, Johannes von Lindheim, Sebastian Neumayer, Gabriele Steidl

Abstract: The topic of this study lies in the intersection of two fields. One is related with analyzing transport phenomena in complicated flows.For this purpose, we use so-called coherent sets: non-dispersing, possibly moving regions in the flow's domain. The other is concerned with reconstructing a flow field from observing its action on a measure, which we address by optimal transport. We show that the f… ▽ More The topic of this study lies in the intersection of two fields. One is related with analyzing transport phenomena in complicated flows.For this purpose, we use so-called coherent sets: non-dispersing, possibly moving regions in the flow's domain. The other is concerned with reconstructing a flow field from observing its action on a measure, which we address by optimal transport. We show that the framework of optimal transport is well suited for delivering the formal requirements on which a coherent-set analysis can be based on. The necessary noise-robustness requirement of coherence can be matched by the computationally efficient concept of unbalanced regularized optimal transport. Moreover, the applied regularization can be interpreted as an optimal way of retrieving the full dynamics given the extremely restricted information of an initial and a final distribution of particles moving according to Brownian motion. △ Less

Submitted 29 April, 2021; v1 submitted 29 June, 2020; originally announced June 2020.

arXiv:2005.05449 [pdf, other]

doi 10.1007/s10851-021-01019-1

Robust PCA via Regularized REAPER with a Matrix-Free Proximal Algorithm

Authors: Robert Beinert, Gabriele Steidl

Abstract: Principal component analysis (PCA) is known to be sensitive to outliers, so that various robust PCA variants were proposed in the literature. A recent model, called REAPER, aims to find the principal components by solving a convex optimization problem. Usually the number of principal components must be determined in advance and the minimization is performed over symmetric positive semi-definite ma… ▽ More Principal component analysis (PCA) is known to be sensitive to outliers, so that various robust PCA variants were proposed in the literature. A recent model, called REAPER, aims to find the principal components by solving a convex optimization problem. Usually the number of principal components must be determined in advance and the minimization is performed over symmetric positive semi-definite matrices having the size of the data, although the number of principal components is substantially smaller. This prohibits its use if the dimension of the data is large which is often the case in image processing. In this paper, we propose a regularized version of REAPER which enforces the sparsity of the number of principal components by penalizing the nuclear norm of the corresponding orthogonal projector. This has the advantage that only an upper bound on the number of principal components is required. Our second contribution is a matrix-free algorithm to find a minimizer of the regularized REAPER which is also suited for high dimensional data. The algorithm couples a primal-dual minimization approach with a thick-restarted Lanczos process. As a side result, we discuss the topic of the bias in robust PCA. Numerical examples demonstrate the performance of our algorithm. △ Less

Submitted 11 May, 2020; originally announced May 2020.

MSC Class: 58C05; 62H25; 65K10

Journal ref: J Math Imaging Vis, 2021

arXiv:2005.02204 [pdf, other]

doi 10.1007/s43670-022-00021-x

Inertial Stochastic PALM (iSPALM) and Applications in Machine Learning

Authors: Johannes Hertrich, Gabriele Steidl

Abstract: Inertial algorithms for minimizing nonsmooth and nonconvex functions as the inertial proximal alternating linearized minimization algorithm (iPALM) have demonstrated their superiority with respect to computation time over their non inertial variants. In many problems in imaging and machine learning, the objective functions have a special form involving huge data which encourage the application of… ▽ More Inertial algorithms for minimizing nonsmooth and nonconvex functions as the inertial proximal alternating linearized minimization algorithm (iPALM) have demonstrated their superiority with respect to computation time over their non inertial variants. In many problems in imaging and machine learning, the objective functions have a special form involving huge data which encourage the application of stochastic algorithms. While algorithms based on stochastic gradient descent are still used in the majority of applications, recently also stochastic algorithms for minimizing nonsmooth and nonconvex functions were proposed. In this paper, we derive an inertial variant of a stochastic PALM algorithm with variance-reduced gradient estimator, called iSPALM, and prove linear convergence of the algorithm under certain assumptions. Our inertial approach can be seen as generalization of momentum methods widely used to speed up and stabilize optimization algorithms, in particular in machine learning, to nonsmooth problems. Numerical experiments for learning the weights of a so-called proximal neural network and the parameters of Student-t mixture models show that our new algorithm outperforms both stochastic PALM and its deterministic counterparts. △ Less

Submitted 21 December, 2020; v1 submitted 5 May, 2020; originally announced May 2020.

Journal ref: Sampling Theory, Signal Processing, and Data Analysis, vol. 20, no. 4, 2022

arXiv:2004.07330 [pdf, other]

A New Constrained Optimization Model for Solving the Nonsymmetric Stochastic Inverse Eigenvalue Problem

Authors: Gabriele Steidl, Maximilian Winkler

Abstract: The stochastic inverse eigenvalue problem aims to reconstruct a stochastic matrix from its spectrum. While there exists a large literature on the existence of solutions for special settings, there are only few numerical solution methods available so far. Recently, Zhao et al. (2016) proposed a constrained optimization model on the manifold of so-called isospectral matrices and adapted a modified P… ▽ More The stochastic inverse eigenvalue problem aims to reconstruct a stochastic matrix from its spectrum. While there exists a large literature on the existence of solutions for special settings, there are only few numerical solution methods available so far. Recently, Zhao et al. (2016) proposed a constrained optimization model on the manifold of so-called isospectral matrices and adapted a modified Polak-Ribière-Polyak conjugate gradient method to the geometry of this manifold. However, not every stochastic matrix is an isospectral one and the model from Zhao et al. is based on the assumption that for each stochastic matrix there exists a (possibly different) isospectral, stochastic matrix with the same spectrum. We are not aware of such a result in the literature, but will see that the claim is at least true for $3 \times 3$ matrices. In this paper, we suggest to extend the above model by considering matrices which differ from isospectral ones only by multiplication with a block diagonal matrix with $2 \times 2$ blocks from the special linear group $SL(2)$, where the number of blocks is given by the number of pairs of complex-conjugate eigenvalues. Every stochastic matrix can be written in such a form, which was not the case for the form of the isospectral matrices. We prove that our model has a minimizer and show how the Polak-Ribière-Polyak conjugate gradient method works on the corresponding more general manifold. We demonstrate by numerical examples that the new, more general method performs similarly as the one from Zhao et al. △ Less

Submitted 15 April, 2020; originally announced April 2020.

arXiv:2002.01189 [pdf, other]

From Optimal Transport to Discrepancy

Authors: Sebastian Neumayer, Gabriele Steidl

Abstract: A common way to quantify the ,,distance'' between measures is via their discrepancy, also known as maximum mean discrepancy (MMD). Discrepancies are related to Sinkhorn divergences $S_\varepsilon$ with appropriate cost functions as $\varepsilon \to \infty$. In the opposite direction, if $\varepsilon \to 0$, Sinkhorn divergences approach another important distance between measures, namely the Wasse… ▽ More A common way to quantify the ,,distance'' between measures is via their discrepancy, also known as maximum mean discrepancy (MMD). Discrepancies are related to Sinkhorn divergences $S_\varepsilon$ with appropriate cost functions as $\varepsilon \to \infty$. In the opposite direction, if $\varepsilon \to 0$, Sinkhorn divergences approach another important distance between measures, namely the Wasserstein distance or more generally optimal transport ,,distance''. In this chapter, we investigate the limiting process for arbitrary measures on compact sets and Lipschitz continuous cost functions. In particular, we are interested in the behavior of the corresponding optimal potentials $\hat \varphi_\varepsilon$, $\hat ψ_\varepsilon$ and $\hat \varphi_K$ appearing in the dual formulation of the Sinkhorn divergences and discrepancies, respectively. While part of the results are known, we provide rigorous proofs for some relations which we have not found in this generality in the literature. Finally, we demonstrate the limiting process by numerical examples and show the behavior of the distances when used for the approximation of measures by point measures in a process called dithering. △ Less

Submitted 24 August, 2020; v1 submitted 4 February, 2020; originally announced February 2020.

arXiv:1912.10480 [pdf, other]

doi 10.1007/s00041-020-09761-7

Parseval Proximal Neural Networks

Authors: Marzieh Hasannasab, Johannes Hertrich, Sebastian Neumayer, Gerlind Plonka, Simon Setzer, Gabriele Steidl

Abstract: The aim of this paper is twofold. First, we show that a certain concatenation of a proximity operator with an affine operator is again a proximity operator on a suitable Hilbert space. Second, we use our findings to establish so-called proximal neural networks (PNNs) and stable tight frame proximal neural networks. Let $\mathcal H$ and $\mathcal K$ be real Hilbert spaces, $b\in\mathcal K$ and… ▽ More The aim of this paper is twofold. First, we show that a certain concatenation of a proximity operator with an affine operator is again a proximity operator on a suitable Hilbert space. Second, we use our findings to establish so-called proximal neural networks (PNNs) and stable tight frame proximal neural networks. Let $\mathcal H$ and $\mathcal K$ be real Hilbert spaces, $b\in\mathcal K$ and $T\in\mathcal{B}(\mathcal H,\mathcal K)$ have closed range and Moore-Penrose inverse $T^\dagger$. Based on the well-known characterization of proximity operators by Moreau, we prove that for any proximity operator $\text{Prox}\colon\mathcal K\to\mathcal K$ the operator $T^\dagger\,\text{Prox} (T\cdot +b)$ is a proximity operator on $\mathcal H$ equipped with a suitable norm. In particular, it follows for the frequently applied soft shrinkage operator $\text{Prox} = S_λ\colon\ell_2 \rightarrow\ell_2$ and any frame analysis operator $T\colon\mathcal H\to\ell_2$ that the frame shrinkage operator $T^\dagger\, S_λ\,T$ is a proximity operator on a suitable Hilbert space. The concatenation of proximity operators on $\mathbb R^d$ equipped with different norms establishes a PNN. If the network arises from tight frame analysis or synthesis operators, then it forms an averaged operator. Hence, it has Lipschitz constant 1 and belongs to the class of so-called Lipschitz networks, which were recently applied to defend against adversarial attacks. Moreover, due to its averaging property, PNNs can be used within so-called Plug-and-Play algorithms with convergence guarantee. In case of Parseval frames, we call the networks Parseval proximal neural networks (PPNNs). Then, the involved linear operators are in a Stiefel manifold and corresponding minimization methods can be applied for training. Finally, some proof-of-the concept examples demonstrate the performance of PPNNs. △ Less

Submitted 17 April, 2020; v1 submitted 19 December, 2019; originally announced December 2019.

Comments: arXiv admin note: text overlap with arXiv:1910.02843

Journal ref: J. Fourier Anal. Application. 26 (59) (2020)

arXiv:1910.06623 [pdf, other]

doi 10.1007/s11075-020-00959-w

Alternatives to the EM Algorithm for ML-Estimation of Location, Scatter Matrix and Degree of Freedom of the Student-$t$ Distribution

Authors: Marzieh Hasannasab, Johannes Hertrich, Friederike Laus, Gabriele Steidl

Abstract: In this paper, we consider maximum likelihood estimations of the degree of freedom parameter $ν$, the location parameter $μ$ and the scatter matrix $Σ$ of the multivariate Student-$t$ distribution. In particular, we are interested in estimating the degree of freedom parameter $ν$ that determines the tails of the corresponding probability density function and was rarely considered in detail in the… ▽ More In this paper, we consider maximum likelihood estimations of the degree of freedom parameter $ν$, the location parameter $μ$ and the scatter matrix $Σ$ of the multivariate Student-$t$ distribution. In particular, we are interested in estimating the degree of freedom parameter $ν$ that determines the tails of the corresponding probability density function and was rarely considered in detail in the literature so far. We prove that under certain assumptions a minimizer of the negative log-likelihood function exists, where we have to take special care of the case $ν\rightarrow \infty$, for which the Student-$t$ distribution approaches the Gaussian distribution. As alternatives to the classical EM algorithm we propose three other algorithms which cannot be interpreted as EM algorithm. For fixed $ν$, the first algorithm is an accelerated EM algorithm known from the literature. However, since we do not fix $ν$, we cannot apply standard convergence results for the EM algorithm. The other two algorithms differ from this algorithm in the iteration step for $ν$. We show how the objective function behaves for the different updates of $ν$ and prove for all three algorithms that it decreases in each iteration step. We compare the algorithms as well as some accelerated versions by numerical simulation and apply one of them for estimating the degree of freedom parameter in images corrupted by Student-$t$ noise. △ Less

Submitted 23 March, 2020; v1 submitted 15 October, 2019; originally announced October 2019.

Journal ref: Numerical Algorithms, vol. 87, pp. 77-118, 2021

arXiv:1910.06124 [pdf, other]

Curve Based Approximation of Measures on Manifolds by Discrepancy Minimization

Authors: Martin Ehler, Manuel Gräf, Sebastian Neumayer, Gabriele Steidl

Abstract: The approximation of probability measures on compact metric spaces and in particular on Riemannian manifoldsby atomic or empirical ones is a classical task in approximation and complexity theory with a wide range of applications. Instead of point measures we are concerned with the approximation by measures supported on Lipschitz curves. Special attention is paid to push-forward measures of Lebesgu… ▽ More The approximation of probability measures on compact metric spaces and in particular on Riemannian manifoldsby atomic or empirical ones is a classical task in approximation and complexity theory with a wide range of applications. Instead of point measures we are concerned with the approximation by measures supported on Lipschitz curves. Special attention is paid to push-forward measures of Lebesgue measures on the interval by such curves. Using the discrepancy as distance between measures, we prove optimal approximation rates in terms of Lipschitz constants of curves. Having established the theoretical convergence rates, we are interested in the numerical minimization of the discrepancy between a given probability measure and the set of push-forward measures of Lebesgue measures on the interval by Lipschitz curves. We present numerical examples for measures on the 2- and 3-dimensional torus, the 2-sphere, the rotation group on $\mathbb R^3$ and the Grassmannian of all 2-dimensional linear subspaces of $\mathbb{R}^4$. Our algorithm of choice is a conjugate gradient method on these manifolds which incorporates second-oder information. For efficiently computing the gradients and the Hessians within the algorithm, we approximate the given measures by truncated Fourier series and use fast Fourier transform techniques on these manifolds. △ Less

Submitted 11 January, 2021; v1 submitted 14 October, 2019; originally announced October 2019.

arXiv:1910.02843 [pdf, ps, other]

Frame Soft Shrinkage as Proximity Operator

Authors: Marzieh Hassanasab, Sebastian Neumayer, Gerlind Plonka, Simon Setzer, Gabriele Steidl, Jakob Alexander Geppert

Abstract: Let $\mathcal H$ and $\mathcal K$ be real Hilbert spaces and $T \in \mathcal{B} (\mathcal H,\mathcal K)$ an injective operator with closed range and Moore-Penrose inverse $T^\dagger$. Based on the well-known characterization of proximity operators by Moreau, we prove that for any proximity operator $\text{Prox} \colon \mathcal K \to \mathcal K$ the operator $T^\dagger \, \text{Prox} \, T$ is a pro… ▽ More Let $\mathcal H$ and $\mathcal K$ be real Hilbert spaces and $T \in \mathcal{B} (\mathcal H,\mathcal K)$ an injective operator with closed range and Moore-Penrose inverse $T^\dagger$. Based on the well-known characterization of proximity operators by Moreau, we prove that for any proximity operator $\text{Prox} \colon \mathcal K \to \mathcal K$ the operator $T^\dagger \, \text{Prox} \, T$ is a proximity operator on the linear space $\mathcal H$ equipped with a suitable norm. In particular, it follows for the frequently applied soft shrinkage operator $\text{Prox} = S_λ\colon \ell_2 \rightarrow \ell_2$ and any frame analysis operator $T\colon \mathcal H \to \ell_2$, that the frame shrinkage operator $T^\dagger\, S_λ\, T$ is a proximity operator in a suitable Hilbert space. △ Less

Submitted 16 October, 2019; v1 submitted 7 October, 2019; originally announced October 2019.

arXiv:1903.04873 [pdf, other]

Minimal Lipschitz and $\infty$-Harmonic Extensions of Vector-Valued Functions on Finite Graphs

Authors: Miroslav Bačák, Johannes Hertrich, Sebastian Neumayer, Gabriele Steidl

Abstract: This paper deals with extensions of vector-valued functions on finite graphs fulfilling distinguished minimality properties. We show that so-called lex and L-lex minimal extensions are actually the same and call them minimal Lipschitz extensions. Then we prove that the solution of the graph $p$-Laplacians converge to these extensions as $p\to \infty$. Furthermore, we examine the relation between m… ▽ More This paper deals with extensions of vector-valued functions on finite graphs fulfilling distinguished minimality properties. We show that so-called lex and L-lex minimal extensions are actually the same and call them minimal Lipschitz extensions. Then we prove that the solution of the graph $p$-Laplacians converge to these extensions as $p\to \infty$. Furthermore, we examine the relation between minimal Lipschitz extensions and iterated weighted midrange filters and address their connection to $\infty$-Laplacians for scalar-valued functions. A convergence proof for an iterative algorithm proposed by Elmoataz et al.~(2014) for finding the zero of the $\infty$-Laplacian is given. Finally, we present applications in image inpainting. △ Less

Submitted 12 March, 2019; originally announced March 2019.

arXiv:1902.04292 [pdf, other]

On the Robust PCA and Weiszfeld's Algorithm

Authors: Sebastian Neumayer, Max Nimmer, Simon Setzer, Gabriele Steidl

Abstract: Principal component analysis (PCA) is a powerful standard tool for reducing the dimensionality of data. Unfortunately, it is sensitive to outliers so that various robust PCA variants were proposed in the literature. This paper addresses the robust PCA by successively determining the directions of lines having minimal Euclidean distances from the data points. The corresponding energy functional is… ▽ More Principal component analysis (PCA) is a powerful standard tool for reducing the dimensionality of data. Unfortunately, it is sensitive to outliers so that various robust PCA variants were proposed in the literature. This paper addresses the robust PCA by successively determining the directions of lines having minimal Euclidean distances from the data points. The corresponding energy functional is not differentiable at a finite number of directions which we call anchor directions. We derive a Weiszfeld-like algorithm for minimizing the energy functional which has several advantages over existing algorithms. Special attention is paid to the careful handling of the anchor directions, where we take the relation between local minima and one-sided derivatives of Lipschitz continuous functions on submanifolds of $\mathbb R^d$ into account. Using ideas for stabilizing the classical Weiszfeld algorithm at anchor points and the Kurdyka-Łojasiewicz property of the energy functional, we prove global convergence of the whole sequence of iterates generated by the algorithm to a critical point of the energy functional. Numerical examples demonstrate the very good performance of our algorithm. △ Less

Submitted 12 February, 2019; originally announced February 2019.

arXiv:1902.03840 [pdf, other]

On the Rotational Invariant $L_1$-Norm PCA

Authors: Sebastian Neumayer, Max Nimmer, Simon Setzer, Gabriele Steidl

Abstract: Principal component analysis (PCA) is a powerful tool for dimensionality reduction. Unfortunately, it is sensitive to outliers, so that various robust PCA variants were proposed in the literature. Among them the so-called rotational invariant $L_1$-norm PCA is rather popular. In this paper, we reinterpret this robust method as conditional gradient algorithm and show moreover that it coincides with… ▽ More Principal component analysis (PCA) is a powerful tool for dimensionality reduction. Unfortunately, it is sensitive to outliers, so that various robust PCA variants were proposed in the literature. Among them the so-called rotational invariant $L_1$-norm PCA is rather popular. In this paper, we reinterpret this robust method as conditional gradient algorithm and show moreover that it coincides with a gradient descent algorithm on Grassmannian manifolds. Based on this point of view, we prove for the first time convergence of the whole series of iterates to a critical point using the Kurdyka-Łojasiewicz property of the energy functional. △ Less

Submitted 24 May, 2019; v1 submitted 11 February, 2019; originally announced February 2019.

arXiv:1812.08540 [pdf, other]

Recent Advances in Denoising of Manifold-Valued Images

Authors: Ronny Bergmann, Friederike Laus, Johannes Persch, Gabriele Steidl

Abstract: Modern signal and image acquisition systems are able to capture data that is no longer real-valued, but may take values on a manifold. However, whenever measurements are taken, no matter whether manifold-valued or not, there occur tiny inaccuracies, which result in noisy data. In this chapter, we review recent advances in denoising of manifold-valued signals and images, where we restrict our atten… ▽ More Modern signal and image acquisition systems are able to capture data that is no longer real-valued, but may take values on a manifold. However, whenever measurements are taken, no matter whether manifold-valued or not, there occur tiny inaccuracies, which result in noisy data. In this chapter, we review recent advances in denoising of manifold-valued signals and images, where we restrict our attention to variational models and appropriate minimization algorithms. The algorithms are either classical as the subgradient algorithm or generalizations of the half-quadratic minimization method, the cyclic proximal point algorithm, and the Douglas-Rachford algorithm to manifolds. An important aspect when dealing with real-world data is the practical implementation. Here several groups provide software and toolboxes as the Manifold Optimization (Manopt) package and the manifold-valued image restoration toolbox (MVIRT). △ Less

Submitted 20 December, 2018; originally announced December 2018.

Showing 1–50 of 75 results for author: Steidl, G