-
Nonlinear Inverse Optimal Transport: Identifiability of the Transport Cost from its Marginals and Optimal Values
Authors:
Alberto González-Sanz,
Michel Groppe,
Axel Munk
Abstract:
The inverse optimal transport problem is to find the underlying cost function from the knowledge of optimal transport plans. While this amounts to solving a linear inverse problem, in this work we will be concerned with the nonlinear inverse problem to identify the cost function when only a set of marginals and its corresponding optimal values are given. We focus on absolutely continuous probabili…
▽ More
The inverse optimal transport problem is to find the underlying cost function from the knowledge of optimal transport plans. While this amounts to solving a linear inverse problem, in this work we will be concerned with the nonlinear inverse problem to identify the cost function when only a set of marginals and its corresponding optimal values are given. We focus on absolutely continuous probability distributions with respect to the $d$-dimensional Lebesgue measure and classes of concave and convex cost functions. Our main result implies that the cost function is uniquely determined from the union of the ranges of the gradients of the optimal potentials. Since, in general, the optimal potentials may not be observed, we derive sufficient conditions for their identifiability - if an open set of marginals is observed, the optimal potentials are then identified via the value of the optimal costs. We conclude with a more in-depth study of this problem in the univariate case, where an explicit representation of the transport plan is available. Here, we link the notion of identifiability of the cost function with that of statistical completeness.
△ Less
Submitted 20 June, 2024; v1 submitted 10 December, 2023;
originally announced December 2023.
-
Empirical Optimal Transport under Estimated Costs: Distributional Limits and Statistical Applications
Authors:
Shayan Hundrieser,
Gilles Mordant,
Christoph Alexander Weitkamp,
Axel Munk
Abstract:
Optimal transport (OT) based data analysis is often faced with the issue that the underlying cost function is (partially) unknown. This paper is concerned with the derivation of distributional limits for the empirical OT value when the cost function and the measures are estimated from data. For statistical inference purposes, but also from the viewpoint of a stability analysis, understanding the f…
▽ More
Optimal transport (OT) based data analysis is often faced with the issue that the underlying cost function is (partially) unknown. This paper is concerned with the derivation of distributional limits for the empirical OT value when the cost function and the measures are estimated from data. For statistical inference purposes, but also from the viewpoint of a stability analysis, understanding the fluctuation of such quantities is paramount. Our results find direct application in the problem of goodness-of-fit testing for group families, in machine learning applications where invariant transport costs arise, in the problem of estimating the distance between mixtures of distributions, and for the analysis of empirical sliced OT quantities.
The established distributional limits assume either weak convergence of the cost process in uniform norm or that the cost is determined by an optimization problem of the OT value over a fixed parameter space. For the first setting we rely on careful lower and upper bounds for the OT value in terms of the measures and the cost in conjunction with a Skorokhod representation. The second setting is based on a functional delta method for the OT value process over the parameter space. The proof techniques might be of independent interest.
△ Less
Submitted 4 January, 2023; v1 submitted 3 January, 2023;
originally announced January 2023.
-
Towards quantitative super-resolution microscopy: Molecular maps with statistical guarantees
Authors:
Katharina Proksch,
Frank Werner,
Jan Keller-Findeisen,
Haisen Ta,
Axel Munk
Abstract:
Quantifying the number of molecules from fluorescence microscopy measurements is an important topic in cell biology and medical research. In this work, we present a consecutive algorithm for super-resolution (STED) scanning microscopy that provides molecule counts in automatically generated image segments and offers statistical guarantees in form of asymptotic confidence intervals. To this end, we…
▽ More
Quantifying the number of molecules from fluorescence microscopy measurements is an important topic in cell biology and medical research. In this work, we present a consecutive algorithm for super-resolution (STED) scanning microscopy that provides molecule counts in automatically generated image segments and offers statistical guarantees in form of asymptotic confidence intervals. To this end, we first apply a multiscale scanning procedure on STED microscopy measurements of the sample to obtain a system of significant regions, each of which contains at least one molecule with prescribed uniform probability. This system of regions will typically be highly redundant and consists of rectangular building blocks. To choose an informative but non-redundant subset of more naturally shaped regions, we hybridize our system with the result of a generic segmentation algorithm. The diameter of the segments can be of the order of the resolution of the microscope. Using multiple photon coincidence measurements of the same sample in confocal mode, we are then able to estimate the brightness and number of the molecules and give uniform confidence intervals on the molecule counts for each previously constructed segment. In other words, we establish a so-called molecular map with uniform error control. The performance of the algorithm is investigated on simulated and real data.
△ Less
Submitted 2 October, 2023; v1 submitted 27 July, 2022;
originally announced July 2022.
-
Statistical analysis of random objects via metric measure Laplacians
Authors:
Gilles Mordant,
Axel Munk
Abstract:
In this paper, we consider a certain convolutional Laplacian for metric measure spaces and investigate its potential for the statistical analysis of complex objects. The spectrum of that Laplacian serves as a signature of the space under consideration and the eigenvectors provide the principal directions of the shape, its harmonics. These concepts are used to assess the similarity of objects or un…
▽ More
In this paper, we consider a certain convolutional Laplacian for metric measure spaces and investigate its potential for the statistical analysis of complex objects. The spectrum of that Laplacian serves as a signature of the space under consideration and the eigenvectors provide the principal directions of the shape, its harmonics. These concepts are used to assess the similarity of objects or understand their most important features in a principled way which is illustrated in various examples. Adopting a statistical point of view, we define a mean spectral measure and its empirical counterpart. The corresponding limiting process of interest is derived and statistical applications are discussed.
△ Less
Submitted 13 April, 2022;
originally announced April 2022.
-
A Unifying Approach to Distributional Limits for Empirical Optimal Transport
Authors:
Shayan Hundrieser,
Marcel Klatt,
Thomas Staudt,
Axel Munk
Abstract:
We provide a unifying approach to central limit type theorems for empirical optimal transport (OT). In general, the limit distributions are characterized as suprema of Gaussian processes. We explicitly characterize when the limit distribution is centered normal or degenerates to a Dirac measure. Moreover, in contrast to recent contributions on distributional limit laws for empirical OT on Euclidea…
▽ More
We provide a unifying approach to central limit type theorems for empirical optimal transport (OT). In general, the limit distributions are characterized as suprema of Gaussian processes. We explicitly characterize when the limit distribution is centered normal or degenerates to a Dirac measure. Moreover, in contrast to recent contributions on distributional limit laws for empirical OT on Euclidean spaces which require centering around its expectation, the distributional limits obtained here are centered around the population quantity, which is well-suited for statistical applications.
At the heart of our theory is Kantorovich duality representing OT as a supremum over a function class $\mathcal{F}_{c}$ for an underlying sufficiently regular cost function $c$. In this regard, OT is considered as a functional defined on $\ell^{\infty}(\mathcal{F}_{c})$ the Banach space of bounded functionals from $\mathcal{F}_{c}$ to $\mathbb{R}$ and equipped with uniform norm. We prove the OT functional to be Hadamard directional differentiable and conclude distributional convergence via a functional delta method that necessitates weak convergence of an underlying empirical process in $\ell^{\infty}(\mathcal{F}_{c})$. The latter can be dealt with empirical process theory and requires $\mathcal{F}_{c}$ to be a Donsker class. We give sufficient conditions depending on the dimension of the ground space, the underlying cost function and the probability measures under consideration to guarantee the Donsker property. Overall, our approach reveals a noteworthy trade-off inherent in central limit theorems for empirical OT: Kantorovich duality requires $\mathcal{F}_{c}$ to be sufficiently rich, while the empirical processes only converges weakly if $\mathcal{F}_{c}$ is not too complex.
△ Less
Submitted 25 February, 2022;
originally announced February 2022.
-
Empirical Optimal Transport between Different Measures Adapts to Lower Complexity
Authors:
Shayan Hundrieser,
Thomas Staudt,
Axel Munk
Abstract:
The empirical optimal transport (OT) cost between two probability measures from random data is a fundamental quantity in transport based data analysis. In this work, we derive novel guarantees for its convergence rate when the involved measures are different, possibly supported on different spaces. Our central observation is that the statistical performance of the empirical OT cost is determined b…
▽ More
The empirical optimal transport (OT) cost between two probability measures from random data is a fundamental quantity in transport based data analysis. In this work, we derive novel guarantees for its convergence rate when the involved measures are different, possibly supported on different spaces. Our central observation is that the statistical performance of the empirical OT cost is determined by the less complex measure, a phenomenon we refer to as lower complexity adaptation of empirical OT. For instance, under Lipschitz ground costs, we find that the empirical OT cost based on $n$ observations converges at least with rate $n^{-1/d}$ to the population quantity if one of the two measures is concentrated on a $d$-dimensional manifold, while the other can be arbitrary. For semi-concave ground costs, we show that the upper bound for the rate improves to $n^{-2/d}$. Similarly, our theory establishes the general convergence rate $n^{-1/2}$ for semi-discrete OT. All of these results are valid in the two-sample case as well, meaning that the convergence rate is still governed by the simpler of the two measures. On a conceptual level, our findings therefore suggest that the curse of dimensionality only affects the estimation of the OT cost when both measures exhibit a high intrinsic dimension. Our proofs are based on the dual formulation of OT as a maximization over a suitable function class $\mathcal{F}_c$ and the observation that the $c$-transform of $\mathcal{F}_c$ under bounded costs has the same uniform metric entropy as $\mathcal{F}_c$ itself.
△ Less
Submitted 21 February, 2022;
originally announced February 2022.
-
On the Uniqueness of Kantorovich Potentials
Authors:
Thomas Staudt,
Shayan Hundrieser,
Axel Munk
Abstract:
Kantorovich potentials denote the dual solutions of the renowned optimal transportation problem. Uniqueness of these solutions is relevant from both a theoretical and an algorithmic point of view, and has recently emerged as a necessary condition for asymptotic results in the context of statistical and entropic optimal transport. In this work, we challenge the common perception that uniqueness in…
▽ More
Kantorovich potentials denote the dual solutions of the renowned optimal transportation problem. Uniqueness of these solutions is relevant from both a theoretical and an algorithmic point of view, and has recently emerged as a necessary condition for asymptotic results in the context of statistical and entropic optimal transport. In this work, we challenge the common perception that uniqueness in continuous settings is reliant on the connectedness of the support of at least one of the involved measures, and we provide mild sufficient conditions for uniqueness even when both measures have disconnected support. Since our main finding builds upon the uniqueness of Kantorovich potentials on connected components, we revisit the corresponding arguments and provide generalizations of well-known results. Several auxiliary findings regarding the continuity of Kantorovich potentials, for example in geodesic spaces, are established along the way.
△ Less
Submitted 20 January, 2022;
originally announced January 2022.
-
Minimax detection of localized signals in statistical inverse problems
Authors:
Markus Pohlmann,
Frank Werner,
Axel Munk
Abstract:
We investigate minimax testing for detecting local signals or linear combinations of such signals when only indirect data is available. Naturally, in the presence of noise, signals that are too small cannot be reliably detected. In a Gaussian white noise model, we discuss upper and lower bounds for the minimal size of the signal such that testing with small error probabilities is possible. In cert…
▽ More
We investigate minimax testing for detecting local signals or linear combinations of such signals when only indirect data is available. Naturally, in the presence of noise, signals that are too small cannot be reliably detected. In a Gaussian white noise model, we discuss upper and lower bounds for the minimal size of the signal such that testing with small error probabilities is possible. In certain situations we are able to characterize the asymptotic minimax detection boundary. Our results are applied to inverse problems such as numerical differentiation, deconvolution and the inversion of the Radon transform.
△ Less
Submitted 20 February, 2023; v1 submitted 10 December, 2021;
originally announced December 2021.
-
Kantorovich-Rubinstein distance and barycenter for finitely supported measures: Foundations and Algorithms
Authors:
Florian Heinemann,
Marcel Klatt,
Axel Munk
Abstract:
The purpose of this paper is to provide a systematic discussion of a generalized barycenter based on a variant of unbalanced optimal transport (UOT) that defines a distance between general non-negative, finitely supported measures by allowing for mass creation and destruction modeled by some cost parameter. They are denoted as Kantorovich-Rubinstein (KR) barycenter and distance. In particular, we…
▽ More
The purpose of this paper is to provide a systematic discussion of a generalized barycenter based on a variant of unbalanced optimal transport (UOT) that defines a distance between general non-negative, finitely supported measures by allowing for mass creation and destruction modeled by some cost parameter. They are denoted as Kantorovich-Rubinstein (KR) barycenter and distance. In particular, we detail the influence of the cost parameter to structural properties of the KR barycenter and the KR distance. For the latter we highlight a closed form solution on ultra-metric trees. The support of such KR barycenters of finitely supported measures turns out to be finite in general and its structure to be explicitly specified by the support of the input measures. Additionally, we prove the existence of sparse KR barycenters and discuss potential computational approaches. The performance of the KR barycenter is compared to the OT barycenter on a multitude of synthetic datasets. We also consider barycenters based on the recently introduced Gaussian Hellinger-Kantorovich and Wasserstein-Fisher-Rao distances.
△ Less
Submitted 25 August, 2022; v1 submitted 7 December, 2021;
originally announced December 2021.
-
A Variational View on Statistical Multiscale Estimation
Authors:
Markus Haltmeier,
Housen Li,
Axel Munk
Abstract:
We present a unifying view on various statistical estimation techniques including penalization, variational and thresholding methods. These estimators will be analyzed in the context of statistical linear inverse problems including nonparametric and change point regression, and high dimensional linear models as examples. Our approach reveals many seemingly unrelated estimation schemes as special i…
▽ More
We present a unifying view on various statistical estimation techniques including penalization, variational and thresholding methods. These estimators will be analyzed in the context of statistical linear inverse problems including nonparametric and change point regression, and high dimensional linear models as examples. Our approach reveals many seemingly unrelated estimation schemes as special instances of a general class of variational multiscale estimators, named MIND (MultIscale Nemirovskii--Dantzig). These estimators result from minimizing certain regularization functionals under convex constraints that can be seen as multiple statistical tests for local hypotheses.
For computational purposes, we recast MIND in terms of simpler unconstraint optimization problems via Lagrangian penalization as well as Fenchel duality. Performance of several MINDs is demonstrated on numerical examples.
△ Less
Submitted 10 June, 2021;
originally announced June 2021.
-
Transport Dependency: Optimal Transport Based Dependency Measures
Authors:
Thomas Giacomo Nies,
Thomas Staudt,
Axel Munk
Abstract:
Finding meaningful ways to measure the statistical dependency between random variables $ξ$ and $ζ$ is a timeless statistical endeavor. In recent years, several novel concepts, like the distance covariance, have extended classical notions of dependency to more general settings. In this article, we propose and study an alternative framework that is based on optimal transport. The transport dependenc…
▽ More
Finding meaningful ways to measure the statistical dependency between random variables $ξ$ and $ζ$ is a timeless statistical endeavor. In recent years, several novel concepts, like the distance covariance, have extended classical notions of dependency to more general settings. In this article, we propose and study an alternative framework that is based on optimal transport. The transport dependency $τ\ge 0$ applies to general Polish spaces and intrinsically respects metric properties. For suitable ground costs, independence is fully characterized by $τ= 0$. Via proper normalization of $τ$, three transport correlations $ρ_α$, $ρ_\infty$, and $ρ_*$ with values in $[0, 1]$ are defined. They attain the value $1$ if and only if $ζ= \varphi(ξ)$, where $\varphi$ is an $α$-Lipschitz function for $ρ_α$, a measurable function for $ρ_\infty$, or a multiple of an isometry for $ρ_*$. The transport dependency can be estimated consistently by an empirical plug-in approach, but alternative estimators with the same convergence rate but significantly reduced computational costs are also proposed. Numerical results suggest that $τ$ robustly recovers dependency between data sets with different internal metric structures. The usage for inferential tasks, like transport dependency based independence testing, is illustrated on a data set from a cancer study.
△ Less
Submitted 20 March, 2023; v1 submitted 5 May, 2021;
originally announced May 2021.
-
Limit Distributions and Sensitivity Analysis for Empirical Entropic Optimal Transport on Countable Spaces
Authors:
Shayan Hundrieser,
Marcel Klatt,
Axel Munk
Abstract:
For probability measures on countable spaces we derive distributional limits for empirical entropic optimal transport quantities. More precisely, we show that the empirical optimal transport plan weakly converges to a centered Gaussian process and that the empirical entropic optimal transport value is asymptotically normal. The results are valid for a large class of cost functions and generalize d…
▽ More
For probability measures on countable spaces we derive distributional limits for empirical entropic optimal transport quantities. More precisely, we show that the empirical optimal transport plan weakly converges to a centered Gaussian process and that the empirical entropic optimal transport value is asymptotically normal. The results are valid for a large class of cost functions and generalize distributional limits for empirical entropic optimal transport quantities on finite spaces. Our proofs are based on a sensitivity analysis with respect to norms induced by suitable function classes, which arise from novel quantitative bounds for primal and dual optimizers, that are related to the exponential penalty term in the dual formulation. The distributional limits then follow from the functional delta method together with weak convergence of the empirical process in that respective norm, for which we provide sharp conditions on the underlying measures. As a byproduct of our proof technique, consistency of the bootstrap for statistical applications is shown.
△ Less
Submitted 25 December, 2022; v1 submitted 30 April, 2021;
originally announced May 2021.
-
The ultrametric Gromov-Wasserstein distance
Authors:
Facundo Mémoli,
Axel Munk,
Zhengchao Wan,
Christoph Weitkamp
Abstract:
In this paper, we investigate compact ultrametric measure spaces which form a subset $\mathcal{U}^w$ of the collection of all metric measure spaces $\mathcal{M}^w$. Similar as for the ultrametric Gromov-Hausdorff distance on the collection of ultrametric spaces $\mathcal{U}$, we define ultrametric versions of two metrics on $\mathcal{U}^w$, namely of Sturm's distance of order $p$ and of the Gromov…
▽ More
In this paper, we investigate compact ultrametric measure spaces which form a subset $\mathcal{U}^w$ of the collection of all metric measure spaces $\mathcal{M}^w$. Similar as for the ultrametric Gromov-Hausdorff distance on the collection of ultrametric spaces $\mathcal{U}$, we define ultrametric versions of two metrics on $\mathcal{U}^w$, namely of Sturm's distance of order $p$ and of the Gromov-Wasserstein distance of order $p$. We study the basic topological and geometric properties of these distances as well as their relation and derive for $p=\infty$ a polynomial time algorithm for their calculation. Further, several lower bounds for both distances are derived and some of our results are generalized to the case of finite ultra-dissimilarity spaces.
△ Less
Submitted 1 July, 2021; v1 submitted 14 January, 2021;
originally announced January 2021.
-
Variational Multiscale Nonparametric Regression: Algorithms and Implementation
Authors:
Miguel del Alamo,
Housen Li,
Axel Munk,
Frank Werner
Abstract:
Many modern statistically efficient methods come with tremendous computational challenges, often leading to large-scale optimisation problems. In this work, we examine such computational issues for recently developed estimation methods in nonparametric regression with a specific view on image denoising. We consider in particular certain variational multiscale estimators which are statistically opt…
▽ More
Many modern statistically efficient methods come with tremendous computational challenges, often leading to large-scale optimisation problems. In this work, we examine such computational issues for recently developed estimation methods in nonparametric regression with a specific view on image denoising. We consider in particular certain variational multiscale estimators which are statistically optimal in minimax sense, yet computationally intensive. Such an estimator is computed as the minimiser of a smoothness functional (e.g., TV norm) over the class of all estimators such that none of its coefficients with respect to a given multiscale dictionary is statistically significant. The so obtained multiscale Nemirowski-Dantzig estimator (MIND) can incorporate any convex smoothness functional and combine it with a proper dictionary including wavelets, curvelets and shearlets. The computation of MIND in general requires to solve a high-dimensional constrained convex optimisation problem with a specific structure of the constraints induced by the statistical multiscale testing criterion. To solve this explicitly, we discuss three different algorithmic approaches: the Chambolle-Pock, ADMM and semismooth Newton algorithms. Algorithmic details and an explicit implementation is presented and the solutions are then compared numerically in a simulation study and on various test images. We thereby recommend the Chambolle-Pock algorithm in most cases for its fast convergence. We stress that our analysis can also be transferred to signal recovery and other denoising problems to recover more general objects whenever it is possible to borrow statistical strength from data patches of similar object structure.
△ Less
Submitted 13 November, 2020; v1 submitted 20 October, 2020;
originally announced October 2020.
-
Optimistic search: Change point estimation for large-scale data via adaptive logarithmic queries
Authors:
Solt Kovács,
Housen Li,
Lorenz Haubner,
Axel Munk,
Peter Bühlmann
Abstract:
Change point estimation is often formulated as a search for the maximum of a gain function describing improved fits when segmenting the data. Searching through all candidates requires $O(n)$ evaluations of the gain function for an interval with $n$ observations. If each evaluation is computationally demanding (e.g. in high-dimensional models), this can become infeasible. Instead, we propose optimi…
▽ More
Change point estimation is often formulated as a search for the maximum of a gain function describing improved fits when segmenting the data. Searching through all candidates requires $O(n)$ evaluations of the gain function for an interval with $n$ observations. If each evaluation is computationally demanding (e.g. in high-dimensional models), this can become infeasible. Instead, we propose optimistic search methods with $O(\log n)$ evaluations exploiting specific structure of the gain function.
Towards solid understanding of our strategy, we investigate in detail the $p$-dimensional Gaussian changing means setup, including high-dimensional scenarios. For some of our proposals, we prove asymptotic minimax optimality for detecting change points and derive their asymptotic localization rate. These rates (up to a possible log factor) are optimal for the univariate and multivariate scenarios, and are by far the fastest in the literature under the weakest possible detection condition on the signal-to-noise ratio in the high-dimensional scenario. Computationally, our proposed methodology has the worst case complexity of $O(np)$, which can be improved to be sublinear in $n$ if some a-priori knowledge on the length of the shortest segment is available.
Our search strategies generalize far beyond the theoretically analyzed setup. We illustrate, as an example, massive computational speedup in change point detection for high-dimensional Gaussian graphical models.
△ Less
Submitted 29 November, 2022; v1 submitted 20 October, 2020;
originally announced October 2020.
-
Limit Laws for Empirical Optimal Solutions in Stochastic Linear Programs
Authors:
Marcel Klatt,
Axel Munk,
Yoav Zemel
Abstract:
We consider a general linear program in standard form whose right-hand side constraint vector is subject to random perturbations. This defines a stochastic linear program for which, under general conditions, we characterize the fluctuations of the corresponding empirical optimal solution by a central limit-type theorem. Our approach relies on the combinatorial nature and the concept of degeneracy…
▽ More
We consider a general linear program in standard form whose right-hand side constraint vector is subject to random perturbations. This defines a stochastic linear program for which, under general conditions, we characterize the fluctuations of the corresponding empirical optimal solution by a central limit-type theorem. Our approach relies on the combinatorial nature and the concept of degeneracy inherent in linear programming, in strong contrast to well-known results for smooth stochastic optimization programs. In particular, if the corresponding dual linear program is degenerate the asymptotic limit law might not be unique and is determined from the way the empirical optimal solution is chosen. Furthermore, we establish consistency and convergence rates of the Hausdorff distance between the empirical and the true optimality sets. As a consequence, we deduce a limit law for the empirical optimal value characterized by the set of all dual optimal solutions which turns out to be a simple consequence of our general proof techniques.
Our analysis is motivated from recent findings in statistical optimal transport that will be of special focus here. In addition to the asymptotic limit laws for optimal transport solutions, we obtain results linking degeneracy of the dual transport problem to geometric properties of the underlying ground space, and prove almost sure uniqueness statements that may be of independent interest.
△ Less
Submitted 27 July, 2020;
originally announced July 2020.
-
Gromov-Wasserstein Distance based Object Matching: Asymptotic Inference
Authors:
Christoph Alexander Weitkamp,
Katharina Proksch,
Carla Tameling,
Axel Munk
Abstract:
In this paper, we aim to provide a statistical theory for object matching based on the Gromov-Wasserstein distance. To this end, we model general objects as metric measure spaces. Based on this, we propose a simple and efficiently computable asymptotic statistical test for pose invariant object discrimination. This is based on an empirical version of a $β$-trimmed lower bound of the Gromov-Wassers…
▽ More
In this paper, we aim to provide a statistical theory for object matching based on the Gromov-Wasserstein distance. To this end, we model general objects as metric measure spaces. Based on this, we propose a simple and efficiently computable asymptotic statistical test for pose invariant object discrimination. This is based on an empirical version of a $β$-trimmed lower bound of the Gromov-Wasserstein distance. We derive for $β\in[0,1/2)$ distributional limits of this test statistic. To this end, we introduce a novel $U$-type process indexed in $β$ and show its weak convergence. Finally, the theory developed is investigated in Monte Carlo simulations and applied to structural protein comparisons.
△ Less
Submitted 24 June, 2020; v1 submitted 22 June, 2020;
originally announced June 2020.
-
What is resolution? A statistical minimax testing perspective on super-resolution microscopy
Authors:
Gytis Kulaitis,
Axel Munk,
Frank Werner
Abstract:
As a general rule of thumb the resolution of a light microscope (i.e. the ability to discern objects) is predominantly described by the full width at half maximum (FWHM) of its point spread function (psf)---the diameter of the blurring density at half of its maximum. Classical wave optics suggests a linear relationship between FWHM and resolution also manifested in the well known Abbe and Rayleigh…
▽ More
As a general rule of thumb the resolution of a light microscope (i.e. the ability to discern objects) is predominantly described by the full width at half maximum (FWHM) of its point spread function (psf)---the diameter of the blurring density at half of its maximum. Classical wave optics suggests a linear relationship between FWHM and resolution also manifested in the well known Abbe and Rayleigh criteria, dating back to the end of 19th century. However, during the last two decades conventional light microscopy has undergone a shift from microscopic scales to nanoscales. This increase in resolution comes with the need to incorporate the random nature of observations (light photons) and challenges the classical view of discernability, as we argue in this paper. Instead, we suggest a statistical description of resolution obtained from such random data. Our notion of discernability is based on statistical testing whether one or two objects with the same total intensity are present. For Poisson measurements we get linear dependence of the (minimax) detection boundary on the FWHM, whereas for a homogeneous Gaussian model the dependence of resolution is nonlinear. Hence, at small physical scales modeling by homogeneous gaussians is inadequate, although often implicitly assumed in many reconstruction algorithms. In contrast, the Poisson model and its variance stabilized Gaussian approximation seem to provide a statistically sound description of resolution at the nanoscale. Our theory is also applicable to other imaging setups, such as telescopes.
△ Less
Submitted 22 October, 2020; v1 submitted 15 May, 2020;
originally announced May 2020.
-
Bump detection in the presence of dependency: Does it ease or does it load?
Authors:
Farida Enikeeva,
Axel Munk,
Markus Pohlmann,
Frank Werner
Abstract:
We provide the asymptotic minimax detection boundary for a bump, i.e. an abrupt change, in the mean function of a stationary Gaussian process. This will be characterized in terms of the asymptotic behavior of the bump length and height as well as the dependency structure of the process. A major finding is that the asymptotic minimax detection boundary is generically determined by the value of its…
▽ More
We provide the asymptotic minimax detection boundary for a bump, i.e. an abrupt change, in the mean function of a stationary Gaussian process. This will be characterized in terms of the asymptotic behavior of the bump length and height as well as the dependency structure of the process. A major finding is that the asymptotic minimax detection boundary is generically determined by the value of its spectral density at zero. Finally, our asymptotic analysis is complemented by non-asymptotic results for AR($p$) processes and confirmed to serve as a good proxy for finite sample scenarios in a simulation study. Our proofs are based on laws of large numbers for non-independent and non-identically distributed arrays of random variables and the asymptotically sharp analysis of the precision matrix of the process.
△ Less
Submitted 6 April, 2020; v1 submitted 19 June, 2019;
originally announced June 2019.
-
Total variation multiscale estimators for linear inverse problems
Authors:
Miguel del Álamo,
Axel Munk
Abstract:
Even though the statistical theory of linear inverse problems is a well-studied topic, certain relevant cases remain open. Among these is the estimation of functions of bounded variation ($BV$), meaning $L^1$ functions on a $d$-dimensional domain whose weak first derivatives are finite Radon measures. The estimation of $BV$ functions is relevant in many applications, since it involves minimal smoo…
▽ More
Even though the statistical theory of linear inverse problems is a well-studied topic, certain relevant cases remain open. Among these is the estimation of functions of bounded variation ($BV$), meaning $L^1$ functions on a $d$-dimensional domain whose weak first derivatives are finite Radon measures. The estimation of $BV$ functions is relevant in many applications, since it involves minimal smoothness assumptions and gives simplified, interpretable cartoonized reconstructions. In this paper we propose a novel technique for estimating $BV$ functions in an inverse problem setting, and provide theoretical guaranties by showing that the proposed estimator is minimax optimal up to logarithms with respect to the $L^q$-risk, for any $q\in[1,\infty)$. This is to the best of our knowledge the first convergence result for $BV$ functions in inverse problems in dimension $d\geq 2$, and it extends the results by Donoho (Appl. Comput. Harmon. Anal., 2(2):101--126, 1995) in $d=1$. Furthermore, our analysis unravels a novel regime for large $q$ in which the minimax rate is slower than $n^{-1/(d+2β+2)}$, where $β$ is the degree of ill-posedness: our analysis shows that this slower rate arises from the low smoothness of $BV$ functions. The proposed estimator combines variational regularization techniques with the wavelet-vaguelette decomposition of operators.
△ Less
Submitted 21 May, 2019;
originally announced May 2019.
-
Empirical Regularized Optimal Transport: Statistical Theory and Applications
Authors:
Marcel Klatt,
Carla Tameling,
Axel Munk
Abstract:
We derive limit distributions for certain empirical regularized optimal transport distances between probability distributions supported on a finite metric space and show consistency of the (naive) bootstrap. In particular, we prove that the empirical regularized transport plan itself asymptotically follows a Gaussian law. The theory includes the Boltzmann-Shannon entropy regularization and hence a…
▽ More
We derive limit distributions for certain empirical regularized optimal transport distances between probability distributions supported on a finite metric space and show consistency of the (naive) bootstrap. In particular, we prove that the empirical regularized transport plan itself asymptotically follows a Gaussian law. The theory includes the Boltzmann-Shannon entropy regularization and hence a limit law for the widely applied Sinkhorn divergence. Our approach is based on an application of the implicit function theorem to necessary and sufficient optimality conditions for the regularized transport problem. The asymptotic results are investigated in Monte Carlo simulations. We further discuss computational and statistical applications, e.g. confidence bands for colocalization analysis of protein interaction networks based on regularized optimal transport.
△ Less
Submitted 1 May, 2019; v1 submitted 23 October, 2018;
originally announced October 2018.
-
Posterior Consistency in the Binomial $(n,p)$ Model with Unknown $n$ and $p$: A Numerical Study
Authors:
Laura Fee Schneider,
Thomas Staudt,
Axel Munk
Abstract:
Estimating the parameters from $k$ independent Bin$(n,p)$ random variables, when both parameters $n$ and $p$ are unknown, is relevant to a variety of applications. It is particularly difficult if $n$ is large and $p$ is small. Over the past decades, several articles have proposed Bayesian approaches to estimate $n$ in this setting, but asymptotic results could only be established recently in \cite…
▽ More
Estimating the parameters from $k$ independent Bin$(n,p)$ random variables, when both parameters $n$ and $p$ are unknown, is relevant to a variety of applications. It is particularly difficult if $n$ is large and $p$ is small. Over the past decades, several articles have proposed Bayesian approaches to estimate $n$ in this setting, but asymptotic results could only be established recently in \cite{Schneider}. There, posterior contraction for $n$ is proven in the problematic parameter regime where $n\rightarrow\infty$ and $p\rightarrow0$ at certain rates. In this article, we study numerically how far the theoretical upper bound on $n$ can be relaxed in simulations without losing posterior consistency.
△ Less
Submitted 7 September, 2018;
originally announced September 2018.
-
Posterior analysis of $n$ in the binomial $(n,p)$ problem with both parameters unknown -- with applications to quantitative nanoscopy
Authors:
Johannes Schmidt-Hieber,
Laura Fee Schneider,
Thomas Staudt,
Andrea Kra**a,
Timo Aspelmeier,
Axel Munk
Abstract:
Estimation of the population size $n$ from $k$ i.i.d.\ binomial observations with unknown success probability $p$ is relevant to a multitude of applications and has a long history. Without additional prior information this is a notoriously difficult task when $p$ becomes small, and the Bayesian approach becomes particularly useful. For a large class of priors, we establish posterior contraction an…
▽ More
Estimation of the population size $n$ from $k$ i.i.d.\ binomial observations with unknown success probability $p$ is relevant to a multitude of applications and has a long history. Without additional prior information this is a notoriously difficult task when $p$ becomes small, and the Bayesian approach becomes particularly useful. For a large class of priors, we establish posterior contraction and a Bernstein-von Mises type theorem in a setting where $p\rightarrow0$ and $n\rightarrow\infty$ as $k\to\infty$. Furthermore, we suggest a new class of Bayesian estimators for $n$ and provide a comprehensive simulation study in which we investigate their performance. To showcase the advantages of a Bayesian approach on real data, we also benchmark our estimators in a novel application from super-resolution microscopy.
△ Less
Submitted 16 November, 2020; v1 submitted 7 September, 2018;
originally announced September 2018.
-
Frame-constrained Total Variation Regularization for White Noise Regression
Authors:
Miguel del Álamo,
Housen Li,
Axel Munk
Abstract:
Despite the popularity and practical success of total variation (TV) regularization for function estimation, surprisingly little is known about its theoretical performance in a statistical setting. While TV regularization has been known for quite some time to be minimax optimal for denoising one-dimensional signals, for higher dimensions this remains elusive until today. In this paper we consider…
▽ More
Despite the popularity and practical success of total variation (TV) regularization for function estimation, surprisingly little is known about its theoretical performance in a statistical setting. While TV regularization has been known for quite some time to be minimax optimal for denoising one-dimensional signals, for higher dimensions this remains elusive until today. In this paper we consider frame-constrained TV estimators including many well-known (overcomplete) frames in a white noise regression model, and prove their minimax optimality w.r.t. $L^q$-risk ($1\leq q<\infty$) up to a logarithmic factor in any dimension $d\geq 1$. Overcomplete frames are an established tool in mathematical imaging and signal recovery, and their combination with TV regularization has been shown to give excellent results in practice, which our theory now confirms. Our results rely on a novel connection between frame-constraints and certain Besov norms, and on an interpolation inequality to relate them to the risk functional.
△ Less
Submitted 9 May, 2019; v1 submitted 5 July, 2018;
originally announced July 2018.
-
Maximum likelihood estimation in hidden Markov models with inhomogeneous noise
Authors:
Manuel Diehn,
Axel Munk,
Daniel Rudolf
Abstract:
We consider parameter estimation in finite hidden state space Markov models with time-dependent inhomogeneous noise, where the inhomogeneity vanishes sufficiently fast. Based on the concept of asymptotic mean stationary processes we prove that the maximum likelihood and a quasi-maximum likelihood estimator (QMLE) are strongly consistent. The computation of the QMLE ignores the inhomogeneity, hence…
▽ More
We consider parameter estimation in finite hidden state space Markov models with time-dependent inhomogeneous noise, where the inhomogeneity vanishes sufficiently fast. Based on the concept of asymptotic mean stationary processes we prove that the maximum likelihood and a quasi-maximum likelihood estimator (QMLE) are strongly consistent. The computation of the QMLE ignores the inhomogeneity, hence, is much simpler and robust. The theory is motivated by an example from biophysics and applied to a Poisson- and linear Gaussian model.
△ Less
Submitted 29 September, 2018; v1 submitted 11 April, 2018;
originally announced April 2018.
-
Multidimensional multiscale scanning in Exponential Families: Limit theory and statistical consequences
Authors:
Claudia König,
Axel Munk,
Frank Werner
Abstract:
We consider the problem of finding anomalies in a $d$-dimensional field of independent random variables $\{Y_i\}_{i \in \left\{1,...,n\right\}^d}$, each distributed according to a one-dimensional natural exponential family $\mathcal F = \left\{F_θ\right\}_{θ\inΘ}$. Given some baseline parameter $θ_0 \inΘ$, the field is scanned using local likelihood ratio tests to detect from a (large) given syste…
▽ More
We consider the problem of finding anomalies in a $d$-dimensional field of independent random variables $\{Y_i\}_{i \in \left\{1,...,n\right\}^d}$, each distributed according to a one-dimensional natural exponential family $\mathcal F = \left\{F_θ\right\}_{θ\inΘ}$. Given some baseline parameter $θ_0 \inΘ$, the field is scanned using local likelihood ratio tests to detect from a (large) given system of regions $\mathcal{R}$ those regions $R \subset \left\{1,...,n\right\}^d$ with $θ_i \neq θ_0$ for some $i \in R$. We provide a unified methodology which controls the overall family wise error (FWER) to make a wrong detection at a given error rate.
Fundamental to our method is a Gaussian approximation of the distribution of the underlying multiscale test statistic with explicit rate of convergence. From this, we obtain a weak limit theorem which can be seen as a generalized weak invariance principle to non identically distributed data and is of independent interest. Furthermore, we give an asymptotic expansion of the procedures power, which yields minimax optimality in case of Gaussian observations.
△ Less
Submitted 24 March, 2019; v1 submitted 22 February, 2018;
originally announced February 2018.
-
Minimax estimation in linear models with unknown design over finite alphabets
Authors:
Merle Behr,
Axel Munk
Abstract:
We provide a minimax optimal estimation procedure for F and W in matrix valued linear models Y = F W + Z where the parameter matrix W and the design matrix F are unknown but the latter takes values in a known finite set. The proposed finite alphabet linear model is justified in a variety of applications, ranging from signal processing to cancer genetics. We show that this allows to separate F and…
▽ More
We provide a minimax optimal estimation procedure for F and W in matrix valued linear models Y = F W + Z where the parameter matrix W and the design matrix F are unknown but the latter takes values in a known finite set. The proposed finite alphabet linear model is justified in a variety of applications, ranging from signal processing to cancer genetics. We show that this allows to separate F and W uniquely under weak identifiability conditions, a task which is not doable, in general. To this end we quantify in the noiseless case, that is, Z = 0, the perturbation range of Y in order to obtain stable recovery of F and W. Based on this, we derive an iterative Lloyd's type estimation procedure that attains minimax estimation rates for W and F for Gaussian error matrix Z. In contrast to the least squares solution the estimation procedure can be computed efficiently and scales linearly with the total number of observations. We confirm our theoretical results in a simulation study and illustrate it with a genetic sequencing data example.
△ Less
Submitted 18 February, 2021; v1 submitted 11 November, 2017;
originally announced November 2017.
-
Multiscale Change-point Segmentation: Beyond Step Functions
Authors:
Housen Li,
Qinghai Guo,
Axel Munk
Abstract:
Modern multiscale type segmentation methods are known to detect multiple change-points with high statistical accuracy, while allowing for fast computation. Underpinning theory has been developed mainly for models that assume the signal as a piecewise constant function. In this paper this will be extended to certain function classes beyond such step functions in a nonparametric regression setting,…
▽ More
Modern multiscale type segmentation methods are known to detect multiple change-points with high statistical accuracy, while allowing for fast computation. Underpinning theory has been developed mainly for models that assume the signal as a piecewise constant function. In this paper this will be extended to certain function classes beyond such step functions in a nonparametric regression setting, revealing certain multiscale segmentation methods as robust to deviation from such piecewise constant functions. Our main finding is the adaptation over such function classes for a universal thresholding, which includes bounded variation functions, and (piecewise) Hölder functions of smoothness order $ 0 < α\le1$ as special cases. From this we derive statistical guarantees on feature detection in terms of jumps and modes. Another key finding is that these multiscale segmentation methods perform nearly (up to a log-factor) as well as the oracle piecewise constant segmentation estimator (with known jump locations), and the best piecewise constant approximants of the (unknown) true signal. Theoretical findings are examined by various numerical simulations.
△ Less
Submitted 21 January, 2019; v1 submitted 13 August, 2017;
originally announced August 2017.
-
Empirical optimal transport on countable metric spaces: Distributional limits and statistical applications
Authors:
Carla Tameling,
Max Sommerfeld,
Axel Munk
Abstract:
We derive distributional limits for empirical transport distances between probability measures supported on countable sets. Our approach is based on sensitivity analysis of optimal values of infinite dimensional mathematical programs and a delta method for non-linear derivatives. A careful calibration of the norm on the space of probability measures is needed in order to combine differentiability…
▽ More
We derive distributional limits for empirical transport distances between probability measures supported on countable sets. Our approach is based on sensitivity analysis of optimal values of infinite dimensional mathematical programs and a delta method for non-linear derivatives. A careful calibration of the norm on the space of probability measures is needed in order to combine differentiability and weak convergence of the underlying empirical process. Based on this we provide a sufficient and necessary condition for the underlying distribution on the countable metric space for such a distributional limit to hold. We give an explicit form of the limiting distribution for ultra-metric spaces. Finally, we apply our findings to optimal transport based inference in large scale problems. An application to nanoscale microscopy is given.
△ Less
Submitted 17 September, 2018; v1 submitted 4 July, 2017;
originally announced July 2017.
-
Kernel partial least squares for stationary data
Authors:
Marco Singer,
Tatyana Krivobokova,
Axel Munk
Abstract:
We consider the kernel partial least squares algorithm for non-parametric regression with stationary dependent data. Probabilistic convergence rates of the kernel partial least squares estimator to the true regression function are established under a source and an effective dimensionality condition. It is shown both theoretically and in simulations that long range dependence results in slower conv…
▽ More
We consider the kernel partial least squares algorithm for non-parametric regression with stationary dependent data. Probabilistic convergence rates of the kernel partial least squares estimator to the true regression function are established under a source and an effective dimensionality condition. It is shown both theoretically and in simulations that long range dependence results in slower convergence rates. A protein dynamics example shows high predictive power of kernel partial least squares.
△ Less
Submitted 12 June, 2017;
originally announced June 2017.
-
Mini-Flash Crashes, Model Risk, and Optimal Execution
Authors:
Erhan Bayraktar,
Alexander Munk
Abstract:
Oft-cited causes of mini-flash crashes include human errors, endogenous feedback loops, the nature of modern liquidity provision, fundamental value shocks, and market fragmentation. We develop a mathematical model which captures aspects of the first three explanations. Empirical features of recent mini-flash crashes are present in our framework. For example, there are periods when no such events w…
▽ More
Oft-cited causes of mini-flash crashes include human errors, endogenous feedback loops, the nature of modern liquidity provision, fundamental value shocks, and market fragmentation. We develop a mathematical model which captures aspects of the first three explanations. Empirical features of recent mini-flash crashes are present in our framework. For example, there are periods when no such events will occur. If they do, even just before their onset, market participants may not know with certainty that a disruption will unfold. Our mini-flash crashes can materialize in both low and high trading volume environments and may be accompanied by a partial synchronization in order submission.
Instead of adopting a classically-inspired equilibrium approach, we borrow ideas from the optimal execution literature. Each of our agents begins with beliefs about how his own trades impact prices and how prices would move in his absence. They, along with other market participants, then submit orders which are executed at a common venue. Naturally, this leads us to explicitly distinguish between how prices actually evolve and our agents' opinions. In particular, every agent's beliefs will be expressly incorrect.
△ Less
Submitted 12 August, 2018; v1 submitted 27 May, 2017;
originally announced May 2017.
-
The Essential Histogram
Authors:
Housen Li,
Axel Munk,
Hannes Sieling,
Guenther Walther
Abstract:
The histogram is widely used as a simple, exploratory display of data, but it is usually not clear how to choose the number and size of bins. We construct a confidence set of distribution functions that optimally address the two main tasks of the histogram: estimating probabilities and detecting features such as increases and modes in the distribution. We define the essential histogram as the hist…
▽ More
The histogram is widely used as a simple, exploratory display of data, but it is usually not clear how to choose the number and size of bins. We construct a confidence set of distribution functions that optimally address the two main tasks of the histogram: estimating probabilities and detecting features such as increases and modes in the distribution. We define the essential histogram as the histogram in the confidence set with the fewest bins. Thus the essential histogram is the simplest visualization of the data that optimally achieves the main tasks of the histogram. The only assumption we make is that the data are independent and identically distributed. We provide a fast algorithm for the essential histogram, and illustrate our methodology with examples. An R-package is available on CRAN.
△ Less
Submitted 28 May, 2019; v1 submitted 21 December, 2016;
originally announced December 2016.
-
Multiscale scanning in inverse problems
Authors:
Katharina Proksch,
Frank Werner,
Axel Munk
Abstract:
In this paper we propose a multiscale scanning method to determine active components of a quantity $f$ w.r.t. a dictionary $\mathcal{U}$ from observations $Y$ in an inverse regression model $Y=Tf+ξ$ with linear operator $T$ and general random error $ξ$. To this end, we provide uniform confidence statements for the coefficients $\langle \varphi, f\rangle$, $\varphi \in \mathcal U$, under the assump…
▽ More
In this paper we propose a multiscale scanning method to determine active components of a quantity $f$ w.r.t. a dictionary $\mathcal{U}$ from observations $Y$ in an inverse regression model $Y=Tf+ξ$ with linear operator $T$ and general random error $ξ$. To this end, we provide uniform confidence statements for the coefficients $\langle \varphi, f\rangle$, $\varphi \in \mathcal U$, under the assumption that $(T^*)^{-1} \left(\mathcal U\right)$ is of wavelet-type. Based on this we obtain a multiple test that allows to identify the active components of $\mathcal{U}$, i.e. $\left\langle f, \varphi\right\rangle \neq 0$, $\varphi \in \mathcal U$, at controlled, family-wise error rate. Our results rely on a Gaussian approximation of the underlying multiscale statistic with a novel scale penalty adapted to the ill-posedness of the problem. The scale penalty furthermore ensures weak convergence of the statistic's distribution towards a Gumbel limit under reasonable assumptions. The important special cases of tomography and deconvolution are discussed in detail. Further, the regression case, when $T = \text{id}$ and the dictionary consists of moving windows of various sizes (scales), is included, generalizing previous results for this setting. We show that our method obeys an oracle optimality, i.e. it attains the same asymptotic power as a single-scale testing procedure at the correct scale. Simulations support our theory and we illustrate the potential of the method as an inferential tool for imaging. As a particular application we discuss super-resolution microscopy and analyze experimental STED data to locate single DNA origami.
△ Less
Submitted 27 June, 2017; v1 submitted 14 November, 2016;
originally announced November 2016.
-
High-Roller Impact: A Large Generalized Game Model of Parimutuel Wagering
Authors:
Erhan Bayraktar,
Alexander Munk
Abstract:
How do large-scale participants in parimutuel wagering events affect the house and ordinary bettors? A standard narrative suggests that they may temporarily benefit the former at the expense of the latter. To approach this problem, we begin by develo** a model based on the theory of large generalized games. Constrained only by their budgets, a continuum of diffuse (ordinary) players and a single…
▽ More
How do large-scale participants in parimutuel wagering events affect the house and ordinary bettors? A standard narrative suggests that they may temporarily benefit the former at the expense of the latter. To approach this problem, we begin by develo** a model based on the theory of large generalized games. Constrained only by their budgets, a continuum of diffuse (ordinary) players and a single atomic (large-scale) player simultaneously wager to maximize their expected profits according to their individual beliefs. Our main theoretical result gives necessary and sufficient conditions for the existence and uniqueness of a pure-strategy Nash equilibrium. Using this framework, we analyze our question in concrete scenarios. First, we study a situation in which both predicted effects are observed. Neither is always observed in our remaining examples, suggesting the need for a more nuanced view of large-scale participants.
△ Less
Submitted 28 March, 2017; v1 submitted 11 May, 2016;
originally announced May 2016.
-
Variational Multiscale Nonparametric Regression: Smooth Functions
Authors:
Markus Grasmair,
Housen Li,
Axel Munk
Abstract:
For the problem of nonparametric regression of smooth functions, we reconsider and analyze a constrained variational approach, which we call the MultIscale Nemirovski-Dantzig (MIND) estimator. This can be viewed as a multiscale extension of the Dantzig selector (\emph{Ann. Statist.}, 35(6): 2313--51, 2009) based on early ideas of Nemirovski (\emph{J. Comput. System Sci.}, 23:1--11, 1986). MIND min…
▽ More
For the problem of nonparametric regression of smooth functions, we reconsider and analyze a constrained variational approach, which we call the MultIscale Nemirovski-Dantzig (MIND) estimator. This can be viewed as a multiscale extension of the Dantzig selector (\emph{Ann. Statist.}, 35(6): 2313--51, 2009) based on early ideas of Nemirovski (\emph{J. Comput. System Sci.}, 23:1--11, 1986). MIND minimizes a homogeneous Sobolev norm under the constraint that the multiresolution norm of the residual is bounded by a universal threshold. The main contribution of this paper is the derivation of convergence rates of MIND with respect to $L^q$-loss, $1 \le q \le \infty$, both almost surely and in expectation. To this end, we introduce the method of approximate source conditions. For a one-dimensional signal, these can be translated into approximation properties of $B$-splines. A remarkable consequence is that MIND attains almost minimax optimal rates simultaneously for a large range of Sobolev and Besov classes, which provides certain adaptation. Complimentary to the asymptotic analysis, we examine the finite sample performance of MIND by numerical simulations.
△ Less
Submitted 3 December, 2015;
originally announced December 2015.
-
Partial least squares for dependent data
Authors:
Marco Singer,
Tatyana Krivobokova,
Bert L. de Groot,
Axel Munk
Abstract:
The partial least squares algorithm for dependent data realisations is considered. Consequences of ignoring the dependence for the algorithm performance are studied both theoretically and in simulations. It is shown that ignoring certain non-stationary dependence structures leads to inconsistent estimation. A simple modification of the partial least squares algorithm for dependent data is proposed…
▽ More
The partial least squares algorithm for dependent data realisations is considered. Consequences of ignoring the dependence for the algorithm performance are studied both theoretically and in simulations. It is shown that ignoring certain non-stationary dependence structures leads to inconsistent estimation. A simple modification of the partial least squares algorithm for dependent data is proposed and consistency of corresponding estimators is shown. A real-data example on protein dynamics llustrates a superior predictive power of the method and the practical relevance of the problem.
△ Less
Submitted 3 March, 2016; v1 submitted 16 October, 2015;
originally announced October 2015.
-
Limit laws of the empirical Wasserstein distance: Gaussian distributions
Authors:
Thomas Rippl,
Axel Munk,
Anja Sturm
Abstract:
We derive central limit theorems for the Wasserstein distance between the empirical distributions of Gaussian samples. The cases are distinguished whether the underlying laws are the same or different. Results are based on the (quadratic) Frechet differentiability of the Wasserstein distance in the Gaussian case. Extensions to elliptically symmetric distributions are discussed as well as several a…
▽ More
We derive central limit theorems for the Wasserstein distance between the empirical distributions of Gaussian samples. The cases are distinguished whether the underlying laws are the same or different. Results are based on the (quadratic) Frechet differentiability of the Wasserstein distance in the Gaussian case. Extensions to elliptically symmetric distributions are discussed as well as several applications such as bootstrap and statistical testing.
△ Less
Submitted 19 February, 2016; v1 submitted 15 July, 2015;
originally announced July 2015.
-
Heterogeneous Change Point Inference
Authors:
Florian Pein,
Hannes Sieling,
Axel Munk
Abstract:
We propose HSMUCE (heterogeneous simultaneous multiscale change-point estimator) for the detection of multiple change-points of the signal in a heterogeneous gaussian regression model. A piecewise constant function is estimated by minimizing the number of change-points over the acceptance region of a multiscale test which locally adapts to changes in the variance. The multiscale test is a combinat…
▽ More
We propose HSMUCE (heterogeneous simultaneous multiscale change-point estimator) for the detection of multiple change-points of the signal in a heterogeneous gaussian regression model. A piecewise constant function is estimated by minimizing the number of change-points over the acceptance region of a multiscale test which locally adapts to changes in the variance. The multiscale test is a combination of local likelihood ratio tests which are properly calibrated by scale dependent critical values in order to keep a global nominal level alpha, even for finite samples. We show that HSMUCE controls the error of over- and underestimation of the number of change-points. To this end, new deviation bounds for F-type statistics are derived. Moreover, we obtain confidence sets for the whole signal. All results are non-asymptotic and uniform over a large class of heterogeneous change-point models. HSMUCE is fast to compute, achieves the optimal detection rate and estimates the number of change-points at almost optimal accuracy for vanishing signals, while still being robust. We compare HSMUCE with several state of the art methods in simulations and analyse current recordings of a transmembrane protein in the bacterial outer membrane with pronounced heterogeneity for its states. An R-package is available online.
△ Less
Submitted 5 February, 2016; v1 submitted 19 May, 2015;
originally announced May 2015.
-
Bump detection in heterogeneous Gaussian regression
Authors:
Farida Enikeeva,
Axel Munk,
Frank Werner
Abstract:
We analyze the effect of a heterogeneous variance on bump detection in a Gaussian regression model. To this end we allow for a simultaneous bump in the variance and specify its impact on the difficulty to detect the null signal against a single bump with known signal strength. This is done by calculating lower and upper bounds, both based on the likelihood ratio. Lower and upper bounds together le…
▽ More
We analyze the effect of a heterogeneous variance on bump detection in a Gaussian regression model. To this end we allow for a simultaneous bump in the variance and specify its impact on the difficulty to detect the null signal against a single bump with known signal strength. This is done by calculating lower and upper bounds, both based on the likelihood ratio. Lower and upper bounds together lead to explicit characterizations of the detection boundary in several subregimes depending on the asymptotic behavior of the bump heights in mean and variance. In particular, we explicitly identify those regimes, where the additional information about a simultaneous bump in variance eases the detection problem for the signal. This effect is made explicit in the constant and / or the rate, appearing in the detection boundary. We also discuss the case of an unknown bump height and provide an adaptive test and some upper bounds in that case.
△ Less
Submitted 3 May, 2016; v1 submitted 28 April, 2015;
originally announced April 2015.
-
FDR-Control in Multiscale Change-point Segmentation
Authors:
Housen Li,
Axel Munk,
Hannes Sieling
Abstract:
Fast multiple change-point segmentation methods, which additionally provide faithful statistical statements on the number, locations and sizes of the segments, have recently received great attention. In this paper, we propose a multiscale segmentation method, FDRSeg, which controls the false discovery rate (FDR) in the sense that the number of false jumps is bounded linearly by the number of true…
▽ More
Fast multiple change-point segmentation methods, which additionally provide faithful statistical statements on the number, locations and sizes of the segments, have recently received great attention. In this paper, we propose a multiscale segmentation method, FDRSeg, which controls the false discovery rate (FDR) in the sense that the number of false jumps is bounded linearly by the number of true jumps. In this way, it adapts the detection power to the number of true jumps. We prove a non-asymptotic upper bound for its FDR in a Gaussian setting, which allows to calibrate the only parameter of FDRSeg properly. Change-point locations, as well as the signal, are shown to be estimated in a uniform sense at optimal minimax convergence rates up to a log-factor. The latter is w.r.t. $L^p$-risk, $p \ge 1$, over classes of step functions with bounded jump sizes and either bounded, or possibly increasing, number of change-points. FDRSeg can be efficiently computed by an accelerated dynamic program; its computational complexity is shown to be linear in the number of observations when there are many change-points. The performance of the proposed method is examined by comparisons with some state of the art methods on both simulated and real datasets. An R-package is available online.
△ Less
Submitted 25 October, 2015; v1 submitted 18 December, 2014;
originally announced December 2014.
-
An $α$-stable limit theorem under sublinear expectation
Authors:
Erhan Bayraktar,
Alexander Munk
Abstract:
For $α\in (1,2)$, we present a generalized central limit theorem for $α$-stable random variables under sublinear expectation. The foundation of our proof is an interior regularity estimate for partial integro-differential equations (PIDEs). A classical generalized central limit theorem is recovered as a special case, provided a mild but natural additional condition holds. Our approach contrasts wi…
▽ More
For $α\in (1,2)$, we present a generalized central limit theorem for $α$-stable random variables under sublinear expectation. The foundation of our proof is an interior regularity estimate for partial integro-differential equations (PIDEs). A classical generalized central limit theorem is recovered as a special case, provided a mild but natural additional condition holds. Our approach contrasts with previous arguments for the result in the linear setting which have typically relied upon tools that are non-existent in the sublinear framework, for example, characteristic functions.
△ Less
Submitted 27 June, 2016; v1 submitted 28 September, 2014;
originally announced September 2014.
-
Comparing the $G$-Normal Distribution to its Classical Counterpart
Authors:
Erhan Bayraktar,
Alexander Munk
Abstract:
In one dimension, the theory of the $G$-normal distribution is well-developed, and many results from the classical setting have a nonlinear counterpart. Significant challenges remain in multiple dimensions, and some of what has already been discovered is quite nonintuitive. By answering several classically-inspired questions concerning independence, covariance uncertainty, and behavior under certa…
▽ More
In one dimension, the theory of the $G$-normal distribution is well-developed, and many results from the classical setting have a nonlinear counterpart. Significant challenges remain in multiple dimensions, and some of what has already been discovered is quite nonintuitive. By answering several classically-inspired questions concerning independence, covariance uncertainty, and behavior under certain linear operations, we continue to highlight the fascinating range of unexpected attributes of the multidimensional $G$-normal distribution.
△ Less
Submitted 2 December, 2014; v1 submitted 18 July, 2014;
originally announced July 2014.
-
Persistence Barcodes versus Kolmogorov Signatures: Detecting Modes of One-Dimensional Signals
Authors:
Ulrich Bauer,
Axel Munk,
Hannes Sieling,
Max Wardetzky
Abstract:
We investigate the problem of estimating the number of modes (i.e., local maxima) - a well known question in statistical inference - and we show how to do so without presmoothing the data. To this end, we modify the ideas of persistence barcodes by first relating persistence values in dimension one to distances (with respect to the supremum norm) to the sets of functions with a given number of mod…
▽ More
We investigate the problem of estimating the number of modes (i.e., local maxima) - a well known question in statistical inference - and we show how to do so without presmoothing the data. To this end, we modify the ideas of persistence barcodes by first relating persistence values in dimension one to distances (with respect to the supremum norm) to the sets of functions with a given number of modes, and subsequently working with norms different from the supremum norm. As a particular case we investigate the Kolmogorov norm. We argue that this modification has certain statistical advantages. We offer confidence bands for the attendant Kolmogorov signatures, thereby allowing for the selection of relevant signatures with a statistically controllable error. As a result of independent interest, we show that taut strings minimize the number of critical points for a very general class of functions. We illustrate our results by several numerical examples.
△ Less
Submitted 27 January, 2015; v1 submitted 4 April, 2014;
originally announced April 2014.
-
Aggregated motion estimation for real-time MRI reconstruction
Authors:
Housen Li,
Markus Haltmeier,
Shuo Zhang,
Jens Frahm,
Axel Munk
Abstract:
Real-time magnetic resonance imaging (MRI) methods generally shorten the measuring time by acquiring less data than needed according to the sampling theorem. In order to obtain a proper image from such undersampled data, the reconstruction is commonly defined as the solution of an inverse problem, which is regularized by a priori assumptions about the object. While practical realizations have hith…
▽ More
Real-time magnetic resonance imaging (MRI) methods generally shorten the measuring time by acquiring less data than needed according to the sampling theorem. In order to obtain a proper image from such undersampled data, the reconstruction is commonly defined as the solution of an inverse problem, which is regularized by a priori assumptions about the object. While practical realizations have hitherto been surprisingly successful, strong assumptions about the continuity of image features may affect the temporal fidelity of the estimated images. Here we propose a novel approach for the reconstruction of serial real-time MRI data which integrates the deformations between nearby frames into the data consistency term. The method is not required to be affine or rigid and does not need additional measurements. Moreover, it handles multi-channel MRI data by simultaneously determining the image and its coil sensitivity profiles in a nonlinear formulation which also adapts to non-Cartesian (e.g., radial) sampling schemes. Experimental results of a motion phantom with controlled speed and in vivo measurements of rapid tongue movements demonstrate image improvements in preserving temporal fidelity and removing residual artifacts.
△ Less
Submitted 4 December, 2013; v1 submitted 18 April, 2013;
originally announced April 2013.
-
Multiscale Change-Point Inference
Authors:
Klaus Frick,
Axel Munk,
Hannes Sieling
Abstract:
We introduce a new estimator SMUCE (simultaneous multiscale change-point estimator) for the change-point problem in exponential family regression. An unknown step function is estimated by minimizing the number of change-points over the acceptance region of a multiscale test at a level α. The probability of overestimating the true number of change-points K is controlled by the asymptotic null distr…
▽ More
We introduce a new estimator SMUCE (simultaneous multiscale change-point estimator) for the change-point problem in exponential family regression. An unknown step function is estimated by minimizing the number of change-points over the acceptance region of a multiscale test at a level α. The probability of overestimating the true number of change-points K is controlled by the asymptotic null distribution of the multiscale test statistic. Further, we derive exponential bounds for the probability of underestimating K. By balancing these quantities, αwill be chosen such that the probability of correctly estimating K is maximized. All results are even non-asymptotic for the normal case. Based on the aforementioned bounds, we construct asymptotically honest confidence sets for the unknown step function and its change-points. At the same time, we obtain exponential bounds for estimating the change-point locations which for example yield the minimax rate O(1/n) up to a log term. Finally, SMUCE asymptotically achieves the optimal detection rate of vanishing signals. We illustrate how dynamic programming techniques can be employed for efficient computation of estimators and confidence regions. The performance of the proposed multiscale approach is illustrated by simulations and in two cutting-edge applications from genetic engineering and photoemission spectroscopy.
△ Less
Submitted 12 August, 2013; v1 submitted 30 January, 2013;
originally announced January 2013.
-
Extreme Value Analysis of Empirical Frame Coefficients and Implications for Denoising by Soft-Thresholding
Authors:
Markus Haltmeier,
Axel Munk
Abstract:
Denoising by frame thresholding is one of the most basic and efficient methods for recovering a discrete signal or image from data that are corrupted by additive Gaussian white noise. The basic idea is to select a frame of analyzing elements that separates the data in few large coefficients due to the signal and many small coefficients mainly due to the noise ε_n. Removing all data coefficients be…
▽ More
Denoising by frame thresholding is one of the most basic and efficient methods for recovering a discrete signal or image from data that are corrupted by additive Gaussian white noise. The basic idea is to select a frame of analyzing elements that separates the data in few large coefficients due to the signal and many small coefficients mainly due to the noise ε_n. Removing all data coefficients being in magnitude below a certain threshold yields a reconstruction of the original signal. In order to properly balance the amount of noise to be removed and the relevant signal features to be kept, a precise understanding of the statistical properties of thresholding is important. For that purpose we derive the asymptotic distribution of max_{ω\in Ω_n} |<φ_ω^n,ε_n>| for a wide class of redundant frames (φ_ω^n: ω\in Ω_n}. Based on our theoretical results we give a rationale for universal extreme value thresholding techniques yielding asymptotically sharp confidence regions and smoothness estimates corresponding to prescribed significance levels. The results cover many frames used in imaging and signal recovery applications, such as redundant wavelet systems, curvelet frames, or unions of bases. We show that `generically' a standard Gumbel law results as it is known from the case of orthonormal wavelet bases. However, for specific highly redundant frames other limiting laws may occur. We indeed verify that the translation invariant wavelet transform shows a different asymptotic behaviour.
△ Less
Submitted 2 August, 2013; v1 submitted 10 May, 2012;
originally announced May 2012.
-
Multiscale Methods for Shape Constraints in Deconvolution: Confidence Statements for Qualitative Features
Authors:
Johannes Schmidt-Hieber,
Axel Munk,
Lutz Duembgen
Abstract:
We derive multiscale statistics for deconvolution in order to detect qualitative features of the unknown density. An important example covered within this framework is to test for local monotonicity on all scales simultaneously. We investigate the moderately ill-posed setting, where the Fourier transform of the error density in the deconvolution model is of polynomial decay. For multiscale testing…
▽ More
We derive multiscale statistics for deconvolution in order to detect qualitative features of the unknown density. An important example covered within this framework is to test for local monotonicity on all scales simultaneously. We investigate the moderately ill-posed setting, where the Fourier transform of the error density in the deconvolution model is of polynomial decay. For multiscale testing, we consider a calibration, motivated by the modulus of continuity of Brownian motion. We investigate the performance of our results from both the theoretical and simulation based point of view. A major consequence of our work is that the detection of qualitative features of a density in a deconvolution problem is a doable task although the minimax rates for pointwise estimation are very slow.
△ Less
Submitted 17 December, 2012; v1 submitted 7 July, 2011;
originally announced July 2011.
-
Statistical Multiresolution Dantzig Estimation in Imaging: Fundamental Concepts and Algorithmic Framework
Authors:
Klaus Frick,
Philipp Marnitz,
Axel Munk
Abstract:
In this paper we are concerned with fully automatic and locally adaptive estimation of functions in a "signal + noise"-model where the regression function may additionally be blurred by a linear operator, e.g. by a convolution. To this end, we introduce a general class of statistical multiresolution estimators and develop an algorithmic framework for computing those. By this we mean estimators tha…
▽ More
In this paper we are concerned with fully automatic and locally adaptive estimation of functions in a "signal + noise"-model where the regression function may additionally be blurred by a linear operator, e.g. by a convolution. To this end, we introduce a general class of statistical multiresolution estimators and develop an algorithmic framework for computing those. By this we mean estimators that are defined as solutions of convex optimization problems with supremum-type constraints. We employ a combination of the alternating direction method of multipliers with Dykstra's algorithm for computing orthogonal projections onto intersections of convex sets and prove numerical convergence. The capability of the proposed method is illustrated by various examples from imaging and signal detection.
△ Less
Submitted 1 February, 2012; v1 submitted 23 January, 2011;
originally announced January 2011.
-
Möbius deconvolution on the hyperbolic plane with application to impedance density estimation
Authors:
Stephan F. Huckemann,
Peter T. Kim,
Ja-Yong Koo,
Axel Munk
Abstract:
In this paper we consider a novel statistical inverse problem on the Poincaré, or Lobachevsky, upper (complex) half plane. Here the Riemannian structure is hyperbolic and a transitive group action comes from the space of $2\times2$ real matrices of determinant one via Möbius transformations. Our approach is based on a deconvolution technique which relies on the Helgason--Fourier calculus adapted t…
▽ More
In this paper we consider a novel statistical inverse problem on the Poincaré, or Lobachevsky, upper (complex) half plane. Here the Riemannian structure is hyperbolic and a transitive group action comes from the space of $2\times2$ real matrices of determinant one via Möbius transformations. Our approach is based on a deconvolution technique which relies on the Helgason--Fourier calculus adapted to this hyperbolic space. This gives a minimax nonparametric density estimator of a hyperbolic density that is corrupted by a random Möbius transform. A motivation for this work comes from the reconstruction of impedances of capacitors where the above scenario on the Poincaré plane exactly describes the physical system that is of statistical interest.
△ Less
Submitted 20 October, 2010;
originally announced October 2010.
-
Adaptive wavelet estimation of the diffusion coefficient under additive error measurements
Authors:
Marc Hoffmann,
Axel Munk,
Johannes Schmidt-Hieber
Abstract:
We study nonparametric estimation of the diffusion coefficient from discrete data, when the observations are blurred by additional noise. Such issues have been developed over the last 10 years in several application fields and in particular in high frequency financial data modelling, however mainly from a parametric and semiparametric point of view. This paper addresses the nonparametric estimatio…
▽ More
We study nonparametric estimation of the diffusion coefficient from discrete data, when the observations are blurred by additional noise. Such issues have been developed over the last 10 years in several application fields and in particular in high frequency financial data modelling, however mainly from a parametric and semiparametric point of view. This paper addresses the nonparametric estimation of the path of the (possibly stochastic) diffusion coefficient in a relatively general setting. By develo** pre-averaging techniques combined with wavelet thresholding, we construct adaptive estimators that achieve a nearly optimal rate within a large scale of smoothness constraints of Besov type. Since the diffusion coefficient is usually genuinely random, we propose a new criterion to assess the quality of estimation; we retrieve the usual minimax theory when this approach is restricted to a deterministic diffusion coefficient. In particular, we take advantage of recent results of Reiss [33] of asymptotic equivalence between a Gaussian diffusion with additive noise and Gaussian white noise model, in order to prove a sharp lower bound.
△ Less
Submitted 29 December, 2011; v1 submitted 27 July, 2010;
originally announced July 2010.