-
Statistics for Phylogenetic Trees in the Presence of Stickiness
Authors:
Lars Lammers,
Tom M. W. Nye,
Stephan F. Huckemann
Abstract:
Samples of phylogenetic trees arise in a variety of evolutionary and biomedical applications, and the Fréchet mean in Billera-Holmes-Vogtmann tree space is a summary tree shown to have advantages over other mean or consensus trees. However, use of the Fréchet mean raises computational and statistical issues which we explore in this paper. The Fréchet sample mean is known often to contain fewer int…
▽ More
Samples of phylogenetic trees arise in a variety of evolutionary and biomedical applications, and the Fréchet mean in Billera-Holmes-Vogtmann tree space is a summary tree shown to have advantages over other mean or consensus trees. However, use of the Fréchet mean raises computational and statistical issues which we explore in this paper. The Fréchet sample mean is known often to contain fewer internal edges than the trees in the sample, and in this circumstance calculating the mean by iterative schemes can be problematic due to slow convergence. We present new methods for identifying edges which must lie in the Fréchet sample mean and apply these to a data set of gene trees relating organisms from the apicomplexa which cause a variety of parasitic infections. When a sample of trees contains a significant level of heterogeneity in the branching patterns, or topologies, displayed by the trees then the Fréchet mean is often a star tree, lacking any internal edges. Not only in this situation, the population Fréchet mean is affected by a non-Euclidean phenomenon called stickness which impacts upon asymptotics, and we examine two data sets for which the mean tree is a star tree. The first consists of trees representing the physical shape of artery structures in a sample of medical images of human brains in which the branching patterns are very diverse. The second consists of gene trees from a population of baboons in which there is evidence of substantial hybridization. We develop hypothesis tests which work in the presence of stickiness. The first is a test for the presence of a given edge in the Fréchet population mean; the second is a two-sample test for differences in two distributions which share the same sticky population mean.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
A Lower Bound for Estimating Fréchet Means
Authors:
Shayan Hundrieser,
Benjamin Eltzner,
Stephan F. Huckemann
Abstract:
Fréchet means, conceptually appealing, generalize the Euclidean expectation to general metric spaces. We explore how well Fréchet means can be estimated from independent and identically distributed samples and uncover a fundamental limitation: In the vicinity of a probability distribution $P$ with nonunique means, independent of sample size, it is not possible to uniformly estimate Fréchet means b…
▽ More
Fréchet means, conceptually appealing, generalize the Euclidean expectation to general metric spaces. We explore how well Fréchet means can be estimated from independent and identically distributed samples and uncover a fundamental limitation: In the vicinity of a probability distribution $P$ with nonunique means, independent of sample size, it is not possible to uniformly estimate Fréchet means below a precision determined by the diameter of the set of Fréchet means of $P$. Implications were previously identified for empirical plug-in estimators as part of the phenomenon \emph{finite sample smeariness}. Our findings thus confirm inevitable statistical challenges in the estimation of Fréchet means on metric spaces for which there exist distributions with nonunique means. Illustrating the relevance of our lower bound, examples of extrinsic, intrinsic, Procrustes, diffusion and Wasserstein means showcase either deteriorating constants or slow convergence rates of empirical Fréchet means for samples near the regime of nonunique means.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Sticky Flavors
Authors:
Lars Lammers,
Do Tran Van,
Stephan F. Huckemann
Abstract:
The Fréchet mean, a generalization to a metric space of the expectation of a random variable in a vector space, can exhibit unexpected behavior for a wide class of random variables. For instance, it can stick to a point (more generally to a closed set) under resampling: sample stickiness. It can stick to a point for topologically nearby distributions: topological stickiness, such as total variatio…
▽ More
The Fréchet mean, a generalization to a metric space of the expectation of a random variable in a vector space, can exhibit unexpected behavior for a wide class of random variables. For instance, it can stick to a point (more generally to a closed set) under resampling: sample stickiness. It can stick to a point for topologically nearby distributions: topological stickiness, such as total variation or Wasserstein stickiness. It can stick to a point for slight but arbitrary perturbations: perturbation stickiness. Here, we explore these and various other flavors of stickiness and their relationship in varying scenarios, for instance on CAT($κ$) spaces, $κ\in \mathbb{R}$. Interestingly, modulation stickiness (faster asymptotic rate than $\sqrt{n}$) and directional stickiness (a generalization of moment stickiness from the literature) allow for the development of new statistical methods building on an asymptotic fluctuation, where, due to stickiness, the mean itself features no asymptotic fluctuation. Also, we rule out sticky flavors on manifolds in scenarios with curvature bounds.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Exploring Uniform Finite Sample Stickiness
Authors:
Susanne Ulmer,
Do Tran Van,
Stephan F. Huckemann
Abstract:
It is well known, that Fréchet means on non-Euclidean spaces may exhibit nonstandard asymptotic rates depending on curvature. Even for distributions featuring standard asymptotic rates, there are non-Euclidean effects, altering finite sampling rates up to considerable sample sizes. These effects can be measured by the variance modulation function proposed by Pennec (2019). Among others, in view of…
▽ More
It is well known, that Fréchet means on non-Euclidean spaces may exhibit nonstandard asymptotic rates depending on curvature. Even for distributions featuring standard asymptotic rates, there are non-Euclidean effects, altering finite sampling rates up to considerable sample sizes. These effects can be measured by the variance modulation function proposed by Pennec (2019). Among others, in view of statistical inference, it is important to bound this function on intervals of sampling sizes. In a first step into this direction, for the special case of a K-spider we give such an interval, based only on folded moments and total probabilities of spider legs and illustrate the method by simulations.
△ Less
Submitted 17 May, 2023;
originally announced May 2023.
-
Types of Stickiness in BHV Phylogenetic Tree Spaces and Their Degree
Authors:
Lars Lammers,
Do Tran Van,
Tom M. W. Nye,
Stephan F. Huckemann
Abstract:
It has been observed that the sample mean of certain probability distributions in Billera-Holmes-Vogtmann (BHV) phylogenetic spaces is confined to a lower-dimensional subspace for large enough sample size. This non-standard behavior has been called stickiness and poses difficulties in statistical applications when comparing samples of sticky distributions. We extend previous results on stickiness…
▽ More
It has been observed that the sample mean of certain probability distributions in Billera-Holmes-Vogtmann (BHV) phylogenetic spaces is confined to a lower-dimensional subspace for large enough sample size. This non-standard behavior has been called stickiness and poses difficulties in statistical applications when comparing samples of sticky distributions. We extend previous results on stickiness to show the equivalence of this sampling behavior to topological conditions in the special case of BHV spaces. Furthermore, we propose to alleviate statistical comparision of sticky distributions by including the directional derivatives of the Fréchet function: the degree of stickiness.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
Foundations of the Wald Space for Phylogenetic Trees
Authors:
Jonas Lueg,
Maryam K. Garba,
Tom M. W. Nye,
Stephan F. Huckemann
Abstract:
Evolutionary relationships between species are represented by phylogenetic trees, but these relationships are subject to uncertainty due to the random nature of evolution. A geometry for the space of phylogenetic trees is necessary in order to properly quantify this uncertainty during the statistical analysis of collections of possible evolutionary trees inferred from biological data. Recently, th…
▽ More
Evolutionary relationships between species are represented by phylogenetic trees, but these relationships are subject to uncertainty due to the random nature of evolution. A geometry for the space of phylogenetic trees is necessary in order to properly quantify this uncertainty during the statistical analysis of collections of possible evolutionary trees inferred from biological data. Recently, the wald space has been introduced: a length space for trees which is a certain subset of the manifold of symmetric positive definite matrices. In this work, the wald space is introduced formally and its topology and structure is studied in detail. In particular, we show that wald space has the topology of a disjoint union of open cubes, it is contractible, and by careful characterization of cube boundaries, we demonstrate that wald space is a Whitney stratified space of type (A). Imposing the metric induced by the affine invariant metric on symmetric positive definite matrices, we prove that wald space is a geodesic Riemann stratified space. A new numerical method is proposed and investigated for construction of geodesics, computation of Fréchet means and calculation of curvature in wald space. This work is intended to serve as a mathematical foundation for further geometric and statistical research on this space.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
Diffusion Means in Geometric Spaces
Authors:
Benjamin Eltzner,
Pernille Hansen,
Stephan F. Huckemann,
Stefan Sommer
Abstract:
We introduce a location statistic for distributions on non-linear geometric spaces, the diffusion mean, serving as an extension and an alternative to the Fréchet mean. The diffusion mean arises as the generalization of Gaussian maximum likelihood analysis to non-linear spaces by maximizing the likelihood of a Brownian motion. The diffusion mean depends on a time parameter $t$, which admits the int…
▽ More
We introduce a location statistic for distributions on non-linear geometric spaces, the diffusion mean, serving as an extension and an alternative to the Fréchet mean. The diffusion mean arises as the generalization of Gaussian maximum likelihood analysis to non-linear spaces by maximizing the likelihood of a Brownian motion. The diffusion mean depends on a time parameter $t$, which admits the interpretation of the allowed variance of the diffusion. The diffusion $t$-mean of a distribution $X$ is the most likely origin of a Brownian motion at time $t$, given the end-point distribution $X$. We give a detailed description of the asymptotic behavior of the diffusion estimator and provide sufficient conditions for the diffusion estimator to be strongly consistent. Particularly, we present a smeary central limit theorem for diffusion means and we show that joint estimation of the mean and diffusion variance rules out smeariness in all directions simultaneously in general situations. Furthermore, we investigate properties of the diffusion mean for distributions on the sphere $\mathbb S^n$. Experimentally, we consider simulated data and data from magnetic pole reversals, all indicating similar or improved convergence rate compared to the Fréchet mean. Here, we additionally estimate $t$ and consider its effects on smeariness and uniqueness of the diffusion mean for distributions on the sphere.
△ Less
Submitted 4 December, 2022; v1 submitted 25 May, 2021;
originally announced May 2021.
-
Clustering Schemes on the Torus with Application to RNA Clashes
Authors:
Henrik Wiechers,
Benjamin Eltzner,
Stephan F. Huckemann,
Kanti V. Mardia
Abstract:
Molecular structures of RNA molecules reconstructed from X-ray crystallography frequently contain errors. Motivated by this problem we examine clustering on a torus since RNA shapes can be described by dihedral angles. A previously developed clustering method for torus data involves two tuning parameters and we assess clustering results for different parameter values in relation to the problem of…
▽ More
Molecular structures of RNA molecules reconstructed from X-ray crystallography frequently contain errors. Motivated by this problem we examine clustering on a torus since RNA shapes can be described by dihedral angles. A previously developed clustering method for torus data involves two tuning parameters and we assess clustering results for different parameter values in relation to the problem of so-called RNA clashes. This clustering problem is part of the dynamically evolving field of statistics on manifolds. Statistical problems on the torus highlight general challenges for statistics on manifolds. Therefore, the torus PCA and clustering methods we propose make an important contribution to directional statistics and statistics on manifolds in general.
△ Less
Submitted 28 February, 2021;
originally announced April 2021.
-
Finite Sample Smeariness on Spheres
Authors:
Benjamin Eltzner,
Shayan Hundrieser,
Stephan F. Huckemann
Abstract:
Finite Sample Smeariness (FSS) has been recently discovered. It means that the distribution of sample Fréchet means of underlying rather unsuspicious random variables can behave as if it were smeary for quite large regimes of finite sample sizes. In effect classical quantile-based statistical testing procedures do not preserve nominal size, they reject too often under the null hypothesis. Suitably…
▽ More
Finite Sample Smeariness (FSS) has been recently discovered. It means that the distribution of sample Fréchet means of underlying rather unsuspicious random variables can behave as if it were smeary for quite large regimes of finite sample sizes. In effect classical quantile-based statistical testing procedures do not preserve nominal size, they reject too often under the null hypothesis. Suitably designed bootstrap tests, however, amend for FSS. On the circle it has been known that arbitrarily sized FSS is possible, and that all distributions with a nonvanishing density feature FSS. These results are extended to spheres of arbitrary dimension. In particular all rotationally symmetric distributions, not necessarily supported on the entire sphere feature FSS of Type I. While on the circle there is also FSS of Type II it is conjectured that this is not possible on higher-dimensional spheres.
△ Less
Submitted 28 February, 2021;
originally announced March 2021.
-
Generalized Intersection Algorithms with Fixpoints for Image Decomposition Learning
Authors:
Robin Richter,
Duy H. Thai,
Stephan F. Huckemann
Abstract:
In image processing, classical methods minimize a suitable functional that balances between computational feasibility (convexity of the functional is ideal) and suitable penalties reflecting the desired image decomposition. The fact that algorithms derived from such minimization problems can be used to construct (deep) learning architectures has spurred the development of algorithms that can be tr…
▽ More
In image processing, classical methods minimize a suitable functional that balances between computational feasibility (convexity of the functional is ideal) and suitable penalties reflecting the desired image decomposition. The fact that algorithms derived from such minimization problems can be used to construct (deep) learning architectures has spurred the development of algorithms that can be trained for a specifically desired image decomposition, e.g. into cartoon and texture. While many such methods are very successful, theoretical guarantees are only scarcely available. To this end, in this contribution, we formalize a general class of intersection point problems encompassing a wide range of (learned) image decomposition models, and we give an existence result for a large subclass of such problems, i.e. giving the existence of a fixpoint of the corresponding algorithm. This class generalizes classical model-based variational problems, such as the TV-l2 -model or the more general TV-Hilbert model. To illustrate the potential for learned algorithms, novel (non learned) choices within our class show comparable results in denoising and texture removal.
△ Less
Submitted 16 October, 2020;
originally announced October 2020.
-
Finite Sample Smeariness of Fréchet Means and Application to Climate
Authors:
Shayan Hundrieser,
Benjamin Eltzner,
Stephan F. Huckemann
Abstract:
Fréchet means on non-Euclidean spaces may exhibit nonstandard asymptotic rates rendering quantile-based asymptotic inference inapplicable. We show here that this affects, among others, all circular distributions whose support exceeds a half circle. We exhaustively describe this phenomenon and introduce a new concept which we call finite samples smeariness (FSS). In the presence of FSS, it turns ou…
▽ More
Fréchet means on non-Euclidean spaces may exhibit nonstandard asymptotic rates rendering quantile-based asymptotic inference inapplicable. We show here that this affects, among others, all circular distributions whose support exceeds a half circle. We exhaustively describe this phenomenon and introduce a new concept which we call finite samples smeariness (FSS). In the presence of FSS, it turns out that quantile-based tests for equality of Fréchet means systematically feature effective levels higher than their nominal level which perseveres asymptotically in case of Type I FSS. In contrast, suitable bootstrap-based tests correct for FSS and asymptotically attain the correct level. For illustration of the relevance of FSS in real data, we apply our method to directional wind data from two European cities. It turns out that quantile based tests, not correcting for FSS, find a multitude of significant wind changes. This multitude condenses to a few years featuring significant wind changes, when our bootstrap tests are applied, correcting for FSS.
△ Less
Submitted 26 July, 2021; v1 submitted 5 May, 2020;
originally announced May 2020.
-
Information geometry for phylogenetic trees
Authors:
Maryam K. Garba,
Tom M. W. Nye,
Jonas Lueg,
Stephan F. Huckemann
Abstract:
We propose a new space of phylogenetic trees which we call wald space. The motivation is to develop a space suitable for statistical analysis of phylogenies, but with a geometry based on more biologically principled assumptions than existing spaces: in wald space, trees are close if they induce similar distributions on genetic sequence data. As a point set, wald space contains the previously devel…
▽ More
We propose a new space of phylogenetic trees which we call wald space. The motivation is to develop a space suitable for statistical analysis of phylogenies, but with a geometry based on more biologically principled assumptions than existing spaces: in wald space, trees are close if they induce similar distributions on genetic sequence data. As a point set, wald space contains the previously developed Billera-Holmes-Vogtmann (BHV) tree space; it also contains disconnected forests, like the edge-product (EP) space but without certain singularities of the EP space. We investigate two related geometries on wald space. The first is the geometry of the Fisher information metric of character distributions induced by the two-state symmetric Markov substitution process on each tree. Infinitesimally, the metric is proportional to the Kullback-Leibler divergence, or equivalently, as we show, any to f -divergence. The second geometry is obtained analogously but using a related continuous-valued Gaussian process on each tree, and it can be viewed as the trace metric of the affine-invariant metric for covariance matrices. We derive a gradient descent algorithm to project from the ambient space of covariance matrices to wald space. For both geometries we derive computational methods to compute geodesics in polynomial time and show numerically that the two information geometries (discrete and continuous) are very similar. In particular geodesics are approximated extrinsically. Comparison with the BHV geometry shows that our canonical and biologically motivated space is substantially different.
△ Less
Submitted 17 September, 2020; v1 submitted 29 March, 2020;
originally announced March 2020.
-
Confidence Tubes for Curves on SO(3) and Identification of Subject-Specific Gait Change after Kneeling
Authors:
Fabian J. E. Telschow,
Michael R. Pierrynowski,
Stephan F. Huckemann
Abstract:
In order to identify changes of gait patterns, e.g. due to prolonged occupational kneeling, which is believed to be major risk factor, among others, for the development of knee osteoarthritis, we develop confidence tubes for curves following a Gaussian perturbation model on SO(3). These are based on an application of the Gaussian kinematic formula to a process of Hotelling statistics and we approx…
▽ More
In order to identify changes of gait patterns, e.g. due to prolonged occupational kneeling, which is believed to be major risk factor, among others, for the development of knee osteoarthritis, we develop confidence tubes for curves following a Gaussian perturbation model on SO(3). These are based on an application of the Gaussian kinematic formula to a process of Hotelling statistics and we approximate them by a computible version, for which we show convergence. Simulations endorse our method, which in application to gait curves from eight volunteers undergoing kneeling tasks, identifies phases of the gait cycle that have changed due to kneeling tasks. We find that after kneeling, deviation from normal gait is stronger, in particular for older aged male volunteers. Notably our method adjusts for different walking speeds and marker replacement at different visits.
△ Less
Submitted 14 September, 2019;
originally announced September 2019.
-
Stability of the Cut Locus and a Central Limit Theorem for Fréchet Means of Riemannian Manifolds
Authors:
Benjamin Eltzner,
Fernando Galaz-Garcia,
Stephan F. Huckemann,
Wilderich Tuschmann
Abstract:
We obtain a Central Limit Theorem for closed Riemannian manifolds, clarifying along the way the geometric meaning of some of the hypotheses in Bhattacharya and Lin's Omnibus Central Limit Theorem for Fréchet means. We obtain our CLT assuming certain stability hypothesis for the cut locus, which always holds when the manifold is compact but may not be satisfied in the non-compact case.
We obtain a Central Limit Theorem for closed Riemannian manifolds, clarifying along the way the geometric meaning of some of the hypotheses in Bhattacharya and Lin's Omnibus Central Limit Theorem for Fréchet means. We obtain our CLT assuming certain stability hypothesis for the cut locus, which always holds when the manifold is compact but may not be satisfied in the non-compact case.
△ Less
Submitted 4 September, 2019; v1 submitted 1 September, 2019;
originally announced September 2019.
-
A Smeary Central Limit Theorem for Manifolds with Application to High Dimensional Spheres
Authors:
Benjamin Eltzner,
Stephan F. Huckemann
Abstract:
The (CLT) central limit theorems for generalized Frechet means (data descriptors assuming values in stratified spaces, such as intrinsic means, geodesics, etc.) on manifolds from the literature are only valid if a certain empirical process of Hessians of the Frechet function converges suitably, as in the proof of the prototypical BP-CLT (Bhattacharya and Patrangenaru (2005)). This is not valid in…
▽ More
The (CLT) central limit theorems for generalized Frechet means (data descriptors assuming values in stratified spaces, such as intrinsic means, geodesics, etc.) on manifolds from the literature are only valid if a certain empirical process of Hessians of the Frechet function converges suitably, as in the proof of the prototypical BP-CLT (Bhattacharya and Patrangenaru (2005)). This is not valid in many realistic scenarios and we provide for a new very general CLT. In particular this includes scenarios where, in a suitable chart, the sample mean fluctuates asymptotically at a scale $n^α$ with exponents $α < 1/2$ with a non-normal distribution. As the BP-CLT yields only fluctuations that are, rescaled with $n^{1/2}$ , asymptotically normal, just as the classical CLT for random vectors, these lower rates, somewhat loosely called smeariness, had to date been observed only on the circle (Hotz and Huckemann (2015)). We make the concept of smeariness on manifolds precise, give an example for two-smeariness on spheres of arbitrary dimension, and show that smeariness, although "almost never" occurring, may have serious statistical implications on a continuum of sample scenarios nearby. In fact, this effect increases with dimension, striking in particular in high dimension low sample size scenarios.
△ Less
Submitted 19 January, 2018;
originally announced January 2018.
-
Detecting Anisotropy in Fingerprint Growth
Authors:
Karla Markert,
Karolin Krehl,
Carsten Gottschlich,
Stephan F. Huckemann
Abstract:
From infancy to adulthood, human growth is anisotropic, much more along the proximal-distal axis (height) than along the medial-lateral axis (width), particularly at extremities. Detecting and modeling the rate of anisotropy in fingerprint growth, and possibly other growth patterns as well, facilitates the use of children's fingerprints for long-term biometric identification. Using standard finger…
▽ More
From infancy to adulthood, human growth is anisotropic, much more along the proximal-distal axis (height) than along the medial-lateral axis (width), particularly at extremities. Detecting and modeling the rate of anisotropy in fingerprint growth, and possibly other growth patterns as well, facilitates the use of children's fingerprints for long-term biometric identification. Using standard fingerprint scanners, anisotropic growth is highly overshadowed by the varying distortions created by each imprint, and it seems that this difficulty has hampered to date the development of suitable methods, detecting anisotropy, let alone, designing models. We provide a tool chain to statistically detect, with a given confidence, anisotropic growth in fingerprints and its preferred axis, where we only require a standard fingerprint scanner and a minutiae matcher. We build on a perturbation model, a new Procrustes-type algorithm, use and develop several parametric and non-parametric tests for different hypotheses, in particular for neighborhood hypotheses to detect the axis of anisotropy, where the latter tests are tunable to measurement accuracy. Taking into account realistic distortions caused by pressing fingers on scanners, our simulations based on real data indicate that, for example, already in rather small samples (56 matches) we can significantly detect proximal-distal growth if it exceeds medial-lateral growth by only around 5 percent. Our method is well applicable to future datasets of children fingerprint time series and we provide an implementation of our algorithms and tests with matched minutiae pattern data.
△ Less
Submitted 19 January, 2018;
originally announced January 2018.
-
Functional Inference on Rotational Curves and Identification of Human Gait at the Knee Joint
Authors:
Fabian J. E. Telschow,
Stephan F. Huckemann,
Michael R. Pierrynowski
Abstract:
We extend Gaussian perturbation models in classical functional data analysis to the three-dimensional rotational group where a zero-mean Gaussian process in the Lie algebra under the Lie exponential spreads multiplicatively around a central curve. As an estimator, we introduce point-wise extrinsic mean curves which feature strong perturbation consistency, and which are asymptotically a.s. unique a…
▽ More
We extend Gaussian perturbation models in classical functional data analysis to the three-dimensional rotational group where a zero-mean Gaussian process in the Lie algebra under the Lie exponential spreads multiplicatively around a central curve. As an estimator, we introduce point-wise extrinsic mean curves which feature strong perturbation consistency, and which are asymptotically a.s. unique and differentiable, if the model is so. Further, we consider the group action of time war** and that of spatial isometries that are connected to the identity. The latter can be asymptotically consistently estimated if lifted to the unit quaternions. Introducing a generic loss for Lie groups, the former can be estimated, and based on curve length, due to asymptotic differentiability, we propose two-sample permutation tests involving various combinations of the group actions. This methodology allows inference on gait patterns due to the rotational motion of the lower leg with respect to the upper leg. This was previously not possible because, among others, the usual analysis of separate Euler angles is not independent of marker placement, even if performed by trained specialists.
△ Less
Submitted 11 November, 2016;
originally announced November 2016.
-
Backward Nested Descriptors Asymptotics with Inference on Stem Cell Differentiation
Authors:
Stephan F. Huckemann,
Benjamin Eltzner
Abstract:
For sequences of random backward nested subspaces as occur, say, in dimension reduction for manifold or stratified space valued data, asymptotic results are derived. In fact, we formulate our results more generally for backward nested families of descriptors (BNFD). Under rather general conditions, asymptotic strong consistency holds. Under additional, still rather general hypotheses, among them e…
▽ More
For sequences of random backward nested subspaces as occur, say, in dimension reduction for manifold or stratified space valued data, asymptotic results are derived. In fact, we formulate our results more generally for backward nested families of descriptors (BNFD). Under rather general conditions, asymptotic strong consistency holds. Under additional, still rather general hypotheses, among them existence of a.s. local twice differentiable charts, asymptotic joint normality of a BNFD can be shown. If charts factor suitably, this leads to individual asymptotic normality for the last element, a principal nested mean or a principal nested geodesic, say. It turns out that these results pertain to principal nested spheres (PNS) and principal nested great subsphere (PNGS) analysis by Jung et al. (2010) as well as to the intrinsic mean on a first geodesic principal component (IMo1GPC) for manifolds and Kendall's shape spaces. A nested bootstrap two-sample test is derived and illustrated with simulations. In a study on real data, PNGS is applied to track early human mesenchymal stem cell differentiation over a coarse time grid and, among others, to locate a change point with direct consequences for the design of further studies.
△ Less
Submitted 3 September, 2016;
originally announced September 2016.
-
Möbius deconvolution on the hyperbolic plane with application to impedance density estimation
Authors:
Stephan F. Huckemann,
Peter T. Kim,
Ja-Yong Koo,
Axel Munk
Abstract:
In this paper we consider a novel statistical inverse problem on the Poincaré, or Lobachevsky, upper (complex) half plane. Here the Riemannian structure is hyperbolic and a transitive group action comes from the space of $2\times2$ real matrices of determinant one via Möbius transformations. Our approach is based on a deconvolution technique which relies on the Helgason--Fourier calculus adapted t…
▽ More
In this paper we consider a novel statistical inverse problem on the Poincaré, or Lobachevsky, upper (complex) half plane. Here the Riemannian structure is hyperbolic and a transitive group action comes from the space of $2\times2$ real matrices of determinant one via Möbius transformations. Our approach is based on a deconvolution technique which relies on the Helgason--Fourier calculus adapted to this hyperbolic space. This gives a minimax nonparametric density estimator of a hyperbolic density that is corrupted by a random Möbius transform. A motivation for this work comes from the reconstruction of impedances of capacitors where the above scenario on the Poincaré plane exactly describes the physical system that is of statistical interest.
△ Less
Submitted 20 October, 2010;
originally announced October 2010.