-
Rapid and Precise Topological Comparison with Merge Tree Neural Networks
Authors:
Yu Qin,
Brittany Terese Fasy,
Carola Wenk,
Brian Summa
Abstract:
Merge trees are a valuable tool in scientific visualization of scalar fields; however, current methods for merge tree comparisons are computationally expensive, primarily due to the exhaustive matching between tree nodes. To address this challenge, we introduce the merge tree neural networks (MTNN), a learned neural network model designed for merge tree comparison. The MTNN enables rapid and high-…
▽ More
Merge trees are a valuable tool in scientific visualization of scalar fields; however, current methods for merge tree comparisons are computationally expensive, primarily due to the exhaustive matching between tree nodes. To address this challenge, we introduce the merge tree neural networks (MTNN), a learned neural network model designed for merge tree comparison. The MTNN enables rapid and high-quality similarity computation. We first demonstrate how graph neural networks (GNNs), which emerged as an effective encoder for graphs, can be trained to produce embeddings of merge trees in vector spaces that enable efficient similarity comparison. Next, we formulate the novel MTNN model that further improves the similarity comparisons by integrating the tree and node embeddings with a new topological attention mechanism. We demonstrate the effectiveness of our model on real-world data in different domains and examine our model's generalizability across various datasets. Our experimental analysis demonstrates our approach's superiority in accuracy and efficiency. In particular, we speed up the prior state-of-the-art by more than 100x on the benchmark datasets while maintaining an error rate below 0.1%.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
How Small Can Faithful Sets Be? Ordering Topological Descriptors
Authors:
Brittany Terese Fasy,
David L. Millman,
Anna Schenfisch
Abstract:
Recent developments in shape reconstruction and comparison call for the use of many different (topological) descriptor types, such as persistence diagrams and Euler characteristic functions. We establish a framework to quantitatively compare the strength of different descriptor types, setting up a theory that allows for future comparisons and analysis of descriptor types and that can inform choice…
▽ More
Recent developments in shape reconstruction and comparison call for the use of many different (topological) descriptor types, such as persistence diagrams and Euler characteristic functions. We establish a framework to quantitatively compare the strength of different descriptor types, setting up a theory that allows for future comparisons and analysis of descriptor types and that can inform choices made in applications. We use this framework to partially order a set of six common descriptor types. We then give lower bounds on the size of sets of descriptors that uniquely correspond to simplicial complexes, giving insight into the advantages of using verbose rather than concise topological descriptors.
△ Less
Submitted 8 July, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
The Manifold Density Function: An Intrinsic Method for the Validation of Manifold Learning
Authors:
Benjamin Holmgren,
Eli Quist,
Jordan Schupbach,
Brittany Terese Fasy,
Bastian Rieck
Abstract:
We introduce the manifold density function, which is an intrinsic method to validate manifold learning techniques. Our approach adapts and extends Ripley's $K$-function, and categorizes in an unsupervised setting the extent to which an output of a manifold learning algorithm captures the structure of a latent manifold. Our manifold density function generalizes to broad classes of Riemannian manifo…
▽ More
We introduce the manifold density function, which is an intrinsic method to validate manifold learning techniques. Our approach adapts and extends Ripley's $K$-function, and categorizes in an unsupervised setting the extent to which an output of a manifold learning algorithm captures the structure of a latent manifold. Our manifold density function generalizes to broad classes of Riemannian manifolds. In particular, we extend the manifold density function to general two-manifolds using the Gauss-Bonnet theorem, and demonstrate that the manifold density function for hypersurfaces is well approximated using the first Laplacian eigenvalue. We prove desirable convergence and robustness properties.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Visualizing Topological Importance: A Class-Driven Approach
Authors:
Yu Qin,
Brittany Terese Fasy,
Carola Wenk,
Brian Summa
Abstract:
This paper presents the first approach to visualize the importance of topological features that define classes of data. Topological features, with their ability to abstract the fundamental structure of complex data, are an integral component of visualization and analysis pipelines. Although not all topological features present in data are of equal importance. To date, the default definition of fea…
▽ More
This paper presents the first approach to visualize the importance of topological features that define classes of data. Topological features, with their ability to abstract the fundamental structure of complex data, are an integral component of visualization and analysis pipelines. Although not all topological features present in data are of equal importance. To date, the default definition of feature importance is often assumed and fixed. This work shows how proven explainable deep learning approaches can be adapted for use in topological classification. In doing so, it provides the first technique that illuminates what topological structures are important in each dataset in regards to their class label. In particular, the approach uses a learned metric classifier with a density estimator of the points of a persistence diagram as input. This metric learns how to reweigh this density such that classification accuracy is high. By extracting this weight, an importance field on persistent point density can be created. This provides an intuitive representation of persistence point importance that can be used to drive new visualizations. This work provides two examples: Visualization on each diagram directly and, in the case of sublevel set filtrations on images, directly on the images themselves. This work highlights real-world examples of this approach visualizing the important topological features in graph, 3D shape, and medical image data.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
From Curves to Words and Back Again: Geometric Computation of Minimum-Area Homotopy
Authors:
Hsien-Chih Chang,
Brittany Terese Fasy,
Bradley McCoy,
David L. Millman,
Carola Wenk
Abstract:
Let $γ$ be a generic closed curve in the plane. Samuel Blank, in his 1967 Ph.D. thesis, determined if $γ$ is self-overlap** by geometrically constructing a combinatorial word from $γ$. More recently, Zipei Nie, in an unpublished manuscript, computed the minimum homotopy area of $γ$ by constructing a combinatorial word algebraically. We provide a unified framework for working with both words and…
▽ More
Let $γ$ be a generic closed curve in the plane. Samuel Blank, in his 1967 Ph.D. thesis, determined if $γ$ is self-overlap** by geometrically constructing a combinatorial word from $γ$. More recently, Zipei Nie, in an unpublished manuscript, computed the minimum homotopy area of $γ$ by constructing a combinatorial word algebraically. We provide a unified framework for working with both words and determine the settings under which Blank's word and Nie's word are equivalent. Using this equivalence, we give a new geometric proof for the correctness of Nie's algorithm. Unlike previous work, our proof is constructive which allows us to naturally compute the actual homotopy that realizes the minimum area. Furthermore, we contribute to the theory of self-overlap** curves by providing the first polynomial-time algorithm to compute a self-overlap** decomposition of any closed curve $γ$ with minimum area.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Metric and Path-Connectedness Properties of the Frechet Distance for Paths and Graphs
Authors:
Erin Chambers,
Brittany Fasy,
Benjamin Holmgren,
Sushovan Majhi,
Carola Wenk
Abstract:
The Frechet distance is often used to measure distances between paths, with applications in areas ranging from map matching to GPS trajectory analysis to handwriting recognition. More recently, the Frechet distance has been generalized to a distance between two copies of the same graph embedded or immersed in a metric space; this more general setting opens up a wide range of more complex applicati…
▽ More
The Frechet distance is often used to measure distances between paths, with applications in areas ranging from map matching to GPS trajectory analysis to handwriting recognition. More recently, the Frechet distance has been generalized to a distance between two copies of the same graph embedded or immersed in a metric space; this more general setting opens up a wide range of more complex applications in graph analysis. In this paper, we initiate a study of some of the fundamental topological properties of spaces of paths and of graphs mapped to R^n under the Frechet distance, in an effort to lay the theoretical groundwork for understanding how these distances can be used in practice. In particular, we prove whether or not these spaces, and the metric balls therein, are path-connected.
△ Less
Submitted 1 August, 2023;
originally announced August 2023.
-
The Weighted Euler Characteristic Transform for Image Shape Classification
Authors:
Jessi Cisewski-Kehe,
Brittany Terese Fasy,
Dhanush Giriyan,
Eli Quist
Abstract:
The weighted Euler characteristic transform (WECT) is a new tool for extracting shape information from data equipped with a weight function. Image data may benefit from the WECT where the intensity of the pixels are used to define the weight function. In this work, an empirical assessment of the WECT's ability to distinguish shapes on images with different pixel intensity distributions is consider…
▽ More
The weighted Euler characteristic transform (WECT) is a new tool for extracting shape information from data equipped with a weight function. Image data may benefit from the WECT where the intensity of the pixels are used to define the weight function. In this work, an empirical assessment of the WECT's ability to distinguish shapes on images with different pixel intensity distributions is considered, along with visualization techniques to improve the intuition and understanding of what is captured by the WECT. Additionally, the expected weighted Euler characteristic and the expected WECT are derived.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
Efficient Graph Reconstruction and Representation Using Augmented Persistence Diagrams
Authors:
Brittany Terese Fasy,
Samuel Micka,
David L. Millman,
Anna Schenfisch,
Lucia Williams
Abstract:
Persistent homology is a tool that can be employed to summarize the shape of data by quantifying homological features. When the data is an object in $\mathbb{R}^d$, the (augmented) persistent homology transform ((A)PHT) is a family of persistence diagrams, parameterized by directions in the ambient space. A recent advance in understanding the PHT used the framework of reconstruction in order to fi…
▽ More
Persistent homology is a tool that can be employed to summarize the shape of data by quantifying homological features. When the data is an object in $\mathbb{R}^d$, the (augmented) persistent homology transform ((A)PHT) is a family of persistence diagrams, parameterized by directions in the ambient space. A recent advance in understanding the PHT used the framework of reconstruction in order to find finite a set of directions to faithfully represent the shape, a result that is of both theoretical and practical interest. In this paper, we improve upon this result and present an improved algorithm for graph -- and, more generally one-skeleton -- reconstruction. The improvement comes in reconstructing the edges, where we use a radial binary (multi-)search. The binary search employed takes advantage of the fact that the edges can be ordered radially with respect to a reference plane, a feature unique to graphs.
△ Less
Submitted 26 December, 2022;
originally announced December 2022.
-
Combinatorial Persistent Homology Transform
Authors:
Brittany Terese Fasy,
Amit Patel
Abstract:
The combinatorial interpretation of the persistence diagram as a Möbius inversion was recently shown to be functorial. We employ this discovery to recast the Persistent Homology Transform of a geometric complex as a representation of a cellulation on $\mathbb{S}^n$ to the category of combinatorial persistence diagrams. Detailed examples are provided. We hope this recasting of the PH transform will…
▽ More
The combinatorial interpretation of the persistence diagram as a Möbius inversion was recently shown to be functorial. We employ this discovery to recast the Persistent Homology Transform of a geometric complex as a representation of a cellulation on $\mathbb{S}^n$ to the category of combinatorial persistence diagrams. Detailed examples are provided. We hope this recasting of the PH transform will allow for the adoption of existing methods from algebraic and topological combinatorics to the study of shapes.
△ Less
Submitted 15 May, 2024; v1 submitted 10 August, 2022;
originally announced August 2022.
-
Differentiating small-scale subhalo distributions in CDM and WDM models using persistent homology
Authors:
Jessi Cisewski-Kehe,
Brittany Terese Fasy,
Wojciech Hellwing,
Mark R. Lovell,
Pawel Drozda,
Mike Wu
Abstract:
The spatial distribution of galaxies at sufficiently small scales will encode information about the identity of the dark matter. We develop a novel description of the halo distribution using persistent homology summaries, in which collections of points are decomposed into clusters, loops and voids. We apply these methods, together with a set of hypothesis tests, to dark matter haloes in MW-analog…
▽ More
The spatial distribution of galaxies at sufficiently small scales will encode information about the identity of the dark matter. We develop a novel description of the halo distribution using persistent homology summaries, in which collections of points are decomposed into clusters, loops and voids. We apply these methods, together with a set of hypothesis tests, to dark matter haloes in MW-analog environment regions of the cold dark matter (CDM) and warm dark matter (WDM) Copernicus Complexio $N$-body cosmological simulations. The results of the hypothesis tests find statistically significant differences (p-values $\leq$ 0.001) between the CDM and WDM structures, and the functional summaries of persistence diagrams detect differences at scales that are distinct from the comparison spatial point process functional summaries considered (including the two-point correlation function). The differences between the models are driven most strongly at filtration scales $\sim100$~kpc, where CDM generates larger numbers of unconnected halo clusters while WDM instead generates loops. This study was conducted on dark matter haloes generally; future work will involve applying the same methods to realistic galaxy catalogues.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.
-
Extremal Event Graphs: A (Stable) Tool for Analyzing Noisy Time Series Data
Authors:
Robin Belton,
Bree Cummins,
Brittany Terese Fasy,
Tomáš Gedeon
Abstract:
Local maxima and minima, or extremal events, in experimental time series can be used as a coarse summary to characterize data. However, the discrete sampling in recording experimental measurements suggests uncertainty on the true timing of extrema during the experiment. This in turn gives uncertainty in the timing order of extrema within the time series. Motivated by applications in genomic time s…
▽ More
Local maxima and minima, or extremal events, in experimental time series can be used as a coarse summary to characterize data. However, the discrete sampling in recording experimental measurements suggests uncertainty on the true timing of extrema during the experiment. This in turn gives uncertainty in the timing order of extrema within the time series. Motivated by applications in genomic time series and biological network analysis, we construct a weighted directed acyclic graph (DAG) called an extremal event DAG using techniques from persistent homology that is robust to measurement noise. Furthermore, we define a distance between extremal event DAGs based on the edit distance between strings. We prove several properties including local stability for the extremal event DAG distance with respect to pairwise $L_{\infty}$ distances between functions in the time series data. Lastly, we provide algorithms, publicly free software, and implementations on extremal event DAG construction and comparison.
△ Less
Submitted 23 August, 2022; v1 submitted 17 March, 2022;
originally announced March 2022.
-
Combinatorial Conditions for Directed Collapsing
Authors:
Robin Belton,
Robyn Brooks,
Stefania Ebli,
Lisbeth Fajstrup,
Brittany Terese Fasy,
Nicole Sanderson,
Elizabeth Vidaurre
Abstract:
The purpose of this article is to study directed collapsibility of directed Euclidean cubical complexes. One application of this is in the nontrivial task of verifying the execution of concurrent programs. The classical definition of collapsibility involves certain conditions on a pair of cubes of the complex. The direction of the space can be taken into account by requiring that the past links of…
▽ More
The purpose of this article is to study directed collapsibility of directed Euclidean cubical complexes. One application of this is in the nontrivial task of verifying the execution of concurrent programs. The classical definition of collapsibility involves certain conditions on a pair of cubes of the complex. The direction of the space can be taken into account by requiring that the past links of vertices remain homotopy equivalent after collapsing. We call this type of collapse a link-preserving directed collapse. In this paper, we give combinatorially equivalent conditions for preserving the topology of the links, allowing for the implementation of an algorithm for collapsing a directed Euclidean cubical complex. Furthermore, we give conditions for when link-preserving directed collapses preserve the contractability and connectedness of directed path spaces, as well as examples when link-preserving directed collapses do not preserve the number of connected components of the path space between the minimum and a given vertex.
△ Less
Submitted 25 May, 2022; v1 submitted 2 June, 2021;
originally announced June 2021.
-
A Domain-Oblivious Approach for Learning Concise Representations of Filtered Topological Spaces for Clustering
Authors:
Yu Qin,
Brittany Terese Fasy,
Carola Wenk,
Brian Summa
Abstract:
Persistence diagrams have been widely used to quantify the underlying features of filtered topological spaces in data visualization. In many applications, computing distances between diagrams is essential; however, computing these distances has been challenging due to the computational cost. In this paper, we propose a persistence diagram hashing framework that learns a binary code representation…
▽ More
Persistence diagrams have been widely used to quantify the underlying features of filtered topological spaces in data visualization. In many applications, computing distances between diagrams is essential; however, computing these distances has been challenging due to the computational cost. In this paper, we propose a persistence diagram hashing framework that learns a binary code representation of persistence diagrams, which allows for fast computation of distances. This framework is built upon a generative adversarial network (GAN) with a diagram distance loss function to steer the learning process. Instead of using standard representations, we hash diagrams into binary codes, which have natural advantages in large-scale tasks. The training of this model is domain-oblivious in that it can be computed purely from synthetic, randomly created diagrams. As a consequence, our proposed method is directly applicable to various datasets without the need for retraining the model. These binary codes, when compared using fast Hamming distance, better maintain topological similarity properties between datasets than other vectorized representations. To evaluate this method, we apply our framework to the problem of diagram clustering and we compare the quality and performance of our approach to the state-of-the-art. In addition, we show the scalability of our approach on a dataset with 10k persistence diagrams, which is not possible with current techniques. Moreover, our experimental results demonstrate that our method is significantly faster with the potential of less memory usage, while retaining comparable or better quality comparisons.
△ Less
Submitted 10 August, 2021; v1 submitted 25 May, 2021;
originally announced May 2021.
-
If You Must Choose Among Your Children, Pick the Right One
Authors:
Benjamin Holmgren,
Bradley McCoy,
Brittany Fasy,
David Millman
Abstract:
Given a simplicial complex $K$ and an injective function $f$ from the vertices of $K$ to $\mathbb{R}$, we consider algorithms that extend $f$ to a discrete Morse function on $K$. We show that an algorithm of King, Knudson and Mramor can be described on the directed Hasse diagram of $K$. Our description has a faster runtime for high dimensional data with no increase in space.
Given a simplicial complex $K$ and an injective function $f$ from the vertices of $K$ to $\mathbb{R}$, we consider algorithms that extend $f$ to a discrete Morse function on $K$. We show that an algorithm of King, Knudson and Mramor can be described on the directed Hasse diagram of $K$. Our description has a faster runtime for high dimensional data with no increase in space.
△ Less
Submitted 22 March, 2021;
originally announced March 2021.
-
ANAPT: Additive Noise Analysis for Persistence Thresholding
Authors:
Audun D. Myers,
Firas A. Khasawneh,
Brittany T. Fasy
Abstract:
We introduce a novel method for Additive Noise Analysis for Persistence Thresholding (ANAPT) which separates significant features in the sublevel set persistence diagram of a time series based on a statistics analysis of the persistence of a noise distribution. Specifically, we consider an additive noise model and leverage the statistical analysis to provide a noise cutoff or confidence interval i…
▽ More
We introduce a novel method for Additive Noise Analysis for Persistence Thresholding (ANAPT) which separates significant features in the sublevel set persistence diagram of a time series based on a statistics analysis of the persistence of a noise distribution. Specifically, we consider an additive noise model and leverage the statistical analysis to provide a noise cutoff or confidence interval in the persistence diagram for the observed time series. This analysis is done for several common noise models including Gaussian, uniform, exponential and Rayleigh distributions. ANAPT is computationally efficient, does not require any signal pre-filtering, is widely applicable, and has open-source software available. We demonstrate the functionality ANAPT with both numerically simulated examples and an experimental data set. Additionally, we provide an efficient $Θ(n\log(n))$ algorithm for calculating the zero-dimensional sublevel set persistence homology.
△ Less
Submitted 11 February, 2022; v1 submitted 7 December, 2020;
originally announced December 2020.
-
A Faithful Discretization of the Verbose Persistent Homology Transform
Authors:
Brittany Terese Fasy,
Samuel Micka,
David L. Millman,
Anna Schenfisch,
Lucia Williams
Abstract:
The persistent homology transform (PHT) represents a shape with a multiset of persistence diagrams parameterized by the sphere of directions in the ambient space. In this work, we describe a finite set of diagrams that discretize the PHT such that it faithfully represents the underlying shape. We provide a discretization that is exponential in the dimension of the shape. Moreover, we show that thi…
▽ More
The persistent homology transform (PHT) represents a shape with a multiset of persistence diagrams parameterized by the sphere of directions in the ambient space. In this work, we describe a finite set of diagrams that discretize the PHT such that it faithfully represents the underlying shape. We provide a discretization that is exponential in the dimension of the shape. Moreover, we show that this discretization is stable with respect to various perturbations and we provide an algorithm for computing the discretization. Our approach relies only on knowing the heights and dimensions of topological events, which means that it can be adapted to provide discretizations of other dimension-returning topological transforms, including the Betti function transform. With mild alterations, we also adapt our methods to faithfully discretize the Euler characteristic function transform.
△ Less
Submitted 13 February, 2024; v1 submitted 29 December, 2019;
originally announced December 2019.
-
Reconstructing Embedded Graphs from Persistence Diagrams
Authors:
Robin Lynne Belton,
Brittany Terese Fasy,
Rostik Mertz,
Samuel Micka,
David L. Millman,
Daniel Salinas,
Anna Schenfisch,
Jordan Schupbach,
Lucia Williams
Abstract:
The persistence diagram (PD) is an increasingly popular topological descriptor. By encoding the size and prominence of topological features at varying scales, the PD provides important geometric and topological information about a space. Recent work has shown that well-chosen (finite) sets of PDs can differentiate between geometric simplicial complexes, providing a method for representing complex…
▽ More
The persistence diagram (PD) is an increasingly popular topological descriptor. By encoding the size and prominence of topological features at varying scales, the PD provides important geometric and topological information about a space. Recent work has shown that well-chosen (finite) sets of PDs can differentiate between geometric simplicial complexes, providing a method for representing complex shapes using a finite set of descriptors. A related inverse problem is the following: given a set of PDs (or an oracle we can query for persistence diagrams), what is underlying geometric simplicial complex? In this paper, we present an algorithm for reconstructing embedded graphs in $\mathbb{R}^d$ (plane graphs in $\mathbb{R}^2$) with $n$ vertices from $n^2 - n + d + 1$ directional (augmented) PDs. Additionally, we empirically validate the correctness and time-complexity of our algorithm in $\mathbb{R}^2$ on randomly generated plane graphs using our implementation, and explain the numerical limitations of implementing our algorithm.
△ Less
Submitted 18 June, 2020; v1 submitted 18 December, 2019;
originally announced December 2019.
-
Topological and Geometric Reconstruction of Metric Graphs in $\mathbb{R}^n$
Authors:
Brittany Terese Fasy,
Rafal Komendarczyk,
Sushovan Majhi,
Carola Wenk
Abstract:
We propose an algorithm to estimate the topology of an embedded metric graph from a well-sampled finite subset of the underlying graph.
We propose an algorithm to estimate the topology of an embedded metric graph from a well-sampled finite subset of the underlying graph.
△ Less
Submitted 6 December, 2019;
originally announced December 2019.
-
Threshold-Based Graph Reconstruction Using Discrete Morse Theory
Authors:
Brittany Terese Fasy,
Sushovan Majhi,
Carola Wenk
Abstract:
Discrete Morse theory has recently been applied in metric graph reconstruction from a given density function concentrated around an (unknown) underlying embedded graph. We propose a new noise model for the density function to reconstruct a connected graph both topologically and geometrically.
Discrete Morse theory has recently been applied in metric graph reconstruction from a given density function concentrated around an (unknown) underlying embedded graph. We propose a new noise model for the density function to reconstruct a connected graph both topologically and geometrically.
△ Less
Submitted 28 November, 2019;
originally announced November 2019.
-
Moduli Spaces of Morse Functions for Persistence
Authors:
Michael J. Catanzaro,
Justin Curry,
Brittany Terese Fasy,
Jānis Lazovskis,
Greg Malen,
Hans Riess,
Bei Wang,
Matthew Zabka
Abstract:
We consider different notions of equivalence for Morse functions on the sphere in the context of persistent homology, and introduce new invariants to study these equivalence classes. These new invariants are as simple, but more discerning than existing topological invariants, such as persistence barcodes and Reeb graphs. We give a method to relate any two Morse--Smale vector fields on the sphere b…
▽ More
We consider different notions of equivalence for Morse functions on the sphere in the context of persistent homology, and introduce new invariants to study these equivalence classes. These new invariants are as simple, but more discerning than existing topological invariants, such as persistence barcodes and Reeb graphs. We give a method to relate any two Morse--Smale vector fields on the sphere by a sequence of fundamental moves by considering graph-equivalent Morse functions. We also explore the combinatorially rich world of height-equivalent Morse functions, considered as height functions of embedded spheres in $\mathbf R^3$. Their level-set invariant, a poset generated by nested disks and annuli from levels sets, gives insight into the moduli space of Morse functions sharing the same persistence barcode.
△ Less
Submitted 30 June, 2020; v1 submitted 23 September, 2019;
originally announced September 2019.
-
Towards Directed Collapsibility
Authors:
Robin Belton,
Robyn Brooks,
Stefania Ebli,
Lisbeth Fajstrup,
Brittany Terese Fasy,
Catherine Ray,
Nicole Sanderson,
Elizabeth Vidaurre
Abstract:
In the directed setting, the spaces of directed paths between fixed initial and terminal points are the defining feature for distinguishing different directed spaces. The simplest case is when the space of directed paths is homotopy equivalent to that of a single path; we call this the trivial space of directed paths. Directed spaces that are topologically trivial may have non-trivial spaces of di…
▽ More
In the directed setting, the spaces of directed paths between fixed initial and terminal points are the defining feature for distinguishing different directed spaces. The simplest case is when the space of directed paths is homotopy equivalent to that of a single path; we call this the trivial space of directed paths. Directed spaces that are topologically trivial may have non-trivial spaces of directed paths, which means that information is lost when the direction of these topological spaces is ignored. We define a notion of directed collapsibility in the setting of a directed Euclidean cubical complex using the spaces of directed paths of the underlying directed topological space relative to an initial or a final vertex. In addition, we give sufficient conditions for a directed Euclidean cubical complex to have a contractible or a connected space of directed paths from a fixed initial vertex. We also give sufficient conditions for the path space between two vertices in a Euclidean cubical complex to be disconnected. Our results have applications to speeding up the verification process of concurrent programming and to understanding partial executions in concurrent programs.
△ Less
Submitted 17 July, 2019; v1 submitted 4 February, 2019;
originally announced February 2019.
-
Approximate Nearest Neighbors in the Space of Persistence Diagrams
Authors:
Brittany Terese Fasy,
Xiaozhou He,
Zhihui Liu,
Samuel Micka,
David L. Millman,
Binhai Zhu
Abstract:
Persistence diagrams are important tools in the field of topological data analysis that describe the presence and magnitude of features in a filtered topological space. However, current approaches for comparing a persistence diagram to a set of other persistence diagrams is linear in the number of diagrams or do not offer performance guarantees. In this paper, we apply concepts from locality-sensi…
▽ More
Persistence diagrams are important tools in the field of topological data analysis that describe the presence and magnitude of features in a filtered topological space. However, current approaches for comparing a persistence diagram to a set of other persistence diagrams is linear in the number of diagrams or do not offer performance guarantees. In this paper, we apply concepts from locality-sensitive hashing to support approximate nearest neighbor search in the space of persistence diagrams. Given a set $Γ$ of $n$ $(M,m)$-bounded persistence diagrams, each with at most $m$ points, we snap-round the points of each diagram to points on a cubical lattice and produce a key for each possible snap-rounding. Specifically, we fix a grid over each diagram at several resolutions and consider the snap-roundings of each diagram to the four nearest lattice points. Then, we propose a data structure with $τ$ levels $\mathbb{D}_τ$ that stores all snap-roundings of each persistence diagram in $Γ$ at each resolution. This data structure has size $O(n5^mτ)$ to account for varying lattice resolutions as well as snap-roundings and the deletion of points with low persistence. To search for a persistence diagram, we compute a key for a query diagram by snap** each point to a lattice and deleting points of low persistence. Furthermore, as the lattice parameter decreases, searching our data structure yields a six-approximation of the nearest diagram in $Γ$ in $O((m\log{n}+m^2)\logτ)$ time and a constant factor approximation of the $k$th nearest diagram in $O((m\log{n}+m^2+k)\logτ)$ time.
△ Less
Submitted 22 March, 2021; v1 submitted 28 December, 2018;
originally announced December 2018.
-
Challenges in Reconstructing Shapes from Euler Characteristic Curves
Authors:
Brittany Terese Fasy,
Samuel Micka,
David L. Millman,
Anna Schenfisch,
Lucia Williams
Abstract:
Shape recognition and classification is a problem with a wide variety of applications. Several recent works have demonstrated that topological descriptors can be used as summaries of shapes and utilized to compute distances. In this abstract, we explore the use of a finite number of Euler Characteristic Curves (ECC) to reconstruct plane graphs. We highlight difficulties that occur when attempting…
▽ More
Shape recognition and classification is a problem with a wide variety of applications. Several recent works have demonstrated that topological descriptors can be used as summaries of shapes and utilized to compute distances. In this abstract, we explore the use of a finite number of Euler Characteristic Curves (ECC) to reconstruct plane graphs. We highlight difficulties that occur when attempting to adopt approaches for reconstruction with persistence diagrams to reconstruction with ECCs. Furthermore, we highlight specific arrangements of vertices that create problems for reconstruction and present several observations about how they affect the ECC-based reconstruction. Finally, we show that plane graphs without degree two vertices can be reconstructed using a finite number of ECCs.
△ Less
Submitted 27 November, 2018;
originally announced November 2018.
-
On the Reconstruction of Geodesic Subspaces of $\mathbb{R}^N$
Authors:
Brittany Terese Fasy,
Rafal Komendarczyk,
Sushovan Majhi,
Carola Wenk
Abstract:
We consider the topological and geometric reconstruction of a geodesic subspace of $\mathbb{R}^N$ both from the Čech and Vietoris-Rips filtrations on a finite, Hausdorff-close, Euclidean sample. Our reconstruction technique leverages the intrinsic length metric induced by the geodesics on the subspace. We consider the distortion and convexity radius as our sampling parameters for a successful reco…
▽ More
We consider the topological and geometric reconstruction of a geodesic subspace of $\mathbb{R}^N$ both from the Čech and Vietoris-Rips filtrations on a finite, Hausdorff-close, Euclidean sample. Our reconstruction technique leverages the intrinsic length metric induced by the geodesics on the subspace. We consider the distortion and convexity radius as our sampling parameters for a successful reconstruction. For a geodesic subspace with finite distortion and positive convexity radius, we guarantee a correct computation of its homotopy and homology groups from the sample. For geodesic subspaces of $\mathbb{R}^2$, we also devise an algorithm to output a homotopy equivalent geometric complex that has a very small Hausdorff distance to the unknown shape of interest.
△ Less
Submitted 23 September, 2022; v1 submitted 23 October, 2018;
originally announced October 2018.
-
Learning Simplicial Complexes from Persistence Diagrams
Authors:
Robin Lynne Belton,
Brittany Terese Fasy,
Rostik Mertz,
Samuel Micka,
David L. Millman,
Daniel Salinas,
Anna Schenfisch,
Jordan Schupbach,
Lucia Williams
Abstract:
Topological Data Analysis (TDA) studies the shape of data. A common topological descriptor is the persistence diagram, which encodes topological features in a topological space at different scales. Turner, Mukeherjee, and Boyer showed that one can reconstruct a simplicial complex embedded in R^3 using persistence diagrams generated from all possible height filtrations (an uncountably infinite numb…
▽ More
Topological Data Analysis (TDA) studies the shape of data. A common topological descriptor is the persistence diagram, which encodes topological features in a topological space at different scales. Turner, Mukeherjee, and Boyer showed that one can reconstruct a simplicial complex embedded in R^3 using persistence diagrams generated from all possible height filtrations (an uncountably infinite number of directions). In this paper, we present an algorithm for reconstructing plane graphs K=(V,E) in R^2 , i.e., a planar graph with vertices in general position and a straight-line embedding, from a quadratic number height filtrations and their respective persistence diagrams.
△ Less
Submitted 31 July, 2018; v1 submitted 27 May, 2018;
originally announced May 2018.
-
Functional Summaries of Persistence Diagrams
Authors:
Eric Berry,
Yen-Chi Chen,
Jessi Cisewski-Kehe,
Brittany Terese Fasy
Abstract:
One of the primary areas of interest in applied algebraic topology is persistent homology, and, more specifically, the persistence diagram. Persistence diagrams have also become objects of interest in topological data analysis. However, persistence diagrams do not naturally lend themselves to statistical goals, such as inferring certain population characteristics, because their complicated structu…
▽ More
One of the primary areas of interest in applied algebraic topology is persistent homology, and, more specifically, the persistence diagram. Persistence diagrams have also become objects of interest in topological data analysis. However, persistence diagrams do not naturally lend themselves to statistical goals, such as inferring certain population characteristics, because their complicated structure makes common algebraic operations--such as addition, division, and multiplication-- challenging (e.g., the mean might not be unique). To bypass these issues, several functional summaries of persistence diagrams have been proposed in the literature (e.g. landscape and silhouette functions). The problem of analyzing a set of persistence diagrams then becomes the problem of analyzing a set of functions, which is a topic that has been studied for decades in statistics. First, we review the various functional summaries in the literature and propose a unified framework for the functional summaries. Then, we generalize the definition of persistence landscape functions, establish several theoretical properties of the persistence functional summaries, and demonstrate and discuss their performance in the context of classification using simulated prostate cancer histology data, and two-sample hypothesis tests comparing human and monkey fibrin images, after develo** a simulation study using a new data generator we call the Pickup Sticks Simulator (STIX).
△ Less
Submitted 4 April, 2018;
originally announced April 2018.
-
On Minimum Area Homotopies of Normal Curves in the Plane
Authors:
Brittany Terese Fasy,
Selcuk Karakoc,
Carola Wenk
Abstract:
In this paper, we study the problem of computing a homotopy from a planar curve $C$ to a point that minimizes the area swept. The existence of such a minimum homotopy is a direct result of the solution of Plateau's problem. Chambers and Wang studied the special case that $C$ is the concatenation of two simple curves, and they gave a polynomial-time algorithm for computing a minimum homotopy in thi…
▽ More
In this paper, we study the problem of computing a homotopy from a planar curve $C$ to a point that minimizes the area swept. The existence of such a minimum homotopy is a direct result of the solution of Plateau's problem. Chambers and Wang studied the special case that $C$ is the concatenation of two simple curves, and they gave a polynomial-time algorithm for computing a minimum homotopy in this setting. We study the general case of a normal curve $C$ in the plane, and provide structural properties of minimum homotopies that lead to an algorithm. In particular, we prove that for any normal curve there exists a minimum homotopy that consists entirely of contractions of self-overlap** sub-curves (i.e., consists of contracting a collection of boundaries of immersed disks).
△ Less
Submitted 7 July, 2017;
originally announced July 2017.
-
Approximating Nearest Neighbor Distances
Authors:
Michael B. Cohen,
Brittany Terese Fasy,
Gary L. Miller,
Amir Nayyeri,
Donald R. Sheehy,
Ameya Velingker
Abstract:
Several researchers proposed using non-Euclidean metrics on point sets in Euclidean space for clustering noisy data. Almost always, a distance function is desired that recognizes the closeness of the points in the same cluster, even if the Euclidean cluster diameter is large. Therefore, it is preferred to assign smaller costs to the paths that stay close to the input points.
In this paper, we co…
▽ More
Several researchers proposed using non-Euclidean metrics on point sets in Euclidean space for clustering noisy data. Almost always, a distance function is desired that recognizes the closeness of the points in the same cluster, even if the Euclidean cluster diameter is large. Therefore, it is preferred to assign smaller costs to the paths that stay close to the input points.
In this paper, we consider the most natural metric with this property, which we call the nearest neighbor metric. Given a point set P and a path $γ$, our metric charges each point of $γ$ with its distance to P. The total charge along $γ$ determines its nearest neighbor length, which is formally defined as the integral of the distance to the input points along the curve. We describe a $(3+\varepsilon)$-approximation algorithm and a $(1+\varepsilon)$-approximation algorithm to compute the nearest neighbor metric. Both approximation algorithms work in near-linear time. The former uses shortest paths on a sparse graph using only the input points. The latter uses a sparse sample of the ambient space, to find good approximate geodesic paths.
△ Less
Submitted 27 February, 2015;
originally announced February 2015.
-
Robust Topological Inference: Distance To a Measure and Kernel Distance
Authors:
Frédéric Chazal,
Brittany T. Fasy,
Fabrizio Lecci,
Bertrand Michel,
Alessandro Rinaldo,
Larry Wasserman
Abstract:
Let P be a distribution with support S. The salient features of S can be quantified with persistent homology, which summarizes topological features of the sublevel sets of the distance function (the distance of any point x to S). Given a sample from P we can infer the persistent homology using an empirical version of the distance function. However, the empirical distance function is highly non-rob…
▽ More
Let P be a distribution with support S. The salient features of S can be quantified with persistent homology, which summarizes topological features of the sublevel sets of the distance function (the distance of any point x to S). Given a sample from P we can infer the persistent homology using an empirical version of the distance function. However, the empirical distance function is highly non-robust to noise and outliers. Even one outlier is deadly. The distance-to-a-measure (DTM), introduced by Chazal et al. (2011), and the kernel distance, introduced by Phillips et al. (2014), are smooth functions that provide useful topological information but are robust to noise and outliers. Chazal et al. (2014) derived concentration bounds for DTM. Building on these results, we derive limiting distributions and confidence sets, and we propose a method for choosing tuning parameters.
△ Less
Submitted 22 December, 2014;
originally announced December 2014.
-
Introduction to the R package TDA
Authors:
Brittany Terese Fasy,
Jisu Kim,
Fabrizio Lecci,
Clément Maria
Abstract:
We present a short tutorial and introduction to using the R package TDA, which provides some tools for Topological Data Analysis. In particular, it includes implementations of functions that, given some data, provide topological information about the underlying space, such as the distance function, the distance to a measure, the kNN density estimator, the kernel density estimator, and the kernel d…
▽ More
We present a short tutorial and introduction to using the R package TDA, which provides some tools for Topological Data Analysis. In particular, it includes implementations of functions that, given some data, provide topological information about the underlying space, such as the distance function, the distance to a measure, the kNN density estimator, the kernel density estimator, and the kernel distance. The salient topological features of the sublevel sets (or superlevel sets) of these functions can be quantified with persistent homology. We provide an R interface for the efficient algorithms of the C++ libraries GUDHI, Dionysus and PHAT, including a function for the persistent homology of the Rips filtration, and one for the persistent homology of sublevel sets (or superlevel sets) of arbitrary functions evaluated over a grid of points. The significance of the features in the resulting persistence diagrams can be analyzed with functions that implement recently developed statistical methods. The R package TDA also includes the implementation of an algorithm for density clustering, which allows us to identify the spatial organization of the probability mass associated to a density function and visualize it by means of a dendrogram, the cluster tree.
△ Less
Submitted 29 January, 2015; v1 submitted 7 November, 2014;
originally announced November 2014.
-
Subsampling Methods for Persistent Homology
Authors:
Frédéric Chazal,
Brittany Terese Fasy,
Fabrizio Lecci,
Bertrand Michel,
Alessandro Rinaldo,
Larry Wasserman
Abstract:
Persistent homology is a multiscale method for analyzing the shape of sets and functions from point cloud data arising from an unknown distribution supported on those sets. When the size of the sample is large, direct computation of the persistent homology is prohibitive due to the combinatorial nature of the existing algorithms. We propose to compute the persistent homology of several subsamples…
▽ More
Persistent homology is a multiscale method for analyzing the shape of sets and functions from point cloud data arising from an unknown distribution supported on those sets. When the size of the sample is large, direct computation of the persistent homology is prohibitive due to the combinatorial nature of the existing algorithms. We propose to compute the persistent homology of several subsamples of the data and then combine the resulting estimates. We study the risk of two estimators and we prove that the subsampling approach carries stable topological information while achieving a great reduction in computational complexity.
△ Less
Submitted 7 June, 2014;
originally announced June 2014.
-
Stochastic Convergence of Persistence Landscapes and Silhouettes
Authors:
Frédéric Chazal,
Brittany Terese Fasy,
Fabrizio Lecci,
Alessandro Rinaldo,
Larry Wasserman
Abstract:
Persistent homology is a widely used tool in Topological Data Analysis that encodes multiscale topological information as a multi-set of points in the plane called a persistence diagram. It is difficult to apply statistical theory directly to a random sample of diagrams. Instead, we can summarize the persistent homology with the persistence landscape, introduced by Bubenik, which converts a diagra…
▽ More
Persistent homology is a widely used tool in Topological Data Analysis that encodes multiscale topological information as a multi-set of points in the plane called a persistence diagram. It is difficult to apply statistical theory directly to a random sample of diagrams. Instead, we can summarize the persistent homology with the persistence landscape, introduced by Bubenik, which converts a diagram into a well-behaved real-valued function. We investigate the statistical properties of landscapes, such as weak convergence of the average landscapes and convergence of the bootstrap. In addition, we introduce an alternate functional summary of persistent homology, which we call the silhouette, and derive an analogous statistical theory.
△ Less
Submitted 1 December, 2013;
originally announced December 2013.
-
On the Bootstrap for Persistence Diagrams and Landscapes
Authors:
Frédéric Chazal,
Brittany Terese Fasy,
Fabrizio Lecci,
Alessandro Rinaldo,
Aarti Singh,
Larry Wasserman
Abstract:
Persistent homology probes topological properties from point clouds and functions. By looking at multiple scales simultaneously, one can record the births and deaths of topological features as the scale varies. In this paper we use a statistical technique, the empirical bootstrap, to separate topological signal from topological noise. In particular, we derive confidence sets for persistence diagra…
▽ More
Persistent homology probes topological properties from point clouds and functions. By looking at multiple scales simultaneously, one can record the births and deaths of topological features as the scale varies. In this paper we use a statistical technique, the empirical bootstrap, to separate topological signal from topological noise. In particular, we derive confidence sets for persistence diagrams and confidence bands for persistence landscapes.
△ Less
Submitted 22 January, 2014; v1 submitted 2 November, 2013;
originally announced November 2013.
-
Path-Based Distance for Street Map Comparison
Authors:
Mahmuda Ahmed,
Brittany Terese Fasy,
Kyle S. Hickmann,
Carola Wenk
Abstract:
Comparing two geometric graphs embedded in space is important in the field of transportation network analysis. Given street maps of the same city collected from different sources, researchers often need to know how and where they differ. However, the majority of current graph comparison algorithms are based on structural properties of graphs, such as their degree distribution or their local connec…
▽ More
Comparing two geometric graphs embedded in space is important in the field of transportation network analysis. Given street maps of the same city collected from different sources, researchers often need to know how and where they differ. However, the majority of current graph comparison algorithms are based on structural properties of graphs, such as their degree distribution or their local connectivity properties, and do not consider their spatial embedding. This ignores a key property of road networks since similarity of travel over two road networks is intimately tied to the specific spatial embedding. Likewise, many current street map comparison algorithms focus on the spatial embeddings only and do not take structural properties into account, which makes these algorithms insensitive to local connectivity properties and shortest path similarities. We propose a new path-based distance measure to compare two planar geometric graphs embedded in the plane. Our distance measure takes structural as well as spatial properties into account by imposing a distance measure between two road networks based on the Hausdorff distance between the two sets of travel paths they represent. We show that this distance can be approximated in polynomial time and that it preserves structural and spatial properties of the graphs.
△ Less
Submitted 13 February, 2015; v1 submitted 24 September, 2013;
originally announced September 2013.
-
Confidence sets for persistence diagrams
Authors:
Brittany Terese Fasy,
Fabrizio Lecci,
Alessandro Rinaldo,
Larry Wasserman,
Sivaraman Balakrishnan,
Aarti Singh
Abstract:
Persistent homology is a method for probing topological properties of point clouds and functions. The method involves tracking the birth and death of topological features (2000) as one varies a tuning parameter. Features with short lifetimes are informally considered to be "topological noise," and those with a long lifetime are considered to be "topological signal." In this paper, we bring some st…
▽ More
Persistent homology is a method for probing topological properties of point clouds and functions. The method involves tracking the birth and death of topological features (2000) as one varies a tuning parameter. Features with short lifetimes are informally considered to be "topological noise," and those with a long lifetime are considered to be "topological signal." In this paper, we bring some statistical ideas to persistent homology. In particular, we derive confidence sets that allow us to separate topological signal from topological noise.
△ Less
Submitted 20 November, 2014; v1 submitted 28 March, 2013;
originally announced March 2013.
-
Persistence Diagrams and the Heat Equation Homotopy
Authors:
Brittany Terese Fasy
Abstract:
Persistence homology is a tool used to measure topological features that are present in data sets and functions. Persistence pairs births and deaths of these features as we iterate through the sublevel sets of the data or function of interest. I am concerned with using persistence to characterize the difference between two functions f, g : M -> R, where M is a topological space. Furthermore, I f…
▽ More
Persistence homology is a tool used to measure topological features that are present in data sets and functions. Persistence pairs births and deaths of these features as we iterate through the sublevel sets of the data or function of interest. I am concerned with using persistence to characterize the difference between two functions f, g : M -> R, where M is a topological space. Furthermore, I formulate a homotopy from g to f by applying the heat equation to the difference function g-f. By stacking the persistence diagrams associated with this homotopy, we create a vineyard of curves that connect the points in the diagram for f with the points in the diagram for g. I look at the diagrams where M is a square, a sphere, a torus, and a Klein bottle. Looking at these four topologies, we notice trends (and differences) as the persistence diagrams change with respect to time.
△ Less
Submitted 9 February, 2010;
originally announced February 2010.