Search | arXiv e-print repository

arXiv:2310.19960 [pdf, other]

doi 10.1109/BigData52589.2021.9671525

Topological Learning for Motion Data via Mixed Coordinates

Authors: Hengrui Luo, Jisu Kim, Alice Patania, Mikael Vejdemo-Johansson

Abstract: Topology can extract the structural information in a dataset efficiently. In this paper, we attempt to incorporate topological information into a multiple output Gaussian process model for transfer learning purposes. To achieve this goal, we extend the framework of circular coordinates into a novel framework of mixed valued coordinates to take linear trends in the time series into consideration.… ▽ More Topology can extract the structural information in a dataset efficiently. In this paper, we attempt to incorporate topological information into a multiple output Gaussian process model for transfer learning purposes. To achieve this goal, we extend the framework of circular coordinates into a novel framework of mixed valued coordinates to take linear trends in the time series into consideration. One of the major challenges to learn from multiple time series effectively via a multiple output Gaussian process model is constructing a functional kernel. We propose to use topologically induced clustering to construct a cluster based kernel in a multiple output Gaussian process model. This kernel not only incorporates the topological structural information, but also allows us to put forward a unified framework using topological information in time and motion series. △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: 7 pages, 4 figures

Journal ref: 2021 IEEE International Conference on Big Data (Big Data)

arXiv:2209.15574 [pdf, other]

An improved algorithm for Generalized Čech complex construction

Authors: Jie Chu, Mikael Vejdemo-Johansson, ** Ji

Abstract: In this paper, we present an algorithm that computes the generalized Čech complex for a finite set of disks where each may have a different radius in 2D space. An extension of this algorithm is also proposed for a set of balls in 3D space with different radius. To compute a $k$-simplex, we leverage the computation performed in the round of $(k-1)$-simplices such that we can reduce the number of… ▽ More In this paper, we present an algorithm that computes the generalized Čech complex for a finite set of disks where each may have a different radius in 2D space. An extension of this algorithm is also proposed for a set of balls in 3D space with different radius. To compute a $k$-simplex, we leverage the computation performed in the round of $(k-1)$-simplices such that we can reduce the number of potential candidates to verify to improve the efficiency. An efficient verification method is proposed to confirm if a $k$-simplex can be constructed on the basis of the $(k-1)$-simplices. We demonstrate the performance with a comparison to some closely related algorithms. △ Less

Submitted 30 September, 2022; originally announced September 2022.

MSC Class: 68U05; 57-08 ACM Class: F.2.2; I.3.5

arXiv:2006.02554 [pdf, other]

doi 10.3934/fods.2021024

Generalized Penalty for Circular Coordinate Representation

Authors: Hengrui Luo, Alice Patania, Jisu Kim, Mikael Vejdemo-Johansson

Abstract: Topological Data Analysis (TDA) provides novel approaches that allow us to analyze the geometrical shapes and topological structures of a dataset. As one important application, TDA can be used for data visualization and dimension reduction. We follow the framework of circular coordinate representation, which allows us to perform dimension reduction and visualization for high-dimensional datasets o… ▽ More Topological Data Analysis (TDA) provides novel approaches that allow us to analyze the geometrical shapes and topological structures of a dataset. As one important application, TDA can be used for data visualization and dimension reduction. We follow the framework of circular coordinate representation, which allows us to perform dimension reduction and visualization for high-dimensional datasets on a torus using persistent cohomology. In this paper, we propose a method to adapt the circular coordinate framework to take into account the roughness of circular coordinates in change-point and high-dimensional applications. We use a generalized penalty function instead of an $L_{2}$ penalty in the traditional circular coordinate algorithm. We provide simulation experiments and real data analysis to support our claim that circular coordinates with generalized penalty will detect the change in high-dimensional datasets under different sampling schemes while preserving the topological structures. △ Less

Submitted 23 November, 2021; v1 submitted 3 June, 2020; originally announced June 2020.

Comments: 39 pages

MSC Class: 55N31; 62R40; 68T09

Journal ref: Foundations of Data Science, 2021

arXiv:1812.06491 [pdf, other]

Multiple testing with persistent homology

Authors: Mikael Vejdemo-Johansson, Sayan Mukherjee

Abstract: In this paper we propose a computationally efficient multiple hypothesis testing procedure for persistent homology. The computational efficiency of our procedure is based on the observation that one can empirically simulate a null distribution that is universal across many hypothesis testing applications involving persistence homology. Our observation suggests that one can simulate the null distri… ▽ More In this paper we propose a computationally efficient multiple hypothesis testing procedure for persistent homology. The computational efficiency of our procedure is based on the observation that one can empirically simulate a null distribution that is universal across many hypothesis testing applications involving persistence homology. Our observation suggests that one can simulate the null distribution efficiently based on a small number of summaries of the collected data and use this null in the same way that p-value tables were used in classical statistics. To illustrate the efficiency and utility of the null distribution we provide procedures for rejecting acyclicity with both control of the Family-Wise Error Rate (FWER) and the False Discovery Rate (FDR). We will argue that the empirical null we propose is very general conditional on a few summaries of the data based on simulations and limit theorems for persistent homology for point processes. △ Less

Submitted 25 August, 2022; v1 submitted 16 December, 2018; originally announced December 2018.

Comments: 43 pages, 16 figures

MSC Class: 55N31

arXiv:1808.09933 [pdf, other]

Certified Mapper: Repeated testing for acyclicity and obstructions to the nerve lemma

Authors: Mikael Vejdemo-Johansson, Alisa Leshchenko

Abstract: The Mapper algorithm does not include a check for whether the cover produced conforms to the requirements of the nerve lemma. To perform a check for obstructions to the nerve lemma, statistical considerations of multiple testing quickly arise. In this paper, we propose several statistical approaches to finding obstructions: through a persistent nerve lemma, through simulation testing, and using… ▽ More The Mapper algorithm does not include a check for whether the cover produced conforms to the requirements of the nerve lemma. To perform a check for obstructions to the nerve lemma, statistical considerations of multiple testing quickly arise. In this paper, we propose several statistical approaches to finding obstructions: through a persistent nerve lemma, through simulation testing, and using a parametric refinement of simulation tests. We suggest Certified Mapper -- a method built from these approaches to generate certificates of non-obstruction, or identify specific obstructions to the nerve lemma -- and we give recommendations for which statistical approaches are most appropriate for the task. △ Less

Submitted 29 August, 2018; originally announced August 2018.

Comments: 16 pages, submitted to the proceedings of the Abel symposium

arXiv:1803.00384 [pdf, other]

Fibres of Failure: Classifying errors in predictive processes

Authors: Leo Carlsson, Gunnar Carlsson, Mikael Vejdemo-Johansson

Abstract: We describe Fibres of Failure (FiFa), a method to classify failure modes of predictive processes using the Mapper algorithm from Topological Data Analysis. Our method uses Mapper to build a graph model of input data stratified by prediction error. Grou**s found in high-error regions of the Mapper model then provide distinct failure modes of the predictive process. We demonstrate FiFa on mi… ▽ More We describe Fibres of Failure (FiFa), a method to classify failure modes of predictive processes using the Mapper algorithm from Topological Data Analysis. Our method uses Mapper to build a graph model of input data stratified by prediction error. Grou**s found in high-error regions of the Mapper model then provide distinct failure modes of the predictive process. We demonstrate FiFa on misclassifications of MNIST images with added noise, and demonstrate two ways to use the failure mode classification: either to produce a correction layer that adjusts predictions by similarity to the failure modes; or to inspect members of the failure modes to illustrate and investigate what characterizes each failure mode. △ Less

Submitted 9 February, 2018; originally announced March 2018.

Comments: 10 pages, submitted to ICML 2018

arXiv:1410.2320 [pdf, other]

Computing minimum area homologies

Authors: Erin Wolf Chambers, Mikael Vejdemo-Johansson

Abstract: Calculating and categorizing the similarity of curves is a fundamental problem which has generated much recent interest. However, to date there are no implementations of these algorithms for curves on surfaces with provable guarantees on the quality of the measure. In this paper, we present a similarity measure for any two cycles that are homologous, where we calculate the minimum area of any homo… ▽ More Calculating and categorizing the similarity of curves is a fundamental problem which has generated much recent interest. However, to date there are no implementations of these algorithms for curves on surfaces with provable guarantees on the quality of the measure. In this paper, we present a similarity measure for any two cycles that are homologous, where we calculate the minimum area of any homology (or connected bounding chain) between the two cycles. The minimum area homology exists for broader classes of cycles than previous measures which are based on homotopy. It is also much easier to compute than previously defined measures, yielding an efficient implementation that is based on linear algebra tools. We demonstrate our algorithm on a range of inputs, showing examples which highlight the feasibility of this similarity measure. △ Less

Submitted 8 October, 2014; originally announced October 2014.

Comments: To appear in Computer Graphics Forum

arXiv:1409.3762 [pdf, other]

Aspects of an internal logic for persistence

Authors: João Pita Costa, Primož Škraba, Mikael Vejdemo-Johansson

Abstract: The foundational character of certain algebraic structures as Boolean algebras and Heyting algebras is rooted in their potential to model classical and constructive logic, respectively. In this paper we discuss the contributions of algebraic logic to the study of persistence based on a new operation on the ordered structure of the input diagram of vector spaces and linear maps given by a filtratio… ▽ More The foundational character of certain algebraic structures as Boolean algebras and Heyting algebras is rooted in their potential to model classical and constructive logic, respectively. In this paper we discuss the contributions of algebraic logic to the study of persistence based on a new operation on the ordered structure of the input diagram of vector spaces and linear maps given by a filtration. Within the context of persistence theory, we give an analysis of the underlying algebra, derive universal properties and discuss new applications. We highlight the definition of the implication operation within this construction, as well as interpret its meaning within persistent homology, multidimensional persistence and zig-zag persistence. △ Less

Submitted 15 September, 2014; v1 submitted 12 September, 2014; originally announced September 2014.

MSC Class: 03G10

arXiv:1401.8242 [pdf, other]

More ties than we thought

Authors: Dan Hirsch, Ingemar Markström, Meredith L Patterson, Anders Sandberg, Mikael Vejdemo-Johansson

Abstract: We extend the existing enumeration of neck tie-knots to include tie-knots with a textured front, tied with the narrow end of a tie. These tie-knots have gained popularity in recent years, based on reconstructions of a costume detail from The Matrix Reloaded, and are explicitly ruled out in the enumeration by Fink and Mao (2000). We show that the relaxed tie-knot description language that compreh… ▽ More We extend the existing enumeration of neck tie-knots to include tie-knots with a textured front, tied with the narrow end of a tie. These tie-knots have gained popularity in recent years, based on reconstructions of a costume detail from The Matrix Reloaded, and are explicitly ruled out in the enumeration by Fink and Mao (2000). We show that the relaxed tie-knot description language that comprehensively describes these extended tie-knot classes is context free. It has a regular sub-language that covers all the knots that originally inspired the work. From the full language, we enumerate 266 682 distinct tie-knots that seem tie-able with a normal neck-tie. Out of these 266 682, we also enumerate 24 882 tie-knots that belong to the regular sub-language. △ Less

Submitted 6 May, 2015; v1 submitted 31 January, 2014; originally announced January 2014.

Comments: Accepted at PeerJ Computer Science 12 pages, 6 color photographs

arXiv:1312.2482 [pdf, other]

Automatic recognition and tagging of topologically different regimes in dynamical systems

Authors: Jesse Berwald, Marian Gidea, Mikael Vejdemo-Johansson

Abstract: Complex systems are commonly modeled using nonlinear dynamical systems. These models are often high-dimensional and chaotic. An important goal in studying physical systems through the lens of mathematical models is to determine when the system undergoes changes in qualitative behavior. A detailed description of the dynamics can be difficult or impossible to obtain for high-dimensional and chaotic… ▽ More Complex systems are commonly modeled using nonlinear dynamical systems. These models are often high-dimensional and chaotic. An important goal in studying physical systems through the lens of mathematical models is to determine when the system undergoes changes in qualitative behavior. A detailed description of the dynamics can be difficult or impossible to obtain for high-dimensional and chaotic systems. Therefore, a more sensible goal is to recognize and mark transitions of a system between qualitatively different regimes of behavior. In practice, one is interested in develo** techniques for detection of such transitions from sparse observations, possibly contaminated by noise. In this paper we develop a framework to accurately tag different regimes of complex systems based on topological features. In particular, our framework works with a high degree of success in picking out a cyclically orbiting regime from a stationary equilibrium regime in high-dimensional stochastic dynamical systems. △ Less

Submitted 24 March, 2014; v1 submitted 9 December, 2013; originally announced December 2013.

MSC Class: 37M10; 55U99; 37M20; 68U05

arXiv:1302.2015 [pdf, ps, other]

Persistence modules: Algebra and algorithms

Authors: Primoz Skraba, Mikael Vejdemo-Johansson

Abstract: Persistent homology was shown by Carlsson and Zomorodian to be homology of graded chain complexes with coefficients in the graded ring $\kk[t]$. As such, the behavior of persistence modules -- graded modules over $\kk[t]$ is an important part in the analysis and computation of persistent homology. In this paper we present a number of facts about persistence modules; ranging from the well-known b… ▽ More Persistent homology was shown by Carlsson and Zomorodian to be homology of graded chain complexes with coefficients in the graded ring $\kk[t]$. As such, the behavior of persistence modules -- graded modules over $\kk[t]$ is an important part in the analysis and computation of persistent homology. In this paper we present a number of facts about persistence modules; ranging from the well-known but under-utilized to the reconstruction of techniques to work in a purely algebraic approach to persistent homology. In particular, the results we present give concrete algorithms to compute the persistent homology of a simplicial complex with torsion in the chain complex. △ Less

Submitted 15 February, 2013; v1 submitted 8 February, 2013; originally announced February 2013.

Comments: 28 pages, submitted to Mathematics of Computation

MSC Class: 13P10; 55N35 ACM Class: I.1.2; I.3.5

arXiv:1212.5398 [pdf, other]

Sketches of a platypus: persistent homology and its algebraic foundations

Authors: Mikael Vejdemo-Johansson

Abstract: The subject of persistent homology has vitalized applications of algebraic topology to point cloud data and to application fields far outside the realm of pure mathematics. The area has seen several fundamentally important results that are rooted in choosing a particular algebraic foundational theory to describe persistent homology, and applying results from that theory to prove useful and importa… ▽ More The subject of persistent homology has vitalized applications of algebraic topology to point cloud data and to application fields far outside the realm of pure mathematics. The area has seen several fundamentally important results that are rooted in choosing a particular algebraic foundational theory to describe persistent homology, and applying results from that theory to prove useful and important results. In this survey paper, we shall examine the various choices in use, and what they allow us to prove. We shall also discuss the inherent differences between the choices people use, and speculate on potential directions of research to resolve these differences. △ Less

Submitted 10 November, 2013; v1 submitted 21 December, 2012; originally announced December 2012.

Comments: 22 pages, 4 figures, accepted for publication in an upcoming volume of AMS Contemporary Mathematics

MSC Class: 55N35; 13C60

arXiv:1210.7913 [pdf, ps, other]

Interleaved equivalence of categories of persistence modules

Authors: Mikael Vejdemo-Johansson

Abstract: We demonstrate that an equivalence of categories using $\varepsilon$-interleavings as a fundamental component exists between the model of persistence modules as graded modules over a polynomial ring and the model of persistence modules as modules over the total order of the real numbers. We demonstrate that an equivalence of categories using $\varepsilon$-interleavings as a fundamental component exists between the model of persistence modules as graded modules over a polynomial ring and the model of persistence modules as modules over the total order of the real numbers. △ Less

Submitted 30 October, 2012; originally announced October 2012.

Comments: 9 pages

MSC Class: 16D90; 06A05

arXiv:1112.1245 [pdf, other]

A spectral sequence for parallelized persistence

Authors: David Lipsky, Primoz Skraba, Mikael Vejdemo-Johansson

Abstract: We approach the problem of the computation of persistent homology for large datasets by a divide-and-conquer strategy. Dividing the total space into separate but overlap** components, we are able to limit the total memory residency for any part of the computation, while not degrading the overall complexity much. Locally computed persistence information is then merged from the components and thei… ▽ More We approach the problem of the computation of persistent homology for large datasets by a divide-and-conquer strategy. Dividing the total space into separate but overlap** components, we are able to limit the total memory residency for any part of the computation, while not degrading the overall complexity much. Locally computed persistence information is then merged from the components and their intersections using a spectral sequence generalizing the Mayer-Vietoris long exact sequence. We describe the Mayer-Vietoris spectral sequence and give details on how to compute with it. This allows us to merge local homological data into the global persistent homology. Furthermore, we detail how the classical topology constructions inherent in the spectral sequence adapt to a persistence perspective, as well as describe the techniques from computational commutative algebra necessary for this extension. The resulting computational scheme suggests a parallelization scheme, and we discuss the communication steps involved in this scheme. Furthermore, the computational scheme can also serve as a guideline for which parts of the boundary matrix manipulation need to co-exist in primary memory at any given time allowing for stratified memory access in single-core computation. The spectral sequence viewpoint also provides easy proofs of a homology nerve lemma as well as a persistent homology nerve lemma. In addition, the algebraic tools we develop to approch persistent homology provide a purely algebraic formulation of kernel, image and cokernel persistence (D. Cohen-Steiner, H. Edelsbrunner, J. Harer, and D. Morozov. Persistent homology for kernels, images, and cokernels. In Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1011-1020. Society for Industrial and Applied Mathematics, 2009.) △ Less

Submitted 6 December, 2011; originally announced December 2011.

Comments: 15 pages, 10 figures, submitted to the ACM Symposium on Computational Geometry

MSC Class: 55-04; 55T99; 55U10 ACM Class: D.1.3; G.4; I.1.2; J.2

arXiv:1107.5665 [pdf, other]

doi 10.1088/0266-5611/27/12/124003

Dualities in persistent (co)homology

Authors: Vin de Silva, Dmitriy Morozov, Mikael Vejdemo-Johansson

Abstract: We consider sequences of absolute and relative homology and cohomology groups that arise naturally for a filtered cell complex. We establish algebraic relationships between their persistence modules, and show that they contain equivalent information. We explain how one can use the existing algorithm for persistent homology to process any of the four modules, and relate it to a recently introduced… ▽ More We consider sequences of absolute and relative homology and cohomology groups that arise naturally for a filtered cell complex. We establish algebraic relationships between their persistence modules, and show that they contain equivalent information. We explain how one can use the existing algorithm for persistent homology to process any of the four modules, and relate it to a recently introduced persistent cohomology algorithm. We present experimental evidence for the practical efficiency of the latter algorithm. △ Less

Submitted 28 July, 2011; originally announced July 2011.

Comments: 16 pages, 3 figures, submitted to the Inverse Problems special issue on Topological Data Analysis

arXiv:1105.6305 [pdf, ps, other]

Interleaved computation for persistent homology

Authors: Mikael Vejdemo-Johansson

Abstract: We describe an approach to bounded-memory computation of persistent homology and betti barcodes, in which a computational state is maintained with updates introducing new edges to the underlying neighbourhood graph and percolating the resulting changes into the simplex stream feeding the persistence algorithm. We further discuss the memory consumption and resulting speed and complexity behaviour… ▽ More We describe an approach to bounded-memory computation of persistent homology and betti barcodes, in which a computational state is maintained with updates introducing new edges to the underlying neighbourhood graph and percolating the resulting changes into the simplex stream feeding the persistence algorithm. We further discuss the memory consumption and resulting speed and complexity behaviours of the resulting algorithm. △ Less

Submitted 31 May, 2011; originally announced May 2011.

Comments: 5 pages, draft version

Report number: Mittag-Leffler-2011spring

arXiv:1105.5509 [pdf, other]

A parallel Buchberger algorithm for multigraded ideals

Authors: Mikael Vejdemo-Johansson, Emil Sköldberg, Jason Dusek

Abstract: We demonstrate a method to parallelize the computation of a Gröbner basis for a homogenous ideal in a multigraded polynomial ring. Our method uses anti-chains in the lattice $\mathbb N^k$ to separate mutually independent S-polynomials for reduction. We demonstrate a method to parallelize the computation of a Gröbner basis for a homogenous ideal in a multigraded polynomial ring. Our method uses anti-chains in the lattice $\mathbb N^k$ to separate mutually independent S-polynomials for reduction. △ Less

Submitted 27 May, 2011; originally announced May 2011.

Comments: 8 pages, 6 figures

Report number: Mittag-Leffler-2011spring MSC Class: 13-04 13P10

arXiv:0909.4950 [pdf, ps, other]

Implementing Gröbner bases for operads

Authors: Vladimir Dotsenko, Mikael Vejdemo-Johansson

Abstract: We present an implementation of the algorithm for computing Groebner bases for operads due to the first author and A. Khoroshkin. We discuss the actual algorithms, the choices made for the implementation platform and the data representation, and strengths and weaknesses of our approach. We present an implementation of the algorithm for computing Groebner bases for operads due to the first author and A. Khoroshkin. We discuss the actual algorithms, the choices made for the implementation platform and the data representation, and strengths and weaknesses of our approach. △ Less

Submitted 26 August, 2010; v1 submitted 27 September, 2009; originally announced September 2009.

Comments: 18 pages, 6 figures

arXiv:0905.4887 [pdf, ps, other]

doi 10.1145/1542362.1542406

Persistent Cohomology and Circular Coordinates

Authors: Vin de Silva, Mikael Vejdemo-Johansson

Abstract: Nonlinear dimensionality reduction (NLDR) algorithms such as Isomap, LLE and Laplacian Eigenmaps address the problem of representing high-dimensional nonlinear data in terms of low-dimensional coordinates which represent the intrinsic structure of the data. This paradigm incorporates the assumption that real-valued coordinates provide a rich enough class of functions to represent the data faithf… ▽ More Nonlinear dimensionality reduction (NLDR) algorithms such as Isomap, LLE and Laplacian Eigenmaps address the problem of representing high-dimensional nonlinear data in terms of low-dimensional coordinates which represent the intrinsic structure of the data. This paradigm incorporates the assumption that real-valued coordinates provide a rich enough class of functions to represent the data faithfully and efficiently. On the other hand, there are simple structures which challenge this assumption: the circle, for example, is one-dimensional but its faithful representation requires two real coordinates. In this work, we present a strategy for constructing circle-valued functions on a statistical data set. We develop a machinery of persistent cohomology to identify candidates for significant circle-structures in the data, and we use harmonic smoothing and integration to obtain the circle-valued coordinate functions themselves. We suggest that this enriched class of coordinate functions permits a precise NLDR analysis of a broader range of realistic data sets. △ Less

Submitted 29 May, 2009; originally announced May 2009.

Comments: 10 pages, 7 figures. To appear in the proceedings of the ACM Symposium on Computational Geometry 2009

MSC Class: 55-04; 62H25

Journal ref: SCG '09: Proceedings of the 25th annual symposium on Computational geometry (2009) pp 227--236

Showing 1–19 of 19 results for author: Vejdemo-Johansson, M