-
TTK is Getting MPI-Ready
Authors:
Eve Le Guillou,
Michael Will,
Pierre Guillou,
Jonas Lukasczyk,
Pierre Fortin,
Christoph Garth,
Julien Tierny
Abstract:
This system paper documents the technical foundations for the extension of the Topology ToolKit (TTK) to distributed-memory parallelism with the Message Passing Interface (MPI). While several recent papers introduced topology-based approaches for distributed-memory environments, these were reporting experiments obtained with tailored, mono-algorithm implementations. In contrast, we describe in thi…
▽ More
This system paper documents the technical foundations for the extension of the Topology ToolKit (TTK) to distributed-memory parallelism with the Message Passing Interface (MPI). While several recent papers introduced topology-based approaches for distributed-memory environments, these were reporting experiments obtained with tailored, mono-algorithm implementations. In contrast, we describe in this paper a versatile approach (supporting both triangulated domains and regular grids) for the support of topological analysis pipelines, i.e. a sequence of topological algorithms interacting together. While develo** this extension, we faced several algorithmic and software engineering challenges, which we document in this paper. We describe an MPI extension of TTK's data structure for triangulation representation and traversal, a central component to the global performance and generality of TTK's topological implementations. We also introduce an intermediate interface between TTK and MPI, both at the global pipeline level, and at the fine-grain algorithmic level. We provide a taxonomy for the distributed-memory topological algorithms supported by TTK, depending on their communication needs and provide examples of hybrid MPI+thread parallelizations. Performance analyses show that parallel efficiencies range from 20% to 80% (depending on the algorithms), and that the MPI-specific preconditioning introduced by our framework induces a negligible computation time overhead. We illustrate the new distributed-memory capabilities of TTK with an example of advanced analysis pipeline, combining multiple algorithms, run on the largest publicly available dataset we have found (120 billion vertices) on a cluster with 64 nodes (for a total of 1536 cores). Finally, we provide a roadmap for the completion of TTK's MPI extension, along with generic recommendations for each algorithm communication category.
△ Less
Submitted 15 April, 2024; v1 submitted 12 October, 2023;
originally announced October 2023.
-
Merge Tree Geodesics and Barycenters with Path Map**s
Authors:
Florian Wetzels,
Mathieu Pont,
Julien Tierny,
Christoph Garth
Abstract:
Comparative visualization of scalar fields is often facilitated using similarity measures such as edit distances. In this paper, we describe a novel approach for similarity analysis of scalar fields that combines two recently introduced techniques: Wasserstein geodesics/barycenters as well as path map**s, a branch decomposition-independent edit distance. Effectively, we are able to leverage the…
▽ More
Comparative visualization of scalar fields is often facilitated using similarity measures such as edit distances. In this paper, we describe a novel approach for similarity analysis of scalar fields that combines two recently introduced techniques: Wasserstein geodesics/barycenters as well as path map**s, a branch decomposition-independent edit distance. Effectively, we are able to leverage the reduced susceptibility of path map**s to small perturbations in the data when compared with the original Wasserstein distance. Our approach therefore exhibits superior performance and quality in typical tasks such as ensemble summarization, ensemble clustering, and temporal reduction of time series, while retaining practically feasible runtimes. Beyond studying theoretical properties of our approach and discussing implementation aspects, we describe a number of case studies that provide empirical insights into its utility for comparative visualization, and demonstrate the advantages of our method in both synthetic and real-world scenarios. We supply a C++ implementation that can be used to reproduce our results.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
Wasserstein Auto-Encoders of Merge Trees (and Persistence Diagrams)
Authors:
Mahieu Pont,
Julien Tierny
Abstract:
This paper presents a computational framework for the Wasserstein auto-encoding of merge trees (MT-WAE), a novel extension of the classical auto-encoder neural network architecture to the Wasserstein metric space of merge trees. In contrast to traditional auto-encoders which operate on vectorized data, our formulation explicitly manipulates merge trees on their associated metric space at each laye…
▽ More
This paper presents a computational framework for the Wasserstein auto-encoding of merge trees (MT-WAE), a novel extension of the classical auto-encoder neural network architecture to the Wasserstein metric space of merge trees. In contrast to traditional auto-encoders which operate on vectorized data, our formulation explicitly manipulates merge trees on their associated metric space at each layer of the network, resulting in superior accuracy and interpretability. Our novel neural network approach can be interpreted as a non-linear generalization of previous linear attempts [79] at merge tree encoding. It also trivially extends to persistence diagrams. Extensive experiments on public ensembles demonstrate the efficiency of our algorithms, with MT-WAE computations in the orders of minutes on average. We show the utility of our contributions in two applications adapted from previous work on merge tree encoding [79]. First, we apply MT-WAE to merge tree compression, by concisely representing them with their coordinates in the final layer of our auto-encoder. Second, we document an application to dimensionality reduction, by exploiting the latent space of our auto-encoder, for the visual analysis of ensemble data. We illustrate the versatility of our framework by introducing two penalty terms, to help preserve in the latent space both the Wasserstein distances between merge trees, as well as their clusters. In both applications, quantitative experiments assess the relevance of our framework. Finally, we provide a C++ implementation that can be used for reproducibility.
△ Less
Submitted 9 November, 2023; v1 submitted 5 July, 2023;
originally announced July 2023.
-
Wasserstein Dictionaries of Persistence Diagrams
Authors:
Keanu Sisouk,
Julie Delon,
Julien Tierny
Abstract:
This paper presents a computational framework for the concise encoding of an ensemble of persistence diagrams, in the form of weighted Wasserstein barycenters [100], [102] of a dictionary of atom diagrams. We introduce a multi-scale gradient descent approach for the efficient resolution of the corresponding minimization problem, which interleaves the optimization of the barycenter weights with the…
▽ More
This paper presents a computational framework for the concise encoding of an ensemble of persistence diagrams, in the form of weighted Wasserstein barycenters [100], [102] of a dictionary of atom diagrams. We introduce a multi-scale gradient descent approach for the efficient resolution of the corresponding minimization problem, which interleaves the optimization of the barycenter weights with the optimization of the atom diagrams. Our approach leverages the analytic expressions for the gradient of both sub-problems to ensure fast iterations and it additionally exploits shared-memory parallelism. Extensive experiments on public ensembles demonstrate the efficiency of our approach, with Wasserstein dictionary computations in the orders of minutes for the largest examples. We show the utility of our contributions in two applications. First, we apply Wassserstein dictionaries to data reduction and reliably compress persistence diagrams by concisely representing them with their weights in the dictionary. Second, we present a dimensionality reduction framework based on a Wasserstein dictionary defined with a small number of atoms (typically three) and encode the dictionary as a low dimensional simplex embedded in a visual space (typically in 2D). In both applications, quantitative experiments assess the relevance of our framework. Finally, we provide a C++ implementation that can be used to reproduce our results.
△ Less
Submitted 15 September, 2023; v1 submitted 28 April, 2023;
originally announced April 2023.
-
Parallel Computation of Piecewise Linear Morse-Smale Segmentations
Authors:
Robin G. C. Maack,
Jonas Lukasczyk,
Julien Tierny,
Hans Hagen,
Ross Maciejewski,
Christoph Garth
Abstract:
This paper presents a well-scaling parallel algorithm for the computation of Morse-Smale (MS) segmentations, including the region separators and region boundaries. The segmentation of the domain into ascending and descending manifolds, solely defined on the vertices, improves the computational time using path compression and fully segments the border region. Region boundaries and region separators…
▽ More
This paper presents a well-scaling parallel algorithm for the computation of Morse-Smale (MS) segmentations, including the region separators and region boundaries. The segmentation of the domain into ascending and descending manifolds, solely defined on the vertices, improves the computational time using path compression and fully segments the border region. Region boundaries and region separators are generated using a multi-label marching tetrahedra algorithm. This enables a fast and simple solution to find optimal parameter settings in preliminary exploration steps by generating an MS complex preview. It also poses a rapid option to generate a fast visual representation of the region geometries for immediate utilization. Two experiments demonstrate the performance of our approach with speedups of over an order of magnitude in comparison to two publicly available implementations. The example section shows the similarity to the MS complex, the useability of the approach, and the benefits of this method with respect to the presented datasets. We provide our implementation with the paper.
△ Less
Submitted 27 March, 2023;
originally announced March 2023.
-
Topological data analysis of vortices in the magnetically-induced current density in LiH molecule
Authors:
Małgorzata Olejniczak,
Julien Tierny
Abstract:
A novel strategy for extracting axial (AV) and toroidal (TV) vortices in the magnetically-induced current density (MICD) in molecular systems is introduced, and its pilot application to LiH molecule is demonstrated. It exploits differences in the topologies of AV and TV cores and involves two key steps: selecting a scalar function that can describe vortex cores in MICD and its subsequent topologic…
▽ More
A novel strategy for extracting axial (AV) and toroidal (TV) vortices in the magnetically-induced current density (MICD) in molecular systems is introduced, and its pilot application to LiH molecule is demonstrated. It exploits differences in the topologies of AV and TV cores and involves two key steps: selecting a scalar function that can describe vortex cores in MICD and its subsequent topological analysis. The scalar function of choice is $Ω^{B_α}$ based on the velocity-gradient $Ω$ method known in research on classical flows. The Topological Data Analysis (TDA) is then used to analyze the $Ω^{B_α}$ scalar field. In particular, TDA robustly assigns distinct topological features of this field to different vortex types in LiH: AV to saddle-maximum separatrices which connect maxima to 2-saddles located on the domain's boundary, TV to a 1-cycle of the super-level sets of the input data. Both are extracted as the most \emph{persistent} features of the topologically-simplified $Ω^{B_α}$ scalar field.
△ Less
Submitted 20 January, 2023; v1 submitted 16 December, 2022;
originally announced December 2022.
-
Topological Analysis of Ensembles of Hydrodynamic Turbulent Flows -- An Experimental Study
Authors:
Florent Nauleau,
Fabien Vivodtzev,
Thibault Bridel-Bertomeu,
Heloise Beaugendre,
Julien Tierny
Abstract:
This application paper presents a comprehensive experimental evaluation of the suitability of Topological Data Analysis (TDA) for the quantitative comparison of turbulent flows. Specifically, our study documents the usage of the persistence diagram of the maxima of flow enstrophy (an established vorticity indicator), for the topological representation of 180 ensemble members, generated by a coarse…
▽ More
This application paper presents a comprehensive experimental evaluation of the suitability of Topological Data Analysis (TDA) for the quantitative comparison of turbulent flows. Specifically, our study documents the usage of the persistence diagram of the maxima of flow enstrophy (an established vorticity indicator), for the topological representation of 180 ensemble members, generated by a coarse sampling of the parameter space of five numerical solvers. We document five main hypotheses reported by domain experts, describing their expectations regarding the variability of the flows generated by the distinct solver configurations. We contribute three evaluation protocols to assess the validation of the above hypotheses by two comparison measures: (i) a standard distance used in scientific imaging (the L2 norm) and (ii) an established topological distance between persistence diagrams (the L2-Wasserstein metric). Extensive experiments on the input ensemble demonstrate the superiority of the topological distance (ii) to report as close to each other flows which are expected to be similar by domain experts, due to the configuration of their vortices. Overall, the insights reported by our study bring an experimental evidence of the suitability of TDA for representing and comparing turbulent flows, thereby providing to the fluid dynamics community confidence for its usage in future work. Also, our flow data and evaluation protocols provide to the TDA community an application-approved benchmark for the evaluation and design of further topological distances.
△ Less
Submitted 28 July, 2022;
originally announced July 2022.
-
Principal Geodesic Analysis of Merge Trees (and Persistence Diagrams)
Authors:
Mathieu Pont,
Jules Vidal,
Julien Tierny
Abstract:
This paper presents a computational framework for the Principal Geodesic Analysis of merge trees (MT-PGA), a novel adaptation of the celebrated Principal Component Analysis (PCA) framework [87] to the Wasserstein metric space of merge trees [92]. We formulate MT-PGA computation as a constrained optimization problem, aiming at adjusting a basis of orthogonal geodesic axes, while minimizing a fittin…
▽ More
This paper presents a computational framework for the Principal Geodesic Analysis of merge trees (MT-PGA), a novel adaptation of the celebrated Principal Component Analysis (PCA) framework [87] to the Wasserstein metric space of merge trees [92]. We formulate MT-PGA computation as a constrained optimization problem, aiming at adjusting a basis of orthogonal geodesic axes, while minimizing a fitting energy. We introduce an efficient, iterative algorithm which exploits shared-memory parallelism, as well as an analytic expression of the fitting energy gradient, to ensure fast iterations. Our approach also trivially extends to extremum persistence diagrams. Extensive experiments on public ensembles demonstrate the efficiency of our approach - with MT-PGA computations in the orders of minutes for the largest examples. We show the utility of our contributions by extending to merge trees two typical PCA applications. First, we apply MT-PGA to data reduction and reliably compress merge trees by concisely representing them by their first coordinates in the MT-PGA basis. Second, we present a dimensionality reduction framework exploiting the first two directions of the MT-PGA basis to generate two-dimensional layouts of the ensemble. We augment these layouts with persistence correlation views, enabling global and local visual inspections of the feature variability in the ensemble. In both applications, quantitative experiments assess the relevance of our framework. Finally, we provide a C++ implementation that can be used to reproduce our results.
△ Less
Submitted 2 December, 2022; v1 submitted 22 July, 2022;
originally announced July 2022.
-
Discrete Morse Sandwich: Fast Computation of Persistence Diagrams for Scalar Data -- An Algorithm and A Benchmark
Authors:
Pierre Guillou,
Jules Vidal,
Julien Tierny
Abstract:
This paper introduces an efficient algorithm for persistence diagram computation, given an input piecewise linear scalar field $f$ defined on a $d$-dimensional simplicial complex $K$, with $d \leq 3$. Our work revisits the seminal algorithm "PairSimplices" [31], [103] with discrete Morse theory (DMT) [34], [80], which greatly reduces the number of input simplices to consider. Further, we also exte…
▽ More
This paper introduces an efficient algorithm for persistence diagram computation, given an input piecewise linear scalar field $f$ defined on a $d$-dimensional simplicial complex $K$, with $d \leq 3$. Our work revisits the seminal algorithm "PairSimplices" [31], [103] with discrete Morse theory (DMT) [34], [80], which greatly reduces the number of input simplices to consider. Further, we also extend to DMT and accelerate the stratification strategy described in "PairSimplices" for the fast computation of the $0^{th}$ and $(d - 1)^{th}$ diagrams, noted $D_0(f)$ and $D_{d-1}(f)$. Minima-saddle persistence pairs ($D_0(f)$) and saddle-maximum persistence pairs ($D_{d-1}(f)$) are efficiently computed by processing, with a Union-Find, the unstable sets of $1$-saddles and the stable sets of $(d - 1)$-saddles. This fast pre-computation for the dimensions $0$ and $(d - 1)$ enables an aggressive specialization of [4] to the 3D case, which results in a drastic reduction of the number of input simplices for the computation of $D_1(f)$, the intermediate layer of the sandwich. Finally, we document several performance improvements via shared-memory parallelism. We provide an open-source implementation of our algorithm for reproducibility purposes. We also contribute a reproducible benchmark package, which exploits three-dimensional data from a public repository and compares our algorithm to a variety of publicly available implementations. Extensive experiments indicate that our algorithm improves by two orders of magnitude the time performance of the seminal "PairSimplices" algorithm it extends. Moreover, it also improves memory footprint and time performance over a selection of 14 competing approaches, with a substantial gain over the fastest available approaches, while producing a strictly identical output.
△ Less
Submitted 13 January, 2023; v1 submitted 27 June, 2022;
originally announced June 2022.
-
Utilising urgent computing to tackle the spread of mosquito-borne diseases
Authors:
Nick Brown,
Rupert Nash,
Piero Poletti,
Giorgio Guzzetta,
Mattia Manica,
Agnese Zardini,
Markus Flatken,
Jules Vidal,
Charles Gueunet,
Evgenij Belikov,
Julien Tierny,
Artur Podobas,
Wei Der Chien,
Stefano Markidis,
Andreas Gerndt
Abstract:
It is estimated that around 80\% of the world's population live in areas susceptible to at-least one major vector borne disease, and approximately 20% of global communicable diseases are spread by mosquitoes. Furthermore, the outbreaks of such diseases are becoming more common and widespread, with much of this driven in recent years by socio-demographic and climatic factors. These trends are causi…
▽ More
It is estimated that around 80\% of the world's population live in areas susceptible to at-least one major vector borne disease, and approximately 20% of global communicable diseases are spread by mosquitoes. Furthermore, the outbreaks of such diseases are becoming more common and widespread, with much of this driven in recent years by socio-demographic and climatic factors. These trends are causing significant worry to global health organisations, including the CDC and WHO, and-so an important question is the role that technology can play in addressing them.
In this work we describe the integration of an epidemiology model, which simulates the spread of mosquito-borne diseases, with the VESTEC urgent computing ecosystem. The intention of this work is to empower human health professionals to exploit this model and more easily explore the progression of mosquito-borne diseases. Traditionally in the domain of the few research scientists, by leveraging state of the art visualisation and analytics techniques, all supported by running the computational workloads on HPC machines in a seamless fashion, we demonstrate the significant advantages that such an integration can provide. Furthermore we demonstrate the benefits of using an ecosystem such as VESTEC, which provides a framework for urgent computing, in supporting the easy adoption of these technologies by the epidemiologists and disaster response professionals more widely.
△ Less
Submitted 10 November, 2021;
originally announced November 2021.
-
Fast Approximation of Persistence Diagrams with Guarantees
Authors:
Jules Vidal,
Julien Tierny
Abstract:
This paper presents an algorithm for the efficient approximation of the saddle-extremum persistence diagram of a scalar field. Vidal et al. introduced recently a fast algorithm for such an approximation (by interrupting a progressive computation framework). However, no theoretical guarantee was provided regarding its approximation quality. In this work, we revisit the progressive framework of Vida…
▽ More
This paper presents an algorithm for the efficient approximation of the saddle-extremum persistence diagram of a scalar field. Vidal et al. introduced recently a fast algorithm for such an approximation (by interrupting a progressive computation framework). However, no theoretical guarantee was provided regarding its approximation quality. In this work, we revisit the progressive framework of Vidal et al. and we introduce in contrast a novel approximation algorithm, with a user controlled approximation error, specifically, on the Bottleneck distance to the exact solution. Our approach is based on a hierarchical representation of the input data, and relies on local simplifications of the scalar field to accelerate the computation, while maintaining a controlled bound on the output error. The locality of our approach enables further speedups thanks to shared memory parallelism. Experiments conducted on real life datasets show that for a mild error tolerance (5% relative Bottleneck distance), our approach improves runtime performance by 18% on average (and up to 48% on large, noisy datasets) in comparison to standard, exact, publicly available implementations. In addition to the strong guarantees on its approximation error, we show that our algorithm also provides in practice outputs which are on average 5 times more accurate (in terms of the L2-Wasserstein distance) than a naive approximation baseline. We illustrate the utility of our approach for interactive data exploration and we document visualization strategies for conveying the uncertainty related to our approximations.
△ Less
Submitted 12 August, 2021;
originally announced August 2021.
-
Wasserstein Distances, Geodesics and Barycenters of Merge Trees
Authors:
Mathieu Pont,
Jules Vidal,
Julie Delon,
Julien Tierny
Abstract:
This paper presents a unified computational framework for the estimation of distances, geodesics and barycenters of merge trees. We extend recent work on the edit distance [106] and introduce a new metric, called the Wasserstein distance between merge trees, which is purposely designed to enable efficient computations of geodesics and barycenters. Specifically, our new distance is strictly equival…
▽ More
This paper presents a unified computational framework for the estimation of distances, geodesics and barycenters of merge trees. We extend recent work on the edit distance [106] and introduce a new metric, called the Wasserstein distance between merge trees, which is purposely designed to enable efficient computations of geodesics and barycenters. Specifically, our new distance is strictly equivalent to the L2-Wasserstein distance between extremum persistence diagrams, but it is restricted to a smaller solution space, namely, the space of rooted partial isomorphisms between branch decomposition trees. This enables a simple extension of existing optimization frameworks [112] for geodesics and barycenters from persistence diagrams to merge trees. We introduce a task-based algorithm which can be generically applied to distance, geodesic, barycenter or cluster computation. The task-based nature of our approach enables further accelerations with shared-memory parallelism. Extensive experiments on public ensembles and SciVis contest benchmarks demonstrate the efficiency of our approach -- with barycenter computations in the orders of minutes for the largest examples -- as well as its qualitative ability to generate representative barycenter merge trees, visually summarizing the features of interest found in the ensemble. We show the utility of our contributions with dedicated visualization applications: feature tracking, temporal reduction and ensemble clustering. We provide a lightweight C++ implementation that can be used to reproduce our results.
△ Less
Submitted 20 September, 2021; v1 submitted 16 July, 2021;
originally announced July 2021.
-
TopoMap: A 0-dimensional Homology Preserving Projection of High-Dimensional Data
Authors:
Harish Doraiswamy,
Julien Tierny,
Paulo J. S. Silva,
Luis Gustavo Nonato,
Claudio Silva
Abstract:
Multidimensional Projection is a fundamental tool for high-dimensional data analytics and visualization. With very few exceptions, projection techniques are designed to map data from a high-dimensional space to a visual space so as to preserve some dissimilarity (similarity) measure, such as the Euclidean distance for example. In fact, although adopting distinct mathematical formulations designed…
▽ More
Multidimensional Projection is a fundamental tool for high-dimensional data analytics and visualization. With very few exceptions, projection techniques are designed to map data from a high-dimensional space to a visual space so as to preserve some dissimilarity (similarity) measure, such as the Euclidean distance for example. In fact, although adopting distinct mathematical formulations designed to favor different aspects of the data, most multidimensional projection methods strive to preserve dissimilarity measures that encapsulate geometric properties such as distances or the proximity relation between data objects. However, geometric relations are not the only interesting property to be preserved in a projection. For instance, the analysis of particular structures such as clusters and outliers could be more reliably performed if the map** process gives some guarantee as to topological invariants such as connected components and loops. This paper introduces TopoMap, a novel projection technique which provides topological guarantees during the map** process. In particular, the proposed method performs the map** from a high-dimensional space to a visual space, while preserving the 0-dimensional persistence diagram of the Rips filtration of the high-dimensional data, ensuring that the filtrations generate the same connected components when applied to the original as well as projected data. The presented case studies show that the topological guarantee provided by TopoMap not only brings confidence to the visual analytic process but also can be used to assist in the assessment of other projection methods.
△ Less
Submitted 3 September, 2020;
originally announced September 2020.
-
Localized Topological Simplification of Scalar Data
Authors:
Jonas Lukasczyk,
Christoph Garth,
Ross Maciejewski,
Julien Tierny
Abstract:
This paper describes a localized algorithm for the topological simplification of scalar data, an essential pre-processing step of topological data analysis (TDA). Given a scalar field f and a selection of extrema to preserve, the proposed localized topological simplification (LTS) derives a function g that is close to f and only exhibits the selected set of extrema. Specifically, sub- and superlev…
▽ More
This paper describes a localized algorithm for the topological simplification of scalar data, an essential pre-processing step of topological data analysis (TDA). Given a scalar field f and a selection of extrema to preserve, the proposed localized topological simplification (LTS) derives a function g that is close to f and only exhibits the selected set of extrema. Specifically, sub- and superlevel set components associated with undesired extrema are first locally flattened and then correctly embedded into the global scalar field, such that these regions are guaranteed -- from a combinatorial perspective -- to no longer contain any undesired extrema. In contrast to previous global approaches, LTS only and independently processes regions of the domain that actually need to be simplified, which already results in a noticeable speedup. Moreover, due to the localized nature of the algorithm, LTS can utilize shared-memory parallelism to simplify regions simultaneously with a high parallel efficiency (70%). Hence, LTS significantly improves interactivity for the exploration of simplification parameters and their effect on subsequent topological analysis. For such exploration tasks, LTS brings the overall execution time of a plethora of TDA pipelines from minutes down to seconds, with an average observed speedup over state-of-the-art techniques of up to x36. Furthermore, in the special case where preserved extrema are selected based on topological persistence, an adapted version of LTS partially computes the persistence diagram and simultaneously simplifies features below a predefined persistence threshold. The effectiveness of LTS, its parallel efficiency, and its resulting benefits for TDA are demonstrated on several simulated and acquired datasets from different application domains, including physics, chemistry, and biomedical imaging.
△ Less
Submitted 31 August, 2020;
originally announced September 2020.
-
A Progressive Approach to Scalar Field Topology
Authors:
Jules Vidal,
Pierre Guillou,
Julien Tierny
Abstract:
This paper introduces progressive algorithms for the topological analysis of scalar data. Our approach is based on a hierarchical representation of the input data and the fast identification of topologically invariant vertices, which are vertices that have no impact on the topological description of the data and for which we show that no computation is required as they are introduced in the hierar…
▽ More
This paper introduces progressive algorithms for the topological analysis of scalar data. Our approach is based on a hierarchical representation of the input data and the fast identification of topologically invariant vertices, which are vertices that have no impact on the topological description of the data and for which we show that no computation is required as they are introduced in the hierarchy. This enables the definition of efficient coarse-to-fine topological algorithms, which leverage fast update mechanisms for ordinary vertices and avoid computation for the topologically invariant ones. We demonstrate our approach with two examples of topological algorithms (critical point extraction and persistence diagram computation), which generate interpretable outputs upon interruption requests and which progressively refine them otherwise. Experiments on real-life datasets illustrate that our progressive strategy, in addition to the continuous visual feedback it provides, even improves run time performance with regard to non-progressive algorithms and we describe further accelerations with shared-memory parallelism. We illustrate the utility of our approach in batch-mode and interactive setups, where it respectively enables the control of the execution time of complete topological pipelines as well as previews of the topological features found in a dataset, with progressive updates delivered within interactive times.
△ Less
Submitted 17 February, 2021; v1 submitted 29 July, 2020;
originally announced July 2020.
-
Statistical Parameter Selection for Clustering Persistence Diagrams
Authors:
Max Kontak,
Jules Vidal,
Julien Tierny
Abstract:
In urgent decision making applications, ensemble simulations are an important way to determine different outcome scenarios based on currently available data. In this paper, we will analyze the output of ensemble simulations by considering so-called persistence diagrams, which are reduced representations of the original data, motivated by the extraction of topological features. Based on a recently…
▽ More
In urgent decision making applications, ensemble simulations are an important way to determine different outcome scenarios based on currently available data. In this paper, we will analyze the output of ensemble simulations by considering so-called persistence diagrams, which are reduced representations of the original data, motivated by the extraction of topological features. Based on a recently published progressive algorithm for the clustering of persistence diagrams, we determine the optimal number of clusters, and therefore the number of significantly different outcome scenarios, by the minimization of established statistical score functions. Furthermore, we present a proof-of-concept prototype implementation of the statistical selection of the number of clusters and provide the results of an experimental study, where this implementation has been applied to real-world ensemble data sets.
△ Less
Submitted 17 October, 2019;
originally announced October 2019.
-
Ranking Viscous Finger Simulations to an Acquired Ground Truth with Topology-aware Matchings
Authors:
Maxime Soler,
Martin Petitfrere,
Gilles Darche,
Melanie Plainchault,
Bruno Conche,
Julien Tierny
Abstract:
This application paper presents a novel framework based on topological data analysis for the automatic evaluation and ranking of viscous finger simulation runs in an ensemble with respect to a reference acquisition. Individual fingers in a given time-step are associated with critical point pairs in the distance field to the injection point, forming persistence diagrams. Different metrics, based on…
▽ More
This application paper presents a novel framework based on topological data analysis for the automatic evaluation and ranking of viscous finger simulation runs in an ensemble with respect to a reference acquisition. Individual fingers in a given time-step are associated with critical point pairs in the distance field to the injection point, forming persistence diagrams. Different metrics, based on optimal transport, for comparing time-varying persistence diagrams in this specific applicative case are introduced. We evaluate the relevance of the rankings obtained with these metrics, both qualitatively thanks to a lightweight web visual interface, and quantitatively by studying the deviation from a reference ranking suggested by experts. Extensive experiments show the quantitative superiority of our approach compared to traditional alternatives. Our web interface allows experts to conveniently explore the produced rankings. We show a complete viscous fingering case study demonstrating the utility of our approach in the context of porous media fluid flow, where our framework can be used to automatically discard physically-irrelevant simulation runs from the ensemble and rank the most plausible ones. We document an in-situ implementation to lighten I/O and performance constraints arising in the context of parametric studies.
△ Less
Submitted 20 August, 2019;
originally announced August 2019.
-
Progressive Wasserstein Barycenters of Persistence Diagrams
Authors:
Jules Vidal,
Joseph Budin,
Julien Tierny
Abstract:
This paper presents an efficient algorithm for the progressive approximation of Wasserstein barycenters of persistence diagrams, with applications to the visual analysis of ensemble data. Given a set of scalar fields, our approach enables the computation of a persistence diagram which is representative of the set, and which visually conveys the number, data ranges and saliences of the main feature…
▽ More
This paper presents an efficient algorithm for the progressive approximation of Wasserstein barycenters of persistence diagrams, with applications to the visual analysis of ensemble data. Given a set of scalar fields, our approach enables the computation of a persistence diagram which is representative of the set, and which visually conveys the number, data ranges and saliences of the main features of interest found in the set. Such representative diagrams are obtained by computing explicitly the discrete Wasserstein barycenter of the set of persistence diagrams, a notoriously computationally intensive task. In particular, we revisit efficient algorithms for Wasserstein distance approximation [12,51] to extend previous work on barycenter estimation [94]. We present a new fast algorithm, which progressively approximates the barycenter by iteratively increasing the computation accuracy as well as the number of persistent features in the output diagram. Such a progressivity drastically improves convergence in practice and allows to design an interruptible algorithm, capable of respecting computation time constraints. This enables the approximation of Wasserstein barycenters within interactive times. We present an application to ensemble clustering where we revisit the k-means algorithm to exploit our barycenters and compute, within execution time constraints, meaningful clusters of ensemble data along with their barycenter diagram. Extensive experiments on synthetic and real-life data sets report that our algorithm converges to barycenters that are qualitatively meaningful with regard to the applications, and quantitatively comparable to previous techniques, while offering an order of magnitude speedup when run until convergence (without time constraint). Our algorithm can be trivially parallelized to provide additional speedups in practice on standard workstations. [...]
△ Less
Submitted 9 October, 2019; v1 submitted 10 July, 2019;
originally announced July 2019.
-
Task-based Augmented Contour Trees with Fibonacci Heaps
Authors:
Charles Gueunet,
P. Fortin,
J Jomier,
J Tierny
Abstract:
This paper presents a new algorithm for the fast, shared memory, multi-core computation of augmented contour trees on triangulations. In contrast to most existing parallel algorithms our technique computes augmented trees, enabling the full extent of contour tree based applications including data segmentation. Our approach completely revisits the traditional, sequential contour tree algorithm to r…
▽ More
This paper presents a new algorithm for the fast, shared memory, multi-core computation of augmented contour trees on triangulations. In contrast to most existing parallel algorithms our technique computes augmented trees, enabling the full extent of contour tree based applications including data segmentation. Our approach completely revisits the traditional, sequential contour tree algorithm to re-formulate all the steps of the computation as a set of independent local tasks. This includes a new computation procedure based on Fibonacci heaps for the join and split trees, two intermediate data structures used to compute the contour tree, whose constructions are efficiently carried out concurrently thanks to the dynamic scheduling of task parallelism. We also introduce a new parallel algorithm for the combination of these two trees into the output global contour tree. Overall, this results in superior time performance in practice, both in sequential and in parallel thanks to the OpenMP task runtime. We report performance numbers that compare our approach to reference sequential and multi-threaded implementations for the computation of augmented merge and contour trees. These experiments demonstrate the run-time efficiency of our approach and its scalability on common workstations. We demonstrate the utility of our approach in data segmentation applications.
△ Less
Submitted 13 February, 2019;
originally announced February 2019.
-
Lifted Wasserstein Matcher for Fast and Robust Topology Tracking
Authors:
Maxime Soler,
Mélanie Plainchault,
Bruno Conche,
Julien Tierny
Abstract:
This paper presents a robust and efficient method for tracking topological features in time-varying scalar data. Structures are tracked based on the optimal matching between persistence diagrams with respect to the Wasserstein metric. This fundamentally relies on solving the assignment problem, a special case of optimal transport, for all consecutive timesteps. Our approach relies on two main cont…
▽ More
This paper presents a robust and efficient method for tracking topological features in time-varying scalar data. Structures are tracked based on the optimal matching between persistence diagrams with respect to the Wasserstein metric. This fundamentally relies on solving the assignment problem, a special case of optimal transport, for all consecutive timesteps. Our approach relies on two main contributions. First, we revisit the seminal assignment algorithm by Kuhn and Munkres which we specifically adapt to the problem of matching persistence diagrams in an efficient way. Second, we propose an extension of the Wasserstein metric that significantly improves the geometrical stability of the matching of domain-embedded persistence pairs. We show that this geometrical lifting has the additional positive side-effect of improving the assignment matrix sparsity and therefore computing time. The global framework implements a coarse-grained parallelism by computing persistence diagrams and finding optimal matchings in parallel for every couple of consecutive timesteps. Critical trajectories are constructed by associating successively matched persistence pairs over time. Merging and splitting events are detected with a geometrical threshold in a post-processing stage. Extensive experiments on real-life datasets show that our matching approach is an order of magnitude faster than the seminal Munkres algorithm. Moreover, compared to a modern approximation method, our method provides competitive runtimes while yielding exact results. We demonstrate the utility of our global framework by extracting critical point trajectories from various simulated time-varying datasets and compare it to the existing methods based on associated overlaps of volumes. Robustness to noise and temporal resolution downsampling is empirically demonstrated.
△ Less
Submitted 2 January, 2019; v1 submitted 17 August, 2018;
originally announced August 2018.
-
Persistence Atlas for Critical Point Variability in Ensembles
Authors:
Guillaume Favelier,
Noura Faraj,
Brian Summa,
Julien Tierny
Abstract:
This paper presents a new approach for the visualization and analysis of the spatial variability of features of interest represented by critical points in ensemble data. Our framework, called Persistence Atlas, enables the visualization of the dominant spatial patterns of critical points, along with statistics regarding their occurrence in the ensemble. The persistence atlas represents in the geom…
▽ More
This paper presents a new approach for the visualization and analysis of the spatial variability of features of interest represented by critical points in ensemble data. Our framework, called Persistence Atlas, enables the visualization of the dominant spatial patterns of critical points, along with statistics regarding their occurrence in the ensemble. The persistence atlas represents in the geometrical domain each dominant pattern in the form of a confidence map for the appearance of critical points. As a by-product, our method also provides 2-dimensional layouts of the entire ensemble, highlighting the main trends at a global level. Our approach is based on the new notion of Persistence Map, a measure of the geometrical density in critical points which leverages the robustness to noise of topological persistence to better emphasize salient features. We show how to leverage spectral embedding to represent the ensemble members as points in a low-dimensional Euclidean space, where distances between points measure the dissimilarities between critical point layouts and where statistical tasks, such as clustering, can be easily carried out. Further, we show how the notion of mandatory critical point can be leveraged to evaluate for each cluster confidence regions for the appearance of critical points. Most of the steps of this framework can be trivially parallelized and we show how to efficiently implement them. Extensive experiments demonstrate the relevance of our approach. The accuracy of the confidence regions provided by the persistence atlas is quantitatively evaluated and compared to a baseline strategy using an off-the-shelf clustering approach. We illustrate the importance of the persistence atlas in a variety of real-life datasets, where clear trends in feature layouts are identified and analyzed.
△ Less
Submitted 30 July, 2018;
originally announced July 2018.
-
Topological Data Analysis Made Easy with the Topology ToolKit
Authors:
Guillaume Favelier,
Charles Gueunet,
Attila Gyulassy,
Julien Kitware,
Joshua Levine,
Jonas Lukasczyk,
Daisuke Sakurai,
Maxime Soler,
Julien Tierny,
Will Usher,
Qi Wu
Abstract:
This tutorial presents topological methods for the analysis and visualization of scientific data from a user's perspective, with the Topology ToolKit (TTK), a recently released open-source library for topological data analysis. Topological methods have gained considerably in popularity and maturity over the last twenty years and success stories of established methods have been documented in a wide…
▽ More
This tutorial presents topological methods for the analysis and visualization of scientific data from a user's perspective, with the Topology ToolKit (TTK), a recently released open-source library for topological data analysis. Topological methods have gained considerably in popularity and maturity over the last twenty years and success stories of established methods have been documented in a wide range of applications (combustion, chemistry, astrophysics, material sciences, etc.) with both acquired and simulated data, in both post-hoc and in-situ contexts. While reference textbooks have been published on the topic, no tutorial at IEEE VIS has covered this area in recent years, and never at a software level and from a user's point-of-view. This tutorial fills this gap by providing a beginner's introduction to topological methods for practitioners, researchers, students, and lecturers. In particular, instead of focusing on theoretical aspects and algorithmic details, this tutorial focuses on how topological methods can be useful in practice for concrete data analysis tasks such as segmentation, feature extraction or tracking. The tutorial describes in detail how to achieve these tasks with TTK. First, after an introduction to topological methods and their application in data analysis, a brief overview of TTK's main entry point for end users, namely ParaView, will be presented. Second, an overview of TTK's main features will be given. A running example will be described in detail, showcasing how to access TTK's features via ParaView, Python, VTK/C++, and C++. Third, hands-on sessions will concretely show how to use TTK in ParaView for multiple, representative data analysis tasks. Fourth, the usage of TTK will be presented for developers, in particular by describing several examples of visualization and data analysis projects that were built on top of TTK. Finally, some feedback regarding the usage of TTK as a teaching platform for topological analysis will be given. Presenters of this tutorial include experts in topological methods, core authors of TTK as well as active users, coming from academia, labs, or industry. A large part of the tutorial will be dedicated to hands-on exercises and a rich material package (including TTK pre-installs in virtual machines, code, data, demos, video tutorials, etc.) will be provided to the participants. This tutorial mostly targets students, practitioners and researchers who are not experts in topological methods but who are interested in using them in their daily tasks. We also target researchers already familiar to topological methods and who are interested in using or contributing to TTK.
△ Less
Submitted 21 June, 2018;
originally announced June 2018.
-
The Topology ToolKit
Authors:
Julien Tierny,
Guillaume Favelier,
Joshua A. Levine,
Charles Gueunet,
Michael Michaux
Abstract:
This system paper presents the Topology ToolKit (TTK), a software platform designed for topological data analysis in scientific visualization. TTK provides a unified, generic, efficient, and robust implementation of key algorithms for the topological analysis of scalar data, including: critical points, integral lines, persistence diagrams, persistence curves, merge trees, contour trees, Morse-Smal…
▽ More
This system paper presents the Topology ToolKit (TTK), a software platform designed for topological data analysis in scientific visualization. TTK provides a unified, generic, efficient, and robust implementation of key algorithms for the topological analysis of scalar data, including: critical points, integral lines, persistence diagrams, persistence curves, merge trees, contour trees, Morse-Smale complexes, fiber surfaces, continuous scatterplots, Jacobi sets, Reeb spaces, and more. TTK is easily accessible to end users due to a tight integration with ParaView. It is also easily accessible to developers through a variety of bindings (Python, VTK/C++) for fast prototy** or through direct, dependence-free, C++, to ease integration into pre-existing complex systems. While develo** TTK, we faced several algorithmic and software engineering challenges, which we document in this paper. In particular, we present an algorithm for the construction of a discrete gradient that complies to the critical points extracted in the piecewise-linear setting. This algorithm guarantees a combinatorial consistency across the topological abstractions supported by TTK, and importantly, a unified implementation of topological data simplification for multi-scale exploration and analysis. We also present a cached triangulation data structure, that supports time efficient and generic traversals, which self-adjusts its memory usage on demand for input simplicial meshes and which implicitly emulates a triangulation for regular grids with no memory overhead. Finally, we describe an original software architecture, which guarantees memory efficient and direct accesses to TTK features, while still allowing for researchers powerful and easy bindings and extensions. TTK is open source (BSD license) and its code, online documentation and video tutorials are available on TTK's website.
△ Less
Submitted 16 July, 2019; v1 submitted 22 May, 2018;
originally announced May 2018.
-
Topologically Controlled Lossy Compression
Authors:
Maxime Soler,
Melanie Plainchault,
Bruno Conche,
Julien Tierny
Abstract:
This paper presents a new algorithm for the lossy compression of scalar data defined on 2D or 3D regular grids, with topological control. Certain techniques allow users to control the pointwise error induced by the compression. However, in many scenarios it is desirable to control in a similar way the preservation of higher-level notions, such as topological features , in order to provide guarante…
▽ More
This paper presents a new algorithm for the lossy compression of scalar data defined on 2D or 3D regular grids, with topological control. Certain techniques allow users to control the pointwise error induced by the compression. However, in many scenarios it is desirable to control in a similar way the preservation of higher-level notions, such as topological features , in order to provide guarantees on the outcome of post-hoc data analyses. This paper presents the first compression technique for scalar data which supports a strictly controlled loss of topological features. It provides users with specific guarantees both on the preservation of the important features and on the size of the smaller features destroyed during compression. In particular, we present a simple compression strategy based on a topologically adaptive quantization of the range. Our algorithm provides strong guarantees on the bottleneck distance between persistence diagrams of the input and decompressed data, specifically those associated with extrema. A simple extension of our strategy additionally enables a control on the pointwise error. We also show how to combine our approach with state-of-the-art compressors, to further improve the geometrical reconstruction. Extensive experiments, for comparable compression rates, demonstrate the superiority of our algorithm in terms of the preservation of topological features. We show the utility of our approach by illustrating the compatibility between the output of post-hoc topological data analysis pipelines, executed on the input and decompressed data, for simulated or acquired data sets. We also provide a lightweight VTK-based C++ implementation of our approach for reproduction purposes.
△ Less
Submitted 8 February, 2018;
originally announced February 2018.
-
Jacobians and Hessians of Mean Value Coordinates for Closed Triangular Meshes
Authors:
Jean-Marc Thiery,
Julien Tierny,
Tamy Boubekeur
Abstract:
In this technical note, we present the formulae of the derivatives of the Mean Value Coordinates based transformations, using an enclosing triangle mesh, acting as a cage for the deformation of an interior object.
In this technical note, we present the formulae of the derivatives of the Mean Value Coordinates based transformations, using an enclosing triangle mesh, acting as a cage for the deformation of an interior object.
△ Less
Submitted 24 October, 2011; v1 submitted 9 September, 2011;
originally announced September 2011.