-
giotto-ph: A Python Library for High-Performance Computation of Persistent Homology of Vietoris-Rips Filtrations
Authors:
Julián Burella Pérez,
Sydney Hauke,
Umberto Lupo,
Matteo Caorsi,
Alberto Dassatti
Abstract:
We introduce giotto-ph, a high-performance, open-source software package for the computation of Vietoris-Rips barcodes. giotto-ph is based on Morozov and Nigmetov's lockfree (multicore) implementation of Ulrich Bauer's Ripser package. It also contains a re-working of the GUDHI library's implementation of Boissonnat and Pritam's Edge Collapser, which can be used as a pre-processing step to dramatic…
▽ More
We introduce giotto-ph, a high-performance, open-source software package for the computation of Vietoris-Rips barcodes. giotto-ph is based on Morozov and Nigmetov's lockfree (multicore) implementation of Ulrich Bauer's Ripser package. It also contains a re-working of the GUDHI library's implementation of Boissonnat and Pritam's Edge Collapser, which can be used as a pre-processing step to dramatically reduce overall run-times in certain scenarios. Our contribution is twofold: on the one hand, we integrate existing state-of-the-art ideas coherently in a single library and provide Python bindings to the C++ code. On the other hand, we increase parallelization opportunities and improve overall performance by adopting more efficient data structures. Our persistent homology backend establishes a new state of the art, surpassing even GPU-accelerated implementations such as Ripser++ when using as few as 5-10 CPU cores. Furthermore, our implementation of Edge Collapser has fewer software dependencies and improved run-times relative to GUDHI's original implementation.
△ Less
Submitted 2 August, 2021; v1 submitted 12 July, 2021;
originally announced July 2021.
-
giotto-tda: A Topological Data Analysis Toolkit for Machine Learning and Data Exploration
Authors:
Guillaume Tauzin,
Umberto Lupo,
Lewis Tunstall,
Julian Burella Pérez,
Matteo Caorsi,
Wojciech Reise,
Anibal Medina-Mardones,
Alberto Dassatti,
Kathryn Hess
Abstract:
We introduce giotto-tda, a Python library that integrates high-performance topological data analysis with machine learning via a scikit-learn-compatible API and state-of-the-art C++ implementations. The library's ability to handle various types of data is rooted in a wide range of preprocessing techniques, and its strong focus on data exploration and interpretability is aided by an intuitive plott…
▽ More
We introduce giotto-tda, a Python library that integrates high-performance topological data analysis with machine learning via a scikit-learn-compatible API and state-of-the-art C++ implementations. The library's ability to handle various types of data is rooted in a wide range of preprocessing techniques, and its strong focus on data exploration and interpretability is aided by an intuitive plotting API. Source code, binaries, examples, and documentation can be found at https://github.com/giotto-ai/giotto-tda.
△ Less
Submitted 5 March, 2021; v1 submitted 6 April, 2020;
originally announced April 2020.
-
Transparent Live Code Offloading on FPGA
Authors:
Roberto Rigamonti,
Baptiste Delporte,
Anthony Convers,
Alberto Dassatti
Abstract:
Even though it seems that FPGAs have finally made the transition from research labs to the consumer devices' market, programming them remains challenging. Despite the improvements made by High-Level Synthesis (HLS), which removed the language and paradigm barriers that prevented many computer scientists from working with them, producing a new design typically requires at least several hours, makin…
▽ More
Even though it seems that FPGAs have finally made the transition from research labs to the consumer devices' market, programming them remains challenging. Despite the improvements made by High-Level Synthesis (HLS), which removed the language and paradigm barriers that prevented many computer scientists from working with them, producing a new design typically requires at least several hours, making data- and context-dependent adaptations virtually impossible.
In this paper we present a new framework that off-loads, on-the-fly and transparently to both the user and the developer, computationally-intensive code fragments to FPGAs. While the performance should not surpass that of hand-crafted HDL code, or even code produced by HLS, our results come with no additional development costs and do not require producing and deploying a new bit-stream to the FPGA each time a change is made. Moreover, since optimizations are made at run-time, they may fit particular datasets or usage scenarios, something which is rarely foreseeable at design or compile time.
Our proposal revolves around an overlay architecture that is pre-programmed on the FPGA and dynamically reconfigured by our framework to execute code fragments extracted from the Data Flow Graph (DFG) of computational intensive routines. We validated our solution using standard benchmarks and proved we are able to off-load to FPGAs without developer's intervention.
△ Less
Submitted 1 September, 2016;
originally announced September 2016.
-
HPA: An Opportunistic Approach to Embedded Energy Efficiency
Authors:
Baptiste Delporte,
Roberto Rigamonti,
Alberto Dassatti
Abstract:
Reducing energy consumption is a challenge that is faced on a daily basis by teams from the High-Performance Computing as well as the Embedded domain. This issue is mostly attacked from an hardware perspective, by devising architectures that put energy efficiency as a primary target, often at the cost of processing power. Lately, computing platforms have become more and more heterogeneous, but the…
▽ More
Reducing energy consumption is a challenge that is faced on a daily basis by teams from the High-Performance Computing as well as the Embedded domain. This issue is mostly attacked from an hardware perspective, by devising architectures that put energy efficiency as a primary target, often at the cost of processing power. Lately, computing platforms have become more and more heterogeneous, but the exploitation of these additional capabilities is so complex from the application developer's perspective that they are left unused most of the time, resulting therefore in a supplemental waste of energy rather than in faster processing times.
In this paper we present a transparent, on-the-fly optimization scheme that allows a generic application to automatically exploit the available computing units to partition its computational load. We have called our approach Heterogeneous Platform Accelerator (HPA). The idea is to use profiling to automatically select a computing-intensive candidate for acceleration, and then distribute the computations to the different units by off-loading blocks of code to them.
Using an NVIDIA Jetson TK1 board, we demonstrate that not only HPA results in faster processing speed, but also in a considerable reduction in the total energy absorbed.
△ Less
Submitted 27 November, 2015;
originally announced November 2015.
-
Toward Transparent Heterogeneous Systems
Authors:
Baptiste Delporte,
Roberto Rigamonti,
Alberto Dassatti
Abstract:
Heterogeneous parallel systems are widely spread nowadays. Despite their availability, their usage and adoption are still limited, and even more rarely they are used to full power. Indeed, compelling new technologies are constantly developed and keep changing the technological landscape, but each of them targets a limited sub-set of supported devices, and nearly all of them require new programming…
▽ More
Heterogeneous parallel systems are widely spread nowadays. Despite their availability, their usage and adoption are still limited, and even more rarely they are used to full power. Indeed, compelling new technologies are constantly developed and keep changing the technological landscape, but each of them targets a limited sub-set of supported devices, and nearly all of them require new programming paradigms and specific toolsets. Software, however, can hardly keep the pace with the growing number of computational capabilities, and developers are less and less motivated in learning skills that could quickly become obsolete. In this paper we present our effort in the direction of a transparent system optimization based on automatic code profiling and Just-In-Time compilation, that resulted in a fully-working embedded prototype capable of dynamically detect computing-intensive code blocks and automatically dispatch them to different computation units. Experimental results show that our system allows gains up to 32x in performance --- after an initial warm-up phase --- without requiring any human intervention.
△ Less
Submitted 20 November, 2015; v1 submitted 18 November, 2015;
originally announced November 2015.