-
An MPI-based Algorithm for Map** Complex Networks onto Hierarchical Architectures
Authors:
Maria Predari,
Charilaos Tzovas,
Christian Schulz,
Henning Meyerhenke
Abstract:
Processing massive application graphs on distributed memory systems requires to map the graphs onto the system's processing elements (PEs). This task becomes all the more important when PEs have non-uniform communication costs or the input is highly irregular. Typically, map** is addressed using partitioning, in a two-step approach or an integrated one. Parallel partitioning tools do exist; yet,…
▽ More
Processing massive application graphs on distributed memory systems requires to map the graphs onto the system's processing elements (PEs). This task becomes all the more important when PEs have non-uniform communication costs or the input is highly irregular. Typically, map** is addressed using partitioning, in a two-step approach or an integrated one. Parallel partitioning tools do exist; yet, corresponding map** algorithms or their public implementations all have major sequential parts or other severe scaling limitations. In this paper, we propose a parallel algorithm that maps graphs onto the PEs of a hierarchical system. Our solution integrates partitioning and map**; it models the system hierarchy in a concise way as an implicit labeled tree. The vertices of the application graph are labeled as well, and these vertex labels induce the map**. The map** optimization follows the basic idea of parallel label propagation, but we tailor the gain computations of label changes to quickly account for the induced communication costs. Our MPI-based code is the first public implementation of a parallel graph map** algorithm; to this end, we extend the partitioning library ParHIP. To evaluate our algorithm's implementation, we perform comparative experiments with complex networks in the million- and billion-scale range. In general our map** tool shows good scalability on up to a few thousand PEs. Compared to other MPI-based competitors, our algorithm achieves the best speed to quality trade-off and our quality results are even better than non-parallel map** tools.
△ Less
Submitted 6 July, 2021;
originally announced July 2021.
-
Distributing Sparse Matrix/Graph Applications in Heterogeneous Clusters -- an Experimental Study
Authors:
Charilaos Tzovas,
Maria Predari,
Henning Meyerhenke
Abstract:
Many problems in scientific and engineering applications contain sparse matrices or graphs as main input objects, e.g. numerical simulations on meshes. Large inputs are abundant these days and require parallel processing for memory size and speed. To optimize the execution of such simulations on cluster systems, the input problem needs to be distributed suitably onto the processing units (PUs). Mo…
▽ More
Many problems in scientific and engineering applications contain sparse matrices or graphs as main input objects, e.g. numerical simulations on meshes. Large inputs are abundant these days and require parallel processing for memory size and speed. To optimize the execution of such simulations on cluster systems, the input problem needs to be distributed suitably onto the processing units (PUs). More and more frequently, such clusters contain different CPUs or a combination of CPUs and GPUs. This heterogeneity makes the load distribution problem quite challenging. Our study is motivated by the observation that established partitioning tools do not handle such heterogeneous distribution problems as well as homogeneous ones.
In this paper, we first formulate the problem of balanced load distribution for heterogeneous architectures as a multi-objective, single-constraint optimization problem. We then split the problem into two phases and propose a greedy approach to determine optimal block sizes for each PU. These block sizes are then fed into numerous existing graph partitioners, for us to examine how well they handle the above problem. One of the tools we consider is an extension of our own previous work (von Looz et al, ICPP'18) called Geographer. Our experiments on well-known benchmark meshes indicate that only two tools under consideration are able to yield good quality. These two are Parmetis (both the geometric and the combinatorial variant) and Geographer. While Parmetis is faster, Geographer yields better quality on average.
△ Less
Submitted 20 November, 2020; v1 submitted 3 November, 2020;
originally announced November 2020.
-
Guidelines for Experimental Algorithmics in Network Analysis
Authors:
Eugenio Angriman,
Alexander van der Grinten,
Moritz von Looz,
Henning Meyerhenke,
Martin Nöllenburg,
Maria Predari,
Charilaos Tzovas
Abstract:
The field of network science is a highly interdisciplinary area; for the empirical analysis of network data, it draws algorithmic methodologies from several research fields. Hence, research procedures and descriptions of the technical results often differ, sometimes widely. In this paper we focus on methodologies for the experimental part of algorithm engineering for network analysis -- an importa…
▽ More
The field of network science is a highly interdisciplinary area; for the empirical analysis of network data, it draws algorithmic methodologies from several research fields. Hence, research procedures and descriptions of the technical results often differ, sometimes widely. In this paper we focus on methodologies for the experimental part of algorithm engineering for network analysis -- an important ingredient for a research area with empirical focus. More precisely, we unify and adapt existing recommendations from different fields and propose universal guidelines -- including statistical analyses -- for the systematic evaluation of network analysis algorithms. This way, the behavior of newly proposed algorithms can be properly assessed and comparisons to existing solutions become meaningful. Moreover, as the main technical contribution, we provide SimexPal, a highly automated tool to perform and analyze experiments following our guidelines. To illustrate the merits of SimexPal and our guidelines, we apply them in a case study: we design, perform, visualize and evaluate experiments of a recent algorithm for approximating betweenness centrality, an important problem in network analysis. In summary, both our guidelines and SimexPal shall modernize and complement previous efforts in experimental algorithmics; they are not only useful for network analysis, but also in related contexts.
△ Less
Submitted 25 March, 2019;
originally announced April 2019.
-
Balanced k-means for Parallel Geometric Partitioning
Authors:
Moritz von Looz,
Charilaos Tzovas,
Henning Meyerhenke
Abstract:
Mesh partitioning is an indispensable tool for efficient parallel numerical simulations. Its goal is to minimize communication between the processes of a simulation while achieving load balance. Established graph-based partitioning tools yield a high solution quality; however, their scalability is limited. Geometric approaches usually scale better, but their solution quality may be unsatisfactory…
▽ More
Mesh partitioning is an indispensable tool for efficient parallel numerical simulations. Its goal is to minimize communication between the processes of a simulation while achieving load balance. Established graph-based partitioning tools yield a high solution quality; however, their scalability is limited. Geometric approaches usually scale better, but their solution quality may be unsatisfactory for `non-trivial' mesh topologies.
In this paper, we present a scalable version of $k$-means that is adapted to yield balanced clusters. Balanced $k$-means constitutes the core of our new partitioning algorithm Geographer. Bootstrap** of initial centers is performed with space-filling curves, leading to fast convergence of the subsequent balanced k-means algorithm.
Our experiments with up to 16384 MPI processes on numerous benchmark meshes show the following: (i) Geographer produces partitions with a lower communication volume than state-of-the-art geometric partitioners from the Zoltan package; (ii) Geographer scales well on large inputs; (iii) a Delaunay mesh with a few billion vertices and edges can be partitioned in a few seconds.
△ Less
Submitted 3 May, 2018;
originally announced May 2018.