-
Greedy Optimization of Resistance-based Graph Robustness with Global and Local Edge Insertions
Authors:
Maria Predari,
Lukas Berner,
Robert Kooij,
Henning Meyerhenke
Abstract:
The total effective resistance, also called the Kirchhoff index, provides a robustness measure for a graph $G$. We consider two optimization problems of adding $k$ new edges to $G$ such that the resulting graph has minimal total effective resistance (i.e., is most robust) -- one where the new edges can be anywhere in the graph and one where the new edges need to be incident to a specified focus no…
▽ More
The total effective resistance, also called the Kirchhoff index, provides a robustness measure for a graph $G$. We consider two optimization problems of adding $k$ new edges to $G$ such that the resulting graph has minimal total effective resistance (i.e., is most robust) -- one where the new edges can be anywhere in the graph and one where the new edges need to be incident to a specified focus node. The total effective resistance and effective resistances between nodes can be computed using the pseudoinverse of the graph Laplacian. The pseudoinverse may be computed explicitly via pseudoinversion; yet, this takes cubic time in practice and quadratic space. We instead exploit combinatorial and algebraic connections to speed up gain computations in an established generic greedy heuristic. Moreover, we leverage existing randomized techniques to boost the performance of our approaches by introducing a sub-sampling step. Our different graph- and matrix-based approaches are indeed significantly faster than the state-of-the-art greedy algorithm, while their quality remains reasonably high and is often quite close. Our experiments show that we can now process larger graphs for which the application of the state-of-the-art greedy approach was impractical before.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
An MPI-based Algorithm for Map** Complex Networks onto Hierarchical Architectures
Authors:
Maria Predari,
Charilaos Tzovas,
Christian Schulz,
Henning Meyerhenke
Abstract:
Processing massive application graphs on distributed memory systems requires to map the graphs onto the system's processing elements (PEs). This task becomes all the more important when PEs have non-uniform communication costs or the input is highly irregular. Typically, map** is addressed using partitioning, in a two-step approach or an integrated one. Parallel partitioning tools do exist; yet,…
▽ More
Processing massive application graphs on distributed memory systems requires to map the graphs onto the system's processing elements (PEs). This task becomes all the more important when PEs have non-uniform communication costs or the input is highly irregular. Typically, map** is addressed using partitioning, in a two-step approach or an integrated one. Parallel partitioning tools do exist; yet, corresponding map** algorithms or their public implementations all have major sequential parts or other severe scaling limitations. In this paper, we propose a parallel algorithm that maps graphs onto the PEs of a hierarchical system. Our solution integrates partitioning and map**; it models the system hierarchy in a concise way as an implicit labeled tree. The vertices of the application graph are labeled as well, and these vertex labels induce the map**. The map** optimization follows the basic idea of parallel label propagation, but we tailor the gain computations of label changes to quickly account for the induced communication costs. Our MPI-based code is the first public implementation of a parallel graph map** algorithm; to this end, we extend the partitioning library ParHIP. To evaluate our algorithm's implementation, we perform comparative experiments with complex networks in the million- and billion-scale range. In general our map** tool shows good scalability on up to a few thousand PEs. Compared to other MPI-based competitors, our algorithm achieves the best speed to quality trade-off and our quality results are even better than non-parallel map** tools.
△ Less
Submitted 6 July, 2021;
originally announced July 2021.
-
New Approximation Algorithms for Forest Closeness Centrality -- for Individual Vertices and Vertex Groups
Authors:
Alexander van der Grinten,
Eugenio Angriman,
Maria Predari,
Henning Meyerhenke
Abstract:
The emergence of massive graph data sets requires fast mining algorithms. Centrality measures to identify important vertices belong to the most popular analysis methods in graph mining. A measure that is gaining attention is forest closeness centrality; it is closely related to electrical measures using current flow but can also handle disconnected graphs. Recently, [** et al., ICDM'19] proposed…
▽ More
The emergence of massive graph data sets requires fast mining algorithms. Centrality measures to identify important vertices belong to the most popular analysis methods in graph mining. A measure that is gaining attention is forest closeness centrality; it is closely related to electrical measures using current flow but can also handle disconnected graphs. Recently, [** et al., ICDM'19] proposed an algorithm to approximate this measure probabilistically. Their algorithm processes small inputs quickly, but does not scale well beyond hundreds of thousands of vertices.
In this paper, we first propose a different approximation algorithm; it is up to two orders of magnitude faster and more accurate in practice. Our method exploits the strong connection between uniform spanning trees and forest distances by adapting and extending recent approximation algorithms for related single-vertex problems. This results in a nearly-linear time algorithm with an absolute probabilistic error guarantee. In addition, we are the first to consider the problem of finding an optimal group of vertices w.r.t. forest closeness. We prove that this latter problem is NP-hard; to approximate it, we adapt a greedy algorithm by [Li et al., WWW'19], which is based on (partial) matrix inversion. Moreover, our experiments show that on disconnected graphs, group forest closeness outperforms existing centrality measures in the context of semi-supervised vertex classification.
△ Less
Submitted 15 January, 2021;
originally announced January 2021.
-
Distributing Sparse Matrix/Graph Applications in Heterogeneous Clusters -- an Experimental Study
Authors:
Charilaos Tzovas,
Maria Predari,
Henning Meyerhenke
Abstract:
Many problems in scientific and engineering applications contain sparse matrices or graphs as main input objects, e.g. numerical simulations on meshes. Large inputs are abundant these days and require parallel processing for memory size and speed. To optimize the execution of such simulations on cluster systems, the input problem needs to be distributed suitably onto the processing units (PUs). Mo…
▽ More
Many problems in scientific and engineering applications contain sparse matrices or graphs as main input objects, e.g. numerical simulations on meshes. Large inputs are abundant these days and require parallel processing for memory size and speed. To optimize the execution of such simulations on cluster systems, the input problem needs to be distributed suitably onto the processing units (PUs). More and more frequently, such clusters contain different CPUs or a combination of CPUs and GPUs. This heterogeneity makes the load distribution problem quite challenging. Our study is motivated by the observation that established partitioning tools do not handle such heterogeneous distribution problems as well as homogeneous ones.
In this paper, we first formulate the problem of balanced load distribution for heterogeneous architectures as a multi-objective, single-constraint optimization problem. We then split the problem into two phases and propose a greedy approach to determine optimal block sizes for each PU. These block sizes are then fed into numerous existing graph partitioners, for us to examine how well they handle the above problem. One of the tools we consider is an extension of our own previous work (von Looz et al, ICPP'18) called Geographer. Our experiments on well-known benchmark meshes indicate that only two tools under consideration are able to yield good quality. These two are Parmetis (both the geometric and the combinatorial variant) and Geographer. While Parmetis is faster, Geographer yields better quality on average.
△ Less
Submitted 20 November, 2020; v1 submitted 3 November, 2020;
originally announced November 2020.
-
Approximation of the Diagonal of a Laplacian's Pseudoinverse for Complex Network Analysis
Authors:
Eugenio Angriman,
Maria Predari,
Alexander van der Grinten,
Henning Meyerhenke
Abstract:
The ubiquity of massive graph data sets in numerous applications requires fast algorithms for extracting knowledge from these data. We are motivated here by three electrical measures for the analysis of large small-world graphs $G = (V, E)$ -- i.e., graphs with diameter in $O(\log |V|)$, which are abundant in complex network analysis. From a computational point of view, the three measures have in…
▽ More
The ubiquity of massive graph data sets in numerous applications requires fast algorithms for extracting knowledge from these data. We are motivated here by three electrical measures for the analysis of large small-world graphs $G = (V, E)$ -- i.e., graphs with diameter in $O(\log |V|)$, which are abundant in complex network analysis. From a computational point of view, the three measures have in common that their crucial component is the diagonal of the graph Laplacian's pseudoinverse, $L^\dagger$. Computing diag$(L^\dagger)$ exactly by pseudoinversion, however, is as expensive as dense matrix multiplication -- and the standard tools in practice even require cubic time. Moreover, the pseudoinverse requires quadratic space -- hardly feasible for large graphs. Resorting to approximation by, e.g., using the Johnson-Lindenstrauss transform, requires the solution of $O(\log |V| / ε^2)$ Laplacian linear systems to guarantee a relative error, which is still very expensive for large inputs.
In this paper, we present a novel approximation algorithm that requires the solution of only one Laplacian linear system. The remaining parts are purely combinatorial -- mainly sampling uniform spanning trees, which we relate to diag$(L^\dagger)$ via effective resistances. For small-world networks, our algorithm obtains a $\pm ε$-approximation with high probability, in a time that is nearly-linear in $|E|$ and quadratic in $1 / ε$. Another positive aspect of our algorithm is its parallel nature due to independent sampling. We thus provide two parallel implementations of our algorithm: one using OpenMP, one MPI + OpenMP. In our experiments against the state of the art, our algorithm (i) yields more accurate results, (ii) is much faster and more memory-efficient, and (iii) obtains good parallel speedups, in particular in the distributed setting.
△ Less
Submitted 8 February, 2021; v1 submitted 24 June, 2020;
originally announced June 2020.
-
Guidelines for Experimental Algorithmics in Network Analysis
Authors:
Eugenio Angriman,
Alexander van der Grinten,
Moritz von Looz,
Henning Meyerhenke,
Martin Nöllenburg,
Maria Predari,
Charilaos Tzovas
Abstract:
The field of network science is a highly interdisciplinary area; for the empirical analysis of network data, it draws algorithmic methodologies from several research fields. Hence, research procedures and descriptions of the technical results often differ, sometimes widely. In this paper we focus on methodologies for the experimental part of algorithm engineering for network analysis -- an importa…
▽ More
The field of network science is a highly interdisciplinary area; for the empirical analysis of network data, it draws algorithmic methodologies from several research fields. Hence, research procedures and descriptions of the technical results often differ, sometimes widely. In this paper we focus on methodologies for the experimental part of algorithm engineering for network analysis -- an important ingredient for a research area with empirical focus. More precisely, we unify and adapt existing recommendations from different fields and propose universal guidelines -- including statistical analyses -- for the systematic evaluation of network analysis algorithms. This way, the behavior of newly proposed algorithms can be properly assessed and comparisons to existing solutions become meaningful. Moreover, as the main technical contribution, we provide SimexPal, a highly automated tool to perform and analyze experiments following our guidelines. To illustrate the merits of SimexPal and our guidelines, we apply them in a case study: we design, perform, visualize and evaluate experiments of a recent algorithm for approximating betweenness centrality, an important problem in network analysis. In summary, both our guidelines and SimexPal shall modernize and complement previous efforts in experimental algorithmics; they are not only useful for network analysis, but also in related contexts.
△ Less
Submitted 25 March, 2019;
originally announced April 2019.
-
Topology-induced Enhancement of Map**s
Authors:
Roland Glantz,
Maria Predari,
Henning Meyerhenke
Abstract:
In this paper we propose a new method to enhance a map** $μ(\cdot)$ of a parallel application's computational tasks to the processing elements (PEs) of a parallel computer.
The idea behind our method \mswap is to enhance such a map** by drawing on the observation that many topologies take the form of a partial cube.
This class of graphs includes all rectangular and cubic meshes, any such t…
▽ More
In this paper we propose a new method to enhance a map** $μ(\cdot)$ of a parallel application's computational tasks to the processing elements (PEs) of a parallel computer.
The idea behind our method \mswap is to enhance such a map** by drawing on the observation that many topologies take the form of a partial cube.
This class of graphs includes all rectangular and cubic meshes, any such torus with even extensions in each dimension, all hypercubes, and all trees.
Following previous work, we represent the parallel application and the parallel computer by graphs $G_a = (V_a, E_a)$ and $G_p = (V_p, E_p)$.
$G_p$ being a partial cube allows us to label its vertices, the PEs, by bitvectors such that the cost of exchanging one unit of information between two vertices $u_p$ and $v_p$ of $G_p$ amounts to the Hamming distance between the labels of $u_p$ and $v_p$.
By transferring these bitvectors from $V_p$ to $V_a$ via $μ^{-1}(\cdot)$ and extending them to be unique on $V_a$, we can enhance $μ(\cdot)$ by swap** labels of $V_a$ in a new way.
Pairs of swapped labels are local \wrt the PEs, but not \wrt $G_a$. Moreover, permutations of the bitvectors' entries give rise to a plethora of hierarchies on the PEs. Through these hierarchies we turn \mswap into a hierarchical method for improving $μ(\cdot)$ that is complementary to state-of-the-art methods for computing $μ(\cdot)$ in the first place.
In our experiments we use \mswap to enhance map**s of complex networks onto rectangular meshes and tori with 256 and 512 nodes, as well as hypercubes with 256 nodes. It turns out that common quality measures of map**s derived from state-of-the-art algorithms can be improved considerably.
△ Less
Submitted 19 April, 2018;
originally announced April 2018.