-
Insights into $(k,ρ)$-shortcutting algorithms
Authors:
Alexander Leonhardt,
Ulrich Meyer,
Manuel Penschuck
Abstract:
A graph is called a $(k,ρ)$-graph iff every node can reach $ρ$ of its nearest neighbors in at most k hops. This property proved useful in the analysis and design of parallel shortest-path algorithms. Any graph can be transformed into a $(k,ρ)$-graph by adding shortcuts. Formally, the $(k,ρ)$-Minimum-Shortcut problem asks to find an appropriate shortcut set of minimal cardinality.
We show that th…
▽ More
A graph is called a $(k,ρ)$-graph iff every node can reach $ρ$ of its nearest neighbors in at most k hops. This property proved useful in the analysis and design of parallel shortest-path algorithms. Any graph can be transformed into a $(k,ρ)$-graph by adding shortcuts. Formally, the $(k,ρ)$-Minimum-Shortcut problem asks to find an appropriate shortcut set of minimal cardinality.
We show that the $(k,ρ)$-Minimum-Shortcut problem is NP-complete in the practical regime of $k \ge 3$ and $ρ= Θ(n^ε)$ for $ε> 0$. With a related construction, we bound the approximation factor of known $(k,ρ)$-Minimum-Shortcut problem heuristics from below and propose algorithmic countermeasures improving the approximation quality. Further, we describe an integer linear problem (ILP) solving the $(k,ρ)$-Minimum-Shortcut problem optimally. Finally, we compare the practical performance and quality of all algorithms in an empirical campaign.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Engineering Shared-Memory Parallel Shuffling to Generate Random Permutations In-Place
Authors:
Manuel Penschuck
Abstract:
Shuffling is the process of rearranging a sequence of elements into a random order such that any permutation occurs with equal probability. It is an important building block in a plethora of techniques used in virtually all scientific areas. Consequently considerable work has been devoted to the design and implementation of shuffling algorithms.
We engineer, -- to the best of our knowledge -- fo…
▽ More
Shuffling is the process of rearranging a sequence of elements into a random order such that any permutation occurs with equal probability. It is an important building block in a plethora of techniques used in virtually all scientific areas. Consequently considerable work has been devoted to the design and implementation of shuffling algorithms.
We engineer, -- to the best of our knowledge -- for the first time, a practically fast, parallel shuffling algorithm with $\Oh{\sqrt{n}\log n}$ parallel depth that requires only poly-logarithmic auxiliary memory. Our reference implementations in Rust are freely available, easy to include in other projects, and can process large data sets approaching the size of the system's memory. In an empirical evaluation, we compare our implementations with a number of existing solutions on various computer architectures. Our algorithms consistently achieve the highest through-put on all machines. Further, we demonstrate that the runtime of our parallel algorithm is comparable to the time that other algorithms may take to acquire the memory from the operating system to copy the input.
△ Less
Submitted 7 February, 2023;
originally announced February 2023.
-
Parallel and I/O-Efficient Algorithms for Non-Linear Preferential Attachment
Authors:
Daniel Allendorf,
Ulrich Meyer,
Manuel Penschuck,
Hung Tran
Abstract:
Preferential attachment lies at the heart of many network models aiming to replicate features of real world networks. To simulate the attachment process, conduct statistical tests, or obtain input data for benchmarks, efficient algorithms are required that are capable of generating large graphs according to these models.
Existing graph generators are optimized for the most simple model, where ne…
▽ More
Preferential attachment lies at the heart of many network models aiming to replicate features of real world networks. To simulate the attachment process, conduct statistical tests, or obtain input data for benchmarks, efficient algorithms are required that are capable of generating large graphs according to these models.
Existing graph generators are optimized for the most simple model, where new nodes that arrive in the network are connected to earlier nodes with a probability $P(h) \propto d$ that depends linearly on the degree $d$ of the earlier node $h$. Yet, some networks are better explained by a more general attachment probability $P(h) \propto f(d)$ for some function $f \colon \mathbb N~\to~\mathbb R$. Here, the polynomial case $f(d) = d^α$ where $α\in \mathbb R_{>0}$ is of particular interest.
In this paper, we present efficient algorithms that generate graphs according to the more general models. We first design a simple yet optimal sequential algorithm for the polynomial model. We then parallelize the algorithm by identifying batches of independent samples and obtain a near-optimal speedup when adding many nodes. In addition, we present an I/O-efficient algorithm that can even be used for the fully general model. To showcase the efficiency and scalability of our algorithms, we conduct an experimental study and compare their performance to existing solutions.
△ Less
Submitted 17 January, 2023; v1 submitted 13 November, 2022;
originally announced November 2022.
-
Algorithms for Large-scale Network Analysis and the NetworKit Toolkit
Authors:
Eugenio Angriman,
Alexander van der Grinten,
Michael Hamann,
Henning Meyerhenke,
Manuel Penschuck
Abstract:
The abundance of massive network data in a plethora of applications makes scalable analysis algorithms and software tools necessary to generate knowledge from such data in reasonable time. Addressing scalability as well as other requirements such as good usability and a rich feature set, the open-source software NetworKit has established itself as a popular tool for large-scale network analysis. T…
▽ More
The abundance of massive network data in a plethora of applications makes scalable analysis algorithms and software tools necessary to generate knowledge from such data in reasonable time. Addressing scalability as well as other requirements such as good usability and a rich feature set, the open-source software NetworKit has established itself as a popular tool for large-scale network analysis. This chapter provides a brief overview of the contributions to NetworKit made by the DFG Priority Programme SPP 1736 Algorithms for Big Data. Algorithmic contributions in the areas of centrality computations, community detection, and sparsification are in the focus, but we also mention several other aspects -- such as current software engineering principles of the project and ways to visualize network data within a NetworKit-based workflow.
△ Less
Submitted 20 September, 2022;
originally announced September 2022.
-
Parallel Global Edge Switching for the Uniform Sampling of Simple Graphs with Prescribed Degrees
Authors:
Daniel Allendorf,
Ulrich Meyer,
Manuel Penschuck,
Hung Tran
Abstract:
The uniform sampling of simple graphs matching a prescribed degree sequence is an important tool in network science, e.g. to construct graph generators or null-models. Here, the Edge Switching Markov Chain (ES-MC) is a common choice. Given an arbitrary simple graph with the required degree sequence, ES-MC carries out a large number of small changes, called edge switches, to eventually obtain a uni…
▽ More
The uniform sampling of simple graphs matching a prescribed degree sequence is an important tool in network science, e.g. to construct graph generators or null-models. Here, the Edge Switching Markov Chain (ES-MC) is a common choice. Given an arbitrary simple graph with the required degree sequence, ES-MC carries out a large number of small changes, called edge switches, to eventually obtain a uniform sample. In practice, reasonably short runs efficiently yield approximate uniform samples.
In this work, we study the problem of executing edge switches in parallel. We discuss parallelizations of ES-MC, but find that this approach suffers from complex dependencies between edge switches. For this reason, we propose the Global Edge Switching Markov Chain (G-ES-MC), an ES-MC variant with simpler dependencies. We show that G-ES-MC converges to the uniform distribution and design shared-memory parallel algorithms for ES-MC and G-ES-MC. In an empirical evaluation, we provide evidence that G-ES-MC requires not more switches than ES-MC (and often fewer), and demonstrate the efficiency and scalability of our parallel G-ES-MC implementation.
△ Less
Submitted 15 February, 2023; v1 submitted 4 November, 2021;
originally announced November 2021.
-
Engineering Uniform Sampling of Graphs with a Prescribed Power-law Degree Sequence
Authors:
Daniel Allendorf,
Ulrich Meyer,
Manuel Penschuck,
Hung Tran,
Nick Wormald
Abstract:
We consider the following common network analysis problem: given a degree sequence $\mathbf{d} = (d_1, \dots, d_n) \in \mathbb N^n$ return a uniform sample from the ensemble of all simple graphs with matching degrees. In practice, the problem is typically solved using Markov Chain Monte Carlo approaches, such as Edge-Switching or Curveball, even if no practical useful rigorous bounds are known on…
▽ More
We consider the following common network analysis problem: given a degree sequence $\mathbf{d} = (d_1, \dots, d_n) \in \mathbb N^n$ return a uniform sample from the ensemble of all simple graphs with matching degrees. In practice, the problem is typically solved using Markov Chain Monte Carlo approaches, such as Edge-Switching or Curveball, even if no practical useful rigorous bounds are known on their mixing times. In contrast, Arman et al. sketch Inc-Powerlaw, a novel and much more involved algorithm capable of generating graphs for power-law bounded degree sequences with $γ\gtrapprox 2.88$ in expected linear time.
For the first time, we give a complete description of the algorithm and add novel switchings. To the best of our knowledge, our open-source implementation of Inc-Powerlaw is the first practical generator with rigorous uniformity guarantees for the aforementioned degree sequences. In an empirical investigation, we find that for small average-degrees Inc-Powerlaw is very efficient and generates graphs with one million nodes in less than a second. For larger average-degrees, parallelism can partially mitigate the increased running-time.
△ Less
Submitted 28 October, 2021;
originally announced October 2021.
-
Efficient and accurate group testing via Belief Propagation: an empirical study
Authors:
AminCoja-Oghlan,
Max Hahn-Klimroth,
Philipp Loick,
Manuel Penschuck
Abstract:
The group testing problem asks for efficient pooling schemes and algorithms that allow to screen moderately large numbers of samples for rare infections. The goal is to accurately identify the infected samples while conducting the least possible number of tests. Exploring the use of techniques centred around the Belief Propagation message passing algorithm, we suggest a new test design that signif…
▽ More
The group testing problem asks for efficient pooling schemes and algorithms that allow to screen moderately large numbers of samples for rare infections. The goal is to accurately identify the infected samples while conducting the least possible number of tests. Exploring the use of techniques centred around the Belief Propagation message passing algorithm, we suggest a new test design that significantly increases the accuracy of the results. The new design comes with Belief Propagation as an efficient inference algorithm. Aiming for results on practical rather than asymptotic problem sizes, we conduct an experimental study.
△ Less
Submitted 13 May, 2021;
originally announced May 2021.
-
Simulating Population Protocols in Sub-Constant Time per Interaction
Authors:
Petra Berenbrink,
David Hammer,
Dominik Kaaser,
Ulrich Meyer,
Manuel Penschuck,
Hung Tran
Abstract:
We consider the problem of efficiently simulating population protocols. In the population model, we are given a distributed system of $n$ agents modeled as identical finite-state machines. In each time step, a pair of agents is selected uniformly at random to interact. In an interaction, agents update their states according to a common transition function. We empirically and analytically analyze t…
▽ More
We consider the problem of efficiently simulating population protocols. In the population model, we are given a distributed system of $n$ agents modeled as identical finite-state machines. In each time step, a pair of agents is selected uniformly at random to interact. In an interaction, agents update their states according to a common transition function. We empirically and analytically analyze two classes of simulators for this model.
First, we consider sequential simulators executing one interaction after the other. Key to the performance of these simulators is the data structure storing the agents' states. For our analysis, we consider plain arrays, binary search trees, and a novel Dynamic Alias Table data structure.
Secondly, we consider batch processing to efficiently update the states of multiple independent agents in one step. For many protocols considered in literature, our simulator requires amortized sub-constant time per interaction and is fast in practice: given a fixed time budget, the implementation of our batched simulator is able to simulate population protocols several orders of magnitude larger compared to the sequential competitors, and can carry out $2^{50}$ interactions among the same number of agents in less than 400s.
△ Less
Submitted 7 May, 2020;
originally announced May 2020.
-
Near optimal sparsity-constrained group testing: improved bounds and algorithms
Authors:
Oliver Gebhard,
Max Hahn-Klimroth,
Olaf Parczyk,
Manuel Penschuck,
Maurice Rolvien,
Jonathan Scarlett,
Nelvin Tan
Abstract:
Recent advances in noiseless non-adaptive group testing have led to a precise asymptotic characterization of the number of tests required for high-probability recovery in the sublinear regime $k = n^θ$ (with $θ\in (0,1)$), with $n$ individuals among which $k$ are infected. However, the required number of tests may increase substantially under real-world practical constraints, notably including bou…
▽ More
Recent advances in noiseless non-adaptive group testing have led to a precise asymptotic characterization of the number of tests required for high-probability recovery in the sublinear regime $k = n^θ$ (with $θ\in (0,1)$), with $n$ individuals among which $k$ are infected. However, the required number of tests may increase substantially under real-world practical constraints, notably including bounds on the maximum number $Δ$ of tests an individual can be placed in, or the maximum number $Γ$ of individuals in a given test. While previous works have given recovery guarantees for these settings, significant gaps remain between the achievability and converse bounds. In this paper, we substantially or completely close several of the most prominent gaps. In the case of $Δ$-divisible items, we show that the definite defectives (DD) algorithm coupled with a random regular design is asymptotically optimal in dense scaling regimes, and optimal to within a factor of $\eul$ more generally; we establish this by strengthening both the best known achievability and converse bounds. In the case of $Γ$-sized tests, we provide a comprehensive analysis of the regime $Γ= Θ(1)$, and again establish a precise threshold proving the asymptotic optimality of SCOMP (a slight refinement of DD) equipped with a tailored pooling scheme. Finally, for each of these two settings, we provide near-optimal adaptive algorithms based on sequential splitting, and provably demonstrate gaps between the performance of optimal adaptive and non-adaptive algorithms.
△ Less
Submitted 22 December, 2021; v1 submitted 24 April, 2020;
originally announced April 2020.
-
Recent Advances in Scalable Network Generation
Authors:
Manuel Penschuck,
Ulrik Brandes,
Michael Hamann,
Sebastian Lamm,
Ulrich Meyer,
Ilya Safro,
Peter Sanders,
Christian Schulz
Abstract:
Random graph models are frequently used as a controllable and versatile data source for experimental campaigns in various research fields. Generating such data-sets at scale is a non-trivial task as it requires design decisions typically spanning multiple areas of expertise. Challenges begin with the identification of relevant domain-specific network features, continue with the question of how to…
▽ More
Random graph models are frequently used as a controllable and versatile data source for experimental campaigns in various research fields. Generating such data-sets at scale is a non-trivial task as it requires design decisions typically spanning multiple areas of expertise. Challenges begin with the identification of relevant domain-specific network features, continue with the question of how to compile such features into a tractable model, and culminate in algorithmic details arising while implementing the pertaining model.
In the present survey, we explore crucial aspects of random graph models with known scalable generators. We begin by briefly introducing network features considered by such models, and then discuss random graphs alongside with generation algorithms. Our focus lies on modelling techniques and algorithmic primitives that have proven successful in obtaining massive graphs. We consider concepts and graph models for various domains (such as social network, infrastructure, ecology, and numerical simulations), and discuss generators for different models of computation (including shared-memory parallelism, massive-parallel GPUs, and distributed systems).
△ Less
Submitted 2 March, 2020;
originally announced March 2020.
-
The random 2-SAT partition function
Authors:
Dimitris Achlioptas,
Amin Coja-Oghlan,
Max Hahn-Klimroth,
Joon Lee,
Noela Müller,
Manuel Penschuck,
Guangyan Zhou
Abstract:
We show that throughout the satisfiable phase the normalised number of satisfying assignments of a random $2$-SAT formula converges in probability to an expression predicted by the cavity method from statistical physics. The proof is based on showing that the Belief Propagation algorithm renders the correct marginal probability that a variable is set to `true' under a uniformly random satisfying a…
▽ More
We show that throughout the satisfiable phase the normalised number of satisfying assignments of a random $2$-SAT formula converges in probability to an expression predicted by the cavity method from statistical physics. The proof is based on showing that the Belief Propagation algorithm renders the correct marginal probability that a variable is set to `true' under a uniformly random satisfying assignment.
△ Less
Submitted 10 February, 2020;
originally announced February 2020.
-
Bidirectional Text Compression in External Memory
Authors:
Patrick Dinklage,
Jonas Ellert,
Johannes Fischer,
Dominik Köppl,
Manuel Penschuck
Abstract:
Bidirectional compression algorithms work by substituting repeated substrings by references that, unlike in the famous LZ77-scheme, can point to either direction. We present such an algorithm that is particularly suited for an external memory implementation. We evaluate it experimentally on large data sets of size up to 128 GiB (using only 16 GiB of RAM) and show that it is significantly faster th…
▽ More
Bidirectional compression algorithms work by substituting repeated substrings by references that, unlike in the famous LZ77-scheme, can point to either direction. We present such an algorithm that is particularly suited for an external memory implementation. We evaluate it experimentally on large data sets of size up to 128 GiB (using only 16 GiB of RAM) and show that it is significantly faster than all known LZ77 compressors, while producing a roughly similar number of factors. We also introduce an external memory decompressor for texts compressed with any uni- or bidirectional compression scheme.
△ Less
Submitted 3 December, 2019; v1 submitted 7 July, 2019;
originally announced July 2019.
-
Efficiently Generating Geometric Inhomogeneous and Hyperbolic Random Graphs
Authors:
Thomas Bläsius,
Tobias Friedrich,
Maximilian Katzmann,
Ulrich Meyer,
Manuel Penschuck,
Christopher Weyand
Abstract:
Hyperbolic random graphs (HRG) and geometric inhomogeneous random graphs (GIRG) are two similar generative network models that were designed to resemble complex real world networks. In particular, they have a power-law degree distribution with controllable exponent $β$, and high clustering that can be controlled via the temperature $T$.
We present the first implementation of an efficient GIRG ge…
▽ More
Hyperbolic random graphs (HRG) and geometric inhomogeneous random graphs (GIRG) are two similar generative network models that were designed to resemble complex real world networks. In particular, they have a power-law degree distribution with controllable exponent $β$, and high clustering that can be controlled via the temperature $T$.
We present the first implementation of an efficient GIRG generator running in expected linear time. Besides varying temperatures, it also supports underlying geometries of higher dimensions. It is capable of generating graphs with ten million edges in under a second on commodity hardware. The algorithm can be adapted to HRGs. Our resulting implementation is the fastest sequential HRG generator, despite the fact that we support non-zero temperatures. Though non-zero temperatures are crucial for many applications, most existing generators are restricted to $T = 0$. Our generators support parallelization, although this is not the focus of this paper. We note that our generators draw from the correct probability distribution, i.e., they involve no approximation.
Besides the generators themselves, we also provide an efficient algorithm to determine the non-trivial dependency between the average degree of the resulting graph and the input parameters of the GIRG model. This makes it possible to specify the expected average degree as input.
Moreover, we investigate the differences between HRGs and GIRGs, shedding new light on the nature of the relation between the two models. Although HRGs represent, in a certain sense, a special case of the GIRG model, we find that a straight-forward inclusion does not hold in practice. However, the difference is negligible for most use cases.
△ Less
Submitted 23 August, 2019; v1 submitted 16 May, 2019;
originally announced May 2019.
-
Fragile Complexity of Comparison-Based Algorithms
Authors:
Peyman Afshani,
Rolf Fagerberg,
David Hammer,
Riko Jacob,
Irina Kostitsyna,
Ulrich Meyer,
Manuel Penschuck,
Nodari Sitchinava
Abstract:
We initiate a study of algorithms with a focus on the computational complexity of individual elements, and introduce the fragile complexity of comparison-based algorithms as the maximal number of comparisons any individual element takes part in. We give a number of upper and lower bounds on the fragile complexity for fundamental problems, including Minimum, Selection, Sorting and Heap Construction…
▽ More
We initiate a study of algorithms with a focus on the computational complexity of individual elements, and introduce the fragile complexity of comparison-based algorithms as the maximal number of comparisons any individual element takes part in. We give a number of upper and lower bounds on the fragile complexity for fundamental problems, including Minimum, Selection, Sorting and Heap Construction. The results include both deterministic and randomized upper and lower bounds, and demonstrate a separation between the two settings for a number of problems. The depth of a comparator network is a straight-forward upper bound on the worst case fragile complexity of the corresponding fragile algorithm. We prove that fragile complexity is a different and strictly easier property than the depth of comparator networks, in the sense that for some problems a fragile complexity equal to the best network depth can be achieved with less total work and that with randomization, even a lower fragile complexity is possible.
△ Less
Submitted 3 September, 2019; v1 submitted 9 January, 2019;
originally announced January 2019.
-
Parallel and I/O-efficient Randomisation of Massive Networks using Global Curveball Trades
Authors:
Corrie Jacobien Carstens,
Michael Hamann,
Ulrich Meyer,
Manuel Penschuck,
Hung Tran,
Dorothea Wagner
Abstract:
Graph randomisation is a crucial task in the analysis and synthesis of networks. It is typically implemented as an edge switching process (ESMC) repeatedly swap** the nodes of random edge pairs while maintaining the degrees involved. Curveball is a novel approach that instead considers the whole neighbourhoods of randomly drawn node pairs. Its Markov chain converges to a uniform distribution, an…
▽ More
Graph randomisation is a crucial task in the analysis and synthesis of networks. It is typically implemented as an edge switching process (ESMC) repeatedly swap** the nodes of random edge pairs while maintaining the degrees involved. Curveball is a novel approach that instead considers the whole neighbourhoods of randomly drawn node pairs. Its Markov chain converges to a uniform distribution, and experiments suggest that it requires less steps than the established ESMC.
Since trades however are more expensive, we study Curveball's practical runtime by introducing the first efficient Curveball algorithms: the I/O-efficient EM-CB for simple undirected graphs and its internal memory pendant IM-CB. Further, we investigate global trades processing every node in a graph during a single super step, and show that undirected global trades converge to a uniform distribution and perform superior in practice. We then discuss EM-GCB and EM-PGCB for global trades and give experimental evidence that EM-PGCB achieves the quality of the state-of-the-art ESMC algorithm EM-ES nearly one order of magnitude faster.
△ Less
Submitted 17 August, 2018; v1 submitted 23 April, 2018;
originally announced April 2018.
-
Communication-free Massively Distributed Graph Generation
Authors:
Daniel Funke,
Sebastian Lamm,
Ulrich Meyer,
Peter Sanders,
Manuel Penschuck,
Christian Schulz,
Darren Strash,
Moritz von Looz
Abstract:
Analyzing massive complex networks yields promising insights about our everyday lives. Building scalable algorithms to do so is a challenging task that requires a careful analysis and an extensive evaluation. However, engineering such algorithms is often hindered by the scarcity of publicly~available~datasets.
Network generators serve as a tool to alleviate this problem by providing synthetic in…
▽ More
Analyzing massive complex networks yields promising insights about our everyday lives. Building scalable algorithms to do so is a challenging task that requires a careful analysis and an extensive evaluation. However, engineering such algorithms is often hindered by the scarcity of publicly~available~datasets.
Network generators serve as a tool to alleviate this problem by providing synthetic instances with controllable parameters. However, many network generators fail to provide instances on a massive scale due to their sequential nature or resource constraints. Additionally, truly scalable network generators are few and often limited in their realism.
In this work, we present novel generators for a variety of network models that are frequently used as benchmarks. By making use of pseudorandomization and divide-and-conquer schemes, our generators follow a communication-free paradigm. The resulting generators are thus embarrassingly parallel and have a near optimal scaling behavior. This allows us to generate instances of up to $2^{43}$ vertices and $2^{47}$ edges in less than 22 minutes on 32768 cores. Therefore, our generators allow new graph families to be used on an unprecedented scale.
△ Less
Submitted 18 March, 2019; v1 submitted 20 October, 2017;
originally announced October 2017.
-
Challenges in QCD matter physics - The Compressed Baryonic Matter experiment at FAIR
Authors:
CBM Collaboration,
T. Ablyazimov,
A. Abuhoza,
R. P. Adak,
M. Adamczyk,
K. Agarwal,
M. M. Aggarwal,
Z. Ahammed,
F. Ahmad,
N. Ahmad,
S. Ahmad,
A. Akindinov,
P. Akishin,
E. Akishina,
T. Akishina,
V. Akishina,
A. Akram,
M. Al-Turany,
I. Alekseev,
E. Alexandrov,
I. Alexandrov,
S. Amar-Youcef,
M. Anđelić,
O. Andreeva,
C. Andrei
, et al. (563 additional authors not shown)
Abstract:
Substantial experimental and theoretical efforts worldwide are devoted to explore the phase diagram of strongly interacting matter. At LHC and top RHIC energies, QCD matter is studied at very high temperatures and nearly vanishing net-baryon densities. There is evidence that a Quark-Gluon-Plasma (QGP) was created at experiments at RHIC and LHC. The transition from the QGP back to the hadron gas is…
▽ More
Substantial experimental and theoretical efforts worldwide are devoted to explore the phase diagram of strongly interacting matter. At LHC and top RHIC energies, QCD matter is studied at very high temperatures and nearly vanishing net-baryon densities. There is evidence that a Quark-Gluon-Plasma (QGP) was created at experiments at RHIC and LHC. The transition from the QGP back to the hadron gas is found to be a smooth cross over. For larger net-baryon densities and lower temperatures, it is expected that the QCD phase diagram exhibits a rich structure, such as a first-order phase transition between hadronic and partonic matter which terminates in a critical point, or exotic phases like quarkyonic matter. The discovery of these landmarks would be a breakthrough in our understanding of the strong interaction and is therefore in the focus of various high-energy heavy-ion research programs. The Compressed Baryonic Matter (CBM) experiment at FAIR will play a unique role in the exploration of the QCD phase diagram in the region of high net-baryon densities, because it is designed to run at unprecedented interaction rates. High-rate operation is the key prerequisite for high-precision measurements of multi-differential observables and of rare diagnostic probes which are sensitive to the dense phase of the nuclear fireball. The goal of the CBM experiment at SIS100 (sqrt(s_NN) = 2.7 - 4.9 GeV) is to discover fundamental properties of QCD matter: the phase structure at large baryon-chemical potentials (mu_B > 500 MeV), effects of chiral symmetry, and the equation-of-state at high density as it is expected to occur in the core of neutron stars. In this article, we review the motivation for and the physics programme of CBM, including activities before the start of data taking in 2022, in the context of the worldwide efforts to explore high-density QCD matter.
△ Less
Submitted 29 March, 2017; v1 submitted 6 July, 2016;
originally announced July 2016.
-
I/O-Efficient Generation of Massive Graphs Following the LFR Benchmark
Authors:
Michael Hamann,
Ulrich Meyer,
Manuel Penschuck,
Hung Tran,
Dorothea Wagner
Abstract:
LFR is a popular benchmark graph generator used to evaluate community detection algorithms. We present EM-LFR, the first external memory algorithm able to generate massive complex networks following the LFR benchmark. Its most expensive component is the generation of random graphs with prescribed degree sequences which can be divided into two steps: the graphs are first materialized deterministica…
▽ More
LFR is a popular benchmark graph generator used to evaluate community detection algorithms. We present EM-LFR, the first external memory algorithm able to generate massive complex networks following the LFR benchmark. Its most expensive component is the generation of random graphs with prescribed degree sequences which can be divided into two steps: the graphs are first materialized deterministically using the Havel-Hakimi algorithm, and then randomized. Our main contributions are EM-HH and EM-ES, two I/O-efficient external memory algorithms for these two steps. We also propose EM-CM/ES, an alternative sampling scheme using the Configuration Model and rewiring steps to obtain a random simple graph. In an experimental evaluation we demonstrate their performance; our implementation is able to handle graphs with more than 37 billion edges on a single machine, is competitive with a massive parallel distributed algorithm, and is faster than a state-of-the-art internal memory implementation even on instances fitting in main memory. EM-LFR's implementation is capable of generating large graph instances orders of magnitude faster than the original implementation. We give evidence that both implementations yield graphs with matching properties by applying clustering algorithms to generated instances. Similarly, we analyse the evolution of graph properties as EM-ES is executed on networks obtained with EM-CM/ES and find that the alternative approach can accelerate the sampling process.
△ Less
Submitted 14 June, 2017; v1 submitted 29 April, 2016;
originally announced April 2016.