-
Open Problems in (Hyper)Graph Decomposition
Authors:
Deepak Ajwani,
Rob H. Bisseling,
Katrin Casel,
Ümit V. Çatalyürek,
Cédric Chevalier,
Florian Chudigiewitsch,
Marcelo Fonseca Faraj,
Michael Fellows,
Lars Gottesbüren,
Tobias Heuer,
George Karypis,
Kamer Kaya,
Jakub Lacki,
Johannes Langguth,
Xiaoye Sherry Li,
Ruben Mayer,
Johannes Meintrup,
Yosuke Mizutani,
François Pellegrini,
Fabrizio Petrini,
Frances Rosamond,
Ilya Safro,
Sebastian Schlag,
Christian Schulz,
Roohani Sharma
, et al. (4 additional authors not shown)
Abstract:
Large networks are useful in a wide range of applications. Sometimes problem instances are composed of billions of entities. Decomposing and analyzing these structures helps us gain new insights about our surroundings. Even if the final application concerns a different problem (such as traversal, finding paths, trees, and flows), decomposing large graphs is often an important subproblem for comple…
▽ More
Large networks are useful in a wide range of applications. Sometimes problem instances are composed of billions of entities. Decomposing and analyzing these structures helps us gain new insights about our surroundings. Even if the final application concerns a different problem (such as traversal, finding paths, trees, and flows), decomposing large graphs is often an important subproblem for complexity reduction or parallelization. This report is a summary of discussions that happened at Dagstuhl seminar 23331 on "Recent Trends in Graph Decomposition" and presents currently open problems and future directions in the area of (hyper)graph decomposition.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
Harmful Conspiracies in Temporal Interaction Networks: Understanding the Dynamics of Digital Wildfires through Phase Transitions
Authors:
Kaspara Skovli Gåsvær,
Pedro G. Lind,
Johannes Langguth,
Morten Hjorth-Jensen,
Michael Kreil,
Daniel Thilo Schroeder
Abstract:
Shortly after the first COVID-19 cases became apparent in December 2020, rumors spread on social media suggesting a connection between the virus and the 5G radiation emanating from the recently deployed telecommunications network. In the course of the following weeks, this idea gained increasing popularity, and various alleged explanations for how such a connection manifests emerged. Ultimately, a…
▽ More
Shortly after the first COVID-19 cases became apparent in December 2020, rumors spread on social media suggesting a connection between the virus and the 5G radiation emanating from the recently deployed telecommunications network. In the course of the following weeks, this idea gained increasing popularity, and various alleged explanations for how such a connection manifests emerged. Ultimately, after being amplified by prominent conspiracy theorists, a series of arson attacks on telecommunication equipment follows, concluding with the kidnap** of telecommunication technicians in Peru. In this paper, we study the spread of content related to a conspiracy theory with harmful consequences, a so-called digital wildfire. In particular, we investigate the 5G and COVID-19 misinformation event on Twitter before, during, and after its peak in April and May 2020. For this purpose, we examine the community dynamics in complex temporal interaction networks underlying Twitter user activity. We assess the evolution of such digital wildfires by appropriately defining the temporal dynamics of communication in communities within social networks. We show that, for this specific misinformation event, the number of interactions of the users participating in a digital wildfire, as well as the size of the engaged communities, both follow a power-law distribution. Moreover, our research elucidates the possibility of quantifying the phases of a digital wildfire, as per established literature. We identify one such phase as a critical transition, marked by a shift from sporadic tweets to a global spread event, highlighting the dramatic scaling of misinformation propagation.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
On the Two Sides of Redundancy in Graph Neural Networks
Authors:
Franka Bause,
Samir Moustafa,
Johannes Langguth,
Wilfried N. Gansterer,
Nils M. Kriege
Abstract:
Message passing neural networks iteratively generate node embeddings by aggregating information from neighboring nodes. With increasing depth, information from more distant nodes is included. However, node embeddings may be unable to represent the growing node neighborhoods accurately and the influence of distant nodes may vanish, a problem referred to as oversquashing. Information redundancy in m…
▽ More
Message passing neural networks iteratively generate node embeddings by aggregating information from neighboring nodes. With increasing depth, information from more distant nodes is included. However, node embeddings may be unable to represent the growing node neighborhoods accurately and the influence of distant nodes may vanish, a problem referred to as oversquashing. Information redundancy in message passing, i.e., the repetitive exchange and encoding of identical information amplifies oversquashing. We develop a novel aggregation scheme based on neighborhood trees, which allows for controlling redundancy by pruning redundant branches of unfolding trees underlying standard message passing. While the regular structure of unfolding trees allows the reuse of intermediate results in a straightforward way, the use of neighborhood trees poses computational challenges. We propose compact representations of neighborhood trees and merge them, exploiting computational redundancy by identifying isomorphic subtrees. From this, node and graph embeddings are computed via a neural architecture inspired by tree canonization techniques. Our method is less susceptible to oversquashing than traditional message passing neural networks and can improve the accuracy on widely used benchmark datasets.
△ Less
Submitted 28 March, 2024; v1 submitted 6 October, 2023;
originally announced October 2023.
-
Social media in the Global South: A Network Dataset of the Malian Twittersphere
Authors:
Daniel Thilo Schroeder,
Mirjam de Bruijn,
Luca Bruls,
Mulatu Alemayehu Moges,
Samba Dialimpa Badji,
Noëmie Fritz,
Modibo Galy Cisse,
Johannes Langguth,
Bruce Mutsvairo,
Kristin Skare Orgeret
Abstract:
With the expansion of mobile communications infrastructure, social media usage in the Global South is surging. Compared to the Global North, populations of the Global South have had less prior experience with social media from stationary computers and wired Internet. Many countries are experiencing violent conflicts that have a profound effect on their societies. As a result, social networks devel…
▽ More
With the expansion of mobile communications infrastructure, social media usage in the Global South is surging. Compared to the Global North, populations of the Global South have had less prior experience with social media from stationary computers and wired Internet. Many countries are experiencing violent conflicts that have a profound effect on their societies. As a result, social networks develop under different conditions than elsewhere, and our goal is to provide data for studying this phenomenon. In this dataset paper, we present a data collection of a national Twittersphere in a West African country of conflict. While not the largest social network in terms of users, Twitter is an important platform where people engage in public discussion. The focus is on Mali, a country beset by conflict since 2012 that has recently had a relatively precarious media ecology. The dataset consists of tweets and Twitter users in Mali and was collected in June 2022, when the Malian conflict became more violent internally both towards external and international actors. In a preliminary analysis, we assume that the conflictual context influences how people access social media and, therefore, the shape of the Twittersphere and its characteristics. The aim of this paper is to primarily invite researchers from various disciplines including complex networks and social sciences scholars to explore the data at hand further. We collected the dataset using a scra** strategy of the follower network and the identification of characteristics of a Malian Twitter user. The given snapshot of the Malian Twitter follower network contains around seven million accounts, of which 56,000 are clearly identifiable as Malian. In addition, we present the tweets. The dataset is available at: https://osf.io/mj2qt/
△ Less
Submitted 24 October, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
-
Space Efficient Sequence Alignment for SRAM-Based Computing: X-Drop on the Graphcore IPU
Authors:
Luk Burchard,
Max Xiaohang Zhao,
Johannes Langguth,
Aydın Buluç,
Giulia Guidi
Abstract:
Dedicated accelerator hardware has become essential for processing AI-based workloads, leading to the rise of novel accelerator architectures. Furthermore, fundamental differences in memory architecture and parallelism have made these accelerators targets for scientific computing.
The sequence alignment problem is fundamental in bioinformatics; we have implemented the $X$-Drop algorithm, a heuri…
▽ More
Dedicated accelerator hardware has become essential for processing AI-based workloads, leading to the rise of novel accelerator architectures. Furthermore, fundamental differences in memory architecture and parallelism have made these accelerators targets for scientific computing.
The sequence alignment problem is fundamental in bioinformatics; we have implemented the $X$-Drop algorithm, a heuristic method for pairwise alignment that reduces search space, on the Graphcore Intelligence Processor Unit (IPU) accelerator. The $X$-Drop algorithm has an irregular computational pattern, which makes it difficult to accelerate due to load balancing.
Here, we introduce a graph-based partitioning and queue-based batch system to improve load balancing. Our implementation achieves $10\times$ speedup over a state-of-the-art GPU implementation and up to $4.65\times$ compared to CPU. In addition, we introduce a memory-restricted $X$-Drop algorithm that reduces memory footprint by $55\times$ and efficiently uses the IPU's limited low-latency SRAM. This optimization further improves the strong scaling performance by $3.6\times$.
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
A Newcomer In The PGAS World -- UPC++ vs UPC: A Comparative Study
Authors:
Jérémie Lagravière,
Johannes Langguth,
Martina Prugger,
Phuong H. Ha,
Xing Cai
Abstract:
A newcomer in the Partitioned Global Address Space (PGAS) 'world' has arrived in its version 1.0: Unified Parallel C++ (UPC++). UPC++ targets distributed data structures where communication is irregular or fine-grained. The key abstractions are global pointers, asynchronous programming via RPC, futures and promises. UPC++ API for moving non-contiguous data and handling memories with different opti…
▽ More
A newcomer in the Partitioned Global Address Space (PGAS) 'world' has arrived in its version 1.0: Unified Parallel C++ (UPC++). UPC++ targets distributed data structures where communication is irregular or fine-grained. The key abstractions are global pointers, asynchronous programming via RPC, futures and promises. UPC++ API for moving non-contiguous data and handling memories with different optimal access methods resemble those used in modern C++. In this study we provide two kernels implemented in UPC++: a sparse-matrix vector multiplication (SpMV) as part of a Partial-Differential Equation solver, and an implementation of the Heat Equation on a 2D-domain. Code listings of these two kernels are available in the article in order to show the differences in programming style between UPC and UPC++. We provide a performance comparison between UPC and UPC++ using single-node, multi-node hardware and many-core hardware (Intel Xeon Phi Knight's Landing).
△ Less
Submitted 6 February, 2021;
originally announced February 2021.
-
Load-Balanced Bottleneck Objectives in Process Map**
Authors:
Johannes Langguth,
Sebastian Schlag,
Christian Schulz
Abstract:
We propose a new problem formulation for graph partitioning that is tailored to the needs of time-critical simulations on modern heterogeneous supercomputers.
We propose a new problem formulation for graph partitioning that is tailored to the needs of time-critical simulations on modern heterogeneous supercomputers.
△ Less
Submitted 27 January, 2020;
originally announced January 2020.
-
Performance optimization and modeling of fine-grained irregular communication in UPC
Authors:
Jérémie Lagravière,
Johannes Langguth,
Martina Prugger,
Lukas Einkemmer,
Phuong H. Ha,
Xing Cai
Abstract:
The UPC programming language offers parallelism via logically partitioned shared memory, which typically spans physically disjoint memory sub-systems. One convenient feature of UPC is its ability to automatically execute between-thread data movement, such that the entire content of a shared data array appears to be freely accessible by all the threads. The programmer friendliness, however, can com…
▽ More
The UPC programming language offers parallelism via logically partitioned shared memory, which typically spans physically disjoint memory sub-systems. One convenient feature of UPC is its ability to automatically execute between-thread data movement, such that the entire content of a shared data array appears to be freely accessible by all the threads. The programmer friendliness, however, can come at the cost of substantial performance penalties. This is especially true when indirectly indexing the elements of a shared array, for which the induced between-thread data communication can be irregular and have a fine-grained pattern. In this paper we study performance enhancement strategies specifically targeting such fine-grained irregular communication in UPC. Starting from explicit thread privatization, continuing with block-wise communication, and arriving at message condensing and consolidation, we obtained considerable performance improvement of UPC programs that originally require fine-grained irregular communication. Besides the performance enhancement strategies, the main contribution of the present paper is to propose performance models for the different scenarios, in form of quantifiable formulas that hinge on the actual volumes of various data movements plus a small number of easily obtainable hardware characteristic parameters. These performance models help to verify the enhancements obtained, while also providing insightful predictions of similar parallel implementations, not limited to UPC, that also involve between-thread or between-process irregular communication. As a further validation, we also apply our performance modeling methodology and hardware characteristic parameters to an existing UPC code for solving a 2D heat equation on a uniform mesh.
△ Less
Submitted 29 December, 2019;
originally announced December 2019.
-
On the Performance and Energy Efficiency of the PGAS Programming Model on Multicore Architectures
Authors:
Jérémie Lagravière,
Johannes Langguth,
Mohammed Sourouri,
Phuong H. Ha,
Xing Cai
Abstract:
Using large-scale multicore systems to get the maximum performance and energy efficiency with manageable programmability is a major challenge. The partitioned global address space (PGAS) programming model enhances programmability by providing a global address space over large-scale computing systems. However, so far the performance and energy efficiency of the PGAS model on multicore-based paralle…
▽ More
Using large-scale multicore systems to get the maximum performance and energy efficiency with manageable programmability is a major challenge. The partitioned global address space (PGAS) programming model enhances programmability by providing a global address space over large-scale computing systems. However, so far the performance and energy efficiency of the PGAS model on multicore-based parallel architectures have not been investigated thoroughly. In this paper we use a set of selected kernels from the well-known NAS Parallel Benchmarks to evaluate the performance and energy efficiency of the UPC programming language, which is a widely used implementation of the PGAS model. In addition, the MPI and OpenMP versions of the same parallel kernels are used for comparison with their UPC counterparts. The investigated hardware platforms are based on multicore CPUs, both within a single 16-core node and across multiple nodes involving up to 1024 physical cores. On the multi-node platform we used the hardware measurement solution called High definition Energy Efficiency Monitoring tool in order to measure energy. On the single-node system we used the hybrid measurement solution to make an effort into understanding the observed performance differences, we use the Intel Performance Counter Monitor to quantify in detail the communication time, cache hit/miss ratio and memory usage. Our experiments show that UPC is competitive with OpenMP and MPI on single and multiple nodes, with respect to both the performance and energy efficiency.
△ Less
Submitted 29 December, 2019;
originally announced December 2019.
-
Multi-Modal Machine Learning for Flood Detection in News, Social Media and Satellite Sequences
Authors:
Kashif Ahmad,
Konstantin Pogorelov,
Mohib Ullah,
Michael Riegler,
Nicola Conci,
Johannes Langguth,
Ala Al-Fuqaha
Abstract:
In this paper we present our methods for the MediaEval 2019 Mul-timedia Satellite Task, which is aiming to extract complementaryinformation associated with adverse events from Social Media andsatellites. For the first challenge, we propose a framework jointly uti-lizing colour, object and scene-level information to predict whetherthe topic of an article containing an image is a flood event or not.…
▽ More
In this paper we present our methods for the MediaEval 2019 Mul-timedia Satellite Task, which is aiming to extract complementaryinformation associated with adverse events from Social Media andsatellites. For the first challenge, we propose a framework jointly uti-lizing colour, object and scene-level information to predict whetherthe topic of an article containing an image is a flood event or not.Visual features are combined using early and late fusion techniquesachieving an average F1-score of82.63,82.40,81.40and76.77. Forthe multi-modal flood level estimation, we rely on both visualand textual information achieving an average F1-score of58.48and46.03, respectively. Finally, for the flooding detection in time-based satellite image sequences we used a combination of classicalcomputer-vision and machine learning approaches achieving anaverage F1-score of58.82%
△ Less
Submitted 7 October, 2019;
originally announced October 2019.
-
A Distributed-Memory Algorithm for Computing a Heavy-Weight Perfect Matching on Bipartite Graphs
Authors:
Ariful Azad,
Aydın Buluc,
Xiaoye S. Li,
Xinliang Wang,
Johannes Langguth
Abstract:
We design and implement an efficient parallel algorithm for finding a perfect matching in a weighted bipartite graph such that weights on the edges of the matching are large. This problem differs from the maximum weight matching problem, for which scalable approximation algorithms are known. It is primarily motivated by finding good pivots in scalable sparse direct solvers before factorization. Du…
▽ More
We design and implement an efficient parallel algorithm for finding a perfect matching in a weighted bipartite graph such that weights on the edges of the matching are large. This problem differs from the maximum weight matching problem, for which scalable approximation algorithms are known. It is primarily motivated by finding good pivots in scalable sparse direct solvers before factorization. Due to the lack of scalable alternatives, distributed solvers use sequential implementations of maximum weight perfect matching algorithms, such as those available in MC64. To overcome this limitation, we propose a fully parallel distributed memory algorithm that first generates a perfect matching and then iteratively improves the weight of the perfect matching by searching for weight-increasing cycles of length four in parallel. For most practical problems the weights of the perfect matchings generated by our algorithm are very close to the optimum. An efficient implementation of the algorithm scales up to 256 nodes (17,408 cores) on a Cray XC40 supercomputer and can solve instances that are too large to be handled by a single node using the sequential algorithm.
△ Less
Submitted 4 September, 2020; v1 submitted 29 January, 2018;
originally announced January 2018.