-
The LDBC Social Network Benchmark Interactive workload v2: A transactional graph query benchmark with deep delete operations
Authors:
David Püroja,
Jack Waudby,
Peter Boncz,
Gábor Szárnyas
Abstract:
The LDBC Social Network Benchmark's Interactive workload captures an OLTP scenario operating on a correlated social network graph. It consists of complex graph queries executed concurrently with a stream of updates operation. Since its initial release in 2015, the Interactive workload has become the de facto industry standard for benchmarking transactional graph data management systems. As graph s…
▽ More
The LDBC Social Network Benchmark's Interactive workload captures an OLTP scenario operating on a correlated social network graph. It consists of complex graph queries executed concurrently with a stream of updates operation. Since its initial release in 2015, the Interactive workload has become the de facto industry standard for benchmarking transactional graph data management systems. As graph systems have matured and the community's understanding of graph processing features has evolved, we initiated the renewal of this benchmark. This paper describes the draft Interactive v2 workload with several new features: delete operations, a cheapest path-finding query, support for larger data sets, and a novel temporal parameter curation algorithm that ensures stable runtimes for path queries.
△ Less
Submitted 17 August, 2023; v1 submitted 10 July, 2023;
originally announced July 2023.
-
The Linked Data Benchmark Council (LDBC): Driving competition and collaboration in the graph data management space
Authors:
Gábor Szárnyas,
Brad Bebee,
Altan Birler,
Alin Deutsch,
George Fletcher,
Henry A. Gabb,
Denise Gosnell,
Alastair Green,
Zhihui Guo,
Keith W. Hare,
Jan Hidders,
Alexandru Iosup,
Atanas Kiryakov,
Tomas Kovatchev,
Xinsheng Li,
Leonid Libkin,
Heng Lin,
Xiaojian Luo,
Arnau Prat-Pérez,
David Püroja,
Shipeng Qi,
Oskar van Rest,
Benjamin A. Steer,
Dávid Szakállas,
Bing Tong
, et al. (8 additional authors not shown)
Abstract:
Graph data management is instrumental for several use cases such as recommendation, root cause analysis, financial fraud detection, and enterprise knowledge representation. Efficiently supporting these use cases yields a number of unique requirements, including the need for a concise query language and graph-aware query optimization techniques. The goal of the Linked Data Benchmark Council (LDBC)…
▽ More
Graph data management is instrumental for several use cases such as recommendation, root cause analysis, financial fraud detection, and enterprise knowledge representation. Efficiently supporting these use cases yields a number of unique requirements, including the need for a concise query language and graph-aware query optimization techniques. The goal of the Linked Data Benchmark Council (LDBC) is to design a set of standard benchmarks that capture representative categories of graph data management problems, making the performance of systems comparable and facilitating competition among vendors. LDBC also conducts research on graph schemas and graph query languages. This paper introduces the LDBC organization and its work over the last decade.
△ Less
Submitted 17 August, 2023; v1 submitted 10 July, 2023;
originally announced July 2023.
-
The LDBC Financial Benchmark
Authors:
Shipeng Qi,
Heng Lin,
Zhihui Guo,
Gábor Szárnyas,
Bing Tong,
Yan Zhou,
Bin Yang,
Jiansong Zhang,
Zheng Wang,
Youren Shen,
Changyuan Wang,
Parviz Peiravi,
Henry Gabb,
Ben Steer
Abstract:
The Linked Data Benchmark Council's Financial Benchmark (LDBC FinBench) is a new effort that defines a graph database benchmark targeting financial scenarios such as anti-fraud and risk control. The benchmark has one workload, the Transaction Workload, currently. It captures OLTP scenario with complex, simple read queries and write queries that continuously insert or delete data in the graph. Comp…
▽ More
The Linked Data Benchmark Council's Financial Benchmark (LDBC FinBench) is a new effort that defines a graph database benchmark targeting financial scenarios such as anti-fraud and risk control. The benchmark has one workload, the Transaction Workload, currently. It captures OLTP scenario with complex, simple read queries and write queries that continuously insert or delete data in the graph. Compared to the LDBC SNB, the LDBC FinBench differs in application scenarios, data patterns, and query patterns. This document contains a detailed explanation of the data used in the LDBC FinBench, the definition of transaction workload, a detailed description for all queries, and instructions on how to use the benchmark suite.
△ Less
Submitted 30 June, 2023; v1 submitted 28 June, 2023;
originally announced June 2023.
-
LAGraph: Linear Algebra, Network Analysis Libraries, and the Study of Graph Algorithms
Authors:
Gábor Szárnyas,
David A. Bader,
Timothy A. Davis,
James Kitchen,
Timothy G. Mattson,
Scott McMillan,
Erik Welch
Abstract:
Graph algorithms can be expressed in terms of linear algebra. GraphBLAS is a library of low-level building blocks for such algorithms that targets algorithm developers. LAGraph builds on top of the GraphBLAS to target users of graph algorithms with high-level algorithms common in network analysis. In this paper, we describe the first release of the LAGraph library, the design decisions behind the…
▽ More
Graph algorithms can be expressed in terms of linear algebra. GraphBLAS is a library of low-level building blocks for such algorithms that targets algorithm developers. LAGraph builds on top of the GraphBLAS to target users of graph algorithms with high-level algorithms common in network analysis. In this paper, we describe the first release of the LAGraph library, the design decisions behind the library, and performance using the GAP benchmark suite. LAGraph, however, is much more than a library. It is also a project to document and analyze the full range of algorithms enabled by the GraphBLAS. To that end, we have developed a compact and intuitive notation for describing these algorithms. In this paper, we present that notation with examples from the GAP benchmark suite.
△ Less
Submitted 4 April, 2021;
originally announced April 2021.
-
The Future is Big Graphs! A Community View on Graph Processing Systems
Authors:
Sherif Sakr,
Angela Bonifati,
Hannes Voigt,
Alexandru Iosup,
Khaled Ammar,
Renzo Angles,
Walid Aref,
Marcelo Arenas,
Maciej Besta,
Peter A. Boncz,
Khuzaima Daudjee,
Emanuele Della Valle,
Stefania Dumbrava,
Olaf Hartig,
Bernhard Haslhofer,
Tim Hegeman,
Jan Hidders,
Katja Hose,
Adriana Iamnitchi,
Vasiliki Kalavri,
Hugo Kapp,
Wim Martens,
M. Tamer Özsu,
Eric Peukert,
Stefan Plantikow
, et al. (16 additional authors not shown)
Abstract:
Graphs are by nature unifying abstractions that can leverage interconnectedness to represent, explore, predict, and explain real- and digital-world phenomena. Although real users and consumers of graph instances and graph workloads understand these abstractions, future problems will require new abstractions and systems. What needs to happen in the next decade for big graph processing to continue t…
▽ More
Graphs are by nature unifying abstractions that can leverage interconnectedness to represent, explore, predict, and explain real- and digital-world phenomena. Although real users and consumers of graph instances and graph workloads understand these abstractions, future problems will require new abstractions and systems. What needs to happen in the next decade for big graph processing to continue to succeed?
△ Less
Submitted 11 December, 2020;
originally announced December 2020.
-
The LDBC Graphalytics Benchmark
Authors:
Alexandru Iosup,
Ahmed Musaafir,
Alexandru Uta,
Arnau Prat Pérez,
Gábor Szárnyas,
Hassan Chafi,
Ilie Gabriel Tănase,
Lifeng Nai,
Michael Anderson,
Mihai Capotă,
Narayanan Sundaram,
Peter Boncz,
Siegfried Depner,
Stijn Heldens,
Thomas Manhardt,
Tim Hegeman,
Wing Lung Ngai,
Yinglong Xia
Abstract:
In this document, we describe LDBC Graphalytics, an industrial-grade benchmark for graph analysis platforms. The main goal of Graphalytics is to enable the fair and objective comparison of graph analysis platforms. Due to the diversity of bottlenecks and performance issues such platforms need to address, Graphalytics consists of a set of selected deterministic algorithms for full-graph analysis, s…
▽ More
In this document, we describe LDBC Graphalytics, an industrial-grade benchmark for graph analysis platforms. The main goal of Graphalytics is to enable the fair and objective comparison of graph analysis platforms. Due to the diversity of bottlenecks and performance issues such platforms need to address, Graphalytics consists of a set of selected deterministic algorithms for full-graph analysis, standard graph datasets, synthetic dataset generators, and reference output for validation purposes. Its test harness produces deep metrics that quantify multiple kinds of systems scalability, weak and strong, and robustness, such as failures and performance variability. The benchmark also balances comprehensiveness with runtime necessary to obtain the deep metrics. The benchmark comes with open-source software for generating performance data, for validating algorithm results, for monitoring and sharing performance data, and for obtaining the final benchmark result as a standard performance report.
△ Less
Submitted 6 April, 2023; v1 submitted 30 November, 2020;
originally announced November 2020.
-
An analysis of the SIGMOD 2014 Programming Contest: Complex queries on the LDBC social network graph
Authors:
Márton Elekes,
János Benjamin Antal,
Gábor Szárnyas
Abstract:
This report contains an analysis of the queries defined in the SIGMOD 2014 Programming Contest. We first describe the data set, then present the queries, providing graphical illustrations for them and pointing out their caveats. Our intention is to document our lessons learnt and simplify the work of those who will attempt to create a solution to this contest. We also demonstrate the influence of…
▽ More
This report contains an analysis of the queries defined in the SIGMOD 2014 Programming Contest. We first describe the data set, then present the queries, providing graphical illustrations for them and pointing out their caveats. Our intention is to document our lessons learnt and simplify the work of those who will attempt to create a solution to this contest. We also demonstrate the influence of this contest by listing followup works which used these queries as inspiration to design better algorithms or to define interesting graph queries.
△ Less
Submitted 24 March, 2022; v1 submitted 23 October, 2020;
originally announced October 2020.
-
Graphs and matrices: A translation of "Graphok és matrixok" by Dénes Kőnig (1931)
Authors:
Gábor Szárnyas
Abstract:
This paper, originally written in Hungarian by Dénes Kőnig in 1931, proves that in a bipartite graph, the minimum vertex cover and the maximum matching have the same size. This statement is now known as Kőnig's theorem. The paper also discusses the connection of graphs and matrices, then makes some observations about the combinatorial properties of the latter.
This paper, originally written in Hungarian by Dénes Kőnig in 1931, proves that in a bipartite graph, the minimum vertex cover and the maximum matching have the same size. This statement is now known as Kőnig's theorem. The paper also discusses the connection of graphs and matrices, then makes some observations about the combinatorial properties of the latter.
△ Less
Submitted 5 September, 2020;
originally announced September 2020.
-
The LDBC Social Network Benchmark
Authors:
Renzo Angles,
János Benjamin Antal,
Alex Averbuch,
Altan Birler,
Peter Boncz,
Márton Búr,
Orri Erling,
Andrey Gubichev,
Vlad Haprian,
Moritz Kaufmann,
Josep Lluís Larriba Pey,
Norbert Martínez,
József Marton,
Marcus Paradies,
Minh-Duc Pham,
Arnau Prat-Pérez,
David Püroja,
Mirko Spasić,
Benjamin A. Steer,
Dávid Szakállas,
Gábor Szárnyas,
Jack Waudby,
Mingxi Wu,
Yuchen Zhang
Abstract:
The Linked Data Benchmark Council's Social Network Benchmark (LDBC SNB) is an effort intended to test various functionalities of systems used for graph-like data management. For this, LDBC SNB uses the recognizable scenario of operating a social network, characterized by its graph-shaped data. LDBC SNB consists of two workloads that focus on different functionalities: the Interactive workload (int…
▽ More
The Linked Data Benchmark Council's Social Network Benchmark (LDBC SNB) is an effort intended to test various functionalities of systems used for graph-like data management. For this, LDBC SNB uses the recognizable scenario of operating a social network, characterized by its graph-shaped data. LDBC SNB consists of two workloads that focus on different functionalities: the Interactive workload (interactive transactional queries) and the Business Intelligence workload (analytical queries). This document contains the definition of both workloads. This includes a detailed explanation of the data used in the LDBC SNB, a detailed description for all queries, and instructions on how to generate the data and run the benchmark with the provided software.
△ Less
Submitted 14 January, 2024; v1 submitted 7 January, 2020;
originally announced January 2020.
-
Reducing Property Graph Queries to Relational Algebra for Incremental View Maintenance
Authors:
Gábor Szárnyas,
József Marton,
János Maginecz,
Dániel Varró
Abstract:
The property graph data model of modern graph database systems is increasingly adapted for storing and processing heterogeneous datasets like networks. Many challenging applications with near real-time requirements -- e.g. financial fraud detection, recommendation systems, and on-the-fly validation -- can be captured with graph queries, which are evaluated repeatedly. To ensure quick response time…
▽ More
The property graph data model of modern graph database systems is increasingly adapted for storing and processing heterogeneous datasets like networks. Many challenging applications with near real-time requirements -- e.g. financial fraud detection, recommendation systems, and on-the-fly validation -- can be captured with graph queries, which are evaluated repeatedly. To ensure quick response time for a changing data set, these applications would benefit from applying incremental view maintenance (IVM) techniques, which can perform continuous evaluation of queries and calculate the changes in the result set upon updates. However, currently, no graph databases provide support for incremental views. While IVM problems have been studied extensively over relational databases, views on property graph queries require operators outside the scope of standard relational algebra. Hence, tackling this problem requires the integration of numerous existing IVM techniques and possibly further extensions. In this paper, we present an approach to perform IVM on property graphs, using a nested relational algebraic representation for property graphs and graph operations. Then we define a chain of transformations to reduce most property graph queries to flat relational algebra and use techniques from discrimination networks (used in rule-based expert systems) to evaluate them. We demonstrate the approach using our prototype tool, ingraph, which uses openCypher, an open graph query language specified as part of an industry initiative. However, several aspects of our approach can be generalised to other graph query languages such as G-CORE and PGQL.
△ Less
Submitted 19 June, 2018;
originally announced June 2018.
-
Incremental View Maintenance for Property Graph Queries
Authors:
Gábor Szárnyas
Abstract:
This paper discusses the challenges of incremental view maintenance for property graph queries. We select a subset of property graph queries and present an approach that uses nested relational algebra to allow incremental evaluation.
This paper discusses the challenges of incremental view maintenance for property graph queries. We select a subset of property graph queries and present an approach that uses nested relational algebra to allow incremental evaluation.
△ Less
Submitted 11 December, 2017;
originally announced December 2017.
-
Formalising opencypher Graph Queries in Relational Algebra
Authors:
József Marton,
Gábor Szárnyas,
Dániel Varró
Abstract:
Graph database systems are increasingly adapted for storing and processing heterogeneous network-like datasets. However, due to the novelty of such systems, no standard data model or query language has yet emerged. Consequently, migrating datasets or applications even between related technologies often requires a large amount of manual work or ad-hoc solutions, thus subjecting the users to the pos…
▽ More
Graph database systems are increasingly adapted for storing and processing heterogeneous network-like datasets. However, due to the novelty of such systems, no standard data model or query language has yet emerged. Consequently, migrating datasets or applications even between related technologies often requires a large amount of manual work or ad-hoc solutions, thus subjecting the users to the possibility of vendor lock-in. To avoid this threat, vendors are working on supporting existing standard languages (e.g. SQL) or creating standardised languages.
In this paper, we present a formal specification for openCypher, a high-level declarative graph query language with an ongoing standardisation effort. We introduce relational graph algebra, which extends relational operators by adapting graph-specific operators and define a map** from core openCypher constructs to this algebra. We propose an algorithm that allows systematic compilation of openCypher queries.
△ Less
Submitted 22 September, 2017; v1 submitted 8 May, 2017;
originally announced May 2017.