The Lothbrok approach for SPARQL Query Optimization over Decentralized Knowledge Graphs
Authors:
Christian Aebeloe,
Gabriela Montoya,
Katja Hose
Abstract:
While the Web of Data in principle offers access to a wide range of interlinked data, the architecture of the Semantic Web today relies mostly on the data providers to maintain access to their data through SPARQL endpoints. Several studies, however, have shown that such endpoints often experience downtime, meaning that the data they maintain becomes inaccessible. While decentralized systems based…
▽ More
While the Web of Data in principle offers access to a wide range of interlinked data, the architecture of the Semantic Web today relies mostly on the data providers to maintain access to their data through SPARQL endpoints. Several studies, however, have shown that such endpoints often experience downtime, meaning that the data they maintain becomes inaccessible. While decentralized systems based on Peer-to-Peer (P2P) technology have previously shown to increase the availability of knowledge graphs, even when a large proportion of the nodes fail, processing queries in such a setup can be an expensive task since data necessary to answer a single query might be distributed over multiple nodes. In this paper, we therefore propose an approach to optimizing SPARQL queries over decentralized knowledge graphs, called Lothbrok. While there are potentially many aspects to consider when optimizing such queries, we focus on three aspects: cardinality estimation, locality awareness, and data fragmentation. We empirically show that Lothbrok is able to achieve significantly faster query processing performance compared to the state of the art when processing challenging queries as well as when the network is under high load.
△ Less
Submitted 31 August, 2022;
originally announced August 2022.
Star Pattern Fragments: Accessing Knowledge Graphs through Star Patterns
Authors:
Christian Aebeloe,
Ilkcan Keles,
Gabriela Montoya,
Katja Hose
Abstract:
The Semantic Web offers access to a vast Web of interlinked information accessible via SPARQL endpoints. Such endpoints offer a well-defined interface to retrieve results for complex SPARQL queries. The computational load for processing such SPARQL endpoints offer access to a vast amount of interlinked information. While they offer a well-defined interface for efficiently retrieving results for co…
▽ More
The Semantic Web offers access to a vast Web of interlinked information accessible via SPARQL endpoints. Such endpoints offer a well-defined interface to retrieve results for complex SPARQL queries. The computational load for processing such SPARQL endpoints offer access to a vast amount of interlinked information. While they offer a well-defined interface for efficiently retrieving results for complex SPARQL queries, complex query loads can easily overload or crash endpoints as all the computational load of answering the queries resides entirely with the server hosting the endpoint. Recently proposed interfaces, such as Triple Pattern Fragments, have therefore shifted some of the query processing load from the server to the client at the expense of increased network traffic in the case of non-selective triple patterns. This paper therefore proposes Star Pattern Fragments (SPF), an RDF interface enabling a better load balancing between server and client by decomposing SPARQL queries into star-shaped subqueries, evaluating them on the server side. Experiments using synthetic data (WatDiv), as well as real data (DBpedia), show that SPF does not only significantly reduce network traffic, it is also up to two orders of magnitude faster than the state-of-the-art interfaces under high query load.
△ Less
Submitted 9 November, 2021; v1 submitted 21 February, 2020;
originally announced February 2020.