-
Efficient Analytical Queries on Semantic Web Data Cubes
Authors:
Lorena Etcheverry,
Alejandro A. Vaisman
Abstract:
The amount of multidimensional data published on the semantic web (SW) is constantly increasing, due to initiatives such as Open Data and Open Government Data, among other ones. Models, languages, and tools, that allow to obtain valuable information efficiently, are thus required. Multidimensional data are typically represented as data cubes, and exploited using Online Analytical Processing (OLAP)…
▽ More
The amount of multidimensional data published on the semantic web (SW) is constantly increasing, due to initiatives such as Open Data and Open Government Data, among other ones. Models, languages, and tools, that allow to obtain valuable information efficiently, are thus required. Multidimensional data are typically represented as data cubes, and exploited using Online Analytical Processing (OLAP) techniques. The RDF Data Cube Vocabulary, also denoted QB, is the current W3C standard to represent statistical data on the SW.Since QB does not include key features needed for OLAP analysis, in previous work we have proposed an extension, denoted QB4OLAP, to overcome this problem without the need of modifying already published data. Once data cubes are represented on the SW, we need tools to analyze them. However, writing efficient analytical queries over SW cubes demands a deep knowledge of RDF and SPARQL. These skills are not common in typical analytical users. Also, OLAP languages like MDX are far from being easily understood by the final user. The lack of friendly tools to exploit multidimensional data on the SW is a barrier that needs to be broken to promote the publication of such data. We address this problem in this paper. Our approach is based on allowing analytical users to write queries using OLAP operations over cubes, without dealing with SW standards. For this, we devised CQL (standing for Cube Query Language), a simple, high-level query language that operates over cubes. Using the metadata provided by QB4OLAP, we translate CQL queries into SPARQL. Then, we propose query improvement strategies to produce efficient SPARQL queries, adapting SPARQL query optimization techniques. We evaluate our approach using the Star-Schema benchmark, showing that our proposal outperforms others. A web application that allows querying SW data cubes using CQL, completes our contributions.
△ Less
Submitted 21 March, 2017;
originally announced March 2017.
-
Modeling and Querying Data Cubes on the Semantic Web
Authors:
Lorena Etcheverry,
Silvia Silvia Gomez,
Alejandro Vaisman
Abstract:
The web is changing the way in which data warehouses are designed, used, and queried. With the advent of initiatives such as Open Data and Open Government, organizations want to share their multidimensional data cubes and make them available to be queried online. The RDF data cube vocabulary (QB), the W3C standard to publish statistical data in RDF, presents several limitations to fully support th…
▽ More
The web is changing the way in which data warehouses are designed, used, and queried. With the advent of initiatives such as Open Data and Open Government, organizations want to share their multidimensional data cubes and make them available to be queried online. The RDF data cube vocabulary (QB), the W3C standard to publish statistical data in RDF, presents several limitations to fully support the multidimensional model. The QB4OLAP vocabulary extends QB to overcome these limitations, allowing to im- plement the typical OLAP operations, such as rollup, slice, dice, and drill-across using standard SPARQL queries. In this paper we introduce a formal data model where the main object is the data cube, and define OLAP operations using this model, independent of the underlying representation of the cube. We show then that a cube expressed using our model can be represented using the QB4OLAP vocabulary, and finally we provide a SPARQL implementation of OLAP operations over data cubes in QB4OLAP.
△ Less
Submitted 18 December, 2015;
originally announced December 2015.
-
Views over RDF Datasets: A State-of-the-Art and Open Challenges
Authors:
Lorena Etcheverry,
Alejandro A. Vaisman
Abstract:
Views on RDF datasets have been discussed in several works, nevertheless there is no consensus on their definition nor the requirements they should fulfill. In traditional data management systems, views have proved to be useful in different application scenarios such as data integration, query answering, data security, and query modularization.
In this work we have reviewed existent work on view…
▽ More
Views on RDF datasets have been discussed in several works, nevertheless there is no consensus on their definition nor the requirements they should fulfill. In traditional data management systems, views have proved to be useful in different application scenarios such as data integration, query answering, data security, and query modularization.
In this work we have reviewed existent work on views over RDF datasets, and discussed the application of existent view definition mechanisms to four scenarios in which views have proved to be useful in traditional (relational) data management systems. To give a framework for the discussion we provided a definition of views over RDF datasets, an issue over which there is no consensus so far. We finally chose the three proposals closer to this definition, and analyzed them with respect to four selected goals.
△ Less
Submitted 1 November, 2012;
originally announced November 2012.