-
A Taxonomy of Schema Changes for NoSQL Databases
Authors:
Alberto Hernández Chillón,
Meike Klettke,
Diego Sevilla Ruiz,
Jesús García Molina
Abstract:
Schema evolution is a crucial aspect in database management. The proposed taxonomies of schema changes have neglected the set of operations that involves relationships between entity types: aggregation and references, as well as the possible existence of structural variations for schema types, as most of NoSQL systems are schemaless. The distinction between entity types and relationship types, whi…
▽ More
Schema evolution is a crucial aspect in database management. The proposed taxonomies of schema changes have neglected the set of operations that involves relationships between entity types: aggregation and references, as well as the possible existence of structural variations for schema types, as most of NoSQL systems are schemaless. The distinction between entity types and relationship types, which is typical of graph schemas, is also not taken into account in the published works. Moreover, NoSQL schema evolution poses the challenge of having different data models, and no standard specification exists for them. In this paper, a generic approach for evolving NoSQL and relational schemas is presented, which is based on the U-Schema unified data model that includes aggregation and reference relationships, and structural variations. For this data model, we introduce a taxonomy of schema changes for all the U-Schema elements, which is implemented by creating the Orion database-independent language. We will show how Orion can be used to automatically generate evolution scripts for a set of NoSQL databases, and the feasibility of each schema operation will be analyzed through the performance results obtained. The taxonomy has been formally validated by means of Alloy, and two case studies show the application of Orion.
△ Less
Submitted 23 May, 2022;
originally announced May 2022.
-
SkiQL: A Unified Schema Query Language
Authors:
Carlos Javier Fernández Candel,
Jesús Joaquín García Molina,
Diego Sevilla Ruiz
Abstract:
Most NoSQL systems are schema-on-read: data can be stored without first having to declare a Schema that imposes a structure. This schemaless feature offers flexibility to evolve data-intensive applications when data frequently change. However, freeing from declaring schemas does not mean their absence, but rather that they are implicit in data and code. Therefore, diagramming tools similar to thos…
▽ More
Most NoSQL systems are schema-on-read: data can be stored without first having to declare a Schema that imposes a structure. This schemaless feature offers flexibility to evolve data-intensive applications when data frequently change. However, freeing from declaring schemas does not mean their absence, but rather that they are implicit in data and code. Therefore, diagramming tools similar to those available for relational systems are also needed to help developers and administrators understanding NoSQL schemas.
Visualizing diagrams is not practical if schemas contain hundreds of database entities, and exploration or query facilities are then needed. In schemaless NoSQL stores, data of the same entity can be stored with different structure which can increase the difficulty of having readable diagrams.
NoSQL schema management tools should therefore have three main components: schema extraction, schema visualization, and schema query. Since that there exist four main NoSQL data models, it is convenient that such tools can be built on a generic data model that provide platform-independence to query and visualize schemas. With the aim of favoring the creation of generic database tools, the authors of this paper defined the U-Schema unified data model that integrates the four main NoSQL data models and the relational model.
This paper is focused on querying NoSQL and relational schemas which are represented as U-Schema models. We present the SkiQL language designed on U-Schema to achieve a platform-independent schema query service. SkiQL provides two constructs: schema-query and relationship-query. The former allows to obtain information of entity or relationship types, and the latter that of the aggregations or references (relations among types). We will show how SkiQL was evaluated by calculating well-known metrics for languages and using a survey.
△ Less
Submitted 19 April, 2022; v1 submitted 13 April, 2022;
originally announced April 2022.
-
A Unified Metamodel for NoSQL and Relational Databases
Authors:
Carlos J. Fernández Candel,
Diego Sevilla Ruiz,
Jesús J. García-Molina
Abstract:
The Database field is undergoing significant changes. Although relational systems are still predominant, the interest in NoSQL systems is continuously increasing. In this scenario, polyglot persistence is envisioned as the database architecture to be prevalent in the future.
Multi-model database tools normally use a generic or unified metamodel to represent schemas of the data model that they su…
▽ More
The Database field is undergoing significant changes. Although relational systems are still predominant, the interest in NoSQL systems is continuously increasing. In this scenario, polyglot persistence is envisioned as the database architecture to be prevalent in the future.
Multi-model database tools normally use a generic or unified metamodel to represent schemas of the data model that they support. Such metamodels facilitate develo** utilities, as they can be built on a common representation. Also, the number of map**s required to migrate databases from a data model to another is reduced, and integrability is favored.
In this paper, we present the U-Schema unified metamodel able to represent logical schemas for the four most popular NoSQL paradigms (columnar, document, key-value, and graph) as well as relational schemas. We will formally define the map**s between U-Schema and the data model defined for each paradigm. How these map**s have been implemented and validated will be discussed, and some applications of U-Schema will be shown.
To achieve flexibility to respond to data changes, most of NoSQL systems are "schema-on-write," and the declaration of schemas is not required. Such an absence of schema declaration makes structural variability possible, i.e., stored data of the same entity type can have different structure. Moreover, data relationships supported by each data model are different. We will show how all these issues have been tackled in our approach.
Our metamodel goes beyond the existing proposals by distinguishing entity types and relationship types, representing aggregation and reference relationships, and including the notion of structural variability. Our contributions also include develo** schema extraction strategies for schemaless systems of each NoSQL data model, and tackling performance and scalability in the implementation for each store.
△ Less
Submitted 17 May, 2021; v1 submitted 13 May, 2021;
originally announced May 2021.