Search | arXiv e-print repository

Validation of Modern JSON Schema: Formalization and Complexity

Authors: Lyes Attouche, Mohamed-Amine Baazizi, Dario Colazzo, Giorgio Ghelli, Carlo Sartiani, Stefanie Scherzinger

Abstract: JSON Schema is the de-facto standard schema language for JSON data. The language went through many minor revisions, but the most recent versions of the language added two novel features, dynamic references and annotation-dependent validation, that change the evaluation model. Modern JSON Schema is the name used to indicate all versions from Draft 2019-09, which are characterized by these new featu… ▽ More JSON Schema is the de-facto standard schema language for JSON data. The language went through many minor revisions, but the most recent versions of the language added two novel features, dynamic references and annotation-dependent validation, that change the evaluation model. Modern JSON Schema is the name used to indicate all versions from Draft 2019-09, which are characterized by these new features, while Classical JSON Schema is used to indicate the previous versions. These new "modern" features make the schema language quite difficult to understand, and have generated many discussions about the correct interpretation of their official specifications; for this reason we undertook the task of their formalization. During this process, we also analyzed the complexity of data validation in Modern JSON Schema, with the idea of confirming the PTIME complexity of Classical JSON Schema validation, and we were surprised to discover a completely different truth: data validation, that is expected to be an extremely efficient process, acquires, with Modern JSON Schema features, a PSPACE complexity. In this paper, we give the first formal description of Modern JSON Schema, which we consider a central contribution of the work that we present here. We then prove that its data validation problem is PSPACE-complete. We prove that the origin of the problem lies in dynamic references, and not in annotation-dependent validation. We study the schema and data complexities, showing that the problem is PSPACE-complete with respect to the schema size even with a fixed instance, but is in PTIME when the schema is fixed and only the instance size is allowed to vary. Finally, we run experiments that show that there are families of schemas where the difference in asymptotic complexity between dynamic and static references is extremely visible, even with small schemas. △ Less

Submitted 1 February, 2024; v1 submitted 19 July, 2023; originally announced July 2023.

arXiv:2306.07085 [pdf, other]

Extracting JSON Schemas with Tagged Unions

Authors: Stefan Klessinger, Meike Klettke, Uta Störl, Stefanie Scherzinger

Abstract: With data lakes and schema-free NoSQL document stores, extracting a descriptive schema from JSON data collections is an acute challenge. In this paper, we target the discovery of tagged unions, a JSON Schema design pattern where the value of one property of an object (the tag) conditionally implies subschemas for sibling properties. We formalize these implications as conditional functional depende… ▽ More With data lakes and schema-free NoSQL document stores, extracting a descriptive schema from JSON data collections is an acute challenge. In this paper, we target the discovery of tagged unions, a JSON Schema design pattern where the value of one property of an object (the tag) conditionally implies subschemas for sibling properties. We formalize these implications as conditional functional dependencies and capture them using the JSON Schema operators if-then-else. We further motivate our heuristics to avoid overfitting. Experiments with our prototype implementation are promising, and show that this form of tagged unions can successfully be detected in real-world GeoJSON and TopoJSON datasets. In discussing future work, we outline how our approach can be extended further. △ Less

Submitted 12 June, 2023; originally announced June 2023.

arXiv:2306.02890 [pdf, other]

A Plaque Test for Redundancies in Relational Data

Authors: Christoph Köhnen, Stefan Klessinger, Jens Zumbrägel, Stefanie Scherzinger

Abstract: Inspired by the visualization of dental plaque at the dentist's office, this article proposes a novel visualization technique for identifying redundancies in relational data. Our approach builds upon an established information-theoretic framework that, despite being well-principled, remains unexplored in practical applications. In this framework, we calculate the information content (or entropy) o… ▽ More Inspired by the visualization of dental plaque at the dentist's office, this article proposes a novel visualization technique for identifying redundancies in relational data. Our approach builds upon an established information-theoretic framework that, despite being well-principled, remains unexplored in practical applications. In this framework, we calculate the information content (or entropy) of each cell in a relation instance, given a set of functional dependencies. The entropy value represents the likelihood of inferring the cell's value based on the dependencies and the remaining tuples. By highlighting cells with lower entropy, we effectively visualize redundancies in the data. We present an initial prototype implementation and demonstrate that a straightforward approach is insufficient for handling practical problem sizes. To address this limitation, we propose several optimizations, which we prove to be correct. Additionally, we present a Monte Carlo approximation technique with a known error, enabling computationally tractable computations. Using a real-world dataset of modest size, we illustrate the potential of our visualization technique. Our vision is to support domain experts with data profiling and data cleaning tasks, akin to the functionality of a plaque test at the dentist's. △ Less

Submitted 5 June, 2023; originally announced June 2023.

arXiv:2303.12440 [pdf, other]

Learning Human-Inspired Force Strategies for Robotic Assembly

Authors: Stefan Scherzinger, Arne Roennau, Rüdiger Dillmann

Abstract: The programming of robotic assembly tasks is a key component in manufacturing and automation. Force-sensitive assembly, however, often requires reactive strategies to handle slight changes in positioning and unforeseen part jamming. Learning such strategies from human performance is a promising approach, but faces two common challenges: the handling of low part clearances which is difficult to cap… ▽ More The programming of robotic assembly tasks is a key component in manufacturing and automation. Force-sensitive assembly, however, often requires reactive strategies to handle slight changes in positioning and unforeseen part jamming. Learning such strategies from human performance is a promising approach, but faces two common challenges: the handling of low part clearances which is difficult to capture from demonstrations and learning intuitive strategies offline without access to the real hardware. We address these two challenges by learning probabilistic force strategies from data that are easily acquired offline in a robot-less simulation from human demonstrations with a joystick. We combine a Long Short Term Memory (LSTM) and a Mixture Density Network (MDN) to model human-inspired behavior in such a way that the learned strategies transfer easily onto real hardware. The experiments show a UR10e robot that completes a plastic assembly with clearances of less than 100 micrometers whose strategies were solely demonstrated in simulation. △ Less

Submitted 22 March, 2023; originally announced March 2023.

Comments: 8 pages, 8 figures. Submitted to the IEEE International Conference on Automation Science and Engineering (CASE) 2023

arXiv:2203.10217 [pdf, other]

A Walking Space Robot for On-Orbit Satellite Servicing: The ReCoBot

Authors: Stefan Scherzinger, Jakob Weinland, Robert Wilbrandt, Pascal Becker, Arne Roennau, Rüdiger Dillmann

Abstract: A key factor in the economic efficiency of satellites is their availability in orbit. Replacing standardized building blocks, such as empty fuel tanks or outdated electronic modules, could greatly extend the satellites' lifetime. This, however, requires flexible robots that can locomote on the surface of these satellites for optimal accessibility and manipulation. This paper introduces ReCoBot, a… ▽ More A key factor in the economic efficiency of satellites is their availability in orbit. Replacing standardized building blocks, such as empty fuel tanks or outdated electronic modules, could greatly extend the satellites' lifetime. This, however, requires flexible robots that can locomote on the surface of these satellites for optimal accessibility and manipulation. This paper introduces ReCoBot, a 7-axis walking space manipulator for locomotion and manipulation. The robot can connect to compatible structures with its symmetric ends and provides interfaces for manual teleoperation and motion planning with a constantly changing base and tip. We build on open-source robotics software and easily available components to evaluate the overall concept with an early stage demonstrator. The proposed manipulator has a length of 1.20 m and a weight of 10.4 kg and successfully locomotes over a satellite mockup in our lab environment. △ Less

Submitted 18 March, 2022; originally announced March 2022.

Comments: 7 pages, 9 figures, submitted to the 18th IEEE International Conference on Automation Science and Engineering (CASE)

arXiv:2203.06289 [pdf, other]

Peel $\mid$ Pile? Cross-Framework Portability of Quantum Software

Authors: Manuel Schönberger, Maja Franz, Stefanie Scherzinger, Wolfgang Mauerer

Abstract: In recent years, various vendors have made quantum software frameworks available. Yet with vendor-specific frameworks, code portability seems at risk, especially in a field where hardware and software libraries have not yet reached a consolidated state, and even foundational aspects of the technologies are still in flux. Accordingly, the development of vendor-independent quantum programming langua… ▽ More In recent years, various vendors have made quantum software frameworks available. Yet with vendor-specific frameworks, code portability seems at risk, especially in a field where hardware and software libraries have not yet reached a consolidated state, and even foundational aspects of the technologies are still in flux. Accordingly, the development of vendor-independent quantum programming languages and frameworks is often suggested. This follows the established architectural pattern of introducing additional levels of abstraction into software stacks, thereby piling on layers of abstraction. Yet software architecture also provides seemingly less abstract alternatives, namely to focus on hardware-specific formulations of problems that peel off unnecessary layers. In this article, we quantitatively and experimentally explore these strategic alternatives, and compare popular quantum frameworks from the software implementation perspective. We find that for several specific, yet generalisable problems, the mathematical formulation of the problem to be solved is not just sufficiently abstract and serves as precise description, but is likewise concrete enough to allow for deriving framework-specific implementations with little effort. Additionally, we argue, based on analysing dozens of existing quantum codes, that porting between frameworks is actually low-effort, since the quantum- and framework-specific portions are very manageable in terms of size, commonly in the order of mere hundreds of lines of code. Given the current state-of-the-art in quantum programming practice, this leads us to argue in favour of peeling off unnecessary abstraction levels. △ Less

Submitted 11 March, 2022; originally announced March 2022.

Journal ref: QSA@ICSA2022

arXiv:2203.05283 [pdf, other]

Beyond the Badge: Reproducibility Engineering as a Lifetime Skill

Authors: Wolfgang Mauerer, Stefan Klessinger, Stefanie Scherzinger

Abstract: Ascertaining reproducibility of scientific experiments is receiving increased attention across disciplines. We argue that the necessary skills are important beyond pure scientific utility, and that they should be taught as part of software engineering (SWE) education. They serve a dual purpose: Apart from acquiring the coveted badges assigned to reproducible research, reproducibility engineering i… ▽ More Ascertaining reproducibility of scientific experiments is receiving increased attention across disciplines. We argue that the necessary skills are important beyond pure scientific utility, and that they should be taught as part of software engineering (SWE) education. They serve a dual purpose: Apart from acquiring the coveted badges assigned to reproducible research, reproducibility engineering is a lifetime skill for a professional industrial career in computer science. SWE curricula seem an ideal fit for conveying such capabilities, yet they require some extensions, especially given that even at flagship conferences like ICSE, only slightly more than one-third of the technical papers (at the 2021 edition) receive recognition for artefact reusability. Knowledge and capabilities in setting up engineering environments that allow for reproducing artefacts and results over decades (a standard requirement in many traditional engineering disciplines), writing semi-literate commit messages that document crucial steps of a decision-making process and that are tightly coupled with code, or sustainably taming dynamic, quickly changing software dependencies, to name a few: They all contribute to solving the scientific reproducibility crisis, and enable software engineers to build sustainable, long-term maintainable, software-intensive, industrial systems. We propose to teach these skills at the undergraduate level, on par with traditional SWE topics. △ Less

Submitted 10 March, 2022; originally announced March 2022.

arXiv:2202.13434 [pdf, ps, other]

Negation-Closure for JSON Schema

Authors: Mohamed-Amine Baazizi, Dario Colazzo, Giorgio Ghelli, Carlo Sartiani, Stefanie Scherzinger

Abstract: JSON Schema is an evolving standard for describing families of JSON documents. It is a logical language, based on a set of assertions that describe features of the JSON value under analysis and on logical or structural combinators for these assertions, including a negation operator. Most logical languages with negation enjoy negation closure, that is, for every operator they have a negation dual t… ▽ More JSON Schema is an evolving standard for describing families of JSON documents. It is a logical language, based on a set of assertions that describe features of the JSON value under analysis and on logical or structural combinators for these assertions, including a negation operator. Most logical languages with negation enjoy negation closure, that is, for every operator they have a negation dual that expresses its negation. We show that this is not the case for JSON Schema, we study how that changed with the latest versions of the Draft, and we discuss how the language may be enriched accordingly. In the process, we define an algebraic reformulation of JSON Schema, which we successfully employed in a prototype system for generating schema witnesses. △ Less

Submitted 27 February, 2022; originally announced February 2022.

arXiv:2202.12849 [pdf, other]

Witness Generation for JSON Schema

Authors: Lyes Attouche, Mohamed-Amine Baazizi, Dario Colazzo, Giorgio Ghelli, Carlo Sartiani, Stefanie Scherzinger

Abstract: JSON Schema is an important, evolving standard schema language for families of JSON documents. It is based on a complex combination of structural and Boolean assertions, and features negation and recursion. The static analysis of JSON Schema documents comprises practically relevant problems, including schema satisfiability, inclusion, and equivalence. These three problems can be reduced to witness… ▽ More JSON Schema is an important, evolving standard schema language for families of JSON documents. It is based on a complex combination of structural and Boolean assertions, and features negation and recursion. The static analysis of JSON Schema documents comprises practically relevant problems, including schema satisfiability, inclusion, and equivalence. These three problems can be reduced to witness generation: given a schema, generate an element of the schema, if it exists, and report failure otherwise. Schema satisfiability, inclusion, and equivalence have been shown to be decidable, by reduction to reachability in alternating tree automata. However, no witness generation algorithm has yet been formally described. We contribute a first, direct algorithm for JSON Schema witness generation. We study its effectiveness and efficiency, in experiments over several schema collections, including thousands of real-world schemas. Our focus is on the completeness of the language, where we only exclude the uniqueItems operator, and on the ability of the algorithm to run in a reasonable time on a large set of real-world examples, despite the exponential complexity of the underlying problem. △ Less

Submitted 16 July, 2022; v1 submitted 25 February, 2022; originally announced February 2022.

arXiv:2202.09221 [pdf, other]

Motion Macro Programming on Assistive Robotic Manipulators: Three Skill Types for Everyday Tasks

Authors: Stefan Scherzinger, Pascal Becker, Arne Roennau, Rüdiger Dillmann

Abstract: Assistive robotic manipulators are becoming increasingly important for people with disabilities. Teleoperating the manipulator in mundane tasks is part of their daily lives. Instead of steering the robot through all actions, applying self-recorded motion macros could greatly facilitate repetitive tasks. Dynamic Movement Primitives (DMP) are a powerful method for skill learning via teleoperation. F… ▽ More Assistive robotic manipulators are becoming increasingly important for people with disabilities. Teleoperating the manipulator in mundane tasks is part of their daily lives. Instead of steering the robot through all actions, applying self-recorded motion macros could greatly facilitate repetitive tasks. Dynamic Movement Primitives (DMP) are a powerful method for skill learning via teleoperation. For this use case, however, they need simple heuristics to specify where to start, stop, and parameterize a skill without a background in computer science and academic sensor setups for autonomous perception. To achieve this goal, this paper provides the concept of local, global, and hybrid skills that form a modular basis for composing single-handed tasks of daily living. These skills are specified implicitly and can easily be programmed by users themselves, requiring only their basic robotic manipulator. The paper contributes all details for robot-agnostic implementations. Experiments validate the developed methods for exemplary tasks, such as scratching an itchy spot, sorting objects on a desk, and feeding a piggy bank with coins. The paper is accompanied by an open-source implementation at https://github.com/fzi-forschungszentrum-informatik/ArNe △ Less

Submitted 12 May, 2023; v1 submitted 18 February, 2022; originally announced February 2022.

Comments: 8 pages, 10 figures, accepted to the IEEE 20th International Conference on Ubiquitous Robots (UR 2023), Honolulu, USA

arXiv:2201.12031 [pdf, other]

1-2-3 Reproducibility for Quantum Software Experiments

Authors: Wolfgang Mauerer, Stefanie Scherzinger

Abstract: Various fields of science face a reproducibility crisis. For quantum software engineering as an emerging field, it is therefore imminent to focus on proper reproducibility engineering from the start. Yet the provision of reproduction packages is almost universally lacking. Actionable advice on how to build such packages is rare, particularly unfortunate in a field with many contributions from rese… ▽ More Various fields of science face a reproducibility crisis. For quantum software engineering as an emerging field, it is therefore imminent to focus on proper reproducibility engineering from the start. Yet the provision of reproduction packages is almost universally lacking. Actionable advice on how to build such packages is rare, particularly unfortunate in a field with many contributions from researchers with backgrounds outside computer science. In this article, we argue how to rectify this deficiency by proposing a 1-2-3~approach to reproducibility engineering for quantum software experiments: Using a meta-generation mechanism, we generate DOI-safe, long-term functioning and dependency-free reproduction packages. They are designed to satisfy the requirements of professional and learned societies solely on the basis of project-specific research artefacts (source code, measurement and configuration data), and require little temporal investment by researchers. Our scheme ascertains long-term traceability even when the quantum processor itself is no longer accessible. By drastically lowering the technical bar, we foster the proliferation of reproduction packages in quantum software experiments and ease the inclusion of non-CS researchers entering the field. △ Less

Submitted 28 January, 2022; originally announced January 2022.

Comments: Q-SANER@SANER 2022 (to appear)

arXiv:2111.01086 [pdf, other]

AutoShard -- Declaratively Managing Hot Spot Data Objects in NoSQL Document Stores

Authors: Stefanie Scherzinger, Andreas Thor

Abstract: NoSQL document stores are becoming increasingly popular as backends in web development. Not only do they scale out to large volumes of data, many systems are even custom-tailored for this domain: NoSQL document stores like Google Cloud Datastore have been designed to support massively parallel reads, and even guarantee strong consistency in updating single data objects. However, strongly consisten… ▽ More NoSQL document stores are becoming increasingly popular as backends in web development. Not only do they scale out to large volumes of data, many systems are even custom-tailored for this domain: NoSQL document stores like Google Cloud Datastore have been designed to support massively parallel reads, and even guarantee strong consistency in updating single data objects. However, strongly consistent updates cannot be implemented arbitrarily fast in large-scale distributed systems. Consequently, data objects that experience high-frequent writes can turn into severe performance bottlenecks. In this paper, we present AutoShard, a ready-to-use object mapper for Java applications running against NoSQL document stores. AutoShard's unique feature is its capability to gracefully shard hot spot data objects to avoid write contention. Using AutoShard, developers can easily handle hot spot data objects by adding minimally intrusive annotations to their application code. Our experiments show the significant impact of sharding on both the write throughput and the execution time. △ Less

Submitted 1 November, 2021; originally announced November 2021.

Comments: Published at WebDB 2014

Journal ref: WebDB 2014

arXiv:2107.11607 [pdf, other]

Tell-Tale Tail Latencies: Pitfalls and Perils in Database Benchmarking

Authors: Michael Fruth, Stefanie Scherzinger, Wolfgang Mauerer, Ralf Ramsauer

Abstract: The performance of database systems is usually characterised by their average-case (i.e., throughput) behaviour in standardised or de-facto standard benchmarks like TPC-X or YCSB. While tails of the latency (i.e., response time) distribution receive considerably less attention, they have been identified as a threat to the overall system performance: In large-scale systems, even a fraction of reque… ▽ More The performance of database systems is usually characterised by their average-case (i.e., throughput) behaviour in standardised or de-facto standard benchmarks like TPC-X or YCSB. While tails of the latency (i.e., response time) distribution receive considerably less attention, they have been identified as a threat to the overall system performance: In large-scale systems, even a fraction of requests delayed can build up into delays perceivable by end users. To eradicate large tail latencies from database systems, the ability to faithfully record them, and likewise pinpoint them to the root causes, is imminently required. In this paper, we address the challenge of measuring tail latencies using standard benchmarks, and identify subtle perils and pitfalls. In particular, we demonstrate how Java-based benchmarking approaches can substantially distort tail latency observations, and discuss how the discovery of such problems is inhibited by the common focus on throughput performance. We make a case for purposefully re-designing database benchmarking harnesses based on these observations to arrive at faithful characterisations of database performance from multiple important angles. △ Less

Submitted 24 July, 2021; originally announced July 2021.

arXiv:2107.08677 [pdf, ps, other]

An Empirical Study on the "Usage of Not" in Real-World JSON Schema Documents (Long Version)

Authors: Mohamed-Amine Baazizi, Dario Colazzo, Giorgio Ghelli, Carlo Sartiani, Stefanie Scherzinger

Abstract: In this paper, we study the usage of negation in JSON Schema data modeling. Negation is a logical operator that is rarely present in type systems and schema description languages, since it complicates decision problems. As a consequence, many software tools, but also formal frameworks for working with JSON Schema, do not fully support negation. As of today, the question whether covering negation i… ▽ More In this paper, we study the usage of negation in JSON Schema data modeling. Negation is a logical operator that is rarely present in type systems and schema description languages, since it complicates decision problems. As a consequence, many software tools, but also formal frameworks for working with JSON Schema, do not fully support negation. As of today, the question whether covering negation is practically relevant, or a mainly theoretical exercise (albeit challenging), is open. This motivates us to study whether negation is really used in practice, for which aims, and whether it could be - in principle - replaced by simpler operators. We have collected the most diverse corpus of JSON Schema documents analyzed so far, based on a crawl of 90k open source schemas hosted on GitHub. We perform a systematic analysis, quantify usage patterns of negation, and also qualitatively analyze schemas. We show that negation is indeed used, following a stable set of patterns, with the potential to mature into design patterns. △ Less

Submitted 19 July, 2021; originally announced July 2021.

arXiv:2104.14828 [pdf, ps, other]

Not Elimination and Witness Generation for JSON Schema

Authors: Mohamed-Amine Baazizi, Dario Colazzo, Giorgio Ghelli, Carlo Sartiani, Stefanie Scherzinger

Abstract: JSON Schema is an evolving standard for the description of families of JSON documents. JSON Schema is a logical language, based on a set of assertions that describe features of the JSON value under analysis and on logical or structural combinators for these assertions. As for any logical language, problems like satisfaction, not-elimination, schema satisfiability, schema inclusion and equivalence,… ▽ More JSON Schema is an evolving standard for the description of families of JSON documents. JSON Schema is a logical language, based on a set of assertions that describe features of the JSON value under analysis and on logical or structural combinators for these assertions. As for any logical language, problems like satisfaction, not-elimination, schema satisfiability, schema inclusion and equivalence, as well as witness generation, have both theoretical and practical interest. While satisfaction is trivial, all other problems are quite difficult, due to the combined presence of negation, recursion, and complex assertions in JSON Schema. To make things even more complex and interesting, JSON Schema is not algebraic, since we have both syntactic and semantic interactions between different keywords in the same schema object. With such motivations, we present in this paper an algebraic characterization of JSON Schema, obtained by adding opportune operators, and by mirroring existing ones. We present then algebra-based approaches for dealing with not-elimination and witness generation problems, which play a central role as they lead to solutions for the other mentioned complex problems. △ Less

Submitted 7 May, 2021; v1 submitted 30 April, 2021; originally announced April 2021.

arXiv:2104.11787 [pdf, other]

MigCast in Monte Carlo: The Impact of Data Model Evolution in NoSQL Databases

Authors: Andrea Hillenbrand, Uta Störl, Shamil Nabiyev, Stefanie Scherzinger

Abstract: During the development of NoSQL-backed software, the data model evolves naturally alongside the application code. Especially in agile development, new application releases are deployed frequently causing schema changes. Eventually, decisions have to be made regarding the migration of versioned legacy data which is persisted in the cloud-hosted production database. We solve this schema evolution pr… ▽ More During the development of NoSQL-backed software, the data model evolves naturally alongside the application code. Especially in agile development, new application releases are deployed frequently causing schema changes. Eventually, decisions have to be made regarding the migration of versioned legacy data which is persisted in the cloud-hosted production database. We solve this schema evolution problem and present the results of near-exhaustive calculations by means of which software project stakeholders can manage the operative costs for data model evolution and adapt their software release strategy accordingly in order to comply with service-level agreements regarding the competing metrics of migration costs and latency. We clarify conclusively how data model evolution in NoSQL databases impacts the metrics while taking all relevant characteristics of migration scenarios into account. As calculating all possible combinatorics in the search space of migration scenarios would by far exceed computational means, we used a probabilistic Monte Carlo method of repeated sampling, serving as a well-established means to bring the complexity of data model evolution under control. Our experiments show the qualitative and quantitative impact on the performance of migration strategies with respect to intensity and distribution of data entity accesses, the kinds of schema changes, and the characteristics of the underlying data model. △ Less

Submitted 23 April, 2021; originally announced April 2021.

Comments: 16 pages, 15 figures

arXiv:2102.06219 [pdf, other]

Silentium! Run-Analyse-Eradicate the Noise out of the DB/OS Stack

Authors: Wolfgang Mauerer, Ralf Ramsauer, Edson R. F. Lucas, Stefanie Scherzinger

Abstract: When multiple tenants compete for resources, database performance tends to suffer. Yet there are scenarios where guaranteed sub-millisecond latencies are crucial, such as in real-time data processing, IoT devices, or when operating in safety-critical environments. In this paper, we study how to make query latencies deterministic in the face of noise (whether caused by other tenants or unrelated op… ▽ More When multiple tenants compete for resources, database performance tends to suffer. Yet there are scenarios where guaranteed sub-millisecond latencies are crucial, such as in real-time data processing, IoT devices, or when operating in safety-critical environments. In this paper, we study how to make query latencies deterministic in the face of noise (whether caused by other tenants or unrelated operating system tasks). We perform controlled experiments with an in-memory database engine in a multi-tenant setting, where we successively eradicate noisy interference from within the system software stack, to the point where the engine runs close to bare-metal on the underlying hardware. We show that we can achieve query latencies comparable to the database engine running as the sole tenant, but without noticeably impacting the workload of competing tenants. We discuss these results in the context of ongoing efforts to build custom operating systems for database workloads, and point out that for certain use cases, the margin for improvement is rather narrow. In fact, for scenarios like ours, existing operating systems might just be good enough, provided that they are expertly configured. We then critically discuss these findings in the light of a broader family of database systems (e.g., including disk-based), and how to extend the approach of this paper accordingly. △ Less

Submitted 25 February, 2021; v1 submitted 11 February, 2021; originally announced February 2021.

arXiv:2009.11888 [pdf, other]

Virtual Forward Dynamics Models for Cartesian Robot Control

Authors: Stefan Scherzinger, Arne Roennau, Rüdiger Dillmann

Abstract: In industrial context, admittance control represents an important scheme in programming robots for interaction tasks with their environments. Those robots usually implement high-gain disturbance rejection on joint-level and hide direct access to the actuators behind velocity or position controlled interfaces. Using wrist force-torque sensors to add compliance to these systems, force-resolved contr… ▽ More In industrial context, admittance control represents an important scheme in programming robots for interaction tasks with their environments. Those robots usually implement high-gain disturbance rejection on joint-level and hide direct access to the actuators behind velocity or position controlled interfaces. Using wrist force-torque sensors to add compliance to these systems, force-resolved control laws must map the control signals from Cartesian space to joint motion. Although forward dynamics algorithms would perfectly fit to that task description, their application to Cartesian robot control is not well researched. This paper proposes a general concept of virtual forward dynamics models for Cartesian robot control and investigates how the forward map** behaves in comparison to well-established alternatives. Through decreasing the virtual system's link masses in comparison to the end effector, the virtual system becomes linear in the operational space dynamics. Experiments focus on stability and manipulability, particularly in singular configurations. Our results show that through this trick, forward dynamics can combine both benefits of the Jacobian inverse and the Jacobian transpose and, in this regard, outperforms the Damped Least Squares method. △ Less

Submitted 24 September, 2020; originally announced September 2020.

Comments: Preprint of submission to Journal of Intelligent & Robotic Systems (JINT), 16 pages, 13 figures

arXiv:2008.10925 [pdf, ps, other]

Replicability and Reproducibility of a Schema Evolution Study in Embedded Databases

Authors: Dimitri Braininger, Wolfgang Mauerer, Stefanie Scherzinger

Abstract: Ascertaining the feasibility of independent falsification or repetition of published results is vital to the scientific process, and replication or reproduction experiments are routinely performed in many disciplines. Unfortunately, such studies are only scarcely available in database research, with few papers dedicated to re-evaluating published results. In this paper, we conduct a case study on… ▽ More Ascertaining the feasibility of independent falsification or repetition of published results is vital to the scientific process, and replication or reproduction experiments are routinely performed in many disciplines. Unfortunately, such studies are only scarcely available in database research, with few papers dedicated to re-evaluating published results. In this paper, we conduct a case study on replicating and reproducing a study on schema evolution in embedded databases. We obtain exact results for one out of four database applications studied, and come close in two further cases. By reporting results, efforts, and obstacles encountered, we hope to increase appreciation for the substantial efforts required to ensure reproducibility. By discussing minutiae details required for reproducible work, we argue that such important, but often ignored components of scientific work should receive more credit in the evaluation of future research. △ Less

Submitted 9 September, 2020; v1 submitted 25 August, 2020; originally announced August 2020.

arXiv:2003.00054 [pdf, other]

An Empirical Study on the Design and Evolution of NoSQL Database Schemas

Authors: Stefanie Scherzinger, Sebastian Sidortschuck

Abstract: We study how software engineers design and evolve their domain model when building applications against NoSQL data stores. Specifically, we target Java projects that use object-NoSQL mappers to interface with schema-free NoSQL data stores. Given the source code of ten real-world database applications, we extract the implicit NoSQL database schema. We capture the sizes of the schemas, and investiga… ▽ More We study how software engineers design and evolve their domain model when building applications against NoSQL data stores. Specifically, we target Java projects that use object-NoSQL mappers to interface with schema-free NoSQL data stores. Given the source code of ten real-world database applications, we extract the implicit NoSQL database schema. We capture the sizes of the schemas, and investigate whether the schema is denormalized, as is recommended practice in data modeling for NoSQL data stores. Further, we analyze the entire project history, and with it, the evolution history of the NoSQL database schema. In doing so, we conduct the so far largest empirical study on NoSQL schema design and evolution. △ Less

Submitted 28 February, 2020; originally announced March 2020.

arXiv:1908.06272 [pdf, other]

doi 10.1109/IROS40897.2019.8967523

Contact Skill Imitation Learning for Robot-Independent Assembly Programming

Authors: Stefan Scherzinger, Arne Roennau, Rüdiger Dillmann

Abstract: Robotic automation is a key driver for the advancement of technology. The skills of human workers, however, are difficult to program and seem currently unmatched by technical systems. In this work we present a data-driven approach to extract and learn robot-independent contact skills from human demonstrations in simulation environments, using a Long Short Term Memory (LSTM) network. Our model lear… ▽ More Robotic automation is a key driver for the advancement of technology. The skills of human workers, however, are difficult to program and seem currently unmatched by technical systems. In this work we present a data-driven approach to extract and learn robot-independent contact skills from human demonstrations in simulation environments, using a Long Short Term Memory (LSTM) network. Our model learns to generate error-correcting sequences of forces and torques in task space from object-relative motion, which industrial robots carry out through a Cartesian force control scheme on the real setup. This scheme uses forward dynamics computation of a virtually conditioned twin of the manipulator to solve the inverse kinematics problem. We evaluate our methods with an assembly experiment, in which our algorithm handles part tilting and jamming in order to succeed. The results show that the skill is robust towards localization uncertainty in task space and across different joint configurations of the robot. With our approach, non-experts can easily program force-sensitive assembly tasks in a robot-independent way. △ Less

Submitted 5 February, 2020; v1 submitted 17 August, 2019; originally announced August 2019.

Comments: 8 pages, 12 figures, presented at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2019

arXiv:1908.06252 [pdf, other]

doi 10.1109/ICAR46387.2019.8981554

Inverse Kinematics with Forward Dynamics Solvers for Sampled Motion Tracking

Authors: Stefan Scherzinger, Arne Roennau, Rüdiger Dillmann

Abstract: Tracking Cartesian motion with end~effectors is a fundamental task in robot control. For motion that is not known in advance, the solvers must find fast solutions to the inverse kinematics (IK) problem for discretely sampled target poses. On joint control level, however, the robot's actuators operate in a continuous domain, requiring smooth transitions between individual states. In this work, we p… ▽ More Tracking Cartesian motion with end~effectors is a fundamental task in robot control. For motion that is not known in advance, the solvers must find fast solutions to the inverse kinematics (IK) problem for discretely sampled target poses. On joint control level, however, the robot's actuators operate in a continuous domain, requiring smooth transitions between individual states. In this work, we present a boost to the well-known Jacobian transpose method to address this goal, using the mass matrix of a virtually conditioned twin of the manipulator. Results on the UR10 show superior convergence and quality of our dynamics-based solver against the plain Jacobian method. Our algorithm is straightforward to implement as a controller, using common robotics libraries. △ Less

Submitted 12 May, 2023; v1 submitted 17 August, 2019; originally announced August 2019.

Comments: 7 pages, 8 figures, presented at the 19th International Conference on Advanced Robotics (ICAR) 2019, Belo Horizonte, Brazil

arXiv:1308.0514 [pdf, other]

Managing Schema Evolution in NoSQL Data Stores

Authors: Stefanie Scherzinger, Meike Klettke, Uta Störl

Abstract: NoSQL data stores are commonly schema-less, providing no means for globally defining or managing the schema. While this offers great flexibility in early stages of application development, developers soon can experience the heavy burden of dealing with increasingly heterogeneous data. This paper targets schema evolution for NoSQL data stores, the complex task of adapting and changing the implicit… ▽ More NoSQL data stores are commonly schema-less, providing no means for globally defining or managing the schema. While this offers great flexibility in early stages of application development, developers soon can experience the heavy burden of dealing with increasingly heterogeneous data. This paper targets schema evolution for NoSQL data stores, the complex task of adapting and changing the implicit structure of the data stored. We discuss the recommendations of the developer community on handling schema changes, and introduce a simple, declarative schema evolution language. With our language, software developers and architects can systematically manage the evolution of their production data and perform typical schema maintenance tasks. We further provide a holistic NoSQL database programming language to define the semantics of our schema evolution language. Our solution does not require any modifications to the NoSQL data store, treating the data store as a black box. Thus, we want to address application developers that use NoSQL systems △ Less

Submitted 2 August, 2013; originally announced August 2013.

Comments: Proceedings of the 14th International Symposium on Database Programming Languages (DBPL 2013), August 30, 2013, Riva del Garda, Trento, Italy

arXiv:cs/0406016 [pdf, ps, other]

Schema-based Scheduling of Event Processors and Buffer Minimization for Queries on Structured Data Streams

Authors: Christoph Koch, Stefanie Scherzinger, Nicole Schweikardt, Bernhard Stegmaier

Abstract: We introduce an extension of the XQuery language, FluX, that supports event-based query processing and the conscious handling of main memory buffers. Purely event-based queries of this language can be executed on streaming XML data in a very direct way. We then develop an algorithm that allows to efficiently rewrite XQueries into the event-based FluX language. This algorithm uses order constrain… ▽ More We introduce an extension of the XQuery language, FluX, that supports event-based query processing and the conscious handling of main memory buffers. Purely event-based queries of this language can be executed on streaming XML data in a very direct way. We then develop an algorithm that allows to efficiently rewrite XQueries into the event-based FluX language. This algorithm uses order constraints from a DTD to schedule event handlers and to thus minimize the amount of buffering required for evaluating a query. We discuss the various technical aspects of query optimization and query evaluation within our framework. This is complemented with an experimental evaluation of our approach. △ Less

Submitted 7 June, 2004; originally announced June 2004.

Comments: 14 pages, 4 figures, to appear in Proc. 30th VLDB 2004, Toronto, Canada. Extended version

ACM Class: H.2.3, H.2.4

Showing 1–24 of 24 results for author: Scherzinger, S