Search | arXiv e-print repository

Paparazzi: A Deep Dive into the Capabilities of Language and Vision Models for Grounding Viewpoint Descriptions

Authors: Henrik Voigt, Jan Hombeck, Monique Meuschke, Kai Lawonn, Sina Zarrieß

Abstract: Existing language and vision models achieve impressive performance in image-text understanding. Yet, it is an open question to what extent they can be used for language understanding in 3D environments and whether they implicitly acquire 3D object knowledge, e.g. about different views of an object. In this paper, we investigate whether a state-of-the-art language and vision model, CLIP, is able to… ▽ More Existing language and vision models achieve impressive performance in image-text understanding. Yet, it is an open question to what extent they can be used for language understanding in 3D environments and whether they implicitly acquire 3D object knowledge, e.g. about different views of an object. In this paper, we investigate whether a state-of-the-art language and vision model, CLIP, is able to ground perspective descriptions of a 3D object and identify canonical views of common objects based on text queries. We present an evaluation framework that uses a circling camera around a 3D object to generate images from different viewpoints and evaluate them in terms of their similarity to natural language descriptions. We find that a pre-trained CLIP model performs poorly on most canonical views and that fine-tuning using hard negative sampling and random contrasting yields good results even under conditions with little available training data. △ Less

Submitted 13 February, 2023; originally announced February 2023.

arXiv:2211.10962 [pdf, ps, other]

doi 10.1145/3589778

PG-Schema: Schemas for Property Graphs

Authors: Renzo Angles, Angela Bonifati, Stefania Dumbrava, George Fletcher, Alastair Green, Jan Hidders, Bei Li, Leonid Libkin, Victor Marsault, Wim Martens, Filip Murlak, Stefan Plantikow, Ognjen Savković, Michael Schmidt, Juan Sequeda, Sławek Staworko, Dominik Tomaszuk, Hannes Voigt, Domagoj Vrgoč, Mingxi Wu, Dušan Živković

Abstract: Property graphs have reached a high level of maturity, witnessed by multiple robust graph database systems as well as the ongoing ISO standardization effort aiming at creating a new standard Graph Query Language (GQL). Yet, despite documented demand, schema support is limited both in existing systems and in the first version of the GQL Standard. It is anticipated that the second version of the GQL… ▽ More Property graphs have reached a high level of maturity, witnessed by multiple robust graph database systems as well as the ongoing ISO standardization effort aiming at creating a new standard Graph Query Language (GQL). Yet, despite documented demand, schema support is limited both in existing systems and in the first version of the GQL Standard. It is anticipated that the second version of the GQL Standard will include a rich DDL. Aiming to inspire the development of GQL and enhance the capabilities of graph database systems, we propose PG-Schema, a simple yet powerful formalism for specifying property graph schemas. It features PG-Types with flexible type definitions supporting multi-inheritance, as well as expressive constraints based on the recently proposed PG-Keys formalism. We provide the formal syntax and semantics of PG-Schema, which meet principled design requirements grounded in contemporary property graph management scenarios, and offer a detailed comparison of its features with those of existing schema languages and graph database systems. △ Less

Submitted 8 July, 2023; v1 submitted 20 November, 2022; originally announced November 2022.

Comments: 26 pages

Journal ref: Proc. ACM Manag. Data (2023)

arXiv:2112.06217 [pdf, ps, other]

Graph Pattern Matching in GQL and SQL/PGQ

Authors: Alin Deutsch, Nadime Francis, Alastair Green, Keith Hare, Bei Li, Leonid Libkin, Tobias Lindaaker, Victor Marsault, Wim Martens, Jan Michels, Filip Murlak, Stefan Plantikow, Petra Selmer, Hannes Voigt, Oskar van Rest, Domagoj Vrgoč, Mingxi Wu, Fred Zemke

Abstract: As graph databases become widespread, JTC1 -- the committee in joint charge of information technology standards for the International Organization for Standardization (ISO), and International Electrotechnical Commission (IEC) -- has approved a project to create GQL, a standard property graph query language. This complements a project to extend SQL with a new part, SQL/PGQ, which specifies how to d… ▽ More As graph databases become widespread, JTC1 -- the committee in joint charge of information technology standards for the International Organization for Standardization (ISO), and International Electrotechnical Commission (IEC) -- has approved a project to create GQL, a standard property graph query language. This complements a project to extend SQL with a new part, SQL/PGQ, which specifies how to define graph views over an SQL tabular schema, and to run read-only queries against them. Both projects have been assigned to the ISO/IEC JTC1 SC32 working group for Database Languages, WG3, which continues to maintain and enhance SQL as a whole. This common responsibility helps enforce a policy that the identical core of both PGQ and GQL is a graph pattern matching sub-language, here termed GPML. The WG3 design process is also analyzed by an academic working group, part of the Linked Data Benchmark Council (LDBC), whose task is to produce a formal semantics of these graph data languages, which complements their standard specifications. This paper, written by members of WG3 and LDBC, presents the key elements of the GPML of SQL/PGQ and GQL in advance of the publication of these new standards. △ Less

Submitted 12 December, 2021; originally announced December 2021.

ACM Class: H.2.3

arXiv:2111.09228 [pdf, other]

Semantic Foundations of Seraph Continuous Graph Query Language

Authors: Emanuele Falzone, Riccardo Tommasini, Emanuele Della Valle, Petra Selmer, Stefan Plantikow, Hannes Voigt, Keith Hare, Ljubica Lazarevic, Tobias Lindaaker

Abstract: The scientific community has been studying graph data models for decades. Their high expressiveness and elasticity led the scientific community to design a variety of graph data models and graph query languages, and the practitioners to use them to model real-world cases and extract useful information. Recently, property graphs and, in particular, Cypher 9 (the first open version of the well-known… ▽ More The scientific community has been studying graph data models for decades. Their high expressiveness and elasticity led the scientific community to design a variety of graph data models and graph query languages, and the practitioners to use them to model real-world cases and extract useful information. Recently, property graphs and, in particular, Cypher 9 (the first open version of the well-known Neo4j Inc.'s language) are gaining popularity. Practitioners find Cypher useful and applicable in many scenarios. However, we are living in a streaming world where data continuously flows. A growing number of Cypher's users show interest in continuously querying graph data to act in a timely fashion. Indeed, Cypher lacks the features for dealing with streams of (graph) data and continuous query evaluation. In this work, we propose Seraph, an extension of Cypher, as a first attempt to introduce streaming features in the context of property graph query languages. Specifically, we define Seraph semantics, we propose a first version of Seraph syntax, and we discuss the potential impacts from a user perspective. △ Less

Submitted 17 November, 2021; originally announced November 2021.

arXiv:2012.06171 [pdf, other]

doi 10.1145/3434642

The Future is Big Graphs! A Community View on Graph Processing Systems

Authors: Sherif Sakr, Angela Bonifati, Hannes Voigt, Alexandru Iosup, Khaled Ammar, Renzo Angles, Walid Aref, Marcelo Arenas, Maciej Besta, Peter A. Boncz, Khuzaima Daudjee, Emanuele Della Valle, Stefania Dumbrava, Olaf Hartig, Bernhard Haslhofer, Tim Hegeman, Jan Hidders, Katja Hose, Adriana Iamnitchi, Vasiliki Kalavri, Hugo Kapp, Wim Martens, M. Tamer Özsu, Eric Peukert, Stefan Plantikow , et al. (16 additional authors not shown)

Abstract: Graphs are by nature unifying abstractions that can leverage interconnectedness to represent, explore, predict, and explain real- and digital-world phenomena. Although real users and consumers of graph instances and graph workloads understand these abstractions, future problems will require new abstractions and systems. What needs to happen in the next decade for big graph processing to continue t… ▽ More Graphs are by nature unifying abstractions that can leverage interconnectedness to represent, explore, predict, and explain real- and digital-world phenomena. Although real users and consumers of graph instances and graph workloads understand these abstractions, future problems will require new abstractions and systems. What needs to happen in the next decade for big graph processing to continue to succeed? △ Less

Submitted 11 December, 2020; originally announced December 2020.

Comments: 12 pages, 3 figures, collaboration between the large-scale systems and data management communities, work started at the Dagstuhl Seminar 19491 on Big Graph Processing Systems, to be published in the Communications of the ACM

ACM Class: C.3; E.0; H.2; J.0

arXiv:1905.09848 [pdf, other]

Conjunctive Queries with Theta Joins Under Updates

Authors: Muhammad Idris, Martín Ugarte, Stijn Vansummeren, Hannes Voigt, Wolfgang Lehner

Abstract: Modern application domains such as Composite Event Recognition (CER) and real-time Analytics require the ability to dynamically refresh query results under high update rates. Traditional approaches to this problem are based either on the materialization of subresults (to avoid their recomputation) or on the recomputation of subresults (to avoid the space overhead of materialization). Both techniqu… ▽ More Modern application domains such as Composite Event Recognition (CER) and real-time Analytics require the ability to dynamically refresh query results under high update rates. Traditional approaches to this problem are based either on the materialization of subresults (to avoid their recomputation) or on the recomputation of subresults (to avoid the space overhead of materialization). Both techniques have recently been shown suboptimal: instead of materializing results and subresults, one can maintain a data structure that supports efficient maintenance under updates and can quickly enumerate the full query output, as well as the changes produced under single updates. Unfortunately, these data structures have been developed only for aggregate-join queries composed of equi-joins, limiting their applicability in domains such as CER where temporal joins are commonplace. In this paper, we present a new approach for dynamically evaluating queries with multi-way theta-joins under updates that is effective in avoiding both materialization and recomputation of results, while supporting a wide range of applications. To do this we generalize Dynamic Yannakakis, an algorithm for dynamically processing acyclic equi-join queries. In tandem, and of independent interest, we generalize the notions of acyclicity and free-connexity to arbitrary theta-joins and show how to compute corresponding join trees. We instantiate our framework to the case where theta-joins are only composed of equalities and inequalities and experimentally compare our algorithm to state of the art CER systems as well as incremental view maintenance engines. Our approach performs consistently better than the competitor systems with up to two orders of magnitude improvements in both time and memory consumption. △ Less

Submitted 23 May, 2019; originally announced May 2019.

arXiv:1902.06427 [pdf, ps, other]

Schema Validation and Evolution for Graph Databases

Authors: Angela Bonifati, Peter Furniss, Alastair Green, Russ Harmer, Eugenia Oshurko, Hannes Voigt

Abstract: Despite the maturity of commercial graph databases, little consensus has been reached so far on the standardization of data definition languages (DDLs) for property graphs (PG). The discussion on the characteristics of PG schemas is ongoing in many standardization and community groups. Although some basic aspects of a schema are already present in Neo4j 3.5, like in most commercial graph databases… ▽ More Despite the maturity of commercial graph databases, little consensus has been reached so far on the standardization of data definition languages (DDLs) for property graphs (PG). The discussion on the characteristics of PG schemas is ongoing in many standardization and community groups. Although some basic aspects of a schema are already present in Neo4j 3.5, like in most commercial graph databases, full support is missing allowing to constraint property graphs with more or less flexibility. In this paper, we focus on two different perspectives from which a PG schema should be considered, as being descriptive or prescriptive, and we show how it would be possible to switch from one to another as the application under development gains more stability. Apart from proposing concise schema DDL inspired by Cypher syntax, we show how schema validation can be enforced through homomorphisms between PG schemas and PG instances; and how schema evolution can be described through the use of graph rewriting operations. Our prototypical implementation demonstrates feasibility and shows the need of offering high-level query primitives to accommodate flexible graph schema requirements as showcased in our work. △ Less

Submitted 18 February, 2019; originally announced February 2019.

Comments: 36 pages, 9 figures

arXiv:1712.01550 [pdf, other]

G-CORE: A Core for Future Graph Query Languages

Authors: Renzo Angles, Marcelo Arenas, Pablo Barceló, Peter Boncz, George H. L. Fletcher, Claudio Gutierrez, Tobias Lindaaker, Marcus Paradies, Stefan Plantikow, Juan Sequeda, Oskar van Rest, Hannes Voigt

Abstract: We report on a community effort between industry and academia to shape the future of graph query languages. We argue that existing graph database management systems should consider supporting a query language with two key characteristics. First, it should be composable, meaning, that graphs are the input and the output of queries. Second, the graph query language should treat paths as first-class… ▽ More We report on a community effort between industry and academia to shape the future of graph query languages. We argue that existing graph database management systems should consider supporting a query language with two key characteristics. First, it should be composable, meaning, that graphs are the input and the output of queries. Second, the graph query language should treat paths as first-class citizens. Our result is G-CORE, a powerful graph query language design that fulfills these goals, and strikes a careful balance between path query expressivity and evaluation complexity. △ Less

Submitted 6 December, 2017; v1 submitted 5 December, 2017; originally announced December 2017.

arXiv:1608.05564 [pdf, other]

doi 10.1145/3035918.3064046

Living in Parallel Realities -- Co-Existing Schema Versions with a Bidirectional Database Evolution Language

Authors: Kai Herrmann, Hannes Voigt, Andreas Behrend, Jonas Rausch, Wolfgang Lehner

Abstract: We introduce end-to-end support of co-existing schema versions within one database. While it is state of the art to run multiple versions of a continuously developed application concurrently, it is hard to do the same for databases. In order to keep multiple co-existing schema versions alive; which are all accessing the same data set; developers usually employ handwritten delta code (e.g. views an… ▽ More We introduce end-to-end support of co-existing schema versions within one database. While it is state of the art to run multiple versions of a continuously developed application concurrently, it is hard to do the same for databases. In order to keep multiple co-existing schema versions alive; which are all accessing the same data set; developers usually employ handwritten delta code (e.g. views and triggers in SQL). This delta code is hard to write and hard to maintain: if a database administrator decides to adapt the physical table schema, all handwritten delta code needs to be adapted as well, which is expensive and error-prone in practice. In this paper, we present InVerDa: developers use the simple bidirectional database evolution language BiDEL, which carries enough information to generate all delta code automatically. Without additional effort, new schema versions become immediately accessible and data changes in any version are visible in all schema versions at the same time. InVerDa also allows for easily changing the physical table design without affecting the availability of co-existing schema versions. This greatly increases robustness (orders of magnitude less lines of code) and allows for significant performance optimization. A main contribution is the formal evaluation that each schema version acts like a common full-fledged database schema independently of the chosen physical table design. △ Less

Submitted 19 September, 2017; v1 submitted 19 August, 2016; originally announced August 2016.

Showing 1–9 of 9 results for author: Voigt, H