Skip to main content

Showing 1–16 of 16 results for author: Boncz, P

Searching in archive cs. Search in all archives.
.
  1. DuckDB-SGX2: The Good, The Bad and The Ugly within Confidential Analytical Query Processing

    Authors: Ilaria Battiston, Lotte Felius, Sam Ansmink, Laurens Kuiper, Peter Boncz

    Abstract: We provide an evaluation of an analytical workload in a confidential computing environment, combining DuckDB with two technologies: modular columnar encryption in Parquet files (data at rest) and the newest version of the Intel SGX Trusted Execution Environment (TEE), providing a hardware enclave where data in flight can be (more) securely decrypted and processed. One finding is that the "performa… ▽ More

    Submitted 30 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

  2. OpenIVM: a SQL-to-SQL Compiler for Incremental Computations

    Authors: Ilaria Battiston, Kriti Kathuria, Peter Boncz

    Abstract: This demonstration presents a new Open Source SQL-to-SQL compiler for Incremental View Maintenance (IVM). While previous systems, such as DBToaster, implemented computational functionality for IVM in a separate system, the core principle of OpenIVM is to make use of existing SQL query processing engines and perform all IVM computations via SQL. This approach enables the integration of IVM in these… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  3. arXiv:2312.12923  [pdf, other

    cs.DB

    Improving Data Minimization through Decentralized Data Architectures

    Authors: Ilaria Battiston, Peter Boncz

    Abstract: In this research project, we investigate an alternative to the standard cloud-centralized data architecture. Specifically, we aim to leave part of the application data under the control of the individual data owners in decentralized personal data stores. Our primary goal is to increase data minimization, i. e., enabling more sensitive personal data to be under the control of its owners while provi… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: 4 pages, 1 figure

    Journal ref: CEUR Workshop Proceedings 2023

  4. arXiv:2307.04820  [pdf, other

    cs.DB

    The LDBC Social Network Benchmark Interactive workload v2: A transactional graph query benchmark with deep delete operations

    Authors: David Püroja, Jack Waudby, Peter Boncz, Gábor Szárnyas

    Abstract: The LDBC Social Network Benchmark's Interactive workload captures an OLTP scenario operating on a correlated social network graph. It consists of complex graph queries executed concurrently with a stream of updates operation. Since its initial release in 2015, the Interactive workload has become the de facto industry standard for benchmarking transactional graph data management systems. As graph s… ▽ More

    Submitted 17 August, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

    ACM Class: H.2.4

  5. arXiv:2307.04350  [pdf, other

    cs.DB

    The Linked Data Benchmark Council (LDBC): Driving competition and collaboration in the graph data management space

    Authors: Gábor Szárnyas, Brad Bebee, Altan Birler, Alin Deutsch, George Fletcher, Henry A. Gabb, Denise Gosnell, Alastair Green, Zhihui Guo, Keith W. Hare, Jan Hidders, Alexandru Iosup, Atanas Kiryakov, Tomas Kovatchev, Xinsheng Li, Leonid Libkin, Heng Lin, Xiaojian Luo, Arnau Prat-Pérez, David Püroja, Shipeng Qi, Oskar van Rest, Benjamin A. Steer, Dávid Szakállas, Bing Tong , et al. (8 additional authors not shown)

    Abstract: Graph data management is instrumental for several use cases such as recommendation, root cause analysis, financial fraud detection, and enterprise knowledge representation. Efficiently supporting these use cases yields a number of unique requirements, including the need for a concise query language and graph-aware query optimization techniques. The goal of the Linked Data Benchmark Council (LDBC)… ▽ More

    Submitted 17 August, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

    ACM Class: H.2.4

  6. arXiv:2112.06280  [pdf, other

    cs.DC

    In-Memory Indexed Caching for Distributed Data Processing

    Authors: Alexandru Uta, Bogdan Ghit, Ankur Dave, Jan Rellermeyer, Peter Boncz

    Abstract: Powerful abstractions such as dataframes are only as efficient as their underlying runtime system. The de-facto distributed data processing framework, Apache Spark, is poorly suited for the modern cloud-based data-science workloads due to its outdated assumptions: static datasets analyzed using coarse-grained transformations. In this paper, we introduce the Indexed DataFrame, an in-memory cache th… ▽ More

    Submitted 8 February, 2022; v1 submitted 12 December, 2021; originally announced December 2021.

    Comments: Accepted for publication at IEEE IPDPS 2022

  7. arXiv:2105.15111  [pdf, ps, other

    cs.CY physics.soc-ph

    An Epidemiological Model for contact tracing with the Dutch CoronaMelder App

    Authors: Peter Boncz

    Abstract: We present an epidemiological model for the effectiveness of CoronaMelder, the Dutch digital contact tracing app developed on top of the Google/Apple Exposure Notification framework. We compare the effectiveness of CoronaMelder with manual contract tracing on a number of metrics. CoronaMelder turns out to have a small but noticeable positive influence in slowing down the COVID-19 pandemic, an effe… ▽ More

    Submitted 10 June, 2021; v1 submitted 18 May, 2021; originally announced May 2021.

    Comments: This is a first and preliminary draft. Future updates are expected

  8. arXiv:2012.06171  [pdf, other

    cs.DC cs.DB

    The Future is Big Graphs! A Community View on Graph Processing Systems

    Authors: Sherif Sakr, Angela Bonifati, Hannes Voigt, Alexandru Iosup, Khaled Ammar, Renzo Angles, Walid Aref, Marcelo Arenas, Maciej Besta, Peter A. Boncz, Khuzaima Daudjee, Emanuele Della Valle, Stefania Dumbrava, Olaf Hartig, Bernhard Haslhofer, Tim Hegeman, Jan Hidders, Katja Hose, Adriana Iamnitchi, Vasiliki Kalavri, Hugo Kapp, Wim Martens, M. Tamer Özsu, Eric Peukert, Stefan Plantikow , et al. (16 additional authors not shown)

    Abstract: Graphs are by nature unifying abstractions that can leverage interconnectedness to represent, explore, predict, and explain real- and digital-world phenomena. Although real users and consumers of graph instances and graph workloads understand these abstractions, future problems will require new abstractions and systems. What needs to happen in the next decade for big graph processing to continue t… ▽ More

    Submitted 11 December, 2020; originally announced December 2020.

    Comments: 12 pages, 3 figures, collaboration between the large-scale systems and data management communities, work started at the Dagstuhl Seminar 19491 on Big Graph Processing Systems, to be published in the Communications of the ACM

    ACM Class: C.3; E.0; H.2; J.0

  9. arXiv:2011.15028  [pdf, other

    cs.DC cs.DB

    The LDBC Graphalytics Benchmark

    Authors: Alexandru Iosup, Ahmed Musaafir, Alexandru Uta, Arnau Prat Pérez, Gábor Szárnyas, Hassan Chafi, Ilie Gabriel Tănase, Lifeng Nai, Michael Anderson, Mihai Capotă, Narayanan Sundaram, Peter Boncz, Siegfried Depner, Stijn Heldens, Thomas Manhardt, Tim Hegeman, Wing Lung Ngai, Yinglong Xia

    Abstract: In this document, we describe LDBC Graphalytics, an industrial-grade benchmark for graph analysis platforms. The main goal of Graphalytics is to enable the fair and objective comparison of graph analysis platforms. Due to the diversity of bottlenecks and performance issues such platforms need to address, Graphalytics consists of a set of selected deterministic algorithms for full-graph analysis, s… ▽ More

    Submitted 6 April, 2023; v1 submitted 30 November, 2020; originally announced November 2020.

    ACM Class: C.4; H.2.4

  10. arXiv:2001.02299  [pdf, other

    cs.DB cs.PF cs.SI

    The LDBC Social Network Benchmark

    Authors: Renzo Angles, János Benjamin Antal, Alex Averbuch, Altan Birler, Peter Boncz, Márton Búr, Orri Erling, Andrey Gubichev, Vlad Haprian, Moritz Kaufmann, Josep Lluís Larriba Pey, Norbert Martínez, József Marton, Marcus Paradies, Minh-Duc Pham, Arnau Prat-Pérez, David Püroja, Mirko Spasić, Benjamin A. Steer, Dávid Szakállas, Gábor Szárnyas, Jack Waudby, Mingxi Wu, Yuchen Zhang

    Abstract: The Linked Data Benchmark Council's Social Network Benchmark (LDBC SNB) is an effort intended to test various functionalities of systems used for graph-like data management. For this, LDBC SNB uses the recognizable scenario of operating a social network, characterized by its graph-shaped data. LDBC SNB consists of two workloads that focus on different functionalities: the Interactive workload (int… ▽ More

    Submitted 14 January, 2024; v1 submitted 7 January, 2020; originally announced January 2020.

    Comments: For the repository containing the source code of this technical report, see https://github.com/ldbc/ldbc_snb_docs

    ACM Class: H.2.4

  11. arXiv:1907.00083  [pdf, other

    cs.IR cs.DB

    Extracting Novel Facts from Tables for Knowledge Graph Completion (Extended version)

    Authors: Benno Kruit, Peter Boncz, Jacopo Urbani

    Abstract: We propose a new end-to-end method for extending a Knowledge Graph (KG) from tables. Existing techniques tend to interpret tables by focusing on information that is already in the KG, and therefore tend to extract many redundant facts. Our method aims to find more novel facts. We introduce a new technique for table interpretation based on a scalable graphical model using entity similarities. Our m… ▽ More

    Submitted 15 July, 2019; v1 submitted 28 June, 2019; originally announced July 2019.

  12. arXiv:1904.08223  [pdf, other

    cs.DB

    Estimating Cardinalities with Deep Sketches

    Authors: Andreas Kipf, Dimitri Vorona, Jonas Müller, Thomas Kipf, Bernhard Radke, Viktor Leis, Peter Boncz, Thomas Neumann, Alfons Kemper

    Abstract: We introduce Deep Sketches, which are compact models of databases that allow us to estimate the result sizes of SQL queries. Deep Sketches are powered by a new deep learning approach to cardinality estimation that can capture correlations between columns, even across tables. Our demonstration allows users to define such sketches on the TPC-H and IMDb datasets, monitor the training process, and run… ▽ More

    Submitted 17 April, 2019; originally announced April 2019.

    Comments: To appear in SIGMOD'19

  13. arXiv:1809.00677  [pdf, other

    cs.DB

    Learned Cardinalities: Estimating Correlated Joins with Deep Learning

    Authors: Andreas Kipf, Thomas Kipf, Bernhard Radke, Viktor Leis, Peter Boncz, Alfons Kemper

    Abstract: We describe a new deep learning approach to cardinality estimation. MSCN is a multi-set convolutional network, tailored to representing relational query plans, that employs set semantics to capture query features and true cardinalities. MSCN builds on sampling-based estimation, addressing its weaknesses when no sampled tuples qualify a predicate, and in capturing join-crossing correlations. Our ev… ▽ More

    Submitted 18 December, 2018; v1 submitted 3 September, 2018; originally announced September 2018.

    Comments: CIDR 2019. https://github.com/andreaskipf/learnedcardinalities

  14. arXiv:1802.09488  [pdf, other

    cs.DB

    Adaptive Geospatial Joins for Modern Hardware

    Authors: Andreas Kipf, Harald Lang, Varun Pandey, Raul Alexandru Persa, Peter Boncz, Thomas Neumann, Alfons Kemper

    Abstract: Geospatial joins are a core building block of connected mobility applications. An especially challenging problem are joins between streaming points and static polygons. Since points are not known beforehand, they cannot be indexed. Nevertheless, points need to be mapped to polygons with low latencies to enable real-time feedback. We present an adaptive geospatial join that uses true hit filterin… ▽ More

    Submitted 26 February, 2018; originally announced February 2018.

  15. arXiv:1712.01550  [pdf, other

    cs.DB

    G-CORE: A Core for Future Graph Query Languages

    Authors: Renzo Angles, Marcelo Arenas, Pablo Barceló, Peter Boncz, George H. L. Fletcher, Claudio Gutierrez, Tobias Lindaaker, Marcus Paradies, Stefan Plantikow, Juan Sequeda, Oskar van Rest, Hannes Voigt

    Abstract: We report on a community effort between industry and academia to shape the future of graph query languages. We argue that existing graph database management systems should consider supporting a query language with two key characteristics. First, it should be composable, meaning, that graphs are the input and the output of queries. Second, the graph query language should treat paths as first-class… ▽ More

    Submitted 6 December, 2017; v1 submitted 5 December, 2017; originally announced December 2017.

  16. arXiv:1208.4170  [pdf, other

    cs.DB

    From Cooperative Scans to Predictive Buffer Management

    Authors: Michał Świtakowski, Peter Boncz, Marcin Żukowski

    Abstract: In analytical applications, database systems often need to sustain workloads with multiple concurrent scans hitting the same table. The Cooperative Scans (CScans) framework, which introduces an Active Buffer Manager (ABM) component into the database architecture, has been the most effective and elaborate response to this problem, and was initially developed in the X100 research prototype. We now r… ▽ More

    Submitted 20 August, 2012; originally announced August 2012.

    Comments: VLDB2012

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 12, pp. 1759-1770 (2012)