Showing 1–2 of 2 results for author: Vincon, T

Search v0.5.6 released 2020-02-24

arXiv:1910.08023 [pdf, other]

cs.DB

MV-PBT: Multi-Version Index for Large Datasets and HTAP Workloads

Authors: Christian Riegger, Tobias Vincon, Robert Gottstein, Ilia Petrov

Abstract: Modern mixed (HTAP) workloads execute fast update-transactions and long-running analytical queries on the same dataset and system. In multi-version (MVCC) systems, such workloads result in many short-lived versions and long version-chains as well as in increased and frequent maintenance overhead. Consequently, the index pressure increases significantly. Firstly, the frequent modifications cause fr… ▽ More Modern mixed (HTAP) workloads execute fast update-transactions and long-running analytical queries on the same dataset and system. In multi-version (MVCC) systems, such workloads result in many short-lived versions and long version-chains as well as in increased and frequent maintenance overhead. Consequently, the index pressure increases significantly. Firstly, the frequent modifications cause frequent creation of new versions, yielding a surge in index maintenance overhead. Secondly and more importantly, index-scans incur extra I/O overhead to determine, which of the resulting tuple-versions are visible to the executing transaction (visibility-check) as current designs only store version/timestamp information in the base table -- not in the index. Such index-only visibility-check is critical for HTAP workloads on large datasets. In this paper we propose the Multi-Version Partitioned B-Tree (MV-PBT) as a version-aware index structure, supporting index-only visibility checks and flash-friendly I/O patterns. The experimental evaluation indicates a 2x improvement for analytical queries and 15% higher transactional throughput under HTAP workloads (CH-Benchmark). MV-PBT offers 40% higher transactional throughput compared to WiredTiger's LSM-Tree implementation under YCSB. △ Less

Submitted 17 October, 2019; originally announced October 2019.
arXiv:1905.04767 [pdf, other]

cs.DB

Moving Processing to Data: On the Influence of Processing in Memory on Data Management

Authors: Tobias Vincon, Andreas Koch, Ilia Petrov

Abstract: Near-Data Processing refers to an architectural hardware and software paradigm, based on the co-location of storage and compute units. Ideally, it will allow to execute application-defined data- or compute-intensive operations in-situ, i.e. within (or close to) the physical data storage. Thus, Near-Data Processing seeks to minimize expensive data movement, improving performance, scalability, and r… ▽ More Near-Data Processing refers to an architectural hardware and software paradigm, based on the co-location of storage and compute units. Ideally, it will allow to execute application-defined data- or compute-intensive operations in-situ, i.e. within (or close to) the physical data storage. Thus, Near-Data Processing seeks to minimize expensive data movement, improving performance, scalability, and resource-efficiency. Processing-in-Memory is a sub-class of Near-Data processing that targets data processing directly within memory (DRAM) chips. The effective use of Near-Data Processing mandates new architectures, algorithms, interfaces, and development toolchains. △ Less

Submitted 12 May, 2019; originally announced May 2019.

Search v0.5.6 released 2020-02-24