Skip to main content

Showing 1–4 of 4 results for author: Polyntsov, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2308.08702  [pdf, other

    cs.DB cs.PF

    Finding a Second Wind: Speeding Up Graph Traversal Queries in RDBMSs Using Column-Oriented Processing

    Authors: Mikhail Firsov, Michael Polyntsov, Kirill Smirnov, George Chernishev

    Abstract: Recursive queries and recursive derived tables constitute an important part of the SQL standard. Their efficient processing is important for many real-life applications that rely on graph or hierarchy traversal. Position-enabled column-stores offer a novel opportunity to improve run times for this type of queries. Such systems allow the engine to explicitly use data positions (row ids) inside its… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

    ACM Class: H.2.4; E.2

  2. arXiv:2307.14935  [pdf, ps, other

    cs.DB cs.AI cs.CE cs.LG

    Solving Data Quality Problems with Desbordante: a Demo

    Authors: George Chernishev, Michael Polyntsov, Anton Chizhov, Kirill Stupakov, Ilya Shchuckin, Alexander Smirnov, Maxim Strutovsky, Alexey Shlyonskikh, Mikhail Firsov, Stepan Manannikov, Nikita Bobrov, Daniil Goncharov, Ilia Barutkin, Vladislav Shalnev, Kirill Muraviev, Anna Rakhmukova, Dmitriy Shcheka, Anton Chernikov, Mikhail Vyrodov, Yaroslav Kurbatov, Maxim Fofanov, Sergei Belokonnyi, Pavel Anosov, Arthur Saliou, Eduard Gaisin , et al. (1 additional authors not shown)

    Abstract: Data profiling is an essential process in modern data-driven industries. One of its critical components is the discovery and validation of complex statistics, including functional dependencies, data constraints, association rules, and others. However, most existing data profiling systems that focus on complex statistics do not provide proper integration with the tools used by contemporary data s… ▽ More

    Submitted 28 July, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

    ACM Class: H.3; I.5; J.0

  3. arXiv:2301.05965  [pdf, other

    cs.DB cs.AI cs.LG

    Desbordante: from benchmarking suite to high-performance science-intensive data profiler (preprint)

    Authors: George Chernishev, Michael Polyntsov, Anton Chizhov, Kirill Stupakov, Ilya Shchuckin, Alexander Smirnov, Maxim Strutovsky, Alexey Shlyonskikh, Mikhail Firsov, Stepan Manannikov, Nikita Bobrov, Daniil Goncharov, Ilia Barutkin, Vladislav Shalnev, Kirill Muraviev, Anna Rakhmukova, Dmitriy Shcheka, Anton Chernikov, Dmitrii Mandelshtam, Mikhail Vyrodov, Arthur Saliou, Eduard Gaisin, Kirill Smirnov

    Abstract: Pioneering data profiling systems such as Metanome and OpenClean brought public attention to science-intensive data profiling. This type of profiling aims to extract complex patterns (primitives) such as functional dependencies, data constraints, association rules, and others. However, these tools are research prototypes rather than production-ready systems. The following work presents Desbordan… ▽ More

    Submitted 14 January, 2023; originally announced January 2023.

    ACM Class: H.3; I.5; J.0

  4. arXiv:2207.12713  [pdf, other

    cs.DB cs.DS cs.PF

    Implementing the Comparison-Based External Sort

    Authors: Michael Polyntsov, Valentin Grigorev, Kirill Smirnov, George Chernishev

    Abstract: In the age of big data, sorting is an indispensable operation for DBMSes and similar systems. Having data sorted can help produce query plans with significantly lower run times. It also can provide other benefits like having non-blocking operators which will produce data steadily (without bursts), or operators with reduced memory footprint. Sorting may be required on any step of query processing… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

    ACM Class: H.2; E.5