Skip to main content

Showing 1–43 of 43 results for author: Olteanu, D

.
  1. arXiv:2404.17679  [pdf, other

    cs.DB cs.DS

    Recent Increments in Incremental View Maintenance

    Authors: Dan Olteanu

    Abstract: We overview recent progress on the longstanding problem of incremental view maintenance (IVM), with a focus on the fine-grained complexity and optimality of IVM for classes of conjunctive queries. This theoretical progress guided the development of IVM engines that reported practical benefits in academic papers and industrial settings. When taken in isolation, each of the reported advancements is… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: 18 pages, 7 figures, Gems of PODS 2024

  2. arXiv:2404.16224  [pdf, ps, other

    cs.DB

    Tractable Conjunctive Queries over Static and Dynamic Relations

    Authors: Ahmet Kara, Zheng Luo, Milos Nikolic, Dan Olteanu, Haozhe Zhang

    Abstract: We investigate the evaluation of conjunctive queries over static and dynamic relations. While static relations are given as input and do not change, dynamic relations are subject to inserts and deletes. We characterise syntactically three classes of queries that admit constant update time and constant enumeration delay. We call such queries tractable. Depending on the class, the preprocessing ti… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    ACM Class: H.2.4

  3. arXiv:2312.09331  [pdf, ps, other

    cs.DB

    Insert-Only versus Insert-Delete in Dynamic Query Evaluation

    Authors: Mahmoud Abo Khamis, Ahmet Kara, Dan Olteanu, Dan Suciu

    Abstract: We study the dynamic query evaluation problem: Given a join query Q and a sequence of updates, we would like to construct a data structure that supports constant-delay enumeration of the query output after each update. We show that a sequence of N insert-only updates (to an initially empty database) can be executed in total time O(N^{w(Q)}), where w(Q) is the fractional hypertree width of Q. Thi… ▽ More

    Submitted 8 June, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

  4. arXiv:2308.05588  [pdf, other

    cs.DB

    Banzhaf Values for Facts in Query Answering

    Authors: Omer Abramovich, Daniel Deutch, Nave Frost, Ahmet Kara, Dan Olteanu

    Abstract: Quantifying the contribution of database facts to query answers has been studied as means of explanation. The Banzhaf value, originally developed in Game Theory, is a natural measure of fact contribution, yet its efficient computation for select-project-join-union queries is challenging. In this paper, we introduce three algorithms to compute the Banzhaf value of database facts: an exact algorithm… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

  5. arXiv:2307.16540  [pdf, ps, other

    cs.DB

    ADOPT: Adaptively Optimizing Attribute Orders for Worst-Case Optimal Join Algorithms via Reinforcement Learning

    Authors: Junxiong Wang, Immanuel Trummer, Ahmet Kara, Dan Olteanu

    Abstract: The performance of worst-case optimal join algorithms depends on the order in which the join attributes are processed. Selecting good orders before query execution is hard, due to the large space of possible orders and unreliable execution cost estimates in case of data skew or data correlation. We propose ADOPT, a query engine that combines adaptive query processing with a worst-case optimal join… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    ACM Class: H.3

  6. arXiv:2306.14211  [pdf, ps, other

    cs.DB cs.CC cs.LO

    From Shapley Value to Model Counting and Back

    Authors: Ahmet Kara, Dan Olteanu, Dan Suciu

    Abstract: In this paper we investigate the problem of quantifying the contribution of each variable to the satisfying assignments of a Boolean function based on the Shapley value. Our main result is a polynomial-time equivalence between computing Shapley values and model counting for any class of Boolean functions that are closed under substitutions of variables with disjunctions of fresh variables. This… ▽ More

    Submitted 25 June, 2023; originally announced June 2023.

    Comments: 22 pages

    ACM Class: F.4.1; F.2; H.2

  7. arXiv:2306.14075  [pdf, ps, other

    cs.DB cs.IT

    Join Size Bounds using Lp-Norms on Degree Sequences

    Authors: Mahmoud Abo Khamis, Vasileios Nakos, Dan Olteanu, Dan Suciu

    Abstract: Estimating the output size of a query is a fundamental yet longstanding problem in database query processing. Traditional cardinality estimators used by database systems can routinely underestimate the true output size by orders of magnitude, which leads to significant system performance penalty. Recently, upper bounds have been proposed that are based on information inequalities and incorporate s… ▽ More

    Submitted 5 June, 2024; v1 submitted 24 June, 2023; originally announced June 2023.

  8. arXiv:2306.09610  [pdf, other

    cs.DB cs.LG

    CHORUS: Foundation Models for Unified Data Discovery and Exploration

    Authors: Moe Kayali, Anton Lykov, Ilias Fountalis, Nikolaos Vasiloglou, Dan Olteanu, Dan Suciu

    Abstract: We apply foundation models to data discovery and exploration tasks. Foundation models include large language models (LLMs) that show promising performance on a range of diverse tasks unrelated to their training. We show that these models are highly applicable to the data discovery and data exploration domain. When carefully used, they have superior capability on three representative tasks: table-c… ▽ More

    Submitted 5 April, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: To appear in VLDB 2024

  9. F-IVM: Analytics over Relational Databases under Updates

    Authors: Ahmet Kara, Milos Nikolic, Dan Olteanu, Haozhe Zhang

    Abstract: This article describes F-IVM, a unified approach for maintaining analytics over changing relational data. We exemplify its versatility in four disciplines: processing queries with group-by aggregates and joins; learning linear regression models using the covariance matrix of the input features; building Chow-Liu trees using pairwise mutual information of the input features; and matrix chain multip… ▽ More

    Submitted 29 January, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

  10. arXiv:2206.09032  [pdf, other

    cs.DB

    Conjunctive Queries with Free Access Patterns under Updates

    Authors: Ahmet Kara, Milos Nikolic, Dan Olteanu, Haozhe Zhang

    Abstract: We study the problem of answering conjunctive queries with free access patterns (CQAP) under updates. A free access pattern is a partition of the free variables of the query into input and output. The query returns tuples over the output variables given a tuple of values over the input variables. We introduce a fully dynamic evaluation approach for CQAP queries. We also give a syntactic characte… ▽ More

    Submitted 14 February, 2024; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: Extended and polished version. Added new Section 11 on the dynamic evaluation of conjunctive queries with free access patterns over probabilistic databases

    ACM Class: H.2.4

  11. arXiv:2204.00525  [pdf, other

    cs.DB

    Givens Rotations for QR Decomposition, SVD and PCA over Database Joins

    Authors: Dan Olteanu, Nils Vortmeier, Đorđe Živanović

    Abstract: This article introduces Figaro, an algorithm for computing the upper-triangular matrix in the QR decomposition of the matrix defined by the natural join over relational data. Figaro's main novelty is that it pushes the QR decomposition past the join. This leads to several desirable properties. For acyclic joins, it takes time linear in the database size and independent of the join size. Its execut… ▽ More

    Submitted 16 October, 2023; v1 submitted 1 April, 2022; originally announced April 2022.

  12. arXiv:2107.13923  [pdf, ps, other

    cs.DB

    Machine Learning over Static and Dynamic Relational Data

    Authors: Ahmet Kara, Milos Nikolic, Dan Olteanu, Haozhe Zhang

    Abstract: This tutorial overviews principles behind recent works on training and maintaining machine learning models over relational data, with an emphasis on the exploitation of the relational data structure to improve the runtime performance of the learning task. The tutorial has the following parts: 1) Database research for data science 2) Three main ideas to achieve performance improvements 2.1)… ▽ More

    Submitted 29 July, 2021; originally announced July 2021.

    Comments: arXiv admin note: text overlap with arXiv:2008.07864

  13. arXiv:2106.13342  [pdf, other

    cs.DB cs.DS

    The Complexity of Boolean Conjunctive Queries with Intersection Joins

    Authors: Mahmoud Abo Khamis, George Chichirim, Antonia Kormpa, Dan Olteanu

    Abstract: Intersection joins over interval data are relevant in spatial and temporal data settings. A set of intervals join if their intersection is non-empty. In case of point intervals, the intersection join becomes the standard equality join. We establish the complexity of Boolean conjunctive queries with intersection joins by a many-one equivalence to disjunctions of Boolean conjunctive queries with e… ▽ More

    Submitted 14 April, 2022; v1 submitted 24 June, 2021; originally announced June 2021.

  14. arXiv:2103.06376  [pdf, other

    cs.PL cs.DB cs.LG

    Functional Collection Programming with Semi-Ring Dictionaries

    Authors: Amir Shaikhha, Mathieu Huot, Jaclyn Smith, Dan Olteanu

    Abstract: This paper introduces semi-ring dictionaries, a powerful class of compositional and purely functional collections that subsume other collection types such as sets, multisets, arrays, vectors, and matrices. We developed SDQL, a statically typed language that can express relational algebra with aggregations, linear algebra, and functional collections over data such as relations and matrices using se… ▽ More

    Submitted 22 March, 2022; v1 submitted 10 March, 2021; originally announced March 2021.

  15. arXiv:2008.08657  [pdf, other

    cs.DB

    LMFAO: An Engine for Batches of Group-By Aggregates

    Authors: Maximilian Schleich, Dan Olteanu

    Abstract: LMFAO is an in-memory optimization and execution engine for large batches of group-by aggregates over joins. Such database workloads capture the data-intensive computation of a variety of data science applications. We demonstrate LMFAO for three popular models: ridge linear regression with batch gradient descent, decision trees with CART, and clustering with Rk-means.

    Submitted 19 August, 2020; originally announced August 2020.

    Comments: 4 pages, 4 figures

  16. arXiv:2008.07864  [pdf, other

    cs.DB cs.LG

    The Relational Data Borg is Learning

    Authors: Dan Olteanu

    Abstract: This paper overviews an approach that addresses machine learning over relational data as a database problem. This is justified by two observations. First, the input to the learning task is commonly the result of a feature extraction query over the relational data. Second, the learning task requires the computation of group-by aggregates. This approach has been already investigated for a number o… ▽ More

    Submitted 18 August, 2020; originally announced August 2020.

    Comments: 14 pages, 11 figures, VLDB 2020 keynote

  17. arXiv:2006.00694  [pdf, other

    cs.DB

    F-IVM: Learning over Fast-Evolving Relational Data

    Authors: Milos Nikolic, Haozhe Zhang, Ahmet Kara, Dan Olteanu

    Abstract: F-IVM is a system for real-time analytics such as machine learning applications over training datasets defined by queries over fast-evolving relational databases. We will demonstrate F-IVM for three such applications: model selection, Chow-Liu trees, and ridge linear regression.

    Submitted 31 May, 2020; originally announced June 2020.

    Comments: SIGMOD DEMO 2020, 5 pages

  18. arXiv:2004.03716  [pdf, ps, other

    cs.DB

    Maintaining Triangle Queries under Updates

    Authors: Ahmet Kara, Milos Nikolic, Hung Q. Ngo, Dan Olteanu, Haozhe Zhang

    Abstract: We consider the problem of incrementally maintaining the triangle queries with arbitrary free variables under single-tuple updates to the input relations. We introduce an approach called IVM$^ε$ that exhibits a trade-off between the update time, the space, and the delay for the enumeration of the query result, such that the update time ranges from the square root to linear in the database size whi… ▽ More

    Submitted 7 April, 2020; originally announced April 2020.

    Comments: 47 pages, 18 figures

    ACM Class: H.2.4

  19. arXiv:2001.03541  [pdf, other

    cs.PL cs.DB cs.LG

    Multi-layer Optimizations for End-to-End Data Analytics

    Authors: Amir Shaikhha, Maximilian Schleich, Alexandru Ghita, Dan Olteanu

    Abstract: We consider the problem of training machine learning models over multi-relational data. The mainstream approach is to first construct the training dataset using a feature extraction query over input database and then use a statistical software package of choice to train the model. In this paper we introduce Iterative Functional Aggregate Queries (IFAQ), a framework that realizes an alternative app… ▽ More

    Submitted 10 January, 2020; originally announced January 2020.

  20. arXiv:1912.11098  [pdf, other

    cs.DB

    Towards Deterministic Decomposable Circuits for Safe Queries

    Authors: Mikaël Monet, Dan Olteanu

    Abstract: There exist two approaches for exact probabilistic inference of UCQs on tuple-independent databases. In the extensional approach, query evaluation is performed within a DBMS by exploiting the structure of the query. In the intensional approach, one first builds a representation of the lineage of the query on the database, then computes the probability of the lineage. In this paper we propose a new… ▽ More

    Submitted 23 December, 2019; originally announced December 2019.

    Comments: 10 pages. Appeared in the workshop AMW'18

  21. arXiv:1911.06577  [pdf, ps, other

    cs.DB

    Learning Models over Relational Data: A Brief Tutorial

    Authors: Maximilian Schleich, Dan Olteanu, Mahmoud Abo-Khamis, Hung Q. Ngo, XuanLong Nguyen

    Abstract: This tutorial overviews the state of the art in learning models over relational databases and makes the case for a first-principles approach that exploits recent developments in database research. The input to learning classification and regression models is a training dataset defined by feature extraction queries over relational databases. The mainstream approach to learning over relational dat… ▽ More

    Submitted 15 November, 2019; originally announced November 2019.

    Comments: 10 pages, 1 figure

    ACM Class: H.2.4; I.2.6

  22. arXiv:1910.04939  [pdf, ps, other

    cs.LG cs.DB stat.ML

    Rk-means: Fast Clustering for Relational Data

    Authors: Ryan Curtin, Ben Moseley, Hung Q. Ngo, XuanLong Nguyen, Dan Olteanu, Maximilian Schleich

    Abstract: Conventional machine learning algorithms cannot be applied until a data matrix is available to process. When the data matrix needs to be obtained from a relational database via a feature extraction query, the computation cost can be prohibitive, as the data matrix may be (much) larger than the total input relation size. This paper introduces Rk-means, or relational k -means algorithm, for clusteri… ▽ More

    Submitted 10 October, 2019; originally announced October 2019.

  23. Trade-offs in Static and Dynamic Evaluation of Hierarchical Queries

    Authors: Ahmet Kara, Milos Nikolic, Dan Olteanu, Haozhe Zhang

    Abstract: We investigate trade-offs in static and dynamic evaluation of hierarchical queries with arbitrary free variables. In the static setting, the trade-off is between the time to partially compute the query result and the delay needed to enumerate its tuples. In the dynamic setting, we additionally consider the time needed to update the query result under single-tuple inserts or deletes to the database… ▽ More

    Submitted 8 August, 2023; v1 submitted 3 July, 2019; originally announced July 2019.

    Journal ref: Logical Methods in Computer Science, Volume 19, Issue 3 (August 9, 2023) lmcs:10035

  24. arXiv:1906.08687  [pdf, other

    cs.DB

    A Layered Aggregate Engine for Analytics Workloads

    Authors: Maximilian Schleich, Dan Olteanu, Mahmoud Abo Khamis, Hung Q. Ngo, XuanLong Nguyen

    Abstract: This paper introduces LMFAO (Layered Multiple Functional Aggregate Optimization), an in-memory optimization and execution engine for batches of aggregates over the input database. The primary motivation for this work stems from the observation that for a variety of analytics over databases, their data-intensive tasks can be decomposed into group-by aggregates over the join of the input database re… ▽ More

    Submitted 20 June, 2019; originally announced June 2019.

    Comments: 18 pages, 7 figures, 4 tables

    ACM Class: H.2.4; I.2.6

  25. arXiv:1902.00585  [pdf, ps, other

    cs.DB

    Incremental Techniques for Large-Scale Dynamic Query Processing

    Authors: Iman Elghandour, Ahmet Kara, Dan Olteanu, Stijn Vansummeren

    Abstract: Many applications from various disciplines are now required to analyze fast evolving big data in real time. Various approaches for incremental processing of queries have been proposed over the years. Traditional approaches rely on updating the results of a query when updates are streamed rather than re-computing these queries, and therefore, higher execution performance is expected. However, they… ▽ More

    Submitted 1 February, 2019; originally announced February 2019.

  26. arXiv:1812.09526  [pdf, ps, other

    cs.DB cs.DS cs.IT cs.LG

    Functional Aggregate Queries with Additive Inequalities

    Authors: Mahmoud Abo Khamis, Ryan R. Curtin, Benjamin Moseley, Hung Q. Ngo, XuanLong Nguyen, Dan Olteanu, Maximilian Schleich

    Abstract: Motivated by fundamental applications in databases and relational machine learning, we formulate and study the problem of answering functional aggregate queries (FAQ) in which some of the input factors are defined by a collection of additive inequalities between variables. We refer to these queries as FAQ-AI for short. To answer FAQ-AI in the Boolean semiring, we define relaxed tree decompositio… ▽ More

    Submitted 15 September, 2020; v1 submitted 22 December, 2018; originally announced December 2018.

  27. arXiv:1804.02780  [pdf, ps, other

    cs.DB

    Counting Triangles under Updates in Worst-Case Optimal Time

    Authors: Ahmet Kara, Hung Q. Ngo, Milos Nikolic, Dan Olteanu, Haozhe Zhang

    Abstract: We consider the problem of incrementally maintaining the triangle count query under single-tuple updates to the input relations. We introduce an approach that exhibits a space-time tradeoff such that the space-time product is quadratic in the size of the input database and the update time can be as low as the square root of this size. This lowest update time is worst-case optimal conditioned on th… ▽ More

    Submitted 25 March, 2019; v1 submitted 8 April, 2018; originally announced April 2018.

    Comments: simplified notation; incremental maintenance of full triangle query, 4-path count query, count queries with three relations added; improved the space complexity of the dynamic algorithm maintaining the triangle count query

    ACM Class: H.2.4

  28. arXiv:1803.07480  [pdf, other

    cs.DB

    AC/DC: In-Database Learning Thunderstruck

    Authors: Mahmoud Abo Khamis, Hung Q. Ngo, XuanLong Nguyen, Dan Olteanu, Maximilian Schleich

    Abstract: We report on the design and implementation of the AC/DC gradient descent solver for a class of optimization problems over normalized databases. AC/DC decomposes an optimization problem into a set of aggregates over the join of the database relations. It then uses the answers to these aggregates to iteratively improve the solution to the problem until it converges. The challenges faced by AC/DC a… ▽ More

    Submitted 15 June, 2018; v1 submitted 20 March, 2018; originally announced March 2018.

    Comments: 10 pages, 3 figures

    ACM Class: H.2.4; I.2.6

  29. arXiv:1712.07445  [pdf, ps, other

    cs.DB cs.DM math.CO

    Boolean Tensor Decomposition for Conjunctive Queries with Negation

    Authors: Mahmoud Abo Khamis, Hung Q. Ngo, Dan Olteanu, Dan Suciu

    Abstract: We propose an algorithm for answering conjunctive queries with negation, where the negated relations have bounded degree. Its data complexity matches that of the best known algorithms for the positive subquery of the input query and is expressed in terms of the fractional hypertree width and the submodular width. The query complexity depends on the structure of the negated subquery; in general it… ▽ More

    Submitted 27 January, 2019; v1 submitted 20 December, 2017; originally announced December 2017.

  30. arXiv:1709.01600  [pdf, ps, other

    cs.DB

    Covers of Query Results

    Authors: Ahmet Kara, Dan Olteanu

    Abstract: We introduce succinct lossless representations of query results called covers. They are subsets of the query results that correspond to minimal edge covers in the hypergraphs of these results. We first study covers whose structures are given by fractional hypertree decompositions of join queries. For any decomposition of a query, we give asymptotically tight size bounds for the covers of the que… ▽ More

    Submitted 10 January, 2018; v1 submitted 5 September, 2017; originally announced September 2017.

    Comments: 33 pages. Notation simplified

    MSC Class: 68P15 ACM Class: H.2.1

  31. arXiv:1703.07484  [pdf, other

    cs.DB

    Incremental View Maintenance with Triple Lock Factorization Benefits

    Authors: Milos Nikolic, Dan Olteanu

    Abstract: We introduce F-IVM, a unified incremental view maintenance (IVM) approach for a variety of tasks, including gradient computation for learning linear regression models over joins, matrix chain multiplication, and factorized evaluation of conjunctive queries. F-IVM is a higher-order IVM algorithm that reduces the maintenance of the given task to the maintenance of a hierarchy of increasingly simpl… ▽ More

    Submitted 28 February, 2018; v1 submitted 21 March, 2017; originally announced March 2017.

    Comments: 27 pages, 13 figures, a shorter version appeared in SIGMOD 2018

    ACM Class: H.2.4

  32. arXiv:1703.04780  [pdf, other

    cs.DB

    Learning Models over Relational Data using Sparse Tensors and Functional Dependencies

    Authors: Mahmoud Abo Khamis, Hung Q. Ngo, XuanLong Nguyen, Dan Olteanu, Maximilian Schleich

    Abstract: Integrated solutions for analytics over relational databases are of great practical importance as they avoid the costly repeated loop data scientists have to deal with on a daily basis: select features from data residing in relational databases using feature extraction queries involving joins, projections, and aggregations; export the training dataset defined by such queries; convert this dataset… ▽ More

    Submitted 6 February, 2020; v1 submitted 14 March, 2017; originally announced March 2017.

    Comments: 61 pages, 9 figures, 2 tables

    ACM Class: H.2.4; I.2.6

  33. arXiv:1412.2221  [pdf, ps, other

    cs.DB cs.AI cs.PL

    Declarative Statistical Modeling with Datalog

    Authors: Vince Barany, Balder ten Cate, Benny Kimelfeld, Dan Olteanu, Zografoula Vagena

    Abstract: Formalisms for specifying statistical models, such as probabilistic-programming languages, typically consist of two components: a specification of a stochastic process (the prior), and a specification of observations that restrict the probability space to a conditional subspace (the posterior). Use cases of such formalisms include the development of algorithms in machine learning and artificial in… ▽ More

    Submitted 5 January, 2015; v1 submitted 6 December, 2014; originally announced December 2014.

    Comments: 14 pages, 4 figures

    ACM Class: F.1.2; G.3; H.2.3; H.2.4; H.2.8; I.2.3

  34. arXiv:1309.0373  [pdf, other

    cs.DB

    ENFrame: A Platform for Processing Probabilistic Data

    Authors: Sebastiaan J. van Schaik, Dan Olteanu, Robert Fink

    Abstract: This paper introduces ENFrame, a unified data processing platform for querying and mining probabilistic data. Using ENFrame, users can write programs in a fragment of Python with constructs such as bounded-range loops, list comprehension, aggregate operations on lists, and calls to external database engines. The program is then interpreted probabilistically by ENFrame. The realisation of ENFrame… ▽ More

    Submitted 2 September, 2013; originally announced September 2013.

    Comments: 12 pages

    ACM Class: H.2.4; H.2.8; H.3.5

  35. arXiv:1307.0441  [pdf, ps, other

    cs.DB cs.DS

    Aggregation and Ordering in Factorised Databases

    Authors: Nurzhan Bakibayev, Tomáš Kočiský, Dan Olteanu, Jakub Závodný

    Abstract: A common approach to data analysis involves understanding and manipulating succinct representations of data. In earlier work, we put forward a succinct representation system for relational data called factorised databases and reported on the main-memory query engine FDB for select-project-join queries on such databases. In this paper, we extend FDB to support a larger class of practical queries… ▽ More

    Submitted 1 July, 2013; originally announced July 2013.

    Comments: 12 pages, 8 figures

  36. arXiv:1203.2672  [pdf, ps, other

    cs.DB cs.DS

    FDB: A Query Engine for Factorised Relational Databases

    Authors: Nurzhan Bakibayev, Dan Olteanu, Jakub Závodný

    Abstract: Factorised databases are relational databases that use compact factorised representations at the physical layer to reduce data redundancy and boost query performance. This paper introduces FDB, an in-memory query engine for select-project-join queries on factorised databases. Key components of FDB are novel algorithms for query optimisation and evaluation that exploit the succinctness brought by d… ▽ More

    Submitted 12 March, 2012; originally announced March 2012.

    Comments: 12 pages, 9 figures

  37. arXiv:1201.6569  [pdf, other

    cs.DB

    Aggregation in Probabilistic Databases via Knowledge Compilation

    Authors: Robert Fink, Larisa Han, Dan Olteanu

    Abstract: This paper presents a query evaluation technique for positive relational algebra queries with aggregates on a representation system for probabilistic data based on the algebraic structures of semiring and semimodule. The core of our evaluation technique is a procedure that compiles semimodule and semiring expressions into so-called decomposition trees, for which the computation of the probability… ▽ More

    Submitted 31 January, 2012; originally announced January 2012.

    Comments: VLDB2012

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 5, pp. 490-501 (2012)

  38. arXiv:1104.0867  [pdf, ps, other

    cs.DB cs.DS

    Factorised Representations of Query Results

    Authors: Dan Olteanu, Jakub Zavodny

    Abstract: Query tractability has been traditionally defined as a function of input database and query sizes, or of both input and output sizes, where the query result is represented as a bag of tuples. In this report, we introduce a framework that allows to investigate tractability beyond this setting. The key insight is that, although the cardinality of a query result can be exponential, its structure can… ▽ More

    Submitted 5 April, 2011; originally announced April 2011.

    Comments: 44 pages, 13 figures

    ACM Class: H.2.3; H.2.4

  39. A semi-empirical simulation of the extragalactic radio continuum sky for next generation radio telescopes

    Authors: R. J. Wilman, L. Miller, M. J. Jarvis, T. Mauch, F. Levrier, F. B. Abdalla, S. Rawlings, H. -R. Kloeckner, D. Obreschkow, D. Olteanu, S. Young

    Abstract: We have developed a semi-empirical simulation of the extragalactic radio continuum sky suitable for aiding the design of next generation radio interferometers such as the Square Kilometre Array (SKA). The emphasis is on modelling the large-scale cosmological distribution of radio sources rather than the internal details of individual galaxies. Here we provide a description of the simulation to a… ▽ More

    Submitted 22 May, 2008; originally announced May 2008.

    Comments: 15 pages; to appear in MNRAS

    Journal ref: Mon.Not.Roy.Astron.Soc.388:1335-1348,2008

  40. arXiv:0803.2212  [pdf, ps, other

    cs.DB cs.AI

    Conditioning Probabilistic Databases

    Authors: Christoph Koch, Dan Olteanu

    Abstract: Past research on probabilistic databases has studied the problem of answering queries on a static database. Application scenarios of probabilistic databases however often involve the conditioning of a database using additional information in the form of new evidence. The conditioning problem is thus to transform a probabilistic database of priors into a posterior probabilistic database which is… ▽ More

    Submitted 16 June, 2008; v1 submitted 14 March, 2008; originally announced March 2008.

    Comments: 13 pages, 13 figures

    ACM Class: H.2.1; H.2.4

  41. arXiv:0707.1644  [pdf, ps, other

    cs.DB cs.PF

    Fast and Simple Relational Processing of Uncertain Data

    Authors: Lyublena Antova, Thomas Jansen, Christoph Koch, Dan Olteanu

    Abstract: This paper introduces U-relations, a succinct and purely relational representation system for uncertain databases. U-relations support attribute-level uncertainty using vertical partitioning. If we consider positive relational algebra extended by an operation for computing possible answers, a query on the logical level can be translated into, and evaluated as, a single relational algebra query o… ▽ More

    Submitted 11 July, 2007; originally announced July 2007.

    Comments: 12 pages, 14 figures

    ACM Class: H.2.1; H.2.4

  42. arXiv:0705.4442  [pdf, ps, other

    cs.DB

    World-set Decompositions: Expressiveness and Efficient Algorithms

    Authors: Dan Olteanu, Christoph Koch, Lyublena Antova

    Abstract: Uncertain information is commonplace in real-world data management scenarios. The ability to represent large sets of possible instances (worlds) while supporting efficient storage and processing is an important challenge in this context. The recent formalism of world-set decompositions (WSDs) provides a space-efficient representation for uncertain data that also supports scalable processing. WSD… ▽ More

    Submitted 9 January, 2008; v1 submitted 30 May, 2007; originally announced May 2007.

    Comments: 34 pages, 13 figures, extended version of ICDT'07 paper

    ACM Class: H.2.1; H.2.4

  43. arXiv:cs/0606075  [pdf, ps, other

    cs.DB

    10^(10^6) Worlds and Beyond: Efficient Representation and Processing of Incomplete Information

    Authors: Lyublena Antova, Christoph Koch, Dan Olteanu

    Abstract: Current systems and formalisms for representing incomplete information generally suffer from at least one of two weaknesses. Either they are not strong enough for representing results of simple queries, or the handling and processing of the data, e.g. for query evaluation, is intractable. In this paper, we present a decomposition-based approach to addressing this problem. We introduce world-se… ▽ More

    Submitted 13 February, 2008; v1 submitted 16 June, 2006; originally announced June 2006.

    Comments: 17 pages, 24 figures

    ACM Class: H.2.1; H.2.4