Skip to main content

Showing 1–7 of 7 results for author: Garofalakis, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.06051  [pdf, other

    cs.DB

    OmniSketch: Efficient Multi-Dimensional High-Velocity Stream Analytics with Arbitrary Predicates

    Authors: Wieger R. Punter, Odysseas Papapetrou, Minos Garofalakis

    Abstract: A key need in different disciplines is to perform analytics over fast-paced data streams, similar in nature to the traditional OLAP analytics in relational databases i.e., with filters and aggregates. Storing unbounded streams, however, is not a realistic, or desired approach due to the high storage requirements, and the delays introduced when storing massive data. Accordingly, many synopses/sketc… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

  2. arXiv:2111.08784  [pdf, other

    cs.CR cs.DB cs.DS

    Improved Pan-Private Stream Density Estimation

    Authors: Vassilis Digalakis Jr, George N. Karystinos, Minos N. Garofalakis

    Abstract: Differential privacy is a rigorous definition for privacy that guarantees that any analysis performed on a sensitive dataset leaks no information about the individuals whose data are contained therein. In this work, we develop new differentially private algorithms to analyze streaming data. Specifically, we consider the problem of estimating the density of a stream of users (or, more generally, el… ▽ More

    Submitted 16 November, 2021; originally announced November 2021.

  3. arXiv:1403.7729  [pdf, ps, other

    cs.DB

    Multi-Resource Parallel Query Scheduling and Optimization

    Authors: Minos Garofalakis, Yannis Ioannidis

    Abstract: Scheduling query execution plans is a particularly complex problem in shared-nothing parallel systems, where each site consists of a collection of local time-shared (e.g., CPU(s) or disk(s)) and space-shared (e.g., memory) resources and communicates with remote sites by message-passing. Earlier work on parallel query scheduling employs either (a) one-dimensional models of parallel task scheduling,… ▽ More

    Submitted 30 March, 2014; originally announced March 2014.

    Comments: 50 pages; Conference version of the paper has appeared in the Proceedings of the 23rd International Conference on Very Large Databases (VLDB'1997), Athens, Greece, August 1997

  4. arXiv:1301.2267  [pdf

    cs.AI cs.DS

    Efficient Stepwise Selection in Decomposable Models

    Authors: Amol Deshpande, Minos Garofalakis, Michael I. Jordan

    Abstract: In this paper, we present an efficient way of performing stepwise selection in the class of decomposable models. The main contribution of the paper is a simple characterization of the edges that canbe added to a decomposable model while kee** the resulting model decomposable and an efficient algorithm for enumerating all such edges for a given model in essentially O(1) time per edge. We also dis… ▽ More

    Submitted 10 January, 2013; originally announced January 2013.

    Comments: Appears in Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI2001)

    Report number: UAI-P-2001-PG-128-135

  5. arXiv:1207.0139  [pdf, other

    cs.DB

    Sketch-based Querying of Distributed Sliding-Window Data Streams

    Authors: Odysseas Papapetrou, Minos Garofalakis, Antonios Deligiannakis

    Abstract: While traditional data-management systems focus on evaluating single, ad-hoc queries over static data sets in a centralized setting, several emerging applications require (possibly, continuous) answers to queries on dynamic data that is widely distributed and constantly updated. Furthermore, such query answers often need to discount data that is "stale", and operate solely on a sliding window of r… ▽ More

    Submitted 30 June, 2012; originally announced July 2012.

    Comments: VLDB2012

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 10, pp. 992-1003 (2012)

  6. arXiv:1103.2410  [pdf

    cs.DB

    Large-Scale Collective Entity Matching

    Authors: Vibhor Rastogi, Nilesh Dalvi, Minos Garofalakis

    Abstract: There have been several recent advancements in Machine Learning community on the Entity Matching (EM) problem. However, their lack of scalability has prevented them from being applied in practical settings on large real-life datasets. Towards this end, we propose a principled framework to scale any generic EM algorithm. Our technique consists of running multiple instances of the EM algorithm on sm… ▽ More

    Submitted 11 March, 2011; originally announced March 2011.

    Comments: VLDB2011

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 4, No. 4, pp. 208-218 (2011)

  7. arXiv:0806.1071  [pdf, other

    cs.DB

    Histograms and Wavelets on Probabilistic Data

    Authors: Graham Cormode, Minos Garofalakis

    Abstract: There is a growing realization that uncertain information is a first-class citizen in modern database management. As such, we need techniques to correctly and efficiently process uncertain data in database systems. In particular, data reduction techniques that can produce concise, accurate synopses of large probabilistic relations are crucial. Similar to their deterministic relation counterparts… ▽ More

    Submitted 5 June, 2008; originally announced June 2008.