Distributed Dependency Discovery

Saxena, Hemant; Golab, Lukasz; Ilyas, Ihab F.

Computer Science > Databases

arXiv:1903.05228 (cs)

[Submitted on 12 Mar 2019]

Title:Distributed Dependency Discovery

Authors:Hemant Saxena, Lukasz Golab, Ihab F. Ilyas

View PDF

Abstract:We analyze the problem of discovering dependencies from distributed big data. Existing (non-distributed) algorithms focus on minimizing computation by pruning the search space of possible dependencies. However, distributed algorithms must also optimize communication costs, especially in shared-nothing settings, leading to a more complex optimization space. To understand this space, we introduce six primitives shared by existing dependency discovery algorithms, corresponding to data processing steps separated by communication barriers. Through case studies, we show how the primitives allow us to analyze the design space and develop communication-optimized implementations. Finally, we support our analysis with an experimental evaluation on real datasets.

Subjects:	Databases (cs.DB)
Cite as:	arXiv:1903.05228 [cs.DB]
	(or arXiv:1903.05228v1 [cs.DB] for this version)
	https://doi.org/10.48550/arXiv.1903.05228

Submission history

From: Hemant Saxena [view email]
[v1] Tue, 12 Mar 2019 21:29:08 UTC (865 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.DB

< prev | next >

new | recent | 2019-03

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Hemant Saxena
Lukasz Golab
Ihab F. Ilyas

export BibTeX citation

Computer Science > Databases

Title:Distributed Dependency Discovery

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Databases

Title:Distributed Dependency Discovery

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators