Trevor: Automatic configuration and scaling of stream processing pipelines

Bansal, Manu; Cidon, Eyal; Balasingam, Arjun; Gudipati, Aditya; Kozyrakis, Christos; Katti, Sachin

Computer Science > Distributed, Parallel, and Cluster Computing

arXiv:1812.09442 (cs)

[Submitted on 22 Dec 2018]

Title:Trevor: Automatic configuration and scaling of stream processing pipelines

Authors:Manu Bansal, Eyal Cidon, Arjun Balasingam, Aditya Gudipati, Christos Kozyrakis, Sachin Katti

View PDF

Abstract:Operating a distributed data stream processing workload efficiently at scale is hard. The operator of the workload must parallelize and lay out tasks of the workload with resources that match the requirement of target data rate. The challenge is that neither the operator nor the programmer is typically aware of the scaling behavior of the workload as a function of resources. An operator manually searches for a safe operating point that can handle predicted peak load and deploys with ample headroom for absorbing unpredictable spikes. Such empirical, static over-provisioning is wasteful of both compute and human resources. We show that precise performance models can be automatically learned for distributed stream processing systems that can predict the execution performance of a job even before deployment. Further, those models can be used to optimally schedule logically specified jobs onto available physical hardware. Finally, those models and the derived execution schedules can be refined online to dynamically adapt to unpredictable changes in the runtime environment or auto-scale with variations in job load.

Subjects:	Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as:	arXiv:1812.09442 [cs.DC]
	(or arXiv:1812.09442v1 [cs.DC] for this version)
	https://doi.org/10.48550/arXiv.1812.09442

Submission history

From: Eyal Cidon [view email]
[v1] Sat, 22 Dec 2018 03:15:41 UTC (1,637 KB)

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Trevor: Automatic configuration and scaling of stream processing pipelines

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Trevor: Automatic configuration and scaling of stream processing pipelines

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators