Showing 1–1 of 1 results for author: Tremel, E
-
Cascade: A Platform for Delay-Sensitive Edge Intelligence
Authors:
Weijia Song,
Thiago Garrett,
Yuting Yang,
Mingzhao Liu,
Edward Tremel,
Lorenzo Rosa,
Andrea Merlina,
Roman Vitenberg,
Ken Birman
Abstract:
Interactive intelligent computing applications are increasingly prevalent, creating a need for AI/ML platforms optimized to reduce per-event latency while maintaining high throughput and efficient resource management. Yet many intelligent applications run on AI/ML platforms that optimize for high throughput even at the cost of high tail-latency. Cascade is a new AI/ML hosting platform intended to…
▽ More
Interactive intelligent computing applications are increasingly prevalent, creating a need for AI/ML platforms optimized to reduce per-event latency while maintaining high throughput and efficient resource management. Yet many intelligent applications run on AI/ML platforms that optimize for high throughput even at the cost of high tail-latency. Cascade is a new AI/ML hosting platform intended to untangle this puzzle. Innovations include a legacy-friendly storage layer that moves data with minimal copying and a "fast path" that collocates data and computation to maximize responsiveness. Our evaluation shows that Cascade reduces latency by orders of magnitude with no loss of throughput.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.