Skip to main content

Showing 1–7 of 7 results for author: Birman, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.17652  [pdf, other

    cs.DC cs.AI

    Compass: A Decentralized Scheduler for Latency-Sensitive ML Workflows

    Authors: Yuting Yang, Andrea Merlina, Weijia Song, Tiancheng Yuan, Ken Birman, Roman Vitenberg

    Abstract: We consider ML query processing in distributed systems where GPU-enabled workers coordinate to execute complex queries: a computing style often seen in applications that interact with users in support of image processing and natural language processing. In such systems, coscheduling of GPU memory management and task placement represents a promising opportunity. We propose Compass, a novel framewor… ▽ More

    Submitted 28 February, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  2. arXiv:2312.11488  [pdf, other

    cs.DC cs.AI

    Low-Latency ML Inference by Grou** Correlated Data Objects and Computation

    Authors: Thiago Garrett, Weijia Song, Roman Vitenberg, Ken Birman

    Abstract: ML inference workflows often require low latency and high throughput, yet we lack good options for addressing this need. Techniques that reduce latency in other streaming settings (such as caching and optimization-driven scheduling) are of limited value because ML data dependencies are often very large and can change dramatically depending on the triggering event. In this work, we propose a novel… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

  3. arXiv:2311.17329  [pdf, other

    cs.OS cs.AI

    Cascade: A Platform for Delay-Sensitive Edge Intelligence

    Authors: Weijia Song, Thiago Garrett, Yuting Yang, Mingzhao Liu, Edward Tremel, Lorenzo Rosa, Andrea Merlina, Roman Vitenberg, Ken Birman

    Abstract: Interactive intelligent computing applications are increasingly prevalent, creating a need for AI/ML platforms optimized to reduce per-event latency while maintaining high throughput and efficient resource management. Yet many intelligent applications run on AI/ML platforms that optimize for high throughput even at the cost of high tail-latency. Cascade is a new AI/ML hosting platform intended to… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

    Comments: 14 pages, 12 Figures

  4. Spindle: Techniques for Optimizing Atomic Multicast on RDMA

    Authors: Sagar Jha, Lorenzo Rosa, Ken Birman

    Abstract: Leveraging one-sided RDMA for applications that replicate small data objects can be surprisingly difficult: such uses amplify any protocol overheads. Spindle is a set of optimization techniques for systematically tackling this class of challenges for atomic multicast over RDMA. These include memory polling optimizations using novel sender and receiver batching techniques, null-message send logic,… ▽ More

    Submitted 2 October, 2021; originally announced October 2021.

    Comments: 14 pages including 1 page for references. To be submitted to an IEEE journal potentially

  5. Cache Serializability: Reducing Inconsistency in Edge Transactions

    Authors: Ittay Eyal, Ken Birman, Robbert van Renesse

    Abstract: Read-only caches are widely used in cloud infrastructures to reduce access latency and load on backend databases. Operators view coherent caches as impractical at genuinely large scale and many client-facing caches are updated in an asynchronous manner with best-effort pipelines. Existing solutions that support cache consistency are inapplicable to this scenario since they require a round trip to… ▽ More

    Submitted 26 April, 2015; v1 submitted 29 September, 2014; originally announced September 2014.

    Comments: Ittay Eyal, Ken Birman, Robbert van Renesse, "Cache Serializability: Reducing Inconsistency in Edge Transactions," Distributed Computing Systems (ICDCS), IEEE 35th International Conference on, June~29 2015--July~2 2015

  6. arXiv:1404.6719  [pdf, other

    cs.DC

    Practical Experience Report: The Performance of Paxos in the Cloud

    Authors: Parisa Jalili Marandi, Samuel Benz, Fernando Pedone, Ken Birman

    Abstract: This experience report presents the results of an extensive performance evaluation conducted using four open-source implementations of Paxos deployed in Amazon's EC2. Paxos is a fundamental algorithm for building fault-tolerant services, at the core of state-machine replication. Implementations of Paxos are currently used in many prototypes and production systems in both academia and industry. Alt… ▽ More

    Submitted 27 April, 2014; originally announced April 2014.

  7. arXiv:cs/9809006  [pdf, ps

    cs.OS cs.DC

    The Design and Architecture of the Microsoft Cluster Service -- A Practical Approach to High-Availability and Scalability

    Authors: Werner Vogels, Dan Dumitriu, Ken Birman, Rod Gamache, Mike Massa, Rob Short, John Vert, Joe Barrera

    Abstract: Microsoft Cluster Service (MSCS) extends the Win-dows NT operating system to support high-availability services. The goal is to offer an execution environment where off-the-shelf server applications can continue to operate, even in the presence of node failures. Later ver-sions of MSCS will provide scalability via a node and application management system that allows applications to scale to hund… ▽ More

    Submitted 2 September, 1998; originally announced September 1998.

    Comments: Original document at: http://research.microsoft.com/~gray/MSCS_FTCS98.doc

    Report number: Microsoft Research MSR-TR-98-16 ACM Class: C.4; C.5; D.4.5

    Journal ref: Proceedings of FTCS'98, June 23-25, 1998 in Munich, Germany