Skip to main content

Showing 1–2 of 2 results for author: Raghavan, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.04311  [pdf, other

    cs.AI cs.CL cs.DC cs.IR

    ALTO: An Efficient Network Orchestrator for Compound AI Systems

    Authors: Keshav Santhanam, Deepti Raghavan, Muhammad Shahir Rahman, Thejas Venkatesh, Neha Kunjal, Pratiksha Thaker, Philip Levis, Matei Zaharia

    Abstract: We present ALTO, a network orchestrator for efficiently serving compound AI systems such as pipelines of language models. ALTO achieves high throughput and low latency by taking advantage of an optimization opportunity specific to generative language models: streaming intermediate outputs. As language models produce outputs token by token, ALTO exposes opportunities to stream intermediate outputs… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  2. arXiv:2003.01668  [pdf, other

    cs.AI cs.LG

    Model Assertions for Monitoring and Improving ML Models

    Authors: Daniel Kang, Deepti Raghavan, Peter Bailis, Matei Zaharia

    Abstract: ML models are increasingly deployed in settings with real world interactions such as vehicles, but unfortunately, these models can fail in systematic ways. To prevent errors, ML engineering teams monitor and continuously improve these models. We propose a new abstraction, model assertions, that adapts the classical use of program assertions as a way to monitor and improve ML models. Model assertio… ▽ More

    Submitted 11 March, 2020; v1 submitted 3 March, 2020; originally announced March 2020.

    Journal ref: MLSys 2020