Skip to main content

Showing 1–12 of 12 results for author: Yadwadkar, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.08859  [pdf, other

    cs.DC cs.LG

    Shabari: Delayed Decision-Making for Faster and Efficient Serverless Functions

    Authors: Prasoon Sinha, Kostis Kaffes, Neeraja J. Yadwadkar

    Abstract: Serverless computing relieves developers from the burden of resource management, thus providing ease-of-use to the users and the opportunity to optimize resource utilization for the providers. However, today's serverless systems lack performance guarantees for function invocations, thus limiting support for performance-critical applications: we observed severe performance variability (up to 6x). P… ▽ More

    Submitted 25 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: 17 pages, 14 figures, update typo in manually entered arxiv title

  2. arXiv:2401.07957  [pdf, ps, other

    eess.IV cs.CV cs.LG cs.SD eess.AS

    Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image Models

    Authors: Dan Jacobellis, Daniel Cummings, Neeraja J. Yadwadkar

    Abstract: In the field of neural data compression, the prevailing focus has been on optimizing algorithms for either classical distortion metrics, such as PSNR or SSIM, or human perceptual quality. With increasing amounts of data consumed by machines rather than humans, a new paradigm of machine-oriented compression$\unicode{x2013}$which prioritizes the retention of features salient for machine perception o… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

    Comments: 10 pages; abridged version published in IEEE Data Compression Conference 2024

  3. arXiv:2310.18481  [pdf, other

    cs.LG cs.AI cs.OS

    MOSEL: Inference Serving Using Dynamic Modality Selection

    Authors: Bodun Hu, Le Xu, Jeongyoon Moon, Neeraja J. Yadwadkar, Aditya Akella

    Abstract: Rapid advancements over the years have helped machine learning models reach previously hard-to-achieve goals, sometimes even exceeding human capabilities. However, to attain the desired accuracy, the model sizes and in turn their computational requirements have increased drastically. Thus, serving predictions from these models to meet any target latency and cost requirements of applications remain… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

  4. arXiv:2308.03615  [pdf, other

    cs.DC cs.DB

    Dirigo: Self-scaling Stateful Actors For Serverless Real-time Data Processing

    Authors: Le Xu, Divyanshu Saxena, Neeraja J. Yadwadkar, Aditya Akella, Indranil Gupta

    Abstract: We propose Dirigo, a distributed stream processing service built atop virtual actors. Dirigo achieves both a high level of resource efficiency and performance isolation driven by user intent (SLO). To improve resource efficiency, Dirigo adopts a serverless architecture that enables time-sharing of compute resources among streaming operators, both within and across applications. Meanwhile, Dirigo i… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

  5. arXiv:2307.08635  [pdf, other

    cs.AR

    Lightweight ML-based Runtime Prefetcher Selection on Many-core Platforms

    Authors: Erika S. Alcorta, Mahesh Madhav, Scott Tetrick, Neeraja J. Yadwadkar, Andreas Gerstlauer

    Abstract: Modern computer designs support composite prefetching, where multiple individual prefetcher components are used to target different memory access patterns. However, multiple prefetchers competing for resources can drastically hurt performance, especially in many-core systems where cache and other resources are shared and very limited. Prior work has proposed mitigating this issue by selectively en… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: 5 pages, 7 figures, 1 table, presented at the 2023 ML for Computer Architecture and Systems Workshop (Co-Located with ASSYST)

  6. arXiv:2306.15792  [pdf, other

    cs.DC cs.AR cs.PF

    Sidecars on the Central Lane: Impact of Network Proxies on Microservices

    Authors: Prateek Sahu, Lucy Zheng, Marco Bueso, Shijia Wei, Neeraja J. Yadwadkar, Mohit Tiwari

    Abstract: Cloud applications are moving away from monolithic model towards loosely-coupled microservices designs. Service meshes are widely used for implementing microservices applications mainly because they provide a modular architecture for modern applications by separating operational features from application business logic. Sidecar proxies in service meshes enable this modularity by applying security,… ▽ More

    Submitted 17 October, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

    Comments: Presented at HotInfra 2023 (co-located with ISCA 2023, Orlando, FL)

  7. arXiv:2201.10477  [pdf, other

    cs.OS

    SOL: Safe On-Node Learning in Cloud Platforms

    Authors: Yawen Wang, Daniel Crankshaw, Neeraja J. Yadwadkar, Daniel Berger, Christos Kozyrakis, Ricardo Bianchini

    Abstract: Cloud platforms run many software agents on each server node. These agents manage all aspects of node operation, and in some cases frequently collect data and make decisions. Unfortunately, their behavior is typically based on pre-defined static heuristics or offline analysis; they do not leverage on-node machine learning (ML). In this paper, we first characterize the spectrum of node agents in Az… ▽ More

    Submitted 25 January, 2022; originally announced January 2022.

  8. arXiv:2111.07226  [pdf, other

    cs.DC

    Practical Scheduling for Real-World Serverless Computing

    Authors: Kostis Kaffes, Neeraja J. Yadwadkar, Christos Kozyrakis

    Abstract: Serverless computing has seen rapid growth due to the ease-of-use and cost-efficiency it provides. However, function scheduling, a critical component of serverless systems, has been overlooked. In this paper, we take a first-principles approach toward designing a scheduler that caters to the unique characteristics of serverless functions as seen in real-world deployments. We first create a taxonom… ▽ More

    Submitted 13 November, 2021; originally announced November 2021.

  9. arXiv:2104.13869  [pdf, other

    cs.DC

    Faa$T: A Transparent Auto-Scaling Cache for Serverless Applications

    Authors: Francisco Romero, Gohar Irfan Chaudhry, Íñigo Goiri, Pragna Gopa, Paul Batum, Neeraja J. Yadwadkar, Rodrigo Fonseca, Christos Kozyrakis, Ricardo Bianchini

    Abstract: Function-as-a-Service (FaaS) has become an increasingly popular way for users to deploy their applications without the burden of managing the underlying infrastructure. However, existing FaaS platforms rely on remote storage to maintain state, limiting the set of applications that can be run efficiently. Recent caching work for FaaS platforms has tried to address this problem, but has fallen short… ▽ More

    Submitted 28 April, 2021; originally announced April 2021.

    Comments: 18 pages, 15 figures

  10. arXiv:2102.01887  [pdf, other

    cs.DC

    Llama: A Heterogeneous & Serverless Framework for Auto-Tuning Video Analytics Pipelines

    Authors: Francisco Romero, Mark Zhao, Neeraja J. Yadwadkar, Christos Kozyrakis

    Abstract: The proliferation of camera-enabled devices and large video repositories has led to a diverse set of video analytics applications. These applications rely on video pipelines, represented as DAGs of operations, to transform videos, process extracted metadata, and answer questions like, "Is this intersection congested?" The latency and resource efficiency of pipelines can be optimized using configur… ▽ More

    Submitted 28 May, 2021; v1 submitted 3 February, 2021; originally announced February 2021.

  11. arXiv:1905.13348  [pdf, other

    cs.DC cs.LG

    INFaaS: A Model-less and Managed Inference Serving System

    Authors: Francisco Romero, Qian Li, Neeraja J. Yadwadkar, Christos Kozyrakis

    Abstract: Despite existing work in machine learning inference serving, ease-of-use and cost efficiency remain challenges at large scales. Developers must manually search through thousands of model-variants -- versions of already-trained models that differ in hardware, resource footprints, latencies, costs, and accuracies -- to meet the diverse application requirements. Since requirements, query load, and ap… ▽ More

    Submitted 15 December, 2020; v1 submitted 30 May, 2019; originally announced May 2019.

    Report number: https://www.usenix.org/system/files/atc21-romero.pdf

  12. arXiv:1902.03383  [pdf, ps, other

    cs.OS

    Cloud Programming Simplified: A Berkeley View on Serverless Computing

    Authors: Eric Jonas, Johann Schleier-Smith, Vikram Sreekanti, Chia-Che Tsai, Anurag Khandelwal, Qifan Pu, Vaishaal Shankar, Joao Carreira, Karl Krauth, Neeraja Yadwadkar, Joseph E. Gonzalez, Raluca Ada Popa, Ion Stoica, David A. Patterson

    Abstract: Serverless cloud computing handles virtually all the system administration operations needed to make it easier for programmers to use the cloud. It provides an interface that greatly simplifies cloud programming, and represents an evolution that parallels the transition from assembly language to high-level programming languages. This paper gives a quick history of cloud computing, including an acc… ▽ More

    Submitted 9 February, 2019; originally announced February 2019.