Skip to main content

Showing 1–2 of 2 results for author: Kulkarni, S G

Searching in archive eess. Search in all archives.
.
  1. arXiv:2304.13541  [pdf, other

    cs.DC cs.PF eess.SY

    D-STACK: High Throughput DNN Inference by Effective Multiplexing and Spatio-Temporal Scheduling of GPUs

    Authors: Aditya Dhakal, Sameer G. Kulkarni, K. K. Ramakrishnan

    Abstract: Hardware accelerators such as GPUs are required for real-time, low-latency inference with Deep Neural Networks (DNN). However, due to the inherent limits to the parallelism they can exploit, DNNs often under-utilize the capacity of today's high-end accelerators. Although spatial multiplexing of the GPU, leads to higher GPU utilization and higher inference throughput, there remain a number of chall… ▽ More

    Submitted 31 March, 2023; originally announced April 2023.

  2. arXiv:2008.03602  [pdf, other

    cs.NE cs.DC eess.SY

    Spatial Sharing of GPU for Autotuning DNN models

    Authors: Aditya Dhakal, Junguk Cho, Sameer G. Kulkarni, K. K. Ramakrishnan, Puneet Sharma

    Abstract: GPUs are used for training, inference, and tuning the machine learning models. However, Deep Neural Network (DNN) vary widely in their ability to exploit the full power of high-performance GPUs. Spatial sharing of GPU enables multiplexing several DNNs on the GPU and can improve GPU utilization, thus improving throughput and lowering latency. DNN models given just the right amount of GPU resources… ▽ More

    Submitted 8 August, 2020; originally announced August 2020.