Skip to main content

Showing 1–1 of 1 results for author: Salpekar, O

Searching in archive cs. Search in all archives.
.
  1. arXiv:2006.15704  [pdf, other

    cs.DC cs.LG

    PyTorch Distributed: Experiences on Accelerating Data Parallel Training

    Authors: Shen Li, Yanli Zhao, Rohan Varma, Omkar Salpekar, Pieter Noordhuis, Teng Li, Adam Paszke, Jeff Smith, Brian Vaughan, Pritam Damania, Soumith Chintala

    Abstract: This paper presents the design, implementation, and evaluation of the PyTorch distributed data parallel module. PyTorch is a widely-adopted scientific computing package used in deep learning research and applications. Recent advances in deep learning argue for the value of large datasets and large models, which necessitates the ability to scale out model training to more computational resources. D… ▽ More

    Submitted 28 June, 2020; originally announced June 2020.

    Comments: To appear in VLDB 2020