Skip to main content

Showing 1–1 of 1 results for author: Karandikar, S

.
  1. arXiv:2003.03477  [pdf, other

    cs.LG cs.DC stat.ML

    ShadowSync: Performing Synchronization in the Background for Highly Scalable Distributed Training

    Authors: Qinqing Zheng, Bor-Yiing Su, Jiyan Yang, Alisson Azzolini, Qiang Wu, Ou **, Shri Karandikar, Hagay Lupesko, Liang Xiong, Eric Zhou

    Abstract: Recommendation systems are often trained with a tremendous amount of data, and distributed training is the workhorse to shorten the training time. While the training throughput can be increased by simply adding more workers, it is also increasingly challenging to preserve the model quality. In this paper, we present \shadowsync, a distributed framework specifically tailored to modern scale recomme… ▽ More

    Submitted 23 February, 2021; v1 submitted 6 March, 2020; originally announced March 2020.