Skip to main content

Showing 1–1 of 1 results for author: Gilmer, J M

.
  1. arXiv:2102.06356  [pdf, other

    cs.LG stat.ML

    A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers Suffice Across Batch Sizes

    Authors: Zachary Nado, Justin M. Gilmer, Christopher J. Shallue, Rohan Anil, George E. Dahl

    Abstract: Recently the LARS and LAMB optimizers have been proposed for training neural networks faster using large batch sizes. LARS and LAMB add layer-wise normalization to the update rules of Heavy-ball momentum and Adam, respectively, and have become popular in prominent benchmarks and deep learning libraries. However, without fair comparisons to standard optimizers, it remains an open question whether L… ▽ More

    Submitted 9 June, 2021; v1 submitted 12 February, 2021; originally announced February 2021.