Skip to main content

Showing 1–1 of 1 results for author: Littwin, E

Searching in archive cond-mat. Search in all archives.
.
  1. arXiv:2308.01814  [pdf, other

    cs.LG cond-mat.dis-nn cs.NE math.PR

    Tensor Programs IVb: Adaptive Optimization in the Infinite-Width Limit

    Authors: Greg Yang, Etai Littwin

    Abstract: Going beyond stochastic gradient descent (SGD), what new phenomena emerge in wide neural networks trained by adaptive optimizers like Adam? Here we show: The same dichotomy between feature learning and kernel behaviors (as in SGD) holds for general optimizers as well, including Adam -- albeit with a nonlinear notion of "kernel." We derive the corresponding "neural tangent" and "maximal update" lim… ▽ More

    Submitted 7 August, 2023; v1 submitted 3 August, 2023; originally announced August 2023.

    Comments: This is the complete version of "Adaptive Optimization in the Infinite-Width Limit" in ICLR 2023, https://openreview.net/forum?id=zgVDqw9ZUES