Skip to main content

Showing 1–1 of 1 results for author: Doshi, D

Searching in archive math. Search in all archives.
.
  1. arXiv:2406.03495  [pdf, other

    cs.LG cond-mat.dis-nn hep-th math.NT stat.ML

    Grokking Modular Polynomials

    Authors: Darshil Doshi, Tianyu He, Aritra Das, Andrey Gromov

    Abstract: Neural networks readily learn a subset of the modular arithmetic tasks, while failing to generalize on the rest. This limitation remains unmoved by the choice of architecture and training strategies. On the other hand, an analytical solution for the weights of Multi-layer Perceptron (MLP) networks that generalize on the modular addition task is known in the literature. In this work, we (i) extend… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 7+4 pages, 3 figures, 2 tables