Skip to main content

Showing 1–1 of 1 results for author: Sully, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2210.16859  [pdf, other

    cs.LG hep-th stat.ML

    A Solvable Model of Neural Scaling Laws

    Authors: Alexander Maloney, Daniel A. Roberts, James Sully

    Abstract: Large language models with a huge number of parameters, when trained on near internet-sized number of tokens, have been empirically shown to obey neural scaling laws: specifically, their performance behaves predictably as a power law in either parameters or dataset size until bottlenecked by the other resource. To understand this better, we first identify the necessary properties allowing such sca… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

    Comments: 73 + 23 pages, 14 + 5 figures

    Report number: MIT-CTP/5463