We gratefully acknowledge support from
the Simons Foundation and member institutions.

Frederik Kunstner is qualified to endorse.

Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models

Frederik Kunstner: Is registered as an author of this paper.
Can endorse for cs.LG. (why?)

Robin Yadav, Alan Milligan, Mark Schmidt and Alberto Bietti are not registered as owners of this paper. (why?)