Skip to main content

Showing 1–1 of 1 results for author: Lunati, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.00332  [pdf, other

    cs.CL cs.AI cs.LG

    A Careful Examination of Large Language Model Performance on Grade School Arithmetic

    Authors: Hugh Zhang, Jeff Da, Dean Lee, Vaughn Robinson, Catherine Wu, Will Song, Tiffany Zhao, Pranav Raja, Dylan Slack, Qin Lyu, Sean Hendryx, Russell Kaplan, Michele Lunati, Summer Yue

    Abstract: Large language models (LLMs) have achieved impressive success on many benchmarks for mathematical reasoning. However, there is growing concern that some of this performance actually reflects dataset contamination, where data closely resembling benchmark questions leaks into the training data, instead of true reasoning ability. To investigate this claim rigorously, we commission Grade School Math 1… ▽ More

    Submitted 3 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.