Skip to main content

Showing 1–1 of 1 results for author: Ladsaria, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.06555  [pdf, other

    cs.LG cs.AI cs.CL cs.PL

    An Evaluation Benchmark for Autoformalization in Lean4

    Authors: Aryan Gulati, Devanshu Ladsaria, Shubhra Mishra, Jasdeep Sidhu, Brando Miranda

    Abstract: Large Language Models (LLMs) hold the potential to revolutionize autoformalization. The introduction of Lean4, a mathematical programming language, presents an unprecedented opportunity to rigorously assess the autoformalization capabilities of LLMs. This paper introduces a novel evaluation benchmark designed for Lean4, applying it to test the abilities of state-of-the-art LLMs, including GPT-3.5,… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: To appear at ICLR 2024 as part of the Tiny Papers track