Skip to main content

Showing 1–1 of 1 results for author: Loomba, A R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2307.10635  [pdf, other

    cs.CL cs.AI cs.LG

    SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models

    Authors: Xiaoxuan Wang, Ziniu Hu, Pan Lu, Yanqiao Zhu, Jieyu Zhang, Satyen Subramaniam, Arjun R. Loomba, Shichang Zhang, Yizhou Sun, Wei Wang

    Abstract: Most of the existing Large Language Model (LLM) benchmarks on scientific problem reasoning focus on problems grounded in high-school subjects and are confined to elementary algebraic operations. To systematically examine the reasoning capabilities required for solving complex scientific problems, we introduce an expansive benchmark suite SciBench for LLMs. SciBench contains a carefully curated dat… ▽ More

    Submitted 28 June, 2024; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: To appear at ICML 2024