Skip to main content

Showing 1–1 of 1 results for author: Gritsevskaya, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.05604  [pdf, other

    cs.CL cs.AI cs.CV cs.CY

    REBUS: A Robust Evaluation Benchmark of Understanding Symbols

    Authors: Andrew Gritsevskiy, Arjun Panickssery, Aaron Kirtland, Derik Kauffman, Hans Gundlach, Irina Gritsevskaya, Joe Cavanagh, Jonathan Chiang, Lydia La Roux, Michelle Hung

    Abstract: We propose a new benchmark evaluating the performance of multimodal large language models on rebus puzzles. The dataset covers 333 original examples of image-based wordplay, cluing 13 categories such as movies, composers, major cities, and food. To achieve good performance on the benchmark of identifying the clued word or phrase, models must combine image recognition and string manipulation with h… ▽ More

    Submitted 3 June, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: 20 pages, 5 figures. For code, see http://github.com/cvndsh/rebus