Skip to main content

Showing 1–4 of 4 results for author: Efrat, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2305.14196  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding

    Authors: Uri Shaham, Maor Ivgi, Avia Efrat, Jonathan Berant, Omer Levy

    Abstract: We introduce ZeroSCROLLS, a zero-shot benchmark for natural language understanding over long texts, which contains only test and small validation sets, without training data. We adapt six tasks from the SCROLLS benchmark, and add four new datasets, including two novel information fusing tasks, such as aggregating the percentage of positive reviews. Using ZeroSCROLLS, we conduct a comprehensive eva… ▽ More

    Submitted 17 December, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Findings of EMNLP 2023

  2. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, AdriĆ  Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  3. arXiv:2201.03533  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    SCROLLS: Standardized CompaRison Over Long Language Sequences

    Authors: Uri Shaham, Elad Segal, Maor Ivgi, Avia Efrat, Ori Yoran, Adi Haviv, Ankit Gupta, Wenhan Xiong, Mor Geva, Jonathan Berant, Omer Levy

    Abstract: NLP benchmarks have largely focused on short texts, such as sentences and paragraphs, even though long texts comprise a considerable amount of natural language in the wild. We introduce SCROLLS, a suite of tasks that require reasoning over long texts. We examine existing long-text datasets, and handpick ones where the text is naturally long, while prioritizing tasks that involve synthesizing infor… ▽ More

    Submitted 11 October, 2022; v1 submitted 10 January, 2022; originally announced January 2022.

    Comments: EMNLP 2022

  4. arXiv:2103.01242  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language

    Authors: Avia Efrat, Uri Shaham, Dan Kilman, Omer Levy

    Abstract: Current NLP datasets targeting ambiguity can be solved by a native speaker with relative ease. We present Cryptonite, a large-scale dataset based on cryptic crosswords, which is both linguistically complex and naturally sourced. Each example in Cryptonite is a cryptic clue, a short phrase or sentence with a misleading surface reading, whose solving requires disambiguating semantic, syntactic, and… ▽ More

    Submitted 1 November, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

    Comments: EMNLP 2021