Skip to main content

Showing 1–4 of 4 results for author: Magar, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2307.06908  [pdf, other

    cs.CL cs.AI

    Generating Benchmarks for Factuality Evaluation of Language Models

    Authors: Dor Muhlgay, Ori Ram, Inbal Magar, Yoav Levine, Nir Ratner, Yonatan Belinkov, Omri Abend, Kevin Leyton-Brown, Amnon Shashua, Yoav Shoham

    Abstract: Before deploying a language model (LM) within a given domain, it is important to measure its tendency to generate factually incorrect information in that domain. Existing methods for factuality evaluation of LLM generation focus on facts sampled from the LM itself, and thus do not control the set of evaluated facts and might under-represent domain specific or rare facts. We propose FACTOR: Factual… ▽ More

    Submitted 4 February, 2024; v1 submitted 13 July, 2023; originally announced July 2023.

  2. arXiv:2212.10947  [pdf, other

    cs.CL

    Parallel Context Windows for Large Language Models

    Authors: Nir Ratner, Yoav Levine, Yonatan Belinkov, Ori Ram, Inbal Magar, Omri Abend, Ehud Karpas, Amnon Shashua, Kevin Leyton-Brown, Yoav Shoham

    Abstract: When applied to processing long text, Large Language Models (LLMs) are limited by their context window. Existing efforts to address this limitation involve training specialized architectures, and cannot be easily applied to off-the-shelf LLMs. We present Parallel Context Windows (PCW), a method that alleviates the context window restriction for any off-the-shelf LLM without further training. The k… ▽ More

    Submitted 1 August, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: The 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023)

  3. arXiv:2206.09860  [pdf, other

    cs.CL

    Fewer Errors, but More Stereotypes? The Effect of Model Size on Gender Bias

    Authors: Yarden Tal, Inbal Magar, Roy Schwartz

    Abstract: The size of pretrained models is increasing, and so is their performance on a variety of NLP tasks. However, as their memorization capacity grows, they might pick up more social biases. In this work, we examine the connection between model size and its gender bias (specifically, occupational gender bias). We measure bias in three masked language model families (RoBERTa, DeBERTa, and T5) in two set… ▽ More

    Submitted 20 June, 2022; originally announced June 2022.

  4. arXiv:2203.08242  [pdf, other

    cs.CL cs.LG

    Data Contamination: From Memorization to Exploitation

    Authors: Inbal Magar, Roy Schwartz

    Abstract: Pretrained language models are typically trained on massive web-based datasets, which are often "contaminated" with downstream test sets. It is not clear to what extent models exploit the contaminated data for downstream tasks. We present a principled method to study this question. We pretrain BERT models on joint corpora of Wikipedia and labeled downstream datasets, and fine-tune them on the rele… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: Accepted to ACL 2022