Skip to main content

Showing 1–7 of 7 results for author: Lenz, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.19887  [pdf, other

    cs.CL cs.LG

    Jamba: A Hybrid Transformer-Mamba Language Model

    Authors: Opher Lieber, Barak Lenz, Hofit Bata, Gal Cohen, Jhonathan Osin, Itay Dalmedigos, Erez Safahi, Shaked Meirom, Yonatan Belinkov, Shai Shalev-Shwartz, Omri Abend, Raz Alon, Tomer Asida, Amir Bergman, Roman Glozman, Michael Gokhman, Avashalom Manevich, Nir Ratner, Noam Rozen, Erez Shwartz, Mor Zusman, Yoav Shoham

    Abstract: We present Jamba, a new base large language model based on a novel hybrid Transformer-Mamba mixture-of-experts (MoE) architecture. Specifically, Jamba interleaves blocks of Transformer and Mamba layers, enjoying the benefits of both model families. MoE is added in some of these layers to increase model capacity while kee** active parameter usage manageable. This flexible architecture allows reso… ▽ More

    Submitted 3 July, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Webpage: https://www.ai21.com/jamba

  2. arXiv:2305.20010  [pdf, other

    cs.AI cs.CL cs.CY cs.HC

    Human or Not? A Gamified Approach to the Turing Test

    Authors: Daniel Jannai, Amos Meron, Barak Lenz, Yoav Levine, Yoav Shoham

    Abstract: We present "Human or Not?", an online game inspired by the Turing test, that measures the capability of AI chatbots to mimic humans in dialog, and of humans to tell bots from other humans. Over the course of a month, the game was played by over 1.5 million users who engaged in anonymous two-minute chat sessions with either another human or an AI language model which was prompted to behave like hum… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: 11 pages, 6 figures

    MSC Class: 68T50 ACM Class: I.2.7

  3. arXiv:2205.00445  [pdf, other

    cs.CL cs.AI

    MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning

    Authors: Ehud Karpas, Omri Abend, Yonatan Belinkov, Barak Lenz, Opher Lieber, Nir Ratner, Yoav Shoham, Hofit Bata, Yoav Levine, Kevin Leyton-Brown, Dor Muhlgay, Noam Rozen, Erez Schwartz, Gal Shachaf, Shai Shalev-Shwartz, Amnon Shashua, Moshe Tenenholtz

    Abstract: Huge language models (LMs) have ushered in a new era for AI, serving as a gateway to natural-language-based knowledge tasks. Although an essential element of modern AI, LMs are also inherently limited in a number of ways. We discuss these limitations and how they can be avoided by adopting a systems approach. Conceptualizing the challenge as one that involves knowledge and reasoning in addition to… ▽ More

    Submitted 1 May, 2022; originally announced May 2022.

  4. arXiv:2204.10019  [pdf, other

    cs.CL cs.AI

    Standing on the Shoulders of Giant Frozen Language Models

    Authors: Yoav Levine, Itay Dalmedigos, Ori Ram, Yoel Zeldes, Daniel Jannai, Dor Muhlgay, Yoni Osin, Opher Lieber, Barak Lenz, Shai Shalev-Shwartz, Amnon Shashua, Kevin Leyton-Brown, Yoav Shoham

    Abstract: Huge pretrained language models (LMs) have demonstrated surprisingly good zero-shot capabilities on a wide variety of tasks. This gives rise to the appealing vision of a single, versatile model with a wide range of functionalities across disparate applications. However, current leading techniques for leveraging a "frozen" LM -- i.e., leaving its weights untouched -- still often underperform fine-t… ▽ More

    Submitted 21 April, 2022; originally announced April 2022.

  5. arXiv:2011.01285  [pdf, other

    cs.LG cs.CL

    Exemplar Guided Active Learning

    Authors: Jason Hartford, Kevin Leyton-Brown, Hadas Raviv, Dan Padnos, Shahar Lev, Barak Lenz

    Abstract: We consider the problem of wisely using a limited budget to label a small subset of a large unlabeled dataset. We are motivated by the NLP problem of word sense disambiguation. For any word, we have a set of candidate labels from a knowledge base, but the label set is not necessarily representative of what occurs in the data: there may exist labels in the knowledge base that very rarely occur in t… ▽ More

    Submitted 2 November, 2020; originally announced November 2020.

    Comments: Published at NeurIPS 2020

  6. arXiv:2010.01825  [pdf, other

    cs.LG cs.CL stat.ML

    PMI-Masking: Principled masking of correlated spans

    Authors: Yoav Levine, Barak Lenz, Opher Lieber, Omri Abend, Kevin Leyton-Brown, Moshe Tennenholtz, Yoav Shoham

    Abstract: Masking tokens uniformly at random constitutes a common flaw in the pretraining of Masked Language Models (MLMs) such as BERT. We show that such uniform masking allows an MLM to minimize its training objective by latching onto shallow local signals, leading to pretraining inefficiency and suboptimal downstream performance. To address this flaw, we propose PMI-Masking, a principled masking strategy… ▽ More

    Submitted 5 October, 2020; originally announced October 2020.

  7. arXiv:1908.05646  [pdf, other

    cs.CL cs.LG

    SenseBERT: Driving Some Sense into BERT

    Authors: Yoav Levine, Barak Lenz, Or Dagan, Ori Ram, Dan Padnos, Or Sharir, Shai Shalev-Shwartz, Amnon Shashua, Yoav Shoham

    Abstract: The ability to learn from large unlabeled corpora has allowed neural language models to advance the frontier in natural language understanding. However, existing self-supervision techniques operate at the word form level, which serves as a surrogate for the underlying semantic content. This paper proposes a method to employ weak-supervision directly at the word sense level. Our model, named SenseB… ▽ More

    Submitted 18 May, 2020; v1 submitted 15 August, 2019; originally announced August 2019.

    Comments: Accepted to ACL 2020