Skip to main content

Showing 1–2 of 2 results for author: Asida, T

.
  1. arXiv:2403.19887  [pdf, other

    cs.CL cs.LG

    Jamba: A Hybrid Transformer-Mamba Language Model

    Authors: Opher Lieber, Barak Lenz, Hofit Bata, Gal Cohen, Jhonathan Osin, Itay Dalmedigos, Erez Safahi, Shaked Meirom, Yonatan Belinkov, Shai Shalev-Shwartz, Omri Abend, Raz Alon, Tomer Asida, Amir Bergman, Roman Glozman, Michael Gokhman, Avashalom Manevich, Nir Ratner, Noam Rozen, Erez Shwartz, Mor Zusman, Yoav Shoham

    Abstract: We present Jamba, a new base large language model based on a novel hybrid Transformer-Mamba mixture-of-experts (MoE) architecture. Specifically, Jamba interleaves blocks of Transformer and Mamba layers, enjoying the benefits of both model families. MoE is added in some of these layers to increase model capacity while kee** active parameter usage manageable. This flexible architecture allows reso… ▽ More

    Submitted 3 July, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Webpage: https://www.ai21.com/jamba

  2. arXiv:1901.07608  [pdf, other

    cond-mat.stat-mech

    Large fluctuations of a Kardar-Parisi-Zhang interface on a half-line: the height statistics at a shifted point

    Authors: Tomer Asida, Eli Livne, Baruch Meerson

    Abstract: We consider a stochastic interface $h(x,t)$, described by the $1+1$ Kardar-Parisi-Zhang (KPZ) equation on the half-line $x\geq0$ with the reflecting boundary at $x=0$. The interface is initially flat, $h(x,t=0)=0$. We focus on the short-time probability distribution $\mathcal{P}\left(H,L,t\right)$ of the height $H$ of the interface at point $x=L$. Using the optimal fluctuation method, we determine… ▽ More

    Submitted 31 March, 2019; v1 submitted 22 January, 2019; originally announced January 2019.

    Comments: 15 pages, 13 figures

    Journal ref: Phys. Rev. E 99, 042132 (2019)