Skip to main content

Showing 1–1 of 1 results for author: Yarden, N

.
  1. arXiv:2401.06104  [pdf, other

    cs.CL

    Transformers are Multi-State RNNs

    Authors: Matanel Oren, Michael Hassid, Nir Yarden, Yossi Adi, Roy Schwartz

    Abstract: Transformers are considered conceptually different from the previous generation of state-of-the-art NLP models - recurrent neural networks (RNNs). In this work, we demonstrate that decoder-only transformers can in fact be conceptualized as unbounded multi-state RNNs - an RNN variant with unlimited hidden state size. We further show that transformers can be converted into $\textit{bounded}$ multi-s… ▽ More

    Submitted 18 June, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: preprint