Showing 1–1 of 1 results for author: Sarrof, Y

Search v0.5.6 released 2020-02-24

arXiv:2405.17394 [pdf, other]

cs.CL cs.FL cs.LG

The Expressive Capacity of State Space Models: A Formal Language Perspective

Authors: Yash Sarrof, Yana Veitsman, Michael Hahn

Abstract: Recently, recurrent models based on linear state space models (SSMs) have shown promising performance in language modeling (LM), competititve with transformers. However, there is little understanding of the in-principle abilities of such models, which could provide useful guidance to the search for better LM architectures. We present a comprehensive theoretical study of the capacity of such SSMs a… ▽ More Recently, recurrent models based on linear state space models (SSMs) have shown promising performance in language modeling (LM), competititve with transformers. However, there is little understanding of the in-principle abilities of such models, which could provide useful guidance to the search for better LM architectures. We present a comprehensive theoretical study of the capacity of such SSMs as it compares to that of transformers and traditional RNNs. We find that SSMs and transformers have overlap** but distinct strengths. In star-free state tracking, SSMs implement straightforward and exact solutions to problems that transformers struggle to represent exactly. They can also model bounded hierarchical structure with optimal memory even without simulating a stack. On the other hand, we identify a design choice in current SSMs that limits their expressive power. We discuss implications for SSM and LM research, and verify results empirically on a recent SSM, Mamba. △ Less

Submitted 2 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

Search v0.5.6 released 2020-02-24