Skip to main content

Showing 1–8 of 8 results for author: Hassid, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.00725  [pdf, other

    cs.SE cs.AI cs.CL cs.LG

    The Larger the Better? Improved LLM Code-Generation via Budget Reallocation

    Authors: Michael Hassid, Tal Remez, Jonas Gehring, Roy Schwartz, Yossi Adi

    Abstract: It is a common belief that large language models (LLMs) are better than smaller-sized ones. However, larger models also require significantly more time and compute during inference. This begs the question: what happens when both models operate under the same budget? (e.g., compute, run-time). To address this question, we analyze code generation LLMs of various sizes and make comparisons such as ru… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  2. arXiv:2401.06104  [pdf, other

    cs.CL

    Transformers are Multi-State RNNs

    Authors: Matanel Oren, Michael Hassid, Nir Yarden, Yossi Adi, Roy Schwartz

    Abstract: Transformers are considered conceptually different from the previous generation of state-of-the-art NLP models - recurrent neural networks (RNNs). In this work, we demonstrate that decoder-only transformers can in fact be conceptualized as unbounded multi-state RNNs - an RNN variant with unlimited hidden state size. We further show that transformers can be converted into $\textit{bounded}$ multi-s… ▽ More

    Submitted 18 June, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: preprint

  3. arXiv:2308.05725  [pdf, ps, other

    cs.CL cs.LG cs.SD eess.AS

    EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis

    Authors: Tu Anh Nguyen, Wei-Ning Hsu, Antony D'Avirro, Bowen Shi, Itai Gat, Maryam Fazel-Zarani, Tal Remez, Jade Copet, Gabriel Synnaeve, Michael Hassid, Felix Kreuk, Yossi Adi, Emmanuel Dupoux

    Abstract: Recent work has shown that it is possible to resynthesize high-quality speech based, not on text, but on low bitrate discrete units that have been learned in a self-supervised fashion and can therefore capture expressive aspects of speech that are hard to transcribe (prosody, voice styles, non-verbal vocalization). The adoption of these methods is still limited by the fact that most speech synthes… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

  4. arXiv:2306.02307  [pdf, other

    cs.CL cs.AI cs.LG

    Finding the SWEET Spot: Analysis and Improvement of Adaptive Inference in Low Resource Settings

    Authors: Daniel Rotem, Michael Hassid, Jonathan Mamou, Roy Schwartz

    Abstract: Adaptive inference is a simple method for reducing inference costs. The method works by maintaining multiple classifiers of different capacities, and allocating resources to each test instance according to its difficulty. In this work, we compare the two main approaches for adaptive inference, Early-Exit and Multi-Model, when training data is limited. First, we observe that for models with the sam… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

    Comments: Proceedings of ACL 2023

  5. arXiv:2305.13009  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Textually Pretrained Speech Language Models

    Authors: Michael Hassid, Tal Remez, Tu Anh Nguyen, Itai Gat, Alexis Conneau, Felix Kreuk, Jade Copet, Alexandre Defossez, Gabriel Synnaeve, Emmanuel Dupoux, Roy Schwartz, Yossi Adi

    Abstract: Speech language models (SpeechLMs) process and generate acoustic data only, without textual supervision. In this work, we propose TWIST, a method for training SpeechLMs using a warm-start from a pretrained textual language models. We show using both automatic and human evaluations that TWIST outperforms a cold-start SpeechLM across the board. We empirically analyze the effect of different model de… ▽ More

    Submitted 30 January, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023

  6. arXiv:2211.03495  [pdf, other

    cs.CL cs.LG

    How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained Transformers

    Authors: Michael Hassid, Hao Peng, Daniel Rotem, Jungo Kasai, Ivan Montero, Noah A. Smith, Roy Schwartz

    Abstract: The attention mechanism is considered the backbone of the widely-used Transformer architecture. It contextualizes the input by computing input-specific attention matrices. We find that this mechanism, while powerful and elegant, is not as important as typically thought for pretrained language models. We introduce PAPA, a new probing method that replaces the input-dependent attention matrices with… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: Findings of EMNLP 2022

  7. arXiv:2209.00099  [pdf, other

    cs.CL

    Efficient Methods for Natural Language Processing: A Survey

    Authors: Marcos Treviso, Ji-Ung Lee, Tianchu Ji, Betty van Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Colin Raffel, Pedro H. Martins, André F. T. Martins, Jessica Zosa Forde, Peter Milder, Edwin Simpson, Noam Slonim, Jesse Dodge, Emma Strubell, Niranjan Balasubramanian, Leon Derczynski, Iryna Gurevych, Roy Schwartz

    Abstract: Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource consumption also grows. Such resources include data, time, storage, or energy, all of which are naturally limited and unevenly distributed. This motivates research into efficient methods that require few… ▽ More

    Submitted 24 March, 2023; v1 submitted 31 August, 2022; originally announced September 2022.

    Comments: Accepted at TACL, pre publication version

  8. arXiv:2111.10139  [pdf, other

    cs.CV cs.CL

    More than Words: In-the-Wild Visually-Driven Prosody for Text-to-Speech

    Authors: Michael Hassid, Michelle Tadmor Ramanovich, Brendan Shillingford, Miaosen Wang, Ye Jia, Tal Remez

    Abstract: In this paper we present VDTTS, a Visually-Driven Text-to-Speech model. Motivated by dubbing, VDTTS takes advantage of video frames as an additional input alongside text, and generates speech that matches the video signal. We demonstrate how this allows VDTTS to, unlike plain TTS models, generate speech that not only has prosodic variations like natural pauses and pitch, but is also synchronized t… ▽ More

    Submitted 23 March, 2022; v1 submitted 19 November, 2021; originally announced November 2021.