Skip to main content

Showing 1–7 of 7 results for author: Cappellazzo, U

Searching in archive eess. Search in all archives.
.
  1. arXiv:2402.10427  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Evaluating and Improving Continual Learning in Spoken Language Understanding

    Authors: Muqiao Yang, Xiang Li, Umberto Cappellazzo, Shinji Watanabe, Bhiksha Raj

    Abstract: Continual learning has emerged as an increasingly important challenge across various tasks, including Spoken Language Understanding (SLU). In SLU, its objective is to effectively handle the emergence of new concepts and evolving environments. The evaluation of continual learning algorithms typically involves assessing the model's stability, plasticity, and generalizability as fundamental aspects o… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  2. arXiv:2402.00828  [pdf, other

    eess.AS cs.AI

    Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters

    Authors: Umberto Cappellazzo, Daniele Falavigna, Alessio Brutti

    Abstract: Mixture of Experts (MoE) architectures have recently started burgeoning due to their ability to scale model's capacity while maintaining the computational cost affordable. Furthermore, they can be applied to both Transformers and State Space Models, the current state-of-the-art models in numerous fields. While MoE has been mostly investigated for the pre-training stage, its use in parameter-effici… ▽ More

    Submitted 4 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Accepted at INTERSPEECH 2024. The code is publicly available at: https://github.com/umbertocappellazzo/PETL_AST

  3. arXiv:2312.03694  [pdf, other

    eess.AS

    Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers

    Authors: Umberto Cappellazzo, Daniele Falavigna, Alessio Brutti, Mirco Ravanelli

    Abstract: The common modus operandi of fine-tuning large pre-trained Transformer models entails the adaptation of all their parameters (i.e., full fine-tuning). While achieving striking results on multiple tasks, this approach becomes unfeasible as the model size and the number of downstream tasks increase. In natural language processing and computer vision, parameter-efficient approaches like prompt-tuning… ▽ More

    Submitted 11 January, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: The code is available at: https://github.com/umbertocappellazzo/PETL_AST

  4. arXiv:2310.02699  [pdf, other

    eess.AS cs.AI

    Continual Contrastive Spoken Language Understanding

    Authors: Umberto Cappellazzo, Enrico Fini, Muqiao Yang, Daniele Falavigna, Alessio Brutti, Bhiksha Raj

    Abstract: Recently, neural networks have shown impressive progress across diverse fields, with speech processing being no exception. However, recent breakthroughs in this area require extensive offline training using large datasets and tremendous computing resources. Unfortunately, these models struggle to retain their previously acquired knowledge when learning new tasks continually, and retraining from sc… ▽ More

    Submitted 4 June, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: Accepted to ACL Findings 2024

  5. arXiv:2309.09546  [pdf, other

    eess.AS cs.CL cs.SD

    Training dynamic models using early exits for automatic speech recognition on resource-constrained devices

    Authors: George August Wright, Umberto Cappellazzo, Salah Zaiem, Desh Raj, Lucas Ondel Yang, Daniele Falavigna, Mohamed Nabih Ali, Alessio Brutti

    Abstract: The ability to dynamically adjust the computational load of neural models during inference is crucial for on-device processing scenarios characterised by limited and time-varying computational resources. A promising solution is presented by early-exit architectures, in which additional exit branches are appended to intermediate layers of the encoder. In self-attention models for automatic speech r… ▽ More

    Submitted 22 February, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: Accepted at the ICASSP Workshop Self-supervision in Audio, Speech and Beyond 2024

  6. arXiv:2305.13899  [pdf, other

    eess.AS cs.CL

    Sequence-Level Knowledge Distillation for Class-Incremental End-to-End Spoken Language Understanding

    Authors: Umberto Cappellazzo, Muqiao Yang, Daniele Falavigna, Alessio Brutti

    Abstract: The ability to learn new concepts sequentially is a major weakness for modern neural networks, which hinders their use in non-stationary environments. Their propensity to fit the current data distribution to the detriment of the past acquired knowledge leads to the catastrophic forgetting issue. In this work we tackle the problem of Spoken Language Understanding applied to a continual learning set… ▽ More

    Submitted 31 July, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted at INTERSPEECH 2023. Code (will be) available at https://github.com/umbertocappellazzo/SLURP-SeqKD

  7. arXiv:2211.08161  [pdf, other

    eess.AS cs.LG

    An Investigation of the Combination of Rehearsal and Knowledge Distillation in Continual Learning for Spoken Language Understanding

    Authors: Umberto Cappellazzo, Daniele Falavigna, Alessio Brutti

    Abstract: Continual learning refers to a dynamical framework in which a model receives a stream of non-stationary data over time and must adapt to new data while preserving previously acquired knowledge. Unluckily, neural networks fail to meet these two desiderata, incurring the so-called catastrophic forgetting phenomenon. Whereas a vast array of strategies have been proposed to attenuate forgetting in the… ▽ More

    Submitted 23 May, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

    Comments: Accepted at INTERSPEECH 2023. Code available here: https://github.com/umbertocappellazzo/CL_SLU