Skip to main content

Showing 1–2 of 2 results for author: Asami, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.17632  [pdf, other

    cs.CL cs.SD eess.AS

    What Do Self-Supervised Speech and Speaker Models Learn? New Findings From a Cross Model Layer-Wise Analysis

    Authors: Takanori Ashihara, Marc Delcroix, Takafumi Moriya, Kohei Matsuura, Taichi Asami, Yusuke Ijima

    Abstract: Self-supervised learning (SSL) has attracted increased attention for learning meaningful speech representations. Speech SSL models, such as WavLM, employ masked prediction training to encode general-purpose representations. In contrast, speaker SSL models, exemplified by DINO-based models, adopt utterance-level training objectives primarily for speaker representation. Understanding how these model… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: Accepted at ICASSP 2024

  2. arXiv:2306.08374  [pdf, other

    cs.CL cs.SD eess.AS

    SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?

    Authors: Takanori Ashihara, Takafumi Moriya, Kohei Matsuura, Tomohiro Tanaka, Yusuke Ijima, Taichi Asami, Marc Delcroix, Yukinori Honma

    Abstract: Self-supervised learning (SSL) for speech representation has been successfully applied in various downstream tasks, such as speech and speaker recognition. More recently, speech SSL models have also been shown to be beneficial in advancing spoken language understanding tasks, implying that the SSL models have the potential to learn not only acoustic but also linguistic information. In this paper,… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: Accepted at INTERSPEECH 2023