Skip to main content

Showing 1–9 of 9 results for author: Takashima, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.18852  [pdf, other

    cs.PL cs.SE

    VERT: Verified Equivalent Rust Transpilation with Large Language Models as Few-Shot Learners

    Authors: Aidan Z. H. Yang, Yoshiki Takashima, Brandon Paulsen, Josiah Dodds, Daniel Kroening

    Abstract: Rust is a programming language that combines memory safety and low-level control, providing C-like performance while guaranteeing the absence of undefined behaviors by default. Rust's growing popularity has prompted research on safe and correct transpiling of existing code-bases to Rust. Existing work falls into two categories: rule-based and large language model (LLM)-based. While rule-based appr… ▽ More

    Submitted 25 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  2. arXiv:2210.03459  [pdf, other

    eess.AS cs.CL cs.SD

    Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization

    Authors: Shota Horiguchi, Yuki Takashima, Shinji Watanabe, Paola Garcia

    Abstract: Due to the high performance of multi-channel speech processing, we can use the outputs from a multi-channel model as teacher labels when training a single-channel model with knowledge distillation. To the contrary, it is also known that single-channel speech data can benefit multi-channel models by mixing it with multi-channel speech data during training or by using it for model pretraining. This… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Comments: Accepted to IEEE SLT 2022

  3. arXiv:2206.02432  [pdf, other

    eess.AS cs.CL cs.SD

    Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local Attractors

    Authors: Shota Horiguchi, Shinji Watanabe, Paola Garcia, Yuki Takashima, Yohei Kawaguchi

    Abstract: A method to perform offline and online speaker diarization for an unlimited number of speakers is described in this paper. End-to-end neural diarization (EEND) has achieved overlap-aware speaker diarization by formulating it as a multi-label classification problem. It has also been extended for a flexible number of speakers by introducing speaker-wise attractors. However, the output number of spea… ▽ More

    Submitted 22 December, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: Accepted to IEEE/ACM TASLP

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 706-720, 2023

  4. arXiv:2110.04694  [pdf, other

    eess.AS cs.CL cs.SD

    Multi-Channel End-to-End Neural Diarization with Distributed Microphones

    Authors: Shota Horiguchi, Yuki Takashima, Paola Garcia, Shinji Watanabe, Yohei Kawaguchi

    Abstract: Recent progress on end-to-end neural diarization (EEND) has enabled overlap-aware speaker diarization with a single neural network. This paper proposes to enhance EEND by using multi-channel signals from distributed microphones. We replace Transformer encoders in EEND with two types of encoders that process a multi-channel input: spatio-temporal and co-attention encoders. Both are independent of t… ▽ More

    Submitted 28 March, 2022; v1 submitted 9 October, 2021; originally announced October 2021.

    Comments: Accepted to ICASSP 2022

  5. arXiv:2107.01545  [pdf, other

    eess.AS cs.CL cs.SD

    Towards Neural Diarization for Unlimited Numbers of Speakers Using Global and Local Attractors

    Authors: Shota Horiguchi, Shinji Watanabe, Paola Garcia, Yawen Xue, Yuki Takashima, Yohei Kawaguchi

    Abstract: Attractor-based end-to-end diarization is achieving comparable accuracy to the carefully tuned conventional clustering-based methods on challenging datasets. However, the main drawback is that it cannot deal with the case where the number of speakers is larger than the one observed during training. This is because its speaker counting relies on supervised learning. In this work, we introduce an un… ▽ More

    Submitted 23 September, 2021; v1 submitted 4 July, 2021; originally announced July 2021.

    Comments: Accepted to ASRU 2021

  6. arXiv:2106.04764  [pdf, other

    eess.AS cs.SD

    Semi-Supervised Training with Pseudo-Labeling for End-to-End Neural Diarization

    Authors: Yuki Takashima, Yusuke Fujita, Shota Horiguchi, Shinji Watanabe, Paola García, Kenji Nagamatsu

    Abstract: In this paper, we present a semi-supervised training technique using pseudo-labeling for end-to-end neural diarization (EEND). The EEND system has shown promising performance compared with traditional clustering-based methods, especially in the case of overlap** speech. However, to get a well-tuned model, EEND requires labeled data for all the joint speech activities of every speaker at each tim… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

    Comments: Accepted for Interspeech 2021

  7. arXiv:2106.04078  [pdf, other

    eess.AS cs.SD

    End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection

    Authors: Yuki Takashima, Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Paola García, Kenji Nagamatsu

    Abstract: In this paper, we present a conditional multitask learning method for end-to-end neural speaker diarization (EEND). The EEND system has shown promising performance compared with traditional clustering-based methods, especially in the case of overlap** speech. In this paper, to further improve the performance of the EEND system, we propose a novel multitask learning framework that solves speaker… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Comments: Accepted for SLT 2021

    Journal ref: IEEE Spoken Language Technology Workshop (SLT), 2021, pp. 849-856

  8. arXiv:2102.01363  [pdf, other

    eess.AS cs.CL cs.SD

    The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap

    Authors: Shota Horiguchi, Nelson Yalta, Paola Garcia, Yuki Takashima, Yawen Xue, Desh Raj, Zili Huang, Yusuke Fujita, Shinji Watanabe, Sanjeev Khudanpur

    Abstract: This paper provides a detailed description of the Hitachi-JHU system that was submitted to the Third DIHARD Speech Diarization Challenge. The system outputs the ensemble results of the five subsystems: two x-vector-based subsystems, two end-to-end neural diarization-based subsystems, and one hybrid subsystem. We refine each system and all five subsystems become competitive and complementary. After… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

  9. arXiv:2101.08473  [pdf, other

    cs.SD eess.AS

    Online Streaming End-to-End Neural Diarization Handling Overlap** Speech and Flexible Numbers of Speakers

    Authors: Yawen Xue, Shota Horiguchi, Yusuke Fujita, Yuki Takashima, Shinji Watanabe, Paola Garcia, Kenji Nagamatsu

    Abstract: We propose a streaming diarization method based on an end-to-end neural diarization (EEND) model, which handles flexible numbers of speakers and overlap** speech. In our previous study, the speaker-tracing buffer (STB) mechanism was proposed to achieve a chunk-wise streaming diarization using a pre-trained EEND model. STB traces the speaker information in previous chunks to map the speakers in a… ▽ More

    Submitted 6 April, 2021; v1 submitted 21 January, 2021; originally announced January 2021.