Skip to main content

Showing 1–4 of 4 results for author: Yalta, N

Searching in archive eess. Search in all archives.
.
  1. arXiv:2102.01363  [pdf, other

    eess.AS cs.CL cs.SD

    The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap

    Authors: Shota Horiguchi, Nelson Yalta, Paola Garcia, Yuki Takashima, Yawen Xue, Desh Raj, Zili Huang, Yusuke Fujita, Shinji Watanabe, Sanjeev Khudanpur

    Abstract: This paper provides a detailed description of the Hitachi-JHU system that was submitted to the Third DIHARD Speech Diarization Challenge. The system outputs the ensemble results of the five subsystems: two x-vector-based subsystems, two end-to-end neural diarization-based subsystems, and one hybrid subsystem. We refine each system and all five subsystems become competitive and complementary. After… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

  2. arXiv:1811.02735  [pdf, other

    eess.AS cs.CL cs.SD

    CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments

    Authors: Nelson Yalta, Shinji Watanabe, Takaaki Hori, Kazuhiro Nakadai, Tetsuya Ogata

    Abstract: Casual conversations involving multiple speakers and noises from surrounding devices are common in everyday environments, which degrades the performances of automatic speech recognition systems. These challenging characteristics of environments are the target of the CHiME-5 challenge. By employing a convolutional neural network (CNN)-based multichannel end-to-end speech recognition system, this st… ▽ More

    Submitted 20 June, 2019; v1 submitted 6 November, 2018; originally announced November 2018.

    Comments: 5 pages, 1 figure, EUSIPCO 2019

  3. arXiv:1810.03459  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling

    Authors: Jae** Cho, Murali Karthick Baskar, Ruizhi Li, Matthew Wiesner, Sri Harish Mallidi, Nelson Yalta, Martin Karafiat, Shinji Watanabe, Takaaki Hori

    Abstract: Sequence-to-sequence (seq2seq) approach for low-resource ASR is a relatively new direction in speech research. The approach benefits by performing model training without using lexicon and alignments. However, this poses a new problem of requiring more data compared to conventional DNN-HMM systems. In this work, we attempt to use data from 10 BABEL languages to build a multi-lingual seq2seq model a… ▽ More

    Submitted 4 October, 2018; originally announced October 2018.

  4. arXiv:1807.01126  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    Weakly Supervised Deep Recurrent Neural Networks for Basic Dance Step Generation

    Authors: Nelson Yalta, Shinji Watanabe, Kazuhiro Nakadai, Tetsuya Ogata

    Abstract: Synthesizing human's movements such as dancing is a flourishing research field which has several applications in computer graphics. Recent studies have demonstrated the advantages of deep neural networks (DNNs) for achieving remarkable performance in motion and music tasks with little effort for feature pre-processing. However, applying DNNs for generating dance to a piece of music is nevertheless… ▽ More

    Submitted 20 June, 2019; v1 submitted 3 July, 2018; originally announced July 2018.

    Comments: 8 pages, 7 figures. Proc. IJCNN 2019