Skip to main content

Showing 1–5 of 5 results for author: Hirano, M

Searching in archive eess. Search in all archives.
.
  1. arXiv:2308.06981  [pdf, other

    eess.AS cs.SD

    The Sound Demixing Challenge 2023 $\unicode{x2013}$ Cinematic Demixing Track

    Authors: Stefan Uhlich, Giorgio Fabbro, Masato Hirano, Shusuke Takahashi, Gordon Wichern, Jonathan Le Roux, Dipam Chakraborty, Sharada Mohanty, Kai Li, Yi Luo, Jianwei Yu, Rongzhi Gu, Roman Solovyev, Alexander Stempkovskiy, Tatiana Habruseva, Mikhail Sukhovei, Yuki Mitsufuji

    Abstract: This paper summarizes the cinematic demixing (CDX) track of the Sound Demixing Challenge 2023 (SDX'23). We provide a comprehensive summary of the challenge setup, detailing the structure of the competition and the datasets used. Especially, we detail CDXDB23, a new hidden dataset constructed from real movies that was used to rank the submissions. The paper also offers insights into the most succes… ▽ More

    Submitted 18 April, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: Accepted for Transactions of the International Society for Music Information Retrieval

  2. arXiv:2305.10734  [pdf, other

    cs.SD cs.CL eess.AS

    Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders

    Authors: Hao Shi, Kazuki Shimada, Masato Hirano, Takashi Shibuya, Yuichiro Koyama, Zhi Zhong, Shusuke Takahashi, Tatsuya Kawahara, Yuki Mitsufuji

    Abstract: Diffusion-based generative speech enhancement (SE) has recently received attention, but reverse diffusion remains time-consuming. One solution is to initialize the reverse diffusion process with enhanced features estimated by a predictive SE system. However, the pipeline structure currently does not consider for a combined use of generative and predictive decoders. The predictive decoder allows us… ▽ More

    Submitted 28 February, 2024; v1 submitted 18 May, 2023; originally announced May 2023.

  3. arXiv:2305.06701  [pdf, ps, other

    cs.SD eess.AS

    Extending Audio Masked Autoencoders Toward Audio Restoration

    Authors: Zhi Zhong, Hao Shi, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji

    Abstract: Audio classification and restoration are among major downstream tasks in audio signal processing. However, restoration derives less of a benefit from pretrained models compared to the overwhelming success of pretrained models in classification tasks. Due to such unbalanced benefits, there has been rising interest in how to improve the performance of pretrained models for restoration tasks, e.g., s… ▽ More

    Submitted 17 August, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

    Comments: WASPAA 2023.Copyright 2023 IEEE.Personal use of this material is permitted.Permission from IEEE must be obtained for all other uses,in any current or future media,including reprinting/republishing this material for advertising or promotional purposes, creating new collective works,for resale or redistribution to servers or lists,or reuse of any copyrighted component of this work in other works

  4. arXiv:2305.05857  [pdf, other

    eess.AS cs.SD

    Diffusion-based Signal Refiner for Speech Separation

    Authors: Masato Hirano, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji

    Abstract: We have developed a diffusion-based speech refiner that improves the reference-free perceptual quality of the audio predicted by preceding single-channel speech separation models. Although modern deep neural network-based speech separation models have show high performance in reference-based metrics, they often produce perceptually unnatural artifacts. The recent advancements made to diffusion mod… ▽ More

    Submitted 12 May, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

    Comments: Under review

  5. arXiv:2302.08136  [pdf, ps, other

    cs.SD eess.AS

    An Attention-based Approach to Hierarchical Multi-label Music Instrument Classification

    Authors: Zhi Zhong, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Shusuke Takahashi, Yuki Mitsufuji

    Abstract: Although music is typically multi-label, many works have studied hierarchical music tagging with simplified settings such as single-label data. Moreover, there lacks a framework to describe various joint training methods under the multi-label setting. In order to discuss the above topics, we introduce hierarchical multi-label music instrument classification task. The task provides a realistic sett… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: To appear at ICASSP 2023