Skip to main content

Showing 1–5 of 5 results for author: Chon, B S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2307.04292  [pdf, other

    eess.AS cs.AI

    A Demand-Driven Perspective on Generative Audio AI

    Authors: Sangshin Oh, Minsung Kang, Hyeongi Moon, Keunwoo Choi, Ben Sangbae Chon

    Abstract: To achieve successful deployment of AI research, it is crucial to understand the demands of the industry. In this paper, we present the results of a survey conducted with professional audio engineers, in order to determine research priorities and define various research tasks. We also summarize the current challenges in audio quality and controllability based on the survey. Our analysis emphasizes… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

    Comments: 10 pages, 7 figures

  2. arXiv:2306.09807  [pdf, other

    eess.AS cs.LG cs.SD

    FALL-E: A Foley Sound Synthesis Model and Strategies

    Authors: Minsung Kang, Sangshin Oh, Hyeongi Moon, Kyungyun Lee, Ben Sangbae Chon

    Abstract: This paper introduces FALL-E, a foley synthesis system and its training/inference strategies. The FALL-E model employs a cascaded approach comprising low-resolution spectrogram generation, spectrogram super-resolution, and a vocoder. We trained every sound-related model from scratch using our extensive datasets, and utilized a pre-trained language model. We conditioned the model with dataset-speci… ▽ More

    Submitted 10 August, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: 5 pages, 3 figures

  3. arXiv:2305.15898  [pdf, other

    cs.SD eess.AS

    Room Impulse Response Estimation in a Multiple Source Environment

    Authors: Kyungyun Lee, Jeonghun Seo, Keunwoo Choi, Sangmoon Lee, Ben Sangbae Chon

    Abstract: In real-world acoustic scenarios, there often are multiple sound sources present in a room. These sources are situated in various locations and produce sounds that reach the listener from multiple directions. The presence of multiple sources in a room creates new challenges in estimating the room impulse response (RIR) as each source has a unique RIR, dependent on its location and orientation. The… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: 2023 AES International Conference on Spatial and Immersive Audio

  4. arXiv:2211.07302  [pdf, other

    cs.SD cs.LG eess.AS

    MedleyVox: An Evaluation Dataset for Multiple Singing Voices Separation

    Authors: Chang-Bin Jeon, Hyeongi Moon, Keunwoo Choi, Ben Sangbae Chon, Kyogu Lee

    Abstract: Separation of multiple singing voices into each voice is a rarely studied area in music source separation research. The absence of a benchmark dataset has hindered its progress. In this paper, we present an evaluation dataset and provide baseline studies for multiple singing voices separation. First, we introduce MedleyVox, an evaluation dataset for multiple singing voices separation. We specify t… ▽ More

    Submitted 4 May, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

    Comments: 5 pages, 3 figures, 6 tables, To appear in ICASSP 2023 (camera-ready version)

  5. arXiv:2010.12139  [pdf

    cs.SD cs.MM eess.AS

    GSEP: A robust vocal and accompaniment separation system using gated CBHG module and loudness normalization

    Authors: Soochul Park, Ben Sangbae Chon

    Abstract: In the field of audio signal processing research, source separation has been a popular research topic for a long time and the recent adoption of the deep neural networks have shown a significant improvement in performance. The improvement vitalizes the industry to productize audio deep learning based products and services including Karaoke in the music streaming apps and dialogue enhancement in th… ▽ More

    Submitted 22 February, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

    Comments: 5 pages, 5 figures