Skip to main content

Showing 1–5 of 5 results for author: Suh, Y

Searching in archive eess. Search in all archives.
.
  1. arXiv:2310.06546  [pdf, other

    cs.SD cs.CL eess.AS

    AutoCycle-VC: Towards Bottleneck-Independent Zero-Shot Cross-Lingual Voice Conversion

    Authors: Haeyun Choi, Jio Gim, Yuho Lee, Youngin Kim, Young-Joo Suh

    Abstract: This paper proposes a simple and robust zero-shot voice conversion system with a cycle structure and mel-spectrogram pre-processing. Previous works suffer from information loss and poor synthesis quality due to their reliance on a carefully designed bottleneck structure. Moreover, models relying solely on self-reconstruction loss struggled with reproducing different speakers' voices. To address th… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

  2. arXiv:2011.01174  [pdf, other

    eess.AS cs.LG cs.SD

    Learning to Maximize Speech Quality Directly Using MOS Prediction for Neural Text-to-Speech

    Authors: Yeunju Choi, Youngmoon Jung, Youngjoo Suh, Hoirin Kim

    Abstract: Although recent neural text-to-speech (TTS) systems have achieved high-quality speech synthesis, there are cases where a TTS system generates low-quality speech, mainly caused by limited training data or information loss during knowledge distillation. Therefore, we propose a novel method to improve speech quality by training a TTS model under the supervision of perceptual loss, which measures the… ▽ More

    Submitted 25 May, 2022; v1 submitted 2 November, 2020; originally announced November 2020.

    Comments: 9 pages, 5 figures, 4 tables

    Journal ref: IEEE Access, vol. 10, pp. 52621 - 52629, 2022

  3. PIINET: A 360-degree Panoramic Image Inpainting Network Using a Cube Map

    Authors: Seo Woo Han, Doug Young Suh

    Abstract: Inpainting has been continuously studied in the field of computer vision. As artificial intelligence technology developed, deep learning technology was introduced in inpainting research, hel** to improve performance. Currently, the input target of an inpainting algorithm using deep learning has been studied from a single image to a video. However, deep learning-based inpainting technology for pa… ▽ More

    Submitted 26 January, 2021; v1 submitted 29 October, 2020; originally announced October 2020.

    Journal ref: Vol.66, No.1, 2021, pp.213-228

  4. arXiv:2006.06937  [pdf

    eess.AS

    Non-parallel voice conversion based on source-to-target direct map**

    Authors: Sunghee Jung, Youngjoo Suh, Yeunju Choi, Hoirin Kim

    Abstract: Recent works of utilizing phonetic posteriograms (PPGs) for non-parallel voice conversion have significantly increased the usability of voice conversion since the source and target DBs are no longer required for matching contents. In this approach, the PPGs are used as the linguistic bridge between source and target speaker features. However, this PPG-based non-parallel voice conversion has some l… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

    Comments: Submitted to Interspeech 2019

  5. Learning-Driven Wireless Communications, towards 6G

    Authors: Md. Jalil Piran, Doug Young Suh

    Abstract: The fifth generation (5G) of wireless communication is in its infancy, and its evolving versions will be launched over the coming years. However, according to exposing the inherent constraints of 5G and the emerging applications and services with stringent requirements e.g. latency, energy/bit, traffic capacity, peak data rate, and reliability, telecom researchers are turning their attention to co… ▽ More

    Submitted 1 August, 2019; originally announced August 2019.

    Report number: 19276363

    Journal ref: 2019 International Conference on Computing, Electronics & Communications Engineering (iCCECE)