Skip to main content

Showing 1–9 of 9 results for author: Chae, Y

Searching in archive eess. Search in all archives.
.
  1. arXiv:2308.12599  [pdf, other

    cs.SD cs.LG eess.AS

    Exploiting Time-Frequency Conformers for Music Audio Enhancement

    Authors: Yunkee Chae, Junghyun Koo, Sungho Lee, Kyogu Lee

    Abstract: With the proliferation of video platforms on the internet, recording musical performances by mobile devices has become commonplace. However, these recordings often suffer from degradation such as noise and reverberation, which negatively impact the listening experience. Consequently, the necessity for music audio enhancement (referred to as music enhancement from this point onward), involving the… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

    Comments: Accepted by ACM Multimedia 2023

  2. arXiv:2307.12576  [pdf, other

    eess.AS cs.IR cs.LG cs.SD

    Self-refining of Pseudo Labels for Music Source Separation with Noisy Labeled Data

    Authors: Junghyun Koo, Yunkee Chae, Chang-Bin Jeon, Kyogu Lee

    Abstract: Music source separation (MSS) faces challenges due to the limited availability of correctly-labeled individual instrument tracks. With the push to acquire larger datasets to improve MSS performance, the inevitability of encountering mislabeled individual instrument tracks becomes a significant challenge to address. This paper introduces an automated technique for refining the labels in a partially… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: 24th International Society for Music Information Retrieval Conference (ISMIR 2023)

  3. arXiv:2305.13108  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Debiased Automatic Speech Recognition for Dysarthric Speech via Sample Reweighting with Sample Affinity Test

    Authors: Eungbeom Kim, Yunkee Chae, Jaeheon Sim, Kyogu Lee

    Abstract: Automatic speech recognition systems based on deep learning are mainly trained under empirical risk minimization (ERM). Since ERM utilizes the averaged performance on the data samples regardless of a group such as healthy or dysarthric speakers, ASR systems are unaware of the performance disparities across the groups. This results in biased ASR systems whose performance differences among groups ar… ▽ More

    Submitted 27 June, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted by Interspeech 2023

  4. arXiv:2305.09167  [pdf, other

    cs.SD cs.CL eess.AS

    Adversarial Speaker Disentanglement Using Unannotated External Data for Self-supervised Representation Based Voice Conversion

    Authors: Xintao Zhao, Shuai Wang, Yang Chao, Zhiyong Wu, Helen Meng

    Abstract: Nowadays, recognition-synthesis-based methods have been quite popular with voice conversion (VC). By introducing linguistics features with good disentangling characters extracted from an automatic speech recognition (ASR) model, the VC performance achieved considerable breakthroughs. Recently, self-supervised learning (SSL) methods trained with a large-scale unannotated speech corpus have been app… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: Accepted by ICME 2023

  5. arXiv:2304.14496  [pdf, ps, other

    physics.ins-det cs.LG eess.SP nucl-ex

    Restoring Original Signal From Pile-up Signal using Deep Learning

    Authors: C. H. Kim, S. Ahn, K. Y. Chae, J. Hooker, G. V. Rogachev

    Abstract: Pile-up signals are frequently produced in experimental physics. They create inaccurate physics data with high uncertainty and cause various problems. Therefore, the correction to pile-up signals is crucially required. In this study, we implemented a deep learning method to restore the original signals from the pile-up signals. We showed that a deep learning model could accurately reconstruct the… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

  6. arXiv:2211.07951  [pdf, other

    cs.SD cs.LG eess.AS

    Show Me the Instruments: Musical Instrument Retrieval from Mixture Audio

    Authors: Kyungsu Kim, Minju Park, Haesun Joung, Yunkee Chae, Yeongbeom Hong, Seonghyeon Go, Kyogu Lee

    Abstract: As digital music production has become mainstream, the selection of appropriate virtual instruments plays a crucial role in determining the quality of music. To search the musical instrument samples or virtual instruments that make one's desired sound, music producers use their ears to listen and compare each instrument sample in their collection, which is time-consuming and inefficient. In this p… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

    Comments: 5 pages, 4 figures, submitted to ICASSP 2023

  7. arXiv:2206.06730  [pdf, other

    eess.IV cs.CV

    Automated Precision Localization of Peripherally Inserted Central Catheter Tip through Model-Agnostic Multi-Stage Networks

    Authors: Subin Park, Yoon Ki Cha, Soyoung Park, Kyung-Su Kim, Myung ** Chung

    Abstract: Peripherally inserted central catheters (PICCs) have been widely used as one of the representative central venous lines (CVCs) due to their long-term intravascular access with low infectivity. However, PICCs have a fatal drawback of a high frequency of tip mispositions, increasing the risk of puncture, embolism, and complications such as cardiac arrhythmias. To automatically and precisely detect i… ▽ More

    Submitted 14 June, 2022; originally announced June 2022.

    Comments: Subin Park and Yoon Ki Cha have contributed equally to this work as the co-first author. Kyung-Su Kim ([email protected]) and Myung ** Chung ([email protected]) have contributed equally to this work as the co-corresponding author

  8. Pre-demosaic Graph-based Light Field Image Compression

    Authors: Yung-Hsuan Chao, Haoran Hong, Gene Cheung, Antonio Ortega

    Abstract: An unfocused plenoptic light field (LF) camera places an array of microlenses in front of an image sensor in order to separately capture different directional rays arriving at an image pixel. Using a conventional Bayer pattern, data captured at each pixel is a single color component (R, G or B).The sensed data then undergoes demosaicking (interpolation of RGB components per pixel) and conversion t… ▽ More

    Submitted 6 January, 2022; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: 13 pages, 12 figures, 6 tables, Accepted by IEEE Transactions on Image Processing

  9. arXiv:1909.00952  [pdf, other

    eess.IV cs.LG cs.MM eess.SY stat.AP stat.ML

    Graph-based Transforms for Video Coding

    Authors: Hilmi E. Egilmez, Yung-Hsuan Chao, Antonio Ortega

    Abstract: In many state-of-the-art compression systems, signal transformation is an integral part of the encoding and decoding process, where transforms provide compact representations for the signals of interest. This paper introduces a class of transforms called graph-based transforms (GBTs) for video compression, and proposes two different techniques to design GBTs. In the first technique, we formulate a… ▽ More

    Submitted 18 September, 2020; v1 submitted 3 September, 2019; originally announced September 2019.

    Comments: To appear in IEEE Trans. on Image Processing (14 pages)