Skip to main content

Showing 1–50 of 63 results for author: Kim, W

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.11427  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    DiTTo-TTS: Efficient and Scalable Zero-Shot Text-to-Speech with Diffusion Transformer

    Authors: Keon Lee, Dong Won Kim, Jaehyeon Kim, Jaewoong Cho

    Abstract: Large-scale diffusion models have shown outstanding generative abilities across multiple modalities including images, videos, and audio. However, text-to-speech (TTS) systems typically involve domain-specific modeling factors (e.g., phonemes and phoneme-level durations) to ensure precise temporal alignments between text and speech, which hinders the efficiency and scalability of diffusion models f… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2406.06650  [pdf, other

    eess.IV cs.CV

    Predicting the risk of early-stage breast cancer recurrence using H\&E-stained tissue images

    Authors: Geongyu Lee, Joonho Lee, Tae-Yeong Kwak, Sun Woo Kim, Youngmee Kwon, Chungyeul Kim, Hyeyoon Chang

    Abstract: Accurate prediction of the likelihood of recurrence is important in the selection of postoperative treatment for patients with early-stage breast cancer. In this study, we investigated whether deep learning algorithms can predict patients' risk of recurrence by analyzing the pathology images of their cancer histology. A total of 125 hematoxylin and eosin stained breast cancer whole slide images la… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 12 pages, 7 figures

  3. arXiv:2405.02066  [pdf, other

    cs.CV eess.IV

    WateRF: Robust Watermarks in Radiance Fields for Protection of Copyrights

    Authors: Youngdong Jang, Dong In Lee, MinHyuk Jang, Jong Wook Kim, Feng Yang, Sangpil Kim

    Abstract: The advances in the Neural Radiance Fields (NeRF) research offer extensive applications in diverse domains, but protecting their copyrights has not yet been researched in depth. Recently, NeRF watermarking has been considered one of the pivotal solutions for safely deploying NeRF-based 3D representations. However, existing methods are designed to apply only to implicit or explicit NeRF representat… ▽ More

    Submitted 27 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

  4. arXiv:2403.08187  [pdf, other

    cs.CL cs.SD eess.AS

    Automatic Speech Recognition (ASR) for the Diagnosis of pronunciation of Speech Sound Disorders in Korean children

    Authors: Taekyung Ahn, Yeonjung Hong, Younggon Im, Do Hyung Kim, Dayoung Kang, Joo Won Jeong, Jae Won Kim, Min Jung Kim, Ah-ra Cho, Dae-Hyun Jang, Hosung Nam

    Abstract: This study presents a model of automatic speech recognition (ASR) designed to diagnose pronunciation issues in children with speech sound disorders (SSDs) to replace manual transcriptions in clinical procedures. Since ASR models trained for general purposes primarily predict input speech into real words, employing a well-known high-performance ASR model for evaluating pronunciation in children wit… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 12 pages, 2 figures

    ACM Class: I.2.7

  5. arXiv:2402.00977  [pdf, other

    cs.CV eess.IV

    Enhanced fringe-to-phase framework using deep learning

    Authors: Won-Hoe Kim, Bongjoong Kim, Hyung-Gun Chi, Jae-Sang Hyun

    Abstract: In Fringe Projection Profilometry (FPP), achieving robust and accurate 3D reconstruction with a limited number of fringe patterns remains a challenge in structured light 3D imaging. Conventional methods require a set of fringe images, but using only one or two patterns complicates phase recovery and unwrap**. In this study, we introduce SFNet, a symmetric fusion network that transforms two fring… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: 35 pages, 13 figures, 6 tables

  6. arXiv:2401.18006  [pdf, other

    q-bio.QM cs.LG eess.SP

    EEG-GPT: Exploring Capabilities of Large Language Models for EEG Classification and Interpretation

    Authors: Jonathan W. Kim, Ahmed Alaa, Danilo Bernardo

    Abstract: In conventional machine learning (ML) approaches applied to electroencephalography (EEG), this is often a limited focus, isolating specific brain activities occurring across disparate temporal scales (from transient spikes in milliseconds to seizures lasting minutes) and spatial scales (from localized high-frequency oscillations to global sleep activity). This siloed approach limits the developmen… ▽ More

    Submitted 3 February, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

  7. Machine learning for industrial sensing and control: A survey and practical perspective

    Authors: Nathan P. Lawrence, Seshu Kumar Damarla, Jong Woo Kim, Aditya Tulsyan, Faraz Amjad, Kai Wang, Benoit Chachuat, Jong Min Lee, Biao Huang, R. Bhushan Gopaluni

    Abstract: With the rise of deep learning, there has been renewed interest within the process industries to utilize data on large-scale nonlinear sensing and control problems. We identify key statistical and machine learning techniques that have seen practical success in the process industries. To do so, we start with hybrid modeling to provide a methodological framework underlying core application areas: so… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: 48 pages

    Journal ref: Control Engineering Practice 2024

  8. arXiv:2312.13313  [pdf, other

    eess.IV cs.CV

    ParamISP: Learned Forward and Inverse ISPs using Camera Parameters

    Authors: Woohyeok Kim, Geonu Kim, Junyong Lee, Seungyong Lee, Seung-Hwan Baek, Sunghyun Cho

    Abstract: RAW images are rarely shared mainly due to its excessive data size compared to their sRGB counterparts obtained by camera ISPs. Learning the forward and inverse processes of camera ISPs has been recently demonstrated, enabling physically-meaningful RAW-level image processing on input sRGB images. However, existing learning-based ISP methods fail to handle the large variations in the ISP processes… ▽ More

    Submitted 14 April, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

  9. arXiv:2311.04468  [pdf

    eess.IV q-bio.NC

    A human brain atlas of chi-separation for normative iron and myelin distributions

    Authors: Kyeongseon Min, Beomseok Sohn, Woo Jung Kim, Chae Jung Park, Soohwa Song, Dong Hoon Shin, Kyung Won Chang, Na-Young Shin, Minjun Kim, Hyeong-Geol Shin, Phil Hyu Lee, Jongho Lee

    Abstract: Iron and myelin are primary susceptibility sources in the human brain. These substances are essential for healthy brain, and their abnormalities are often related to various neurological disorders. Recently, an advanced susceptibility map** technique, which is referred to as chi-separation, has been proposed, successfully disentangling paramagnetic iron from diamagnetic myelin. This method opene… ▽ More

    Submitted 2 April, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: 19 pages, 9 figures

  10. Deep Video Inpainting Guided by Audio-Visual Self-Supervision

    Authors: Kyuyeon Kim, Junsik Jung, Woo Jae Kim, Sung-Eui Yoon

    Abstract: Humans can easily imagine a scene from auditory information based on their prior knowledge of audio-visual events. In this paper, we mimic this innate human ability in deep learning models to improve the quality of video inpainting. To implement the prior knowledge, we first train the audio-visual network, which learns the correspondence between auditory and visual information. Then, the audio-vis… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: Accepted at ICASSP 2022

  11. arXiv:2309.11988  [pdf, ps, other

    math.OC eess.SY

    Relaxed Conditions for Parameterized Linear Matrix Inequality in the Form of Nested Fuzzy Summations

    Authors: Do Wan Kim, Donghwan Lee

    Abstract: The aim of this study is to investigate less conservative conditions for parameterized linear matrix inequalities (PLMIs) that are formulated as nested fuzzy summations. Such PLMIs are commonly encountered in stability analysis and control design problems for Takagi-Sugeno (T-S) fuzzy systems. Utilizing the weighted inequality of arithmetic and geometric means (AM-GM inequality), we develop new, l… ▽ More

    Submitted 18 December, 2023; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: This work has been submitted to IEEE Transactions on Systems, Man and Cybernetics: Systems for possible publications

  12. arXiv:2309.08208  [pdf, other

    cs.SD cs.CR cs.LG eess.AS

    HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods

    Authors: Hyun-seo Shin, Jungwoo Heo, Ju-ho Kim, Chan-yeong Lim, Wonbin Kim, Ha-** Yu

    Abstract: Audio deepfake detection (ADD) is the task of detecting spoofing attacks generated by text-to-speech or voice conversion systems. Spoofing evidence, which helps to distinguish between spoofed and bona-fide utterances, might exist either locally or globally in the input features. To capture these, the Conformer, which consists of Transformers and CNN, possesses a suitable structure. However, since… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: Submitted to 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024)

  13. arXiv:2309.06841  [pdf, ps, other

    eess.SY cs.AI

    On the Local Quadratic Stability of T-S Fuzzy Systems in the Vicinity of the Origin

    Authors: Donghwan Lee, Do Wan Kim

    Abstract: The main goal of this paper is to introduce new local stability conditions for continuous-time Takagi-Sugeno (T-S) fuzzy systems. These stability conditions are based on linear matrix inequalities (LMIs) in combination with quadratic Lyapunov functions. Moreover, they integrate information on the membership functions at the origin and effectively leverage the linear structure of the underlying non… ▽ More

    Submitted 13 September, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

  14. arXiv:2308.07788  [pdf, ps, other

    eess.AS

    GIST-AiTeR Speaker Diarization System for VoxCeleb Speaker Recognition Challenge (VoxSRC) 2023

    Authors: Dongkeon Park, Ji Won Kim, Kang Ryeol Kim, Do Hyun Lee, Hong Kook Kim

    Abstract: This report describes the submission system by the GIST-AiTeR team for the VoxCeleb Speaker Recognition Challenge 2023 (VoxSRC-23) Track 4. Our submission system focuses on implementing diverse speaker diarization (SD) techniques, including ResNet293 and MFA-Conformer with different combinations of segment and hop length. Then, those models are combined into an ensemble model. The ResNet293 and MF… ▽ More

    Submitted 25 August, 2023; v1 submitted 15 August, 2023; originally announced August 2023.

    Comments: VoxSRC 2023 Track4

  15. arXiv:2307.16706  [pdf, ps, other

    eess.SY cs.AI

    Continuous-Time Distributed Dynamic Programming for Networked Multi-Agent Markov Decision Processes

    Authors: Donghwan Lee, Han-Dong Lim, Do Wan Kim

    Abstract: The main goal of this paper is to investigate continuous-time distributed dynamic programming (DP) algorithms for networked multi-agent Markov decision problems (MAMDPs). In our study, we adopt a distributed multi-agent framework where individual agents have access only to their own rewards, lacking insights into the rewards of other agents. Moreover, each agent has the ability to share its parame… ▽ More

    Submitted 13 June, 2024; v1 submitted 31 July, 2023; originally announced July 2023.

  16. arXiv:2307.10628  [pdf, other

    eess.AS cs.SD

    PAS: Partial Additive Speech Data Augmentation Method for Noise Robust Speaker Verification

    Authors: Wonbin Kim, Hyun-seo Shin, Ju-ho Kim, Jungwoo Heo, Chan-yeong Lim, Ha-** Yu

    Abstract: Background noise reduces speech intelligibility and quality, making speaker verification (SV) in noisy environments a challenging task. To improve the noise robustness of SV systems, additive noise data augmentation method has been commonly used. In this paper, we propose a new additive noise method, partial additive speech (PAS), which aims to train SV systems to be less affected by noisy environ… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: 5 pages, 2 figures, 1 table, accepted to CKAIA2023 as a conference paper

  17. arXiv:2306.14384  [pdf, other

    cs.RO eess.SY

    Multitask Learning for Multiple Recognition Tasks: A Framework for Lower-limb Exoskeleton Robot Applications

    Authors: Joonhyun Kim, Seongmin Ha, Dongbin Shin, Seoyeon Ham, Jaepil Jang, Wansoo Kim

    Abstract: To control the lower-limb exoskeleton robot effectively, it is essential to accurately recognize user status and environmental conditions. Previous studies have typically addressed these recognition challenges through independent models for each task, resulting in an inefficient model development process. In this study, we propose a Multitask learning approach that can address multiple recognition… ▽ More

    Submitted 25 June, 2023; originally announced June 2023.

    Comments: Accepted for publication in the Proceedings of the 2023 IEEE International Conference on RO-MAN 2023 BUSAN, 7 pages

  18. arXiv:2306.13020  [pdf

    eess.IV cs.AI cs.CV

    Toward Automated Detection of Microbleeds with Anatomical Scale Localization: A Complete Clinical Diagnosis Support Using Deep Learning

    Authors: Jun-Ho Kim, Young Noh, Haejoon Lee, Seul Lee, Woo-Ram Kim, Koung Mi Kang, Eung Yeop Kim, Mohammed A. Al-masni, Dong-Hyun Kim

    Abstract: Cerebral Microbleeds (CMBs) are chronic deposits of small blood products in the brain tissues, which have explicit relation to various cerebrovascular diseases depending on their anatomical location, including cognitive decline, intracerebral hemorrhage, and cerebral infarction. However, manual detection of CMBs is a time-consuming and error-prone process because of their sparse and tiny structura… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: 16 pages, 10 figures,3 tables

  19. arXiv:2306.06461  [pdf

    eess.AS cs.SD

    Semi-supervsied Learning-based Sound Event Detection using Freuqency Dynamic Convolution with Large Kernel Attention for DCASE Challenge 2023 Task 4

    Authors: Ji Won Kim, Sang Won Son, Yoonah Song, Hong Kook Kim, Il Hoon Song, Jeong Eun Lim

    Abstract: This report proposes a frequency dynamic convolution (FDY) with a large kernel attention (LKA)-convolutional recurrent neural network (CRNN) with a pre-trained bidirectional encoder representation from audio transformers (BEATs) embedding-based sound event detection (SED) model that employs a mean-teacher and pseudo-label approach to address the challenge of limited labeled data for DCASE 2023 Tas… ▽ More

    Submitted 10 June, 2023; originally announced June 2023.

    Comments: DCASE 2023 Challenge Task 4A, 5 pages

  20. Low-Cost GNSS Simulators with Wireless Clock Synchronization for Indoor Positioning

    Authors: Woohyun Kim, Jiwon Seo

    Abstract: In regions where global navigation satellite systems (GNSS) signals are unavailable, such as underground areas and tunnels, GNSS simulators can be deployed for transmitting simulated GNSS signals. Then, a GNSS receiver in the simulator coverage outputs the position based on the received GNSS signals (e.g., Global Positioning System (GPS) L1 signals in this study) transmitted by the corresponding s… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: Submitted to IEEE Access

  21. arXiv:2304.06237  [pdf, other

    cs.LG eess.SP

    Deep learning based ECG segmentation for delineation of diverse arrhythmias

    Authors: Chankyu Joung, Mi** Kim, Tae** Paik, Seong-Ho Kong, Seung-Young Oh, Won Kyeong Jeon, Jae-hu Jeon, Joong-Sik Hong, Wan-Joong Kim, Woong Kook, Myung-** Cha, Otto van Koert

    Abstract: Accurate delineation of key waveforms in an ECG is a critical initial step in extracting relevant features to support the diagnosis and treatment of heart conditions. Although deep learning based methods using a segmentation model to locate the P, QRS, and T waves have shown promising results, their ability to handle signals exhibiting arrhythmia remains unclear. This study builds on existing rese… ▽ More

    Submitted 6 September, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

  22. arXiv:2303.08670  [pdf, other

    cs.CV cs.AI cs.SD eess.AS

    Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video

    Authors: Minsu Kim, Chae Won Kim, Yong Man Ro

    Abstract: Forced alignment refers to a technology that time-aligns a given transcription with a corresponding speech. However, as the forced alignment technologies have developed using speech audio, they might fail in alignment when the input speech audio is noise-corrupted or is not accessible. We focus on that there is another component that the speech can be inferred from, the speech video (i.e., talking… ▽ More

    Submitted 26 February, 2023; originally announced March 2023.

    Comments: Accepted in AAAI2023

  23. arXiv:2302.12172  [pdf, other

    eess.IV cs.CV cs.LG

    Vision-Language Generative Model for View-Specific Chest X-ray Generation

    Authors: Hyungyung Lee, Da Young Lee, Wonjae Kim, **-Hwa Kim, Tackeun Kim, Jihang Kim, Leonard Sunwoo, Edward Choi

    Abstract: Synthetic medical data generation has opened up new possibilities in the healthcare domain, offering a powerful tool for simulating clinical scenarios, enhancing diagnostic and treatment quality, gaining granular medical knowledge, and accelerating the development of unbiased algorithms. In this context, we present a novel approach called ViewXGen, designed to overcome the limitations of existing… ▽ More

    Submitted 29 April, 2024; v1 submitted 23 February, 2023; originally announced February 2023.

    Comments: Accepted at CHIL 2024

  24. arXiv:2212.04356  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Robust Speech Recognition via Large-Scale Weak Supervision

    Authors: Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, Ilya Sutskever

    Abstract: We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual and multitask supervision, the resulting models generalize well to standard benchmarks and are often competitive with prior fully supervised results but in a zero-shot transfer setting without the need for any fine-tuni… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

  25. A Design Method of Distributed Algorithms via Discrete-time Blended Dynamics Theorem

    Authors: Jeong Woo Kim, ** Gyu Lee, Donggil Lee, Hyungbo Shim

    Abstract: We develop a discrete-time version of the blended dynamics theorem for the use of designing distributed computation algorithms. The blended dynamics theorem enables to predict the behavior of heterogeneous multi-agent systems. Therefore, once we get a blended dynamics for a particular computational task, design idea of node dynamics for individual heterogeneous agents can easily occur. In the cont… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Journal ref: Automatica, vol. 159, pp. 111371, Jan 2024

  26. Modern Machine Learning Tools for Monitoring and Control of Industrial Processes: A Survey

    Authors: R. Bhushan Gopaluni, Aditya Tulsyan, Benoit Chachuat, Biao Huang, Jong Min Lee, Faraz Amjad, Seshu Kumar Damarla, Jong Woo Kim, Nathan P. Lawrence

    Abstract: Over the last ten years, we have seen a significant increase in industrial data, tremendous improvement in computational power, and major theoretical advances in machine learning. This opens up an opportunity to use modern machine learning tools on large-scale nonlinear monitoring and control problems. This article provides a survey of recent results with applications in the process industry.

    Submitted 22 September, 2022; originally announced September 2022.

    Comments: IFAC World Congress 2020

  27. arXiv:2209.10357  [pdf, other

    eess.AS

    GIST-AiTeR System for the Diarization Task of the 2022 VoxCeleb Speaker Recognition Challenge

    Authors: Dongkeon Park, Yechan Yu, Kyeong Wan Park, Ji Won Kim, Hong Kook Kim

    Abstract: This report describes the submission system of the GIST-AiTeR team at the 2022 VoxCeleb Speaker Recognition Challenge (VoxSRC) Track 4. Our system mainly includes speech enhancement, voice activity detection , multi-scaled speaker embedding, probabilistic linear discriminant analysis-based speaker clustering, and overlapped speech detection models. We first construct four different diarization sys… ▽ More

    Submitted 6 October, 2022; v1 submitted 21 September, 2022; originally announced September 2022.

    Comments: 2022 VoxSRC Track4

  28. arXiv:2209.01724  [pdf, ps, other

    eess.SP

    Towards Deep Learning-aided Wireless Channel Estimation and Channel State Information Feedback for 6G

    Authors: Wonjun Kim, Yongjun Ahn, **hong Kim, Byonghyo Shim

    Abstract: Deep learning (DL), a branch of artificial intelligence (AI) techniques, has shown great promise in various disciplines such as image classification and segmentation, speech recognition, language translation, among others. This remarkable success of DL has stimulated increasing interest in applying this paradigm to wireless channel estimation in recent years. Since DL principles are inductive in n… ▽ More

    Submitted 4 September, 2022; originally announced September 2022.

  29. arXiv:2208.12544  [pdf

    cs.LG eess.SP physics.flu-dyn

    Deep learning-based denoising for fast time-resolved flame emission spectroscopy in high-pressure combustion environment

    Authors: Taekeun Yoon, Seon Woong Kim, Hosung Byun, Younsik Kim, Campbell D. Carter, Hyungrok Do

    Abstract: A deep learning strategy is developed for fast and accurate gas property measurements using flame emission spectroscopy (FES). Particularly, the short-gated fast FES is essential to resolve fast-evolving combustion behaviors. However, as the exposure time for capturing the flame emission spectrum gets shorter, the signal-to-noise ratio (SNR) decreases, and characteristic spectral features indicati… ▽ More

    Submitted 26 December, 2022; v1 submitted 29 July, 2022; originally announced August 2022.

    Comments: 25 pages, 12 figures, accepted to Combustion and Flame

    Report number: Combustion and Flame 248 (2023) 112583

  30. arXiv:2207.06330  [pdf, other

    eess.IV cs.CV

    Left Ventricle Contouring of Apical Three-Chamber Views on 2D Echocardiography

    Authors: Alberto Gomez, Mihaela Porumb, Angela Mumith, Thierry Judge, Shan Gao, Woo-** Cho Kim, Jorge Oliveira, Agis Chartsias

    Abstract: We propose a new method to automatically contour the left ventricle on 2D echocardiographic images. Unlike most existing segmentation methods, which are based on predicting segmentation masks, we focus at predicting the endocardial contour and the key landmark points within this contour (basal points and apex). This provides a representation that is closer to how experts perform manual annotations… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

    Comments: Submitted to MICCAI-ASMUS 2022

  31. arXiv:2206.06541  [pdf, other

    eess.IV cs.CV cs.MM

    Pixel-by-pixel Mean Opinion Score (pMOS) for No-Reference Image Quality Assessment

    Authors: Wook-Hyung Kim, Cheul-hee Hahm, Anant Baijal, Namuk Kim, Ilhyun Cho, Jayoon Koo

    Abstract: Deep-learning based techniques have contributed to the remarkable progress in the field of automatic image quality assessment (IQA). Existing IQA methods are designed to measure the quality of an image in terms of Mean Opinion Score (MOS) at the image-level (i.e. the whole image) or at the patch-level (dividing the image into multiple units and measuring quality of each patch). Some applications m… ▽ More

    Submitted 13 June, 2022; originally announced June 2022.

  32. arXiv:2206.02222  [pdf, other

    math.OC cs.GT cs.MA eess.SY

    How does a Rational Agent Act in an Epidemic?

    Authors: S. Yagiz Olmez, Shubham Aggarwal, ** Won Kim, Erik Miehling, Tamer BaÅŸar, Matthew West, Prashant G. Mehta

    Abstract: Evolution of disease in a large population is a function of the top-down policy measures from a centralized planner, as well as the self-interested decisions (to be socially active) of individual agents in a large heterogeneous population. This paper is concerned with understanding the latter based on a mean-field type optimal control model. Specifically, the model is used to investigate the role… ▽ More

    Submitted 5 June, 2022; originally announced June 2022.

    Comments: arXiv admin note: text overlap with arXiv:2111.10422

  33. arXiv:2204.10479  [pdf, ps, other

    cs.LG eess.SY

    Finite-Time Analysis of Temporal Difference Learning: Discrete-Time Linear System Perspective

    Authors: Donghwan Lee, Do Wan Kim

    Abstract: TD-learning is a fundamental algorithm in the field of reinforcement learning (RL), that is employed to evaluate a given policy by estimating the corresponding value function for a Markov decision process. While significant progress has been made in the theoretical analysis of TD-learning, recent research has uncovered guarantees concerning its statistical efficiency by develo** finite-time erro… ▽ More

    Submitted 2 June, 2023; v1 submitted 21 April, 2022; originally announced April 2022.

    Comments: arXiv admin note: text overlap with arXiv:2112.14417

  34. arXiv:2203.12053  [pdf, other

    eess.AS cs.SD

    Upmixing via style transfer: a variational autoencoder for disentangling spatial images and musical content

    Authors: Haici Yang, Sanna Wager, Spencer Russell, Mike Luo, Minje Kim, Wontak Kim

    Abstract: In the stereo-to-multichannel upmixing problem for music, one of the main tasks is to set the directionality of the instrument sources in the multichannel rendering results. In this paper, we propose a modified variational autoencoder model that learns a latent space to describe the spatial images in multichannel music. We seek to disentangle the spatial images and music content, so the learned la… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

  35. arXiv:2203.07211  [pdf, other

    q-bio.QM eess.SY

    Model predictive control and moving horizon estimation for adaptive optimal bolus feeding in high-throughput cultivation of \textit{E. coli}

    Authors: Jong Woo Kim, Niels Krausch, Judit Aizpuru, Tilman Barz, Sergio Lucia, Peter Neubauer, Mariano Nicolas Cruz Bournazou

    Abstract: We discuss the application of a nonlinear model predictive control (MPC) and a moving horizon estimation (MHE) to achieve an optimal operation of \textit{E. coli} fed-batch cultivations with intermittent bolus feeding. 24 parallel experiments were considered in a high-throughput microbioreactor platform at a 10 mL scale. The robotic island in question can run up to 48 fed-batch processes in parall… ▽ More

    Submitted 6 February, 2023; v1 submitted 14 March, 2022; originally announced March 2022.

  36. arXiv:2112.14417   

    cs.AI cs.LG eess.SY

    Control Theoretic Analysis of Temporal Difference Learning

    Authors: Donghwan Lee, Do Wan Kim

    Abstract: The goal of this manuscript is to conduct a controltheoretic analysis of Temporal Difference (TD) learning algorithms. TD-learning serves as a cornerstone in the realm of reinforcement learning, offering a methodology for approximating the value function associated with a given policy in a Markov Decision Process. Despite several existing works that have contributed to the theoretical understandin… ▽ More

    Submitted 8 September, 2023; v1 submitted 29 December, 2021; originally announced December 2021.

    Comments: The contents of this paper have some overlaps with some other arxiv paper we have submitted. Therefore, this paper is redundant in my opinion

  37. arXiv:2112.13283  [pdf, other

    q-bio.QM eess.SY

    Fitting nonlinear models to continuous oxygen data with oscillatory signal variations via a loss based on DynamicTime War**

    Authors: Judit Aizpuru, Annina Karolin Kemmer, Jong Woo Kim, Stefan Born, Peter Neubauer, Mariano N. Cruz Bournazou, Tilman Barz

    Abstract: High throughput experimental systems play an important role in bioprocess development, as they provide an efficient way of analysing different experimental conditions and perform strain discrimination in previous phases to the industrial scale production. In the millilitre scale, these systems are combinations of parallel mini-bioreactors, liquid handling robots and automated workflows for data ha… ▽ More

    Submitted 25 December, 2021; originally announced December 2021.

  38. arXiv:2110.14513  [pdf, other

    cs.SD cs.AI eess.AS

    Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations

    Authors: Hyeong-Seok Choi, Juheon Lee, Wansoo Kim, Jie Hwan Lee, Hoon Heo, Kyogu Lee

    Abstract: We present a neural analysis and synthesis (NANSY) framework that can manipulate voice, pitch, and speed of an arbitrary speech signal. Most of the previous works have focused on using information bottleneck to disentangle analysis features for controllable synthesis, which usually results in poor reconstruction quality. We address this issue by proposing a novel training strategy based on informa… ▽ More

    Submitted 28 October, 2021; v1 submitted 27 October, 2021; originally announced October 2021.

    Comments: Neural Information Processing Systems (NeurIPS) 2021

  39. arXiv:2109.09088  [pdf, ps, other

    eess.SY

    Relaxed Conditions for Parameterized Linear Matrix Inequality in the Form of Double Sum

    Authors: Do Wan Kim, Dong Hwan Lee

    Abstract: The aim of this study is to investigate less conservative conditions for a parameterized linear matrix inequality (PLMI) expressed in the form of a double convex sum. This type of PLMI frequently appears in T-S fuzzy control system analysis and design problems. In this letter, we derive new, less conservative linear matrix inequalities (LMIs) for the PLMI by employing the proposed sum relaxation m… ▽ More

    Submitted 13 July, 2023; v1 submitted 19 September, 2021; originally announced September 2021.

  40. First Demonstration of the Korean eLoran Accuracy in a Narrow Waterway Using Improved ASF Maps

    Authors: Woohyun Kim, Pyo-Woong Son, Sul Gee Park, Sang Hyun Park, Jiwon Seo

    Abstract: The vulnerabilities of global navigation satellite systems (GNSSs) to radio frequency jamming and spoofing have attracted significant research attention. In particular, the large-scale jamming incidents that occurred in South Korea substantiate the practical importance of implementing a complementary navigation system. This letter briefly summarizes the efforts of South Korea to deploy an enhanced… ▽ More

    Submitted 28 September, 2021; v1 submitted 18 September, 2021; originally announced September 2021.

    Comments: Submitted to IEEE Transactions on Aerospace and Electronic Systems

  41. arXiv:2107.05009  [pdf, other

    cs.SD eess.AS

    PocketVAE: A Two-step Model for Groove Generation and Control

    Authors: Kyungyun Lee, Wonil Kim, Juhan Nam

    Abstract: Creating a good drum track to imitate a skilled performer in digital audio workstations (DAWs) can be a time-consuming process, especially for those unfamiliar with drums. In this work, we introduce PocketVAE, a groove generation system that applies grooves to users' rudimentary MIDI tracks, i.e, templates. Grooves can be either transferred from a reference track, generated randomly or with condit… ▽ More

    Submitted 11 July, 2021; originally announced July 2021.

  42. arXiv:2106.02391  [pdf, ps, other

    math.OC eess.SY

    Data-Driven Control Design with LMIs and Dynamic Programming

    Authors: Donghwan Lee, Do Wan Kim

    Abstract: The goal of this paper is to develop data-driven control design and evaluation strategies based on linear matrix inequalities (LMIs) and dynamic programming. We consider deterministic discrete-time LTI systems, where the system model is unknown. We propose efficient data collection schemes from the state-input trajectories together with data-driven LMIs to design state-feedback controllers for sta… ▽ More

    Submitted 16 June, 2021; v1 submitted 4 June, 2021; originally announced June 2021.

  43. arXiv:2105.14760  [pdf, ps, other

    math.OC eess.SY

    Multi-Objective LQG Design with Primal-Dual Method

    Authors: Donghwan Lee, Do Wan Kim

    Abstract: The goal of this paper is to study a multi-objective linear quadratic Gaussian (LQG) control problem. In particular, we consider an optimal control problem minimizing a quadratic cost over a finite time horizon for linear stochastic systems subject to control energy constraints. To solve the problem, we suggest an efficient bisection line search algorithm which is computationally efficient compare… ▽ More

    Submitted 31 May, 2021; originally announced May 2021.

  44. arXiv:2012.02753  [pdf, other

    eess.SY

    Model-plant mismatch learning offset-free model predictive control

    Authors: Sang Hwan Son, Jong Woo Kim, Tae Hoon Oh, Jong Min Lee

    Abstract: We propose model-plant mismatch learning offset-free model predictive control (MPC), which learns and applies the intrinsic model-plant mismatch, to effectively exploit the advantages of model-based and data-driven control strategies and overcome the limitations of each approach. In this study, the model-plant mismatch map on steady-state manifold in the controlled variable space is approximated v… ▽ More

    Submitted 13 December, 2020; v1 submitted 4 December, 2020; originally announced December 2020.

  45. Effect of Outlier Removal from Temporal ASF Corrections on Multichain Loran Positioning Accuracy

    Authors: Jongmin Park, Pyo-Woong Son, Woohyun Kim, Joon Hyo Rhee, Jiwon Seo

    Abstract: The widely used global navigation satellite systems (GNSSs) are vulnerable to radio frequency interference (RFI). Long-range navigation (Loran), a terrestrial navigation system, can compensate for this weakness; however, it suffers from low positioning accuracy, and studies are under way to improve its positioning performance. One such study has proposed the multichain Loran positioning method tha… ▽ More

    Submitted 24 September, 2020; originally announced September 2020.

    Comments: Submitted to ICCAS 2020

    Journal ref: 2020 20th International Conference on Control, Automation and Systems (ICCAS)

  46. Effects of Initial Attitude Estimation Errors on Loosely Coupled Smartphone GPS/IMU Integration System

    Authors: Kwansik Park, Woohyun Kim, Jiwon Seo

    Abstract: Global Positioning System (GPS) and inertial measurement unit (IMU) sensors are commonly integrated using the extended Kalman filter (EKF), for achieving better navigation performance. However, because of nonlinearity, the performance of the EKF is affected by the initial state estimation errors, and the navigation solutions, including the attitude, diverge rapidly as the initial errors increase.… ▽ More

    Submitted 24 September, 2020; originally announced September 2020.

    Comments: Submitted to ICCAS 2020

    Journal ref: 2020 20th International Conference on Control, Automation and Systems (ICCAS)

  47. Development of Record and Management Software for GPS/Loran Measurements

    Authors: Woohyun Kim, Pyo-Woong Son, Joon Hyo Rhee, Jiwon Seo

    Abstract: In this paper, a software implementation that records Global Positioning System (GPS) and long-range navigation (Loran) measurement data output from an integrated GPS/Loran receiver and organizes them based on time is proposed. The purpose of the developed software is to collect measurements from multiple Loran transmitter chains for performance analysis of navigation methods using Loran, and to o… ▽ More

    Submitted 24 September, 2020; originally announced September 2020.

    Comments: Submitted to ICCAS 2020

    Journal ref: 2020 20th International Conference on Control, Automation and Systems (ICCAS)

  48. arXiv:2009.11587  [pdf, other

    eess.IV cs.LG

    Transfer Learning by Cascaded Network to identify and classify lung nodules for cancer detection

    Authors: Shah B. Shrey, Lukman Hakim, Muthusubash Kavitha, Hae Won Kim, Takio Kurita

    Abstract: Lung cancer is one of the most deadly diseases in the world. Detecting such tumors at an early stage can be a tedious task. Existing deep learning architecture for lung nodule identification used complex architecture with large number of parameters. This study developed a cascaded architecture which can accurately segment and classify the benign or malignant lung nodules on computed tomography (CT… ▽ More

    Submitted 24 September, 2020; originally announced September 2020.

  49. arXiv:2008.05060  [pdf, other

    cs.CV cs.LG eess.SP stat.ML

    Online Graph Completion: Multivariate Signal Recovery in Computer Vision

    Authors: Won Hwa Kim, Mona Jalal, Seongjae Hwang, Sterling C. Johnson, Vikas Singh

    Abstract: The adoption of "human-in-the-loop" paradigms in computer vision and machine learning is leading to various applications where the actual data acquisition (e.g., human supervision) and the underlying inference algorithms are closely interwined. While classical work in active learning provides effective solutions when the learning module involves classification and regression tasks, many practical… ▽ More

    Submitted 11 August, 2020; originally announced August 2020.

    Comments: 9 pages, 7 figures, CVPR 2017 Conference

  50. arXiv:2005.04117  [pdf, other

    cs.CV eess.IV

    NTIRE 2020 Challenge on Real Image Denoising: Dataset, Methods and Results

    Authors: Abdelrahman Abdelhamed, Mahmoud Afifi, Radu Timofte, Michael S. Brown, Yue Cao, Zhilu Zhang, Wangmeng Zuo, Xiaoling Zhang, Jiye Liu, Wendong Chen, Changyuan Wen, Meng Liu, Shuailin Lv, Yunchao Zhang, Zhihong Pan, Baopu Li, Teng Xi, Yanwen Fan, Xiyu Yu, Gang Zhang, **gtuo Liu, Junyu Han, Errui Ding, Songhyun Yu, Bumjun Park , et al. (65 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2020 challenge on real image denoising with focus on the newly introduced dataset, the proposed methods and their results. The challenge is a new version of the previous NTIRE 2019 challenge on real image denoising that was based on the SIDD benchmark. This challenge is based on a newly collected validation and testing image datasets, and hence, named SIDD+. This chall… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.