Skip to main content

Showing 1–50 of 304 results for author: Kim, H

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.16886  [pdf, other

    eess.SP cs.CV cs.LG

    Sensor Data Augmentation from Skeleton Pose Sequences for Improving Human Activity Recognition

    Authors: Parham Zolfaghari, Vitor Fortes Rey, Lala Ray, Hyun Kim, Sungho Suh, Paul Lukowicz

    Abstract: The proliferation of deep learning has significantly advanced various fields, yet Human Activity Recognition (HAR) has not fully capitalized on these developments, primarily due to the scarcity of labeled datasets. Despite the integration of advanced Inertial Measurement Units (IMUs) in ubiquitous wearable devices like smartwatches and fitness trackers, which offer self-labeled activity data from… ▽ More

    Submitted 25 April, 2024; originally announced June 2024.

    Comments: Accepted in IEEE 6th International Conference on Activity and Behavior Computing (ABC 2024)

  2. arXiv:2406.16716  [pdf, other

    eess.AS cs.CR cs.SD

    One-Class Learning with Adaptive Centroid Shift for Audio Deepfake Detection

    Authors: Hyun Myung Kim, Kangwook Jang, Hoirin Kim

    Abstract: As speech synthesis systems continue to make remarkable advances in recent years, the importance of robust deepfake detection systems that perform well in unseen systems has grown. In this paper, we propose a novel adaptive centroid shift (ACS) method that updates the centroid representation by continually shifting as the weighted average of bonafide representations. Our approach uses only bonafid… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  3. arXiv:2406.13935  [pdf, other

    eess.AS cs.AI cs.SD

    CONMOD: Controllable Neural Frame-based Modulation Effects

    Authors: Gyubin Lee, Hounsu Kim, Junwon Lee, Juhan Nam

    Abstract: Deep learning models have seen widespread use in modelling LFO-driven audio effects, such as phaser and flanger. Although existing neural architectures exhibit high-quality emulation of individual effects, they do not possess the capability to manipulate the output via control parameters. To address this issue, we introduce Controllable Neural Frame-based Modulation Effects (CONMOD), a single blac… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  4. arXiv:2406.12721  [pdf

    eess.AS cs.SD

    Sound event detection based on auxiliary decoder and maximum probability aggregation for DCASE Challenge 2024 Task 4

    Authors: Sang Won Son, Jongyeon Park, Hong Kook Kim, Sulaiman Vesal, Jeong Eun Lim

    Abstract: In this report, we propose three novel methods for develo** a sound event detection (SED) model for the DCASE 2024 Challenge Task 4. First, we propose an auxiliary decoder attached to the final convolutional block to improve feature extraction capabilities while reducing dependency on embeddings from pre-trained large models. The proposed auxiliary decoder operates independently from the main de… ▽ More

    Submitted 24 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: DCASE 2024 challenge Task4, 4 pages

  5. arXiv:2406.11248  [pdf

    eess.AS cs.AI cs.SD

    Performance Improvement of Language-Queried Audio Source Separation Based on Caption Augmentation From Large Language Models for DCASE Challenge 2024 Task 9

    Authors: Do Hyun Lee, Yoonah Song, Hong Kook Kim

    Abstract: We present a prompt-engineering-based text-augmentation approach applied to a language-queried audio source separation (LASS) task. To enhance the performance of LASS, the proposed approach utilizes large language models (LLMs) to generate multiple captions corresponding to each sentence of the training dataset. To this end, we first perform experiments to identify the most effective prompts for c… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: DCASE 2024 Challenge Task 9, 4 pages

  6. arXiv:2406.10549  [pdf, other

    eess.AS cs.CL cs.SD

    Lightweight Audio Segmentation for Long-form Speech Translation

    Authors: Jaesong Lee, Soyoon Kim, Hanbyul Kim, Joon Son Chung

    Abstract: Speech segmentation is an essential part of speech translation (ST) systems in real-world scenarios. Since most ST models are designed to process speech segments, long-form audio must be partitioned into shorter segments before translation. Recently, data-driven approaches for the speech segmentation task have been developed. Although the approaches improve overall translation quality, a performan… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

  7. arXiv:2406.07909  [pdf, other

    eess.AS cs.CL cs.SD stat.ML

    Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation

    Authors: Eungbeom Kim, Hantae Kim, Kyogu Lee

    Abstract: Transformer encoder with connectionist temporal classification (CTC) framework is widely used for automatic speech recognition (ASR). However, knowledge distillation (KD) for ASR displays a problem of disagreement between teacher-student models in frame-level alignment which ultimately hinders it from improving the student model's performance. In order to resolve this problem, this paper introduce… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  8. arXiv:2406.02479  [pdf

    cs.LG eess.SP eess.SY

    Applying Fine-Tuned LLMs for Reducing Data Needs in Load Profile Analysis

    Authors: Yi Hu, Hyeon** Kim, Kai Ye, Ning Lu

    Abstract: This paper presents a novel method for utilizing fine-tuned Large Language Models (LLMs) to minimize data requirements in load profile analysis, demonstrated through the restoration of missing data in power system load profiles. A two-stage fine-tuning strategy is proposed to adapt a pre-trained LLMs, i.e., GPT-3.5, for missing data restoration tasks. Through empirical evaluation, we demonstrate t… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  9. arXiv:2405.03684  [pdf, other

    eess.IV

    All-in-One Deep Learning Framework for MR Image Reconstruction

    Authors: Geunu Jeong, Hyeonsoo Kim, Joonyoung Yang, Kyungeun Jang, Jeewook Kim

    Abstract: We introduce a novel, all-in-one deep learning framework for MR image reconstruction, enabling a single model to enhance image quality across multiple aspects of k-space sampling and to be effective across a wide range of clinical and technical scenarios. This DICOM-based algorithm serves as the core of SwiftMR (AIRS Medical, Seoul, Korea), which is FDA-cleared, CE-certified, and commercially avai… ▽ More

    Submitted 26 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

    Comments: 22 pages, 9 figures; number of collected MR raw data corrected

  10. arXiv:2404.19075  [pdf, other

    eess.IV cs.AI cs.CV cs.LG math.NA

    Distributed Stochastic Optimization of a Neural Representation Network for Time-Space Tomography Reconstruction

    Authors: K. Aditya Mohan, Massimiliano Ferrucci, Chuck Divin, Garrett A. Stevenson, Hyo** Kim

    Abstract: 4D time-space reconstruction of dynamic events or deforming objects using X-ray computed tomography (CT) is an extremely ill-posed inverse problem. Existing approaches assume that the object remains static for the duration of several tens or hundreds of X-ray projection measurement images (reconstruction of consecutive limited-angle CT scans). However, this is an unrealistic assumption for many in… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: submitted to Nature Machine Intelligence

  11. arXiv:2404.17736  [pdf, other

    eess.SP cs.CV cs.IT eess.IV

    Diffusion-Aided Joint Source Channel Coding For High Realism Wireless Image Transmission

    Authors: Mingyu Yang, Bowen Liu, Boyang Wang, Hun-Seok Kim

    Abstract: Deep learning-based joint source-channel coding (deep JSCC) has been demonstrated as an effective approach for wireless image transmission. Nevertheless, current research has concentrated on minimizing a standard distortion metric such as Mean Squared Error (MSE), which does not necessarily improve the perceptual quality. To address this issue, we propose DiffJSCC, a novel framework that leverages… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  12. arXiv:2404.17585  [pdf, other

    cs.HC cs.AI cs.LG eess.SP

    NeuroNet: A Novel Hybrid Self-Supervised Learning Framework for Sleep Stage Classification Using Single-Channel EEG

    Authors: Cheol-Hui Lee, Hakseung Kim, Hyun-jee Han, Min-Kyung Jung, Byung C. Yoon, Dong-Joo Kim

    Abstract: The classification of sleep stages is a pivotal aspect of diagnosing sleep disorders and evaluating sleep quality. However, the conventional manual scoring process, conducted by clinicians, is time-consuming and prone to human bias. Recent advancements in deep learning have substantially propelled the automation of sleep stage classification. Nevertheless, challenges persist, including the need fo… ▽ More

    Submitted 13 May, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: 14 pages, 4 figures

  13. arXiv:2404.08175  [pdf, ps, other

    eess.SY

    A Novel Vision Transformer based Load Profile Analysis using Load Images as Inputs

    Authors: Hyeon** Kim, Yi Hu, Kai Ye, Ning Lu

    Abstract: This paper introduces ViT4LPA, an innovative Vision Transformer (ViT) based approach for Load Profile Analysis (LPA). We transform time-series load profiles into load images. This allows us to leverage the ViT architecture, originally designed for image processing, as a pre-trained image encoder to uncover latent patterns within load data. ViT is pre-trained using an extensive load image dataset,… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  14. arXiv:2404.07021  [pdf, other

    eess.SP

    A 4x32Gb/s 1.8pJ/bit Collaborative Baud-Rate CDR with Background Eye-Climbing Algorithm and Low-Power Global Clock Distribution

    Authors: Jihee Kim, Jia Park, Jiwon Shin, Hanseok Kim, Kahyun Kim, Haengbeom Shin, Ha-Jung Park, Woo-Seok Choi

    Abstract: This paper presents design techniques for an energy-efficient multi-lane receiver (RX) with baud-rate clock and data recovery (CDR), which is essential for high-throughput low-latency communication in high-performance computing systems. The proposed low-power global clock distribution not only significantly reduces power consumption across multi-lane RXs but is capable of compensating for the freq… ▽ More

    Submitted 22 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  15. arXiv:2404.06452  [pdf, other

    cs.RO eess.SY

    PAAM: A Framework for Coordinated and Priority-Driven Accelerator Management in ROS 2

    Authors: Daniel Enright, Yecheng Xiang, Hyunjong Choi, Hyoseung Kim

    Abstract: This paper proposes a Priority-driven Accelerator Access Management (PAAM) framework for multi-process robotic applications built on top of the Robot Operating System (ROS) 2 middleware platform. The framework addresses the issue of predictable execution of time- and safety-critical callback chains that require hardware accelerators such as GPUs and TPUs. PAAM provides a standalone ROS executor th… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 14 Pages, 14 Figures

  16. arXiv:2404.05119  [pdf, other

    eess.SP

    A 0.65-pJ/bit 3.6-TB/s/mm I/O Interface with XTalk Minimizing Affine Signaling for Next-Generation HBM with High Interconnect Density

    Authors: Hyunjun Park, Jiwon Shin, Hanseok Kim, Jihee Kim, Haengbeom Shin, Taehoon Kim, Jung-Hun Park, Woo-Seok Choi

    Abstract: This paper presents an I/O interface with Xtalk Minimizing Affine Signaling (XMAS), which is designed to support high-speed data transmission in die-to-die communication over silicon interposers or similar high-density interconnects susceptible to crosstalk. The operating principles of XMAS are elucidated through rigorous analyses, and its advantages over existing signaling are validated through n… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  17. arXiv:2404.04096  [pdf, other

    cs.IT eess.SP

    Machine Learning-Aided Cooperative Localization under Dense Urban Environment

    Authors: Hoon Lee, Hong Ki Kim, Seung Hyun Oh, Sang Hyun Lee

    Abstract: Future wireless network technology provides automobiles with the connectivity feature to consolidate the concept of vehicular networks that collaborate on conducting cooperative driving tasks. The full potential of connected vehicles, which promises road safety and quality driving experience, can be leveraged if machine learning models guarantee the robustness in performing core functions includin… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  18. arXiv:2404.02342  [pdf, other

    cs.CL cs.SD eess.AS

    A Computational Analysis of Lyric Similarity Perception

    Authors: Haven Kim, Taketo Akama

    Abstract: In musical compositions that include vocals, lyrics significantly contribute to artistic expression. Consequently, previous studies have introduced the concept of a recommendation system that suggests lyrics similar to a user's favorites or personalized preferences, aiding in the discovery of lyrics among millions of tracks. However, many of these systems do not fully consider human perceptions of… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  19. arXiv:2404.01661  [pdf, other

    cs.RO eess.SY

    Interaction-Aware Vehicle Motion Planning with Collision Avoidance Constraints in Highway Traffic

    Authors: Dongryul Kim, Hyeonjeong Kim, Kyoungseok Han

    Abstract: This paper proposes collision-free optimal trajectory planning for autonomous vehicles in highway traffic, where vehicles need to deal with the interaction among each other. To address this issue, a novel optimal control framework is suggested, which couples the trajectory of surrounding vehicles with collision avoidance constraints. Additionally, we describe a trajectory optimization technique un… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  20. arXiv:2403.19785  [pdf, other

    cs.IT eess.SP

    Integrated Communication, Localization, and Sensing in 6G D-MIMO Networks

    Authors: Hao Guo, Henk Wymeersch, Behrooz Makki, Hui Chen, Yibo Wu, Giuseppe Durisi, Musa Furkan Keskin, Mohammad H. Moghaddam, Charitha Madapatha, Han Yu, Peter Hammarberg, Hyowon Kim, Tommy Svensson

    Abstract: Future generations of mobile networks call for concurrent sensing and communication functionalities in the same hardware and/or spectrum. Compared to communication, sensing services often suffer from limited coverage, due to the high path loss of the reflected signal and the increased infrastructure requirements. To provide a more uniform quality of service, distributed multiple input multiple out… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  21. arXiv:2403.12726  [pdf

    eess.SP

    Small Distance Increment Method for Measuring Complex Permittivity With mmWave Radar

    Authors: Hang Song, Hyun Joon Kim, Mingxia Wan, Bo Wei, Takamaro Kikkawa, Jun-ichi Takada

    Abstract: Measuring the complex permittivity of material is essential in many scenarios such as quality check and component analysis. Generally, measurement methods for characterizing the material are based on the usage of vector network analyzer, which is large and not easy for on-site measurement, especially in high frequency range such as millimeter wave (mmWave). In addition, some measurement methods re… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  22. arXiv:2403.08187  [pdf, other

    cs.CL cs.SD eess.AS

    Automatic Speech Recognition (ASR) for the Diagnosis of pronunciation of Speech Sound Disorders in Korean children

    Authors: Taekyung Ahn, Yeonjung Hong, Younggon Im, Do Hyung Kim, Dayoung Kang, Joo Won Jeong, Jae Won Kim, Min Jung Kim, Ah-ra Cho, Dae-Hyun Jang, Hosung Nam

    Abstract: This study presents a model of automatic speech recognition (ASR) designed to diagnose pronunciation issues in children with speech sound disorders (SSDs) to replace manual transcriptions in clinical procedures. Since ASR models trained for general purposes primarily predict input speech into real words, employing a well-known high-performance ASR model for evaluating pronunciation in children wit… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 12 pages, 2 figures

    ACM Class: I.2.7

  23. arXiv:2403.06397  [pdf, other

    cs.LG cs.AI eess.SY

    DeepSafeMPC: Deep Learning-Based Model Predictive Control for Safe Multi-Agent Reinforcement Learning

    Authors: Xuefeng Wang, Henglin Pu, Hyung Jun Kim, Husheng Li

    Abstract: Safe Multi-agent reinforcement learning (safe MARL) has increasingly gained attention in recent years, emphasizing the need for agents to not only optimize the global return but also adhere to safety requirements through behavioral constraints. Some recent work has integrated control theory with multi-agent reinforcement learning to address the challenge of ensuring safety. However, there have bee… ▽ More

    Submitted 11 March, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

    Comments: 8 pages, 5 figures

  24. arXiv:2403.05136  [pdf, other

    cs.RO eess.SP

    DeRO: Dead Reckoning Based on Radar Odometry With Accelerometers Aided for Robot Localization

    Authors: Hoang Viet Do, Yong Hun Kim, Joo Han Lee, Min Ho Lee, ** Woo Song

    Abstract: In this paper, we propose a radar odometry structure that directly utilizes radar velocity measurements for dead reckoning while maintaining its ability to update estimations within the Kalman filter framework. Specifically, we employ the Doppler velocity obtained by a 4D Frequency Modulated Continuous Wave (FMCW) radar in conjunction with gyroscope data to calculate poses. This approach helps mit… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 9 pages, 5 figures, 1 table, conference

    ACM Class: I.2.9

  25. arXiv:2402.05965  [pdf, other

    cs.LG eess.SP

    Hybrid Neural Representations for Spherical Data

    Authors: Hyomin Kim, Yunhui Jang, Jaeho Lee, Sungsoo Ahn

    Abstract: In this paper, we study hybrid neural representations for spherical data, a domain of increasing relevance in scientific research. In particular, our work focuses on weather and climate data as well as comic microwave background (CMB) data. Although previous studies have delved into coordinate-based neural representations for spherical signals, they often fail to capture the intricate details of h… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: 13 pages, 8 figures

  26. arXiv:2402.05706  [pdf, other

    cs.CL cs.SD eess.AS

    Unified Speech-Text Pretraining for Spoken Dialog Modeling

    Authors: Heeseung Kim, Soonshin Seo, Kyeongseok Jeong, Ohsung Kwon, Jungwhan Kim, Jaehong Lee, Eunwoo Song, Myungwoo Oh, Sungroh Yoon, Kang Min Yoo

    Abstract: While recent work shows promising results in expanding the capabilities of large language models (LLM) to directly understand and synthesize speech, an LLM-based strategy for modeling spoken dialogs remains elusive and calls for further investigation. This work proposes an extensive speech-text LLM framework, named the Unified Spoken Dialog Model (USDM), to generate coherent spoken responses with… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  27. arXiv:2402.01116  [pdf, other

    cs.RO cs.LG eess.SY

    Scalable Multi-modal Model Predictive Control via Duality-based Interaction Predictions

    Authors: Hansung Kim, Siddharth H. Nair, Francesco Borrelli

    Abstract: We propose a hierarchical architecture designed for scalable real-time Model Predictive Control (MPC) in complex, multi-modal traffic scenarios. This architecture comprises two key components: 1) RAID-Net, a novel attention-based Recurrent Neural Network that predicts relevant interactions along the MPC prediction horizon between the autonomous vehicle and the surrounding vehicles using Lagrangian… ▽ More

    Submitted 2 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Accepted at IEEE Intelligent Vehicles Symposium 2024

  28. arXiv:2401.13498  [pdf, other

    cs.SD cs.AI cs.LG eess.AS eess.SP

    Expressive Acoustic Guitar Sound Synthesis with an Instrument-Specific Input Representation and Diffusion Outpainting

    Authors: Hounsu Kim, Soonbeom Choi, Juhan Nam

    Abstract: Synthesizing performing guitar sound is a highly challenging task due to the polyphony and high variability in expression. Recently, deep generative models have shown promising results in synthesizing expressive polyphonic instrument sounds from music scores, often using a generic MIDI input. In this work, we propose an expressive acoustic guitar sound synthesis model with a customized input repre… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: Accepted to ICASSP 2024

  29. arXiv:2401.10288  [pdf, other

    cs.LG eess.SP

    CLAN: A Contrastive Learning based Novelty Detection Framework for Human Activity Recognition

    Authors: Hyunju Kim, Dongman Lee

    Abstract: In ambient assisted living, human activity recognition from time series sensor data mainly focuses on predefined activities, often overlooking new activity patterns. We propose CLAN, a two-tower contrastive learning-based novelty detection framework with diverse types of negative pairs for human activity recognition. It is tailored to challenges with human activity characteristics, including the s… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  30. arXiv:2401.08962  [pdf, other

    cs.HC cs.LG cs.SD eess.AS

    DOO-RE: A dataset of ambient sensors in a meeting room for activity recognition

    Authors: Hyunju Kim, Geon Kim, Taehoon Lee, Kisoo Kim, Dongman Lee

    Abstract: With the advancement of IoT technology, recognizing user activities with machine learning methods is a promising way to provide various smart services to users. High-quality data with privacy protection is essential for deploying such services in the real world. Data streams from surrounding ambient sensors are well suited to the requirement. Existing ambient sensor datasets only support constrain… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  31. arXiv:2312.15914  [pdf, other

    cs.NI eess.SP

    Improving One-Shot Transmission in NR Sidelink Resource Allocation for V2X Communication

    Authors: Hojeong Lee, Hyogon Kim

    Abstract: The Society of Automotive Engineers (SAE) has specified a wireless channel congestion control algorithm for cellular vehicle-to-everything (C-V2X) communication in J3161/1. A notable aspect of J3161/1 standard is that it addresses persistent packet collisions between neighboring vehicles. Although the chances are slim, the persistent collisions can cause so called the wireless blind spot once the… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

  32. arXiv:2312.09456  [pdf, other

    eess.SP cs.AI cs.LG

    Pioneering EEG Motor Imagery Classification Through Counterfactual Analysis

    Authors: Kang Yin, Hye-Bin Shin, Hee-Dong Kim, Seong-Whan Lee

    Abstract: The application of counterfactual explanation (CE) techniques in the realm of electroencephalography (EEG) classification has been relatively infrequent in contemporary research. In this study, we attempt to introduce and explore a novel non-generative approach to CE, specifically tailored for the analysis of EEG signals. This innovative approach assesses the model's decision-making process by str… ▽ More

    Submitted 10 November, 2023; originally announced December 2023.

  33. arXiv:2312.09040  [pdf, other

    cs.SD cs.CL eess.AS

    STaR: Distilling Speech Temporal Relation for Lightweight Speech Self-Supervised Learning Models

    Authors: Kangwook Jang, Sungnyun Kim, Hoirin Kim

    Abstract: Albeit great performance of Transformer-based speech selfsupervised learning (SSL) models, their large parameter size and computational cost make them unfavorable to utilize. In this study, we propose to compress the speech SSL models by distilling speech temporal relation (STaR). Unlike previous works that directly match the representation for each speech frame, STaR distillation transfers tempor… ▽ More

    Submitted 25 April, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: ICASSP 2024 Best Student Paper Awarded. Code URL: https://github.com/sungnyun/ARMHuBERT

  34. Synergistic Perception and Control Simplex for Verifiable Safe Vertical Landing

    Authors: Ayoosh Bansal, Yang Zhao, James Zhu, Sheng Cheng, Yuliang Gu, Hyung-** Yoon, Hunmin Kim, Naira Hovakimyan, Lui Sha

    Abstract: Perception, Planning, and Control form the essential components of autonomy in advanced air mobility. This work advances the holistic integration of these components to enhance the performance and robustness of the complete cyber-physical system. We adapt Perception Simplex, a system for verifiable collision avoidance amidst obstacle detection faults, to the vertical landing maneuver for autonomou… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: To appear in AIAA SciTech 2024

    ACM Class: C.3; C.4; J.7

    Journal ref: AIAA SCITECH 2024 Forum, p. 1167

  35. arXiv:2312.02753  [pdf, other

    eess.IV cs.CV cs.LG stat.ML

    C3: High-performance and low-complexity neural compression from a single image or video

    Authors: Hyunjik Kim, Matthias Bauer, Lucas Theis, Jonathan Richard Schwarz, Emilien Dupont

    Abstract: Most neural compression models are trained on large datasets of images or videos in order to generalize to unseen data. Such generalization typically requires large and expressive architectures with a high decoding complexity. Here we introduce C3, a neural compression method with strong rate-distortion (RD) performance that instead overfits a small model to each image or video separately. The res… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  36. arXiv:2312.01285  [pdf, other

    eess.SY

    A Literature Review on the Smart Wheelchair Systems

    Authors: Yane Kim, Bharath Velamala, Youngseo Choi, Yu** Kim, Hyunkin Kim, Nishad Kulkarni, Eung-Joo Lee

    Abstract: This study offers an in-depth analysis of smart wheelchair (SW) systems, charting their progression from early developments to future innovations. It delves into various Brain-Computer Interface (BCI) systems, including mu rhythm, event-related potential, and steady-state visual evoked potential. The paper addresses challenges in signal categorization, proposing the sparse Bayesian extreme learnin… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

  37. arXiv:2311.17358  [pdf, other

    eess.SY

    OpenSense: An Open-World Sensing Framework for Incremental Learning and Dynamic Sensor Scheduling on Embedded Edge Devices

    Authors: Abdulrahman Bukhari, Seyedmehdi Hosseinimotlagh, Hyoseung Kim

    Abstract: Recent advances in Internet-of-Things (IoT) technologies have sparked significant interest towards develo** learning-based sensing applications on embedded edge devices. These efforts, however, are being challenged by the complexities of adapting to unforeseen conditions in an open-world environment, mainly due to the intensive computational and energy demands exceeding the capabilities of edge… ▽ More

    Submitted 21 February, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

  38. arXiv:2311.10306  [pdf, other

    eess.IV cs.CV cs.LG

    MPSeg : Multi-Phase strategy for coronary artery Segmentation

    Authors: Jonghoe Ku, Yong-Hee Lee, Junsup Shin, In Kyu Lee, Hyun-Woo Kim

    Abstract: Accurate segmentation of coronary arteries is a pivotal process in assessing cardiovascular diseases. However, the intricate structure of the cardiovascular system presents significant challenges for automatic segmentation, especially when utilizing methodologies like the SYNTAX Score, which relies extensively on detailed structural information for precise risk stratification. To address these dif… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: MICCAI 2023 Conference ARCADE Challenge

  39. arXiv:2311.07227  [pdf, other

    cs.OS eess.SY

    CARTOS: A Charging-Aware Real-Time Operating System for Intermittent Batteryless Devices

    Authors: Mohsen Karimi, Yidi Wang, Youngbin Kim, Yoo** Lim, Hyoseung Kim

    Abstract: This paper presents CARTOS, a charging-aware real-time operating system designed to enhance the functionality of intermittently-powered batteryless devices (IPDs) for various Internet of Things (IoT) applications. While IPDs offer significant advantages such as extended lifespan and operability in extreme environments, they pose unique challenges, including the need to ensure forward progress of p… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  40. arXiv:2311.05878  [pdf

    cs.CV eess.IV

    Central Angle Optimization for 360-degree Holographic 3D Content

    Authors: Hakdong Kim, Minsung Yoon, Cheongwon Kim

    Abstract: In this study, we propose a method to find an optimal central angle in deep learning-based depth map estimation used to produce realistic holographic content. The acquisition of RGB-depth map images as detailed as possible must be performed to generate holograms of high quality, despite the high computational cost. Therefore, we introduce a novel pipeline designed to analyze various values of cent… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  41. arXiv:2310.19264  [pdf, other

    cs.MM cs.SD eess.AS

    Sound of Story: Multi-modal Storytelling with Audio

    Authors: Jaeyeon Bae, Seokhoon Jeong, Seokun Kang, Namgi Han, Jae-Yon Lee, Hyounghun Kim, Taehwan Kim

    Abstract: Storytelling is multi-modal in the real world. When one tells a story, one may use all of the visualizations and sounds along with the story itself. However, prior studies on storytelling datasets and tasks have paid little attention to sound even though sound also conveys meaningful semantics of the story. Therefore, we propose to extend story understanding and telling areas by establishing a new… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Findings of EMNLP 2023, project: https://github.com/Sosdatasets/SoS_Dataset/

  42. arXiv:2310.18882  [pdf, other

    cs.LG cs.AI cs.CV eess.IV eess.SP

    Differentiable Learning of Generalized Structured Matrices for Efficient Deep Neural Networks

    Authors: Changwoo Lee, Hun-Seok Kim

    Abstract: This paper investigates efficient deep neural networks (DNNs) to replace dense unstructured weight matrices with structured ones that possess desired properties. The challenge arises because the optimal weight matrix structure in popular neural network models is obscure in most cases and may vary from layer to layer even in the same network. Prior structured matrices proposed for efficient DNNs we… ▽ More

    Submitted 7 March, 2024; v1 submitted 28 October, 2023; originally announced October 2023.

  43. arXiv:2310.17742  [pdf

    eess.AS cs.LG eess.SP

    BERT-PIN: A BERT-based Framework for Recovering Missing Data Segments in Time-series Load Profiles

    Authors: Yi Hu, Kai Ye, Hyeon** Kim, Ning Lu

    Abstract: Inspired by the success of the Transformer model in natural language processing and computer vision, this paper introduces BERT-PIN, a Bidirectional Encoder Representations from Transformers (BERT) powered Profile Inpainting Network. BERT-PIN recovers multiple missing data segments (MDSs) using load and temperature time-series profiles as inputs. To adopt a standard Transformer model structure for… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

  44. arXiv:2310.08660  [pdf, other

    cs.LG cs.AI eess.SP

    Learning RL-Policies for Joint Beamforming Without Exploration: A Batch Constrained Off-Policy Approach

    Authors: Heasung Kim, Sravan Kumar Ankireddy

    Abstract: In this work, we consider the problem of network parameter optimization for rate maximization. We frame this as a joint optimization problem of power control, beam forming, and interference cancellation. We consider the setting where multiple Base Stations (BSs) communicate with multiple user equipment (UEs). Because of the exponential computational complexity of brute force search, we instead sol… ▽ More

    Submitted 11 November, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: 10 pages, 8 figures

  45. arXiv:2310.07974  [pdf, other

    eess.SY

    Causality-based Cost Allocation for Peer-to-Peer Energy Trading in Distribution System

    Authors: Hyun Joong Kim, Yong Hyun Song, Jip Kim

    Abstract: While peer-to-peer energy trading has the potential to harness the capabilities of small-scale energy resources, a peer-matching process often overlooks power grid conditions, yielding increased losses, line congestion, and voltage problems. This imposes a great challenge on the distribution system operator (DSO), which can eventually limit peer-to-peer energy trading. To align the peer-matching p… ▽ More

    Submitted 20 February, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: 7 pages, 7 figures

  46. arXiv:2310.04010  [pdf, other

    cs.CV cs.AI eess.IV

    Excision And Recovery: Visual Defect Obfuscation Based Self-Supervised Anomaly Detection Strategy

    Authors: YeongHyeon Park, Sungho Kang, Myung ** Kim, Yeonho Lee, Hyeong Seok Kim, Juneho Yi

    Abstract: Due to scarcity of anomaly situations in the early manufacturing stage, an unsupervised anomaly detection (UAD) approach is widely adopted which only uses normal samples for training. This approach is based on the assumption that the trained UAD model will accurately reconstruct normal patterns but struggles with unseen anomalous patterns. To enhance the UAD performance, reconstruction-by-inpainti… ▽ More

    Submitted 9 November, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

    Comments: 10 pages, 5 figures, 5 tables

  47. arXiv:2309.14967  [pdf, other

    cs.CV eess.IV

    A novel approach for holographic 3D content generation without depth map

    Authors: Hakdong Kim, Minkyu Jee, Yurim Lee, Kyudam Choi, MinSung Yoon, Cheongwon Kim

    Abstract: In preparation for observing holographic 3D content, acquiring a set of RGB color and depth map images per scene is necessary to generate computer-generated holograms (CGHs) when using the fast Fourier transform (FFT) algorithm. However, in real-world situations, these paired formats of RGB color and depth map images are not always fully available. We propose a deep learning-based method to synthe… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  48. arXiv:2309.12047  [pdf, other

    cs.CV cs.GR eess.IV

    Self-Calibrating, Fully Differentiable NLOS Inverse Rendering

    Authors: Kiseok Choi, Inchul Kim, Dongyoung Choi, Julio Marco, Diego Gutierrez, Min H. Kim

    Abstract: Existing time-resolved non-line-of-sight (NLOS) imaging methods reconstruct hidden scenes by inverting the optical paths of indirect illumination measured at visible relay surfaces. These methods are prone to reconstruction artifacts due to inversion ambiguities and capture noise, which are typically mitigated through the manual selection of filtering functions and parameters. We introduce a fully… ▽ More

    Submitted 25 September, 2023; v1 submitted 21 September, 2023; originally announced September 2023.

    Journal ref: Proceedings of ACM SIGGRAPH Asia 2023 (December 2023)

  49. arXiv:2309.07152  [pdf

    eess.SP physics.med-ph

    Novel Smart N95 Filtering Facepiece Respirator with Real-time Adaptive Fit Functionality and Wireless Humidity Monitoring for Enhanced Wearable Comfort

    Authors: Kangkyu Kwon, Yoon Jae Lee, Yeongju Jung, Ira Soltis, Chanyeong Choi, Yewon Na, Lissette Romero, Myung Chul Kim, Nathan Rodeheaver, Hodam Kim, Michael S. Lloyd, Ziqing Zhuang, William King, Susan Xu, Seung-Hwan Ko, **woo Lee, Woon-Hong Yeo

    Abstract: The widespread emergence of the COVID-19 pandemic has transformed our lifestyle, and facial respirators have become an essential part of daily life. Nevertheless, the current respirators possess several limitations such as poor respirator fit because they are incapable of covering diverse human facial sizes and shapes, potentially diminishing the effect of wearing respirators. In addition, the cur… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

    Comments: 20 pages, 5 figures, 1 table, submitted for possible publication

    MSC Class: 92C55

  50. arXiv:2309.06770  [pdf, other

    eess.IV eess.SY

    Deep Learning-based Synthetic High-Resolution In-Depth Imaging Using an Attachable Dual-element Endoscopic Ultrasound Probe

    Authors: Hah Min Lew, Jae Seong Kim, Moon Hwan Lee, Jaegeun Park, Sangyeon Youn, Hee Man Kim, Jihun Kim, Jae Youn Hwang

    Abstract: Endoscopic ultrasound (EUS) imaging has a trade-off between resolution and penetration depth. By considering the in-vivo characteristics of human organs, it is necessary to provide clinicians with appropriate hardware specifications for precise diagnosis. Recently, super-resolution (SR) ultrasound imaging studies, including the SR task in deep learning fields, have been reported for enhancing ultr… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

    Comments: 10 pages, 9 figures