Skip to main content

Showing 1–27 of 27 results for author: Kim, U

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.06621  [pdf, other

    cs.CL

    What is Your Favorite Gender, MLM? Gender Bias Evaluation in Multilingual Masked Language Models

    Authors: Jeongrok Yu, Seong Ug Kim, Jacob Choi, **ho D. Choi

    Abstract: Bias is a disproportionate prejudice in favor of one side against another. Due to the success of transformer-based Masked Language Models (MLMs) and their impact on many NLP tasks, a systematic evaluation of bias in these models is needed more than ever. While many studies have evaluated gender bias in English MLMs, only a few works have been conducted for the task in other languages. This paper p… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  2. arXiv:2403.17420  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Learning to Visually Localize Sound Sources from Mixtures without Prior Source Knowledge

    Authors: Dong** Kim, Sung ** Um, Sangmin Lee, Jung Uk Kim

    Abstract: The goal of the multi-sound source localization task is to localize sound sources from the mixture individually. While recent multi-sound source localization methods have shown improved performance, they face challenges due to their reliance on prior information about the number of objects to be separated. In this paper, to overcome this limitation, we present a novel multi-sound source localizati… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted at CVPR 2024

  3. arXiv:2308.09303  [pdf, other

    cs.CV cs.LG

    Online Class Incremental Learning on Stochastic Blurry Task Boundary via Mask and Visual Prompt Tuning

    Authors: Jun-Yeong Moon, Keon-Hee Park, Jung Uk Kim, Gyeong-Moon Park

    Abstract: Continual learning aims to learn a model from a continuous stream of data, but it mainly assumes a fixed number of data and tasks with clear task boundaries. However, in real-world scenarios, the number of input data and tasks is constantly changing in a statistical way, not a static way. Although recently introduced incremental learning scenarios having blurry task boundaries somewhat address the… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

  4. arXiv:2308.06087  [pdf, other

    cs.MM cs.AI cs.CV

    Audio-Visual Spatial Integration and Recursive Attention for Robust Sound Source Localization

    Authors: Sung ** Um, Dong** Kim, Jung Uk Kim

    Abstract: The objective of the sound source localization task is to enable machines to detect the location of sound-making objects within a visual scene. While the audio modality provides spatial cues to locate the sound source, existing approaches only use audio as an auxiliary role to compare spatial regions of the visual modality. Humans, on the other hand, utilize both audio and visual modalities as spa… ▽ More

    Submitted 17 August, 2023; v1 submitted 11 August, 2023; originally announced August 2023.

    Comments: Camera-Ready, ACM MM 2023

  5. arXiv:2306.14289  [pdf, other

    cs.CV

    Faster Segment Anything: Towards Lightweight SAM for Mobile Applications

    Authors: Chaoning Zhang, Dongshen Han, Yu Qiao, Jung Uk Kim, Sung-Ho Bae, Seungkyu Lee, Choong Seon Hong

    Abstract: Segment Anything Model (SAM) has attracted significant attention due to its impressive zero-shot transfer performance and high versatility for numerous vision applications (like image editing with fine-grained control). Many of such applications need to be run on resource-constraint edge devices, like mobile phones. In this work, we aim to make SAM mobile-friendly by replacing the heavyweight imag… ▽ More

    Submitted 1 July, 2023; v1 submitted 25 June, 2023; originally announced June 2023.

    Comments: First work to make SAM lightweight for mobile applications

  6. arXiv:2304.06488  [pdf, other

    cs.CY cs.AI cs.CL cs.CV cs.LG

    One Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC Era

    Authors: Chaoning Zhang, Chenshuang Zhang, Chenghao Li, Yu Qiao, Sheng Zheng, Sumit Kumar Dam, Mengchun Zhang, Jung Uk Kim, Seong Tae Kim, **woo Choi, Gyeong-Moon Park, Sung-Ho Bae, Lik-Hang Lee, Pan Hui, In So Kweon, Choong Seon Hong

    Abstract: OpenAI has recently released GPT-4 (a.k.a. ChatGPT plus), which is demonstrated to be one small step for generative AI (GAI), but one giant leap for artificial general intelligence (AGI). Since its official release in November 2022, ChatGPT has quickly attracted numerous users with extensive media coverage. Such unprecedented attention has also motivated numerous researchers to investigate ChatGPT… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

    Comments: A Survey on ChatGPT and GPT-4, 29 pages. Feedback is appreciated ([email protected])

  7. arXiv:2210.16788  [pdf, other

    cs.CV

    Image-free Domain Generalization via CLIP for 3D Hand Pose Estimation

    Authors: Seongyeong Lee, Hansoo Park, Dong Uk Kim, Jihyeon Kim, Muhammadjon Boboev, Seungryul Baek

    Abstract: RGB-based 3D hand pose estimation has been successful for decades thanks to large-scale databases and deep learning. However, the hand pose estimation network does not operate well for hand pose images whose characteristics are far different from the training data. This is caused by various factors such as illuminations, camera angles, diverse backgrounds in the input images, etc. Many existing me… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

  8. arXiv:2209.09486  [pdf, other

    cs.CV

    Self-supervised 3D Object Detection from Monocular Pseudo-LiDAR

    Authors: Curie Kim, Ue-Hwan Kim, Jong-Hwan Kim

    Abstract: There have been attempts to detect 3D objects by fusion of stereo camera images and LiDAR sensor data or using LiDAR for pre-training and only monocular images for testing, but there have been less attempts to use only monocular image sequences due to low accuracy. In addition, when depth prediction using only monocular images, only scale-inconsistent depth can be predicted, which is the reason wh… ▽ More

    Submitted 20 September, 2022; originally announced September 2022.

    Comments: Accepted for the 2022 IEEE International Conference on Multisensor Fusion and Integration (MFI 2022)

  9. arXiv:2209.08844  [pdf, other

    cs.CV

    A Dual-Cycled Cross-View Transformer Network for Unified Road Layout Estimation and 3D Object Detection in the Bird's-Eye-View

    Authors: Curie Kim, Ue-Hwan Kim

    Abstract: The bird's-eye-view (BEV) representation allows robust learning of multiple tasks for autonomous driving including road layout estimation and 3D object detection. However, contemporary methods for unified road layout estimation and 3D object detection rarely handle the class imbalance of the training dataset and multi-class learning to reduce the total number of networks required. To overcome thes… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

  10. arXiv:2207.10324  [pdf, other

    eess.IV cs.CV cs.LG

    Enhancing Generative Networks for Chest Anomaly Localization through Automatic Registration-Based Unpaired-to-Pseudo-Paired Training Data Translation

    Authors: Kyungsu Kim, Seong Je Oh, Chae Yeon Lim, Ju Hwan Lee, Tae Uk Kim, Myung ** Chung

    Abstract: Image translation based on a generative adversarial network (GAN-IT) is a promising method for the precise localization of abnormal regions in chest X-ray images (AL-CXR) even without the pixel-level annotation. However, heterogeneous unpaired datasets undermine existing methods to extract key features and distinguish normal from abnormal cases, resulting in inaccurate and unstable AL-CXR. To addr… ▽ More

    Submitted 15 June, 2024; v1 submitted 21 July, 2022; originally announced July 2022.

  11. arXiv:2108.09030  [pdf, other

    cs.HC

    Type Anywhere You Want: An Introduction to Invisible Mobile Keyboard

    Authors: Sahng-Min Yoo, Ue-Hwan Kim, Yewon Hwang, Jong-Hwan Kim

    Abstract: Contemporary soft keyboards possess limitations: the lack of physical feedback results in an increase of typos, and the interface of soft keyboards degrades the utility of the screen. To overcome these limitations, we propose an Invisible Mobile Keyboard (IMK), which lets users freely type on the desired area without any constraints. To facilitate a data-driven IMK decoding task, we have collected… ▽ More

    Submitted 20 August, 2021; originally announced August 2021.

    Comments: Accepted by IJCAI 2021

  12. arXiv:2104.09021  [pdf, other

    cs.CV

    Writing in The Air: Unconstrained Text Recognition from Finger Movement Using Spatio-Temporal Convolution

    Authors: Ue-Hwan Kim, Yewon Hwang, Sun-Kyung Lee, Jong-Hwan Kim

    Abstract: In this paper, we introduce a new benchmark dataset for the challenging writing in the air (WiTA) task -- an elaborate task bridging vision and NLP. WiTA implements an intuitive and natural writing method with finger movement for human-computer interaction (HCI). Our WiTA dataset will facilitate the development of data-driven WiTA systems which thus far have displayed unsatisfactory performance --… ▽ More

    Submitted 18 April, 2021; originally announced April 2021.

    Comments: 10 pages, 6 figures, 6 tables

  13. arXiv:2103.12496  [pdf, other

    cs.CV cs.AI cs.RO

    Revisiting Self-Supervised Monocular Depth Estimation

    Authors: Ue-Hwan Kim, Jong-Hwan Kim

    Abstract: Self-supervised learning of depth map prediction and motion estimation from monocular video sequences is of vital importance -- since it realizes a broad range of tasks in robotics and autonomous vehicles. A large number of research efforts have enhanced the performance by tackling illumination variation, occlusions, and dynamic objects, to name a few. However, each of those efforts targets indivi… ▽ More

    Submitted 23 March, 2021; originally announced March 2021.

    Comments: 14 pages, 3 figures, 4 tables

  14. arXiv:2103.05368  [pdf, other

    cs.CV cs.RO

    ChangeSim: Towards End-to-End Online Scene Change Detection in Industrial Indoor Environments

    Authors: **-Man Park, Jae-Hyuk Jang, Sahng-Min Yoo, Sun-Kyung Lee, Ue-Hwan Kim, Jong-Hwan Kim

    Abstract: We present a challenging dataset, ChangeSim, aimed at online scene change detection (SCD) and more. The data is collected in photo-realistic simulation environments with the presence of environmental non-targeted variations, such as air turbidity and light condition changes, as well as targeted object changes in industrial indoor environments. By collecting data in simulations, multi-modal sensor… ▽ More

    Submitted 22 July, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

    Comments: Accepted to IROS 2021

  15. arXiv:2009.10868  [pdf, other

    cs.CV

    A Real-Time Predictive Pedestrian Collision Warning Service for Cooperative Intelligent Transportation Systems Using 3D Pose Estimation

    Authors: Ue-Hwan Kim, Dongho Ka, Hwasoo Yeo, Jong-Hwan Kim

    Abstract: Minimizing traffic accidents between vehicles and pedestrians is one of the primary research goals in intelligent transportation systems. To achieve the goal, pedestrian orientation recognition and prediction of pedestrian's crossing or not-crossing intention play a central role. Contemporary approaches do not guarantee satisfactory performance due to limited field-of-view, lack of generalization,… ▽ More

    Submitted 21 February, 2022; v1 submitted 22 September, 2020; originally announced September 2020.

    Comments: 12 pages, 8 figures, 4 tables

  16. arXiv:2007.08154  [pdf, other

    cs.CV

    Comprehensive Facial Expression Synthesis using Human-Interpretable Language

    Authors: Joanna Hong, Jung Uk Kim, Sangmin Lee, Yong Man Ro

    Abstract: Recent advances in facial expression synthesis have shown promising results using diverse expression representations including facial action units. Facial action units for an elaborate facial expression synthesis need to be intuitively represented for human comprehension, not a numeric categorization of facial action units. To address this issue, we utilize human-friendly approach: use of natural… ▽ More

    Submitted 16 July, 2020; originally announced July 2020.

    Comments: ICIP 2020

  17. arXiv:2005.14390  [pdf, ps, other

    cs.CV cs.CR eess.IV

    Privacy-Protection Drone Patrol System based on Face Anonymization

    Authors: Harim Lee, Myeung Un Kim, Yeongjun Kim, Hyeonsu Lyu, Hyun Jong Yang

    Abstract: The robot market has been growing significantly and is expected to become 1.5 times larger in 2024 than what it was in 2019. Robots have attracted attention of security companies thanks to their mobility. These days, for security robots, unmanned aerial vehicles (UAVs) have quickly emerged by highlighting their advantage: they can even go to any hazardous place that humans cannot access. For UAVs,… ▽ More

    Submitted 29 May, 2020; originally announced May 2020.

  18. arXiv:2005.10987  [pdf, other

    cs.CV

    Investigating Vulnerability to Adversarial Examples on Multimodal Data Fusion in Deep Learning

    Authors: Youngjoon Yu, Hong Joo Lee, Byeong Cheon Kim, Jung Uk Kim, Yong Man Ro

    Abstract: The success of multimodal data fusion in deep learning appears to be attributed to the use of complementary in-formation between multiple input data. Compared to their predictive performance, relatively less attention has been devoted to the robustness of multimodal fusion models. In this paper, we investigated whether the current multimodal fusion model utilizes the complementary intelligence to… ▽ More

    Submitted 21 May, 2020; originally announced May 2020.

  19. arXiv:2005.10750  [pdf, other

    cs.CV cs.LG

    Revisiting Role of Autoencoders in Adversarial Settings

    Authors: Byeong Cheon Kim, Jung Uk Kim, Hakmin Lee, Yong Man Ro

    Abstract: To combat against adversarial attacks, autoencoder structure is widely used to perform denoising which is regarded as gradient masking. In this paper, we revisit the role of autoencoders in adversarial settings. Through the comprehensive experimental results and analysis, this paper presents the inherent property of adversarial robustness in the autoencoders. We also found that autoencoders may us… ▽ More

    Submitted 21 May, 2020; originally announced May 2020.

    Comments: Accepted at ICIP 2020

  20. arXiv:1912.08541  [pdf, other

    cs.CV cs.LG cs.NE

    s-DRN: Stabilized Developmental Resonance Network

    Authors: In-Ug Yoon, Ue-Hwan Kim, Jong-Hwan

    Abstract: Online incremental clustering of sequentially incoming data without prior knowledge suffers from changing cluster numbers and tends to fall into local extrema according to given data order. To overcome these limitations, we propose a stabilized developmental resonance network (s-DRN). First, we analyze the instability of the conventional choice function during the node activation process and desig… ▽ More

    Submitted 15 July, 2020; v1 submitted 18 December, 2019; originally announced December 2019.

    Comments: Under review

  21. arXiv:1911.05939  [pdf, other

    cs.CV cs.LG

    SimVODIS: Simultaneous Visual Odometry, Object Detection, and Instance Segmentation

    Authors: Ue-Hwan Kim, Se-Ho Kim, Jong-Hwan Kim

    Abstract: Intelligent agents need to understand the surrounding environment to provide meaningful services to or interact intelligently with humans. The agents should perceive geometric features as well as semantic entities inherent in the environment. Contemporary methods in general provide one type of information regarding the environment at a time, making it difficult to conduct high-level tasks. Moreove… ▽ More

    Submitted 16 November, 2019; v1 submitted 14 November, 2019; originally announced November 2019.

    Comments: Submitted to TPAMI

  22. arXiv:1908.08204  [pdf, other

    eess.IV cs.LG stat.ML

    Convolutional Recurrent Reconstructive Network for Spatiotemporal Anomaly Detection in Solder Paste Inspection

    Authors: Yong-Ho Yoo, Ue-Hwan Kim, Jong-Hwan Kim

    Abstract: Surface mount technology (SMT) is a process for producing printed circuit boards. Solder paste printer (SPP), package mounter, and solder reflow oven are used for SMT. The board on which the solder paste is deposited from the SPP is monitored by solder paste inspector (SPI). If SPP malfunctions due to the printer defects, the SPP produces defective products, and then abnormal patterns are detected… ▽ More

    Submitted 22 August, 2019; originally announced August 2019.

  23. 3-D Scene Graph: A Sparse and Semantic Representation of Physical Environments for Intelligent Agents

    Authors: Ue-Hwan Kim, **-Man Park, Taek-** Song, Jong-Hwan Kim

    Abstract: Intelligent agents gather information and perceive semantics within the environments before taking on given tasks. The agents store the collected information in the form of environment models that compactly represent the surrounding environments. The agents, however, can only conduct limited tasks without an efficient and effective environment model. Thus, such an environment model takes a crucial… ▽ More

    Submitted 13 August, 2019; originally announced August 2019.

    Comments: Early Access

  24. arXiv:1907.13285  [pdf, other

    cs.HC cs.AI cs.CV

    I-Keyboard: Fully Imaginary Keyboard on Touch Devices Empowered by Deep Neural Decoder

    Authors: Ue-Hwan Kim, Sahng-Min Yoo, Jong-Hwan Kim

    Abstract: Text-entry aims to provide an effective and efficient pathway for humans to deliver their messages to computers. With the advent of mobile computing, the recent focus of text-entry research has moved from physical keyboards to soft keyboards. Current soft keyboards, however, increase the typo rate due to lack of tactile feedback and degrade the usability of mobile devices due to their large portio… ▽ More

    Submitted 30 July, 2019; originally announced July 2019.

    Comments: Submitted to IEEE TRANSACTIONS ON CYBERNETICS

  25. arXiv:1907.13274  [pdf

    cs.RO cs.AI cs.HC

    A Stabilized Feedback Episodic Memory (SF-EM) and Home Service Provision Framework for Robot and IoT Collaboration

    Authors: Ue-Hwan Kim, Jong-Hwan Kim

    Abstract: The automated home referred to as Smart Home is expected to offer fully customized services to its residents, reducing the amount of home labor, thus improving human beings' welfare. Service robots and Internet of Things (IoT) play the key roles in the development of Smart Home. The service provision with these two main components in a Smart Home environment requires: 1) learning and reasoning alg… ▽ More

    Submitted 30 July, 2019; originally announced July 2019.

    Comments: Accepted (Early Access)

  26. arXiv:1809.05001  [pdf

    cs.AI

    Reductive property of new fuzzy reasoning method based on distance measure

    Authors: Son-il Kwak, Gum-ju Kim, Michio Sugeno, Gwang-chol Li, Myong-suk Son, Hyok-chol Kim, Un-ha Kim

    Abstract: Firstly in this paper we propose a new criterion function for evaluation of the reductive property about the fuzzy reasoning result for fuzzy modus ponens and fuzzy modus tollens. Secondly unlike fuzzy reasoning methods based on the similarity measure, we propose a new fuzzy reasoning method based on distance measure. Thirdly the reductive property for 5 fuzzy reasoning methods are checked with re… ▽ More

    Submitted 7 September, 2018; originally announced September 2018.

  27. arXiv:1708.03431  [pdf

    cs.CV

    Iterative Deep Convolutional Encoder-Decoder Network for Medical Image Segmentation

    Authors: Jung Uk Kim, Hak Gu Kim, Yong Man Ro

    Abstract: In this paper, we propose a novel medical image segmentation using iterative deep learning framework. We have combined an iterative learning approach and an encoder-decoder network to improve segmentation results, which enables to precisely localize the regions of interest (ROIs) including complex shapes or detailed textures of medical images in an iterative manner. The proposed iterative deep con… ▽ More

    Submitted 11 August, 2017; originally announced August 2017.

    Comments: accepted at EMBC 2017