Skip to main content

Showing 1–50 of 52 results for author: Kim, S J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08222  [pdf

    cs.CV cs.AI cs.CY cs.HC

    A Sociotechnical Lens for Evaluating Computer Vision Models: A Case Study on Detecting and Reasoning about Gender and Emotion

    Authors: Sha Luo, Sang Jung Kim, Zening Duan, Kai** Chen

    Abstract: In the evolving landscape of computer vision (CV) technologies, the automatic detection and interpretation of gender and emotion in images is a critical area of study. This paper investigates social biases in CV models, emphasizing the limitations of traditional evaluation metrics such as precision, recall, and accuracy. These metrics often fall short in capturing the complexities of gender and em… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  2. arXiv:2406.01079  [pdf, other

    cs.CV cs.AI

    Object Aware Egocentric Online Action Detection

    Authors: Joungbin An, Yunsu Park, Hyolim Kang, Seon Joo Kim

    Abstract: Advancements in egocentric video datasets like Ego4D, EPIC-Kitchens, and Ego-Exo4D have enriched the study of first-person human interactions, which is crucial for applications in augmented reality and assisted living. Despite these advancements, current Online Action Detection methods, which efficiently detect actions in streaming videos, are predominantly designed for exocentric views and thus f… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: CVPR First Joint Egocentric Vision Workshop 2024

  3. arXiv:2406.01020  [pdf, other

    cs.CV

    CLIP-Guided Attribute Aware Pretraining for Generalizable Image Quality Assessment

    Authors: Daekyu Kwon, Dongyoung Kim, Sehwan Ki, Younghyun Jo, Hyong-Euk Lee, Seon Joo Kim

    Abstract: In no-reference image quality assessment (NR-IQA), the challenge of limited dataset sizes hampers the development of robust and generalizable models. Conventional methods address this issue by utilizing large datasets to extract rich representations for IQA. Also, some approaches propose vision language models (VLM) based IQA, but the domain gap between generic VLM and IQA constrains their scalabi… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  4. arXiv:2405.06284  [pdf, other

    eess.IV cs.CV cs.LG

    Modality-agnostic Domain Generalizable Medical Image Segmentation by Multi-Frequency in Multi-Scale Attention

    Authors: Ju-Hyeon Nam, Nur Suriza Syazwany, Su Jung Kim, Sang-Chul Lee

    Abstract: Generalizability in deep neural networks plays a pivotal role in medical image segmentation. However, deep learning-based medical image analyses tend to overlook the importance of frequency variance, which is critical element for achieving a model that is both modality-agnostic and domain-generalizable. Additionally, various models fail to account for the potential information loss that can arise… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: Accepted in Computer Vision and Pattern Recognition (CVPR) 2024

  5. arXiv:2402.18277  [pdf, other

    cs.CV

    Attentive Illumination Decomposition Model for Multi-Illuminant White Balancing

    Authors: Dongyoung Kim, **woo Kim, Junsang Yu, Seon Joo Kim

    Abstract: White balance (WB) algorithms in many commercial cameras assume single and uniform illumination, leading to undesirable results when multiple lighting sources with different chromaticities exist in the scene. Prior research on multi-illuminant WB typically predicts illumination at the pixel level without fully gras** the scene's actual lighting conditions, including the number and color of light… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: CVPR 2024

  6. arXiv:2312.04885  [pdf, other

    cs.CV

    VISAGE: Video Instance Segmentation with Appearance-Guided Enhancement

    Authors: Hanjung Kim, Jaehyun Kang, Miran Heo, Sukjun Hwang, Seoung Wug Oh, Seon Joo Kim

    Abstract: In recent years, online Video Instance Segmentation (VIS) methods have shown remarkable advancement with their powerful query-based detectors. Utilizing the output queries of the detector at the frame-level, these methods achieve high accuracy on challenging benchmarks. However, our observations demonstrate that these methods heavily rely on location information, which often causes incorrect assoc… ▽ More

    Submitted 8 March, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

    Comments: Technical report

  7. arXiv:2310.08929  [pdf, other

    cs.CV

    Leveraging Image Augmentation for Object Manipulation: Towards Interpretable Controllability in Object-Centric Learning

    Authors: **woo Kim, Janghyuk Choi, Jaehyun Kang, Changyeon Lee, Ho-** Choi, Seon Joo Kim

    Abstract: The binding problem in artificial neural networks is actively explored with the goal of achieving human-level recognition skills through the comprehension of the world in terms of symbol-like entities. Especially in the field of computer vision, object-centric learning (OCL) is extensively researched to better understand complex scenes by acquiring object representations or slots. While recent stu… ▽ More

    Submitted 1 March, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

  8. arXiv:2308.10269  [pdf, other

    cs.CV eess.IV

    Domain Reduction Strategy for Non Line of Sight Imaging

    Authors: Hyunbo Shim, In Cho, Daekyu Kwon, Seon Joo Kim

    Abstract: This paper presents a novel optimization-based method for non-line-of-sight (NLOS) imaging that aims to reconstruct hidden scenes under various setups. Our method is built upon the observation that photons returning from each point in hidden volumes can be independently computed if the interactions between hidden surfaces are trivially ignored. We model the generalized light propagation function t… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

  9. arXiv:2303.17842  [pdf, other

    cs.CV

    Shepherding Slots to Objects: Towards Stable and Robust Object-Centric Learning

    Authors: **woo Kim, Janghyuk Choi, Ho-** Choi, Seon Joo Kim

    Abstract: Object-centric learning (OCL) aspires general and compositional understanding of scenes by representing a scene as a collection of object-centric representations. OCL has also been extended to multi-view image and video datasets to apply various data-driven inductive biases by utilizing geometric or temporal information in the multi-image data. Single-view images carry less information about how t… ▽ More

    Submitted 31 March, 2023; originally announced March 2023.

  10. arXiv:2301.06244  [pdf, other

    cs.RO eess.SY

    Haptic Transparency and Interaction Force Control for a Lower-Limb Exoskeleton

    Authors: Emek Barış Küçüktabak, Yue Wen, Sangjoon J. Kim, Matthew Short, Daniel Ludvig, Levi Hargrove, Eric Perreault, Kevin Lynch, Jose Pons

    Abstract: Controlling the interaction forces between a human and an exoskeleton is crucial for providing transparency or adjusting assistance or resistance levels. However, it is an open problem to control the interaction forces of lower-limb exoskeletons designed for unrestricted overground walking. For these types of exoskeletons, it is challenging to implement force/torque sensors at every contact betwee… ▽ More

    Submitted 22 January, 2024; v1 submitted 15 January, 2023; originally announced January 2023.

    Comments: 19 pages, 13 figures. Accepted for publication in the IEEE Transactions on Robotics (T-RO)

  11. arXiv:2211.09385  [pdf, other

    cs.SD cs.AI cs.MM eess.AS

    ComMU: Dataset for Combinatorial Music Generation

    Authors: Lee Hyun, Taehyun Kim, Hyolim Kang, Minjoo Ki, Hyeonchan Hwang, Kwanho Park, Sharang Han, Seon Joo Kim

    Abstract: Commercial adoption of automatic music composition requires the capability of generating diverse and high-quality music suitable for the desired context (e.g., music for romantic movies, action games, restaurants, etc.). In this paper, we introduce combinatorial music generation, a new task to create varying background music based on given conditions. Combinatorial music generation creates short s… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

    Comments: 19 pages, 12 figures

  12. arXiv:2211.08834  [pdf, other

    cs.CV

    A Generalized Framework for Video Instance Segmentation

    Authors: Miran Heo, Sukjun Hwang, Jeongseok Hyun, Hanjung Kim, Seoung Wug Oh, Joon-Young Lee, Seon Joo Kim

    Abstract: The handling of long videos with complex and occluded sequences has recently emerged as a new challenge in the video instance segmentation (VIS) community. However, existing methods have limitations in addressing this challenge. We argue that the biggest bottleneck in current approaches is the discrepancy between training and inference. To effectively bridge this gap, we propose a Generalized fram… ▽ More

    Submitted 24 March, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

    Comments: CVPR 2023

  13. arXiv:2211.06023  [pdf, other

    cs.CV

    Soft-Landing Strategy for Alleviating the Task Discrepancy Problem in Temporal Action Localization Tasks

    Authors: Hyolim Kang, Hanjung Kim, Joungbin An, Minsu Cho, Seon Joo Kim

    Abstract: Temporal Action Localization (TAL) methods typically operate on top of feature sequences from a frozen snippet encoder that is pretrained with the Trimmed Action Classification (TAC) tasks, resulting in a task discrepancy problem. While existing TAL methods mitigate this issue either by retraining the encoder with a pretext task or by end-to-end fine-tuning, they commonly require an overload of hi… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

  14. arXiv:2210.02526  [pdf, other

    cs.CL

    "No, they did not": Dialogue response dynamics in pre-trained language models

    Authors: Sanghee J. Kim, Lang Yu, Allyson Ettinger

    Abstract: A critical component of competence in language is being able to identify relevant components of an utterance and reply appropriately. In this paper we examine the extent of such dialogue response sensitivity in pre-trained language models, conducting a series of experiments with a particular focus on sensitivity to dynamics involving phenomena of at-issueness and ellipsis. We find that models show… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: 12 pages, 8 figures, COLING 2022, see https://github.com/sangheek16/dialogue-response-dynamics for codes and material

  15. arXiv:2207.10391  [pdf, other

    cs.CV cs.LG

    Error Compensation Framework for Flow-Guided Video Inpainting

    Authors: Jaeyeon Kang, Seoung Wug Oh, Seon Joo Kim

    Abstract: The key to video inpainting is to use correlation information from as many reference frames as possible. Existing flow-based propagation methods split the video synthesis process into multiple steps: flow completion -> pixel propagation -> synthesis. However, there is a significant drawback that the errors in each step continue to accumulate and amplify in the next step. To this end, we propose an… ▽ More

    Submitted 21 July, 2022; originally announced July 2022.

    Comments: ECCV2022 accepted

  16. Residual-based physics-informed transfer learning: A hybrid method for accelerating long-term CFD simulations via deep learning

    Authors: Joongoo Jeon, Juhyeong Lee, Ricardo Vinuesa, Sung Joong Kim

    Abstract: While a big wave of artificial intelligence (AI) has propagated to the field of computational fluid dynamics (CFD) acceleration studies, recent research has highlighted that the development of AI techniques that reconciles the following goals remains our primary task: (1) accurate prediction of unseen (future) time series in long-term CFD simulations (2) acceleration of simulations (3) an acceptab… ▽ More

    Submitted 26 November, 2023; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: 36 pages, 10 figures

    Journal ref: International Journal of Heat and Mass Transfer 220 (2024) 124900

  17. arXiv:2206.04403  [pdf, other

    cs.CV

    VITA: Video Instance Segmentation via Object Token Association

    Authors: Miran Heo, Sukjun Hwang, Seoung Wug Oh, Joon-Young Lee, Seon Joo Kim

    Abstract: We introduce a novel paradigm for offline Video Instance Segmentation (VIS), based on the hypothesis that explicit object-oriented information can be a strong clue for understanding the context of the entire sequence. To this end, we propose VITA, a simple structure built on top of an off-the-shelf Transformer-based image instance segmentation model. Specifically, we use an image object detector a… ▽ More

    Submitted 20 October, 2022; v1 submitted 9 June, 2022; originally announced June 2022.

  18. arXiv:2206.02116  [pdf, other

    cs.CV

    Cannot See the Forest for the Trees: Aggregating Multiple Viewpoints to Better Classify Objects in Videos

    Authors: Sukjun Hwang, Miran Heo, Seoung Wug Oh, Seon Joo Kim

    Abstract: Recently, both long-tailed recognition and object tracking have made great advances individually. TAO benchmark presented a mixture of the two, long-tailed object tracking, in order to further reflect the aspect of the real-world. To date, existing solutions have adopted detectors showing robustness in long-tailed distributions, which derive per-frame results. Then, they used tracking algorithms t… ▽ More

    Submitted 5 June, 2022; originally announced June 2022.

    Comments: Accepted to CVPR 2022

  19. arXiv:2201.01479  [pdf, other

    cs.NE

    Gradient-based Bit Encoding Optimization for Noise-Robust Binary Memristive Crossbar

    Authors: Youngeun Kim, Hyunsoo Kim, Seijoon Kim, Sang Joon Kim, Priyadarshini Panda

    Abstract: Binary memristive crossbars have gained huge attention as an energy-efficient deep learning hardware accelerator. Nonetheless, they suffer from various noises due to the analog nature of the crossbars. To overcome such limitations, most previous works train weight parameters with noise data obtained from a crossbar. These methods are, however, ineffective because it is difficult to collect noise d… ▽ More

    Submitted 5 January, 2022; originally announced January 2022.

    Comments: Accepted to DATE2022

  20. arXiv:2112.04177  [pdf, other

    cs.CV

    VISOLO: Grid-Based Space-Time Aggregation for Efficient Online Video Instance Segmentation

    Authors: Su Ho Han, Sukjun Hwang, Seoung Wug Oh, Yeonchool Park, Hyunwoo Kim, Min-Jung Kim, Seon Joo Kim

    Abstract: For online video instance segmentation (VIS), fully utilizing the information from previous frames in an efficient manner is essential for real-time applications. Most previous methods follow a two-stage approach requiring additional computations such as RPN and RoIAlign, and do not fully exploit the available information in the video for all subtasks in VIS. In this paper, we propose a novel sing… ▽ More

    Submitted 30 March, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

  21. arXiv:2111.14799  [pdf, other

    cs.CV cs.AI

    UBoCo : Unsupervised Boundary Contrastive Learning for Generic Event Boundary Detection

    Authors: Hyolim Kang, **woo Kim, Taehyun Kim, Seon Joo Kim

    Abstract: Generic Event Boundary Detection (GEBD) is a newly suggested video understanding task that aims to find one level deeper semantic boundaries of events. Bridging the gap between natural human perception and video understanding, it has various potential applications, including interpretable and semantically valid video parsing. Still at an early development stage, existing GEBD solvers are simple ex… ▽ More

    Submitted 29 November, 2021; v1 submitted 29 November, 2021; originally announced November 2021.

  22. arXiv:2106.11549  [pdf, other

    cs.CV

    Winning the CVPR'2021 Kinetics-GEBD Challenge: Contrastive Learning Approach

    Authors: Hyolim Kang, **woo Kim, Kyungmin Kim, Taehyun Kim, Seon Joo Kim

    Abstract: Generic Event Boundary Detection (GEBD) is a newly introduced task that aims to detect "general" event boundaries that correspond to natural human perception. In this paper, we introduce a novel contrastive learning based approach to deal with the GEBD. Our intuition is that the feature similarity of the video snippet would significantly vary near the event boundaries, while remaining relatively t… ▽ More

    Submitted 22 June, 2021; originally announced June 2021.

  23. arXiv:2106.03299  [pdf, other

    cs.CV

    Video Instance Segmentation using Inter-Frame Communication Transformers

    Authors: Sukjun Hwang, Miran Heo, Seoung Wug Oh, Seon Joo Kim

    Abstract: We propose a novel end-to-end solution for video instance segmentation (VIS) based on transformers. Recently, the per-clip pipeline shows superior performance over per-frame methods leveraging richer information from multiple frames. However, previous per-clip models require heavy computation and memory usage to achieve frame-to-frame communications, limiting practicality. In this work, we propose… ▽ More

    Submitted 6 June, 2021; originally announced June 2021.

  24. arXiv:2105.14584  [pdf, other

    cs.CV cs.AI cs.LG

    Polygonal Point Set Tracking

    Authors: Gunhee Nam, Miran Heo, Seoung Wug Oh, Joon-Young Lee, Seon Joo Kim

    Abstract: In this paper, we propose a novel learning-based polygonal point set tracking method. Compared to existing video object segmentation~(VOS) methods that propagate pixel-wise object mask information, we propagate a polygonal point set over frames. Specifically, the set is defined as a subset of points in the target contour, and our goal is to track corresponding points on the target contour. Those… ▽ More

    Submitted 30 May, 2021; originally announced May 2021.

    Comments: 14 pages, 10 figures, 6 tables

  25. Finite volume method network for acceleration of unsteady computational fluid dynamics: non-reacting and reacting flows

    Authors: Joongoo Jeon, Juhyeong Lee, Sung Joong Kim

    Abstract: Despite rapid improvements in the performance of central processing unit (CPU), the calculation cost of simulating chemically reacting flow using CFD remains infeasible in many cases. The application of the convolutional neural networks (CNNs) specialized in image processing in flow field prediction has been studied, but the need to develop a neural netweork design fitted for CFD is recently emerg… ▽ More

    Submitted 24 July, 2023; v1 submitted 7 May, 2021; originally announced May 2021.

    Journal ref: International Journal of Energy Research, 2022

  26. Temporally smooth online action detection using cycle-consistent future anticipation

    Authors: Young Hwi Kim, Seonghyeon Nam, Seon Joo Kim

    Abstract: Many video understanding tasks work in the offline setting by assuming that the input video is given from the start to the end. However, many real-world problems require the online setting, making a decision immediately using only the current and the past frames of videos such as in autonomous driving and surveillance systems. In this paper, we present a novel solution for online action detection… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

    Comments: Accepted by Pattern Recognition

    Journal ref: Pattern Recognition, Volume 116, August 2021, 107954

  27. arXiv:2102.05790  [pdf

    physics.optics cs.LG eess.IV

    Nonlocal metasurfaces for spectrally decoupled wavefront manipulation and eye tracking

    Authors: Jung-Hwan Song, Jorik van de Groep, Soo ** Kim, Mark L. Brongersma

    Abstract: Metasurface-based optical elements typically manipulate light waves by imparting space-variant changes in the amplitude and phase with a dense array of scattering nanostructures. The highly-localized and low optical-quality-factor (Q) modes of nanostructures are beneficial for wavefront-sha** as they afford quasi-local control over the electromagnetic fields. However, many emerging imaging, sens… ▽ More

    Submitted 10 February, 2021; originally announced February 2021.

  28. arXiv:2102.01163  [pdf

    cs.MM cs.CV cs.LG

    Visual Framing of Science Conspiracy Videos: Integrating Machine Learning with Communication Theories to Study the Use of Color and Brightness

    Authors: Kai** Chen, Sang Jung Kim, Qiantong Gao, Sebastian Raschka

    Abstract: Recent years have witnessed an explosion of science conspiracy videos on the Internet, challenging science epistemology and public understanding of science. Scholars have started to examine the persuasion techniques used in conspiracy messages such as uncertainty and fear yet, little is understood about the visual narratives, especially how visual narratives differ in videos that debunk conspiraci… ▽ More

    Submitted 17 November, 2021; v1 submitted 1 February, 2021; originally announced February 2021.

  29. arXiv:2012.01632  [pdf, other

    cs.CV

    Single-shot Path Integrated Panoptic Segmentation

    Authors: Sukjun Hwang, Seoung Wug Oh, Seon Joo Kim

    Abstract: Panoptic segmentation, which is a novel task of unifying instance segmentation and semantic segmentation, has attracted a lot of attention lately. However, most of the previous methods are composed of multiple pathways with each pathway specialized to a designated segmentation task. In this paper, we propose to resolve panoptic segmentation in single-shot by integrating the execution flows. With t… ▽ More

    Submitted 3 December, 2020; v1 submitted 2 December, 2020; originally announced December 2020.

    Comments: 10 pages, 5 figures

  30. arXiv:2007.08786  [pdf, other

    cs.CV

    Cross-Identity Motion Transfer for Arbitrary Objects through Pose-Attentive Video Reassembling

    Authors: Subin Jeon, Seonghyeon Nam, Seoung Wug Oh, Seon Joo Kim

    Abstract: We propose an attention-based networks for transferring motions between arbitrary objects. Given a source image(s) and a driving video, our networks animate the subject in the source images according to the motion in the driving video. In our attention mechanism, dense similarities between the learned keypoints in the source and the driving images are computed in order to retrieve the appearance i… ▽ More

    Submitted 17 July, 2020; originally announced July 2020.

    Comments: ECCV 2020

  31. arXiv:2006.12314  [pdf

    cs.NE cs.LG

    Always-On, Sub-300-nW, Event-Driven Spiking Neural Network based on Spike-Driven Clock-Generation and Clock- and Power-Gating for an Ultra-Low-Power Intelligent Device

    Authors: Dewei Wang, Pavan Kumar Chundi, Sung Justin Kim, Minhao Yang, Joao Pedro Cerqueira, Joonsung Kang, Seungchul Jung, Sangjoon Kim, Mingoo Seok

    Abstract: Always-on artificial intelligent (AI) functions such as keyword spotting (KWS) and visual wake-up tend to dominate total power consumption in ultra-low power devices. A key observation is that the signals to an always-on function are sparse in time, which a spiking neural network (SNN) classifier can leverage for power savings, because the switching activity and power consumption of SNNs tend to s… ▽ More

    Submitted 23 June, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

  32. arXiv:2005.01056  [pdf, other

    eess.IV cs.CV

    NTIRE 2020 Challenge on Perceptual Extreme Super-Resolution: Methods and Results

    Authors: Kai Zhang, Shuhang Gu, Radu Timofte, Taizhang Shang, Qiuju Dai, Shengchen Zhu, Tong Yang, Yandong Guo, Younghyun Jo, Sejong Yang, Seon Joo Kim, Lin Zha, Jiande Jiang, Xinbo Gao, Wen Lu, **g Liu, Kwang** Yoon, Taegyun Jeon, Kazutoshi Akita, Takeru Ooba, Norimichi Ukita, Zhipeng Luo, Yuehan Yao, Zhenyu Xu, Dongliang He , et al. (38 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2020 challenge on perceptual extreme super-resolution with focus on proposed solutions and results. The challenge task was to super-resolve an input image with a magnification factor 16 based on a set of prior examples of low and corresponding high resolution images. The goal is to obtain a network design capable to produce high resolution results with the best percept… ▽ More

    Submitted 3 May, 2020; originally announced May 2020.

    Comments: CVPRW 2020

  33. arXiv:2004.02432  [pdf, other

    cs.CV

    Deep Space-Time Video Upsampling Networks

    Authors: Jaeyeon Kang, Younghyun Jo, Seoung Wug Oh, Peter Vajda, Seon Joo Kim

    Abstract: Video super-resolution (VSR) and frame interpolation (FI) are traditional computer vision problems, and the performance have been improving by incorporating deep learning recently. In this paper, we investigate the problem of jointly upsampling videos both in space and time, which is becoming more important with advances in display systems. One solution for this is to run VSR and FI, one by one, i… ▽ More

    Submitted 9 August, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

    Comments: ECCV2020 accepted

  34. arXiv:2003.09171  [pdf, other

    cs.CV cs.LG eess.IV

    DMV: Visual Object Tracking via Part-level Dense Memory and Voting-based Retrieval

    Authors: Gunhee Nam, Seoung Wug Oh, Joon-Young Lee, Seon Joo Kim

    Abstract: We propose a novel memory-based tracker via part-level dense memory and voting-based retrieval, called DMV. Since deep learning techniques have been introduced to the tracking field, Siamese trackers have attracted many researchers due to the balance between speed and accuracy. However, most of them are based on a single template matching, which limits the performance as it restricts the accessibl… ▽ More

    Submitted 20 March, 2020; originally announced March 2020.

    Comments: 19 pages, 9 figures

  35. arXiv:2003.09124  [pdf, other

    eess.IV cs.CV

    Learning the Loss Functions in a Discriminative Space for Video Restoration

    Authors: Younghyun Jo, Jaeyeon Kang, Seoung Wug Oh, Seonghyeon Nam, Peter Vajda, Seon Joo Kim

    Abstract: With more advanced deep network architectures and learning schemes such as GANs, the performance of video restoration algorithms has greatly improved recently. Meanwhile, the loss functions for optimizing deep neural networks remain relatively unchanged. To this end, we propose a new framework for building effective loss functions by learning a discriminative space specific to a video restoration… ▽ More

    Submitted 20 March, 2020; originally announced March 2020.

    Comments: 24 pages

  36. arXiv:1910.02027  [pdf, other

    cs.CV

    Unsupervised Keypoint Learning for Guiding Class-Conditional Video Prediction

    Authors: Yunji Kim, Seonghyeon Nam, In Cho, Seon Joo Kim

    Abstract: We propose a deep video prediction model conditioned on a single image and an action class. To generate future frames, we first detect keypoints of a moving object and predict future motion as a sequence of keypoints. The input image is then translated following the predicted keypoints sequence to compose future frames. Detecting the keypoints is central to our algorithm, and our method is trained… ▽ More

    Submitted 4 October, 2019; originally announced October 2019.

    Comments: NeurIPS 2019

  37. arXiv:1908.11587  [pdf, other

    cs.CV

    Copy-and-Paste Networks for Deep Video Inpainting

    Authors: Sungho Lee, Seoung Wug Oh, DaeYeun Won, Seon Joo Kim

    Abstract: We present a novel deep learning based algorithm for video inpainting. Video inpainting is a process of completing corrupted or missing regions in videos. Video inpainting has additional challenges compared to image inpainting due to the extra temporal information as well as the need for maintaining the temporal coherency. We propose a novel DNN-based framework called the Copy-and-Paste Networks f… ▽ More

    Submitted 30 August, 2019; originally announced August 2019.

    Comments: ICCV 2019

  38. arXiv:1908.08718  [pdf, other

    cs.CV

    Onion-Peel Networks for Deep Video Completion

    Authors: Seoung Wug Oh, Sungho Lee, Joon-Young Lee, Seon Joo Kim

    Abstract: We propose the onion-peel networks for video completion. Given a set of reference images and a target image with holes, our network fills the hole by referring the contents in the reference images. Our onion-peel network progressively fills the hole from the hole boundary enabling it to exploit richer contextual information for the missing regions every step. Given a sufficient number of recurrenc… ▽ More

    Submitted 23 August, 2019; originally announced August 2019.

    Comments: ICCV 2019

  39. arXiv:1904.09791  [pdf, other

    cs.CV

    Fast User-Guided Video Object Segmentation by Interaction-and-Propagation Networks

    Authors: Seoung Wug Oh, Joon-Young Lee, Ning Xu, Seon Joo Kim

    Abstract: We present a deep learning method for the interactive video object segmentation. Our method is built upon two core operations, interaction and propagation, and each operation is conducted by Convolutional Neural Networks. The two networks are connected both internally and externally so that the networks are trained jointly and interact with each other to solve the complex video object segmentation… ▽ More

    Submitted 2 May, 2019; v1 submitted 22 April, 2019; originally announced April 2019.

    Comments: CVPR 2019

  40. arXiv:1904.00680  [pdf, other

    cs.CV

    End-to-End Time-Lapse Video Synthesis from a Single Outdoor Image

    Authors: Seonghyeon Nam, Chongyang Ma, Menglei Chai, William Brendel, Ning Xu, Seon Joo Kim

    Abstract: Time-lapse videos usually contain visually appealing content but are often difficult and costly to create. In this paper, we present an end-to-end solution to synthesize a time-lapse video from a single outdoor image using deep neural networks. Our key idea is to train a conditional generative adversarial network based on existing datasets of time-lapse videos and image sequences. We propose a mul… ▽ More

    Submitted 1 April, 2019; originally announced April 2019.

    Comments: To appear in CVPR 2019

  41. arXiv:1904.00607  [pdf, other

    cs.CV

    Video Object Segmentation using Space-Time Memory Networks

    Authors: Seoung Wug Oh, Joon-Young Lee, Ning Xu, Seon Joo Kim

    Abstract: We propose a novel solution for semi-supervised video object segmentation. By the nature of the problem, available cues (e.g. video frame(s) with object masks) become richer with the intermediate predictions. However, the existing methods are unable to fully exploit this rich source of information. We resolve the issue by leveraging memory networks and learn to read relevant information from all a… ▽ More

    Submitted 12 August, 2019; v1 submitted 1 April, 2019; originally announced April 2019.

    Comments: ICCV 2019

  42. arXiv:1810.11919  [pdf, other

    cs.CV

    Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language

    Authors: Seonghyeon Nam, Yunji Kim, Seon Joo Kim

    Abstract: This paper addresses the problem of manipulating images using natural language description. Our task aims to semantically modify visual attributes of an object in an image according to the text describing the new visual appearance. Although existing methods synthesize images having new attributes, they do not fully preserve text-irrelevant contents of the original image. In this paper, we propose… ▽ More

    Submitted 27 November, 2018; v1 submitted 28 October, 2018; originally announced October 2018.

    Comments: NeurIPS 2018

  43. arXiv:1804.02379  [pdf, other

    cs.CV

    EPINET: A Fully-Convolutional Neural Network Using Epipolar Geometry for Depth from Light Field Images

    Authors: Changha Shin, Hae-Gon Jeon, Young** Yoon, In So Kweon, Seon Joo Kim

    Abstract: Light field cameras capture both the spatial and the angular properties of light rays in space. Due to its property, one can compute the depth from light fields in uncontrolled lighting environments, which is a big advantage over active sensing devices. Depth computed from light fields can be used for many applications including 3D modelling and refocusing. However, light field images from hand-he… ▽ More

    Submitted 6 April, 2018; originally announced April 2018.

    Comments: Accepted to CVPR 2018, Total 10 pages

  44. arXiv:1707.08350  [pdf, other

    cs.CV

    Modelling the Scene Dependent Imaging in Cameras with a Deep Neural Network

    Authors: Seonghyeon Nam, Seon Joo Kim

    Abstract: We present a novel deep learning framework that models the scene dependent image processing inside cameras. Often called as the radiometric calibration, the process of recovering RAW images from processed images (JPEG format in the sRGB color space) is essential for many computer vision tasks that rely on physically accurate radiance values. All previous works rely on the deterministic imaging mod… ▽ More

    Submitted 26 July, 2017; originally announced July 2017.

    Comments: To appear in ICCV 2017

  45. arXiv:1706.08260  [pdf, other

    cs.CV

    Deep Semantics-Aware Photo Adjustment

    Authors: Seonghyeon Nam, Seon Joo Kim

    Abstract: Automatic photo adjustment is to mimic the photo retouching style of professional photographers and automatically adjust photos to the learned style. There have been many attempts to model the tone and the color adjustment globally with low-level color statistics. Also, spatially varying photo adjustment methods have been studied by exploiting high-level features and semantic label maps. Those met… ▽ More

    Submitted 26 June, 2017; originally announced June 2017.

  46. arXiv:1705.07543  [pdf, other

    cs.CV cs.MM

    Building Emotional Machines: Recognizing Image Emotions through Deep Neural Networks

    Authors: Hye-Rin Kim, Yeong-Seok Kim, Seon Joo Kim, In-Kwon Lee

    Abstract: An image is a very effective tool for conveying emotions. Many researchers have investigated in computing the image emotions by using various features extracted from images. In this paper, we focus on two high level features, the object and the background, and assume that the semantic information of images is a good cue for predicting emotion. An object is one of the most important elements that d… ▽ More

    Submitted 3 July, 2017; v1 submitted 21 May, 2017; originally announced May 2017.

    Comments: 11 pages, 15 figures

  47. Approaching the Computational Color Constancy as a Classification Problem through Deep Learning

    Authors: Seoung Wug Oh, Seon Joo Kim

    Abstract: Computational color constancy refers to the problem of computing the illuminant color so that the images of a scene under varying illumination can be normalized to an image under the canonical illumination. In this paper, we adopt a deep learning framework for the illumination estimation problem. The proposed method works under the assumption of uniform illumination over the scene and aims for the… ▽ More

    Submitted 29 August, 2016; originally announced August 2016.

    Comments: This is a preprint of an article accepted for publication in Pattern Recognition, ELSEVIER

    Journal ref: Pattern Recognition, Volume 61, January 2017, Pages 405 to 416

  48. Regularization and Kernelization of the Maximin Correlation Approach

    Authors: Taehoon Lee, Taesup Moon, Seung Jean Kim, Sungroh Yoon

    Abstract: Robust classification becomes challenging when each class consists of multiple subclasses. Examples include multi-font optical character recognition and automated protein function prediction. In correlation-based nearest-neighbor classification, the maximin correlation approach (MCA) provides the worst-case optimal solution by minimizing the maximum misclassification risk through an iterative proc… ▽ More

    Submitted 29 March, 2016; v1 submitted 21 February, 2015; originally announced February 2015.

    Comments: Submitted to IEEE Access

  49. arXiv:1002.0123  [pdf, ps, other

    cs.IT

    Achievable rate regions and outer bounds for a multi-pair bi-directional relay network

    Authors: Sang Joon Kim, Besma Smida, Natasha Devroye

    Abstract: In a bi-directional relay channel, a pair of nodes wish to exchange independent messages over a shared wireless half-duplex channel with the help of relays. Recent work has mostly considered information theoretic limits of the bi-directional relay channel with two terminal nodes (or end users) and one relay. In this work we consider bi-directional relaying with one base station, multiple termina… ▽ More

    Submitted 31 January, 2010; originally announced February 2010.

    Comments: 61 pages, 12 figures, will be submitted to IEEE info theory

  50. arXiv:0810.1268  [pdf, ps, other

    cs.IT

    Bi-directional half-duplex protocols with multiple relays

    Authors: Sang Joon Kim, Natasha Devroye, Vahid Tarokh

    Abstract: In a bi-directional relay channel, two nodes wish to exchange independent messages over a shared wireless half-duplex channel with the help of relays. Recent work has considered information theoretic limits of the bi-directional relay channel with a single relay. In this work we consider bi-directional relaying with multiple relays. We derive achievable rate regions and outer bounds for half-duple… ▽ More

    Submitted 9 September, 2010; v1 submitted 7 October, 2008; originally announced October 2008.

    Comments: 44 pages, 17 figures, Submitted to IEEE Transactions on Information Theory