Skip to main content

Showing 1–50 of 117 results for author: Moon, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01193  [pdf, other

    cs.CV

    Cross-Architecture Auxiliary Feature Space Translation for Efficient Few-Shot Personalized Object Detection

    Authors: Francesco Barbato, Umberto Michieli, Jijoong Moon, Pietro Zanuttigh, Mete Ozay

    Abstract: Recent years have seen object detection robotic systems deployed in several personal devices (e.g., home robots and appliances). This has highlighted a challenge in their design, i.e., they cannot efficiently update their knowledge to distinguish between general classes and user-specific instances (e.g., a dog vs. user's dog). We refer to this challenging task as Instance-level Personalized Object… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accepted at IROS 2024, 8 pages, 4 figures, 6 tables

  2. arXiv:2406.12632  [pdf, other

    eess.IV cs.CV

    Cyclic 2.5D Perceptual Loss for Cross-Modal 3D Image Synthesis: T1 MRI to Tau-PET

    Authors: Symac Kim, Junho Moon, Haejun Chung, Ikbeom Jang

    Abstract: Alzheimer's Disease (AD) is the most common form of dementia, characterised by cognitive decline and biomarkers such as tau-proteins. Tau-positron emission tomography (tau-PET), which employs a radiotracer to selectively bind, detect, and visualise tau protein aggregates within the brain, is valuable for early AD diagnosis but is less accessible due to high costs, limited availability, and its inv… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 24 pages, 5 figures

  3. arXiv:2405.18623  [pdf

    cs.HC

    I See You: Teacher Analytics with GPT-4 Vision-Powered Observational Assessment

    Authors: Unggi Lee, Yeil Jeong, Junbo Koh, Gyuri Byun, Yunseo Lee, Hyunwoong Lee, Seunmin Eun, Jewoong Moon, Cheolil Lim, Hyeoncheol Kim

    Abstract: This preliminary study explores the integration of GPT-4 Vision (GPT-4V) technology into teacher analytics, focusing on its applicability in observational assessment to enhance reflective teaching practice. This research is grounded in develo** a Video-based Automatic Assessment System (VidAAS) empowered by GPT-4V. Our approach aims to revolutionize teachers' assessment of students' practices by… ▽ More

    Submitted 30 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: 27 pages, 5 figures, 4 tables

  4. arXiv:2405.03945  [pdf, other

    cs.CV cs.NI

    Role of Sensing and Computer Vision in 6G Wireless Communications

    Authors: Seungnyun Kim, Jihoon Moon, **hong Kim, Yongjun Ahn, Donghoon Kim, Sunwoo Kim, Kyuhong Shim, Byonghyo Shim

    Abstract: Recently, we are witnessing the remarkable progress and widespread adoption of sensing technologies in autonomous driving, robotics, and metaverse. Considering the rapid advancement of computer vision (CV) technology to analyze the sensing information, we anticipate a proliferation of wireless applications exploiting the sensing and CV technologies in 6G. In this article, we provide a holistic ove… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  5. arXiv:2404.11916  [pdf, other

    cs.CL cs.AI

    SKIP: Skill-Localized Prompt Tuning for Inference Speed Boost-Up

    Authors: Nakyeong Yang, Junseok Kim, Jiwon Moon, Yunah Jang, Kyomin Jung

    Abstract: Prompt-tuning methods have shown comparable performance as parameter-efficient fine-tuning (PEFT) methods in various natural language understanding tasks. However, existing prompt tuning methods still utilize the entire model architecture; thus, they fail to accelerate inference speed in the application. In this paper, we propose a novel approach called SKIll-localized Prompt tuning (SKIP), which… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 6 pages

  6. arXiv:2404.10649  [pdf

    cs.HC

    Navigating the Serious Game Design Landscape: A Comprehensive Reference Document

    Authors: Julieana Moon, Naimul Khan

    Abstract: Within the evolving field of digital intervention, serious games emerge as promising tools for evidence-based interventions. Research indicates that gamified therapy, whether employed independently or in conjunction with online psychoeducation or traditional programs, proves more efficacious in delivering care to patients. As we navigate the intricate realm of serious game design, bridging the gap… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: In press

  7. arXiv:2404.07610  [pdf, other

    cs.CV

    Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval

    Authors: Minkuk Kim, Hyeon Bae Kim, **young Moon, **woo Choi, Seong Tae Kim

    Abstract: There has been significant attention to the research on dense video captioning, which aims to automatically localize and caption all events within untrimmed video. Several studies introduce methods by designing dense video captioning as a multitasking problem of event localization and event captioning to consider inter-task relations. However, addressing both tasks using only visual input is chall… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  8. arXiv:2404.01397  [pdf, other

    cs.CV cs.AI cs.RO

    Object-conditioned Bag of Instances for Few-Shot Personalized Instance Recognition

    Authors: Umberto Michieli, Jijoong Moon, Daehyun Kim, Mete Ozay

    Abstract: Nowadays, users demand for increased personalization of vision systems to localize and identify personal instances of objects (e.g., my dog rather than dog) from a few-shot dataset only. Despite outstanding results of deep networks on classical label-abundant benchmarks (e.g., those of the latest YOLOv8 model for standard object detection), they struggle to maintain within-class variability to rep… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: ICASSP 2024. Copyright 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other

  9. arXiv:2404.00928  [pdf, other

    cs.CV cs.LG

    Instance-Aware Group Quantization for Vision Transformers

    Authors: Jaehyeon Moon, Dohyung Kim, Junyong Cheon, Bumsub Ham

    Abstract: Post-training quantization (PTQ) is an efficient model compression technique that quantizes a pretrained full-precision model using only a small calibration set of unlabeled samples without retraining. PTQ methods for convolutional neural networks (CNNs) provide quantization results comparable to full-precision counterparts. Directly applying them to vision transformers (ViTs), however, incurs sev… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  10. arXiv:2403.14335  [pdf, other

    cs.CV

    FFT-based Selection and Optimization of Statistics for Robust Recognition of Severely Corrupted Images

    Authors: Elena Camuffo, Umberto Michieli, Jijoong Moon, Daehyun Kim, Mete Ozay

    Abstract: Improving model robustness in case of corrupted images is among the key challenges to enable robust vision systems on smart devices, such as robotic agents. Particularly, robust test-time performance is imperative for most of the applications. This paper presents a novel approach to improve robustness of any classification model, especially on severely corrupted images. Our method (FROST) employs… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: ICASSP 2024. Copyright 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other

  11. arXiv:2402.18614  [pdf, other

    cs.LG cs.CV cs.NE

    Deep Neural Network Models Trained With A Fixed Random Classifier Transfer Better Across Domains

    Authors: Hafiz Tiomoko Ali, Umberto Michieli, Ji Joong Moon, Daehyun Kim, Mete Ozay

    Abstract: The recently discovered Neural collapse (NC) phenomenon states that the last-layer weights of Deep Neural Networks (DNN), converge to the so-called Equiangular Tight Frame (ETF) simplex, at the terminal phase of their training. This ETF geometry is equivalent to vanishing within-class variability of the last layer activations. Inspired by NC properties, we explore in this paper the transferability… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: ICASSP 2024. Copyright 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other

  12. arXiv:2402.15019  [pdf, other

    cs.LG cs.AI stat.ML

    Consistency-Guided Temperature Scaling Using Style and Content Information for Out-of-Domain Calibration

    Authors: Wonjeong Choi, Jungwuk Park, Dong-Jun Han, Younghyun Park, Jaekyun Moon

    Abstract: Research interests in the robustness of deep neural networks against domain shifts have been rapidly increasing in recent years. Most existing works, however, focus on improving the accuracy of the model, not the calibration performance which is another important requirement for trustworthy AI systems. Temperature scaling (TS), an accuracy-preserving post-hoc calibration method, has been proven to… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: Accepted at AAAI-24 (The 38th AAAI Conference on Artificial Intelligence, February 2024)

  13. arXiv:2401.09786  [pdf, other

    cs.CV cs.AI

    Adaptive Self-training Framework for Fine-grained Scene Graph Generation

    Authors: Kibum Kim, Kanghoon Yoon, Yeonjun In, **young Moon, Donghyun Kim, Chanyoung Park

    Abstract: Scene graph generation (SGG) models have suffered from inherent problems regarding the benchmark datasets such as the long-tailed predicate distribution and missing annotation problems. In this work, we aim to alleviate the long-tailed problem of SGG by utilizing unannotated triplets. To this end, we introduce a Self-Training framework for SGG (ST-SGG) that assigns pseudo-labels for unannotated tr… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: 9 pages; ICLR 2024

  14. arXiv:2312.10118  [pdf, other

    cs.CV

    From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior

    Authors: Jaeho Moon, Juan Luis Gonzalez Bello, Byeongjun Kwon, Munchurl Kim

    Abstract: Self-supervised monocular depth estimation (DE) is an approach to learning depth without costly depth ground truths. However, it often struggles with moving objects that violate the static scene assumption during training. To address this issue, we introduce a coarse-to-fine training strategy leveraging the ground contacting prior based on the observation that most moving objects in outdoor scenes… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  15. arXiv:2312.06532  [pdf, other

    cs.AR cs.ET cs.LG

    RACE-IT: A Reconfigurable Analog CAM-Crossbar Engine for In-Memory Transformer Acceleration

    Authors: Lei Zhao, Luca Buonanno, Ron M. Roth, Sergey Serebryakov, Archit Gajjar, John Moon, Jim Ignowski, Giacomo Pedretti

    Abstract: Transformer models represent the cutting edge of Deep Neural Networks (DNNs) and excel in a wide range of machine learning tasks. However, processing these models demands significant computational resources and results in a substantial memory footprint. While In-memory Computing (IMC) offers promise for accelerating Matrix-Vector Multiplications (MVMs) with high computational parallelism and minim… ▽ More

    Submitted 29 November, 2023; originally announced December 2023.

  16. arXiv:2311.07485  [pdf, other

    cs.LG cs.AI cs.CV cs.NE

    EvoFed: Leveraging Evolutionary Strategies for Communication-Efficient Federated Learning

    Authors: Mohammad Mahdi Rahimi, Hasnain Irshad Bhatti, Younghyun Park, Humaira Kousar, Jaekyun Moon

    Abstract: Federated Learning (FL) is a decentralized machine learning paradigm that enables collaborative model training across dispersed nodes without having to force individual nodes to share data. However, its broad adoption is hindered by the high communication costs of transmitting a large number of model parameters. This paper presents EvoFed, a novel approach that integrates Evolutionary Strategies (… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  17. arXiv:2311.05669  [pdf, other

    cs.CV

    Multi-Modal Gaze Following in Conversational Scenarios

    Authors: Yuqi Hou, Zhongqun Zhang, Nora Horanyi, Jaewon Moon, Yihua Cheng, Hyung ** Chang

    Abstract: Gaze following estimates gaze targets of in-scene person by understanding human behavior and scene information. Existing methods usually analyze scene images for gaze following. However, compared with visual images, audio also provides crucial cues for determining human behavior.This suggests that we can further improve gaze following considering audio cues. In this paper, we explore gaze followin… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  18. arXiv:2311.00428  [pdf, other

    cs.LG

    NEO-KD: Knowledge-Distillation-Based Adversarial Training for Robust Multi-Exit Neural Networks

    Authors: Seokil Ham, Jungwuk Park, Dong-Jun Han, Jaekyun Moon

    Abstract: While multi-exit neural networks are regarded as a promising solution for making efficient inference via early exits, combating adversarial attacks remains a challenging problem. In multi-exit networks, due to the high dependency among different submodels, an adversarial example targeting a specific exit not only degrades the performance of the target exit but also reduces the performance of all o… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: 10 pages, 4 figures, accepted by 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  19. arXiv:2311.00227  [pdf, other

    cs.LG cs.AI

    StableFDG: Style and Attention Based Learning for Federated Domain Generalization

    Authors: Jungwuk Park, Dong-Jun Han, **ho Kim, Shiqiang Wang, Christopher G. Brinton, Jaekyun Moon

    Abstract: Traditional federated learning (FL) algorithms operate under the assumption that the data distributions at training (source domains) and testing (target domain) are the same. The fact that domain shifts often occur in practice necessitates equip** FL methods with a domain generalization (DG) capability. However, existing DG algorithms face fundamental challenges in FL setups due to the lack of s… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

    Comments: Accepted at NeurIPS 2023, 19 pages

  20. arXiv:2310.18481  [pdf, other

    cs.LG cs.AI cs.OS

    MOSEL: Inference Serving Using Dynamic Modality Selection

    Authors: Bodun Hu, Le Xu, Jeongyoon Moon, Neeraja J. Yadwadkar, Aditya Akella

    Abstract: Rapid advancements over the years have helped machine learning models reach previously hard-to-achieve goals, sometimes even exceeding human capabilities. However, to attain the desired accuracy, the model sizes and in turn their computational requirements have increased drastically. Thus, serving predictions from these models to meet any target latency and cost requirements of applications remain… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

  21. arXiv:2310.10404  [pdf, other

    cs.CV cs.AI cs.LG

    LLM4SGG: Large Language Models for Weakly Supervised Scene Graph Generation

    Authors: Kibum Kim, Kanghoon Yoon, Jaehyeong Jeon, Yeonjun In, **young Moon, Donghyun Kim, Chanyoung Park

    Abstract: Weakly-Supervised Scene Graph Generation (WSSGG) research has recently emerged as an alternative to the fully-supervised approach that heavily relies on costly annotations. In this regard, studies on WSSGG have utilized image captions to obtain unlocalized triplets while primarily focusing on grounding the unlocalized triplets over image regions. However, they have overlooked the two issues involv… ▽ More

    Submitted 18 April, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: 8 pages; CVPR 2024

  22. arXiv:2309.05913  [pdf, other

    cs.CR

    Behind The Wings: The Case of Reverse Engineering and Drone Hijacking in DJI Enhanced Wi-Fi Protocol

    Authors: Derry Pratama, Jaegeun Moon, Agus Mahardika Ari Laksmono, Dongwook Yun, Iqbal Muhammad, Byeonguk Jeong, Janghyun Ji, Howon Kim

    Abstract: This research paper entails an examination of the Enhanced Wi-Fi protocol, focusing on its control command reverse-engineering analysis and subsequent demonstration of a hijacking attack. Our investigation discovered vulnerabilities in the Enhanced Wi-Fi control commands, rendering them susceptible to hijacking attacks. Notably, the study established that even readily available and cost-effective… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: Open source PoC available at Github https://github.com/ibndias/dji-drone-hijacking, 10 pages

  23. arXiv:2309.00237  [pdf, other

    cs.CL cs.AI

    Publicly Shareable Clinical Large Language Model Built on Synthetic Clinical Notes

    Authors: Sunjun Kweon, Junu Kim, Jiyoun Kim, Sujeong Im, Eunbyeol Cho, Seongsu Bae, Jungwoo Oh, Gyubok Lee, Jong Hak Moon, Seng Chan You, Seung** Baek, Chang Hoon Han, Yoon Bin Jung, Yohan Jo, Edward Choi

    Abstract: The development of large language models tailored for handling patients' clinical notes is often hindered by the limited accessibility and usability of these notes due to strict privacy regulations. To address these challenges, we first create synthetic large-scale clinical notes using publicly available case reports extracted from biomedical literature. We then use these synthetic notes to train… ▽ More

    Submitted 13 June, 2024; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: ACL 2024 (Findings)

  24. arXiv:2308.09303  [pdf, other

    cs.CV cs.LG

    Online Class Incremental Learning on Stochastic Blurry Task Boundary via Mask and Visual Prompt Tuning

    Authors: Jun-Yeong Moon, Keon-Hee Park, Jung Uk Kim, Gyeong-Moon Park

    Abstract: Continual learning aims to learn a model from a continuous stream of data, but it mainly assumes a fixed number of data and tasks with clear task boundaries. However, in real-world scenarios, the number of input data and tasks is constantly changing in a statistical way, not a static way. Although recently introduced incremental learning scenarios having blurry task boundaries somewhat address the… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

  25. arXiv:2308.02161  [pdf, other

    cs.CV

    M2Former: Multi-Scale Patch Selection for Fine-Grained Visual Recognition

    Authors: Jiyong Moon, Junseok Lee, Yunju Lee, Seongsik Park

    Abstract: Recently, vision Transformers (ViTs) have been actively applied to fine-grained visual recognition (FGVR). ViT can effectively model the interdependencies between patch-divided object regions through an inherent self-attention mechanism. In addition, patch selection is used with ViT to remove redundant patch information and highlight the most discriminative object patches. However, existing ViT-ba… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

  26. arXiv:2308.00009  [pdf

    eess.IV cs.LG

    A 3D deep learning classifier and its explainability when assessing coronary artery disease

    Authors: Wing Keung Cheung, Jeremy Kalindjian, Robert Bell, Arjun Nair, Leon J. Menezes, Riyaz Patel, Simon Wan, Kacy Chou, Jiahang Chen, Ryo Torii, Rhodri H. Davies, James C. Moon, Daniel C. Alexander, Joseph Jacob

    Abstract: Early detection and diagnosis of coronary artery disease (CAD) could save lives and reduce healthcare costs. In this study, we propose a 3D Resnet-50 deep learning model to directly classify normal subjects and CAD patients on computed tomography coronary angiography images. Our proposed method outperforms a 2D Resnet-50 model by 23.65%. Explainability is also provided by using a Grad-GAM. Further… ▽ More

    Submitted 29 July, 2023; originally announced August 2023.

  27. arXiv:2306.04911  [pdf, other

    cs.CV cs.AI

    Test-Time Style Shifting: Handling Arbitrary Styles in Domain Generalization

    Authors: Jungwuk Park, Dong-Jun Han, Soyeong Kim, Jaekyun Moon

    Abstract: In domain generalization (DG), the target domain is unknown when the model is being trained, and the trained model should successfully work on an arbitrary (and possibly unseen) target domain during inference. This is a difficult problem, and despite active studies in recent years, it remains a great challenge. In this paper, we take a simple yet effective approach to tackle this issue. We propose… ▽ More

    Submitted 12 June, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: ICML 2023 camera-ready version

  28. It's Enough: Relaxing Diagonal Constraints in Linear Autoencoders for Recommendation

    Authors: Jaewan Moon, Hye-young Kim, Jongwuk Lee

    Abstract: Linear autoencoder models learn an item-to-item weight matrix via convex optimization with L2 regularization and zero-diagonal constraints. Despite their simplicity, they have shown remarkable performance compared to sophisticated non-linear models. This paper aims to theoretically understand the properties of two terms in linear autoencoders. Through the lens of singular value decomposition (SVD)… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted by SIGIR 2023

  29. arXiv:2305.10731  [pdf, other

    cs.CL

    Analyzing Norm Violations in Live-Stream Chat

    Authors: Jihyung Moon, Dong-Ho Lee, Hyundong Cho, Woojeong **, Chan Young Park, Minwoo Kim, Jonathan May, Jay Pujara, Sungjoon Park

    Abstract: Toxic language, such as hate speech, can deter users from participating in online communities and enjoying popular platforms. Previous approaches to detecting toxic language and norm violations have been primarily concerned with conversations from online forums and social media, such as Reddit and Twitter. These approaches are less effective when applied to conversations on live-streaming platform… ▽ More

    Submitted 7 October, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: 17 pages, 8 figures, 15 tables

  30. arXiv:2304.01285  [pdf, other

    cs.LG

    X-TIME: An in-memory engine for accelerating machine learning on tabular data with CAMs

    Authors: Giacomo Pedretti, John Moon, Pedro Bruel, Sergey Serebryakov, Ron M. Roth, Luca Buonanno, Archit Gajjar, Tobias Ziegler, Cong Xu, Martin Foltin, Paolo Faraboschi, Jim Ignowski, Catherine E. Graves

    Abstract: Structured, or tabular, data is the most common format in data science. While deep learning models have proven formidable in learning from unstructured data such as images or speech, they are less accurate than simpler approaches when learning from tabular data. In contrast, modern tree-based Machine Learning (ML) models shine in extracting relevant information from structured data. An essential r… ▽ More

    Submitted 2 February, 2024; v1 submitted 3 April, 2023; originally announced April 2023.

  31. arXiv:2303.07648  [pdf, ps, other

    cs.CV

    SimFLE: Simple Facial Landmark Encoding for Self-Supervised Facial Expression Recognition in the Wild

    Authors: Jiyong Moon, Seongsik Park

    Abstract: One of the key issues in facial expression recognition in the wild (FER-W) is that curating large-scale labeled facial images is challenging due to the inherent complexity and ambiguity of facial images. Therefore, in this paper, we propose a self-supervised simple facial landmark encoding (SimFLE) method that can learn effective encoding of facial landmarks, which are important features for impro… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

  32. arXiv:2212.08343  [pdf, other

    cs.LG

    SplitGP: Achieving Both Generalization and Personalization in Federated Learning

    Authors: Dong-Jun Han, Do-Yeon Kim, Minseok Choi, Christopher G. Brinton, Jaekyun Moon

    Abstract: A fundamental challenge to providing edge-AI services is the need for a machine learning (ML) model that achieves personalization (i.e., to individual clients) and generalization (i.e., to unseen data) properties concurrently. Existing techniques in federated learning (FL) have encountered a steep tradeoff between these objectives and impose large computational requirements on edge devices during… ▽ More

    Submitted 11 February, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

    Comments: To appear in IEEE INFOCOM 2023

  33. arXiv:2212.00443  [pdf, other

    cs.AI

    Unbiased Heterogeneous Scene Graph Generation with Relation-aware Message Passing Neural Network

    Authors: Kanghoon Yoon, Kibum Kim, **young Moon, Chanyoung Park

    Abstract: Recent scene graph generation (SGG) frameworks have focused on learning complex relationships among multiple objects in an image. Thanks to the nature of the message passing neural network (MPNN) that models high-order interactions between objects and their neighboring objects, they are dominant representation learning modules for SGG. However, existing MPNN-based frameworks assume the scene graph… ▽ More

    Submitted 6 July, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

    Comments: 9 pages; AAAI 2023

  34. arXiv:2210.08819  [pdf, other

    cs.CV

    Correlation between Alignment-Uniformity and Performance of Dense Contrastive Representations

    Authors: Jong Hak Moon, Wonjae Kim, Edward Choi

    Abstract: Recently, dense contrastive learning has shown superior performance on dense prediction tasks compared to instance-level contrastive learning. Despite its supremacy, the properties of dense contrastive representations have not yet been carefully studied. Therefore, we analyze the theoretical ideas of dense contrastive learning using a standard CNN and straightforward feature matching scheme rather… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    Comments: BMVC22 accepted

  35. A Codebook Design for FD-MIMO Systems with Multi-Panel Array

    Authors: Zhilin Fu, Sangwon Hwang, Jihwan Moon, Haibao Ren, Inkyu Lee

    Abstract: In this work, we study codebook designs for full-dimension multiple-input multiple-output (FD-MIMO) systems with a multi-panel array (MPA). We propose novel codebooks which allow precise beam structures for MPA FD-MIMO systems by investigating the physical properties and alignments of the panels. We specifically exploit the characteristic that a group of antennas in a vertical direction exhibit mo… ▽ More

    Submitted 9 August, 2022; originally announced August 2022.

  36. arXiv:2208.00821  [pdf, other

    cs.LG

    Locally Supervised Learning with Periodic Global Guidance

    Authors: Hasnain Irshad Bhatti, Jaekyun Moon

    Abstract: Locally supervised learning aims to train a neural network based on a local estimation of the global loss function at each decoupled module of the network. Auxiliary networks are typically appended to the modules to approximate the gradient updates based on the greedy local losses. Despite being advantageous in terms of parallelism and reduced memory consumption, this paradigm of training severely… ▽ More

    Submitted 1 August, 2022; originally announced August 2022.

    Comments: Accepted at ICML 2022 Hardware Aware Efficient Training (HAET) Workshop

  37. arXiv:2207.00003  [pdf, other

    cs.LG cs.CV eess.IV

    A Multi-stage Framework with Mean Subspace Computation and Recursive Feedback for Online Unsupervised Domain Adaptation

    Authors: Jihoon Moon, Debasmit Das, C. S. George Lee

    Abstract: In this paper, we address the Online Unsupervised Domain Adaptation (OUDA) problem and propose a novel multi-stage framework to solve real-world situations when the target data are unlabeled and arriving online sequentially in batches. To project the data from the source and the target domains to a common subspace and manipulate the projected data in real-time, our proposed framework institutes a… ▽ More

    Submitted 23 June, 2022; originally announced July 2022.

  38. arXiv:2206.04688  [pdf, other

    cs.LG

    A New Frontier of AI: On-Device AI Training and Personalization

    Authors: Ji Joong Moon, Hyun Suk Lee, Jiho Chu, Donghak Park, Seungbaek Hong, Hyungjun Seo, Donghyeon Jeong, Sungsik Kong, MyungJoo Ham

    Abstract: Modern consumer electronic devices have started executing deep learning-based intelligence services on devices, not cloud servers, to keep personal data on devices and to reduce network and cloud costs. We find such a trend as the opportunity to personalize intelligence services by updating neural networks with user data without exposing the data out of devices: on-device training. However, the li… ▽ More

    Submitted 4 January, 2024; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 12 pages, 16 figures, Accepted in ICSE 2024

  39. arXiv:2205.11315  [pdf, other

    cs.CL cs.AI

    KOLD: Korean Offensive Language Dataset

    Authors: Younghoon Jeong, Juhyun Oh, Jaimeen Ahn, Jongwon Lee, Jihyung Moon, Sungjoon Park, Alice Oh

    Abstract: Recent directions for offensive language detection are hierarchical modeling, identifying the type and the target of offensive language, and interpretability with offensive span annotation and prediction. These improvements are focused on English and do not transfer well to other languages because of cultural and linguistic differences. In this paper, we present the Korean Offensive Language Datas… ▽ More

    Submitted 4 November, 2022; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: 9 pages, 2 figures

  40. arXiv:2205.08851  [pdf, other

    cs.CV eess.IV

    Positional Information is All You Need: A Novel Pipeline for Self-Supervised SVDE from Videos

    Authors: Juan Luis Gonzalez Bello, Jaeho Moon, Munchurl Kim

    Abstract: Recently, much attention has been drawn to learning the underlying 3D structures of a scene from monocular videos in a fully self-supervised fashion. One of the most challenging aspects of this task is handling the independently moving objects as they break the rigid-scene assumption. For the first time, we show that pixel positional information can be exploited to learn SVDE (Single View Depth Es… ▽ More

    Submitted 18 May, 2022; originally announced May 2022.

  41. arXiv:2202.06498  [pdf, other

    cs.CV

    Task-Adaptive Feature Transformer with Semantic Enrichment for Few-Shot Segmentation

    Authors: Jun Seo, Young-Hyun Park, Sung Whan Yoon, Jaekyun Moon

    Abstract: Few-shot learning allows machines to classify novel classes using only a few labeled samples. Recently, few-shot segmentation aiming at semantic segmentation on low sample data has also seen great interest. In this paper, we propose a learnable module that can be placed on top of existing segmentation networks for performing few-shot segmentation. This module, called the task-adaptive feature tran… ▽ More

    Submitted 14 February, 2022; originally announced February 2022.

    Comments: 8 pages, 7 figures. arXiv admin note: text overlap with arXiv:2010.11437

  42. arXiv:2201.02354  [pdf, other

    cs.LG

    GenLabel: Mixup Relabeling using Generative Models

    Authors: Jy-yong Sohn, Liang Shang, Hongxu Chen, Jaekyun Moon, Dimitris Papailiopoulos, Kangwook Lee

    Abstract: Mixup is a data augmentation method that generates new data points by mixing a pair of input data. While mixup generally improves the prediction performance, it sometimes degrades the performance. In this paper, we first identify the main causes of this phenomenon by theoretically and empirically analyzing the mixup algorithm. To resolve this, we propose GenLabel, a simple yet effective relabeling… ▽ More

    Submitted 7 January, 2022; originally announced January 2022.

  43. arXiv:2109.13572  [pdf, other

    cs.CV

    Information Elevation Network for Fast Online Action Detection

    Authors: Sunah Min, **young Moon

    Abstract: Online action detection (OAD) is a task that receives video segments within a streaming video as inputs and identifies ongoing actions within them. It is important to retain past information associated with a current action. However, long short-term memory (LSTM), a popular recurrent unit for modeling temporal information from videos, accumulates past information from the previous hidden and cell… ▽ More

    Submitted 28 September, 2021; originally announced September 2021.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  44. Learning to Discriminate Information for Online Action Detection: Analysis and Application

    Authors: Sumin Lee, Hyunjun Eun, **young Moon, Seokeon Choi, Yoonhyung Kim, Chanho Jung, Changick Kim

    Abstract: Online action detection, which aims to identify an ongoing action from a streaming video, is an important subject in real-world applications. For this task, previous methods use recurrent neural networks for modeling temporal relations in an input sequence. However, these methods overlook the fact that the input image sequence includes not only the action of interest but background and irrelevant… ▽ More

    Submitted 18 November, 2022; v1 submitted 7 September, 2021; originally announced September 2021.

    Comments: To appear in TPAMI. arXiv admin note: substantial text overlap with arXiv:1912.04461

  45. arXiv:2109.02925  [pdf

    cs.CV

    Learning to Combine the Modalities of Language and Video for Temporal Moment Localization

    Authors: Jungkyoo Shin, **young Moon

    Abstract: Temporal moment localization aims to retrieve the best video segment matching a moment specified by a query. The existing methods generate the visual and semantic embeddings independently and fuse them without full consideration of the long-term temporal relationship between them. To address these shortcomings, we introduce a novel recurrent unit, cross-modal long short-term memory (CM-LSTM), by m… ▽ More

    Submitted 7 September, 2021; originally announced September 2021.

  46. arXiv:2106.13937  [pdf, ps, other

    cs.IT eess.SP

    Unified Simultaneous Wireless Information and Power Transfer for IoT: Signaling and Architecture with Deep Learning Adaptive Control

    Authors: Jong ** Park, Jong Ho Moon, Hyeon Ho Jang, Dong In Kim

    Abstract: In this paper, we propose a unified SWIPT signal and its architecture design in order to take advantage of both single tone and multi-tone signaling by adjusting only the power allocation ratio of a unified signal. For this, we design a novel unified and integrated receiver architecture for the proposed unified SWIPT signaling, which consumes low power with an envelope detection. To relieve the co… ▽ More

    Submitted 25 June, 2021; originally announced June 2021.

    Comments: 15 pages, 15 figures

  47. arXiv:2106.01016  [pdf, other

    eess.SY cs.LG cs.RO

    Deep Reinforcement Learning-based UAV Navigation and Control: A Soft Actor-Critic with Hindsight Experience Replay Approach

    Authors: Myoung Hoon Lee, Jun Moon

    Abstract: In this paper, we propose SACHER (soft actor-critic (SAC) with hindsight experience replay (HER)), which constitutes a class of deep reinforcement learning (DRL) algorithms. SAC is known as an off-policy model-free DRL algorithm based on the maximum entropy framework, which outperforms earlier DRL algorithms in terms of exploration, robustness and learning performance. However, in SAC, maximizing… ▽ More

    Submitted 5 June, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: 12 page, 9 figures

    MSC Class: 60J20; 68T05

  48. arXiv:2105.11826  [pdf, other

    cs.LG cs.IR cs.MM

    Reproducibility Companion Paper: Knowledge Enhanced Neural Fashion Trend Forecasting

    Authors: Yunshan Ma, Yujuan Ding, Xun Yang, Lizi Liao, Wai Keung Wong, Tat-Seng Chua, **young Moon, Hong-Han Shuai

    Abstract: This companion paper supports the replication of the fashion trend forecasting experiments with the KERN (Knowledge Enhanced Recurrent Network) method that we presented in the ICMR 2020. We provide an artifact that allows the replication of the experiments using a Python implementation. The artifact is easy to deploy with simple installation, training and evaluation. We reproduce the experiments c… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

    Journal ref: ICMR 2021

  49. Multi-modal Understanding and Generation for Medical Images and Text via Vision-Language Pre-Training

    Authors: Jong Hak Moon, Hyungyung Lee, Woncheol Shin, Young-Hak Kim, Edward Choi

    Abstract: Recently a number of studies demonstrated impressive performance on diverse vision-language multi-modal tasks such as image captioning and visual question answering by extending the BERT architecture with multi-modal pre-training objectives. In this work we explore a broad set of multi-modal representation learning tasks in the medical domain, specifically using radiology images and the unstructur… ▽ More

    Submitted 21 September, 2022; v1 submitted 24 May, 2021; originally announced May 2021.

    Comments: Accepted by IEEE Journal of Biomedical and Health Informatics

    Journal ref: IEEE Journal of Biomedical and Health Informatics 2022

  50. arXiv:2105.09680  [pdf, other

    cs.CL

    KLUE: Korean Language Understanding Evaluation

    Authors: Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, Junseong Kim, Yongsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Lyu, Younghoon Jeong, Inkwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park , et al. (6 additional authors not shown)

    Abstract: We introduce Korean Language Understanding Evaluation (KLUE) benchmark. KLUE is a collection of 8 Korean natural language understanding (NLU) tasks, including Topic Classification, SemanticTextual Similarity, Natural Language Inference, Named Entity Recognition, Relation Extraction, Dependency Parsing, Machine Reading Comprehension, and Dialogue State Tracking. We build all of the tasks from scrat… ▽ More

    Submitted 2 November, 2021; v1 submitted 20 May, 2021; originally announced May 2021.

    Comments: 76 pages, 10 figures, 36 tables