Skip to main content

Showing 1–50 of 140 results for author: Jung, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00693  [pdf, other

    cs.AI cs.CL cs.LG

    BAPO: Base-Anchored Preference Optimization for Personalized Alignment in Large Language Models

    Authors: Gihun Lee, Minchan Jeong, Yu** Kim, Hojung Jung, Jaehoon Oh, Sangmook Kim, Se-Young Yun

    Abstract: While learning to align Large Language Models (LLMs) with human preferences has shown remarkable success, aligning these models to meet the diverse user preferences presents further challenges in preserving previous knowledge. This paper examines the impact of personalized preference optimization on LLMs, revealing that the extent of knowledge loss varies significantly with preference heterogeneit… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: under review

  2. arXiv:2406.16030  [pdf, other

    cs.CL cs.AI

    Zero-Shot Cross-Lingual NER Using Phonemic Representations for Low-Resource Languages

    Authors: Jimin Sohn, Haeji Jung, Alex Cheng, Jooeon Kang, Yilin Du, David R. Mortensen

    Abstract: Existing zero-shot cross-lingual NER approaches require substantial prior knowledge of the target language, which is impractical for low-resource languages. In this paper, we propose a novel approach to NER using phonemic representation based on the International Phonetic Alphabet (IPA) to bridge the gap between representations of different languages. Our experiments show that our method significa… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 7 pages, 5 figures, 5 tables

  3. arXiv:2406.10296  [pdf, other

    cs.CL cs.AI cs.CY

    CLST: Cold-Start Mitigation in Knowledge Tracing by Aligning a Generative Language Model as a Students' Knowledge Tracer

    Authors: Heeseok Jung, Jaesang Yoo, Yohaan Yoon, Yeonju Jang

    Abstract: Knowledge tracing (KT), wherein students' problem-solving histories are used to estimate their current levels of knowledge, has attracted significant interest from researchers. However, most existing KT models were developed with an ID-based paradigm, which exhibits limitations in cold-start performance. These limitations can be mitigated by leveraging the vast quantities of external knowledge pos… ▽ More

    Submitted 17 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  4. arXiv:2406.07034  [pdf, other

    cs.AI cs.CL

    Improving Multi-hop Logical Reasoning in Knowledge Graphs with Context-Aware Query Representation Learning

    Authors: Jeonghoon Kim, Heesoo Jung, Hyeju Jang, Hogun Park

    Abstract: Multi-hop logical reasoning on knowledge graphs is a pivotal task in natural language processing, with numerous approaches aiming to answer First-Order Logic (FOL) queries. Recent geometry (e.g., box, cone) and probability (e.g., beta distribution)-based methodologies have effectively addressed complex FOL queries. However, a common challenge across these methods lies in determining accurate geome… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 Findings

  5. arXiv:2406.06786  [pdf, other

    cs.SD cs.AI eess.AS

    BTS: Bridging Text and Sound Modalities for Metadata-Aided Respiratory Sound Classification

    Authors: June-Woo Kim, Miika Toikkanen, Yera Choi, Seoung-Eun Moon, Ho-Young Jung

    Abstract: Respiratory sound classification (RSC) is challenging due to varied acoustic signatures, primarily influenced by patient demographics and recording environments. To address this issue, we introduce a text-audio multimodal model that utilizes metadata of respiratory sounds, which provides useful complementary information for RSC. Specifically, we fine-tune a pretrained text-audio multimodal model u… ▽ More

    Submitted 14 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted INTERSPEECH 2024

  6. arXiv:2405.18986  [pdf, other

    cs.LG q-bio.BM q-bio.QM

    Robust Optimization in Protein Fitness Landscapes Using Reinforcement Learning in Latent Space

    Authors: Minji Lee, Luiz Felipe Vecchietti, Hyunkyu Jung, Hyun Joo Ro, Meeyoung Cha, Ho Min Kim

    Abstract: Proteins are complex molecules responsible for different functions in nature. Enhancing the functionality of proteins and cellular fitness can significantly impact various industries. However, protein optimization using computational methods remains challenging, especially when starting from low-fitness sequences. We propose LatProtRL, an optimization method to efficiently traverse a latent space… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  7. arXiv:2405.18602  [pdf, other

    cs.AI

    SST-GCN: The Sequential based Spatio-Temporal Graph Convolutional networks for Minute-level and Road-level Traffic Accident Risk Prediction

    Authors: Tae-wook Kim, Han-** Lee, Hyeon-** Jung, Ji-Woong Yang, Ellen J. Hong

    Abstract: Traffic accidents are recognized as a major social issue worldwide, causing numerous injuries and significant costs annually. Consequently, methods for predicting and preventing traffic accidents have been researched for many years. With advancements in the field of artificial intelligence, various studies have applied Machine Learning and Deep Learning techniques to traffic accident prediction. M… ▽ More

    Submitted 3 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

  8. arXiv:2405.15097  [pdf, other

    cs.CL cs.AI

    Contrastive and Consistency Learning for Neural Noisy-Channel Model in Spoken Language Understanding

    Authors: Suyoung Kim, Jiyeon Hwang, Ho-Young Jung

    Abstract: Recently, deep end-to-end learning has been studied for intent classification in Spoken Language Understanding (SLU). However, end-to-end models require a large amount of speech data with intent labels, and highly optimized models are generally sensitive to the inconsistency between the training and evaluation conditions. Therefore, a natural language understanding approach based on Automatic Spee… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Accepted NAACL 2024

  9. arXiv:2405.15092  [pdf, other

    cs.AI cs.CL

    Dissociation of Faithful and Unfaithful Reasoning in LLMs

    Authors: Evelyn Yee, Alice Li, Chenyu Tang, Yeon Ho Jung, Ramamohan Paturi, Leon Bergen

    Abstract: Large language models (LLMs) improve their performance in downstream tasks when they generate Chain of Thought reasoning text before producing an answer. Our research investigates how LLMs recover from errors in Chain of Thought, reaching the correct final answer despite mistakes in the reasoning text. Through analysis of these error recovery behaviors, we find evidence for unfaithfulness in Chain… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: code published at https://github.com/CoTErrorRecovery/CoTErrorRecovery

  10. arXiv:2405.05378  [pdf, other

    cs.CL cs.AI cs.CY cs.HC cs.LG

    "They are uncultured": Unveiling Covert Harms and Social Threats in LLM Generated Conversations

    Authors: Preetam Prabhu Srikar Dammu, Hayoung Jung, Anjali Singh, Monojit Choudhury, Tanushree Mitra

    Abstract: Large language models (LLMs) have emerged as an integral part of modern societies, powering user-facing applications such as personal assistants and enterprise applications like recruitment tools. Despite their utility, research indicates that LLMs perpetuate systemic biases. Yet, prior works on LLM harms predominantly focus on Western concepts like race and gender, often overlooking cultural conc… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  11. arXiv:2405.02996  [pdf, other

    cs.SD cs.AI eess.AS

    RepAugment: Input-Agnostic Representation-Level Augmentation for Respiratory Sound Classification

    Authors: June-Woo Kim, Miika Toikkanen, Sangmin Bae, Minseok Kim, Ho-Young Jung

    Abstract: Recent advancements in AI have democratized its deployment as a healthcare assistant. While pretrained models from large-scale visual and audio datasets have demonstrably generalized to this task, surprisingly, no studies have explored pretrained speech models, which, as human-originated sounds, intuitively would share closer resemblance to lung sounds. This paper explores the efficacy of pretrain… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: Accepted EMBC 2024

  12. arXiv:2404.17598  [pdf, other

    cs.IR cs.AI cs.LG cs.SI

    Revealing and Utilizing In-group Favoritism for Graph-based Collaborative Filtering

    Authors: Hoin Jung, Hyunsoo Cho, Myungje Choi, Joowon Lee, Jung Ho Park, Myungjoo Kang

    Abstract: When it comes to a personalized item recommendation system, It is essential to extract users' preferences and purchasing patterns. Assuming that users in the real world form a cluster and there is common favoritism in each cluster, in this work, we introduce Co-Clustering Wrapper (CCW). We compute co-clusters of users and items with co-clustering algorithms and add CF subnetworks for each cluster… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 7 pages, 6 figures

  13. arXiv:2404.05717  [pdf, other

    cs.CV cs.AI

    SwapAnything: Enabling Arbitrary Object Swap** in Personalized Visual Editing

    Authors: **g Gu, Yilin Wang, Nanxuan Zhao, Wei Xiong, Qing Liu, Zhifei Zhang, He Zhang, Jianming Zhang, HyunJoon Jung, Xin Eric Wang

    Abstract: Effective editing of personal content holds a pivotal role in enabling individuals to express their creativity, weaving captivating narratives within their visual stories, and elevate the overall quality and impact of their visual content. Therefore, in this work, we introduce SwapAnything, a novel framework that can swap any objects in an image with personalized concepts given by the reference, w… ▽ More

    Submitted 6 May, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: 18 pages, 16 figures, 3 tables

  14. arXiv:2404.05144  [pdf, other

    cs.CL cs.CV cs.LG

    Enhancing Clinical Efficiency through LLM: Discharge Note Generation for Cardiac Patients

    Authors: HyoJe Jung, Yunha Kim, Heejung Choi, Hyeram Seo, Minkyoung Kim, JiYe Han, Gaeun Kee, Seohyun Park, Soyoung Ko, Byeolhee Kim, Suyeon Kim, Tae Joon Jun, Young-Hak Kim

    Abstract: Medical documentation, including discharge notes, is crucial for ensuring patient care quality, continuity, and effective medical communication. However, the manual creation of these documents is not only time-consuming but also prone to inconsistencies and potential errors. The automation of this documentation process using artificial intelligence (AI) represents a promising area of innovation in… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: 10 pages, 1 figure, 3 tables, conference

  15. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seong** Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  16. arXiv:2404.00285  [pdf, other

    cs.CV cs.AI

    Long-Tailed Recognition on Binary Networks by Calibrating A Pre-trained Model

    Authors: Jihun Kim, Dahyun Kim, Hyungrok Jung, Taeil Oh, Jonghyun Choi

    Abstract: Deploying deep models in real-world scenarios entails a number of challenges, including computational efficiency and real-world (e.g., long-tailed) data distributions. We address the combined challenge of learning long-tailed distributions using highly resource-efficient binary neural networks as backbones. Specifically, we propose a calibrate-and-distill framework that uses off-the-shelf pretrain… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  17. arXiv:2403.09632  [pdf, other

    cs.CV

    Holo-Relighting: Controllable Volumetric Portrait Relighting from a Single Image

    Authors: Yiqun Mei, Yu Zeng, He Zhang, Zhixin Shu, Xuaner Zhang, Sai Bi, Jianming Zhang, HyunJoon Jung, Vishal M. Patel

    Abstract: At the core of portrait photography is the search for ideal lighting and viewpoint. The process often requires advanced knowledge in photography and an elaborate studio setup. In this work, we propose Holo-Relighting, a volumetric relighting method that is capable of synthesizing novel viewpoints, and novel lighting from a single image. Holo-Relighting leverages the pretrained 3D GAN (EG3D) to rec… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: CVPR2024

  18. arXiv:2403.08244  [pdf

    cs.CE

    Evaluating the Efficiency and Cost-effectiveness of RPB-based CO2 Capture: A Comprehensive Approach to Simultaneous Design and Operating Condition Optimization

    Authors: Howoun Jung, Noh** Park, Jay H. Lee

    Abstract: Despite ongoing global initiatives to reduce CO2 emissions, implementing large-scale CO2 capture using amine solvents is fraught with economic uncertainties and technical hurdles. The Rotating Packed Bed (RPB) presents a promising alternative to traditional packed towers, offering compact design and adaptability. Nonetheless, scaling RPB processes to an industrial level is challenging due to the n… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 44 pages, 11 figures, 6 tables

  19. arXiv:2402.14279  [pdf, other

    cs.CL cs.AI

    Mitigating the Linguistic Gap with Phonemic Representations for Robust Multilingual Language Understanding

    Authors: Haeji Jung, Changdae Oh, Jooeon Kang, Jimin Sohn, Kyungwoo Song, **kyu Kim, David R. Mortensen

    Abstract: Approaches to improving multilingual language understanding often require multiple languages during the training phase, rely on complicated training techniques, and -- importantly -- struggle with significant performance gaps between high-resource and low-resource languages. We hypothesize that the performance gaps between languages are affected by linguistic gaps between those languages and provi… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  20. arXiv:2402.11883  [pdf, other

    cs.CV

    InMD-X: Large Language Models for Internal Medicine Doctors

    Authors: Hansle Gwon, Im** Ahn, Hyoje Jung, Byeolhee Kim, Young-Hak Kim, Tae Joon Jun

    Abstract: In this paper, we introduce InMD-X, a collection of multiple large language models specifically designed to cater to the unique characteristics and demands of Internal Medicine Doctors (IMD). InMD-X represents a groundbreaking development in natural language processing, offering a suite of language models fine-tuned for various aspects of the internal medicine field. These models encompass a wide… ▽ More

    Submitted 19 February, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  21. arXiv:2402.09784  [pdf, other

    cs.IR cs.AI

    Sequential Recommendation on Temporal Proximities with Contrastive Learning and Self-Attention

    Authors: Hansol Jung, Hyunwoo Seo, Chiehyeon Lim

    Abstract: Sequential recommender systems identify user preferences from their past interactions to predict subsequent items optimally. Although traditional deep-learning-based models and modern transformer-based models in previous studies capture unidirectional and bidirectional patterns within user-item interactions, the importance of temporal contexts, such as individual behavioral and societal trend patt… ▽ More

    Submitted 17 February, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: 10 pages, 9 figures

  22. arXiv:2402.06790  [pdf, other

    cs.RO cs.HC

    Towards Robotic Companions: Understanding Handler-Guide Dog Interactions for Informed Guide Dog Robot Design

    Authors: Hochul Hwang, Hee-Tae Jung, Nicholas A Giudice, Joydeep Biswas, Sunghoon Ivan Lee, Donghyun Kim

    Abstract: Dog guides are favored by blind and low-vision (BLV) individuals for their ability to enhance independence and confidence by reducing safety concerns and increasing navigation efficiency compared to traditional mobility aids. However, only a relatively small proportion of BLV individuals work with dog guides due to their limited availability and associated maintenance responsibilities. There is co… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  23. arXiv:2402.00863  [pdf, other

    cs.CV

    Geometry Transfer for Stylizing Radiance Fields

    Authors: Hyunyoung Jung, Seonghyeon Nam, Nikolaos Sarafianos, Sungjoo Yoo, Alexander Sorkine-Hornung, Rakesh Ranjan

    Abstract: Shape and geometric patterns are essential in defining stylistic identity. However, current 3D style transfer methods predominantly focus on transferring colors and textures, often overlooking geometric aspects. In this paper, we introduce Geometry Transfer, a novel method that leverages geometric deformation for 3D style transfer. This technique employs depth maps to extract a style guide, subseq… ▽ More

    Submitted 6 April, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: CVPR 2024. Project page: https://hyblue.github.io/geo-srf/

  24. arXiv:2401.08897  [pdf, other

    cs.LG cs.AI

    CFASL: Composite Factor-Aligned Symmetry Learning for Disentanglement in Variational AutoEncoder

    Authors: Hee-Jun Jung, Jaehyoung Jeong, Kangil Kim

    Abstract: Symmetries of input and latent vectors have provided valuable insights for disentanglement learning in VAEs.However, only a few works were proposed as an unsupervised method, and even these works require known factor information in training data. We propose a novel method, Composite Factor-Aligned Symmetry Learning (CFASL), which is integrated into VAEs for learning symmetry-based disentanglement… ▽ More

    Submitted 18 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: 21 pages, 14 figures

  25. arXiv:2312.15059  [pdf, other

    cs.CV cs.AI

    Deformable 3D Gaussian Splatting for Animatable Human Avatars

    Authors: HyunJun Jung, Nikolas Brasch, Jifei Song, Eduardo Perez-Pellitero, Yiren Zhou, Zhihao Li, Nassir Navab, Benjamin Busam

    Abstract: Recent advances in neural radiance fields enable novel view synthesis of photo-realistic images in dynamic settings, which can be applied to scenarios with human animation. Commonly used implicit backbones to establish accurate models, however, require many input views and additional annotations such as human masks, UV maps and depth maps. In this work, we propose ParDy-Human (Parameterized Dynami… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  26. arXiv:2312.09603  [pdf, other

    cs.SD cs.LG eess.AS

    Stethoscope-guided Supervised Contrastive Learning for Cross-domain Adaptation on Respiratory Sound Classification

    Authors: June-Woo Kim, Sangmin Bae, Won-Yang Cho, Byungjo Lee, Ho-Young Jung

    Abstract: Despite the remarkable advances in deep learning technology, achieving satisfactory performance in lung sound classification remains a challenge due to the scarcity of available data. Moreover, the respiratory sound samples are collected from a variety of electronic stethoscopes, which could potentially introduce biases into the trained models. When a significant distribution shift occurs within t… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: accepted to ICASSP 2024

  27. arXiv:2312.07266  [pdf, other

    cs.CV

    ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open-Vocabulary Object Detection

    Authors: Joonhyun Jeong, Geondo Park, Jayeon Yoo, Hyungsik Jung, Heesu Kim

    Abstract: Open-vocabulary object detection (OVOD) aims to recognize novel objects whose categories are not included in the training set. In order to classify these unseen classes during training, many OVOD frameworks leverage the zero-shot capability of largely pretrained vision and language models, such as CLIP. To further improve generalization on the unseen novel classes, several approaches proposed to a… ▽ More

    Submitted 20 February, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: Accepted in AAAI24. Code: https://github.com/clovaai/ProxyDet Project page: https://proxydet.github.io

  28. arXiv:2312.06886  [pdf, other

    cs.CV

    Relightful Harmonization: Lighting-aware Portrait Background Replacement

    Authors: Mengwei Ren, Wei Xiong, Jae Shin Yoon, Zhixin Shu, Jianming Zhang, HyunJoon Jung, Guido Gerig, He Zhang

    Abstract: Portrait harmonization aims to composite a subject into a new background, adjusting its lighting and color to ensure harmony with the background scene. Existing harmonization techniques often only focus on adjusting the global color and brightness of the foreground and ignore crucial illumination cues from the background such as apparent lighting direction, leading to unrealistic compositions. We… ▽ More

    Submitted 7 April, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: CVPR 2024 camera ready

  29. arXiv:2311.06480  [pdf, other

    cs.SD cs.LG eess.AS

    Adversarial Fine-tuning using Generated Respiratory Sound to Address Class Imbalance

    Authors: June-Woo Kim, Chihyeon Yoon, Miika Toikkanen, Sangmin Bae, Ho-Young Jung

    Abstract: Deep generative models have emerged as a promising approach in the medical image domain to address data scarcity. However, their use for sequential data like respiratory sounds is less explored. In this work, we propose a straightforward approach to augment imbalanced respiratory sound data using an audio diffusion model as a conditional neural vocoder. We also demonstrate a simple yet effective a… ▽ More

    Submitted 11 November, 2023; originally announced November 2023.

    Comments: accepted in NeurIPS 2023 Workshop on Deep Generative Models for Health (DGM4H)

  30. arXiv:2311.05889  [pdf, other

    eess.IV cs.CV cs.LG

    Semantic Map Guided Synthesis of Wireless Capsule Endoscopy Images using Diffusion Models

    Authors: Hae** Lee, Jeongwoo Ju, Jonghyuck Lee, Yeoun Joo Lee, Heechul Jung

    Abstract: Wireless capsule endoscopy (WCE) is a non-invasive method for visualizing the gastrointestinal (GI) tract, crucial for diagnosing GI tract diseases. However, interpreting WCE results can be time-consuming and tiring. Existing studies have employed deep neural networks (DNNs) for automatic GI tract lesion detection, but acquiring sufficient training examples, particularly due to privacy concerns, r… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  31. arXiv:2311.02304  [pdf, other

    cs.RO

    Imitating and Finetuning Model Predictive Control for Robust and Symmetric Quadrupedal Locomotion

    Authors: Donghoon Youm, Hyunyoung Jung, Hyeongjun Kim, Jemin Hwangbo, Hae-Won Park, Sehoon Ha

    Abstract: Control of legged robots is a challenging problem that has been investigated by different approaches, such as model-based control and learning algorithms. This work proposes a novel Imitating and Finetuning Model Predictive Control (IFM) framework to take the strengths of both approaches. Our framework first develops a conventional model predictive controller (MPC) using Differential Dynamic Progr… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  32. arXiv:2309.17046  [pdf, other

    cs.RO

    CrossLoco: Human Motion Driven Control of Legged Robots via Guided Unsupervised Reinforcement Learning

    Authors: Tianyu Li, Hyunyoung Jung, Matthew Gombolay, Yong Kwon Cho, Sehoon Ha

    Abstract: Human motion driven control (HMDC) is an effective approach for generating natural and compelling robot motions while preserving high-level semantics. However, establishing the correspondence between humans and robots with different body structures is not straightforward due to the mismatches in kinematics and dynamics properties, which causes intrinsic ambiguity to the problem. Many previous algo… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

  33. arXiv:2308.10627  [pdf, other

    cs.CV

    Polarimetric Information for Multi-Modal 6D Pose Estimation of Photometrically Challenging Objects with Limited Data

    Authors: Patrick Ruhkamp, Daoyi Gao, HyunJun Jung, Nassir Navab, Benjamin Busam

    Abstract: 6D pose estimation pipelines that rely on RGB-only or RGB-D data show limitations for photometrically challenging objects with e.g. textureless surfaces, reflections or transparency. A supervised learning-based method utilising complementary polarisation information as input modality is proposed to overcome such limitations. This supervised approach is then extended to a self-supervised paradigm b… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: Accepted at ICCV 2023 TRICKY Workshop

  34. arXiv:2308.10621  [pdf, other

    cs.CV

    Multi-Modal Dataset Acquisition for Photometrically Challenging Object

    Authors: HyunJun Jung, Patrick Ruhkamp, Nassir Navab, Benjamin Busam

    Abstract: This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects. We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets. Our approach integrates robotic forward-kinematics, external infrared trackers, and improved… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: Accepted at ICCV 2023 TRICKY Workshop

  35. Robust Monocular Depth Estimation under Challenging Conditions

    Authors: Stefano Gasperini, Nils Morbitzer, HyunJun Jung, Nassir Navab, Federico Tombari

    Abstract: While state-of-the-art monocular depth estimation approaches achieve impressive results in ideal settings, they are highly unreliable under challenging illumination and weather conditions, such as at nighttime or in the presence of rain. In this paper, we uncover these safety-critical issues and tackle them with md4all: a simple and effective solution that works reliably under both adverse and ide… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Comments: ICCV 2023. Source code and data: https://md4all.github.io

  36. Discrete Prompt Compression with Reinforcement Learning

    Authors: Hoyoun Jung, Kyung-Joong Kim

    Abstract: Compressed prompts aid instruction-tuned language models (LMs) in overcoming context window limitations and reducing computational costs. Existing methods, which primarily based on training embeddings, face various challenges associated with interpretability, the fixed number of embedding tokens, reusability across different LMs, and inapplicability when interacting with black-box APIs. This study… ▽ More

    Submitted 2 June, 2024; v1 submitted 16 August, 2023; originally announced August 2023.

  37. arXiv:2307.04550  [pdf, other

    cs.LG cs.AI

    Gradient Surgery for One-shot Unlearning on Generative Model

    Authors: Seohui Bae, Seoyoon Kim, Hyemin Jung, Woohyung Lim

    Abstract: Recent regulation on right-to-be-forgotten emerges tons of interest in unlearning pre-trained machine learning models. While approximating a straightforward yet expensive approach of retrain-from-scratch, recent machine unlearning methods unlearn a sample by updating weights to remove its influence on the weight parameters. In this paper, we introduce a simple yet effective approach to remove a da… ▽ More

    Submitted 18 July, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

    Comments: ICML 2023 Workshop on Generative AI & Law

  38. arXiv:2307.01676  [pdf, other

    cs.AI

    RaidEnv: Exploring New Challenges in Automated Content Balancing for Boss Raid Games

    Authors: Hyeon-Chang Jeon, In-Chang Baek, Cheong-mok Bae, Taehwa Park, Wonsang You, Taegwan Ha, Hoyun Jung, **ha Noh, Seungwon Oh, Kyung-Joong Kim

    Abstract: The balance of game content significantly impacts the gaming experience. Unbalanced game content diminishes engagement or increases frustration because of repetitive failure. Although game designers intend to adjust the difficulty of game content, this is a repetitive, labor-intensive, and challenging process, especially for commercial-level games with extensive content. To address this issue, the… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

    Comments: 14 pages, 6 figures, 6 tables, 2 algorithms

  39. arXiv:2305.18286  [pdf, other

    cs.CV cs.AI

    Photoswap: Personalized Subject Swap** in Images

    Authors: **g Gu, Yilin Wang, Nanxuan Zhao, Tsu-Jui Fu, Wei Xiong, Qing Liu, Zhifei Zhang, He Zhang, Jianming Zhang, HyunJoon Jung, Xin Eric Wang

    Abstract: In an era where images and visual content dominate our digital landscape, the ability to manipulate and personalize these images has become a necessity. Envision seamlessly substituting a tabby cat lounging on a sunlit window sill in a photograph with your own playful puppy, all while preserving the original charm and composition of the image. We present Photoswap, a novel approach that enables th… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: 14 pages

  40. arXiv:2305.15080  [pdf, other

    cs.CL cs.AI

    Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models

    Authors: Geewook Kim, Hodong Lee, Daehee Kim, Haeji Jung, Sanghee Park, Yoonsik Kim, Sangdoo Yun, Taeho Kil, Bado Lee, Seunghyun Park

    Abstract: Recent advances in Large Language Models (LLMs) have stimulated a surge of research aimed at extending their applications to the visual domain. While these models exhibit promise in generating abstract image captions and facilitating natural conversations, their performance on text-rich images still requires improvement. In this paper, we introduce Contrastive Reading Model (Cream), a novel neural… ▽ More

    Submitted 26 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: 22 pages; To appear at EMNLP 2023 Main Conference (Project page: https://naver-ai.github.io/cream )

  41. arXiv:2305.01505  [pdf, other

    cs.CL cs.AI cs.CY

    Beyond Classification: Financial Reasoning in State-of-the-Art Language Models

    Authors: Gui** Son, Hanearl Jung, Moonjeong Hahm, Keonju Na, Sol **

    Abstract: Large Language Models (LLMs), consisting of 100 billion or more parameters, have demonstrated remarkable ability in complex multi-step reasoning tasks. However, the application of such generic advancements has been limited to a few fields, such as clinical or legal, with the field of financial reasoning remaining largely unexplored. To the best of our knowledge, the ability of LLMs to solve financ… ▽ More

    Submitted 25 June, 2023; v1 submitted 30 April, 2023; originally announced May 2023.

    Comments: Accepted by FinNLP (Financial Technology and Natural Language Processing) @ IJCAI2023 as long paper

  42. arXiv:2304.03411  [pdf, other

    cs.CV

    InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning

    Authors: **g Shi, Wei Xiong, Zhe Lin, Hyun Joon Jung

    Abstract: Recent advances in personalized image generation allow a pre-trained text-to-image model to learn a new concept from a set of images. However, existing personalization approaches usually require heavy test-time finetuning for each concept, which is time-consuming and difficult to scale. We propose InstantBooth, a novel approach built upon pre-trained text-to-image models that enables instant text-… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Comments: 13 pages

  43. arXiv:2303.16493  [pdf, other

    cs.CV

    AnyFlow: Arbitrary Scale Optical Flow with Implicit Neural Representation

    Authors: Hyunyoung Jung, Zhuo Hui, Lei Luo, Haitao Yang, Feng Liu, Sungjoo Yoo, Rakesh Ranjan, Denis Demandolx

    Abstract: To apply optical flow in practice, it is often necessary to resize the input to smaller dimensions in order to reduce computational costs. However, downsizing inputs makes the estimation more challenging because objects and motion ranges become smaller. Even though recent approaches have demonstrated high-quality flow estimation, they tend to fail to accurately model small objects and precise boun… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

    Comments: CVPR 2023 (Highlight)

  44. arXiv:2303.14840  [pdf, other

    cs.CV

    On the Importance of Accurate Geometry Data for Dense 3D Vision Tasks

    Authors: HyunJun Jung, Patrick Ruhkamp, Guangyao Zhai, Nikolas Brasch, Yitong Li, Yannick Verdie, Jifei Song, Yiren Zhou, Anil Armagan, Slobodan Ilic, Ales Leonardis, Nassir Navab, Benjamin Busam

    Abstract: Learning-based methods to solve dense 3D vision problems typically train on 3D sensor data. The respectively used principle of measuring distances provides advantages and drawbacks. These are typically not compared nor discussed in the literature due to a lack of multi-modal datasets. Texture-less regions are problematic for structure from motion and stereo, reflective material poses issues for ac… ▽ More

    Submitted 26 March, 2023; originally announced March 2023.

    Comments: Accepted at CVPR 2023, Main Paper + Supp. Mat. arXiv admin note: substantial text overlap with arXiv:2205.04565

  45. arXiv:2303.12950  [pdf, other

    cs.CV cs.GR

    LightPainter: Interactive Portrait Relighting with Freehand Scribble

    Authors: Yiqun Mei, He Zhang, Xuaner Zhang, Jianming Zhang, Zhixin Shu, Yilin Wang, Zijun Wei, Shi Yan, HyunJoon Jung, Vishal M. Patel

    Abstract: Recent portrait relighting methods have achieved realistic results of portrait lighting effects given a desired lighting representation such as an environment map. However, these methods are not intuitive for user interaction and lack precise lighting control. We introduce LightPainter, a scribble-based relighting system that allows users to interactively manipulate portrait lighting effect with e… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: CVPR2023

  46. arXiv:2303.06274  [pdf

    cs.CV cs.LG

    CoNIC Challenge: Pushing the Frontiers of Nuclear Detection, Segmentation, Classification and Counting

    Authors: Simon Graham, Quoc Dang Vu, Mostafa Jahanifar, Martin Weigert, Uwe Schmidt, Wenhua Zhang, Jun Zhang, Sen Yang, **xi Xiang, Xiyue Wang, Josef Lorenz Rumberger, Elias Baumann, Peter Hirsch, Lihao Liu, Chenyang Hong, Angelica I. Aviles-Rivero, Ayushi Jain, Heeyoung Ahn, Yiyu Hong, Hussam Azzuni, Min Xu, Mohammad Yaqub, Marie-Claire Blache, Benoît Piégu, Bertrand Vernay , et al. (64 additional authors not shown)

    Abstract: Nuclear detection, segmentation and morphometric profiling are essential in hel** us further understand the relationship between histology and patient outcome. To drive innovation in this area, we setup a community-wide challenge using the largest available dataset of its kind to assess nuclear segmentation and cellular composition. Our challenge, named CoNIC, stimulated the development of repro… ▽ More

    Submitted 14 March, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

  47. Dual Policy Learning for Aggregation Optimization in Graph Neural Network-based Recommender Systems

    Authors: Heesoo Jung, Sangpil Kim, Hogun Park

    Abstract: Graph Neural Networks (GNNs) provide powerful representations for recommendation tasks. GNN-based recommendation systems capture the complex high-order connectivity between users and items by aggregating information from distant neighbors and can improve the performance of recommender systems. Recently, Knowledge Graphs (KGs) have also been incorporated into the user-item interaction graph to prov… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: Accepted by the Web Conference 2023

  48. arXiv:2301.11063  [pdf, other

    cs.CV

    Rewarded meta-pruning: Meta Learning with Rewards for Channel Pruning

    Authors: Athul Shibu, Abhishek Kumar, Heechul Jung, Dong-Gyu Lee

    Abstract: Convolutional Neural Networks (CNNs) have a large number of parameters and take significantly large hardware resources to compute, so edge devices struggle to run high-level networks. This paper proposes a novel method to reduce the parameters and FLOPs for computational efficiency in deep learning models. We introduce accuracy and efficiency coefficients to control the trade-off between the accur… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

  49. arXiv:2301.05843  [pdf, other

    cs.HC cs.AI cs.CL

    Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data

    Authors: **g Wei, Sungdong Kim, Hyunhoon Jung, Young-Ho Kim

    Abstract: Large language models (LLMs) provide a new way to build chatbots by accepting natural language prompts. Yet, it is unclear how to design prompts to power chatbots to carry on naturalistic conversations while pursuing a given goal, such as collecting self-report data from users. We explore what design factors of prompts can help steer chatbots to talk naturally and collect data reliably. To this ai… ▽ More

    Submitted 22 September, 2023; v1 submitted 14 January, 2023; originally announced January 2023.

    Comments: 22 pages including Appendix, 7 figures, 7 tables. Accepted to PACM HCI (CSCW 2024)

    Report number: 87 ACM Class: H.5.2; I.2.7

    Journal ref: Proceedings of the ACM on Human-Computer Interaction, 2024, Volume 8, Issue CSCW1, Article No. 87

  50. arXiv:2301.05331  [pdf, other

    math.ST cs.LG math.PR stat.ML

    Detection problems in the spiked matrix models

    Authors: Ji Hyung Jung, Hye Won Chung, Ji Oon Lee

    Abstract: We study the statistical decision process of detecting the low-rank signal from various signal-plus-noise type data matrices, known as the spiked random matrix models. We first show that the principal component analysis can be improved by entrywise pre-transforming the data matrix if the noise is non-Gaussian, generalizing the known results for the spiked random matrix models with rank-1 signals.… ▽ More

    Submitted 16 January, 2023; v1 submitted 12 January, 2023; originally announced January 2023.

    Comments: 80 pages, 6 figures. arXiv admin note: text overlap with arXiv:2104.13517

    MSC Class: 62H25; 62H15; 60B20