Skip to main content

Showing 1–50 of 995 results for author: Choi, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19634  [pdf, other

    cs.RO

    CLOi-Mapper: Consistent, Lightweight, Robust, and Incremental Mapper With Embedded Systems for Commercial Robot Services

    Authors: DongKi Noh, Hyungtae Lim, Gyuho Eoh, Duckyu Choi, Jeongsik Choi, Hyunjun Lim, SeungMin Baek, Hyun Myung

    Abstract: In commercial autonomous service robots with several form factors, simultaneous localization and map** (SLAM) is an essential technology for providing proper services such as cleaning and guidance. Such robots require SLAM algorithms suitable for specific applications and environments. Hence, several SLAM frameworks have been proposed to address various requirements in the past decade. However,… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Journal ref: IEEE Robotics and Automation Letters, 2024

  2. arXiv:2406.17256  [pdf, other

    cs.CV

    Disentangled Motion Modeling for Video Frame Interpolation

    Authors: Jaihyun Lew, Jooyoung Choi, Chaehun Shin, Dahuin Jung, Sungroh Yoon

    Abstract: Video frame interpolation (VFI) aims to synthesize intermediate frames in between existing frames to enhance visual smoothness and quality. Beyond the conventional methods based on the reconstruction loss, recent works employ the high quality generative models for perceptual quality. However, they require complex training and large computational cost for modeling on the pixel space. In this paper,… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  3. arXiv:2406.15996  [pdf, other

    cs.CL cs.AI

    Memorizing Documents with Guidance in Large Language Models

    Authors: Bum** Park, Jaesik Choi

    Abstract: Training data plays a pivotal role in AI models. Large language models (LLMs) are trained with massive amounts of documents, and their parameters hold document-related contents. Recently, several studies identified content-specific locations in LLMs by examining the parameters. Instead of the post hoc interpretation, we propose another approach. We propose document-wise memory architecture to trac… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: IJCAI 2024

  4. arXiv:2406.15635  [pdf, other

    cs.LG cs.CR cs.CV

    DataFreeShield: Defending Adversarial Attacks without Training Data

    Authors: Hyeyoon Lee, Kanghyun Choi, Dain Kwon, Sunjong Park, Mayoore Selvarasa Jaiswal, Noseong Park, Jonghyun Choi, **ho Lee

    Abstract: Recent advances in adversarial robustness rely on an abundant set of training data, where using external or additional datasets has become a common setting. However, in real life, the training data is often kept private for security and privacy issues, while only the pretrained weight is available to the public. In such scenarios, existing methods that assume accessibility to the original data bec… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  5. arXiv:2406.12909  [pdf, other

    cs.LG physics.comp-ph

    Scalable Training of Graph Foundation Models for Atomistic Materials Modeling: A Case Study with HydraGNN

    Authors: Massimiliano Lupo Pasini, Jong Youl Choi, Kshitij Mehta, Pei Zhang, David Rogers, Jonghyun Bae, Khaled Z. Ibrahim, Ashwin M. Aji, Karl W. Schulz, Jorda Polo, Prasanna Balaprakash

    Abstract: We present our work on develo** and training scalable graph foundation models (GFM) using HydraGNN, a multi-headed graph convolutional neural network architecture. HydraGNN expands the boundaries of graph neural network (GNN) in both training scale and data diversity. It abstracts over message passing algorithms, allowing both reproduction of and comparison across algorithmic innovations that de… ▽ More

    Submitted 28 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 16 pages, 13 figures

    MSC Class: 68T07; 68T09 ACM Class: C.2.4; I.2.11

  6. arXiv:2406.12402  [pdf, other

    cs.CL

    Flee the Flaw: Annotating the Underlying Logic of Fallacious Arguments Through Templates and Slot-filling

    Authors: Irfan Robbani, Paul Reisert, Naoya Inoue, Surawat Pothong, Camélia Guerraoui, Wenzhi Wang, Shoichi Naito, Jungmin Choi, Kentaro Inui

    Abstract: Prior research in computational argumentation has mainly focused on scoring the quality of arguments, with less attention on explicating logical errors. In this work, we introduce four sets of explainable templates for common informal logical fallacies designed to explicate a fallacy's implicit logic. Using our templates, we conduct an annotation study on top of 400 fallacious arguments taken from… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  7. arXiv:2406.12233  [pdf, other

    cs.AI cs.CL cs.CV

    SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization

    Authors: Young ** Ahn, Jungwoo Park, Sangha Park, Jonghyun Choi, Kee-Eung Kim

    Abstract: Visual Speech Recognition (VSR) stands at the intersection of computer vision and speech recognition, aiming to interpret spoken content from visual cues. A prominent challenge in VSR is the presence of homophenes-visually similar lip gestures that represent different phonemes. Prior approaches have sought to distinguish fine-grained visemes by aligning visual and auditory semantics, but often fel… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  8. arXiv:2406.11384  [pdf, other

    cs.CV

    Understanding Multi-Granularity for Open-Vocabulary Part Segmentation

    Authors: Jiho Choi, Seonho Lee, Seungho Lee, Minhyun Lee, Hyunjung Shim

    Abstract: Open-vocabulary part segmentation (OVPS) is an emerging research area focused on segmenting fine-grained entities based on diverse and previously unseen vocabularies. Our study highlights the inherent complexities of part segmentation due to intricate boundaries and diverse granularity, reflecting the knowledge-based nature of part identification. To address these challenges, we propose PartCLIPSe… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  9. arXiv:2406.11313  [pdf, other

    cs.CV

    Semi-Supervised Domain Adaptation Using Target-Oriented Domain Augmentation for 3D Object Detection

    Authors: Yecheol Kim, Junho Lee, Changsoo Park, Hyoung won Kim, Inho Lim, Christopher Chang, Jun Won Choi

    Abstract: 3D object detection is crucial for applications like autonomous driving and robotics. However, in real-world environments, variations in sensor data distribution due to sensor upgrades, weather changes, and geographic differences can adversely affect detection performance. Semi-Supervised Domain Adaptation (SSDA) aims to mitigate these challenges by transferring knowledge from a source domain, abu… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted to IEEE Transactions on Intelligent Vehicles (T-IV). The code is available at: https://github.com/rasd3/TODA

  10. arXiv:2406.11280  [pdf, other

    cs.CV

    i-SRT: Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective Judgment

    Authors: Daechul Ahn, Yura Choi, San Kim, Youngjae Yu, Dongyeop Kang, Jonghyun Choi

    Abstract: Aligning Video Large Multimodal Models (VLMMs) face challenges such as modality misalignment and verbose responses. Although iterative approaches such as self-rewarding or iterative direct preference optimization (DPO) recently showed a significant improvement in language model alignment, particularly on reasoning tasks, self-aligned models applied to large video-language models often result in le… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Technical report

  11. arXiv:2406.11244  [pdf, other

    cs.LG cs.AI

    SpoT-Mamba: Learning Long-Range Dependency on Spatio-Temporal Graphs with Selective State Spaces

    Authors: **hyeok Choi, Heehyeon Kim, Minhyeong An, Joyce Jiyoung Whang

    Abstract: Spatio-temporal graph (STG) forecasting is a critical task with extensive applications in the real world, including traffic and weather forecasting. Although several recent methods have been proposed to model complex dynamics in STGs, addressing long-range spatio-temporal dependencies remains a significant challenge, leading to limited performance gains. Inspired by a recently proposed state space… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 6 pages, 2 figures, 3 tables. Spatio-Temporal Reasoning and Learning (STRL) Workshop at the 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024)

  12. arXiv:2406.09138  [pdf, other

    cs.CL

    Leveraging Explicit Reasoning for Inference Integration in Commonsense-Augmented Dialogue Models

    Authors: Sarah E. Finch, **ho D. Choi

    Abstract: Open-domain dialogue systems need to grasp social commonsense to understand and respond effectively to human users. Commonsense-augmented dialogue models have been proposed that aim to infer commonsense knowledge from dialogue contexts in order to improve response quality. However, existing approaches to commonsense-augmented dialogue rely on implicit reasoning to integrate commonsense inferences… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  13. arXiv:2406.08796  [pdf, other

    cs.CL

    Deep Exploration of Cross-Lingual Zero-Shot Generalization in Instruction Tuning

    Authors: Janghoon Han, Changho Lee, Joongbo Shin, Stanley Jungkyu Choi, Honglak Lee, Kynghoon Bae

    Abstract: Instruction tuning has emerged as a powerful technique, significantly boosting zero-shot performance on unseen tasks. While recent work has explored cross-lingual generalization by applying instruction tuning to multilingual models, previous studies have primarily focused on English, with a limited exploration of non-English tasks. For an in-depth exploration of cross-lingual generalization in ins… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Findings of ACL 2024 (Camera-ready), by Janghoon Han and Changho Lee, with equal contribution

  14. arXiv:2406.07922  [pdf

    cs.CL

    Automated Information Extraction from Thyroid Operation Narrative: A Comparative Study of GPT-4 and Fine-tuned KoELECTRA

    Authors: Dongsuk Jang, Hyeryun Park, Jiye Son, Hyeonuk Hwang, Su** Kim, **wook Choi

    Abstract: In the rapidly evolving field of healthcare, the integration of artificial intelligence (AI) has become a pivotal component in the automation of clinical workflows, ushering in a new era of efficiency and accuracy. This study focuses on the transformative capabilities of the fine-tuned KoELECTRA model in comparison to the GPT-4 model, aiming to facilitate automated information extraction from thyr… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 9 pages, 2 figures, 3 tables

    Journal ref: AMIA Joint Summits on Translational Science Proceedings, 2024, pp. 249-257

  15. arXiv:2406.05965  [pdf, other

    eess.AS cs.AI

    MakeSinger: A Semi-Supervised Training Method for Data-Efficient Singing Voice Synthesis via Classifier-free Diffusion Guidance

    Authors: Semin Kim, Myeonghun Jeong, Hyeonseung Lee, Minchan Kim, Byoung ** Choi, Nam Soo Kim

    Abstract: In this paper, we propose MakeSinger, a semi-supervised training method for singing voice synthesis (SVS) via classifier-free diffusion guidance. The challenge in SVS lies in the costly process of gathering aligned sets of text, pitch, and audio data. MakeSinger enables the training of the diffusion-based SVS model from any speech and singing voice data regardless of its labeling, thereby enhancin… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

  16. arXiv:2406.05446  [pdf

    cs.CL cs.AI

    Design of reliable technology valuation model with calibrated machine learning of patent indicators

    Authors: Seunghyun Lee, Janghyeok Yoon, Jaewoong Choi

    Abstract: Machine learning (ML) has revolutionized the digital transformation of technology valuation by predicting the value of patents with high accuracy. However, the lack of validation regarding the reliability of these models hinders experts from fully trusting the confidence of model predictions. To address this issue, we propose an analytical framework for reliable technology valuation using calibrat… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  17. arXiv:2406.05431  [pdf

    cs.CL

    MaTableGPT: GPT-based Table Data Extractor from Materials Science Literature

    Authors: Gyeong Hoon Yi, Jiwoo Choi, Hyeongyun Song, Olivia Miano, Jaewoong Choi, Kihoon Bang, Byungju Lee, Seok Su Sohn, David Buttler, Anna Hiszpanski, Sang Soo Han, Donghun Kim

    Abstract: Efficiently extracting data from tables in the scientific literature is pivotal for building large-scale databases. However, the tables reported in materials science papers exist in highly diverse forms; thus, rule-based extractions are an ineffective approach. To overcome this challenge, we present MaTableGPT, which is a GPT-based table data extractor from the materials science literature. MaTabl… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  18. arXiv:2406.05255  [pdf, other

    cs.CL cs.AI

    Generative Explore-Exploit: Training-free Optimization of Generative Recommender Systems using LLM Optimizers

    Authors: Lütfi Kerem Senel, Besnik Fetahu, Davis Yoshida, Zhiyu Chen, Giuseppe Castellucci, Nikhita Vedula, Jason Choi, Shervin Malmasi

    Abstract: Recommender systems are widely used to suggest engaging content, and Large Language Models (LLMs) have given rise to generative recommenders. Such systems can directly generate items, including for open-set tasks like question suggestion. While the world knowledge of LLMs enable good recommendations, improving the generated content through user feedback is challenging as continuously fine-tuning L… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Accepted at ACL 2024 Main Proceedings

  19. arXiv:2406.03671  [pdf, other

    cs.LG cs.AI

    PANDA: Expanded Width-Aware Message Passing Beyond Rewiring

    Authors: Jeongwhan Choi, Sumin Park, Hyowon Wi, Sung-Bae Cho, Noseong Park

    Abstract: Recent research in the field of graph neural network (GNN) has identified a critical issue known as "over-squashing," resulting from the bottleneck phenomenon in graph structures, which impedes the propagation of long-range information. Prior works have proposed a variety of graph rewiring concepts that aim at optimizing the spatial or spectral properties of graphs to promote the signal propagatio… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML 2024

  20. arXiv:2406.03665  [pdf, other

    cs.LG cs.AI

    Towards Dynamic Trend Filtering through Trend Point Detection with Reinforcement Learning

    Authors: Jihyeon Seong, Sekwang Oh, Jaesik Choi

    Abstract: Trend filtering simplifies complex time series data by applying smoothness to filter out noise while emphasizing proximity to the original data. However, existing trend filtering methods fail to reflect abrupt changes in the trend due to `approximateness,' resulting in constant smoothness. This approximateness uniformly filters out the tail distribution of time series data, characterized by extrem… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 18 pages, 11 figures

    Journal ref: IJCAI2024

  21. arXiv:2405.20630  [pdf, other

    cs.LG

    Stochastic Optimal Control for Diffusion Bridges in Function Spaces

    Authors: Byoungwoo Park, Jungwon Choi, Sungbin Lim, Juho Lee

    Abstract: Recent advancements in diffusion models and diffusion bridges primarily focus on finite-dimensional spaces, yet many real-world problems necessitate operations in infinite-dimensional function spaces for more natural and interpretable formulations. In this paper, we present a theory of stochastic optimal control (SOC) tailored to infinite-dimensional spaces, aiming to extend diffusion-based algori… ▽ More

    Submitted 2 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

  22. arXiv:2405.19899  [pdf, other

    cs.CV cs.AI

    Open-Set Domain Adaptation for Semantic Segmentation

    Authors: Seun-An Choe, Ah-Hyung Shin, Keon-Hee Park, **woo Choi, Gyeong-Moon Park

    Abstract: Unsupervised domain adaptation (UDA) for semantic segmentation aims to transfer the pixel-wise knowledge from the labeled source domain to the unlabeled target domain. However, current UDA methods typically assume a shared label space between source and target, limiting their applicability in real-world scenarios where novel categories may emerge in the target domain. In this paper, we introduce O… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 14 pages, 5 figures, 13 tables, CVPR 2024 Poster

  23. arXiv:2405.17878  [pdf, other

    cs.LG cs.AI

    An Information Theoretic Metric for Evaluating Unlearning Models

    Authors: Dongjae Jeon, Wonje Jeung, Taeheon Kim, Albert No, Jonghyun Choi

    Abstract: Machine unlearning (MU) addresses privacy concerns by removing information of `forgetting data' samples from trained models. Typically, evaluating MU methods involves comparing unlearned models to those retrained from scratch without forgetting data, using metrics such as membership inference attacks (MIA) and accuracy measurements. These evaluations implicitly assume that if the output logits of… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  24. arXiv:2405.16483  [pdf, other

    cs.NI eess.SP

    Enhancing Reliability in LEO Satellite Networks via High-Speed Inter-Satellite Links

    Authors: **ho Choi

    Abstract: Low Earth orbit (LEO) satellites play a crucial role in providing global connectivity for non-terrestrial networks (NTNs) and supporting various Internet-of-Remote-Things (IoRT) applications. Each LEO satellite functions as a relay node in the sky, employing store-and-forward transmission strategies that necessitate the use of buffers. However, due to the finite size of these buffers, occurrences… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 13 pages, 7 figures, to be published in IEEE Wireless Communications Letters

  25. arXiv:2405.16301  [pdf, other

    cs.CV cs.LG

    Active Learning for Finely-Categorized Image-Text Retrieval by Selecting Hard Negative Unpaired Samples

    Authors: Dae Ung Jo, Kyuewang Lee, JaeHo Chung, ** Young Choi

    Abstract: Securing a sufficient amount of paired data is important to train an image-text retrieval (ITR) model, but collecting paired data is very expensive. To address this issue, in this paper, we propose an active learning algorithm for ITR that can collect paired data cost-efficiently. Previous studies assume that image-text pairs are given and their category labels are asked to the annotator. However,… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  26. arXiv:2405.15780  [pdf, other

    cs.CV cs.LG

    Sequence Length Scaling in Vision Transformers for Scientific Images on Frontier

    Authors: Aristeidis Tsaris, Chengming Zhang, Xiao Wang, Junqi Yin, Siyan Liu, Moetasim Ashfaq, Ming Fan, Jong Youl Choi, Mohamed Wahib, Dan Lu, Prasanna Balaprakash, Feiyi Wang

    Abstract: Vision Transformers (ViTs) are pivotal for foundational models in scientific imagery, including Earth science applications, due to their capability to process large sequence lengths. While transformers for text has inspired scaling sequence lengths in ViTs, yet adapting these for ViTs introduces unique challenges. We develop distributed sequence parallelism for ViTs, enabling them to handle up to… ▽ More

    Submitted 17 April, 2024; originally announced May 2024.

  27. arXiv:2405.12468  [pdf, other

    cs.CL

    Diverse and Effective Synthetic Data Generation for Adaptable Zero-Shot Dialogue State Tracking

    Authors: James D. Finch, **ho D. Choi

    Abstract: We demonstrate substantial performance gains in zero-shot dialogue state tracking (DST) by enhancing training data diversity through synthetic data generation. Existing DST datasets are severely limited in the number of application domains and slot types they cover due to the high costs of data collection, restricting their adaptability to new domains. This work addresses this challenge with a nov… ▽ More

    Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

  28. arXiv:2405.11473  [pdf, other

    cs.CV cs.AI

    FIFO-Diffusion: Generating Infinite Videos from Text without Training

    Authors: Jihwan Kim, Junoh Kang, **young Choi, Bohyung Han

    Abstract: We propose a novel inference technique based on a pretrained diffusion model for text-conditional video generation. Our approach, called FIFO-Diffusion, is conceptually capable of generating infinitely long videos without additional training. This is achieved by iteratively performing diagonal denoising, which concurrently processes a series of consecutive frames with increasing noise levels in a… ▽ More

    Submitted 12 June, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

    Comments: Project Page: https://jjihwan.github.io/projects/FIFO-Diffusion

  29. arXiv:2405.11178  [pdf, other

    cs.CL

    Automating PTSD Diagnostics in Clinical Interviews: Leveraging Large Language Models for Trauma Assessments

    Authors: Sichang Tu, Abigail Powers, Natalie Merrill, Negar Fani, Sierra Carter, Stephen Doogan, **ho D. Choi

    Abstract: The shortage of clinical workforce presents significant challenges in mental healthcare, limiting access to formal diagnostics and services. We aim to tackle this shortage by integrating a customized large language model (LLM) into the workflow, thus promoting equity in mental healthcare for the general population. Although LLMs have showcased their capability in clinical decision-making, their ad… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  30. arXiv:2405.09976  [pdf, other

    cs.CV eess.SP

    Language-Oriented Semantic Latent Representation for Image Transmission

    Authors: Giordano Cicchetti, Eleonora Grassucci, Jihong Park, **ho Choi, Sergio Barbarossa, Danilo Comminiello

    Abstract: In the new paradigm of semantic communication (SC), the focus is on delivering meanings behind bits by extracting semantic information from raw data. Recent advances in data-to-text models facilitate language-oriented SC, particularly for text-transformed image communication via image-to-text (I2T) encoding and text-to-image (T2I) decoding. However, although semantically aligned, the text is too c… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: Under review at IEEE International Workshop on Machine Learning for Signal Processing (MLSP) 2024

  31. arXiv:2405.09866  [pdf, other

    eess.SP cs.LG

    Rethinking Multi-User Semantic Communications with Deep Generative Models

    Authors: Eleonora Grassucci, **ho Choi, Jihong Park, Riccardo F. Gramaccioni, Giordano Cicchetti, Danilo Comminiello

    Abstract: In recent years, novel communication strategies have emerged to face the challenges that the increased number of connected devices and the higher quality of transmitted information are posing. Among them, semantic communication obtained promising results especially when combined with state-of-the-art deep generative models, such as large language or diffusion models, able to regenerate content fro… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: Under review in IEEE Journal on Selected Areas in Communications

  32. arXiv:2405.06931  [pdf, other

    cs.IR

    Identifying Key Terms in Prompts for Relevance Evaluation with GPT Models

    Authors: Jaekeol Choi

    Abstract: Relevance evaluation of a query and a passage is essential in Information Retrieval (IR). Recently, numerous studies have been conducted on tasks related to relevance judgment using Large Language Models (LLMs) such as GPT-4, demonstrating significant improvements. However, the efficacy of LLMs is considerably influenced by the design of the prompt. The purpose of this paper is to identify which s… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: 19pages, 2 figures

    Journal ref: International Journal of Natural Language Computing, April 2024, Volume 13, Number 2

  33. arXiv:2405.04746  [pdf, other

    cs.IR cs.AI cs.LG

    SVD-AE: Simple Autoencoders for Collaborative Filtering

    Authors: Seoyoung Hong, Jeongwhan Choi, Yeon-Chang Lee, Srijan Kumar, Noseong Park

    Abstract: Collaborative filtering (CF) methods for recommendation systems have been extensively researched, ranging from matrix factorization and autoencoder-based to graph filtering-based methods. Recently, lightweight methods that require almost no training have been recently proposed to reduce overall computation. However, existing methods still have room to improve the trade-offs among accuracy, efficie… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted by IJCAI 2024

  34. arXiv:2405.03958  [pdf, other

    cs.CV cs.AI cs.LG

    Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model

    Authors: Joo Young Choi, Jaesung R. Park, Inkyu Park, Jaewoong Cho, Albert No, Ernest K. Ryu

    Abstract: Current state-of-the-art diffusion models employ U-Net architectures containing convolutional and (qkv) self-attention layers. The U-Net processes images while being conditioned on the time embedding input for each sampling step and the class or caption embedding input corresponding to the desired conditional generation. Such conditioning involves scale-and-shift operations to the convolutional la… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  35. arXiv:2405.02762  [pdf, other

    cs.CV cs.LG cs.RO

    TK-Planes: Tiered K-Planes with High Dimensional Feature Vectors for Dynamic UAV-based Scenes

    Authors: Christopher Maxey, Jaehoon Choi, Yonghan Lee, Hyungtae Lee, Dinesh Manocha, Heesung Kwon

    Abstract: In this paper, we present a new approach to bridge the domain gap between synthetic and real-world data for un- manned aerial vehicle (UAV)-based perception. Our formu- lation is designed for dynamic scenes, consisting of moving objects or human actions, where the goal is to recognize the pose or actions. We propose an extension of K-Planes Neural Radiance Field (NeRF), wherein our algorithm store… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 8 pages, submitted to IROS2024

  36. arXiv:2405.02347  [pdf, other

    cs.LG cs.AI cs.CL

    COPAL: Continual Pruning in Large Language Generative Models

    Authors: Srikanth Malla, Joon Hee Choi, Chiho Choi

    Abstract: Adapting pre-trained large language models to different domains in natural language processing requires two key considerations: high computational demands and model's inability to continual adaptation. To simultaneously address both issues, this paper presents COPAL (COntinual Pruning in Adaptive Language settings), an algorithm developed for pruning large language generative models under a contin… ▽ More

    Submitted 14 June, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: ICML2024

  37. arXiv:2405.01554  [pdf, other

    cs.LG cs.AI q-bio.NC

    Early-stage detection of cognitive impairment by hybrid quantum-classical algorithm using resting-state functional MRI time-series

    Authors: Junggu Choi, Tak Hur, Daniel K. Park, Na-Young Shin, Seung-Koo Lee, Hakbae Lee, Sanghoon Han

    Abstract: Following the recent development of quantum machine learning techniques, the literature has reported several quantum machine learning algorithms for disease detection. This study explores the application of a hybrid quantum-classical algorithm for classifying region-of-interest time-series data obtained from resting-state functional magnetic resonance imaging in patients with early-stage cognitive… ▽ More

    Submitted 16 March, 2024; originally announced May 2024.

    Comments: 28 pages, 10 figures

  38. arXiv:2405.01022  [pdf, other

    cs.CL cs.AI

    UniGen: Universal Domain Generalization for Sentiment Classification via Zero-shot Dataset Generation

    Authors: Juhwan Choi, Yeonghwa Kim, Seunguk Yu, JungMin Yun, YoungBin Kim

    Abstract: Although pre-trained language models have exhibited great flexibility and versatility with prompt-based few-shot learning, they suffer from the extensive parameter size and limited applicability for inference. Recent studies have suggested that PLMs be used as dataset generators and a tiny task-specific model be trained to achieve efficient inference. However, their applicability to various domain… ▽ More

    Submitted 2 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

  39. arXiv:2405.00523  [pdf, other

    cs.AI cs.CL

    CookingSense: A Culinary Knowledgebase with Multidisciplinary Assertions

    Authors: Donghee Choi, Mogan Gim, Donghyeon Park, Mujeen Sung, Hyunjae Kim, Jaewoo Kang, Jihun Choi

    Abstract: This paper introduces CookingSense, a descriptive collection of knowledge assertions in the culinary domain extracted from various sources, including web data, scientific papers, and recipes, from which knowledge covering a broad range of aspects is acquired. CookingSense is constructed through a series of dictionary-based filtering and language model-based semantic filtering techniques, which res… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: LREC-COLING 2024 Accepted

  40. arXiv:2405.00287  [pdf, other

    cs.IR cs.AI cs.LG

    Stochastic Sampling for Contrastive Views and Hard Negative Samples in Graph-based Collaborative Filtering

    Authors: Chaejeong Lee, Jeongwhan Choi, Hyowon Wi, Sung-Bae Cho, Noseong Park

    Abstract: Graph-based collaborative filtering (CF) has emerged as a promising approach in recommendation systems. Despite its achievements, graph-based CF models face challenges due to data sparsity and negative sampling. In this paper, we propose a novel Stochastic sampling for i) COntrastive views and ii) hard NEgative samples (SCONE) to overcome these issues. By considering that they are both sampling ta… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

  41. arXiv:2404.16418  [pdf, other

    cs.CL

    Instruction Matters, a Simple yet Effective Task Selection Approach in Instruction Tuning for Specific Tasks

    Authors: Changho Lee, Janghoon Han, Seonghyeon Ye, Stanley Jungkyu Choi, Honglak Lee, Kyunghoon Bae

    Abstract: Instruction tuning has shown its ability to not only enhance zero-shot generalization across various tasks but also its effectiveness in improving the performance of specific tasks. A crucial aspect in instruction tuning for a particular task is a strategic selection of related tasks that offer meaningful supervision, thereby enhancing efficiency and preventing performance degradation from irrelev… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: 21 pages, 6 figures, 16 tables

  42. arXiv:2404.14712  [pdf, other

    physics.ao-ph cs.AI cs.DC eess.IV physics.geo-ph

    ORBIT: Oak Ridge Base Foundation Model for Earth System Predictability

    Authors: Xiao Wang, Aristeidis Tsaris, Siyan Liu, Jong-Youl Choi, Ming Fan, Wei Zhang, Junqi Yin, Moetasim Ashfaq, Dan Lu, Prasanna Balaprakash

    Abstract: Earth system predictability is challenged by the complexity of environmental dynamics and the multitude of variables involved. Current AI foundation models, although advanced by leveraging large and heterogeneous data, are often constrained by their size and data integration, limiting their effectiveness in addressing the full range of Earth system prediction challenges. To overcome these limitati… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  43. arXiv:2404.12168  [pdf, other

    cs.CV cs.AI

    Real-World Efficient Blind Motion Deblurring via Blur Pixel Discretization

    Authors: Insoo Kim, Jae Seok Choi, Geonseok Seo, Kinam Kwon, **woo Shin, Hyong-Euk Lee

    Abstract: As recent advances in mobile camera technology have enabled the capability to capture high-resolution images, such as 4K images, the demand for an efficient deblurring model handling large motion has increased. In this paper, we discover that the image residual errors, i.e., blur-sharp pixel differences, can be grouped into some categories according to their motion blur type and how complex their… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: CVPR2024 Camera-Ready

  44. arXiv:2404.09682  [pdf, other

    cs.CL cs.AI

    Multi-News+: Cost-efficient Dataset Cleansing via LLM-based Data Annotation

    Authors: Juhwan Choi, Jungmin Yun, Kyohoon **, YoungBin Kim

    Abstract: The quality of the dataset is crucial for ensuring optimal performance and reliability of downstream task models. However, datasets often contain noisy data inadvertently included during the construction process. Numerous attempts have been made to correct this issue through human annotators. However, hiring and managing human annotators is expensive and time-consuming. As an alternative, recent s… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  45. arXiv:2404.09480  [pdf, other

    cs.CL cs.AI

    Mitigating Hallucination in Abstractive Summarization with Domain-Conditional Mutual Information

    Authors: Kyubyung Chae, Jaepill Choi, Yohan Jo, Taesup Kim

    Abstract: A primary challenge in abstractive summarization is hallucination -- the phenomenon where a model generates plausible text that is absent in the source text. We hypothesize that the domain (or topic) of the source text triggers the model to generate text that is highly probable in the domain, neglecting the details of the source text. To alleviate this model bias, we introduce a decoding strategy… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted by Findings of NAACL 2024

  46. arXiv:2404.09228  [pdf, other

    cs.RO

    A Survey on Integration of Large Language Models with Intelligent Robots

    Authors: Yeseung Kim, Dohyun Kim, Jieun Choi, Jisang Park, Nayoung Oh, Daehyung Park

    Abstract: In recent years, the integration of large language models (LLMs) has revolutionized the field of robotics, enabling robots to communicate, understand, and reason with human-like proficiency. This paper explores the multifaceted impact of LLMs on robotics, addressing key challenges and opportunities for leveraging these models across various domains. By categorizing and analyzing LLM applications w… ▽ More

    Submitted 23 June, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: 24 pages, 1 figure, Accepted to Intelligent Service Robotics (ISR)

  47. arXiv:2404.08654  [pdf

    cs.CL cs.AI

    Optimal path for Biomedical Text Summarization Using Pointer GPT

    Authors: Hyunkyung Han, Jaesik Choi

    Abstract: Biomedical text summarization is a critical tool that enables clinicians to effectively ascertain patient status. Traditionally, text summarization has been accomplished with transformer models, which are capable of compressing long documents into brief summaries. However, transformer models are known to be among the most challenging natural language processing (NLP) tasks. Specifically, GPT model… ▽ More

    Submitted 21 March, 2024; originally announced April 2024.

    Comments: 3 pages, 3 figures

    Journal ref: KSC2023

  48. arXiv:2404.07610  [pdf, other

    cs.CV

    Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval

    Authors: Minkuk Kim, Hyeon Bae Kim, **young Moon, **woo Choi, Seong Tae Kim

    Abstract: There has been significant attention to the research on dense video captioning, which aims to automatically localize and caption all events within untrimmed video. Several studies introduce methods by designing dense video captioning as a multitasking problem of event localization and event captioning to consider inter-task relations. However, addressing both tasks using only visual input is chall… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  49. arXiv:2404.06621  [pdf, other

    cs.CL

    What is Your Favorite Gender, MLM? Gender Bias Evaluation in Multilingual Masked Language Models

    Authors: Jeongrok Yu, Seong Ug Kim, Jacob Choi, **ho D. Choi

    Abstract: Bias is a disproportionate prejudice in favor of one side against another. Due to the success of transformer-based Masked Language Models (MLMs) and their impact on many NLP tasks, a systematic evaluation of bias in these models is needed more than ever. While many studies have evaluated gender bias in English MLMs, only a few works have been conducted for the task in other languages. This paper p… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  50. arXiv:2404.04870  [pdf, other

    cs.LG eess.SP nlin.CD

    Signal-noise separation using unsupervised reservoir computing

    Authors: Jaesung Choi, Pilwon Kim

    Abstract: Removing noise from a signal without knowing the characteristics of the noise is a challenging task. This paper introduces a signal-noise separation method based on time series prediction. We use Reservoir Computing (RC) to extract the maximum portion of "predictable information" from a given signal. Reproducing the deterministic component of the signal using RC, we estimate the noise distribution… ▽ More

    Submitted 30 May, 2024; v1 submitted 7 April, 2024; originally announced April 2024.