Skip to main content

Showing 1–50 of 789 results for author: Kim, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12721  [pdf

    eess.AS cs.SD

    Sound event detection based on auxiliary decoder and maximum probability aggregation for DCASE Challenge 2024 Task 4

    Authors: Sang Won Son, Jongyeon Park, Hong Kook Kim, Sulaiman Vesal, Jeong Eun Lim

    Abstract: In this report, we propose three novel methods for develo** a sound event detection (SED) model for the DCASE 2024 Challenge Task 4. First, we propose an auxiliary decoder attached to the final convolutional block to improve feature extraction capabilities while reducing dependency on embeddings from pre-trained large models. The proposed auxiliary decoder operates independently from the main de… ▽ More

    Submitted 24 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: DCASE 2024 challenge Task4, 4 pages

  2. arXiv:2406.12233  [pdf, other

    cs.AI cs.CL cs.CV

    SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization

    Authors: Young ** Ahn, Jungwoo Park, Sangha Park, Jonghyun Choi, Kee-Eung Kim

    Abstract: Visual Speech Recognition (VSR) stands at the intersection of computer vision and speech recognition, aiming to interpret spoken content from visual cues. A prominent challenge in VSR is the presence of homophenes-visually similar lip gestures that represent different phonemes. Prior approaches have sought to distinguish fine-grained visemes by aligning visual and auditory semantics, but often fel… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  3. arXiv:2406.12016  [pdf, other

    cs.LG

    Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization

    Authors: Seungwoo Son, Wonpyo Park, Woohyun Han, Kyuyeun Kim, Jaeho Lee

    Abstract: Despite recent advances in LLM quantization, activation quantization remains to be challenging due to the activation outliers. Conventional remedies, e.g., mixing precisions for different channels, introduce extra overhead and reduce the speedup. In this work, we develop a simple yet effective strategy to facilitate per-tensor activation quantization by preventing the generation of problematic tok… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  4. arXiv:2406.11875  [pdf, other

    cs.AI

    ChatPCG: Large Language Model-Driven Reward Design for Procedural Content Generation

    Authors: In-Chang Baek, Tae-Hwa Park, **-Ha Noh, Cheong-Mok Bae, Kyung-Joong Kim

    Abstract: Driven by the rapid growth of machine learning, recent advances in game artificial intelligence (AI) have significantly impacted productivity across various gaming genres. Reward design plays a pivotal role in training game AI models, wherein researchers implement concepts of specific reward functions. However, despite the presence of AI, the reward design process predominantly remains in the doma… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 4 pages, 2 figures, accepted at IEEE Conference on Games 2024

  5. arXiv:2406.11248  [pdf

    eess.AS cs.AI cs.SD

    Performance Improvement of Language-Queried Audio Source Separation Based on Caption Augmentation From Large Language Models for DCASE Challenge 2024 Task 9

    Authors: Do Hyun Lee, Yoonah Song, Hong Kook Kim

    Abstract: We present a prompt-engineering-based text-augmentation approach applied to a language-queried audio source separation (LASS) task. To enhance the performance of LASS, the proposed approach utilizes large language models (LLMs) to generate multiple captions corresponding to each sentence of the training dataset. To this end, we first perform experiments to identify the most effective prompts for c… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: DCASE 2024 Challenge Task 9, 4 pages

  6. arXiv:2406.09345  [pdf, other

    cs.CL cs.SD eess.AS

    DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding

    Authors: Suwon Shon, Kwangyoun Kim, Yi-Te Hsu, Prashant Sridhar, Shinji Watanabe, Karen Livescu

    Abstract: The integration of pre-trained text-based large language models (LLM) with speech input has enabled instruction-following capabilities for diverse speech tasks. This integration requires the use of a speech encoder, a speech adapter, and an LLM, trained on diverse tasks. We propose the use of discrete speech units (DSU), rather than continuous-valued speech encoder outputs, that are converted to t… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  7. arXiv:2406.08527  [pdf, other

    cs.LG cs.AI

    Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning

    Authors: Jaehyun Nam, Kyuyoung Kim, Seunghyuk Oh, Jihoon Tack, Jaehyung Kim, **woo Shin

    Abstract: Learning effective representations from raw data is crucial for the success of deep learning methods. However, in the tabular domain, practitioners often prefer augmenting raw column features over using learned representations, as conventional tree-based algorithms frequently outperform competing approaches. As a result, feature engineering methods that automatically generate candidate features ha… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 18 pages

  8. arXiv:2406.05963  [pdf, other

    cs.CV cs.AI

    Solution for SMART-101 Challenge of CVPR Multi-modal Algorithmic Reasoning Task 2024

    Authors: **woo Ahn, Junhyeok Park, Min-Jun Kim, Kang-Hyeon Kim, So-Yeong Sohn, Yun-Ji Lee, Du-Seong Chang, Yu-Jung Heo, Eun-Sol Kim

    Abstract: In this paper, the solution of HYU MLLAB KT Team to the Multimodal Algorithmic Reasoning Task: SMART-101 CVPR 2024 Challenge is presented. Beyond conventional visual question-answering problems, the SMART-101 challenge aims to achieve human-level multimodal understanding by tackling complex visio-linguistic puzzles designed for children in the 6-8 age group. To solve this problem, we suggest two m… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  9. arXiv:2406.05794  [pdf, other

    cs.CL cs.AI

    RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation

    Authors: Kiseung Kim, Jay-Yoon Lee

    Abstract: The Retrieval Augmented Generation (RAG) framework utilizes a combination of parametric knowledge and external knowledge to demonstrate state-of-the-art performance on open-domain question answering tasks. However, the RAG framework suffers from performance degradation when the query is accompanied by irrelevant contexts. In this work, we propose the RE-RAG framework, which introduces a relevance… ▽ More

    Submitted 16 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

  10. arXiv:2406.04772  [pdf, other

    cs.LG cs.AI cs.CV

    REP: Resource-Efficient Prompting for On-device Continual Learning

    Authors: Sungho Jeon, Xinyue Ma, Kwang In Kim, Myeongjae Jeon

    Abstract: On-device continual learning (CL) requires the co-optimization of model accuracy and resource efficiency to be practical. This is extremely challenging because it must preserve accuracy while learning new tasks with continuously drifting data and maintain both high energy and memory efficiency to be deployable on real-world devices. Typically, a CL method leverages one of two types of backbone net… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 19 pages, 10 figures

  11. arXiv:2406.04308  [pdf, other

    cs.LG stat.ML

    Approximation-Aware Bayesian Optimization

    Authors: Natalie Maus, Kyurae Kim, Geoff Pleiss, David Eriksson, John P. Cunningham, Jacob R. Gardner

    Abstract: High-dimensional Bayesian optimization (BO) tasks such as molecular design often require 10,000 function evaluations before obtaining meaningful results. While methods like sparse variational Gaussian processes (SVGPs) reduce computational requirements in these settings, the underlying approximations result in suboptimal data acquisitions that slow the progress of optimization. In this paper we mo… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  12. arXiv:2406.03773  [pdf, other

    cs.IT

    Optimizing Multi-User Semantic Communication via Transfer Learning and Knowledge Distillation

    Authors: Loc X. Nguyen, Kitae Kim, Ye Lin Tun, Sheikh Salman Hassan, Yan Kyaw Tun, Zhu Han, Choong Seon Hong

    Abstract: Semantic communication, notable for ensuring quality of service by jointly optimizing source and channel coding, effectively extracts data semantics, reduces transmission length, and mitigates channel noise. However, most studies overlook multi-user scenarios and resource availability, limiting real-world application. This paper addresses this gap by focusing on downlink communication from a base… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 5 pages, 5 figures

  13. arXiv:2406.03486  [pdf, other

    cs.CL

    BIPED: Pedagogically Informed Tutoring System for ESL Education

    Authors: Soonwoo Kwon, Sojung Kim, Minju Park, Seunghyun Lee, Kyuseok Kim

    Abstract: Large Language Models (LLMs) have a great potential to serve as readily available and cost-efficient Conversational Intelligent Tutoring Systems (CITS) for teaching L2 learners of English. Existing CITS, however, are designed to teach only simple concepts or lack the pedagogical depth necessary to address diverse learning strategies. To develop a more pedagogically informed CITS capable of teachin… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: ACL 2024

  14. arXiv:2406.02000  [pdf, other

    cs.NI eess.SP

    Advancing Ultra-Reliable 6G: Transformer and Semantic Localization Empowered Robust Beamforming in Millimeter-Wave Communications

    Authors: Avi Deb Raha, Kitae Kim, Apurba Adhikary, Mrityunjoy Gain, Choong Seon Hong

    Abstract: Advancements in 6G wireless technology have elevated the importance of beamforming, especially for attaining ultra-high data rates via millimeter-wave (mmWave) frequency deployment. Although promising, mmWave bands require substantial beam training to achieve precise beamforming. While initial deep learning models that use RGB camera images demonstrated promise in reducing beam training overhead,… ▽ More

    Submitted 21 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  15. arXiv:2406.00920  [pdf, ps, other

    stat.ML cs.LG math.OC

    Demystifying SGD with Doubly Stochastic Gradients

    Authors: Kyurae Kim, Joohwan Ko, Yi-An Ma, Jacob R. Gardner

    Abstract: Optimization objectives in the form of a sum of intractable expectations are rising in importance (e.g., diffusion models, variational autoencoders, and many more), a setting also known as "finite sum with infinite data." For these problems, a popular strategy is to employ SGD with doubly stochastic gradients (doubly SGD): the expectations are estimated using the gradient estimator of each compone… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: Accepted to ICML'24

  16. arXiv:2406.00810  [pdf, other

    cs.CR

    Expanding the Attack Scenarios of SAE J1939: A Comprehensive Analysis of Established and Novel Vulnerabilities in Transport Protocol

    Authors: Hwejae Lee, Hyosun Lee, Saehee Jun, Huy Kang Kim

    Abstract: Following the enactment of the UN Regulation, substantial efforts have been directed toward implementing intrusion detection and prevention systems (IDPSs) and vulnerability analysis in Controller Area Network (CAN). However, Society of Automotive Engineers (SAE) J1939 protocol, despite its extensive application in cam** cars and commercial vehicles, has seen limited vulnerability identification… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: 18 pages, 7 figures, 5 tables; This is the accepted version of ESCAR USA 2024

    MSC Class: 68M25 ACM Class: K.6.5

  17. arXiv:2405.20233  [pdf, other

    cs.LG cs.AI

    Grokfast: Accelerated Grokking by Amplifying Slow Gradients

    Authors: Jaerin Lee, Bong Gyun Kang, Kihoon Kim, Kyoung Mu Lee

    Abstract: One puzzling artifact in machine learning dubbed grokking is where delayed generalization is achieved tenfolds of iterations after near perfect overfitting to the training data. Focusing on the long delay itself on behalf of machine learning practitioners, our goal is to accelerate generalization of a model under grokking phenomenon. By regarding a series of gradients of a parameter over training… ▽ More

    Submitted 5 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: 17 pages, 13 figures. Typo fixed. Project page: https://jaerinlee.com/research/grokfast

  18. arXiv:2405.19771  [pdf, other

    cs.NI eess.SP

    Data Service Maximization in Integrated Terrestrial-Non-Terrestrial 6G Networks: A Deep Reinforcement Learning Approach

    Authors: Nway Nway Ei, Kitae Kim, Yan Kyaw Tun, Choong Seon Hong

    Abstract: Integrating terrestrial and non-terrestrial networks has emerged as a promising paradigm to fulfill the constantly growing demand for connectivity, low transmission delay, and quality of services (QoS). This integration brings together the strengths of terrestrial and non-terrestrial networks, such as the reliability of terrestrial networks, broad coverage, and service continuity of non-terrestria… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 5 pages, 4 figures

  19. arXiv:2405.18792  [pdf, other

    cs.LG cs.AI

    Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic RL Policies

    Authors: Haanvid Lee, Tri Wahyu Guntara, Jongmin Lee, Yung-Kyun Noh, Kee-Eung Kim

    Abstract: We consider off-policy evaluation (OPE) of deterministic target policies for reinforcement learning (RL) in environments with continuous action spaces. While it is common to use importance sampling for OPE, it suffers from high variance when the behavior policy deviates significantly from the target policy. In order to address this issue, some recent works on OPE proposed in-sample learning with i… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 23 pages, 2 figures, Accepted at ICLR 2024 (spotlight)

  20. arXiv:2405.15987  [pdf, other

    cs.CY

    Modes of Analyzing Disinformation Narratives With AI/ML/Text Mining to Assist in Mitigating the Weaponization of Social Media

    Authors: Andy Skumanich, Han Kyul Kim

    Abstract: This paper highlights the develo** need for quantitative modes for capturing and monitoring malicious communication in social media. There has been a deliberate "weaponization" of messaging through the use of social networks including by politically oriented entities both state sponsored and privately run. The article identifies a use of AI/ML characterization of generalized "mal-info," a broad… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Accepted at ICWSM-2024 Workshop on Digital State Sponsored Disinformation and Propaganda: Challenges and Opportunities (DSSDP24)

  21. arXiv:2405.12421  [pdf, other

    cs.LG cs.AI stat.ML

    A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback

    Authors: Kihyun Kim, Jiawei Zhang, Asuman Ozdaglar, Pablo A. Parrilo

    Abstract: Inverse Reinforcement Learning (IRL) and Reinforcement Learning from Human Feedback (RLHF) are pivotal methodologies in reward learning, which involve inferring and sha** the underlying reward function of sequential decision-making problems based on observed human demonstrations and feedback. Most prior work in reward learning has relied on prior knowledge or assumptions about decision or prefer… ▽ More

    Submitted 3 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  22. arXiv:2405.11905  [pdf, other

    cs.CV

    CSTA: CNN-based Spatiotemporal Attention for Video Summarization

    Authors: Jaewon Son, Jaehun Park, Kwangsu Kim

    Abstract: Video summarization aims to generate a concise representation of a video, capturing its essential content and key moments while reducing its overall length. Although several methods employ attention mechanisms to handle long-term dependencies, they often fail to capture the visual significance inherent in frames. To address this limitation, we propose a CNN-based SpatioTemporal Attention (CSTA) me… ▽ More

    Submitted 21 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: Accepted at CVPR 2024

  23. arXiv:2405.10123  [pdf, other

    cs.LG cs.DC

    Asynchronous Federated Stochastic Optimization for Heterogeneous Objectives Under Arbitrary Delays

    Authors: Charikleia Iakovidou, Kibaek Kim

    Abstract: Federated learning (FL) was recently proposed to securely train models with data held over multiple locations ("clients") under the coordination of a central server. Two major challenges hindering the performance of FL algorithms are long training times caused by straggling clients, and a decline in model accuracy under non-iid local data distributions ("client drift"). In this work, we propose an… ▽ More

    Submitted 28 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

  24. arXiv:2405.09935  [pdf, other

    cs.CL cs.AI

    DEBATE: Devil's Advocate-Based Assessment and Text Evaluation

    Authors: Alex Kim, Keonwoo Kim, Sangwon Yoon

    Abstract: As natural language generation (NLG) models have become prevalent, systematically assessing the quality of machine-generated texts has become increasingly important. Recent studies introduce LLM-based evaluators that operate as reference-free metrics, demonstrating their capability to adeptly handle novel tasks. However, these models generally rely on a single-agent approach, which, we argue, intr… ▽ More

    Submitted 23 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

  25. Unsupervised Extractive Dialogue Summarization in Hyperdimensional Space

    Authors: Seongmin Park, Kyungho Kim, Jae** Seo, Jihwa Lee

    Abstract: We present HyperSum, an extractive summarization framework that captures both the efficiency of traditional lexical summarization and the accuracy of contemporary neural approaches. HyperSum exploits the pseudo-orthogonality that emerges when randomly initializing vectors at extremely high dimensions ("blessing of dimensionality") to construct representative and efficient sentence embeddings. Simp… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: ICASSP 2024

  26. arXiv:2405.08311  [pdf, ps, other

    cs.CL cs.AI

    A Decoupling and Aggregating Framework for Joint Extraction of Entities and Relations

    Authors: Yao Wang, Xin Liu, Weikun Kong, Hai-Tao Yu, Teeradaj Racharak, Kyoung-Sook Kim, Minh Le Nguyen

    Abstract: Named Entity Recognition and Relation Extraction are two crucial and challenging subtasks in the field of Information Extraction. Despite the successes achieved by the traditional approaches, fundamental research questions remain open. First, most recent studies use parameter sharing for a single subtask or shared features for both two subtasks, ignoring their semantic differences. Second, informa… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  27. arXiv:2405.05787  [pdf, other

    cs.RO cs.CV eess.SY

    Autonomous Robotic Ultrasound System for Liver Follow-up Diagnosis: Pilot Phantom Study

    Authors: Tianpeng Zhang, Sekeun Kim, Jerome Charton, Haitong Ma, Kyungsang Kim, Na Li, Quanzheng Li

    Abstract: The paper introduces a novel autonomous robot ultrasound (US) system targeting liver follow-up scans for outpatients in local communities. Given a computed tomography (CT) image with specific target regions of interest, the proposed system carries out the autonomous follow-up scan in three steps: (i) initial robot contact to surface, (ii) coordinate map** between CT image and robot, and (iii) ta… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  28. arXiv:2405.03905  [pdf, other

    cs.AR cs.CV cs.SD eess.AS

    A 65nm 36nJ/Decision Bio-inspired Temporal-Sparsity-Aware Digital Keyword Spotting IC with 0.6V Near-Threshold SRAM

    Authors: Qinyu Chen, Kwantae Kim, Chang Gao, Sheng Zhou, Taekwang Jang, Tobi Delbruck, Shih-Chii Liu

    Abstract: This paper introduces, to the best of the authors' knowledge, the first fine-grained temporal sparsity-aware keyword spotting (KWS) IC leveraging temporal similarities between neighboring feature vectors extracted from input frames and network hidden states, eliminating unnecessary operations and memory accesses. This KWS IC, featuring a bio-inspired delta-gated recurrent neural network (ΔRNN) cla… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  29. arXiv:2405.03083  [pdf, other

    stat.ME cs.LG stat.ML

    Causal K-Means Clustering

    Authors: Kwangho Kim, Jisu Kim, Edward H. Kennedy

    Abstract: Causal effects are often characterized with population summaries. These might provide an incomplete picture when there are heterogeneous treatment effects across subgroups. Since the subgroup structure is typically unknown, it is more challenging to identify and evaluate subgroup effects than population effects. We propose a new solution to this problem: Causal k-Means Clustering, which harnesses… ▽ More

    Submitted 29 June, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

  30. arXiv:2405.02367  [pdf, other

    cs.LG cs.CV

    Enhancing Social Media Post Popularity Prediction with Visual Content

    Authors: Dahyun Jeong, Hyelim Son, Yun** Choi, Keunwoo Kim

    Abstract: Our study presents a framework for predicting image-based social media content popularity that focuses on addressing complex image information and a hierarchical data structure. We utilize the Google Cloud Vision API to effectively extract key image and color information from users' postings, achieving 6.8% higher accuracy compared to using non-image covariates alone. For prediction, we explore a… ▽ More

    Submitted 8 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Report number: Report-no: JKSS-D-23-00299R1

  31. arXiv:2404.19535  [pdf, other

    physics.app-ph cs.ET

    Ferroelectrically-enhanced Schottky barrier transistors for Logic-in-Memory applications

    Authors: Daniele Nazzari, Lukas Wind, Masiar Sistani, Dominik Mayr, Kihye Kim, Walter M. Weber

    Abstract: Artificial neural networks (ANNs) have had an enormous impact on a multitude of sectors, from research to industry, generating an unprecedented demand for tailor-suited hardware platforms. Their training and execution is highly memory-intensive, clearly evidencing the limitations affecting the currently available hardware based on the von Neumann architecture, which requires frequent data shuttlin… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  32. arXiv:2404.08672  [pdf, other

    cs.IR cs.AI cs.CL cs.CY cs.LG

    Taxonomy and Analysis of Sensitive User Queries in Generative AI Search

    Authors: Hwiyeol Jo, Taiwoo Park, Nayoung Choi, Changbong Kim, Ohjoon Kwon, Donghyeon Jeon, Hyunwoo Lee, Eui-Hyeon Lee, Kyoungho Shin, Sun Suk Lim, Kyungmi Kim, Jihye Lee, Sun Kim

    Abstract: Although there has been a growing interest among industries to integrate generative LLMs into their services, limited experiences and scarcity of resources acts as a barrier in launching and servicing large-scale LLM-based conversational services. In this paper, we share our experiences in develo** and operating generative AI models within a national-scale search engine, with a specific focus on… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  33. arXiv:2404.07947  [pdf, other

    cs.DC cs.LG

    ExeGPT: Constraint-Aware Resource Scheduling for LLM Inference

    Authors: Hyungjun Oh, Kihong Kim, Jaemin Kim, Sungkyun Kim, Junyeol Lee, Du-seong Chang, Jiwon Seo

    Abstract: This paper presents ExeGPT, a distributed system designed for constraint-aware LLM inference. ExeGPT finds and runs with an optimal execution schedule to maximize inference throughput while satisfying a given latency constraint. By leveraging the distribution of input and output sequences, it effectively allocates resources and determines optimal execution configurations, including batch sizes and… ▽ More

    Submitted 15 March, 2024; originally announced April 2024.

    Comments: Accepted to ASPLOS 2024 (summer cycle)

    Journal ref: 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems(ASPLOS 24 summer cycle), Volume 2, Nov 15, 2023 (Notification Date)

  34. arXiv:2404.06731  [pdf

    cs.CY cs.AI

    Accuracy of a Large Language Model in Distinguishing Anti- And Pro-vaccination Messages on Social Media: The Case of Human Papillomavirus Vaccination

    Authors: Soojong Kim, Kwanho Kim, Claire Wonjeong Jo

    Abstract: Objective. Vaccination has engendered a spectrum of public opinions, with social media acting as a crucial platform for health-related discussions. The emergence of artificial intelligence technologies, such as large language models (LLMs), offers a novel opportunity to efficiently investigate public discourses. This research assesses the accuracy of ChatGPT, a widely used and freely available ser… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Forthcoming in Preventive Medicine Reports

  35. arXiv:2404.06324  [pdf, other

    cs.NI cs.AI cs.LG

    Dynamic D2D-Assisted Federated Learning over O-RAN: Performance Analysis, MAC Scheduler, and Asymmetric User Selection

    Authors: Payam Abdisarabshali, Kwang Taik Kim, Michael Langberg, Weifeng Su, Seyyedali Hosseinalipour

    Abstract: Existing studies on federated learning (FL) are mostly focused on system orchestration for static snapshots of the network and making static control decisions (e.g., spectrum allocation). However, real-world wireless networks are susceptible to temporal variations of wireless channel capacity and users' datasets. In this paper, we incorporate multi-granular system dynamics (MSDs) into FL, includin… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 120 pages, 13 figures

  36. arXiv:2404.05916  [pdf, other

    cs.CV

    Prompt-driven Universal Model for View-Agnostic Echocardiography Analysis

    Authors: Sekeun Kim, Hui Ren, Peng Guo, Abder-Rahman Ali, Patrick Zhang, Kyungsang Kim, Xiang Li, Quanzheng Li

    Abstract: Echocardiography segmentation for cardiac analysis is time-consuming and resource-intensive due to the variability in image quality and the necessity to process scans from various standard views. While current automated segmentation methods in echocardiography show promising performance, they are trained on specific scan views to analyze corresponding data. However, this solution has a limitation… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  37. arXiv:2404.04096  [pdf, other

    cs.IT eess.SP

    Machine Learning-Aided Cooperative Localization under Dense Urban Environment

    Authors: Hoon Lee, Hong Ki Kim, Seung Hyun Oh, Sang Hyun Lee

    Abstract: Future wireless network technology provides automobiles with the connectivity feature to consolidate the concept of vehicular networks that collaborate on conducting cooperative driving tasks. The full potential of connected vehicles, which promises road safety and quality driving experience, can be leveraged if machine learning models guarantee the robustness in performing core functions includin… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  38. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seong** Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  39. arXiv:2404.01863  [pdf, other

    cs.LG cs.AI

    Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models

    Authors: Kyuyoung Kim, Jongheon Jeong, Minyong An, Mohammad Ghavamzadeh, Krishnamurthy Dvijotham, **woo Shin, Kimin Lee

    Abstract: Fine-tuning text-to-image models with reward functions trained on human feedback data has proven effective for aligning model behavior with human intent. However, excessive optimization with such reward models, which serve as mere proxy objectives, can compromise the performance of fine-tuned models, a phenomenon known as reward overoptimization. To investigate this issue in depth, we introduce th… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: ICLR 2024

  40. arXiv:2404.01517  [pdf, other

    cs.LG eess.SP

    Addressing Heterogeneity in Federated Load Forecasting with Personalization Layers

    Authors: Shourya Bose, Yu Zhang, Kibaek Kim

    Abstract: The advent of smart meters has enabled pervasive collection of energy consumption data for training short-term load forecasting models. In response to privacy concerns, federated learning (FL) has been proposed as a privacy-preserving approach for training, but the quality of trained models degrades as client data becomes heterogeneous. In this paper we propose the use of personalization layers fo… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  41. arXiv:2404.01464  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Data-Efficient Unsupervised Interpolation Without Any Intermediate Frame for 4D Medical Images

    Authors: JungEun Kim, Hangyul Yoon, Geondo Park, Kyungsu Kim, Eunho Yang

    Abstract: 4D medical images, which represent 3D images with temporal information, are crucial in clinical practice for capturing dynamic changes and monitoring long-term disease progression. However, acquiring 4D medical images poses challenges due to factors such as radiation exposure and imaging duration, necessitating a balance between achieving high temporal resolution and minimizing adverse effects. Gi… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  42. arXiv:2404.01140  [pdf, other

    cs.CL

    KoCoNovel: Annotated Dataset of Character Coreference in Korean Novels

    Authors: Kyuhee Kim, Surin Lee, Sangah Lee

    Abstract: In this paper, we present KoCoNovel, a novel character coreference dataset derived from Korean literary texts, complete with detailed annotation guidelines. Comprising 178K tokens from 50 modern and contemporary novels, KoCoNovel stands as one of the largest public coreference resolution corpora in Korean, and the first to be based on literary texts. KoCoNovel offers four distinct versions to acco… ▽ More

    Submitted 11 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: 12 pages

  43. arXiv:2404.01104  [pdf, other

    cs.CL

    SentiCSE: A Sentiment-aware Contrastive Sentence Embedding Framework with Sentiment-guided Textual Similarity

    Authors: Jaemin Kim, Yohan Na, Kangmin Kim, Sang Rak Lee, Dong-Kyu Chae

    Abstract: Recently, sentiment-aware pre-trained language models (PLMs) demonstrate impressive results in downstream sentiment analysis tasks. However, they neglect to evaluate the quality of their constructed sentiment representations; they just focus on improving the fine-tuning performance, which overshadows the representation quality. We argue that without guaranteeing the representation quality, their d… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 14 pages, 8 figures

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: LREC-COLING2024

  44. arXiv:2404.00974  [pdf, other

    cs.CV

    Improving Visual Recognition with Hyperbolical Visual Hierarchy Map**

    Authors: Hyeongjun Kwon, **hyun Jang, ** Kim, Kwonyoung Kim, Kwanghoon Sohn

    Abstract: Visual scenes are naturally organized in a hierarchy, where a coarse semantic is recursively comprised of several fine details. Exploring such a visual hierarchy is crucial to recognize the complex relations of visual elements, leading to a comprehensive scene understanding. In this paper, we propose a Visual Hierarchy Mapper (Hi-Mapper), a novel approach for enhancing the structured understanding… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: This paper is accepted to CVPR 2024. The supplementary material is included. The code is available at \url{https://github.com/kwonjunn01/Hi-Mapper}

  45. arXiv:2404.00384  [pdf, other

    cs.CV

    TTD: Text-Tag Self-Distillation Enhancing Image-Text Alignment in CLIP to Alleviate Single Tag Bias

    Authors: Sanghyun Jo, Soohyun Ryu, Sungyub Kim, Eunho Yang, Kyungsu Kim

    Abstract: We identify a critical bias in contemporary CLIP-based models, which we denote as single tag bias. This bias manifests as a disproportionate focus on a singular tag (word) while neglecting other pertinent tags, stemming from CLIP's text embeddings that prioritize one specific tag in image-text relationships. When deconstructing text into individual tags, only one tag tends to have high relevancy w… ▽ More

    Submitted 20 May, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

  46. arXiv:2404.00380  [pdf, other

    cs.CV

    DHR: Dual Features-Driven Hierarchical Rebalancing in Inter- and Intra-Class Regions for Weakly-Supervised Semantic Segmentation

    Authors: Sanghyun Jo, Fei Pan, In-Jae Yu, Kyungsu Kim

    Abstract: Weakly-supervised semantic segmentation (WSS) ensures high-quality segmentation with limited data and excels when employed as input seed masks for large-scale vision models such as Segment Anything. However, WSS faces challenges related to minor classes since those are overlooked in images with adjacent multiple classes, a limitation originating from the overfitting of traditional expansion method… ▽ More

    Submitted 19 May, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

  47. arXiv:2404.00033  [pdf, other

    cs.HC

    The Hall of Singularity: VR Experience of Prophecy by AI

    Authors: Jisu Kim, Kirak Kim

    Abstract: "The Hall of Singularity" is an immersive art that creates personalized experiences of receiving prophecies from an AI deity through an integration of Artificial Intelligence (AI) and Virtual Reality (VR). As a metaphor for the mythologizing of AI in our society, "The Hall of Singularity" offers an immersive quasi-religious experience where individuals can encounter an AI that has the power to mak… ▽ More

    Submitted 22 March, 2024; originally announced April 2024.

    Comments: 3 pages, 4 figures

  48. arXiv:2403.19144  [pdf, other

    cs.CV

    MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head Generation

    Authors: Seyeon Kim, Siyoon **, Jihye Park, Kihong Kim, Jiyoung Kim, Jisu Nam, Seungryong Kim

    Abstract: Conventional GAN-based models for talking head generation often suffer from limited quality and unstable training. Recent approaches based on diffusion models aimed to address these limitations and improve fidelity. However, they still face challenges, including extensive sampling times and difficulties in maintaining temporal consistency due to the high stochasticity of diffusion models. To overc… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  49. arXiv:2403.18277  [pdf, other

    cs.CL

    BlendX: Complex Multi-Intent Detection with Blended Patterns

    Authors: Ye** Yoon, Jungyeon Lee, Kangsan Kim, Chanhee Park, Taeuk Kim

    Abstract: Task-oriented dialogue (TOD) systems are commonly designed with the presumption that each utterance represents a single intent. However, this assumption may not accurately reflect real-world situations, where users frequently express multiple intents within a single utterance. While there is an emerging interest in multi-intent detection (MID), existing in-domain datasets such as MixATIS and MixSN… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted to LREC-COLING2024

  50. arXiv:2403.15040  [pdf, other

    cs.CL

    ESG Classification by Implicit Rule Learning via GPT-4

    Authors: Hyo Jeong Yun, Chanyoung Kim, Moonjeong Hahm, Kyuri Kim, Gui** Son

    Abstract: Environmental, social, and governance (ESG) factors are widely adopted as higher investment return indicators. Accordingly, ongoing efforts are being made to automate ESG evaluation with language models to extract signals from massive web text easily. However, recent approaches suffer from a lack of training data, as rating agencies keep their evaluation metrics confidential. This paper investigat… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: Accepted as Shared Track Paper at 7th FinNLP Workshop @ LREC-COLING 2024