Skip to main content

Showing 1–50 of 173 results for author: Ahn, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12272  [pdf, other

    cs.AI

    Slot State Space Models

    Authors: **dong Jiang, Fei Deng, Gautam Singh, Minseung Lee, Sung** Ahn

    Abstract: Recent State Space Models (SSMs) such as S4, S5, and Mamba have shown remarkable computational benefits in long-range temporal dependency modeling. However, in many sequence modeling problems, the underlying process is inherently modular and it is of interest to have inductive biases that mimic this modular structure. In this paper, we introduce SlotSSMs, a novel framework for incorporating indepe… ▽ More

    Submitted 30 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  2. arXiv:2406.06793  [pdf, other

    cs.LG cs.AI

    PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer

    Authors: Chang Chen, Junyeob Baek, Fei Deng, Kenji Kawaguchi, Caglar Gulcehre, Sung** Ahn

    Abstract: Despite the recent advancements in offline RL, no unified algorithm could achieve superior performance across a broad range of tasks. Offline \textit{value function learning}, in particular, struggles with sparse-reward, long-horizon tasks due to the difficulty of solving credit assignment and extrapolation errors that accumulates as the horizon of the task grows.~On the other hand, models that ca… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  3. arXiv:2406.02355  [pdf, other

    cs.CV cs.AI cs.DC cs.LG

    FedDr+: Stabilizing Dot-regression with Global Feature Distillation for Federated Learning

    Authors: Seongyoon Kim, Minchan Jeong, Sungnyun Kim, Sungwoo Cho, Sumyeong Ahn, Se-Young Yun

    Abstract: Federated Learning (FL) has emerged as a pivotal framework for the development of effective global models (global FL) or personalized models (personalized FL) across clients with heterogeneous, non-iid data distribution. A key challenge in FL is client drift, where data heterogeneity impedes the aggregation of scattered knowledge. Recent studies have tackled the client drift issue by identifying s… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  4. arXiv:2406.01302  [pdf

    cs.CV

    Pulmonary Embolism Mortality Prediction Using Multimodal Learning Based on Computed Tomography Angiography and Clinical Data

    Authors: Zhusi Zhong, Helen Zhang, Fayez H. Fayad, Andrew C. Lancaster, John Sollee, Shreyas Kulkarni, Cheng Ting Lin, Jie Li, Xinbo Gao, Scott Collins, Colin Greineder, Sun H. Ahn, Harrison X. Bai, Zhicheng Jiao, Michael K. Atalay

    Abstract: Purpose: Pulmonary embolism (PE) is a significant cause of mortality in the United States. The objective of this study is to implement deep learning (DL) models using Computed Tomography Pulmonary Angiography (CTPA), clinical data, and PE Severity Index (PESI) scores to predict PE mortality. Materials and Methods: 918 patients (median age 64 years, range 13-99 years, 52% female) with 3,978 CTPAs w… ▽ More

    Submitted 5 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  5. arXiv:2405.19961  [pdf, other

    cs.LG

    Collective Variable Free Transition Path Sampling with Generative Flow Network

    Authors: Kiyoung Seong, Seonghyun Park, Seonghwan Kim, Woo Youn Kim, Sungsoo Ahn

    Abstract: Understanding transition paths between meta-stable states in molecular systems is fundamental for material design and drug discovery. However, sampling these paths via molecular dynamics simulations is computationally prohibitive due to the high-energy barriers between the meta-stable states. Recent machine learning approaches are often restricted to simple systems or rely on collective variables… ▽ More

    Submitted 31 May, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: 9 pages, 5 figures, 2 tables

  6. arXiv:2405.19691  [pdf, other

    cs.HC

    Designing Prompt Analytics Dashboards to Analyze Student-ChatGPT Interactions in EFL Writing

    Authors: Minsun Kim, SeonGyeom Kim, Suyoun Lee, Yoosang Yoon, Junho Myung, Haneul Yoo, Hyungseung Lim, Jieun Han, Yoonsu Kim, So-Yeon Ahn, Juho Kim, Alice Oh, Hwajung Hong, Tak Yeon Lee

    Abstract: While ChatGPT has significantly impacted education by offering personalized resources for students, its integration into educational settings poses unprecedented risks, such as inaccuracies and biases in AI-generated content, plagiarism and over-reliance on AI, and privacy and security issues. To help teachers address such risks, we conducted a two-phase iterative design process that comprises sur… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  7. arXiv:2405.16413  [pdf, other

    cs.AI cs.CL cs.LG stat.AP

    Augmented Risk Prediction for the Onset of Alzheimer's Disease from Electronic Health Records with Large Language Models

    Authors: Jiankun Wang, Sumyeong Ahn, Taykhoom Dalal, Xiaodan Zhang, Weishen Pan, Qiannan Zhang, Bin Chen, Hiroko H. Dodge, Fei Wang, Jiayu Zhou

    Abstract: Alzheimer's disease (AD) is the fifth-leading cause of death among Americans aged 65 and older. Screening and early detection of AD and related dementias (ADRD) are critical for timely intervention and for identifying clinical trial participants. The widespread adoption of electronic health records (EHRs) offers an important resource for develo** ADRD screening tools such as machine learning bas… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  8. arXiv:2405.16012  [pdf, other

    cs.LG

    Pessimistic Backward Policy for GFlowNets

    Authors: Hyosoon Jang, Yunhui Jang, Minsu Kim, **kyoo Park, Sungsoo Ahn

    Abstract: This paper studies Generative Flow Networks (GFlowNets), which learn to sample objects proportionally to a given reward function through the trajectory of state transitions. In this work, we observe that GFlowNets tend to under-exploit the high-reward objects due to training on insufficient number of trajectories, which may lead to a large gap between the estimated flow and the (known) reward valu… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  9. arXiv:2405.08424  [pdf, other

    cs.LG math.OC

    Tackling Prevalent Conditions in Unsupervised Combinatorial Optimization: Cardinality, Minimum, Covering, and More

    Authors: Fanchen Bu, Hyeonsoo Jo, Soo Yong Lee, Sungsoo Ahn, Kijung Shin

    Abstract: Combinatorial optimization (CO) is naturally discrete, making machine learning based on differentiable optimization inapplicable. Karalias & Loukas (2020) adapted the probabilistic method to incorporate CO into differentiable optimization. Their work ignited the research on unsupervised learning for CO, composed of two main components: probabilistic objectives and derandomization. However, each co… ▽ More

    Submitted 23 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  10. arXiv:2405.04752  [pdf, other

    eess.AS cs.SD

    HILCodec: High Fidelity and Lightweight Neural Audio Codec

    Authors: Sunghwan Ahn, Beom Jun Woo, Min Hyun Han, Chanyeong Moon, Nam Soo Kim

    Abstract: The recent advancement of end-to-end neural audio codecs enables compressing audio at very low bitrates while reconstructing the output audio with high fidelity. Nonetheless, such improvements often come at the cost of increased model complexity. In this paper, we identify and address the problems of existing neural audio codecs. We show that the performance of Wave-U-Net does not increase consist… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  11. arXiv:2405.00646  [pdf, other

    cs.CV cs.LG

    Learning to Compose: Improving Object Centric Learning by Injecting Compositionality

    Authors: Whie Jung, Jaehoon Yoo, Sung** Ahn, Seunghoon Hong

    Abstract: Learning compositional representation is a key aspect of object-centric learning as it enables flexible systematic generalization and supports complex visual reasoning. However, most of the existing approaches rely on auto-encoding objective, while the compositionality is implicitly imposed by the architectural or algorithmic bias in the encoder. This misalignment between auto-encoding objective a… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  12. arXiv:2404.16012  [pdf, other

    cs.CV cs.MM

    GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting

    Authors: Kyusun Cho, Joungbin Lee, Heeji Yoon, Yeobin Hong, Jaehoon Ko, Sangjun Ahn, Seungryong Kim

    Abstract: We propose GaussianTalker, a novel framework for real-time generation of pose-controllable talking heads. It leverages the fast rendering capabilities of 3D Gaussian Splatting (3DGS) while addressing the challenges of directly controlling 3DGS with speech audio. GaussianTalker constructs a canonical 3DGS representation of the head and deforms it in sync with the audio. A key insight is to encode t… ▽ More

    Submitted 25 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: Project Page: https://ku-cvlab.github.io/GaussianTalker

  13. arXiv:2404.05832  [pdf, other

    cs.HC eess.SY

    Human-Machine Interaction in Automated Vehicles: Reducing Voluntary Driver Intervention

    Authors: Xinzhi Zhong, Yang Zhou, Varshini Kamaraj, Zhenhao Zhou, Wissam Kontar, Dan Negrut, John D. Lee, Soyoung Ahn

    Abstract: This paper develops a novel car-following control method to reduce voluntary driver interventions and improve traffic stability in Automated Vehicles (AVs). Through a combination of experimental and empirical analysis, we show how voluntary driver interventions can instigate substantial traffic disturbances that are amplified along the traffic upstream. Motivated by these findings, we present a fr… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  14. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seong** Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  15. arXiv:2403.20153  [pdf, other

    cs.CV

    Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior

    Authors: Jaehoon Ko, Kyusun Cho, Joungbin Lee, Heeji Yoon, Sangmin Lee, Sangjun Ahn, Seungryong Kim

    Abstract: Recent methods for audio-driven talking head synthesis often optimize neural radiance fields (NeRF) on a monocular talking portrait video, leveraging its capability to render high-fidelity and 3D-consistent novel-view frames. However, they often struggle to reconstruct complete face geometry due to the absence of comprehensive 3D information in the input monocular videos. In this paper, we introdu… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: Project page: https://ku-cvlab.github.io/Talk3D/

  16. arXiv:2403.08272  [pdf, other

    cs.CL

    RECIPE4U: Student-ChatGPT Interaction Dataset in EFL Writing Education

    Authors: Jieun Han, Haneul Yoo, Junho Myung, Minsun Kim, Tak Yeon Lee, So-Yeon Ahn, Alice Oh

    Abstract: The integration of generative AI in education is expanding, yet empirical analyses of large-scale and real-world interactions between students and AI systems still remain limited. Addressing this gap, we present RECIPE4U (RECIPE for University), a dataset sourced from a semester-long experiment with 212 college students in English as Foreign Language (EFL) writing courses. During the study, studen… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: text overlap with arXiv:2309.13243

  17. arXiv:2403.02642  [pdf, other

    cs.RO cs.CV

    UFO: Uncertainty-aware LiDAR-image Fusion for Off-road Semantic Terrain Map Estimation

    Authors: Ohn Kim, Junwon Seo, Seongyong Ahn, Chong Hui Kim

    Abstract: Autonomous off-road navigation requires an accurate semantic understanding of the environment, often converted into a bird's-eye view (BEV) representation for various downstream tasks. While learning-based methods have shown success in generating local semantic terrain maps directly from sensor data, their efficacy in off-road environments is hindered by challenges in accurately representing uncer… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  18. arXiv:2402.18866  [pdf, other

    cs.LG

    Dr. Strategy: Model-Based Generalist Agents with Strategic Dreaming

    Authors: Hany Hamed, Subin Kim, Dongyeong Kim, Jaesik Yoon, Sung** Ahn

    Abstract: Model-based reinforcement learning (MBRL) has been a primary approach to ameliorating the sample efficiency issue as well as to make a generalist agent. However, there has not been much effort toward enhancing the strategy of dreaming itself. Therefore, it is a question whether and how an agent can "dream better" in a more structured and strategic way. In this paper, inspired by the observation fr… ▽ More

    Submitted 4 June, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: First two authors contributed equally

  19. arXiv:2402.17077  [pdf, other

    cs.LG cs.CV

    Parallelized Spatiotemporal Binding

    Authors: Gautam Singh, Yue Wang, Jiawei Yang, Boris Ivanovic, Sung** Ahn, Marco Pavone, Tong Che

    Abstract: While modern best practices advocate for scalable architectures that support long-range interactions, object-centric models are yet to fully embrace these architectures. In particular, existing object-centric models for handling sequential inputs, due to their reliance on RNN-based implementation, show poor stability and capacity and are slow to train on long sequences. We introduce Parallelizable… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: See project page at https://parallel-st-binder.github.io

  20. arXiv:2402.16733  [pdf, other

    cs.CL cs.AI

    DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing

    Authors: Haneul Yoo, Jieun Han, So-Yeon Ahn, Alice Oh

    Abstract: Automated essay scoring (AES) is a useful tool in English as a Foreign Language (EFL) writing education, offering real-time essay scores for students and instructors. However, previous AES models were trained on essays and scores irrelevant to the practical scenarios of EFL writing education and usually provided a single holistic score due to the lack of appropriate datasets. In this paper, we rel… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2310.05191

  21. arXiv:2402.15160  [pdf, other

    cs.LG cs.AI

    Spatially-Aware Transformer for Embodied Agents

    Authors: Junmo Cho, Jaesik Yoon, Sung** Ahn

    Abstract: Episodic memory plays a crucial role in various cognitive processes, such as the ability to mentally recall past events. While cognitive science emphasizes the significance of spatial context in the formation and retrieval of episodic memory, the current primary approach to implementing episodic memory in AI systems is through transformers that store temporally ordered experiences, which overlooks… ▽ More

    Submitted 29 February, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: ICLR 2024 Spotlight. First two authors contributed equally

  22. arXiv:2402.12412  [pdf, other

    cs.HC cs.AI cs.MM eess.SP

    Dynamic and Super-Personalized Media Ecosystem Driven by Generative AI: Unpredictable Plays Never Repeating The Same

    Authors: Sungjun Ahn, Hyun-Jeong Yim, Youngwan Lee, Sung-Ik Park

    Abstract: This paper introduces a media service model that exploits artificial intelligence (AI) video generators at the receive end. This proposal deviates from the traditional multimedia ecosystem, completely relying on in-house production, by shifting part of the content creation onto the receiver. We bring a semantic process into the framework, allowing the distribution network to provide service elemen… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: 13 pages, 7 figures

  23. arXiv:2402.05982  [pdf, other

    q-bio.QM cs.LG

    Decoupled Sequence and Structure Generation for Realistic Antibody Design

    Authors: Nayoung Kim, Minsu Kim, Sungsoo Ahn, **kyoo Park

    Abstract: Antibody design plays a pivotal role in advancing therapeutics. Although deep learning has made rapid progress in this field, existing methods jointly generate antibody sequences and structures, limiting task-specific optimization. In response, we propose an antibody sequence-structure decoupling (ASSD) framework, which separates sequence generation and structure prediction. Although our approach… ▽ More

    Submitted 27 May, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: 18 pages, 6 figures

  24. arXiv:2402.05965  [pdf, other

    cs.LG eess.SP

    Hybrid Neural Representations for Spherical Data

    Authors: Hyomin Kim, Yunhui Jang, Jaeho Lee, Sungsoo Ahn

    Abstract: In this paper, we study hybrid neural representations for spherical data, a domain of increasing relevance in scientific research. In particular, our work focuses on weather and climate data as well as comic microwave background (CMB) data. Although previous studies have delved into coordinate-based neural representations for spherical signals, they often fail to capture the intricate details of h… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: 13 pages, 8 figures

  25. arXiv:2402.04278  [pdf, other

    physics.chem-ph cs.LG

    Gaussian Plane-Wave Neural Operator for Electron Density Estimation

    Authors: Seongsu Kim, Sungsoo Ahn

    Abstract: This work studies machine learning for electron density prediction, which is fundamental for understanding chemical systems and density functional theory (DFT) simulations. To this end, we introduce the Gaussian plane-wave neural operator (GPWNO), which operates in the infinite-dimensional functional space using the plane-wave and Gaussian-type orbital bases, widely recognized in the context of DF… ▽ More

    Submitted 13 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Accepted by ICML 2024 main poster

    Journal ref: International Conference on Machine Learning (ICML), 2024

  26. arXiv:2402.01203  [pdf, other

    cs.LG cs.CV

    Neural Language of Thought Models

    Authors: Yi-Fu Wu, Minseung Lee, Sung** Ahn

    Abstract: The Language of Thought Hypothesis suggests that human cognition operates on a structured, language-like system of mental representations. While neural language models can naturally benefit from the compositional structure inherently and explicitly expressed in language data, learning such representations from non-linguistic general observations, like images, remains a challenge. In this work, we… ▽ More

    Submitted 16 April, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: Accepted in ICLR 2024

  27. arXiv:2401.12920  [pdf, other

    cs.AI

    Truck Parking Usage Prediction with Decomposed Graph Neural Networks

    Authors: Rei Tamaru, Yang Cheng, Steven Parker, Ernie Perry, Bin Ran, Soyoung Ahn

    Abstract: Truck parking on freight corridors faces various challenges, such as insufficient parking spaces and compliance with Hour-of-Service (HOS) regulations. These constraints often result in unauthorized parking practices, causing safety concerns. To enhance the safety of freight operations, providing accurate parking usage prediction proves to be a cost-effective solution. Despite the existing researc… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: 10 pages, 5 figures, 3 tables, Manuscript for IEEE Transactions on Intelligent Transportation Systems

  28. arXiv:2401.02644  [pdf, other

    cs.LG cs.AI

    Simple Hierarchical Planning with Diffusion

    Authors: Chang Chen, Fei Deng, Kenji Kawaguchi, Caglar Gulcehre, Sung** Ahn

    Abstract: Diffusion-based generative methods have proven effective in modeling trajectories with offline datasets. However, they often face computational challenges and can falter in generalization, especially in capturing temporal abstractions for long-horizon tasks. To overcome this, we introduce the Hierarchical Diffuser, a simple, fast, yet surprisingly effective planning method combining the advantages… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  29. arXiv:2312.16839  [pdf, other

    cs.RO

    Similar but Different: A Survey of Ground Segmentation and Traversability Estimation for Terrestrial Robots

    Authors: Hyungtae Lim, Minho Oh, Seungjae Lee, Seunguk Ahn, Hyun Myung

    Abstract: With the increasing demand for mobile robots and autonomous vehicles, several approaches for long-term robot navigation have been proposed. Among these techniques, ground segmentation and traversability estimation play important roles in perception and path planning, respectively. Even though these two techniques appear similar, their objectives are different. Ground segmentation divides data into… ▽ More

    Submitted 2 January, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: 10 pages, 8 figures

  30. arXiv:2312.14184  [pdf

    cs.CL cs.AI cs.LG

    Large Language Models in Medical Term Classification and Unexpected Misalignment Between Response and Reasoning

    Authors: Xiaodan Zhang, Sandeep Vemulapalli, Nabasmita Talukdar, Sumyeong Ahn, Jiankun Wang, Han Meng, Sardar Mehtab Bin Murtaza, Aakash Ajay Dave, Dmitry Leshchiner, Dimitri F. Joseph, Martin Witteveen-Lane, Dave Chesla, Jiayu Zhou, Bin Chen

    Abstract: This study assesses the ability of state-of-the-art large language models (LLMs) including GPT-3.5, GPT-4, Falcon, and LLaMA 2 to identify patients with mild cognitive impairment (MCI) from discharge summaries and examines instances where the models' responses were misaligned with their reasoning. Utilizing the MIMIC-IV v2.2 database, we focused on a cohort aged 65 and older, verifying MCI diagnos… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  31. arXiv:2312.10042  [pdf

    cs.LG cs.RO

    A Generic Stochastic Hybrid Car-following Model Based on Approximate Bayesian Computation

    Authors: Jiwan Jiang, Yang Zhou, Xin Wang, Soyoung Ahn

    Abstract: Car following (CF) models are fundamental to describing traffic dynamics. However, the CF behavior of human drivers is highly stochastic and nonlinear. As a result, identifying the best CF model has been challenging and controversial despite decades of research. Introduction of automated vehicles has further complicated this matter as their CF controllers remain proprietary, though their behavior… ▽ More

    Submitted 26 November, 2023; originally announced December 2023.

    Comments: 25 pages, 6 figures

  32. arXiv:2312.05611  [pdf, other

    cs.LG cs.AI

    Triplet Edge Attention for Algorithmic Reasoning

    Authors: Yeonjoon Jung, Sungsoo Ahn

    Abstract: This work investigates neural algorithmic reasoning to develop neural networks capable of learning from classical algorithms. The main challenge is to develop graph neural networks that are expressive enough to predict the given algorithm outputs while generalizing well to out-of-distribution data. In this work, we introduce a new graph neural network layer called Triplet Edge Attention (TEA), an… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

  33. arXiv:2312.02230  [pdf, other

    cs.LG cs.AI

    A Simple and Scalable Representation for Graph Generation

    Authors: Yunhui Jang, Seul Lee, Sungsoo Ahn

    Abstract: Recently, there has been a surge of interest in employing neural networks for graph generation, a fundamental statistical learning problem with critical applications like molecule design and community analysis. However, most approaches encounter significant limitations when generating large-scale graphs. This is due to their requirement to output the full adjacency matrices whose size grows quadra… ▽ More

    Submitted 26 March, 2024; v1 submitted 3 December, 2023; originally announced December 2023.

    Comments: International Conference on Learning Representations (ICLR) 2024

  34. arXiv:2312.00836  [pdf, other

    eess.IV cs.CV

    Heteroscedastic Uncertainty Estimation for Probabilistic Unsupervised Registration of Noisy Medical Images

    Authors: Xiaoran Zhang, Daniel H. Pak, Shawn S. Ahn, Xiaoxiao Li, Chenyu You, Lawrence Staib, Albert J. Sinusas, Alex Wong, James S. Duncan

    Abstract: This paper proposes a heteroscedastic uncertainty estimation framework for unsupervised medical image registration. Existing methods rely on objectives (e.g. mean-squared error) that assume a uniform noise level across the image, disregarding the heteroscedastic and input-dependent characteristics of noise distribution in real-world medical images. This further introduces noisy gradients due to un… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

  35. arXiv:2311.11178  [pdf, other

    cs.CV

    Active Prompt Learning in Vision Language Models

    Authors: Jihwan Bang, Sumyeong Ahn, Jae-Gil Lee

    Abstract: Pre-trained Vision Language Models (VLMs) have demonstrated notable progress in various zero-shot tasks, such as classification and retrieval. Despite their performance, because improving performance on new tasks requires task-specific knowledge, their adaptation is essential. While labels are needed for the adaptation, acquiring them is typically expensive. To overcome this challenge, active lear… ▽ More

    Submitted 21 March, 2024; v1 submitted 18 November, 2023; originally announced November 2023.

    Comments: accepted at CVPR 2024

  36. arXiv:2311.09064  [pdf, other

    cs.CV cs.LG

    Imagine the Unseen World: A Benchmark for Systematic Generalization in Visual World Models

    Authors: Yeongbin Kim, Gautam Singh, Junyeong Park, Caglar Gulcehre, Sung** Ahn

    Abstract: Systematic compositionality, or the ability to adapt to novel situations by creating a mental model of the world using reusable pieces of knowledge, remains a significant challenge in machine learning. While there has been considerable progress in the language domain, efforts towards systematic visual imagination, or envisioning the dynamical implications of a visual observation, are in their infa… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: Published as a conference paper at NeurIPS 2023. The first two authors contributed equally. To download the benchmark, visit https://systematic-visual-imagination.github.io

  37. arXiv:2310.17668  [pdf, other

    cs.LG

    Fine tuning Pre trained Models for Robustness Under Noisy Labels

    Authors: Sumyeong Ahn, Sihyeon Kim, Jongwoo Ko, Se-Young Yun

    Abstract: The presence of noisy labels in a training dataset can significantly impact the performance of machine learning models. To tackle this issue, researchers have explored methods for Learning with Noisy Labels to identify clean samples and reduce the influence of noisy labels. However, constraining the influence of a certain portion of the training dataset can result in a reduction in overall general… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: 10 pages (17 pages including supplementary)

    MSC Class: Computer Science; Artificial Intelligence

  38. arXiv:2310.13312  [pdf, other

    cs.CL

    Exploring the Impact of Corpus Diversity on Financial Pretrained Language Models

    Authors: Jaeyoung Choe, Keonwoong Noh, Nayeon Kim, Seyun Ahn, Woohwan Jung

    Abstract: Over the past few years, various domain-specific pretrained language models (PLMs) have been proposed and have outperformed general-domain PLMs in specialized areas such as biomedical, scientific, and clinical domains. In addition, financial PLMs have been studied because of the high economic impact of financial data analysis. However, we found that financial PLMs were not pretrained on sufficient… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 (Findings)

  39. arXiv:2310.10054  [pdf, other

    cs.CL cs.AI cs.LG

    NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models

    Authors: Jongwoo Ko, Seungjoon Park, Yu** Kim, Sumyeong Ahn, Du-Seong Chang, Euijai Ahn, Se-Young Yun

    Abstract: Structured pruning methods have proven effective in reducing the model size and accelerating inference speed in various network architectures such as Transformers. Despite the versatility of encoder-decoder models in numerous NLP tasks, the structured pruning methods on such models are relatively less explored compared to encoder-only models. In this study, we investigate the behavior of the struc… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: Findings of the Association for Computational Linguistics: EMNLP 2023

  40. arXiv:2310.07430  [pdf, other

    cs.LG stat.ML

    Non-backtracking Graph Neural Networks

    Authors: Seonghyun Park, Narae Ryu, Gahee Kim, Dongyeop Woo, Se-Young Yun, Sungsoo Ahn

    Abstract: The celebrated message-passing updates for graph neural networks allow the representation of large-scale graphs with local and computationally tractable updates. However, the local updates suffer from backtracking, i.e., a message flows through the same edge twice and revisits the previously visited node. Since the number of message flows increases exponentially with the number of updates, the red… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  41. arXiv:2310.05191  [pdf, other

    cs.CL

    FABRIC: Automated Scoring and Feedback Generation for Essays

    Authors: Jieun Han, Haneul Yoo, Junho Myung, Minsun Kim, Hyunseung Lim, Yoonsu Kim, Tak Yeon Lee, Hwajung Hong, Juho Kim, So-Yeon Ahn, Alice Oh

    Abstract: Automated essay scoring (AES) provides a useful tool for students and instructors in writing classes by generating essay scores in real-time. However, previous AES models do not provide more specific rubric-based scores nor feedback on how to improve the essays, which can be even more important than the overall scores for learning. We present FABRIC, a pipeline to help students and instructors in… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

  42. arXiv:2310.03301  [pdf, other

    cs.LG

    Learning Energy Decompositions for Partial Inference of GFlowNets

    Authors: Hyosoon Jang, Minsu Kim, Sungsoo Ahn

    Abstract: This paper studies generative flow networks (GFlowNets) to sample objects from the Boltzmann energy distribution via a sequence of actions. In particular, we focus on improving GFlowNet with partial inference: training flow functions with the evaluation of the intermediate states or transitions. To this end, the recently developed forward-looking GFlowNet reparameterizes the flow functions based o… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  43. arXiv:2310.02710  [pdf, other

    cs.LG stat.ML

    Local Search GFlowNets

    Authors: Minsu Kim, Taeyoung Yun, Emmanuel Bengio, Dinghuai Zhang, Yoshua Bengio, Sungsoo Ahn, **kyoo Park

    Abstract: Generative Flow Networks (GFlowNets) are amortized sampling methods that learn a distribution over discrete objects proportional to their rewards. GFlowNets exhibit a remarkable ability to generate diverse samples, yet occasionally struggle to consistently produce samples with high rewards due to over-exploration on wide sample space. This paper proposes to train GFlowNets with local search, which… ▽ More

    Submitted 22 March, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: ICLR 2024 (Spotlight paper), 18 pages, 17 figures

  44. arXiv:2309.16898  [pdf, other

    cs.RO cs.CL cs.CV cs.HC

    A Sign Language Recognition System with Pepper, Lightweight-Transformer, and LLM

    Authors: JongYoon Lim, Inkyu Sa, Bruce MacDonald, Ho Seok Ahn

    Abstract: This research explores using lightweight deep neural network architectures to enable the humanoid robot Pepper to understand American Sign Language (ASL) and facilitate non-verbal human-robot interaction. First, we introduce a lightweight and efficient model for ASL understanding optimized for embedded systems, ensuring rapid sign recognition while conserving computational resources. Building upon… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  45. arXiv:2309.15284  [pdf

    cs.LG

    A Physics Enhanced Residual Learning (PERL) Framework for Vehicle Trajectory Prediction

    Authors: Keke Long, Zihao Sheng, Haotian Shi, Xiaopeng Li, Sikai Chen, Sue Ahn

    Abstract: In vehicle trajectory prediction, physics models and data-driven models are two predominant methodologies. However, each approach presents its own set of challenges: physics models fall short in predictability, while data-driven models lack interpretability. Addressing these identified shortcomings, this paper proposes a novel framework, the Physics-Enhanced Residual Learning (PERL) model. PERL in… ▽ More

    Submitted 21 March, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

  46. arXiv:2309.13243   

    cs.CL

    ChEDDAR: Student-ChatGPT Dialogue in EFL Writing Education

    Authors: Jieun Han, Haneul Yoo, Junho Myung, Minsun Kim, Tak Yeon Lee, So-Yeon Ahn, Alice Oh

    Abstract: The integration of generative AI in education is expanding, yet empirical analyses of large-scale, real-world interactions between students and AI systems still remain limited. In this study, we present ChEDDAR, ChatGPT & EFL Learner's Dialogue Dataset As Revising an essay, which is collected from a semester-long longitudinal experiment involving 212 college students enrolled in English as Foreign… ▽ More

    Submitted 20 March, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: The new version of this paper is on arXiv as arXiv:2403.08272

  47. arXiv:2308.16870  [pdf, other

    cs.RO cs.AI eess.SY

    Learning Driver Models for Automated Vehicles via Knowledge Sharing and Personalization

    Authors: Wissam Kontar, Xinzhi Zhong, Soyoung Ahn

    Abstract: This paper describes a framework for learning Automated Vehicles (AVs) driver models via knowledge sharing between vehicles and personalization. The innate variability in the transportation system makes it exceptionally challenging to expose AVs to all possible driving scenarios during empirical experimentation or testing. Consequently, AVs could be blind to certain encounters that are deemed detr… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

    Comments: 10 pages, 8 figures

  48. arXiv:2308.14466  [pdf, other

    cs.CV

    Improving the performance of object detection by preserving label distribution

    Authors: Heewon Lee, Sangtae Ahn

    Abstract: Object detection is a task that performs position identification and label classification of objects in images or videos. The information obtained through this process plays an essential role in various tasks in the field of computer vision. In object detection, the data utilized for training and validation typically originate from public datasets that are well-balanced in terms of the number of o… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: Code is available at https://github.com/leeheewon-01/YOLOstratifiedKFold/tree/main

  49. arXiv:2308.07512  [pdf, other

    cs.RO

    Seeing the Fruit for the Leaves: Robotically Map** Apple Fruitlets in a Commercial Orchard

    Authors: Ans Qureshi, David Smith, Trevor Gee, Mahla Nejati, Jalil Shahabi, JongYoon Lim, Ho Seok Ahn, Ben McGuinness, Catherine Downes, Rahul Jangali, Kale Black, Hin Lim, Mike Duke, Bruce MacDonald, Henry Williams

    Abstract: Aotearoa New Zealand has a strong and growing apple industry but struggles to access workers to complete skilled, seasonal tasks such as thinning. To ensure effective thinning and make informed decisions on a per-tree basis, it is crucial to accurately measure the crop load of individual apple trees. However, this task poses challenges due to the dense foliage that hides the fruitlets within the t… ▽ More

    Submitted 14 August, 2023; originally announced August 2023.

    Comments: Accepted at the International Conference on Intelligent Robots and Systems (IROS 2023)

  50. arXiv:2307.13991  [pdf, other

    cs.RO cs.CV cs.LG

    METAVerse: Meta-Learning Traversability Cost Map for Off-Road Navigation

    Authors: Junwon Seo, Taekyung Kim, Seongyong Ahn, Kiho Kwak

    Abstract: Autonomous navigation in off-road conditions requires an accurate estimation of terrain traversability. However, traversability estimation in unstructured environments is subject to high uncertainty due to the variability of numerous factors that influence vehicle-terrain interaction. Consequently, it is challenging to obtain a generalizable model that can accurately predict traversability in a va… ▽ More

    Submitted 4 March, 2024; v1 submitted 26 July, 2023; originally announced July 2023.

    Comments: Our video can be found at https://youtu.be/4rIAMM1ZKMo