Skip to main content

Showing 1–50 of 193 results for author: Jung, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16994  [pdf, other

    eess.SP cs.AI

    Quantum Multi-Agent Reinforcement Learning for Cooperative Mobile Access in Space-Air-Ground Integrated Networks

    Authors: Gyu Seon Kim, Yeryeong Cho, Jaehyun Chung, Soohyun Park, Soyi Jung, Zhu Han, Joongheon Kim

    Abstract: Achieving global space-air-ground integrated network (SAGIN) access only with CubeSats presents significant challenges such as the access sustainability limitations in specific regions (e.g., polar regions) and the energy efficiency limitations in CubeSats. To tackle these problems, high-altitude long-endurance unmanned aerial vehicles (HALE-UAVs) can complement these CubeSat shortcomings for prov… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 17 pages, 22 figures

  2. Lesion-Aware Cross-Phase Attention Network for Renal Tumor Subtype Classification on Multi-Phase CT Scans

    Authors: Kwang-Hyun Uhm, Seung-Won Jung, Sung-Hoo Hong, Sung-Jea Ko

    Abstract: Multi-phase computed tomography (CT) has been widely used for the preoperative diagnosis of kidney cancer due to its non-invasive nature and ability to characterize renal lesions. However, since enhancement patterns of renal lesions across CT phases are different even for the same lesion type, the visual assessment by radiologists suffers from inter-observer variability in clinical practice. Altho… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: This article has been accepted for publication in Computers in Biology and Medicine

    Journal ref: Computers in Biology and Medicine, 108746, 2024

  3. arXiv:2405.10944  [pdf, other

    physics.chem-ph cs.LG

    Probabilistic transfer learning methodology to expedite high fidelity simulation of reactive flows

    Authors: Bruno S. Soriano, Ki Sung Jung, Tarek Echekki, Jacqueline H. Chen, Mohammad Khalil

    Abstract: Reduced order models based on the transport of a lower dimensional manifold representation of the thermochemical state, such as Principal Component (PC) transport and Machine Learning (ML) techniques, have been developed to reduce the computational cost associated with the Direct Numerical Simulations (DNS) of reactive flows. Both PC transport and ML normally require an abundance of data to exhibi… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  4. arXiv:2404.18063  [pdf, other

    cs.LG physics.flu-dyn

    Machine Learning Techniques for Data Reduction of CFD Applications

    Authors: Jaemoon Lee, Ki Sung Jung, Qian Gong, Xiao Li, Scott Klasky, Jacqueline Chen, Anand Rangarajan, Sanjay Ranka

    Abstract: We present an approach called guaranteed block autoencoder that leverages Tensor Correlations (GBATC) for reducing the spatiotemporal data generated by computational fluid dynamics (CFD) and other scientific applications. It uses a multidimensional block of tensors (spanning in space and time) for both input and output, capturing the spatiotemporal and interspecies relationship within a tensor. Th… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 10 pages, 8 figures

  5. arXiv:2404.14664  [pdf, ps, other

    cs.LG cs.AI

    Employing Layerwised Unsupervised Learning to Lessen Data and Loss Requirements in Forward-Forward Algorithms

    Authors: Taewook Hwang, Hyein Seo, Sangkeun Jung

    Abstract: Recent deep learning models such as ChatGPT utilizing the back-propagation algorithm have exhibited remarkable performance. However, the disparity between the biological brain processes and the back-propagation algorithm has been noted. The Forward-Forward algorithm, which trains deep learning models solely through the forward pass, has emerged to address this. Although the Forward-Forward algorit… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 8 pages, 6 figures

  6. arXiv:2404.06021  [pdf, other

    cs.HC

    Combinational Nonuniform Timeslicing of Dynamic Networks

    Authors: Seokweon Jung, DongHwa Shin, Hyeon Jeon, **wook Seo

    Abstract: Dynamic networks represent the complex and evolving interrelationships between real-world entities. Given the scale and variability of these networks, finding an optimal slicing interval is essential for meaningful analysis. Nonuniform timeslicing, which adapts to density changes within the network, is drawing attention as a solution to this problem. In this research, we categorized existing algor… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: PacificVis2024 poster

  7. arXiv:2404.04819  [pdf, other

    cs.CV

    Joint Reconstruction of 3D Human and Object via Contact-Based Refinement Transformer

    Authors: Hyeong** Nam, Daniel Sungho Jung, Gyeongsik Moon, Kyoung Mu Lee

    Abstract: Human-object contact serves as a strong cue to understand how humans physically interact with objects. Nevertheless, it is not widely explored to utilize human-object contact information for the joint reconstruction of 3D human and object from a single image. In this work, we present a novel joint 3D human-object reconstruction method (CONTHO) that effectively exploits contact information between… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: Published at CVPR 2024, 19 pages including the supplementary material

  8. arXiv:2404.04656  [pdf, other

    cs.LG cs.AI cs.CL

    Binary Classifier Optimization for Large Language Model Alignment

    Authors: Seungjae Jung, Gunsoo Han, Daniel Wontae Nam, Kyoung-Woon On

    Abstract: Aligning Large Language Models (LLMs) to human preferences through preference optimization has been crucial but labor-intensive, necessitating for each prompt a comparison of both a chosen and a rejected text completion by evaluators. Recently, Kahneman-Tversky Optimization (KTO) has demonstrated that LLMs can be aligned using merely binary "thumbs-up" or "thumbs-down" signals on each prompt-compl… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: 18 pages, 9 figures

  9. arXiv:2404.03991  [pdf, other

    eess.IV cs.CV cs.LG

    Towards Efficient and Accurate CT Segmentation via Edge-Preserving Probabilistic Downsampling

    Authors: Shahzad Ali, Yu Rim Lee, Soo Young Park, Won Young Tak, Soon Ki Jung

    Abstract: Downsampling images and labels, often necessitated by limited resources or to expedite network training, leads to the loss of small objects and thin boundaries. This undermines the segmentation network's capacity to interpret images accurately and predict detailed labels, resulting in diminished performance compared to processing at original resolutions. This situation exemplifies the trade-off be… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 5 pages (4 figures, 1 table); This work has been submitted to the IEEE Signal Processing Letters. Copyright may be transferred without notice, after which this version may no longer be accessible

  10. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seong** Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  11. arXiv:2404.01690  [pdf, other

    cs.CV

    RefQSR: Reference-based Quantization for Image Super-Resolution Networks

    Authors: Hongjae Lee, Jun-Sang Yoo, Seung-Won Jung

    Abstract: Single image super-resolution (SISR) aims to reconstruct a high-resolution image from its low-resolution observation. Recent deep learning-based SISR models show high performance at the expense of increased computational costs, limiting their use in resource-constrained environments. As a promising solution for computationally efficient network design, network quantization has been extensively stu… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted by IEEE Transactions on Image Processing (TIP)

  12. Imaging radar and LiDAR image translation for 3-DOF extrinsic calibration

    Authors: Sangwoo Jung, Hyesu Jang, Minwoo Jung, Ayoung Kim, Myung-Hwan Jeon

    Abstract: The integration of sensor data is crucial in the field of robotics to take full advantage of the various sensors employed. One critical aspect of this integration is determining the extrinsic calibration parameters, such as the relative transformation, between each sensor. The use of data fusion between complementary sensors, such as radar and LiDAR, can provide significant benefits, particularly… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  13. arXiv:2403.17330  [pdf, other

    cs.CV

    Staircase Localization for Autonomous Exploration in Urban Environments

    Authors: **rae Kim, Sunggoo Jung, Sung-Kyun Kim, Youdan Kim, Ali-akbar Agha-mohammadi

    Abstract: A staircase localization method is proposed for robots to explore urban environments autonomously. The proposed method employs a modular design in the form of a cascade pipeline consisting of three modules of stair detection, line segment detection, and stair localization modules. The stair detection module utilizes an object detection algorithm based on deep learning to generate a region of inter… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 9 pages, 10 figures

  14. arXiv:2403.09193  [pdf, other

    cs.CV cs.AI cs.LG q-bio.NC

    Are Vision Language Models Texture or Shape Biased and Can We Steer Them?

    Authors: Paul Gavrikov, Jovita Lukasik, Steffen Jung, Robert Geirhos, Bianca Lamm, Muhammad Jehanzeb Mirza, Margret Keuper, Janis Keuper

    Abstract: Vision language models (VLMs) have drastically changed the computer vision model landscape in only a few years, opening an exciting array of new applications from zero-shot image classification, over to image captioning, and visual question answering. Unlike pure vision models, they offer an intuitive way to access visual content through language prompting. The wide applicability of such models en… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  15. arXiv:2403.09055  [pdf, other

    cs.CV

    StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control

    Authors: Jaerin Lee, Daniel Sungho Jung, Kanggeon Lee, Kyoung Mu Lee

    Abstract: The enormous success of diffusion models in text-to-image synthesis has made them promising candidates for the next generation of end-user applications for image generation and editing. Previous works have focused on improving the usability of diffusion models by reducing the inference time or increasing user interactivity by allowing new, fine-grained controls such as region-based text prompts. H… ▽ More

    Submitted 1 April, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: 29 pages, 16 figures. v2: typos corrected, references added. Project page: https://jaerinlee.com/research/StreamMultiDiffusion

  16. arXiv:2403.08639  [pdf, other

    cs.CV

    HIMap: HybrId Representation Learning for End-to-end Vectorized HD Map Construction

    Authors: Yi Zhou, Hui Zhang, Jiaqian Yu, Yifan Yang, Sangil Jung, Seung-In Park, ByungIn Yoo

    Abstract: Vectorized High-Definition (HD) map construction requires predictions of the category and point coordinates of map elements (e.g. road boundary, lane divider, pedestrian crossing, etc.). State-of-the-art methods are mainly based on point-level representation learning for regressing accurate point coordinates. However, this pipeline has limitations in obtaining element-level information and handlin… ▽ More

    Submitted 26 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  17. arXiv:2403.05093  [pdf, other

    cs.CV eess.IV

    Spectrum Translation for Refinement of Image Generation (STIG) Based on Contrastive Learning and Spectral Filter Profile

    Authors: Seokjun Lee, Seung-Won Jung, Hyunseok Seo

    Abstract: Currently, image generation and synthesis have remarkably progressed with generative models. Despite photo-realistic results, intrinsic discrepancies are still observed in the frequency domain. The spectral discrepancy appeared not only in generative adversarial networks but in diffusion models. In this study, we propose a framework to effectively mitigate the disparity in frequency domain of the… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted to AAAI 2024

  18. arXiv:2402.09754  [pdf, other

    stat.ML cs.LG math.ST

    Robust SVD Made Easy: A fast and reliable algorithm for large-scale data analysis

    Authors: Sangil Han, Kyoowon Kim, Sungkyu Jung

    Abstract: The singular value decomposition (SVD) is a crucial tool in machine learning and statistical data analysis. However, it is highly susceptible to outliers in the data matrix. Existing robust SVD algorithms often sacrifice speed for robustness or fail in the presence of only a few outliers. This study introduces an efficient algorithm, called Spherically Normalized SVD, for robust SVD approximation… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  19. arXiv:2402.05195  [pdf, other

    cs.CV cs.CL

    $λ$-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space

    Authors: Maitreya Patel, Sangmin Jung, Chitta Baral, Yezhou Yang

    Abstract: Despite the recent advances in personalized text-to-image (P-T2I) generative models, it remains challenging to perform finetuning-free multi-subject-driven T2I in a resource-efficient manner. Predominantly, contemporary approaches, involving the training of Hypernetworks and Multimodal Large Language Models (MLLMs), require heavy computing resources that range from 600 to 12300 GPU hours of traini… ▽ More

    Submitted 9 April, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: Project page: https://eclipse-t2i.github.io/Lambda-ECLIPSE/

  20. arXiv:2401.13921  [pdf, other

    eess.AS cs.SD

    Intelli-Z: Toward Intelligible Zero-Shot TTS

    Authors: Sunghee Jung, Won Jang, Jaesam Yoon, Bongwan Kim

    Abstract: Although numerous recent studies have suggested new frameworks for zero-shot TTS using large-scale, real-world data, studies that focus on the intelligibility of zero-shot TTS are relatively scarce. Zero-shot TTS demands additional efforts to ensure clear pronunciation and speech quality due to its inherent requirement of replacing a core parameter (speaker embedding or acoustic prompt) with a new… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  21. arXiv:2401.13146  [pdf, other

    eess.AS cs.CL cs.SD

    Locality enhanced dynamic biasing and sampling strategies for contextual ASR

    Authors: Md Asif Jalal, Pablo Peso Parada, George Pavlidis, Vasileios Moschopoulos, Karthikeyan Saravanan, Chrysovalantis-Giorgos Kontoulis, Jisi Zhang, Anastasios Drosou, Gil Ho Lee, Jungin Lee, Seokyeong Jung

    Abstract: Automatic Speech Recognition (ASR) still face challenges when recognizing time-variant rare-phrases. Contextual biasing (CB) modules bias ASR model towards such contextually-relevant phrases. During training, a list of biasing phrases are selected from a large pool of phrases following a sampling strategy. In this work we firstly analyse different sampling strategies to provide insights into the t… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Accepted for IEEE ASRU 2023

  22. arXiv:2401.12326  [pdf, other

    cs.CL cs.AI

    Fine-tuning Large Language Models for Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection

    Authors: Feng Xiong, Thanet Markchom, Ziwei Zheng, Subin Jung, Varun Ojha, Huizhi Liang

    Abstract: SemEval-2024 Task 8 introduces the challenge of identifying machine-generated texts from diverse Large Language Models (LLMs) in various languages and domains. The task comprises three subtasks: binary classification in monolingual and multilingual (Subtask A), multi-class classification (Subtask B), and mixed text detection (Subtask C). This paper focuses on Subtask A & B. Each subtask is support… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  23. arXiv:2401.12085  [pdf, other

    eess.AS cs.SD

    Consistency Based Unsupervised Self-training For ASR Personalisation

    Authors: Jisi Zhang, Vandana Rajan, Haaris Mehmood, David Tuckey, Pablo Peso Parada, Md Asif Jalal, Karthikeyan Saravanan, Gil Ho Lee, Jungin Lee, Seokyeong Jung

    Abstract: On-device Automatic Speech Recognition (ASR) models trained on speech data of a large population might underperform for individuals unseen during training. This is due to a domain shift between user data and the original training data, differed by user's speaking characteristics and environmental acoustic conditions. ASR personalisation is a solution that aims to exploit user data to improve model… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: Accepted for IEEE ASRU 2023

  24. arXiv:2401.04143  [pdf, other

    cs.CV

    RHOBIN Challenge: Reconstruction of Human Object Interaction

    Authors: Xianghui Xie, Xi Wang, Nikos Athanasiou, Bharat Lal Bhatnagar, Chun-Hao P. Huang, Kaichun Mo, Hao Chen, Xia Jia, Zerui Zhang, Liangxian Cui, Xiao Lin, Bingqiao Qian, Jie Xiao, Wenfei Yang, Hyeong** Nam, Daniel Sungho Jung, Kihoon Kim, Kyoung Mu Lee, Otmar Hilliges, Gerard Pons-Moll

    Abstract: Modeling the interaction between humans and objects has been an emerging research direction in recent years. Capturing human-object interaction is however a very challenging task due to heavy occlusion and complex dynamics, which requires understanding not only 3D human pose, and object pose but also the interaction between them. Reconstruction of 3D humans and objects has been two separate resear… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

    Comments: 14 pages, 5 tables, 7 figure. Technical report of the CVPR'23 workshop: RHOBIN challenge (https://rhobin-challenge.github.io/)

  25. arXiv:2312.16016  [pdf, other

    cs.RO

    V-STRONG: Visual Self-Supervised Traversability Learning for Off-road Navigation

    Authors: Sanghun Jung, JoonHo Lee, Xiangyun Meng, Byron Boots, Alexander Lambert

    Abstract: Reliable estimation of terrain traversability is critical for the successful deployment of autonomous systems in wild, outdoor environments. Given the lack of large-scale annotated datasets for off-road navigation, strictly-supervised learning approaches remain limited in their generalization ability. To this end, we introduce a novel, image-based self-supervised learning method for traversability… ▽ More

    Submitted 15 March, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

    Comments: ICRA 2024; 8 pages

  26. arXiv:2312.05548  [pdf, other

    eess.IV cs.CV cs.LG

    A Unified Multi-Phase CT Synthesis and Classification Framework for Kidney Cancer Diagnosis with Incomplete Data

    Authors: Kwang-Hyun Uhm, Seung-Won Jung, Moon Hyung Choi, Sung-Hoo Hong, Sung-Jea Ko

    Abstract: Multi-phase CT is widely adopted for the diagnosis of kidney cancer due to the complementary information among phases. However, the complete set of multi-phase CT is often not available in practical clinical applications. In recent years, there have been some studies to generate the missing modality image from the available data. Nevertheless, the generated images are not guaranteed to be effectiv… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Comments: This article has been accepted for publication in IEEE Journal of Biomedical and Health Informatics

    Journal ref: JBHI, 2022

  27. arXiv:2312.05528  [pdf, other

    eess.IV cs.CV

    Exploring 3D U-Net Training Configurations and Post-Processing Strategies for the MICCAI 2023 Kidney and Tumor Segmentation Challenge

    Authors: Kwang-Hyun Uhm, Hyunjun Cho, Zhixin Xu, Seohoon Lim, Seung-Won Jung, Sung-Hoo Hong, Sung-Jea Ko

    Abstract: In 2023, it is estimated that 81,800 kidney cancer cases will be newly diagnosed, and 14,890 people will die from this cancer in the United States. Preoperative dynamic contrast-enhanced abdominal computed tomography (CT) is often used for detecting lesions. However, there exists inter-observer variability due to subtle differences in the imaging features of kidney and kidney tumors. In this paper… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Comments: MICCAI 2023, KITS 2023 challenge 2nd place

  28. arXiv:2312.01638  [pdf, other

    eess.IV cs.CV

    J-Net: Improved U-Net for Terahertz Image Super-Resolution

    Authors: Woon-Ha Yeo, Seung-Hwan Jung, Seung Jae Oh, Inhee Maeng, Eui Su Lee, Han-Cheol Ryu

    Abstract: Terahertz (THz) waves are electromagnetic waves in the 0.1 to 10 THz frequency range, and THz imaging is utilized in a range of applications, including security inspections, biomedical fields, and the non-destructive examination of materials. However, THz images have low resolution due to the long wavelength of THz waves. Therefore, improving the resolution of THz images is one of the current hot… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  29. arXiv:2312.00356  [pdf, other

    physics.chem-ph cs.LG

    Transfer learning for predicting source terms of principal component transport in chemically reactive flow

    Authors: Ki Sung Jung, Tarek Echekki, Jacqueline H. Chen, Mohammad Khalil

    Abstract: The objective of this study is to evaluate whether the number of requisite training samples can be reduced with the use of various transfer learning models for predicting, for example, the chemical source terms of the data-driven reduced-order model that represents the homogeneous ignition process of a hydrogen/air mixture. Principal component analysis is applied to reduce the dimensionality of th… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: 41 pages, 14 figures

  30. arXiv:2311.15683  [pdf

    eess.AS cs.SD eess.SP

    Ultrasensitive Textile Strain Sensors Redefine Wearable Silent Speech Interfaces with High Machine Learning Efficiency

    Authors: Chenyu Tang, Muzi Xu, Wentian Yi, Zibo Zhang, Edoardo Occhipinti, Chaoqun Dong, Dafydd Ravenscroft, Sung-Min Jung, Sanghyo Lee, Shuo Gao, Jong Min Kim, Luigi G. Occhipinti

    Abstract: Our research presents a wearable Silent Speech Interface (SSI) technology that excels in device comfort, time-energy efficiency, and speech decoding accuracy for real-world use. We developed a biocompatible, durable textile choker with an embedded graphene-based strain sensor, capable of accurately detecting subtle throat movements. This sensor, surpassing other strain sensors in sensitivity by 42… ▽ More

    Submitted 7 December, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: 5 figures in the article; 11 figures and 4 tables in supplementary information

    Journal ref: npj Flexible Electronics (2024)

  31. arXiv:2311.13338  [pdf, other

    cs.CV

    High-Quality Face Caricature via Style Translation

    Authors: Lamyanba Laishram, Muhammad Shaheryar, Jong Taek Lee, Soon Ki Jung

    Abstract: Caricature is an exaggerated form of artistic portraiture that accentuates unique yet subtle characteristics of human faces. Recently, advancements in deep end-to-end techniques have yielded encouraging outcomes in capturing both style and elevated exaggerations in creating face caricatures. Most of these approaches tend to produce cartoon-like results that could be more practical for real-world a… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Comments: 14 pages, 21 figures

  32. arXiv:2311.12805  [pdf, other

    cs.CV cs.AI

    DeepCompass: AI-driven Location-Orientation Synchronization for Navigating Platforms

    Authors: Jihun Lee, SP Choi, Bumsoo Kang, Hyekyoung Seok, Hyoungseok Ahn, Sanghee Jung

    Abstract: In current navigating platforms, the user's orientation is typically estimated based on the difference between two consecutive locations. In other words, the orientation cannot be identified until the second location is taken. This asynchronous location-orientation identification often leads to our real-life question: Why does my navigator tell the wrong direction of my car at the beginning? We pr… ▽ More

    Submitted 15 September, 2023; originally announced November 2023.

    Comments: 7page with 3 supplemental pages

  33. arXiv:2311.10922  [pdf, other

    cs.AI cs.CL cs.DB cs.IR

    Explainable Product Classification for Customs

    Authors: Eunji Lee, Sihyeon Kim, Sundong Kim, Soyeon Jung, Heeja Kim, Meeyoung Cha

    Abstract: The task of assigning internationally accepted commodity codes (aka HS codes) to traded goods is a critical function of customs offices. Like court decisions made by judges, this task follows the doctrine of precedent and can be nontrivial even for experienced officers. Together with the Korea Customs Service (KCS), we propose a first-ever explainable decision supporting model that suggests the mo… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

    Comments: 24 pages, Accepted to ACM Transactions on Intelligent Systems and Technology

  34. arXiv:2311.05407  [pdf

    physics.comp-ph cs.LG physics.chem-ph

    Data Distillation for Neural Network Potentials toward Foundational Dataset

    Authors: Gang Seob Jung, Sangkeun Lee, Jong Youl Choi

    Abstract: Machine learning (ML) techniques and atomistic modeling have rapidly transformed materials design and discovery. Specifically, generative models can swiftly propose promising materials for targeted applications. However, the predicted properties of materials through the generative models often do not match with calculated properties through ab initio calculations. This discrepancy can arise becaus… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  35. arXiv:2310.19583  [pdf, other

    cs.CV cs.LG

    GC-MVSNet: Multi-View, Multi-Scale, Geometrically-Consistent Multi-View Stereo

    Authors: Vibhas K. Vats, Sripad Joshi, David J. Crandall, Md. Alimoor Reza, Soon-heung Jung

    Abstract: Traditional multi-view stereo (MVS) methods rely heavily on photometric and geometric consistency constraints, but newer machine learning-based MVS methods check geometric consistency across multiple source views only as a post-processing step. In this paper, we present a novel approach that explicitly encourages geometric consistency of reference view depth maps across multiple source views at di… ▽ More

    Submitted 21 December, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted in WACV 2024 Link: https://openaccess.thecvf.com/content/WACV2024/html/Vats_GC-MVSNet_Multi-View_Multi-Scale_Geometrically-Consistent_Multi-View_Stereo_WACV_2024_paper.html

    Journal ref: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024

  36. arXiv:2309.15314  [pdf

    physics.med-ph cs.CV

    Conversion of single-energy computed tomography to parametric maps of dual-energy computed tomography using convolutional neural network

    Authors: Sangwook Kim, Jimin Lee, Jungye Kim, Bitbyeol Kim, Chang Heon Choi, Seongmoon Jung

    Abstract: Objectives: We propose a deep learning (DL) multi-task learning framework using convolutional neural network (CNN) for a direct conversion of single-energy CT (SECT) to three different parametric maps of dual-energy CT (DECT): Virtual-monochromatic image (VMI), effective atomic number (EAN), and relative electron density (RED). Methods: We propose VMI-Net for conversion of SECT to 70, 120, and 2… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: 29 pages, 17 figures

  37. arXiv:2309.13523  [pdf, other

    cs.CV

    LiDAR-UDA: Self-ensembling Through Time for Unsupervised LiDAR Domain Adaptation

    Authors: Amirreza Shaban, JoonHo Lee, Sanghun Jung, Xiangyun Meng, Byron Boots

    Abstract: We introduce LiDAR-UDA, a novel two-stage self-training-based Unsupervised Domain Adaptation (UDA) method for LiDAR segmentation. Existing self-training methods use a model trained on labeled source data to generate pseudo labels for target data and refine the predictions via fine-tuning the network on the pseudo labels. These methods suffer from domain shifts caused by different LiDAR sensor conf… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

    Comments: Accepted ICCV 2023 (Oral)

  38. arXiv:2309.13457  [pdf, other

    cs.LG cs.CV physics.comp-ph physics.flu-dyn

    Turbulence in Focus: Benchmarking Scaling Behavior of 3D Volumetric Super-Resolution with BLASTNet 2.0 Data

    Authors: Wai Tong Chung, Bassem Akoush, Pushan Sharma, Alex Tamkin, Ki Sung Jung, Jacqueline H. Chen, Jack Guo, Davy Brouzet, Mohsen Talei, Bruno Savard, Alexei Y. Poludnenko, Matthias Ihme

    Abstract: Analysis of compressible turbulent flows is essential for applications related to propulsion, energy generation, and the environment. Here, we present BLASTNet 2.0, a 2.2 TB network-of-datasets containing 744 full-domain samples from 34 high-fidelity direct numerical simulations, which addresses the current limited availability of 3D high-fidelity reacting and non-reacting compressible turbulent f… ▽ More

    Submitted 27 October, 2023; v1 submitted 23 September, 2023; originally announced September 2023.

    Comments: Accepted in Adv. in Neural Information Processing Systems 36 (NeurIPS 2023). Link: https://nips.cc/virtual/2023/poster/73433 . 55 pages, 21 figures. Keywords: Super-resolution, 3D, Neural Scaling, Physics-informed Loss, Computational Fluid Dynamics, Partial Differential Equations, Turbulent Reacting Flows, Direct Numerical Simulation, Fluid Mechanics, Combustion, Computer Vision

  39. arXiv:2309.01943  [pdf, other

    cs.CV

    Extract-and-Adaptation Network for 3D Interacting Hand Mesh Recovery

    Authors: JoonKyu Park, Daniel Sungho Jung, Gyeongsik Moon, Kyoung Mu Lee

    Abstract: Understanding how two hands interact with each other is a key component of accurate 3D interacting hand mesh recovery. However, recent Transformer-based methods struggle to learn the interaction between two hands as they directly utilize two hand features as input tokens, which results in distant token problem. The distant token problem represents that input tokens are in heterogeneous spaces, lea… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: Accepted at ICCVW 2023

  40. arXiv:2308.10557  [pdf, other

    cs.CV

    Local Spherical Harmonics Improve Skeleton-Based Hand Action Recognition

    Authors: Katharina Prasse, Steffen Jung, Yuxuan Zhou, Margret Keuper

    Abstract: Hand action recognition is essential. Communication, human-robot interactions, and gesture control are dependent on it. Skeleton-based action recognition traditionally includes hands, which belong to the classes which remain challenging to correctly recognize to date. We propose a method specifically designed for hand action recognition which uses relative angular embeddings and local Spherical Ha… ▽ More

    Submitted 14 November, 2023; v1 submitted 21 August, 2023; originally announced August 2023.

  41. arXiv:2308.07728  [pdf, other

    cs.LG cs.CV

    Domain-Aware Fine-Tuning: Enhancing Neural Network Adaptability

    Authors: Seokhyeon Ha, Sunbeom Jung, Jungwoo Lee

    Abstract: Fine-tuning pre-trained neural network models has become a widely adopted approach across various domains. However, it can lead to the distortion of pre-trained feature extractors that already possess strong generalization capabilities. Mitigating feature distortion during adaptation to new target domains is crucial. Recent studies have shown promising results in handling feature distortion by ali… ▽ More

    Submitted 26 March, 2024; v1 submitted 15 August, 2023; originally announced August 2023.

  42. arXiv:2308.06554  [pdf, other

    cs.CV

    Cyclic Test-Time Adaptation on Monocular Video for 3D Human Mesh Reconstruction

    Authors: Hyeong** Nam, Daniel Sungho Jung, Yeonguk Oh, Kyoung Mu Lee

    Abstract: Despite recent advances in 3D human mesh reconstruction, domain gap between training and test data is still a major challenge. Several prior works tackle the domain gap problem via test-time adaptation that fine-tunes a network relying on 2D evidence (e.g., 2D human keypoints) from test images. However, the high reliance on 2D evidence during adaptation causes two major issues. First, 2D evidence… ▽ More

    Submitted 12 August, 2023; originally announced August 2023.

    Comments: Published at ICCV 2023, 16 pages including the supplementary material

  43. arXiv:2308.01519  [pdf, other

    cs.MA cs.AI

    Quantum Multi-Agent Reinforcement Learning for Autonomous Mobility Cooperation

    Authors: Soohyun Park, Jae Pyoung Kim, Chanyoung Park, Soyi Jung, Joongheon Kim

    Abstract: For Industry 4.0 Revolution, cooperative autonomous mobility systems are widely used based on multi-agent reinforcement learning (MARL). However, the MARL-based algorithms suffer from huge parameter utilization and convergence difficulties with many agents. To tackle these problems, a quantum MARL (QMARL) algorithm based on the concept of actor-critic network is proposed, which is beneficial in te… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Comments: 7 pages, 3 figures, 2 tables

  44. arXiv:2307.13343  [pdf, other

    eess.AS cs.CR cs.SD

    On-Device Speaker Anonymization of Acoustic Embeddings for ASR based onFlexible Location Gradient Reversal Layer

    Authors: Md Asif Jalal, Pablo Peso Parada, Jisi Zhang, Karthikeyan Saravanan, Mete Ozay, Myoungji Han, Jung In Lee, Seokyeong Jung

    Abstract: Smart devices serviced by large-scale AI models necessitates user data transfer to the cloud for inference. For speech applications, this means transferring private user information, e.g., speaker identity. Our paper proposes a privacy-enhancing framework that targets speaker identity anonymization while preserving speech recognition accuracy for our downstream task~-~Automatic Speech Recognition… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: Proceedings of INTERSPEECH 2023

  45. arXiv:2307.09711  [pdf, other

    cs.AI

    Two Tales of Platoon Intelligence for Autonomous Mobility Control: Enabling Deep Learning Recipes

    Authors: Soohyun Park, Haemin Lee, Chanyoung Park, Soyi Jung, Minseok Choi, Joongheon Kim

    Abstract: This paper presents the deep learning-based recent achievements to resolve the problem of autonomous mobility control and efficient resource management of autonomous vehicles and UAVs, i.e., (i) multi-agent reinforcement learning (MARL), and (ii) neural Myerson auction. Representatively, communication network (CommNet), which is one of the most popular MARL algorithms, is introduced to enable mult… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: 8 pages, 3 figures

  46. arXiv:2307.08263  [pdf, other

    cs.CV

    Hierarchical Spatiotemporal Transformers for Video Object Segmentation

    Authors: Jun-Sang Yoo, Hongjae Lee, Seung-Won Jung

    Abstract: This paper presents a novel framework called HST for semi-supervised video object segmentation (VOS). HST extracts image and video features using the latest Swin Transformer and Video Swin Transformer to inherit their inductive bias for the spatiotemporal locality, which is essential for temporally coherent VOS. To take full advantage of the image and video features, HST casts image and video feat… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

  47. arXiv:2307.05977  [pdf, other

    cs.CV cs.AI cs.LG

    Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion Models

    Authors: Sanghyun Kim, Seohyeon Jung, Balhae Kim, Moonseok Choi, **woo Shin, Juho Lee

    Abstract: Large-scale image generation models, with impressive quality made possible by the vast amount of data available on the Internet, raise social concerns that these models may generate harmful or copyrighted content. The biases and harmfulness arise throughout the entire training process and are hard to completely remove, which have become significant hurdles to the safe deployment of these models. I… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

    Comments: 17 pages, 13 figures, ICML 2023 Workshop on Challenges in Deployable Generative AI

  48. arXiv:2307.05016  [pdf, other

    cs.CV cs.RO

    TRansPose: Large-Scale Multispectral Dataset for Transparent Object

    Authors: Jeongyun Kim, Myung-Hwan Jeon, Sangwoo Jung, Wooseong Yang, Minwoo Jung, Jaeho Shin, Ayoung Kim

    Abstract: Transparent objects are encountered frequently in our daily lives, yet recognizing them poses challenges for conventional vision sensors due to their unique material properties, not being well perceived from RGB or depth cameras. Overcoming this limitation, thermal infrared cameras have emerged as a solution, offering improved visibility and shape information for transparent objects. In this paper… ▽ More

    Submitted 10 November, 2023; v1 submitted 11 July, 2023; originally announced July 2023.

    Comments: The International Journal of Robotics Research (IJRR)

  49. arXiv:2306.10989  [pdf, other

    cs.LG

    Scaling of Class-wise Training Losses for Post-hoc Calibration

    Authors: Seung** Jung, Seungmo Seo, Yonghyun Jeong, Jongwon Choi

    Abstract: The class-wise training losses often diverge as a result of the various levels of intra-class and inter-class appearance variation, and we find that the diverging class-wise training losses cause the uncalibrated prediction with its reliability. To resolve the issue, we propose a new calibration method to synchronize the class-wise training losses. We design a new training loss to alleviate the va… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

    Comments: Published at ICML 2023. Camera ready version

  50. arXiv:2306.09382  [pdf, ps, other

    cs.SD cs.LG cs.MM eess.AS

    Sound Demixing Challenge 2023 Music Demixing Track Technical Report: TFC-TDF-UNet v3

    Authors: Minseok Kim, Jun Hyung Lee, Soonyoung Jung

    Abstract: In this report, we present our award-winning solutions for the Music Demixing Track of Sound Demixing Challenge 2023. First, we propose TFC-TDF-UNet v3, a time-efficient music source separation model that achieves state-of-the-art results on the MUSDB benchmark. We then give full details regarding our solutions for each Leaderboard, including a loss masking approach for noise-robust training. Code… ▽ More

    Submitted 21 July, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: 5 pages, 4 tables