Skip to main content

Showing 101–150 of 2,932 results for author: Kim, M

.
  1. A Comparative Analysis of Poetry Reading Audio: Singing, Narrating, or Somewhere In Between?

    Authors: Kahyun Choi, Minje Kim

    Abstract: This paper provides a computational analysis of poetry reading audio signals at a large scale to unveil the musicality within professionally-read poems. Although the acoustic characteristics of other types of spoken language have been extensively studied, most of the literature is limited to narrative speech or singing voice, discussing how different they are from each other. In this work, we deve… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Journal ref: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, pp. 1296-1300

  2. arXiv:2404.00678  [pdf, other

    cs.CV cs.GR

    OmniSDF: Scene Reconstruction using Omnidirectional Signed Distance Functions and Adaptive Binoctrees

    Authors: Hakyeong Kim, Andreas Meuleman, Hyeonjoong Jang, James Tompkin, Min H. Kim

    Abstract: We present a method to reconstruct indoor and outdoor static scene geometry and appearance from an omnidirectional video moving in a small circular sweep. This setting is challenging because of the small baseline and large depth ranges, making it difficult to find ray crossings. To better constrain the optimization, we estimate geometry as a signed distance field within a spherical binoctree data… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024

  3. arXiv:2404.00676  [pdf, other

    cs.CV cs.GR

    OmniLocalRF: Omnidirectional Local Radiance Fields from Dynamic Videos

    Authors: Dongyoung Choi, Hyeonjoong Jang, Min H. Kim

    Abstract: Omnidirectional cameras are extensively used in various applications to provide a wide field of vision. However, they face a challenge in synthesizing novel views due to the inevitable presence of dynamic objects, including the photographer, in their wide field of view. In this paper, we introduce a new approach called Omnidirectional Local Radiance Fields (OmniLocalRF) that can render static-only… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024

  4. arXiv:2403.20225  [pdf, other

    cs.CV

    MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark

    Authors: Sanghyun Woo, Kwanyong Park, Inkyu Shin, Myungchul Kim, In So Kweon

    Abstract: Multi-target multi-camera tracking is a crucial task that involves identifying and tracking individuals over time using video streams from multiple cameras. This task has practical applications in various fields, such as visual surveillance, crowd behavior analysis, and anomaly detection. However, due to the difficulty and cost of collecting and labeling data, existing datasets for this task are e… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: Accepted on CVPR 2024

  5. arXiv:2403.19904  [pdf, other

    cs.CV

    Fully Geometric Panoramic Localization

    Authors: Junho Kim, Jiwon Jeong, Young Min Kim

    Abstract: We introduce a lightweight and accurate localization method that only utilizes the geometry of 2D-3D lines. Given a pre-captured 3D map, our approach localizes a panorama image, taking advantage of the holistic 360 view. The system mitigates potential privacy breaches or domain discrepancies by avoiding trained or hand-crafted visual descriptors. However, as lines alone can be ambiguous, we expres… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  6. arXiv:2403.19132  [pdf, ps, other

    eess.SP

    Meta-Heuristic Fronthaul Bit Allocation for Cell-free Massive MIMO Systems

    Authors: Minje Kim, In-soo Kim, Junil Choi

    Abstract: Limited capacity of fronthaul links in a cell-free massive multiple-input multiple-output (MIMO) system can cause quantization errors at a central processing unit (CPU) during data transmission, complicating the centralized rate optimization problem. Addressing this challenge, we propose a harmony search (HS)-based algorithm that renders the combinatorial non-convex problem tractable. One of the d… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: 16 pages, 13 figures, accepted to IEEE Transactions on Wireless Communications (TWC)

  7. arXiv:2403.18992  [pdf

    eess.IV

    Tractography with T1-weighted MRI and associated anatomical constraints on clinical quality diffusion MRI

    Authors: Tian Yu, Yunhe Li, Michael E. Kim, Chenyu Gao, Qi Yang, Leon Y. Cai, Susane M. Resnick, Lori L. Beason-Held, Daniel C. Moyer, Kurt G. Schilling, Bennett A. Landman

    Abstract: Diffusion MRI (dMRI) streamline tractography, the gold standard for in vivo estimation of brain white matter (WM) pathways, has long been considered indicative of macroscopic relationships with WM microstructure. However, recent advances in tractography demonstrated that convolutional recurrent neural networks (CoRNN) trained with a teacher-student framework have the ability to learn and propagate… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  8. arXiv:2403.16167  [pdf, other

    cs.CV cs.CL

    Exploiting Semantic Reconstruction to Mitigate Hallucinations in Vision-Language Models

    Authors: Minchan Kim, Minyeong Kim, Junik Bae, Suhwan Choi, Sungkyung Kim, Buru Chang

    Abstract: Hallucinations in vision-language models pose a significant challenge to their reliability, particularly in the generation of long captions. Current methods fall short of accurately identifying and mitigating these hallucinations. To address this issue, we introduce ESREAL, a novel unsupervised learning framework designed to suppress the generation of hallucinations through accurate localization a… ▽ More

    Submitted 5 May, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

  9. arXiv:2403.16158  [pdf, other

    cs.CL

    Korean Bio-Medical Corpus (KBMC) for Medical Named Entity Recognition

    Authors: Sungjoo Byun, Jiseung Hong, Sumin Park, Dongjun Jang, Jean Seo, Minseok Kim, Chaeyoung Oh, Hyopil Shin

    Abstract: Named Entity Recognition (NER) plays a pivotal role in medical Natural Language Processing (NLP). Yet, there has not been an open-source medical NER dataset specifically for the Korean language. To address this, we utilized ChatGPT to assist in constructing the KBMC (Korean Bio-Medical Corpus), which we are now presenting to the public. With the KBMC dataset, we noticed an impressive 20% increase… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Journal ref: LREC-COLING 2024

  10. arXiv:2403.14852  [pdf, other

    cs.CV

    KeyPoint Relative Position Encoding for Face Recognition

    Authors: Minchul Kim, Yiyang Su, Feng Liu, Anil Jain, Xiaoming Liu

    Abstract: In this paper, we address the challenge of making ViT models more robust to unseen affine transformations. Such robustness becomes useful in various recognition tasks such as face recognition when image alignment failures occur. We propose a novel method called KP-RPE, which leverages key points (e.g.~facial landmarks) to make ViT more resilient to scale, translation, and pose variations. We begin… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: To appear in CVPR2024

  11. PECI-Net: Bolus segmentation from video fluoroscopic swallowing study images using preprocessing ensemble and cascaded inference

    Authors: Dougho Park, Younghun Kim, Harim Kang, Junmyeoung Lee, **young Choi, Taeyeon Kim, Sangeok Lee, Seokil Son, Minsol Kim, Injung Kim

    Abstract: Bolus segmentation is crucial for the automated detection of swallowing disorders in videofluoroscopic swallowing studies (VFSS). However, it is difficult for the model to accurately segment a bolus region in a VFSS image because VFSS images are translucent, have low contrast and unclear region boundaries, and lack color information. To overcome these challenges, we propose PECI-Net, a network arc… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: 20 pages, 8 figures,

    Journal ref: Computers in Biology and Medicine (2024)

  12. arXiv:2403.12862  [pdf, other

    cs.CL

    Epistemology of Language Models: Do Language Models Have Holistic Knowledge?

    Authors: Minsu Kim, James Thorne

    Abstract: This paper investigates the inherent knowledge in language models from the perspective of epistemological holism. The purpose of this paper is to explore whether LLMs exhibit characteristics consistent with epistemological holism. These characteristics suggest that core knowledge, such as general scientific knowledge, each plays a specific role, serving as the foundation of our knowledge system an… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  13. arXiv:2403.12794  [pdf, other

    physics.atom-ph physics.optics

    Optical Atomic Clock Interrogation Via an Integrated Spiral Cavity Laser

    Authors: William Loh, David Reens, Dave Kharas, Alkesh Sumant, Connor Belanger, Ryan T. Maxson, Alexander Medeiros, William Setzer, Dodd Gray, Kyle DeBry, Colin D. Bruzewicz, Jason Plant, John Liddell, Gavin N. West, Sagar Doshi, Matthew Roychowdhury, May Kim, Danielle Braje, Paul W. Juodawlkis, John Chiaverini, Robert McConnell

    Abstract: Optical atomic clocks have demonstrated revolutionary advances in precision timekee**, but their applicability to the real world is critically dependent on whether such clocks can operate outside a laboratory setting. The challenge to clock portability stems from the many obstacles not only in miniaturizing the underlying components of the clock $-$ namely the ultrastable laser, the frequency co… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  14. arXiv:2403.11472  [pdf, other

    cs.LG cs.AR cs.DB

    Accelerating String-Key Learned Index Structures via Memoization-based Incremental Training

    Authors: Minsu Kim, **woo Hwang, Guseul Heo, Seiyeon Cho, Divya Mahajan, Jongse Park

    Abstract: Learned indexes use machine learning models to learn the map**s between keys and their corresponding positions in key-value indexes. These indexes use the map** information as training data. Learned indexes require frequent retrainings of their models to incorporate the changes introduced by update queries. To efficiently retrain the models, existing learned index systems often harness a linea… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Accepted at VLDB '24; 12 pages + 2 pages (ref), 18 figures, 2 tables

  15. arXiv:2403.11399  [pdf, other

    cs.CL

    X-LLaVA: Optimizing Bilingual Large Vision-Language Alignment

    Authors: Dongjae Shin, Hyeonseok Lim, Inho Won, Changsu Choi, Minjun Kim, Seungwoo Song, Hangyeol Yoo, Sangmin Kim, Kyungtae Lim

    Abstract: The impressive development of large language models (LLMs) is expanding into the realm of large multimodal models (LMMs), which incorporate multiple types of data beyond text. However, the nature of multimodal models leads to significant expenses in the creation of training data. Furthermore, constructing multilingual data for LMMs presents its own set of challenges due to language diversity and c… ▽ More

    Submitted 1 April, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

  16. arXiv:2403.11382  [pdf, other

    cond-mat.str-el

    Topological singularity-induced self-energy in strongly correlated fermion systems

    Authors: Byungkyun Kang, Zachary Brown, Myoung-Hwan Kim, Hyunsoo Kim, Chul Hong Park

    Abstract: Employing ab initio many-body perturbation theory combined with dynamical mean field theory, we discovered that in strongly correlated topological semimetals HoPtBi and PrAlGe, which exhibit topological singular points in the vicinity of the Fermi level, the formation of 4$f$ quasiparticles are forbidden. We show that blocking hybridization channels at the topological singular point effectively en… ▽ More

    Submitted 6 May, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

  17. arXiv:2403.10494  [pdf, other

    cs.RO

    Lifelong LERF: Local 3D Semantic Inventory Monitoring Using FogROS2

    Authors: Adam Rashid, Chung Min Kim, Justin Kerr, Letian Fu, Kush Hari, Ayah Ahmad, Kaiyuan Chen, Huang Huang, Marcus Gualtieri, Michael Wang, Christian Juette, Nan Tian, Liu Ren, Ken Goldberg

    Abstract: Inventory monitoring in homes, factories, and retail stores relies on maintaining data despite objects being swapped, added, removed, or moved. We introduce Lifelong LERF, a method that allows a mobile robot with minimal compute to jointly optimize a dense language and geometric representation of its surroundings. Lifelong LERF maintains this representation over time by detecting semantic changes… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: See project webpage at: https://sites.google.com/berkeley.edu/lifelonglerf/home

  18. arXiv:2403.09967  [pdf, other

    eess.SP

    NR-Surface: NextG-ready $μ$W-reconfigurable mmWave Metasurface

    Authors: Minseok Kim, Namjo Ahn, Song Min Kim

    Abstract: Metasurface has recently emerged as an economic solution to expand mmWave coverage. However, their pervasive deployment remains a challenge, mainly due to the difficulty in reaching the tight 260ns NR synchronization requirement and real-time wireless reconfiguration while maintaining multi-year battery life. This paper presents NR-Surface, the first real-time reconfigurable metasurface fully comp… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 17 pages, 28 figures, to be published in NSDI '24

  19. arXiv:2403.09508  [pdf, other

    cs.CV

    SkateFormer: Skeletal-Temporal Transformer for Human Action Recognition

    Authors: Jeonghyeok Do, Munchurl Kim

    Abstract: Skeleton-based action recognition, which classifies human actions based on the coordinates of joints and their connectivity within skeleton data, is widely utilized in various scenarios. While Graph Convolutional Networks (GCNs) have been proposed for skeleton data represented as graphs, they suffer from limited receptive fields constrained by joint connectivity. To address this limitation, recent… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Please visit our project page at https://jeonghyeokdo.github.io/SkateFormer_site/

  20. arXiv:2403.08827  [pdf, other

    math.OC

    Locational Scenario-based Pricing in a Bilateral Distribution Energy Market under Uncertainty

    Authors: Hien Thanh Doan, Minsoo Kim, Keunju Song, Hongseok Kim

    Abstract: In recent years, there has been a significant focus on advancing the next generation of power systems. Despite these efforts, persistent challenges revolve around addressing the operational impact of uncertainty on predicted data, especially concerning economic dispatch and optimal power flow. To tackle these challenges, we introduce a stochastic day-ahead scheduling approach for a community. This… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  21. arXiv:2403.08302  [pdf, other

    cs.RO

    Online Multi-Contact Feedback Model Predictive Control for Interactive Robotic Tasks

    Authors: Seo Wook Han, Maged Iskandar, **oh Lee, Min Jun Kim

    Abstract: In this paper, we propose a model predictive control (MPC) that accomplishes interactive robotic tasks, in which multiple contacts may occur at unknown locations. To address such scenarios, we made an explicit contact feedback loop in the MPC framework. An algorithm called Multi-Contact Particle Filter with Exploration Particle (MCP-EP) is employed to establish real-time feedback of multi-contact… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: This paper has been accepted for publication at the IEEE International Conference on Robotics and Automation (ICRA), Yokohama, 2024

  22. arXiv:2403.08277  [pdf, other

    cs.CV

    VIGFace: Virtual Identity Generation Model for Face Image Synthesis

    Authors: Minsoo Kim, Min-Cheol Sagong, Gi Pyo Nam, Junghyun Cho, Ig-Jae Kim

    Abstract: Deep learning-based face recognition continues to face challenges due to its reliance on huge datasets obtained from web crawling, which can be costly to gather and raise significant real-world privacy concerns. To address this issue, we propose VIGFace, a novel framework capable of generating synthetic facial images. Initially, we train the face recognition model using a real face dataset and cre… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  23. arXiv:2403.08272  [pdf, other

    cs.CL

    RECIPE4U: Student-ChatGPT Interaction Dataset in EFL Writing Education

    Authors: Jieun Han, Haneul Yoo, Junho Myung, Minsun Kim, Tak Yeon Lee, So-Yeon Ahn, Alice Oh

    Abstract: The integration of generative AI in education is expanding, yet empirical analyses of large-scale and real-world interactions between students and AI systems still remain limited. Addressing this gap, we present RECIPE4U (RECIPE for University), a dataset sourced from a semester-long experiment with 212 college students in English as Foreign Language (EFL) writing courses. During the study, studen… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: text overlap with arXiv:2309.13243

  24. arXiv:2403.08262  [pdf, other

    cs.CV

    BiTT: Bi-directional Texture Reconstruction of Interacting Two Hands from a Single Image

    Authors: Minje Kim, Tae-Kyun Kim

    Abstract: Creating personalized hand avatars is important to offer a realistic experience to users on AR / VR platforms. While most prior studies focused on reconstructing 3D hand shapes, some recent work has tackled the reconstruction of hand textures on top of shapes. However, these methods are often limited to capturing pixels on the visible side of a hand, requiring diverse views of the hand in a video… ▽ More

    Submitted 25 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024, Project Page: https://yunmin**2.github.io/projects/bitt/

  25. arXiv:2403.08256  [pdf, other

    cs.CV

    IG-FIQA: Improving Face Image Quality Assessment through Intra-class Variance Guidance robust to Inaccurate Pseudo-Labels

    Authors: Minsoo Kim, Gi Pyo Nam, Haksub Kim, Haesol Park, Ig-Jae Kim

    Abstract: In the realm of face image quality assesment (FIQA), method based on sample relative classification have shown impressive performance. However, the quality scores used as pseudo-labels assigned from images of classes with low intra-class variance could be unrelated to the actual quality in this method. To address this issue, we present IG-FIQA, a novel approach to guide FIQA training, introducing… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  26. arXiv:2403.08187  [pdf, other

    cs.CL cs.SD eess.AS

    Automatic Speech Recognition (ASR) for the Diagnosis of pronunciation of Speech Sound Disorders in Korean children

    Authors: Taekyung Ahn, Yeonjung Hong, Younggon Im, Do Hyung Kim, Dayoung Kang, Joo Won Jeong, Jae Won Kim, Min Jung Kim, Ah-ra Cho, Dae-Hyun Jang, Hosung Nam

    Abstract: This study presents a model of automatic speech recognition (ASR) designed to diagnose pronunciation issues in children with speech sound disorders (SSDs) to replace manual transcriptions in clinical procedures. Since ASR models trained for general purposes primarily predict input speech into real words, employing a well-known high-performance ASR model for evaluating pronunciation in children wit… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 12 pages, 2 figures

    ACM Class: I.2.7

  27. arXiv:2403.07041  [pdf, other

    cs.LG cs.NE

    Ant Colony Sampling with GFlowNets for Combinatorial Optimization

    Authors: Minsu Kim, Sanghyeok Choi, Hyeonah Kim, Jiwoo Son, **kyoo Park, Yoshua Bengio

    Abstract: This paper introduces the Generative Flow Ant Colony Sampler (GFACS), a neural-guided probabilistic search algorithm for solving combinatorial optimization (CO). GFACS integrates generative flow networks (GFlowNets), an emerging amortized inference method, with ant colony optimization (ACO), a promising probabilistic search algorithm. Specifically, we use GFlowNets to learn a constructive policy i… ▽ More

    Submitted 22 May, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: 23 pages, 5 figures

  28. arXiv:2403.05252  [pdf, other

    quant-ph

    Quantum error cancellation in photonic systems -- undoing photon losses

    Authors: Adam Taylor, Gabriele Bressanini, Hyukjoon Kwon, M. S. Kim

    Abstract: Real photonic devices are subject to photon losses that can decohere quantum information encoded in the system. In the absence of full fault tolerance, quantum error mitigation techniques have been introduced to help manage errors in noisy quantum devices. In this work, we introduce an error mitigation protocol inspired by probabilistic error cancellation (a popular error mitigation technique in d… ▽ More

    Submitted 28 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: Comments welcome. 22 pages, 10 figures

  29. arXiv:2403.04460  [pdf, other

    cs.CL

    Pearl: A Review-driven Persona-Knowledge Grounded Conversational Recommendation Dataset

    Authors: Min** Kim, Minju Kim, Hana Kim, Beong-woo Kwak, Soyeon Chun, Hyunseo Kim, SeongKu Kang, Youngjae Yu, **young Yeo, Dongha Lee

    Abstract: Conversational recommender system is an emerging area that has garnered an increasing interest in the community, especially with the advancements in large language models (LLMs) that enable diverse reasoning over conversational input. Despite the progress, the field has many aspects left to explore. The currently available public datasets for conversational recommendation lack specific user prefer… ▽ More

    Submitted 8 June, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: Published at ACL 2024 Findings

  30. arXiv:2403.03919  [pdf, other

    quant-ph

    Multi-parameter quantum estimation of single- and two-mode pure Gaussian states

    Authors: Gabriele Bressanini, Marco G. Genoni, M. S. Kim, Matteo G. A. Paris

    Abstract: We discuss the ultimate precision bounds on the multiparameter estimation of single- and two-mode pure Gaussian states. By leveraging on previous approaches that focused on the estimation of a complex displacement only, we derive the Holevo Cramér-Rao bound (HCRB) for both displacement and squeezing parameter characterizing single and two-mode squeezed states. In the single-mode scenario, we obtai… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  31. arXiv:2403.03693  [pdf, other

    physics.plasm-ph

    Operational Space and Plasma Performance with an RMP-ELM Suppressed Edge

    Authors: C. Paz-Soldan, S. Gu, N. Leuthold, P. Lunia, P. Xie, M. W. Kim, S. K. Kim, N. C. Logan, J. -K. Park, W. Suttrop, Y. Sun, D. B. Weisberg, M. Willensdorfer, the ASDEX-Upgrade, DIII-D, EAST, KSTAR Teams

    Abstract: The operational space and global performance of plasmas with edge-localized modes (ELMs) suppressed by resonant magnetic perturbations (RMPs) are surveyed by comparing AUG, DIII-D, EAST, and KSTAR stationary operating points. RMP-ELM suppression is achieved over a range of plasma currents, toroidal fields, and RMP toroidal mode numbers. Consistent operational windows in edge safety factor are foun… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 22 pages, 11 figures

  32. arXiv:2403.03368  [pdf, other

    cs.LG cs.CY

    Leveraging Federated Learning for Automatic Detection of Clopidogrel Treatment Failures

    Authors: Samuel Kim, Min Sang Kim

    Abstract: The effectiveness of clopidogrel, a widely used antiplatelet medication, varies significantly among individuals, necessitating the development of precise predictive models to optimize patient care. In this study, we leverage federated learning strategies to address clopidogrel treatment failure detection. Our research harnesses the collaborative power of multiple healthcare institutions, allowing… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  33. arXiv:2403.03004  [pdf, other

    astro-ph.CO gr-qc hep-ph

    Ultralight vector dark matter search using data from the KAGRA O3GK run

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, I. Abouelfettouh, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi , et al. (1778 additional authors not shown)

    Abstract: Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we prese… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 20 pages, 5 figures

    Report number: LIGO-P2300250

  34. arXiv:2403.02944  [pdf, other

    cs.CV cs.LG

    Neural Image Compression with Text-guided Encoding for both Pixel-level and Perceptual Fidelity

    Authors: Hagyeong Lee, Minkyu Kim, Jun-Hyuk Kim, Seungeon Kim, Dokwan Oh, Jaeho Lee

    Abstract: Recent advances in text-guided image compression have shown great potential to enhance the perceptual quality of reconstructed images. These methods, however, tend to have significantly degraded pixel-wise fidelity, limiting their practicality. To fill this gap, we develop a new text-guided image compression algorithm that achieves both high perceptual and pixel-wise fidelity. In particular, we pr… ▽ More

    Submitted 21 May, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: The first two authors contributed equally

  35. arXiv:2403.02734  [pdf, ps, other

    cond-mat.str-el cond-mat.mtrl-sci

    Strain tunable electronic ground states in two-dimensional iridate thin films

    Authors: Donghan Kim, Byungmin Sohn, Yeonjae Lee, Jeongkeun Song, Mi Kyung Kim, Minjae Kim, Tae Won Noh, Changyoung Kim

    Abstract: Quantum phases of matter such as superconducting, ferromagnetic and Wigner crystal states are often driven by the two-dimensionality (2D) of correlated systems. Meanwhile, spin-orbit coupling (SOC) is a fundamental element leading to nontrivial topology which gives rise to quantum phenomena such as the large anomalous Hall effect and nontrivial superconductivity. However, the search for controllab… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 7 pages, 4 figures

  36. arXiv:2403.01861  [pdf, other

    cs.RO cs.AI cs.CV

    AiSDF: Structure-aware Neural Signed Distance Fields in Indoor Scenes

    Authors: Jaehoon Jang, Inha Lee, Minje Kim, Kyungdon Joo

    Abstract: Indoor scenes we are living in are visually homogenous or textureless, while they inherently have structural forms and provide enough structural priors for 3D scene reconstruction. Motivated by this fact, we propose a structure-aware online signed distance fields (SDF) reconstruction framework in indoor scenes, especially under the Atlanta world (AW) assumption. Thus, we dub this incremental SDF r… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 8 pages, 6 figures, Accepted to IEEE RA-L (First two authors contributed equally)

    Journal ref: IEEE Robotics and Automation Letters (RA-L), vol. 9, no. 5, pp. 4106-4113, 2024

  37. arXiv:2403.01469  [pdf, other

    cs.CL

    KorMedMCQA: Multi-Choice Question Answering Benchmark for Korean Healthcare Professional Licensing Examinations

    Authors: Sunjun Kweon, Byung** Choi, Minkyu Kim, Rae Woong Park, Edward Choi

    Abstract: We introduce KorMedMCQA, the first Korean multiple-choice question answering (MCQA) benchmark derived from Korean healthcare professional licensing examinations, covering from the year 2012 to year 2023. This dataset consists of a selection of questions from the license examinations for doctors, nurses, and pharmacists, featuring a diverse array of subjects. We conduct baseline experiments on vari… ▽ More

    Submitted 5 March, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

  38. arXiv:2403.00398  [pdf, other

    cs.RO

    Learning Quadrupedal Locomotion with Impaired Joints Using Random Joint Masking

    Authors: Mincheol Kim, Ukcheol Shin, Jung-Yup Kim

    Abstract: Quadrupedal robots have played a crucial role in various environments, from structured environments to complex harsh terrains, thanks to their agile locomotion ability. However, these robots can easily lose their locomotion functionality if damaged by external accidents or internal malfunctions. In this paper, we propose a novel deep reinforcement learning framework to enable a quadrupedal robot t… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: Appear to ICRA 2024, Project page: https://sites.google.com/view/learning-impaired-joints-loco

  39. arXiv:2402.18778  [pdf, other

    cs.NI quant-ph

    X-ResQ: Reverse Annealing for Quantum MIMO Detection with Flexible Parallelism

    Authors: Minsung Kim, Abhishek Kumar Singh, Davide Venturelli, John Kaewell, Kyle Jamieson

    Abstract: Quantum Annealing (QA)-accelerated MIMO detection is an emerging research approach in the context of NextG wireless networks. The opportunity is to enable large MIMO systems and thus improve wireless performance. The approach aims to leverage QA to expedite the computation required for theoretically optimal but computationally-demanding Maximum Likelihood detection to overcome the limitations of t… ▽ More

    Submitted 9 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: 22 pages

  40. arXiv:2402.18372  [pdf, other

    cs.LG cs.AI cs.DC

    FedUV: Uniformity and Variance for Heterogeneous Federated Learning

    Authors: Ha Min Son, Moon-Hyun Kim, Tai-Myoung Chung, Chao Huang, Xin Liu

    Abstract: Federated learning is a promising framework to train neural networks with widely distributed data. However, performance degrades heavily with heterogeneously distributed data. Recent work has shown this is due to the final layer of the network being most prone to local bias, some finding success freezing the final layer as an orthogonal classifier. We investigate the training dynamics of the class… ▽ More

    Submitted 1 March, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: 11 pages, 4 figures, 5 tables, to appear at CVPR 2024

  41. arXiv:2402.18351  [pdf, other

    cs.CV

    LatentSwap: An Efficient Latent Code Map** Framework for Face Swap**

    Authors: Changho Choi, Minho Kim, Junhyeok Lee, Hyoung-Kyu Song, Younggeun Kim, Seungryong Kim

    Abstract: We propose LatentSwap, a simple face swap** framework generating a face swap latent code of a given generator. Utilizing randomly sampled latent codes, our framework is light and does not require datasets besides employing the pre-trained models, with the training procedure also being fast and straightforward. The loss objective consists of only three terms, and can effectively control the face… ▽ More

    Submitted 2 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: 9 pages, 11 figures

  42. arXiv:2402.17984  [pdf, other

    stat.ME

    Sampling low-fidelity outputs for estimation of high-fidelity density and its tails

    Authors: Minji Kim, Vladas Pipiras, Kevin O'Connor, Themistoklis Sapsis

    Abstract: In a multifidelity setting, data are available under the same conditions from two (or more) sources, e.g. computer codes, one being lower-fidelity but computationally cheaper, and the other higher-fidelity and more expensive. This work studies for which low-fidelity outputs, one should obtain high-fidelity outputs, if the goal is to estimate the probability density function of the latter, especial… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 32 pages, 11 figures, 2 tables

  43. arXiv:2402.17267  [pdf, other

    physics.plasm-ph

    Narrowband THz Emission from a Plasma Oscillator Imbedded in a Plasma Density Gradient

    Authors: Manoj Kumar, Bernhard Ersfeld, Jaeho Lee, Dohyun Park, Seungyun Kim, Inhyuk Nam, Minseok Kim, Seong** Jeon, Dino A. Jaroszynski, Hyyong Suk, Min Sup Hur

    Abstract: A novel method is presented for generating radiation using the beat wave associated with a bi-frequency laser pulse, to excite plasma oscillations in a plasma slab with a density gradient. By resonantly exciting a plasma wave, it can be localised and transformed into a plasma oscillator that produces a beam of radially polarised terahertz radiation. Particle-in-cell simulations and analytic theory… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  44. arXiv:2402.16785  [pdf, other

    cs.LG

    CARTE: Pretraining and Transfer for Tabular Learning

    Authors: Myung Jun Kim, Léo Grinsztajn, Gaël Varoquaux

    Abstract: Pretrained deep-learning models are the go-to solution for images or text. However, for tabular data the standard is still to train tree-based models. Indeed, transfer learning on tables hits the challenge of data integration: finding correspondences, correspondences in the entries (entity matching) where different words may denote the same entity, correspondences across columns (schema matching),… ▽ More

    Submitted 31 May, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

  45. arXiv:2402.16768  [pdf, other

    cond-mat.mtrl-sci cond-mat.str-el

    Synthesis, structural and magnetic characterizations of Li$_4$Cu$_{1-x}$Ni$_x$TeO$_6$ ( $x$ = 0, 0.1, 0.2, 0.5, and 1)

    Authors: Ashiwini Balodhi, Brianna Billingsley, Tai Kong, Min Gyu Kim

    Abstract: We investigated the effect of Ni do** in a recently proposed quantum spin liquid (QSL) candidate Li$_4$CuTeO$_6$. We performed a comprehensive study on the structural and magnetic properties. We find that the anti-site disorder between Li$^+$ and Cu$^{2+}$ persists until 50\% Ni do** in which Ni and Cu occupy different crystallographic sites. As a result, while Cu sits in both triangular and h… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: 7 pages, 4 figures

  46. arXiv:2402.16307  [pdf, ps, other

    eess.SP

    Analyzing Downlink Coverage in Clustered Low Earth Orbit Satellite Constellations: A Stochastic Geometry Approach

    Authors: Miyeon Lee, Sucheol Kim, Minje Kim, Dong-Hyun Jung, Junil Choi

    Abstract: Satellite networks are emerging as vital solutions for global connectivity beyond 5G. As companies such as SpaceX, OneWeb, and Amazon are poised to launch a large number of satellites in low Earth orbit, the heightened inter-satellite interference caused by mega-constellations has become a significant concern. To address this challenge, recent works have introduced the concept of satellite cluster… ▽ More

    Submitted 29 March, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: submitted to IEEE Transactions on Communications

  47. arXiv:2402.16021  [pdf, other

    cs.CL cs.AI cs.CV eess.AS

    TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages

    Authors: Minsu Kim, Jee-weon Jung, Hyeongseop Rha, Soumi Maiti, Siddhant Arora, Xuankai Chang, Shinji Watanabe, Yong Man Ro

    Abstract: The capability to jointly process multi-modal information is becoming an essential task. However, the limited number of paired multi-modal data and the large computational requirements in multi-modal learning hinder the development. We propose a novel Tri-Modal Translation (TMT) model that translates between arbitrary modalities spanning speech, image, and text. We introduce a novel viewpoint, whe… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  48. Artful Path to Healing: Using Machine Learning for Visual Art Recommendation to Prevent and Reduce Post-Intensive Care

    Authors: Bereket A. Yilma, Chan Mi Kim, Gerald C. Cupchik, Luis A. Leiva

    Abstract: Staying in the intensive care unit (ICU) is often traumatic, leading to post-intensive care syndrome (PICS), which encompasses physical, psychological, and cognitive impairments. Currently, there are limited interventions available for PICS. Studies indicate that exposure to visual art may help address the psychological aspects of PICS and be more effective if it is personalized. We develop Machin… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI 24)

  49. arXiv:2402.15151  [pdf, other

    cs.CV cs.CL eess.AS eess.IV

    Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing

    Authors: Jeong Hun Yeo, Seunghee Han, Minsu Kim, Yong Man Ro

    Abstract: In visual speech processing, context modeling capability is one of the most important requirements due to the ambiguous nature of lip movements. For example, homophenes, words that share identical lip movements but produce different sounds, can be distinguished by considering the context. In this paper, we propose a novel framework, namely Visual Speech Processing incorporated with LLMs (VSP-LLM),… ▽ More

    Submitted 13 May, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: An Erratum was added on the last page of this paper

  50. arXiv:2402.15046  [pdf, other

    cs.CL

    CARBD-Ko: A Contextually Annotated Review Benchmark Dataset for Aspect-Level Sentiment Classification in Korean

    Authors: Dongjun Jang, Jean Seo, Sungjoo Byun, Taekyoung Kim, Minseok Kim, Hyopil Shin

    Abstract: This paper explores the challenges posed by aspect-based sentiment classification (ABSC) within pretrained language models (PLMs), with a particular focus on contextualization and hallucination issues. In order to tackle these challenges, we introduce CARBD-Ko (a Contextually Annotated Review Benchmark Dataset for Aspect-Based Sentiment Classification in Korean), a benchmark dataset that incorpora… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.