Search | arXiv e-print repository

Weighted decoupling estimates and the Bochner-Riesz means

Abstract: We prove new weighted decoupling estimates. As an application, we give an improved sufficient condition for almost everywhere convergence of the Bochner-Riesz means of arbitrary $L^p$ functions for $1<p<2$ in dimensions 2 and 3. We prove new weighted decoupling estimates. As an application, we give an improved sufficient condition for almost everywhere convergence of the Bochner-Riesz means of arbitrary $L^p$ functions for $1<p<2$ in dimensions 2 and 3. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 17 pages

MSC Class: 42B25

arXiv:2406.02936 [pdf]

Radiomics-guided Multimodal Self-attention Network for Predicting Pathological Complete Response in Breast MRI

Authors: Jonghun Kim, Hyun** Park

Abstract: Breast cancer is the most prevalent cancer among women and predicting pathologic complete response (pCR) after anti-cancer treatment is crucial for patient prognosis and treatment customization. Deep learning has shown promise in medical imaging diagnosis, particularly when utilizing multiple imaging modalities to enhance accuracy. This study presents a model that predicts pCR in breast cancer pat… ▽ More Breast cancer is the most prevalent cancer among women and predicting pathologic complete response (pCR) after anti-cancer treatment is crucial for patient prognosis and treatment customization. Deep learning has shown promise in medical imaging diagnosis, particularly when utilizing multiple imaging modalities to enhance accuracy. This study presents a model that predicts pCR in breast cancer patients using dynamic contrast-enhanced (DCE) magnetic resonance imaging (MRI) and apparent diffusion coefficient (ADC) maps. Radiomics features are established hand-crafted features of the tumor region and thus could be useful in medical image analysis. Our approach extracts features from both DCE MRI and ADC using an encoder with a self-attention mechanism, leveraging radiomics to guide feature extraction from tumor-related regions. Our experimental results demonstrate the superior performance of our model in predicting pCR compared to other baseline methods. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: 5 pages, 5 figures, IEEE ISBI 2024 proceedings

arXiv:2406.02840 [pdf, other]

Statistical inference of convex order by Wasserstein projection

Authors: Jakwang Kim, Young-Heon Kim, Yuanlong Ruan, Andrew Warren

Abstract: Ranking distributions according to a stochastic order has wide applications in diverse areas. Although stochastic dominance has received much attention,convex order, particularly in general dimensions, has yet to be investigated from a statistical point of view. This article addresses this gap by introducing a simple statistical test for convex order based on the Wasserstein projection distance. T… ▽ More Ranking distributions according to a stochastic order has wide applications in diverse areas. Although stochastic dominance has received much attention,convex order, particularly in general dimensions, has yet to be investigated from a statistical point of view. This article addresses this gap by introducing a simple statistical test for convex order based on the Wasserstein projection distance. This projection distance not only encodes whether two distributions are indeed in convex order, but also quantifies the deviation from the desired convex order and produces an optimal convex order approximation. Lipschitz stability of the backward and forward Wasserstein projection distance is proved, which leads to elegant consistency results of the estimator we employ as our test statistic. Combining these with state of the art results regarding the convergence rate of empirical distributions, we also derive upper bounds for the $p$-value and type I error our test statistic, as well as upper bounds on the type II error for an appropriate class of strict alternatives. Lastly, we provide an efficient numerical scheme for our test statistic, by way of an entropic Frank-Wolfe algorithm. Some experiments based on synthetic data sets illuminates the success of our approach empirically. △ Less

Submitted 4 June, 2024; originally announced June 2024.

MSC Class: 62G10; 49K27

arXiv:2406.02441 [pdf, other]

Probing the Scalar WIMP-Pion Coupling with the first LUX-ZEPLIN data

Authors: J. Aalbers, D. S. Akerib, A. K. Al Musalhi, F. Alder, C. S. Amarasinghe, A. Ames, T. J. Anderson, N. Angelides, H. M. Araújo, J. E. Armstrong, M. Arthurs, A. Baker, S. Balashov, J. Bang, E. E. Barillier, J. W. Bargemann, K. Beattie, T. Benson, A. Bhatti, A. Biekert, T. P. Biesiadzinski, H. J. Birch, E. J. Bishop, G. M. Blockinger, B. Boxer , et al. (178 additional authors not shown)

Abstract: Weakly interacting massive particles (WIMPs) may interact with a virtual pion that is exchanged between nucleons. This interaction channel is important to consider in models where the spin-independent isoscalar channel is suppressed. Using data from the first science run of the LUX-ZEPLIN dark matter experiment, containing 60 live days of data in a 5.5~tonne fiducial mass of liquid xenon, we repor… ▽ More Weakly interacting massive particles (WIMPs) may interact with a virtual pion that is exchanged between nucleons. This interaction channel is important to consider in models where the spin-independent isoscalar channel is suppressed. Using data from the first science run of the LUX-ZEPLIN dark matter experiment, containing 60 live days of data in a 5.5~tonne fiducial mass of liquid xenon, we report the results on a search for WIMP-pion interactions. We observe no significant excess and set an upper limit of $1.5\times10^{-46}$~cm$^2$ at a 90\% confidence level for a WIMP mass of 33~GeV/c$^2$ for this interaction. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.02331 [pdf, other]

Translation Deserves Better: Analyzing Translation Artifacts in Cross-lingual Visual Question Answering

Authors: ChaeHun Park, Koanho Lee, Hyesu Lim, Jaeseok Kim, Junmo Park, Yu-Jung Heo, Du-Seong Chang, Jaegul Choo

Abstract: Building a reliable visual question answering~(VQA) system across different languages is a challenging problem, primarily due to the lack of abundant samples for training. To address this challenge, recent studies have employed machine translation systems for the cross-lingual VQA task. This involves translating the evaluation samples into a source language (usually English) and using monolingual… ▽ More Building a reliable visual question answering~(VQA) system across different languages is a challenging problem, primarily due to the lack of abundant samples for training. To address this challenge, recent studies have employed machine translation systems for the cross-lingual VQA task. This involves translating the evaluation samples into a source language (usually English) and using monolingual models (i.e., translate-test). However, our analysis reveals that translated texts contain unique characteristics distinct from human-written ones, referred to as translation artifacts. We find that these artifacts can significantly affect the models, confirmed by extensive experiments across diverse models, languages, and translation processes. In light of this, we present a simple data augmentation strategy that can alleviate the adverse impacts of translation artifacts. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: ACL 2024 Findings Accepted

arXiv:2406.01920 [pdf, other]

CODE: Contrasting Self-generated Description to Combat Hallucination in Large Multi-modal Models

Authors: Junho Kim, Hyunjun Kim, Yeonju Kim, Yong Man Ro

Abstract: Large Multi-modal Models (LMMs) have recently demonstrated remarkable abilities in visual context understanding and coherent response generation. However, alongside these advancements, the issue of hallucinations has emerged as a significant challenge, producing erroneous responses that are unrelated to the visual contents. In this paper, we introduce a novel contrastive-based decoding method, COu… ▽ More Large Multi-modal Models (LMMs) have recently demonstrated remarkable abilities in visual context understanding and coherent response generation. However, alongside these advancements, the issue of hallucinations has emerged as a significant challenge, producing erroneous responses that are unrelated to the visual contents. In this paper, we introduce a novel contrastive-based decoding method, COuntering DEscription Contrastive Decoding (CODE), which leverages self-generated descriptions as contrasting references during the decoding phase of LMMs to address hallucination issues. CODE utilizes the comprehensive descriptions from model itself as visual counterpart to correct and improve response alignment with actual visual content. By dynamically adjusting the information flow and distribution of next-token predictions in the LMM's vocabulary, CODE enhances the coherence and informativeness of generated responses. Extensive experiments demonstrate that our method significantly reduces hallucinations and improves cross-modal consistency across various benchmarks and cutting-edge LMMs. Our method provides a simple yet effective decoding strategy that can be integrated to existing LMM frameworks without additional training. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: Project page: https://ivy-lvlm.github.io/CODE/

arXiv:2406.01833 [pdf, other]

doi 10.1145/3637528.3671724

CAFO: Feature-Centric Explanation on Time Series Classification

Authors: Jaeho Kim, Seok-Ju Hahn, Yoontae Hwang, Junghye Lee, Seulki Lee

Abstract: In multivariate time series (MTS) classification, finding the important features (e.g., sensors) for model performance is crucial yet challenging due to the complex, high-dimensional nature of MTS data, intricate temporal dynamics, and the necessity for domain-specific interpretations. Current explanation methods for MTS mostly focus on time-centric explanations, apt for pinpointing important time… ▽ More In multivariate time series (MTS) classification, finding the important features (e.g., sensors) for model performance is crucial yet challenging due to the complex, high-dimensional nature of MTS data, intricate temporal dynamics, and the necessity for domain-specific interpretations. Current explanation methods for MTS mostly focus on time-centric explanations, apt for pinpointing important time periods but less effective in identifying key features. This limitation underscores the pressing need for a feature-centric approach, a vital yet often overlooked perspective that complements time-centric analysis. To bridge this gap, our study introduces a novel feature-centric explanation and evaluation framework for MTS, named CAFO (Channel Attention and Feature Orthgonalization). CAFO employs a convolution-based approach with channel attention mechanisms, incorporating a depth-wise separable channel attention module (DepCA) and a QR decomposition-based loss for promoting feature-wise orthogonality. We demonstrate that this orthogonalization enhances the separability of attention distributions, thereby refining and stabilizing the ranking of feature importance. This improvement in feature-wise ranking enhances our understanding of feature explainability in MTS. Furthermore, we develop metrics to evaluate global and class-specific feature importance. Our framework's efficacy is validated through extensive empirical analyses on two major public benchmarks and real-world datasets, both synthetic and self-collected, specifically designed to highlight class-wise discriminative features. The results confirm CAFO's robustness and informative capacity in assessing feature importance in MTS classification tasks. This study not only advances the understanding of feature-centric explanations in MTS but also sets a foundation for future explorations in feature-centric explanations. △ Less

Submitted 11 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

Comments: Accepted to KDD 2024 Research Track

arXiv:2406.01742 [pdf, other]

UV Cooling via O VI Emission in the Superwind of M82 Observed with the Far Ultraviolet Spectroscopic Explorer (FUSE)

Authors: **-Ah Kim, Haeun Chung, Carlos J. Vargas, Erika Hamden

Abstract: We examined archival Far Ultraviolet Spectroscopic Explorer data to search for far-ultraviolet emission lines in the starburst galaxy M82. The observations were made in an outflow region that extends beyond the galactic disk. We found the O VI $λλ$ 1032, 1038 emission lines from the galaxy's southern outflow region. The O VI lines suggest that the outflowing warm-hot gas is undergoing radiative co… ▽ More We examined archival Far Ultraviolet Spectroscopic Explorer data to search for far-ultraviolet emission lines in the starburst galaxy M82. The observations were made in an outflow region that extends beyond the galactic disk. We found the O VI $λλ$ 1032, 1038 emission lines from the galaxy's southern outflow region. The O VI lines suggest that the outflowing warm-hot gas is undergoing radiative cooling. We measured a radial velocity of $\sim$420 km s$^{-1}$ from the O VI lines, which is faster than the velocity seen in H$α$ observations. The O VI $λ$1038 emission line seems to be blended with the C II $λ$1037 emission line, which has a radial velocity of $\sim$300 km s$^{-1}$, similar to what is observed in H$α$ observations. The outflow medium of M82 appears to be composed of gas in multiple phases with varying temperatures and kinematics. Future spectroscopic observations in high energy regimes covering a wider spatial area are necessary to understand better the properties of the warm-hot gas medium in the outflow. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: 9 pages, 5 figures, Accepted to AJ

arXiv:2406.01079 [pdf, other]

Object Aware Egocentric Online Action Detection

Authors: Joungbin An, Yunsu Park, Hyolim Kang, Seon Joo Kim

Abstract: Advancements in egocentric video datasets like Ego4D, EPIC-Kitchens, and Ego-Exo4D have enriched the study of first-person human interactions, which is crucial for applications in augmented reality and assisted living. Despite these advancements, current Online Action Detection methods, which efficiently detect actions in streaming videos, are predominantly designed for exocentric views and thus f… ▽ More Advancements in egocentric video datasets like Ego4D, EPIC-Kitchens, and Ego-Exo4D have enriched the study of first-person human interactions, which is crucial for applications in augmented reality and assisted living. Despite these advancements, current Online Action Detection methods, which efficiently detect actions in streaming videos, are predominantly designed for exocentric views and thus fail to capitalize on the unique perspectives inherent to egocentric videos. To address this gap, we introduce an Object-Aware Module that integrates egocentric-specific priors into existing OAD frameworks, enhancing first-person footage interpretation. Utilizing object-specific details and temporal dynamics, our module improves scene understanding in detecting actions. Validated extensively on the Epic-Kitchens 100 dataset, our work can be seamlessly integrated into existing models with minimal overhead and bring consistent performance enhancements, marking an important step forward in adapting action detection systems to egocentric video analysis. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: CVPR First Joint Egocentric Vision Workshop 2024

arXiv:2406.01020 [pdf, other]

CLIP-Guided Attribute Aware Pretraining for Generalizable Image Quality Assessment

Authors: Daekyu Kwon, Dongyoung Kim, Sehwan Ki, Younghyun Jo, Hyong-Euk Lee, Seon Joo Kim

Abstract: In no-reference image quality assessment (NR-IQA), the challenge of limited dataset sizes hampers the development of robust and generalizable models. Conventional methods address this issue by utilizing large datasets to extract rich representations for IQA. Also, some approaches propose vision language models (VLM) based IQA, but the domain gap between generic VLM and IQA constrains their scalabi… ▽ More In no-reference image quality assessment (NR-IQA), the challenge of limited dataset sizes hampers the development of robust and generalizable models. Conventional methods address this issue by utilizing large datasets to extract rich representations for IQA. Also, some approaches propose vision language models (VLM) based IQA, but the domain gap between generic VLM and IQA constrains their scalability. In this work, we propose a novel pretraining framework that constructs a generalizable representation for IQA by selectively extracting quality-related knowledge from VLM and leveraging the scalability of large datasets. Specifically, we carefully select optimal text prompts for five representative image quality attributes and use VLM to generate pseudo-labels. Numerous attribute-aware pseudo-labels can be generated with large image datasets, allowing our IQA model to learn rich representations about image quality. Our approach achieves state-of-the-art performance on multiple IQA datasets and exhibits remarkable generalization capabilities. Leveraging these strengths, we propose several applications, such as evaluating image generation models and training image enhancement models, demonstrating our model's real-world applicability. We will make the code available for access. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.00945 [pdf, other]

General relativistic self-gravitating equilibrium disks around rotating neutron stars

Authors: Yoonsoo Kim, **ho Kim, Hee Il Kim, Hyung Mok Lee

Abstract: In modeling a relativistic disk around a compact object, the self-gravity of the disk is often neglected while it needs to be incorporated for more accurate descriptions in several circumstances. Extending the Komatsu-Eriguchi-Hachisu self-consistent field method, we present numerical models of a rapidly rotating neutron star with a self-gravitating disk in stationary equilibrium. In particular, o… ▽ More In modeling a relativistic disk around a compact object, the self-gravity of the disk is often neglected while it needs to be incorporated for more accurate descriptions in several circumstances. Extending the Komatsu-Eriguchi-Hachisu self-consistent field method, we present numerical models of a rapidly rotating neutron star with a self-gravitating disk in stationary equilibrium. In particular, our approach allows us to obtain numerical solutions involving a massive disk with the rest mass $O(10^{-1})-O(10^0) M_\odot$ closely attached to a rotating neutron star. We also assess the impact of self-gravity on the internal structure of the disk and the neutron star. These axisymmetric, stationary solutions can be employed for simulations involving the neutron star-disk system in the context of high-energy transients and gravitational wave emissions. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: 15 pages, 12 figures

arXiv:2406.00798 [pdf, other]

PruNeRF: Segment-Centric Dataset Pruning via 3D Spatial Consistency

Authors: Yeonsung Jung, Heecheol Yun, Joonhyung Park, **-Hwa Kim, Eunho Yang

Abstract: Neural Radiance Fields (NeRF) have shown remarkable performance in learning 3D scenes. However, NeRF exhibits vulnerability when confronted with distractors in the training images -- unexpected objects are present only within specific views, such as moving entities like pedestrians or birds. Excluding distractors during dataset construction is a straightforward solution, but without prior knowledg… ▽ More Neural Radiance Fields (NeRF) have shown remarkable performance in learning 3D scenes. However, NeRF exhibits vulnerability when confronted with distractors in the training images -- unexpected objects are present only within specific views, such as moving entities like pedestrians or birds. Excluding distractors during dataset construction is a straightforward solution, but without prior knowledge of their types and quantities, it becomes prohibitively expensive. In this paper, we propose PruNeRF, a segment-centric dataset pruning framework via 3D spatial consistency, that effectively identifies and prunes the distractors. We first examine existing metrics for measuring pixel-wise distraction and introduce Influence Functions for more accurate measurements. Then, we assess 3D spatial consistency using a depth-based reprojection technique to obtain 3D-aware distraction. Furthermore, we incorporate segmentation for pixel-to-segment refinement, enabling more precise identification. Our experiments on benchmark datasets demonstrate that PruNeRF consistently outperforms state-of-the-art methods in robustness against distractors. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.00158 [pdf, other]

doi 10.1145/3650200.3656632

Distributed Ranges: A Model for Distributed Data Structures, Algorithms, and Views

Authors: Benjamin Brock, Robert Cohn, Suyash Bakshi, Tuomas Karna, Jeongnim Kim, Mateusz Nowak, Łukasz Ślusarczyk, Kacper Stefanski, Timothy G. Mattson

Abstract: Data structures and algorithms are essential building blocks for programs, and \emph{distributed data structures}, which automatically partition data across multiple memory locales, are essential to writing high-level parallel programs. While many projects have designed and implemented C++ distributed data structures and algorithms, there has not been widespread adoption of an interoperable model… ▽ More Data structures and algorithms are essential building blocks for programs, and \emph{distributed data structures}, which automatically partition data across multiple memory locales, are essential to writing high-level parallel programs. While many projects have designed and implemented C++ distributed data structures and algorithms, there has not been widespread adoption of an interoperable model allowing algorithms and data structures from different libraries to work together. This paper introduces distributed ranges, which is a model for building generic data structures, views, and algorithms. A distributed range extends a C++ range, which is an iterable sequence of values, with a concept of segmentation, thus exposing how the distributed range is partitioned over multiple memory locales. Distributed data structures provide this distributed range interface, which allows them to be used with a collection of generic algorithms implemented using the distributed range interface. The modular nature of the model allows for the straightforward implementation of \textit{distributed views}, which are lightweight objects that provide a lazily evaluated view of another range. Views can be composed together recursively and combined with algorithms to implement computational kernels using efficient, flexible, and high-level standard C++ primitives. We evaluate the distributed ranges model by implementing a set of standard concepts and views as well as two execution runtimes, a multi-node, MPI-based runtime and a single-process, multi-GPU runtime. We demonstrate that high-level algorithms implemented using generic, high-level distributed ranges can achieve performance competitive with highly-tuned, expert-written code. △ Less

Submitted 31 May, 2024; originally announced June 2024.

Comments: To appear in ACM International Conference on Supercomputing (ICS) 2024

Journal ref: In Proceedings of the 38th ACM International Conference on Supercomputing (ICS 2024) 236-246

arXiv:2406.00123 [pdf]

Correlation-aware Coarse-to-fine MLPs for Deformable Medical Image Registration

Authors: Mingyuan Meng, Dagan Feng, Lei Bi, **man Kim

Abstract: Deformable image registration is a fundamental step for medical image analysis. Recently, transformers have been used for registration and outperformed Convolutional Neural Networks (CNNs). Transformers can capture long-range dependence among image features, which have been shown beneficial for registration. However, due to the high computation/memory loads of self-attention, transformers are typi… ▽ More Deformable image registration is a fundamental step for medical image analysis. Recently, transformers have been used for registration and outperformed Convolutional Neural Networks (CNNs). Transformers can capture long-range dependence among image features, which have been shown beneficial for registration. However, due to the high computation/memory loads of self-attention, transformers are typically used at downsampled feature resolutions and cannot capture fine-grained long-range dependence at the full image resolution. This limits deformable registration as it necessitates precise dense correspondence between each image pixel. Multi-layer Perceptrons (MLPs) without self-attention are efficient in computation/memory usage, enabling the feasibility of capturing fine-grained long-range dependence at full resolution. Nevertheless, MLPs have not been extensively explored for image registration and are lacking the consideration of inductive bias crucial for medical registration tasks. In this study, we propose the first correlation-aware MLP-based registration network (CorrMLP) for deformable medical image registration. Our CorrMLP introduces a correlation-aware multi-window MLP block in a novel coarse-to-fine registration architecture, which captures fine-grained multi-range dependence to perform correlation-aware coarse-to-fine registration. Extensive experiments with seven public medical datasets show that our CorrMLP outperforms state-of-the-art deformable registration methods. △ Less

Submitted 12 June, 2024; v1 submitted 31 May, 2024; originally announced June 2024.

Comments: Accepted at CVPR2024 as Oral Presentation && Best Paper Candidate

Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 9645-9654

arXiv:2405.20720 [pdf, other]

Power of Cooperative Supervision: Multiple Teachers Framework for Enhanced 3D Semi-Supervised Object Detection

Authors: **-Hee Lee, Jae-Keun Lee, Je-Seok Kim, Soon Kwon

Abstract: To ensure safe urban driving for autonomous platforms, it is crucial not only to develop high-performance object detection techniques but also to establish a diverse and representative dataset that captures various urban environments and object characteristics. To address these two issues, we have constructed a multi-class 3D LiDAR dataset reflecting diverse urban environments and object character… ▽ More To ensure safe urban driving for autonomous platforms, it is crucial not only to develop high-performance object detection techniques but also to establish a diverse and representative dataset that captures various urban environments and object characteristics. To address these two issues, we have constructed a multi-class 3D LiDAR dataset reflecting diverse urban environments and object characteristics, and developed a robust 3D semi-supervised object detection (SSOD) based on a multiple teachers framework. This SSOD framework categorizes similar classes and assigns specialized teachers to each category. Through collaborative supervision among these category-specialized teachers, the student network becomes increasingly proficient, leading to a highly effective object detector. We propose a simple yet effective augmentation technique, Pie-based Point Compensating Augmentation (PieAug), to enable the teacher network to generate high-quality pseudo-labels. Extensive experiments on the WOD, KITTI, and our datasets validate the effectiveness of our proposed method and the quality of our dataset. Experimental results demonstrate that our approach consistently outperforms existing state-of-the-art 3D semi-supervised object detection methods across all datasets. We plan to release our multi-class LiDAR dataset and the source code available on our Github repository in the near future. △ Less

Submitted 31 May, 2024; originally announced May 2024.

Comments: under review

arXiv:2405.20610 [pdf, other]

Revisiting and Maximizing Temporal Knowledge in Semi-supervised Semantic Segmentation

Authors: Wooseok Shin, Hyun Joon Park, ** Sob Kim, Sung Won Han

Abstract: In semi-supervised semantic segmentation, the Mean Teacher- and co-training-based approaches are employed to mitigate confirmation bias and coupling problems. However, despite their high performance, these approaches frequently involve complex training pipelines and a substantial computational burden, limiting the scalability and compatibility of these methods. In this paper, we propose a PrevMatc… ▽ More In semi-supervised semantic segmentation, the Mean Teacher- and co-training-based approaches are employed to mitigate confirmation bias and coupling problems. However, despite their high performance, these approaches frequently involve complex training pipelines and a substantial computational burden, limiting the scalability and compatibility of these methods. In this paper, we propose a PrevMatch framework that effectively mitigates the aforementioned limitations by maximizing the utilization of the temporal knowledge obtained during the training process. The PrevMatch framework relies on two core strategies: (1) we reconsider the use of temporal knowledge and thus directly utilize previous models obtained during training to generate additional pseudo-label guidance, referred to as previous guidance. (2) we design a highly randomized ensemble strategy to maximize the effectiveness of the previous guidance. Experimental results on four benchmark semantic segmentation datasets confirm that the proposed method consistently outperforms existing methods across various evaluation protocols. In particular, with DeepLabV3+ and ResNet-101 network settings, PrevMatch outperforms the existing state-of-the-art method, Diverse Co-training, by +1.6 mIoU on Pascal VOC with only 92 annotated images, while achieving 2.4 times faster training. Furthermore, the results indicate that PrevMatch induces stable optimization, particularly in benefiting classes that exhibit poor performance. Code is available at https://github.com/wooseok-shin/PrevMatch △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 14 pages, 5 figures, submitted to IEEE TPAMI. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2405.20597 [pdf]

Double-sided van der Waals epitaxy of topological insulators across an atomically thin membrane

Authors: Joon Young Park, Young Jae Shin, Jeacheol Shin, Jehyun Kim, Janghyun Jo, Hyobin Yoo, Danial Haei, Chohee Hyun, Jiyoung Yun, Robert M. Huber, Arijit Gupta, Kenji Watanabe, Takashi Taniguchi, Wan Kyu Park, Hyeon Suk Shin, Miyoung Kim, Dohun Kim, Gyu-Chul Yi, Philip Kim

Abstract: Atomically thin van der Waals (vdW) films provide a novel material platform for epitaxial growth of quantum heterostructures. However, unlike the remote epitaxial growth of three-dimensional bulk crystals, the growth of two-dimensional (2D) material heterostructures across atomic layers has been limited due to the weak vdW interaction. Here, we report the double-sided epitaxy of vdW layered materi… ▽ More Atomically thin van der Waals (vdW) films provide a novel material platform for epitaxial growth of quantum heterostructures. However, unlike the remote epitaxial growth of three-dimensional bulk crystals, the growth of two-dimensional (2D) material heterostructures across atomic layers has been limited due to the weak vdW interaction. Here, we report the double-sided epitaxy of vdW layered materials through atomic membranes. We grow vdW topological insulators (TIs) Sb$_2$Te$_3$ and Bi$_2$Se$_3$ by molecular beam epitaxy on both surfaces of atomically thin graphene or hBN, which serve as suspended 2D vdW "$\textit{substrate}$" layers. Both homo- and hetero- double-sided vdW TI tunnel junctions are fabricated, with the atomically thin hBN acting as a crystal-momentum-conserving tunnelling barrier with abrupt and epitaxial interface. By performing field-angle dependent magneto-tunnelling spectroscopy on these devices, we reveal the energy-momentum-spin resonant tunnelling of massless Dirac electrons between helical Landau levels developed in the topological surface states at the interface. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 24 pages, 4 main figures, 7 extended data figures

arXiv:2405.20586 [pdf, ps, other]

Entanglement witness and nonlocality in confidence of measurement from multipartite quantum state discrimination

Authors: Donghoon Ha, Jeong San Kim

Abstract: We consider multipartite quantum state discrimination and provide a specific relation between the properties of entanglement witness and quantum nonlocality inherent in the confidence of measurements. We first provide the definition of the confidence of measurements as well as its useful properties for various types of multipartite measurements. We show that globally maximum confidence that cannot… ▽ More We consider multipartite quantum state discrimination and provide a specific relation between the properties of entanglement witness and quantum nonlocality inherent in the confidence of measurements. We first provide the definition of the confidence of measurements as well as its useful properties for various types of multipartite measurements. We show that globally maximum confidence that cannot be achieved by local operations and classical communication strongly depends on the existence of entanglement witness. We also provide conditions for an upper bound on maximum of locally-achievable confidences. Finally, we establish a method in terms of entanglement witness to construct quantum state ensemble with nonlocal maximum confidences. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 12 pages, no figure

arXiv:2405.20245 [pdf, other]

Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use

Authors: Franz Louis Cesista, Rui Aguiar, Jason Kim, Paolo Acilo

Abstract: Business Document Information Extraction (BDIE) is the problem of transforming a blob of unstructured information (raw text, scanned documents, etc.) into a structured format that downstream systems can parse and use. It has two main tasks: Key-Information Extraction (KIE) and Line Items Recognition (LIR). In this paper, we argue that BDIE is best modeled as a Tool Use problem, where the tools are… ▽ More Business Document Information Extraction (BDIE) is the problem of transforming a blob of unstructured information (raw text, scanned documents, etc.) into a structured format that downstream systems can parse and use. It has two main tasks: Key-Information Extraction (KIE) and Line Items Recognition (LIR). In this paper, we argue that BDIE is best modeled as a Tool Use problem, where the tools are these downstream systems. We then present Retrieval Augmented Structured Generation (RASG), a novel general framework for BDIE that achieves state of the art (SOTA) results on both KIE and LIR tasks on BDIE benchmarks. The contributions of this paper are threefold: (1) We show, with ablation benchmarks, that Large Language Models (LLMs) with RASG are already competitive with or surpasses current SOTA Large Multimodal Models (LMMs) without RASG on BDIE benchmarks. (2) We propose a new metric class for Line Items Recognition, General Line Items Recognition Metric (GLIRM), that is more aligned with practical BDIE use cases compared to existing metrics, such as ANLS*, DocILE, and GriTS. (3) We provide a heuristic algorithm for backcalculating bounding boxes of predicted line items and tables without the need for vision encoders. Finally, we claim that, while LMMs might sometimes offer marginal performance benefits, LLMs + RASG is oftentimes superior given real-world applications and constraints of BDIE. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: Accepted by IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR), 2024

arXiv:2405.19703 [pdf, other]

Towards a Better Evaluation of Out-of-Domain Generalization

Authors: Duhun Hwang, Suhyun Kang, Moonjung Eo, Jimyeong Kim, Wonjong Rhee

Abstract: The objective of Domain Generalization (DG) is to devise algorithms and models capable of achieving high performance on previously unseen test distributions. In the pursuit of this objective, average measure has been employed as the prevalent measure for evaluating models and comparing algorithms in the existing DG studies. Despite its significance, a comprehensive exploration of the average measu… ▽ More The objective of Domain Generalization (DG) is to devise algorithms and models capable of achieving high performance on previously unseen test distributions. In the pursuit of this objective, average measure has been employed as the prevalent measure for evaluating models and comparing algorithms in the existing DG studies. Despite its significance, a comprehensive exploration of the average measure has been lacking and its suitability in approximating the true domain generalization performance has been questionable. In this study, we carefully investigate the limitations inherent in the average measure and propose worst+gap measure as a robust alternative. We establish theoretical grounds of the proposed measure by deriving two theorems starting from two different assumptions. We conduct extensive experimental investigations to compare the proposed worst+gap measure with the conventional average measure. Given the indispensable need to access the true DG performance for studying measures, we modify five existing datasets to come up with SR-CMNIST, C-Cats&Dogs, L-CIFAR10, PACS-corrupted, and VLCS-corrupted datasets. The experiment results unveil an inferior performance of the average measure in approximating the true DG performance and confirm the robustness of the theoretically supported worst+gap measure. △ Less

Submitted 2 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.19702 [pdf, ps, other]

Acylindrical hyperbolicity of outer automorphism groups of right-angled Artin groups

Authors: Hyungryul Baik, Junseok Kim

Abstract: We study the acylindrical hyperbolicity of the outer automorphism group of a right-angled Artin group $A_Γ$. When the defining graph $Γ$ has no SIL-pair (separating intersection of links), we obtain a necessary and sufficient condition for $\mathrm{Out}(A_Γ)$ to be acylindrically hyperbolic. As a corollary, if $Γ$ is a random connected graph satisfying a certain probabilistic condition, then… ▽ More We study the acylindrical hyperbolicity of the outer automorphism group of a right-angled Artin group $A_Γ$. When the defining graph $Γ$ has no SIL-pair (separating intersection of links), we obtain a necessary and sufficient condition for $\mathrm{Out}(A_Γ)$ to be acylindrically hyperbolic. As a corollary, if $Γ$ is a random connected graph satisfying a certain probabilistic condition, then $\mathrm{Out}(A_Γ)$ is not acylindrically hyperbolic with high probability. When $Γ$ has a maximal SIL-pair system, we derive a classification theorem for partial conjugations. Such a classification theorem allows us to show that the acylindrical hyperbolicity of $\mathrm{Out}(A_Γ)$ is closely related to the existence of a specific type of partial conjugations. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: Comments are welcome!

arXiv:2405.19691 [pdf, other]

Designing Prompt Analytics Dashboards to Analyze Student-ChatGPT Interactions in EFL Writing

Authors: Minsun Kim, SeonGyeom Kim, Suyoun Lee, Yoosang Yoon, Junho Myung, Haneul Yoo, Hyungseung Lim, Jieun Han, Yoonsu Kim, So-Yeon Ahn, Juho Kim, Alice Oh, Hwajung Hong, Tak Yeon Lee

Abstract: While ChatGPT has significantly impacted education by offering personalized resources for students, its integration into educational settings poses unprecedented risks, such as inaccuracies and biases in AI-generated content, plagiarism and over-reliance on AI, and privacy and security issues. To help teachers address such risks, we conducted a two-phase iterative design process that comprises sur… ▽ More While ChatGPT has significantly impacted education by offering personalized resources for students, its integration into educational settings poses unprecedented risks, such as inaccuracies and biases in AI-generated content, plagiarism and over-reliance on AI, and privacy and security issues. To help teachers address such risks, we conducted a two-phase iterative design process that comprises surveys, interviews, and prototype demonstration involving six EFL (English as a Foreign Language) teachers, who integrated ChatGPT into semester-long English essay writing classes. Based on the needs identified during the initial survey and interviews, we developed a prototype of Prompt Analytics Dashboard (PAD) that integrates the essay editing history and chat logs between students and ChatGPT. Teacher's feedback on the prototype informs additional features and unmet needs for designing future PAD, which helps them (1) analyze contextual analysis of student behaviors, (2) design an overall learning loop, and (3) develop their teaching skills. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.19405 [pdf, other]

Quarter- and half-filled quantum Hall states and their competing interactions in bilayer graphene

Authors: Ravi Kumar, André Haug, Jehyun Kim, Misha Yutushui, Konstantin Khudiakov, Vishal Bhardwaj, Alexey Ilin, Kenji Watanabe, Takashi Taniguchi, David F. Mross, Yuval Ronen

Abstract: Bilayer graphene has emerged as a key platform for studying non-Abelian fractional quantum Hall (FQH) states. Its multiple half-filled plateaus with large energy gaps combined with its tunability offer an opportunity to distill the principles that determine their topological order. Here, we report four additional plateaus at $ν=\frac{1}{2}$ for different spin and valley, revealing a systematic pat… ▽ More Bilayer graphene has emerged as a key platform for studying non-Abelian fractional quantum Hall (FQH) states. Its multiple half-filled plateaus with large energy gaps combined with its tunability offer an opportunity to distill the principles that determine their topological order. Here, we report four additional plateaus at $ν=\frac{1}{2}$ for different spin and valley, revealing a systematic pattern of non-Abelian states according to their Levin--Halperin daughter states. Whenever a pair of $N=1$ Landau levels cross, anti-Pfaffian and Pfaffian develop at half filling of the lower and higher levels, respectively. In the $N=0$ levels, where half-filled plateaus are absent, we instead observe four unexpected incompressible quarter-filled states along with daughters. The mutual exclusion of half- and quarter-filled states indicates a robust competition between the interactions favoring either paired states of two-flux or four-flux composite fermions. Finally, we observe several FQH states that require strong interactions between composite fermions. Our combined findings herald a new generation of quantum Hall physics in graphene-based heterostructures. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 17 pages, 5 figures

arXiv:2405.19227 [pdf, other]

Metallicity Dependence of Pressure-Regulated Feedback-Modulated Star Formation in the TIGRESS-NCR Simulation Suite

Authors: Chang-Goo Kim, Eve C. Ostriker, Jeong-Gyu Kim, Munan Gong, Greg L. Bryan, Drummond B. Fielding, Sultan Hassan, Matthew Ho, Sarah M. R. Jeffreson, Rachel S. Somerville, Ulrich P. Steinwandel

Abstract: We present a new simulation suite for the star-forming interstellar medium (ISM) in galactic disks using the TIGRESS-NCR framework. Distinctive aspects of our simulation suite are: (1) sophisticated and comprehensive numerical treatments of essential physical processes including magnetohydrodynamics, self-gravity, and galactic differential rotation, as well as photochemistry, cooling, and heating… ▽ More We present a new simulation suite for the star-forming interstellar medium (ISM) in galactic disks using the TIGRESS-NCR framework. Distinctive aspects of our simulation suite are: (1) sophisticated and comprehensive numerical treatments of essential physical processes including magnetohydrodynamics, self-gravity, and galactic differential rotation, as well as photochemistry, cooling, and heating coupled with ray-tracing UV radiation transfer and resolved supernova feedback and (2) wide parameter coverage including metallicity over $Z'\equiv Z/Z_\odot\sim0.1-3$, gas surface density $Σ_{\rm gas}\sim5-150 M_{\odot}{\rm pc^{-2}}$, and stellar surface density $Σ_{\rm star}\sim 1-50 M_{\odot}{\rm pc^{-2}}$. The range of emergent star formation rate surface density is $Σ_{\rm SFR}\sim 10^{-4}-0.5 M_{\odot}{\rm kpc^{-2}yr^{-1}}$ and ISM total midplane pressure is $P_{\rm tot}/k_B=10^3-10^6{\rm cm^{-3}K}$, with $P_{\rm tot}$ equal to the ISM weight $W$. For given $Σ_{\rm gas}$ and $Σ_{\rm star}$, we find $Σ_{\rm SFR} \propto Z'^{0.3}$. We provide an interpretation based on the pressure-regulated feedback-modulated (PRFM) star formation theory. We characterize feedback modulation in terms of the yield $Υ$, defined as the ratio of each stress to $Σ_{\rm SFR}$. The thermal feedback yield varies sensitively with both weight and metallicity as $Υ_{\rm th}\propto W^{-0.46}Z'^{-0.53}$, while the combined turbulent and magnetic feedback yield shows weaker dependence $Υ_{\rm turb+mag}\propto W^{-0.22}Z'^{-0.18}$. The reduction in $Σ_{\rm SFR}$ at low metallicity is due mainly to enhanced thermal feedback yield, resulting from reduced attenuation of UV radiation. With the metallicity-dependent calibrations we provide, PRFM theory can be used for a new subgrid star formation prescription in cosmological simulations where the ISM is unresolved. △ Less

Submitted 6 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

Comments: Resubmitted to ApJ after minor revision

arXiv:2405.18957 [pdf, other]

Donnan equilibrium in charged slit-pores from a hybrid nonequilibrium Molecular Dynamics / Monte Carlo method with ions and solvent exchange

Authors: Jeongmin Kim, Benjamin Rotenberg

Abstract: Ion partitioning between different compartments (\emph{e.g.} a porous material and a bulk solution reservoir), known as Donnan equilibrium, plays a fundamental role in various contexts such as energy, environment, or water treatment. The linearized Poisson-Boltzmann (PB) equation, capturing the thermal motion of the ions with mean-field electrostatic interactions, is practically useful to understa… ▽ More Ion partitioning between different compartments (\emph{e.g.} a porous material and a bulk solution reservoir), known as Donnan equilibrium, plays a fundamental role in various contexts such as energy, environment, or water treatment. The linearized Poisson-Boltzmann (PB) equation, capturing the thermal motion of the ions with mean-field electrostatic interactions, is practically useful to understand and predict ion partitioning, despite its limited applicability to conditions of low salt concentrations and surface charge densities. Here, we investigate the Donnan equilibrium of coarse-grained dilute electrolytes confined in charged slit-pores in equilibrium with a reservoir of ions and solvent. We introduce and use an extension to confined systems of a recently developed hybrid nonequilibrium molecular dynamics / grand canonical Monte Carlo simulation method ("H4D"), which enhances the efficiency of solvent and ion-pair exchange via a fourth spatial dimension. We show that the validity range of linearized PB theory to predict the Donnan equilibrium of dilute electrolytes can be extended to highly charged pores, by simply considering \textit{renormalized} surface charge densities. We compare with simulations of implicit solvent models of electrolytes and show that in the low salt concentrations and thin electric double layer limit considered here, an explicit solvent has a limited effect on the Donnan equilibrium and that the main limitations of the analytical predictions are not due to the breakdown of the mean-field description, but rather to the charge renormalization approximation, because it only focuses on the behavior far from the surfaces. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 14 pages, 6 figures

arXiv:2405.18042 [pdf, other]

Visualizing the loss landscape of Self-supervised Vision Transformer

Authors: Youngwan Lee, Jeffrey Ryan Willette, Jonghee Kim, Sung Ju Hwang

Abstract: The Masked autoencoder (MAE) has drawn attention as a representative self-supervised approach for masked image modeling with vision transformers. However, even though MAE shows better generalization capability than fully supervised training from scratch, the reason why has not been explored. In another line of work, the Reconstruction Consistent Masked Auto Encoder (RC-MAE), has been proposed whic… ▽ More The Masked autoencoder (MAE) has drawn attention as a representative self-supervised approach for masked image modeling with vision transformers. However, even though MAE shows better generalization capability than fully supervised training from scratch, the reason why has not been explored. In another line of work, the Reconstruction Consistent Masked Auto Encoder (RC-MAE), has been proposed which adopts a self-distillation scheme in the form of an exponential moving average (EMA) teacher into MAE, and it has been shown that the EMA-teacher performs a conditional gradient correction during optimization. To further investigate the reason for better generalization of the self-supervised ViT when trained by MAE (MAE-ViT) and the effect of the gradient correction of RC-MAE from the perspective of optimization, we visualize the loss landscapes of the self-supervised vision transformer by both MAE and RC-MAE and compare them with the supervised ViT (Sup-ViT). Unlike previous loss landscape visualizations of neural networks based on classification task loss, we visualize the loss landscape of ViT by computing pre-training task loss. Through the lens of loss landscapes, we find two interesting observations: (1) MAE-ViT has a smoother and wider overall loss curvature than Sup-ViT. (2) The EMA-teacher allows MAE to widen the region of convexity in both pretraining and linear probing, leading to quicker convergence. To the best of our knowledge, this work is the first to investigate the self-supervised ViT through the lens of the loss landscape. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: NeurIPS 2023 Workshop: Self-Supervised Learning - Theory and Practice

arXiv:2405.18027 [pdf, other]

TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models

Authors: Jaewoo Ahn, Taehyun Lee, Junyoung Lim, **-Hwa Kim, Sangdoo Yun, Hwaran Lee, Gunhee Kim

Abstract: While Large Language Models (LLMs) can serve as agents to simulate human behaviors (i.e., role-playing agents), we emphasize the importance of point-in-time role-playing. This situates characters at specific moments in the narrative progression for three main reasons: (i) enhancing users' narrative immersion, (ii) avoiding spoilers, and (iii) fostering engagement in fandom role-playing. To accurat… ▽ More While Large Language Models (LLMs) can serve as agents to simulate human behaviors (i.e., role-playing agents), we emphasize the importance of point-in-time role-playing. This situates characters at specific moments in the narrative progression for three main reasons: (i) enhancing users' narrative immersion, (ii) avoiding spoilers, and (iii) fostering engagement in fandom role-playing. To accurately represent characters at specific time points, agents must avoid character hallucination, where they display knowledge that contradicts their characters' identities and historical timelines. We introduce TimeChara, a new benchmark designed to evaluate point-in-time character hallucination in role-playing LLMs. Comprising 10,895 instances generated through an automated pipeline, this benchmark reveals significant hallucination issues in current state-of-the-art LLMs (e.g., GPT-4o). To counter this challenge, we propose Narrative-Experts, a method that decomposes the reasoning steps and utilizes narrative experts to reduce point-in-time character hallucinations effectively. Still, our findings with TimeChara highlight the ongoing challenges of point-in-time character hallucination, calling for further study. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: ACL 2024 Findings. Code and dataset are released at https://ahnjaewoo.github.io/timechara

arXiv:2405.17928 [pdf, other]

Relational Self-supervised Distillation with Compact Descriptors for Image Copy Detection

Authors: Juntae Kim, Sungwon Woo, Jongho Nang

Abstract: This paper addresses image copy detection, a task in online sharing platforms for copyright protection. While previous approaches have performed exceptionally well, the large size of their networks and descriptors remains a significant disadvantage, complicating their practical application. In this paper, we propose a novel method that achieves a competitive performance by using a lightweight netw… ▽ More This paper addresses image copy detection, a task in online sharing platforms for copyright protection. While previous approaches have performed exceptionally well, the large size of their networks and descriptors remains a significant disadvantage, complicating their practical application. In this paper, we propose a novel method that achieves a competitive performance by using a lightweight network and compact descriptors. By utilizing relational self-supervised distillation to transfer knowledge from a large network to a small network, we enable the training of lightweight networks with a small descriptor size. Our approach, which we call Relational self-supervised Distillation with Compact Descriptors (RDCD), introduces relational self-supervised distillation (RSD) for flexible representation in a smaller feature space and applies contrastive learning with a hard negative (HN) loss to prevent dimensional collapse. We demonstrate the effectiveness of our method using the DISC2021, Copydays, and NDEC benchmark datasets, with which our lightweight network with compact descriptors achieves a competitive performance. For the DISC2021 benchmark, ResNet-50/EfficientNet- B0 are used as a teacher and student respectively, the micro average precision improved by 5.0%/4.9%/5.9% for 64/128/256 descriptor sizes compared to the baseline method. △ Less

Submitted 7 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

Comments: 12 pages, 8 figures

ACM Class: I.4.0; I.4.10

arXiv:2405.17825 [pdf, other]

Diffusion Model Patching via Mixture-of-Prompts

Authors: Seokil Ham, Sangmin Woo, **-Young Kim, Hyojun Go, Byeongjun Park, Changick Kim

Abstract: We present Diffusion Model Patching (DMP), a simple method to boost the performance of pre-trained diffusion models that have already reached convergence, with a negligible increase in parameters. DMP inserts a small, learnable set of prompts into the model's input space while kee** the original model frozen. The effectiveness of DMP is not merely due to the addition of parameters but stems from… ▽ More We present Diffusion Model Patching (DMP), a simple method to boost the performance of pre-trained diffusion models that have already reached convergence, with a negligible increase in parameters. DMP inserts a small, learnable set of prompts into the model's input space while kee** the original model frozen. The effectiveness of DMP is not merely due to the addition of parameters but stems from its dynamic gating mechanism, which selects and combines a subset of learnable prompts at every step of the generative process (e.g., reverse denoising steps). This strategy, which we term "mixture-of-prompts", enables the model to draw on the distinct expertise of each prompt, essentially "patching" the model's functionality at every step with minimal yet specialized parameters. Uniquely, DMP enhances the model by further training on the same dataset on which it was originally trained, even in a scenario where significant improvements are typically not expected due to model convergence. Experiments show that DMP significantly enhances the converged FID of DiT-L/2 on FFHQ 256x256 by 10.38%, achieved with only a 1.43% parameter increase and 50K additional training iterations. △ Less

Submitted 30 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

Comments: Project page: https://sangminwoo.github.io/DMP/

arXiv:2405.17056 [pdf, other]

Massive twistor worldline in electromagnetic fields

Authors: Joon-Hwi Kim, Jung-Wook Kim, Sangmin Lee

Abstract: We study the (ambi-)twistor model for spinning particles interacting via electromagnetic field, as a toy model for studying classical dynamics of gravitating bodies including effects of both spins to all orders. We compute the momentum kick and spin kick up to one-loop order and show precisely how they are encoded in the classical eikonal. The all-orders-in-spin effects are encoded as a dynamical… ▽ More We study the (ambi-)twistor model for spinning particles interacting via electromagnetic field, as a toy model for studying classical dynamics of gravitating bodies including effects of both spins to all orders. We compute the momentum kick and spin kick up to one-loop order and show precisely how they are encoded in the classical eikonal. The all-orders-in-spin effects are encoded as a dynamical implementation of the Newman-Janis shift, and we find that the expansion in both spins can be resummed to simple expressions in special kinematic configurations, at least up to one-loop order. We confirm that the classical eikonal can be understood as the generator of canonical transformations that map the in-states of a scattering process to the out-states. We also remark that cut contributions for converting worldline propagators from time-symmetric to retarded amount to the iterated action of the leading eikonal at one-loop order. △ Less

Submitted 5 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

Comments: 74 pages, 15 figures; (v2) 76 pages, 15 figures; additional references; extended discussions on singularity structures of axial scattering eikonal

arXiv:2405.16997 [pdf, other]

Program Synthesis is $Σ_3^0$-Complete

Authors: **woo Kim

Abstract: This paper considers program synthesis in the context of computational hardness, asking the question: How hard is it to determine whether a given synthesis problem has a solution or not? To answer this question, this paper studies program synthesis for a basic imperative, Turing-complete language IMP, for which this paper proves that program synthesis is $Σ_3^0$-\emph{complete} in the arithmetic… ▽ More This paper considers program synthesis in the context of computational hardness, asking the question: How hard is it to determine whether a given synthesis problem has a solution or not? To answer this question, this paper studies program synthesis for a basic imperative, Turing-complete language IMP, for which this paper proves that program synthesis is $Σ_3^0$-\emph{complete} in the arithmetical hierarchy. The proof of this fact relies on a fully constructive encoding of program synthesis (which is typically formulated as a second-order query) as a first-order formula in the standard model of arithmetic (i.e., Peano arithmetic). Constructing such a formula then allows us to reduce the decision problem for COF (the set of functions which diverge only on a finite set of inputs), which is well-known to be a $Σ_3^0$-complete problem, into the constructed first-order representation of synthesis. In addition to this main result, we also consider the hardness of variants of synthesis problems, such as those introduced in previous work to make program synthesis more tractable (e.g., synthesis over finite examples). To the best of our knowledge, this paper is the first to give a first-order characterization of program synthesis in general, and precisely define the computability of synthesis problems and their variants. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.16155 [pdf, other]

Improving Multi-lingual Alignment Through Soft Contrastive Learning

Authors: Minsu Park, Seyeon Choi, Chanyeol Choi, Jun-Seong Kim, Jy-yong Sohn

Abstract: Making decent multi-lingual sentence representations is critical to achieve high performances in cross-lingual downstream tasks. In this work, we propose a novel method to align multi-lingual embeddings based on the similarity of sentences measured by a pre-trained mono-lingual embedding model. Given translation sentence pairs, we train a multi-lingual model in a way that the similarity between cr… ▽ More Making decent multi-lingual sentence representations is critical to achieve high performances in cross-lingual downstream tasks. In this work, we propose a novel method to align multi-lingual embeddings based on the similarity of sentences measured by a pre-trained mono-lingual embedding model. Given translation sentence pairs, we train a multi-lingual model in a way that the similarity between cross-lingual embeddings follows the similarity of sentences measured at the mono-lingual teacher model. Our method can be considered as contrastive learning with soft labels defined as the similarity between sentences. Our experimental results on five languages show that our contrastive loss with soft labels far outperforms conventional contrastive loss with hard labels in various benchmarks for bitext mining tasks and STS tasks. In addition, our method outperforms existing multi-lingual embeddings including LaBSE, for Tatoeba dataset. The code is available at https://github.com/YAI12xLinq-B/IMASCL △ Less

Submitted 28 May, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

Comments: 8 pages, 1 figures, Accepted at NAACL SRW 2024

arXiv:2405.15591 [pdf, ps, other]

Constraints for electron-capture decays mimicking production of axion-like particles in nuclei

Authors: Aagrah Agnihotri, Jouni Suhonen, Hong Joo Kim

Abstract: We give for the first time, theoretical estimates of ground-state-to-ground-state (GS-to-GS) electron-capture (EC) branch decay rates of $^{44}$Ti, $^{57}$Co, and $^{139}$Ce. The nuclear-structure calculations have been done exploiting the nuclear shell model (NSM) with well-established Hamiltonians and an advanced theory of $β$ decay. In the absence of experimental measurements of these GS-to-GS… ▽ More We give for the first time, theoretical estimates of ground-state-to-ground-state (GS-to-GS) electron-capture (EC) branch decay rates of $^{44}$Ti, $^{57}$Co, and $^{139}$Ce. The nuclear-structure calculations have been done exploiting the nuclear shell model (NSM) with well-established Hamiltonians and an advanced theory of $β$ decay. In the absence of experimental measurements of these GS-to-GS branches, these estimates are of utmost importance for terrestrial searches of axion-like particles (ALPs). Predictions are made for EC-decay rates of 2$^{nd}$-forbidden unique (FU) and 2$^{nd}$-forbidden non-unique (FNU) EC transitions that can potentially mimic nuclear axion production in experiments designed to detect ALPs in nuclear environments. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.15386 [pdf, other]

Exploring Baryon Resonances with Transition Generalized Parton Distributions: Status and Perspectives

Authors: Stefan Diehl, Kyungseon Joo, Kirill Semenov-Tian-Shansky, Christian Weiss, Vladimir Braun, Wen-Chen Chang, Pierre Chatagnon, Martha Constantinou, Yuxun Guo, Parada T. P. Hutauruk, Hyon-Suk Jo, Andrey Kim, Jun-Young Kim, Peter Kroll, Shunzo Kumano, Chang-Hwan Lee, Simonetta Liuti, Ronan McNulty, Hyeon-Dong Son, Pawel Sznajder, Ali Usman, Charlotte Van Hulse, Marc Vanderhaeghen, Michael Winn

Abstract: QCD gives rise to a rich spectrum of excited baryon states. Understanding their internal structure is important for many areas of nuclear physics, such as nuclear forces, dense matter, and neutrino-nucleus interactions. Generalized parton distributions (GPDs) are an established tool for characterizing the QCD structure of the ground-state nucleon. They are used to create 3D tomographic images of t… ▽ More QCD gives rise to a rich spectrum of excited baryon states. Understanding their internal structure is important for many areas of nuclear physics, such as nuclear forces, dense matter, and neutrino-nucleus interactions. Generalized parton distributions (GPDs) are an established tool for characterizing the QCD structure of the ground-state nucleon. They are used to create 3D tomographic images of the quark/gluon structure and quantify the mechanical properties such as the distribution of mass, angular momentum and forces in the system. Transition GPDs extend these concepts to $N \rightarrow N^\ast$ transitions and can be used to characterize the 3D structure and mechanical properties of baryon resonances. They can be probed in high-momentum-transfer exclusive electroproduction processes with resonance transitions $e + N \rightarrow e' + M + N^\ast$, such as deeply-virtual Compton scattering ($M = γ$) or meson production ($M = π, K$, $etc.$), and in related photon/hadron-induced processes. This White Paper describes a research program aiming to explore baryon resonance structure with transition GPDs. This includes the properties and interpretation of the transition GPDs, theoretical methods for structures and processes, first experimental results from JLab 12 GeV, future measurements with existing and planned facilities (JLab detector and energy upgrades, COMPASS/AMBER, EIC, EicC, J-PARC, LHC ultraperihperal collisions), and the theoretical and experimental developments needed to realize this program. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Report number: JLAB-THY-24-4071

arXiv:2405.14943 [pdf, other]

SDSS-V Local Volume Mapper (LVM): A Glimpse into Orion

Authors: K. Kreckel, O. V. Egorov, E. Egorova, G. A. Blanc, N. Drory, M. Kounkel, J. E. Mendez-Delgado, C. G. Roman-Zuniga, S. F. Sanchez, G. S. Stringfellow, A. M. Stutz, E. Zari, J. K. Barrera-Ballesteros, D. Bizyaev, J. R. Brownstein, E. Congiu, J. G. Fernandez-Trincado, P. Garcia, L. Hillenbrand, H. J. Ibarra-Medel, Y. **, E. J. Johnston, A. M. Jones, J. Serena Kim, J. A. Kollmeier , et al. (15 additional authors not shown)

Abstract: The Orion Molecular Cloud complex, one of the nearest (D~400 pc) and most extensively studied massive star-forming regions, is ideal for constraining the physics of stellar feedback, but its ~12 deg diameter on the sky requires a dedicated approach to map** ionized gas structures within and around the nebula. The Sloan Digital Sky Survey (SDSS-V) Local Volume Mapper (LVM) is a new optical integr… ▽ More The Orion Molecular Cloud complex, one of the nearest (D~400 pc) and most extensively studied massive star-forming regions, is ideal for constraining the physics of stellar feedback, but its ~12 deg diameter on the sky requires a dedicated approach to map** ionized gas structures within and around the nebula. The Sloan Digital Sky Survey (SDSS-V) Local Volume Mapper (LVM) is a new optical integral field unit (IFU) that will map the ionized gas within the Milky Way and Local Group galaxies, covering 4300 deg^2 of the sky with the new LVM Instrument (LVM-I). We showcase optical emission line maps from LVM covering 12 deg^2 inside of the Orion belt region, with 195,000 individual spectra combined to produce images at 0.07 pc (35.3 arcsec) resolution. This is the largest IFU map made (to date) of the Milky Way, and contains well-known nebulae (the Horsehead Nebula, Flame Nebula, IC 434, and IC 432), as well as ionized interfaces with the neighboring dense Orion B molecular cloud. We resolve the ionization structure of each nebula, and map the increase in both the [SII]/Ha and [NII]/Ha line ratios at the outskirts of nebulae and along the ionization front with Orion B. [OIII] line emission is only spatially resolved within the center of the Flame Nebula and IC 434, and our ~0.1 pc scale line ratio diagrams show how variations in these diagnostics are lost as we move from the resolved to the integrated view of each nebula. We detect ionized gas emission associated with the dusty bow wave driven ahead of the star sigma Orionis, where the stellar wind interacts with the ambient interstellar medium. The Horsehead Nebula is seen as a dark occlusion of the bright surrounding photo-disassociation region. This small glimpse into Orion only hints at the rich science that will be enabled by the LVM. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 12 pages, 9 figures, submitted to A&A

arXiv:2405.14732 [pdf, other]

The Data Acquisition System of the LZ Dark Matter Detector: FADR

Authors: J. Aalbers, D. S. Akerib, A. K. Al Musalhi, F. Alder, C. S. Amarasinghe, A. Ames, T. J. Anderson, N. Angelides, H. M. Araújo, J. E. Armstrong, M. Arthurs, A. Baker, S. Balashov, J. Bang, E. E. Barillier, J. W. Bargemann, K. Beattie, T. Benson, A. Bhatti, A. Biekert, T. P. Biesiadzinski, H. J. Birch, E. Bishop, G. M. Blockinger, B. Boxer , et al. (190 additional authors not shown)

Abstract: The Data Acquisition System (DAQ) for the LUX-ZEPLIN (LZ) dark matter detector is described. The signals from 745 PMTs, distributed across three subsystems, are sampled with 100-MHz 32-channel digitizers (DDC-32s). A basic waveform analysis is carried out on the on-board Field Programmable Gate Arrays (FPGAs) to extract information about the observed scintillation and electroluminescence signals.… ▽ More The Data Acquisition System (DAQ) for the LUX-ZEPLIN (LZ) dark matter detector is described. The signals from 745 PMTs, distributed across three subsystems, are sampled with 100-MHz 32-channel digitizers (DDC-32s). A basic waveform analysis is carried out on the on-board Field Programmable Gate Arrays (FPGAs) to extract information about the observed scintillation and electroluminescence signals. This information is used to determine if the digitized waveforms should be preserved for offline analysis. The system is designed around the Kintex-7 FPGA. In addition to digitizing the PMT signals and providing basic event selection in real time, the flexibility provided by the use of FPGAs allows us to monitor the performance of the detector and the DAQ in parallel to normal data acquisition. The hardware and software/firmware of this FPGA-based Architecture for Data acquisition and Realtime monitoring (FADR) are discussed and performance measurements are described. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 18 pages, 24 figures

arXiv:2405.14624 [pdf, other]

Quantum Simulation of Spin-Boson Models with Structured Bath

Authors: Ke Sun, Mingyu Kang, Hanggai Nuomin, George Schwartz, David N. Beratan, Kenneth R. Brown, Jungsang Kim

Abstract: The spin-boson model, involving spins interacting with a bath of quantum harmonic oscillators, is a widely used representation of open quantum systems. Trapped ions present a natural platform for simulating the quantum dynamics of such models, thanks to the presence of both high quality internal qubit states and the motional modes of the ions that can simulate the relevant quantum degrees of freed… ▽ More The spin-boson model, involving spins interacting with a bath of quantum harmonic oscillators, is a widely used representation of open quantum systems. Trapped ions present a natural platform for simulating the quantum dynamics of such models, thanks to the presence of both high quality internal qubit states and the motional modes of the ions that can simulate the relevant quantum degrees of freedom. In our work, we extend the previous body of work that focused on coherent coupling of the spins and bosons to perform quantum simulations with structured dissipative baths using the motional states of trapped ions. We demonstrate the capability for adjusting the bath's temperature and continuous spectral density by adding randomness to fully programmable control parameters. Subsequently, we simulate the dynamics of various spin-boson models with noise spectral densities constructed from coupling to several dissipative harmonic oscillator modes. The experimental outcomes closely align with theoretical predictions, indicating successful simulation of open quantum systems using a trapped-ion system. △ Less

Submitted 6 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

Comments: 11 pages, 7 figures

arXiv:2405.14515 [pdf, other]

Visuo-Tactile Keypoint Correspondences for Object Manipulation

Authors: Jeong-Jung Kim, Doo-Yeol Koh, Chang-Hyun Kim

Abstract: This paper presents a novel manipulation strategy that uses keypoint correspondences extracted from visuo-tactile sensor images to facilitate precise object manipulation. Our approach uses the visuo-tactile feedback to guide the robot's actions for accurate object gras** and placement, eliminating the need for post-grasp adjustments and extensive training. This method provides an improvement in… ▽ More This paper presents a novel manipulation strategy that uses keypoint correspondences extracted from visuo-tactile sensor images to facilitate precise object manipulation. Our approach uses the visuo-tactile feedback to guide the robot's actions for accurate object gras** and placement, eliminating the need for post-grasp adjustments and extensive training. This method provides an improvement in deployment efficiency, addressing the challenges of manipulation tasks in environments where object locations are not predefined. We validate the effectiveness of our strategy through experiments demonstrating the extraction of keypoint correspondences and their application to real-world tasks such as block alignment and gear insertion, which require millimeter-level precision. The results show an average error margin significantly lower than that of traditional vision-based methods, which is sufficient to achieve the target tasks. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.14283 [pdf, ps, other]

Diffusion-based Quantum Error Mitigation using Stochastic Differential Equation

Authors: Joo Yong Shim, Joongheon Kim

Abstract: Unlike closed systems, where the total energy and information are conserved within the system, open systems interact with the external environment which often leads to complex behaviors not seen in closed systems. The random fluctuations that arise due to the interaction with the external environment cause noise affecting the states of the quantum system, resulting in system errors. To effectively… ▽ More Unlike closed systems, where the total energy and information are conserved within the system, open systems interact with the external environment which often leads to complex behaviors not seen in closed systems. The random fluctuations that arise due to the interaction with the external environment cause noise affecting the states of the quantum system, resulting in system errors. To effectively concern quantum error in open quantum systems, this paper introduces a novel approach to mitigate errors using diffusion models. This approach can be realized by noise occurrence formulation during the state evolution as forward-backward stochastic differential equations (FBSDE) and adapting the score-based generative model (SGM) to denoise errors in quantum states. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.14126 [pdf, other]

The Disappearance of Timestep Embedding in Modern Time-Dependent Neural Networks

Authors: Bum Jun Kim, Yoshinobu Kawahara, Sang Woo Kim

Abstract: Dynamical systems are often time-varying, whose modeling requires a function that evolves with respect to time. Recent studies such as the neural ordinary differential equation proposed a time-dependent neural network, which provides a neural network varying with respect to time. However, we claim that the architectural choice to build a time-dependent neural network significantly affects its time… ▽ More Dynamical systems are often time-varying, whose modeling requires a function that evolves with respect to time. Recent studies such as the neural ordinary differential equation proposed a time-dependent neural network, which provides a neural network varying with respect to time. However, we claim that the architectural choice to build a time-dependent neural network significantly affects its time-awareness but still lacks sufficient validation in its current states. In this study, we conduct an in-depth analysis of the architecture of modern time-dependent neural networks. Here, we report a vulnerability of vanishing timestep embedding, which disables the time-awareness of a time-dependent neural network. Furthermore, we find that this vulnerability can also be observed in diffusion models because they employ a similar architecture that incorporates timestep embedding to discriminate between different timesteps during a diffusion process. Our analysis provides a detailed description of this phenomenon as well as several solutions to address the root cause. Through experiments on neural ordinary differential equations and diffusion models, we observed that ensuring alive time-awareness via proposed solutions boosted their performance, which implies that their current implementations lack sufficient time-dependency. △ Less