Search | arXiv e-print repository

Pointwise estimates of the Bergman kernel with an exponential weight on the unit ball

Abstract: We consider the weighted Bergman space $A^2_ψ(\Bn)$ of all holomorphic functions on $\Bn$ square integrable with respect to a particular exponential weight measure $e^{-ψ} dV$ on $\Bn$, where \begin{align*} ψ(z):=\frac{1}{1-|z|^2}. \end{align*} We prove the following estimate for the Bergman kernel $K_ψ(z,w)$ of $A^2_ψ(\Bn)$: \begin{align*} |K_ψ(z,w)|^2\le C\frac{e^{ψ(z)+ψ(w)}}{{\rm Vol}(B_ψ(z,1… ▽ More We consider the weighted Bergman space $A^2_ψ(\Bn)$ of all holomorphic functions on $\Bn$ square integrable with respect to a particular exponential weight measure $e^{-ψ} dV$ on $\Bn$, where \begin{align*} ψ(z):=\frac{1}{1-|z|^2}. \end{align*} We prove the following estimate for the Bergman kernel $K_ψ(z,w)$ of $A^2_ψ(\Bn)$: \begin{align*} |K_ψ(z,w)|^2\le C\frac{e^{ψ(z)+ψ(w)}}{{\rm Vol}(B_ψ(z,1)){\rm Vol}(B_ψ(w, 1))}e^{-\varepsilon d_ψ(z,w)}, \quad z, w\in\Bn, \end{align*} where $d_ψ$ is the Riemannian distance induced by the potential function $ψ$ and $B_ψ(z,1)$ is the $d_ψ$-ball of center $z$ and radius $1$. The result is motivated by Christ \cite{Chr}. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2407.00965 [pdf, other]

Measurement of the integrated luminosity of data samples collected during 2019-2022 by the Belle II experiment

Authors: The Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, J. K. Ahn, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (382 additional authors not shown)

Abstract: A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, diga… ▽ More A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, digamma, and dimuon events is (426.52 $\pm$ 0.03 $\pm$ 2.48)~fb$^{-1}$, (427.32 $\pm$ 0.03 $\pm$ 2.56)~fb$^{-1}$, and (424.84 $\pm$ 0.04 $\pm$ 3.88)~fb$^{-1}$, where the first uncertainties are statistical and the second are systematic. The resulting total integrated luminosity obtained from the combination of the three methods is (426.88 $\pm$ 1.93)~fb$^{-1}$. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 12 pages, 3 figures

Report number: Belle II Preprint 2024-019; KEK Preprint 2024-16

arXiv:2407.00879 [pdf, ps, other]

Study of $χ_{bJ}(2P)\toωΥ(1S)$ at Belle

Authors: Z. S. Stottler, T. K. Pedlar, B. G. Fulsom, I. Adachi, K. Adamczyk, H. Aihara, S. Al Said, D. M. Asner, H. Atmacan, T. Aushev, R. Ayad, V. Babu, Sw. Banerjee, M. Bauer, P. Behera, K. Belous, J. Bennett, F. Bernlochner, M. Bessner, T. Bilka, D. Biswas, A. Bobrov, D. Bodrov, G. Bonvicini, J. Borah , et al. (159 additional authors not shown)

Abstract: We report a study of the hadronic transitions $χ_{bJ}(2P)\toωΥ(1S)$, with $ω\toπ^{+}π^{-}π^{0}$, using $28.2\times10^6~Υ(3S)$ mesons recorded by the Belle detector. We present the first evidence for the near--threshold transition $χ_{b0}(2P)\toωΥ(1S)$, the analog of the charm sector decay $χ_{c1}(3872)\toωJ/ψ$, with a branching fraction of… ▽ More We report a study of the hadronic transitions $χ_{bJ}(2P)\toωΥ(1S)$, with $ω\toπ^{+}π^{-}π^{0}$, using $28.2\times10^6~Υ(3S)$ mesons recorded by the Belle detector. We present the first evidence for the near--threshold transition $χ_{b0}(2P)\toωΥ(1S)$, the analog of the charm sector decay $χ_{c1}(3872)\toωJ/ψ$, with a branching fraction of $\mathcal{B}\big(χ_{b0}(2P)\toωΥ(1S)\big) = \big(0.55\pm0.19\pm0.07\big)\%$. We also obtain branching fractions of $\mathcal{B}\big(χ_{b1}(2P)\toωΥ(1S)\big) = \big(2.39{}^{+0.20}_{-0.19}\pm0.24\big)\%$ and $\mathcal{B}\big(χ_{b2}(2P)\toωΥ(1S)\big) = \big(0.47{}^{+0.13}_{-0.12}\pm0.06\big)\%$, confirming the measurement of the $ω$ transitions of the $J=1,2~P$--wave states. The ratio for the $J=2$ to $J=1$ transitions is also measured and found to differ by 3.3 standard deviations from the expected value in the QCD multipole expansion. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 6 pages, 2 figures

Report number: Belle Preprint: 2024-05; KEK Preprint: 2024-10

arXiv:2406.19502 [pdf, other]

Investigating How Large Language Models Leverage Internal Knowledge to Perform Complex Reasoning

Authors: Miyoung Ko, Sue Hyun Park, Joonsuk Park, Minjoon Seo

Abstract: Despite significant advancements, there is a limited understanding of how large language models (LLMs) utilize knowledge for reasoning. To address this, we propose a method that deconstructs complex real-world questions into a graph, representing each question as a node with parent nodes of background knowledge needed to solve the question. We develop the DepthQA dataset, deconstructing questions… ▽ More Despite significant advancements, there is a limited understanding of how large language models (LLMs) utilize knowledge for reasoning. To address this, we propose a method that deconstructs complex real-world questions into a graph, representing each question as a node with parent nodes of background knowledge needed to solve the question. We develop the DepthQA dataset, deconstructing questions into three depths: (i) recalling conceptual knowledge, (ii) applying procedural knowledge, and (iii) analyzing strategic knowledge. Based on a hierarchical graph, we quantify forward discrepancy, discrepancies in LLMs' performance on simpler sub-problems versus complex questions. We also measure backward discrepancy, where LLMs answer complex questions but struggle with simpler ones. Our analysis shows that smaller models have more discrepancies than larger models. Additionally, guiding models from simpler to complex questions through multi-turn interactions improves performance across model sizes, highlighting the importance of structured intermediate steps in knowledge reasoning. This work enhances our understanding of LLM reasoning and suggests ways to improve their problem-solving abilities. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: Work in progress; code is available at https://github.com/kaistAI/knowledge-reasoning

arXiv:2406.18904 [pdf, ps, other]

Finite size scaling of the Kuramoto model at criticality

Authors: Su-Chan Park, Hyunggyu Park

Abstract: The asymptotic scaling behavior of the Kuramoto model with finite populations has been notably elusive, despite comprehensive investigations employing both analytical and numerical methods. In this study, we explore the Kuramoto model with "deterministic" sampling of natural frequencies, employing extensive numerical simulations and report the asymptotic values of the finite-size scaling (FSS) exp… ▽ More The asymptotic scaling behavior of the Kuramoto model with finite populations has been notably elusive, despite comprehensive investigations employing both analytical and numerical methods. In this study, we explore the Kuramoto model with "deterministic" sampling of natural frequencies, employing extensive numerical simulations and report the asymptotic values of the finite-size scaling (FSS) exponents, which deviate significantly from the previously reported values in the literature. Additionally, we observe that these exponents are sensitive to the specifics of the sampling method. We discuss the origins of this variability through the self-consistent theory of the entrained oscillators. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 13 pages, 8 figures, 1 table

arXiv:2406.17145 [pdf, other]

GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism

Authors: Byungsoo Jeon, Mengdi Wu, Shiyi Cao, Sunghyun Kim, Sunghyun Park, Neeraj Aggarwal, Colin Unger, Daiyaan Arfeen, Peiyuan Liao, Xupeng Miao, Mohammad Alizadeh, Gregory R. Ganger, Tianqi Chen, Zhihao Jia

Abstract: Deep neural networks (DNNs) continue to grow rapidly in size, making them infeasible to train on a single device. Pipeline parallelism is commonly used in existing DNN systems to support large-scale DNN training by partitioning a DNN into multiple stages, which concurrently perform DNN training for different micro-batches in a pipeline fashion. However, existing pipeline-parallel approaches only c… ▽ More Deep neural networks (DNNs) continue to grow rapidly in size, making them infeasible to train on a single device. Pipeline parallelism is commonly used in existing DNN systems to support large-scale DNN training by partitioning a DNN into multiple stages, which concurrently perform DNN training for different micro-batches in a pipeline fashion. However, existing pipeline-parallel approaches only consider sequential pipeline stages and thus ignore the topology of a DNN, resulting in missed model-parallel opportunities. This paper presents graph pipeline parallelism (GPP), a new pipeline-parallel scheme that partitions a DNN into pipeline stages whose dependencies are identified by a directed acyclic graph. GPP generalizes existing sequential pipeline parallelism and preserves the inherent topology of a DNN to enable concurrent execution of computationally-independent operators, resulting in reduced memory requirement and improved GPU performance. In addition, we develop GraphPipe, a distributed system that exploits GPP strategies to enable performant and scalable DNN training. GraphPipe partitions a DNN into a graph of stages, optimizes micro-batch schedules for these stages, and parallelizes DNN training using the discovered GPP strategies. Evaluation on a variety of DNNs shows that GraphPipe outperforms existing pipeline-parallel systems such as PipeDream and Piper by up to 1.6X. GraphPipe also reduces the search time by 9-21X compared to PipeDream and Piper. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.16994 [pdf, other]

Quantum Multi-Agent Reinforcement Learning for Cooperative Mobile Access in Space-Air-Ground Integrated Networks

Authors: Gyu Seon Kim, Yeryeong Cho, Jaehyun Chung, Soohyun Park, Soyi Jung, Zhu Han, Joongheon Kim

Abstract: Achieving global space-air-ground integrated network (SAGIN) access only with CubeSats presents significant challenges such as the access sustainability limitations in specific regions (e.g., polar regions) and the energy efficiency limitations in CubeSats. To tackle these problems, high-altitude long-endurance unmanned aerial vehicles (HALE-UAVs) can complement these CubeSat shortcomings for prov… ▽ More Achieving global space-air-ground integrated network (SAGIN) access only with CubeSats presents significant challenges such as the access sustainability limitations in specific regions (e.g., polar regions) and the energy efficiency limitations in CubeSats. To tackle these problems, high-altitude long-endurance unmanned aerial vehicles (HALE-UAVs) can complement these CubeSat shortcomings for providing cooperatively global access sustainability and energy efficiency. However, as the number of CubeSats and HALE-UAVs, increases, the scheduling dimension of each ground station (GS) increases. As a result, each GS can fall into the curse of dimensionality, and this challenge becomes one major hurdle for efficient global access. Therefore, this paper provides a quantum multi-agent reinforcement Learning (QMARL)-based method for scheduling between GSs and CubeSats/HALE-UAVs in order to improve global access availability and energy efficiency. The main reason why the QMARL-based scheduler can be beneficial is that the algorithm facilitates a logarithmic-scale reduction in scheduling action dimensions, which is one critical feature as the number of CubeSats and HALE-UAVs expands. Additionally, individual GSs have different traffic demands depending on their locations and characteristics, thus it is essential to provide differentiated access services. The superiority of the proposed scheduler is validated through data-intensive experiments in realistic CubeSat/HALE-UAV settings. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 17 pages, 22 figures

arXiv:2406.16771 [pdf, other]

An antiferromagnetic diode effect in even-layered MnBi2Te4

Authors: Anyuan Gao, Shao-Wen Chen, Barun Ghosh, Jian-Xiang Qiu, Yu-Fei Liu, Yugo Onishi, Chaowei Hu, Tiema Qian, Damien Bérubé, Thao Dinh, Houchen Li, Christian Tzschaschel, Seunghyun Park, Tianye Huang, Shang-Wei Lien, Zhe Sun, Sheng-Chin Ho, Bahadur Singh, Kenji Watanabe, Takashi Taniguchi, David C. Bell, Arun Bansil, Hsin Lin, Tay-Rong Chang, Amir Yacoby , et al. (4 additional authors not shown)

Abstract: In a PN junction, the separation between positive and negative charges leads to diode transport. In the past few years, the intrinsic diode transport in noncentrosymmetric polar conductors has attracted great interest, because it suggests novel nonlinear applications and provides a symmetry-sensitive probe of Fermi surface. Recently, such studies have been extended to noncentrosymmetric supercondu… ▽ More In a PN junction, the separation between positive and negative charges leads to diode transport. In the past few years, the intrinsic diode transport in noncentrosymmetric polar conductors has attracted great interest, because it suggests novel nonlinear applications and provides a symmetry-sensitive probe of Fermi surface. Recently, such studies have been extended to noncentrosymmetric superconductors, realizing the superconducting diode effect. Here, we show that, even in a centrosymmetric crystal without directional charge separation, the spins of an antiferromagnet (AFM) can generate a spatial directionality, leading to an AFM diode effect. We observe large second-harmonic transport in a nonlinear electronic device enabled by the compensated AFM state of even-layered MnBi2Te4. We also report a novel electrical sum-frequency generation (SFG), which has been rarely explored in contrast to the well-known optical SFG in wide-gap insulators. We demonstrate that the AFM enables an in-plane field-effect transistor and harvesting of wireless electromagnetic energy. The electrical SFG establishes a powerful method to study nonlinear electronics built by quantum materials. The AFM diode effect paves the way for potential device concepts including AFM logic circuits, self-powered AFM spintronics, and other applications that potentially bridge nonlinear electronics with AFM spintronics. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 33+8 pages, 14+2 figures

arXiv:2406.16147 [pdf, ps, other]

Time transformation between the solar system barycenter and the surfaces of the Earth and Moon

Authors: Slava G. Turyshev, James G. Williams, Dale H. Boggs, Ryan S. Park

Abstract: The transformation of time between the surface of the Earth, the solar system barycenter, and the surface of the Moon involves relativistic corrections. For solar system Barycentric Dynamical Time (TDB), we also require that there be no rate difference between Terrestrial Time (TT) and TDB. The IAU has addressed these transformations with several resolutions. A series of robotic and crewed landing… ▽ More The transformation of time between the surface of the Earth, the solar system barycenter, and the surface of the Moon involves relativistic corrections. For solar system Barycentric Dynamical Time (TDB), we also require that there be no rate difference between Terrestrial Time (TT) and TDB. The IAU has addressed these transformations with several resolutions. A series of robotic and crewed landings on the Moon are planned. The analogous transformation between TDB and time on the surface of the Moon (TL) needs a review and discussion. In this paper, we compute the rate terms involved in that transformation. We also present the TDB-compatible spatial scale and Lorentz contraction of Moon-centered positional coordinates. These transformations have been implemented in the JPL programs used to generate ephemerides of the Moon and planets. Finally, we provide expressions that can be used to synchronize TT and TL using either TDB or TT. The relevant transformations contain a small secular drift between the two time scales, along with additional small periodic terms that can be numerically evaluated using the solar system ephemerides. △ Less

Submitted 23 June, 2024; originally announced June 2024.

Comments: 15 pages, 3 tables

arXiv:2406.15965 [pdf, other]

Search for charmed baryons in the $Λ_c^+η$ system and measurement of the branching fractions of $Λ_c(2880)^+$ and $Λ_c(2940)^+$ decaying to $Λ_c^+η$ and $pD^0$ relative to $Σ_c(2455)π$

Authors: Belle Collaboration, S. X. Li, C. P. Shen, I. Adachi, J. K. Ahn, H. Aihara, D. M. Asner, H. Atmacan, T. Aushev, R. Ayad, Sw. Banerjee, K. Belous, J. Bennett, M. Bessner, T. Bilka, D. Biswas, D. Bodrov, A. Bozek, M. Bračko, P. Branchini, T. E. Browder, A. Budano, M. Campajola, M. -C. Chang, B. G. Cheon , et al. (102 additional authors not shown)

Abstract: We search for excited charmed baryons in the $Λ_c^+η$ system using a data sample corresponding to an integrated luminosity of 980 $\rm fb^{-1}$. The data were collected by the Belle detector at the KEKB $e^{+}$$e^{-}$ asymmetric-energy collider. No significant signals are found in the $Λ_c^+η$ mass spectrum, including the known $Λ_c(2880)^+$ and $Λ_c(2940)^+$. Clear $Λ_c(2880)^+$ and… ▽ More We search for excited charmed baryons in the $Λ_c^+η$ system using a data sample corresponding to an integrated luminosity of 980 $\rm fb^{-1}$. The data were collected by the Belle detector at the KEKB $e^{+}$$e^{-}$ asymmetric-energy collider. No significant signals are found in the $Λ_c^+η$ mass spectrum, including the known $Λ_c(2880)^+$ and $Λ_c(2940)^+$. Clear $Λ_c(2880)^+$ and $Λ_c(2940)^+$ signals are observed in the $pD^0$ mass spectrum. We set upper limits at 90\% credibility level on ratios of branching fractions of $Λ_c(2880)^+$ and $Λ_c(2940)^+$ decaying to $Λ_c^+η$ relative to $Σ_c(2455)π$ of $<0.13$ for the $Λ_c(2880)^+$ and $<1.11$ for the $Λ_c(2940)^+$. We measure ratios of branching fractions of $Λ_c(2880)^+$ and $Λ_c(2940)^+$ decaying to $pD^0$ relative to $Σ_c(2455)π$ of $0.75 \pm 0.03(\text{stat.}) \pm 0.07(\text{syst.})$ for the $Λ_c(2880)^+$ and $3.59 \pm 0.21(\text{stat.}) \pm 0.56(\text{syst.})$ for the $Λ_c(2940)^+$. △ Less

Submitted 22 June, 2024; originally announced June 2024.

Comments: 10 pages, 4 figures

Report number: Belle Preprint: 2024-06;KEK Preprint: 2024-15

arXiv:2406.15819 [pdf, other]

Automatic AI Model Selection for Wireless Systems: Online Learning via Digital Twinning

Authors: Qiushuo Hou, Matteo Zecchin, Sangwoo Park, Yunlong Cai, Guanding Yu, Kaushik Chowdhury, Osvaldo Simeone

Abstract: In modern wireless network architectures, such as O-RAN, artificial intelligence (AI)-based applications are deployed at intelligent controllers to carry out functionalities like scheduling or power control. The AI "apps" are selected on the basis of contextual information such as network conditions, topology, traffic statistics, and design goals. The map** between context and AI model parameter… ▽ More In modern wireless network architectures, such as O-RAN, artificial intelligence (AI)-based applications are deployed at intelligent controllers to carry out functionalities like scheduling or power control. The AI "apps" are selected on the basis of contextual information such as network conditions, topology, traffic statistics, and design goals. The map** between context and AI model parameters is ideally done in a zero-shot fashion via an automatic model selection (AMS) map** that leverages only contextual information without requiring any current data. This paper introduces a general methodology for the online optimization of AMS map**s. Optimizing an AMS map** is challenging, as it requires exposure to data collected from many different contexts. Therefore, if carried out online, this initial optimization phase would be extremely time consuming. A possible solution is to leverage a digital twin of the physical system to generate synthetic data from multiple simulated contexts. However, given that the simulator at the digital twin is imperfect, a direct use of simulated data for the optimization of the AMS map** would yield poor performance when tested in the real system. This paper proposes a novel method for the online optimization of AMS map** that corrects for the bias of the simulator by means of limited real data collected from the physical system. Experimental results for a graph neural network-based power control app demonstrate the significant advantages of the proposed approach. △ Less

Submitted 22 June, 2024; originally announced June 2024.

Comments: submitted for a journal publication

arXiv:2406.15635 [pdf, other]

DataFreeShield: Defending Adversarial Attacks without Training Data

Authors: Hyeyoon Lee, Kanghyun Choi, Dain Kwon, Sunjong Park, Mayoore Selvarasa Jaiswal, Noseong Park, Jonghyun Choi, **ho Lee

Abstract: Recent advances in adversarial robustness rely on an abundant set of training data, where using external or additional datasets has become a common setting. However, in real life, the training data is often kept private for security and privacy issues, while only the pretrained weight is available to the public. In such scenarios, existing methods that assume accessibility to the original data bec… ▽ More Recent advances in adversarial robustness rely on an abundant set of training data, where using external or additional datasets has become a common setting. However, in real life, the training data is often kept private for security and privacy issues, while only the pretrained weight is available to the public. In such scenarios, existing methods that assume accessibility to the original data become inapplicable. Thus we investigate the pivotal problem of data-free adversarial robustness, where we try to achieve adversarial robustness without accessing any real data. Through a preliminary study, we highlight the severity of the problem by showing that robustness without the original dataset is difficult to achieve, even with similar domain datasets. To address this issue, we propose DataFreeShield, which tackles the problem from two perspectives: surrogate dataset generation and adversarial training using the generated data. Through extensive validation, we show that DataFreeShield outperforms baselines, demonstrating that the proposed method sets the first entirely data-free solution for the adversarial robustness problem. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: ICML 2024

arXiv:2406.13682 [pdf, ps, other]

A variational perspective on the dissipative Hamiltonian structure of the Vlasov-Fokker-Planck equation

Authors: Sangmin Park

Abstract: The Vlasov-Fokker-Planck equation describes the evolution of the probability density of the position and velocity of particles under the influence of external confinement, interaction, friction, and stochastic force. It is well-known that this equation can be formally seen as a dissipative Hamiltonian system in the Wasserstein space of probability measures. In order to better understand this geome… ▽ More The Vlasov-Fokker-Planck equation describes the evolution of the probability density of the position and velocity of particles under the influence of external confinement, interaction, friction, and stochastic force. It is well-known that this equation can be formally seen as a dissipative Hamiltonian system in the Wasserstein space of probability measures. In order to better understand this geometric formalism, we introduce a time-discrete variational scheme, solutions of which converge to the solution of the Vlasov-Fokker-Planck equation as time-step vanishes. The implicit scheme is based on the symplectic Euler scheme, and updates the probability density at each iteration first in the velocity variable then in the position variable. The algorithm leverages the geometric structure of the Wasserstein space, and has several desirable properties. Energy functionals involved in each variational problem are geodesically-convex, which implies the unique solvability of the problem. Furthermore, the correct dissipation of the Hamiltonian is observed at the discrete level up to higher order errors. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 51 pages

MSC Class: 35Q84; 49Q22; 46E27; 35A15

arXiv:2406.13023 [pdf]

Stackelberg Games with $k$-Submodular Function under Distributional Risk-Receptiveness and Robustness

Authors: Seonghun Park, Manish Bansal

Abstract: We study submodular optimization in adversarial context, applicable to machine learning problems such as feature selection using data susceptible to uncertainties and attacks. We focus on Stackelberg games between an attacker (or interdictor) and a defender where the attacker aims to minimize the defender's objective of maximizing a $k$-submodular function. We allow uncertainties arising from the… ▽ More We study submodular optimization in adversarial context, applicable to machine learning problems such as feature selection using data susceptible to uncertainties and attacks. We focus on Stackelberg games between an attacker (or interdictor) and a defender where the attacker aims to minimize the defender's objective of maximizing a $k$-submodular function. We allow uncertainties arising from the success of attacks and inherent data noise, and address challenges due to incomplete knowledge of the probability distribution of random parameters. Specifically, we introduce Distributionally Risk-Averse $k$-Submodular Interdiction Problem (DRA $k$-SIP) and Distributionally Risk-Receptive $k$-Submodular Interdiction Problem (DRR $k$-SIP) along with finitely convergent exact algorithms for solving them. The DRA $k$-SIP solution allows risk-averse interdictor to develop robust strategies for real-world uncertainties. Conversely, DRR $k$-SIP solution suggests aggressive tactics for attackers, willing to embrace (distributional) risk to inflict maximum damage, identifying critical vulnerable components, which can be used for the defender's defensive strategies. The optimal values derived from both DRA $k$-SIP and DRR $k$-SIP offer a confidence interval-like range for the expected value of the defender's objective function, capturing distributional ambiguity. We conduct computational experiments using instances of feature selection and sensor placement problems, and Wisconsin breast cancer data and synthetic data, respectively. △ Less

Submitted 28 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.12607 [pdf, other]

Explanations for the two-component spectral energy distributions of gravitationally lensed stars at high redshifts

Authors: Armin Nabizadeh, Erik Zackrisson, Emma Lundqvist, Massimo Ricotti, Seyong Park, Brian Welch, Jose M. Diego

Abstract: Observations of gravitationally lensed, high-mass stars at redshifts $\gtrsim1$ occasionally reveal spectral energy distributions that contain two components with different effective temperatures. Given that two separate stars are involved, it suggests that both stars have simultaneously reached very high magnification, as expected for two stars in a binary system close to the caustic curve of the… ▽ More Observations of gravitationally lensed, high-mass stars at redshifts $\gtrsim1$ occasionally reveal spectral energy distributions that contain two components with different effective temperatures. Given that two separate stars are involved, it suggests that both stars have simultaneously reached very high magnification, as expected for two stars in a binary system close to the caustic curve of the foreground galaxy-cluster lens. The inferred effective temperatures and luminosities of these stars are, however, difficult to reconcile with known binaries, or even with isolated stars of the same age. Here, we explore three alternative explanations for these cases: circumstellar dust around the cooler of the two stars; age differences of a few Myr among stars in the same star cluster, and a scenario in which the stars originate in two separate star clusters of different age along the lensing caustic. While all of these scenarios are deemed plausible in principle, dust solutions would require more circumstellar extinction than seen in local observations of the relevant super/hypergiant stars. Hence, we argue that age differences between the two stars are the most likely scenario, given the current data. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 7 pages, 7 figures, 1 table, submitted to A&A

arXiv:2406.12233 [pdf, other]

SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization

Authors: Young ** Ahn, Jungwoo Park, Sangha Park, Jonghyun Choi, Kee-Eung Kim

Abstract: Visual Speech Recognition (VSR) stands at the intersection of computer vision and speech recognition, aiming to interpret spoken content from visual cues. A prominent challenge in VSR is the presence of homophenes-visually similar lip gestures that represent different phonemes. Prior approaches have sought to distinguish fine-grained visemes by aligning visual and auditory semantics, but often fel… ▽ More Visual Speech Recognition (VSR) stands at the intersection of computer vision and speech recognition, aiming to interpret spoken content from visual cues. A prominent challenge in VSR is the presence of homophenes-visually similar lip gestures that represent different phonemes. Prior approaches have sought to distinguish fine-grained visemes by aligning visual and auditory semantics, but often fell short of full synchronization. To address this, we present SyncVSR, an end-to-end learning framework that leverages quantized audio for frame-level crossmodal supervision. By integrating a projection layer that synchronizes visual representation with acoustic data, our encoder learns to generate discrete audio tokens from a video sequence in a non-autoregressive manner. SyncVSR shows versatility across tasks, languages, and modalities at the cost of a forward pass. Our empirical evaluations show that it not only achieves state-of-the-art results but also reduces data usage by up to ninefold. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.11608 [pdf, other]

Learning Hierarchical Semantic Classification by Grounding on Consistent Image Segmentations

Authors: Seulki Park, Youren Zhang, Stella X. Yu, Sara Beery, Jonathan Huang

Abstract: Hierarchical semantic classification requires the prediction of a taxonomy tree instead of a single flat level of the tree, where both accuracies at individual levels and consistency across levels matter. We can train classifiers for individual levels, which has accuracy but not consistency, or we can train only the finest level classification and infer higher levels, which has consistency but not… ▽ More Hierarchical semantic classification requires the prediction of a taxonomy tree instead of a single flat level of the tree, where both accuracies at individual levels and consistency across levels matter. We can train classifiers for individual levels, which has accuracy but not consistency, or we can train only the finest level classification and infer higher levels, which has consistency but not accuracy. Our key insight is that hierarchical recognition should not be treated as multi-task classification, as each level is essentially a different task and they would have to compromise with each other, but be grounded on image segmentations that are consistent across semantic granularities. Consistency can in fact improve accuracy. We build upon recent work on learning hierarchical segmentation for flat-level recognition, and extend it to hierarchical recognition. It naturally captures the intuition that fine-grained recognition requires fine image segmentation whereas coarse-grained recognition requires coarse segmentation; they can all be integrated into one recognition model that drives fine-to-coarse internal visual parsing.Additionally, we introduce a Tree-path KL Divergence loss to enforce consistent accurate predictions across levels. Our extensive experimentation and analysis demonstrate our significant gains on predicting an accurate and consistent taxonomy tree. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 34 pages

arXiv:2406.11260 [pdf, other]

Adversarial Style Augmentation via Large Language Model for Robust Fake News Detection

Authors: Sungwon Park, Sungwon Han, Meeyoung Cha

Abstract: The spread of fake news negatively impacts individuals and is regarded as a significant social challenge that needs to be addressed. A number of algorithmic and insightful features have been identified for detecting fake news. However, with the recent LLMs and their advanced generation capabilities, many of the detectable features (e.g., style-conversion attacks) can be altered, making it more cha… ▽ More The spread of fake news negatively impacts individuals and is regarded as a significant social challenge that needs to be addressed. A number of algorithmic and insightful features have been identified for detecting fake news. However, with the recent LLMs and their advanced generation capabilities, many of the detectable features (e.g., style-conversion attacks) can be altered, making it more challenging to distinguish from real news. This study proposes adversarial style augmentation, AdStyle, to train a fake news detector that remains robust against various style-conversion attacks. Our model's key mechanism is the careful use of LLMs to automatically generate a diverse yet coherent range of style-conversion attack prompts. This improves the generation of prompts that are particularly difficult for the detector to handle. Experiments show that our augmentation strategy improves robustness and detection performance when tested on fake news benchmark datasets. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 8 pages

arXiv:2406.09799 [pdf, other]

GeoSEE: Regional Socio-Economic Estimation With a Large Language Model

Authors: Sungwon Han, Donghyun Ahn, Seungeon Lee, Minhyuk Song, Sungwon Park, Sangyoon Park, Jihee Kim, Meeyoung Cha

Abstract: Moving beyond traditional surveys, combining heterogeneous data sources with AI-driven inference models brings new opportunities to measure socio-economic conditions, such as poverty and population, over expansive geographic areas. The current research presents GeoSEE, a method that can estimate various socio-economic indicators using a unified pipeline powered by a large language model (LLM). Pre… ▽ More Moving beyond traditional surveys, combining heterogeneous data sources with AI-driven inference models brings new opportunities to measure socio-economic conditions, such as poverty and population, over expansive geographic areas. The current research presents GeoSEE, a method that can estimate various socio-economic indicators using a unified pipeline powered by a large language model (LLM). Presented with a diverse set of information modules, including those pre-constructed from satellite imagery, GeoSEE selects which modules to use in estimation, for each indicator and country. This selection is guided by the LLM's prior socio-geographic knowledge, which functions similarly to the insights of a domain expert. The system then computes target indicators via in-context learning after aggregating results from selected modules in the format of natural language-based texts. Comprehensive evaluation across countries at various stages of development reveals that our method outperforms other predictive models in both unsupervised and low-shot contexts. This reliable performance under data-scarce setting in under-developed or develo** countries, combined with its cost-effectiveness, underscores its potential to continuously support and monitor the progress of Sustainable Development Goals, such as poverty alleviation and equitable growth, on a global scale. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.09698 [pdf, other]

Projected background and sensitivity of AMoRE-II

Authors: A. Agrawal, V. V. Alenkov, P. Aryal, J. Beyer, B. Bhandari, R. S. Boiko, K. Boonin, O. Buzanov, C. R. Byeon, N. Chanthima, M. K. Cheoun, J. S. Choe, Seonho Choi, S. Choudhury, J. S. Chung, F. A. Danevich, M. Djamal, D. Drung, C. Enss, A. Fleischmann, A. M. Gangapshev, L. Gastaldo, Y. M. Gavrilyuk, A. M. Gezhaev, O. Gileva , et al. (81 additional authors not shown)

Abstract: AMoRE-II aims to search for neutrinoless double beta decay with an array of 423 Li$_2$$^{100}$MoO$_4$ crystals operating in the cryogenic system as the main phase of the Advanced Molybdenum-based Rare process Experiment (AMoRE). AMoRE has been planned to operate in three phases: AMoRE-pilot, AMoRE-I, and AMoRE-II. AMoRE-II is currently being installed at the Yemi Underground Laboratory, located ap… ▽ More AMoRE-II aims to search for neutrinoless double beta decay with an array of 423 Li$_2$$^{100}$MoO$_4$ crystals operating in the cryogenic system as the main phase of the Advanced Molybdenum-based Rare process Experiment (AMoRE). AMoRE has been planned to operate in three phases: AMoRE-pilot, AMoRE-I, and AMoRE-II. AMoRE-II is currently being installed at the Yemi Underground Laboratory, located approximately 1000 meters deep in Jeongseon, Korea. The goal of AMoRE-II is to reach up to $T^{0νββ}_{1/2}$ $\sim$ 6 $\times$ 10$^{26}$ years, corresponding to an effective Majorana mass of 15 - 29 meV, covering all the inverted mass hierarchy regions. To achieve this, the background level of the experimental configurations and possible background sources of gamma and beta events should be well understood. We have intensively performed Monte Carlo simulations using the GEANT4 toolkit in all the experimental configurations with potential sources. We report the estimated background level that meets the 10$^{-4}$counts/(keV$\cdot$kg$\cdot$yr) requirement for AMoRE-II in the region of interest (ROI) and show the projected half-life sensitivity based on the simulation study. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.09329 [pdf, other]

Is Value Learning Really the Main Bottleneck in Offline RL?

Authors: Seohong Park, Kevin Frans, Sergey Levine, Aviral Kumar

Abstract: While imitation learning requires access to high-quality data, offline reinforcement learning (RL) should, in principle, perform similarly or better with substantially lower data quality by using a value function. However, current results indicate that offline RL often performs worse than imitation learning, and it is often unclear what holds back the performance of offline RL. Motivated by this o… ▽ More While imitation learning requires access to high-quality data, offline reinforcement learning (RL) should, in principle, perform similarly or better with substantially lower data quality by using a value function. However, current results indicate that offline RL often performs worse than imitation learning, and it is often unclear what holds back the performance of offline RL. Motivated by this observation, we aim to understand the bottlenecks in current offline RL algorithms. While poor performance of offline RL is typically attributed to an imperfect value function, we ask: is the main bottleneck of offline RL indeed in learning the value function, or something else? To answer this question, we perform a systematic empirical study of (1) value learning, (2) policy extraction, and (3) policy generalization in offline RL problems, analyzing how these components affect performance. We make two surprising observations. First, we find that the choice of a policy extraction algorithm significantly affects the performance and scalability of offline RL, often more so than the value learning objective. For instance, we show that common value-weighted behavioral cloning objectives (e.g., AWR) do not fully leverage the learned value function, and switching to behavior-constrained policy gradient objectives (e.g., DDPG+BC) often leads to substantial improvements in performance and scalability. Second, we find that a big barrier to improving offline RL performance is often imperfect policy generalization on test-time states out of the support of the training data, rather than policy learning on in-distribution states. We then show that the use of suboptimal but high-coverage data or test-time policy training techniques can address this generalization issue in practice. Specifically, we propose two simple test-time policy improvement methods and show that these methods lead to better performance. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.09106 [pdf, other]

Selecting Alternative Metals for Advanced Interconnects

Authors: Jean-Philippe Soulié, Kiroubanand Sankaran, Benoit Van Troeye, Alicja Leśniewska, Olalla Varela Pedreira, Herman Oprins, Gilles Delie, Claudia Fleischmann, Lizzie Boakes, Cédric Rolin, Lars-Åke Ragnarsson, Kristof Croes, Seongho Park, Johan Swerts, Geoffrey Pourtois, Zsolt Tőkei, Christoph Adelmann

Abstract: Today, interconnect resistance and reliability are key limiters for the performance of advanced CMOS circuits. As transistor scaling is slowing, interconnect scaling has become the main driver for circuit miniaturization, and interconnect limitations are expected to become even more stringent in future CMOS technology nodes. Current Cu dual-damascene metallization is also becoming increasingly cha… ▽ More Today, interconnect resistance and reliability are key limiters for the performance of advanced CMOS circuits. As transistor scaling is slowing, interconnect scaling has become the main driver for circuit miniaturization, and interconnect limitations are expected to become even more stringent in future CMOS technology nodes. Current Cu dual-damascene metallization is also becoming increasingly challenging as critical interconnect dimensions approach 10 nm, alternative metallization schemes are researched with increasing intensity for about a decade. The selection of alternative metals is a highly multifaceted task and includes many criteria, covering the resistivity at reduced dimension, reliability and thermal aspects, as well as a sustainability perspective. In this tutorial, we introduce the basic criteria for alternative metal benchmarking and selection, and discuss the current state of the art of the field. The tutorial covers materials close to manufacturing introduction, materials under actual research, as well as future directions for fundamental research. While first alternatives to Cu metallization in commercial CMOS devices have become a reality recently, research for the ultimate interconnect metal is ongoing. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 72 pages, 27 figures

arXiv:2406.08301 [pdf, other]

Jet modification via $π^0$-hadron correlations in Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV

Authors: PHENIX Collaboration, N. J. Abdulameer, U. Acharya, A. Adare, S. Afanasiev, C. Aidala, N. N. Ajitanand, Y. Akiba, H. Al-Bataineh, J. Alexander, M. Alfred, K. Aoki, N. Apadula, L. Aphecetche, J. Asai, H. Asano, E. T. Atomssa, R. Averbeck, T. C. Awes, B. Azmoun, V. Babintsev, M. Bai, G. Baksay, L. Baksay, A. Baldisseri , et al. (510 additional authors not shown)

Abstract: High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is obs… ▽ More High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is observed in the yield of high-momentum jet fragments opposite the trigger particle, which indicates jet suppression stemming from in-medium partonic energy loss, while enhancement is observed for low-momentum particles. The ratio and differences between the yield in Au$+$Au collisions and $p$$+$$p$ collisions, $I_{AA}$ and $Δ_{AA}$, as a function of the trigger-hadron azimuthal separation, $Δφ$, are measured for the first time at the Relativistic Heavy Ion Collider. These results better quantify how the yield of low-$p_T$ associated hadrons is enhanced at wide angle, which is crucial for studying energy loss as well as medium-response effects. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 534 authors from 83 institutions, 12 pages, 7 figures. v1 is version submitted to Physical Review C. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html

arXiv:2406.08020 [pdf, other]

Generalizable Disaster Damage Assessment via Change Detection with Vision Foundation Model

Authors: Kyeong** Ahn, Sungwon Han, Sungwon Park, Jihee Kim, Sangyoon Park, Meeyoung Cha

Abstract: The increasing frequency and intensity of natural disasters demand more sophisticated approaches for rapid and precise damage assessment. To tackle this issue, researchers have developed various methods on disaster benchmark datasets from satellite imagery to aid in detecting disaster damage. However, the diverse nature of geographical landscapes and disasters makes it challenging to apply existin… ▽ More The increasing frequency and intensity of natural disasters demand more sophisticated approaches for rapid and precise damage assessment. To tackle this issue, researchers have developed various methods on disaster benchmark datasets from satellite imagery to aid in detecting disaster damage. However, the diverse nature of geographical landscapes and disasters makes it challenging to apply existing methods to regions unseen during training. We present DAVI (Disaster Assessment with VIsion foundation model), which overcomes domain disparities and detects structural damage (e.g., building) without requiring ground-truth labels of the target region. DAVI integrates task-specific knowledge from a model trained on source regions with an image segmentation foundation model to generate pseudo labels of possible damage in the target region. It then employs a two-stage refinement process, targeting both the pixel and overall image, to more accurately pinpoint changes in disaster-struck areas based on before-and-after images. Comprehensive evaluations demonstrate that DAVI achieves exceptional performance across diverse terrains (e.g., USA and Mexico) and disaster types (e.g., wildfires, hurricanes, and earthquakes). This confirms its robustness in assessing disaster impact without dependence on ground-truth labels. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 9 pages, 4 figures, 2 tables

arXiv:2406.07886 [pdf, other]

Label-aware Hard Negative Sampling Strategies with Momentum Contrastive Learning for Implicit Hate Speech Detection

Authors: Jaehoon Kim, Seungwan **, Sohyun Park, Someen Park, Kyungsik Han

Abstract: Detecting implicit hate speech that is not directly hateful remains a challenge. Recent research has attempted to detect implicit hate speech by applying contrastive learning to pre-trained language models such as BERT and RoBERTa, but the proposed models still do not have a significant advantage over cross-entropy loss-based learning. We found that contrastive learning based on randomly sampled b… ▽ More Detecting implicit hate speech that is not directly hateful remains a challenge. Recent research has attempted to detect implicit hate speech by applying contrastive learning to pre-trained language models such as BERT and RoBERTa, but the proposed models still do not have a significant advantage over cross-entropy loss-based learning. We found that contrastive learning based on randomly sampled batch data does not encourage the model to learn hard negative samples. In this work, we propose Label-aware Hard Negative sampling strategies (LAHN) that encourage the model to learn detailed features from hard negative samples, instead of naive negative samples in random batch, using momentum-integrated contrastive learning. LAHN outperforms the existing models for implicit hate speech detection both in- and cross-datasets. The code is available at https://github.com/Hanyang-HCC-Lab/LAHN △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: Accepted to ACL 2024 Findings

arXiv:2406.07867 [pdf, other]

Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation

Authors: Se ** Park, Chae Won Kim, Hyeongseop Rha, Minsu Kim, Joanna Hong, Jeong Hun Yeo, Yong Man Ro

Abstract: In this paper, we introduce a novel Face-to-Face spoken dialogue model. It processes audio-visual speech from user input and generates audio-visual speech as the response, marking the initial step towards creating an avatar chatbot system without relying on intermediate text. To this end, we newly introduce MultiDialog, the first large-scale multimodal (i.e., audio and visual) spoken dialogue corp… ▽ More In this paper, we introduce a novel Face-to-Face spoken dialogue model. It processes audio-visual speech from user input and generates audio-visual speech as the response, marking the initial step towards creating an avatar chatbot system without relying on intermediate text. To this end, we newly introduce MultiDialog, the first large-scale multimodal (i.e., audio and visual) spoken dialogue corpus containing 340 hours of approximately 9,000 dialogues, recorded based on the open domain dialogue dataset, TopicalChat. The MultiDialog contains parallel audio-visual recordings of conversation partners acting according to the given script with emotion annotations, which we expect to open up research opportunities in multimodal synthesis. Our Face-to-Face spoken dialogue model incorporates a textually pretrained large language model and adapts it into the audio-visual spoken dialogue domain by incorporating speech-text joint pretraining. Through extensive experiments, we validate the effectiveness of our model in facilitating a face-to-face conversation. Demo and data are available at https://multidialog.github.io and https://huggingface.co/datasets/IVLLab/MultiDialog, respectively. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: Accepted to ACL 2024

arXiv:2406.07736 [pdf, other]

MultiPragEval: Multilingual Pragmatic Evaluation of Large Language Models

Authors: Dojun Park, Jiwoo Lee, Seohyun Park, Hyeyun Jeong, Youngeun Koo, Soonha Hwang, Seonwoo Park, Sungeun Lee

Abstract: As the capabilities of LLMs expand, it becomes increasingly important to evaluate them beyond basic knowledge assessment, focusing on higher-level language understanding. This study introduces MultiPragEval, a robust test suite designed for the multilingual pragmatic evaluation of LLMs across English, German, Korean, and Chinese. Comprising 1200 question units categorized according to Grice's Coop… ▽ More As the capabilities of LLMs expand, it becomes increasingly important to evaluate them beyond basic knowledge assessment, focusing on higher-level language understanding. This study introduces MultiPragEval, a robust test suite designed for the multilingual pragmatic evaluation of LLMs across English, German, Korean, and Chinese. Comprising 1200 question units categorized according to Grice's Cooperative Principle and its four conversational maxims, MultiPragEval enables an in-depth assessment of LLMs' contextual awareness and their ability to infer implied meanings. Our findings demonstrate that Claude3-Opus significantly outperforms other models in all tested languages, establishing a state-of-the-art in the field. Among open-source models, Solar-10.7B and Qwen1.5-14B emerge as strong competitors. This study not only leads the way in the multilingual evaluation of LLMs in pragmatic inference but also provides valuable insights into the nuanced capabilities necessary for advanced language comprehension in AI systems. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 8 pages, under review

arXiv:2406.07488 [pdf, other]

ReduceFormer: Attention with Tensor Reduction by Summation

Authors: John Yang, Le An, Su Inn Park

Abstract: Transformers have excelled in many tasks including vision. However, efficient deployment of transformer models in low-latency or high-throughput applications is hindered by the computation in the attention mechanism which involves expensive operations such as matrix multiplication and Softmax. To address this, we introduce ReduceFormer, a family of models optimized for efficiency with the spirit o… ▽ More Transformers have excelled in many tasks including vision. However, efficient deployment of transformer models in low-latency or high-throughput applications is hindered by the computation in the attention mechanism which involves expensive operations such as matrix multiplication and Softmax. To address this, we introduce ReduceFormer, a family of models optimized for efficiency with the spirit of attention. ReduceFormer leverages only simple operations such as reduction and element-wise multiplication, leading to greatly simplified architecture and improved inference performance, with up to 37% reduction in latency and 44% improvement in throughput, while maintaining competitive accuracy comparable to other recent methods. The proposed model family is suitable for edge devices where compute resource and memory bandwidth are limited, as well as for cloud computing where high throughput is sought after. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2406.06913 [pdf]

Frustrated phonon with charge density wave in vanadium Kagome metal

Authors: Seung-Phil Heo, Choongjae Won, Heemin Lee, Hanbyul Kim, Eunyoung Park, Sung Yun Lee, Junha Hwang, Hyeongi Choi, Sang-Youn Park, Byungjune Lee, Woo-Suk Noh, Hoyoung Jang, Jae-Hoon Park, Dongbin Shin, Changyong Song

Abstract: Crystals with unique ionic arrangements and strong electronic correlations serve as a fertile ground for the emergence of exotic phases, as evidenced by the coexistence of charge density wave (CDW) and superconductivity in vanadium Kagome metals, specifically AV3Sb5 (where A represents K, Rb, or Cs). The formation of a star of David CDW superstructure, resulting from the coordinated displacements… ▽ More Crystals with unique ionic arrangements and strong electronic correlations serve as a fertile ground for the emergence of exotic phases, as evidenced by the coexistence of charge density wave (CDW) and superconductivity in vanadium Kagome metals, specifically AV3Sb5 (where A represents K, Rb, or Cs). The formation of a star of David CDW superstructure, resulting from the coordinated displacements of vanadium ions on a corner sharing triangular lattice, has garnered significant attention in efforts to comprehend the influence of electron phonon interaction within this geometrically intricate lattice. However, understanding of the underlying mechanism behind CDW formation, coupled with symmetry protected lattice vibrations, remains elusive. In this study, we employed time resolved X ray scattering experiments utilising an X ray free electron laser. Our findings reveal that the phonon mode associated with the out of plane motion of Cs ions becomes frustrated in the CDW phase. Furthermore, we observed the photoinduced emergence of a metastable CDW phase, facilitated by the alleviation of frustration through nonadiabatic changes in free energy. By elucidating the longstanding puzzle surrounding the intervention of phonons in CDW ordering, this research offers fresh insights into the competition between phonons and periodic lattice distortions, a phenomenon widespread in other correlated quantum materials including layered high Tc superconductors. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: Manuscript: 20 pages, 4 figures, SI: 14 pages, 8 figures

arXiv:2406.06559 [pdf, other]

Harnessing Business and Media Insights with Large Language Models

Authors: Yujia Bao, Ankit Parag Shah, Neeru Narang, Jonathan Rivers, Rajeev Maksey, Lan Guan, Louise N. Barrere, Shelley Evenson, Rahul Basole, Connie Miao, Ankit Mehta, Fabien Boulay, Su Min Park, Natalie E. Pearson, Eldhose Joy, Tiger He, Sumiran Thakur, Koustav Ghosal, Josh On, Phoebe Morrison, Tim Major, Eva Siqi Wang, Gina Escobar, Jiaheng Wei, Tharindu Cyril Weerasooriya , et al. (8 additional authors not shown)

Abstract: This paper introduces Fortune Analytics Language Model (FALM). FALM empowers users with direct access to comprehensive business analysis, including market trends, company performance metrics, and expert insights. Unlike generic LLMs, FALM leverages a curated knowledge base built from professional journalism, enabling it to deliver precise and in-depth answers to intricate business questions. Users… ▽ More This paper introduces Fortune Analytics Language Model (FALM). FALM empowers users with direct access to comprehensive business analysis, including market trends, company performance metrics, and expert insights. Unlike generic LLMs, FALM leverages a curated knowledge base built from professional journalism, enabling it to deliver precise and in-depth answers to intricate business questions. Users can further leverage natural language queries to directly visualize financial data, generating insightful charts and graphs to understand trends across diverse business sectors clearly. FALM fosters user trust and ensures output accuracy through three novel methods: 1) Time-aware reasoning guarantees accurate event registration and prioritizes recent updates. 2) Thematic trend analysis explicitly examines topic evolution over time, providing insights into emerging business landscapes. 3) Content referencing and task decomposition enhance answer fidelity and data visualization accuracy. We conduct both automated and human evaluations, demonstrating FALM's significant performance improvements over baseline methods while prioritizing responsible AI practices. These benchmarks establish FALM as a cutting-edge LLM in the business and media domains, with exceptional accuracy and trustworthiness. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.06287 [pdf, other]

VS-PINN: A Fast and efficient training of physics-informed neural networks using variable-scaling methods for solving PDEs with stiff behavior

Authors: Seungchan Ko, Sang Hyeon Park

Abstract: Physics-informed neural networks (PINNs) have recently emerged as a promising way to compute the solutions of partial differential equations (PDEs) using deep neural networks. However, despite their significant success in various fields, it remains unclear in many aspects how to effectively train PINNs if the solutions of PDEs exhibit stiff behaviors or high frequencies. In this paper, we propose… ▽ More Physics-informed neural networks (PINNs) have recently emerged as a promising way to compute the solutions of partial differential equations (PDEs) using deep neural networks. However, despite their significant success in various fields, it remains unclear in many aspects how to effectively train PINNs if the solutions of PDEs exhibit stiff behaviors or high frequencies. In this paper, we propose a new method for training PINNs using variable-scaling techniques. This method is simple and it can be applied to a wide range of problems including PDEs with rapidly-varying solutions. Throughout various numerical experiments, we will demonstrate the effectiveness of the proposed method for these problems and confirm that it can significantly improve the training efficiency and performance of PINNs. Furthermore, based on the analysis of the neural tangent kernel (NTK), we will provide theoretical evidence for this phenomenon and show that our methods can indeed improve the performance of PINNs. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2406.06277 [pdf, other]

Measurement of the branching fractions of $\bar{B}\to D^{(*)} K^- K^{(*)0}_{(S)}$ and $\bar{B}\to D^{(*)}D_s^{-}$ decays at Belle II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer , et al. (382 additional authors not shown)

Abstract: We present measurements of the branching fractions of eight $\overline B{}^0\to D^{(*)+} K^- K^{(*)0}_{(S)}$, $B^{-}\to D^{(*)0} K^- K^{(*)0}_{(S)}$ decay channels. The results are based on data from SuperKEKB electron-positron collisions at the $Υ(4S)$ resonance collected with the Belle II detector, corresponding to an integrated luminosity of $362~\text{fb}^{-1}$. The event yields are extracted… ▽ More We present measurements of the branching fractions of eight $\overline B{}^0\to D^{(*)+} K^- K^{(*)0}_{(S)}$, $B^{-}\to D^{(*)0} K^- K^{(*)0}_{(S)}$ decay channels. The results are based on data from SuperKEKB electron-positron collisions at the $Υ(4S)$ resonance collected with the Belle II detector, corresponding to an integrated luminosity of $362~\text{fb}^{-1}$. The event yields are extracted from fits to the distributions of the difference between expected and observed $B$ meson energy, and are efficiency-corrected as a function of $m(K^-K^{(*)0}_{(S)})$ and $m(D^{(*)}K^{(*)0}_{(S)})$ in order to avoid dependence on the decay model. These results include the first observation of $\overline B{}^0\to D^+K^-K_S^0$, $B^-\to D^{*0}K^-K_S^0$, and $\overline B{}^0\to D^{*+}K^-K_S^0$ decays and a significant improvement in the precision of the other channels compared to previous measurements. The helicity-angle distributions and the invariant mass distributions of the $K^- K^{(*)0}_{(S)}$ systems are compatible with quasi-two-body decays via a resonant transition with spin-parity $J^P=1^-$ for the $K^-K_S^0$ systems and $J^P= 1^+$ for the $K^-K^{*0}$ systems. We also present measurements of the branching fractions of four $\overline B{}^0\to D^{(*)+} D_s^-$, $B^{-}\to D^{(*)0} D_s^- $ decay channels with a precision compatible to the current world averages. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: Prepared for submission to JHEP. 34 pages, 14 figures

Report number: Belle II Preprint: 2024-014, KEK Preprint: 2024-8

arXiv:2406.05761 [pdf, other]

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

Authors: Seungone Kim, Juyoung Suk, Ji Yong Cho, Shayne Longpre, Chaeeun Kim, Dongkeun Yoon, Gui** Son, Ye** Cho, Sheikh Shafayat, **heon Baek, Sue Hyun Park, Hyeonbin Hwang, **kyung Jo, Hyowon Cho, Haebin Shin, Seongyun Lee, Hanseok Oh, Noah Lee, Namgyu Ho, Se June Joo, Miyoung Ko, Yoonjoo Lee, Hyungjoo Chae, Jamin Shin, Joel Jang , et al. (7 additional authors not shown)

Abstract: As language models (LMs) become capable of handling a wide range of tasks, their evaluation is becoming as challenging as their development. Most generation benchmarks currently assess LMs using abstract evaluation criteria like helpfulness and harmlessness, which often lack the flexibility and granularity of human assessment. Additionally, these benchmarks tend to focus disproportionately on spec… ▽ More As language models (LMs) become capable of handling a wide range of tasks, their evaluation is becoming as challenging as their development. Most generation benchmarks currently assess LMs using abstract evaluation criteria like helpfulness and harmlessness, which often lack the flexibility and granularity of human assessment. Additionally, these benchmarks tend to focus disproportionately on specific capabilities such as instruction following, leading to coverage bias. To overcome these limitations, we introduce the BiGGen Bench, a principled generation benchmark designed to thoroughly evaluate nine distinct capabilities of LMs across 77 diverse tasks. A key feature of the BiGGen Bench is its use of instance-specific evaluation criteria, closely mirroring the nuanced discernment of human evaluation. We apply this benchmark to assess 103 frontier LMs using five evaluator LMs. Our code, data, and evaluation results are all publicly available at https://github.com/prometheus-eval/prometheus-eval/tree/main/BiGGen-Bench. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Comments: Work in Progress

arXiv:2406.05432 [pdf, other]

Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models

Authors: Minho Park, Sunghyun Park, Jooyeol Yun, Jaegul Choo

Abstract: Recent advancements in text-to-image generation have inspired researchers to generate datasets tailored for perception models using generative models, which prove particularly valuable in scenarios where real-world data is limited. In this study, our goal is to address the challenges when fine-tuning vision-language models (e.g., CLIP) on generated datasets. Specifically, we aim to fine-tune visio… ▽ More Recent advancements in text-to-image generation have inspired researchers to generate datasets tailored for perception models using generative models, which prove particularly valuable in scenarios where real-world data is limited. In this study, our goal is to address the challenges when fine-tuning vision-language models (e.g., CLIP) on generated datasets. Specifically, we aim to fine-tune vision-language models to a specific classification model without access to any real images, also known as name-only transfer. However, despite the high fidelity of generated images, we observed a significant performance degradation when fine-tuning the model using the generated datasets due to the domain gap between real and generated images. To overcome the domain gap, we provide two regularization methods for training and post-training, respectively. First, we leverage the domain-agnostic knowledge from the original pre-trained vision-language model by conducting the weight-space ensemble of the fine-tuned model on the generated dataset with the original pre-trained model at the post-training. Secondly, we reveal that fine-tuned models with high feature diversity score high performance in the real domain, which indicates that increasing feature diversity prevents learning the generated domain-specific knowledge. Thus, we encourage feature diversity by providing additional regularization at training time. Extensive experiments on various classification datasets and various text-to-image generation models demonstrated that our analysis and regularization techniques effectively mitigate the domain gap, which has long been overlooked, and enable us to achieve state-of-the-art performance by training with generated images. Code is available at https://github.com/pmh9960/regft-for-gen △ Less

Submitted 8 June, 2024; originally announced June 2024.

Comments: Preprint. Under review

arXiv:2406.05396 [pdf, other]

Mean-field Chaos Diffusion Models

Authors: Sungwoo Park, Dongjun Kim, Ahmed Alaa

Abstract: In this paper, we introduce a new class of score-based generative models (SGMs) designed to handle high-cardinality data distributions by leveraging concepts from mean-field theory. We present mean-field chaos diffusion models (MF-CDMs), which address the curse of dimensionality inherent in high-cardinality data by utilizing the propagation of chaos property of interacting particles. By treating h… ▽ More In this paper, we introduce a new class of score-based generative models (SGMs) designed to handle high-cardinality data distributions by leveraging concepts from mean-field theory. We present mean-field chaos diffusion models (MF-CDMs), which address the curse of dimensionality inherent in high-cardinality data by utilizing the propagation of chaos property of interacting particles. By treating high-cardinality data as a large stochastic system of interacting particles, we develop a novel score-matching method for infinite-dimensional chaotic particle systems and propose an approximation scheme that employs a subdivision strategy for efficient training. Our theoretical and empirical results demonstrate the scalability and effectiveness of MF-CDMs for managing large high-cardinality data structures, such as 3D point clouds. △ Less

Submitted 8 June, 2024; originally announced June 2024.

arXiv:2406.04642 [pdf, ps, other]

Measurements of the branching fractions of $Ξ_{c}^{0}\toΞ^{0}π^{0}$, $Ξ_{c}^{0}\toΞ^{0}η$, and $Ξ_{c}^{0}\toΞ^{0}η^{\prime}$ and asymmetry parameter of $Ξ_{c}^{0}\toΞ^{0}π^{0}$

Authors: Belle, Belle II Collaborations, :, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (360 additional authors not shown)

Abstract: We present a study of $Ξ_{c}^{0}\toΞ^{0}π^{0}$, $Ξ_{c}^{0}\toΞ^{0}η$, and $Ξ_{c}^{0}\toΞ^{0}η^{\prime}$ decays using the Belle and Belle~II data samples, which have integrated luminosities of 980~$\mathrm{fb}^{-1}$ and 426~$\mathrm{fb}^{-1}$, respectively. We measure the following relative branching fractions… ▽ More We present a study of $Ξ_{c}^{0}\toΞ^{0}π^{0}$, $Ξ_{c}^{0}\toΞ^{0}η$, and $Ξ_{c}^{0}\toΞ^{0}η^{\prime}$ decays using the Belle and Belle~II data samples, which have integrated luminosities of 980~$\mathrm{fb}^{-1}$ and 426~$\mathrm{fb}^{-1}$, respectively. We measure the following relative branching fractions $${\cal B}(Ξ_{c}^{0}\toΞ^{0}π^{0})/{\cal B}(Ξ_{c}^{0}\toΞ^{-}π^{+}) = 0.48 \pm 0.02 ({\rm stat}) \pm 0.03 ({\rm syst}) ,$$ $${\cal B}(Ξ_{c}^{0}\toΞ^{0}η)/{\cal B}(Ξ_{c}^{0}\toΞ^{-}π^{+}) = 0.11 \pm 0.01 ({\rm stat}) \pm 0.01 ({\rm syst}) ,$$ $${\cal B}(Ξ_{c}^{0}\toΞ^{0}η^{\prime})/{\cal B}(Ξ_{c}^{0}\toΞ^{-}π^{+}) = 0.08 \pm 0.02 ({\rm stat}) \pm 0.01 ({\rm syst}) $$ for the first time, where the uncertainties are statistical ($\rm stat$) and systematic ($\rm syst$). By multiplying by the branching fraction of the normalization mode, ${\mathcal B}(Ξ_{c}^{0}\toΞ^{-}π^{+})$, we obtain the following absolute branching fraction results $(6.9 \pm 0.3 ({\rm stat}) \pm 0.5 ({\rm syst}) \pm 1.3 ({\rm norm})) \times 10^{-3}$, $(1.6 \pm 0.2 ({\rm stat}) \pm 0.2 ({\rm syst}) \pm 0.3 ({\rm norm})) \times 10^{-3}$, and $(1.2 \pm 0.3 ({\rm stat}) \pm 0.1 ({\rm syst}) \pm 0.2 ({\rm norm})) \times 10^{-3}$, for $Ξ_{c}^{0}$ decays to $Ξ^{0}π^{0}$, $Ξ^{0}η$, and $Ξ^{0}η^{\prime}$ final states, respectively. The third errors are from the uncertainty on ${\mathcal B}(Ξ_{c}^{0}\toΞ^{-}π^{+})$. The asymmetry parameter for $Ξ_{c}^{0}\toΞ^{0}π^{0}$ is measured to be $α(Ξ_{c}^{0}\toΞ^{0}π^{0}) = -0.90\pm0.15({\rm stat})\pm0.23({\rm syst})$. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 23 pages, 5 figures

Report number: Belle II Preprint 2024-015; KEK Preprint 2024-9

arXiv:2406.03685 [pdf, other]

Shockingly Bright Warm Carbon Monoxide Molecular Features in the Supernova Remnant Cassiopeia A Revealed by JWST

Authors: J. Rho, S. -H. Park, R. Arendt, M. Matsuura, D. Milisavljevic, T. Temim, I. De Looze, W. P. Blair, A. Rest, O. Fox, A. P. Ravi, B. -C. Koo, M. Barlow, A. Burrows, R. Chevalier, G. Clayton, R. Fesen, C. Fransson, C. Fryer, H. L. Gomez, H. -T. Janka, F. Kirchschlarger, J. M. Laming, S. Orlando, D. Patnaude , et al. (14 additional authors not shown)

Abstract: We present JWST NIRCam (F356W and F444W filters) and MIRI (F770W) images and NIRSpec- IFU spectroscopy of the young supernova remnant Cassiopeia A (Cas A). We obtained the data as part of a JWST survey of Cas A. The NIRCam and MIRI images map the spatial distributions of synchrotron radiation, Ar-rich ejecta, and CO on both large and small scales, revealing remarkably complex structures. The CO em… ▽ More We present JWST NIRCam (F356W and F444W filters) and MIRI (F770W) images and NIRSpec- IFU spectroscopy of the young supernova remnant Cassiopeia A (Cas A). We obtained the data as part of a JWST survey of Cas A. The NIRCam and MIRI images map the spatial distributions of synchrotron radiation, Ar-rich ejecta, and CO on both large and small scales, revealing remarkably complex structures. The CO emission is stronger at the outer layers than the Ar ejecta, which indicates the reformation of CO molecules behind the reverse shock. NIRSpec-IFU spectra (3 - 5.5 microns) were obtained toward two representative knots in the NE and S fields. Both regions are dominated by the bright fundamental rovibrational band of CO in the two R and P branches, with strong [Ar VI] and relatively weaker, variable strength ejecta lines of [Si IX], [Ca IV], [Ca V] and [Mg IV]. The NIRSpec-IFU data resolve individual ejecta knots and filaments spatially and in velocity space. The fundamental CO band in the JWST spectra reveals unique shapes of CO, showing a few tens of sinusoidal patterns of rovibrational lines with pseudo-continuum underneath, which is attributed to the high-velocity widths of CO lines. The CO also shows high J lines at different vibrational transitions. Our results with LTE modeling of CO emission indicate a temperature of 1080 K and provide unique insight into the correlations between dust, molecules, and highly ionized ejecta in supernovae, and have strong ramifications for modeling dust formation that is led by CO cooling in the early Universe. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: accepted for the ApJ letter (17 pages and 10 figures)

arXiv:2406.03671 [pdf, other]

PANDA: Expanded Width-Aware Message Passing Beyond Rewiring

Authors: Jeongwhan Choi, Sumin Park, Hyowon Wi, Sung-Bae Cho, Noseong Park

Abstract: Recent research in the field of graph neural network (GNN) has identified a critical issue known as "over-squashing," resulting from the bottleneck phenomenon in graph structures, which impedes the propagation of long-range information. Prior works have proposed a variety of graph rewiring concepts that aim at optimizing the spatial or spectral properties of graphs to promote the signal propagatio… ▽ More Recent research in the field of graph neural network (GNN) has identified a critical issue known as "over-squashing," resulting from the bottleneck phenomenon in graph structures, which impedes the propagation of long-range information. Prior works have proposed a variety of graph rewiring concepts that aim at optimizing the spatial or spectral properties of graphs to promote the signal propagation. However, such approaches inevitably deteriorate the original graph topology, which may lead to a distortion of information flow. To address this, we introduce an expanded width-aware (PANDA) message passing, a new message passing paradigm where nodes with high centrality, a potential source of over-squashing, are selectively expanded in width to encapsulate the growing influx of signals from distant nodes. Experimental results show that our method outperforms existing rewiring methods, suggesting that selectively expanding the hidden state of nodes can be a compelling alternative to graph rewiring for addressing the over-squashing. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: Accepted at ICML 2024

arXiv:2406.02223 [pdf, other]

doi 10.1109/ICASSP49357.2023.10097143

SMCL: Saliency Masked Contrastive Learning for Long-tailed Recognition

Authors: Sanglee Park, Seung-won Hwang, Jungmin So

Abstract: Real-world data often follow a long-tailed distribution with a high imbalance in the number of samples between classes. The problem with training from imbalanced data is that some background features, common to all classes, can be unobserved in classes with scarce samples. As a result, this background correlates to biased predictions into ``major" classes. In this paper, we propose saliency masked… ▽ More Real-world data often follow a long-tailed distribution with a high imbalance in the number of samples between classes. The problem with training from imbalanced data is that some background features, common to all classes, can be unobserved in classes with scarce samples. As a result, this background correlates to biased predictions into ``major" classes. In this paper, we propose saliency masked contrastive learning, a new method that uses saliency masking and contrastive learning to mitigate the problem and improve the generalizability of a model. Our key idea is to mask the important part of an image using saliency detection and use contrastive learning to move the masked image towards minor classes in the feature space, so that background features present in the masked image are no longer correlated with the original class. Experiment results show that our method achieves state-of-the-art level performance on benchmark long-tailed datasets. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: accepted at ICASSP 2023

arXiv:2406.01413 [pdf, other]

doi 10.21105/joss.06587

KerrGeoPy: A Python Package for Computing Timelike Geodesics in Kerr Spacetime

Authors: Seyong Park, Zachary Nasipak

Abstract: KerrGeoPy is a Python package which computes both stable and plunging timelike geodesics in Kerr spacetime using analytic solutions to the geodesic equation that are written in terms of Legendre elliptic integrals. It mirrors and builds upon much of the functionality of the KerrGeodesics Mathematica library. Users can construct a geodesic by providing the initial position and four-velocity, or by… ▽ More KerrGeoPy is a Python package which computes both stable and plunging timelike geodesics in Kerr spacetime using analytic solutions to the geodesic equation that are written in terms of Legendre elliptic integrals. It mirrors and builds upon much of the functionality of the KerrGeodesics Mathematica library. Users can construct a geodesic by providing the initial position and four-velocity, or by providing either the constants of motion or a generalized version of the parameters defining a Keplerian orbit. The package provides methods for computing the four-velocity, fundamental frequencies, and constants of motion associated with a given geodesic along with the location of the separatrix. It also includes several methods for visualizing and animating geodesics. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: 4 pages, 1 figure; repository: https://github.com/BlackHolePerturbationToolkit/KerrGeoPy ; documentation: https://kerrgeopy.readthedocs.io/

Journal ref: Journal of Open Source Software, 9(98), 6587 (2024)

arXiv:2406.00505 [pdf, other]

Improving Text Generation on Images with Synthetic Captions

Authors: Jun Young Koh, Sang Hyun Park, Joy Song

Abstract: The recent emergence of latent diffusion models such as SDXL and SD 1.5 has shown significant capability in generating highly detailed and realistic images. Despite their remarkable ability to produce images, generating accurate text within images still remains a challenging task. In this paper, we examine the validity of fine-tuning approaches in generating legible text within the image. We propo… ▽ More The recent emergence of latent diffusion models such as SDXL and SD 1.5 has shown significant capability in generating highly detailed and realistic images. Despite their remarkable ability to produce images, generating accurate text within images still remains a challenging task. In this paper, we examine the validity of fine-tuning approaches in generating legible text within the image. We propose a low-cost approach by leveraging SDXL without any time-consuming training on large-scale datasets. The proposed strategy employs a fine-tuning technique that examines the effects of data refinement levels and synthetic captions. Moreover, our results demonstrate how our small scale fine-tuning approach can improve the accuracy of text generation in different scenarios without the need of additional multimodal encoders. Our experiments show that with the addition of random letters to our raw dataset, our model's performance improves in producing well-formed visual text. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: 9 pages, 12 figures

arXiv:2406.00464 [pdf, other]

Sub-wavelength optical lattice in 2D materials

Authors: Supratik Sarkar, Mahmoud Jalali Mehrabad, Daniel G. Suárez-Forero, Liuxin Gu, Christopher J. Flower, Lida Xu, Kenji Watanabe, Takashi Taniguchi, Suji Park, Houk Jang, You Zhou, Mohammad Hafezi

Abstract: Recently, light-matter interaction has been vastly expanded as a control tool for inducing and enhancing many emergent non-equilibrium phenomena. However, conventional schemes for exploring such light-induced phenomena rely on uniform and diffraction-limited free-space optics, which limits the spatial resolution and the efficiency of light-matter interaction. Here, we overcome these challenges usi… ▽ More Recently, light-matter interaction has been vastly expanded as a control tool for inducing and enhancing many emergent non-equilibrium phenomena. However, conventional schemes for exploring such light-induced phenomena rely on uniform and diffraction-limited free-space optics, which limits the spatial resolution and the efficiency of light-matter interaction. Here, we overcome these challenges using metasurface plasmon polaritons (MPPs) to form a sub-wavelength optical lattice. Specifically, we report a ``nonlocal" pump-probe scheme where MPPs are excited to induce a spatially modulated AC Stark shift for excitons in a monolayer of MoSe$_2$, several microns away from the illumination spot. Remarkably, we identify nearly two orders of magnitude reduction for the required modulation power compared to the free-space optical illumination counterpart. Moreover, we demonstrate a broadening of the excitons' linewidth as a robust signature of MPP-induced periodic sub-diffraction modulation. Our results open new avenues for exploring power-efficient light-induced lattice phenomena below the diffraction limit in active chip-compatible MPP architectures. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2405.20829 [pdf, other]

Rethinking Open-World Semi-Supervised Learning: Distribution Mismatch and Inductive Inference

Authors: Seongheon Park, Hyuk Kwon, Kwanghoon Sohn, Kibok Lee

Abstract: Open-world semi-supervised learning (OWSSL) extends conventional semi-supervised learning to open-world scenarios by taking account of novel categories in unlabeled datasets. Despite the recent advancements in OWSSL, the success often relies on the assumptions that 1) labeled and unlabeled datasets share the same balanced class prior distribution, which does not generally hold in real-world applic… ▽ More Open-world semi-supervised learning (OWSSL) extends conventional semi-supervised learning to open-world scenarios by taking account of novel categories in unlabeled datasets. Despite the recent advancements in OWSSL, the success often relies on the assumptions that 1) labeled and unlabeled datasets share the same balanced class prior distribution, which does not generally hold in real-world applications, and 2) unlabeled training datasets are utilized for evaluation, where such transductive inference might not adequately address challenges in the wild. In this paper, we aim to generalize OWSSL by addressing them. Our work suggests that practical OWSSL may require different training settings, evaluation methods, and learning strategies compared to those prevalent in the existing literature. △ Less

Submitted 31 May, 2024; originally announced May 2024.

Comments: CVPR Workshop on Computer Vision in the Wild (CVinW), 2024

arXiv:2405.19961 [pdf, other]

Collective Variable Free Transition Path Sampling with Generative Flow Network

Authors: Kiyoung Seong, Seonghyun Park, Seonghwan Kim, Woo Youn Kim, Sungsoo Ahn

Abstract: Understanding transition paths between meta-stable states in molecular systems is fundamental for material design and drug discovery. However, sampling these paths via molecular dynamics simulations is computationally prohibitive due to the high-energy barriers between the meta-stable states. Recent machine learning approaches are often restricted to simple systems or rely on collective variables… ▽ More Understanding transition paths between meta-stable states in molecular systems is fundamental for material design and drug discovery. However, sampling these paths via molecular dynamics simulations is computationally prohibitive due to the high-energy barriers between the meta-stable states. Recent machine learning approaches are often restricted to simple systems or rely on collective variables (CVs) extracted from expensive domain knowledge. In this work, we propose to leverage generative flow networks (GFlowNets) to sample transition paths without relying on CVs. We reformulate the problem as amortized energy-based sampling over molecular trajectories and train a bias potential by minimizing the squared log-ratio between the target distribution and the generator, derived from the flow matching objective of GFlowNets. Our evaluation on three proteins (Alanine Dipeptide, Polyproline, and Chignolin) demonstrates that our approach, called TPS-GFN, generates more realistic and diverse transition paths than the previous CV-free machine learning approach. △ Less

Submitted 31 May, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

Comments: 9 pages, 5 figures, 2 tables

arXiv:2405.19734 [pdf, other]

Search for the decay $B^{0}\toγγ$ using Belle and Belle II data

Authors: Belle, Belle II Collaborations, :, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, S. Al Said, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot , et al. (385 additional authors not shown)

Abstract: We report the result of a search for the rare decay $B^{0} \to γγ$ using a combined dataset of $753\times10^{6}$ $B\bar{B}$ pairs collected by the Belle experiment and $387\times10^{6}$ $B\bar{B}$ pairs collected by the Belle II experiment from decays of the $\rm Υ(4S)$ resonance produced in $e^{+}e^{-}$ collisions. A simultaneous fit to the Belle and Belle II data sets yields… ▽ More We report the result of a search for the rare decay $B^{0} \to γγ$ using a combined dataset of $753\times10^{6}$ $B\bar{B}$ pairs collected by the Belle experiment and $387\times10^{6}$ $B\bar{B}$ pairs collected by the Belle II experiment from decays of the $\rm Υ(4S)$ resonance produced in $e^{+}e^{-}$ collisions. A simultaneous fit to the Belle and Belle II data sets yields $11.0^{+6.5}_{-5.5}$ signal events, corresponding to a 2.5$σ$ significance. We determine the branching fraction $\mathcal{B}(B^{0} \to γγ) = (3.7^{+2.2}_{-1.8}(\rm stat)\pm0.5(\rm syst))\times10^{-8}$ and set a 90% credibility level upper limit of $\mathcal{B}(B^{0} \to γγ) < 6.4\times10^{-8}$. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Report number: Belle II Preprint: 2024-017, KEK Preprint: 2024-13

arXiv:2405.19346 [pdf, other]

Subject-Adaptive Transfer Learning Using Resting State EEG Signals for Cross-Subject EEG Motor Imagery Classification

Authors: Sion An, Myeongkyun Kang, Soopil Kim, Philip Chikontwe, Li Shen, Sang Hyun Park

Abstract: Electroencephalography (EEG) motor imagery (MI) classification is a fundamental, yet challenging task due to the variation of signals between individuals i.e., inter-subject variability. Previous approaches try to mitigate this using task-specific (TS) EEG signals from the target subject in training. However, recording TS EEG signals requires time and limits its applicability in various fields. In… ▽ More Electroencephalography (EEG) motor imagery (MI) classification is a fundamental, yet challenging task due to the variation of signals between individuals i.e., inter-subject variability. Previous approaches try to mitigate this using task-specific (TS) EEG signals from the target subject in training. However, recording TS EEG signals requires time and limits its applicability in various fields. In contrast, resting state (RS) EEG signals are a viable alternative due to ease of acquisition with rich subject information. In this paper, we propose a novel subject-adaptive transfer learning strategy that utilizes RS EEG signals to adapt models on unseen subject data. Specifically, we disentangle extracted features into task- and subject-dependent features and use them to calibrate RS EEG signals for obtaining task information while preserving subject characteristics. The calibrated signals are then used to adapt the model to the target subject, enabling the model to simulate processing TS EEG signals of the target subject. The proposed method achieves state-of-the-art accuracy on three public benchmarks, demonstrating the effectiveness of our method in cross-subject EEG MI classification. Our findings highlight the potential of leveraging RS EEG signals to advance practical brain-computer interface systems. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: Early Accepted at MICCAI 2024

arXiv:2405.18928 [pdf, other]

Measurement of the energy dependence of the $e^+e^- \to B\bar{B}$, $B\bar{B}{}^*$, and $B^*\bar{B}{}^*$ cross sections at Belle~II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, M. Bauer, A. Baur , et al. (444 additional authors not shown)

Abstract: We report measurements of the $e^+e^- \to B\bar{B}$, $B\bar{B}{}^*$, and $B^*\bar{B}{}^*$ cross sections at four energies, 10653, 10701, 10746 and 10805 MeV, using data collected by the Belle~II experiment. We reconstruct one $B$ meson in a large number of hadronic final states and use its momentum to identify the production process. In the first $2-5$ MeV above $B^*\bar{B}{}^*$ threshold, the… ▽ More We report measurements of the $e^+e^- \to B\bar{B}$, $B\bar{B}{}^*$, and $B^*\bar{B}{}^*$ cross sections at four energies, 10653, 10701, 10746 and 10805 MeV, using data collected by the Belle~II experiment. We reconstruct one $B$ meson in a large number of hadronic final states and use its momentum to identify the production process. In the first $2-5$ MeV above $B^*\bar{B}{}^*$ threshold, the $e^+e^- \to B^*\bar{B}{}^*$ cross section increases rapidly. This may indicate the presence of a pole close to the threshold. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 30 pages, 15 figures, submitted to JHEP

Report number: Belle II Preprint 2024-016, KEK Preprint 2024-12

arXiv:2405.18783 [pdf, other]

Global optimization in variational quantum algorithms via dynamic tunneling method

Authors: Seung Park, Kyunghyun Baek, Seung** Lee, Mahn-Soo Choi

Abstract: We present a global optimization routine for the variational quantum algorithms, which utilizes the dynamic tunneling flow. Originally designed to leverage information gathered by a gradient-based optimizer around local minima, we adapt the conventional dynamic tunneling flow to exploit the distance measure of quantum states, resolving issues of extrinsic degeneracy arising from the parametrizatio… ▽ More We present a global optimization routine for the variational quantum algorithms, which utilizes the dynamic tunneling flow. Originally designed to leverage information gathered by a gradient-based optimizer around local minima, we adapt the conventional dynamic tunneling flow to exploit the distance measure of quantum states, resolving issues of extrinsic degeneracy arising from the parametrization of quantum states. Our global optimization algorithm is applied to the variational quantum eigensolver for the transverse-field Ising model to demonstrate the performance of our routine while comparing it with the conventional dynamic tunneling method, which is based on the Euclidean distance measure on the parameter space. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 12 pages; 6 figures

arXiv:2405.18400 [pdf, other]

Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass

Authors: Ethan Shen, Alan Fan, Sarah M. Pratt, Jae Sung Park, Matthew Wallingford, Sham M. Kakade, Ari Holtzman, Ranjay Krishna, Ali Farhadi, Aditya Kusupati

Abstract: Many applications today provide users with multiple auto-complete drafts as they type, including GitHub's code completion, Gmail's smart compose, and Apple's messaging auto-suggestions. Under the hood, language models support this by running an autoregressive inference pass to provide a draft. Consequently, providing $k$ drafts to the user requires running an expensive language model $k$ times. To… ▽ More Many applications today provide users with multiple auto-complete drafts as they type, including GitHub's code completion, Gmail's smart compose, and Apple's messaging auto-suggestions. Under the hood, language models support this by running an autoregressive inference pass to provide a draft. Consequently, providing $k$ drafts to the user requires running an expensive language model $k$ times. To alleviate the computation cost of running $k$ inference passes, we propose Superposed Decoding, a new decoding algorithm that generates $k$ drafts at the computation cost of one autoregressive inference pass. We achieve this by feeding a superposition of the most recent token embeddings from the $k$ drafts as input to the next decoding step of the language model. At every inference step we combine the $k$ drafts with the top-$k$ tokens to get $k^2$ new drafts and cache the $k$ most likely options, using an n-gram interpolation with minimal compute overhead to filter out incoherent generations. Our experiments show that $k$ drafts from Superposed Decoding are at least as coherent and factual as Nucleus Sampling and Greedy Decoding respectively, while being at least $2.44\times$ faster for $k\ge3$. In a compute-normalized setting, user evaluations demonstrably favor text generated by Superposed Decoding over Nucleus Sampling. Code and more examples open-sourced at https://github.com/RAIVNLab/SuperposedDecoding. △ Less

Submitted 24 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

Comments: 22 pages, 15 figures

arXiv:2405.17977 [pdf, other]

Aligning to Thousands of Preferences via System Message Generalization

Authors: Seongyun Lee, Sue Hyun Park, Seungone Kim, Minjoon Seo

Abstract: Although humans inherently have diverse values, current large language model (LLM) alignment methods often assume that aligning LLMs with the general public's preferences is optimal. A major challenge in adopting a more individualized approach to LLM alignment is its lack of scalability, as it involves repeatedly acquiring preference data and training new reward models and LLMs for each individual… ▽ More Although humans inherently have diverse values, current large language model (LLM) alignment methods often assume that aligning LLMs with the general public's preferences is optimal. A major challenge in adopting a more individualized approach to LLM alignment is its lack of scalability, as it involves repeatedly acquiring preference data and training new reward models and LLMs for each individual's preferences. To address these challenges, we propose a new paradigm where users specify what they value most within the system message, steering the LLM's generation behavior to better align with the user's intentions. However, a naive application of such an approach is non-trivial since LLMs are typically trained on a uniform system message (e.g., "You are a helpful assistant") which limits their ability to generalize to diverse, unseen system messages. To improve this generalization, we create the Multifaceted Collection, a preference dataset with 192k combinations of values beyond generic helpfulness and harmlessness, spanning 65k user instructions. Using this dataset, we train a 7B LLM called Janus and test it on 921 prompts from 5 benchmarks (AlpacaEval 2.0, FLASK, Koala, MT-Bench, and Self-Instruct) by adding various unseen system messages that reflect user preferences. Janus achieves tie+win rate of 75.2%, 72.4%, and 66.4% against Mistral 7B Instruct v0.2, GPT-3.5 Turbo, and GPT-4, respectively. Unexpectedly, on three benchmarks focused on response helpfulness (AlpacaEval 2.0, MT-Bench, Arena Hard Auto v0.1), Janus also outperforms LLaMA 3 8B Instruct by a +4.0%, +0.1%, +3.0% margin, underscoring that training with a vast array of system messages could also enhance alignment to the general public's preference as well. Our code, dataset, benchmark, and models are available at https://github.com/kaistAI/Janus. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: Work in progress

Showing 1–50 of 3,002 results for author: Park, S