Search | arXiv e-print repository

Measurement of the integrated luminosity of data samples collected during 2019-2022 by the Belle II experiment

Authors: The Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, J. K. Ahn, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (382 additional authors not shown)

Abstract: A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, diga… ▽ More A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, digamma, and dimuon events is (426.52 $\pm$ 0.03 $\pm$ 2.48)~fb$^{-1}$, (427.32 $\pm$ 0.03 $\pm$ 2.56)~fb$^{-1}$, and (424.84 $\pm$ 0.04 $\pm$ 3.88)~fb$^{-1}$, where the first uncertainties are statistical and the second are systematic. The resulting total integrated luminosity obtained from the combination of the three methods is (426.88 $\pm$ 1.93)~fb$^{-1}$. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 12 pages, 3 figures

Report number: Belle II Preprint 2024-019; KEK Preprint 2024-16

arXiv:2407.00879 [pdf, ps, other]

Study of $χ_{bJ}(2P)\toωΥ(1S)$ at Belle

Authors: Z. S. Stottler, T. K. Pedlar, B. G. Fulsom, I. Adachi, K. Adamczyk, H. Aihara, S. Al Said, D. M. Asner, H. Atmacan, T. Aushev, R. Ayad, V. Babu, Sw. Banerjee, M. Bauer, P. Behera, K. Belous, J. Bennett, F. Bernlochner, M. Bessner, T. Bilka, D. Biswas, A. Bobrov, D. Bodrov, G. Bonvicini, J. Borah , et al. (159 additional authors not shown)

Abstract: We report a study of the hadronic transitions $χ_{bJ}(2P)\toωΥ(1S)$, with $ω\toπ^{+}π^{-}π^{0}$, using $28.2\times10^6~Υ(3S)$ mesons recorded by the Belle detector. We present the first evidence for the near--threshold transition $χ_{b0}(2P)\toωΥ(1S)$, the analog of the charm sector decay $χ_{c1}(3872)\toωJ/ψ$, with a branching fraction of… ▽ More We report a study of the hadronic transitions $χ_{bJ}(2P)\toωΥ(1S)$, with $ω\toπ^{+}π^{-}π^{0}$, using $28.2\times10^6~Υ(3S)$ mesons recorded by the Belle detector. We present the first evidence for the near--threshold transition $χ_{b0}(2P)\toωΥ(1S)$, the analog of the charm sector decay $χ_{c1}(3872)\toωJ/ψ$, with a branching fraction of $\mathcal{B}\big(χ_{b0}(2P)\toωΥ(1S)\big) = \big(0.55\pm0.19\pm0.07\big)\%$. We also obtain branching fractions of $\mathcal{B}\big(χ_{b1}(2P)\toωΥ(1S)\big) = \big(2.39{}^{+0.20}_{-0.19}\pm0.24\big)\%$ and $\mathcal{B}\big(χ_{b2}(2P)\toωΥ(1S)\big) = \big(0.47{}^{+0.13}_{-0.12}\pm0.06\big)\%$, confirming the measurement of the $ω$ transitions of the $J=1,2~P$--wave states. The ratio for the $J=2$ to $J=1$ transitions is also measured and found to differ by 3.3 standard deviations from the expected value in the QCD multipole expansion. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 6 pages, 2 figures

Report number: Belle Preprint: 2024-05; KEK Preprint: 2024-10

arXiv:2406.19587 [pdf, other]

Filtration learning in exact multi-parameter persistent homology and classification of time-series data

Authors: Keunsu Kim, Jae-Hun Jung

Abstract: To analyze the topological properties of the given discrete data, one needs to consider a continuous transform called filtration. Persistent homology serves as a tool to track changes of homology in the filtration. The outcome of the topological analysis of data varies depending on the choice of filtration, making the selection of filtration crucial. Filtration learning is an attempt to find an op… ▽ More To analyze the topological properties of the given discrete data, one needs to consider a continuous transform called filtration. Persistent homology serves as a tool to track changes of homology in the filtration. The outcome of the topological analysis of data varies depending on the choice of filtration, making the selection of filtration crucial. Filtration learning is an attempt to find an optimal filtration that minimizes the loss function. Exact Multi-parameter Persistent Homology (EMPH) has been recently proposed, particularly for topological time-series analysis, that utilizes the exact formula of rank invariant instead of calculating it. In this paper, we propose a framework for filtration learning of EMPH. We formulate an optimization problem and propose an algorithm for solving the problem. We then apply the proposed algorithm to several classification problems. Particularly, we derive the exact formula of the gradient of the loss function with respect to the filtration parameter, which makes it possible to directly update the filtration without using automatic differentiation, significantly enhancing the learning process. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 28 pages

MSC Class: 37M10; 55N31; 65K10

arXiv:2406.18823 [pdf, other]

Emergence of metachronal waves in a chain of symmetrically beating filaments

Authors: Narina Jung, Won Kyu Kim, Changbong Hyeon

Abstract: Recent experiments have shown that metachronal waves (MCWs) can emerge from a chain of symmetrically beating nematodes aligned at the edge of sessile droplets. Our study, employing a coupled elastohydrodynamic model of active filaments, elucidates that a misalignment caused by a tilt against the bounding wall disrupts the synchronization and generates a constant time lag between adjacent filaments… ▽ More Recent experiments have shown that metachronal waves (MCWs) can emerge from a chain of symmetrically beating nematodes aligned at the edge of sessile droplets. Our study, employing a coupled elastohydrodynamic model of active filaments, elucidates that a misalignment caused by a tilt against the bounding wall disrupts the synchronization and generates a constant time lag between adjacent filaments, leading to MCWs. The MCWs, enhancing the fluid circulation, achieve their maximum thermodynamic efficiency over the same range of tilt angles observed in the nematode experiments. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 12 page, 8 figures

arXiv:2406.18730 [pdf, other]

Chandra detects low-luminosity AGN with $M_\mathrm{BH}=10^{4}-10^{6}~M_\mathrm{\odot}$ in nearby ($z<0.5$), dwarf and star-forming galaxies

Authors: Mainak Singha, Julissa Sarmiento, Sangeeta Malhotra, James E. Rhoads, L. Y. Aaron Yung, Junxian Wang, Zhen-Ya Zheng, Ruqiu Lin, Keunho Kim, Jialai Kang, Santosh Harish

Abstract: We searched the Chandra and XMM archives for observations of 900 green pea galaxies to find AGN signatures. Green peas are low-mass galaxies with prominent emission lines, similar in size and star formation rate to high-redshift dwarf galaxies. Of the 29 observations found, 9 show X-ray detections with $S/N>3$. The 2-10 keV X-ray luminosity for these 9 sources exceeds… ▽ More We searched the Chandra and XMM archives for observations of 900 green pea galaxies to find AGN signatures. Green peas are low-mass galaxies with prominent emission lines, similar in size and star formation rate to high-redshift dwarf galaxies. Of the 29 observations found, 9 show X-ray detections with $S/N>3$. The 2-10 keV X-ray luminosity for these 9 sources exceeds $10^{40}~\mathrm{erg~s}^{-1}$, with 2 sources exceeding $10^{41}~\mathrm{erg~s}^{-1}$, suggesting the presence of intermediate-mass black holes (IMBH) or low-luminosity AGN (LLAGN) with BH masses between $100-10^6M_\mathrm{\odot}$. All X-ray detected sources (plus 6 additional sources) show He~II$\lambda4686$ emission and a broad component of the H$α$ emission line, indicating winds. The line widths of the broad H$α$ and He II$\lambda4686$ emitting gas clouds are weakly correlated ($R^{2}=0.15$), suggesting He II$\lambda4686$ emission is inconsistent with winds from super-Eddington accretors. However, the ratio of X-ray luminosity to star formation rate shows an anti-correlation with metallicity in 5 out of 9 X-ray detected sources, implying ultraluminous X-ray sources are key contributors to the observed X-ray luminosity. This could be due to super-Eddington accretors or IMBH. The X-ray emission is much higher than that produced by Wolf-Rayet stars and supernovae-driven winds. Thus, the X-ray luminosity in these 9 sources can only be explained by black holes with masses over $100~M_\mathrm{\odot}$. Our findings suggest the presence of LLAGN in these galaxies, with broad H$α$ line widths implying BH masses of $10^4-10^6M_\mathrm{\odot}$. Given Green Peas' role as significant Lyman Continuum leakers, LLAGN in these galaxies could have contributed significantly to cosmic reionization. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: Submitted to ApJ. 17 pages, 11 figures and 3 tables. Comments welcome

arXiv:2406.16727 [pdf, other]

Higher differentiability for the fractional $p$-Laplacian

Authors: Lars Diening, Kyeongbae Kim, Ho-Sik Lee, Simon Nowak

Abstract: In this work, we study the higher differentiability of solutions to the inhomogeneous fractional $p$-Laplace equation under different regularity assumptions on the data. In the superquadratic case, we extend and sharpen several previous results, while in the subquadratic regime our results constitute completely novel developments even in the homogeneous case. In particular, in the local limit our… ▽ More In this work, we study the higher differentiability of solutions to the inhomogeneous fractional $p$-Laplace equation under different regularity assumptions on the data. In the superquadratic case, we extend and sharpen several previous results, while in the subquadratic regime our results constitute completely novel developments even in the homogeneous case. In particular, in the local limit our results are consistent with well-known higher differentiability results for the standard inhomogeneous $p$-Laplace equation. All of our main results remain valid in the vectorial context of fractional $p$-Laplace systems. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 48 pages

arXiv:2406.13740 [pdf, other]

Kinetic Inductance, Quantum Geometry, and Superconductivity in Magic-Angle Twisted Bilayer Graphene

Authors: Miuko Tanaka, Joel Î-j. Wang, Thao H. Dinh, Daniel Rodan-Legrain, Sameia Zaman, Max Hays, Bharath Kannan, Aziza Almanakly, David K. Kim, Bethany M. Niedzielski, Kyle Serniak, Mollie E. Schwartz, Kenji Watanabe, Takashi Taniguchi, Jeffrey A. Grover, Terry P. Orlando, Simon Gustavsson, Pablo Jarillo-Herrero, William D. Oliver

Abstract: The physics of superconductivity in magic-angle twisted bilayer graphene (MATBG) is a topic of keen interest in moiré systems research, and it may provide insight into the pairing mechanism of other strongly correlated materials such as high-$T_{\mathrm{c}}$ superconductors. Here, we use DC-transport and microwave circuit quantum electrodynamics (cQED) to measure directly the superfluid stiffness… ▽ More The physics of superconductivity in magic-angle twisted bilayer graphene (MATBG) is a topic of keen interest in moiré systems research, and it may provide insight into the pairing mechanism of other strongly correlated materials such as high-$T_{\mathrm{c}}$ superconductors. Here, we use DC-transport and microwave circuit quantum electrodynamics (cQED) to measure directly the superfluid stiffness of superconducting MATBG via its kinetic inductance. We find the superfluid stiffness to be much larger than expected from conventional single-band Fermi liquid theory; rather, it aligns well with theory involving quantum geometric effects that are dominant at the magic angle. The temperature dependence of the superfluid stiffness exhibits a power-law behavior, which contraindicates an isotropic BCS model; instead, the extracted power-law exponents indicate an anisotropic superconducting gap, whether interpreted using the conventional anisotropic BCS model or a quantum geometric theory of flat-band superconductivity. Moreover, the quadratic dependence of the stiffness on both DC and microwave current is consistent with Ginzburg-Landau theory. Taken together, these findings strongly suggest a connection between quantum geometry, superfluid stiffness, and unconventional superconductivity in MATBG. Finally, the combined DC-microwave measurement platform used here is applicable to the investigation of other atomically thin superconductors. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.12721 [pdf]

Sound event detection based on auxiliary decoder and maximum probability aggregation for DCASE Challenge 2024 Task 4

Authors: Sang Won Son, Jongyeon Park, Hong Kook Kim, Sulaiman Vesal, Jeong Eun Lim

Abstract: In this report, we propose three novel methods for develo** a sound event detection (SED) model for the DCASE 2024 Challenge Task 4. First, we propose an auxiliary decoder attached to the final convolutional block to improve feature extraction capabilities while reducing dependency on embeddings from pre-trained large models. The proposed auxiliary decoder operates independently from the main de… ▽ More In this report, we propose three novel methods for develo** a sound event detection (SED) model for the DCASE 2024 Challenge Task 4. First, we propose an auxiliary decoder attached to the final convolutional block to improve feature extraction capabilities while reducing dependency on embeddings from pre-trained large models. The proposed auxiliary decoder operates independently from the main decoder, enhancing performance of the convolutional block during the initial training stages by assigning a different weight strategy between main and auxiliary decoder losses. Next, to address the time interval issue between the DESED and MAESTRO datasets, we propose maximum probability aggregation (MPA) during the training step. The proposed MPA method enables the model's output to be aligned with soft labels of 1 s in the MAESTRO dataset. Finally, we propose a multi-channel input feature that employs various versions of logmel and MFCC features to generate time-frequency pattern. The experimental results demonstrate the efficacy of these proposed methods in a view of improving SED performance by achieving a balanced enhancement across different datasets and label types. Ultimately, this approach presents a significant step forward in develo** more robust and flexible SED models △ Less

Submitted 24 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

Comments: DCASE 2024 challenge Task4, 4 pages

arXiv:2406.12233 [pdf, other]

SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization

Authors: Young ** Ahn, Jungwoo Park, Sangha Park, Jonghyun Choi, Kee-Eung Kim

Abstract: Visual Speech Recognition (VSR) stands at the intersection of computer vision and speech recognition, aiming to interpret spoken content from visual cues. A prominent challenge in VSR is the presence of homophenes-visually similar lip gestures that represent different phonemes. Prior approaches have sought to distinguish fine-grained visemes by aligning visual and auditory semantics, but often fel… ▽ More Visual Speech Recognition (VSR) stands at the intersection of computer vision and speech recognition, aiming to interpret spoken content from visual cues. A prominent challenge in VSR is the presence of homophenes-visually similar lip gestures that represent different phonemes. Prior approaches have sought to distinguish fine-grained visemes by aligning visual and auditory semantics, but often fell short of full synchronization. To address this, we present SyncVSR, an end-to-end learning framework that leverages quantized audio for frame-level crossmodal supervision. By integrating a projection layer that synchronizes visual representation with acoustic data, our encoder learns to generate discrete audio tokens from a video sequence in a non-autoregressive manner. SyncVSR shows versatility across tasks, languages, and modalities at the cost of a forward pass. Our empirical evaluations show that it not only achieves state-of-the-art results but also reduces data usage by up to ninefold. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.12016 [pdf, other]

Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization

Authors: Seungwoo Son, Wonpyo Park, Woohyun Han, Kyuyeun Kim, Jaeho Lee

Abstract: Despite recent advances in LLM quantization, activation quantization remains to be challenging due to the activation outliers. Conventional remedies, e.g., mixing precisions for different channels, introduce extra overhead and reduce the speedup. In this work, we develop a simple yet effective strategy to facilitate per-tensor activation quantization by preventing the generation of problematic tok… ▽ More Despite recent advances in LLM quantization, activation quantization remains to be challenging due to the activation outliers. Conventional remedies, e.g., mixing precisions for different channels, introduce extra overhead and reduce the speedup. In this work, we develop a simple yet effective strategy to facilitate per-tensor activation quantization by preventing the generation of problematic tokens. Precisely, we propose a method to find a set of key-value cache, coined CushionCache, which mitigates outliers in subsequent tokens when inserted as a prefix. CushionCache works in two steps: First, we greedily search for a prompt token sequence that minimizes the maximum activation values in subsequent tokens. Then, we further tune the token cache to regularize the activations of subsequent tokens to be more quantization-friendly. The proposed method successfully addresses activation outliers of LLMs, providing a substantial performance boost for per-tensor activation quantization methods. We thoroughly evaluate our method over a wide range of models and benchmarks and find that it significantly surpasses the established baseline of per-tensor W8A8 quantization and can be seamlessly integrated with the recent activation quantization method. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.11875 [pdf, other]

ChatPCG: Large Language Model-Driven Reward Design for Procedural Content Generation

Authors: In-Chang Baek, Tae-Hwa Park, **-Ha Noh, Cheong-Mok Bae, Kyung-Joong Kim

Abstract: Driven by the rapid growth of machine learning, recent advances in game artificial intelligence (AI) have significantly impacted productivity across various gaming genres. Reward design plays a pivotal role in training game AI models, wherein researchers implement concepts of specific reward functions. However, despite the presence of AI, the reward design process predominantly remains in the doma… ▽ More Driven by the rapid growth of machine learning, recent advances in game artificial intelligence (AI) have significantly impacted productivity across various gaming genres. Reward design plays a pivotal role in training game AI models, wherein researchers implement concepts of specific reward functions. However, despite the presence of AI, the reward design process predominantly remains in the domain of human experts, as it is heavily reliant on their creativity and engineering skills. Therefore, this paper proposes ChatPCG, a large language model (LLM)-driven reward design framework.It leverages human-level insights, coupled with game expertise, to generate rewards tailored to specific game features automatically. Moreover, ChatPCG is integrated with deep reinforcement learning, demonstrating its potential for multiplayer game content generation tasks. The results suggest that the proposed LLM exhibits the capability to comprehend game mechanics and content generation tasks, enabling tailored content generation for a specified game. This study not only highlights the potential for improving accessibility in content generation but also aims to streamline the game AI development process. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 4 pages, 2 figures, accepted at IEEE Conference on Games 2024

arXiv:2406.11261 [pdf]

Anti-aliased metasurfaces beyond the Nyquist limit

Authors: Seokwoo Kim, Joohoon Kim, Kyungtae Kim, Minsu Jeong, Junsuk Rho

Abstract: Sampling is a pivotal element in the design of metasurfaces, enabling a broad spectrum of applications. Despite its flexibility, sampling can result in reduced efficiency and unintended diffractions, which are more pronounced at high numerical aperture or shorter wavelengths, e.g. ultraviolet spectrum. Prevailing metasurface research has often relied on the conventional Nyquist sampling theorem to… ▽ More Sampling is a pivotal element in the design of metasurfaces, enabling a broad spectrum of applications. Despite its flexibility, sampling can result in reduced efficiency and unintended diffractions, which are more pronounced at high numerical aperture or shorter wavelengths, e.g. ultraviolet spectrum. Prevailing metasurface research has often relied on the conventional Nyquist sampling theorem to assess sampling appropriateness, however, our findings reveal that the Nyquist criterion is insufficient for preventing the diffractive distortion. Specifically, we find that the performance of a metasurface is significantly correlated to the geometric relationship between the spectrum morphology and sampling lattice. Based on lattice-based diffraction analysis, we demonstrate several anti-aliasing strategies from visible to ultraviolet regimes. These approaches significantly reduce aliasing phenomena occurring in high numerical aperture metasurfaces. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 45 pages, 28 figures

arXiv:2406.11248 [pdf]

Performance Improvement of Language-Queried Audio Source Separation Based on Caption Augmentation From Large Language Models for DCASE Challenge 2024 Task 9

Authors: Do Hyun Lee, Yoonah Song, Hong Kook Kim

Abstract: We present a prompt-engineering-based text-augmentation approach applied to a language-queried audio source separation (LASS) task. To enhance the performance of LASS, the proposed approach utilizes large language models (LLMs) to generate multiple captions corresponding to each sentence of the training dataset. To this end, we first perform experiments to identify the most effective prompts for c… ▽ More We present a prompt-engineering-based text-augmentation approach applied to a language-queried audio source separation (LASS) task. To enhance the performance of LASS, the proposed approach utilizes large language models (LLMs) to generate multiple captions corresponding to each sentence of the training dataset. To this end, we first perform experiments to identify the most effective prompts for caption augmentation with a smaller number of captions. A LASS model trained with these augmented captions demonstrates improved performance on the DCASE 2024 Task 9 validation set compared to that trained without augmentation. This study highlights the effectiveness of LLM-based caption augmentation in advancing language-queried audio source separation. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: DCASE 2024 Challenge Task 9, 4 pages

arXiv:2406.09698 [pdf, other]

Projected background and sensitivity of AMoRE-II

Authors: A. Agrawal, V. V. Alenkov, P. Aryal, J. Beyer, B. Bhandari, R. S. Boiko, K. Boonin, O. Buzanov, C. R. Byeon, N. Chanthima, M. K. Cheoun, J. S. Choe, Seonho Choi, S. Choudhury, J. S. Chung, F. A. Danevich, M. Djamal, D. Drung, C. Enss, A. Fleischmann, A. M. Gangapshev, L. Gastaldo, Y. M. Gavrilyuk, A. M. Gezhaev, O. Gileva , et al. (81 additional authors not shown)

Abstract: AMoRE-II aims to search for neutrinoless double beta decay with an array of 423 Li$_2$$^{100}$MoO$_4$ crystals operating in the cryogenic system as the main phase of the Advanced Molybdenum-based Rare process Experiment (AMoRE). AMoRE has been planned to operate in three phases: AMoRE-pilot, AMoRE-I, and AMoRE-II. AMoRE-II is currently being installed at the Yemi Underground Laboratory, located ap… ▽ More AMoRE-II aims to search for neutrinoless double beta decay with an array of 423 Li$_2$$^{100}$MoO$_4$ crystals operating in the cryogenic system as the main phase of the Advanced Molybdenum-based Rare process Experiment (AMoRE). AMoRE has been planned to operate in three phases: AMoRE-pilot, AMoRE-I, and AMoRE-II. AMoRE-II is currently being installed at the Yemi Underground Laboratory, located approximately 1000 meters deep in Jeongseon, Korea. The goal of AMoRE-II is to reach up to $T^{0νββ}_{1/2}$ $\sim$ 6 $\times$ 10$^{26}$ years, corresponding to an effective Majorana mass of 15 - 29 meV, covering all the inverted mass hierarchy regions. To achieve this, the background level of the experimental configurations and possible background sources of gamma and beta events should be well understood. We have intensively performed Monte Carlo simulations using the GEANT4 toolkit in all the experimental configurations with potential sources. We report the estimated background level that meets the 10$^{-4}$counts/(keV$\cdot$kg$\cdot$yr) requirement for AMoRE-II in the region of interest (ROI) and show the projected half-life sensitivity based on the simulation study. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.09345 [pdf, other]

DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding

Authors: Suwon Shon, Kwangyoun Kim, Yi-Te Hsu, Prashant Sridhar, Shinji Watanabe, Karen Livescu

Abstract: The integration of pre-trained text-based large language models (LLM) with speech input has enabled instruction-following capabilities for diverse speech tasks. This integration requires the use of a speech encoder, a speech adapter, and an LLM, trained on diverse tasks. We propose the use of discrete speech units (DSU), rather than continuous-valued speech encoder outputs, that are converted to t… ▽ More The integration of pre-trained text-based large language models (LLM) with speech input has enabled instruction-following capabilities for diverse speech tasks. This integration requires the use of a speech encoder, a speech adapter, and an LLM, trained on diverse tasks. We propose the use of discrete speech units (DSU), rather than continuous-valued speech encoder outputs, that are converted to the LLM token embedding space using the speech adapter. We generate DSU using a self-supervised speech encoder followed by k-means clustering. The proposed model shows robust performance on speech inputs from seen/unseen domains and instruction-following capability in spoken question answering. We also explore various types of DSU extracted from different layers of the self-supervised speech encoder, as well as Mel frequency Cepstral Coefficients (MFCC). Our findings suggest that the ASR task and datasets are not crucial in instruction-tuning for spoken question answering tasks. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.08527 [pdf, other]

Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning

Authors: Jaehyun Nam, Kyuyoung Kim, Seunghyuk Oh, Jihoon Tack, Jaehyung Kim, **woo Shin

Abstract: Learning effective representations from raw data is crucial for the success of deep learning methods. However, in the tabular domain, practitioners often prefer augmenting raw column features over using learned representations, as conventional tree-based algorithms frequently outperform competing approaches. As a result, feature engineering methods that automatically generate candidate features ha… ▽ More Learning effective representations from raw data is crucial for the success of deep learning methods. However, in the tabular domain, practitioners often prefer augmenting raw column features over using learned representations, as conventional tree-based algorithms frequently outperform competing approaches. As a result, feature engineering methods that automatically generate candidate features have been widely used. While these approaches are often effective, there remains ambiguity in defining the space over which to search for candidate features. Moreover, they often rely solely on validation scores to select good features, neglecting valuable feedback from past experiments that could inform the planning of future experiments. To address the shortcomings, we propose a new tabular learning framework based on large language models (LLMs), coined Optimizing Column feature generator with decision Tree reasoning (OCTree). Our key idea is to leverage LLMs' reasoning capabilities to find good feature generation rules without manually specifying the search space and provide language-based reasoning information highlighting past experiments as feedback for iterative rule improvements. Here, we choose a decision tree as reasoning as it can be interpreted in natural language, effectively conveying knowledge of past experiments (i.e., the prediction models trained with the generated features) to the LLM. Our empirical results demonstrate that this simple framework consistently enhances the performance of various prediction models across diverse tabular benchmarks, outperforming competing automatic feature engineering methods. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 18 pages

arXiv:2406.08156 [pdf, other]

doi 10.1016/j.rinp.2024.107820

Scaling behavior of the localization length for TE waves at critical incidence on short-range correlated stratified random media

Authors: Seulong Kim, Kihong Kim

Abstract: We theoretically investigate the scaling behavior of the localization length for $s$-polarized electromagnetic waves incident at a critical angle on stratified random media with short-range correlated disorder. By employing the invariant embedding method, extended to waves in correlated random media, and utilizing the Shapiro-Loginov formula of differentiation, we accurately compute the localizati… ▽ More We theoretically investigate the scaling behavior of the localization length for $s$-polarized electromagnetic waves incident at a critical angle on stratified random media with short-range correlated disorder. By employing the invariant embedding method, extended to waves in correlated random media, and utilizing the Shapiro-Loginov formula of differentiation, we accurately compute the localization length $ξ$ of $s$ waves incident obliquely on stratified random media that exhibit short-range correlated dichotomous randomness in the dielectric permittivity. The random component of the permittivity is characterized by the disorder strength parameter $σ^2$ and the disorder correlation length $l_c$. Away from the critical angle, $ξ$ depends on these parameters independently. However, precisely at the critical angle, we discover that for waves with wavenumber $k$, $kξ$ depends on the single parameter $kl_cσ^2$, satisfying a universal equation $kξ\approx 1.3717\left(kl_cσ^2\right)^{-1/3}$ across the entire range of parameter values. Additionally, we find that $ξ$ scales as $λ^{4/3}$ for the entire range of the wavelength $λ$, regardless of the values of $σ^2$ and $l_c$. We demonstrate that under sufficiently strong disorder, the scaling behavior of the localization length for all other incident angles converges to that for the critical incidence. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 8 pages, 5 figures

Journal ref: Results in Physics 62, 107820 (2024)

arXiv:2406.08025 [pdf, other]

Holstein polarons, Rashba-like spin splitting and Ising superconductivity in electron-doped MoSe2

Authors: Sung Won Jung, Saumya Mukherjee, Matthew D. Watson, Daniil V. Evtushinsky, Cephise Cacho, Edoardo Martino, Helmut Berger, Timur K. Kim

Abstract: Interaction between electrons and phonons in solids is a key effect defining physical properties of materials such as electrical and thermal conductivity. In transitional metal dichalcogenides (TMDCs) the electron-phonon coupling results in the creation of polarons, quasiparticles that manifest themselves as discrete features in the electronic spectral function. In this study, we report the format… ▽ More Interaction between electrons and phonons in solids is a key effect defining physical properties of materials such as electrical and thermal conductivity. In transitional metal dichalcogenides (TMDCs) the electron-phonon coupling results in the creation of polarons, quasiparticles that manifest themselves as discrete features in the electronic spectral function. In this study, we report the formation of polarons at the alkali dosed MoSe2 surface, where Rashba-like spin splitting of the conduction band states is caused by an inversion-symmetry breaking electric field. In addition, we observe the crossover from phonon-like to plasmon-like polaronic spectral features at MoSe2 surface with increasing do**. Our findings support the concept of electron-phonon coupling mediated superconductivity in electron-doped layered TMDC materials, observed using ionic liquid gating technology. Furthermore, the discovered spin-splitting at the Fermi level could offer crucial experimental validation for theoretical models of Ising-type superconductivity in these materials. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.07395 [pdf, other]

Holographic reconstruction of black hole spacetime: machine learning and entanglement entropy

Authors: Byoungjoon Ahn, Hyun-Sik Jeong, Keun-Young Kim, Kwan Yun

Abstract: We investigate the bulk reconstruction of AdS black hole spacetime emergent from quantum entanglement within a machine learning framework. Utilizing neural ordinary differential equations alongside Monte-Carlo integration, we develop a method tailored for continuous training functions to extract the general isotropic bulk metric from entanglement entropy data. To validate our approach, we first ap… ▽ More We investigate the bulk reconstruction of AdS black hole spacetime emergent from quantum entanglement within a machine learning framework. Utilizing neural ordinary differential equations alongside Monte-Carlo integration, we develop a method tailored for continuous training functions to extract the general isotropic bulk metric from entanglement entropy data. To validate our approach, we first apply our machine learning algorithm to holographic entanglement entropy data derived from the Gubser-Rocha and superconductor models, which serve as representative models of strongly coupled matters in holography. Our algorithm successfully extracts the corresponding bulk metrics from these data. Additionally, we extend our methodology to many-body systems by employing entanglement entropy data from a fermionic tight-binding chain at half filling, exemplifying critical one-dimensional systems, and derive the associated bulk metric. We find that the metrics for a tight-binding chain and the Gubser-Rocha model are similar. We speculate this similarity is due to the metallic property of these models. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 44 pages, 14 figures

Report number: IFT-UAM/CSIC-24-88

arXiv:2406.06277 [pdf, other]

Measurement of the branching fractions of $\bar{B}\to D^{(*)} K^- K^{(*)0}_{(S)}$ and $\bar{B}\to D^{(*)}D_s^{-}$ decays at Belle II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer , et al. (382 additional authors not shown)

Abstract: We present measurements of the branching fractions of eight $\overline B{}^0\to D^{(*)+} K^- K^{(*)0}_{(S)}$, $B^{-}\to D^{(*)0} K^- K^{(*)0}_{(S)}$ decay channels. The results are based on data from SuperKEKB electron-positron collisions at the $Υ(4S)$ resonance collected with the Belle II detector, corresponding to an integrated luminosity of $362~\text{fb}^{-1}$. The event yields are extracted… ▽ More We present measurements of the branching fractions of eight $\overline B{}^0\to D^{(*)+} K^- K^{(*)0}_{(S)}$, $B^{-}\to D^{(*)0} K^- K^{(*)0}_{(S)}$ decay channels. The results are based on data from SuperKEKB electron-positron collisions at the $Υ(4S)$ resonance collected with the Belle II detector, corresponding to an integrated luminosity of $362~\text{fb}^{-1}$. The event yields are extracted from fits to the distributions of the difference between expected and observed $B$ meson energy, and are efficiency-corrected as a function of $m(K^-K^{(*)0}_{(S)})$ and $m(D^{(*)}K^{(*)0}_{(S)})$ in order to avoid dependence on the decay model. These results include the first observation of $\overline B{}^0\to D^+K^-K_S^0$, $B^-\to D^{*0}K^-K_S^0$, and $\overline B{}^0\to D^{*+}K^-K_S^0$ decays and a significant improvement in the precision of the other channels compared to previous measurements. The helicity-angle distributions and the invariant mass distributions of the $K^- K^{(*)0}_{(S)}$ systems are compatible with quasi-two-body decays via a resonant transition with spin-parity $J^P=1^-$ for the $K^-K_S^0$ systems and $J^P= 1^+$ for the $K^-K^{*0}$ systems. We also present measurements of the branching fractions of four $\overline B{}^0\to D^{(*)+} D_s^-$, $B^{-}\to D^{(*)0} D_s^- $ decay channels with a precision compatible to the current world averages. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: Prepared for submission to JHEP. 34 pages, 14 figures

Report number: Belle II Preprint: 2024-014, KEK Preprint: 2024-8

arXiv:2406.06117 [pdf, other]

Exclusion of the Cosmological Triangle in Reactor-Based Search for Axion-Like Particles

Authors: Byung Ju Park, Jae ** Choi, Eunju Jeon, **yu Kim, Kyungwon Kim, Sung Hyun Kim, Sun Kee Kim, Yeongduk Kim, Young Ju Ko, Byoung-Cheol Koh, Chang Hyon Ha, Seo Hyun Lee, In Soo Lee, Hyunseok Lee, Hyun Su Lee, Jaison Lee, Yoomin Oh, Doo** Kim

Abstract: We report new constraints on axion-like particle (ALP) using data corresponding to a sodium iodine target exposure of 3063 kg$\cdot$days from the neutrino elastic scattering observation with NaI (NEON) experiment. A 16.7 kg of thallium-doped sodium iodide target was located 23.7 meters from a 2.8 GW thermal power nuclear reactor. We searched for ALPs produced by high-flux photons by comparing the… ▽ More We report new constraints on axion-like particle (ALP) using data corresponding to a sodium iodine target exposure of 3063 kg$\cdot$days from the neutrino elastic scattering observation with NaI (NEON) experiment. A 16.7 kg of thallium-doped sodium iodide target was located 23.7 meters from a 2.8 GW thermal power nuclear reactor. We searched for ALPs produced by high-flux photons by comparing the energy spectra of data collected during reactor-on (1596 kg$\cdot$days exposure) and reactor-off (1467 kg$\cdot$days exposure) periods. No signal consistent with ALP interaction was identified, allowing us to set exclusion limits at the 95% confidence level. Our limits cover previously unexplored regions for both photon couplings (${g_{aγ}}$) and electron couplings (${g_{ae}}$) for axion masses around 1 MeV/c$^2$. Notably, the NEON data excludes the unconstrained region identified by laboratory-based searches for photon couplings within the "cosmological triangle" for the first time. The observed 95\% confidence level limits reach as low as ${g_{aγ}}$ of 4.33$\times$ 10$^{-8}$ GeV$^{-1}$ and ${g_{ae}}$ of 1.10$\times$ 10$^{-9}$ for axion masses of 1.7 MeV/c$^2$ and 1.0 MeV/c$^2$, respectively. △ Less

Submitted 11 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

arXiv:2406.05963 [pdf, other]

Solution for SMART-101 Challenge of CVPR Multi-modal Algorithmic Reasoning Task 2024

Authors: **woo Ahn, Junhyeok Park, Min-Jun Kim, Kang-Hyeon Kim, So-Yeong Sohn, Yun-Ji Lee, Du-Seong Chang, Yu-Jung Heo, Eun-Sol Kim

Abstract: In this paper, the solution of HYU MLLAB KT Team to the Multimodal Algorithmic Reasoning Task: SMART-101 CVPR 2024 Challenge is presented. Beyond conventional visual question-answering problems, the SMART-101 challenge aims to achieve human-level multimodal understanding by tackling complex visio-linguistic puzzles designed for children in the 6-8 age group. To solve this problem, we suggest two m… ▽ More In this paper, the solution of HYU MLLAB KT Team to the Multimodal Algorithmic Reasoning Task: SMART-101 CVPR 2024 Challenge is presented. Beyond conventional visual question-answering problems, the SMART-101 challenge aims to achieve human-level multimodal understanding by tackling complex visio-linguistic puzzles designed for children in the 6-8 age group. To solve this problem, we suggest two main ideas. First, to utilize the reasoning ability of a large-scale language model (LLM), the given visual cues (images) are grounded in the text modality. For this purpose, we generate highly detailed text captions that describe the context of the image and use these captions as input for the LLM. Second, due to the nature of puzzle images, which often contain various geometric visual patterns, we utilize an object detection algorithm to ensure these patterns are not overlooked in the captioning process. We employed the SAM algorithm, which can detect various-size objects, to capture the visual features of these geometric patterns and used this information as input for the LLM. Under the puzzle split configuration, we achieved an option selection accuracy Oacc of 29.5 on the test set and a weighted option selection accuracy (WOSA) of 27.1 on the challenge set. △ Less

Submitted 9 June, 2024; originally announced June 2024.

arXiv:2406.05935 [pdf, other]

Control of spin-wave polarity and velocity using a ferrimagnetic domain wall

Authors: Ehsan Faridi, Giovanni Vignale, Se Kwon Kim

Abstract: We present a theoretical study of the scattering of spin waves by a domain wall (DW) in a ferrimagnetic (FiM) spin chain in which two sublattices carry spins of unequal magnitudes. We find that a narrow, but atomically smooth FiM DW exhibits a different behavior in comparison with similarly smooth ferromagnetic and antiferromagnetic DWs due to the inequivalence of the two sublattices. Specifically… ▽ More We present a theoretical study of the scattering of spin waves by a domain wall (DW) in a ferrimagnetic (FiM) spin chain in which two sublattices carry spins of unequal magnitudes. We find that a narrow, but atomically smooth FiM DW exhibits a different behavior in comparison with similarly smooth ferromagnetic and antiferromagnetic DWs due to the inequivalence of the two sublattices. Specifically, for sufficiently weak anisotropy, the smaller spin at the center of the DW is found to become precisely normal to the easy-axis, selecting an arbitrary direction in the $xy$-plane and thereby breaking the U(1) spin-rotational symmetry spontaneously. This particular form of a FiM DW does not occur in antiferromagnetic systems and is shown to lead to a strong dependence of spin wave scattering pattern on the state of polarization of the spin wave, which can be either right-handed or left-handed, suggesting the utilization of such a narrow DW as a spin-wave filter. Moreover, we find that in the case of an atomically sharp DW, where all the spins point either up or down due to strong easy-axis anisotropy and therefore the polarization of the spin wave is conserved upon transmission, the wave vector of the spin wave changes after passing through the DW leading to a change in the group velocity of the spin wave. This change of the wave vector indicates the acceleration or deceleration of the spin waves and thus a sharp FiM DW could serve as a spin wave accelerator or decelerator in spintronics devices, offering a functionality absent in a ferromagnetic and an antiferromagnetic counterpart. Our results indicate that FiM spin textures can interact with spin waves distinctly from ferromagnetic and antiferromagnetic counterparts, suggesting that they may offer spin-wave functionalities that are absent in more conventional magnets. △ Less

Submitted 9 June, 2024; originally announced June 2024.

arXiv:2406.05794 [pdf, other]

RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation

Authors: Kiseung Kim, Jay-Yoon Lee

Abstract: The Retrieval Augmented Generation (RAG) framework utilizes a combination of parametric knowledge and external knowledge to demonstrate state-of-the-art performance on open-domain question answering tasks. However, the RAG framework suffers from performance degradation when the query is accompanied by irrelevant contexts. In this work, we propose the RE-RAG framework, which introduces a relevance… ▽ More The Retrieval Augmented Generation (RAG) framework utilizes a combination of parametric knowledge and external knowledge to demonstrate state-of-the-art performance on open-domain question answering tasks. However, the RAG framework suffers from performance degradation when the query is accompanied by irrelevant contexts. In this work, we propose the RE-RAG framework, which introduces a relevance estimator (RE) that not only provides relative relevance between contexts as previous rerankers did, but also provides confidence, which can be used to classify whether given context is useful for answering the given question. We propose a weakly supervised method for training the RE simply utilizing question-answer data without any labels for correct contexts. We show that RE trained with a small generator (sLM) can not only improve the sLM fine-tuned together with RE but also improve previously unreferenced large language models (LLMs). Furthermore, we investigate new decoding strategies that utilize the proposed confidence measured by RE such as choosing to let the user know that it is "unanswerable" to answer the question given the retrieved contexts or choosing to rely on LLM's parametric knowledge rather than unrelated contexts. △ Less

Submitted 16 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

arXiv:2406.04772 [pdf, other]

REP: Resource-Efficient Prompting for On-device Continual Learning

Authors: Sungho Jeon, Xinyue Ma, Kwang In Kim, Myeongjae Jeon

Abstract: On-device continual learning (CL) requires the co-optimization of model accuracy and resource efficiency to be practical. This is extremely challenging because it must preserve accuracy while learning new tasks with continuously drifting data and maintain both high energy and memory efficiency to be deployable on real-world devices. Typically, a CL method leverages one of two types of backbone net… ▽ More On-device continual learning (CL) requires the co-optimization of model accuracy and resource efficiency to be practical. This is extremely challenging because it must preserve accuracy while learning new tasks with continuously drifting data and maintain both high energy and memory efficiency to be deployable on real-world devices. Typically, a CL method leverages one of two types of backbone networks: CNN or ViT. It is commonly believed that CNN-based CL excels in resource efficiency, whereas ViT-based CL is superior in model performance, making each option attractive only for a single aspect. In this paper, we revisit this comparison while embracing powerful pre-trained ViT models of various sizes, including ViT-Ti (5.8M parameters). Our detailed analysis reveals that many practical options exist today for making ViT-based methods more suitable for on-device CL, even when accuracy, energy, and memory are all considered. To further expand this impact, we introduce REP, which improves resource efficiency specifically targeting prompt-based rehearsal-free methods. Our key focus is on avoiding catastrophic trade-offs with accuracy while trimming computational and memory costs throughout the training process. We achieve this by exploiting swift prompt selection that enhances input data using a carefully provisioned model, and by develo** two novel algorithms-adaptive token merging (AToM) and adaptive layer drop** (ALD)-that optimize the prompt updating stage. In particular, AToM and ALD perform selective skip** across the data and model-layer dimensions without compromising task-specific features in vision transformer models. Extensive experiments on three image classification datasets validate REP's superior resource efficiency over current state-of-the-art methods. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 19 pages, 10 figures

arXiv:2406.04642 [pdf, ps, other]

Measurements of the branching fractions of $Ξ_{c}^{0}\toΞ^{0}π^{0}$, $Ξ_{c}^{0}\toΞ^{0}η$, and $Ξ_{c}^{0}\toΞ^{0}η^{\prime}$ and asymmetry parameter of $Ξ_{c}^{0}\toΞ^{0}π^{0}$

Authors: Belle, Belle II Collaborations, :, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (360 additional authors not shown)

Abstract: We present a study of $Ξ_{c}^{0}\toΞ^{0}π^{0}$, $Ξ_{c}^{0}\toΞ^{0}η$, and $Ξ_{c}^{0}\toΞ^{0}η^{\prime}$ decays using the Belle and Belle~II data samples, which have integrated luminosities of 980~$\mathrm{fb}^{-1}$ and 426~$\mathrm{fb}^{-1}$, respectively. We measure the following relative branching fractions… ▽ More We present a study of $Ξ_{c}^{0}\toΞ^{0}π^{0}$, $Ξ_{c}^{0}\toΞ^{0}η$, and $Ξ_{c}^{0}\toΞ^{0}η^{\prime}$ decays using the Belle and Belle~II data samples, which have integrated luminosities of 980~$\mathrm{fb}^{-1}$ and 426~$\mathrm{fb}^{-1}$, respectively. We measure the following relative branching fractions $${\cal B}(Ξ_{c}^{0}\toΞ^{0}π^{0})/{\cal B}(Ξ_{c}^{0}\toΞ^{-}π^{+}) = 0.48 \pm 0.02 ({\rm stat}) \pm 0.03 ({\rm syst}) ,$$ $${\cal B}(Ξ_{c}^{0}\toΞ^{0}η)/{\cal B}(Ξ_{c}^{0}\toΞ^{-}π^{+}) = 0.11 \pm 0.01 ({\rm stat}) \pm 0.01 ({\rm syst}) ,$$ $${\cal B}(Ξ_{c}^{0}\toΞ^{0}η^{\prime})/{\cal B}(Ξ_{c}^{0}\toΞ^{-}π^{+}) = 0.08 \pm 0.02 ({\rm stat}) \pm 0.01 ({\rm syst}) $$ for the first time, where the uncertainties are statistical ($\rm stat$) and systematic ($\rm syst$). By multiplying by the branching fraction of the normalization mode, ${\mathcal B}(Ξ_{c}^{0}\toΞ^{-}π^{+})$, we obtain the following absolute branching fraction results $(6.9 \pm 0.3 ({\rm stat}) \pm 0.5 ({\rm syst}) \pm 1.3 ({\rm norm})) \times 10^{-3}$, $(1.6 \pm 0.2 ({\rm stat}) \pm 0.2 ({\rm syst}) \pm 0.3 ({\rm norm})) \times 10^{-3}$, and $(1.2 \pm 0.3 ({\rm stat}) \pm 0.1 ({\rm syst}) \pm 0.2 ({\rm norm})) \times 10^{-3}$, for $Ξ_{c}^{0}$ decays to $Ξ^{0}π^{0}$, $Ξ^{0}η$, and $Ξ^{0}η^{\prime}$ final states, respectively. The third errors are from the uncertainty on ${\mathcal B}(Ξ_{c}^{0}\toΞ^{-}π^{+})$. The asymmetry parameter for $Ξ_{c}^{0}\toΞ^{0}π^{0}$ is measured to be $α(Ξ_{c}^{0}\toΞ^{0}π^{0}) = -0.90\pm0.15({\rm stat})\pm0.23({\rm syst})$. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 23 pages, 5 figures

Report number: Belle II Preprint 2024-015; KEK Preprint 2024-9

arXiv:2406.04308 [pdf, other]

Approximation-Aware Bayesian Optimization

Authors: Natalie Maus, Kyurae Kim, Geoff Pleiss, David Eriksson, John P. Cunningham, Jacob R. Gardner

Abstract: High-dimensional Bayesian optimization (BO) tasks such as molecular design often require 10,000 function evaluations before obtaining meaningful results. While methods like sparse variational Gaussian processes (SVGPs) reduce computational requirements in these settings, the underlying approximations result in suboptimal data acquisitions that slow the progress of optimization. In this paper we mo… ▽ More High-dimensional Bayesian optimization (BO) tasks such as molecular design often require 10,000 function evaluations before obtaining meaningful results. While methods like sparse variational Gaussian processes (SVGPs) reduce computational requirements in these settings, the underlying approximations result in suboptimal data acquisitions that slow the progress of optimization. In this paper we modify SVGPs to better align with the goals of BO: targeting informed data acquisition rather than global posterior fidelity. Using the framework of utility-calibrated variational inference, we unify GP approximation and data acquisition into a joint optimization problem, thereby ensuring optimal decisions under a limited computational budget. Our approach can be used with any decision-theoretic acquisition function and is compatible with trust region methods like TuRBO. We derive efficient joint objectives for the expected improvement and knowledge gradient acquisition functions in both the standard and batch BO settings. Our approach outperforms standard SVGPs on high-dimensional benchmark tasks in control and molecular design. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2406.03804 [pdf, other]

Exploring the interplay between mass-energy equivalence, interactions and entanglement in an optical lattice clock

Authors: Anjun Chu, Victor J. Martínez-Lahuerta, Maya Miklos, Kyungtae Kim, Peter Zoller, Klemens Hammerer, Jun Ye, Ana Maria Rey

Abstract: We propose protocols that probe manifestations of the mass-energy equivalence in an optical lattice clock (OLC) interrogated with spin coherent and entangled quantum states. To tune and uniquely distinguish the mass-energy equivalence effects (gravitational redshift and second order Doppler shift) in such setting, we devise a dressing protocol using an additional nuclear spin state. We then analyz… ▽ More We propose protocols that probe manifestations of the mass-energy equivalence in an optical lattice clock (OLC) interrogated with spin coherent and entangled quantum states. To tune and uniquely distinguish the mass-energy equivalence effects (gravitational redshift and second order Doppler shift) in such setting, we devise a dressing protocol using an additional nuclear spin state. We then analyze the interplay between photon-mediated interactions and gravitational redshift and show that such interplay can lead to entanglement generation and frequency synchronization. In the regime where all atomic spins synchronize, we show the synchronization time depends on the initial entanglement of the state and can be used as a proxy of its metrological gain compared to a classical state. Our work opens new possibilities for exploring the effects of general relativity on quantum coherence and entanglement in OLC experiments. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 7+17 pages, 4+6 figures

arXiv:2406.03773 [pdf, other]

Optimizing Multi-User Semantic Communication via Transfer Learning and Knowledge Distillation

Authors: Loc X. Nguyen, Kitae Kim, Ye Lin Tun, Sheikh Salman Hassan, Yan Kyaw Tun, Zhu Han, Choong Seon Hong

Abstract: Semantic communication, notable for ensuring quality of service by jointly optimizing source and channel coding, effectively extracts data semantics, reduces transmission length, and mitigates channel noise. However, most studies overlook multi-user scenarios and resource availability, limiting real-world application. This paper addresses this gap by focusing on downlink communication from a base… ▽ More Semantic communication, notable for ensuring quality of service by jointly optimizing source and channel coding, effectively extracts data semantics, reduces transmission length, and mitigates channel noise. However, most studies overlook multi-user scenarios and resource availability, limiting real-world application. This paper addresses this gap by focusing on downlink communication from a base station to multiple users with varying computing capacities. Users employ variants of Swin transformer models for source decoding and a simple architecture for channel decoding. We propose a novel training regimen, incorporating transfer learning and knowledge distillation to improve low-computing users' performance. Extensive simulations validate the proposed methods. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 5 pages, 5 figures

arXiv:2406.03486 [pdf, other]

BIPED: Pedagogically Informed Tutoring System for ESL Education

Authors: Soonwoo Kwon, Sojung Kim, Minju Park, Seunghyun Lee, Kyuseok Kim

Abstract: Large Language Models (LLMs) have a great potential to serve as readily available and cost-efficient Conversational Intelligent Tutoring Systems (CITS) for teaching L2 learners of English. Existing CITS, however, are designed to teach only simple concepts or lack the pedagogical depth necessary to address diverse learning strategies. To develop a more pedagogically informed CITS capable of teachin… ▽ More Large Language Models (LLMs) have a great potential to serve as readily available and cost-efficient Conversational Intelligent Tutoring Systems (CITS) for teaching L2 learners of English. Existing CITS, however, are designed to teach only simple concepts or lack the pedagogical depth necessary to address diverse learning strategies. To develop a more pedagogically informed CITS capable of teaching complex concepts, we construct a BIlingual PEDagogically-informed Tutoring Dataset (BIPED) of one-on-one, human-to-human English tutoring interactions. Through post-hoc analysis of the tutoring interactions, we come up with a lexicon of dialogue acts (34 tutor acts and 9 student acts), which we use to further annotate the collected dataset. Based on a two-step framework of first predicting the appropriate tutor act then generating the corresponding response, we implemented two CITS models using GPT-4 and SOLAR-KO, respectively. We experimentally demonstrate that the implemented models not only replicate the style of human teachers but also employ diverse and contextually appropriate pedagogical strategies. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: ACL 2024

arXiv:2406.02008 [pdf]

High-Performance Ferroelectric Field-Effect Transistors with Ultra-High Current and Carrier Densities

Authors: Seunguk Song, Kwan-Ho Kim, Rachael Keneipp, Nicholas Trainor, Chen Chen, Jeffrey Zheng, Joan M. Redwing, Marija Drndić, Roy H. Olsson III, Deep Jariwala

Abstract: Ferroelectric field-effect transistors (FeFET) with two-dimensional (2D) semiconductor channels are promising low-power, embedded non-volatile memory (NVM) candidates for next-generation in-memory computing. However, the performance of FeFETs can be limited by a charge imbalance between the ferroelectric layer and the channel, and for low-dimensional semiconductors, also by a high contact resistan… ▽ More Ferroelectric field-effect transistors (FeFET) with two-dimensional (2D) semiconductor channels are promising low-power, embedded non-volatile memory (NVM) candidates for next-generation in-memory computing. However, the performance of FeFETs can be limited by a charge imbalance between the ferroelectric layer and the channel, and for low-dimensional semiconductors, also by a high contact resistance between the metal electrodes and the channel. Here, we report a significant enhancement in performance of contact-engineered FeFETs with a 2D MoS2 channel and a ferroelectric Al0.68Sc0.32N (AlScN) gate dielectric. Replacing Ti with In contact electrodes results in a fivefold increase in on-state current (~120 uA/um at 1 V) and on-to-off ratio (~2*10^7) in the FeFETs. In addition, the high carrier concentration in the MoS2 channel during the on-state (> 10^14 cm^-2) facilitates the observation of a metal-to-insulator phase transition in monolayer MoS2 permitting observation of high field effect mobility (> 100 cm^2V^-1s^-1) at cryogenic temperatures. Our work and devices broaden the potential of FeFETs and provides a unique platform to implement high-carrier-density transport in a 2D channel. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 42 pages, 5 main figures

arXiv:2406.02000 [pdf, other]

Advancing Ultra-Reliable 6G: Transformer and Semantic Localization Empowered Robust Beamforming in Millimeter-Wave Communications

Authors: Avi Deb Raha, Kitae Kim, Apurba Adhikary, Mrityunjoy Gain, Choong Seon Hong

Abstract: Advancements in 6G wireless technology have elevated the importance of beamforming, especially for attaining ultra-high data rates via millimeter-wave (mmWave) frequency deployment. Although promising, mmWave bands require substantial beam training to achieve precise beamforming. While initial deep learning models that use RGB camera images demonstrated promise in reducing beam training overhead,… ▽ More Advancements in 6G wireless technology have elevated the importance of beamforming, especially for attaining ultra-high data rates via millimeter-wave (mmWave) frequency deployment. Although promising, mmWave bands require substantial beam training to achieve precise beamforming. While initial deep learning models that use RGB camera images demonstrated promise in reducing beam training overhead, their performance suffers due to sensitivity to lighting and environmental variations. Due to this sensitivity, Quality of Service (QoS) fluctuates, eventually affecting the stability and dependability of networks in dynamic environments. This emphasizes a critical need for more robust solutions. This paper proposes a robust beamforming technique to ensure consistent QoS under varying environmental conditions. An optimization problem has been formulated to maximize users' data rates. To solve the formulated NP-hard optimization problem, we decompose it into two subproblems: the semantic localization problem and the optimal beam selection problem. To solve the semantic localization problem, we propose a novel method that leverages the k-means clustering and YOLOv8 model. To solve the beam selection problem, we propose a novel lightweight hybrid architecture that utilizes various data sources and a weighted entropy-based mechanism to predict the optimal beams. Rapid and accurate beam predictions are needed to maintain QoS. A novel metric, Accuracy-Complexity Efficiency (ACE), has been proposed to quantify this. Six testing scenarios have been developed to evaluate the robustness of the proposed model. Finally, the simulation result demonstrates that the proposed model outperforms several state-of-the-art baselines regarding beam prediction accuracy, received power, and ACE in the developed test scenarios. △ Less

Submitted 21 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.00920 [pdf, ps, other]

Demystifying SGD with Doubly Stochastic Gradients

Authors: Kyurae Kim, Joohwan Ko, Yi-An Ma, Jacob R. Gardner

Abstract: Optimization objectives in the form of a sum of intractable expectations are rising in importance (e.g., diffusion models, variational autoencoders, and many more), a setting also known as "finite sum with infinite data." For these problems, a popular strategy is to employ SGD with doubly stochastic gradients (doubly SGD): the expectations are estimated using the gradient estimator of each compone… ▽ More Optimization objectives in the form of a sum of intractable expectations are rising in importance (e.g., diffusion models, variational autoencoders, and many more), a setting also known as "finite sum with infinite data." For these problems, a popular strategy is to employ SGD with doubly stochastic gradients (doubly SGD): the expectations are estimated using the gradient estimator of each component, while the sum is estimated by subsampling over these estimators. Despite its popularity, little is known about the convergence properties of doubly SGD, except under strong assumptions such as bounded variance. In this work, we establish the convergence of doubly SGD with independent minibatching and random reshuffling under general conditions, which encompasses dependent component gradient estimators. In particular, for dependent estimators, our analysis allows fined-grained analysis of the effect correlations. As a result, under a per-iteration computational budget of $b \times m$, where $b$ is the minibatch size and $m$ is the number of Monte Carlo samples, our analysis suggests where one should invest most of the budget in general. Furthermore, we prove that random reshuffling (RR) improves the complexity dependence on the subsampling noise. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: Accepted to ICML'24

arXiv:2406.00857 [pdf, other]

Modeling the refractive index profile n(z) of polar ice for ultra-high energy neutrino experiments

Authors: S. Ali, P. Allison, S. Archambault, J. J. Beatty, D. Z. Besson, A. Bishop, P. Chen, Y. C. Chen, B. A. Clark, W. Clay, A. Connolly, K. Couberly, L. Cremonesi, A. Cummings, P. Dasgupta, R. Debolt, S. de Kockere, K. D. de Vries, C. Deaconu, M. A. DuVernois, J. Flaherty, E. Friedman, R. Gaior, P. Giri, J. Hanson , et al. (45 additional authors not shown)

Abstract: We develop an in-situ index of refraction profile using the transit time of radio signals broadcast from an englacial transmitter to 2-5 km distant radio-frequency receivers, deployed at depths up to 200 m. Maxwell's equations generally admit two ray propagation solutions from a given transmitter, corresponding to a direct path (D) and a refracted path (R); the measured D vs. R (dt(D,R)) timing di… ▽ More We develop an in-situ index of refraction profile using the transit time of radio signals broadcast from an englacial transmitter to 2-5 km distant radio-frequency receivers, deployed at depths up to 200 m. Maxwell's equations generally admit two ray propagation solutions from a given transmitter, corresponding to a direct path (D) and a refracted path (R); the measured D vs. R (dt(D,R)) timing differences provide constraints on the index of refraction profile near South Pole, where the Askaryan Radio Array (ARA) neutrino observatory is located. We constrain the refractive index profile by simulating D and R ray paths via ray tracing and comparing those to measured dt(D,R) signals. Using previous ice density data as a proxy for n(z), we demonstrate that our data strongly favors a glaciologically-motivated three-phase densification model rather than a single exponential scale height model. Simulations show that the single exponential model overestimates ARA neutrino sensitivity compared to the three-phase model. △ Less

Submitted 11 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.00810 [pdf, other]

Expanding the Attack Scenarios of SAE J1939: A Comprehensive Analysis of Established and Novel Vulnerabilities in Transport Protocol

Authors: Hwejae Lee, Hyosun Lee, Saehee Jun, Huy Kang Kim

Abstract: Following the enactment of the UN Regulation, substantial efforts have been directed toward implementing intrusion detection and prevention systems (IDPSs) and vulnerability analysis in Controller Area Network (CAN). However, Society of Automotive Engineers (SAE) J1939 protocol, despite its extensive application in cam** cars and commercial vehicles, has seen limited vulnerability identification… ▽ More Following the enactment of the UN Regulation, substantial efforts have been directed toward implementing intrusion detection and prevention systems (IDPSs) and vulnerability analysis in Controller Area Network (CAN). However, Society of Automotive Engineers (SAE) J1939 protocol, despite its extensive application in cam** cars and commercial vehicles, has seen limited vulnerability identification, which raises significant safety concerns in the event of security breaches. In this research, we explore and demonstrate attack techniques specific to SAE J1939 communication protocol. We introduce 14 attack scenarios, enhancing the discourse with seven scenarios recognized in the previous research and unveiling seven novel scenarios through our elaborate study. To verify the feasibility of these scenarios, we leverage a sophisticated testbed that facilitates real-time communication and the simulation of attacks. Our testing confirms the successful execution of 11 scenarios, underscoring their imminent threat to commercial vehicle operations. Some attacks will be difficult to detect because they only inject a single message. These results highlight unique vulnerabilities within SAE J1939 protocol, indicating the automotive cybersecurity community needs to address the identified risks. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: 18 pages, 7 figures, 5 tables; This is the accepted version of ESCAR USA 2024

MSC Class: 68M25 ACM Class: K.6.5

arXiv:2405.20233 [pdf, other]

Grokfast: Accelerated Grokking by Amplifying Slow Gradients

Authors: Jaerin Lee, Bong Gyun Kang, Kihoon Kim, Kyoung Mu Lee

Abstract: One puzzling artifact in machine learning dubbed grokking is where delayed generalization is achieved tenfolds of iterations after near perfect overfitting to the training data. Focusing on the long delay itself on behalf of machine learning practitioners, our goal is to accelerate generalization of a model under grokking phenomenon. By regarding a series of gradients of a parameter over training… ▽ More One puzzling artifact in machine learning dubbed grokking is where delayed generalization is achieved tenfolds of iterations after near perfect overfitting to the training data. Focusing on the long delay itself on behalf of machine learning practitioners, our goal is to accelerate generalization of a model under grokking phenomenon. By regarding a series of gradients of a parameter over training iterations as a random signal over time, we can spectrally decompose the parameter trajectories under gradient descent into two components: the fast-varying, overfitting-yielding component and the slow-varying, generalization-inducing component. This analysis allows us to accelerate the grokking phenomenon more than $\times 50$ with only a few lines of code that amplifies the slow-varying components of gradients. The experiments show that our algorithm applies to diverse tasks involving images, languages, and graphs, enabling practical availability of this peculiar artifact of sudden generalization. Our code is available at https://github.com/ironjr/grokfast. △ Less

Submitted 5 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

Comments: 17 pages, 13 figures. Typo fixed. Project page: https://jaerinlee.com/research/grokfast

arXiv:2405.19771 [pdf, other]

Data Service Maximization in Integrated Terrestrial-Non-Terrestrial 6G Networks: A Deep Reinforcement Learning Approach

Authors: Nway Nway Ei, Kitae Kim, Yan Kyaw Tun, Choong Seon Hong

Abstract: Integrating terrestrial and non-terrestrial networks has emerged as a promising paradigm to fulfill the constantly growing demand for connectivity, low transmission delay, and quality of services (QoS). This integration brings together the strengths of terrestrial and non-terrestrial networks, such as the reliability of terrestrial networks, broad coverage, and service continuity of non-terrestria… ▽ More Integrating terrestrial and non-terrestrial networks has emerged as a promising paradigm to fulfill the constantly growing demand for connectivity, low transmission delay, and quality of services (QoS). This integration brings together the strengths of terrestrial and non-terrestrial networks, such as the reliability of terrestrial networks, broad coverage, and service continuity of non-terrestrial networks like low earth orbit (LEO) satellites. In this work, we study a data service maximization problem in an integrated terrestrial-non-terrestrial network (I-TNT) where the ground base stations (GBSs) and LEO satellites cooperatively serve the coexisting aerial users (AUs) and ground users (GUs). Then, by considering the spectrum scarcity, interference, and QoS requirements of the users, we jointly optimize the user association, AUE's trajectory, and power allocation. To tackle the formulated mixed-integer non-convex problem, we disintegrate it into two subproblems: 1) user association problem and 2) trajectory and power allocation problem. Since the user association problem is a binary integer programming problem, we use the standard convex optimization method to solve it. Meanwhile, the trajectory and power allocation problem is solved by the deep deterministic policy gradient (DDPG) method to cope with the problem's non-convexity and dynamic network environments. Then, the two subproblems are alternately solved by the proposed iterative algorithm. By comparing with the baselines in the existing literature, extensive simulations are conducted to evaluate the performance of the proposed framework. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 5 pages, 4 figures

arXiv:2405.19734 [pdf, other]

Search for the decay $B^{0}\toγγ$ using Belle and Belle II data

Authors: Belle, Belle II Collaborations, :, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, S. Al Said, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot , et al. (385 additional authors not shown)

Abstract: We report the result of a search for the rare decay $B^{0} \to γγ$ using a combined dataset of $753\times10^{6}$ $B\bar{B}$ pairs collected by the Belle experiment and $387\times10^{6}$ $B\bar{B}$ pairs collected by the Belle II experiment from decays of the $\rm Υ(4S)$ resonance produced in $e^{+}e^{-}$ collisions. A simultaneous fit to the Belle and Belle II data sets yields… ▽ More We report the result of a search for the rare decay $B^{0} \to γγ$ using a combined dataset of $753\times10^{6}$ $B\bar{B}$ pairs collected by the Belle experiment and $387\times10^{6}$ $B\bar{B}$ pairs collected by the Belle II experiment from decays of the $\rm Υ(4S)$ resonance produced in $e^{+}e^{-}$ collisions. A simultaneous fit to the Belle and Belle II data sets yields $11.0^{+6.5}_{-5.5}$ signal events, corresponding to a 2.5$σ$ significance. We determine the branching fraction $\mathcal{B}(B^{0} \to γγ) = (3.7^{+2.2}_{-1.8}(\rm stat)\pm0.5(\rm syst))\times10^{-8}$ and set a 90% credibility level upper limit of $\mathcal{B}(B^{0} \to γγ) < 6.4\times10^{-8}$. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Report number: Belle II Preprint: 2024-017, KEK Preprint: 2024-13

arXiv:2405.18928 [pdf, other]

Measurement of the energy dependence of the $e^+e^- \to B\bar{B}$, $B\bar{B}{}^*$, and $B^*\bar{B}{}^*$ cross sections at Belle~II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, M. Bauer, A. Baur , et al. (444 additional authors not shown)

Abstract: We report measurements of the $e^+e^- \to B\bar{B}$, $B\bar{B}{}^*$, and $B^*\bar{B}{}^*$ cross sections at four energies, 10653, 10701, 10746 and 10805 MeV, using data collected by the Belle~II experiment. We reconstruct one $B$ meson in a large number of hadronic final states and use its momentum to identify the production process. In the first $2-5$ MeV above $B^*\bar{B}{}^*$ threshold, the… ▽ More We report measurements of the $e^+e^- \to B\bar{B}$, $B\bar{B}{}^*$, and $B^*\bar{B}{}^*$ cross sections at four energies, 10653, 10701, 10746 and 10805 MeV, using data collected by the Belle~II experiment. We reconstruct one $B$ meson in a large number of hadronic final states and use its momentum to identify the production process. In the first $2-5$ MeV above $B^*\bar{B}{}^*$ threshold, the $e^+e^- \to B^*\bar{B}{}^*$ cross section increases rapidly. This may indicate the presence of a pole close to the threshold. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 30 pages, 15 figures, submitted to JHEP

Report number: Belle II Preprint 2024-016, KEK Preprint 2024-12

arXiv:2405.18792 [pdf, other]

Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic RL Policies

Authors: Haanvid Lee, Tri Wahyu Guntara, Jongmin Lee, Yung-Kyun Noh, Kee-Eung Kim

Abstract: We consider off-policy evaluation (OPE) of deterministic target policies for reinforcement learning (RL) in environments with continuous action spaces. While it is common to use importance sampling for OPE, it suffers from high variance when the behavior policy deviates significantly from the target policy. In order to address this issue, some recent works on OPE proposed in-sample learning with i… ▽ More We consider off-policy evaluation (OPE) of deterministic target policies for reinforcement learning (RL) in environments with continuous action spaces. While it is common to use importance sampling for OPE, it suffers from high variance when the behavior policy deviates significantly from the target policy. In order to address this issue, some recent works on OPE proposed in-sample learning with importance resampling. Yet, these approaches are not applicable to deterministic target policies for continuous action spaces. To address this limitation, we propose to relax the deterministic target policy using a kernel and learn the kernel metrics that minimize the overall mean squared error of the estimated temporal difference update vector of an action value function, where the action value function is used for policy evaluation. We derive the bias and variance of the estimation error due to this relaxation and provide analytic solutions for the optimal kernel metric. In empirical studies using various test domains, we show that the OPE with in-sample learning using the kernel with optimized metric achieves significantly improved accuracy than other baselines. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 23 pages, 2 figures, Accepted at ICLR 2024 (spotlight)

arXiv:2405.15987 [pdf, other]

Modes of Analyzing Disinformation Narratives With AI/ML/Text Mining to Assist in Mitigating the Weaponization of Social Media

Authors: Andy Skumanich, Han Kyul Kim

Abstract: This paper highlights the develo** need for quantitative modes for capturing and monitoring malicious communication in social media. There has been a deliberate "weaponization" of messaging through the use of social networks including by politically oriented entities both state sponsored and privately run. The article identifies a use of AI/ML characterization of generalized "mal-info," a broad… ▽ More This paper highlights the develo** need for quantitative modes for capturing and monitoring malicious communication in social media. There has been a deliberate "weaponization" of messaging through the use of social networks including by politically oriented entities both state sponsored and privately run. The article identifies a use of AI/ML characterization of generalized "mal-info," a broad term which includes deliberate malicious narratives similar with hate speech, which adversely impact society. A key point of the discussion is that this mal-info will dramatically increase in volume, and it will become essential for sharable quantifying tools to provide support for human expert intervention. Despite attempts to introduce moderation on major platforms like Facebook and X/Twitter, there are now established alternative social networks that offer completely unmoderated spaces. The paper presents an introduction to these platforms and the initial results of a qualitative and semi-quantitative analysis of characteristic mal-info posts. The authors perform a rudimentary text mining function for a preliminary characterization in order to evaluate the modes for better-automated monitoring. The action examines several inflammatory terms using text analysis and, importantly, discusses the use of generative algorithms by one political agent in particular, providing some examples of the potential risks to society. This latter is of grave concern, and monitoring tools must be established. This paper presents a preliminary step to selecting relevant sources and to setting a foundation for characterizing the mal-info, which must be monitored. The AI/ML methods provide a means for semi-quantitative signature capture. The impending use of "mal-GenAI" is presented. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: Accepted at ICWSM-2024 Workshop on Digital State Sponsored Disinformation and Propaganda: Challenges and Opportunities (DSSDP24)

arXiv:2405.14727 [pdf, other]

Quantized geodesic lengths for Teichmüller spaces: algebraic aspects

Authors: Hyun Kyu Kim

Abstract: In 1980's H Verlinde suggested to construct and use a quantization of Teichmüller spaces to construct spaces of conformal blocks for the Liouville conformal field theory. This suggestion led to a mathematical formulation by Fock in 1990's, called the modular functor conjecture, based on the Chekhov-Fock quantum Teichmüller theory. In 2000's Teschner combined the Chekhov-Fock version and the Kashae… ▽ More In 1980's H Verlinde suggested to construct and use a quantization of Teichmüller spaces to construct spaces of conformal blocks for the Liouville conformal field theory. This suggestion led to a mathematical formulation by Fock in 1990's, called the modular functor conjecture, based on the Chekhov-Fock quantum Teichmüller theory. In 2000's Teschner combined the Chekhov-Fock version and the Kashaev version of quantum Teichmüller theory to construct a solution to a modified form of the conjecture. We embark on a direct approach to the conjecture based on the Chekhov-Fock(-Goncharov) theory. We construct quantized trace-of-monodromy along simple loops via Bonahon and Wong's quantum trace maps developed in 2010's, and investigate algebraic structures of them, which will eventually lead to construction and properties of quantized geodesic length operators. We show that a special recursion relation used by Teschner is satisfied by the quantized trace-of-monodromy, and that the quantized trace-of-monodromy for disjoint loops commute in a certain strong sense. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 74 pages

MSC Class: 18M20; 57K31; 57K20; 13F60; 81R60; 46L65

arXiv:2405.14625 [pdf, other]

Test of light-lepton universality in $τ$ decays with the Belle II experiment

Authors: Belle II Collaboration, I. Adachi, K. Adamczyk, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Anh Ky, D. M. Asner, H. Atmacan, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer, J. Becker , et al. (406 additional authors not shown)

Abstract: We present a measurement of the ratio $R_μ= \mathcal{B}(τ^-\to μ^-\barν_μν_τ) / \mathcal{B}(τ^-\to e^-\barν_eν_τ)$ of branching fractions $\mathcal{B}$ of the $τ$ lepton decaying to muons or electrons using data collected with the Belle II detector at the SuperKEKB $e^+e^-$ collider. The sample has an integrated luminosity of 362 fb$^{-1}$ at a centre-of-mass energy of 10.58 GeV. Using an optimise… ▽ More We present a measurement of the ratio $R_μ= \mathcal{B}(τ^-\to μ^-\barν_μν_τ) / \mathcal{B}(τ^-\to e^-\barν_eν_τ)$ of branching fractions $\mathcal{B}$ of the $τ$ lepton decaying to muons or electrons using data collected with the Belle II detector at the SuperKEKB $e^+e^-$ collider. The sample has an integrated luminosity of 362 fb$^{-1}$ at a centre-of-mass energy of 10.58 GeV. Using an optimised event selection, a binned maximum likelihood fit is performed using the momentum spectra of the electron and muon candidates. The result, $R_μ= 0.9675 \pm 0.0007 \pm 0.0036$, where the first uncertainty is statistical and the second is systematic, is the most precise to date. It provides a stringent test of the light-lepton universality, translating to a ratio of the couplings of the muon and electron to the $W$ boson in $τ$ decays of $0.9974 \pm 0.0019$, in agreement with the standard model expectation of unity. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Report number: Belle II Preprint 2024-002, KEK Preprint 2023-49

arXiv:2405.14155 [pdf]

Room-temperature waveguide-integrated photodetector using bolometric effect for mid-infrared spectroscopy applications

Authors: Joonsup Shim, **ha Lim, Inki Kim, Jaeyong Jeong, Bong Ho Kim, Seong Kwang Kim, Dae-Myeong Geum, SangHyeon Kim

Abstract: Waveguide-integrated mid-infrared (MIR) photodetectors are pivotal components for develo** molecular spectroscopy applications, leveraging mature photonic integrated circuit (PIC) technologies. Despite various strategies, critical challenges still remain in achieving broadband photoresponse, cooling-free operation, and large-scale complementary-metal-oxide-semiconductor (CMOS)-compatible manufac… ▽ More Waveguide-integrated mid-infrared (MIR) photodetectors are pivotal components for develo** molecular spectroscopy applications, leveraging mature photonic integrated circuit (PIC) technologies. Despite various strategies, critical challenges still remain in achieving broadband photoresponse, cooling-free operation, and large-scale complementary-metal-oxide-semiconductor (CMOS)-compatible manufacturability. To leap beyond these limitations, the bolometric effect - a thermal detection mechanism - is introduced into the waveguide platform. More importantly, we pursue a free-carrier absorption (FCA) process in germanium (Ge) to create an efficient light-absorbing medium, providing a pragmatic solution for full coverage of the MIR spectrum without incorporating exotic materials into CMOS. Here, we present an uncooled waveguide-integrated photodetector based on a Ge-on-insulator (Ge-OI) PIC architecture, exploiting the bolometric effect combined with FCA. Notably, our device exhibits a broadband responsivity of ~12 mA/W across 4030-4360 nm (and potentially beyond), challenging the state of the art, while achieving a noise-equivalent power of 3.4x10^-9 W/Hz^0.5 at 4180 nm. We further demonstrate label-free sensing of carbon dioxide using our integrated photodetector and sensing waveguide on a single chip. This approach to room-temperature waveguide-integrated MIR photodetection, harnessing bolometry with FCA in Ge, not only facilitates the realization of fully integrated lab-on-a-chip systems with wavelength flexibility but also provides a blueprint for MIR PICs with CMOS-foundry-compatibility. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 6 figures for the main manuscript and 14 figures for the supplementary information

arXiv:2405.12421 [pdf, other]

A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback

Authors: Kihyun Kim, Jiawei Zhang, Asuman Ozdaglar, Pablo A. Parrilo

Abstract: Inverse Reinforcement Learning (IRL) and Reinforcement Learning from Human Feedback (RLHF) are pivotal methodologies in reward learning, which involve inferring and sha** the underlying reward function of sequential decision-making problems based on observed human demonstrations and feedback. Most prior work in reward learning has relied on prior knowledge or assumptions about decision or prefer… ▽ More Inverse Reinforcement Learning (IRL) and Reinforcement Learning from Human Feedback (RLHF) are pivotal methodologies in reward learning, which involve inferring and sha** the underlying reward function of sequential decision-making problems based on observed human demonstrations and feedback. Most prior work in reward learning has relied on prior knowledge or assumptions about decision or preference models, potentially leading to robustness issues. In response, this paper introduces a novel linear programming (LP) framework tailored for offline reward learning. Utilizing pre-collected trajectories without online exploration, this framework estimates a feasible reward set from the primal-dual optimality conditions of a suitably designed LP, and offers an optimality guarantee with provable sample efficiency. Our LP framework also enables aligning the reward functions with human feedback, such as pairwise trajectory comparison data, while maintaining computational tractability and sample efficiency. We demonstrate that our framework potentially achieves better performance compared to the conventional maximum likelihood estimation (MLE) approach through analytical examples and numerical experiments. △ Less

Submitted 3 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: ICML 2024

arXiv:2405.11905 [pdf, other]

CSTA: CNN-based Spatiotemporal Attention for Video Summarization

Authors: Jaewon Son, Jaehun Park, Kwangsu Kim

Abstract: Video summarization aims to generate a concise representation of a video, capturing its essential content and key moments while reducing its overall length. Although several methods employ attention mechanisms to handle long-term dependencies, they often fail to capture the visual significance inherent in frames. To address this limitation, we propose a CNN-based SpatioTemporal Attention (CSTA) me… ▽ More Video summarization aims to generate a concise representation of a video, capturing its essential content and key moments while reducing its overall length. Although several methods employ attention mechanisms to handle long-term dependencies, they often fail to capture the visual significance inherent in frames. To address this limitation, we propose a CNN-based SpatioTemporal Attention (CSTA) method that stacks each feature of frames from a single video to form image-like frame representations and applies 2D CNN to these frame features. Our methodology relies on CNN to comprehend the inter and intra-frame relations and to find crucial attributes in videos by exploiting its ability to learn absolute positions within images. In contrast to previous work compromising efficiency by designing additional modules to focus on spatial importance, CSTA requires minimal computational overhead as it uses CNN as a sliding window. Extensive experiments on two benchmark datasets (SumMe and TVSum) demonstrate that our proposed approach achieves state-of-the-art performance with fewer MACs compared to previous methods. Codes are available at https://github.com/thswodnjs3/CSTA. △ Less

Submitted 21 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: Accepted at CVPR 2024

arXiv:2405.11390 [pdf, other]

Search for Two-Body $B$ Meson Decays to $Λ^{0}$ and $Ω^{(*)0}_{c}$

Authors: Belle Collaboration, V. Savinov, I. Adachi, J. K. Ahn, H. Aihara, D. M. Asner, H. Atmacan, R. Ayad, Sw. Banerjee, J. Bennett, M. Bessner, V. Bhardwaj, D. Biswas, A. Bobrov, D. Bodrov, J. Borah, M. Bračko, P. Branchini, T. E. Browder, A. Budano, D. Červenkov, M. -C. Chang, P. Chang, B. G. Cheon, K. Cho , et al. (124 additional authors not shown)

Abstract: We report the results of the first search for Standard Model and baryon-number-violating two-body decays of the neutral $B$ mesons to $Λ^{0}$ and $Ω^{(*)0}_c$ using 711~${\rm fb^{-1}}$ of data collected at the $Υ(4S)$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+ e^-$ collider. We observe no evidence of signal from any such decays and set 95\% confidence-level upper limits o… ▽ More We report the results of the first search for Standard Model and baryon-number-violating two-body decays of the neutral $B$ mesons to $Λ^{0}$ and $Ω^{(*)0}_c$ using 711~${\rm fb^{-1}}$ of data collected at the $Υ(4S)$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+ e^-$ collider. We observe no evidence of signal from any such decays and set 95\% confidence-level upper limits on the products of $B^0$ and $\bar{B}^0$ branching fractions for these two-body decays with $\mathcal{B}(Ω_{c}^{0} \to π^+ Ω^-)$ in the range between 9.5~$\times 10^{-8}$ and 31.2~$\times 10^{-8}$. △ Less

Submitted 18 May, 2024; originally announced May 2024.

Comments: 6 pages, 2 figures, submitted to PRD(L)

Report number: Belle Preprint 2024-04, KEK Preprint 2024-5

arXiv:2405.11254 [pdf, other]

Spread and Spectral Complexity in Quantum Spin Chains: from Integrability to Chaos

Authors: Hugo A. Camargo, Kyoung-Bum Huh, Viktor Jahnke, Hyun-Sik Jeong, Keun-Young Kim, Mitsuhiro Nishida

Abstract: We explore spread and spectral complexity in quantum systems that exhibit a transition from integrability to chaos, namely the mixed-field Ising model and the next-to-nearest-neighbor deformation of the Heisenberg XXZ spin chain. We corroborate the observation that the presence of a peak in spread complexity before its saturation, is a characteristic feature in chaotic systems. We find that, in ge… ▽ More We explore spread and spectral complexity in quantum systems that exhibit a transition from integrability to chaos, namely the mixed-field Ising model and the next-to-nearest-neighbor deformation of the Heisenberg XXZ spin chain. We corroborate the observation that the presence of a peak in spread complexity before its saturation, is a characteristic feature in chaotic systems. We find that, in general, the saturation value of spread complexity post-peak depends not only on the spectral statistics of the Hamiltonian, but also on the specific state. However, there appears to be a maximal universal bound determined by the symmetries and dimension of the Hamiltonian, which is realized by the thermofield double state (TFD) at infinite temperature. We also find that the time scales at which the spread complexity and spectral form factor change their behaviour agree with each other and are independent of the chaotic properties of the systems. In the case of spectral complexity, we identify that the key factor determining its saturation value and timescale in chaotic systems is given by minimum energy difference in the theory's spectrum. This explains observations made in the literature regarding its earlier saturation in chaotic systems compared to their integrable counterparts. We conclude by discussing the properties of the TFD which, we conjecture, make it suitable for probing signatures of chaos in quantum many-body systems. △ Less

Submitted 3 June, 2024; v1 submitted 18 May, 2024; originally announced May 2024.

Comments: v1: 35 pages, 18 figures, v2: references added, minor changes

Report number: IFT-UAM/CSIC-24-65

arXiv:2405.10908 [pdf, other]

UVCANDELS: The role of dust on the stellar mass-size relation of disk galaxies at 0.5 $\leq z \leq$ 3.0

Authors: Kalina V. Nedkova, Marc Rafelski, Harry I. Teplitz, Vihang Mehta, Laura DeGroot, Swara Ravindranath, Anahita Alavi, Alexander Beckett, Norman A. Grogin, Boris Häußler, Anton M. Koekemoer, Grecco A. Oyarzún, Laura Prichard, Mitchell Revalski, Gregory F. Snyder, Ben Sunnquist, Xin Wang, Rogier A. Windhorst, Nima Chartab, Christopher J. Conselice, Yicheng Guo, Nimish Hathi, Matthew J. Hayes, Zhiyuan Ji, Keunho J. Kim , et al. (8 additional authors not shown)

Abstract: We use the Ultraviolet Imaging of the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey fields (UVCANDELS) to measure half-light radii in the rest-frame far-UV for $\sim$16,000 disk-like galaxies over $0.5\leq z \leq 3$. We compare these results to rest-frame optical sizes that we measure in a self-consistent way and find that the stellar mass-size relation of disk galaxies is steeper… ▽ More We use the Ultraviolet Imaging of the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey fields (UVCANDELS) to measure half-light radii in the rest-frame far-UV for $\sim$16,000 disk-like galaxies over $0.5\leq z \leq 3$. We compare these results to rest-frame optical sizes that we measure in a self-consistent way and find that the stellar mass-size relation of disk galaxies is steeper in the rest-frame UV than in the optical across our entire redshift range. We show that this is mainly driven by massive galaxies ($\gtrsim10^{10}$M$_\odot$), which we find to also be among the most dusty. Our results are consistent with the literature and have commonly been interpreted as evidence of inside-out growth wherein galaxies form their central structures first. However, they could also suggest that the centers of massive galaxies are more heavily attenuated than their outskirts. We distinguish between these scenarios by modeling and selecting galaxies at $z=2$ from the VELA simulation suite in a way that is consistent with UVCANDELS. We show that the effects of dust alone can account for the size differences we measure at $z=2$. This indicates that, at different wavelengths, size differences and the different slopes of the stellar mass-size relation do not constitute evidence for inside-out growth. △ Less

Submitted 28 June, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

Comments: Accepted for publication in ApJ. 22 pages, 12 figures, and 4 tables

arXiv:2405.10123 [pdf, other]

Asynchronous Federated Stochastic Optimization for Heterogeneous Objectives Under Arbitrary Delays

Authors: Charikleia Iakovidou, Kibaek Kim

Abstract: Federated learning (FL) was recently proposed to securely train models with data held over multiple locations ("clients") under the coordination of a central server. Two major challenges hindering the performance of FL algorithms are long training times caused by straggling clients, and a decline in model accuracy under non-iid local data distributions ("client drift"). In this work, we propose an… ▽ More Federated learning (FL) was recently proposed to securely train models with data held over multiple locations ("clients") under the coordination of a central server. Two major challenges hindering the performance of FL algorithms are long training times caused by straggling clients, and a decline in model accuracy under non-iid local data distributions ("client drift"). In this work, we propose and analyze Asynchronous Exact Averaging (AREA), a new stochastic (sub)gradient algorithm that utilizes asynchronous communication to speed up convergence and enhance scalability, and employs client memory to correct the client drift caused by variations in client update frequencies. Moreover, AREA is, to the best of our knowledge, the first method that is guaranteed to converge under arbitrarily long delays, without the use of delay-adaptive stepsizes, and (i) for strongly convex, smooth functions, asymptotically converges to an error neighborhood whose size depends only on the variance of the stochastic gradients used with respect to the number of iterations, and (ii) for convex, non-smooth functions, matches the convergence rate of the centralized stochastic subgradient method up to a constant factor, which depends on the average of the individual client update frequencies instead of their minimum (or maximum). Our numerical results validate our theoretical analysis and indicate AREA outperforms state-of-the-art methods when local data are highly non-iid, especially as the number of clients grows. △ Less

Submitted 28 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

Showing 1–50 of 4,214 results for author: Kim, K