-
Test of light-lepton universality in $τ$ decays with the Belle II experiment
Authors:
Belle II Collaboration,
I. Adachi,
K. Adamczyk,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer,
J. Becker
, et al. (406 additional authors not shown)
Abstract:
We present a measurement of the ratio $R_μ= \mathcal{B}(τ^-\to μ^-\barν_μν_τ) / \mathcal{B}(τ^-\to e^-\barν_eν_τ)$ of branching fractions $\mathcal{B}$ of the $τ$ lepton decaying to muons or electrons using data collected with the Belle II detector at the SuperKEKB $e^+e^-$ collider. The sample has an integrated luminosity of 362 fb$^{-1}$ at a centre-of-mass energy of 10.58 GeV. Using an optimise…
▽ More
We present a measurement of the ratio $R_μ= \mathcal{B}(τ^-\to μ^-\barν_μν_τ) / \mathcal{B}(τ^-\to e^-\barν_eν_τ)$ of branching fractions $\mathcal{B}$ of the $τ$ lepton decaying to muons or electrons using data collected with the Belle II detector at the SuperKEKB $e^+e^-$ collider. The sample has an integrated luminosity of 362 fb$^{-1}$ at a centre-of-mass energy of 10.58 GeV. Using an optimised event selection, a binned maximum likelihood fit is performed using the momentum spectra of the electron and muon candidates. The result, $R_μ= 0.9675 \pm 0.0007 \pm 0.0036$, where the first uncertainty is statistical and the second is systematic, is the most precise to date. It provides a stringent test of the light-lepton universality, translating to a ratio of the couplings of the muon and electron to the $W$ boson in $τ$ decays of $0.9974 \pm 0.0019$, in agreement with the standard model expectation of unity.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Out-of-plane magnetic phase diagram of Kitaev quantum spin liquid candidate Na2Co2TeO6
Authors:
Shengzhi Zhang,
Sangyun Lee,
Eric Brosha,
Qing Huang,
Haidong Zhou,
Vivien S. Zapf,
Minseong Lee
Abstract:
We have investigated the magnetic properties and mapped out the phase diagram of the honeycomb magnet Na2Co2TeO6 with Co 3d7 in out-of-plane magnetic fields. This material has previously been proposed to show nearest-neighbor Kitaev interactions between Co spins and maybe even Kitaev quantum spin liquid behavior in high fields. At low magnetic fields, we observe a thermal phase transition at TN =…
▽ More
We have investigated the magnetic properties and mapped out the phase diagram of the honeycomb magnet Na2Co2TeO6 with Co 3d7 in out-of-plane magnetic fields. This material has previously been proposed to show nearest-neighbor Kitaev interactions between Co spins and maybe even Kitaev quantum spin liquid behavior in high fields. At low magnetic fields, we observe a thermal phase transition at TN = 27 K, transitioning from a paramagnetic state to a canonical ferrimagnetic state. Under the application of magnetic fields, a spin flop-like phase transition occurred before saturation of J = 1/2 between 10 K and TN. Below 10 K, a peak-dip-peak structure emerges between 10 and 17 T in the magnetic susceptibility (dM/dH) before the magnetic saturation, reminiscent of magnetic plateau behavior. The measurement of the magnetocaloric effect also shows dip-peak-dip behavior in this field range. Our data can be explained by an XXZ model with a single ion anisotropy and possibly small Kitaev and Γ exchange interactions. We also unambiguously determined the magnetization saturation field that helps constrain the energy scale of the exchange interactions
△ Less
Submitted 30 May, 2024; v1 submitted 22 May, 2024;
originally announced May 2024.
-
Ambisonizer: Neural Upmixing as Spherical Harmonics Generation
Authors:
Yongyi Zang,
Yifan Wang,
Minglun Lee
Abstract:
Neural upmixing, the task of generating immersive music with an increased number of channels from fewer input channels, has been an active research area, with mono-to-stereo and stereo-to-surround upmixing treated as separate problems. In this paper, we propose a unified approach to neural upmixing by formulating it as spherical harmonics - more specifically, Ambisonic generation. We explicitly fo…
▽ More
Neural upmixing, the task of generating immersive music with an increased number of channels from fewer input channels, has been an active research area, with mono-to-stereo and stereo-to-surround upmixing treated as separate problems. In this paper, we propose a unified approach to neural upmixing by formulating it as spherical harmonics - more specifically, Ambisonic generation. We explicitly formulate mono upmixing as unconditional generation and stereo upmixing as conditional generation, where the stereo signals serve as conditions. We provide evidence that our proposed methodology, when decoded to stereo, matches a strong commercial stereo widener in subjective ratings. Overall, our work presents direct upmixing to Ambisonic format as a strong and promising approach to neural upmixing. A discussion on limitations is also provided.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
High-yield fabrication of bubble-free magic-angle twisted bilayer graphene devices with high twist-angle homogeneity
Authors:
J. Diez-Merida,
I. Das,
G. Di Battista,
A. Diez-Carlon,
M. Lee,
L. Zeng,
K. Watanabe,
T. Taniguchi,
E. Olsson,
D. K. Efetov
Abstract:
Magic-angle twisted bilayer graphene (MATBG) stands as one of the most versatile materials in condensed-matter physics due to its hosting of a wide variety of exotic phases while also offering convenient tunability. However, the fabrication of MATBG is still manual, and remains to be a challenging and inefficient process, with devices being highly dependent on specific fabrication methods, that of…
▽ More
Magic-angle twisted bilayer graphene (MATBG) stands as one of the most versatile materials in condensed-matter physics due to its hosting of a wide variety of exotic phases while also offering convenient tunability. However, the fabrication of MATBG is still manual, and remains to be a challenging and inefficient process, with devices being highly dependent on specific fabrication methods, that often result in inconsistency and variability. In this work, we present an optimized protocol for the fabrication of MATBG samples, for which we use deterministic graphene anchoring to stabilize the twist-angle, and a careful bubble removal techniques to ensure a high twist-angle homogeneity. We use low-temperature transport experiments to extract the average twist-angle between pairs of leads. We find that up to 38 percent of the so fabricated devices show micrometer square sized regions with a twist-angle in the range 1.1 plus/minus 0.1 degrees, and a twist-angle variation of only 0.02 degrees, where in some instances such regions were up to 36 micrometer square large. We are certain that the discussed protocols can be directly transferred to non-graphene materials, and will be useful for the growing field of moire materials.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
LG AI Research & KAIST at EHRSQL 2024: Self-Training Large Language Models with Pseudo-Labeled Unanswerable Questions for a Reliable Text-to-SQL System on EHRs
Authors:
Yongrae Jo,
Seongyun Lee,
Minju Seo,
Sung Ju Hwang,
Moontae Lee
Abstract:
Text-to-SQL models are pivotal for making Electronic Health Records (EHRs) accessible to healthcare professionals without SQL knowledge. With the advancements in large language models, these systems have become more adept at translating complex questions into SQL queries. Nonetheless, the critical need for reliability in healthcare necessitates these models to accurately identify unanswerable ques…
▽ More
Text-to-SQL models are pivotal for making Electronic Health Records (EHRs) accessible to healthcare professionals without SQL knowledge. With the advancements in large language models, these systems have become more adept at translating complex questions into SQL queries. Nonetheless, the critical need for reliability in healthcare necessitates these models to accurately identify unanswerable questions or uncertain predictions, preventing misinformation. To address this problem, we present a self-training strategy using pseudo-labeled unanswerable questions to enhance the reliability of text-to-SQL models for EHRs. This approach includes a two-stage training process followed by a filtering method based on the token entropy and query execution. Our methodology's effectiveness is validated by our top performance in the EHRSQL 2024 shared task, showcasing the potential to improve healthcare decision-making through more reliable text-to-SQL systems.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
LIFL: A Lightweight, Event-driven Serverless Platform for Federated Learning
Authors:
Shixiong Qi,
K. K. Ramakrishnan,
Myung** Lee
Abstract:
Federated Learning (FL) typically involves a large-scale, distributed system with individual user devices/servers training models locally and then aggregating their model updates on a trusted central server. Existing systems for FL often use an always-on server for model aggregation, which can be inefficient in terms of resource utilization. They may also be inelastic in their resource management.…
▽ More
Federated Learning (FL) typically involves a large-scale, distributed system with individual user devices/servers training models locally and then aggregating their model updates on a trusted central server. Existing systems for FL often use an always-on server for model aggregation, which can be inefficient in terms of resource utilization. They may also be inelastic in their resource management. This is particularly exacerbated when aggregating model updates at scale in a highly dynamic environment with varying numbers of heterogeneous user devices/servers.
We present LIFL, a lightweight and elastic serverless cloud platform with fine-grained resource management for efficient FL aggregation at scale. LIFL is enhanced by a streamlined, event-driven serverless design that eliminates the individual heavy-weight message broker and replaces inefficient container-based sidecars with lightweight eBPF-based proxies. We leverage shared memory processing to achieve high-performance communication for hierarchical aggregation, which is commonly adopted to speed up FL aggregation at scale. We further introduce locality-aware placement in LIFL to maximize the benefits of shared memory processing. LIFL precisely scales and carefully reuses the resources for hierarchical aggregation to achieve the highest degree of parallelism while minimizing the aggregation time and resource consumption. Our experimental results show that LIFL achieves significant improvement in resource efficiency and aggregation speed for supporting FL at scale, compared to existing serverful and serverless FL systems.
△ Less
Submitted 5 May, 2024;
originally announced May 2024.
-
Scalarisation-based risk concepts for robust multi-objective optimisation
Authors:
Ben Tu,
Nikolas Kantas,
Robert M. Lee,
Behrang Shafei
Abstract:
Robust optimisation is a well-established framework for optimising functions in the presence of uncertainty. The inherent goal of this problem is to identify a collection of inputs whose outputs are both desirable for the decision maker, whilst also being robust to the underlying uncertainties in the problem. In this work, we study the multi-objective extension of this problem from a computational…
▽ More
Robust optimisation is a well-established framework for optimising functions in the presence of uncertainty. The inherent goal of this problem is to identify a collection of inputs whose outputs are both desirable for the decision maker, whilst also being robust to the underlying uncertainties in the problem. In this work, we study the multi-objective extension of this problem from a computational standpoint. We identify that the majority of all robust multi-objective algorithms rely on two key operations: robustification and scalarisation. Robustification refers to the strategy that is used to marginalise over the uncertainty in the problem. Whilst scalarisation refers to the procedure that is used to encode the relative importance of each objective. As these operations are not necessarily commutative, the order that they are performed in has an impact on the resulting solutions that are identified and the final decisions that are made. This work aims to give an exposition on the philosophical differences between these two operations and highlight when one should opt for one ordering over the other. As part of our analysis, we showcase how many existing risk concepts can be easily integrated into the specification and solution of a robust multi-objective optimisation problem. Besides this, we also demonstrate how one can principally define the notion of a robust Pareto front and a robust performance metric based on our robustify and scalarise methodology. To illustrate the efficacy of these new ideas, we present two insightful numerical case studies which are based on real-world data sets.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
COSMOS-Web: The Role of Galaxy Interactions and Disk Instabilities in Producing Starbursts at z<4
Authors:
A. L. Faisst,
M. Brinch,
C. M. Casey,
N. Chartab,
M. Dessauges-Zavadsky,
N. E. Drakos,
S. Gillman,
G. Gonzaliasl,
C. C. Hayward,
O. Ilbert,
P. Jablonka,
J. S. Kartaltepe,
A. M. Koekemoer,
V. Kokorev,
E. Lambrides,
D. Liu,
C. Maraston,
C. L. Martin,
A. Renzini,
B. E. Robertson,
D. B. Sanders,
Z. Sattari,
N. Scoville,
C. M. Urry,
A. P. Vijayan
, et al. (27 additional authors not shown)
Abstract:
We study of the role of galaxy-galaxy interactions and disk instabilities in producing starburst activity in galaxies out to z=4. For this, we use a sample of 387 galaxies with robust total star formation rate measurements from Herschel, gas masses from ALMA, stellar masses and redshifts from multi-band photometry, and JWST/NIRCam rest-frame optical imaging. Using mass-controlled samples, we find…
▽ More
We study of the role of galaxy-galaxy interactions and disk instabilities in producing starburst activity in galaxies out to z=4. For this, we use a sample of 387 galaxies with robust total star formation rate measurements from Herschel, gas masses from ALMA, stellar masses and redshifts from multi-band photometry, and JWST/NIRCam rest-frame optical imaging. Using mass-controlled samples, we find an increased fraction of interacting galaxies in the starburst regime at all redshifts out to z=4. This increase correlates with star formation efficiency (SFE), but not with gas fraction. However, the correlation is weak (and only significant out to z=2), which could be explained by the short duration of SFE increase during interaction. In addition, we find that isolated disk galaxies make up a significant fraction of the starburst population. The fraction of such galaxies with star-forming clumps ("clumpy disks") is significantly increased compared to the main-sequence disk population. Furthermore, this fraction directly correlates with SFE. This is direct observational evidence for a long-term increase of SFE maintained due to disk instabilities, contributing to the majority of starburst galaxies in our sample and hence to substantial mass growth in these systems. This result could also be of importance for explaining the growth of the most massive galaxies at z>6.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
GAD: A Real-time Gait Anomaly Detection System with Online Adaptive Learning
Authors:
Ming-Chang Lee,
Jia-Chun Lin,
Sokratis Katsikas
Abstract:
Gait anomaly detection is a task that involves detecting deviations from a person's normal gait pattern. These deviations can indicate health issues and medical conditions in the healthcare domain, or fraudulent impersonation and unauthorized identity access in the security domain. A number of gait anomaly detection approaches have been introduced, but many of them require offline data preprocessi…
▽ More
Gait anomaly detection is a task that involves detecting deviations from a person's normal gait pattern. These deviations can indicate health issues and medical conditions in the healthcare domain, or fraudulent impersonation and unauthorized identity access in the security domain. A number of gait anomaly detection approaches have been introduced, but many of them require offline data preprocessing, offline model learning, setting parameters, and so on, which might restrict their effectiveness and applicability in real-world scenarios. To address these issues, this paper introduces GAD, a real-time gait anomaly detection system. GAD focuses on detecting anomalies within an individual's three-dimensional accelerometer readings based on dimensionality reduction and Long Short-Term Memory (LSTM). Upon being launched, GAD begins collecting a gait segment from the user and training an anomaly detector to learn the user's walking pattern on the fly. If the subsequent model verification is successful, which involves validating the trained detector using the user's subsequent steps, the detector is employed to identify abnormalities in the user's subsequent gait readings at the user's request. The anomaly detector will be retained online to adapt to minor pattern changes and will undergo retraining as long as it cannot provide adequate prediction. We explored two methods for capturing users' gait segments: a personalized method tailored to each individual's step length, and a uniform method utilizing a fixed step length. Experimental results using an open-source gait dataset show that GAD achieves a higher detection accuracy ratio when combined with the personalized method.
△ Less
Submitted 4 May, 2024;
originally announced May 2024.
-
Cross-Domain Feature Augmentation for Domain Generalization
Authors:
Yingnan Liu,
Yingtian Zou,
Rui Qiao,
Fusheng Liu,
Mong Li Lee,
Wynne Hsu
Abstract:
Domain generalization aims to develop models that are robust to distribution shifts. Existing methods focus on learning invariance across domains to enhance model robustness, and data augmentation has been widely used to learn invariant predictors, with most methods performing augmentation in the input space. However, augmentation in the input space has limited diversity whereas in the feature spa…
▽ More
Domain generalization aims to develop models that are robust to distribution shifts. Existing methods focus on learning invariance across domains to enhance model robustness, and data augmentation has been widely used to learn invariant predictors, with most methods performing augmentation in the input space. However, augmentation in the input space has limited diversity whereas in the feature space is more versatile and has shown promising results. Nonetheless, feature semantics is seldom considered and existing feature augmentation methods suffer from a limited variety of augmented features. We decompose features into class-generic, class-specific, domain-generic, and domain-specific components. We propose a cross-domain feature augmentation method named XDomainMix that enables us to increase sample diversity while emphasizing the learning of invariant representations to achieve domain generalization. Experiments on widely used benchmark datasets demonstrate that our proposed method is able to achieve state-of-the-art performance. Quantitative analysis indicates that our feature augmentation approach facilitates the learning of effective models that are invariant across different domains.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
How Non-native English Speakers Use, Assess, and Select AI-Generated Paraphrases with Information Aids
Authors:
Yewon Kim,
Thanh-Long V. Le,
Donghwi Kim,
Mina Lee,
Sung-Ju Lee
Abstract:
Non-native English speakers (NNESs) often face challenges in achieving fluency in their written English. AI paraphrasing tools have the potential to improve their writing by suggesting more fluent paraphrases to their original sentences. Yet, the effectiveness of these tools depends on the user's ability to accurately assess and select context-appropriate suggestions, which is a significant challe…
▽ More
Non-native English speakers (NNESs) often face challenges in achieving fluency in their written English. AI paraphrasing tools have the potential to improve their writing by suggesting more fluent paraphrases to their original sentences. Yet, the effectiveness of these tools depends on the user's ability to accurately assess and select context-appropriate suggestions, which is a significant challenge for those with limited English proficiency. This paper explores how NNESs utilize a paraphrasing tool augmented with information aids designed to facilitate the assessment of paraphrased suggestions. Through a formative study with 15 NNESs, we identify their specific needs when paraphrasing with AI, leading to the design of a paraphrasing tool integrated with five types of information aids, termed "support features." A user study with 22 NNESs demonstrates their heavy reliance on the paraphrasing functionality throughout the writing process, where they leverage the support features to assess and select suggestions efficiently and comprehensively. When equipped with the support features, NNESs experience enhanced writing experience in efficiency, confidence, and trust. Our findings contribute to the HCI community by (i) identifying the distinct needs of NNESs in AI paraphrasing tools, (ii) elucidating how NNESs use paraphrasing tools with support features, and (iii) offering design implications for the development of more effective AI paraphrasing tools tailored to NNESs' requirements.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Search for lepton-flavor-violating $τ^- \to μ^-μ^+μ^-$ decays at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Althubiti,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer,
J. Becker
, et al. (407 additional authors not shown)
Abstract:
We present the result of a search for the charged-lepton-flavor violating decay $τ^- \to μ^-μ^+μ^-$ using a $424fb^{-1}$ sample of data recorded by the Belle II experiment at the SuperKEKB $e^{-}e^{+}$ collider. The selection of $e^{-}e^{+}\toτ^+τ^-$ events is based on an inclusive reconstruction of the non-signal tau decay, and on a boosted decision tree to suppress background. We observe one sig…
▽ More
We present the result of a search for the charged-lepton-flavor violating decay $τ^- \to μ^-μ^+μ^-$ using a $424fb^{-1}$ sample of data recorded by the Belle II experiment at the SuperKEKB $e^{-}e^{+}$ collider. The selection of $e^{-}e^{+}\toτ^+τ^-$ events is based on an inclusive reconstruction of the non-signal tau decay, and on a boosted decision tree to suppress background. We observe one signal candidate, which is compatible with the expectation from background processes. We set a $90\%$ confidence level upper limit of $1.9 \times 10^{-8}$ on the branching fraction of the \taumu decay, which is the most stringent bound to date.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
A Highly Efficient Hybrid Fiber Optic Laser Using a Cesium Atom Vapor Cell as an Optical Gain Medium
Authors:
Seok** Kim,
Mingyu Lee,
Sanggwon Song,
Seong** Hong,
Johan Nilsson,
Kyunghwan Oh
Abstract:
A new scheme of a highly efficient hybrid laser cavity is proposed and experimentally demonstrated utilizing a hot cesium (Cs) vapor cell as an optical gain medium. The laser cavity consists of a macroscopic concave reflecting mirror (>99% reflectivity) and a 4% Fresnel-reflecting perpendicularly cleaved facet of a single mode fiber (SMF). The cylindrical cesium gain cell is located between these…
▽ More
A new scheme of a highly efficient hybrid laser cavity is proposed and experimentally demonstrated utilizing a hot cesium (Cs) vapor cell as an optical gain medium. The laser cavity consists of a macroscopic concave reflecting mirror (>99% reflectivity) and a 4% Fresnel-reflecting perpendicularly cleaved facet of a single mode fiber (SMF). The cylindrical cesium gain cell is located between these two reflectors. The SMF serves multiple roles: 1) a passive mode-matching component to approximate the pump beam diameter to that of the laser cavity mode within the cesium cell, 2) an output coupler with low reflectivity, and 3) a high beam-quality laser delivery with a low loss. Optimizing the pump beam waist diameter and the cesium vapor cell temperature, a high slope efficiency of 86% and continuous wave power of 419 mW were obtained in the pump power range of 400 to 600 mW, with an optical-to-optical conversion efficiency of 71%. The unique multi-functional role of the SMF in the hybrid cavity is fully described, and it can also be applied to other phases of high optical gain media.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Understanding the Capabilities and Limitations of Large Language Models for Cultural Commonsense
Authors:
Siqi Shen,
Lajanugen Logeswaran,
Moontae Lee,
Honglak Lee,
Soujanya Poria,
Rada Mihalcea
Abstract:
Large language models (LLMs) have demonstrated substantial commonsense understanding through numerous benchmark evaluations. However, their understanding of cultural commonsense remains largely unexamined. In this paper, we conduct a comprehensive examination of the capabilities and limitations of several state-of-the-art LLMs in the context of cultural commonsense tasks. Using several general and…
▽ More
Large language models (LLMs) have demonstrated substantial commonsense understanding through numerous benchmark evaluations. However, their understanding of cultural commonsense remains largely unexamined. In this paper, we conduct a comprehensive examination of the capabilities and limitations of several state-of-the-art LLMs in the context of cultural commonsense tasks. Using several general and cultural commonsense benchmarks, we find that (1) LLMs have a significant discrepancy in performance when tested on culture-specific commonsense knowledge for different cultures; (2) LLMs' general commonsense capability is affected by cultural context; and (3) The language used to query the LLMs can impact their performance on cultural-related tasks. Our study points to the inherent bias in the cultural understanding of LLMs and provides insights that can help develop culturally aware language models.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Collapsing immortal Kähler-Ricci flows
Authors:
Hans-Joachim Hein,
Man-Chun Lee,
Valentino Tosatti
Abstract:
We consider the Kähler-Ricci flow on compact Kähler manifolds with semiample canonical bundle and intermediate Kodaira dimension, and show that the flow collapses to a canonical metric on the base of the Iitaka fibration in the locally smooth topology and with bounded Ricci curvature away from the singular fibers. This follows from an asymptotic expansion for the evolving metrics, in the spirit of…
▽ More
We consider the Kähler-Ricci flow on compact Kähler manifolds with semiample canonical bundle and intermediate Kodaira dimension, and show that the flow collapses to a canonical metric on the base of the Iitaka fibration in the locally smooth topology and with bounded Ricci curvature away from the singular fibers. This follows from an asymptotic expansion for the evolving metrics, in the spirit of recent work of the first and third-named authors on collapsing Calabi-Yau metrics, and proves two conjectures of Song and Tian.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Scattering of Giant Planets and Implications for the Origin of the Hierarchical and Eccentric Two-planet System GJ 1148
Authors:
Longhui Yuan,
Man Hoi Lee
Abstract:
The GJ 1148 system has two Saturn-mass planets orbiting around an M dwarf star on hierarchical and eccentric orbits, with orbital period ratio of 13 and eccentricities of both planets of 0.375. The inner planet is in the regime of eccentric warm Jupiters. We perform numerical experiments to study the planet-planet scattering scenario for the origin of this orbital architecture. We consider a third…
▽ More
The GJ 1148 system has two Saturn-mass planets orbiting around an M dwarf star on hierarchical and eccentric orbits, with orbital period ratio of 13 and eccentricities of both planets of 0.375. The inner planet is in the regime of eccentric warm Jupiters. We perform numerical experiments to study the planet-planet scattering scenario for the origin of this orbital architecture. We consider a third planet of $0.1 M_J$ (Jupiter's mass) in the initial GJ 1148 system with initial orbital separations of 3.5, 4, and 4.5 mutual Hill radii and initial semimajor axis of the innermost planet in the range of 0.10-0.50 au. The majority of scattering results in planet-planet collisions, followed by planet ejections, and planet-star close approaches. Among them, only planet ejections produce eccentric and widely separated two-planet systems, with some having similar orbital properties to the GJ 1148 system. We also examine the effects of general relativistic apsidal precession and a higher mass of $0.227 M_J$ for the third planet. The simulation results suggest that the GJ 1148 system may have lost a giant planet. We also perform simulations of the general problem of the origin of warm Jupiters by planet-planet scattering. As in the GJ 1148 simulations, a nontrivial number of stable two-planet systems are produced by ejection, which disagrees with the result from a previous study showing that two-planet systems arise exclusively through planet-planet collisions.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Measurement of Low Energy Nuclear Recoil Events with the phonon-mediated Voltage-Assisted Hybrid Detector for Rare Event Searches
Authors:
Sandro Maludze,
William Baker,
Mahdi Mirzakhani,
Matthew Lee,
Chelsea Savage,
Himangshu Neog,
Rupak Mahapatra,
Nader Mirabolfathi,
Mark Platt,
Andrew Jastram
Abstract:
The phonon-mediated hybrid detector is made out of a monolithic silicon crystal characterized by two interconnected regions linked through a narrow neck. Operating solely on phonon signal measurements, the hybrid design facilitates the differentiation between electron recoil and nuclear recoil events, effectively discerning two types of interaction down to low energy levels. With a newly implement…
▽ More
The phonon-mediated hybrid detector is made out of a monolithic silicon crystal characterized by two interconnected regions linked through a narrow neck. Operating solely on phonon signal measurements, the hybrid design facilitates the differentiation between electron recoil and nuclear recoil events, effectively discerning two types of interaction down to low energy levels. With a newly implemented software triggering technique (SWT), low-energy nuclear recoil events of approximately $500~eV_{ee}$ have been measured. In addition to this, an electron recoil background reduction of up to $95\%$ has been successfully demonstrated.
△ Less
Submitted 31 March, 2024;
originally announced May 2024.
-
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
Authors:
Seungone Kim,
Juyoung Suk,
Shayne Longpre,
Bill Yuchen Lin,
Jamin Shin,
Sean Welleck,
Graham Neubig,
Moontae Lee,
Kyungjae Lee,
Minjoon Seo
Abstract:
Proprietary LMs such as GPT-4 are often employed to assess the quality of responses from various LMs. However, concerns including transparency, controllability, and affordability strongly motivate the development of open-source LMs specialized in evaluations. On the other hand, existing open evaluator LMs exhibit critical shortcomings: 1) they issue scores that significantly diverge from those ass…
▽ More
Proprietary LMs such as GPT-4 are often employed to assess the quality of responses from various LMs. However, concerns including transparency, controllability, and affordability strongly motivate the development of open-source LMs specialized in evaluations. On the other hand, existing open evaluator LMs exhibit critical shortcomings: 1) they issue scores that significantly diverge from those assigned by humans, and 2) they lack the flexibility to perform both direct assessment and pairwise ranking, the two most prevalent forms of assessment. Additionally, they do not possess the ability to evaluate based on custom evaluation criteria, focusing instead on general attributes like helpfulness and harmlessness. To address these issues, we introduce Prometheus 2, a more powerful evaluator LM than its predecessor that closely mirrors human and GPT-4 judgements. Moreover, it is capable of processing both direct assessment and pair-wise ranking formats grouped with a user-defined evaluation criteria. On four direct assessment benchmarks and four pairwise ranking benchmarks, Prometheus 2 scores the highest correlation and agreement with humans and proprietary LM judges among all tested open evaluator LMs. Our models, code, and data are all publicly available at https://github.com/prometheus-eval/prometheus-eval.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Random Pareto front surfaces
Authors:
Ben Tu,
Nikolas Kantas,
Robert M. Lee,
Behrang Shafei
Abstract:
The goal of multi-objective optimisation is to identify the Pareto front surface which is the set obtained by connecting the best trade-off points. Typically this surface is computed by evaluating the objectives at different points and then interpolating between the subset of the best evaluated trade-off points. In this work, we propose to parameterise the Pareto front surface using polar coordina…
▽ More
The goal of multi-objective optimisation is to identify the Pareto front surface which is the set obtained by connecting the best trade-off points. Typically this surface is computed by evaluating the objectives at different points and then interpolating between the subset of the best evaluated trade-off points. In this work, we propose to parameterise the Pareto front surface using polar coordinates. More precisely, we show that any Pareto front surface can be equivalently represented using a scalar-valued length function which returns the projected length along any positive radial direction. We then use this representation in order to rigorously develop the theory and applications of stochastic Pareto front surfaces. In particular, we derive many Pareto front surface statistics of interest such as the expectation, covariance and quantiles. We then discuss how these can be used in practice within a design of experiments setting, where the goal is to both infer and use the Pareto front surface distribution in order to make effective decisions. Our framework allows for clear uncertainty quantification and we also develop advanced visualisation techniques for this purpose. Finally we discuss the applicability of our ideas within multivariate extreme value theory and illustrate our methodology in a variety of numerical examples, including a case study with a real-world air pollution data set.
△ Less
Submitted 21 June, 2024; v1 submitted 2 May, 2024;
originally announced May 2024.
-
The origin of wall-shear stress fluctuations in wall-bounded turbulence
Authors:
Myoungkyu Lee,
Yongyun Hwang
Abstract:
The origin of wall shear-stress fluctuations in wall turbulence was studied through energy dissipation at the wall. While confirming the universality in wall dissipation at small inner scales, the dissipation at larger scales is a consequence of near-wall scale interactions. In particular, the energy transport from the universal small to larger scale strengthens with Reynolds number due to the gro…
▽ More
The origin of wall shear-stress fluctuations in wall turbulence was studied through energy dissipation at the wall. While confirming the universality in wall dissipation at small inner scales, the dissipation at larger scales is a consequence of near-wall scale interactions. In particular, the energy transport from the universal small to larger scale strengthens with Reynolds number due to the growing number of intermediate scales associated with the log layer. We anticipate that these insights broadly apply to all canonical wall-bounded turbulence for sufficiently high Reynolds numbers.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Noise propagation and MP-PCA image denoising for high-resolution quantitative T2* and magnetic susceptibility map** (QSM)
Authors:
Liad Doniza,
Mitchel Lee,
Tamar Blumenfeld Katzir,
Moran Artzi,
Dafna Ben Bashat,
Dvir Radunsky,
Karin Shmueli,
Noam Ben-Eliezer
Abstract:
Quantitative Susceptibility Map** (QSM) is a technique for measuring magnetic susceptibility of tissues, aiding in the detection of pathologies like traumatic brain injury and multiple sclerosis by analyzing variations in substances such as iron and calcium. Despite its clinical value, achieving high-resolution QSM (voxel sizes < 1 mm3) reduces signal-to-noise ratio (SNR), compromising diagnosti…
▽ More
Quantitative Susceptibility Map** (QSM) is a technique for measuring magnetic susceptibility of tissues, aiding in the detection of pathologies like traumatic brain injury and multiple sclerosis by analyzing variations in substances such as iron and calcium. Despite its clinical value, achieving high-resolution QSM (voxel sizes < 1 mm3) reduces signal-to-noise ratio (SNR), compromising diagnostic quality. To mitigate this, we applied the Marchenko-Pastur Principal Component Analysis (MP-PCA) denoising technique on T2* weighted data, to enhance the quality of R2*, T2*, and QSM maps. Denoising was tested on a numerical phantom, healthy subjects, and patients with brain metastases and sickle cell disease, demonstrating effective and robust improvements across different scan settings. Further analysis examined noise propagation in R2* and T2* values, revealing lower noise-related variations in R2* values compared to T2* values which tended to be overestimated due to noise. Reduced variability was observed in QSM values post denoising, demonstrating MP-PCA's potential to improve the
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
Low-rank Matrix Bandits with Heavy-tailed Rewards
Authors:
Yue Kang,
Cho-Jui Hsieh,
Thomas C. M. Lee
Abstract:
In stochastic low-rank matrix bandit, the expected reward of an arm is equal to the inner product between its feature matrix and some unknown $d_1$ by $d_2$ low-rank parameter matrix $Θ^*$ with rank $r \ll d_1\wedge d_2$. While all prior studies assume the payoffs are mixed with sub-Gaussian noises, in this work we loosen this strict assumption and consider the new problem of \underline{low}-rank…
▽ More
In stochastic low-rank matrix bandit, the expected reward of an arm is equal to the inner product between its feature matrix and some unknown $d_1$ by $d_2$ low-rank parameter matrix $Θ^*$ with rank $r \ll d_1\wedge d_2$. While all prior studies assume the payoffs are mixed with sub-Gaussian noises, in this work we loosen this strict assumption and consider the new problem of \underline{low}-rank matrix bandit with \underline{h}eavy-\underline{t}ailed \underline{r}ewards (LowHTR), where the rewards only have finite $(1+δ)$ moment for some $δ\in (0,1]$. By utilizing the truncation on observed payoffs and the dynamic exploration, we propose a novel algorithm called LOTUS attaining the regret bound of order $\tilde O(d^\frac{3}{2}r^\frac{1}{2}T^\frac{1}{1+δ}/\tilde{D}_{rr})$ without knowing $T$, which matches the state-of-the-art regret bound under sub-Gaussian noises~\citep{lu2021low,kang2022efficient} with $δ= 1$. Moreover, we establish a lower bound of the order $Ω(d^\fracδ{1+δ} r^\fracδ{1+δ} T^\frac{1}{1+δ}) = Ω(T^\frac{1}{1+δ})$ for LowHTR, which indicates our LOTUS is nearly optimal in the order of $T$. In addition, we improve LOTUS so that it does not require knowledge of the rank $r$ with $\tilde O(dr^\frac{3}{2}T^\frac{1+δ}{1+2δ})$ regret bound, and it is efficient under the high-dimensional scenario. We also conduct simulations to demonstrate the practical superiority of our algorithm.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
Small Language Models Need Strong Verifiers to Self-Correct Reasoning
Authors:
Yunxiang Zhang,
Muhammad Khalifa,
Lajanugen Logeswaran,
Jaekyeom Kim,
Moontae Lee,
Honglak Lee,
Lu Wang
Abstract:
Self-correction has emerged as a promising solution to boost the reasoning performance of large language models (LLMs), where LLMs refine their solutions using self-generated critiques that pinpoint the errors. This work explores whether small (<= 13B) language models (LMs) have the ability of self-correction on reasoning tasks with minimal inputs from stronger LMs. We propose a novel pipeline tha…
▽ More
Self-correction has emerged as a promising solution to boost the reasoning performance of large language models (LLMs), where LLMs refine their solutions using self-generated critiques that pinpoint the errors. This work explores whether small (<= 13B) language models (LMs) have the ability of self-correction on reasoning tasks with minimal inputs from stronger LMs. We propose a novel pipeline that prompts smaller LMs to collect self-correction data that supports the training of self-refinement abilities. First, we leverage correct solutions to guide the model in critiquing their incorrect responses. Second, the generated critiques, after filtering, are used for supervised fine-tuning of the self-correcting reasoner through solution refinement. Our experimental results show improved self-correction abilities of two models on five datasets spanning math and commonsense reasoning, with notable performance gains when paired with a strong GPT-4-based verifier, though limitations are identified when using a weak self-verifier for determining when to correct.
△ Less
Submitted 5 June, 2024; v1 submitted 25 April, 2024;
originally announced April 2024.
-
Andes: Defining and Enhancing Quality-of-Experience in LLM-Based Text Streaming Services
Authors:
Jiachen Liu,
Zhiyu Wu,
Jae-Won Chung,
Fan Lai,
Myung** Lee,
Mosharaf Chowdhury
Abstract:
The advent of large language models (LLMs) has transformed text-based services, enabling capabilities ranging from real-time translation to AI-driven chatbots. However, existing serving systems primarily focus on optimizing server-side aggregate metrics like token generation throughput, ignoring individual user experience with streamed text. As a result, under high and/or bursty load, a significan…
▽ More
The advent of large language models (LLMs) has transformed text-based services, enabling capabilities ranging from real-time translation to AI-driven chatbots. However, existing serving systems primarily focus on optimizing server-side aggregate metrics like token generation throughput, ignoring individual user experience with streamed text. As a result, under high and/or bursty load, a significant number of users can receive unfavorable service quality or poor Quality-of-Experience (QoE). In this paper, we first formally define QoE of text streaming services, where text is delivered incrementally and interactively to users, by considering the end-to-end token delivery process throughout the entire interaction with the user. Thereafter, we propose Andes, a QoE-aware serving system that enhances user experience for LLM-enabled text streaming services. At its core, Andes strategically allocates contended GPU resources among multiple requests over time to optimize their QoE. Our evaluations demonstrate that, compared to the state-of-the-art LLM serving systems like vLLM, Andes improves the average QoE by up to 3.2$\times$ under high request rate, or alternatively, it attains up to 1.6$\times$ higher request rate while preserving high QoE.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Return of EM: Entity-driven Answer Set Expansion for QA Evaluation
Authors:
Dongryeol Lee,
Minwoo Lee,
Kyungmin Min,
Joonsuk Park,
Kyomin Jung
Abstract:
Recently, directly using large language models (LLMs) has been shown to be the most reliable method to evaluate QA models. However, it suffers from limited interpretability, high cost, and environmental harm. To address these, we propose to use soft EM with entity-driven answer set expansion. Our approach expands the gold answer set to include diverse surface forms, based on the observation that t…
▽ More
Recently, directly using large language models (LLMs) has been shown to be the most reliable method to evaluate QA models. However, it suffers from limited interpretability, high cost, and environmental harm. To address these, we propose to use soft EM with entity-driven answer set expansion. Our approach expands the gold answer set to include diverse surface forms, based on the observation that the surface forms often follow particular patterns depending on the entity type. The experimental results show that our method outperforms traditional evaluation methods by a large margin. Moreover, the reliability of our evaluation method is comparable to that of LLM-based ones, while offering the benefits of high interpretability and reduced environmental harm.
△ Less
Submitted 11 June, 2024; v1 submitted 24 April, 2024;
originally announced April 2024.
-
The CO-to-H$_2$ Conversion Factor in the Barred Spiral Galaxy M83
Authors:
Amanda M Lee,
** Koda,
Akihiko Hirota,
Fumi Egusa,
Mark Heyer
Abstract:
We analyze the CO-to-H$_2$ conversion factor ($α_{\rm{CO}}$) in the nearby barred spiral galaxy M83. We present new HI observations from the JVLA and single-dish GBT in the disk of the galaxy, and combine them with maps of CO(1-0) integrated intensity and dust surface density from the literature. $α_{\rm{CO}}$ and the gas-to-dust ratio ($δ_{\rm{GDR}}$) are simultaneously derived in annuli of 2 kpc…
▽ More
We analyze the CO-to-H$_2$ conversion factor ($α_{\rm{CO}}$) in the nearby barred spiral galaxy M83. We present new HI observations from the JVLA and single-dish GBT in the disk of the galaxy, and combine them with maps of CO(1-0) integrated intensity and dust surface density from the literature. $α_{\rm{CO}}$ and the gas-to-dust ratio ($δ_{\rm{GDR}}$) are simultaneously derived in annuli of 2 kpc width from R = 1-7 kpc. We find that $α_{\rm{CO}}$ and $δ_{\rm{GDR}}$ both increase radially, by a factor of $\sim$ 2-3 from the center to the outskirts of the disk. The luminosity-weighted averages over the disk are $α_{\rm{CO}} = 3.14$ (2.06, 4.96) M$_{\odot}$ pc$^{-2}$[K$\cdot$ km s$^{-1}$]$^{-1}$ and $δ_{\rm{GDR}}$ = 137 (111, 182) at the 68% (1$σ$) confidence level. These are consistent with the $α_{\rm{CO}}$ and $δ_{\rm{GDR}}$ values measured in the Milky Way. In addition to possible variations of $α_{\rm{CO}}$ due to the radial metallicity gradient, we test the possibility of variations in $α_{\rm{CO}}$ due to changes in the underlying cloud populations, as a function of galactic radius. Using a truncated power-law molecular cloud CO luminosity function and an empirical power-law relation for cloud-mass and luminosity, we show that the changes in the underlying cloud population may account for a factor of $\sim 1.5-2.0$ radial change in $α_{\rm{CO}}$.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Fidelitous Augmentation of Human Accelerometric Data for Deep Learning
Authors:
Tracey K. M. Lee,
H. W. Chan,
K. H. Leo,
Effie Chew,
L. Zhao,
Saeid Sanei
Abstract:
Time series (TS) data have consistently been in short supply, yet their demand remains high for training systems in prediction, modeling, classification, and various other applications. Synthesis can serve to expand the sample population, yet it is crucial to maintain the statistical characteristics between the synthesized and the original TS : this ensures consistent sampling of data for both tra…
▽ More
Time series (TS) data have consistently been in short supply, yet their demand remains high for training systems in prediction, modeling, classification, and various other applications. Synthesis can serve to expand the sample population, yet it is crucial to maintain the statistical characteristics between the synthesized and the original TS : this ensures consistent sampling of data for both training and testing purposes. However the time domain features of the data may not be maintained. This motivates for our work, the objective which is to preserve the following features in a synthesized TS: its fundamental statistical characteristics and important time domain features like its general shape and prominent transients. In a novel way, we first isolate important TS features into various components using a spectrogram and singular spectrum analysis. The residual signal is then randomized in a way that preserves its statistical properties. These components are then recombined for the synthetic time series. Using accelerometer data in a clinical setting, we use statistical and shape measures to compare our method to others. We show it has higher fidelity to the original signal features, has good diversity and performs better data classification in a deep learning application.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Spontaneous emission decay and excitation in photonic temporal crystals
Authors:
Jagang Park,
Kyungmin Lee,
Ruo-Yang Zhang,
Hee-Chul Park,
Jung-Wan Ryu,
Gil Young Cho,
Min Yeul Lee,
Zhaoqing Zhang,
Namkyoo Park,
Wonju Jeon,
Jonghwa Shin,
C. T. Chan,
Bumki Min
Abstract:
Over the last few decades, the prominent strategies for controlling spontaneous emission has been the use of resonant or space-periodic photonic structures. This approach, initially articulated by Purcell and later expanded upon by Yablonovitch in the context of photonic crystals, leverages the spatial surroundings to modify the spontaneous emission decay rate of atoms or quantum emitters. However…
▽ More
Over the last few decades, the prominent strategies for controlling spontaneous emission has been the use of resonant or space-periodic photonic structures. This approach, initially articulated by Purcell and later expanded upon by Yablonovitch in the context of photonic crystals, leverages the spatial surroundings to modify the spontaneous emission decay rate of atoms or quantum emitters. However, the rise of time-varying photonics has compelled a reevaluation of the spontaneous emission process within dynamically changing environments, especially concerning photonic temporal crystals where optical properties undergo time-periodic modulation. Here, we apply classical light-matter interaction theory along with Floquet analysis to reveal a substantial enhancement in the spontaneous emission decay rate at the momentum gap frequency in photonic temporal crystals. This enhancement is attributed to time-periodicity-induced loss and gain mechanisms, as well as the non-orthogonality of Floquet eigenstates that are inherent to photonic temporal crystals. Intriguingly, our findings also suggest that photonic temporal crystals enable the spontaneous excitation of an atom from its ground state to an excited state, accompanied by the concurrent emission of a photon.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
Atacama Large Aperture Submillimeter Telescope \mbox{(AtLAST)} Science: Probing the Transient and Time-variable Sky
Authors:
John Orlowski-Scherer,
Thomas J. Maccarone,
Joe Bright,
Tomasz Kaminski,
Michael Koss,
Atul Mohan,
Francisco Miguel Montenegro-Montes,
Sig urd Næss,
Claudio Ricci,
Paola Severgnini,
Thomas Stanke,
Cristian Vignali,
Sven Wedemeyer,
Mark Booth,
Claudia Cicone,
Luca Di Mascolo,
Doug Johnstone,
Tony Mroczkowski,
Martin A. Cordiner,
Jochen Greiner,
Evanthia Hatziminaoglou,
Eelco van Kampen,
Pamela Klaassen,
Minju M. Lee,
Daizhong Liu
, et al. (3 additional authors not shown)
Abstract:
The study of transient and variable events, including novae, active galactic nuclei, and black hole binaries, has historically been a fruitful path for elucidating the evolutionary mechanisms of our universe. The study of such events in the millimeter and submillimeter is, however, still in its infancy. Submillimeter observations probe a variety of materials, such as optically thick dust, which ar…
▽ More
The study of transient and variable events, including novae, active galactic nuclei, and black hole binaries, has historically been a fruitful path for elucidating the evolutionary mechanisms of our universe. The study of such events in the millimeter and submillimeter is, however, still in its infancy. Submillimeter observations probe a variety of materials, such as optically thick dust, which are hard to study in other wavelengths. Submillimeter observations are sensitive to a number of emission mechanisms, from the aforementioned cold dust, to hot free-free emission, and synchrotron emission from energetic particles. Study of these phenomena has been hampered by a lack of prompt, high sensitivity submillimeter follow-up, as well as by a lack of high-sky-coverage submillimeter surveys. In this paper, we describe how the proposed Atacama Large Aperture Submillimeter Telescope (AtLAST) could fill in these gaps in our understanding of the transient universe. We discuss a number of science cases that would benefit from AtLAST observations, and detail how AtLAST is uniquely suited to contributing to them. In particular, AtLAST's large field of view will enable serendipitous detections of transient events, while its anticipated ability to get on source quickly and observe simultaneously in multiple bands make it also ideally suited for transient follow-up. We make theoretical predictions for the instrumental and observatory properties required to significantly contribute to these science cases, and compare them to the projected AtLAST capabilities. Finally, we consider the unique ways in which transient science cases constrain the observational strategies of AtLAST, and make prescriptions for how AtLAST should observe in order to maximize its transient science output without im**ing on other science cases.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
Determination of the CKM angle $φ_{3}$ from a combination of Belle and Belle II results
Authors:
Belle,
Belle II Collaborations,
:,
I. Adachi,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
S. Al Said,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien
, et al. (377 additional authors not shown)
Abstract:
We report a determination of the CKM angle $φ_{3}$, also known as $γ$, from a combination of measurements using samples of up to 711~fb$^{-1}$ from the Belle experiment and up to 362~fb$^{-1}$ from the Belle II experiment. We combine results from analyses of $B^+\to DK^+, B^+\to Dπ^+$, and $B^+ \to D^{*}K^+$ decays, where $D$ is an admixture of $D^0$ and $\overline{D}{}^{0}$ mesons, in a likelihoo…
▽ More
We report a determination of the CKM angle $φ_{3}$, also known as $γ$, from a combination of measurements using samples of up to 711~fb$^{-1}$ from the Belle experiment and up to 362~fb$^{-1}$ from the Belle II experiment. We combine results from analyses of $B^+\to DK^+, B^+\to Dπ^+$, and $B^+ \to D^{*}K^+$ decays, where $D$ is an admixture of $D^0$ and $\overline{D}{}^{0}$ mesons, in a likelihood fit to obtain $φ_{3} = (78.6^{+7.2}_{-7.3})^{\circ}$. We also briefly discuss the interpretation of this result.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
Expanding Ricci solitons coming out of weakly PIC1 metric cones
Authors:
Pak-Yeung Chan,
Man-Chun Lee,
Luke T. Peachey
Abstract:
Motivated by recent work of Deruelle-Schulze-Simon, we study complete weakly PIC1 Ricci flows with Euclidean volume growth coming out of metric cones. We show that such a Ricci flow must be an expanding gradient Ricci soliton, and as a consequence, any metric cone at infinity of a complete weakly PIC1 Kähler manifold with Euclidean volume growth is biholomorphic to complex Euclidean space in a can…
▽ More
Motivated by recent work of Deruelle-Schulze-Simon, we study complete weakly PIC1 Ricci flows with Euclidean volume growth coming out of metric cones. We show that such a Ricci flow must be an expanding gradient Ricci soliton, and as a consequence, any metric cone at infinity of a complete weakly PIC1 Kähler manifold with Euclidean volume growth is biholomorphic to complex Euclidean space in a canonical way.
△ Less
Submitted 1 May, 2024; v1 submitted 19 April, 2024;
originally announced April 2024.
-
Dyakonov-Perel-like Orbital and Spin Relaxations in Centrosymmetric Systems
Authors:
Jeonghun Sohn,
Jongjun M. Lee,
Hyun-Woo Lee
Abstract:
The Dyakonov-Perel (DP) mechanism of spin relaxation has long been considered irrelevant in centrosymmetric systems since it was developed originally for non-centrosymmetric ones. We investigate whether this conventional understanding extends to the realm of orbital relaxation, which has recently attracted significant attention. Surprisingly, we find that orbital relaxation in centrosymmetric syst…
▽ More
The Dyakonov-Perel (DP) mechanism of spin relaxation has long been considered irrelevant in centrosymmetric systems since it was developed originally for non-centrosymmetric ones. We investigate whether this conventional understanding extends to the realm of orbital relaxation, which has recently attracted significant attention. Surprisingly, we find that orbital relaxation in centrosymmetric systems exhibits the DP-like behavior in the weak scattering regime. Moreover, the DP-like orbital relaxation can make the spin relaxation in centrosymmetric systems DP-like through the spin-orbit coupling. We also find that the DP-like orbital and spin relaxations are anisotropic even in materials with high crystal symmetry (such as face-centered cubic structure) and may depend on the orbital and spin nature of electron wavefunctions.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
DeblurGS: Gaussian Splatting for Camera Motion Blur
Authors:
Jeongtaek Oh,
Jaeyoung Chung,
Dongwoo Lee,
Kyoung Mu Lee
Abstract:
Although significant progress has been made in reconstructing sharp 3D scenes from motion-blurred images, a transition to real-world applications remains challenging. The primary obstacle stems from the severe blur which leads to inaccuracies in the acquisition of initial camera poses through Structure-from-Motion, a critical aspect often overlooked by previous approaches. To address this challeng…
▽ More
Although significant progress has been made in reconstructing sharp 3D scenes from motion-blurred images, a transition to real-world applications remains challenging. The primary obstacle stems from the severe blur which leads to inaccuracies in the acquisition of initial camera poses through Structure-from-Motion, a critical aspect often overlooked by previous approaches. To address this challenge, we propose DeblurGS, a method to optimize sharp 3D Gaussian Splatting from motion-blurred images, even with the noisy camera pose initialization. We restore a fine-grained sharp scene by leveraging the remarkable reconstruction capability of 3D Gaussian Splatting. Our approach estimates the 6-Degree-of-Freedom camera motion for each blurry observation and synthesizes corresponding blurry renderings for the optimization process. Furthermore, we propose Gaussian Densification Annealing strategy to prevent the generation of inaccurate Gaussians at erroneous locations during the early training stages when camera motion is still imprecise. Comprehensive experiments demonstrate that our DeblurGS achieves state-of-the-art performance in deblurring and novel view synthesis for real-world and synthetic benchmark datasets, as well as field-captured blurry smartphone videos.
△ Less
Submitted 17 April, 2024; v1 submitted 17 April, 2024;
originally announced April 2024.
-
Measurement of the branching fraction of the decay $B^- \to D^0 ρ(770)^-$ at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer,
J. Becker,
J. V. Bennett
, et al. (367 additional authors not shown)
Abstract:
We measure the branching fraction of the decay $B^- \to D^0 ρ(770)^-$ using data collected with the Belle II detector. The data contain 387 million $B\overline{B}$ pairs produced in $e^+e^-$ collisions at the $Υ(4S)$ resonance. We reconstruct $8360\pm 180$ decays from an analysis of the distributions of the $B^-$ energy and the $ρ(770)^-$ helicity angle. We determine the branching fraction to be…
▽ More
We measure the branching fraction of the decay $B^- \to D^0 ρ(770)^-$ using data collected with the Belle II detector. The data contain 387 million $B\overline{B}$ pairs produced in $e^+e^-$ collisions at the $Υ(4S)$ resonance. We reconstruct $8360\pm 180$ decays from an analysis of the distributions of the $B^-$ energy and the $ρ(770)^-$ helicity angle. We determine the branching fraction to be $(0.939 \pm 0.021\mathrm{(stat)} \pm 0.050\mathrm{(syst)})\%$, in agreement with previous results. Our measurement improves the relative precision of the world average by more than a factor of two.
△ Less
Submitted 27 June, 2024; v1 submitted 16 April, 2024;
originally announced April 2024.
-
Tracing the evolutionary pathways of dust and cold gas in high-z quiescent galaxies with SIMBA
Authors:
G. Lorenzon,
D. Donevski,
K. Lisiecki,
C. Lovell,
M. Romano,
D. Narayanan,
R. Davé,
A. Man,
K. E. Whitaker,
A. Nanni,
A. Long,
M. M. Lee,
Junais,
K. Małek,
G. Rodighiero,
Q. Li
Abstract:
Recent discoveries of copious amounts of dust in quiescent galaxies (QGs) at high redshifts ($z\gtrsim 1-2$) challenge the conventional view that these objects have poor interstellar medium (ISM) in proportion to their stellar mass. We use the SIMBA cosmological simulation to explore the evolution of dust and cold gas content in QGs in relation to the quenching processes affecting them. We track t…
▽ More
Recent discoveries of copious amounts of dust in quiescent galaxies (QGs) at high redshifts ($z\gtrsim 1-2$) challenge the conventional view that these objects have poor interstellar medium (ISM) in proportion to their stellar mass. We use the SIMBA cosmological simulation to explore the evolution of dust and cold gas content in QGs in relation to the quenching processes affecting them. We track the changes in the ISM dust abundance across the evolutionary history of QGs identified at $0 \lesssim z \lesssim2$ in the field and cluster environments. The QGs quench via diverse pathways, both rapid and slow, and exhibit a wide range of times elapsed between the quenching event and cold gas removal (from $\sim650$ Myr to $\sim8$ Gyr). We find that quenching modes attributed to the feedback from active galactic nuclei (AGN) do not affect dust and cold gas within the same timescales. Remarkably, QGs may replenish their dust content in the quenched phase primarily due to internal processes and marginally by external factors such as minor mergers. The key mechanism for re-formation of dust is prolonged grain growth on gas-phase metals, it is effective within $\sim100$ Myr after the quenching event, and rapidly increases the dust-to-gas mass ratio in QGs above the standard values ($δ_{\rm DGR}\gtrsim1/100$). As a result, despite heavily depleted cold gas reservoirs, roughly half of QGs maintain little evolution in their ISM dust with stellar age within the first 2 Gyr following the quenching. Overall, we predict that relatively dusty QGs ($M_{\rm dust}/M_{\star}\gtrsim10^{-3}-10^{-4}$) arise from both fast and slow quenchers, and are prevalent in systems of intermediate and low stellar masses ($9<\log(M_{\star}/M_{\odot})<10.5$). This prediction poses an immediate quest for observational synergy between e.g., James Webb Space Telescope (JWST) and the Atacama Large Millimeter Array (ALMA).
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
On Speculative Decoding for Multimodal Large Language Models
Authors:
Mukul Gagrani,
Raghavv Goel,
Wonseok Jeon,
Junyoung Park,
Mingu Lee,
Christopher Lott
Abstract:
Inference with Multimodal Large Language Models (MLLMs) is slow due to their large-language-model backbone which suffers from memory bandwidth bottleneck and generates tokens auto-regressively. In this paper, we explore the application of speculative decoding to enhance the inference efficiency of MLLMs, specifically the LLaVA 7B model. We show that a language-only model can serve as a good draft…
▽ More
Inference with Multimodal Large Language Models (MLLMs) is slow due to their large-language-model backbone which suffers from memory bandwidth bottleneck and generates tokens auto-regressively. In this paper, we explore the application of speculative decoding to enhance the inference efficiency of MLLMs, specifically the LLaVA 7B model. We show that a language-only model can serve as a good draft model for speculative decoding with LLaVA 7B, bypassing the need for image tokens and their associated processing components from the draft model. Our experiments across three different tasks show that speculative decoding can achieve a memory-bound speedup of up to 2.37$\times$ using a 115M parameter language model that we trained from scratch. Additionally, we introduce a compact LLaVA draft model incorporating an image adapter, which shows marginal performance gains in image captioning while maintaining comparable results in other tasks.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
AutoGFI: Streamlined Generalized Fiducial Inference for Modern Inference Problems
Authors:
Wei Du,
Jan Hannig,
Thomas C. M. Lee,
Yi Su,
Chunzhe Zhang
Abstract:
The origins of fiducial inference trace back to the 1930s when R. A. Fisher first introduced the concept as a response to what he perceived as a limitation of Bayesian inference - the requirement for a subjective prior distribution on model parameters in cases where no prior information was available. However, Fisher's initial fiducial approach fell out of favor as complications arose, particularl…
▽ More
The origins of fiducial inference trace back to the 1930s when R. A. Fisher first introduced the concept as a response to what he perceived as a limitation of Bayesian inference - the requirement for a subjective prior distribution on model parameters in cases where no prior information was available. However, Fisher's initial fiducial approach fell out of favor as complications arose, particularly in multi-parameter problems. In the wake of 2000, amidst a renewed interest in contemporary adaptations of fiducial inference, generalized fiducial inference (GFI) emerged to extend Fisher's fiducial argument, providing a promising avenue for addressing numerous crucial and practical inference challenges. Nevertheless, the adoption of GFI has been limited due to its often demanding mathematical derivations and the necessity for implementing complex Markov Chain Monte Carlo algorithms. This complexity has impeded its widespread utilization and practical applicability. This paper presents a significant advancement by introducing an innovative variant of GFI designed to alleviate these challenges. Specifically, this paper proposes AutoGFI, an easily implementable algorithm that streamlines the application of GFI to a broad spectrum of inference problems involving additive noise. AutoGFI can be readily implemented as long as a fitting routine is available, making it accessible to a broader audience of researchers and practitioners. To demonstrate its effectiveness, AutoGFI is applied to three contemporary and challenging problems: tensor regression, matrix completion, and regression with network cohesion. These case studies highlight the immense potential of GFI and illustrate AutoGFI's promising performance when compared to specialized solutions for these problems. Overall, this research paves the way for a more accessible and powerful application of GFI in a range of practical domains.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Search for rare $b \to d\ell^+\ell^-$ transitions at Belle
Authors:
Belle,
Belle II Collaborations,
:,
I. Adachi,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
S. Al Said,
D. M. Asner,
H. Atmacan,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Beaubien,
F. Becherer,
J. Becker
, et al. (371 additional authors not shown)
Abstract:
We present the results of a search for the $b \to d\ell^+\ell^-$ flavor-changing neutral-current rare decays $B^{+, 0} \to (η, ω, π^{+,0}, ρ^{+, 0}) e^+e^-$ and $B^{+, 0} \to (η, ω, π^{0}, ρ^{+}) μ^+μ^-$ using a $711$ fb$^{-1}$ data sample that contains $772 \times 10^{6}$ $B\overline{B}$ events. The data were collected at the $Υ(4S)$ resonance with the Belle detector at the KEKB asymmetric-energy…
▽ More
We present the results of a search for the $b \to d\ell^+\ell^-$ flavor-changing neutral-current rare decays $B^{+, 0} \to (η, ω, π^{+,0}, ρ^{+, 0}) e^+e^-$ and $B^{+, 0} \to (η, ω, π^{0}, ρ^{+}) μ^+μ^-$ using a $711$ fb$^{-1}$ data sample that contains $772 \times 10^{6}$ $B\overline{B}$ events. The data were collected at the $Υ(4S)$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+e^-$ collider. We find no evidence for signal and set upper limits on branching fractions at the $90\%$ confidence level in the range $(3.8 - 47) \times 10^{-8}$ depending on the decay channel. The obtained limits are the world's best results. This is the first search for the channels $B^{+, 0} \to (ω, ρ^{+,0}) e^+e^-$ and $B^{+, 0} \to (ω, ρ^{+})μ^+μ^-$.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
FedAuxHMTL: Federated Auxiliary Hard-Parameter Sharing Multi-Task Learning for Network Edge Traffic Classification
Authors:
Faisal Ahmed,
Myung** Lee,
Suresh Subramaniam,
Motoharu Matsuura,
Hiroshi Hasegawa,
Shih-Chun Lin
Abstract:
Federated Learning (FL) has garnered significant interest recently due to its potential as an effective solution for tackling many challenges in diverse application scenarios, for example, data privacy in network edge traffic classification. Despite its recognized advantages, FL encounters obstacles linked to statistical data heterogeneity and labeled data scarcity during the training of single-ta…
▽ More
Federated Learning (FL) has garnered significant interest recently due to its potential as an effective solution for tackling many challenges in diverse application scenarios, for example, data privacy in network edge traffic classification. Despite its recognized advantages, FL encounters obstacles linked to statistical data heterogeneity and labeled data scarcity during the training of single-task models for machine learning-based traffic classification, leading to hindered learning performance. In response to these challenges, adopting a hard-parameter sharing multi-task learning model with auxiliary tasks proves to be a suitable approach. Such a model has the capability to reduce communication and computation costs, navigate statistical complexities inherent in FL contexts, and overcome labeled data scarcity by leveraging knowledge derived from interconnected auxiliary tasks. This paper introduces a new framework for federated auxiliary hard-parameter sharing multi-task learning, namely, FedAuxHMTL. The introduced framework incorporates model parameter exchanges between edge server and base stations, enabling base stations from distributed areas to participate in the FedAuxHMTL process and enhance the learning performance of the main task-network edge traffic classification. Empirical experiments are conducted to validate and demonstrate the FedAuxHMTL's effectiveness in terms of accuracy, total global loss, communication costs, computing time, and energy consumption compared to its counterparts.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Learning Locally Interacting Discrete Dynamical Systems: Towards Data-Efficient and Scalable Prediction
Authors:
Beomseok Kang,
Harshit Kumar,
Minah Lee,
Biswadeep Chakraborty,
Saibal Mukhopadhyay
Abstract:
Locally interacting dynamical systems, such as epidemic spread, rumor propagation through crowd, and forest fire, exhibit complex global dynamics originated from local, relatively simple, and often stochastic interactions between dynamic elements. Their temporal evolution is often driven by transitions between a finite number of discrete states. Despite significant advancements in predictive model…
▽ More
Locally interacting dynamical systems, such as epidemic spread, rumor propagation through crowd, and forest fire, exhibit complex global dynamics originated from local, relatively simple, and often stochastic interactions between dynamic elements. Their temporal evolution is often driven by transitions between a finite number of discrete states. Despite significant advancements in predictive modeling through deep learning, such interactions among many elements have rarely explored as a specific domain for predictive modeling. We present Attentive Recurrent Neural Cellular Automata (AR-NCA), to effectively discover unknown local state transition rules by associating the temporal information between neighboring cells in a permutation-invariant manner. AR-NCA exhibits the superior generalizability across various system configurations (i.e., spatial distribution of states), data efficiency and robustness in extremely data-limited scenarios even in the presence of stochastic interactions, and scalability through spatial dimension-independent prediction.
△ Less
Submitted 27 May, 2024; v1 submitted 9 April, 2024;
originally announced April 2024.
-
Measurement of the $e^+e^- \to π^+π^-π^0$ cross section in the energy range 0.62-3.50 GeV at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer,
J. Becker,
J. V. Bennett
, et al. (338 additional authors not shown)
Abstract:
We report a measurement of the $e^+e^- \to π^+π^-π^0$ cross section in the energy range from 0.62 to 3.50 GeV using an initial-state radiation technique. We use an $e^+e^-$ data sample corresponding to 191 $\text{fb}^{-1}$ of integrated luminosity, collected at a center-of-mass energy at or near the $Υ{(4S)}$ resonance with the Belle II detector at the SuperKEKB collider. Signal yields are extract…
▽ More
We report a measurement of the $e^+e^- \to π^+π^-π^0$ cross section in the energy range from 0.62 to 3.50 GeV using an initial-state radiation technique. We use an $e^+e^-$ data sample corresponding to 191 $\text{fb}^{-1}$ of integrated luminosity, collected at a center-of-mass energy at or near the $Υ{(4S)}$ resonance with the Belle II detector at the SuperKEKB collider. Signal yields are extracted by fitting the two-photon mass distribution in $e^+e^- \to π^+π^-π^0γ$ events, which involve a $π^0 \to γγ$ decay and an energetic photon radiated from the initial state. Signal efficiency corrections with an accuracy of 1.6% are obtained from several control data samples. The uncertainty on the cross section at the $ω$ and $φ$ resonances is dominated by the systematic uncertainty of 2.2%. The resulting cross sections in the 0.62-1.80 GeV energy range yield $ a_μ^{3π} = [48.91 \pm 0.23~(\mathrm{stat}) \pm 1.07~(\mathrm{syst})] \times 10^{-10} $ for the leading-order hadronic vacuum polarization contribution to the muon anomalous magnetic moment. This result differs by $2.5$ standard deviations from the most precise current determination.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Comparative Study of Quantum-Circuit Scalability in a Financial Problem
Authors:
Jaewoong Heo,
Moonjoo Lee
Abstract:
Quantum computer is extensively used in solving financial problems. Quantum amplitude estimation, an algorithm that aims to estimate the amplitude of a given quantum state, can be utilized to determine the expectation value of bonds as the logic introduced in quantum risk analysis. As the number of the evaluation qubit increases, the more accurate the precise the outcome expectation value is. This…
▽ More
Quantum computer is extensively used in solving financial problems. Quantum amplitude estimation, an algorithm that aims to estimate the amplitude of a given quantum state, can be utilized to determine the expectation value of bonds as the logic introduced in quantum risk analysis. As the number of the evaluation qubit increases, the more accurate the precise the outcome expectation value is. This augmentation in qubits, however, also leads to a varied escalation in circuit complexity, contingent upon the type of quantum computing device. By analyzing the number of two-qubit gates in the superconducting circuit and ion-trap quantum system, this study examines that the native gates and connectivity nature of the ion-trap system lead to less complicated quantum circuits. Across a range of experiments conducted with one to nineteen qubits, the examination reveals that the ion-trap system exhibits a two to three factor reduction in the number of required two-qubit gates when compared to the superconducting circuit system.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Joint Reconstruction of 3D Human and Object via Contact-Based Refinement Transformer
Authors:
Hyeong** Nam,
Daniel Sungho Jung,
Gyeongsik Moon,
Kyoung Mu Lee
Abstract:
Human-object contact serves as a strong cue to understand how humans physically interact with objects. Nevertheless, it is not widely explored to utilize human-object contact information for the joint reconstruction of 3D human and object from a single image. In this work, we present a novel joint 3D human-object reconstruction method (CONTHO) that effectively exploits contact information between…
▽ More
Human-object contact serves as a strong cue to understand how humans physically interact with objects. Nevertheless, it is not widely explored to utilize human-object contact information for the joint reconstruction of 3D human and object from a single image. In this work, we present a novel joint 3D human-object reconstruction method (CONTHO) that effectively exploits contact information between humans and objects. There are two core designs in our system: 1) 3D-guided contact estimation and 2) contact-based 3D human and object refinement. First, for accurate human-object contact estimation, CONTHO initially reconstructs 3D humans and objects and utilizes them as explicit 3D guidance for contact estimation. Second, to refine the initial reconstructions of 3D human and object, we propose a novel contact-based refinement Transformer that effectively aggregates human features and object features based on the estimated human-object contact. The proposed contact-based refinement prevents the learning of erroneous correlation between human and object, which enables accurate 3D reconstruction. As a result, our CONTHO achieves state-of-the-art performance in both human-object contact estimation and joint reconstruction of 3D human and object. The code is publicly available at https://github.com/dqj5182/CONTHO_RELEASE.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
AdaBM: On-the-Fly Adaptive Bit Map** for Image Super-Resolution
Authors:
Cheeun Hong,
Kyoung Mu Lee
Abstract:
Although image super-resolution (SR) problem has experienced unprecedented restoration accuracy with deep neural networks, it has yet limited versatile applications due to the substantial computational costs. Since different input images for SR face different restoration difficulties, adapting computational costs based on the input image, referred to as adaptive inference, has emerged as a promisi…
▽ More
Although image super-resolution (SR) problem has experienced unprecedented restoration accuracy with deep neural networks, it has yet limited versatile applications due to the substantial computational costs. Since different input images for SR face different restoration difficulties, adapting computational costs based on the input image, referred to as adaptive inference, has emerged as a promising solution to compress SR networks. Specifically, adapting the quantization bit-widths has successfully reduced the inference and memory cost without sacrificing the accuracy. However, despite the benefits of the resultant adaptive network, existing works rely on time-intensive quantization-aware training with full access to the original training pairs to learn the appropriate bit allocation policies, which limits its ubiquitous usage. To this end, we introduce the first on-the-fly adaptive quantization framework that accelerates the processing time from hours to seconds. We formulate the bit allocation problem with only two bit map** modules: one to map the input image to the image-wise bit adaptation factor and one to obtain the layer-wise adaptation factors. These bit map**s are calibrated and fine-tuned using only a small number of calibration images. We achieve competitive performance with the previous adaptive quantization methods, while the processing time is accelerated by x2000. Codes are available at https://github.com/Cheeun/AdaBM.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
End-To-End Self-tuning Self-supervised Time Series Anomaly Detection
Authors:
Boje Deforce,
Meng-Chieh Lee,
Bart Baesens,
Estefanía Serral Asensio,
Jaemin Yoo,
Leman Akoglu
Abstract:
Time series anomaly detection (TSAD) finds many applications such as monitoring environmental sensors, industry KPIs, patient biomarkers, etc. A two-fold challenge for TSAD is a versatile and unsupervised model that can detect various different types of time series anomalies (spikes, discontinuities, trend shifts, etc.) without any labeled data. Modern neural networks have outstanding ability in m…
▽ More
Time series anomaly detection (TSAD) finds many applications such as monitoring environmental sensors, industry KPIs, patient biomarkers, etc. A two-fold challenge for TSAD is a versatile and unsupervised model that can detect various different types of time series anomalies (spikes, discontinuities, trend shifts, etc.) without any labeled data. Modern neural networks have outstanding ability in modeling complex time series. Self-supervised models in particular tackle unsupervised TSAD by transforming the input via various augmentations to create pseudo anomalies for training. However, their performance is sensitive to the choice of augmentation, which is hard to choose in practice, while there exists no effort in the literature on data augmentation tuning for TSAD without labels. Our work aims to fill this gap. We introduce TSAP for TSA "on autoPilot", which can (self-)tune augmentation hyperparameters end-to-end. It stands on two key components: a differentiable augmentation architecture and an unsupervised validation loss to effectively assess the alignment between augmentation type and anomaly type. Case studies show TSAP's ability to effectively select the (discrete) augmentation type and associated (continuous) hyperparameters. In turn, it outperforms established baselines, including SOTA self-supervised models, on diverse TSAD tasks exhibiting different anomaly types.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
The JWST-PRIMAL Legacy Survey. A JWST/NIRSpec reference sample for the physical properties and Lyman-$α$ absorption and emission of $\sim 500$ galaxies at $z=5.5-13.4$
Authors:
K. E. Heintz,
G. B. Brammer,
D. Watson,
P. A. Oesch,
L. C. Keating,
M. J. Hayes,
Abdurro'uf,
K. Z. Arellano-Córdova,
A. C. Carnall,
C. R. Christiansen,
F. Cullen,
R. Davé,
P. Dayal,
A. Ferrara,
K. Finlator,
J. P. U. Fynbo,
S. R. Flury,
V. Gelli,
S. Gillman,
R. Gottumukkala,
K. Gould,
T. R. Greve,
S. E. Hardin,
T. Y. -Y Hsiao,
A. Hutter
, et al. (23 additional authors not shown)
Abstract:
One of the surprising early findings with JWST has been the discovery of a strong "roll-over" or a softening of the absorption edge of Ly$α$ in a large number of galaxies at ($z\gtrsim 6$), in addition to systematic offsets from photometric redshift estimates and fundamental galaxy scaling relations. This has been interpreted as damped Ly$α$ absorption (DLA) wings from high column densities of neu…
▽ More
One of the surprising early findings with JWST has been the discovery of a strong "roll-over" or a softening of the absorption edge of Ly$α$ in a large number of galaxies at ($z\gtrsim 6$), in addition to systematic offsets from photometric redshift estimates and fundamental galaxy scaling relations. This has been interpreted as damped Ly$α$ absorption (DLA) wings from high column densities of neutral atomic hydrogen (HI), signifying major gas accretion events in the formation of these galaxies. To explore this new phenomenon systematically, we assemble the JWST/NIRSpec PRImordial gas Mass AssembLy (PRIMAL) legacy survey of 494 galaxies at $z=5.5-13.4$. We characterize this benchmark sample in full and spectroscopically derive the galaxy redshifts, metallicities, star-formation rates, and ultraviolet slopes. We define a new diagnostic, the Ly$α$ dam** parameter $D_{\rm Lyα}$ to measure and quantify the Ly$α$ emission strength, HI fraction in the IGM, or local HI column density for each source. The JWST-PRIMAL survey is based on the spectroscopic DAWN JWST Archive (DJA-Spec). All the software, reduced spectra, and spectroscopically derived quantities and catalogs are made publicly available in dedicated repositories. The fraction of strong galaxy DLAs are found to be in the range $65-95\%$ at $z>5.5$. The fraction of strong Ly$α$ emitters (LAEs) is found to increase with decreasing redshift, in qualitative agreement with previous observational results, and are predominantly associated with low-metallicity and UV faint galaxies. By contrast, strong DLAs are observed in galaxies with a variety of intrinsic physical properties. Our results indicate that strong DLAs likely reflect a particular early assembly phase of reionization-era galaxies, at which point they are largely dominated by pristine HI gas accretion. [abridged]
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
HyperCLOVA X Technical Report
Authors:
Kang Min Yoo,
Jaegeun Han,
Sookyo In,
Heewon Jeon,
Jisu Jeong,
Jaewook Kang,
Hyunwook Kim,
Kyung-Min Kim,
Munhyong Kim,
Sungju Kim,
Donghyun Kwak,
Hanock Kwak,
Se Jung Kwon,
Bado Lee,
Dongsoo Lee,
Gichang Lee,
Jooho Lee,
Baeseong Park,
Seong** Shin,
Joonsang Yu,
Seolki Baek,
Sumin Byeon,
Eungsup Cho,
Dooseok Choe,
Jeesung Han
, et al. (371 additional authors not shown)
Abstract:
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t…
▽ More
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in develo** their sovereign LLMs.
△ Less
Submitted 13 April, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss
Authors:
Jaeha Kim,
Junghun Oh,
Kyoung Mu Lee
Abstract:
In real-world scenarios, image recognition tasks, such as semantic segmentation and object detection, often pose greater challenges due to the lack of information available within low-resolution (LR) content. Image super-resolution (SR) is one of the promising solutions for addressing the challenges. However, due to the ill-posed property of SR, it is challenging for typical SR methods to restore…
▽ More
In real-world scenarios, image recognition tasks, such as semantic segmentation and object detection, often pose greater challenges due to the lack of information available within low-resolution (LR) content. Image super-resolution (SR) is one of the promising solutions for addressing the challenges. However, due to the ill-posed property of SR, it is challenging for typical SR methods to restore task-relevant high-frequency contents, which may dilute the advantage of utilizing the SR method. Therefore, in this paper, we propose Super-Resolution for Image Recognition (SR4IR) that effectively guides the generation of SR images beneficial to achieving satisfactory image recognition performance when processing LR images. The critical component of our SR4IR is the task-driven perceptual (TDP) loss that enables the SR network to acquire task-specific knowledge from a network tailored for a specific task. Moreover, we propose a cross-quality patch mix and an alternate training framework that significantly enhances the efficacy of the TDP loss by addressing potential problems when employing the TDP loss. Through extensive experiments, we demonstrate that our SR4IR achieves outstanding task performance by generating SR images useful for a specific image recognition task, including semantic segmentation, object detection, and image classification. The implementation code is available at https://github.com/JaehaKim97/SR4IR.
△ Less
Submitted 4 April, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
Learning Equi-angular Representations for Online Continual Learning
Authors:
Minhyuk Seo,
Hyunseo Koh,
Wonje Jeung,
Minjae Lee,
San Kim,
Hankook Lee,
Sungjun Cho,
Sungik Choi,
Hyunwoo Kim,
Jonghyun Choi
Abstract:
Online continual learning suffers from an underfitted solution due to insufficient training for prompt model update (e.g., single-epoch training). To address the challenge, we propose an efficient online continual learning method using the neural collapse phenomenon. In particular, we induce neural collapse to form a simplex equiangular tight frame (ETF) structure in the representation space so th…
▽ More
Online continual learning suffers from an underfitted solution due to insufficient training for prompt model update (e.g., single-epoch training). To address the challenge, we propose an efficient online continual learning method using the neural collapse phenomenon. In particular, we induce neural collapse to form a simplex equiangular tight frame (ETF) structure in the representation space so that the continuously learned model with a single epoch can better fit to the streamed data by proposing preparatory data training and residual correction in the representation space. With an extensive set of empirical validations using CIFAR-10/100, TinyImageNet, ImageNet-200, and ImageNet-1K, we show that our proposed method outperforms state-of-the-art methods by a noticeable margin in various online continual learning scenarios such as disjoint and Gaussian scheduled continuous (i.e., boundary-free) data setups.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Stationary solutions to the relativistic BGK model for gas mixtures in a slab
Authors:
Byung-Hoon Hwang,
Myeong-Su Lee
Abstract:
In a recent paper [16], the authors proposed a BGK model for relativistic gas mixtures based on the Marle-type approximation, which satisfies the fundamental kinetic properties: non-negativity of distribution functions, conservation laws, H-theorem, and indifferentiability principle. In this paper, we are concerned with the stationary problems to the relativistic BGK model for gas mixtures in slab…
▽ More
In a recent paper [16], the authors proposed a BGK model for relativistic gas mixtures based on the Marle-type approximation, which satisfies the fundamental kinetic properties: non-negativity of distribution functions, conservation laws, H-theorem, and indifferentiability principle. In this paper, we are concerned with the stationary problems to the relativistic BGK model for gas mixtures in slab geometry. We establish the existence of a unique mild solution with the fixed inflow boundary data when the collision frequencies for each species are sufficiently small.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.