-
AdaCQR: Enhancing Query Reformulation for Conversational Search via Sparse and Dense Retrieval Alignment
Authors:
Yilong Lai,
Jialong Wu,
Congzhi Zhang,
Haowen Sun,
Deyu Zhou
Abstract:
Conversational Query Reformulation (CQR) has significantly advanced in addressing the challenges of conversational search, particularly those stemming from the latent user intent and the need for historical context. Recent works aimed to boost the performance of CRQ through alignment. However, they are designed for one specific retrieval system, which potentially results in poor generalization. To…
▽ More
Conversational Query Reformulation (CQR) has significantly advanced in addressing the challenges of conversational search, particularly those stemming from the latent user intent and the need for historical context. Recent works aimed to boost the performance of CRQ through alignment. However, they are designed for one specific retrieval system, which potentially results in poor generalization. To overcome this limitation, we present a novel framework AdaCQR. By aligning reformulation models with both term-based and semantic-based retrieval systems, AdaCQR enhances the generalizability of information-seeking queries across diverse retrieval environments through a dual-phase training strategy. We also developed two effective approaches for acquiring superior labels and diverse input candidates, boosting the efficiency and robustness of the framework. Experimental evaluations on the TopiOCQA and QReCC datasets demonstrate that AdaCQR significantly outperforms existing methods, offering both quantitative and qualitative improvements in conversational query reformulation.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Measurement of the integrated luminosity of data samples collected during 2019-2022 by the Belle II experiment
Authors:
The Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
J. K. Ahn,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Althubiti,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien
, et al. (382 additional authors not shown)
Abstract:
A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, diga…
▽ More
A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, digamma, and dimuon events is (426.52 $\pm$ 0.03 $\pm$ 2.48)~fb$^{-1}$, (427.32 $\pm$ 0.03 $\pm$ 2.56)~fb$^{-1}$, and (424.84 $\pm$ 0.04 $\pm$ 3.88)~fb$^{-1}$, where the first uncertainties are statistical and the second are systematic. The resulting total integrated luminosity obtained from the combination of the three methods is (426.88 $\pm$ 1.93)~fb$^{-1}$.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Study of $χ_{bJ}(2P)\toωΥ(1S)$ at Belle
Authors:
Z. S. Stottler,
T. K. Pedlar,
B. G. Fulsom,
I. Adachi,
K. Adamczyk,
H. Aihara,
S. Al Said,
D. M. Asner,
H. Atmacan,
T. Aushev,
R. Ayad,
V. Babu,
Sw. Banerjee,
M. Bauer,
P. Behera,
K. Belous,
J. Bennett,
F. Bernlochner,
M. Bessner,
T. Bilka,
D. Biswas,
A. Bobrov,
D. Bodrov,
G. Bonvicini,
J. Borah
, et al. (159 additional authors not shown)
Abstract:
We report a study of the hadronic transitions $χ_{bJ}(2P)\toωΥ(1S)$, with $ω\toπ^{+}π^{-}π^{0}$, using $28.2\times10^6~Υ(3S)$ mesons recorded by the Belle detector. We present the first evidence for the near--threshold transition $χ_{b0}(2P)\toωΥ(1S)$, the analog of the charm sector decay $χ_{c1}(3872)\toωJ/ψ$, with a branching fraction of…
▽ More
We report a study of the hadronic transitions $χ_{bJ}(2P)\toωΥ(1S)$, with $ω\toπ^{+}π^{-}π^{0}$, using $28.2\times10^6~Υ(3S)$ mesons recorded by the Belle detector. We present the first evidence for the near--threshold transition $χ_{b0}(2P)\toωΥ(1S)$, the analog of the charm sector decay $χ_{c1}(3872)\toωJ/ψ$, with a branching fraction of $\mathcal{B}\big(χ_{b0}(2P)\toωΥ(1S)\big) = \big(0.55\pm0.19\pm0.07\big)\%$. We also obtain branching fractions of $\mathcal{B}\big(χ_{b1}(2P)\toωΥ(1S)\big) = \big(2.39{}^{+0.20}_{-0.19}\pm0.24\big)\%$ and $\mathcal{B}\big(χ_{b2}(2P)\toωΥ(1S)\big) = \big(0.47{}^{+0.13}_{-0.12}\pm0.06\big)\%$, confirming the measurement of the $ω$ transitions of the $J=1,2~P$--wave states. The ratio for the $J=2$ to $J=1$ transitions is also measured and found to differ by 3.3 standard deviations from the expected value in the QCD multipole expansion.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
GM-DF: Generalized Multi-Scenario Deepfake Detection
Authors:
Yingxin Lai,
Zitong Yu,
**g Yang,
Bin Li,
Xiangui Kang,
Linlin Shen
Abstract:
Existing face forgery detection usually follows the paradigm of training models in a single domain, which leads to limited generalization capacity when unseen scenarios and unknown attacks occur. In this paper, we elaborately investigate the generalization capacity of deepfake detection models when jointly trained on multiple face forgery detection datasets. We first find a rapid degradation of de…
▽ More
Existing face forgery detection usually follows the paradigm of training models in a single domain, which leads to limited generalization capacity when unseen scenarios and unknown attacks occur. In this paper, we elaborately investigate the generalization capacity of deepfake detection models when jointly trained on multiple face forgery detection datasets. We first find a rapid degradation of detection accuracy when models are directly trained on combined datasets due to the discrepancy across collection scenarios and generation methods. To address the above issue, a Generalized Multi-Scenario Deepfake Detection framework (GM-DF) is proposed to serve multiple real-world scenarios by a unified model. First, we propose a hybrid expert modeling approach for domain-specific real/forgery feature extraction. Besides, as for the commonality representation, we use CLIP to extract the common features for better aligning visual and textual features across domains. Meanwhile, we introduce a masked image reconstruction mechanism to force models to capture rich forged details. Finally, we supervise the models via a domain-aware meta-learning strategy to further enhance their generalization capacities. Specifically, we design a novel domain alignment loss to strongly align the distributions of the meta-test domains and meta-train domains. Thus, the updated models are able to represent both specific and common real/forgery features across multiple datasets. In consideration of the lack of study of multi-dataset training, we establish a new benchmark leveraging multi-source data to fairly evaluate the models' generalization capacity on unseen scenarios. Both qualitative and quantitative experiments on five datasets conducted on traditional protocols as well as the proposed benchmark demonstrate the effectiveness of our approach.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
The Belle II Detector Upgrades Framework Conceptual Design Report
Authors:
H. Aihara,
A. Aloisio,
D. P. Auguste,
M. Aversano,
M. Babeluk,
S. Bahinipati,
Sw. Banerjee,
M. Barbero,
J. Baudot,
A. Beaubien,
F. Becherer,
T. Bergauer,
F. U. Bernlochner.,
V. Bertacchi,
G. Bertolone,
C. Bespin,
M. Bessner,
S. Bettarini,
A. J. Bevan,
B. Bhuyan,
M. Bona,
J. F. Bonis,
J. Borah,
F. Bosi,
R. Boudagga
, et al. (183 additional authors not shown)
Abstract:
We describe the planned near-term and potential longer-term upgrades of the Belle II detector at the SuperKEKB electron-positron collider operating at the KEK laboratory in Tsukuba, Japan. These upgrades will allow increasingly sensitive searches for possible new physics beyond the Standard Model in flavor, tau, electroweak and dark sector physics that are both complementary to and competitive wit…
▽ More
We describe the planned near-term and potential longer-term upgrades of the Belle II detector at the SuperKEKB electron-positron collider operating at the KEK laboratory in Tsukuba, Japan. These upgrades will allow increasingly sensitive searches for possible new physics beyond the Standard Model in flavor, tau, electroweak and dark sector physics that are both complementary to and competitive with the LHC and other experiments.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
SEED: Accelerating Reasoning Tree Construction via Scheduled Speculative Decoding
Authors:
Zhenglin Wang,
Jialong Wu,
Yilong Lai,
Congzhi Zhang,
Deyu Zhou
Abstract:
Large Language Models (LLMs) demonstrate remarkable emergent abilities across various tasks, yet fall short of complex reasoning and planning tasks. The tree-search-based reasoning methods address this by surpassing the capabilities of chain-of-thought prompting, encouraging exploration of intermediate steps. However, such methods introduce significant inference latency due to the systematic explo…
▽ More
Large Language Models (LLMs) demonstrate remarkable emergent abilities across various tasks, yet fall short of complex reasoning and planning tasks. The tree-search-based reasoning methods address this by surpassing the capabilities of chain-of-thought prompting, encouraging exploration of intermediate steps. However, such methods introduce significant inference latency due to the systematic exploration and evaluation of multiple thought paths. This paper introduces SeeD, a novel and efficient inference framework to optimize runtime speed and GPU memory management concurrently. By employing a scheduled speculative execution, SeeD efficiently handles multiple iterations for the thought generation and the state evaluation, leveraging a rounds-scheduled strategy to manage draft model dispatching. Extensive experimental evaluations on three reasoning datasets demonstrate superior speedup performance of SeeD, providing a viable path for batched inference in training-free speculative decoding.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Feature-prompting GBMSeg: One-Shot Reference Guided Training-Free Prompt Engineering for Glomerular Basement Membrane Segmentation
Authors:
Xueyu Liu,
Guangze Shi,
Rui Wang,
Yexin Lai,
Jianan Zhang,
Lele Sun,
Quan Yang,
Yongfei Wu,
MIng Li,
Weixia Han,
Wen Zheng
Abstract:
Assessment of the glomerular basement membrane (GBM) in transmission electron microscopy (TEM) is crucial for diagnosing chronic kidney disease (CKD). The lack of domain-independent automatic segmentation tools for the GBM necessitates an AI-based solution to automate the process. In this study, we introduce GBMSeg, a training-free framework designed to automatically segment the GBM in TEM images…
▽ More
Assessment of the glomerular basement membrane (GBM) in transmission electron microscopy (TEM) is crucial for diagnosing chronic kidney disease (CKD). The lack of domain-independent automatic segmentation tools for the GBM necessitates an AI-based solution to automate the process. In this study, we introduce GBMSeg, a training-free framework designed to automatically segment the GBM in TEM images guided only by a one-shot annotated reference. Specifically, GBMSeg first exploits the robust feature matching capabilities of the pretrained foundation model to generate initial prompt points, then introduces a series of novel automatic prompt engineering techniques across the feature and physical space to optimize the prompt scheme. Finally, GBMSeg employs a class-agnostic foundation segmentation model with the generated prompt scheme to obtain accurate segmentation results. Experimental results on our collected 2538 TEM images confirm that GBMSeg achieves superior segmentation performance with a Dice similarity coefficient (DSC) of 87.27% using only one labeled reference image in a training-free manner, outperforming recently proposed one-shot or few-shot methods. In summary, GBMSeg introduces a distinctive automatic prompt framework that facilitates robust domain-independent segmentation performance without training, particularly advancing the automatic prompting of foundation segmentation models for medical images. Future work involves automating the thickness measurement of segmented GBM and quantifying pathological indicators, holding significant potential for advancing pathology assessments in clinical applications. The source code is available on https://github.com/SnowRain510/GBMSeg
△ Less
Submitted 23 June, 2024;
originally announced June 2024.
-
SuperSVG: Superpixel-based Scalable Vector Graphics Synthesis
Authors:
Teng Hu,
Ran Yi,
Baihong Qian,
Jiangning Zhang,
Paul L. Rosin,
Yu-Kun Lai
Abstract:
SVG (Scalable Vector Graphics) is a widely used graphics format that possesses excellent scalability and editability. Image vectorization, which aims to convert raster images to SVGs, is an important yet challenging problem in computer vision and graphics. Existing image vectorization methods either suffer from low reconstruction accuracy for complex images or require long computation time. To add…
▽ More
SVG (Scalable Vector Graphics) is a widely used graphics format that possesses excellent scalability and editability. Image vectorization, which aims to convert raster images to SVGs, is an important yet challenging problem in computer vision and graphics. Existing image vectorization methods either suffer from low reconstruction accuracy for complex images or require long computation time. To address this issue, we propose SuperSVG, a superpixel-based vectorization model that achieves fast and high-precision image vectorization. Specifically, we decompose the input image into superpixels to help the model focus on areas with similar colors and textures. Then, we propose a two-stage self-training framework, where a coarse-stage model is employed to reconstruct the main structure and a refinement-stage model is used for enriching the details. Moreover, we propose a novel dynamic path war** loss to help the refinement-stage model to inherit knowledge from the coarse-stage model. Extensive qualitative and quantitative experiments demonstrate the superior performance of our method in terms of reconstruction accuracy and inference time compared to state-of-the-art approaches. The code is available in \url{https://github.com/sjtuplayer/SuperSVG}.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
AutomaChef: A Physics-informed Demonstration-guided Learning Framework for Granular Material Manipulation
Authors:
Minglun Wei,
Xintong Yang,
Yu-Kun Lai,
Seyed Amir Tafrishi,
Ze Ji
Abstract:
Due to the complex physical properties of granular materials, research on robot learning for manipulating such materials predominantly either disregards the consideration of their physical characteristics or uses surrogate models to approximate their physical properties. Learning to manipulate granular materials based on physical information obtained through precise modelling remains an unsolved p…
▽ More
Due to the complex physical properties of granular materials, research on robot learning for manipulating such materials predominantly either disregards the consideration of their physical characteristics or uses surrogate models to approximate their physical properties. Learning to manipulate granular materials based on physical information obtained through precise modelling remains an unsolved problem. In this paper, we propose to address this challenge by constructing a differentiable physics simulator for granular materials based on the Taichi programming language and develo** a learning framework accelerated by imperfect demonstrations that are generated via gradient-based optimisation on non-granular materials through our simulator. Experimental results show that our method trains three policies that, when chained, are capable of executing the task of transporting granular materials in both simulated and real-world scenarios, which existing popular deep reinforcement learning models fail to accomplish.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Jet modification via $π^0$-hadron correlations in Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV
Authors:
PHENIX Collaboration,
N. J. Abdulameer,
U. Acharya,
A. Adare,
S. Afanasiev,
C. Aidala,
N. N. Ajitanand,
Y. Akiba,
H. Al-Bataineh,
J. Alexander,
M. Alfred,
K. Aoki,
N. Apadula,
L. Aphecetche,
J. Asai,
H. Asano,
E. T. Atomssa,
R. Averbeck,
T. C. Awes,
B. Azmoun,
V. Babintsev,
M. Bai,
G. Baksay,
L. Baksay,
A. Baldisseri
, et al. (510 additional authors not shown)
Abstract:
High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is obs…
▽ More
High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is observed in the yield of high-momentum jet fragments opposite the trigger particle, which indicates jet suppression stemming from in-medium partonic energy loss, while enhancement is observed for low-momentum particles. The ratio and differences between the yield in Au$+$Au collisions and $p$$+$$p$ collisions, $I_{AA}$ and $Δ_{AA}$, as a function of the trigger-hadron azimuthal separation, $Δφ$, are measured for the first time at the Relativistic Heavy Ion Collider. These results better quantify how the yield of low-$p_T$ associated hadrons is enhanced at wide angle, which is crucial for studying energy loss as well as medium-response effects.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Efficient Topology-aware Data Augmentation for High-Degree Graph Neural Networks
Authors:
Yurui Lai,
Xiaoyang Lin,
Renchi Yang,
Hongtao Wang
Abstract:
In recent years, graph neural networks (GNNs) have emerged as a potent tool for learning on graph-structured data and won fruitful successes in varied fields. The majority of GNNs follow the message-passing paradigm, where representations of each node are learned by recursively aggregating features of its neighbors. However, this mechanism brings severe over-smoothing and efficiency issues over hi…
▽ More
In recent years, graph neural networks (GNNs) have emerged as a potent tool for learning on graph-structured data and won fruitful successes in varied fields. The majority of GNNs follow the message-passing paradigm, where representations of each node are learned by recursively aggregating features of its neighbors. However, this mechanism brings severe over-smoothing and efficiency issues over high-degree graphs (HDGs), wherein most nodes have dozens (or even hundreds) of neighbors, such as social networks, transaction graphs, power grids, etc. Additionally, such graphs usually encompass rich and complex structure semantics, which are hard to capture merely by feature aggregations in GNNs. Motivated by the above limitations, we propose TADA, an efficient and effective front-mounted data augmentation framework for GNNs on HDGs. Under the hood, TADA includes two key modules: (i) feature expansion with structure embeddings, and (ii) topology- and attribute-aware graph sparsification. The former obtains augmented node features and enhanced model capacity by encoding the graph structure into high-quality structure embeddings with our highly-efficient sketching method. Further, by exploiting task-relevant features extracted from graph structures and attributes, the second module enables the accurate identification and reduction of numerous redundant/noisy edges from the input graph, thereby alleviating over-smoothing and facilitating faster feature aggregations over HDGs. Empirically, TADA considerably improves the predictive performance of mainstream GNN models on 8 real homophilic/heterophilic HDGs in terms of node classification, while achieving efficient training and inference processes.
△ Less
Submitted 17 June, 2024; v1 submitted 8 June, 2024;
originally announced June 2024.
-
LLM-Enhanced Bayesian Optimization for Efficient Analog Layout Constraint Generation
Authors:
Guo** Chen,
Keren Zhu,
Seunggeun Kim,
Hanqing Zhu,
Yao Lai,
Bei Yu,
David Z. Pan
Abstract:
Analog layout synthesis faces significant challenges due to its dependence on manual processes, considerable time requirements, and performance instability. Current Bayesian Optimization (BO)-based techniques for analog layout synthesis, despite their potential for automation, suffer from slow convergence and extensive data needs, limiting their practical application. This paper presents the \text…
▽ More
Analog layout synthesis faces significant challenges due to its dependence on manual processes, considerable time requirements, and performance instability. Current Bayesian Optimization (BO)-based techniques for analog layout synthesis, despite their potential for automation, suffer from slow convergence and extensive data needs, limiting their practical application. This paper presents the \texttt{LLANA} framework, a novel approach that leverages Large Language Models (LLMs) to enhance BO by exploiting the few-shot learning abilities of LLMs for more efficient generation of analog design-dependent parameter constraints. Experimental results demonstrate that \texttt{LLANA} not only achieves performance comparable to state-of-the-art (SOTA) BO methods but also enables a more effective exploration of the analog circuit design space, thanks to LLM's superior contextual understanding and learning efficiency. The code is available at https://github.com/dekura/LLANA.
△ Less
Submitted 19 June, 2024; v1 submitted 7 June, 2024;
originally announced June 2024.
-
Zero-Shot Video Editing through Adaptive Sliding Score Distillation
Authors:
Lianghan Zhu,
Yanqi Bao,
**g Huo,
**g Wu,
Yu-Kun Lai,
Wenbin Li,
Yang Gao
Abstract:
The burgeoning field of text-based video generation (T2V) has reignited significant interest in the research of controllable video editing. Although pre-trained T2V-based editing models have achieved efficient editing capabilities, current works are still plagued by two major challenges. Firstly, the inherent limitations of T2V models lead to content inconsistencies and motion discontinuities betw…
▽ More
The burgeoning field of text-based video generation (T2V) has reignited significant interest in the research of controllable video editing. Although pre-trained T2V-based editing models have achieved efficient editing capabilities, current works are still plagued by two major challenges. Firstly, the inherent limitations of T2V models lead to content inconsistencies and motion discontinuities between frames. Secondly, the notorious issue of over-editing significantly disrupts areas that are intended to remain unaltered. To address these challenges, our work aims to explore a robust video-based editing paradigm based on score distillation. Specifically, we propose an Adaptive Sliding Score Distillation strategy, which not only enhances the stability of T2V supervision but also incorporates both global and local video guidance to mitigate the impact of generation errors. Additionally, we modify the self-attention layers during the editing process to further preserve the key features of the original video. Extensive experiments demonstrate that these strategies enable us to effectively address the aforementioned challenges, achieving superior editing performance compared to existing state-of-the-art methods.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Entanglement engineering of optomechanical systems by reinforcement learning
Authors:
Li-Li Ye,
Christian Arenz,
Joseph M. Lukens,
Ying-Cheng Lai
Abstract:
Entanglement is fundamental to quantum information science and technology, yet controlling and manipulating entanglement -- so-called entanglement engineering -- for arbitrary quantum systems remains a formidable challenge. There are two difficulties: the fragility of quantum entanglement and its experimental characterization. We develop a model-free deep reinforcement-learning (RL) approach to en…
▽ More
Entanglement is fundamental to quantum information science and technology, yet controlling and manipulating entanglement -- so-called entanglement engineering -- for arbitrary quantum systems remains a formidable challenge. There are two difficulties: the fragility of quantum entanglement and its experimental characterization. We develop a model-free deep reinforcement-learning (RL) approach to entanglement engineering, in which feedback control together with weak continuous measurement and partial state observation is exploited to generate and maintain desired entanglement. We employ quantum optomechanical systems with linear or nonlinear photon-phonon interactions to demonstrate the workings of our machine-learning-based entanglement engineering protocol. In particular, the RL agent sequentially interacts with one or multiple parallel quantum optomechanical environments, collects trajectories, and updates the policy to maximize the accumulated reward to create and stabilize quantum entanglement over an arbitrary amount of time. The machine-learning-based model-free control principle is applicable to the entanglement engineering of experimental quantum systems in general.
△ Less
Submitted 2 July, 2024; v1 submitted 6 June, 2024;
originally announced June 2024.
-
Measurement of the energy dependence of the $e^+e^- \to B\bar{B}$, $B\bar{B}{}^*$, and $B^*\bar{B}{}^*$ cross sections at Belle~II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Althubiti,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
M. Bauer,
A. Baur
, et al. (444 additional authors not shown)
Abstract:
We report measurements of the $e^+e^- \to B\bar{B}$, $B\bar{B}{}^*$, and $B^*\bar{B}{}^*$ cross sections at four energies, 10653, 10701, 10746 and 10805 MeV, using data collected by the Belle~II experiment. We reconstruct one $B$ meson in a large number of hadronic final states and use its momentum to identify the production process. In the first $2-5$ MeV above $B^*\bar{B}{}^*$ threshold, the…
▽ More
We report measurements of the $e^+e^- \to B\bar{B}$, $B\bar{B}{}^*$, and $B^*\bar{B}{}^*$ cross sections at four energies, 10653, 10701, 10746 and 10805 MeV, using data collected by the Belle~II experiment. We reconstruct one $B$ meson in a large number of hadronic final states and use its momentum to identify the production process. In the first $2-5$ MeV above $B^*\bar{B}{}^*$ threshold, the $e^+e^- \to B^*\bar{B}{}^*$ cross section increases rapidly. This may indicate the presence of a pole close to the threshold.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Training-free Editioning of Text-to-Image Models
Authors:
**qi Wang,
Yunfei Fu,
Zhangcan Ding,
Bailin Deng,
Yu-Kun Lai,
Yipeng Qin
Abstract:
Inspired by the software industry's practice of offering different editions or versions of a product tailored to specific user groups or use cases, we propose a novel task, namely, training-free editioning, for text-to-image models. Specifically, we aim to create variations of a base text-to-image model without retraining, enabling the model to cater to the diverse needs of different user groups o…
▽ More
Inspired by the software industry's practice of offering different editions or versions of a product tailored to specific user groups or use cases, we propose a novel task, namely, training-free editioning, for text-to-image models. Specifically, we aim to create variations of a base text-to-image model without retraining, enabling the model to cater to the diverse needs of different user groups or to offer distinct features and functionalities. To achieve this, we propose that different editions of a given text-to-image model can be formulated as concept subspaces in the latent space of its text encoder (e.g., CLIP). In such a concept subspace, all points satisfy a specific user need (e.g., generating images of a cat lying on the grass/ground/falling leaves). Technically, we apply Principal Component Analysis (PCA) to obtain the desired concept subspaces from representative text embedding that correspond to a specific user need or requirement. Projecting the text embedding of a given prompt into these low-dimensional subspaces enables efficient model editioning without retraining. Intuitively, our proposed editioning paradigm enables a service provider to customize the base model into its "cat edition" (or other editions) that restricts image generation to cats, regardless of the user's prompt (e.g., dogs, people, etc.). This introduces a new dimension for product differentiation, targeted functionality, and pricing strategies, unlocking novel business models for text-to-image generators. Extensive experimental results demonstrate the validity of our approach and its potential to enable a wide range of customized text-to-image model editions across various domains and applications.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
AnalogCoder: Analog Circuit Design via Training-Free Code Generation
Authors:
Yao Lai,
Sungyoung Lee,
Guo** Chen,
Souradip Poddar,
Mengkang Hu,
David Z. Pan,
** Luo
Abstract:
Analog circuit design is a significant task in modern chip technology, focusing on the selection of component types, connectivity, and parameters to ensure proper circuit functionality. Despite advances made by Large Language Models (LLMs) in digital circuit design, the complexity and scarcity of data in analog circuitry pose significant challenges. To mitigate these issues, we introduce AnalogCod…
▽ More
Analog circuit design is a significant task in modern chip technology, focusing on the selection of component types, connectivity, and parameters to ensure proper circuit functionality. Despite advances made by Large Language Models (LLMs) in digital circuit design, the complexity and scarcity of data in analog circuitry pose significant challenges. To mitigate these issues, we introduce AnalogCoder, the first training-free LLM agent for designing analog circuits through Python code generation. Firstly, AnalogCoder incorporates a feedback-enhanced flow with tailored domain-specific prompts, enabling the automated and self-correcting design of analog circuits with a high success rate. Secondly, it proposes a circuit tool library to archive successful designs as reusable modular sub-circuits, simplifying composite circuit creation. Thirdly, extensive experiments on a benchmark designed to cover a wide range of analog circuit tasks show that AnalogCoder outperforms other LLM-based methods. It has successfully designed 20 circuits, 5 more than standard GPT-4o. We believe AnalogCoder can significantly improve the labor-intensive chip design process, enabling non-experts to design analog circuits efficiently.
△ Less
Submitted 30 May, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Test of light-lepton universality in $τ$ decays with the Belle II experiment
Authors:
Belle II Collaboration,
I. Adachi,
K. Adamczyk,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer,
J. Becker
, et al. (406 additional authors not shown)
Abstract:
We present a measurement of the ratio $R_μ= \mathcal{B}(τ^-\to μ^-\barν_μν_τ) / \mathcal{B}(τ^-\to e^-\barν_eν_τ)$ of branching fractions $\mathcal{B}$ of the $τ$ lepton decaying to muons or electrons using data collected with the Belle II detector at the SuperKEKB $e^+e^-$ collider. The sample has an integrated luminosity of 362 fb$^{-1}$ at a centre-of-mass energy of 10.58 GeV. Using an optimise…
▽ More
We present a measurement of the ratio $R_μ= \mathcal{B}(τ^-\to μ^-\barν_μν_τ) / \mathcal{B}(τ^-\to e^-\barν_eν_τ)$ of branching fractions $\mathcal{B}$ of the $τ$ lepton decaying to muons or electrons using data collected with the Belle II detector at the SuperKEKB $e^+e^-$ collider. The sample has an integrated luminosity of 362 fb$^{-1}$ at a centre-of-mass energy of 10.58 GeV. Using an optimised event selection, a binned maximum likelihood fit is performed using the momentum spectra of the electron and muon candidates. The result, $R_μ= 0.9675 \pm 0.0007 \pm 0.0036$, where the first uncertainty is statistical and the second is systematic, is the most precise to date. It provides a stringent test of the light-lepton universality, translating to a ratio of the couplings of the muon and electron to the $W$ boson in $τ$ decays of $0.9974 \pm 0.0019$, in agreement with the standard model expectation of unity.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Error Analysis of Three-Layer Neural Network Trained with PGD for Deep Ritz Method
Authors:
Yuling Jiao,
Yanming Lai,
Yang Wang
Abstract:
Machine learning is a rapidly advancing field with diverse applications across various domains. One prominent area of research is the utilization of deep learning techniques for solving partial differential equations(PDEs). In this work, we specifically focus on employing a three-layer tanh neural network within the framework of the deep Ritz method(DRM) to solve second-order elliptic equations wi…
▽ More
Machine learning is a rapidly advancing field with diverse applications across various domains. One prominent area of research is the utilization of deep learning techniques for solving partial differential equations(PDEs). In this work, we specifically focus on employing a three-layer tanh neural network within the framework of the deep Ritz method(DRM) to solve second-order elliptic equations with three different types of boundary conditions. We perform projected gradient descent(PDG) to train the three-layer network and we establish its global convergence. To the best of our knowledge, we are the first to provide a comprehensive error analysis of using overparameterized networks to solve PDE problems, as our analysis simultaneously includes estimates for approximation error, generalization error, and optimization error. We present error bound in terms of the sample size $n$ and our work provides guidance on how to set the network depth, width, step size, and number of iterations for the projected gradient descent algorithm. Importantly, our assumptions in this work are classical and we do not require any additional assumptions on the solution of the equation. This ensures the broad applicability and generality of our results.
△ Less
Submitted 19 May, 2024;
originally announced May 2024.
-
Search for Two-Body $B$ Meson Decays to $Λ^{0}$ and $Ω^{(*)0}_{c}$
Authors:
Belle Collaboration,
V. Savinov,
I. Adachi,
J. K. Ahn,
H. Aihara,
D. M. Asner,
H. Atmacan,
R. Ayad,
Sw. Banerjee,
J. Bennett,
M. Bessner,
V. Bhardwaj,
D. Biswas,
A. Bobrov,
D. Bodrov,
J. Borah,
M. Bračko,
P. Branchini,
T. E. Browder,
A. Budano,
D. Červenkov,
M. -C. Chang,
P. Chang,
B. G. Cheon,
K. Cho
, et al. (124 additional authors not shown)
Abstract:
We report the results of the first search for Standard Model and baryon-number-violating two-body decays of the neutral $B$ mesons to $Λ^{0}$ and $Ω^{(*)0}_c$ using 711~${\rm fb^{-1}}$ of data collected at the $Υ(4S)$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+ e^-$ collider. We observe no evidence of signal from any such decays and set 95\% confidence-level upper limits o…
▽ More
We report the results of the first search for Standard Model and baryon-number-violating two-body decays of the neutral $B$ mesons to $Λ^{0}$ and $Ω^{(*)0}_c$ using 711~${\rm fb^{-1}}$ of data collected at the $Υ(4S)$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+ e^-$ collider. We observe no evidence of signal from any such decays and set 95\% confidence-level upper limits on the products of $B^0$ and $\bar{B}^0$ branching fractions for these two-body decays with $\mathcal{B}(Ω_{c}^{0} \to π^+ Ω^-)$ in the range between 9.5~$\times 10^{-8}$ and 31.2~$\times 10^{-8}$.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
Magnetic Relaxometry of Hemoglobin by Widefield Nitrogen-Vacancy Microscopy
Authors:
Suvechhya Lamichhane,
Evelyn Carreto Guevara,
Ilja Fescenko,
Sy-Hwang Liou,
Rebecca Y. Lai,
Abdelghani Laraoui
Abstract:
Hemoglobin (Hb) is a multifaceted protein, classified as a metalloprotein, chromoprotein, and globulin. It incorporates iron, which plays a crucial role in transporting oxygen within red blood cells. Hb functions by carrying oxygen from the respiratory organs to diverse tissues in the body, where it releases oxygen to fuel aerobic respiration, thus supporting the organism's metabolic processes. De…
▽ More
Hemoglobin (Hb) is a multifaceted protein, classified as a metalloprotein, chromoprotein, and globulin. It incorporates iron, which plays a crucial role in transporting oxygen within red blood cells. Hb functions by carrying oxygen from the respiratory organs to diverse tissues in the body, where it releases oxygen to fuel aerobic respiration, thus supporting the organism's metabolic processes. Deviations in Hb concentration in the blood have been linked to various medical conditions, including anemia and other blood disorders. Here, we use optical detected magnetic relaxometry of paramagnetic iron spins in Hb drop-casted onto nanostructured diamond doped with shallow (~ 5.5 nm) high density nitrogen vacancy (NV) spin qubits. We modify the Hb concentration in the range of 6 x 10^6 to 1.8 x 10^7 adsorbed Fe+3 spins per um^2 and observe an increase of the NV relaxation rate G1 (= 1/ T1, T1 is NV spin lattice relaxation time) up to 2 x 10^3 s^-1. NV magnetic relaxometry of Hb in phosphate-buffered saline solution show a similar effect with an increase of G1 to 6.7 x 10^3 s^-1 upon increasing the Hb concentration to 100 uM. The increase of NV G1 is explained by the increased spin noise coming from the Fe+3 spins present in Hb proteins. This study presents an additional usage of NV quantum sensors to detect paramagnetic centers in biomolecules.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Lai Loss: A Novel Loss for Gradient Control
Authors:
YuFei Lai
Abstract:
In the field of machine learning, traditional regularization methods tend to directly add regularization terms to the loss function. This paper introduces the "Lai loss", a novel loss design that integrates the regularization terms (specifically, gradients) into the traditional loss function through straightforward geometric concepts. This design penalizes the gradients with the loss itself, allow…
▽ More
In the field of machine learning, traditional regularization methods tend to directly add regularization terms to the loss function. This paper introduces the "Lai loss", a novel loss design that integrates the regularization terms (specifically, gradients) into the traditional loss function through straightforward geometric concepts. This design penalizes the gradients with the loss itself, allowing for control of the gradients while ensuring maximum accuracy. With this loss, we can effectively control the model's smoothness and sensitivity, potentially offering the dual benefits of improving the model's generalization performance and enhancing its noise resistance on specific features. Additionally, we proposed a training method that successfully addresses the challenges in practical applications. We conducted preliminary experiments using publicly available datasets from Kaggle, demonstrating that the design of Lai loss can control the model's smoothness and sensitivity while maintaining stable model performance.
△ Less
Submitted 23 May, 2024; v1 submitted 13 May, 2024;
originally announced May 2024.
-
Scalable and Effective Arithmetic Tree Generation for Adder and Multiplier Designs
Authors:
Yao Lai,
**xin Liu,
David Z. Pan,
** Luo
Abstract:
Across a wide range of hardware scenarios, the computational efficiency and physical size of the arithmetic units significantly influence the speed and footprint of the overall hardware system. Nevertheless, the effectiveness of prior arithmetic design techniques proves inadequate, as it does not sufficiently optimize speed and area, resulting in a reduced processing rate and larger module size. T…
▽ More
Across a wide range of hardware scenarios, the computational efficiency and physical size of the arithmetic units significantly influence the speed and footprint of the overall hardware system. Nevertheless, the effectiveness of prior arithmetic design techniques proves inadequate, as it does not sufficiently optimize speed and area, resulting in a reduced processing rate and larger module size. To boost the arithmetic performance, in this work, we focus on the two most common and fundamental arithmetic modules: adders and multipliers. We cast the design tasks as single-player tree generation games, leveraging reinforcement learning techniques to optimize their arithmetic tree structures. Such a tree generation formulation allows us to efficiently navigate the vast search space and discover superior arithmetic designs that improve computational efficiency and hardware size within just a few hours. For adders, our approach discovers designs of 128-bit adders that achieve Pareto optimality in theoretical metrics. Compared with the state-of-the-art PrefixRL, our method decreases computational delay and hardware size by up to 26% and 30%, respectively. For multipliers, when compared to RL-MUL, our approach increases speed and reduces size by as much as 49% and 45%. Moreover, the inherent flexibility and scalability of our method enable us to deploy our designs into cutting-edge technologies, as we show that they can be seamlessly integrated into 7nm technology. We believe our work will offer valuable insights into hardware design, further accelerating speed and reducing size through the refined search space and our tree generation methodologies. See our introduction video at https://bit.ly/ArithmeticTree. Codes are released at https://github.com/laiyao1/ArithmeticTree.
△ Less
Submitted 10 May, 2024;
originally announced May 2024.
-
SketchDream: Sketch-based Text-to-3D Generation and Editing
Authors:
Feng-Lin Liu,
Hongbo Fu,
Yu-Kun Lai,
Lin Gao
Abstract:
Existing text-based 3D generation methods generate attractive results but lack detailed geometry control. Sketches, known for their conciseness and expressiveness, have contributed to intuitive 3D modeling but are confined to producing texture-less mesh models within predefined categories. Integrating sketch and text simultaneously for 3D generation promises enhanced control over geometry and appe…
▽ More
Existing text-based 3D generation methods generate attractive results but lack detailed geometry control. Sketches, known for their conciseness and expressiveness, have contributed to intuitive 3D modeling but are confined to producing texture-less mesh models within predefined categories. Integrating sketch and text simultaneously for 3D generation promises enhanced control over geometry and appearance but faces challenges from 2D-to-3D translation ambiguity and multi-modal condition integration. Moreover, further editing of 3D models in arbitrary views will give users more freedom to customize their models. However, it is difficult to achieve high generation quality, preserve unedited regions, and manage proper interactions between shape components. To solve the above issues, we propose a text-driven 3D content generation and editing method, SketchDream, which supports NeRF generation from given hand-drawn sketches and achieves free-view sketch-based local editing. To tackle the 2D-to-3D ambiguity challenge, we introduce a sketch-based multi-view image generation diffusion model, which leverages depth guidance to establish spatial correspondence. A 3D ControlNet with a 3D attention module is utilized to control multi-view images and ensure their 3D consistency. To support local editing, we further propose a coarse-to-fine editing approach: the coarse phase analyzes component interactions and provides 3D masks to label edited regions, while the fine stage generates realistic results with refined details by local enhancement. Extensive experiments validate that our method generates higher-quality results compared with a combination of 2D ControlNet and image-to-3D generation techniques and achieves detailed control compared with existing diffusion-based 3D editing approaches.
△ Less
Submitted 14 May, 2024; v1 submitted 10 May, 2024;
originally announced May 2024.
-
Deep-learning design of graphene metasurfaces for quantum control and Dirac electron holography
Authors:
Chen-Di Han,
Li-Li Ye,
Zin Lin,
Vassilios Kovanis,
Ying-Cheng Lai
Abstract:
Metasurfaces are sub-wavelength patterned layers for controlling waves in physical systems. In optics, meta-surfaces are created by materials with different dielectric constants and are capable of unconventional functionalities. We develop a deep-learning framework for Dirac-material metasurface design for controlling electronic waves. The metasurface is a configuration of circular graphene quantu…
▽ More
Metasurfaces are sub-wavelength patterned layers for controlling waves in physical systems. In optics, meta-surfaces are created by materials with different dielectric constants and are capable of unconventional functionalities. We develop a deep-learning framework for Dirac-material metasurface design for controlling electronic waves. The metasurface is a configuration of circular graphene quantum dots, each created by an electric potential. Employing deep convolutional neural networks, we show that the original scattering wave can be reconstructed with fidelity over 95$\%$, suggesting the feasibility of Dirac electron holography. Additional applications such as plane wave generation, designing broadband, and multi-functionality graphene metasurface systems are illustrated.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Evolving R2 to R2+: Optimal, Delayed Line-of-sight Vector-based Path Planning
Authors:
Yan Kai Lai,
Prahlad Vadakkepat,
Cheng Xiang
Abstract:
A vector-based any-angle path planner, R2, is evolved in to R2+ in this paper. By delaying line-of-sight, R2 and R2+ search times are largely unaffected by the distance between the start and goal points, but are exponential in the worst case with respect to the number of collisions during searches. To improve search times, additional discarding conditions in the overlap rule are introduced in R2+.…
▽ More
A vector-based any-angle path planner, R2, is evolved in to R2+ in this paper. By delaying line-of-sight, R2 and R2+ search times are largely unaffected by the distance between the start and goal points, but are exponential in the worst case with respect to the number of collisions during searches. To improve search times, additional discarding conditions in the overlap rule are introduced in R2+. In addition, R2+ resolves interminable chases in R2 by replacing ad hoc points with limited occupied-sector traces from target nodes, and simplifies R2 by employing new abstract structures and ensuring target progression during a trace. R2+ preserves the speed of R2 when paths are expected to detour around few obstacles, and searches significantly faster than R2 in maps with many disjoint obstacles.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Automated Deep Learning Optimization via DSL-Based Source Code Transformation
Authors:
Ruixin Wang,
Minghai Lu,
Cody Hao Yu,
Yi-Hsiang Lai,
Tianyi Zhang
Abstract:
As deep learning models become increasingly bigger and more complex, it is critical to improve model training and inference efficiency. Though a variety of highly optimized libraries and packages (known as DL kernels) have been developed, it is tedious and time-consuming to figure out which kernel to use, where to use, and how to use them correctly. To address this challenge, we propose an Automat…
▽ More
As deep learning models become increasingly bigger and more complex, it is critical to improve model training and inference efficiency. Though a variety of highly optimized libraries and packages (known as DL kernels) have been developed, it is tedious and time-consuming to figure out which kernel to use, where to use, and how to use them correctly. To address this challenge, we propose an Automated Deep learning OPTimization approach called Adopter. We design a Domain-Specific Language (DSL) to represent DL model architectures and leverage this DSL to specify model transformation rules required to integrate a DL kernel into a model. Given the source code of a DL model and the transformation rules for a set of kernels, Adopter first performs inter-procedural analysis to identify and express the model architecture in our DSL. Then, Adopter performs scope analysis and sub-sequence matching to identify locations in the model architecture where the transformation rules can be applied. Finally, Adopter proposes a synthesis-based code transformation method to apply the transformation rule. We curated a benchmark with 199 models from Hugging Face and a diverse set of DL kernels. We found that, compared to a state-of-the-art automated code transformation technique, Adopter helps improve the precision and recall by 3% and 56%, respectively. An in-depth analysis of 9 models revealed that on average, Adopter improved the training speed by 22.7% while decreasing the GPU memory usage by 10.5%.
△ Less
Submitted 5 May, 2024;
originally announced May 2024.
-
Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents
Authors:
Junkai Li,
Siyu Wang,
Meng Zhang,
Weitao Li,
Yunghwei Lai,
Xinhui Kang,
Weizhi Ma,
Yang Liu
Abstract:
In this paper, we introduce a simulacrum of hospital called Agent Hospital that simulates the entire process of treating illness. All patients, nurses, and doctors are autonomous agents powered by large language models (LLMs). Our central goal is to enable a doctor agent to learn how to treat illness within the simulacrum. To do so, we propose a method called MedAgent-Zero. As the simulacrum can s…
▽ More
In this paper, we introduce a simulacrum of hospital called Agent Hospital that simulates the entire process of treating illness. All patients, nurses, and doctors are autonomous agents powered by large language models (LLMs). Our central goal is to enable a doctor agent to learn how to treat illness within the simulacrum. To do so, we propose a method called MedAgent-Zero. As the simulacrum can simulate disease onset and progression based on knowledge bases and LLMs, doctor agents can keep accumulating experience from both successful and unsuccessful cases. Simulation experiments show that the treatment performance of doctor agents consistently improves on various tasks. More interestingly, the knowledge the doctor agents have acquired in Agent Hospital is applicable to real-world medicare benchmarks. After treating around ten thousand patients (real-world doctors may take over two years), the evolved doctor agent achieves a state-of-the-art accuracy of 93.06% on a subset of the MedQA dataset that covers major respiratory diseases. This work paves the way for advancing the applications of LLM-powered agent techniques in medical scenarios.
△ Less
Submitted 5 May, 2024;
originally announced May 2024.
-
In-situ Doppler-free spectroscopy and laser frequency stabilization based on time-division multiplexing differential saturated absorption
Authors:
Yuxin Wang,
Zhiyue Zheng,
Qiuxin Zhang,
Yonglang Lai,
Zongqi Ge,
Tianyi Wang,
Liangyu Ding,
Smirnov Vasilii,
Shuaining Zhang,
Wei Zhang,
Xiang Zhang
Abstract:
We introduce a novel time-division multiplexing differential saturated absorption spectroscopy (TDMDSAS) approach, providing superior accuracy and stability in Doppler-free spectroscopy. By distinguishing probe and reference fields in the temporal domain, TDMDSAS efficiently suppresses Doppler broadening and common-mode optical noise. We utilized this technology to determine the absolute frequency…
▽ More
We introduce a novel time-division multiplexing differential saturated absorption spectroscopy (TDMDSAS) approach, providing superior accuracy and stability in Doppler-free spectroscopy. By distinguishing probe and reference fields in the temporal domain, TDMDSAS efficiently suppresses Doppler broadening and common-mode optical noise. We utilized this technology to determine the absolute frequency of diverse neutral Yb isotopes across its $6s^2\ ^{1}S_0\to 6s6p ^{1}P_1$ transitions. Furthermore, the first-ever observation of in-situ Doppler-free Zeeman sub-level spectra was accomplished, enabling the determination of magnetic field gradients. We stabilized a UV diode laser at 399 nm using an error signal derived from the spectral first-derivative demodulated signal of $^{174}\mathrm{Yb}$. This technique yielded a frequency stability of up to 15 kHz with a 40 s averaging time and a standard deviation of around 180 kHz over a half-hour period. Given its low cost, straightforward, and scalable nature, TDMDSAS holds excellent potential in metrology and quantum applications.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
LTOS: Layout-controllable Text-Object Synthesis via Adaptive Cross-attention Fusions
Authors:
Xiaoran Zhao,
Tianhao Wu,
Yu Lai,
Zhiliang Tian,
Zhen Huang,
Yahui Liu,
Zejiang He,
Dongsheng Li
Abstract:
Controllable text-to-image generation synthesizes visual text and objects in images with certain conditions, which are frequently applied to emoji and poster generation. Visual text rendering and layout-to-image generation tasks have been popular in controllable text-to-image generation. However, each of these tasks typically focuses on single modality generation or rendering, leaving yet-to-be-br…
▽ More
Controllable text-to-image generation synthesizes visual text and objects in images with certain conditions, which are frequently applied to emoji and poster generation. Visual text rendering and layout-to-image generation tasks have been popular in controllable text-to-image generation. However, each of these tasks typically focuses on single modality generation or rendering, leaving yet-to-be-bridged gaps between the approaches correspondingly designed for each of the tasks. In this paper, we combine text rendering and layout-to-image generation tasks into a single task: layout-controllable text-object synthesis (LTOS) task, aiming at synthesizing images with object and visual text based on predefined object layout and text contents. As compliant datasets are not readily available for our LTOS task, we construct a layout-aware text-object synthesis dataset, containing elaborate well-aligned labels of visual text and object information. Based on the dataset, we propose a layout-controllable text-object adaptive fusion (TOF) framework, which generates images with clear, legible visual text and plausible objects. We construct a visual-text rendering module to synthesize text and employ an object-layout control module to generate objects while integrating the two modules to harmoniously generate and integrate text content and objects in images. To better the image-text integration, we propose a self-adaptive cross-attention fusion module that helps the image generation to attend more to important text information. Within such a fusion module, we use a self-adaptive learnable factor to learn to flexibly control the influence of cross-attention outputs on image generation. Experimental results show that our method outperforms the state-of-the-art in LTOS, text rendering, and layout-to-image tasks, enabling harmonious visual text rendering and object generation.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Determination of the CKM angle $φ_{3}$ from a combination of Belle and Belle II results
Authors:
Belle,
Belle II Collaborations,
:,
I. Adachi,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
S. Al Said,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien
, et al. (377 additional authors not shown)
Abstract:
We report a determination of the CKM angle $φ_{3}$, also known as $γ$, from a combination of measurements using samples of up to 711~fb$^{-1}$ from the Belle experiment and up to 362~fb$^{-1}$ from the Belle II experiment. We combine results from analyses of $B^+\to DK^+, B^+\to Dπ^+$, and $B^+ \to D^{*}K^+$ decays, where $D$ is an admixture of $D^0$ and $\overline{D}{}^{0}$ mesons, in a likelihoo…
▽ More
We report a determination of the CKM angle $φ_{3}$, also known as $γ$, from a combination of measurements using samples of up to 711~fb$^{-1}$ from the Belle experiment and up to 362~fb$^{-1}$ from the Belle II experiment. We combine results from analyses of $B^+\to DK^+, B^+\to Dπ^+$, and $B^+ \to D^{*}K^+$ decays, where $D$ is an admixture of $D^0$ and $\overline{D}{}^{0}$ mesons, in a likelihood fit to obtain $φ_{3} = (78.6^{+7.2}_{-7.3})^{\circ}$. We also briefly discuss the interpretation of this result.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
DeferredGS: Decoupled and Editable Gaussian Splatting with Deferred Shading
Authors:
Tong Wu,
Jia-Mu Sun,
Yu-Kun Lai,
Yuewen Ma,
Leif Kobbelt,
Lin Gao
Abstract:
Reconstructing and editing 3D objects and scenes both play crucial roles in computer graphics and computer vision. Neural radiance fields (NeRFs) can achieve realistic reconstruction and editing results but suffer from inefficiency in rendering. Gaussian splatting significantly accelerates rendering by rasterizing Gaussian ellipsoids. However, Gaussian splatting utilizes a single Spherical Harmoni…
▽ More
Reconstructing and editing 3D objects and scenes both play crucial roles in computer graphics and computer vision. Neural radiance fields (NeRFs) can achieve realistic reconstruction and editing results but suffer from inefficiency in rendering. Gaussian splatting significantly accelerates rendering by rasterizing Gaussian ellipsoids. However, Gaussian splatting utilizes a single Spherical Harmonic (SH) function to model both texture and lighting, limiting independent editing capabilities of these components. Recently, attempts have been made to decouple texture and lighting with the Gaussian splatting representation but may fail to produce plausible geometry and decomposition results on reflective scenes. Additionally, the forward shading technique they employ introduces noticeable blending artifacts during relighting, as the geometry attributes of Gaussians are optimized under the original illumination and may not be suitable for novel lighting conditions. To address these issues, we introduce DeferredGS, a method for decoupling and editing the Gaussian splatting representation using deferred shading. To achieve successful decoupling, we model the illumination with a learnable environment map and define additional attributes such as texture parameters and normal direction on Gaussians, where the normal is distilled from a jointly trained signed distance function. More importantly, we apply deferred shading, resulting in more realistic relighting effects compared to previous methods. Both qualitative and quantitative experiments demonstrate the superior performance of DeferredGS in novel view synthesis and editing tasks.
△ Less
Submitted 6 May, 2024; v1 submitted 14 April, 2024;
originally announced April 2024.
-
Search for rare $b \to d\ell^+\ell^-$ transitions at Belle
Authors:
Belle,
Belle II Collaborations,
:,
I. Adachi,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
S. Al Said,
D. M. Asner,
H. Atmacan,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Beaubien,
F. Becherer,
J. Becker
, et al. (371 additional authors not shown)
Abstract:
We present the results of a search for the $b \to d\ell^+\ell^-$ flavor-changing neutral-current rare decays $B^{+, 0} \to (η, ω, π^{+,0}, ρ^{+, 0}) e^+e^-$ and $B^{+, 0} \to (η, ω, π^{0}, ρ^{+}) μ^+μ^-$ using a $711$ fb$^{-1}$ data sample that contains $772 \times 10^{6}$ $B\overline{B}$ events. The data were collected at the $Υ(4S)$ resonance with the Belle detector at the KEKB asymmetric-energy…
▽ More
We present the results of a search for the $b \to d\ell^+\ell^-$ flavor-changing neutral-current rare decays $B^{+, 0} \to (η, ω, π^{+,0}, ρ^{+, 0}) e^+e^-$ and $B^{+, 0} \to (η, ω, π^{0}, ρ^{+}) μ^+μ^-$ using a $711$ fb$^{-1}$ data sample that contains $772 \times 10^{6}$ $B\overline{B}$ events. The data were collected at the $Υ(4S)$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+e^-$ collider. We find no evidence for signal and set upper limits on branching fractions at the $90\%$ confidence level in the range $(3.8 - 47) \times 10^{-8}$ depending on the decay channel. The obtained limits are the world's best results. This is the first search for the channels $B^{+, 0} \to (ω, ρ^{+,0}) e^+e^-$ and $B^{+, 0} \to (ω, ρ^{+})μ^+μ^-$.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
An analysis of parameter compression and full-modeling techniques with Velocileptors for DESI 2024 and beyond
Authors:
M. Maus,
S. Chen,
M. White,
J. Aguilar,
S. Ahlen,
A. Aviles,
S. Brieden,
D. Brooks,
T. Claybaugh,
S. Cole,
A. de la Macorra,
Arjun Dey,
P. Doel,
S. Ferraro,
N. Findlay,
J. E. Forero-Romero,
E. Gaztañaga,
H. Gil-Marín,
S. Gontcho A Gontcho,
C. Hahn,
K. Honscheid,
C. Howlett,
M. Ishak,
S. Juneau,
A. Kremin
, et al. (30 additional authors not shown)
Abstract:
In anticipation of forthcoming data releases of current and future spectroscopic surveys, we present the validation tests and analysis of systematic effects within \texttt{velocileptors} modeling pipeline when fitting mock data from the \texttt{AbacusSummit} N-body simulations. We compare the constraints obtained from parameter compression methods to the direct fitting (Full-Modeling) approaches o…
▽ More
In anticipation of forthcoming data releases of current and future spectroscopic surveys, we present the validation tests and analysis of systematic effects within \texttt{velocileptors} modeling pipeline when fitting mock data from the \texttt{AbacusSummit} N-body simulations. We compare the constraints obtained from parameter compression methods to the direct fitting (Full-Modeling) approaches of modeling the galaxy power spectra, and show that the ShapeFit extension to the traditional template method is consistent with the Full-Modeling method within the standard $Λ$CDM parameter space. We show the dependence on scale cuts when fitting the different redshift bins using the ShapeFit and Full-Modeling methods. We test the ability to jointly fit data from multiple redshift bins as well as joint analysis of the pre-reconstruction power spectrum with the post-reconstruction BAO correlation function signal. We further demonstrate the behavior of the model when opening up the parameter space beyond $Λ$CDM and also when combining likelihoods with external datasets, namely the Planck CMB priors. Finally, we describe different parametrization options for the galaxy bias, counterterm, and stochastic parameters, and employ the halo model in order to physically motivate suitable priors that are necessary to ensure the stability of the perturbation theory.
△ Less
Submitted 17 April, 2024; v1 submitted 10 April, 2024;
originally announced April 2024.
-
A comparison between Shapefit compression and Full-Modelling method with PyBird for DESI 2024 and beyond
Authors:
Y. Lai,
C. Howlett,
M. Maus,
H. Gil-Marín,
H. E. Noriega,
S. Ramírez-Solano,
P. Zarrouk,
J. Aguilar,
S. Ahlen,
O. Alves,
A. Aviles,
D. Brooks,
S. Chen,
T. Claybaugh,
T. M. Davis,
K. Dawson,
A. de la Macorra,
P. Doel,
J. E. Forero-Romero,
E. Gaztañaga,
S. Gontcho A Gontcho,
K. Honscheid,
S. Juneau,
M. Landriau,
M. Manera
, et al. (18 additional authors not shown)
Abstract:
DESI aims to provide one of the tightest constraints on cosmological parameters by analysing the clustering of more than thirty million galaxies. However, obtaining such constraints requires special care in validating the methodology and efforts to reduce the computational time required through data compression and emulation techniques. In this work, we perform a rigorous validation of the PyBird…
▽ More
DESI aims to provide one of the tightest constraints on cosmological parameters by analysing the clustering of more than thirty million galaxies. However, obtaining such constraints requires special care in validating the methodology and efforts to reduce the computational time required through data compression and emulation techniques. In this work, we perform a rigorous validation of the PyBird power spectrum modelling code with both a traditional emulated Full-Modelling approach and the model-independent ShapeFit compression approach. By using cubic box simulations that accurately reproduce the clustering and precision of the DESI survey, we find that the cosmological constraints from ShapeFit and Full-Modelling are consistent with each other at the $\sim0.3σ$ level for the $Λ$CDM model. Both ShapeFit and Full-Modelling are also consistent with the true $Λ$CDM simulation cosmology down to a scale of $k_{\mathrm{max}} = 0.20 h\mathrm{Mpc}^{-1}$ even after including the hexadecapole. For extended models such as the wCDM and the oCDM models, we find that including the hexadecapole can significantly improve the constraints and reduce the modelling errors with the same $k_{\mathrm{max}}$. While their discrepancies between the constraints from ShapeFit and Full-Modelling are more significant than $Λ$CDM, they remain consistent within $0.7σ$. Lastly, we also show that the constraints on cosmological parameters with the correlation function evaluated from PyBird down to $s_{\mathrm{min}} = 30 h^{-1} \mathrm{Mpc}$ are unbiased and consistent with the constraints from the power spectrum.
△ Less
Submitted 26 June, 2024; v1 submitted 10 April, 2024;
originally announced April 2024.
-
A comparison of effective field theory models of redshift space galaxy power spectra for DESI 2024 and future surveys
Authors:
M. Maus,
Y. Lai,
H. E. Noriega,
S. Ramirez-Solano,
A. Aviles,
S. Chen,
S. Fromenteau,
H. Gil-Marín,
C. Howlett,
M. Vargas-Magaña,
M. White,
P. Zarrouk,
J. Aguilar,
S. Ahlen,
O. Alves,
S. Brieden,
D. Brooks,
E. Burtin,
T. Claybaugh,
S. Cole,
K. Dawson,
M. Icaza-Lizaola,
A. de la Macorra,
A. de Mattia,
P. Doel
, et al. (32 additional authors not shown)
Abstract:
In preparation for the next generation of galaxy redshift surveys, and in particular the year-one data release from the Dark Energy Spectroscopic Instrument (DESI), we investigate the consistency of a variety of effective field theory models that describe the galaxy-galaxy power spectra in redshift space into the quasi-linear regime using 1-loop perturbation theory. These models are employed in th…
▽ More
In preparation for the next generation of galaxy redshift surveys, and in particular the year-one data release from the Dark Energy Spectroscopic Instrument (DESI), we investigate the consistency of a variety of effective field theory models that describe the galaxy-galaxy power spectra in redshift space into the quasi-linear regime using 1-loop perturbation theory. These models are employed in the pipelines \texttt{velocileptors}, \texttt{PyBird}, and \texttt{Folps$ν$}. While these models have been validated independently, a detailed comparison with consistent choices has not been attempted. After briefly discussing the theoretical differences between the models we describe how to provide a more apples-to-apples comparison between them. We present the results of fitting mock spectra from the \texttt{AbacusSummit} suite of N-body simulations provided in three redshift bins to mimic the types of dark time tracers targeted by the DESI survey. We show that the theories behave similarly and give consistent constraints in both the forward-modeling and ShapeFit compressed fitting approaches. We additionally generate (noiseless) synthetic data from each pipeline to be fit by the others, varying the scale cuts in order to show that the models agree within the range of scales for which we expect 1-loop perturbation theory to be applicable. This work lays the foundation of Full-Shape analysis with DESI Y1 galaxy samples where in the tests we performed, we found no systematic error associated with the modeling of the galaxy redshift space power spectrum for this volume.
△ Less
Submitted 6 June, 2024; v1 submitted 10 April, 2024;
originally announced April 2024.
-
Comparing Compressed and Full-modeling Analyses with FOLPS: Implications for DESI 2024 and beyond
Authors:
H. E. Noriega,
A. Aviles,
H. Gil-Marín,
S. Ramirez-Solano,
S. Fromenteau,
M. Vargas-Magaña,
J. Aguilar,
S. Ahlen,
O. Alves,
S. Brieden,
D. Brooks,
J. L. Cervantes-Cota,
S. Chen,
T. Claybaugh,
S. Cole,
K. Dawson,
A. de la Macorra,
A. de Mattia,
P. Doel,
N. Findlay,
J. E. Forero-Romero,
E. Gaztañaga,
S. Gontcho A Gontcho,
K. Honscheid,
J. Hou
, et al. (29 additional authors not shown)
Abstract:
The Dark Energy Spectroscopic Instrument (DESI) will provide unprecedented information about the large-scale structure of our Universe. In this work, we study the robustness of the theoretical modelling of the power spectrum of FOLPS, a novel effective field theory-based package for evaluating the redshift space power spectrum in the presence of massive neutrinos. We perform this validation by fit…
▽ More
The Dark Energy Spectroscopic Instrument (DESI) will provide unprecedented information about the large-scale structure of our Universe. In this work, we study the robustness of the theoretical modelling of the power spectrum of FOLPS, a novel effective field theory-based package for evaluating the redshift space power spectrum in the presence of massive neutrinos. We perform this validation by fitting the AbacusSummit high-accuracy $N$-body simulations for Luminous Red Galaxies, Emission Line Galaxies and Quasar tracers, calibrated to describe DESI observations. We quantify the potential systematic error budget of FOLPS, finding that the modelling errors are fully sub-dominant for the DESI statistical precision within the studied range of scales. Additionally, we study two complementary approaches to fit and analyse the power spectrum data, one based on direct Full-Modelling fits and the other on the ShapeFit compression variables, both resulting in very good agreement in precision and accuracy. In each of these approaches, we study a set of potential systematic errors induced by several assumptions, such as the choice of template cosmology, the effect of prior choice in the nuisance parameters of the model, or the range of scales used in the analysis. Furthermore, we show how opening up the parameter space beyond the vanilla $Λ$CDM model affects the DESI observables. These studies include the addition of massive neutrinos, spatial curvature, and dark energy equation of state. We also examine how relaxing the usual Cosmic Microwave Background and Big Bang Nucleosynthesis priors on the primordial spectral index and the baryonic matter abundance, respectively, impacts the inference on the rest of the parameters of interest. This paper pathways towards performing a robust and reliable analysis of the shape of the power spectrum of DESI galaxy and quasar clustering using FOLPS.
△ Less
Submitted 13 April, 2024; v1 submitted 10 April, 2024;
originally announced April 2024.
-
Full Modeling and Parameter Compression Methods in configuration space for DESI 2024 and beyond
Authors:
S. Ramirez-Solano,
M. Icaza-Lizaola,
H. E. Noriega,
M. Vargas-Magaña,
S. Fromenteau,
A. Aviles,
F. Rodriguez-Martinez,
J. Aguilar,
S. Ahlen,
O. Alves,
S. Brieden,
D. Brooks,
T. Claybaugh,
S. Cole,
A. de la Macorra,
Arjun Dey,
B. Dey,
P. Doel,
K. Fanning,
J. E. Forero-Romero,
E. Gaztañaga,
H. Gil-Marín,
S. Gontcho A Gontcho,
K. Honscheid,
C. Howlett
, et al. (27 additional authors not shown)
Abstract:
In the contemporary era of high-precision spectroscopic surveys, led by projects like DESI, there is an increasing demand for optimizing the extraction of cosmological information from clustering data. This work conducts a thorough comparison of various methodologies for modeling the full shape of the two-point statistics in configuration space. We investigate the performance of both direct fits (…
▽ More
In the contemporary era of high-precision spectroscopic surveys, led by projects like DESI, there is an increasing demand for optimizing the extraction of cosmological information from clustering data. This work conducts a thorough comparison of various methodologies for modeling the full shape of the two-point statistics in configuration space. We investigate the performance of both direct fits (Full-Modeling) and the parameter compression approaches (ShapeFit and Standard). We utilize the ABACUS-SUMMIT simulations, tailored to exceed DESI's precision requirements. Particularly, we fit the two-point statistics of three distinct tracers (LRG, ELG, and QSO), by employing a Gaussian Streaming Model in tandem with Convolution Lagrangian Perturbation Theory and Effective Field Theory. We explore methodological setup variations, including the range of scales, the set of galaxy bias parameters, the inclusion of the hexadecapole, as well as model extensions encompassing varying $n_s$ and allowing for $w_0w_a$CDM dark energy model. Throughout these varied explorations, while precision levels fluctuate and certain configurations exhibit tighter parameter constraints, our pipeline consistently recovers the parameter values of the mocks within $1σ$ in all cases for a 1-year DESI volume. Additionally, we compare the performance of configuration space analysis with its Fourier space counterpart using three models: PyBird, FOLPS and velocileptors, presented in companion papers. We find good agreement with the results from all these models.
△ Less
Submitted 16 April, 2024; v1 submitted 10 April, 2024;
originally announced April 2024.
-
Flexible Fairness Learning via Inverse Conditional Permutation
Authors:
Yuheng Lai,
Leying Guan
Abstract:
Equalized odds, as a popular notion of algorithmic fairness, aims to ensure that sensitive variables, such as race and gender, do not unfairly influence the algorithm prediction when conditioning on the true outcome. Despite rapid advancements, most of the current research focuses on the violation of equalized odds caused by one sensitive attribute, leaving the challenge of simultaneously accounti…
▽ More
Equalized odds, as a popular notion of algorithmic fairness, aims to ensure that sensitive variables, such as race and gender, do not unfairly influence the algorithm prediction when conditioning on the true outcome. Despite rapid advancements, most of the current research focuses on the violation of equalized odds caused by one sensitive attribute, leaving the challenge of simultaneously accounting for multiple attributes under-addressed. We address this gap by introducing a fairness learning approach that integrates adversarial learning with a novel inverse conditional permutation. This approach effectively and flexibly handles multiple sensitive attributes, potentially of mixed data types. The efficacy and flexibility of our method are demonstrated through both simulation studies and empirical analysis of real-world datasets.
△ Less
Submitted 9 April, 2024; v1 submitted 8 April, 2024;
originally announced April 2024.
-
DESI 2024 VI: Cosmological Constraints from the Measurements of Baryon Acoustic Oscillations
Authors:
DESI Collaboration,
A. G. Adame,
J. Aguilar,
S. Ahlen,
S. Alam,
D. M. Alexander,
M. Alvarez,
O. Alves,
A. Anand,
U. Andrade,
E. Armengaud,
S. Avila,
A. Aviles,
H. Awan,
B. Bahr-Kalus,
S. Bailey,
C. Baltay,
A. Bault,
J. Behera,
S. BenZvi,
A. Bera,
F. Beutler,
D. Bianchi,
C. Blake,
R. Blum
, et al. (178 additional authors not shown)
Abstract:
We present cosmological results from the measurement of baryon acoustic oscillations (BAO) in galaxy, quasar and Lyman-$α$ forest tracers from the first year of observations from the Dark Energy Spectroscopic Instrument (DESI), to be released in the DESI Data Release 1. DESI BAO provide robust measurements of the transverse comoving distance and Hubble rate, or their combination, relative to the s…
▽ More
We present cosmological results from the measurement of baryon acoustic oscillations (BAO) in galaxy, quasar and Lyman-$α$ forest tracers from the first year of observations from the Dark Energy Spectroscopic Instrument (DESI), to be released in the DESI Data Release 1. DESI BAO provide robust measurements of the transverse comoving distance and Hubble rate, or their combination, relative to the sound horizon, in seven redshift bins from over 6 million extragalactic objects in the redshift range $0.1<z<4.2$. DESI BAO data alone are consistent with the standard flat $Λ$CDM cosmological model with a matter density $Ω_\mathrm{m}=0.295\pm 0.015$. Paired with a BBN prior and the robustly measured acoustic angular scale from the CMB, DESI requires $H_0=(68.52\pm0.62)$ km/s/Mpc. In conjunction with CMB anisotropies from Planck and CMB lensing data from Planck and ACT, we find $Ω_\mathrm{m}=0.307\pm 0.005$ and $H_0=(67.97\pm0.38)$ km/s/Mpc. Extending the baseline model with a constant dark energy equation of state parameter $w$, DESI BAO alone require $w=-0.99^{+0.15}_{-0.13}$. In models with a time-varying dark energy equation of state parametrized by $w_0$ and $w_a$, combinations of DESI with CMB or with SN~Ia individually prefer $w_0>-1$ and $w_a<0$. This preference is 2.6$σ$ for the DESI+CMB combination, and persists or grows when SN~Ia are added in, giving results discrepant with the $Λ$CDM model at the $2.5σ$, $3.5σ$ or $3.9σ$ levels for the addition of Pantheon+, Union3, or DES-SN5YR datasets respectively. For the flat $Λ$CDM model with the sum of neutrino mass $\sum m_ν$ free, combining the DESI and CMB data yields an upper limit $\sum m_ν< 0.072$ $(0.113)$ eV at 95% confidence for a $\sum m_ν>0$ $(\sum m_ν>0.059)$ eV prior. These neutrino-mass constraints are substantially relaxed in models beyond $Λ$CDM. [Abridged.]
△ Less
Submitted 24 April, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
DESI 2024 IV: Baryon Acoustic Oscillations from the Lyman Alpha Forest
Authors:
DESI Collaboration,
A. G. Adame,
J. Aguilar,
S. Ahlen,
S. Alam,
D. M. Alexander,
M. Alvarez,
O. Alves,
A. Anand,
U. Andrade,
E. Armengaud,
S. Avila,
A. Aviles,
H. Awan,
S. Bailey,
C. Baltay,
A. Bault,
J. Bautista,
J. Behera,
S. BenZvi,
F. Beutler,
D. Bianchi,
C. Blake,
R. Blum,
S. Brieden
, et al. (174 additional authors not shown)
Abstract:
We present the measurement of Baryon Acoustic Oscillations (BAO) from the Lyman-$α$ (Ly$α$) forest of high-redshift quasars with the first-year dataset of the Dark Energy Spectroscopic Instrument (DESI). Our analysis uses over $420\,000$ Ly$α$ forest spectra and their correlation with the spatial distribution of more than $700\,000$ quasars. An essential facet of this work is the development of a…
▽ More
We present the measurement of Baryon Acoustic Oscillations (BAO) from the Lyman-$α$ (Ly$α$) forest of high-redshift quasars with the first-year dataset of the Dark Energy Spectroscopic Instrument (DESI). Our analysis uses over $420\,000$ Ly$α$ forest spectra and their correlation with the spatial distribution of more than $700\,000$ quasars. An essential facet of this work is the development of a new analysis methodology on a blinded dataset. We conducted rigorous tests using synthetic data to ensure the reliability of our methodology and findings before unblinding. Additionally, we conducted multiple data splits to assess the consistency of the results and scrutinized various analysis approaches to confirm their robustness. For a given value of the sound horizon ($r_d$), we measure the expansion at $z_{\rm eff}=2.33$ with 2\% precision, $H(z_{\rm eff}) = (239.2 \pm 4.8) (147.09~{\rm Mpc} /r_d)$ km/s/Mpc. Similarly, we present a 2.4\% measurement of the transverse comoving distance to the same redshift, $D_M(z_{\rm eff}) = (5.84 \pm 0.14) (r_d/147.09~{\rm Mpc})$ Gpc. Together with other DESI BAO measurements at lower redshifts, these results are used in a companion paper to constrain cosmological parameters.
△ Less
Submitted 12 April, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
DESI 2024 III: Baryon Acoustic Oscillations from Galaxies and Quasars
Authors:
DESI Collaboration,
A. G. Adame,
J. Aguilar,
S. Ahlen,
S. Alam,
D. M. Alexander,
M. Alvarez,
O. Alves,
A. Anand,
U. Andrade,
E. Armengaud,
S. Avila,
A. Aviles,
H. Awan,
S. Bailey,
C. Baltay,
A. Bault,
J. Behera,
S. BenZvi,
F. Beutler,
D. Bianchi,
C. Blake,
R. Blum,
S. Brieden,
A. Brodzeller
, et al. (171 additional authors not shown)
Abstract:
We present the DESI 2024 galaxy and quasar baryon acoustic oscillations (BAO) measurements using over 5.7 million unique galaxy and quasar redshifts in the range 0.1<z<2.1. Divided by tracer type, we utilize 300,017 galaxies from the magnitude-limited Bright Galaxy Survey with 0.1<z<0.4, 2,138,600 Luminous Red Galaxies with 0.4<z<1.1, 2,432,022 Emission Line Galaxies with 0.8<z<1.6, and 856,652 qu…
▽ More
We present the DESI 2024 galaxy and quasar baryon acoustic oscillations (BAO) measurements using over 5.7 million unique galaxy and quasar redshifts in the range 0.1<z<2.1. Divided by tracer type, we utilize 300,017 galaxies from the magnitude-limited Bright Galaxy Survey with 0.1<z<0.4, 2,138,600 Luminous Red Galaxies with 0.4<z<1.1, 2,432,022 Emission Line Galaxies with 0.8<z<1.6, and 856,652 quasars with 0.8<z<2.1, over a ~7,500 square degree footprint. The analysis was blinded at the catalog-level to avoid confirmation bias. All fiducial choices of the BAO fitting and reconstruction methodology, as well as the size of the systematic errors, were determined on the basis of the tests with mock catalogs and the blinded data catalogs. We present several improvements to the BAO analysis pipeline, including enhancing the BAO fitting and reconstruction methods in a more physically-motivated direction, and also present results using combinations of tracers. We present a re-analysis of SDSS BOSS and eBOSS results applying the improved DESI methodology and find scatter consistent with the level of the quoted SDSS theoretical systematic uncertainties. With the total effective survey volume of ~ 18 Gpc$^3$, the combined precision of the BAO measurements across the six different redshift bins is ~0.52%, marking a 1.2-fold improvement over the previous state-of-the-art results using only first-year data. We detect the BAO in all of these six redshift bins. The highest significance of BAO detection is $9.1σ$ at the effective redshift of 0.93, with a constraint of 0.86% placed on the BAO scale. We find our measurements are systematically larger than the prediction of Planck-2018 LCDM model at z<0.8. We translate the results into transverse comoving distance and radial Hubble distance measurements, which are used to constrain cosmological models in our companion paper [abridged].
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Convergence Analysis of Flow Matching in Latent Space with Transformers
Authors:
Yuling Jiao,
Yanming Lai,
Yang Wang,
Bokai Yan
Abstract:
We present theoretical convergence guarantees for ODE-based generative models, specifically flow matching. We use a pre-trained autoencoder network to map high-dimensional original inputs to a low-dimensional latent space, where a transformer network is trained to predict the velocity field of the transformation from a standard normal distribution to the target latent distribution. Our error analy…
▽ More
We present theoretical convergence guarantees for ODE-based generative models, specifically flow matching. We use a pre-trained autoencoder network to map high-dimensional original inputs to a low-dimensional latent space, where a transformer network is trained to predict the velocity field of the transformation from a standard normal distribution to the target latent distribution. Our error analysis demonstrates the effectiveness of this approach, showing that the distribution of samples generated via estimated ODE flow converges to the target distribution in the Wasserstein-2 distance under mild and practical assumptions. Furthermore, we show that arbitrary smooth functions can be effectively approximated by transformer networks with Lipschitz continuity, which may be of independent interest.
△ Less
Submitted 28 April, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
Angular analysis of $B \to K^* e^+ e^-$ in the low-$q^2$ region with new electron identification at Belle
Authors:
Belle Collaboration,
D. Ferlewicz,
P. Urquijo,
I. Adachi,
K. Adamczyk,
H. Aihara,
D. M. Asner,
H. Atmacan,
R. Ayad,
V. Babu,
Sw. Banerjee,
P. Behera,
K. Belous,
J. Bennett,
M. Bessner,
V. Bhardwaj,
B. Bhuyan,
T. Bilka,
D. Biswas,
D. Bodrov,
M. Bračko,
P. Branchini,
T. E. Browder,
A. Budano,
M. Campajola
, et al. (145 additional authors not shown)
Abstract:
We perform an angular analysis of the $B\to K^* e^+ e^-$ decay for the dielectron mass squared, $q^2$, range of $0.0008$ to $1.1200 ~\text{GeV}^2 /c^4$ using the full Belle data set in the $K^{*0} \to K^+ π^-$ and $K^{*+} \to K_S^0 π^+$ channels, incorporating new methods of electron identification to improve the statistical power of the data set. This analysis is sensitive to contributions from r…
▽ More
We perform an angular analysis of the $B\to K^* e^+ e^-$ decay for the dielectron mass squared, $q^2$, range of $0.0008$ to $1.1200 ~\text{GeV}^2 /c^4$ using the full Belle data set in the $K^{*0} \to K^+ π^-$ and $K^{*+} \to K_S^0 π^+$ channels, incorporating new methods of electron identification to improve the statistical power of the data set. This analysis is sensitive to contributions from right-handed currents from physics beyond the Standard Model by constraining the Wilson coefficients $\mathcal{C}_7^{(\prime)}$. We perform a fit to the $B\to K^* e^+ e^-$ differential decay rate and measure the imaginary component of the transversality amplitude to be $A_T^{\rm Im} = -1.27 \pm 0.52 \pm 0.12$, and the $K^*$ transverse asymmetry to be $A_T^{(2)} = 0.52 \pm 0.53 \pm 0.11$. The resulting constraints on the value of $\mathcal{C}_7^{\prime}$ are consistent with the Standard Model within a $2σ$ confidence interval.
△ Less
Submitted 2 April, 2024; v1 submitted 29 March, 2024;
originally announced April 2024.
-
Selective Domain-Invariant Feature for Generalizable Deepfake Detection
Authors:
Yingxin Lai,
Guoqing Yang Yifan He,
Zhiming Luo,
Shaozi Li
Abstract:
With diverse presentation forgery methods emerging continually, detecting the authenticity of images has drawn growing attention. Although existing methods have achieved impressive accuracy in training dataset detection, they still perform poorly in the unseen domain and suffer from forgery of irrelevant information such as background and identity, affecting generalizability. To solve this problem…
▽ More
With diverse presentation forgery methods emerging continually, detecting the authenticity of images has drawn growing attention. Although existing methods have achieved impressive accuracy in training dataset detection, they still perform poorly in the unseen domain and suffer from forgery of irrelevant information such as background and identity, affecting generalizability. To solve this problem, we proposed a novel framework Selective Domain-Invariant Feature (SDIF), which reduces the sensitivity to face forgery by fusing content features and styles. Specifically, we first use a Farthest-Point Sampling (FPS) training strategy to construct a task-relevant style sample representation space for fusing with content features. Then, we propose a dynamic feature extraction module to generate features with diverse styles to improve the performance and effectiveness of the feature extractor. Finally, a domain separation strategy is used to retain domain-related features to help distinguish between real and fake faces. Both qualitative and quantitative results in existing benchmarks and proposals demonstrate the effectiveness of our approach.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Tokensome: Towards a Genetic Vision-Language GPT for Explainable and Cognitive Karyoty**
Authors:
Haoxi Zhang,
Xinxu Zhang,
Yuanxin Lin,
Maiqi Wang,
Yi Lai,
Yu Wang,
Linfeng Yu,
Yufeng Xu,
Ran Cheng,
Edward Szczerbicki
Abstract:
Automatic karyotype analysis is often defined as a visual perception task focused solely on chromosomal object-level modeling. This definition has led most existing methods to overlook componential and holistic information, significantly constraining model performance. Moreover, the lack of interpretability in current technologies hinders clinical adoption. In this paper, we introduce Tokensome, a…
▽ More
Automatic karyotype analysis is often defined as a visual perception task focused solely on chromosomal object-level modeling. This definition has led most existing methods to overlook componential and holistic information, significantly constraining model performance. Moreover, the lack of interpretability in current technologies hinders clinical adoption. In this paper, we introduce Tokensome, a novel vision-language model based on chromosome tokenization for explainable and cognitive karyoty**. Tokensome elevates the method from the conventional visual perception layer to the cognitive decision-making layer. This elevation enables the integration of domain knowledge and cognitive reasoning via knowledge graphs and LLMs, markedly enhancing model's explainability and facilitating abnormality detection.
△ Less
Submitted 16 March, 2024;
originally announced March 2024.
-
Texture-GS: Disentangling the Geometry and Texture for 3D Gaussian Splatting Editing
Authors:
Tian-Xing Xu,
Wenbo Hu,
Yu-Kun Lai,
Ying Shan,
Song-Hai Zhang
Abstract:
3D Gaussian splatting, emerging as a groundbreaking approach, has drawn increasing attention for its capabilities of high-fidelity reconstruction and real-time rendering. However, it couples the appearance and geometry of the scene within the Gaussian attributes, which hinders the flexibility of editing operations, such as texture swap**. To address this issue, we propose a novel approach, namel…
▽ More
3D Gaussian splatting, emerging as a groundbreaking approach, has drawn increasing attention for its capabilities of high-fidelity reconstruction and real-time rendering. However, it couples the appearance and geometry of the scene within the Gaussian attributes, which hinders the flexibility of editing operations, such as texture swap**. To address this issue, we propose a novel approach, namely Texture-GS, to disentangle the appearance from the geometry by representing it as a 2D texture mapped onto the 3D surface, thereby facilitating appearance editing. Technically, the disentanglement is achieved by our proposed texture map** module, which consists of a UV map** MLP to learn the UV coordinates for the 3D Gaussian centers, a local Taylor expansion of the MLP to efficiently approximate the UV coordinates for the ray-Gaussian intersections, and a learnable texture to capture the fine-grained appearance. Extensive experiments on the DTU dataset demonstrate that our method not only facilitates high-fidelity appearance editing but also achieves real-time rendering on consumer-level devices, e.g. a single RTX 2080 Ti GPU.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models
Authors:
Yu-Chu Yu,
Chi-Pin Huang,
Jr-Jen Chen,
Kai-Po Chang,
Yung-Hsuan Lai,
Fu-En Yang,
Yu-Chiang Frank Wang
Abstract:
Large-scale vision-language models (VLMs) have shown a strong zero-shot generalization capability on unseen-domain data. However, when adapting pre-trained VLMs to a sequence of downstream tasks, they are prone to forgetting previously learned knowledge and degrade their zero-shot classification capability. To tackle this problem, we propose a unique Selective Dual-Teacher Knowledge Transfer frame…
▽ More
Large-scale vision-language models (VLMs) have shown a strong zero-shot generalization capability on unseen-domain data. However, when adapting pre-trained VLMs to a sequence of downstream tasks, they are prone to forgetting previously learned knowledge and degrade their zero-shot classification capability. To tackle this problem, we propose a unique Selective Dual-Teacher Knowledge Transfer framework that leverages the most recent fine-tuned and the original pre-trained VLMs as dual teachers to preserve the previously learned knowledge and zero-shot capabilities, respectively. With only access to an unlabeled reference dataset, our proposed framework performs a selective knowledge distillation mechanism by measuring the feature discrepancy from the dual teacher VLMs. Consequently, our selective dual-teacher knowledge distillation would mitigate catastrophic forgetting of previously learned knowledge while preserving the zero-shot capabilities from pre-trained VLMs. Through extensive experiments on benchmark datasets, we show that our proposed framework is favorable against state-of-the-art continual learning approaches for preventing catastrophic forgetting and zero-shot degradation.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Large, Small or Both: A Novel Data Augmentation Framework Based on Language Models for Debiasing Opinion Summarization
Authors:
Yanyue Zhang,
Pengfei Li,
Yilong Lai,
Deyu Zhou,
Yulan He
Abstract:
As more than 70$\%$ of reviews in the existing opinion summary data set are positive, current opinion summarization approaches are reluctant to generate negative summaries given the input of negative texts. To address such sentiment bias, a direct approach without the over-reliance on a specific framework is to generate additional data based on large language models to balance the emotional distri…
▽ More
As more than 70$\%$ of reviews in the existing opinion summary data set are positive, current opinion summarization approaches are reluctant to generate negative summaries given the input of negative texts. To address such sentiment bias, a direct approach without the over-reliance on a specific framework is to generate additional data based on large language models to balance the emotional distribution of the dataset. However, data augmentation based on large language models faces two disadvantages: 1) the potential issues or toxicity in the augmented data; 2) the expensive costs. Therefore, in this paper, we propose a novel data augmentation framework based on both large and small language models for debiasing opinion summarization. In specific, a small size of synthesized negative reviews is obtained by rewriting the positive text via a large language model. Then, a disentangle reconstruction model is trained based on the generated data. After training, a large amount of synthetic data can be obtained by decoding the new representation obtained from the combination of different sample representations and filtering based on confusion degree and sentiment classification. Experiments have proved that our framework can effectively alleviate emotional bias same as using only large models, but more economically.
△ Less
Submitted 19 March, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
ALaRM: Align Language Models via Hierarchical Rewards Modeling
Authors:
Yuhang Lai,
Siyuan Wang,
Shujun Liu,
Xuan**g Huang,
Zhongyu Wei
Abstract:
We introduce ALaRM, the first framework modeling hierarchical rewards in reinforcement learning from human feedback (RLHF), which is designed to enhance the alignment of large language models (LLMs) with human preferences. The framework addresses the limitations of current alignment approaches, which often struggle with the inconsistency and sparsity of human supervision signals, by integrating ho…
▽ More
We introduce ALaRM, the first framework modeling hierarchical rewards in reinforcement learning from human feedback (RLHF), which is designed to enhance the alignment of large language models (LLMs) with human preferences. The framework addresses the limitations of current alignment approaches, which often struggle with the inconsistency and sparsity of human supervision signals, by integrating holistic rewards with aspect-specific rewards. This integration enables more precise and consistent guidance of language models towards desired outcomes, particularly in complex and open text generation tasks. By employing a methodology that filters and combines multiple rewards based on their consistency, the framework provides a reliable mechanism for improving model alignment. We validate our approach through applications in long-form question answering and machine translation tasks, employing gpt-3.5-turbo for pairwise comparisons, and demonstrate improvements over existing baselines. Our work underscores the effectiveness of hierarchical rewards modeling in refining LLM training processes for better human preference alignment. We release our code at https://ALaRM-fdu.github.io.
△ Less
Submitted 16 March, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.