Search | arXiv e-print repository

doi 10.1007/JHEP11(2023)228

Measurement of the cross section of $e^+e^-\rightarrowΞ^{-}\barΞ^{+}$ at center-of-mass energies between 3.510 and 4.843 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (599 additional authors not shown)

Abstract: Using $e^+e^-$ collision data corresponding to a total integrated luminosity of 12.9 $fb^{-1}$ collected with the BESIII detector at the BEPCII collider, the exclusive Born cross sections and the effective form factors of the reaction $e^+e^-\rightarrowΞ^{-}\barΞ^{+}$ are measured via the single baryon-tag method at 23 center-of-mass energies between 3.510 and 4.843 GeV. Evidence for the decay… ▽ More Using $e^+e^-$ collision data corresponding to a total integrated luminosity of 12.9 $fb^{-1}$ collected with the BESIII detector at the BEPCII collider, the exclusive Born cross sections and the effective form factors of the reaction $e^+e^-\rightarrowΞ^{-}\barΞ^{+}$ are measured via the single baryon-tag method at 23 center-of-mass energies between 3.510 and 4.843 GeV. Evidence for the decay $ψ(3770)\rightarrowΞ^{-}\barΞ^{+}$ is observed with a significance of 4.5$σ$ by analyzing the measured cross sections together with earlier BESIII results. For the other charmonium(-like) states $ψ(4040)$, $ψ(4160)$, $Y(4230)$, $Y(4360)$, $ψ(4415)$, and $Y(4660)$, no significant signal of their decay to $Ξ^-\bar Ξ^+$ is found. For these states, upper limits of the products of the branching fraction and the electronic partial width at the 90% confidence level are provided. △ Less

Submitted 30 November, 2023; v1 submitted 8 September, 2023; originally announced September 2023.

Comments: 24 pages, 4 tables, 3 figures, consistent with the publication in JHEP11(2023)228

Journal ref: JHEP11(2023)228

arXiv:2309.04139 [pdf, other]

Novel method to extract the femtometer structure of strange baryons using the vacuum polarization effect

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, V. Batozskaya, D. Becker, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko , et al. (560 additional authors not shown)

Abstract: One of the fundamental goals of particle physics is to gain microscopic understanding of the strong interaction. Electromagnetic form factors quantify the structure of hadrons in terms of charge and magnetization distributions. While the nucleon structure has been investigated extensively, data on hyperons is still scarce. It has recently been demonstrated that electron-positron annihilations into… ▽ More One of the fundamental goals of particle physics is to gain microscopic understanding of the strong interaction. Electromagnetic form factors quantify the structure of hadrons in terms of charge and magnetization distributions. While the nucleon structure has been investigated extensively, data on hyperons is still scarce. It has recently been demonstrated that electron-positron annihilations into hyperon-antihyperon pairs provide a powerful tools to investigate their inner structure. We present a novel method useful for hyperon-antihyperon pairs of different types which exploits the cross section enhancement due to the vacuum polarization effect at the $J/ψ$ resonance. Using the 10 billion $J/ψ$ events collected with the BESIII detector, this allows a thorough determination of the hyperon structure . The result is essentially a precise snapshot of a $\barΛΣ^0$~($Λ\barΣ^0$) pair in the making, encoded in the form factor ratio and the phase. Their values are measured to be $R = 0.860\pm0.029({\rm stat.})\pm0.010({\rm syst.})$, $ΔΦ_1=(1.011\pm0.094({\rm stat.})\pm0.010({\rm syst.}))~\rm rad$ for $\barΛΣ^0$ and $ΔΦ_2=(2.128\pm0.094({\rm stat.})\pm0.010({\rm syst.}))~\rm rad$ for $Λ\barΣ^0$, respectively. Furthermore, charge-parity (CP) breaking is investigated for the first time in this reaction and found to be consistent with CP symmetry. △ Less

Submitted 8 September, 2023; originally announced September 2023.

arXiv:2309.04090 [pdf, ps, other]

Search for the semileptonic decays $D^+_s \to K_1(1270)^0 e^+ν_e$ and $D^+_s \to b_1(1235)^0 e^+ν_e$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (601 additional authors not shown)

Abstract: By analyzing 7.33\,fb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies between 4.128 and 4.226 GeV with the BESIII detector, we search for the semileptonic decays $D^+_s \to K_1(1270)^0 e^+ν_e$ and $D^+_s \to b_1(1235)^0 e^+ν_e$ for the first time. No significant signals are observed for either decay mode. The upper limits on the (product) branching fractions are determined t… ▽ More By analyzing 7.33\,fb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies between 4.128 and 4.226 GeV with the BESIII detector, we search for the semileptonic decays $D^+_s \to K_1(1270)^0 e^+ν_e$ and $D^+_s \to b_1(1235)^0 e^+ν_e$ for the first time. No significant signals are observed for either decay mode. The upper limits on the (product) branching fractions are determined to be ${\mathcal B}[D^+_s \to K_1(1270)^0 e^+ν_e] < 4.1\times 10^{-4}$ and ${\mathcal B}[D^+_s \to b_1(1235)^0 e^+ν_e]\cdot {\mathcal B}[b_1(1235)^0\to ωπ^0] < 6.4\times 10^{-4}$ at 90\% confidence level. △ Less

Submitted 7 September, 2023; originally announced September 2023.

Comments: 12 pages,4 figures

arXiv:2309.03466 [pdf, other]

MIRA: Cracking Black-box Watermarking on Deep Neural Networks via Model Inversion-based Removal Attacks

Authors: Yifan Lu, Wenxuan Li, Mi Zhang, Xudong Pan, Min Yang

Abstract: To protect the intellectual property of well-trained deep neural networks (DNNs), black-box DNN watermarks, which are embedded into the prediction behavior of DNN models on a set of specially-crafted samples, have gained increasing popularity in both academy and industry. Watermark robustness is usually implemented against attackers who steal the protected model and obfuscate its parameters for wa… ▽ More To protect the intellectual property of well-trained deep neural networks (DNNs), black-box DNN watermarks, which are embedded into the prediction behavior of DNN models on a set of specially-crafted samples, have gained increasing popularity in both academy and industry. Watermark robustness is usually implemented against attackers who steal the protected model and obfuscate its parameters for watermark removal. Recent studies empirically prove the robustness of most black-box watermarking schemes against known removal attempts. In this paper, we propose a novel Model Inversion-based Removal Attack (\textsc{Mira}), which is watermark-agnostic and effective against most of mainstream black-box DNN watermarking schemes. In general, our attack pipeline exploits the internals of the protected model to recover and unlearn the watermark message. We further design target class detection and recovered sample splitting algorithms to reduce the utility loss caused by \textsc{Mira} and achieve data-free watermark removal on half of the watermarking schemes. We conduct comprehensive evaluation of \textsc{Mira} against ten mainstream black-box watermarks on three benchmark datasets and DNN architectures. Compared with six baseline removal attacks, \textsc{Mira} achieves strong watermark removal effects on the covered watermarks, preserving at least $90\%$ of the stolen model utility, under more relaxed or even no assumptions on the dataset availability. △ Less

Submitted 6 September, 2023; originally announced September 2023.

Comments: 13 pages, ver 1.0

arXiv:2309.02774 [pdf, other]

doi 10.1103/PhysRevLett.132.031801

First Measurement of the Decay Asymmetry in the pure W-boson-exchange Decay $Λ_{c}^{+}\toΞ^{0}K^{+}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (618 additional authors not shown)

Abstract: Based on $4.4~\text{fb}^{-1}$ of $e^{+}e^{-}$ annihilation data collected at the center-of-mass energies between $4.60$ and $4.70~\text{GeV}$ with the BESIII detector at the BEPCII collider, the pure \textit{W}-boson-exchange decay $Λ_{c}^{+}\toΞ^{0}K^{+}$ is studied with a full angular analysis. The corresponding decay asymmetry is measured for the first time to be… ▽ More Based on $4.4~\text{fb}^{-1}$ of $e^{+}e^{-}$ annihilation data collected at the center-of-mass energies between $4.60$ and $4.70~\text{GeV}$ with the BESIII detector at the BEPCII collider, the pure \textit{W}-boson-exchange decay $Λ_{c}^{+}\toΞ^{0}K^{+}$ is studied with a full angular analysis. The corresponding decay asymmetry is measured for the first time to be $α_{Ξ^{0}K^{+}}=0.01\pm0.16({\rm stat.})\pm0.03({\rm syst.})$. This result reflects the non-interference effect between the $S$- and $P$-wave amplitudes. The phase shift between $S$- and $P$-wave amplitudes has two solutions, which are $δ_{p}-δ_{s}=-1.55\pm0.25({\rm stat.})\pm0.05({\rm syst.})~\text{rad}$ or $1.59\pm0.25({\rm stat.})\pm0.05({\rm syst.})~\text{rad}$. △ Less

Submitted 20 January, 2024; v1 submitted 6 September, 2023; originally announced September 2023.

Journal ref: Phys. Rev. Lett. 132, 031801(2024)

arXiv:2309.02033 [pdf, other]

Data-Juicer: A One-Stop Data Processing System for Large Language Models

Authors: Daoyuan Chen, Yilun Huang, Zhijian Ma, Hesen Chen, Xuchen Pan, Ce Ge, Dawei Gao, Yuexiang Xie, Zhaoyang Liu, **yang Gao, Yaliang Li, Bolin Ding, **gren Zhou

Abstract: The immense evolution in Large Language Models (LLMs) has underscored the importance of massive, heterogeneous, and high-quality data. A data recipe is a mixture of data from different sources for training LLMs, which plays a vital role in LLMs' performance. Existing open-source tools for LLM data processing are mostly tailored for specific data recipes. To continuously uncover the potential of LL… ▽ More The immense evolution in Large Language Models (LLMs) has underscored the importance of massive, heterogeneous, and high-quality data. A data recipe is a mixture of data from different sources for training LLMs, which plays a vital role in LLMs' performance. Existing open-source tools for LLM data processing are mostly tailored for specific data recipes. To continuously uncover the potential of LLMs, incorporate data from new sources, and improve LLMs' performance, we build a new system named Data-Juicer, with which we can efficiently generate diverse data recipes, explore different possibilities in forming data mixtures, and evaluate their effects on model performance. Different from traditional data-analytics pipelines, Data-Juicer faces some unique challenges. Firstly, the possible data sources for forming data recipes are truly heterogeneous and massive with various qualities. Secondly, it is extremely expensive to precisely evaluate data recipes' impact on LLMs' performance. Thirdly, the end users of Data-Juicer, model developers, need sufficient flexibility to configure and evaluate different data recipes. Data-Juicer features a fine-grained abstraction of pipelines for constructing data recipes, with over 50 built-in operators for easy composition and extension. By incorporating visualization and auto-evaluation capabilities, Data-Juicer enables a timely feedback loop for both LLM pre-training and fine-tuning. Further, Data-Juicer is optimized and integrated with ecosystems for LLM training, evaluation, and distributed computing. The data recipes derived with Data-Juicer gain notable improvements on state-of-the-art LLMs, by up to 7.45% increase in averaged score across 16 LLM benchmarks and 17.5% higher win rate in pair-wise GPT-4 evaluations. Our system, data recipes, and tutorials are released, calling for broader data-centric research on training and understanding LLMs. △ Less

Submitted 20 December, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

Comments: 20 Pages, 10 figures, 9 tables. The system, data recipes, and demos are continuously maintained at https://github.com/alibaba/data-juicer

arXiv:2309.01646 [pdf, other]

ReLoc-PDR: Visual Relocalization Enhanced Pedestrian Dead Reckoning via Graph Optimization

Authors: Zongyang Chen, Xianfei Pan, Changhao Chen

Abstract: Accurately and reliably positioning pedestrians in satellite-denied conditions remains a significant challenge. Pedestrian dead reckoning (PDR) is commonly employed to estimate pedestrian location using low-cost inertial sensor. However, PDR is susceptible to drift due to sensor noise, incorrect step detection, and inaccurate stride length estimation. This work proposes ReLoc-PDR, a fusion framewo… ▽ More Accurately and reliably positioning pedestrians in satellite-denied conditions remains a significant challenge. Pedestrian dead reckoning (PDR) is commonly employed to estimate pedestrian location using low-cost inertial sensor. However, PDR is susceptible to drift due to sensor noise, incorrect step detection, and inaccurate stride length estimation. This work proposes ReLoc-PDR, a fusion framework combining PDR and visual relocalization using graph optimization. ReLoc-PDR leverages time-correlated visual observations and learned descriptors to achieve robust positioning in visually-degraded environments. A graph optimization-based fusion mechanism with the Tukey kernel effectively corrects cumulative errors and mitigates the impact of abnormal visual observations. Real-world experiments demonstrate that our ReLoc-PDR surpasses representative methods in accuracy and robustness, achieving accurte and robust pedestrian positioning results using only a smartphone in challenging environments such as less-textured corridors and dark nighttime scenarios. △ Less

Submitted 4 September, 2023; originally announced September 2023.

Comments: 11 pages, 14 figures

arXiv:2309.01502 [pdf, other]

A coupled-channel analysis of the $X(3872)$ lineshape with BESIII data

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (600 additional authors not shown)

Abstract: We perform a study of the $X(3872)$ lineshape using the data samples of $e^+e^-\toγX(3872)$, $X(3872)\to D^0\bar{D}^0 π^0$ and $π^+π^- J/ψ$ collected with the BESIII detector. The effects of the coupled-channels and the off-shell $D^{*0}$ are included in the parameterization of the lineshape. The lineshape mass parameter is obtained to be $M_{X}=(3871.63\pm 0.13^{+0.06}_{-0.05})$ MeV. Two poles ar… ▽ More We perform a study of the $X(3872)$ lineshape using the data samples of $e^+e^-\toγX(3872)$, $X(3872)\to D^0\bar{D}^0 π^0$ and $π^+π^- J/ψ$ collected with the BESIII detector. The effects of the coupled-channels and the off-shell $D^{*0}$ are included in the parameterization of the lineshape. The lineshape mass parameter is obtained to be $M_{X}=(3871.63\pm 0.13^{+0.06}_{-0.05})$ MeV. Two poles are found on the first and second Riemann sheets corresponding to the $D^{*0}\bar{D}^0$ branch cut. The pole location on the first sheet is much closer to the $D^{*0}\bar{D}^0$ threshold than the other, and is determined to be $7.04\pm0.15^{+0.07}_{-0.08}$ MeV above the $D^0\bar{D}^0π^0$ threshold with an imaginary part $-0.19\pm0.08^{+0.14}_{-0.19}$ MeV. △ Less

Submitted 4 September, 2023; originally announced September 2023.

arXiv:2309.01430 [pdf, other]

DAT++: Spatially Dynamic Vision Transformer with Deformable Attention

Authors: Zhuofan Xia, Xuran Pan, Shiji Song, Li Erran Li, Gao Huang

Abstract: Transformers have shown superior performance on various vision tasks. Their large receptive field endows Transformer models with higher representation power than their CNN counterparts. Nevertheless, simply enlarging the receptive field also raises several concerns. On the one hand, using dense attention in ViT leads to excessive memory and computational cost, and features can be influenced by irr… ▽ More Transformers have shown superior performance on various vision tasks. Their large receptive field endows Transformer models with higher representation power than their CNN counterparts. Nevertheless, simply enlarging the receptive field also raises several concerns. On the one hand, using dense attention in ViT leads to excessive memory and computational cost, and features can be influenced by irrelevant parts that are beyond the region of interests. On the other hand, the handcrafted attention adopted in PVT or Swin Transformer is data agnostic and may limit the ability to model long-range relations. To solve this dilemma, we propose a novel deformable multi-head attention module, where the positions of key and value pairs in self-attention are adaptively allocated in a data-dependent way. This flexible scheme enables the proposed deformable attention to dynamically focus on relevant regions while maintains the representation power of global attention. On this basis, we present Deformable Attention Transformer (DAT), a general vision backbone efficient and effective for visual recognition. We further build an enhanced version DAT++. Extensive experiments show that our DAT++ achieves state-of-the-art results on various visual recognition benchmarks, with 85.9% ImageNet accuracy, 54.5 and 47.0 MS-COCO instance segmentation mAP, and 51.5 ADE20K semantic segmentation mIoU. △ Less

Submitted 4 September, 2023; originally announced September 2023.

Comments: 17 pages, 6 figures, 11 tables

arXiv:2309.01172 [pdf, other]

FusionAI: Decentralized Training and Deploying LLMs with Massive Consumer-Level GPUs

Authors: Zhenheng Tang, Yuxin Wang, Xin He, Longteng Zhang, Xinglin Pan, Qiang Wang, Rongfei Zeng, Kaiyong Zhao, Shaohuai Shi, Bingsheng He, Xiaowen Chu

Abstract: The rapid growth of memory and computation requirements of large language models (LLMs) has outpaced the development of hardware, hindering people who lack large-scale high-end GPUs from training or deploying LLMs. However, consumer-level GPUs, which constitute a larger market share, are typically overlooked in LLM due to their weaker computing performance, smaller storage capacity, and lower comm… ▽ More The rapid growth of memory and computation requirements of large language models (LLMs) has outpaced the development of hardware, hindering people who lack large-scale high-end GPUs from training or deploying LLMs. However, consumer-level GPUs, which constitute a larger market share, are typically overlooked in LLM due to their weaker computing performance, smaller storage capacity, and lower communication bandwidth. Additionally, users may have privacy concerns when interacting with remote LLMs. In this paper, we envision a decentralized system unlocking the potential vast untapped consumer-level GPUs in pre-training, inference and fine-tuning of LLMs with privacy protection. However, this system faces critical challenges, including limited CPU and GPU memory, low network bandwidth, the variability of peer and device heterogeneity. To address these challenges, our system design incorporates: 1) a broker with backup pool to implement dynamic join and quit of computing providers; 2) task scheduling with hardware performance to improve system efficiency; 3) abstracting ML procedures into directed acyclic graphs (DAGs) to achieve model and task universality; 4) abstracting intermediate represention and execution planes to ensure compatibility of various devices and deep learning (DL) frameworks. Our performance analysis demonstrates that 50 RTX 3080 GPUs can achieve throughputs comparable to those of 4 H100 GPUs, which are significantly more expensive. △ Less

Submitted 3 September, 2023; originally announced September 2023.

arXiv:2309.01075 [pdf, other]

doi 10.1145/3607828.3617798

Muti-Stage Hierarchical Food Classification

Authors: Xinyue Pan, Jiangpeng He, Fengqing Zhu

Abstract: Food image classification serves as a fundamental and critical step in image-based dietary assessment, facilitating nutrient intake analysis from captured food images. However, existing works in food classification predominantly focuses on predicting 'food types', which do not contain direct nutritional composition information. This limitation arises from the inherent discrepancies in nutrition da… ▽ More Food image classification serves as a fundamental and critical step in image-based dietary assessment, facilitating nutrient intake analysis from captured food images. However, existing works in food classification predominantly focuses on predicting 'food types', which do not contain direct nutritional composition information. This limitation arises from the inherent discrepancies in nutrition databases, which are tasked with associating each 'food item' with its respective information. Therefore, in this work we aim to classify food items to align with nutrition database. To this end, we first introduce VFN-nutrient dataset by annotating each food image in VFN with a food item that includes nutritional composition information. Such annotation of food items, being more discriminative than food types, creates a hierarchical structure within the dataset. However, since the food item annotations are solely based on nutritional composition information, they do not always show visual relations with each other, which poses significant challenges when applying deep learning-based techniques for classification. To address this issue, we then propose a multi-stage hierarchical framework for food item classification by iteratively clustering and merging food items during the training process, which allows the deep model to extract image features that are discriminative across labels. Our method is evaluated on VFN-nutrient dataset and achieve promising results compared with existing work in terms of both food type and food item classification. △ Less

Submitted 3 September, 2023; originally announced September 2023.

Comments: accepted for ACM MM 2023 Madima

arXiv:2309.00428 [pdf, other]

doi 10.1145/3610548.3618148

A Locality-based Neural Solver for Optical Motion Capture

Authors: Xiaoyu Pan, Bowen Zheng, Xinwei Jiang, Guanglong Xu, Xianli Gu, **gxiang Li, Qilong Kou, He Wang, Tianjia Shao, Kun Zhou, Xiaogang **

Abstract: We present a novel locality-based learning method for cleaning and solving optical motion capture data. Given noisy marker data, we propose a new heterogeneous graph neural network which treats markers and joints as different types of nodes, and uses graph convolution operations to extract the local features of markers and joints and transform them to clean motions. To deal with anomaly markers (e… ▽ More We present a novel locality-based learning method for cleaning and solving optical motion capture data. Given noisy marker data, we propose a new heterogeneous graph neural network which treats markers and joints as different types of nodes, and uses graph convolution operations to extract the local features of markers and joints and transform them to clean motions. To deal with anomaly markers (e.g. occluded or with big tracking errors), the key insight is that a marker's motion shows strong correlations with the motions of its immediate neighboring markers but less so with other markers, a.k.a. locality, which enables us to efficiently fill missing markers (e.g. due to occlusion). Additionally, we also identify marker outliers due to tracking errors by investigating their acceleration profiles. Finally, we propose a training regime based on representation learning and data augmentation, by training the model on data with masking. The masking schemes aim to mimic the occluded and noisy markers often observed in the real data. Finally, we show that our method achieves high accuracy on multiple metrics across various datasets. Extensive comparison shows our method outperforms state-of-the-art methods in terms of prediction accuracy of occluded marker position error by approximately 20%, which leads to a further error reduction on the reconstructed joint rotations and positions by 30%. The code and data for this paper are available at https://github.com/non-void/LocalMoCap. △ Less

Submitted 4 September, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

Comments: Siggraph Asia 2023 Conference Paper

arXiv:2309.00363 [pdf, other]

FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large Language Models in Federated Learning

Authors: Weirui Kuang, Bingchen Qian, Zitao Li, Daoyuan Chen, Dawei Gao, Xuchen Pan, Yuexiang Xie, Yaliang Li, Bolin Ding, **gren Zhou

Abstract: LLMs have demonstrated great capabilities in various NLP tasks. Different entities can further improve the performance of those LLMs on their specific downstream tasks by fine-tuning LLMs. When several entities have similar interested tasks, but their data cannot be shared because of privacy concerns regulations, federated learning (FL) is a mainstream solution to leverage the data of different en… ▽ More LLMs have demonstrated great capabilities in various NLP tasks. Different entities can further improve the performance of those LLMs on their specific downstream tasks by fine-tuning LLMs. When several entities have similar interested tasks, but their data cannot be shared because of privacy concerns regulations, federated learning (FL) is a mainstream solution to leverage the data of different entities. However, fine-tuning LLMs in federated learning settings still lacks adequate support from existing FL frameworks because it has to deal with optimizing the consumption of significant communication and computational resources, data preparation for different tasks, and distinct information protection demands. This paper first discusses these challenges of federated fine-tuning LLMs, and introduces our package FS-LLM as a main contribution, which consists of the following components: (1) we build an end-to-end benchmarking pipeline, automizing the processes of dataset preprocessing, federated fine-tuning execution, and performance evaluation on federated LLM fine-tuning; (2) we provide comprehensive federated parameter-efficient fine-tuning algorithm implementations and versatile programming interfaces for future extension in FL scenarios with low communication and computation costs, even without accessing the full model; (3) we adopt several accelerating and resource-efficient operators for fine-tuning LLMs with limited resources and the flexible pluggable sub-routines for interdisciplinary study. We conduct extensive experiments to validate the effectiveness of FS-LLM and benchmark advanced LLMs with state-of-the-art parameter-efficient fine-tuning algorithms in FL settings, which also yields valuable insights into federated fine-tuning LLMs for the research community. To facilitate further research and adoption, we release FS-LLM at https://github.com/alibaba/FederatedScope/tree/llm. △ Less

Submitted 1 September, 2023; originally announced September 2023.

Comments: Source code: https://github.com/alibaba/FederatedScope/tree/llm

arXiv:2309.00187 [pdf]

Vision-aided nonlinear control framework for shake table tests

Authors: Zhongwei Chen, T. Y. Yang, Yifei Xiao, Xiao Pan, Wanyan Yang

Abstract: The structural response under the earthquake excitations can be simulated by scaled-down model shake table tests or full-scale model shake table tests. In this paper, adaptive control theory is used as a nonlinear shake table control algorithm which considers the inherent nonlinearity of the shake table system and the Control-Structural Interaction (CSI) effect that the linear controller cannot co… ▽ More The structural response under the earthquake excitations can be simulated by scaled-down model shake table tests or full-scale model shake table tests. In this paper, adaptive control theory is used as a nonlinear shake table control algorithm which considers the inherent nonlinearity of the shake table system and the Control-Structural Interaction (CSI) effect that the linear controller cannot consider, such as the Proportional-Integral-Derivative (PID) controller. The mass of the specimen can be assumed as an unknown variation and the unknown parameter will be replaced by an estimated value in the proposed control framework. The signal generated by the control law of the adaptive control method will be implemented by a loop-sha** controller. To verify the stability and feasibility of the proposed control framework, a simulation of a bare shake table and experiments with a bare shake table with a two-story frame were carried out. This study randomly selects Earthquake recordings from the Pacific Earthquake Engineering Research Center (PEER) database. The simulation and experimental results show that the proposed control framework can be effectively used in shake table control. △ Less

Submitted 31 August, 2023; originally announced September 2023.

Comments: 10 pages, 7 figures, accepted in the Canadian Conference - Pacific Conference on Earthquake Engineering 2023, Vancouver, British Columbia

arXiv:2308.16380 [pdf]

3D vision-based structural masonry damage detection

Authors: Elmira Faraji Zonouz, Xiao Pan, Yu-Cheng Hsu, Tony Yang

Abstract: The detection of masonry damage is essential for preventing potentially disastrous outcomes. Manual inspection can, however, take a long time and be hazardous to human inspectors. Automation of the inspection process using novel computer vision and machine learning algorithms can be a more efficient and safe solution to prevent further deterioration of the masonry structures. Most existing 2D visi… ▽ More The detection of masonry damage is essential for preventing potentially disastrous outcomes. Manual inspection can, however, take a long time and be hazardous to human inspectors. Automation of the inspection process using novel computer vision and machine learning algorithms can be a more efficient and safe solution to prevent further deterioration of the masonry structures. Most existing 2D vision-based methods are limited to qualitative damage classification, 2D localization, and in-plane quantification. In this study, we present a 3D vision-based methodology for accurate masonry damage detection, which offers a more robust solution with a greater field of view, depth of vision, and the ability to detect failures in complex environments. First, images of the masonry specimens are collected to generate a 3D point cloud. Second, 3D point clouds processing methods are developed to evaluate the masonry damage. We demonstrate the effectiveness of our approach through experiments on structural masonry components. Our experiments showed the proposed system can effectively classify damage states and localize and quantify critical damage features. The result showed the proposed method can improve the level of autonomy during the inspection of masonry structures. △ Less

Submitted 30 August, 2023; originally announced August 2023.

Comments: 10 pages, accepted in the Canadian Conference - Pacific Conference on Earthquake Engineering 2023, Vancouver, British Columbia

arXiv:2308.16280 [pdf]

A reinforcement learning based construction material supply strategy using robotic crane and computer vision for building reconstruction after an earthquake

Authors: Yifei Xiao, T. Y. Yang, Xiao Pan, Fan Xie, Zhongwei Chen

Abstract: After an earthquake, it is particularly important to provide the necessary resources on site because a large number of infrastructures need to be repaired or newly constructed. Due to the complex construction environment after the disaster, there are potential safety hazards for human labors working in this environment. With the advancement of robotic technology and artificial intelligent (AI) alg… ▽ More After an earthquake, it is particularly important to provide the necessary resources on site because a large number of infrastructures need to be repaired or newly constructed. Due to the complex construction environment after the disaster, there are potential safety hazards for human labors working in this environment. With the advancement of robotic technology and artificial intelligent (AI) algorithms, smart robotic technology is the potential solution to provide construction resources after an earthquake. In this paper, the robotic crane with advanced AI algorithms is proposed to provide resources for infrastructure reconstruction after an earthquake. The proximal policy optimization (PPO), a reinforcement learning (RL) algorithm, is implemented for 3D lift path planning when transporting the construction materials. The state and reward function are designed in detail for RL model training. Two models are trained through a loading task in different environments by using PPO algorithm, one considering the influence of obstacles and the other not considering obstacles. Then, the two trained models are compared and evaluated through an unloading task and a loading task in simulation environments. For each task, two different cases are considered. One is that there is no obstacle between the initial position where the construction material is lifted and the target position, and the other is that there are obstacles between the initial position and the target position. The results show that the model that considering the obstacles during training can generate proper actions for the robotic crane to execute so that the crane can automatically transport the construction materials to the desired location with swing suppression, short time consumption and collision avoidance. △ Less

Submitted 30 August, 2023; originally announced August 2023.

Comments: 12 pages, 7 figures, accepted in the Canadian Conference - Pacific Conference on Earthquake Engineering 2023, Vancouver, British Columbia

arXiv:2308.16278 [pdf]

Autonomous damage assessment of structural columns using low-cost micro aerial vehicles and multi-view computer vision

Authors: Sina Tavasoli, Xiao Pan, T. Y. Yang, Saudah Gazi, Mohsen Azimi

Abstract: Structural columns are the crucial load-carrying components of buildings and bridges. Early detection of column damage is important for the assessment of the residual performance and the prevention of system-level collapse. This research proposes an innovative end-to-end micro aerial vehicles (MAVs)-based approach to automatically scan and inspect columns. First, an MAV-based automatic image colle… ▽ More Structural columns are the crucial load-carrying components of buildings and bridges. Early detection of column damage is important for the assessment of the residual performance and the prevention of system-level collapse. This research proposes an innovative end-to-end micro aerial vehicles (MAVs)-based approach to automatically scan and inspect columns. First, an MAV-based automatic image collection method is proposed. The MAV is programmed to sense the structural columns and their surrounding environment. During the navigation, the MAV first detects and approaches the structural columns. Then, it starts to collect image data at multiple viewpoints around every detected column. Second, the collected images will be used to assess the damage types and damage locations. Third, the damage state of the structural column will be determined by fusing the evaluation outcomes from multiple camera views. In this study, reinforced concrete (RC) columns are selected to demonstrate the effectiveness of the approach. Experimental results indicate that the proposed MAV-based inspection approach can effectively collect images from multiple viewing angles, and accurately assess critical RC column damages. The approach improves the level of autonomy during the inspection. In addition, the evaluation outcomes are more comprehensive than the existing 2D vision methods. The concept of the proposed inspection approach can be extended to other structural columns such as bridge piers. △ Less

Submitted 30 August, 2023; originally announced August 2023.

Comments: 12 pages, 11 figures, accepted in the Canadian Conference - Pacific Conference on Earthquake Engineering 2023, Vancouver, British Columbia

arXiv:2308.15362 [pdf, other]

doi 10.1103/PhysRevLett.131.211902

Observation of a vector charmoniumlike state at 4.7 ${\rm GeV}/c^2$ and search for $Z_{cs}$ in $e^+e^-\to K^+K^-J/ψ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (599 additional authors not shown)

Abstract: Using data samples with an integrated luminosity of 5.85~fb$^{-1}$ collected at center-of-mass energies from 4.61 to 4.95 GeV with the BESIII detector operating at the BEPCII storage ring, we measure the cross section for the process $e^+e^-\to K^+K^-J/ψ$. A new resonance with a mass of $M = 4708_{-15}^{+17}\pm21$ MeV/$c^{2}$ and a width of $Γ= 126_{-23}^{+27}\pm30$ MeV is observed in the energy-d… ▽ More Using data samples with an integrated luminosity of 5.85~fb$^{-1}$ collected at center-of-mass energies from 4.61 to 4.95 GeV with the BESIII detector operating at the BEPCII storage ring, we measure the cross section for the process $e^+e^-\to K^+K^-J/ψ$. A new resonance with a mass of $M = 4708_{-15}^{+17}\pm21$ MeV/$c^{2}$ and a width of $Γ= 126_{-23}^{+27}\pm30$ MeV is observed in the energy-dependent line shape of the $e^+e^-\to K^+K^-J/ψ$ cross section with a significance over $5σ$. The $K^{+}J/ψ$ system is also investigated to search for charged charmoniumlike states, but no significant $Z_{cs}^+$ states are observed. Upper limits on the Born cross sections for $e^+e^-\to K^{-} Z_{cs}(3985)^{+}/K^{-} Z_{cs}(4000)^{+} + c.c.$ with $Z_{cs}(3985)^{\pm}/Z_{cs}(4000)^{\pm}\to K^{\pm} J/ψ$ are reported at 90\% confidence levels. The ratio of branching fractions $\frac{\mathcal{B}(Z_{cs}(3985)^{+}\to K^+ J/ψ)}{\mathcal{B}(Z_{cs}(3985)^{+}\to (\bar{D}^{0}D_s^{*+} + \bar{D}^{*0}D_s^+))}$ is measured to be less than 0.03 at 90\% confidence level. △ Less

Submitted 24 November, 2023; v1 submitted 29 August, 2023; originally announced August 2023.

Journal ref: Phys.Rev. Lett. 131, 211902 (2023)

arXiv:2308.15206 [pdf, ps, other]

doi 10.1103/PhysRevD.109.072008

Study of excited $Ξ$ states in $ψ(3686)\rightarrow{}K^{-}Λ\overlineΞ^{+}+c.c.$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (587 additional authors not shown)

Abstract: Based on a sample of $(448.1\pm2.9)\times10^{6}$ $ψ(3686)$ events collected with the BESIII detector at BEPCII, the decays of $ψ(3686)\to{}K^{-}Λ\overlineΞ^{+} + c.c.$ with $\overlineΞ^+ \to \overlineΛ π^+$, $\overlineΛ\to \overline{p} π^+$ are studied.Two excited hyperons, $Ξ(1690)^-$ and $Ξ(1820)^-$, are observed with large significance ($ \gg 10 σ$) in the $K^{-}Λ$ invariant mass distributions.… ▽ More Based on a sample of $(448.1\pm2.9)\times10^{6}$ $ψ(3686)$ events collected with the BESIII detector at BEPCII, the decays of $ψ(3686)\to{}K^{-}Λ\overlineΞ^{+} + c.c.$ with $\overlineΞ^+ \to \overlineΛ π^+$, $\overlineΛ\to \overline{p} π^+$ are studied.Two excited hyperons, $Ξ(1690)^-$ and $Ξ(1820)^-$, are observed with large significance ($ \gg 10 σ$) in the $K^{-}Λ$ invariant mass distributions. A partial wave analysis is performed, and the spin-parities of $Ξ(1690)^-$ and $Ξ(1820)^-$ are determined to be $\frac{1}{2}^{-}$ and $\frac{3}{2}^{-}$, respectively. The masses, widths, and product branching fractions of $Ξ(1690)^-$ and $Ξ(1820)^-$ are also measured. △ Less

Submitted 28 April, 2024; v1 submitted 29 August, 2023; originally announced August 2023.

Comments: 12 paes, 4 figures

Journal ref: Physical Review D 109, 072008 (2024)

arXiv:2308.13980 [pdf, ps, other]

doi 10.1103/PhysRevD.109.L011102

Search for the light hadron decay $χ_{c1}(3872) \to π^{+}π^{-}η$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (600 additional authors not shown)

Abstract: With a data sample corresponding to an integrated luminosity of 11.5~fb$^{-1}$ collected with the BESIII detector operating at the BEPCII storage ring, for the first time the light hadron decay $χ_{c1}(3872) \rightarrow π^{+}π^{-}η$ is searched for. While no significant signal is observed, the upper limits at the 90\% confidence level for… ▽ More With a data sample corresponding to an integrated luminosity of 11.5~fb$^{-1}$ collected with the BESIII detector operating at the BEPCII storage ring, for the first time the light hadron decay $χ_{c1}(3872) \rightarrow π^{+}π^{-}η$ is searched for. While no significant signal is observed, the upper limits at the 90\% confidence level for $σ[e^{+}e^{-} \rightarrow γχ_{c1}(3872)] \mathcal{B}[χ_{c1}(3872) \rightarrow π^{+}π^{-}η]$ at center-of-mass energies from 4.13 to 4.34 GeV are determined. By normalizing to the $χ_{c1}(3872)\toπ^+π^- J/ψ$ decay channel, a 90\% confidence level upper limit for the branching fraction ratio $\mathcal{R}=\mathcal{B}[χ_{c1}(3872) \rightarrowπ^{+}π^{-}η]/\mathcal{B}[χ_{c1}(3872) \rightarrow π^{+}π^{-} J/ψ] < 0.12$ is given. These measurements provide important inputs for understanding the internal structure of the $χ_{c1}(3872)$ resonance. △ Less

Submitted 19 January, 2024; v1 submitted 26 August, 2023; originally announced August 2023.

Comments: 11 pages, 5 figures, version to appear in PRD(L)

Journal ref: Phys. Rev. D 109, L011102 (2024)

arXiv:2308.13725 [pdf, ps, other]

Improved measurement of the branching fractions for $J/ψ\toγπ^0$, $γη$ and $γη^\prime$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (598 additional authors not shown)

Abstract: Using a data sample of $(1.0087\pm 0.0044)\times 10^{10}$ $J/ψ$ events collected with the BESIII detector, the decays of $J/ψ\toγπ^{0} (η, η^\prime)\toγγγ$ are studied. Newly measured branching fractions are $\mathcal{B}$$(J/ψ\toγπ^{0})$=$(3.34\pm 0.02\pm 0.09)\times 10^{-5}$, $\mathcal{B}$$(J/ψ\toγη)$=$(1.096\pm 0.001\pm0.019)\times 10^{-3}$ and $\mathcal{B}$$(J/ψ\toγη^\prime)$=… ▽ More Using a data sample of $(1.0087\pm 0.0044)\times 10^{10}$ $J/ψ$ events collected with the BESIII detector, the decays of $J/ψ\toγπ^{0} (η, η^\prime)\toγγγ$ are studied. Newly measured branching fractions are $\mathcal{B}$$(J/ψ\toγπ^{0})$=$(3.34\pm 0.02\pm 0.09)\times 10^{-5}$, $\mathcal{B}$$(J/ψ\toγη)$=$(1.096\pm 0.001\pm0.019)\times 10^{-3}$ and $\mathcal{B}$$(J/ψ\toγη^\prime)$=$(5.40\pm 0.01\pm0.11)\times 10^{-3}$, where the first uncertainties are statistical and the second are systematic. These results are consistent with the world average values within two standard deviations. The ratio of partial widths $Γ(J/ψ\toγη^\prime)/Γ(J/ψ\toγη)$ is measured to be $4.93 \pm 0.13$. The singlet-octet pseudoscalar mixing angle $θ_P$ is determined to be $θ_P = -(22.11 \pm0.26)^\circ$ or $-(19.34 \pm 0.34)^\circ$ with two different phenomenological models. △ Less

Submitted 25 August, 2023; originally announced August 2023.

arXiv:2308.13561 [pdf, other]

Project Aria: A New Tool for Egocentric Multi-Modal AI Research

Authors: Jakob Engel, Kiran Somasundaram, Michael Goesele, Albert Sun, Alexander Gamino, Andrew Turner, Arjang Talattof, Arnie Yuan, Bilal Souti, Brighid Meredith, Cheng Peng, Chris Sweeney, Cole Wilson, Dan Barnes, Daniel DeTone, David Caruso, Derek Valleroy, Dinesh Ginjupalli, Duncan Frost, Edward Miller, Elias Mueggler, Evgeniy Oleinik, Fan Zhang, Guruprasad Somasundaram, Gustavo Solaira , et al. (49 additional authors not shown)

Abstract: Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, mul… ▽ More Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, multi-modal data recording and streaming device with the goal to foster and accelerate research in this area. In this paper, we describe the Aria device hardware including its sensor configuration and the corresponding software tools that enable recording and processing of such data. △ Less

Submitted 1 October, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

arXiv:2308.13351 [pdf, other]

Optically Detected Magnetic Resonance of Nitrogen-Vacancy Centers in Diamond under Weak Laser Excitation

Authors: Yong-Hong Yu, Rui-Zhi Zhang, Yue Xu, Xiu-Qi Chen, Huijie Zheng, Quan Li, Ren-Bao Liu, Xin-Yu Pan, Dmitry Budker, Gang-Qin Liu

Abstract: As promising quantum sensors, nitrogen-vacancy (NV) centers in diamond have been widely used in frontier studies in condensed matter physics, material sciences, and life sciences. In practical applications, weak laser excitation is favorable as it reduces the side effects of laser irradiation, for example, phototoxicity and heating. Here we report a combined theoretical and experimental study of o… ▽ More As promising quantum sensors, nitrogen-vacancy (NV) centers in diamond have been widely used in frontier studies in condensed matter physics, material sciences, and life sciences. In practical applications, weak laser excitation is favorable as it reduces the side effects of laser irradiation, for example, phototoxicity and heating. Here we report a combined theoretical and experimental study of optically detected magnetic resonance (ODMR) of NV-center ensembles under weak 532-nm laser excitation. In this regime, both the width and splitting of ODMR spectra decrease with increasing laser power. This power dependence is reproduced with a model considering laser-induced charge neutralization of NV--N+ pairs, which alters the local electric field environment. These results are important for understanding and designing NV-based quantum sensing in light-sensitive applications. △ Less

Submitted 24 April, 2024; v1 submitted 25 August, 2023; originally announced August 2023.

arXiv:2308.10527 [pdf, other]

doi 10.1145/3583780.3615218

DPAN: Dynamic Preference-based and Attribute-aware Network for Relevant Recommendations

Authors: Wei Dai, Yingmin Su, Xiaofeng Pan

Abstract: In e-commerce platforms, the relevant recommendation is a unique scenario providing related items for a trigger item that users are interested in. However, users' preferences for the similarity and diversity of recommendation results are dynamic and vary under different conditions. Moreover, individual item-level diversity is too coarse-grained since all recommended items are related to the trigge… ▽ More In e-commerce platforms, the relevant recommendation is a unique scenario providing related items for a trigger item that users are interested in. However, users' preferences for the similarity and diversity of recommendation results are dynamic and vary under different conditions. Moreover, individual item-level diversity is too coarse-grained since all recommended items are related to the trigger item. Thus, the two main challenges are to learn fine-grained representations of similarity and diversity and capture users' dynamic preferences for them under different conditions. To address these challenges, we propose a novel method called the Dynamic Preference-based and Attribute-aware Network (DPAN) for predicting Click-Through Rate (CTR) in relevant recommendations. Specifically, based on Attribute-aware Activation Values Generation (AAVG), Bi-dimensional Compression-based Re-expression (BCR) is designed to obtain similarity and diversity representations of user interests and item information. Then Shallow and Deep Union-based Fusion (SDUF) is proposed to capture users' dynamic preferences for the diverse degree of recommendation results according to various conditions. DPAN has demonstrated its effectiveness through extensive offline experiments and online A/B testing, resulting in a significant 7.62% improvement in CTR. Currently, DPAN has been successfully deployed on our e-commerce platform serving the primary traffic for relevant recommendations. The code of DPAN has been made publicly available. △ Less

Submitted 21 August, 2023; originally announced August 2023.

arXiv:2308.09604 [pdf, other]

Faster Stochastic Variance Reduction Methods for Compositional MiniMax Optimization

Authors: ** Liu, Xiaokang Pan, Junwen Duan, Hongdong Li, Youqi Li, Zhe Qu

Abstract: This paper delves into the realm of stochastic optimization for compositional minimax optimization - a pivotal challenge across various machine learning domains, including deep AUC and reinforcement learning policy evaluation. Despite its significance, the problem of compositional minimax optimization is still under-explored. Adding to the complexity, current methods of compositional minimax optim… ▽ More This paper delves into the realm of stochastic optimization for compositional minimax optimization - a pivotal challenge across various machine learning domains, including deep AUC and reinforcement learning policy evaluation. Despite its significance, the problem of compositional minimax optimization is still under-explored. Adding to the complexity, current methods of compositional minimax optimization are plagued by sub-optimal complexities or heavy reliance on sizable batch sizes. To respond to these constraints, this paper introduces a novel method, called Nested STOchastic Recursive Momentum (NSTORM), which can achieve the optimal sample complexity of $O(κ^3 /ε^3 )$ to obtain the $ε$-accuracy solution. We also demonstrate that NSTORM can achieve the same sample complexity under the Polyak-Łojasiewicz (PL)-condition - an insightful extension of its capabilities. Yet, NSTORM encounters an issue with its requirement for low learning rates, potentially constraining its real-world applicability in machine learning. To overcome this hurdle, we present ADAptive NSTORM (ADA-NSTORM) with adaptive learning rates. We demonstrate that ADA-NSTORM can achieve the same sample complexity but the experimental results show its more effectiveness. All the proposed complexities indicate that our proposed methods can match lower bounds to existing minimax optimizations, without requiring a large batch size in each iteration. Extensive experiments support the efficiency of our proposed methods. △ Less

Submitted 12 December, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

arXiv:2308.09278 [pdf, other]

MATLABER: Material-Aware Text-to-3D via LAtent BRDF auto-EncodeR

Authors: Xudong Xu, Zhaoyang Lyu, Xingang Pan, Bo Dai

Abstract: Based on powerful text-to-image diffusion models, text-to-3D generation has made significant progress in generating compelling geometry and appearance. However, existing methods still struggle to recover high-fidelity object materials, either only considering Lambertian reflectance, or failing to disentangle BRDF materials from the environment lights. In this work, we propose Material-Aware Text-t… ▽ More Based on powerful text-to-image diffusion models, text-to-3D generation has made significant progress in generating compelling geometry and appearance. However, existing methods still struggle to recover high-fidelity object materials, either only considering Lambertian reflectance, or failing to disentangle BRDF materials from the environment lights. In this work, we propose Material-Aware Text-to-3D via LAtent BRDF auto-EncodeR (\textbf{MATLABER}) that leverages a novel latent BRDF auto-encoder for material generation. We train this auto-encoder with large-scale real-world BRDF collections and ensure the smoothness of its latent space, which implicitly acts as a natural distribution of materials. During appearance modeling in text-to-3D generation, the latent BRDF embeddings, rather than BRDF parameters, are predicted via a material network. Through exhaustive experiments, our approach demonstrates the superiority over existing ones in generating realistic and coherent object materials. Moreover, high-quality materials naturally enable multiple downstream tasks such as relighting and material editing. Code and model will be publicly available at \url{https://sheldontsui.github.io/projects/Matlaber}. △ Less

Submitted 17 August, 2023; originally announced August 2023.

arXiv:2308.08161 [pdf, ps, other]

Study of $e^+e^-\toηφ$ at center-of-mass energies from 3.773 to 4.600 GeV

Authors: M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (600 additional authors not shown)

Abstract: We present a study of the process $e^{+}e^{-}\toηφ$ using data samples collected with the BESIII detector corresponding to an integrated luminosity of 15.03 fb$^{-1}$ at 23 center-of-mass energies from 3.773 to 4.600 GeV. The Born cross sections are measured at each energy and a coherent fit to cross-section lineshape is performed using a Breit-Wigner parametrization to search for charmonium-like… ▽ More We present a study of the process $e^{+}e^{-}\toηφ$ using data samples collected with the BESIII detector corresponding to an integrated luminosity of 15.03 fb$^{-1}$ at 23 center-of-mass energies from 3.773 to 4.600 GeV. The Born cross sections are measured at each energy and a coherent fit to cross-section lineshape is performed using a Breit-Wigner parametrization to search for charmonium-like vector states. No significant signals of the $Y(4230)$ and $Y(4360)$ resonances are observed. △ Less

Submitted 24 October, 2023; v1 submitted 16 August, 2023; originally announced August 2023.

Comments: 11 pages, 5(8) figures

Report number: about 4000

arXiv:2308.05490 [pdf, other]

Search for the lepton number violation decay $φ\to π^+ π^+ e^- e^-$ via $J/ψ\to φη$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (584 additional authors not shown)

Abstract: Using $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the lepton number violation decay $φ\to π^+ π^+ e^- e^-$ via $J/ψ\to φη$. No signal is found and the upper limit on the branching fraction of $φ\to π^+ π^+ e^- e^-$ is set to be $9.7\times10^{-6}$ at the 90\% confidence level. Using $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the lepton number violation decay $φ\to π^+ π^+ e^- e^-$ via $J/ψ\to φη$. No signal is found and the upper limit on the branching fraction of $φ\to π^+ π^+ e^- e^-$ is set to be $9.7\times10^{-6}$ at the 90\% confidence level. △ Less

Submitted 10 August, 2023; originally announced August 2023.

Comments: 10 pages, 5 figures

arXiv:2308.03361 [pdf, other]

Measurement of the $e^+e^- \to Λ\barΣ^0 + c.c.$ cross sections at $\sqrt{s}$ from 2.3094 to 3.0800 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (601 additional authors not shown)

Abstract: The Born cross sections and effective form factors of the process $e^+e^-\toΛ\barΣ^0 + c.c.$ are measured at 14 center-of-mass energy points from 2.3094 to 3.0800 GeV, based on data corresponding to an integrated luminosity of $(478.5 \pm 4.8)\ \text{pb}^{-1}$ collected with the BESIII detector. A non-zero Born cross section is observed at the center-of-mass energy of 2.3094 GeV with a statistical… ▽ More The Born cross sections and effective form factors of the process $e^+e^-\toΛ\barΣ^0 + c.c.$ are measured at 14 center-of-mass energy points from 2.3094 to 3.0800 GeV, based on data corresponding to an integrated luminosity of $(478.5 \pm 4.8)\ \text{pb}^{-1}$ collected with the BESIII detector. A non-zero Born cross section is observed at the center-of-mass energy of 2.3094 GeV with a statistical significance of more than five standard deviations, and the cross sections at other energies are obtained with improved precision compared to earlier measurements from the BaBar Collaboration. The Born cross-section lineshape is described better by a shape with a plateau near the threshold than by a pQCD motivated functional form. △ Less

Submitted 7 August, 2023; originally announced August 2023.

arXiv:2308.01201 [pdf, ps, other]

A Real-Time Robust Ecological-Adaptive Cruise Control Strategy for Battery Electric Vehicles

Authors: Sheng Yu, Xiao Pan, Anastasis Georgiou, Boli Chen, Imad M. Jaimoukha, Simos A. Evangelou

Abstract: This work addresses the ecological-adaptive cruise control problem for connected electric vehicles by a computationally efficient robust control strategy. The problem is formulated in the space-domain with a realistic description of the nonlinear electric powertrain model and motion dynamics to yield a convex optimal control problem (OCP). The OCP is approached by a novel robust model predictive c… ▽ More This work addresses the ecological-adaptive cruise control problem for connected electric vehicles by a computationally efficient robust control strategy. The problem is formulated in the space-domain with a realistic description of the nonlinear electric powertrain model and motion dynamics to yield a convex optimal control problem (OCP). The OCP is approached by a novel robust model predictive control (RMPC) method handling various disturbances due to modelling mismatch and inaccurate leading vehicle information. The RMPC problem is solved by semi-definite programming relaxation and single linear matrix inequality (sLMI) techniques for further enhanced computational efficiency. The performance of the proposed real-time robust ecological-adaptive cruise control (REACC) method is evaluated using an experimentally collected driving cycle. Its robustness is verified by comparison with a nominal MPC which is shown to result in speed-limit constraint violations. The energy economy of the proposed method outperforms a state-of-the-art time-domain RMPC scheme, as a more precisely fitted convex powertrain model can be integrated into the space-domain scheme. The additional comparison with a traditional constant distance following strategy (CDFS) further verifies the effectiveness of the proposed REACC. Finally, it is verified that the REACC can be potentially implemented in real-time owing to the sLMI and resulting convex algorithm. △ Less

Submitted 15 August, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

Comments: 15 pages, 12 figures and 2 tables. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2308.00442 [pdf, other]

FLatten Transformer: Vision Transformer using Focused Linear Attention

Authors: Dongchen Han, Xuran Pan, Yizeng Han, Shiji Song, Gao Huang

Abstract: The quadratic computation complexity of self-attention has been a persistent challenge when applying Transformer models to vision tasks. Linear attention, on the other hand, offers a much more efficient alternative with its linear complexity by approximating the Softmax operation through carefully designed map** functions. However, current linear attention approaches either suffer from significa… ▽ More The quadratic computation complexity of self-attention has been a persistent challenge when applying Transformer models to vision tasks. Linear attention, on the other hand, offers a much more efficient alternative with its linear complexity by approximating the Softmax operation through carefully designed map** functions. However, current linear attention approaches either suffer from significant performance degradation or introduce additional computation overhead from the map** functions. In this paper, we propose a novel Focused Linear Attention module to achieve both high efficiency and expressiveness. Specifically, we first analyze the factors contributing to the performance degradation of linear attention from two perspectives: the focus ability and feature diversity. To overcome these limitations, we introduce a simple yet effective map** function and an efficient rank restoration module to enhance the expressiveness of self-attention while maintaining low computation complexity. Extensive experiments show that our linear attention module is applicable to a variety of advanced vision Transformers, and achieves consistently improved performances on multiple benchmarks. Code is available at https://github.com/LeapLabTHU/FLatten-Transformer. △ Less

Submitted 1 September, 2023; v1 submitted 1 August, 2023; originally announced August 2023.

Comments: ICCV 2023

arXiv:2308.00304 [pdf, other]

Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models

Authors: Jiaao Chen, Xiaoman Pan, Dian Yu, Kaiqiang Song, Xiaoyang Wang, Dong Yu, Jianshu Chen

Abstract: We consider the problem of eliciting compositional generalization capabilities in large language models (LLMs) with a novel type of prompting strategy. Compositional generalization empowers the LLMs to solve problems that are harder than the ones they have seen (i.e., easy-to-hard generalization), which is a critical reasoning capability of human-like intelligence. However, even the current state-… ▽ More We consider the problem of eliciting compositional generalization capabilities in large language models (LLMs) with a novel type of prompting strategy. Compositional generalization empowers the LLMs to solve problems that are harder than the ones they have seen (i.e., easy-to-hard generalization), which is a critical reasoning capability of human-like intelligence. However, even the current state-of-the-art LLMs still struggle with this form of reasoning. To bridge this gap, we propose skills-in-context (SKiC) prompting, which instructs LLMs how to compose basic skills to resolve more complex problems. We find that it is crucial to demonstrate both the skills and the compositional examples within the same prompting context. With as few as two examplars, our SKiC prompting initiates strong synergies between skills and their composition capabilities. Notably, it empowers LLMs to solve unseen problems that require innovative skill compositions, achieving near-perfect generalization on a broad range of challenging compositionality tasks. Intriguingly, SKiC prompting unlocks the latent potential of LLMs, enabling them to leverage pre-existing internal skills acquired during earlier pre-training stages, even when these skills are not explicitly presented in the prompting context. This results in the capability of LLMs to solve unseen complex problems by activating and composing internal competencies. With such prominent features, SKiC prompting is able to achieve state-of-the-art performance on challenging mathematical reasoning benchmarks (e.g., MATH). △ Less

Submitted 14 August, 2023; v1 submitted 1 August, 2023; originally announced August 2023.

arXiv:2308.00212 [pdf, other]

Optimizing dual-energy CT technique for iodine-based contrast-to-noise ratio

Authors: Fatma Terzioglu, Emil Y. Sidky, Jp Phillips, Ingrid Reiser, Guillaume Bal, Xiaochuan Pan

Abstract: Purpose: This study proposes a systematic method for determining the optimal x-ray tube settings/energy windows and fluence for minimal noise and maximum CNR in material density images obtained from DECT scans by fixing the subject size and the total radiation dose. Methods: The noise propagation in the process of sinogram and image reconstruction from DECT measurements is analyzed. Analytic estim… ▽ More Purpose: This study proposes a systematic method for determining the optimal x-ray tube settings/energy windows and fluence for minimal noise and maximum CNR in material density images obtained from DECT scans by fixing the subject size and the total radiation dose. Methods: The noise propagation in the process of sinogram and image reconstruction from DECT measurements is analyzed. Analytic estimates for the sinogram and monochromatic image pixel variances and the CNR as functions of tube potentials, fluence, and virtual monochromatic image (VMI) energy are derived, and then used in a phantom experiment as an objective function for optimizing the tube settings to minimize the image noise and maximize the CNR. Results: A non-trivial example that shows the existence of singular solutions to the inversion of sinograms-to-DECT measurements map was presented. Additionally, the optimal VMI energy for maximal CNR was determined. The optimal energy VMI was found to be the least noisy monochromatic image synthesized from the iodine and water density images, and it was shown that using more general weights in combining the two images linearly does not improve image quality. When the x-ray beam filter material was fixed at 2mm of Aluminum and the photon fluence for low and high kV scans were considered equal, the tube potential pair of 60/120 kV led to the maximal CNR in the VMI formed at energy 55 KeV. Conclusions: Optimizing DECT scan parameters to maximize the CNR can be done in a systematic way. Also choosing the parameters that maximize the Jacobian determinant over the sinogram domain would lead to more stable reconstructions due to the reduced amplification of the measurement noise. Since the values of the Jacobian determinant depend strongly on the imaging task, careful consideration of all of the relevant factors is needed when implementing the proposed framework. △ Less

Submitted 28 July, 2023; originally announced August 2023.

arXiv:2307.15894 [pdf, ps, other]

doi 10.1103/PhysRevLett.132.081904

Determination of the $Σ^{+}$ Timelike Electromagnetic Form Factors

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (604 additional authors not shown)

Abstract: Based on data samples collected with the BESIII detector at the BEPCII collider, the process $e^{+}e^{-} \to Σ^{+}\barΣ^{-}$ is studied at center-of-mass energies $\sqrt{s}$ = 2.3960, 2.6454, and 2.9000 GeV. Using a fully differential angular description of the final state particles, both the relative magnitude and phase information of the $Σ^{+}$ electromagnetic form factors in the timelike regio… ▽ More Based on data samples collected with the BESIII detector at the BEPCII collider, the process $e^{+}e^{-} \to Σ^{+}\barΣ^{-}$ is studied at center-of-mass energies $\sqrt{s}$ = 2.3960, 2.6454, and 2.9000 GeV. Using a fully differential angular description of the final state particles, both the relative magnitude and phase information of the $Σ^{+}$ electromagnetic form factors in the timelike region are extracted. The relative phase between the electric and magnetic form factors is determined to be $\sinΔΦ$ = -0.67~$\pm$~0.29~(stat)~$\pm$~0.18~(syst) at $\sqrt{s}$ = 2.3960 GeV, $ΔΦ$ = 55$^{\circ}$~$\pm$~19$^{\circ}$~(stat) $\pm$~14$^{\circ}$~(syst) at $\sqrt{s}$ = 2.6454 GeV, and 78$^{\circ}$~$\pm$~22$^{\circ}$~(stat) $\pm$~9$^{\circ}$~(syst) at $\sqrt{s}$ = 2.9000 GeV. For the first time, the phase of the hyperon electromagnetic form factors is explored in a wide range of four-momentum transfer. The evolution of the phase along with four-momentum transfer is an important input for understanding its asymptotic behavior and the dynamics of baryons. △ Less

Submitted 5 March, 2024; v1 submitted 29 July, 2023; originally announced July 2023.

Journal ref: Phys. Rev. Lett. 132, 081904 (2024)

arXiv:2307.14633 [pdf, ps, other]

Observation of the decay $J/ψ\to e^+ e^- η(1405)$ with $η(1405) \to π^0 f_0(980)$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (601 additional authors not shown)

Abstract: Using a data sample of $(10087\pm44)\times 10^6$ $J/ψ$ events collected by the BESIII detector in 2009, 2012, 2018 and 2019, the electromagnetic Dalitz process $J/ψ\to e^+ e^- η(1405)$ is observed via the decay $η(1405) \to π^0 f_0(980)$, $f_0(980) \to π^+ π^-$, with a significance of about $9.6σ$. The branching fraction of this decay is measured to be… ▽ More Using a data sample of $(10087\pm44)\times 10^6$ $J/ψ$ events collected by the BESIII detector in 2009, 2012, 2018 and 2019, the electromagnetic Dalitz process $J/ψ\to e^+ e^- η(1405)$ is observed via the decay $η(1405) \to π^0 f_0(980)$, $f_0(980) \to π^+ π^-$, with a significance of about $9.6σ$. The branching fraction of this decay is measured to be ${\mathcal B}(J/ψ\to e^+ e^- π^0 η(1405) \to e^+ e^- π^0 f_0(980) \to e^+ e^- π^0 π^+ π^-)=(2.02\pm0.24(\rm{stat.})\pm0.09(\rm{syst.}))\times 10^{-7}$. The branching-fraction ratio ${\mathcal B}(J/ψ\to e^+ e^- η(1405))$/${\mathcal B}(J/ψ\to γη(1405))$ is determined to be $(1.35\pm0.19(\rm{stat.})\pm0.06(\rm{syst.}))\times10^{-2}$. Furthermore, an $e^+e^-$ invariant-mass dependent transition form factor of $J/ψ\to e^+ e^-η(1405)$ is presented for the first time. The obtained result provides input for different theoretical models, and is valuable for the improved understanding the intrinsic structure of the $η(1405)$ meson. △ Less

Submitted 27 July, 2023; originally announced July 2023.

Comments: 9 pages, 3 figures

arXiv:2307.14585 [pdf, other]

Improved measurement of the branching fraction of $D_s^+\toμ^+ν_μ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (598 additional authors not shown)

Abstract: Using $e^+e^-$ collision data with an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226 GeV with the BESIII detector operating at the BEPCII collider, the branching fraction of the leptonic decay $D_s^+\toμ^+ν_μ$ is measured to be $(0.5294\pm0.0108_{\rm stat}\pm0.0085_{\rm syst})$\%. Based on this, the product of the $D_s^+$ decay constan… ▽ More Using $e^+e^-$ collision data with an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226 GeV with the BESIII detector operating at the BEPCII collider, the branching fraction of the leptonic decay $D_s^+\toμ^+ν_μ$ is measured to be $(0.5294\pm0.0108_{\rm stat}\pm0.0085_{\rm syst})$\%. Based on this, the product of the $D_s^+$ decay constant $f_{D_s^+}$ and the magnitude of the $c\to s$ quark mixing matrix element $|V_{cs}|$ is determined to be $f_{D_s^+}|V_{cs}|=241.8\pm2.5_{\rm stat}\pm2.2_{\rm syst}~\mathrm{MeV}$. Using the value of $|V_{cs}|$ given by the global standard model fit, $f_{D_s^+}$ is found to be $248.4\pm2.5_{\rm stat}\pm2.2_{\rm syst}$\,MeV. Alternatively, using the value of $f_{D_s^+}$ from a recent lattice quantum chromodynamics calculation, $|V_{cs}|$ is determined to be $0.968\pm0.010_{\rm stat}\pm0.009_{\rm syst}$. △ Less

Submitted 26 July, 2023; originally announced July 2023.

Comments: 12 pages, 2 figures

arXiv:2307.13145 [pdf, other]

Our Nudges, Our Selves: Tailoring Mobile User Engagement Using Personality

Authors: Nima Jamalian, Marios Constantinides, Sagar Joglekar, Xueni Pan, Daniele Quercia

Abstract: To increase mobile user engagement, current apps employ a variety of behavioral nudges, but these engagement techniques are applied in a one-size-fits-all approach. Yet the very same techniques may be perceived differently by different individuals. To test this, we developed HarrySpotter, a location-based AR app that embedded six engagement techniques. We deployed it in a 2-week study involving 29… ▽ More To increase mobile user engagement, current apps employ a variety of behavioral nudges, but these engagement techniques are applied in a one-size-fits-all approach. Yet the very same techniques may be perceived differently by different individuals. To test this, we developed HarrySpotter, a location-based AR app that embedded six engagement techniques. We deployed it in a 2-week study involving 29 users who also took the Big-Five personality test. Preferences for specific engagement techniques are not only descriptive but also predictive of personality traits. The Adj. $R^2$ ranges from 0.16 for conscientious users (encouraged by competition) to 0.32 for neurotic users (self-centered and focused on their own achievements), and even up to 0.61 for extroverts (motivated by both exploration of objects and places). These findings suggest that these techniques need to be personalized in the future. △ Less

Submitted 24 July, 2023; originally announced July 2023.

Comments: 10 pages, 1 figure, 2 tables

arXiv:2307.12852 [pdf, ps, other]

doi 10.1103/PhysRevLett.132.091802

Observation of $D^+_s\to η^\prime μ^+ν_μ$, Precision Test of Lepton Flavor Universality with $D^+_s\to η^{(\prime)} \ell^+ν_\ell$, and First Measurements of $D^+_s\to η^{(\prime)}μ^+ν_μ$ Decay Dynamics

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (584 additional authors not shown)

Abstract: By analyzing 7.33 fb$^{-1}$ of $e^+e^-$ annihilation data collected at center-of-mass energies between 4.128 and 4.226 GeV with the BESIII detector, we report the observation of the semileptonic decay $D^+_s\to η^\prime μ^+ν_μ$, with a statistical significance larger than 10$σ$, and the measurements of the $D_s^+ \to ημ^+ν_μ$ and $D_s^+ \to η^\primeμ^+ν_μ$ decay dynamics for the first time. The br… ▽ More By analyzing 7.33 fb$^{-1}$ of $e^+e^-$ annihilation data collected at center-of-mass energies between 4.128 and 4.226 GeV with the BESIII detector, we report the observation of the semileptonic decay $D^+_s\to η^\prime μ^+ν_μ$, with a statistical significance larger than 10$σ$, and the measurements of the $D_s^+ \to ημ^+ν_μ$ and $D_s^+ \to η^\primeμ^+ν_μ$ decay dynamics for the first time. The branching fractions of $D_s^+ \to ημ^+ν_μ$ and $D_s^+ \to η^\primeμ^+ν_μ$ are determined to be $(2.235\pm0.051_{\rm stat}\pm0.052_{\rm syst})\%$ and $(0.801\pm0.055_{\rm stat}\pm0.028_{\rm syst})\%$, respectively, with precision improved by factors of 6.0 and 6.6 compared to the previous best measurements. Combined with the results for the decays $D_s^+ \to ηe^+ν_e$ and $D_s^+ \to η^\prime e^+ν_e$, the ratios of the decay widths are examined both inclusively and in several $\ell^+ν_\ell$ four-momentum transfer ranges. No evidence for lepton flavor universality violation is found within the current statistics. The products of the hadronic form factors $f_{+,0}^{η^{(\prime)}}(0)$ and the $c\to s$ Cabibbo-Kobayashi-Maskawa matrix element $|V_{cs}|$ are determined. The results based on the two-parameter series expansion are $f^η_{+,0}(0)|V_{cs}| = 0.452\pm0.010_{\rm stat}\pm0.007_{\rm syst}$ and $f^{η^{\prime}}_{+,0}(0)|V_{cs}| = 0.504\pm0.037_{\rm stat}\pm0.012_{\rm syst}$, which help to constrain present models on $f_{+,0}^{η^{(\prime)}}(0)$. The forward-backward asymmetries are determined to be $\langle A_{\rm FB}^η\rangle=-0.059\pm0.031_{\rm stat}\pm0.005_{\rm syst}$ and $\langle A_{\rm FB}^{η^\prime}\rangle=-0.064\pm0.079_{\rm stat}\pm0.006_{\rm syst}$ for the first time, which are consistent with the theoretical calculation. △ Less

Submitted 28 February, 2024; v1 submitted 24 July, 2023; originally announced July 2023.

Journal ref: Physical Review Letters 132, 091802 (2024)

arXiv:2307.12736 [pdf, other]

Measurement of $e^{+}e^{-}\toφη'$ cross sections at center-of-mass energies from 3.508 to 4.951 GeV and search for the decay $ψ(3770)\toφη'$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (600 additional authors not shown)

Abstract: The cross sections of the $e^{+}e^{-}\toφη'$ process at center-of-mass energies from 3.508 to 4.951 GeV are measured with high precision using 26.1 fb$^{-1}$ data collected with the BESIII detector operating at the BEPCII storage ring. The cross sections are of the order of a few picobarn, and decrease as the center-of-mass energy increases as $s^{-n/2}$ with $n=4.35\pm 0.14$. This result is in ag… ▽ More The cross sections of the $e^{+}e^{-}\toφη'$ process at center-of-mass energies from 3.508 to 4.951 GeV are measured with high precision using 26.1 fb$^{-1}$ data collected with the BESIII detector operating at the BEPCII storage ring. The cross sections are of the order of a few picobarn, and decrease as the center-of-mass energy increases as $s^{-n/2}$ with $n=4.35\pm 0.14$. This result is in agreement with the Nambu-Jona-Lasinio model prediction of $n=3.5\pm 0.9$. In addition, the charmless decay $ψ(3770)\toφη'$ is searched for by fitting the measured cross sections, yet no significant signal is observed. The upper limit of ${\cal B}(ψ(3770)\toφη')$ at the 90\% confidence level is determined to be $2.3\times 10^{-5}$. △ Less

Submitted 11 September, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

arXiv:2307.12291 [pdf, other]

TransHuman: A Transformer-based Human Representation for Generalizable Neural Human Rendering

Authors: Xiao Pan, Zongxin Yang, Jianxin Ma, Chang Zhou, Yi Yang

Abstract: In this paper, we focus on the task of generalizable neural human rendering which trains conditional Neural Radiance Fields (NeRF) from multi-view videos of different characters. To handle the dynamic human motion, previous methods have primarily used a SparseConvNet (SPC)-based human representation to process the painted SMPL. However, such SPC-based representation i) optimizes under the volatile… ▽ More In this paper, we focus on the task of generalizable neural human rendering which trains conditional Neural Radiance Fields (NeRF) from multi-view videos of different characters. To handle the dynamic human motion, previous methods have primarily used a SparseConvNet (SPC)-based human representation to process the painted SMPL. However, such SPC-based representation i) optimizes under the volatile observation space which leads to the pose-misalignment between training and inference stages, and ii) lacks the global relationships among human parts that is critical for handling the incomplete painted SMPL. Tackling these issues, we present a brand-new framework named TransHuman, which learns the painted SMPL under the canonical space and captures the global relationships between human parts with transformers. Specifically, TransHuman is mainly composed of Transformer-based Human Encoding (TransHE), Deformable Partial Radiance Fields (DPaRF), and Fine-grained Detail Integration (FDI). TransHE first processes the painted SMPL under the canonical space via transformers for capturing the global relationships between human parts. Then, DPaRF binds each output token with a deformable radiance field for encoding the query point under the observation space. Finally, the FDI is employed to further integrate fine-grained information from reference images. Extensive experiments on ZJU-MoCap and H36M show that our TransHuman achieves a significantly new state-of-the-art performance with high efficiency. Project page: https://pansanity666.github.io/TransHuman/ △ Less

Submitted 23 July, 2023; originally announced July 2023.

Comments: Accepted by ICCV 2023

arXiv:2307.11077 [pdf, other]

AlignDet: Aligning Pre-training and Fine-tuning in Object Detection

Authors: Ming Li, Jie Wu, Xionghui Wang, Chen Chen, Jie Qin, Xuefeng Xiao, Rui Wang, Min Zheng, Xin Pan

Abstract: The paradigm of large-scale pre-training followed by downstream fine-tuning has been widely employed in various object detection algorithms. In this paper, we reveal discrepancies in data, model, and task between the pre-training and fine-tuning procedure in existing practices, which implicitly limit the detector's performance, generalization ability, and convergence speed. To this end, we propose… ▽ More The paradigm of large-scale pre-training followed by downstream fine-tuning has been widely employed in various object detection algorithms. In this paper, we reveal discrepancies in data, model, and task between the pre-training and fine-tuning procedure in existing practices, which implicitly limit the detector's performance, generalization ability, and convergence speed. To this end, we propose AlignDet, a unified pre-training framework that can be adapted to various existing detectors to alleviate the discrepancies. AlignDet decouples the pre-training process into two stages, i.e., image-domain and box-domain pre-training. The image-domain pre-training optimizes the detection backbone to capture holistic visual abstraction, and box-domain pre-training learns instance-level semantics and task-aware concepts to initialize the parts out of the backbone. By incorporating the self-supervised pre-trained backbones, we can pre-train all modules for various detectors in an unsupervised paradigm. As depicted in Figure 1, extensive experiments demonstrate that AlignDet can achieve significant improvements across diverse protocols, such as detection algorithm, model backbone, data setting, and training schedule. For example, AlignDet improves FCOS by 5.3 mAP, RetinaNet by 2.1 mAP, Faster R-CNN by 3.3 mAP, and DETR by 2.3 mAP under fewer epochs. △ Less

Submitted 13 August, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

Comments: Camera Ready Version on ICCV 2023. Code and Models are publicly available. Project Page: https://liming-ai.github.io/AlignDet

arXiv:2307.10948 [pdf, ps, other]

doi 10.1103/PhysRevLett.132.191902

First Observation of a Three-Resonance Structure in $e^+e^-\rightarrow$Nonopen Charm Hadrons

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (608 additional authors not shown)

Abstract: We report the measurement of the inclusive cross sections for $e^+e^-$$\rightarrow$nOCH (where nOCH denotes non-open charm hadrons) with improved precision at center-of-mass (c.m.) energies from 3.645 to 3.871 GeV. We observe three resonances: $\mathcal R(3760)$, $\mathcal R(3780)$, and $\mathcal R(3810)$ with significances of $8.1σ$, $13.7σ$, and $8.8σ$, respectively. The $\mathcal R(3810)$ state… ▽ More We report the measurement of the inclusive cross sections for $e^+e^-$$\rightarrow$nOCH (where nOCH denotes non-open charm hadrons) with improved precision at center-of-mass (c.m.) energies from 3.645 to 3.871 GeV. We observe three resonances: $\mathcal R(3760)$, $\mathcal R(3780)$, and $\mathcal R(3810)$ with significances of $8.1σ$, $13.7σ$, and $8.8σ$, respectively. The $\mathcal R(3810)$ state is observed for the first time, while the $\mathcal R(3760)$ and $\mathcal R(3780)$ states are observed for the first time in the nOCH cross sections. Two sets of resonance parameters describe the energy-dependent line shape of the cross sections well. In set I [set II], the $\mathcal R(3810)$ state has mass $(3805.7 \pm 1.1 \pm 2.7)$ [$(3805.7 \pm 1.1 \pm 2.7)$] MeV/$c^2$, total width $(11.6 \pm 2.9 \pm 1.9)$ [$(11.5 \pm 2.8 \pm 1.9)$] MeV, and an electronic width multiplied by the nOCH decay branching fraction of $(10.9\pm 3.8\pm 2.5)$ [$(11.0\pm 3.4\pm 2.5)$] eV. In addition, we measure the branching fractions ${\mathcal B}[{\mathcal R}(3760)$$\rightarrow$nOCH$]=(25.2 \pm 16.1 \pm 30.4)\% [(6.4 \pm 4.8 \pm 7.7)\%]$ and ${\mathcal B}[\mathcal R(3780)$$\rightarrow$nOCH$]=(12.3 \pm 6.6 \pm 8.3)\% [(10.4 \pm 4.8 \pm 7.0)\%]$ for the first time. The $\mathcal R(3760)$ state can be interpreted as an open-charm (OC) molecular state, but containing a simple four-quark state component. The $\mathcal R(3810)$ state can be interpreted as a hadrocharmonium state. △ Less

Submitted 11 May, 2024; v1 submitted 20 July, 2023; originally announced July 2023.

Journal ref: Physical Review Letters 132, 191902 (2024)

arXiv:2307.10442 [pdf, other]

Thrust: Adaptively Propels Large Language Models with External Knowledge

Authors: Xinran Zhao, Hongming Zhang, Xiaoman Pan, Wenlin Yao, Dong Yu, Jianshu Chen

Abstract: Although large-scale pre-trained language models (PTLMs) are shown to encode rich knowledge in their model parameters, the inherent knowledge in PTLMs can be opaque or static, making external knowledge necessary. However, the existing information retrieval techniques could be costly and may even introduce noisy and sometimes misleading knowledge. To address these challenges, we propose the instanc… ▽ More Although large-scale pre-trained language models (PTLMs) are shown to encode rich knowledge in their model parameters, the inherent knowledge in PTLMs can be opaque or static, making external knowledge necessary. However, the existing information retrieval techniques could be costly and may even introduce noisy and sometimes misleading knowledge. To address these challenges, we propose the instance-level adaptive propulsion of external knowledge (IAPEK), where we only conduct the retrieval when necessary. To achieve this goal, we propose measuring whether a PTLM contains enough knowledge to solve an instance with a novel metric, Thrust, which leverages the representation distribution of a small number of seen instances. Extensive experiments demonstrate that thrust is a good measurement of PTLM models' instance-level knowledgeability. Moreover, we can achieve significantly higher cost-efficiency with the Thrust score as the retrieval indicator than the naive usage of external knowledge on 88% of the evaluated tasks with 26% average performance improvement. Such findings shed light on the real-world practice of knowledge-enhanced LMs with a limited knowledge-seeking budget due to computation latency or costs. △ Less

Submitted 19 July, 2023; originally announced July 2023.

Comments: 13 pages, 6 figures

arXiv:2307.09266 [pdf, other]

doi 10.1007/JHEP11(2023)137

Measurement of the branching fractions of the singly Cabibbo-suppressed decays $Λ_{c}^{+}\to pη$ and $Λ_{c}^{+}\to pω$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (608 additional authors not shown)

Abstract: Based on 4.5 $\mbox{fb$^{-1}$}$ $e^{+}e^{-}$ collision data collected with BESIII detector at seven energy points between 4.600 and 4.699 GeV, the branching fractions for $Λ_{c}^{+}\to pη$ and $Λ_{c}^{+}\to pω$ were measured by means of single-tag method. The branching fractions of $Λ_{c}^{+}\to pη$ and $Λ_{c}^{+}\to pω$ are determined to be… ▽ More Based on 4.5 $\mbox{fb$^{-1}$}$ $e^{+}e^{-}$ collision data collected with BESIII detector at seven energy points between 4.600 and 4.699 GeV, the branching fractions for $Λ_{c}^{+}\to pη$ and $Λ_{c}^{+}\to pω$ were measured by means of single-tag method. The branching fractions of $Λ_{c}^{+}\to pη$ and $Λ_{c}^{+}\to pω$ are determined to be $(1.57\pm0.11_{\rm {stat}}\pm0.04_{\rm{syst}})\times10^{-3}$ and $(1.11\pm0.20_{\rm{stat}}\pm0.07_{\rm{syst}})\times10^{-3}$, with a statistical significance of greater than 10 $σ$ and 5.7 $σ$, respectively. These results are consistent with the previous measurements by BESIII, LHCb and Belle, and the result of $Λ_{c}^{+}\to pη$ is the most precise to date. △ Less

Submitted 17 October, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

Comments: 24 pages, 4 figures

Journal ref: J. High Energ. Phys. 2023, 137 (2023)

arXiv:2307.07951 [pdf, other]

MinT: Boosting Generalization in Mathematical Reasoning via Multi-View Fine-Tuning

Authors: Zhenwen Liang, Dian Yu, Xiaoman Pan, Wenlin Yao, Qingkai Zeng, Xiangliang Zhang, Dong Yu

Abstract: Reasoning in mathematical domains remains a significant challenge for relatively small language models (LMs). Many current methods focus on specializing LMs in mathematical reasoning and rely heavily on knowledge distillation from powerful but inefficient large LMs (LLMs). In this work, we explore a new direction that avoids over-reliance on LLM teachers, introducing a multi-view fine-tuning metho… ▽ More Reasoning in mathematical domains remains a significant challenge for relatively small language models (LMs). Many current methods focus on specializing LMs in mathematical reasoning and rely heavily on knowledge distillation from powerful but inefficient large LMs (LLMs). In this work, we explore a new direction that avoids over-reliance on LLM teachers, introducing a multi-view fine-tuning method that efficiently exploits existing mathematical problem datasets with diverse annotation styles. Our approach uniquely considers the various annotation formats as different "views" and leverages them in training the model. By postpending distinct instructions to input questions, models can learn to generate solutions in diverse formats in a flexible manner. Experimental results show that our strategy enables a LLaMA-7B model to outperform prior approaches that utilize knowledge distillation, as well as carefully established baselines. Additionally, the proposed method grants the models promising generalization ability across various views and datasets, and the capability to learn from inaccurate or incomplete noisy data. We hope our multi-view training paradigm could inspire future studies in other machine reasoning domains. △ Less

Submitted 16 July, 2023; originally announced July 2023.

arXiv:2307.07316 [pdf, other]

doi 10.1103/PhysRevLett.131.191901

Measurement of the Energy-Dependent Electromagnetic Form Factors of a Charmed Baryon

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (598 additional authors not shown)

Abstract: We study the process $e^{+}e^{-}\toΛ_{c}^{+}\barΛ_c^{-}$ at twelve center-of-mass energies from $4.6119$ to $4.9509~\mathrm{GeV}$ using data samples collected by the BESIII detector at the BEPCII collider. The Born cross sections and effective form factors ($|G_{\mathrm{eff}}|$) are determined with unprecedented precision after combining the single and double-tag methods based on the decay process… ▽ More We study the process $e^{+}e^{-}\toΛ_{c}^{+}\barΛ_c^{-}$ at twelve center-of-mass energies from $4.6119$ to $4.9509~\mathrm{GeV}$ using data samples collected by the BESIII detector at the BEPCII collider. The Born cross sections and effective form factors ($|G_{\mathrm{eff}}|$) are determined with unprecedented precision after combining the single and double-tag methods based on the decay process $Λ_{c}^{+}\to pK^{-}π^{+}$. Flat cross sections around $4.63~\mathrm{GeV}$ are obtained and no indication of the resonant structure $Y(4630)$, as reported by Belle, is found. In addition, no oscillatory behavior is discerned in the $|G_{\mathrm{eff}}|$ energy-dependence of $Λ_{c}^{+}$, in contrast to what is seen for the proton and neutron cases. Analyzing the cross section together with the polar-angle distribution of the $Λ_{c}^{+}$ baryon at each energy point, the moduli of electric and magnetic form factors ($|G_{E}|$ and $|G_{M}|$) are extracted and separated. For the first time, the energy-dependence of the form factor ratio $|G_{E}/G_{M}|$ is observed, which can be well described by an oscillatory function. △ Less

Submitted 14 July, 2023; originally announced July 2023.

Comments: 10 pages, 3 figures

Journal ref: Phys. Rev. Lett. 131, 191901 (2023)

arXiv:2307.06023 [pdf, other]

On the Uplink Distributed Detection in UAV-enabled Aerial Cell-Free mMIMO Systems

Authors: Xuesong Pan, Zhong Zheng, Xueqing Huang, Zesong Fei

Abstract: In this paper, we investigate the uplink signal detection approaches in the cell-free massive MIMO systems with unmanned aerial vehicles (UAVs) serving as aerial access points (APs). The ground users are equipped with multiple antennas and the ground-to-air propagation channels are subject to correlated Rician fading. To overcome huge signaling overhead in the fully-centralized detection, we propo… ▽ More In this paper, we investigate the uplink signal detection approaches in the cell-free massive MIMO systems with unmanned aerial vehicles (UAVs) serving as aerial access points (APs). The ground users are equipped with multiple antennas and the ground-to-air propagation channels are subject to correlated Rician fading. To overcome huge signaling overhead in the fully-centralized detection, we propose a two-layer distributed uplink detection scheme, where the uplink signals are first detected in the AP-UAVs by using the minimum mean-squared error (MMSE) detector depending on local channel state information (CSI), and then collected and weighted combined at the CPU-UAV to obtain the refined detection. By using the operator-valued free probability theory, the asymptotic expressions of the combining weights are obtained, which only depend on the statistical CSI and show excellent accuracy. Based on the proposed distributed scheme, we further investigate the impacts of different distributed deployments on the achieved spectral efficiency (SE). Numerical results show that in urban and dense urban environments, it is more beneficial to deploy more AP-UAVs to achieve higher SE. On the other hand, in suburban environment, an optimal ratio between the number of deployed UAVs and the number of antennas per UAV exists to maximize the SE. △ Less

Submitted 12 July, 2023; originally announced July 2023.

arXiv:2307.05889 [pdf, other]

Rethinking Mitosis Detection: Towards Diverse Data and Feature Representation

Authors: Hao Wang, Jiatai Lin, Danyi Li, **g Wang, Bingchao Zhao, Zhenwei Shi, Xipeng Pan, Huadeng Wang, Bingbing Li, Changhong Liang, Guoqiang Han, Li Liang, Chu Han, Zaiyi Liu

Abstract: Mitosis detection is one of the fundamental tasks in computational pathology, which is extremely challenging due to the heterogeneity of mitotic cell. Most of the current studies solve the heterogeneity in the technical aspect by increasing the model complexity. However, lacking consideration of the biological knowledge and the complex model design may lead to the overfitting problem while limited… ▽ More Mitosis detection is one of the fundamental tasks in computational pathology, which is extremely challenging due to the heterogeneity of mitotic cell. Most of the current studies solve the heterogeneity in the technical aspect by increasing the model complexity. However, lacking consideration of the biological knowledge and the complex model design may lead to the overfitting problem while limited the generalizability of the detection model. In this paper, we systematically study the morphological appearances in different mitotic phases as well as the ambiguous non-mitotic cells and identify that balancing the data and feature diversity can achieve better generalizability. Based on this observation, we propose a novel generalizable framework (MitDet) for mitosis detection. The data diversity is considered by the proposed diversity-guided sample balancing (DGSB). And the feature diversity is preserved by inter- and intra- class feature diversity-preserved module (InCDP). Stain enhancement (SE) module is introduced to enhance the domain-relevant diversity of both data and features simultaneously. Extensive experiments have demonstrated that our proposed model outperforms all the SOTA approaches in several popular mitosis detection datasets in both internal and external test sets using minimal annotation efforts with point annotations only. Comprehensive ablation studies have also proven the effectiveness of the rethinking of data and feature diversity balancing. By analyzing the results quantitatively and qualitatively, we believe that our proposed model not only achieves SOTA performance but also might inspire the future studies in new perspectives. Source code is at https://github.com/Onehour0108/MitDet. △ Less

Submitted 11 July, 2023; originally announced July 2023.

arXiv:2307.04657 [pdf, other]

BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset

Authors: Jiaming Ji, Mickel Liu, Juntao Dai, Xuehai Pan, Chi Zhang, Ce Bian, Chi Zhang, Ruiyang Sun, Yizhou Wang, Yaodong Yang

Abstract: In this paper, we introduce the BeaverTails dataset, aimed at fostering research on safety alignment in large language models (LLMs). This dataset uniquely separates annotations of helpfulness and harmlessness for question-answering pairs, thus offering distinct perspectives on these crucial attributes. In total, we have gathered safety meta-labels for 333,963 question-answer (QA) pairs and 361,90… ▽ More In this paper, we introduce the BeaverTails dataset, aimed at fostering research on safety alignment in large language models (LLMs). This dataset uniquely separates annotations of helpfulness and harmlessness for question-answering pairs, thus offering distinct perspectives on these crucial attributes. In total, we have gathered safety meta-labels for 333,963 question-answer (QA) pairs and 361,903 pairs of expert comparison data for both the helpfulness and harmlessness metrics. We further showcase applications of BeaverTails in content moderation and reinforcement learning with human feedback (RLHF), emphasizing its potential for practical safety measures in LLMs. We believe this dataset provides vital resources for the community, contributing towards the safe development and deployment of LLMs. Our project page is available at the following URL: https://sites.google.com/view/pku-beavertails. △ Less

Submitted 6 November, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

Comments: Published at NeurIPS 2023

arXiv:2307.03465 [pdf]

TBGC: Task-level Backbone-Oriented Gradient Clip for Multi-Task Foundation Model Learning

Authors: Zelun Zhang, Xue Pan

Abstract: The AllInOne training paradigm squeezes a wide range of tasks into a unified model in a multi-task learning manner. However, optimization in multi-task learning is more challenge than single-task learning, as the gradient norm from different tasks may vary greatly, making the backbone overly biased towards one specific task. To address this issue, we propose the task-level backbone-oriented gradie… ▽ More The AllInOne training paradigm squeezes a wide range of tasks into a unified model in a multi-task learning manner. However, optimization in multi-task learning is more challenge than single-task learning, as the gradient norm from different tasks may vary greatly, making the backbone overly biased towards one specific task. To address this issue, we propose the task-level backbone-oriented gradient clip paradigm, compared with the vanilla gradient clip method, it has two points of emphasis:1) gradient clip is performed independently for each task. 2) backbone gradients generated from each task are rescaled to the same norm scale. Based on the experimental results, we argue that the task-level backbone-oriented gradient clip paradigm can relieve the gradient bias problem to some extent. We also propose a novel multi-branch data augmentation strategy where conflict augmentations are placed in different branches. Our approach has been shown to be effective and finally achieve 1st place in the Leaderboard A and 2nd place in the Leaderboard B of the CVPR2023 Foundation Model Challenge. It's worth noting that instead of evaluating all three tasks(detection, segmentation and fine-grained classification) in Leaderboard A, the segmentation task is not evaluated in Leaderboard B, in which our team has a huge advantage. △ Less

Submitted 7 July, 2023; originally announced July 2023.

Comments: Foundation Model Challenge@CVPR2023, Accepted by CVPR2023 Workshop

Journal ref: Conference on Computer Vision and Pattern Recognition, 2023

Showing 201–250 of 1,062 results for author: Pan, X