Search | arXiv e-print repository

Private Learning with Public Features

Authors: Walid Krichene, Nicolas Mayoraz, Steffen Rendle, Shuang Song, Abhradeep Thakurta, Li Zhang

Abstract: We study a class of private learning problems in which the data is a join of private and public features. This is often the case in private personalization tasks such as recommendation or ad prediction, in which features related to individuals are sensitive, while features related to items (the movies or songs to be recommended, or the ads to be shown to users) are publicly available and do not re… ▽ More We study a class of private learning problems in which the data is a join of private and public features. This is often the case in private personalization tasks such as recommendation or ad prediction, in which features related to individuals are sensitive, while features related to items (the movies or songs to be recommended, or the ads to be shown to users) are publicly available and do not require protection. A natural question is whether private algorithms can achieve higher utility in the presence of public features. We give a positive answer for multi-encoder models where one of the encoders operates on public features. We develop new algorithms that take advantage of this separation by only protecting certain sufficient statistics (instead of adding noise to the gradient). This method has a guaranteed utility improvement for linear regression, and importantly, achieves the state of the art on two standard private recommendation benchmarks, demonstrating the importance of methods that adapt to the private-public feature separation. △ Less

Submitted 23 October, 2023; originally announced October 2023.

arXiv:2310.13546 [pdf, other]

Observation of $Ξ_b^0 \rightarrow Ξ_c^+ D_s^-$ and $Ξ_b^- \rightarrow Ξ_c^0 D_s^-$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1087 additional authors not shown)

Abstract: The $Ξ_b^0 \rightarrow Ξ_c^+ D_s^-$ and $Ξ_b^- \rightarrow Ξ_c^0 D_s^-$ decays are observed for the first time using proton-proton collision data collected by the LHCb experiment at a centre-of-mass energy of $\sqrt{s}=13\mathrm{TeV}$, corresponding to an integrated luminosity of $5.1\mathrm{fb}^{-1}$. The relative branching fractions times the beauty-baryon production cross-sections are measured… ▽ More The $Ξ_b^0 \rightarrow Ξ_c^+ D_s^-$ and $Ξ_b^- \rightarrow Ξ_c^0 D_s^-$ decays are observed for the first time using proton-proton collision data collected by the LHCb experiment at a centre-of-mass energy of $\sqrt{s}=13\mathrm{TeV}$, corresponding to an integrated luminosity of $5.1\mathrm{fb}^{-1}$. The relative branching fractions times the beauty-baryon production cross-sections are measured to be \begin{align*} \mathcal{R}\left(\frac{Ξ_b^0}{Λ_b^0}\right) \equiv \frac{σ\left(Ξ_b^0\right)}{σ\left(Λ_b^0\right)} \times \frac{\mathcal{B}\left(Ξ_b^0 \rightarrow Ξ_c^+ D_s^-\right)}{\mathcal{B}\left(Λ_b^0 \rightarrow Λ_c^0 D_s^-\right)} =(15.8\pm1.1\pm0.6\pm7.7)\%, \mathcal{R}\left(\frac{Ξ_b^-}{Λ_b^0}\right) \equiv \frac{σ\left(Ξ_b^-\right)}{σ\left(Λ_b^0\right)} \times \frac{\mathcal{B}\left(Ξ_b^- \rightarrow Ξ_c^0 D_s^-\right)}{\mathcal{B}\left(Λ_b^0 \rightarrow Λ_c^0 D_s^-\right)} =(16.9\pm1.3\pm0.9\pm4.3)\%, \end{align*} where the first uncertainties are statistical, the second systematic, and the third due to the uncertainties on the branching fractions of relevant charm-baryon decays. The masses of $Ξ_b^0$ and $Ξ_b^-$ baryons are measured to be $m_{Ξ_b^0}=5791.12\pm0.60\pm0.45\pm0.24\mathrm{MeV}/c^2$ and $m_{Ξ_b^-}=5797.02\pm0.63\pm0.49\pm0.29\mathrm{MeV}/c^2$, where the uncertainties are statistical, systematic, and those due to charm-hadron masses, respectively. △ Less

Submitted 20 October, 2023; originally announced October 2023.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2023-017.html (LHCb public pages)

Report number: CERN-EP-2023-173, LHCb-PAPER-2023-017

arXiv:2310.12649 [pdf, other]

A measurement of $ΔΓ_{s}$

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1078 additional authors not shown)

Abstract: Using a dataset corresponding to $9~\mathrm{fb}^{-1}$ of integrated luminosity collected with the LHCb detector between 2011 and 2018 in proton-proton collisions, the decay-time distributions of the decay modes $B_s^0 \rightarrow J/ψη'$ and $B_s^0 \rightarrow J/ψπ^{+} π^{-}$ are studied. The decay-width difference between the light and heavy mass eigenstates of the $B_s^0$ meson is measured to be… ▽ More Using a dataset corresponding to $9~\mathrm{fb}^{-1}$ of integrated luminosity collected with the LHCb detector between 2011 and 2018 in proton-proton collisions, the decay-time distributions of the decay modes $B_s^0 \rightarrow J/ψη'$ and $B_s^0 \rightarrow J/ψπ^{+} π^{-}$ are studied. The decay-width difference between the light and heavy mass eigenstates of the $B_s^0$ meson is measured to be $ΔΓ_s = 0.087 \pm 0.012 \pm 0.009 \, \mathrm{ps}^{-1}$, where the first uncertainty is statistical and the second systematic. △ Less

Submitted 19 October, 2023; originally announced October 2023.

Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2023-025.html

Report number: LHCb-PAPER-2023-025, CERN-EP-2023-218

arXiv:2310.12278 [pdf, other]

doi 10.1103/PhysRevLett.132.081901

Enhanced production of $Λ_{b}^{0}$ baryons in high-multiplicity $pp$ collisions at $\sqrt{s} = 13$ TeV

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1082 additional authors not shown)

Abstract: The production rate of $Λ_{b}^{0}$ baryons relative to $B^{0}$ mesons in $pp$ collisions at a center-of-mass energy $\sqrt{s} = 13$ TeV is measured by the LHCb experiment. The ratio of $Λ_{b}^{0}$ to $B^{0}$ production cross-sections shows a significant dependence on both the transverse momentum and the measured charged-particle multiplicity. At low multiplicity, the ratio measured at LHCb is cons… ▽ More The production rate of $Λ_{b}^{0}$ baryons relative to $B^{0}$ mesons in $pp$ collisions at a center-of-mass energy $\sqrt{s} = 13$ TeV is measured by the LHCb experiment. The ratio of $Λ_{b}^{0}$ to $B^{0}$ production cross-sections shows a significant dependence on both the transverse momentum and the measured charged-particle multiplicity. At low multiplicity, the ratio measured at LHCb is consistent with the value measured in $e^{+}e^{-}$ collisions, and increases by a factor of $\sim2$ with increasing multiplicity. At relatively low transverse momentum, the ratio of $Λ_{b}^{0}$ to $B^{0}$ cross-sections is higher than what is measured in $e^{+}e^{-}$ collisions, but converges with the $e^{+}e^{-}$ ratio as the momentum increases. These results imply that the evolution of heavy $b$ quarks into final-state hadrons is influenced by the density of the hadronic environment produced in the collision. Comparisons with several models and implications for the mechanisms enforcing quark confinement are discussed. △ Less

Submitted 22 February, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2023-027.html (LHCb public pages)

Report number: LHCb-PAPER-2023-027, CERN-EP-2023-208

Journal ref: Phys. Rev. Lett. 132 (2024) 081901

arXiv:2310.11959 [pdf, other]

A Multi-Scale Decomposition MLP-Mixer for Time Series Analysis

Authors: Shuhan Zhong, Sizhe Song, Weipeng Zhuo, Guanyao Li, Yang Liu, S. -H. Gary Chan

Abstract: Time series data, including univariate and multivariate ones, are characterized by unique composition and complex multi-scale temporal variations. They often require special consideration of decomposition and multi-scale modeling to analyze. Existing deep learning methods on this best fit to univariate time series only, and have not sufficiently considered sub-series modeling and decomposition com… ▽ More Time series data, including univariate and multivariate ones, are characterized by unique composition and complex multi-scale temporal variations. They often require special consideration of decomposition and multi-scale modeling to analyze. Existing deep learning methods on this best fit to univariate time series only, and have not sufficiently considered sub-series modeling and decomposition completeness. To address these challenges, we propose MSD-Mixer, a Multi-Scale Decomposition MLP-Mixer, which learns to explicitly decompose and represent the input time series in its different layers. To handle the multi-scale temporal patterns and multivariate dependencies, we propose a novel temporal patching approach to model the time series as multi-scale patches, and employ MLPs to capture intra- and inter-patch variations and channel-wise correlations. In addition, we propose a novel loss function to constrain both the mean and the autocorrelation of the decomposition residual for better decomposition completeness. Through extensive experiments on various real-world datasets for five common time series analysis tasks, we demonstrate that MSD-Mixer consistently and significantly outperforms other state-of-the-art algorithms with better efficiency. △ Less

Submitted 24 March, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

Comments: Accepted for VLDB 2024

arXiv:2310.10283 [pdf]

Unveiling Early Warning Signals of Systemic Risks in Banks: A Recurrence Network-Based Approach

Authors: Shijia Song, Handong Li

Abstract: Bank crisis is challenging to define but can be manifested through bank contagion. This study presents a comprehensive framework grounded in nonlinear time series analysis to identify potential early warning signals (EWS) for impending phase transitions in bank systems, with the goal of anticipating severe bank crisis. In contrast to traditional analyses of exposure networks using low-frequency da… ▽ More Bank crisis is challenging to define but can be manifested through bank contagion. This study presents a comprehensive framework grounded in nonlinear time series analysis to identify potential early warning signals (EWS) for impending phase transitions in bank systems, with the goal of anticipating severe bank crisis. In contrast to traditional analyses of exposure networks using low-frequency data, we argue that studying the dynamic relationships among bank stocks using high-frequency data offers a more insightful perspective on changes in the banking system. We construct multiple recurrence networks (MRNs) based on multidimensional returns of listed banks' stocks in China, aiming to monitor the nonlinear dynamics of the system through the corresponding indicators and topological structures. Empirical findings indicate that key indicators of MRNs, specifically the average mutual information, provide valuable insights into periods of extreme volatility of bank system. This paper contributes to the ongoing discourse on early warning signals for bank instability, highlighting the applicability of predicting systemic risks in the context of banking networks. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: 24 pages

arXiv:2310.09489 [pdf, other]

Study of residual artificial neural network for particle identification in the CEPC high-granularity calorimeter prototype

Authors: Siyuan Song, Jiyuan Chen, Jianbei Liu, Yong Liu, Baohua Qi, Yukun Shi, Jiaxuan Wang, Zhen Wang, Haijun Yang

Abstract: Particle Identification (PID) plays a central role in associating the energy depositions in calorimeter cells with the type of primary particle in a particle flow oriented detector system. In this paper, we propose novel PID methods based on the Residual Network (ResNet) architecture which enable the training of very deep networks, bypass the need to reconstruct feature variables, and ensure the g… ▽ More Particle Identification (PID) plays a central role in associating the energy depositions in calorimeter cells with the type of primary particle in a particle flow oriented detector system. In this paper, we propose novel PID methods based on the Residual Network (ResNet) architecture which enable the training of very deep networks, bypass the need to reconstruct feature variables, and ensure the generalization ability among various geometries of detectors, to classify electromagnetic showers and hadronic showers. Using Geant4 simulation samples with energy ranging from 5 GeV to 120 GeV, the efficacy of Residual Connections is validated and the performance of our model is compared with Boosted Decision Trees (BDT) and other pioneering Artificial Neural Network (ANN) approaches. In shower classification, we observe an improvement in background rejection over a wide range of high signal efficiency ($> 95\%$). These findings highlight the prospects of ANN with Residual Blocks for imaging detectors in the PID task of particle physics experiments. △ Less

Submitted 9 March, 2024; v1 submitted 14 October, 2023; originally announced October 2023.

arXiv:2310.09475 [pdf]

Twisted DNA origami-based chiral monolayers for spin filtering

Authors: Haozhi Wang, Fangfei Yin, Linyun Li, Mingqiang Li, Zheng Fang, Chenyun Sun, Bochen Li, Jiye Shi, Jiang Li, Lihua Wang, Shi** Song, Xiaolei Zuo, Xiaoguo Liu, Chunhai Fan

Abstract: DNA monolayers with inherent chirality play a pivotal role across various domains, including biosensors, DNA chips, and bioelectronics. Nonetheless, conventional DNA chiral monolayers, typically constructed from single-stranded DNA (ssDNA) or double-stranded DNA (dsDNA), often lack structural orderliness and design flexibility at the interface. Structural DNA nanotechnology emerges as a promising… ▽ More DNA monolayers with inherent chirality play a pivotal role across various domains, including biosensors, DNA chips, and bioelectronics. Nonetheless, conventional DNA chiral monolayers, typically constructed from single-stranded DNA (ssDNA) or double-stranded DNA (dsDNA), often lack structural orderliness and design flexibility at the interface. Structural DNA nanotechnology emerges as a promising solution to tackle these challenges. In this study, we present a strategy for crafting highly adaptable twisted DNA origami-based chiral monolayers. These structures exhibit distinct interfacial assembly characteristics and effectively mitigate the structural disorder of dsDNA monolayers, which is constrained by a limited persistence length of ~50 nm of dsDNA. We highlight the spin-filtering capabilities of four representative DNA origami-based chiral monolayers, demonstrating a maximal one-order-of-magnitude increase in spin-filtering efficiency per unit area compared to conventional dsDNA chiral monolayers. Intriguingly, our findings reveal that the higher-order, tertiary, chiral structure of twisted DNA origami further enhances the spin-filtering efficiency. This work paves the way for the rational design of DNA chiral monolayers. △ Less

Submitted 13 October, 2023; originally announced October 2023.

arXiv:2310.08921 [pdf, other]

Feature Proliferation -- the "Cancer" in StyleGAN and its Treatments

Authors: Shuang Song, Yuanbang Liang, **g Wu, Yu-Kun Lai, Yipeng Qin

Abstract: Despite the success of StyleGAN in image synthesis, the images it synthesizes are not always perfect and the well-known truncation trick has become a standard post-processing technique for StyleGAN to synthesize high-quality images. Although effective, it has long been noted that the truncation trick tends to reduce the diversity of synthesized images and unnecessarily sacrifices many distinct ima… ▽ More Despite the success of StyleGAN in image synthesis, the images it synthesizes are not always perfect and the well-known truncation trick has become a standard post-processing technique for StyleGAN to synthesize high-quality images. Although effective, it has long been noted that the truncation trick tends to reduce the diversity of synthesized images and unnecessarily sacrifices many distinct image features. To address this issue, in this paper, we first delve into the StyleGAN image synthesis mechanism and discover an important phenomenon, namely Feature Proliferation, which demonstrates how specific features reproduce with forward propagation. Then, we show how the occurrence of Feature Proliferation results in StyleGAN image artifacts. As an analogy, we refer to it as the" cancer" in StyleGAN from its proliferating and malignant nature. Finally, we propose a novel feature rescaling method that identifies and modulates risky features to mitigate feature proliferation. Thanks to our discovery of Feature Proliferation, the proposed feature rescaling method is less destructive and retains more useful image features than the truncation trick, as it is more fine-grained and works in a lower-level feature space rather than a high-level latent space. Experimental results justify the validity of our claims and the effectiveness of the proposed feature rescaling method. Our code is available at https://github. com/songc42/Feature-proliferation. △ Less

Submitted 13 October, 2023; originally announced October 2023.

Comments: Accepted at ICCV 2023

arXiv:2310.08864 [pdf, other]

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, A**kya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (267 additional authors not shown)

Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train generalist X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. More details can be found on the project website https://robotics-transformer-x.github.io. △ Less

Submitted 1 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

Comments: Project website: https://robotics-transformer-x.github.io

arXiv:2310.07592 [pdf, other]

Transformers for Green Semantic Communication: Less Energy, More Semantics

Authors: Shubhabrata Mukherjee, Cory Beard, Sejun Song

Abstract: Semantic communication aims to transmit meaningful and effective information, rather than focusing on individual symbols or bits. This results in benefits like reduced latency, bandwidth usage, and higher throughput compared with traditional communication. However, semantic communication poses significant challenges due to the need for universal metrics to benchmark the joint effects of semantic i… ▽ More Semantic communication aims to transmit meaningful and effective information, rather than focusing on individual symbols or bits. This results in benefits like reduced latency, bandwidth usage, and higher throughput compared with traditional communication. However, semantic communication poses significant challenges due to the need for universal metrics to benchmark the joint effects of semantic information loss and practical energy consumption. This research presents a novel multi-objective loss function named "Energy-Optimized Semantic Loss" (EOSL), addressing the challenge of balancing semantic information loss and energy consumption. Through comprehensive experiments on transformer models, including CPU and GPU energy usage, it is demonstrated that EOSL-based encoder model selection can save up to 90% of energy while achieving a 44% improvement in semantic similarity performance during inference in this experiment. This work paves the way for energy-efficient neural network selection and the development of greener semantic communication architectures. △ Less

Submitted 18 February, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

Comments: First revision, Version 2

arXiv:2310.05864 [pdf, other]

doi 10.1088/1748-0221/19/02/P02010

Helium identification with LHCb

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1079 additional authors not shown)

Abstract: The identification of helium nuclei at LHCb is achieved using a method based on measurements of ionisation losses in the silicon sensors and timing measurements in the Outer Tracker drift tubes. The background from photon conversions is reduced using the RICH detectors and an isolation requirement. The method is developed using $pp$ collision data at $\sqrt{s}=13\,{\rm TeV}$ recorded by the LHCb e… ▽ More The identification of helium nuclei at LHCb is achieved using a method based on measurements of ionisation losses in the silicon sensors and timing measurements in the Outer Tracker drift tubes. The background from photon conversions is reduced using the RICH detectors and an isolation requirement. The method is developed using $pp$ collision data at $\sqrt{s}=13\,{\rm TeV}$ recorded by the LHCb experiment in the years 2016 to 2018, corresponding to an integrated luminosity of $5.5\,{\rm fb}^{-1}$. A total of around $10^5$ helium and antihelium candidates are identified with negligible background contamination. The helium identification efficiency is estimated to be approximately $50\%$ with a corresponding background rejection rate of up to $\mathcal O(10^{12})$. These results demonstrate the feasibility of a rich programme of measurements of QCD and astrophysics interest involving light nuclei. △ Less

Submitted 6 February, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-DP-2023-002.html (LHCb public pages)

Report number: CERN-EP-2023-227, LHCb-DP-2023-002

Journal ref: JINST 19 (2024) P02010

arXiv:2310.04610 [pdf, other]

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

Authors: Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri , et al. (67 additional authors not shown)

Abstract: In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique… ▽ More In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique capabilities through AI system technology innovations to help domain experts to unlock today's biggest science mysteries. By leveraging DeepSpeed's current technology pillars (training, inference and compression) as base technology enablers, DeepSpeed4Science will create a new set of AI system technologies tailored for accelerating scientific discoveries by addressing their unique complexity beyond the common technical approaches used for accelerating generic large language models (LLMs). In this paper, we showcase the early progress we made with DeepSpeed4Science in addressing two of the critical system challenges in structural biology research. △ Less

Submitted 11 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

arXiv:2310.04440 [pdf, other]

Facilitating Battery Swap** Services for Freight Trucks with Spatial-Temporal Demand Prediction

Authors: Linyu Liu, Zhen Dai, Shiji Song, Xiaocheng Li, Guanting Chen

Abstract: Electrifying heavy-duty trucks offers a substantial opportunity to curtail carbon emissions, advancing toward a carbon-neutral future. However, the inherent challenges of limited battery energy and the sheer weight of heavy-duty trucks lead to reduced mileage and prolonged charging durations. Consequently, battery-swap** services emerge as an attractive solution for these trucks. This paper empl… ▽ More Electrifying heavy-duty trucks offers a substantial opportunity to curtail carbon emissions, advancing toward a carbon-neutral future. However, the inherent challenges of limited battery energy and the sheer weight of heavy-duty trucks lead to reduced mileage and prolonged charging durations. Consequently, battery-swap** services emerge as an attractive solution for these trucks. This paper employs a two-fold approach to investigate the potential and enhance the efficacy of such services. Firstly, spatial-temporal demand prediction models are adopted to predict the traffic patterns for the upcoming hours. Subsequently, the prediction guides an optimization module for efficient battery allocation and deployment. Analyzing the heavy-duty truck data on a highway network spanning over 2,500 miles, our model and analysis underscore the value of prediction/machine learning in facilitating future decision-makings. In particular, we find that the initial phase of implementing battery-swap** services favors mobile battery-swap** stations, but as the system matures, fixed-location stations are preferred. △ Less

Submitted 23 May, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

Comments: 9 pages, 6 figures

MSC Class: 90B06; 68T07

arXiv:2310.04411 [pdf, other]

Understanding, Predicting and Better Resolving Q-Value Divergence in Offline-RL

Authors: Yang Yue, Rui Lu, Bingyi Kang, Shiji Song, Gao Huang

Abstract: The divergence of the Q-value estimation has been a prominent issue in offline RL, where the agent has no access to real dynamics. Traditional beliefs attribute this instability to querying out-of-distribution actions when bootstrap** value targets. Though this issue can be alleviated with policy constraints or conservative Q estimation, a theoretical understanding of the underlying mechanism ca… ▽ More The divergence of the Q-value estimation has been a prominent issue in offline RL, where the agent has no access to real dynamics. Traditional beliefs attribute this instability to querying out-of-distribution actions when bootstrap** value targets. Though this issue can be alleviated with policy constraints or conservative Q estimation, a theoretical understanding of the underlying mechanism causing the divergence has been absent. In this work, we aim to thoroughly comprehend this mechanism and attain an improved solution. We first identify a fundamental pattern, self-excitation, as the primary cause of Q-value estimation divergence in offline RL. Then, we propose a novel Self-Excite Eigenvalue Measure (SEEM) metric based on Neural Tangent Kernel (NTK) to measure the evolving property of Q-network at training, which provides an intriguing explanation of the emergence of divergence. For the first time, our theory can reliably decide whether the training will diverge at an early stage, and even predict the order of the growth for the estimated Q-value, the model's norm, and the crashing step when an SGD optimizer is used. The experiments demonstrate perfect alignment with this theoretic analysis. Building on our insights, we propose to resolve divergence from a novel perspective, namely improving the model's architecture for better extrapolating behavior. Through extensive empirical studies, we identify LayerNorm as a good solution to effectively avoid divergence without introducing detrimental bias, leading to superior performance. Experimental results prove that it can still work in some most challenging settings, i.e. using only 1 transitions of the dataset, where all previous methods fail. Moreover, it can be easily plugged into modern offline RL methods and achieve SOTA results on many challenging tasks. We also give unique insights into its effectiveness. △ Less

Submitted 7 November, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

Comments: 31 pages, 20 figures

Journal ref: NeurIPS 2023

arXiv:2310.04277 [pdf, other]

doi 10.1007/JHEP12(2023)013

Measurement of the CKM angle $γ$ using the $B^{\pm}\rightarrow D^{*} h^{\pm}$ channels

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1076 additional authors not shown)

Abstract: A measurement of the $CP$-violating observables from $B^{\pm}\rightarrow D^* K^{\pm}$ and $B^{\pm}\rightarrow D^* π^{\pm}$ decays is presented, where $D^* (D) $ is an admixture of $D^{*0}$ and $\bar{D}^{*0}$ ($D^0$ and $\bar{D}^0$) states and is reconstructed through the decay chains $ D^* \rightarrow Dπ^0/γ$ and $D \to K_S^0 π^+π^-/K_S^0 K^+K^-$. The measurement is performed by analysing the sign… ▽ More A measurement of the $CP$-violating observables from $B^{\pm}\rightarrow D^* K^{\pm}$ and $B^{\pm}\rightarrow D^* π^{\pm}$ decays is presented, where $D^* (D) $ is an admixture of $D^{*0}$ and $\bar{D}^{*0}$ ($D^0$ and $\bar{D}^0$) states and is reconstructed through the decay chains $ D^* \rightarrow Dπ^0/γ$ and $D \to K_S^0 π^+π^-/K_S^0 K^+K^-$. The measurement is performed by analysing the signal yield variation across the $D$ decay phase space and is independent of any amplitude model. The data sample used was collected by the LHCb experiment in proton-proton collisions and corresponds to a total integrated luminosity of 9 fb$^{-1}$ at centre-of-mass energies of 7, 8 and 13 TeV. The CKM angle $γ$ is determined to be $(69^{+13}_{-14})^{\circ}$ using the measured $CP$-violating observables. The hadronic parameters $r^{D^* K^{\pm}}_B, r^{D^* π^{\pm}}_B, δ^{D^* K^{\pm}}_B, δ^{D^* π^{\pm}}_B$, which are the ratios and strong phase differences between favoured and suppressed $B^{\pm}$ decays, are also reported. △ Less

Submitted 8 April, 2024; v1 submitted 6 October, 2023; originally announced October 2023.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2023-012.html (LHCb public pages)

Report number: LHCb-PAPER-2023-012, CERN-EP-2023-170

Journal ref: JHEP12(2023)013

arXiv:2310.03889 [pdf, ps, other]

doi 10.1109/LSP.2023.3319233

Audio Event-Relational Graph Representation Learning for Acoustic Scene Classification

Authors: Yuanbo Hou, Siyang Song, Chuang Yu, Wenwu Wang, Dick Botteldooren

Abstract: Most deep learning-based acoustic scene classification (ASC) approaches identify scenes based on acoustic features converted from audio clips containing mixed information entangled by polyphonic audio events (AEs). However, these approaches have difficulties in explaining what cues they use to identify scenes. This paper conducts the first study on disclosing the relationship between real-life aco… ▽ More Most deep learning-based acoustic scene classification (ASC) approaches identify scenes based on acoustic features converted from audio clips containing mixed information entangled by polyphonic audio events (AEs). However, these approaches have difficulties in explaining what cues they use to identify scenes. This paper conducts the first study on disclosing the relationship between real-life acoustic scenes and semantic embeddings from the most relevant AEs. Specifically, we propose an event-relational graph representation learning (ERGL) framework for ASC to classify scenes, and simultaneously answer clearly and straightly which cues are used in classifying. In the event-relational graph, embeddings of each event are treated as nodes, while relationship cues derived from each pair of nodes are described by multi-dimensional edge features. Experiments on a real-life ASC dataset show that the proposed ERGL achieves competitive performance on ASC by learning embeddings of only a limited number of AEs. The results show the feasibility of recognizing diverse acoustic scenes based on the audio event-relational graph. Visualizations of graph representations learned by ERGL are available here (https://github.com/Yuanbo2020/ERGL). △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: IEEE Signal Processing Letters, doi: 10.1109/LSP.2023.3319233

arXiv:2310.01320 [pdf, other]

Avalon's Game of Thoughts: Battle Against Deception through Recursive Contemplation

Authors: Shenzhi Wang, Chang Liu, Zilong Zheng, Siyuan Qi, Shuo Chen, Qisen Yang, Andrew Zhao, Chaofei Wang, Shiji Song, Gao Huang

Abstract: Recent breakthroughs in large language models (LLMs) have brought remarkable success in the field of LLM-as-Agent. Nevertheless, a prevalent assumption is that the information processed by LLMs is consistently honest, neglecting the pervasive deceptive or misleading information in human society and AI-generated content. This oversight makes LLMs susceptible to malicious manipulations, potentially… ▽ More Recent breakthroughs in large language models (LLMs) have brought remarkable success in the field of LLM-as-Agent. Nevertheless, a prevalent assumption is that the information processed by LLMs is consistently honest, neglecting the pervasive deceptive or misleading information in human society and AI-generated content. This oversight makes LLMs susceptible to malicious manipulations, potentially resulting in detrimental outcomes. This study utilizes the intricate Avalon game as a testbed to explore LLMs' potential in deceptive environments. Avalon, full of misinformation and requiring sophisticated logic, manifests as a "Game-of-Thoughts". Inspired by the efficacy of humans' recursive thinking and perspective-taking in the Avalon game, we introduce a novel framework, Recursive Contemplation (ReCon), to enhance LLMs' ability to identify and counteract deceptive information. ReCon combines formulation and refinement contemplation processes; formulation contemplation produces initial thoughts and speech, while refinement contemplation further polishes them. Additionally, we incorporate first-order and second-order perspective transitions into these processes respectively. Specifically, the first-order allows an LLM agent to infer others' mental states, and the second-order involves understanding how others perceive the agent's mental state. After integrating ReCon with different LLMs, extensive experiment results from the Avalon game indicate its efficacy in aiding LLMs to discern and maneuver around deceptive information without extra fine-tuning and data. Finally, we offer a possible explanation for the efficacy of ReCon and explore the current limitations of LLMs in terms of safety, reasoning, speaking style, and format, potentially furnishing insights for subsequent research. △ Less

Submitted 24 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

Comments: 40 pages

arXiv:2309.15753 [pdf]

Optical detection of bond-dependent and frustrated spin in the two-dimensional cobalt-based honeycomb antiferromagnet Cu3Co2SbO6

Authors: Baekjune Kang, Uksam Choi, Taek Sun Jung, Seunghyeon Noh, Gye-Hyeon Kim, UiHyeon Seo, Miju Park, **-Hyun Choi, Minjae Kim, GwangCheol Ji, Sehwan Song, Hyesung Jo, Seokjo Hong, Nguyen Xuan Duong, Tae Heon Kim, Yongsoo Yang, Sungkyun Park, Jong Mok Ok, Jung-Woo Yoo, Jae Hoon Kim, Changhee Sohn

Abstract: Two-dimensional honeycomb antiferromagnet becomes an important class of materials as it can provide a route to Kitaev quantum spin liquid, characterized by massive quantum entanglement and fractional excitations. The signatures of its proximity to Kitaev quantum spin liquid in the honeycomb antiferromagnet includes anisotropic bond-dependent magnetic responses and persistent fluctuation by frustra… ▽ More Two-dimensional honeycomb antiferromagnet becomes an important class of materials as it can provide a route to Kitaev quantum spin liquid, characterized by massive quantum entanglement and fractional excitations. The signatures of its proximity to Kitaev quantum spin liquid in the honeycomb antiferromagnet includes anisotropic bond-dependent magnetic responses and persistent fluctuation by frustration in paramagnetic regime. Here, we propose Cu3Co2SbO6 heterostructures as an intriguing honeycomb antiferromagnet for quantum spin liquid, wherein bond-dependent and frustrated spins interact with optical excitons. This system exhibits antiferromagnetism at 16 K with different spin-flip magnetic fields between a bond-parallel and bond-perpendicular directions, aligning more closely with the generalized Heisenberg-Kitaev than the XXZ model. Optical spectroscopy reveals a strong excitonic transition coupled to the antiferromagnetism, enabling optical detection of its spin states. Particularly, such spin-exciton coupling presents anisotropic responses between bond-parallel and bond-perpendicular magnetic field as well as a finite spin-spin correlation function around 40 K, higher than twice its Néel temperature. The characteristic temperature that remains barely changed even under strong magnetic fields highlights the robustness of the spin-fluctuation region. Our results demonstrate Cu3Co2SbO6 as a unique candidate for the quantum spin liquid phase, where the spin Hamiltonian and quasiparticle excitations can be probed and potentially controlled by light. △ Less

Submitted 27 September, 2023; originally announced September 2023.

arXiv:2309.15433 [pdf, other]

Cardinality Estimation of Subgraph Matching: A Filtering-Sampling Approach

Authors: Wonseok Shin, Siwoo Song, Kunsoo Park, Wook-Shin Han

Abstract: Subgraph counting is a fundamental problem in understanding and analyzing graph structured data, yet computationally challenging. This calls for an accurate and efficient algorithm for Subgraph Cardinality Estimation, which is to estimate the number of all isomorphic embeddings of a query graph in a data graph. We present FaSTest, a novel algorithm that combines (1) a powerful filtering technique… ▽ More Subgraph counting is a fundamental problem in understanding and analyzing graph structured data, yet computationally challenging. This calls for an accurate and efficient algorithm for Subgraph Cardinality Estimation, which is to estimate the number of all isomorphic embeddings of a query graph in a data graph. We present FaSTest, a novel algorithm that combines (1) a powerful filtering technique to significantly reduce the sample space, (2) an adaptive tree sampling algorithm for accurate and efficient estimation, and (3) a worst-case optimal stratified graph sampling algorithm for difficult instances. Extensive experiments on real-world datasets show that FaSTest outperforms state-of-the-art sampling-based methods by up to two orders of magnitude and GNN-based methods by up to three orders of magnitude in terms of accuracy. △ Less

Submitted 15 April, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

arXiv:2309.14626 [pdf]

On Hyperelastic Crease

Authors: Siyuan Song, Mrityunjay Kothari, Kyung-Suk Kim

Abstract: We present analyses of crease-formation and stability criteria for incompressible hyperelastic solids. A generic singular perturbation over a laterally compressed half-space creates a far-field eigenmode of three energy-release angular sectors separated by two energy-elevating sectors of incremental deformation. The far-field eigenmode braces the energy-release field of the surface flaw against th… ▽ More We present analyses of crease-formation and stability criteria for incompressible hyperelastic solids. A generic singular perturbation over a laterally compressed half-space creates a far-field eigenmode of three energy-release angular sectors separated by two energy-elevating sectors of incremental deformation. The far-field eigenmode braces the energy-release field of the surface flaw against the transition to a self-similar crease field, and the braced-incremental-deformation (bid) field has a unique shape factor that determines the creasing stability. The shape factor, which is identified by two conservation integrals that represent a subsurface dislocation in the tangential manifold, is a monotonically increasing function of compressive strain. For Neo-Hookean material, when the shape factor is below unity, the bid field is configurationally stable. When the compressive strain is 0.356, the shape factor becomes unity, and the bid field undergoes a higher-order transition to a crease field. At the crease-limit point, we have two asymptotic solutions of the crease-tip folding field and the leading-order far field with two scaling parameters, the ratio of which is determined by matched asymptotes. Our analyses show that the surface is stable against singular perturbation up to the crease limit point and becomes unstable beyond the limit. However, the flat state is metastable against a regular perturbation between the crease limit point and wrinkle critical point, which is a first-order instability point. We introduced a novel finite element method for simulating the bid field with a finite domain size. For Gent model, the strain-stiffening alters the shape factor dependence on the compressive strain, raising crease resistance. The new findings in crease mechanisms will help study ruga mechanics of self-organization and design soft-material structures for high crease resistance. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: 42 pages, 6 figures

arXiv:2309.14509 [pdf, other]

DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models

Authors: Sam Ade Jacobs, Masahiro Tanaka, Chengming Zhang, Minjia Zhang, Shuaiwen Leon Song, Samyam Rajbhandari, Yuxiong He

Abstract: Computation in a typical Transformer-based large language model (LLM) can be characterized by batch size, hidden dimension, number of layers, and sequence length. Until now, system works for accelerating LLM training have focused on the first three dimensions: data parallelism for batch size, tensor parallelism for hidden size and pipeline parallelism for model depth or layers. These widely studie… ▽ More Computation in a typical Transformer-based large language model (LLM) can be characterized by batch size, hidden dimension, number of layers, and sequence length. Until now, system works for accelerating LLM training have focused on the first three dimensions: data parallelism for batch size, tensor parallelism for hidden size and pipeline parallelism for model depth or layers. These widely studied forms of parallelism are not targeted or optimized for long sequence Transformer models. Given practical application needs for long sequence LLM, renewed attentions are being drawn to sequence parallelism. However, existing works in sequence parallelism are constrained by memory-communication inefficiency, limiting their scalability to long sequence large models. In this work, we introduce DeepSpeed-Ulysses, a novel, portable and effective methodology for enabling highly efficient and scalable LLM training with extremely long sequence length. DeepSpeed-Ulysses at its core partitions input data along the sequence dimension and employs an efficient all-to-all collective communication for attention computation. Theoretical communication analysis shows that whereas other methods incur communication overhead as sequence length increases, DeepSpeed-Ulysses maintains constant communication volume when sequence length and compute devices are increased proportionally. Furthermore, experimental evaluations show that DeepSpeed-Ulysses trains 2.5x faster with 4x longer sequence length than the existing method SOTA baseline. △ Less

Submitted 4 October, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

arXiv:2309.12783 [pdf, ps, other]

Multi-objective Optimization of Space-Air-Ground Integrated Network Slicing Relying on a Pair of Central and Distributed Learning Algorithms

Authors: Guorong Zhou, Liqiang Zhao, Gan Zheng, Shenghui Song, Jiankang Zhang, Lajos Hanzo

Abstract: As an attractive enabling technology for next-generation wireless communications, network slicing supports diverse customized services in the global space-air-ground integrated network (SAGIN) with diverse resource constraints. In this paper, we dynamically consider three typical classes of radio access network (RAN) slices, namely high-throughput slices, low-delay slices and wide-coverage slices,… ▽ More As an attractive enabling technology for next-generation wireless communications, network slicing supports diverse customized services in the global space-air-ground integrated network (SAGIN) with diverse resource constraints. In this paper, we dynamically consider three typical classes of radio access network (RAN) slices, namely high-throughput slices, low-delay slices and wide-coverage slices, under the same underlying physical SAGIN. The throughput, the service delay and the coverage area of these three classes of RAN slices are jointly optimized in a non-scalar form by considering the distinct channel features and service advantages of the terrestrial, aerial and satellite components of SAGINs. A joint central and distributed multi-agent deep deterministic policy gradient (CDMADDPG) algorithm is proposed for solving the above problem to obtain the Pareto optimal solutions. The algorithm first determines the optimal virtual unmanned aerial vehicle (vUAV) positions and the inter-slice sub-channel and power sharing by relying on a centralized unit. Then it optimizes the intra-slice sub-channel and power allocation, and the virtual base station (vBS)/vUAV/virtual low earth orbit (vLEO) satellite deployment in support of three classes of slices by three separate distributed units. Simulation results verify that the proposed method approaches the Pareto-optimal exploitation of multiple RAN slices, and outperforms the benchmarkers. △ Less

Submitted 22 September, 2023; originally announced September 2023.

Comments: 19 pages, 14 figures, journal

arXiv:2309.11235 [pdf, other]

OpenChat: Advancing Open-source Language Models with Mixed-Quality Data

Authors: Guan Wang, Sijie Cheng, Xianyuan Zhan, Xiangang Li, Sen Song, Yang Liu

Abstract: Nowadays, open-source large language models like LLaMA have emerged. Recent developments have incorporated supervised fine-tuning (SFT) and reinforcement learning fine-tuning (RLFT) to align these models with human goals. However, SFT methods treat all training data with mixed quality equally, while RLFT methods require high-quality pairwise or ranking-based preference data. In this study, we pres… ▽ More Nowadays, open-source large language models like LLaMA have emerged. Recent developments have incorporated supervised fine-tuning (SFT) and reinforcement learning fine-tuning (RLFT) to align these models with human goals. However, SFT methods treat all training data with mixed quality equally, while RLFT methods require high-quality pairwise or ranking-based preference data. In this study, we present a novel framework, named OpenChat, to advance open-source language models with mixed-quality data. Specifically, we consider the general SFT training data, consisting of a small amount of expert data mixed with a large proportion of sub-optimal data, without any preference labels. We propose the C(onditioned)-RLFT, which regards different data sources as coarse-grained reward labels and learns a class-conditioned policy to leverage complementary data quality information. Interestingly, the optimal policy in C-RLFT can be easily solved through single-stage, RL-free supervised learning, which is lightweight and avoids costly human preference labeling. Through extensive experiments on three standard benchmarks, our openchat-13b fine-tuned with C-RLFT achieves the highest average performance among all 13b open-source language models. Moreover, we use AGIEval to validate the model generalization performance, in which only openchat-13b surpasses the base model. Finally, we conduct a series of analyses to shed light on the effectiveness and robustness of OpenChat. Our code, data, and models are publicly available at https://github.com/imoneoi/openchat and https://huggingface.co/openchat. △ Less

Submitted 16 March, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

arXiv:2309.10285 [pdf, other]

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity

Authors: Haojun Xia, Zhen Zheng, Yuchao Li, Donglin Zhuang, Zhongzhu Zhou, Xiafei Qiu, Yong Li, Wei Lin, Shuaiwen Leon Song

Abstract: With the fast growth of parameter size, it becomes increasingly challenging to deploy large generative models as they typically require large GPU memory consumption and massive computation. Unstructured model pruning has been a common approach to reduce both GPU memory footprint and the overall computation while retaining good model accuracy. However, the existing solutions do not provide a highly… ▽ More With the fast growth of parameter size, it becomes increasingly challenging to deploy large generative models as they typically require large GPU memory consumption and massive computation. Unstructured model pruning has been a common approach to reduce both GPU memory footprint and the overall computation while retaining good model accuracy. However, the existing solutions do not provide a highly-efficient support for handling unstructured sparsity on modern GPUs, especially on the highly-structured Tensor Core hardware. Therefore, we propose Flash-LLM for enabling low-cost and highly-efficient large generative model inference with the sophisticated support of unstructured sparsity on high-performance but highly restrictive Tensor Cores. Based on our key observation that the main bottleneck of generative model inference is the several skinny matrix multiplications for which Tensor Cores would be significantly under-utilized due to low computational intensity, we propose a general Load-as-Sparse and Compute-as-Dense methodology for unstructured sparse matrix multiplication. The basic insight is to address the significant memory bandwidth bottleneck while tolerating redundant computations that are not critical for end-to-end performance on Tensor Cores. Based on this, we design an effective software framework for Tensor Core based unstructured SpMM, leveraging on-chip resources for efficient sparse data extraction and computation/memory-access overlap**. At SpMM kernel level, Flash-LLM significantly outperforms the state-of-the-art library, i.e., Sputnik and SparTA by an average of 2.9x and 1.5x, respectively. At end-to-end framework level on OPT-30B/66B/175B models, for tokens per GPU-second, Flash-LLM achieves up to 3.8x and 3.6x improvement over DeepSpeed and FasterTransformer, respectively, with significantly lower inference cost. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: VLDB 2024

arXiv:2309.10065 [pdf, other]

Bayesian longitudinal tensor response regression for modeling neuroplasticity

Authors: Suprateek Kundu, Alec Reinhardt, Serena Song, Joo Han, M. Lawson Meadows, Bruce Crosson, Venkatagiri Krishnamurthy

Abstract: A major interest in longitudinal neuroimaging studies involves investigating voxel-level neuroplasticity due to treatment and other factors across visits. However, traditional voxel-wise methods are beset with several pitfalls, which can compromise the accuracy of these approaches. We propose a novel Bayesian tensor response regression approach for longitudinal imaging data, which pools informatio… ▽ More A major interest in longitudinal neuroimaging studies involves investigating voxel-level neuroplasticity due to treatment and other factors across visits. However, traditional voxel-wise methods are beset with several pitfalls, which can compromise the accuracy of these approaches. We propose a novel Bayesian tensor response regression approach for longitudinal imaging data, which pools information across spatially-distributed voxels to infer significant changes while adjusting for covariates. The proposed method, which is implemented using Markov chain Monte Carlo (MCMC) sampling, utilizes low-rank decomposition to reduce dimensionality and preserve spatial configurations of voxels when estimating coefficients. It also enables feature selection via joint credible regions which respect the shape of the posterior distributions for more accurate inference. In addition to group level inferences, the method is able to infer individual-level neuroplasticity, allowing for examination of personalized disease or recovery trajectories. The advantages of the proposed approach in terms of prediction and feature selection over voxel-wise regression are highlighted via extensive simulation studies. Subsequently, we apply the approach to a longitudinal Aphasia dataset consisting of task functional MRI images from a group of subjects who were administered either a control intervention or intention treatment at baseline and were followed up over subsequent visits. Our analysis revealed that while the control therapy showed long-term increases in brain activity, the intention treatment produced predominantly short-term changes, both of which were concentrated in distinct localized regions. In contrast, the voxel-wise regression failed to detect any significant neuroplasticity after multiplicity adjustments, which is biologically implausible and implies lack of power. △ Less

Submitted 18 October, 2023; v1 submitted 12 September, 2023; originally announced September 2023.

Comments: 28 pages, 8 figures, 6 tables

arXiv:2309.09728 [pdf, other]

doi 10.1103/PhysRevLett.132.021801

Measurement of CP violation in $B^0\toψ(\to\ell^+\ell^-)K^0_S(\toπ^+π^-)$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1080 additional authors not shown)

Abstract: A measurement of time-dependent CP violation in the decays of $B^0$ and $\overline{B}^0$ mesons to the final states $J/ψ(\toμ^+μ^-)K^0_S$, $ψ(2S)(\toμ^+μ^-)K^0_S$ and $J/ψ(\to e^+e^-)K^0_S$ with $K^0_S\toπ^+π^-$ is presented. The data correspond to an integrated luminosity of 6 fb${}^{-1}$ collected at a center-of-mass energy of $\sqrt{s}=13$ TeV with the LHCb detector. The CP-violation parameters… ▽ More A measurement of time-dependent CP violation in the decays of $B^0$ and $\overline{B}^0$ mesons to the final states $J/ψ(\toμ^+μ^-)K^0_S$, $ψ(2S)(\toμ^+μ^-)K^0_S$ and $J/ψ(\to e^+e^-)K^0_S$ with $K^0_S\toπ^+π^-$ is presented. The data correspond to an integrated luminosity of 6 fb${}^{-1}$ collected at a center-of-mass energy of $\sqrt{s}=13$ TeV with the LHCb detector. The CP-violation parameters are measured to be \begin{align*} S_{ψK^0_S} &= 0.717 \pm 0.013 (\text{stat}) \pm 0.008 (\text{syst}), \\ C_{ψK^0_S} &= 0.008 \pm 0.012 (\text{stat}) \pm 0.003 (\text{syst}). \end{align*} This measurement of $S_{ψK^0_S}$ represents the most precise single measurement of the CKM angle $β$ to date and is more precise than the current world average. In addition, measurements of the CP-violation parameters of the individual channels are reported and a combination with the LHCb Run 1 measurements is performed. △ Less

Submitted 9 January, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2023-013.html (LHCb public pages)

Report number: LHCb-PAPER-2023-013, CERN-EP-2023-177

Journal ref: Phys. Rev. Lett. 132 (2024) 021801

arXiv:2309.09575 [pdf, other]

AI-Native Transceiver Design for Near-Field Ultra-Massive MIMO: Principles and Techniques

Authors: Wentao Yu, Yifan Ma, Hengtao He, Shenghui Song, Jun Zhang, Khaled B. Letaief

Abstract: Ultra-massive multiple-input multiple-output (UMMIMO) is a cutting-edge technology that promises to revolutionize wireless networks by providing an unprecedentedly high spectral and energy efficiency. The enlarged array aperture of UM-MIMO facilitates the accessibility of the near-field region, thereby offering a novel degree of freedom for communications and sensing. Nevertheless, the transceiver… ▽ More Ultra-massive multiple-input multiple-output (UMMIMO) is a cutting-edge technology that promises to revolutionize wireless networks by providing an unprecedentedly high spectral and energy efficiency. The enlarged array aperture of UM-MIMO facilitates the accessibility of the near-field region, thereby offering a novel degree of freedom for communications and sensing. Nevertheless, the transceiver design for such systems is challenging because of the enormous system scale, the complicated channel characteristics, and the uncertainties of the propagation environments. Hence, it is critical to study scalable, low-complexity, and robust algorithms that can efficiently characterize and leverage the properties of the near-field channel. In this article, we advocate two general frameworks from an artificial intelligence (AI)-native perspective to design iterative and noniterative algorithms for the near-field UM-MIMO transceivers, respectively. Near-field beam focusing and channel estimation are presented as two tutorial-style examples to demonstrate the significant advantages of the proposed AI-native frameworks in terms of various key performance indicators. △ Less

Submitted 3 January, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

Comments: 7 pages, 3 figures, 2 tables, magazine manuscript, submitted to IEEE for possible publication

arXiv:2309.08524 [pdf, other]

Multi-orbital Kondo screening in strongly correlated polyradical nanographenes

Authors: Aitor Calvo-Fernández, Diego Soler-Polo, Andrés Pinar Solé, Shaotang Song, Oleksander Stetsovych, Manish Kumar, Guangwu Li, Jishan Wu, Jiong Lu, Asier Eiguren, María Blanco-Rey, Pavel Jelínek

Abstract: We discuss coexistence of Kondo and spin excitation signals in tunneling spectroscopy in strongly correlated polyradical $π$-magnetic nanographenes on a metal surface. The Kondo signal is rationalized by a multi-orbital Kondo screening of the unpaired electrons. The fundamental processes are spin-flips of antiferromagnetic (AFM) order involving charged molecular multiplets. We introduce a~perturba… ▽ More We discuss coexistence of Kondo and spin excitation signals in tunneling spectroscopy in strongly correlated polyradical $π$-magnetic nanographenes on a metal surface. The Kondo signal is rationalized by a multi-orbital Kondo screening of the unpaired electrons. The fundamental processes are spin-flips of antiferromagnetic (AFM) order involving charged molecular multiplets. We introduce a~perturbative model, which provides simple rules to identify the presence of AFM channels responsible for Kondo screening. The Kondo regime is confirmed by numerical renormalization group calculations. This framework can be applied to similar strongly correlated open-shell systems. △ Less

Submitted 15 September, 2023; originally announced September 2023.

Comments: 43 pages (14 in main, 29 in SM), 20 figures (3 in main, 17 in SM)

arXiv:2309.07796 [pdf, other]

For A More Comprehensive Evaluation of 6DoF Object Pose Tracking

Authors: Yang Li, Fan Zhong, Xin Wang, Shuangbing Song, Jiachen Li, Xueying Qin, Changhe Tu

Abstract: Previous evaluations on 6DoF object pose tracking have presented obvious limitations along with the development of this area. In particular, the evaluation protocols are not unified for different methods, the widely-used YCBV dataset contains significant annotation error, and the error metrics also may be biased. As a result, it is hard to fairly compare the methods, which has became a big obstacl… ▽ More Previous evaluations on 6DoF object pose tracking have presented obvious limitations along with the development of this area. In particular, the evaluation protocols are not unified for different methods, the widely-used YCBV dataset contains significant annotation error, and the error metrics also may be biased. As a result, it is hard to fairly compare the methods, which has became a big obstacle for develo** new algorithms. In this paper we contribute a unified benchmark to address the above problems. For more accurate annotation of YCBV, we propose a multi-view multi-object global pose refinement method, which can jointly refine the poses of all objects and view cameras, resulting in sub-pixel sub-millimeter alignment errors. The limitations of previous scoring methods and error metrics are analyzed, based on which we introduce our improved evaluation methods. The unified benchmark takes both YCBV and BCOT as base datasets, which are shown to be complementary in scene categories. In experiments, we validate the precision and reliability of the proposed global pose refinement method with a realistic semi-synthesized dataset particularly for YCBV, and then present the benchmark results unifying learning&non-learning and RGB&RGBD methods, with some finds not discovered in previous studies. △ Less

Submitted 14 September, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

arXiv:2309.05514 [pdf, other]

doi 10.1140/epjc/s10052-023-12376-z

Measurement of the CKM angle $γ$ in the $B^0 \to DK^{*0}$ channel using self-conjugate $D \to K_S^0 h^+ h^-$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1055 additional authors not shown)

Abstract: A model-independent study of CP violation in $B^0 \to DK^{*0}$ decays is presented using data corresponding to an integrated luminosity of 9fb$^{-1}$ collected by the LHCb experiment at centre-of-mass energies of $\sqrt{s}=7, \, 8$ and $13$TeV. The CKM angle $γ$ is determined by examining the distributions of signal decays in phase-space bins of the self-conjugate $D \to K_S^0 h^+ h^-$ decays, whe… ▽ More A model-independent study of CP violation in $B^0 \to DK^{*0}$ decays is presented using data corresponding to an integrated luminosity of 9fb$^{-1}$ collected by the LHCb experiment at centre-of-mass energies of $\sqrt{s}=7, \, 8$ and $13$TeV. The CKM angle $γ$ is determined by examining the distributions of signal decays in phase-space bins of the self-conjugate $D \to K_S^0 h^+ h^-$ decays, where $h = π, K$. Observables related to CP violation are measured and the angle $γ$ is determined to be $γ=(49^{+ 22}_{-19})^\circ$. Measurements of the amplitude ratio and strong-phase difference between the favoured and suppressed $B^0$ decays are also presented. △ Less

Submitted 21 April, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2023-009.html (LHCb public pages)

Report number: LHCb-PAPER-2023-009, CERN-EP-2023-138

Journal ref: Eur. Phys. J. C 84 (2024) 206

arXiv:2309.01788 [pdf, other]

Hierarchical Grammar-Induced Geometry for Data-Efficient Molecular Property Prediction

Authors: Minghao Guo, Veronika Thost, Samuel W Song, Adithya Balachandran, Payel Das, Jie Chen, Wojciech Matusik

Abstract: The prediction of molecular properties is a crucial task in the field of material and drug discovery. The potential benefits of using deep learning techniques are reflected in the wealth of recent literature. Still, these techniques are faced with a common challenge in practice: Labeled data are limited by the cost of manual extraction from literature and laborious experimentation. In this work, w… ▽ More The prediction of molecular properties is a crucial task in the field of material and drug discovery. The potential benefits of using deep learning techniques are reflected in the wealth of recent literature. Still, these techniques are faced with a common challenge in practice: Labeled data are limited by the cost of manual extraction from literature and laborious experimentation. In this work, we propose a data-efficient property predictor by utilizing a learnable hierarchical molecular grammar that can generate molecules from grammar production rules. Such a grammar induces an explicit geometry of the space of molecular graphs, which provides an informative prior on molecular structural similarity. The property prediction is performed using graph neural diffusion over the grammar-induced geometry. On both small and large datasets, our evaluation shows that this approach outperforms a wide spectrum of baselines, including supervised and pre-trained graph neural networks. We include a detailed ablation study and further analysis of our solution, showing its effectiveness in cases with extremely limited data. Code is available at https://github.com/gmh14/Geo-DEG. △ Less

Submitted 4 September, 2023; originally announced September 2023.

Comments: 22 pages, 10 figures; ICML 2023

arXiv:2309.01458 [pdf, other]

Leveraging Reward Consistency for Interpretable Feature Discovery in Reinforcement Learning

Authors: Qisen Yang, Huanqian Wang, Mukun Tong, Wenjie Shi, Gao Huang, Shiji Song

Abstract: The black-box nature of deep reinforcement learning (RL) hinders them from real-world applications. Therefore, interpreting and explaining RL agents have been active research topics in recent years. Existing methods for post-hoc explanations usually adopt the action matching principle to enable an easy understanding of vision-based RL agents. In this paper, it is argued that the commonly used acti… ▽ More The black-box nature of deep reinforcement learning (RL) hinders them from real-world applications. Therefore, interpreting and explaining RL agents have been active research topics in recent years. Existing methods for post-hoc explanations usually adopt the action matching principle to enable an easy understanding of vision-based RL agents. In this paper, it is argued that the commonly used action matching principle is more like an explanation of deep neural networks (DNNs) than the interpretation of RL agents. It may lead to irrelevant or misplaced feature attribution when different DNNs' outputs lead to the same rewards or different rewards result from the same outputs. Therefore, we propose to consider rewards, the essential objective of RL agents, as the essential objective of interpreting RL agents as well. To ensure reward consistency during interpretable feature discovery, a novel framework (RL interpreting RL, denoted as RL-in-RL) is proposed to solve the gradient disconnection from actions to rewards. We verify and evaluate our method on the Atari 2600 games as well as Duckietown, a challenging self-driving car simulator environment. The results show that our method manages to keep reward (or return) consistency and achieves high-quality feature attribution. Further, a series of analytical experiments validate our assumption of the action matching principle's limitations. △ Less

Submitted 4 September, 2023; originally announced September 2023.

arXiv:2309.01448 [pdf, other]

Hundreds Guide Millions: Adaptive Offline Reinforcement Learning with Expert Guidance

Authors: Qisen Yang, Shenzhi Wang, Qihang Zhang, Gao Huang, Shiji Song

Abstract: Offline reinforcement learning (RL) optimizes the policy on a previously collected dataset without any interactions with the environment, yet usually suffers from the distributional shift problem. To mitigate this issue, a typical solution is to impose a policy constraint on a policy improvement objective. However, existing methods generally adopt a ``one-size-fits-all'' practice, i.e., kee** on… ▽ More Offline reinforcement learning (RL) optimizes the policy on a previously collected dataset without any interactions with the environment, yet usually suffers from the distributional shift problem. To mitigate this issue, a typical solution is to impose a policy constraint on a policy improvement objective. However, existing methods generally adopt a ``one-size-fits-all'' practice, i.e., kee** only a single improvement-constraint balance for all the samples in a mini-batch or even the entire offline dataset. In this work, we argue that different samples should be treated with different policy constraint intensities. Based on this idea, a novel plug-in approach named Guided Offline RL (GORL) is proposed. GORL employs a guiding network, along with only a few expert demonstrations, to adaptively determine the relative importance of the policy improvement and policy constraint for every sample. We theoretically prove that the guidance provided by our method is rational and near-optimal. Extensive experiments on various environments suggest that GORL can be easily installed on most offline RL algorithms with statistically significant performance improvements. △ Less

Submitted 4 September, 2023; originally announced September 2023.

arXiv:2309.01430 [pdf, other]

DAT++: Spatially Dynamic Vision Transformer with Deformable Attention

Authors: Zhuofan Xia, Xuran Pan, Shiji Song, Li Erran Li, Gao Huang

Abstract: Transformers have shown superior performance on various vision tasks. Their large receptive field endows Transformer models with higher representation power than their CNN counterparts. Nevertheless, simply enlarging the receptive field also raises several concerns. On the one hand, using dense attention in ViT leads to excessive memory and computational cost, and features can be influenced by irr… ▽ More Transformers have shown superior performance on various vision tasks. Their large receptive field endows Transformer models with higher representation power than their CNN counterparts. Nevertheless, simply enlarging the receptive field also raises several concerns. On the one hand, using dense attention in ViT leads to excessive memory and computational cost, and features can be influenced by irrelevant parts that are beyond the region of interests. On the other hand, the handcrafted attention adopted in PVT or Swin Transformer is data agnostic and may limit the ability to model long-range relations. To solve this dilemma, we propose a novel deformable multi-head attention module, where the positions of key and value pairs in self-attention are adaptively allocated in a data-dependent way. This flexible scheme enables the proposed deformable attention to dynamically focus on relevant regions while maintains the representation power of global attention. On this basis, we present Deformable Attention Transformer (DAT), a general vision backbone efficient and effective for visual recognition. We further build an enhanced version DAT++. Extensive experiments show that our DAT++ achieves state-of-the-art results on various visual recognition benchmarks, with 85.9% ImageNet accuracy, 54.5 and 47.0 MS-COCO instance segmentation mAP, and 51.5 ADE20K semantic segmentation mIoU. △ Less

Submitted 4 September, 2023; originally announced September 2023.

Comments: 17 pages, 6 figures, 11 tables

arXiv:2309.00859 [pdf, other]

DeepScaler: Holistic Autoscaling for Microservices Based on Spatiotemporal GNN with Adaptive Graph Learning

Authors: Chunyang Meng, Shijie Song, Haogang Tong, Maolin Pan, Yang Yu

Abstract: Autoscaling functions provide the foundation for achieving elasticity in the modern cloud computing paradigm. It enables dynamic provisioning or de-provisioning resources for cloud software services and applications without human intervention to adapt to workload fluctuations. However, autoscaling microservice is challenging due to various factors. In particular, complex, time-varying service depe… ▽ More Autoscaling functions provide the foundation for achieving elasticity in the modern cloud computing paradigm. It enables dynamic provisioning or de-provisioning resources for cloud software services and applications without human intervention to adapt to workload fluctuations. However, autoscaling microservice is challenging due to various factors. In particular, complex, time-varying service dependencies are difficult to quantify accurately and can lead to cascading effects when allocating resources. This paper presents DeepScaler, a deep learning-based holistic autoscaling approach for microservices that focus on co** with service dependencies to optimize service-level agreements (SLA) assurance and cost efficiency. DeepScaler employs (i) an expectation-maximization-based learning method to adaptively generate affinity matrices revealing service dependencies and (ii) an attention-based graph convolutional network to extract spatio-temporal features of microservices by aggregating neighbors' information of graph-structural data. Thus DeepScaler can capture more potential service dependencies and accurately estimate the resource requirements of all services under dynamic workloads. It allows DeepScaler to reconfigure the resources of the interacting services simultaneously in one resource provisioning operation, avoiding the cascading effect caused by service dependencies. Experimental results demonstrate that our method implements a more effective autoscaling mechanism for microservice that not only allocates resources accurately but also adapts to dependencies changes, significantly reducing SLA violations by an average of 41% at lower costs. △ Less

Submitted 2 September, 2023; originally announced September 2023.

Comments: To be published in the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE 2023)

arXiv:2309.00810 [pdf, other]

RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large Model

Authors: Fengxiang Bie, Yibo Yang, Zhongzhu Zhou, Adam Ghanem, Minjia Zhang, Zhewei Yao, Xiaoxia Wu, Connor Holmes, Pareesa Golnari, David A. Clifton, Yuxiong He, Dacheng Tao, Shuaiwen Leon Song

Abstract: Text-to-image generation (TTI) refers to the usage of models that could process text input and generate high fidelity images based on text descriptions. Text-to-image generation using neural networks could be traced back to the emergence of Generative Adversial Network (GAN), followed by the autoregressive Transformer. Diffusion models are one prominent type of generative model used for the genera… ▽ More Text-to-image generation (TTI) refers to the usage of models that could process text input and generate high fidelity images based on text descriptions. Text-to-image generation using neural networks could be traced back to the emergence of Generative Adversial Network (GAN), followed by the autoregressive Transformer. Diffusion models are one prominent type of generative model used for the generation of images through the systematic introduction of noises with repeating steps. As an effect of the impressive results of diffusion models on image synthesis, it has been cemented as the major image decoder used by text-to-image models and brought text-to-image generation to the forefront of machine-learning (ML) research. In the era of large models, scaling up model size and the integration with large language models have further improved the performance of TTI models, resulting the generation result nearly indistinguishable from real-world images, revolutionizing the way we retrieval images. Our explorative study has incentivised us to think that there are further ways of scaling text-to-image models with the combination of innovative model architectures and prediction enhancement techniques. We have divided the work of this survey into five main sections wherein we detail the frameworks of major literature in order to delve into the different types of text-to-image generation methods. Following this we provide a detailed comparison and critique of these methods and offer possible pathways of improvement for future work. In the future work, we argue that TTI development could yield impressive productivity improvements for creation, particularly in the context of the AIGC era, and could be extended to more complex tasks such as video generation and 3D generation. △ Less

Submitted 1 September, 2023; originally announced September 2023.

arXiv:2308.15949 [pdf, other]

Latency-aware Unified Dynamic Networks for Efficient Image Recognition

Authors: Yizeng Han, Zeyu Liu, Zhihang Yuan, Yifan Pu, Chaofei Wang, Shiji Song, Gao Huang

Abstract: Dynamic computation has emerged as a promising avenue to enhance the inference efficiency of deep networks. It allows selective activation of computational units, leading to a reduction in unnecessary computations for each input sample. However, the actual efficiency of these dynamic models can deviate from theoretical predictions. This mismatch arises from: 1) the lack of a unified approach due t… ▽ More Dynamic computation has emerged as a promising avenue to enhance the inference efficiency of deep networks. It allows selective activation of computational units, leading to a reduction in unnecessary computations for each input sample. However, the actual efficiency of these dynamic models can deviate from theoretical predictions. This mismatch arises from: 1) the lack of a unified approach due to fragmented research; 2) the focus on algorithm design over critical scheduling strategies, especially in CUDA-enabled GPU contexts; and 3) challenges in measuring practical latency, given that most libraries cater to static operations. Addressing these issues, we unveil the Latency-Aware Unified Dynamic Networks (LAUDNet), a framework that integrates three primary dynamic paradigms-spatially adaptive computation, dynamic layer skip**, and dynamic channel skip**. To bridge the theoretical and practical efficiency gap, LAUDNet merges algorithmic design with scheduling optimization, guided by a latency predictor that accurately gauges dynamic operator latency. We've tested LAUDNet across multiple vision tasks, demonstrating its capacity to notably reduce the latency of models like ResNet-101 by over 50% on platforms such as V100, RTX3090, and TX2 GPUs. Notably, LAUDNet stands out in balancing accuracy and efficiency. Code is available at: https://www.github.com/LeapLabTHU/LAUDNet. △ Less

Submitted 20 February, 2024; v1 submitted 30 August, 2023; originally announced August 2023.

arXiv:2308.15640 [pdf, other]

Identifying Constitutive Parameters for Complex Hyperelastic Materials using Physics-Informed Neural Networks

Authors: Siyuan Song, Hanxun **

Abstract: Identifying constitutive parameters in engineering and biological materials, particularly those with intricate geometries and mechanical behaviors, remains a longstanding challenge. The recent advent of Physics-Informed Neural Networks (PINNs) offers promising solutions, but current frameworks are often limited to basic constitutive laws and encounter practical constraints when combined with exper… ▽ More Identifying constitutive parameters in engineering and biological materials, particularly those with intricate geometries and mechanical behaviors, remains a longstanding challenge. The recent advent of Physics-Informed Neural Networks (PINNs) offers promising solutions, but current frameworks are often limited to basic constitutive laws and encounter practical constraints when combined with experimental data. In this paper, we introduce a robust PINN-based framework designed to identify material parameters for soft materials, specifically those exhibiting complex constitutive behaviors, under large deformation in plane stress conditions. Distinctively, our model emphasizes training PINNs with multi-modal synthetic experimental datasets consisting of full-field deformation and loading history, ensuring algorithm robustness even with noisy data. Our results reveal that the PINNs framework can accurately identify constitutive parameters of the incompressible Arruda-Boyce model for samples with intricate geometries, maintaining an error below 5%, even with an experimental noise level of 5%. We believe our framework provides a robust modulus identification approach for complex solids, especially for those with geometrical and constitutive complexity. △ Less

Submitted 23 June, 2024; v1 submitted 29 August, 2023; originally announced August 2023.

Comments: 28 pages, 5 figures, 1 table

arXiv:2308.14960 [pdf, other]

Read-only Prompt Optimization for Vision-Language Few-shot Learning

Authors: Dongjun Lee, Seokwon Song, Jihee Suh, Joonmyung Choi, Sanghyeok Lee, Hyunwoo J. Kim

Abstract: In recent years, prompt tuning has proven effective in adapting pre-trained vision-language models to downstream tasks. These methods aim to adapt the pre-trained models by introducing learnable prompts while kee** pre-trained weights frozen. However, learnable prompts can affect the internal representation within the self-attention module, which may negatively impact performance variance and ge… ▽ More In recent years, prompt tuning has proven effective in adapting pre-trained vision-language models to downstream tasks. These methods aim to adapt the pre-trained models by introducing learnable prompts while kee** pre-trained weights frozen. However, learnable prompts can affect the internal representation within the self-attention module, which may negatively impact performance variance and generalization, especially in data-deficient settings. To address these issues, we propose a novel approach, Read-only Prompt Optimization (RPO). RPO leverages masked attention to prevent the internal representation shift in the pre-trained model. Further, to facilitate the optimization of RPO, the read-only prompts are initialized based on special tokens of the pre-trained model. Our extensive experiments demonstrate that RPO outperforms CLIP and CoCoOp in base-to-new generalization and domain generalization while displaying better robustness. Also, the proposed method achieves better generalization on extremely data-deficient settings, while improving parameter efficiency and computational overhead. Code is available at https://github.com/mlvlab/RPO. △ Less

Submitted 9 November, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

Comments: Accepted at ICCV2023

arXiv:2308.14252 [pdf, other]

Key technologies and application for radar and smart video fusion in perimeter intrusion alarm system

Authors: Shujun Fu, Shenghai Liao, **g**g Gao, Shi**g Song, Zhonghua Man

Abstract: With the continuous development of modern science and technology, radar detection, video surveillance and perimeter alarm system are more and more widely used in the field of social security. This paper introduces video surveillance and perimeter alarm in detail, mathematical modeling and key technologies, analyzes their fusion and application status, and puts forward suggestions combined with the… ▽ More With the continuous development of modern science and technology, radar detection, video surveillance and perimeter alarm system are more and more widely used in the field of social security. This paper introduces video surveillance and perimeter alarm in detail, mathematical modeling and key technologies, analyzes their fusion and application status, and puts forward suggestions combined with the development trend of intelligent security system in the future. △ Less

Submitted 27 August, 2023; originally announced August 2023.

Comments: submitted

arXiv:2308.13998 [pdf, other]

Computation-efficient Deep Learning for Computer Vision: A Survey

Authors: Yulin Wang, Yizeng Han, Chaofei Wang, Shiji Song, Qi Tian, Gao Huang

Abstract: Over the past decade, deep learning models have exhibited considerable advancements, reaching or even exceeding human-level performance in a range of visual perception tasks. This remarkable progress has sparked interest in applying deep networks to real-world applications, such as autonomous vehicles, mobile devices, robotics, and edge computing. However, the challenge remains that state-of-the-a… ▽ More Over the past decade, deep learning models have exhibited considerable advancements, reaching or even exceeding human-level performance in a range of visual perception tasks. This remarkable progress has sparked interest in applying deep networks to real-world applications, such as autonomous vehicles, mobile devices, robotics, and edge computing. However, the challenge remains that state-of-the-art models usually demand significant computational resources, leading to impractical power consumption, latency, or carbon emissions in real-world scenarios. This trade-off between effectiveness and efficiency has catalyzed the emergence of a new research focus: computationally efficient deep learning, which strives to achieve satisfactory performance while minimizing the computational cost during inference. This review offers an extensive analysis of this rapidly evolving field by examining four key areas: 1) the development of static or dynamic light-weighted backbone models for the efficient extraction of discriminative deep representations; 2) the specialized network architectures or algorithms tailored for specific computer vision tasks; 3) the techniques employed for compressing deep learning models; and 4) the strategies for deploying efficient deep networks on hardware platforms. Additionally, we provide a systematic discussion on the critical challenges faced in this domain, such as network architecture design, training schemes, practical efficiency, and more realistic model compression approaches, as well as potential future research directions. △ Less

Submitted 26 August, 2023; originally announced August 2023.

arXiv:2308.12940 [pdf, other]

doi 10.1007/JHEP02(2024)070

Measurement of the $Z$ boson production cross-section in $pp$ collisions at $\sqrt{s} = 5.02$ TeV

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1075 additional authors not shown)

Abstract: The first measurement of the $Z$ boson production cross-section at centre-of-mass energy $\sqrt{s} = 5.02\,$TeV in the forward region is reported, using $pp$ collision data collected by the LHCb experiment in year 2017, corresponding to an integrated luminosity of $100 \pm 2\,\rm{pb^{-1}}$. The production cross-section is measured for final-state muons in the pseudorapidity range $2.0<η<4.5$ with… ▽ More The first measurement of the $Z$ boson production cross-section at centre-of-mass energy $\sqrt{s} = 5.02\,$TeV in the forward region is reported, using $pp$ collision data collected by the LHCb experiment in year 2017, corresponding to an integrated luminosity of $100 \pm 2\,\rm{pb^{-1}}$. The production cross-section is measured for final-state muons in the pseudorapidity range $2.0<η<4.5$ with transverse momentum $p_{\rm{T}}> 20\,\rm{GeV/}\it{c}$. The integrated cross-section is determined to be \[ σ_{Z \rightarrow μ^{+}μ^{-}} = 39.6 \pm 0.7\,(\rm{stat}) \pm 0.6\,(\rm{syst}) \pm 0.8\,(\rm{lumi}) \ \rm{pb} \] for the di-muon invariant mass in the range $60<M_{μμ}<120\,\rm{GeV/}\it{c^{2}}$. This result and the differential cross-section results are in good agreement with theoretical predictions at next-to-next-to-leading order in the strong coupling. Based on a previous LHCb measurement of the $Z$ boson production cross-section in $p$Pb collisions at $\sqrt{s_{NN}}=5.02$ TeV, the nuclear modification factor $R_{p\rm{Pb}}$ is measured for the first time at this energy. The measured values are $1.2^{+0.5}_{-0.3}\,(\rm{stat}) \pm 0.1\,(\rm{syst})$ in the forward region ($1.53<y^*_μ<4.03$) and $3.6^{+1.6}_{-0.9}\,(\rm{stat}) \pm 0.2\,(\rm{syst})$ in the backward region ($-4.97<y^*_μ<-2.47$), where $y^*_μ$ represents the muon rapidity in the centre-of-mass frame. △ Less

Submitted 8 March, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2023-010.html (LHCb public pages)

Report number: LHCb-PAPER-2023-010, CERN-EP-2023-141

Journal ref: JHEP02(2024)070

arXiv:2308.12139 [pdf]

Mesh Conflation of Oblique Photogrammetric Models using Virtual Cameras and Truncated Signed Distance Field

Authors: Shuang Song, Rongjun Qin

Abstract: Conflating/stitching 2.5D raster digital surface models (DSM) into a large one has been a running practice in geoscience applications, however, conflating full-3D mesh models, such as those from oblique photogrammetry, is extremely challenging. In this letter, we propose a novel approach to address this challenge by conflating multiple full-3D oblique photogrammetric models into a single, and seam… ▽ More Conflating/stitching 2.5D raster digital surface models (DSM) into a large one has been a running practice in geoscience applications, however, conflating full-3D mesh models, such as those from oblique photogrammetry, is extremely challenging. In this letter, we propose a novel approach to address this challenge by conflating multiple full-3D oblique photogrammetric models into a single, and seamless mesh for high-resolution site modeling. Given two or more individually collected and created photogrammetric meshes, we first propose to create a virtual camera field (with a panoramic field of view) to incubate virtual spaces represented by Truncated Signed Distance Field (TSDF), an implicit volumetric field friendly for linear 3D fusion; then we adaptively leverage the truncated bound of meshes in TSDF to conflate them into a single and accurate full 3D site model. With drone-based 3D meshes, we show that our approach significantly improves upon traditional methods for model conflations, to drive new potentials to create excessively large and accurate full 3D mesh models in support of geoscience and environmental applications. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: 5 Figures

arXiv:2308.11980 [pdf, other]

Joint Prediction of Audio Event and Annoyance Rating in an Urban Soundscape by Hierarchical Graph Representation Learning

Authors: Yuanbo Hou, Siyang Song, Cheng Luo, Andrew Mitchell, Qiaoqiao Ren, Weicheng Xie, Jian Kang, Wenwu Wang, Dick Botteldooren

Abstract: Sound events in daily life carry rich information about the objective world. The composition of these sounds affects the mood of people in a soundscape. Most previous approaches only focus on classifying and detecting audio events and scenes, but may ignore their perceptual quality that may impact humans' listening mood for the environment, e.g. annoyance. To this end, this paper proposes a novel… ▽ More Sound events in daily life carry rich information about the objective world. The composition of these sounds affects the mood of people in a soundscape. Most previous approaches only focus on classifying and detecting audio events and scenes, but may ignore their perceptual quality that may impact humans' listening mood for the environment, e.g. annoyance. To this end, this paper proposes a novel hierarchical graph representation learning (HGRL) approach which links objective audio events (AE) with subjective annoyance ratings (AR) of the soundscape perceived by humans. The hierarchical graph consists of fine-grained event (fAE) embeddings with single-class event semantics, coarse-grained event (cAE) embeddings with multi-class event semantics, and AR embeddings. Experiments show the proposed HGRL successfully integrates AE with AR for AEC and ARP tasks, while coordinating the relations between cAE and fAE and further aligning the two different grains of AE information with the AR. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: INTERSPEECH 2023, Code and models: https://github.com/Yuanbo2020/HGRL

arXiv:2308.08742 [pdf, other]

PMET: Precise Model Editing in a Transformer

Authors: Xiaopeng Li, Shasha Li, Shezheng Song, **g Yang, Jun Ma, Jie Yu

Abstract: Model editing techniques modify a minor proportion of knowledge in Large Language Models (LLMs) at a relatively low cost, which have demonstrated notable success. Existing methods assume Transformer Layer (TL) hidden states are values of key-value memories of the Feed-Forward Network (FFN). They usually optimize the TL hidden states to memorize target knowledge and use it to update the weights of… ▽ More Model editing techniques modify a minor proportion of knowledge in Large Language Models (LLMs) at a relatively low cost, which have demonstrated notable success. Existing methods assume Transformer Layer (TL) hidden states are values of key-value memories of the Feed-Forward Network (FFN). They usually optimize the TL hidden states to memorize target knowledge and use it to update the weights of the FFN in LLMs. However, the information flow of TL hidden states comes from three parts: Multi-Head Self-Attention (MHSA), FFN, and residual connections. Existing methods neglect the fact that the TL hidden states contains information not specifically required for FFN. Consequently, the performance of model editing decreases. To achieve more precise model editing, we analyze hidden states of MHSA and FFN, finding that MHSA encodes certain general knowledge extraction patterns. This implies that MHSA weights do not require updating when new knowledge is introduced. Based on above findings, we introduce PMET, which simultaneously optimizes Transformer Component (TC, namely MHSA and FFN) hidden states, while only using the optimized TC hidden states of FFN to precisely update FFN weights. Our experiments demonstrate that PMET exhibits state-of-the-art performance on both the COUNTERFACT and zsRE datasets. Our ablation experiments substantiate the effectiveness of our enhancements, further reinforcing the finding that the MHSA encodes certain general knowledge extraction patterns and indicating its storage of a small amount of factual knowledge. Our code is available at https://github.com/xpq-tech/PMET. △ Less

Submitted 11 March, 2024; v1 submitted 16 August, 2023; originally announced August 2023.

Comments: AAAI24

arXiv:2308.08512 [pdf, other]

doi 10.1103/PhysRevLett.132.081802

Observation of Cabibbo-Suppressed Two-Body Hadronic Decays and Precision Mass Measurement of the $Ω_{c}^{0}$ Baryon

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1076 additional authors not shown)

Abstract: The first observation of the singly Cabibbo-suppressed $Ω_{c}^{0}\toΩ^{-}K^{+}$ and $Ω_{c}^{0}\toΞ^{-}π^{+}$ decays is reported, using proton-proton collision data at a center-of-mass energy of $13\,{\rm TeV}$, corresponding to an integrated luminosity of $5.4\,{\rm fb}^{-1}$, collected with the LHCb detector between 2016 and 2018. The branching fraction ratios are measured to be… ▽ More The first observation of the singly Cabibbo-suppressed $Ω_{c}^{0}\toΩ^{-}K^{+}$ and $Ω_{c}^{0}\toΞ^{-}π^{+}$ decays is reported, using proton-proton collision data at a center-of-mass energy of $13\,{\rm TeV}$, corresponding to an integrated luminosity of $5.4\,{\rm fb}^{-1}$, collected with the LHCb detector between 2016 and 2018. The branching fraction ratios are measured to be $\frac{\mathcal{B}(Ω_{c}^{0}\toΩ^{-}K^{+})}{\mathcal{B}(Ω_{c}^{0}\toΩ^{-}π^{+})}=[6.08\pm0.51({\rm stat})\pm0.40({\rm syst})]\%$, $\frac{\mathcal{B}(Ω_{c}^{0}\toΞ^{-}π^{+})}{\mathcal{B}(Ω_{c}^{0}\toΩ^{-}π^{+})}=[15.81\pm0.87({\rm stat})\pm0.44({\rm syst})\pm0.16({\rm ext})]\%$. In addition, using the $Ω_{c}^{0}\toΩ^{-}π^{+}$ decay channel, the $Ω_{c}^{0}$ baryon mass is measured to be $M(Ω_{c}^{0})=2695.28\pm0.07({\rm stat})\pm0.27({\rm syst})\pm0.30({\rm ext})\,{\rm MeV}$, improving the precision of the previous world average by a factor of 4. △ Less

Submitted 26 February, 2024; v1 submitted 16 August, 2023; originally announced August 2023.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2023-011.html (LHCb public pages)

Report number: LHCb-PAPER-2023-011, CERN-EP-2023-155

Journal ref: Phys. Rev. Lett. 132, 081802 (2024)

arXiv:2308.08176 [pdf, other]

RSpell: Retrieval-augmented Framework for Domain Adaptive Chinese Spelling Check

Authors: Siqi Song, Qi Lv, Lei Geng, Ziqiang Cao, Guohong Fu

Abstract: Chinese Spelling Check (CSC) refers to the detection and correction of spelling errors in Chinese texts. In practical application scenarios, it is important to make CSC models have the ability to correct errors across different domains. In this paper, we propose a retrieval-augmented spelling check framework called RSpell, which searches corresponding domain terms and incorporates them into CSC mo… ▽ More Chinese Spelling Check (CSC) refers to the detection and correction of spelling errors in Chinese texts. In practical application scenarios, it is important to make CSC models have the ability to correct errors across different domains. In this paper, we propose a retrieval-augmented spelling check framework called RSpell, which searches corresponding domain terms and incorporates them into CSC models. Specifically, we employ pinyin fuzzy matching to search for terms, which are combined with the input and fed into the CSC model. Then, we introduce an adaptive process control mechanism to dynamically adjust the impact of external knowledge on the model. Additionally, we develop an iterative strategy for the RSpell framework to enhance reasoning capabilities. We conducted experiments on CSC datasets in three domains: law, medicine, and official document writing. The results demonstrate that RSpell achieves state-of-the-art performance in both zero-shot and fine-tuning scenarios, demonstrating the effectiveness of the retrieval-augmented CSC framework. Our code is available at https://github.com/47777777/Rspell. △ Less

Submitted 16 August, 2023; originally announced August 2023.

Journal ref: NLPCC 2023

arXiv:2308.05495 [pdf, other]

Topological soliton molecule in quasi 1D charge density wave

Authors: Taehwan Im, Sun Kyu Song, Jae Whan Park, Han Woong Yeom

Abstract: Soliton molecules, bound states of two solitons, can be important for the informatics using solitons and the quest for exotic particles in a wide range of physical systems from unconventional superconductors to nuclear matter and Higgs field, but have been observed only in temporal dimension for classical wave optical systems. Here, we identify a topological soliton molecule formed spatially in an… ▽ More Soliton molecules, bound states of two solitons, can be important for the informatics using solitons and the quest for exotic particles in a wide range of physical systems from unconventional superconductors to nuclear matter and Higgs field, but have been observed only in temporal dimension for classical wave optical systems. Here, we identify a topological soliton molecule formed spatially in an electronic system, a quasi 1D charge density wave of indium atomic wires. This system is composed of two coupled Peierls chains, which are endowed with a Z$_4$ topology and three distinct, right-chiral, left-chiral, and non-chiral, solitons. Our scanning tunneling microscopy measurements identify a bound state of right- and left-chiral solitons with distinct in-gap states and net zero phase shift. Our density functional theory calculations reveal the attractive interaction of these solitons and the hybridization of their electronic states. This result initiates the study of the interaction between solitons in electronic systems, which can provide novel manybody electronic states and extra data-handling capacity beyond the given soliton topology. △ Less

Submitted 10 August, 2023; originally announced August 2023.

arXiv:2308.05267 [pdf, other]

doi 10.1103/PhysRevB.108.064503

Absence of superconductivity in electron-doped chromium pnictides ThCrAsN$_{1-x}$O$_x$

Authors: Zhi-Cheng Wang, Ye-Ting Shao, Yi-Qiang Lin, Shi-Jie Song, Bai-Zhuo Li, Er-Jian Cheng, Shi-Yan Li, Qin-Qing Zhu, Zhi Ren, Guang-Han Cao

Abstract: Theoretical studies predicted possible superconductivity in electron-doped chromium pnictides isostructural to their iron counterparts. Here, we report the synthesis and characterization of a new ZrCuSiAs-type Cr-based compound ThCrAsN, as well as its oxygen-doped variants. All samples of ThCrAsN$_{1-x}$O$_x$ show metallic conduction, but no superconductivity is observed above 30 mK even though th… ▽ More Theoretical studies predicted possible superconductivity in electron-doped chromium pnictides isostructural to their iron counterparts. Here, we report the synthesis and characterization of a new ZrCuSiAs-type Cr-based compound ThCrAsN, as well as its oxygen-doped variants. All samples of ThCrAsN$_{1-x}$O$_x$ show metallic conduction, but no superconductivity is observed above 30 mK even though the oxygen substitution reaches 75\%. The magnetic structure of ThCrAsN is determined to be G-type antiferromagnetic by magnetization measurements and first-principles calculations jointly. The calculations also indicate that the in-plane Cr--Cr direct interaction of ThCrAsN is robust against the heavy electron do**. The calculated density of states of the orbital occupations of Cr for ThCrAs(N,O) is strongly spin-polarized. Our results suggest the similarities between chromium pnictides and iron-based superconductors shouldn't be overestimated. △ Less

Submitted 9 August, 2023; originally announced August 2023.

Journal ref: Phys. Rev. B 108, 064503 (2023)

Showing 201–250 of 1,119 results for author: Song, S