Search | arXiv e-print repository

Hopf algebras and alternating multiple zeta values in positive characteristic

Authors: Bo-Hae Im, Ho** Kim, Khac Nhuan Le, Tuan Ngo Dac, Lan Huong Pham

Abstract: In \cite{IKLNDP23} we presented a systematic study of algebra structures of multiple zeta values in positive characteristic introduced by Thakur as analogues of classical multiple zeta values of Euler. In this paper we construct algebra and Hopf algebra structures of alternating multiple zeta values introduced by Harada, extending our previous work. Our results could be considered as an analogue o… ▽ More In \cite{IKLNDP23} we presented a systematic study of algebra structures of multiple zeta values in positive characteristic introduced by Thakur as analogues of classical multiple zeta values of Euler. In this paper we construct algebra and Hopf algebra structures of alternating multiple zeta values introduced by Harada, extending our previous work. Our results could be considered as an analogue of those of Hoffman \cite{Hof00} and Racinet \cite{Rac02} in the classical setting. The proof is based on two new ingredients: the first one is a direct and explicit construction of the shuffle Hopf algebra structure, and the second one is the notion of horizontal maps. △ Less

Submitted 5 April, 2023; originally announced April 2023.

Comments: 37 pages. arXiv admin note: text overlap with arXiv:2301.05906

MSC Class: 11M32

arXiv:2304.01981 [pdf, other]

doi 10.1140/epjc/s10052-023-11759-6

Search for $D^{*}(2007)^0\toμ^+μ^-$ in $B^-\toπ^-μ^+μ^-$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1040 additional authors not shown)

Abstract: The very rare $D^{*}(2007)^0\toμ^+μ^-$ decay is searched for by analysing $B^-\toπ^-μ^+μ^-$ decays. The analysis uses a sample of beauty mesons produced in proton-proton collisions collected with the LHCb detector between 2011 and 2018, corresponding to an integrated luminosity of 9 fb$^{-1}$. The signal signature corresponds to simultaneous peaks in the $μ^+μ^-$ and $π^-μ^+μ^-$ invariant masses.… ▽ More The very rare $D^{*}(2007)^0\toμ^+μ^-$ decay is searched for by analysing $B^-\toπ^-μ^+μ^-$ decays. The analysis uses a sample of beauty mesons produced in proton-proton collisions collected with the LHCb detector between 2011 and 2018, corresponding to an integrated luminosity of 9 fb$^{-1}$. The signal signature corresponds to simultaneous peaks in the $μ^+μ^-$ and $π^-μ^+μ^-$ invariant masses. No evidence for an excess of events over background is observed and an upper limit is set on the branching fraction of the decay at ${\cal B}(D^{*}(2007)^0\toμ^+μ^-) < 2.6\times 10^{-8}$ at $90\%$ confidence level. This is the first limit on the branching fraction of $D^{*}(2007)^0\toμ^+μ^-$ decays and the most stringent limit on $D^{*}(2007)^0$ decays to leptonic final states. The analysis is the first search for a rare charm-meson decay exploiting production via beauty decays. △ Less

Submitted 15 August, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2023-004.html (LHCb public pages)

Report number: LHCb-PAPER-2023-004, CERN-EP-2023-050

Journal ref: Eur. Phys. J. C 83, 666 (2023)

arXiv:2304.01220 [pdf, other]

Evaluating the impact of an explainable machine learning system on the interobserver agreement in chest radiograph interpretation

Authors: Hieu H. Pham, Ha Q. Nguyen, Hieu T. Nguyen, Linh T. Le, Khanh Lam

Abstract: We conducted a prospective study to measure the clinical impact of an explainable machine learning system on interobserver agreement in chest radiograph interpretation. The AI system, which we call as it VinDr-CXR when used as a diagnosis-supporting tool, significantly improved the agreement between six radiologists with an increase of 1.5% in mean Fleiss' Kappa. In addition, we also observed that… ▽ More We conducted a prospective study to measure the clinical impact of an explainable machine learning system on interobserver agreement in chest radiograph interpretation. The AI system, which we call as it VinDr-CXR when used as a diagnosis-supporting tool, significantly improved the agreement between six radiologists with an increase of 1.5% in mean Fleiss' Kappa. In addition, we also observed that, after the radiologists consulted AI's suggestions, the agreement between each radiologist and the system was remarkably increased by 3.3% in mean Cohen's Kappa. This work has been accepted for publication in IEEE Access and this paper is our short version submitted to the Midwest Machine Learning Symposium (MMLS 2023), Chicago, IL, USA. △ Less

Submitted 1 April, 2023; originally announced April 2023.

Comments: This work has been accepted for publication in IEEE Access. This is a short version submitted to the Midwest Machine Learning Symposium (MMLS 2023), Chicago, IL, USA

arXiv:2304.00557 [pdf, other]

Semi-supervised Neural Machine Translation with Consistency Regularization for Low-Resource Languages

Authors: Viet H. Pham, Thang M. Pham, Giang Nguyen, Long Nguyen, Dien Dinh

Abstract: The advent of deep learning has led to a significant gain in machine translation. However, most of the studies required a large parallel dataset which is scarce and expensive to construct and even unavailable for some languages. This paper presents a simple yet effective method to tackle this problem for low-resource languages by augmenting high-quality sentence pairs and training NMT models in a… ▽ More The advent of deep learning has led to a significant gain in machine translation. However, most of the studies required a large parallel dataset which is scarce and expensive to construct and even unavailable for some languages. This paper presents a simple yet effective method to tackle this problem for low-resource languages by augmenting high-quality sentence pairs and training NMT models in a semi-supervised manner. Specifically, our approach combines the cross-entropy loss for supervised learning with KL Divergence for unsupervised fashion given pseudo and augmented target sentences derived from the model. We also introduce a SentenceBERT-based filter to enhance the quality of augmenting data by retaining semantically similar sentence pairs. Experimental results show that our approach significantly improves NMT baselines, especially on low-resource datasets with 0.46--2.03 BLEU scores. We also demonstrate that using unsupervised training for augmented data is more efficient than reusing the ground-truth target sentences for supervised learning. △ Less

Submitted 2 April, 2023; originally announced April 2023.

Comments: TMP and GN contributed equally

arXiv:2303.16507 [pdf, other]

Improving Object Detection in Medical Image Analysis through Multiple Expert Annotators: An Empirical Investigation

Authors: Hieu H. Pham, Khiem H. Le, Tuan V. Tran, Ha Q. Nguyen

Abstract: The work discusses the use of machine learning algorithms for anomaly detection in medical image analysis and how the performance of these algorithms depends on the number of annotators and the quality of labels. To address the issue of subjectivity in labeling with a single annotator, we introduce a simple and effective approach that aggregates annotations from multiple annotators with varying le… ▽ More The work discusses the use of machine learning algorithms for anomaly detection in medical image analysis and how the performance of these algorithms depends on the number of annotators and the quality of labels. To address the issue of subjectivity in labeling with a single annotator, we introduce a simple and effective approach that aggregates annotations from multiple annotators with varying levels of expertise. We then aim to improve the efficiency of predictive models in abnormal detection tasks by estimating hidden labels from multiple annotations and using a re-weighted loss function to improve detection performance. Our method is evaluated on a real-world medical imaging dataset and outperforms relevant baselines that do not consider disagreements among annotators. △ Less

Submitted 29 March, 2023; originally announced March 2023.

Comments: This is a short version submitted to the Midwest Machine Learning Symposium (MMLS 2023), Chicago, IL, USA

arXiv:2303.15614 [pdf, other]

Modeling Population Movements under Uncertainty at the Border in Humanitarian Crises: A Situational Analysis Tool

Authors: Arturo de Nieves Gutierrez de Rubalcava, Oscar Sanchez Piñeiro, Rebeca Moreno Jiménez, Joseph Aylett-Bullock, Azra Ismail, Sofia Kyriazi, Catherine Schneider, Fred Sekidde, Giulia del Panta, Chao Huang, Vanessa Maigné, Miguel Luengo-Oroz, Katherine Hoffmann Pham

Abstract: Humanitarian agencies must be prepared to mobilize quickly in response to complex emergencies, and their effectiveness depends on their ability to identify, anticipate, and prepare for future needs. These are typically highly uncertain situations in which predictive modeling tools can be useful but challenging to build. To better understand the need for humanitarian support -- including shelter an… ▽ More Humanitarian agencies must be prepared to mobilize quickly in response to complex emergencies, and their effectiveness depends on their ability to identify, anticipate, and prepare for future needs. These are typically highly uncertain situations in which predictive modeling tools can be useful but challenging to build. To better understand the need for humanitarian support -- including shelter and assistance -- and strengthen contingency planning and protection efforts for displaced populations, we present a situational analysis tool to help anticipate the number of migrants and forcibly displaced persons that will cross a border in a humanitarian crisis. The tool consists of: (i) indicators of potential intent to move drawn from traditional and big data sources; (ii) predictive models for forecasting possible future movements; and (iii) a simulation of border crossings and shelter capacity requirements under different conditions. This tool has been specifically adapted to contingency planning in settings of high uncertainty, with an application to the Brazil-Venezuela border during the COVID-19 pandemic. △ Less

Submitted 27 March, 2023; originally announced March 2023.

Comments: 9 pages, 5 figures

Journal ref: Proceedings of the 3rd KDD Workshop on Data-driven Humanitarian Map**, 2022, Washington, DC USA

arXiv:2303.11429 [pdf, other]

Machine learning-based detection of cardiovascular disease using ECG signals: performance vs. complexity

Authors: Huy Pham, Konstantin Egorov, Alexey Kazakov, Semen Budennyy

Abstract: Cardiovascular disease remains a significant problem in modern society. Among non-invasive techniques, the electrocardiogram (ECG) is one of the most reliable methods for detecting abnormalities in cardiac activities. However, ECG interpretation requires expert knowledge and it is time-consuming. Develo** a novel method to detect the disease early could prevent death and complication. The paper… ▽ More Cardiovascular disease remains a significant problem in modern society. Among non-invasive techniques, the electrocardiogram (ECG) is one of the most reliable methods for detecting abnormalities in cardiac activities. However, ECG interpretation requires expert knowledge and it is time-consuming. Develo** a novel method to detect the disease early could prevent death and complication. The paper presents novel various approaches for classifying cardiac diseases from ECG recordings. The first approach suggests the Poincare representation of ECG signal and deep-learning-based image classifiers (ResNet50 and DenseNet121 were learned over Poincare diagrams), which showed decent performance in predicting AF (atrial fibrillation) but not other types of arrhythmia. XGBoost, a gradient-boosting model, showed an acceptable performance in long-term data but had a long inference time due to highly-consuming calculation within the pre-processing phase. Finally, the 1D convolutional model, specifically the 1D ResNet, showed the best results in both studied CinC 2017 and CinC 2020 datasets, reaching the F1 score of 85% and 71%, respectively, and that was superior to the first-ranking solution of each challenge. The paper also investigated efficiency metrics such as power consumption and equivalent CO2 emissions, with one-dimensional models like 1D CNN and 1D ResNet being the most energy efficient. Model interpretation analysis showed that the DenseNet detected AF using heart rate variability while the 1DResNet assessed AF pattern in raw ECG signals. △ Less

Submitted 10 March, 2023; originally announced March 2023.

Comments: 12 pages, 6 figures, 6 tables

arXiv:2303.10801 [pdf, other]

doi 10.1103/PhysRevApplied.21.054067

Efficient site-resolved imaging and spin-state detection in dynamic two-dimensional ion crystals

Authors: Robert N. Wolf, Joseph H. Pham, Julian Y. Z. Jee, Alexander Rischka, Michael J. Biercuk

Abstract: Resolving the locations and discriminating the spin states of individual trapped ions with high fidelity is critical for a large class of applications in quantum computing, simulation, and sensing. We report on a method for high-fidelity state discrimination in large two-dimensional (2D) crystals with over 100 trapped ions in a single trap** region, combining a hardware detector and an artificia… ▽ More Resolving the locations and discriminating the spin states of individual trapped ions with high fidelity is critical for a large class of applications in quantum computing, simulation, and sensing. We report on a method for high-fidelity state discrimination in large two-dimensional (2D) crystals with over 100 trapped ions in a single trap** region, combining a hardware detector and an artificial neural network. A high-data-rate, spatially resolving, single-photon sensitive timestam** detector performs efficient single-shot detection of 2D crystals in a Penning trap, exhibiting rotation at about $25\,\mathrm{kHz}$. We then train an artificial neural network to process the fluorescence photon data in the rest frame of the rotating crystal in order to identify ion locations with a success rate of $~90\%$, accounting for substantial illumination inhomogeneity across the crystal. Finally, employing a time-binned state detection method, we arrive at an average spin-state detection fidelity of $94(2)\%$. This technique can be used to analyze spatial and temporal correlations in arrays of hundreds of trapped-ion qubits. △ Less

Submitted 1 June, 2024; v1 submitted 19 March, 2023; originally announced March 2023.

Journal ref: Phys. Rev. Applied 21, 054067 (2024)

arXiv:2303.09782 [pdf, other]

High Accurate and Explainable Multi-Pill Detection Framework with Graph Neural Network-Assisted Multimodal Data Fusion

Authors: Anh Duy Nguyen, Huy Hieu Pham, Huynh Thanh Trung, Quoc Viet Hung Nguyen, Thao Nguyen Truong, Phi Le Nguyen

Abstract: Due to the significant resemblance in visual appearance, pill misuse is prevalent and has become a critical issue, responsible for one-third of all deaths worldwide. Pill identification, thus, is a crucial concern needed to be investigated thoroughly. Recently, several attempts have been made to exploit deep learning to tackle the pill identification problem. However, most published works consider… ▽ More Due to the significant resemblance in visual appearance, pill misuse is prevalent and has become a critical issue, responsible for one-third of all deaths worldwide. Pill identification, thus, is a crucial concern needed to be investigated thoroughly. Recently, several attempts have been made to exploit deep learning to tackle the pill identification problem. However, most published works consider only single-pill identification and fail to distinguish hard samples with identical appearances. Also, most existing pill image datasets only feature single pill images captured in carefully controlled environments under ideal lighting conditions and clean backgrounds. In this work, we are the first to tackle the multi-pill detection problem in real-world settings, aiming at localizing and identifying pills captured by users in a pill intake. Moreover, we also introduce a multi-pill image dataset taken in unconstrained conditions. To handle hard samples, we propose a novel method for constructing heterogeneous a priori graphs incorporating three forms of inter-pill relationships, including co-occurrence likelihood, relative size, and visual semantic correlation. We then offer a framework for integrating a priori with pills' visual features to enhance detection accuracy. Our experimental results have proved the robustness, reliability, and explainability of the proposed framework. Experimentally, it outperforms all detection benchmarks in terms of all evaluation metrics. Specifically, our proposed framework improves COCO mAP metrics by 9.4% over Faster R-CNN and 12.0% compared to vanilla YOLOv5. Our study opens up new opportunities for protecting patients from medication errors using an AI-based pill identification solution. △ Less

Submitted 17 March, 2023; originally announced March 2023.

Comments: Under review by Plos ONE journal

arXiv:2303.09443 [pdf, other]

doi 10.1007/JHEP08(2023)174

Observation of the $B^+ \rightarrow J/ψη^{\prime} K^+$ decay

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1041 additional authors not shown)

Abstract: The $B^+ \rightarrow J/ψη^{\prime} K^+$ decay is observed for the first time using proton-proton collision data collected by the LHCb experiment at centre-of-mass energies of 7, 8, and 13TeV, corresponding to a total integrated luminosity of 9fb$^{-1}$. The branching fraction of this decay is measured relative to the known branching fraction of the $B^+ \rightarrow ψ(2S) K^+$ decays and found to b… ▽ More The $B^+ \rightarrow J/ψη^{\prime} K^+$ decay is observed for the first time using proton-proton collision data collected by the LHCb experiment at centre-of-mass energies of 7, 8, and 13TeV, corresponding to a total integrated luminosity of 9fb$^{-1}$. The branching fraction of this decay is measured relative to the known branching fraction of the $B^+ \rightarrow ψ(2S) K^+$ decays and found to be $$ \frac{\mathcal{B}( B^+ \rightarrow J/ψη^{\prime}K^+)}{\mathcal{B}( B^+ \rightarrow ψ(2S)K^+)} = \left(4.91\pm 0.47\pm0.29\pm0.07\right)\times10^{-2}, $$ where the first uncertainty is statistical, the second is systematic and the third is related to external branching fractions. A first look at the $J/ψη^{\prime}$ mass distribution is performed and no signal of intermediate resonances is observed. △ Less

Submitted 13 December, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

Comments: 17 pages, 3 figures. All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-054.html (LHCb public pages)

Report number: LHCb-PAPER-2022-054, CERN-EP-2023-022

Journal ref: JHEP 08 (2023) 174

arXiv:2303.09115 [pdf, other]

Learning for Amalgamation: A Multi-Source Transfer Learning Framework For Sentiment Classification

Authors: Cuong V. Nguyen, Khiem H. Le, Anh M. Tran, Quang H. Pham, Binh T. Nguyen

Abstract: Transfer learning plays an essential role in Deep Learning, which can remarkably improve the performance of the target domain, whose training data is not sufficient. Our work explores beyond the common practice of transfer learning with a single pre-trained model. We focus on the task of Vietnamese sentiment classification and propose LIFA, a framework to learn a unified embedding from several pre… ▽ More Transfer learning plays an essential role in Deep Learning, which can remarkably improve the performance of the target domain, whose training data is not sufficient. Our work explores beyond the common practice of transfer learning with a single pre-trained model. We focus on the task of Vietnamese sentiment classification and propose LIFA, a framework to learn a unified embedding from several pre-trained models. We further propose two more LIFA variants that encourage the pre-trained models to either cooperate or compete with one another. Studying these variants sheds light on the success of LIFA by showing that sharing knowledge among the models is more beneficial for transfer learning. Moreover, we construct the AISIA-VN-Review-F dataset, the first large-scale Vietnamese sentiment classification database. We conduct extensive experiments on the AISIA-VN-Review-F and existing benchmarks to demonstrate the efficacy of LIFA compared to other techniques. To contribute to the Vietnamese NLP research, we publish our source code and datasets to the research community upon acceptance. △ Less

Submitted 16 March, 2023; originally announced March 2023.

Comments: Information Sciences

arXiv:2303.07884 [pdf, other]

Distributed least square solution method to linear algebraic equations over multiagent networks

Authors: Viet Hoang Pham, Hyo-Sung Ahn

Abstract: This paper designs a distributed least square solution method for a linear algebraic equation over a multiagent network. The coefficient matrix is divided into multiple blocks, and each agent only knows a subset of these blocks. The designed method is discrete-time and based on a proximal ADMM algorithm. By applying the designed method, each agent can find its corresponding part in one least squar… ▽ More This paper designs a distributed least square solution method for a linear algebraic equation over a multiagent network. The coefficient matrix is divided into multiple blocks, and each agent only knows a subset of these blocks. The designed method is discrete-time and based on a proximal ADMM algorithm. By applying the designed method, each agent can find its corresponding part in one least square solution of the considered linear algebraic equation while using only its information and communicating with its neighbors. Numerical simulations verify the effectiveness of the designed method in MATLAB. △ Less

Submitted 28 December, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

Comments: I need to correct some unclear points in the paper

arXiv:2303.06993 [pdf, other]

Actor-Critic learning for mean-field control in continuous time

Authors: Noufel Frikha, Maximilien Germain, Mathieu Laurière, Huyên Pham, Xuanye Song

Abstract: We study policy gradient for mean-field control in continuous time in a reinforcement learning setting. By considering randomised policies with entropy regularisation, we derive a gradient expectation representation of the value function, which is amenable to actor-critic type algorithms, where the value functions and the policies are learnt alternately based on observation samples of the state… ▽ More We study policy gradient for mean-field control in continuous time in a reinforcement learning setting. By considering randomised policies with entropy regularisation, we derive a gradient expectation representation of the value function, which is amenable to actor-critic type algorithms, where the value functions and the policies are learnt alternately based on observation samples of the state and model-free estimation of the population state distribution, either by offline or online learning. In the linear-quadratic mean-field framework, we obtain an exact parametrisation of the actor and critic functions defined on the Wasserstein space. Finally, we illustrate the results of our algorithms with some numerical experiments on concrete examples. △ Less

Submitted 13 March, 2023; originally announced March 2023.

arXiv:2303.06744 [pdf, other]

Ensemble Learning of Myocardial Displacements for Myocardial Infarction Detection in Echocardiography

Authors: Nguyen Tuan, Phi Nguyen, Dai Tran, Hung Pham, Quang Nguyen, Thanh Le, Hanh Van, Bach Do, Phuong Tran, Vinh Le, Thuy Nguyen, Long Tran, Hieu Pham

Abstract: Early detection and localization of myocardial infarction (MI) can reduce the severity of cardiac damage through timely treatment interventions. In recent years, deep learning techniques have shown promise for detecting MI in echocardiographic images. However, there has been no examination of how segmentation accuracy affects MI classification performance and the potential benefits of using ensemb… ▽ More Early detection and localization of myocardial infarction (MI) can reduce the severity of cardiac damage through timely treatment interventions. In recent years, deep learning techniques have shown promise for detecting MI in echocardiographic images. However, there has been no examination of how segmentation accuracy affects MI classification performance and the potential benefits of using ensemble learning approaches. Our study investigates this relationship and introduces a robust method that combines features from multiple segmentation models to improve MI classification performance by leveraging ensemble learning. Our method combines myocardial segment displacement features from multiple segmentation models, which are then input into a typical classifier to estimate the risk of MI. We validated the proposed approach on two datasets: the public HMC-QU dataset (109 echocardiograms) for training and validation, and an E-Hospital dataset (60 echocardiograms) from a local clinical site in Vietnam for independent testing. Model performance was evaluated based on accuracy, sensitivity, and specificity. The proposed approach demonstrated excellent performance in detecting MI. The results showed that the proposed approach outperformed the state-of-the-art feature-based method. Further research is necessary to determine its potential use in clinical settings as a tool to assist cardiologists and technicians with objective assessments and reduce dependence on operator subjectivity. Our research codes are available on GitHub at https://github.com/vinuni-vishc/mi-detection-echo. △ Less

Submitted 12 March, 2023; originally announced March 2023.

arXiv:2303.02213 [pdf, other]

Backdoor Attacks and Defenses in Federated Learning: Survey, Challenges and Future Research Directions

Authors: Thuy Dung Nguyen, Tuan Nguyen, Phi Le Nguyen, Hieu H. Pham, Khoa Doan, Kok-Seng Wong

Abstract: Federated learning (FL) is a machine learning (ML) approach that allows the use of distributed data without compromising personal privacy. However, the heterogeneous distribution of data among clients in FL can make it difficult for the orchestration server to validate the integrity of local model updates, making FL vulnerable to various threats, including backdoor attacks. Backdoor attacks involv… ▽ More Federated learning (FL) is a machine learning (ML) approach that allows the use of distributed data without compromising personal privacy. However, the heterogeneous distribution of data among clients in FL can make it difficult for the orchestration server to validate the integrity of local model updates, making FL vulnerable to various threats, including backdoor attacks. Backdoor attacks involve the insertion of malicious functionality into a targeted model through poisoned updates from malicious clients. These attacks can cause the global model to misbehave on specific inputs while appearing normal in other cases. Backdoor attacks have received significant attention in the literature due to their potential to impact real-world deep learning applications. However, they have not been thoroughly studied in the context of FL. In this survey, we provide a comprehensive survey of current backdoor attack strategies and defenses in FL, including a comprehensive analysis of different approaches. We also discuss the challenges and potential future directions for attacks and defenses in the context of FL. △ Less

Submitted 3 March, 2023; originally announced March 2023.

arXiv:2302.12020 [pdf, other]

Personalized Privacy-Preserving Framework for Cross-Silo Federated Learning

Authors: Van-Tuan Tran, Huy-Hieu Pham, Kok-Seng Wong

Abstract: Federated learning (FL) is recently surging as a promising decentralized deep learning (DL) framework that enables DL-based approaches trained collaboratively across clients without sharing private data. However, in the context of the central party being active and dishonest, the data of individual clients might be perfectly reconstructed, leading to the high possibility of sensitive information b… ▽ More Federated learning (FL) is recently surging as a promising decentralized deep learning (DL) framework that enables DL-based approaches trained collaboratively across clients without sharing private data. However, in the context of the central party being active and dishonest, the data of individual clients might be perfectly reconstructed, leading to the high possibility of sensitive information being leaked. Moreover, FL also suffers from the nonindependent and identically distributed (non-IID) data among clients, resulting in the degradation in the inference performance on local clients' data. In this paper, we propose a novel framework, namely Personalized Privacy-Preserving Federated Learning (PPPFL), with a concentration on cross-silo FL to overcome these challenges. Specifically, we introduce a stabilized variant of the Model-Agnostic Meta-Learning (MAML) algorithm to collaboratively train a global initialization from clients' synthetic data generated by Differential Private Generative Adversarial Networks (DP-GANs). After reaching convergence, the global initialization will be locally adapted by the clients to their private data. Through extensive experiments, we empirically show that our proposed framework outperforms multiple FL baselines on different datasets, including MNIST, Fashion-MNIST, CIFAR-10, and CIFAR-100. △ Less

Submitted 22 February, 2023; originally announced February 2023.

arXiv:2302.10629 [pdf, other]

doi 10.1007/JHEP07(2023)084

Observation of the $B^0_s\rightarrow χ_{c1}(3872)π^+π^-$ decay

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1037 additional authors not shown)

Abstract: The first observation of the $B^0_s \rightarrow \left( χ_{c1}(3872) \rightarrow J/ψπ^+π^-\right) π^+ π^-$ decay is reported using proton-proton collision data, corresponding to integrated luminosities of 1, 2 and 6fb$^{-1}$, collected by the LHCb experiment at centre-of-mass energies of 7, 8 and 13TeV, respectively. The ratio of branching fractions relative to the… ▽ More The first observation of the $B^0_s \rightarrow \left( χ_{c1}(3872) \rightarrow J/ψπ^+π^-\right) π^+ π^-$ decay is reported using proton-proton collision data, corresponding to integrated luminosities of 1, 2 and 6fb$^{-1}$, collected by the LHCb experiment at centre-of-mass energies of 7, 8 and 13TeV, respectively. The ratio of branching fractions relative to the $B^0_s \rightarrow \left( ψ(2S) \rightarrow Jψπ^+π^- \right) π^+ π^-$ decay is measured to be $$ \frac{ \mathcal{B} \left( B^0_s \rightarrow χ_{c1}(3872) π^+π^-\right) \times \mathcal{B} \left( χ_{c1}(3872) \rightarrow Jψπ^+π^-\right)} { \mathcal{B} \left( B^0_s \rightarrow ψ(2S) π^+ π^- \right) \times \mathcal{B} \left( ψ(2S) \rightarrow Jψπ^+π^-\right) } = \left( 6.8 \pm 1.1 \pm 0.2 \right) \times 10^{-2} , $$ where the first uncertainty is statistical and the second systematic. The mass spectrum of the $π^+π^-$ system recoiling against the $χ_{c1}(3872)$ meson exhibits a large contribution from $B^0_s \rightarrow χ_{c1}(3872) \left( f_0(980) \rightarrow π^+ π^-\right)$ decays. △ Less

Submitted 13 December, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

Comments: 16 pages, 2 figures. All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-049.html (LHCb public pages)

Report number: CERN-EP-2023-015, LHcb-PAPER-2022-049

Journal ref: JHEP 07 (2023) 084

arXiv:2302.10413 [pdf, ps, other]

CADIS: Handling Cluster-skewed Non-IID Data in Federated Learning with Clustered Aggregation and Knowledge DIStilled Regularization

Authors: Nang Hung Nguyen, Duc Long Nguyen, Trong Bang Nguyen, Thanh-Hung Nguyen, Huy Hieu Pham, Truong Thao Nguyen, Phi Le Nguyen

Abstract: Federated learning enables edge devices to train a global model collaboratively without exposing their data. Despite achieving outstanding advantages in computing efficiency and privacy protection, federated learning faces a significant challenge when dealing with non-IID data, i.e., data generated by clients that are typically not independent and identically distributed. In this paper, we tackle… ▽ More Federated learning enables edge devices to train a global model collaboratively without exposing their data. Despite achieving outstanding advantages in computing efficiency and privacy protection, federated learning faces a significant challenge when dealing with non-IID data, i.e., data generated by clients that are typically not independent and identically distributed. In this paper, we tackle a new type of Non-IID data, called cluster-skewed non-IID, discovered in actual data sets. The cluster-skewed non-IID is a phenomenon in which clients can be grouped into clusters with similar data distributions. By performing an in-depth analysis of the behavior of a classification model's penultimate layer, we introduce a metric that quantifies the similarity between two clients' data distributions without violating their privacy. We then propose an aggregation scheme that guarantees equality between clusters. In addition, we offer a novel local training regularization based on the knowledge-distillation technique that reduces the overfitting problem at clients and dramatically boosts the training scheme's performance. We theoretically prove the superiority of the proposed aggregation over the benchmark FedAvg. Extensive experimental results on both standard public datasets and our in-house real-world dataset demonstrate that the proposed approach improves accuracy by up to 16% compared to the FedAvg algorithm. △ Less

Submitted 15 April, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

Comments: Accepted for presentation at the 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid 2023)

arXiv:2302.08262 [pdf, other]

doi 10.1103/PhysRevLett.131.151801

Measurement of the $Λ_{b}^{0}\to Λ(1520) μ^{+}μ^{-}$ differential branching fraction

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1038 additional authors not shown)

Abstract: The branching fraction of the rare decay $Λ_{b}^{0}\to Λ(1520) μ^{+}μ^{-}$ is measured for the first time, in the squared dimuon mass intervals, $q^2$, excluding the $J/ψ$ and $ψ(2S)$ regions. The data sample analyzed was collected by the LHCb experiment at center-of-mass energies of 7, 8, and 13 TeV, corresponding to a total integrated luminosity of $9\ \mathrm{fb}^{-1}$. The result in the highes… ▽ More The branching fraction of the rare decay $Λ_{b}^{0}\to Λ(1520) μ^{+}μ^{-}$ is measured for the first time, in the squared dimuon mass intervals, $q^2$, excluding the $J/ψ$ and $ψ(2S)$ regions. The data sample analyzed was collected by the LHCb experiment at center-of-mass energies of 7, 8, and 13 TeV, corresponding to a total integrated luminosity of $9\ \mathrm{fb}^{-1}$. The result in the highest $q^{2}$ interval, $q^{2} >15.0\ \mathrm{GeV}^2/c^4$, where theoretical predictions have the smallest model dependence, agrees with the predictions. △ Less

Submitted 24 October, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-050.html (LHCb public pages)

Report number: LHCb-PAPER-2022-050, CERN-EP-2023-007

Journal ref: Phys. Rev. Lett. 131 (2023), 151801

arXiv:2302.07320 [pdf, other]

Policy gradient learning methods for stochastic control with exit time and applications to share repurchase pricing

Authors: Mohamed Hamdouche, Pierre Henry-Labordere, Huyen Pham

Abstract: We develop policy gradients methods for stochastic control with exit time in a model-free setting. We propose two types of algorithms for learning either directly the optimal policy or by learning alternately the value function (critic) and the optimal control (actor). The use of randomized policies is crucial for overcoming notably the issue related to the exit time in the gradient computation. W… ▽ More We develop policy gradients methods for stochastic control with exit time in a model-free setting. We propose two types of algorithms for learning either directly the optimal policy or by learning alternately the value function (critic) and the optimal control (actor). The use of randomized policies is crucial for overcoming notably the issue related to the exit time in the gradient computation. We demonstrate the effectiveness of our approach by implementing our numerical schemes in the application to the problem of share repurchase pricing. Our results show that the proposed policy gradient methods outperform PDE or other neural networks techniques in a model-based setting. Furthermore, our algorithms are flexible enough to incorporate realistic market conditions like e.g. price impact or transaction costs. △ Less

Submitted 14 February, 2023; originally announced February 2023.

Comments: 19 pages, 6 figures

arXiv:2302.06847 [pdf, other]

doi 10.1088/1361-6501/acde9c

An in-situ thermoelectric measurement apparatus inside a thermal-evaporator

Authors: Kien Trung Nguyen, Giang Bui-Thanh, Hong Thi Pham, Thuat Nguyen-Tran, Chi Hieu Hoang, Hung Q. Nguyen

Abstract: At the ultra-thin limit below 20 nm, a film's electrical conductivity, thermal conductivity, or thermoelectricity depends heavily on its thickness. In most studies, each sample is fabricated one at a time, potentially leading to considerable uncertainty in later characterizations. We design and build an in-situ apparatus to measure thermoelectricity during their deposition inside a thermal evapora… ▽ More At the ultra-thin limit below 20 nm, a film's electrical conductivity, thermal conductivity, or thermoelectricity depends heavily on its thickness. In most studies, each sample is fabricated one at a time, potentially leading to considerable uncertainty in later characterizations. We design and build an in-situ apparatus to measure thermoelectricity during their deposition inside a thermal evaporator. A temperature difference of up to 2 K is generated by a current passing through an on-chip resistor patterned using photolithography. The Seebeck voltage is measured on a Hall bar structure of a film deposited through a shadow mask. The measurement system is calibrated carefully before loading into the thermal evaporator. This in-situ thermoelectricity measurement system has been thoroughly tested on various materials, including Bi, Te, and Bi$_2$Te$_3$, at high temperatures up to 500 K. △ Less

Submitted 20 June, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

Journal ref: Measurement Science and Technology, 2023

arXiv:2302.06675 [pdf, other]

Symbolic Discovery of Optimization Algorithms

Authors: Xiangning Chen, Chen Liang, Da Huang, Esteban Real, Kaiyuan Wang, Yao Liu, Hieu Pham, Xuanyi Dong, Thang Luong, Cho-Jui Hsieh, Yifeng Lu, Quoc V. Le

Abstract: We present a method to formulate algorithm discovery as program search, and apply it to discover optimization algorithms for deep neural network training. We leverage efficient search techniques to explore an infinite and sparse program space. To bridge the large generalization gap between proxy and target tasks, we also introduce program selection and simplification strategies. Our method discove… ▽ More We present a method to formulate algorithm discovery as program search, and apply it to discover optimization algorithms for deep neural network training. We leverage efficient search techniques to explore an infinite and sparse program space. To bridge the large generalization gap between proxy and target tasks, we also introduce program selection and simplification strategies. Our method discovers a simple and effective optimization algorithm, $\textbf{Lion}$ ($\textit{Evo$\textbf{L}$ved S$\textbf{i}$gn M$\textbf{o}$me$\textbf{n}$tum}$). It is more memory-efficient than Adam as it only keeps track of the momentum. Different from adaptive optimizers, its update has the same magnitude for each parameter calculated through the sign operation. We compare Lion with widely used optimizers, such as Adam and Adafactor, for training a variety of models on different tasks. On image classification, Lion boosts the accuracy of ViT by up to 2% on ImageNet and saves up to 5x the pre-training compute on JFT. On vision-language contrastive learning, we achieve 88.3% $\textit{zero-shot}$ and 91.1% $\textit{fine-tuning}$ accuracy on ImageNet, surpassing the previous best results by 2% and 0.1%, respectively. On diffusion models, Lion outperforms Adam by achieving a better FID score and reducing the training compute by up to 2.3x. For autoregressive, masked language modeling, and fine-tuning, Lion exhibits a similar or better performance compared to Adam. Our analysis of Lion reveals that its performance gain grows with the training batch size. It also requires a smaller learning rate than Adam due to the larger norm of the update produced by the sign function. Additionally, we examine the limitations of Lion and identify scenarios where its improvements are small or not statistically significant. Lion is also successfully deployed in production systems such as Google search ads CTR model. △ Less

Submitted 8 May, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

Comments: 30 pages, Lion is successfully deployed in production systems. We also add comparison with other automatically discovered optimizers

arXiv:2302.03676 [pdf, other]

doi 10.3847/1538-4365/acdc9f

Open data from the third observing run of LIGO, Virgo, KAGRA and GEO

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, H. Abe, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah, C. Alléné, A. Allocca , et al. (1719 additional authors not shown)

Abstract: The global network of gravitational-wave observatories now includes five detectors, namely LIGO Hanford, LIGO Livingston, Virgo, KAGRA, and GEO 600. These detectors collected data during their third observing run, O3, composed of three phases: O3a starting in April of 2019 and lasting six months, O3b starting in November of 2019 and lasting five months, and O3GK starting in April of 2020 and lasti… ▽ More The global network of gravitational-wave observatories now includes five detectors, namely LIGO Hanford, LIGO Livingston, Virgo, KAGRA, and GEO 600. These detectors collected data during their third observing run, O3, composed of three phases: O3a starting in April of 2019 and lasting six months, O3b starting in November of 2019 and lasting five months, and O3GK starting in April of 2020 and lasting 2 weeks. In this paper we describe these data and various other science products that can be freely accessed through the Gravitational Wave Open Science Center at https://gwosc.org. The main dataset, consisting of the gravitational-wave strain time series that contains the astrophysical signals, is released together with supporting data useful for their analysis and documentation, tutorials, as well as analysis software packages. △ Less

Submitted 7 February, 2023; originally announced February 2023.

Comments: 27 pages, 3 figures

Report number: LIGO-P2200316

arXiv:2301.10439 [pdf, other]

ViDeBERTa: A powerful pre-trained language model for Vietnamese

Authors: Cong Dao Tran, Nhut Huy Pham, Anh Nguyen, Truong Son Hy, Tu Vu

Abstract: This paper presents ViDeBERTa, a new pre-trained monolingual language model for Vietnamese, with three versions - ViDeBERTa_xsmall, ViDeBERTa_base, and ViDeBERTa_large, which are pre-trained on a large-scale corpus of high-quality and diverse Vietnamese texts using DeBERTa architecture. Although many successful pre-trained language models based on Transformer have been widely proposed for the Engl… ▽ More This paper presents ViDeBERTa, a new pre-trained monolingual language model for Vietnamese, with three versions - ViDeBERTa_xsmall, ViDeBERTa_base, and ViDeBERTa_large, which are pre-trained on a large-scale corpus of high-quality and diverse Vietnamese texts using DeBERTa architecture. Although many successful pre-trained language models based on Transformer have been widely proposed for the English language, there are still few pre-trained models for Vietnamese, a low-resource language, that perform good results on downstream tasks, especially Question answering. We fine-tune and evaluate our model on three important natural language downstream tasks, Part-of-speech tagging, Named-entity recognition, and Question answering. The empirical results demonstrate that ViDeBERTa with far fewer parameters surpasses the previous state-of-the-art models on multiple Vietnamese-specific natural language understanding tasks. Notably, ViDeBERTa_base with 86M parameters, which is only about 23% of PhoBERT_large with 370M parameters, still performs the same or better results than the previous state-of-the-art model. Our ViDeBERTa models are available at: https://github.com/HySonLab/ViDeBERTa. △ Less

Submitted 10 February, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

arXiv:2301.05906 [pdf, ps, other]

Hopf algebras and multiple zeta values in positive characteristic

Authors: Bo-Hae Im, Ho** Kim, Khac Nhuan Le, Tuan Ngo Dac, Lan Huong Pham

Abstract: Multiples zeta values (MZV's for short) in positive characteristic were introduced by Thakur as analogues of classical multiple zeta values of Euler. In this paper we give a systematic study of algebraic structures of MZV's in positive characteristic. We construct both the stuffle algebra and the shuffle algebra of these MZV's and equip them with algebra and Hopf algebra structures. In particular,… ▽ More Multiples zeta values (MZV's for short) in positive characteristic were introduced by Thakur as analogues of classical multiple zeta values of Euler. In this paper we give a systematic study of algebraic structures of MZV's in positive characteristic. We construct both the stuffle algebra and the shuffle algebra of these MZV's and equip them with algebra and Hopf algebra structures. In particular, we completely solve a problem suggested by Deligne and Thakur \cite{Del17} in 2017 and establish Shi's conjectures \cite{Shi18}. The construction of the stuffle algebra is based on our recent work \cite{IKLNDP22}. △ Less

Submitted 14 January, 2023; originally announced January 2023.

Comments: 116 pages

MSC Class: Primary 11M32; Secondary 11M38; 16S10; 16T30; 11R58

arXiv:2301.01733 [pdf, other]

doi 10.1063/5.0141616

Enhancing the Accuracy of Density Functional Tight Binding Models Through ChIMES Many-body Interaction Potentials

Authors: Nir Goldman, Laurence E. Fried, Rebecca K. Lindsey, C. Huy Pham, R. Dettori

Abstract: Semi-empirical quantum models such as Density Functional Tight Binding (DFTB) are attractive methods for obtaining quantum simulation data at longer time and length scales than possible with standard approaches. However, application of these models can require lengthy effort due to the lack of a systematic approach for their development. In this work, we discuss use of the Chebyshev Interaction Mo… ▽ More Semi-empirical quantum models such as Density Functional Tight Binding (DFTB) are attractive methods for obtaining quantum simulation data at longer time and length scales than possible with standard approaches. However, application of these models can require lengthy effort due to the lack of a systematic approach for their development. In this work, we discuss use of the Chebyshev Interaction Model for Efficient Simulation (ChIMES) to create rapidly parameterized DFTB models which exhibit strong transferability due to the inclusion of many-body interactions that might otherwise be inaccurate. We apply our modeling approach to silicon polymorphs and review previous work on titanium hydride. We also review creation of a general purpose DFTB/ChIMES model for organic molecules and compounds that approaches hybrid functional and coupled cluster accuracy with two orders of magnitude fewer parameters than similar neural network approaches. In all cases, DFTB/ChIMES yields similar accuracy to the underlying quantum method with orders of magnitude improvement in computational cost. Our developments provide a way to create computationally efficient and highly accurate simulations over varying extreme thermodynamic conditions, where physical and chemical properties can be difficult to interrogate directly and there is historically a significant reliance on theoretical approaches for interpretation and validation of experimental results. △ Less

Submitted 27 February, 2023; v1 submitted 4 January, 2023; originally announced January 2023.

Comments: 42 pages, 8 figures, 6 tables. In review for a special issue in J. Chem. Phys

arXiv:2212.14353 [pdf, other]

Sheaf-theoretic self-filtering network of low-cost sensors for local air quality monitoring: A causal approach

Authors: Anh-Duy Pham, Chuong Dinh Le, Hoang Viet Pham, Thinh Gia Tran, Dat Thanh Vo, Chau Long Tran, An Dinh Le, Hien Bich Vo

Abstract: Sheaf theory, which is a complex but powerful tool supported by topological theory, offers more flexibility and precision than traditional graph theory when it comes to modeling relationships between multiple features. In the realm of air quality monitoring, this can be incredibly useful in detecting sudden changes in local dust particle density, which can be difficult to accurately measure using… ▽ More Sheaf theory, which is a complex but powerful tool supported by topological theory, offers more flexibility and precision than traditional graph theory when it comes to modeling relationships between multiple features. In the realm of air quality monitoring, this can be incredibly useful in detecting sudden changes in local dust particle density, which can be difficult to accurately measure using commercial instruments. Traditional methods for air quality measurement often rely on calibrating the measurement with public standard instruments or calculating the measurements moving average over a constant period. However, this can lead to an incorrect index at the measurement location, as well as an oversmoothing effect on the signal. In this study, we propose a compact device that uses sheaf theory to detect and count vehicles as a local air quality change-causing factor. By inferring the number of vehicles into the PM2.5 index and propagating it into the recorded PM2.5 index from low-cost air monitoring sensors such as PMS7003 and BME280, we can achieve self-correction in real-time. Plus, the sheaf-theoretic method allows for easy scaling to multiple nodes for further filtering effects. By implementing sheaf theory in air quality monitoring, we can overcome the limitations of traditional methods and provide more accurate and reliable results. △ Less

Submitted 29 December, 2022; originally announced December 2022.

arXiv:2212.13381 [pdf, other]

MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

Authors: Yingtian Zou, Vikas Verma, Sarthak Mittal, Wai Hoh Tang, Hieu Pham, Juho Kannala, Yoshua Bengio, Arno Solin, Kenji Kawaguchi

Abstract: Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional deri… ▽ More Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. Based on this new insight, we propose an improved version of Mixup, theoretically justified to deliver better generalization performance than the vanilla Mixup. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across multiple datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy. △ Less

Submitted 15 October, 2023; v1 submitted 27 December, 2022; originally announced December 2022.

Comments: 16 pages, Best Student Paper Award at UAI 2023

arXiv:2212.12574 [pdf, other]

doi 10.1007/JHEP07(2023)075

First observation and branching fraction measurement of the $Λ_b^0\to D_s^- p$ decay

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1040 additional authors not shown)

Abstract: The first observation of the $Λ_b^0\to D_s^- p$ decay is presented using proton-proton collision data collected by the LHCb experiment at a centre-of-mass energy of ${\sqrt{s}=13 \,\textrm{TeV}}$, corresponding to a total integrated luminosity of $6\,\textrm{fb}^{-1}$. Using the $Λ_b^0\toΛ_c^+π^-$ decay as the normalisation mode, the branching fraction of the $Λ_b^0\to D_s^- p$ decay is measured t… ▽ More The first observation of the $Λ_b^0\to D_s^- p$ decay is presented using proton-proton collision data collected by the LHCb experiment at a centre-of-mass energy of ${\sqrt{s}=13 \,\textrm{TeV}}$, corresponding to a total integrated luminosity of $6\,\textrm{fb}^{-1}$. Using the $Λ_b^0\toΛ_c^+π^-$ decay as the normalisation mode, the branching fraction of the $Λ_b^0\to D_s^- p$ decay is measured to be ${\mathcal{B}(Λ_b^0\to D_s^- p)=(12.6 \pm 0.5 \pm 0.3 \pm 1.2 )\times 10^{-6}}$, where the first uncertainty is statistical, the second systematic and the third due to uncertainties in the branching fractions of the $Λ_b^0\toΛ_c^+π^-$, $D_s^- \to K^-K^+π^-$ and $Λ_c^+\to p K^- π^+$ decays. △ Less

Submitted 17 July, 2023; v1 submitted 23 December, 2022; originally announced December 2022.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-038.html (LHCb public pages)

Report number: LHCb-PAPER-2022-038, CERN-EP-2022-272

Journal ref: JHEP 07 (2023) 075

arXiv:2212.11518 [pdf, other]

Mean-field neural networks-based algorithms for McKean-Vlasov control problems *

Authors: Huyên Pham, Xavier Warin

Abstract: This paper is devoted to the numerical resolution of McKean-Vlasov control problems via the class of mean-field neural networks introduced in our companion paper [25] in order to learn the solution on the Wasserstein space. We propose several algorithms either based on dynamic programming with control learning by policy or value iteration, or backward SDE from stochastic maximum principle with glo… ▽ More This paper is devoted to the numerical resolution of McKean-Vlasov control problems via the class of mean-field neural networks introduced in our companion paper [25] in order to learn the solution on the Wasserstein space. We propose several algorithms either based on dynamic programming with control learning by policy or value iteration, or backward SDE from stochastic maximum principle with global or local loss functions. Extensive numerical results on different examples are presented to illustrate the accuracy of each of our eight algorithms. We discuss and compare the pros and cons of all the tested methods. △ Less

Submitted 19 March, 2024; v1 submitted 22 December, 2022; originally announced December 2022.

arXiv:2212.09153 [pdf, other]

doi 10.1103/PhysRevD.108.032002

Measurement of lepton universality parameters in $B^+\to K^+\ell^+\ell^-$ and $B^0\to K^{*0}\ell^+\ell^-$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1039 additional authors not shown)

Abstract: A simultaneous analysis of the $B^+\to K^+\ell^+\ell^-$ and $B^0\to K^{*0}\ell^+\ell^-$ decays is performed to test muon-electron universality in two ranges of the square of the dilepton invariant mass, $q^2$. The measurement uses a sample of beauty meson decays produced in proton-proton collisions collected with the LHCb detector between 2011 and 2018, corresponding to an integrated luminosity of… ▽ More A simultaneous analysis of the $B^+\to K^+\ell^+\ell^-$ and $B^0\to K^{*0}\ell^+\ell^-$ decays is performed to test muon-electron universality in two ranges of the square of the dilepton invariant mass, $q^2$. The measurement uses a sample of beauty meson decays produced in proton-proton collisions collected with the LHCb detector between 2011 and 2018, corresponding to an integrated luminosity of $9$ $\text{fb}^{-1}$. A sequence of multivariate selections and strict particle identification requirements produce a higher signal purity and a better statistical sensitivity per unit luminosity than previous LHCb lepton universality tests using the same decay modes. Residual backgrounds due to misidentified hadronic decays are studied using data and included in the fit model. Each of the four lepton universality measurements reported is either the first in the given $q^2$ interval or supersedes previous LHCb measurements. The results are compatible with the predictions of the Standard Model. △ Less

Submitted 7 November, 2023; v1 submitted 18 December, 2022; originally announced December 2022.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-045.html (LHCb public pages)

Report number: LHCb-PAPER-2022-045, CERN-EP-2022-278

Journal ref: Phys. Rev. D 108 (2023) 032002

arXiv:2212.09152 [pdf, other]

doi 10.1103/PhysRevLett.131.051803

Test of lepton universality in $b \rightarrow s \ell^+ \ell^-$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1039 additional authors not shown)

Abstract: The first simultaneous test of muon-electron universality using $B^{+}\rightarrow K^{+}\ell^{+}\ell^{-}$ and $B^{0}\rightarrow K^{*0}\ell^{+}\ell^{-}$ decays is performed, in two ranges of the dilepton invariant-mass squared, $q^{2}$. The analysis uses beauty mesons produced in proton-proton collisions collected with the LHCb detector between 2011 and 2018, corresponding to an integrated luminosit… ▽ More The first simultaneous test of muon-electron universality using $B^{+}\rightarrow K^{+}\ell^{+}\ell^{-}$ and $B^{0}\rightarrow K^{*0}\ell^{+}\ell^{-}$ decays is performed, in two ranges of the dilepton invariant-mass squared, $q^{2}$. The analysis uses beauty mesons produced in proton-proton collisions collected with the LHCb detector between 2011 and 2018, corresponding to an integrated luminosity of 9 $\mathrm{fb}^{-1}$. Each of the four lepton universality measurements reported is either the first in the given $q^{2}$ interval or supersedes previous LHCb measurements. The results are compatible with the predictions of the Standard Model. △ Less

Submitted 7 November, 2023; v1 submitted 18 December, 2022; originally announced December 2022.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-046.html (LHCb public pages)

Report number: LHCb-PAPER-2022-046, CERN-EP-2022-277

Journal ref: Phys. Rev. Lett. 131 (2023) 051803

arXiv:2212.06109 [pdf, ps, other]

Optimal thresholds for Latin squares, Steiner Triple Systems, and edge colorings

Authors: Vishesh Jain, Huy Tuan Pham

Abstract: We show that the threshold for the binomial random $3$-partite, $3$-uniform hypergraph $G^{3}((n,n,n),p)$ to contain a Latin square is $Θ(\log{n}/n)$. We also prove analogous results for Steiner triple systems and proper list edge-colorings of the complete (bipartite) graph with random lists. Our results answer several related questions of Johansson, Luria-Simkin, Casselgren-Häggkvist, Simkin, and… ▽ More We show that the threshold for the binomial random $3$-partite, $3$-uniform hypergraph $G^{3}((n,n,n),p)$ to contain a Latin square is $Θ(\log{n}/n)$. We also prove analogous results for Steiner triple systems and proper list edge-colorings of the complete (bipartite) graph with random lists. Our results answer several related questions of Johansson, Luria-Simkin, Casselgren-Häggkvist, Simkin, and Kang-Kelly-Kühn-Methuku-Osthus. △ Less

Submitted 19 December, 2022; v1 submitted 12 December, 2022; originally announced December 2022.

Comments: 10 pages. Simplified proof; results unchanged

arXiv:2212.04313 [pdf]

Scalable, low-cost, and versatile system design for air pollution and traffic density monitoring and analysis

Authors: Thinh Gia Tran, Dat Thanh Vo, Long Chau Tran, Hoang Viet Pham, Chuong Dinh Le, An Dinh Le, Duy Anh Pham, Hien Bich Vo

Abstract: Vietnam requires a sustainable urbanization, for which city sensing is used in planning and de-cision-making. Large cities need portable, scalable, and inexpensive digital technology for this purpose. End-to-end air quality monitoring companies such as AirVisual and Plume Air have shown their reliability with portable devices outfitted with superior air sensors. They are pricey, yet homeowners use… ▽ More Vietnam requires a sustainable urbanization, for which city sensing is used in planning and de-cision-making. Large cities need portable, scalable, and inexpensive digital technology for this purpose. End-to-end air quality monitoring companies such as AirVisual and Plume Air have shown their reliability with portable devices outfitted with superior air sensors. They are pricey, yet homeowners use them to get local air data without evaluating the causal effect. Our air quality inspection system is scalable, reasonably priced, and flexible. Minicomputer of the sys-tem remotely monitors PMS7003 and BME280 sensor data through a microcontroller processor. The 5-megapixel camera module enables researchers to infer the causal relationship between traffic intensity and dust concentration. The design enables inexpensive, commercial-grade hardware, with Azure Blob storing air pollution data and surrounding-area imagery and pre-venting the system from physically expanding. In addition, by including an air channel that re-plenishes and distributes temperature, the design improves ventilation and safeguards electrical components. The gadget allows for the analysis of the correlation between traffic and air quali-ty data, which might aid in the establishment of sustainable urban development plans and poli-cies. △ Less

Submitted 8 December, 2022; originally announced December 2022.

arXiv:2212.02717 [pdf, other]

doi 10.1103/PhysRevD.108.012017

Amplitude analysis of $B^0 \rightarrow \overline{D}^0 D_s^+ π^-$ and $B^+ \rightarrow D^- D_s^+ π^+$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, S. Aiola, Z. Ajaltouni, S. Akar, K. Akiba, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1047 additional authors not shown)

Abstract: Resonant contributions in $B^0 \rightarrow \overline{D}^0 D^+_sπ^-$ and $B^+\rightarrow D^- D^+_sπ^+$ decays are determined with an amplitude analysis, which is performed both separately and simultaneously, where in the latter case isospin symmetry between the decays is assumed. The analysis is based on data collected by the LHCb detector in proton-proton collisions at center-of-mass energies of 7… ▽ More Resonant contributions in $B^0 \rightarrow \overline{D}^0 D^+_sπ^-$ and $B^+\rightarrow D^- D^+_sπ^+$ decays are determined with an amplitude analysis, which is performed both separately and simultaneously, where in the latter case isospin symmetry between the decays is assumed. The analysis is based on data collected by the LHCb detector in proton-proton collisions at center-of-mass energies of 7, 8 and 13 $\rm{TeV}$. The full data sample corresponds to an integrated luminosity of 9 $\rm fb^{-1}$. A doubly charged spin-0 open-charm tetraquark candidate together with a neutral partner, both with masses near $2.9\,\rm{GeV}$, are observed in the $D_sπ$ decay channel. △ Less

Submitted 1 August, 2023; v1 submitted 5 December, 2022; originally announced December 2022.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-027.html (LHCb public pages)

Report number: LHCb-PAPER-2022-027, CERN-EP-2022-246

Journal ref: Phys. Rev. D 108, 012017 (2023)

arXiv:2212.02716 [pdf, other]

doi 10.1103/PhysRevLett.131.041902

First observation of a doubly charged tetraquark and its neutral partner

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, S. Aiola, Z. Ajaltouni, S. Akar, K. Akiba, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1047 additional authors not shown)

Abstract: A combined amplitude analysis is performed for the decays $B^0 \rightarrow \overline{D}^0 D^+_sπ^-$ and $B^+\rightarrow D^- D^+_sπ^+$, which are related by isospin symmetry. The analysis is based on data collected by the LHCb detector in proton-proton collisions at center-of-mass energies of 7, 8 and 13$\,\rm{TeV}$. The full data sample corresponds to an integrated luminosity of 9$\,\rm{fb^{-1}}$.… ▽ More A combined amplitude analysis is performed for the decays $B^0 \rightarrow \overline{D}^0 D^+_sπ^-$ and $B^+\rightarrow D^- D^+_sπ^+$, which are related by isospin symmetry. The analysis is based on data collected by the LHCb detector in proton-proton collisions at center-of-mass energies of 7, 8 and 13$\,\rm{TeV}$. The full data sample corresponds to an integrated luminosity of 9$\,\rm{fb^{-1}}$. Two new resonant states with masses of $2.908\pm0.011\pm0.020\,\rm{GeV}$ and widths of $0.136\pm0.023\pm0.011\,\rm{GeV}$ are observed, which decay to $D^+_sπ^+$ and $D^+_sπ^-$ respectively. The former state indicates the first observation of a doubly charged open-charm tetraquark state with minimal quark content $[c\bar{s}u\bar{d}]$, and the latter state is a neutral tetraquark composed of $[c\bar{s}\bar{u}d]$ quarks. Both states are found to have spin-parity $0^+$, and their resonant parameters are consistent with each other, which suggests that they belong to an isospin triplet. △ Less

Submitted 1 August, 2023; v1 submitted 5 December, 2022; originally announced December 2022.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-026.html (LHCb public pages)

Report number: LHCb-PAPER-2022-026, CERN-EP-2022-239

Journal ref: Phys. Rev. Lett. 131, 041902 (2023)

arXiv:2212.01761 [pdf]

A PM2.5 concentration prediction framework with vehicle tracking system: From cause to effect

Authors: Chuong D. Le, Hoang V. Pham, Duy A. Pham, An D. Le, Hien B. Vo

Abstract: Air pollution is an emerging problem that needs to be solved especially in developed and develo** countries. In Vietnam, air pollution is also a concerning issue in big cities such as Hanoi and Ho Chi Minh cities where air pollution comes mostly from vehicles such as cars and motorbikes. In order to tackle the problem, the paper focuses on develo** a solution that can estimate the emitted PM2.… ▽ More Air pollution is an emerging problem that needs to be solved especially in developed and develo** countries. In Vietnam, air pollution is also a concerning issue in big cities such as Hanoi and Ho Chi Minh cities where air pollution comes mostly from vehicles such as cars and motorbikes. In order to tackle the problem, the paper focuses on develo** a solution that can estimate the emitted PM2.5 pollutants by counting the number of vehicles in the traffic. We first investigated among the recent object detection models and developed our own traffic surveillance system. The observed traffic density showed a similar trend to the measured PM2.5 with a certain lagging in time, suggesting a relation between traffic density and PM2.5. We further express this relationship with a mathematical model which can estimate the PM2.5 value based on the observed traffic density. The estimated result showed a great correlation with the measured PM2.5 plots in the urban area context. △ Less

Submitted 4 December, 2022; originally announced December 2022.

arXiv:2212.01477 [pdf, other]

doi 10.1093/mnras/stad3120

Search for subsolar-mass black hole binaries in the second part of Advanced LIGO's and Advanced Virgo's third observing run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, H. Abe, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, C. Alléné, A. Allocca, P. A. Altin , et al. (1680 additional authors not shown)

Abstract: We describe a search for gravitational waves from compact binaries with at least one component with mass 0.2 $M_\odot$ -- $1.0 M_\odot$ and mass ratio $q \geq 0.1$ in Advanced LIGO and Advanced Virgo data collected between 1 November 2019, 15:00 UTC and 27 March 2020, 17:00 UTC. No signals were detected. The most significant candidate has a false alarm rate of 0.2 $\mathrm{yr}^{-1}$. We estimate t… ▽ More We describe a search for gravitational waves from compact binaries with at least one component with mass 0.2 $M_\odot$ -- $1.0 M_\odot$ and mass ratio $q \geq 0.1$ in Advanced LIGO and Advanced Virgo data collected between 1 November 2019, 15:00 UTC and 27 March 2020, 17:00 UTC. No signals were detected. The most significant candidate has a false alarm rate of 0.2 $\mathrm{yr}^{-1}$. We estimate the sensitivity of our search over the entirety of Advanced LIGO's and Advanced Virgo's third observing run, and present the most stringent limits to date on the merger rate of binary black holes with at least one subsolar-mass component. We use the upper limits to constrain two fiducial scenarios that could produce subsolar-mass black holes: primordial black holes (PBH) and a model of dissipative dark matter. The PBH model uses recent prescriptions for the merger rate of PBH binaries that include a rate suppression factor to effectively account for PBH early binary disruptions. If the PBHs are monochromatically distributed, we can exclude a dark matter fraction in PBHs $f_\mathrm{PBH} \gtrsim 0.6$ (at 90% confidence) in the probed subsolar-mass range. However, if we allow for broad PBH mass distributions we are unable to rule out $f_\mathrm{PBH} = 1$. For the dissipative model, where the dark matter has chemistry that allows a small fraction to cool and collapse into black holes, we find an upper bound $f_{\mathrm{DBH}} < 10^{-5}$ on the fraction of atomic dark matter collapsed into black holes. △ Less

Submitted 26 January, 2024; v1 submitted 2 December, 2022; originally announced December 2022.

Comments: https://dcc.ligo.org/P2200139

arXiv:2211.11633 [pdf, other]

doi 10.1140/epjc/s10052-023-11641-5

Open charm production and asymmetry in $p$Ne collisions at $\sqrt{s_{\scriptscriptstyle\rm NN}} =$ 68.5 GeV

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1045 additional authors not shown)

Abstract: A measurement of $D^0$ meson production by the LHCb experiment in its fixed-target configuration is presented. The production of $D^0$ mesons is studied with a beam of 2.5 TeV protons colliding on a gaseous neon target at rest, corresponding to a nucleon-nucleon centre-of-mass energy of $\sqrt{s_{\rm NN}}$ = 68.5 GeV. The sum of the $D^0$ and ${\overline D^0}$ production cross-section in $p$Ne col… ▽ More A measurement of $D^0$ meson production by the LHCb experiment in its fixed-target configuration is presented. The production of $D^0$ mesons is studied with a beam of 2.5 TeV protons colliding on a gaseous neon target at rest, corresponding to a nucleon-nucleon centre-of-mass energy of $\sqrt{s_{\rm NN}}$ = 68.5 GeV. The sum of the $D^0$ and ${\overline D^0}$ production cross-section in $p$Ne collisions in the centre-of-mass rapidity range $y^{\star}\in [-2.29, 0]$ is found to be $σ_{D^{0}}^{y^\star \in [-2.29, 0]} = 48.2 \pm 0.3 \pm 4.5 \,μ\textrm{b/nucleon}$ where the first uncertainty is statistical and the second is systematic. The $D^0-{\overline D^0}$ production asymmetry is also evaluated and suggests a trend towards negative values at large negative $y^{\star}$. The considered models do not account precisely for all the features observed in the LHCb data, but theoretical predictions including 1$\%$ intrinsic charm and 10$\%$ recombination contributions better describe the data than the other models considered. △ Less

Submitted 20 February, 2024; v1 submitted 21 November, 2022; originally announced November 2022.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-015.html (LHCb public pages). arXiv admin note: text overlap with arXiv:1810.07907

Report number: CERN-EP-2022-217, LHCb-PAPER-2022-015

Journal ref: Eur. Phys. J. C83 (2023) 541

arXiv:2211.10948 [pdf, other]

doi 10.1109/TNSM.2023.3314066

FedDCT: Federated Learning of Large Convolutional Neural Networks on Resource Constrained Devices using Divide and Collaborative Training

Authors: Quan Nguyen, Hieu H. Pham, Kok-Seng Wong, Phi Le Nguyen, Truong Thao Nguyen, Minh N. Do

Abstract: We introduce FedDCT, a novel distributed learning paradigm that enables the usage of large, high-performance CNNs on resource-limited edge devices. As opposed to traditional FL approaches, which require each client to train the full-size neural network independently during each training round, the proposed FedDCT allows a cluster of several clients to collaboratively train a large deep learning mo… ▽ More We introduce FedDCT, a novel distributed learning paradigm that enables the usage of large, high-performance CNNs on resource-limited edge devices. As opposed to traditional FL approaches, which require each client to train the full-size neural network independently during each training round, the proposed FedDCT allows a cluster of several clients to collaboratively train a large deep learning model by dividing it into an ensemble of several small sub-models and train them on multiple devices in parallel while maintaining privacy. In this collaborative training process, clients from the same cluster can also learn from each other, further improving their ensemble performance. In the aggregation stage, the server takes a weighted average of all the ensemble models trained by all the clusters. FedDCT reduces the memory requirements and allows low-end devices to participate in FL. We empirically conduct extensive experiments on standardized datasets, including CIFAR-10, CIFAR-100, and two real-world medical datasets HAM10000 and VAIPE. Experimental results show that FedDCT outperforms a set of current SOTA FL methods with interesting convergence behaviors. Furthermore, compared to other existing approaches, FedDCT achieves higher accuracy and substantially reduces the number of communication rounds (with $4-8$ times fewer memory requirements) to achieve the desired accuracy on the testing dataset without incurring any extra training cost on the server side. △ Less

Submitted 18 September, 2023; v1 submitted 20 November, 2022; originally announced November 2022.

Comments: Update v2: Final version as published in IEEE Transactions on Network and Service Management 2023

arXiv:2211.06828 [pdf, other]

doi 10.1109/ACCESS.2023.3298299

Enhancing Few-shot Image Classification with Cosine Transformer

Authors: Quang-Huy Nguyen, Cuong Q. Nguyen, Dung D. Le, Hieu H. Pham

Abstract: This paper addresses the few-shot image classification problem, where the classification task is performed on unlabeled query samples given a small amount of labeled support samples only. One major challenge of the few-shot learning problem is the large variety of object visual appearances that prevents the support samples to represent that object comprehensively. This might result in a significan… ▽ More This paper addresses the few-shot image classification problem, where the classification task is performed on unlabeled query samples given a small amount of labeled support samples only. One major challenge of the few-shot learning problem is the large variety of object visual appearances that prevents the support samples to represent that object comprehensively. This might result in a significant difference between support and query samples, therefore undermining the performance of few-shot algorithms. In this paper, we tackle the problem by proposing Few-shot Cosine Transformer (FS-CT), where the relational map between supports and queries is effectively obtained for the few-shot tasks. The FS-CT consists of two parts, a learnable prototypical embedding network to obtain categorical representations from support samples with hard cases, and a transformer encoder to effectively achieve the relational map from two different support and query samples. We introduce Cosine Attention, a more robust and stable attention module that enhances the transformer module significantly and therefore improves FS-CT performance from 5% to over 20% in accuracy compared to the default scaled dot-product mechanism. Our method performs competitive results in mini-ImageNet, CUB-200, and CIFAR-FS on 1-shot learning and 5-shot learning tasks across backbones and few-shot configurations. We also developed a custom few-shot dataset for Yoga pose recognition to demonstrate the potential of our algorithm for practical application. Our FS-CT with cosine attention is a lightweight, simple few-shot algorithm that can be applied for a wide range of applications, such as healthcare, medical, and security surveillance. The official implementation code of our Few-shot Cosine Transformer is available at https://github.com/vinuni-vishc/Few-Shot-Cosine-Transformer △ Less

Submitted 21 July, 2023; v1 submitted 13 November, 2022; originally announced November 2022.

Journal ref: IEEE Access (2023)

arXiv:2211.05034 [pdf, other]

doi 10.1103/PhysRevD.108.034012

First observation of the $B^+ \rightarrow D_s^+ D_s^- K^+$ decay

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, H. Afsharnia, C. Agapopoulou, C. A. Aidala, S. Aiola, Z. Ajaltouni, S. Akar, K. Akiba, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1038 additional authors not shown)

Abstract: The $B^+ \rightarrow D_s^+ D_s^- K^+$ decay is observed for the first time using proton-proton collision data collected by the LHCb detector at centre-of-mass energies of $7$, $8$ and $13\, \text{TeV}$, corresponding to an integrated luminosity of $9\,\text{fb}^{-1}$. Its branching fraction relative to that of the $B^{+} \rightarrow D^{+} D^{-} K^{+}$ decay is measured to be… ▽ More The $B^+ \rightarrow D_s^+ D_s^- K^+$ decay is observed for the first time using proton-proton collision data collected by the LHCb detector at centre-of-mass energies of $7$, $8$ and $13\, \text{TeV}$, corresponding to an integrated luminosity of $9\,\text{fb}^{-1}$. Its branching fraction relative to that of the $B^{+} \rightarrow D^{+} D^{-} K^{+}$ decay is measured to be $$\frac{B\left(B^{+} \rightarrow D_s^{+} D_s^{-} K^{+}\right)}{B\left(B^{+} \rightarrow D^{+} D^{-} K^{+}\right)}=0.525 \pm 0.033 \pm 0.027 \pm 0.034,$$ where the first uncertainty is statistical, the second systematic, and the third is due to the uncertainties on the branching fractions of the $D_s^{\pm} \rightarrow K^{\mp} K^{\pm} π^{\pm}$ and $D^{\pm} \rightarrow K^{\mp} π^{\pm} π^{\pm}$ decays. This measurement fills an experimental gap in the knowledge of the family of Cabibbo$-$favoured $\bar{b} \rightarrow \bar{c} c \bar{s}$ transitions and opens the path for unique studies of spectroscopy in future. △ Less

Submitted 7 November, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-019.html (LHCb public pages)

Report number: LHCb-PAPER-2022-019, CERN-EP-2022-205

Journal ref: Phys. Rev. D 108 (2023) 034012

arXiv:2211.03970 [pdf, other]

On the Algorithmic Stability and Generalization of Adaptive Optimization Methods

Authors: Han Nguyen, Hai Pham, Sashank J. Reddi, Barnabás Póczos

Abstract: Despite their popularity in deep learning and machine learning in general, the theoretical properties of adaptive optimizers such as Adagrad, RMSProp, Adam or AdamW are not yet fully understood. In this paper, we develop a novel framework to study the stability and generalization of these optimization methods. Based on this framework, we show provable guarantees about such properties that depend h… ▽ More Despite their popularity in deep learning and machine learning in general, the theoretical properties of adaptive optimizers such as Adagrad, RMSProp, Adam or AdamW are not yet fully understood. In this paper, we develop a novel framework to study the stability and generalization of these optimization methods. Based on this framework, we show provable guarantees about such properties that depend heavily on a single parameter $β_2$. Our empirical experiments support our claims and provide practical insights into the stability and generalization properties of adaptive optimization methods. △ Less

Submitted 7 November, 2022; originally announced November 2022.

Comments: 21 pages including appendix

arXiv:2210.15179 [pdf, other]

Mean-field neural networks: learning map**s on Wasserstein space

Authors: Huyên Pham, Xavier Warin

Abstract: We study the machine learning task for models with operators map** between the Wasserstein space of probability measures and a space of functions, like e.g. in mean-field games/control problems. Two classes of neural networks, based on bin density and on cylindrical approximation, are proposed to learn these so-called mean-field functions, and are theoretically supported by universal approximati… ▽ More We study the machine learning task for models with operators map** between the Wasserstein space of probability measures and a space of functions, like e.g. in mean-field games/control problems. Two classes of neural networks, based on bin density and on cylindrical approximation, are proposed to learn these so-called mean-field functions, and are theoretically supported by universal approximation theorems. We perform several numerical experiments for training these two mean-field neural networks, and show their accuracy and efficiency in the generalization error with various test distributions. Finally, we present different algorithms relying on mean-field neural networks for solving time-dependent mean-field problems, and illustrate our results with numerical tests for the example of a semi-linear partial differential equation in the Wasserstein space of probability measures. △ Less

Submitted 18 September, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

Comments: 32 pages, 15 figures

MSC Class: 60G99

arXiv:2210.15153 [pdf, other]

doi 10.1103/PhysRevLett.131.071901

Observation of a resonant structure near the $D_s^+ D_s^-$ threshold in the $B^+\to D_s^+ D_s^- K^+$ decay

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, H. Afsharnia, C. Agapopoulou, C. A. Aidala, S. Aiola, Z. Ajaltouni, S. Akar, K. Akiba, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1038 additional authors not shown)

Abstract: An amplitude analysis of the $B^+\to D_s^+ D_s^- K^+$ decay is carried out to study for the first time its intermediate resonant contributions, using proton-proton collision data collected with the LHCb detector at centre-of-mass energies of 7, 8 and 13 TeV. A near-threshold peaking structure, referred to as $X(3960)$, is observed in the $D_s^+ D_s^-$ invariant-mass spectrum with significance grea… ▽ More An amplitude analysis of the $B^+\to D_s^+ D_s^- K^+$ decay is carried out to study for the first time its intermediate resonant contributions, using proton-proton collision data collected with the LHCb detector at centre-of-mass energies of 7, 8 and 13 TeV. A near-threshold peaking structure, referred to as $X(3960)$, is observed in the $D_s^+ D_s^-$ invariant-mass spectrum with significance greater than 12 standard deviations. The mass, width and the quantum numbers of the structure are measured to be $3956\pm5\pm10$ MeV, $43\pm13\pm8$ MeV and $J^{PC}=0^{++}$, respectively, where the first uncertainties are statistical and the second systematic. The properties of the new structure are consistent with recent theoretical predictions for a state composed of $c\bar{c}s\bar{s}$ quarks. Evidence for an additional structure is found around 4140 MeV in the $D_s^+ D_s^-$ invariant mass, which might be caused either by a new resonance with the $0^{++}$ assignment or by a $J/ψφ\leftrightarrow D_s^+ D_s^-$ coupled-channel effect. △ Less

Submitted 18 August, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-018.html (LHCb public pages)

Report number: CERN-EP-2022-200, LHCb-PAPER-2022-018

Journal ref: Phys. Rev. Lett. 131, 071901 (2023)

arXiv:2210.14945 [pdf, other]

doi 10.1007/JHEP07(2023)119

Observation of the $B^0_s\!\to D^{*+}D^{*-}$ decay

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, S. Aiola, Z. Ajaltouni, S. Akar, K. Akiba, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1049 additional authors not shown)

Abstract: The first observation of the $B^0_s\!\to D^{*+}D^{*-}$ decay and the measurement of its branching ratio relative to the $B^0\!\to D^{*+}D^{*-}$ decay are presented. The data sample used corresponds to an integrated luminosity of $9\,\text{fb}^{-1}$ of proton-proton collisions recorded by the LHCb experiment at centre-of-mass energies of 7, 8 and $13\,\text{TeV}$ between 2011 and 2018. The decay is… ▽ More The first observation of the $B^0_s\!\to D^{*+}D^{*-}$ decay and the measurement of its branching ratio relative to the $B^0\!\to D^{*+}D^{*-}$ decay are presented. The data sample used corresponds to an integrated luminosity of $9\,\text{fb}^{-1}$ of proton-proton collisions recorded by the LHCb experiment at centre-of-mass energies of 7, 8 and $13\,\text{TeV}$ between 2011 and 2018. The decay is observed with more than $10$ standard deviations and the time-integrated ratio of branching fractions is determined to be \begin{align*} \frac{\mathcal{B}(B^0_s\!\to D^{*+}D^{*-})}{\mathcal{B}(B^0\!\to D^{*+}D^{*-})} = 0.269 \pm 0.032 \pm 0.011 \pm 0.008\, , \end{align*} where the first uncertainty is statistical, the second systematic and the third due to the uncertainty of the fragmentation fraction ratio $f_s/f_d$. The $B^0_s\!\to D^{*+}D^{*-}$ branching fraction is calculated to be \begin{align*} \mathcal{B}(B^0_s\!\to D^{*+}D^{*-}) = (2.15 \pm 0.26 \pm 0.09 \pm 0.06 \pm 0.16)\times 10^{-4} \,, \end{align*} where the fourth uncertainty is due to the $B^0\!\to D^{*+}D^{*-}$branching fraction. These results are calculated using the average $B^0_s$ meson lifetime in simulation. Correction factors are reported for scenarios where either a purely heavy or a purely light $B^0_s$ eigenstate is considered. △ Less

Submitted 17 July, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-023.html (LHCb public pages)

Report number: LHCb-PAPER-2022-023, CERN-EP-2022-193

Journal ref: JHEP 07 (2023) 119

arXiv:2210.12000 [pdf, other]

doi 10.1007/JHEP07(2023)066

Measurement of the ratio of branching fractions $\mathcal{B}(B_c^+ \to B_s^0 π^+)/\mathcal{B}(B_c^+ \to J/ψπ^+)$

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1046 additional authors not shown)

Abstract: The ratio of branching fractions of $B_c^+ \to B_s^0 π^+$ and $B_c^+ \to J/ψπ^+$ decays is measured with proton-proton collision data of a centre-of-mass energy of $13\text{TeV}$. The data were collected with the LHCb experiment during 2016--2018, corresponding to an integrated luminosity of $5.4 \text{fb}^{-1}$. The $B_s^0$ mesons are reconstructed via the decays $B_s^0 \to J/ψφ$ and… ▽ More The ratio of branching fractions of $B_c^+ \to B_s^0 π^+$ and $B_c^+ \to J/ψπ^+$ decays is measured with proton-proton collision data of a centre-of-mass energy of $13\text{TeV}$. The data were collected with the LHCb experiment during 2016--2018, corresponding to an integrated luminosity of $5.4 \text{fb}^{-1}$. The $B_s^0$ mesons are reconstructed via the decays $B_s^0 \to J/ψφ$ and $B_s^0 \to D_s^- π^+$. The ratio of branching fractions is measured to be $\mathcal{B}(B_c^+ \to B_s^0 π^+)/\mathcal{B}(B_c^+ \to J/ψπ^+) = 91 \pm 10 \pm 8 \pm 3$ where the first uncertainty is statistical, the second is systematic and the third is due to the knowledge of the branching fractions of the intermediate state decays. △ Less

Submitted 18 July, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-034.html (LHCb public pages)

Report number: LHCb-PAPER-2022-034, CERN-EP-2022-204

Journal ref: JHEP 07 (2023) 066

arXiv:2210.10931 [pdf, other]

Search for gravitational-wave transients associated with magnetar bursts in Advanced LIGO and Advanced Virgo data from the third observing run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, H. Abe, F. Acernese, K. Ackley, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Allocca, P. A. Altin , et al. (1645 additional authors not shown)

Abstract: Gravitational waves are expected to be produced from neutron star oscillations associated with magnetar giant flares and short bursts. We present the results of a search for short-duration (milliseconds to seconds) and long-duration ($\sim$ 100 s) transient gravitational waves from 13 magnetar short bursts observed during Advanced LIGO, Advanced Virgo and KAGRA's third observation run. These 13 bu… ▽ More Gravitational waves are expected to be produced from neutron star oscillations associated with magnetar giant flares and short bursts. We present the results of a search for short-duration (milliseconds to seconds) and long-duration ($\sim$ 100 s) transient gravitational waves from 13 magnetar short bursts observed during Advanced LIGO, Advanced Virgo and KAGRA's third observation run. These 13 bursts come from two magnetars, SGR 1935$+$2154 and Swift J1818.0$-$1607. We also include three other electromagnetic burst events detected by Fermi GBM which were identified as likely coming from one or more magnetars, but they have no association with a known magnetar. No magnetar giant flares were detected during the analysis period. We find no evidence of gravitational waves associated with any of these 16 bursts. We place upper bounds on the root-sum-square of the integrated gravitational-wave strain that reach $2.2 \times 10^{-23}$ $/\sqrt{\text{Hz}}$ at 100 Hz for the short-duration search and $8.7 \times 10^{-23}$ $/\sqrt{\text{Hz}}$ at $450$ Hz for the long-duration search, given a detection efficiency of 50%. For a ringdown signal at 1590 Hz targeted by the short-duration search the limit is set to $1.8 \times 10^{-22}$ $/\sqrt{\text{Hz}}$. Using the estimated distance to each magnetar, we derive upper bounds on the emitted gravitational-wave energy of $3.2 \times 10^{43}$ erg ($7.3 \times 10^{43}$ erg) for SGR 1935$+$2154 and $8.2 \times 10^{42}$ erg ($2.8 \times 10^{43}$ erg) for Swift J1818.0$-$1607, for the short-duration (long-duration) search. Assuming isotropic emission of electromagnetic radiation of the burst fluences, we constrain the ratio of gravitational-wave energy to electromagnetic energy for bursts from SGR 1935$+$2154 with available fluence information. The lowest of these ratios is $3 \times 10^3$. △ Less

Submitted 19 October, 2022; originally announced October 2022.

Comments: 30 pages with appendices, 5 figures, 10 tables

Report number: LIGO-P2100387

arXiv:2210.07196 [pdf, ps, other]

Small subsets with large sumset: Beyond the Cauchy--Davenport bound

Authors: Jacob Fox, Sammy Luo, Huy Tuan Pham, Yunkun Zhou

Abstract: For a subset $A$ of an abelian group $G$, given its size $|A|$, its doubling $κ=|A+A|/|A|$, and a parameter $s$ which is small compared to $|A|$, we study the size of the largest sumset $A+A'$ that can be guaranteed for a subset $A'$ of $A$ of size at most $s$. We show that a subset $A'\subseteq A$ of size at most $s$ can be found so that $|A+A'| = Ω(\min(κ^{1/3},s)|A|)$. Thus a sumset significant… ▽ More For a subset $A$ of an abelian group $G$, given its size $|A|$, its doubling $κ=|A+A|/|A|$, and a parameter $s$ which is small compared to $|A|$, we study the size of the largest sumset $A+A'$ that can be guaranteed for a subset $A'$ of $A$ of size at most $s$. We show that a subset $A'\subseteq A$ of size at most $s$ can be found so that $|A+A'| = Ω(\min(κ^{1/3},s)|A|)$. Thus a sumset significantly larger than the Cauchy--Davenport bound can be guaranteed by a bounded size subset assuming that the doubling $κ$ is large. Building up on the same ideas, we resolve a conjecture of Bollobás, Leader and Tiba that for subsets $A,B$ of $\mathbb{Z}_p$ of size at most $αp$ for an appropriate constant $α>0$, one only needs three elements $b_1,b_2,b_3\in B$ to guarantee $|A+\{b_1,b_2,b_3\}|\ge |A|+|B|-1$. Allowing the use of larger subsets $A'$, we show that for sets $A$ of bounded doubling, one only needs a subset $A'$ with $o(|A|)$ elements to guarantee that $A+A'=A+A$. We also address another conjecture and a question raised by Bollobás, Leader and Tiba on high-dimensional analogs and sets whose sumset cannot be saturated by a bounded size subset. △ Less

Submitted 13 October, 2022; originally announced October 2022.

arXiv:2210.04996 [pdf, other]

Graph2Vid: Flow graph to Video Grounding for Weakly-supervised Multi-Step Localization

Authors: Nikita Dvornik, Isma Hadji, Hai Pham, Dhaivat Bhatt, Brais Martinez, Afsaneh Fazly, Allan D. Jepson

Abstract: In this work, we consider the problem of weakly-supervised multi-step localization in instructional videos. An established approach to this problem is to rely on a given list of steps. However, in reality, there is often more than one way to execute a procedure successfully, by following the set of steps in slightly varying orders. Thus, for successful localization in a given video, recent works r… ▽ More In this work, we consider the problem of weakly-supervised multi-step localization in instructional videos. An established approach to this problem is to rely on a given list of steps. However, in reality, there is often more than one way to execute a procedure successfully, by following the set of steps in slightly varying orders. Thus, for successful localization in a given video, recent works require the actual order of procedure steps in the video, to be provided by human annotators at both training and test times. Instead, here, we only rely on generic procedural text that is not tied to a specific video. We represent the various ways to complete the procedure by transforming the list of instructions into a procedure flow graph which captures the partial order of steps. Using the flow graphs reduces both training and test time annotation requirements. To this end, we introduce the new problem of flow graph to video grounding. In this setup, we seek the optimal step ordering consistent with the procedure flow graph and a given video. To solve this problem, we propose a new algorithm - Graph2Vid - that infers the actual ordering of steps in the video and simultaneously localizes them. To show the advantage of our proposed formulation, we extend the CrossTask dataset with procedure flow graph information. Our experiments show that Graph2Vid is both more efficient than the baselines and yields strong step localization results, without the need for step order annotation. △ Less

Submitted 31 October, 2022; v1 submitted 10 October, 2022; originally announced October 2022.

Comments: ECCV'22, oral

Journal ref: ECCV 2022

Showing 151–200 of 644 results for author: Pham, H