Search | arXiv e-print repository

Search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (635 additional authors not shown)

Abstract: Using 9.0 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies from 4.178 to 4.278 GeV with the BESIII detector at the BEPCII collider, we perform the first search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$. No $χ_{c1}(3872)\toγψ_2(3823)$ signal is observed. The upper limit on the ratio of branching fractions… ▽ More Using 9.0 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies from 4.178 to 4.278 GeV with the BESIII detector at the BEPCII collider, we perform the first search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$. No $χ_{c1}(3872)\toγψ_2(3823)$ signal is observed. The upper limit on the ratio of branching fractions $\mathcal{B}(χ_{c1}(3872)\toγψ_2(3823), ψ_2(3823)\toγχ_{c1})/\mathcal{B}(χ_{c1}(3872)\toπ^+π^- J/ψ)$ is set as 0.075 at the 90\% confidence level. Our result contradicts theoretical predictions under the assumption that the $χ_{c1}(3872)$ is the pure charmonium state $χ_{c1}(2P)$. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 8 pages, 2 figures

arXiv:2405.07691 [pdf, other]

Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) i… ▽ More The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) is compatible with NGC 4278 within $\sim0.03$ degree. Variation analysis shows an indication of the variability at a few months level in the TeV band, which is consistent with low frequency observations. Based on these observations, we report the detection of TeV $γ$-ray emissions from this low-luminosity AGN NGC 4278. The observations by LHAASO-WCDA during active period has a significance level of 8.8\,$σ$ with best-fit photon spectral index $\varGamma=2.56\pm0.14$ and a flux $f_{1-10\,\rm{TeV}}=(7.0\pm1.1_{\rm{sta}}\pm0.35_{\rm{syst}})\times10^{-13}\,\rm{photons\,cm^{-2}\,s^{-1}}$, or approximately $5\%$ of the Crab Nebula. The discovery of VHE from NGC 4278 indicates that the compact, weak radio jet can efficiently accelerate particles and emit TeV photons. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures

arXiv:2405.07474 [pdf, other]

Integrating Intent Understanding and Optimal Behavior Planning for Behavior Tree Generation from Human Instructions

Authors: Xinglin Chen, Yishuai Cai, Yunxin Mao, Minglong Li, Wen**g Yang, Weixia Xu, Ji Wang

Abstract: Robots executing tasks following human instructions in domestic or industrial environments essentially require both adaptability and reliability. Behavior Tree (BT) emerges as an appropriate control architecture for these scenarios due to its modularity and reactivity. Existing BT generation methods, however, either do not involve interpreting natural language or cannot theoretically guarantee the… ▽ More Robots executing tasks following human instructions in domestic or industrial environments essentially require both adaptability and reliability. Behavior Tree (BT) emerges as an appropriate control architecture for these scenarios due to its modularity and reactivity. Existing BT generation methods, however, either do not involve interpreting natural language or cannot theoretically guarantee the BTs' success. This paper proposes a two-stage framework for BT generation, which first employs large language models (LLMs) to interpret goals from high-level instructions, then constructs an efficient goal-specific BT through the Optimal Behavior Tree Expansion Algorithm (OBTEA). We represent goals as well-formed formulas in first-order logic, effectively bridging intent understanding and optimal behavior planning. Experiments in the service robot validate the proficiency of LLMs in producing grammatically correct and accurately interpreted goals, demonstrate OBTEA's superiority over the baseline BT Expansion algorithm in various metrics, and finally confirm the practical deployability of our framework. The project website is https://dids-ei.github.io/Project/LLM-OBTEA/. △ Less

Submitted 27 June, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.06393 [pdf, other]

Measurement of the ${e}^{+}{e}^{-}\to p \bar{p}π^{0}$ cross section at $\sqrt{s}=2.1000-3.0800$ GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: The process $e^{+}e^{-}\to p\bar{p}π^{0}$ is studied at 20 center-of-mass energies ranging from 2.1000 to 3.0800 GeV using 636.8 pb$^{-1}$ of data collected with the BESIII detector operating at the BEPCII collider. The Born cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$ are measured with high precision. Since the lowest center-of-mass energy, 2.1000 GeV, is less than 90 MeV above the… ▽ More The process $e^{+}e^{-}\to p\bar{p}π^{0}$ is studied at 20 center-of-mass energies ranging from 2.1000 to 3.0800 GeV using 636.8 pb$^{-1}$ of data collected with the BESIII detector operating at the BEPCII collider. The Born cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$ are measured with high precision. Since the lowest center-of-mass energy, 2.1000 GeV, is less than 90 MeV above the $p\bar{p}π^0$ energy threshold, we can probe the threshold behavior for this reaction. However, no anomalous threshold enhancement is found in the cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.06227 [pdf, other]

MaskMatch: Boosting Semi-Supervised Learning Through Mask Autoencoder-Driven Feature Learning

Authors: Wen** Zhang, Keyi Li, Sen Yang, Chenyang Gao, Wanzhao Yang, Sifan Yuan, Ivan Marsic

Abstract: Conventional methods in semi-supervised learning (SSL) often face challenges related to limited data utilization, mainly due to their reliance on threshold-based techniques for selecting high-confidence unlabeled data during training. Various efforts (e.g., FreeMatch) have been made to enhance data utilization by tweaking the thresholds, yet none have managed to use 100% of the available data. To… ▽ More Conventional methods in semi-supervised learning (SSL) often face challenges related to limited data utilization, mainly due to their reliance on threshold-based techniques for selecting high-confidence unlabeled data during training. Various efforts (e.g., FreeMatch) have been made to enhance data utilization by tweaking the thresholds, yet none have managed to use 100% of the available data. To overcome this limitation and improve SSL performance, we introduce \algo, a novel algorithm that fully utilizes unlabeled data to boost semi-supervised learning. \algo integrates a self-supervised learning strategy, i.e., Masked Autoencoder (MAE), that uses all available data to enforce the visual representation learning. This enables the SSL algorithm to leverage all available data, including samples typically filtered out by traditional methods. In addition, we propose a synthetic data training approach to further increase data utilization and improve generalization. These innovations lead \algo to achieve state-of-the-art results on challenging datasets. For instance, on CIFAR-100 with 2 labels per class, STL-10 with 4 labels per class, and Euro-SAT with 2 labels per class, \algo achieves low error rates of 18.71%, 9.47%, and 3.07%, respectively. The code will be made publicly available. △ Less

Submitted 9 May, 2024; originally announced May 2024.

arXiv:2405.05739 [pdf]

Preliminary Exploration on the Low-Pressure Ar-O2 Plasma Generated by Low-Frequency Alternating Current (AC) Power Supply

Authors: Niaz Wali, W. W. Xiao, Q. U. Din, N. U. Rehman, C. Y. Wang, J. T. Ma, W. J. Zhong, Q. W. Yang

Abstract: This study reports a low-frequency alternating current (AC) power supply as a novel approach for generating low-pressure capacitively coupled Ar-O2 plasma, offering advantages in cost, compactness, and operational simplicity, which are crucial for both material science and biological applications. The effectiveness of low-frequency AC-generated plasma against traditional RF systems by examining ke… ▽ More This study reports a low-frequency alternating current (AC) power supply as a novel approach for generating low-pressure capacitively coupled Ar-O2 plasma, offering advantages in cost, compactness, and operational simplicity, which are crucial for both material science and biological applications. The effectiveness of low-frequency AC-generated plasma against traditional RF systems by examining key plasma parameters such as electron density, electron temperature, and electron energy distribution function (EEDF), are investigated. Experimental results revealed that AC power supply could effectively produce low pressure Ar-O2 plasma with comparable properties to RF systems. Most notably, the AC-generated plasma achieved a significant reduction in bacterial growth, suggesting its potential as a more economical and flexible alternative for enhancing plasma-assisted applications in sterilization and material processing. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Comments: 16 pages, 7 figures

arXiv:2405.04954 [pdf, ps, other]

On vector parking functions and q-analogue

Authors: Wenkai Yang

Abstract: In 2000, it was demonstrated that the set of $x$-parking functions of length $n$, where $x$=($a,b,...,b$) $\in \mathbbm{N}^n$, is equivalent to the set of rooted multicolored forests on [$n$]=\{1,...,$n$\}. In 2020, Yue Cai and Catherine H. Yan systematically investigated the properties of rational parking functions. Subsequently, a series of Context-free grammars possessing the requisite property… ▽ More In 2000, it was demonstrated that the set of $x$-parking functions of length $n$, where $x$=($a,b,...,b$) $\in \mathbbm{N}^n$, is equivalent to the set of rooted multicolored forests on [$n$]=\{1,...,$n$\}. In 2020, Yue Cai and Catherine H. Yan systematically investigated the properties of rational parking functions. Subsequently, a series of Context-free grammars possessing the requisite property were introduced by William Y.C. Chen and Harold R.L. Yang in 2021. %An Abelian-type identity is derived from a comparable methodology and grammatical framework. %Leveraging a comparable methodology and grammatical framework, an Abelian-type identity is derived herein. In this paper, I discuss generalized parking functions in terms of grammars. The primary result is to obtain the q-analogue about the number of '1's in certain vector parking functions with the assistance of grammars. △ Less

Submitted 8 May, 2024; originally announced May 2024.

MSC Class: 05A15; 05A19; 05A30

arXiv:2405.04503 [pdf, other]

Physics-data hybrid dynamic model of a multi-axis manipulator for sensorless dexterous manipulation and high-performance motion planning

Authors: Wu-Te Yang, Jyun-Ming Liao, Pei-Chun Lin

Abstract: We report on the development of an implementable physics-data hybrid dynamic model for an articulated manipulator to plan and operate in various scenarios. Meanwhile, the physics-based and data-driven dynamic models are studied in this research to select the best model for planning. The physics-based model is constructed using the Lagrangian method, and the loss terms include inertia loss, viscous… ▽ More We report on the development of an implementable physics-data hybrid dynamic model for an articulated manipulator to plan and operate in various scenarios. Meanwhile, the physics-based and data-driven dynamic models are studied in this research to select the best model for planning. The physics-based model is constructed using the Lagrangian method, and the loss terms include inertia loss, viscous loss, and friction loss. As for the data-driven model, three methods are explored, including DNN, LSTM, and XGBoost. Our modeling results demonstrate that, after comprehensive hyperparameter optimization, the XGBoost architecture outperforms DNN and LSTM in accurately representing manipulator dynamics. The hybrid model with physics-based and data-driven terms has the best performance among all models based on the RMSE criteria, and it only needs about 24k of training data. In addition, we developed a virtual force sensor of a manipulator using the observed external torque derived from the dynamic model and designed a motion planner through the physics-data hybrid dynamic model. The external torque contributes to forces and torque on the end effector, facilitating interaction with the surroundings, while the internal torque governs manipulator motion dynamics and compensates for internal losses. By estimating external torque via the difference between measured joint torque and internal losses, we implement a sensorless control strategy which is demonstrated through a peg-in-hole task. Lastly, a learning-based motion planner based on the hybrid dynamic model assists in planning time-efficient trajectories for the manipulator. This comprehensive approach underscores the efficacy of integrating physics-based and data-driven models for advanced manipulator control and planning in industrial environments. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: 26 pages, 16 figures

arXiv:2405.03209 [pdf, other]

Evidence for a cyclotron absorption line and spectral transition in EXO 2030+375 during 2021 giant outburst

Authors: Wen Yang, Wei Wang, Prahlad R. Epili

Abstract: Based on HXMT observations of EXO 2030+375 during its 2021 giant outburst, we report the analysis of pulse variations and the broadband X-ray spectrum, and find the presence of a potential cyclotron resonant scattering feature with the fundamental line at 47 keV from both average spectra and phase-resolved spectroscopy. During the outburst, the source reached an X-ray luminosity of… ▽ More Based on HXMT observations of EXO 2030+375 during its 2021 giant outburst, we report the analysis of pulse variations and the broadband X-ray spectrum, and find the presence of a potential cyclotron resonant scattering feature with the fundamental line at 47 keV from both average spectra and phase-resolved spectroscopy. During the outburst, the source reached an X-ray luminosity of $\sim 1\times 10^{38}$ erg /cm/s from 2-105 keV at a distance of 7.1 kpc. The X-ray pulsar at the spin period of 41.27 seconds exhibits complex timing and spectral variations with both energy and luminosity during the outburst. The shapes of the pulses profiles show the single main peak above 20 keV, while appear to exhibit multi-peak patterns in low energy bands, and the transition of pulse profiles from multi-peak to single-peak is observed at $0.8\times 10^{38}$ erg /cm/s, which suggests the evolution from the subcritical luminosity (pencil-beam dominated) to supercritical luminosity (fan-beam dominated) regimes. A dip structure before the energy of the cyclotron resonant scattering features is found in the pulse fraction-energy relation near the peak luminosity. A detailed analysis of spectral parameters showed that the power-law photon index exhibits three distinct trends as luminosity increases, and these changes also signify a spectral transition from sub-critical to super-critical regimes. The critical luminosity infers the magnetic field of $(4.8-6.0)\times 10^{12}$ G, which supports the presence of the cyclotron line at 47 keV. A Comptonization model applied for the broad X-ray spectra during the outburst also suggests the surface magnetic field ranging from $(5-9)\times 10^{12}$ G. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: 17 pages, 13 figure, 5 tables, accept for the publication in ApJ

arXiv:2405.01829 [pdf, other]

Fractional quantum anomalous Hall effect in a semimetal

Authors: Wenqi Yang, Dawei Zhai, Feng-Ren Fan, Wang Yao

Abstract: In the search of fractional quantum anomalous Hall (FQAH) effects, the conventional wisdom is to start from a flat Chern band isolated from the rest of the Hilbert space by bandgap, so that many-body interaction can be projected to a landscape that mimics a Landau level. Here we report the finding of FQAH in a 2D semimetal. Described by a 2π-flux dice lattice, the model features a flat band in tou… ▽ More In the search of fractional quantum anomalous Hall (FQAH) effects, the conventional wisdom is to start from a flat Chern band isolated from the rest of the Hilbert space by bandgap, so that many-body interaction can be projected to a landscape that mimics a Landau level. Here we report the finding of FQAH in a 2D semimetal. Described by a 2π-flux dice lattice, the model features a flat band in touching with a dispersive lower band, where the band touching is symmetry-protected from being gapped by electron interaction. At 1/3 and 2/3 filling of the gapless flat band, FQAH phases are found using density matrix renormalisation group calculations taking into accounts all bands. Symmetry breaking to gap the band touching can turn the semimetal into a Chern insulator while kee** the Chern band nearly flat, but counter-intuitively this suppresses the FQAH, as the gap opening introduces strong inhomogeneity to the quantum geometry. We show an optical scheme to realize the 2π-flux dice lattice for cold atoms. Our finding uncovers a new arena for the exploration of fractional quantum Hall physics in addition to the Landau levels and Chern insulators. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2405.01460 [pdf, other]

Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders

Authors: Yi Yu, Yufei Wang, Song Xia, Wenhan Yang, Shijian Lu, Yap-Peng Tan, Alex C. Kot

Abstract: Unlearnable examples (UEs) seek to maximize testing error by making subtle modifications to training examples that are correctly labeled. Defenses against these poisoning attacks can be categorized based on whether specific interventions are adopted during training. The first approach is training-time defense, such as adversarial training, which can mitigate poisoning effects but is computationall… ▽ More Unlearnable examples (UEs) seek to maximize testing error by making subtle modifications to training examples that are correctly labeled. Defenses against these poisoning attacks can be categorized based on whether specific interventions are adopted during training. The first approach is training-time defense, such as adversarial training, which can mitigate poisoning effects but is computationally intensive. The other approach is pre-training purification, e.g., image short squeezing, which consists of several simple compressions but often encounters challenges in dealing with various UEs. Our work provides a novel disentanglement mechanism to build an efficient pre-training purification method. Firstly, we uncover rate-constrained variational autoencoders (VAEs), demonstrating a clear tendency to suppress the perturbations in UEs. We subsequently conduct a theoretical analysis for this phenomenon. Building upon these insights, we introduce a disentangle variational autoencoder (D-VAE), capable of disentangling the perturbations with learnable class-wise embeddings. Based on this network, a two-stage purification approach is naturally developed. The first stage focuses on roughly eliminating perturbations, while the second stage produces refined, poison-free results, ensuring effectiveness and robustness across various scenarios. Extensive experiments demonstrate the remarkable performance of our method across CIFAR-10, CIFAR-100, and a 100-class ImageNet-subset. Code is available at https://github.com/yuyi-sd/D-VAE. △ Less

Submitted 6 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

Comments: Accepted by ICML 2024

arXiv:2405.01186 [pdf, other]

Potential Energy based Mixture Model for Noisy Label Learning

Authors: Zijia Wang, Wenbin Yang, Zhisong Liu, Zhen Jia

Abstract: Training deep neural networks (DNNs) from noisy labels is an important and challenging task. However, most existing approaches focus on the corrupted labels and ignore the importance of inherent data structure. To bridge the gap between noisy labels and data, inspired by the concept of potential energy in physics, we propose a novel Potential Energy based Mixture Model (PEMM) for noise-labels lear… ▽ More Training deep neural networks (DNNs) from noisy labels is an important and challenging task. However, most existing approaches focus on the corrupted labels and ignore the importance of inherent data structure. To bridge the gap between noisy labels and data, inspired by the concept of potential energy in physics, we propose a novel Potential Energy based Mixture Model (PEMM) for noise-labels learning. We innovate a distance-based classifier with the potential energy regularization on its class centers. Embedding our proposed classifier with existing deep learning backbones, we can have robust networks with better feature representations. They can preserve intrinsic structures from the data, resulting in a superior noisy tolerance. We conducted extensive experiments to analyze the efficiency of our proposed model on several real-world datasets. Quantitative results show that it can achieve state-of-the-art performance. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

arXiv:2405.01175 [pdf, other]

Uncertainty-aware self-training with expectation maximization basis transformation

Authors: Zijia Wang, Wenbin Yang, Zhisong Liu, Zhen Jia

Abstract: Self-training is a powerful approach to deep learning. The key process is to find a pseudo-label for modeling. However, previous self-training algorithms suffer from the over-confidence issue brought by the hard labels, even some confidence-related regularizers cannot comprehensively catch the uncertainty. Therefore, we propose a new self-training framework to combine uncertainty information of bo… ▽ More Self-training is a powerful approach to deep learning. The key process is to find a pseudo-label for modeling. However, previous self-training algorithms suffer from the over-confidence issue brought by the hard labels, even some confidence-related regularizers cannot comprehensively catch the uncertainty. Therefore, we propose a new self-training framework to combine uncertainty information of both model and dataset. Specifically, we propose to use Expectation-Maximization (EM) to smooth the labels and comprehensively estimate the uncertainty information. We further design a basis extraction network to estimate the initial basis from the dataset. The obtained basis with uncertainty can be filtered based on uncertainty information. It can then be transformed into the real hard label to iteratively update the model and basis in the retraining process. Experiments on image classification and semantic segmentation show the advantages of our methods among confidence-aware self-training algorithms with 1-3 percentage improvement on different datasets. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Journal ref: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

arXiv:2405.00822 [pdf, ps, other]

Kernel-based Learning for Safe Control of Discrete-Time Unknown Systems under Incomplete Observations

Authors: Zewen Yang, Xiaobing Dai, Weijie Yang, Bahar İlgen, Aleksandar Anžel, Georges Hattab

Abstract: Safe control for dynamical systems is critical, yet the presence of unknown dynamics poses significant challenges. In this paper, we present a learning-based control approach for tracking control of a class of high-order systems, operating under the constraint of partially observable states. The uncertainties inherent within the systems are modeled by kernel ridge regression, leveraging the propos… ▽ More Safe control for dynamical systems is critical, yet the presence of unknown dynamics poses significant challenges. In this paper, we present a learning-based control approach for tracking control of a class of high-order systems, operating under the constraint of partially observable states. The uncertainties inherent within the systems are modeled by kernel ridge regression, leveraging the proposed strategic data acquisition approach with limited state measurements. To achieve accurate trajectory tracking, a state observer that seamlessly integrates with the control law is devised. The analysis of the guaranteed control performance is conducted using Lyapunov theory due to the deterministic prediction error bound of kernel ridge regression, ensuring the adaptability of the approach in safety-critical scenarios. To demonstrate the effectiveness of our proposed approach, numerical simulations are performed, underscoring its contributions to the advancement of control strategies. △ Less

Submitted 1 May, 2024; originally announced May 2024.

arXiv:2405.00816 [pdf]

Sifting out communities in large sparse networks

Authors: Sharlee Climer, Kenneth Smith Jr, Wei Yang, Lisa de las Fuentes, Victor G. Dávila-Román, C. Charles Gu

Abstract: Research data sets are growing to unprecedented sizes and network modeling is commonly used to extract complex relationships in diverse domains, such as genetic interactions involved in disease, logistics, and social communities. As the number of nodes increases in a network, an increasing sparsity of edges is a practical limitation due to memory restrictions. Moreover, many of these sparse networ… ▽ More Research data sets are growing to unprecedented sizes and network modeling is commonly used to extract complex relationships in diverse domains, such as genetic interactions involved in disease, logistics, and social communities. As the number of nodes increases in a network, an increasing sparsity of edges is a practical limitation due to memory restrictions. Moreover, many of these sparse networks exhibit very large numbers of nodes with no adjacent edges, as well as disjoint components of nodes with no edges connecting them. A prevalent aim in network modeling is the identification of clusters, or communities, of nodes that are highly interrelated. Several definitions of strong community structure have been introduced to facilitate this task, each with inherent assumptions and biases. We introduce an intuitive objective function for quantifying the quality of clustering results in large sparse networks. We utilize a two-step method for identifying communities which is especially well-suited for this domain as the first step efficiently divides the network into the disjoint components, while the second step optimizes clustering of the produced components based on the new objective. Using simulated networks, optimization based on the new objective function consistently yields significantly higher accuracy than those based on the modularity function, with the widest gaps appearing for the noisiest networks. Additionally, applications to benchmark problems illustrate the intuitive correctness of our approach. Finally, the practicality of our approach is demonstrated in real-world data in which we identify complex genetic interactions in large-scale networks comprised of tens of thousands of nodes. Based on these three different types of trials, our results clearly demonstrate the usefulness of our two-step procedure and the accuracy of our simple objective. △ Less

Submitted 1 May, 2024; originally announced May 2024.

arXiv:2405.00236 [pdf, other]

STT: Stateful Tracking with Transformers for Autonomous Driving

Authors: Longlong **g, Ruichi Yu, Xu Chen, Zhengli Zhao, Shiwei Sheng, Colin Graber, Qi Chen, Qinru Li, Shangxuan Wu, Han Deng, Sang** Lee, Chris Sweeney, Qiurui He, Wei-Chih Hung, Tong He, Xingyi Zhou, Farshid Moussavi, Zijian Guo, Yin Zhou, Mingxing Tan, Weilong Yang, Congcong Li

Abstract: Tracking objects in three-dimensional space is critical for autonomous driving. To ensure safety while driving, the tracker must be able to reliably track objects across frames and accurately estimate their states such as velocity and acceleration in the present. Existing works frequently focus on the association task while either neglecting the model performance on state estimation or deploying c… ▽ More Tracking objects in three-dimensional space is critical for autonomous driving. To ensure safety while driving, the tracker must be able to reliably track objects across frames and accurately estimate their states such as velocity and acceleration in the present. Existing works frequently focus on the association task while either neglecting the model performance on state estimation or deploying complex heuristics to predict the states. In this paper, we propose STT, a Stateful Tracking model built with Transformers, that can consistently track objects in the scenes while also predicting their states accurately. STT consumes rich appearance, geometry, and motion signals through long term history of detections and is jointly optimized for both data association and state estimation tasks. Since the standard tracking metrics like MOTA and MOTP do not capture the combined performance of the two tasks in the wider spectrum of object states, we extend them with new metrics called S-MOTA and MOTPS that address this limitation. STT achieves competitive real-time performance on the Waymo Open Dataset. △ Less

Submitted 30 April, 2024; originally announced May 2024.

Comments: ICRA 2024

arXiv:2404.19381 [pdf, other]

Low-overhead General-purpose Near-Data Processing in CXL Memory Expanders

Authors: Hyungkyu Ham, Jeongmin Hong, Geonwoo Park, Yunseon Shin, Okkyun Woo, Wonhyuk Yang, **hoon Bae, Eunhyeok Park, Hyo** Sung, Euicheol Lim, Gwangsun Kim

Abstract: To overcome the memory capacity wall of large-scale AI and big data applications, Compute Express Link (CXL) enables cost-efficient memory expansion beyond the local DRAM of processors. While its CXL.mem protocol stack minimizes interconnect latency, CXL memory accesses can still result in significant slowdowns for memory-bound applications. While near-data processing (NDP) in CXL memory can overc… ▽ More To overcome the memory capacity wall of large-scale AI and big data applications, Compute Express Link (CXL) enables cost-efficient memory expansion beyond the local DRAM of processors. While its CXL.mem protocol stack minimizes interconnect latency, CXL memory accesses can still result in significant slowdowns for memory-bound applications. While near-data processing (NDP) in CXL memory can overcome such limitations, prior works propose application-specific HW units that are not suitable for practical CXL memory-based systems that should support various applications. On the other hand, existing CPU or GPU cores are not cost-effective for NDP because they are not optimized for memory-bound applications. In addition, the communication between the host processor and CXL controller for NDP offloading should achieve low latency, but the CXL$.$io (or PCIe) protocol incurs $μ$s-scale latency and is not suitable for fine-grain NDP. To achieve high-performance NDP end-to-end, we propose a low-overhead general-purpose NDP architecture for CXL memory referred to as Memory-Mapped NDP (M$^2$NDP), which comprises memory-mapped functions (M$^2$func) and memory-mapped $μ$threading (M$^2μ$thr). The M$^2$func is a CXL.mem-compatible low-overhead communication mechanism between the host processor and NDP controller in the CXL memory. The M$^2μ$thr enables low-cost, general-purpose NDP unit design by introducing lightweight $μ$threads that support highly concurrent execution of NDP kernels with minimal resource wastage. By combining them, our M$^2$NDP achieves significant speedups for various applications, including in-memory OLAP, key-value store, large language model, recommendation model, and graph analytics by up to 128$\times$ (11.5$\times$ overall) and reduces energy by up to 87.9\% (80.1\% overall) compared to a baseline CPU or GPU host with passive CXL memory. △ Less

Submitted 30 April, 2024; originally announced April 2024.

arXiv:2404.19240 [pdf, ps, other]

Surface energy and elementary excitations of the XYZ spin chain with integrable open boundary fields

Authors: Zhirong Xin, Junpeng Cao, Wen-Li Yang, Yupeng Wang

Abstract: We study the thermodynamic limit of the anisotropic XYZ spin chain with non-diagonal integrable open boundary conditions. Although the $U(1)$-symmetry is broken, by using the new parametrization scheme, we exactly obtain the surface energy and the excitation energy of the system, which has solved the difficulty in the inhomogeneous $T-Q$ relation. With the boundary parameters in the regions making… ▽ More We study the thermodynamic limit of the anisotropic XYZ spin chain with non-diagonal integrable open boundary conditions. Although the $U(1)$-symmetry is broken, by using the new parametrization scheme, we exactly obtain the surface energy and the excitation energy of the system, which has solved the difficulty in the inhomogeneous $T-Q$ relation. With the boundary parameters in the regions making the Hamiltonian Hermitian, we have obtained the distribution patterns of the zero roots of the eigenvalue of the transfer matrix for the ground state and the excited ones. We find that the surface and excitation energies depend on the parities of sites number $N$, due to the long-range Neel order in the bulk. The spontaneous magnetization and easy-axis for all the regions of boundary parameters are studied. We also obtain the physical quantities in the thermodynamic limit of boundary XXZ model by taking the triangular limit. △ Less

Submitted 29 April, 2024; originally announced April 2024.

Comments: 44 pages, 12 figures

arXiv:2404.18762 [pdf, ps, other]

Genericity of sublinearly Morse directions in general metric spaces

Authors: Yulan Qing, Wenyuan Yang

Abstract: In this paper, we show that for any proper statistically convexcocompact actions on proper metric spaces, the sublinearly Morse boundary has full Patterson-Sullivan measure in the horofundction boundary. In this paper, we show that for any proper statistically convexcocompact actions on proper metric spaces, the sublinearly Morse boundary has full Patterson-Sullivan measure in the horofundction boundary. △ Less

Submitted 29 April, 2024; originally announced April 2024.

Comments: 23 pages, 5 figures

MSC Class: 20F65; 20F67

arXiv:2404.18117 [pdf, ps, other]

A Basis-preserving Algorithm for Computing the Bezout Matrix of Newton Polynomials

Authors: **g Yang, Wei Yang

Abstract: This paper tackles the problem of constructing Bezout matrices for Newton polynomials in a basis-preserving approach that operates directly with the given Newton basis, thus avoiding the need for transformation from Newton basis to monomial basis. This approach significantly reduces the computational cost and also mitigates numerical instability caused by basis transformation. For this purpose, we… ▽ More This paper tackles the problem of constructing Bezout matrices for Newton polynomials in a basis-preserving approach that operates directly with the given Newton basis, thus avoiding the need for transformation from Newton basis to monomial basis. This approach significantly reduces the computational cost and also mitigates numerical instability caused by basis transformation. For this purpose, we investigate the internal structure of Bezout matrices in Newton basis and design a basis-preserving algorithm that generates the Bezout matrix in the specified basis used to formulate the input polynomials. Furthermore, we show an application of the proposed algorithm on constructing confederate resultant matrices for Newton polynomials. Experimental results demonstrate that the proposed methods perform superior to the basis-transformation-based ones. △ Less

Submitted 28 April, 2024; originally announced April 2024.

arXiv:2404.17894 [pdf, ps, other]

Unpaired Multi-view Clustering via Reliable View Guidance

Authors: Like Xin, Wanqi Yang, Lei Wang, Ming Yang

Abstract: This paper focuses on unpaired multi-view clustering (UMC), a challenging problem where paired observed samples are unavailable across multiple views. The goal is to perform effective joint clustering using the unpaired observed samples in all views. In incomplete multi-view clustering, existing methods typically rely on sample pairing between views to capture their complementary. However, that is… ▽ More This paper focuses on unpaired multi-view clustering (UMC), a challenging problem where paired observed samples are unavailable across multiple views. The goal is to perform effective joint clustering using the unpaired observed samples in all views. In incomplete multi-view clustering, existing methods typically rely on sample pairing between views to capture their complementary. However, that is not applicable in the case of UMC. Hence, we aim to extract the consistent cluster structure across views. In UMC, two challenging issues arise: uncertain cluster structure due to lack of label and uncertain pairing relationship due to absence of paired samples. We assume that the view with a good cluster structure is the reliable view, which acts as a supervisor to guide the clustering of the other views. With the guidance of reliable views, a more certain cluster structure of these views is obtained while achieving alignment between reliable views and other views. Then we propose Reliable view Guidance with one reliable view (RG-UMC) and multiple reliable views (RGs-UMC) for UMC. Specifically, we design alignment modules with one reliable view and multiple reliable views, respectively, to adaptively guide the optimization process. Also, we utilize the compactness module to enhance the relationship of samples within the same cluster. Meanwhile, an orthogonal constraint is applied to latent representation to obtain discriminate features. Extensive experiments show that both RG-UMC and RGs-UMC outperform the best state-of-the-art method by an average of 24.14\% and 29.42\% in NMI, respectively. △ Less

Submitted 27 April, 2024; originally announced April 2024.

arXiv:2404.17617 [pdf, other]

Beyond Traditional Threats: A Persistent Backdoor Attack on Federated Learning

Authors: Tao Liu, Yuhang Zhang, Zhu Feng, Zhiqin Yang, Chen Xu, Dapeng Man, Wu Yang

Abstract: Backdoors on federated learning will be diluted by subsequent benign updates. This is reflected in the significant reduction of attack success rate as iterations increase, ultimately failing. We use a new metric to quantify the degree of this weakened backdoor effect, called attack persistence. Given that research to improve this performance has not been widely noted,we propose a Full Combination… ▽ More Backdoors on federated learning will be diluted by subsequent benign updates. This is reflected in the significant reduction of attack success rate as iterations increase, ultimately failing. We use a new metric to quantify the degree of this weakened backdoor effect, called attack persistence. Given that research to improve this performance has not been widely noted,we propose a Full Combination Backdoor Attack (FCBA) method. It aggregates more combined trigger information for a more complete backdoor pattern in the global model. Trained backdoored global model is more resilient to benign updates, leading to a higher attack success rate on the test set. We test on three datasets and evaluate with two models across various settings. FCBA's persistence outperforms SOTA federated learning backdoor attacks. On GTSRB, postattack 120 rounds, our attack success rate rose over 50% from baseline. The core code of our method is available at https://github.com/PhD-TaoLiu/FCBA. △ Less

Submitted 26 April, 2024; originally announced April 2024.

Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence. 2024, 38(19): 21359-21367

arXiv:2404.16565 [pdf, other]

PyRadar: Towards Automatically Retrieving and Validating Source Code Repository Information for PyPI Packages

Authors: Kai Gao, Weiwei Xu, Wenhao Yang, Minghui Zhou

Abstract: A package's source code repository records the development history of the package, providing indispensable information for the use and risk monitoring of the package. However, a package release often misses its source code repository due to the separation of the package's development platform from its distribution platform. Existing tools retrieve the release's repository information from its meta… ▽ More A package's source code repository records the development history of the package, providing indispensable information for the use and risk monitoring of the package. However, a package release often misses its source code repository due to the separation of the package's development platform from its distribution platform. Existing tools retrieve the release's repository information from its metadata, which suffers from two limitations: the metadata may not contain or contain wrong information. Our analysis shows that existing tools can only retrieve repository information for up to 70.5% of PyPI releases. To address the limitations, this paper proposes PyRadar, a novel framework that utilizes the metadata and source distribution to retrieve and validate the repository information for PyPI releases. We start with an empirical study to compare four existing tools on 4,227,425 PyPI releases and analyze phantom files (files appearing in the release's distribution but not in the release's repository) in 14,375 correct package-repository links and 2,064 incorrect links. Based on the findings, we design PyRadar with three components, i.e., Metadata-based Retriever, Source Code Repository Validator, and Source Code-based Retriever. In particular, the Metadata-based Retriever combines best practices of existing tools and successfully retrieves repository information from the metadata for 72.1% of PyPI releases. The Source Code Repository Validator applies common machine learning algorithms on six crafted features and achieves an AUC of up to 0.995. The Source Code-based Retriever queries World of Code with the SHA-1 hashes of all Python files in the release's source distribution and retrieves repository information for 90.2% of packages in our dataset with an accuracy of 0.970. Both practitioners and researchers can employ the PyRadar to better use PyPI packages. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: This paper has been accepted at FSE 2024

arXiv:2404.15631 [pdf]

Are There Echo Chambers in the US News Ecosystem? Evidence From Twitter/X

Authors: Wen Yang

Abstract: This study investigates echo chambers in social networks through an analysis of Twitter news accounts. Utilizing bias labels from the AllSides website, we construct a dataset representing six dimensions of news bias. Through manual extraction of follower/following relationships, we analyze interactions among 65 active Twitter news accounts. Despite the relatively small size of the network node dat… ▽ More This study investigates echo chambers in social networks through an analysis of Twitter news accounts. Utilizing bias labels from the AllSides website, we construct a dataset representing six dimensions of news bias. Through manual extraction of follower/following relationships, we analyze interactions among 65 active Twitter news accounts. Despite the relatively small size of the network node data utilized, results reveal distinct clustering patterns indicative of echo chambers, with limited interaction between conflicting ideologies. This study underscores the potential impact of bias on information dissemination and democratic expression. These findings offer valuable insights into the dynamics of echo chambers in contemporary social media environments. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: 13 pages, 7 figures, 2 tables

arXiv:2404.15630 [pdf]

Information Cocoons on Social Media: Why and How Should the Government Regulate Algorithms

Authors: Wen Yang

Abstract: Information cocoons are frequently cited in the literature on whether and how social media might lead to ideological segregation and political polarization. From the behavioural and communication perspectives, this paper first examines why algorithm-based social media, as opposed to its traditional counterpart, is more likely to produce information cocoons. We then explore populism and short-termi… ▽ More Information cocoons are frequently cited in the literature on whether and how social media might lead to ideological segregation and political polarization. From the behavioural and communication perspectives, this paper first examines why algorithm-based social media, as opposed to its traditional counterpart, is more likely to produce information cocoons. We then explore populism and short-termism in voting, bias and noise in decision-making, and prerequisite capital for innovation, demonstrating the importance of information diversity for a sustainable information environment. Finally, this study argues for libertarian paternalism by evaluating the criteria and trade-offs involved in regulating algorithms and proposes to employ nudges to address the core issues while preserving freedom of choice. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: 2 tables

arXiv:2404.15542 [pdf]

Twisted MoSe2 Homobilayer Behaving as a Heterobilayer

Authors: Arka Karmakar, Abdullah Al-Mahboob, Natalia Zawadzka, Mateusz Raczyński, Weiguang Yang, Mehdi Arfaoui, Gayatri, Julia Kucharek, Jerzy T. Sadowski, Hyeon Suk Shin, Adam Babiński, Wojciech Pacuski, Tomasz Kazimierczuk, Maciej R Molas

Abstract: Heterostructures (HSs) formed by the transition-metal dichalcogenides (TMDCs) materials have shown great promise in next-generation optoelectronic and photonic applications. An artificially twisted HS, allows us to manipulate the optical, and electronic properties. With this work, we introduce the understanding of the complex energy transfer (ET) process governed by the dipolar interaction in a tw… ▽ More Heterostructures (HSs) formed by the transition-metal dichalcogenides (TMDCs) materials have shown great promise in next-generation optoelectronic and photonic applications. An artificially twisted HS, allows us to manipulate the optical, and electronic properties. With this work, we introduce the understanding of the complex energy transfer (ET) process governed by the dipolar interaction in a twisted molybdenum diselenide (MoSe2) homobilayer without any charge-blocking interlayer. We fabricated an unconventional homobilayer (i.e., HS) with a large twist angle by combining the chemical vapor deposition (CVD) and mechanical exfoliation (Exf.) techniques to fully exploit the lattice parameters mismatch and indirect/direct (CVD/Exf.) bandgap nature. This effectively weaken the charge transfer (CT) process and allows the ET process to take over the carrier recombination channels. We utilize a series of optical and electron spectroscopy techniques complementing by the density functional theory calculations, to describe a massive photoluminescence enhancement from the HS area due to an efficient ET process. Our results show that the electronically decoupled MoSe2 homobilayer is coupled by the ET process, mimicking a 'true' heterobilayer nature. △ Less

Submitted 7 June, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

Comments: 4 figures

arXiv:2404.14248 [pdf, other]

NTIRE 2024 Challenge on Low Light Image Enhancement: Methods and Results

Authors: Xiaoning Liu, Zongwei Wu, Ao Li, Florin-Alexandru Vasluianu, Yulun Zhang, Shuhang Gu, Le Zhang, Ce Zhu, Radu Timofte, Zhi **, Hongjun Wu, Chenxi Wang, Haitao Ling, Yuanhao Cai, Hao Bian, Yuxin Zheng, **g Lin, Alan Yuille, Ben Shao, ** Guo, Tianli Liu, Mohao Wu, Yixu Feng, Shuo Hou, Haotian Lin , et al. (87 additional authors not shown)

Abstract: This paper reviews the NTIRE 2024 low light image enhancement challenge, highlighting the proposed solutions and results. The aim of this challenge is to discover an effective network design or solution capable of generating brighter, clearer, and visually appealing results when dealing with a variety of conditions, including ultra-high resolution (4K and beyond), non-uniform illumination, backlig… ▽ More This paper reviews the NTIRE 2024 low light image enhancement challenge, highlighting the proposed solutions and results. The aim of this challenge is to discover an effective network design or solution capable of generating brighter, clearer, and visually appealing results when dealing with a variety of conditions, including ultra-high resolution (4K and beyond), non-uniform illumination, backlighting, extreme darkness, and night scenes. A notable total of 428 participants registered for the challenge, with 22 teams ultimately making valid submissions. This paper meticulously evaluates the state-of-the-art advancements in enhancing low-light images, reflecting the significant progress and creativity in this field. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: NTIRE 2024 Challenge Report

arXiv:2404.13840 [pdf, other]

Study of $e^+e^-\toωX(3872)$ and $γX(3872)$ from 4.66 to 4.95 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

Abstract: Using data samples with an integrated luminosity of $4.5~\text{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies ranging from 4.66 to 4.95 GeV, we study the processes of $e^+e^-\toωX(3872)$ and $e^+e^-\toγX(3872)$. With the $e^+e^-\toωX(3872)$ process, the branching fraction ratio $R\equiv\frac{\mathcal{B}(X(3872)\toγJ/ψ)}{\mathcal{B}(X(3872)\toπ^+π^- J/ψ)}$ is measured to be… ▽ More Using data samples with an integrated luminosity of $4.5~\text{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies ranging from 4.66 to 4.95 GeV, we study the processes of $e^+e^-\toωX(3872)$ and $e^+e^-\toγX(3872)$. With the $e^+e^-\toωX(3872)$ process, the branching fraction ratio $R\equiv\frac{\mathcal{B}(X(3872)\toγJ/ψ)}{\mathcal{B}(X(3872)\toπ^+π^- J/ψ)}$ is measured to be $0.38\pm0.20_\text{stat.}\pm0.01_\text{syst.}$ ($R< 0.83$ at 90\% confidence level). In addition, we measure the ratio of the average cross section of $e^+e^-\toωX(3872)$ to $e^+e^-\toωχ_{c1}(ωχ_{c2})$ to be $σ_{ωX(3872)}/σ_{ωχ_{c1}}~(σ_{ωX(3872)}/σ_{ωχ_{c2}})=5.2\pm1.0_\text{stat.}\pm1.9_\text{syst.}~ (5.5\pm1.1_\text{stat.}\pm2.4_\text{syst.})$. Finally, we search for the process of $e^+e^-\toγX(3872)$, and no obvious signal is observed. The upper limit on the ratio of the average cross section of $e^+e^-\toγX(3872)$ to $e^+e^-\toωX(3872)$ is set as $σ_{γX(3872)}/σ_{ωX(3872)}<0.23$ at 90\% confidence level. △ Less

Submitted 21 April, 2024; originally announced April 2024.

Comments: 19 pages, 10 figures

arXiv:2404.13619 [pdf, other]

Towards Unified Representation of Multi-Modal Pre-training for 3D Understanding via Differentiable Rendering

Authors: Ben Fei, Yixuan Li, Weidong Yang, Lipeng Ma, Ying He

Abstract: State-of-the-art 3D models, which excel in recognition tasks, typically depend on large-scale datasets and well-defined category sets. Recent advances in multi-modal pre-training have demonstrated potential in learning 3D representations by aligning features from 3D shapes with their 2D RGB or depth counterparts. However, these existing frameworks often rely solely on either RGB or depth images, l… ▽ More State-of-the-art 3D models, which excel in recognition tasks, typically depend on large-scale datasets and well-defined category sets. Recent advances in multi-modal pre-training have demonstrated potential in learning 3D representations by aligning features from 3D shapes with their 2D RGB or depth counterparts. However, these existing frameworks often rely solely on either RGB or depth images, limiting their effectiveness in harnessing a comprehensive range of multi-modal data for 3D applications. To tackle this challenge, we present DR-Point, a tri-modal pre-training framework that learns a unified representation of RGB images, depth images, and 3D point clouds by pre-training with object triplets garnered from each modality. To address the scarcity of such triplets, DR-Point employs differentiable rendering to obtain various depth images. This approach not only augments the supply of depth images but also enhances the accuracy of reconstructed point clouds, thereby promoting the representative learning of the Transformer backbone. Subsequently, using a limited number of synthetically generated triplets, DR-Point effectively learns a 3D representation space that aligns seamlessly with the RGB-Depth image space. Our extensive experiments demonstrate that DR-Point outperforms existing self-supervised learning methods in a wide range of downstream tasks, including 3D object classification, part segmentation, point cloud completion, semantic segmentation, and detection. Additionally, our ablation studies validate the effectiveness of DR-Point in enhancing point cloud understanding. △ Less

Submitted 21 April, 2024; originally announced April 2024.

arXiv:2404.13534 [pdf, other]

Motion-aware Latent Diffusion Models for Video Frame Interpolation

Authors: Zhilin Huang, Yijie Yu, Ling Yang, Chujun Qin, Bing Zheng, Xiawu Zheng, Zikun Zhou, Yaowei Wang, Wenming Yang

Abstract: With the advancement of AIGC, video frame interpolation (VFI) has become a crucial component in existing video generation frameworks, attracting widespread research interest. For the VFI task, the motion estimation between neighboring frames plays a crucial role in avoiding motion ambiguity. However, existing VFI methods always struggle to accurately predict the motion information between consecut… ▽ More With the advancement of AIGC, video frame interpolation (VFI) has become a crucial component in existing video generation frameworks, attracting widespread research interest. For the VFI task, the motion estimation between neighboring frames plays a crucial role in avoiding motion ambiguity. However, existing VFI methods always struggle to accurately predict the motion information between consecutive frames, and this imprecise estimation leads to blurred and visually incoherent interpolated frames. In this paper, we propose a novel diffusion framework, motion-aware latent diffusion models (MADiff), which is specifically designed for the VFI task. By incorporating motion priors between the conditional neighboring frames with the target interpolated frame predicted throughout the diffusion sampling procedure, MADiff progressively refines the intermediate outcomes, culminating in generating both visually smooth and realistic results. Extensive experiments conducted on benchmark datasets demonstrate that our method achieves state-of-the-art performance significantly outperforming existing approaches, especially under challenging scenarios involving dynamic textures with complex motion. △ Less

Submitted 4 June, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

Comments: 17 pages, 4 figures. arXiv admin note: substantial text overlap with arXiv:2303.09508 by other authors

arXiv:2404.12777 [pdf, other]

EfficientGS: Streamlining Gaussian Splatting for Large-Scale High-Resolution Scene Representation

Authors: Wenkai Liu, Tao Guan, Bin Zhu, Lili Ju, Zikai Song, Dan Li, Yuesong Wang, Wei Yang

Abstract: In the domain of 3D scene representation, 3D Gaussian Splatting (3DGS) has emerged as a pivotal technology. However, its application to large-scale, high-resolution scenes (exceeding 4k$\times$4k pixels) is hindered by the excessive computational requirements for managing a large number of Gaussians. Addressing this, we introduce 'EfficientGS', an advanced approach that optimizes 3DGS for high-res… ▽ More In the domain of 3D scene representation, 3D Gaussian Splatting (3DGS) has emerged as a pivotal technology. However, its application to large-scale, high-resolution scenes (exceeding 4k$\times$4k pixels) is hindered by the excessive computational requirements for managing a large number of Gaussians. Addressing this, we introduce 'EfficientGS', an advanced approach that optimizes 3DGS for high-resolution, large-scale scenes. We analyze the densification process in 3DGS and identify areas of Gaussian over-proliferation. We propose a selective strategy, limiting Gaussian increase to key primitives, thereby enhancing the representational efficiency. Additionally, we develop a pruning mechanism to remove redundant Gaussians, those that are merely auxiliary to adjacent ones. For further enhancement, we integrate a sparse order increment for Spherical Harmonics (SH), designed to alleviate storage constraints and reduce training overhead. Our empirical evaluations, conducted on a range of datasets including extensive 4K+ aerial images, demonstrate that 'EfficientGS' not only expedites training and rendering times but also achieves this with a model size approximately tenfold smaller than conventional 3DGS while maintaining high rendering fidelity. △ Less

Submitted 19 April, 2024; originally announced April 2024.

arXiv:2404.11112 [pdf, other]

An Adaptive Regularized Proximal Newton-Type Methods for Composite Optimization over the Stiefel Manifold

Authors: Qinsi Wang, Wei Hong Yang

Abstract: Recently, the proximal Newton-type method and its variants have been generalized to solve composite optimization problems over the Stiefel manifold whose objective function is the summation of a smooth function and a nonsmooth function. In this paper, we propose an adaptive quadratically regularized proximal quasi-Newton method, named ARPQN, to solve this class of problems. Under some mild assumpt… ▽ More Recently, the proximal Newton-type method and its variants have been generalized to solve composite optimization problems over the Stiefel manifold whose objective function is the summation of a smooth function and a nonsmooth function. In this paper, we propose an adaptive quadratically regularized proximal quasi-Newton method, named ARPQN, to solve this class of problems. Under some mild assumptions, the global convergence, the local linear convergence rate and the iteration complexity of ARPQN are established. Numerical experiments and comparisons with other state-of-the-art methods indicate that ARPQN is very promising. We also propose an adaptive quadratically regularized proximal Newton method, named ARPN. It is shown the ARPN method has a local superlinear convergence rate under certain reasonable assumptions, which demonstrates attractive convergence properties of regularized proximal Newton methods. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 38 pages, 6 figures

arXiv:2404.10249 [pdf]

Picturing the Gap Between the Performance and US-DOE's Hydrogen Storage Target: A Data-Driven Model for MgH2 Dehydrogenation

Authors: Chaoqun Li, Weijie Yang, Hao Liu, Xinyuan Liu, Xiu**g Xing, Zhengyang Gao, Shuai Dong, Hao Li

Abstract: Develo** solid-state hydrogen storage materials is as pressing as ever, which requires a comprehensive understanding of the dehydrogenation chemistry of a solid-state hydride. Transition state search and kinetics calculations are essential to understanding and designing high-performance solid-state hydrogen storage materials by filling in the knowledge gap that current experimental techniques ca… ▽ More Develo** solid-state hydrogen storage materials is as pressing as ever, which requires a comprehensive understanding of the dehydrogenation chemistry of a solid-state hydride. Transition state search and kinetics calculations are essential to understanding and designing high-performance solid-state hydrogen storage materials by filling in the knowledge gap that current experimental techniques cannot measure. However, the ab initio analysis of these processes is computationally expensive and time-consuming. Searching for descriptors to accurately predict the energy barrier is urgently needed, to accelerate the prediction of hydrogen storage material properties and identify the opportunities and challenges in this field. Herein, we develop a data-driven model to describe and predict the dehydrogenation barriers of a typical solid-state hydrogen storage material, magnesium hydride (MgH2), based on the combination of the crystal Hamilton population orbital of Mg-H bond and the distance between atomic hydrogen. By deriving the distance energy ratio, this model elucidates the key chemistry of the reaction kinetics. All the parameters in this model can be directly calculated with significantly less computational cost than conventional transition state search, so that the dehydrogenation performance of hydrogen storage materials can be predicted efficiently. Finally, we found that this model leads to excellent agreement with typical experimental measurements reported to date and provides clear design guidelines on how to propel the performance of MgH2 closer to the target set by the United States Department of Energy (US-DOE). △ Less

Submitted 29 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.10087 [pdf, ps, other]

cuFastTuckerPlus: A Stochastic Parallel Sparse FastTucker Decomposition Using GPU Tensor Cores

Authors: Zixuan Li, Mingxing Duan, Huizhang Luo, Wangdong Yang, Kenli Li, Keqin Li

Abstract: Sparse tensors are prevalent in real-world applications, often characterized by their large-scale, high-order, and high-dimensional nature. Directly handling raw tensors is impractical due to the significant memory and computational overhead involved. The current mainstream approach involves compressing or decomposing the original tensor. One popular tensor decomposition algorithm is the Tucker de… ▽ More Sparse tensors are prevalent in real-world applications, often characterized by their large-scale, high-order, and high-dimensional nature. Directly handling raw tensors is impractical due to the significant memory and computational overhead involved. The current mainstream approach involves compressing or decomposing the original tensor. One popular tensor decomposition algorithm is the Tucker decomposition. However, existing state-of-the-art algorithms for large-scale Tucker decomposition typically relax the original optimization problem into multiple convex optimization problems to ensure polynomial convergence. Unfortunately, these algorithms tend to converge slowly. In contrast, tensor decomposition exhibits a simple optimization landscape, making local search algorithms capable of converging to a global (approximate) optimum much faster. In this paper, we propose the FastTuckerPlus algorithm, which decomposes the original optimization problem into two non-convex optimization problems and solves them alternately using the Stochastic Gradient Descent method. Furthermore, we introduce cuFastTuckerPlus, a fine-grained parallel algorithm designed for GPU platforms, leveraging the performance of tensor cores. This algorithm minimizes memory access overhead and computational costs, surpassing the state-of-the-art algorithms. Our experimental results demonstrate that our method achieves a speedup of $3X$ to $5X$ compared to state-of-the-art algorithms. △ Less

Submitted 23 May, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.09219 [pdf, ps, other]

Observation of $D \to a_{0}(980)π$ in the decays $D^{0} \rightarrow π^{+}π^{-}η$ and $D^{+} \rightarrow π^{+}π^{0}η$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

Abstract: We report the first amplitude analysis of the decays $D^{0} \to π^{+} π^{-} η$ and $D^{+} \rightarrow π^{+}π^{0}η$ using a data sample taken with the BESIII detector at the center-of-mass energy of 3.773 GeV, corresponding to an integrated luminosity of 7.9 ${\rm fb}^{-1}$. The contribution from the process $D^{0(+)} \to a_{0}(980)^{+} π^{-(0)}$ is significantly larger than the… ▽ More We report the first amplitude analysis of the decays $D^{0} \to π^{+} π^{-} η$ and $D^{+} \rightarrow π^{+}π^{0}η$ using a data sample taken with the BESIII detector at the center-of-mass energy of 3.773 GeV, corresponding to an integrated luminosity of 7.9 ${\rm fb}^{-1}$. The contribution from the process $D^{0(+)} \to a_{0}(980)^{+} π^{-(0)}$ is significantly larger than the $D^{0(+)} \to a_{0}(980)^{-(0)} π^{+}$ contribution. The ratios $\mathcal{B}(D^{0} \rightarrow a_{0}(980)^{+}π^{-})/\mathcal{B}(D^{0} \rightarrow a_{0}(980)^{-}π^{+})$ and $\mathcal{B}(D^{+} \rightarrow a_{0}(980)^{+}π^{0})/\mathcal{B}(D^{+} \rightarrow a_{0}(980)^{0}π^{+})$ are measured to be $7.5^{+2.5}_{-0.8\,\mathrm{stat.}}\pm1.7_{\mathrm{syst.}}$ and $2.6\pm0.6_{\mathrm{stat.}}\pm0.3_{\mathrm{syst.}}$, respectively. The measured $D^{0}$ ratio disagrees with the theoretical predictions by orders of magnitudes, thus implying a substantial contribution from final-state interactions. △ Less

Submitted 14 April, 2024; originally announced April 2024.

arXiv:2404.09198 [pdf, ps, other]

Unsourced Random Access in MIMO Quasi-Static Rayleigh Fading Channels with Finite Blocklength

Authors: Junyuan Gao, Yongpeng Wu, Giuseppe Caire, Wei Yang, Wenjun Zhang

Abstract: This paper explores the fundamental limits of unsourced random access (URA) with a random and unknown number ${\rm{K}}_a$ of active users in MIMO quasi-static Rayleigh fading channels. First, we derive an upper bound on the probability of incorrectly estimating the number of active users. We prove that it exponentially decays with the number of receive antennas and eventually vanishes, whereas rea… ▽ More This paper explores the fundamental limits of unsourced random access (URA) with a random and unknown number ${\rm{K}}_a$ of active users in MIMO quasi-static Rayleigh fading channels. First, we derive an upper bound on the probability of incorrectly estimating the number of active users. We prove that it exponentially decays with the number of receive antennas and eventually vanishes, whereas reaches a plateau as the power and blocklength increase. Then, we derive non-asymptotic achievability and converse bounds on the minimum energy-per-bit required by each active user to reliably transmit $J$ bits with blocklength $n$. Numerical results verify the tightness of our bounds, suggesting that they provide benchmarks to evaluate existing schemes. The extra required energy-per-bit due to the uncertainty of the number of active users decreases as $\mathbb{E}[{\rm{K}}_a]$ increases. Compared to random access with individual codebooks, the URA paradigm achieves higher spectral and energy efficiency. Moreover, using codewords distributed on a sphere is shown to outperform the Gaussian random coding scheme in the non-asymptotic regime. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Comments: Accepted by ISIT 2024

arXiv:2404.09125 [pdf]

Achieving High Yield of Perpendicular SOT-MTJ Manufactured on 300 mm Wafers

Authors: Wenlong Yang, Zhenghui Ji, Yang Gao, Kaiyuan Zhou, Qijun Guo, Dinggui Zeng, Shasha Wang, Ming Wang, Lijie Shen, Guilin Chen, Yihui Sun, Enlong Liu, Shikun He

Abstract: The large-scale fabrication of three-terminal magnetic tunnel junctions (MTJs) with high yield is becoming increasingly crucial, especially with the growing interest in spin-orbit torque (SOT) magnetic random access memory (MRAM) as the next generation of MRAM technology. To achieve high yield and consistent device performance in MTJs with perpendicular magnetic anisotropy, an integration flow has… ▽ More The large-scale fabrication of three-terminal magnetic tunnel junctions (MTJs) with high yield is becoming increasingly crucial, especially with the growing interest in spin-orbit torque (SOT) magnetic random access memory (MRAM) as the next generation of MRAM technology. To achieve high yield and consistent device performance in MTJs with perpendicular magnetic anisotropy, an integration flow has been developed that incorporates special MTJ etching technique and other CMOS-compatible processes on a 300 mm wafer manufacturing platform. Systematic studies have been conducted on device performance and statistical uniformity, encompassing magnetic properties, electrical switching behavior, and reliability. Achievements include a switching current of 680 uA at 2 ns, a TMR as high as 119%, ultra-high endurance (over 1012 cycles), and excellent uniformity in the fabricated SOT-MTJ devices, with a yield of up to 99.6%. The proposed integration process, featuring high yield, is anticipated to streamline the mass production of SOT-MRAM. △ Less

Submitted 13 April, 2024; originally announced April 2024.

Comments: 8 pages, 5 figures

ACM Class: J.2.6

arXiv:2404.08449 [pdf, other]

OccGaussian: 3D Gaussian Splatting for Occluded Human Rendering

Authors: **grui Ye, Zongkai Zhang, Yujiao Jiang, Qingmin Liao, Wenming Yang, Zongqing Lu

Abstract: Rendering dynamic 3D human from monocular videos is crucial for various applications such as virtual reality and digital entertainment. Most methods assume the people is in an unobstructed scene, while various objects may cause the occlusion of body parts in real-life scenarios. Previous method utilizing NeRF for surface rendering to recover the occluded areas, but it requiring more than one day t… ▽ More Rendering dynamic 3D human from monocular videos is crucial for various applications such as virtual reality and digital entertainment. Most methods assume the people is in an unobstructed scene, while various objects may cause the occlusion of body parts in real-life scenarios. Previous method utilizing NeRF for surface rendering to recover the occluded areas, but it requiring more than one day to train and several seconds to render, failing to meet the requirements of real-time interactive applications. To address these issues, we propose OccGaussian based on 3D Gaussian Splatting, which can be trained within 6 minutes and produces high-quality human renderings up to 160 FPS with occluded input. OccGaussian initializes 3D Gaussian distributions in the canonical space, and we perform occlusion feature query at occluded regions, the aggregated pixel-align feature is extracted to compensate for the missing information. Then we use Gaussian Feature MLP to further process the feature along with the occlusion-aware loss functions to better perceive the occluded area. Extensive experiments both in simulated and real-world occlusions, demonstrate that our method achieves comparable or even superior performance compared to the state-of-the-art method. And we improving training and inference speeds by 250x and 800x, respectively. Our code will be available for research purposes. △ Less

Submitted 14 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

arXiv:2404.08145 [pdf]

Polar vortex hidden in twisted bilayers of paraelectric SrTiO3

Authors: Haozhi Sha, Yixuan Zhang, Yunpeng Ma, Wei Li, Wenfeng Yang, Jizhe Cui, Qian Li, Houbing Huang, Rong Yu

Abstract: Polar topologies, such as vortex and skyrmion, have attracted significant interest due to their unique physical properties and promising applications in high-density memory devices. Currently, most polar vortices are observed in heterostructures containing ferroelectric materials and constrained by substrates. In this study, we unravel arrays of polar vortices formed in twisted freestanding bilaye… ▽ More Polar topologies, such as vortex and skyrmion, have attracted significant interest due to their unique physical properties and promising applications in high-density memory devices. Currently, most polar vortices are observed in heterostructures containing ferroelectric materials and constrained by substrates. In this study, we unravel arrays of polar vortices formed in twisted freestanding bilayers composed of SrTiO3, a quantum-paraelectric material. Depth-resolved structures of the bilayers are measured with deep-sub-angstrom resolution and one picometer accuracy using multislice ptychography, enabling identification of the three-dimensional variations of polarization topology. Our findings reveal the evolution of the polar vortices in the twisted overlap** layers, demonstrating the reverse of rotation manner in the depth direction. Twisted freestanding bilayers provide a unique platform for exploration and modulation of novel polar topologies. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.08027 [pdf, other]

SurvMamba: State Space Model with Multi-grained Multi-modal Interaction for Survival Prediction

Authors: Ying Chen, Jia**g Xie, Yuxiang Lin, Yuhang Song, Wenxian Yang, Rongshan Yu

Abstract: Multi-modal learning that combines pathological images with genomic data has significantly enhanced the accuracy of survival prediction. Nevertheless, existing methods have not fully utilized the inherent hierarchical structure within both whole slide images (WSIs) and transcriptomic data, from which better intra-modal representations and inter-modal integration could be derived. Moreover, many ex… ▽ More Multi-modal learning that combines pathological images with genomic data has significantly enhanced the accuracy of survival prediction. Nevertheless, existing methods have not fully utilized the inherent hierarchical structure within both whole slide images (WSIs) and transcriptomic data, from which better intra-modal representations and inter-modal integration could be derived. Moreover, many existing studies attempt to improve multi-modal representations through attention mechanisms, which inevitably lead to high complexity when processing high-dimensional WSIs and transcriptomic data. Recently, a structured state space model named Mamba emerged as a promising approach for its superior performance in modeling long sequences with low complexity. In this study, we propose Mamba with multi-grained multi-modal interaction (SurvMamba) for survival prediction. SurvMamba is implemented with a Hierarchical Interaction Mamba (HIM) module that facilitates efficient intra-modal interactions at different granularities, thereby capturing more detailed local features as well as rich global representations. In addition, an Interaction Fusion Mamba (IFM) module is used for cascaded inter-modal interactive fusion, yielding more comprehensive features for survival prediction. Comprehensive evaluations on five TCGA datasets demonstrate that SurvMamba outperforms other existing methods in terms of performance and computational cost. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.07486 [pdf, other]

Extremal triangle-free graphs with chromatic number at least four

Authors: Sijie Ren, Jian Wang, Shipeng Wang, Weihua Yang

Abstract: Let $G$ be an $n$-vertex triangle-free graph. The celebrated Mantel's theorem showed that $e(G)\leq \lfloor\frac{n^2}{4}\rfloor$. In 1962, Erdős (together with Gallai), and independently Andrásfai, proved that if $G$ is non-bipartite then $e(G)\leq \lfloor\frac{(n-1)^2}{4}\rfloor+1$. In this paper, we extend this result and show that if $G$ has chromatic number at least four and $n\geq 150$, then… ▽ More Let $G$ be an $n$-vertex triangle-free graph. The celebrated Mantel's theorem showed that $e(G)\leq \lfloor\frac{n^2}{4}\rfloor$. In 1962, Erdős (together with Gallai), and independently Andrásfai, proved that if $G$ is non-bipartite then $e(G)\leq \lfloor\frac{(n-1)^2}{4}\rfloor+1$. In this paper, we extend this result and show that if $G$ has chromatic number at least four and $n\geq 150$, then $e(G)\leq \lfloor\frac{(n-3)^2}{4}\rfloor+5$. The blow-up of Grötzsch graph shows that this bound is best possible. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: 14 pages, 4 figures

arXiv:2404.07436 [pdf, other]

Measurement of $e^{+}e^{-}\to ωη^{\prime}$ cross sections at $\sqrt{s}=$ 2.000 to 3.080 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (599 additional authors not shown)

Abstract: The Born cross sections for the process $e^{+}e^{-}\to ωη^{\prime}$ are measured at 22 center-of-mass energies from 2.000 to 3.080 GeV using data collected with the BESIII detector at the BEPCII collider. A resonant structure is observed with a statistical significance of 9.6$σ$. A Breit-Wigner fit determines its mass to be $M_R=(2153\pm30\pm31)~{\rm{MeV}}/c^{2}$ and its width to be… ▽ More The Born cross sections for the process $e^{+}e^{-}\to ωη^{\prime}$ are measured at 22 center-of-mass energies from 2.000 to 3.080 GeV using data collected with the BESIII detector at the BEPCII collider. A resonant structure is observed with a statistical significance of 9.6$σ$. A Breit-Wigner fit determines its mass to be $M_R=(2153\pm30\pm31)~{\rm{MeV}}/c^{2}$ and its width to be $Γ_{R}=(167\pm77\pm7)~\rm{MeV}$, where the first uncertainties are statistical and the second are systematic. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.07106 [pdf, other]

3DMambaComplete: Exploring Structured State Space Model for Point Cloud Completion

Authors: Yixuan Li, Weidong Yang, Ben Fei

Abstract: Point cloud completion aims to generate a complete and high-fidelity point cloud from an initially incomplete and low-quality input. A prevalent strategy involves leveraging Transformer-based models to encode global features and facilitate the reconstruction process. However, the adoption of pooling operations to obtain global feature representations often results in the loss of local details with… ▽ More Point cloud completion aims to generate a complete and high-fidelity point cloud from an initially incomplete and low-quality input. A prevalent strategy involves leveraging Transformer-based models to encode global features and facilitate the reconstruction process. However, the adoption of pooling operations to obtain global feature representations often results in the loss of local details within the point cloud. Moreover, the attention mechanism inherent in Transformers introduces additional computational complexity, rendering it challenging to handle long sequences effectively. To address these issues, we propose 3DMambaComplete, a point cloud completion network built on the novel Mamba framework. It comprises three modules: HyperPoint Generation encodes point cloud features using Mamba's selection mechanism and predicts a set of Hyperpoints. A specific offset is estimated, and the down-sampled points become HyperPoints. The HyperPoint Spread module disperses these HyperPoints across different spatial locations to avoid concentration. Finally, a deformation method transforms the 2D mesh representation of HyperPoints into a fine-grained 3D structure for point cloud reconstruction. Extensive experiments conducted on various established benchmarks demonstrate that 3DMambaComplete surpasses state-of-the-art point cloud completion methods, as confirmed by qualitative and quantitative analyses. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 10 pages, 8 figures, 7 tables

arXiv:2404.06997 [pdf, other]

Agent-driven Generative Semantic Communication for Remote Surveillance

Authors: Wanting Yang, Zehui Xiong, Yanli Yuan, Wenchao Jiang, Tony Q. S. Quek, Merouane Debbah

Abstract: In the era of 6G, featuring compelling visions of intelligent transportation system, digital twins, remote surveillance is poised to become a ubiquitous practice. The substantial data volume and frequent updates present challenges in wireless networks. To address this, we propose a novel agent-driven generative semantic communication (A-GSC) framework based on reinforcement learning. In contrast t… ▽ More In the era of 6G, featuring compelling visions of intelligent transportation system, digital twins, remote surveillance is poised to become a ubiquitous practice. The substantial data volume and frequent updates present challenges in wireless networks. To address this, we propose a novel agent-driven generative semantic communication (A-GSC) framework based on reinforcement learning. In contrast to the existing research on semantic communication (SemCom), which mainly focuses on semantic compression or semantic sampling, we seamlessly cascade both together by jointly considering the intrinsic attributes of source information and the contextual information regarding the task. Notably, the introduction of the generative artificial intelligence (GAI) enables the independent design of semantic encoders and decoders. In this work, we develop an agent-assisted semantic encoder leveraging the knowledge based soft actor-critic algorithm, which can track the semantic changes, channel condition, and sampling intervals, so as to perform adaptive semantic sampling. Accordingly, we design a semantic decoder with both predictive and generative capabilities, which consists of two tailored modules. Moreover, the effectiveness of the designed models has been verified based on the dataset generated from CDNet2014, and the performance gain of the overall A-GSC framework in both energy saving and reconstruction accuracy have been demonstrated. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: Under review with IEEE Transactions on Wireless Communications

arXiv:2404.06929 [pdf, other]

Exact solution of a two-parameter extended Bariev model

Authors: Mingchen Zheng, Xin Zhang, Junpeng Cao, Wen-li Yang, Yupeng Wang

Abstract: An exactly solvable strongly correlated electron model with two independent parameters is constructed in the frame of the quantum inverse scattering method, which can be seen as a generalization of the Bariev model. Through the Bethe ansatz method, a set of Bethe ansatz equations is derived. In the thermodynamic limit, to study the ground state of the model, we obtain the integral equations for th… ▽ More An exactly solvable strongly correlated electron model with two independent parameters is constructed in the frame of the quantum inverse scattering method, which can be seen as a generalization of the Bariev model. Through the Bethe ansatz method, a set of Bethe ansatz equations is derived. In the thermodynamic limit, to study the ground state of the model, we obtain the integral equations for the density of Bethe roots. Numerical validation are done to confirm the accuracy of our analytic results. △ Less

Submitted 4 July, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.06765 [pdf, other]

Harnessing the Power of AI-Generated Content for Semantic Communication

Authors: Yiru Wang, Wanting Yang, Zehui Xiong, Yu** Zhao, Tony Q. S. Quek, Zhu Han

Abstract: Semantic Communication (SemCom) is envisaged as the next-generation paradigm to address challenges stemming from the conflicts between the increasing volume of transmission data and the scarcity of spectrum resources. However, existing SemCom systems face drawbacks, such as low explainability, modality rigidity, and inadequate reconstruction functionality. Recognizing the transformative capabiliti… ▽ More Semantic Communication (SemCom) is envisaged as the next-generation paradigm to address challenges stemming from the conflicts between the increasing volume of transmission data and the scarcity of spectrum resources. However, existing SemCom systems face drawbacks, such as low explainability, modality rigidity, and inadequate reconstruction functionality. Recognizing the transformative capabilities of AI-generated content (AIGC) technologies in content generation, this paper explores a pioneering approach by integrating them into SemCom to address the aforementioned challenges. We employ a three-layer model to illustrate the proposed AIGC-assisted SemCom (AIGC-SCM) architecture, emphasizing its clear deviation from existing SemCom. Grounded in this model, we investigate various AIGC technologies with the potential to augment SemCom's performance. In alignment with SemCom's goal of conveying semantic meanings, we also introduce the new evaluation methods for our AIGC-SCM system. Subsequently, we explore communication scenarios where our proposed AIGC-SCM can realize its potential. For practical implementation, we construct a detailed integration workflow and conduct a case study in a virtual reality image transmission scenario. The results demonstrate our ability to maintain a high degree of alignment between the reconstructed content and the original source information, while substantially minimizing the data volume required for transmission. These findings pave the way for further enhancements in communication efficiency and the improvement of Quality of Service. At last, we present future directions for AIGC-SCM studies. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.06718 [pdf, other]

Measurement of the Born cross section for $e^{+}e^{-}\to ηh_c $ at center-of-mass energies between 4.1 and 4.6\,GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

Abstract: We measure the Born cross section for the reaction $e^{+}e^{-} \rightarrow ηh_c$ from $\sqrt{s} = 4.129$ to $4.600$~GeV using data sets collected by the BESIII detector running at the BEPCII collider. A resonant structure in the cross section line shape near 4.200~GeV is observed with a statistical significance of 7$σ$. The parameters of this resonance are measured to be \MeasMass\ and \MeasWidth,… ▽ More We measure the Born cross section for the reaction $e^{+}e^{-} \rightarrow ηh_c$ from $\sqrt{s} = 4.129$ to $4.600$~GeV using data sets collected by the BESIII detector running at the BEPCII collider. A resonant structure in the cross section line shape near 4.200~GeV is observed with a statistical significance of 7$σ$. The parameters of this resonance are measured to be \MeasMass\ and \MeasWidth, where the first uncertainties are statistical and the second systematic. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.06182 [pdf, other]

Streamlined Transmission: A Semantic-Aware XR Deployment Framework Enhanced by Generative AI

Authors: Wanting Yang, Zehui Xiong, Tony Q. S. Quek, Xuemin Shen

Abstract: In the era of 6G, featuring compelling visions of digital twins and metaverses, Extended Reality (XR) has emerged as a vital conduit connecting the digital and physical realms, garnering widespread interest. Ensuring a fully immersive wireless XR experience stands as a paramount technical necessity, demanding the liberation of XR from the confines of wired connections. In this paper, we first intr… ▽ More In the era of 6G, featuring compelling visions of digital twins and metaverses, Extended Reality (XR) has emerged as a vital conduit connecting the digital and physical realms, garnering widespread interest. Ensuring a fully immersive wireless XR experience stands as a paramount technical necessity, demanding the liberation of XR from the confines of wired connections. In this paper, we first introduce the technologies applied in the wireless XR domain, delve into their benefits and limitations, and highlight the ongoing challenges. We then propose a novel deployment framework for a broad XR pipeline, termed "GeSa-XRF", inspired by the core philosophy of Semantic Communication (SemCom) which shifts the concern from "how" to transmit to "what" to transmit. Particularly, the framework comprises three stages: data collection, data analysis, and data delivery. In each stage, we integrate semantic awareness to achieve streamlined transmission and employ Generative Artificial Intelligence (GAI) to achieve collaborative refinements. For the data collection of multi-modal data with differentiated data volumes and heterogeneous latency requirements, we propose a novel SemCom paradigm based on multi-modal fusion and separation and a GAI-based robust superposition scheme. To perform a comprehensive data analysis, we employ multi-task learning to perform the prediction of field of view and personalized attention and discuss the possible preprocessing approaches assisted by GAI. Lastly, for the data delivery stage, we present a semantic-aware multicast-based delivery strategy aimed at reducing pixel level redundant transmissions and introduce the GAI collaborative refinement approach. The performance gain of the proposed GeSa-XRF is preliminarily demonstrated through a case study. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: Under review with IEEE Network

arXiv:2404.06022

Band-Attention Modulated RetNet for Face Forgery Detection

Authors: Zhida Zhang, Jie Cao, Wenkui Yang, Qihang Fan, Kai Zhou, Ran He

Abstract: The transformer networks are extensively utilized in face forgery detection due to their scalability across large datasets.Despite their success, transformers face challenges in balancing the capture of global context, which is crucial for unveiling forgery clues, with computational complexity.To mitigate this issue, we introduce Band-Attention modulated RetNet (BAR-Net), a lightweight network des… ▽ More The transformer networks are extensively utilized in face forgery detection due to their scalability across large datasets.Despite their success, transformers face challenges in balancing the capture of global context, which is crucial for unveiling forgery clues, with computational complexity.To mitigate this issue, we introduce Band-Attention modulated RetNet (BAR-Net), a lightweight network designed to efficiently process extensive visual contexts while avoiding catastrophic forgetting.Our approach empowers the target token to perceive global information by assigning differential attention levels to tokens at varying distances. We implement self-attention along both spatial axes, thereby maintaining spatial priors and easing the computational burden.Moreover, we present the adaptive frequency Band-Attention Modulation mechanism, which treats the entire Discrete Cosine Transform spectrogram as a series of frequency bands with learnable weights.Together, BAR-Net achieves favorable performance on several face forgery datasets, outperforming current state-of-the-art methods. △ Less

Submitted 1 July, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

Comments: The essay is poorly expressed in writing and will be re-optimised

arXiv:2404.05973 [pdf, ps, other]

Search for the Rare Decays $D_s^+\to h^+(h^{0})e^+e^-$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (618 additional authors not shown)

Abstract: Using 7.33~fb$^{-1}$ of $e^{+}e^{-}$ collision data collected by the BESIII detector at center-of-mass energies in the range of $\sqrt{s}=4.128 - 4.226$~GeV, we search for the rare decays $D_{s}^+\to h^+(h^{0})e^{+}e^{-}$, where $h$ represents a kaon or pion. By requiring the $e^{+}e^{-}$ invariant mass to be consistent with a $φ(1020)$, $0.98<M(e^{+}e^{-})<1.04$ ~GeV/$c^2$, the decay… ▽ More Using 7.33~fb$^{-1}$ of $e^{+}e^{-}$ collision data collected by the BESIII detector at center-of-mass energies in the range of $\sqrt{s}=4.128 - 4.226$~GeV, we search for the rare decays $D_{s}^+\to h^+(h^{0})e^{+}e^{-}$, where $h$ represents a kaon or pion. By requiring the $e^{+}e^{-}$ invariant mass to be consistent with a $φ(1020)$, $0.98<M(e^{+}e^{-})<1.04$ ~GeV/$c^2$, the decay $D_s^+\toπ^+φ,φ\to e^{+}e^{-}$ is observed with a statistical significance of 7.8$σ$, and evidence for the decay $D_s^+\toρ^+φ,φ\to e^{+}e^{-}$ is found for the first time with a statistical significance of 4.4$σ$. The decay branching fractions are measured to be $\mathcal{B}(D_s^+\toπ^+φ, φ\to e^{+}e^{-} )=(1.17^{+0.23}_{-0.21}\pm0.03)\times 10^{-5}$, and $\mathcal{B}(D_s^+\toρ^+φ, φ\to e^{+}e^{-} )=(2.44^{+0.67}_{-0.62}\pm 0.16)\times 10^{-5}$, where the first uncertainties are statistical and the second systematic. No significant signal for the three four-body decays of $D_{s}^{+}\to π^{+}π^{0}e^{+}e^{-},\ D_{s}^{+}\to K^{+}π^{0}e^{+}e^{-}$, and $D_{s}^{+}\to K_{S}^{0}π^{+}e^{+}e^{-}$ is observed. For $D_{s}^{+}\to π^{+}π^{0}e^{+}e^{-}$, the $φ$ mass region is vetoed to minimize the long-distance effects. The 90$\%$ confidence level upper limits set on the branching fractions of these decays are in the range of $(7.0-8.1)\times 10^{-5}$. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: 10 pages, 2 figures, 1 table

Showing 101–150 of 2,367 results for author: Yang, W