Search | arXiv e-print repository

arXiv:2407.01347 [pdf, ps, other]

Bulk and fracture process zone contribution to the rate-dependent adhesion amplification in viscoelastic broad-band materials

Authors: Ali Maghami, Qingao Wang, Michele Tricarico, Michele Ciavarella, Qunyang Li, Antonio Papangelo

Abstract: The contact between a rigid Hertzian indenter and an adhesive broad-band viscoelastic substrate is considered. The material behaviour is described by a modified power law model, which is characterized by only four parameters, the glassy and rubbery elastic moduli, a characteristic exponent n and a timescale $τ_0$. The maximum adherence force that can be reached while unloading the rigid indenter f… ▽ More The contact between a rigid Hertzian indenter and an adhesive broad-band viscoelastic substrate is considered. The material behaviour is described by a modified power law model, which is characterized by only four parameters, the glassy and rubbery elastic moduli, a characteristic exponent n and a timescale $τ_0$. The maximum adherence force that can be reached while unloading the rigid indenter from a relaxed viscoelastic half-space is studied by means of a numerical implementation based on the boundary element method, as a function of the unloading velocity, preload and by varying the broadness of the viscoelastic material spectrum. Through a comprehensive numerical analysis we have determined the minimum contact radius that is needed to achieve the maximum amplification of the pull-off force at a specified unloading rate and for different material exponents n. The numerical results are then compared with the prediction of Persson and Brener viscoelastic crack propagation theory, providing excellent agreement. However, comparison against experimental tests for a glass lens indenting a PDMS substrate show data can be fitted with the linear theory only up to an unloading rate of about $100 \textrm{ $μ$}$m/s showing the fracture process zone rate-dependent contribution to the energy enhancement is of the same order of the bulk dissipation contribution. Hence, the limitations of the current numerical and theoretical models for viscoelastic adhesion are discussed in light of the most recent literature results. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2407.00958 [pdf, other]

Universal Approximation Theory: The basic theory for large language models

Authors: Wei Wang, Qing Li

Abstract: Language models have emerged as a critical area of focus in artificial intelligence, particularly with the introduction of groundbreaking innovations like ChatGPT. Large-scale Transformer networks have quickly become the leading approach for advancing natural language processing algorithms. Built on the Transformer architecture, these models enable interactions that closely mimic human communicati… ▽ More Language models have emerged as a critical area of focus in artificial intelligence, particularly with the introduction of groundbreaking innovations like ChatGPT. Large-scale Transformer networks have quickly become the leading approach for advancing natural language processing algorithms. Built on the Transformer architecture, these models enable interactions that closely mimic human communication and, equipped with extensive knowledge, can even assist in guiding human tasks. Despite their impressive capabilities and growing complexity, a key question remains-the theoretical foundations of large language models (LLMs). What makes Transformer so effective for powering intelligent language applications, such as translation and coding? What underlies LLMs' ability for In-Context Learning (ICL)? How does the LoRA scheme enhance the fine-tuning of LLMs? And what supports the practicality of pruning LLMs? To address these critical questions and explore the technological strategies within LLMs, we leverage the Universal Approximation Theory (UAT) to offer a theoretical backdrop, shedding light on the mechanisms that underpin these advancements. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2407.00918 [pdf, other]

Robust and Reliable Early-Stage Website Fingerprinting Attacks via Spatial-Temporal Distribution Analysis

Authors: Xinhao Deng, Qi Li, Ke Xu

Abstract: Website Fingerprinting (WF) attacks identify the websites visited by users by performing traffic analysis, compromising user privacy. Particularly, DL-based WF attacks demonstrate impressive attack performance. However, the effectiveness of DL-based WF attacks relies on the collected complete and pure traffic during the page loading, which impacts the practicality of these attacks. The WF performa… ▽ More Website Fingerprinting (WF) attacks identify the websites visited by users by performing traffic analysis, compromising user privacy. Particularly, DL-based WF attacks demonstrate impressive attack performance. However, the effectiveness of DL-based WF attacks relies on the collected complete and pure traffic during the page loading, which impacts the practicality of these attacks. The WF performance is rather low under dynamic network conditions and various WF defenses, particularly when the analyzed traffic is only a small part of the complete traffic. In this paper, we propose Holmes, a robust and reliable early-stage WF attack. Holmes utilizes temporal and spatial distribution analysis of website traffic to effectively identify websites in the early stages of page loading. Specifically, Holmes develops adaptive data augmentation based on the temporal distribution of website traffic and utilizes a supervised contrastive learning method to extract the correlations between the early-stage traffic and the pre-collected complete traffic. Holmes accurately identifies traffic in the early stages of page loading by computing the correlation of the traffic with the spatial distribution information, which ensures robust and reliable detection according to early-stage traffic. We extensively evaluate Holmes using six datasets. Compared to nine existing DL-based WF attacks, Holmes improves the F1-score of identifying early-stage traffic by an average of 169.18%. Furthermore, we replay the traffic of visiting real-world dark web websites. Holmes successfully identifies dark web websites when the ratio of page loading on average is only 21.71%, with an average precision improvement of 169.36% over the existing WF attacks. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: To appear in the Proceedings of The ACM Conference on Computer and Communications Security (CCS), 2024

arXiv:2407.00569 [pdf, other]

Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models

Authors: Weihong Zhong, Xiaocheng Feng, Liang Zhao, Qiming Li, Lei Huang, Yuxuan Gu, Weitao Ma, Yuan Xu, Bing Qin

Abstract: Though advanced in understanding visual information with human languages, Large Vision-Language Models (LVLMs) still suffer from multimodal hallucinations. A natural concern is that during multimodal interaction, the generated hallucinations could influence the LVLMs' subsequent generation. Thus, we raise a question: When presented with a query relevant to the previously generated hallucination, w… ▽ More Though advanced in understanding visual information with human languages, Large Vision-Language Models (LVLMs) still suffer from multimodal hallucinations. A natural concern is that during multimodal interaction, the generated hallucinations could influence the LVLMs' subsequent generation. Thus, we raise a question: When presented with a query relevant to the previously generated hallucination, will LVLMs be misled and respond incorrectly, even though the ground visual information exists? To answer this, we propose a framework called MMHalSnowball to evaluate LVLMs' behaviors when encountering generated hallucinations, where LVLMs are required to answer specific visual questions within a curated hallucinatory conversation. Crucially, our experiment shows that the performance of open-source LVLMs drops by at least $31\%$, indicating that LVLMs are prone to accept the generated hallucinations and make false claims that they would not have supported without distractions. We term this phenomenon Multimodal Hallucination Snowballing. To mitigate this, we further propose a training-free method called Residual Visual Decoding, where we revise the output distribution of LVLMs with the one derived from the residual visual input, providing models with direct access to the visual information. Experiments show that our method can mitigate more than $24\%$ of the snowballed multimodal hallucination while maintaining capabilities. △ Less

Submitted 29 June, 2024; originally announced July 2024.

Comments: Accepted to ACL 2024 Main Conference. 21 pages, 20 figures

arXiv:2407.00136 [pdf, other]

Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, S. Ahmed, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, X. H. Bai, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (495 additional authors not shown)

Abstract: Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions… ▽ More Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components. △ Less

Submitted 28 June, 2024; originally announced July 2024.

arXiv:2407.00114 [pdf, other]

OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents

Authors: Zihao Wang, Shaofei Cai, Zhancun Mu, Haowei Lin, Ceyao Zhang, Xuejie Liu, Qing Li, Anji Liu, Xiaojian Ma, Yitao Liang

Abstract: We present OmniJARVIS, a novel Vision-Language-Action (VLA) model for open-world instruction-following agents in open-world Minecraft. Compared to prior works that either emit textual goals to separate controllers or produce the control command directly, OmniJARVIS seeks a different path to ensure both strong reasoning and efficient decision-making capabilities via unified tokenization of multimod… ▽ More We present OmniJARVIS, a novel Vision-Language-Action (VLA) model for open-world instruction-following agents in open-world Minecraft. Compared to prior works that either emit textual goals to separate controllers or produce the control command directly, OmniJARVIS seeks a different path to ensure both strong reasoning and efficient decision-making capabilities via unified tokenization of multimodal interaction data. First, we introduce a self-supervised approach to learn a behavior encoder that produces discretized tokens for behavior trajectories $τ$ = {$o_0$, $a_0$, $\dots$} and an imitation learning (IL) policy decoder conditioned on these tokens. These additional behavior tokens will be augmented to the vocabulary of pretrained Multimodal Language Models (MLMs). With this encoder, we then pack long-term multimodal interactions involving task instructions, memories, thoughts, observations, textual responses, behavior trajectories, etc. into unified token sequences and model them with autoregressive transformers. Thanks to the semantically meaningful behavior tokens, the resulting VLA model, OmniJARVIS, can reason (by producing chain-of-thoughts), plan, answer questions, and act (by producing behavior tokens for the IL policy decoder). OmniJARVIS demonstrates excellent performances on a comprehensive collection of atomic, programmatic, and open-ended tasks in open-world Minecraft. Our analysis further unveils the crucial design principles in interaction data formation, unified tokenization, and its scaling potentials. △ Less

Submitted 27 June, 2024; originally announced July 2024.

arXiv:2406.20040 [pdf]

Twist angle driven electronic structure evolution of twisted bilayer graphene

Authors: Jiawei Yu, Guihao Jia, Qian Li, Yuyang Wang, Kebin Xiao, Yongkang Ju, Hongyun Zhang, Zhiqiang Hu, Yunkai Guo, Biao Lian, Peizhe Tang, Shuyun Zhou, Qi-Kun Xue, Wei Li

Abstract: In twisted bilayer graphene (TBG) devices, local strains often coexist and entangle with the twist-angle dependent moiré superlattice, both of which can significantly affect the electronic properties of TBG. Here, using low-temperature scanning tunneling microscopy, we investigate the fine evolution of the electronic structures of a TBG device with continuous variation of twist angles from 0.32° t… ▽ More In twisted bilayer graphene (TBG) devices, local strains often coexist and entangle with the twist-angle dependent moiré superlattice, both of which can significantly affect the electronic properties of TBG. Here, using low-temperature scanning tunneling microscopy, we investigate the fine evolution of the electronic structures of a TBG device with continuous variation of twist angles from 0.32° to 1.29°, spanning the first (1.1°), second (0.5°) and third (0.3°) magic angles. We reveal the exotic behavior of the flat bands and remote bands in both the energy space and real space near the magic angles. Interestingly, we observe an anomalous spectral weight transfer between the two flat band peaks in the tunneling spectra when approaching the first magic angle, suggesting strong inter-flat-bands interactions. The position of the remote band peak can be an index for the twist angle in TBG, since it positively correlates with the twist angle but is insensitive to the strain. Moreover, influences of the twist angle gradient on symmetry breaking of the flat bands are also studied. △ Less

Submitted 28 June, 2024; originally announced June 2024.

arXiv:2406.19311 [pdf, other]

Zero-Query Adversarial Attack on Black-box Automatic Speech Recognition Systems

Authors: Zheng Fang, Tao Wang, Lingchen Zhao, Shenyi Zhang, Bowen Li, Yunjie Ge, Qi Li, Chao Shen, Qian Wang

Abstract: In recent years, extensive research has been conducted on the vulnerability of ASR systems, revealing that black-box adversarial example attacks pose significant threats to real-world ASR systems. However, most existing black-box attacks rely on queries to the target ASRs, which is impractical when queries are not permitted. In this paper, we propose ZQ-Attack, a transfer-based adversarial attack… ▽ More In recent years, extensive research has been conducted on the vulnerability of ASR systems, revealing that black-box adversarial example attacks pose significant threats to real-world ASR systems. However, most existing black-box attacks rely on queries to the target ASRs, which is impractical when queries are not permitted. In this paper, we propose ZQ-Attack, a transfer-based adversarial attack on ASR systems in the zero-query black-box setting. Through a comprehensive review and categorization of modern ASR technologies, we first meticulously select surrogate ASRs of diverse types to generate adversarial examples. Following this, ZQ-Attack initializes the adversarial perturbation with a scaled target command audio, rendering it relatively imperceptible while maintaining effectiveness. Subsequently, to achieve high transferability of adversarial perturbations, we propose a sequential ensemble optimization algorithm, which iteratively optimizes the adversarial perturbation on each surrogate model, leveraging collaborative information from other models. We conduct extensive experiments to evaluate ZQ-Attack. In the over-the-line setting, ZQ-Attack achieves a 100% success rate of attack (SRoA) with an average signal-to-noise ratio (SNR) of 21.91dB on 4 online speech recognition services, and attains an average SRoA of 100% and SNR of 19.67dB on 16 open-source ASRs. For commercial intelligent voice control devices, ZQ-Attack also achieves a 100% SRoA with an average SNR of 15.77dB in the over-the-air setting. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: To appear in the Proceedings of The ACM Conference on Computer and Communications Security (CCS), 2024

arXiv:2406.19190 [pdf, ps, other]

Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

Abstract: Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec… ▽ More Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential decay rate of $D^+_s\to K^0 e^+ν_e$ to be $f^{K^0}_+(0)=0.636\pm0.049\pm0.013$. For both measurements, the first uncertainty is statistical and the second systematic. The branching fraction and form factor measurements are factors of 1.6 and 1.7 more precise than the previous world averages, respectively. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 13 pages, 6 figures

arXiv:2406.19068 [pdf]

Enhancing interfacial thermal transport by nanostructures: Monte Carlo simulations with ab initio phonon properties

Authors: Wenzhu Luo, Neng Wang, Wenlei Lian, Ershuai Yin, Qiang Li

Abstract: Recent experiments have indicated that employing nanostructures can enhance interfacial heat transport, but the mechanism by which different structural morphologies and dimensions contribute to the full-spectrum phonon interfacial transport remains unclear. In this paper, a multiscale method to study the thermal transfer at nanostructured interfaces is developed by combining density functional cal… ▽ More Recent experiments have indicated that employing nanostructures can enhance interfacial heat transport, but the mechanism by which different structural morphologies and dimensions contribute to the full-spectrum phonon interfacial transport remains unclear. In this paper, a multiscale method to study the thermal transfer at nanostructured interfaces is developed by combining density functional calculation, Monte Carlo simulation, and diffuse mismatch method. The changes in the transport paths and contributions to thermal conductance of different frequency phonons caused by changes in nanostructure morphology and size are investigated. The results show that, compared to the triangular and trapezoidal nanostructures, the rectangular nanostructures are more beneficial in enhancing the probability of the reflected phonons encountering the interface, and thus the phonon interfacial transmittance. The nanostructure makes the interfacial heat flow extremely heterogeneous, with significant transverse heat flow occurring at the sidewalls, resulting in a new thermal conduction pathway. The phenomena of multiple reflections and double transmission together lead to the existence of the optimal dimension that maximizes the nanostructures enhancement effect on interfacial heat transfer. The optimal nanostructure width is 100 nm when the height is 100 nm and the maximum interfacial thermal conductance enhancement ratio is 1.31. These results can guide the design of heat transfer enhancement structures at the interface of the actual high-power chips. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 18 pages, 9 figures

ACM Class: J.2.7

arXiv:2406.19043 [pdf]

CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI

Authors: Zi Wang, Fanwen Wang, Chen Qin, Jun Lyu, Ouyang Cheng, Shuo Wang, Yan Li, Mengyao Yu, Haoyu Zhang, Kunyuan Guo, Zhang Shi, Qirong Li, Ziqiang Xu, Ya**g Zhang, Hao Li, Sha Hua, Binghua Chen, Longyu Sun, Mengting Sun, Qin Li, Ying-Hua Chu, Wenjia Bai, **g Qin, Xiahai Zhuang, Claudia Prieto , et al. (7 additional authors not shown)

Abstract: Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover h… ▽ More Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover high-quality, clinically interpretable images from undersampled measurements. However, the lack of publicly available cardiac MRI k-space dataset in terms of both quantity and diversity has severely hindered substantial technological progress, particularly for data-driven artificial intelligence. Here, we provide a standardized, diverse, and high-quality CMRxRecon2024 dataset to facilitate the technical development, fair evaluation, and clinical transfer of cardiac MRI reconstruction approaches, towards promoting the universal frameworks that enable fast and robust reconstructions across different cardiac MRI protocols in clinical practice. To the best of our knowledge, the CMRxRecon2024 dataset is the largest and most diverse publicly available cardiac k-space dataset. It is acquired from 330 healthy volunteers, covering commonly used modalities, anatomical views, and acquisition trajectories in clinical cardiac MRI workflows. Besides, an open platform with tutorials, benchmarks, and data processing tools is provided to facilitate data usage, advanced method development, and fair performance evaluation. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 19 pages, 3 figures, 2 tables

arXiv:2406.18577 [pdf]

Measurement of dynamic nonlocal deformation using nanodiamond sensors

Authors: Yue Cui, Weng-Hang Leong, Guoli Zhu, Ren-Bao Liu, Quan Li

Abstract: Nonlocal deformation sensing achieved by integrating atomic force microscopy indentation with nanodiamond-based orientation tracking features high precision and high spatial resolution, providing a useful technique for studying the mechanical properties of soft biological systems. However, this technique is currently limited to lifeless systems because it cannot differentiate the indentation-induc… ▽ More Nonlocal deformation sensing achieved by integrating atomic force microscopy indentation with nanodiamond-based orientation tracking features high precision and high spatial resolution, providing a useful technique for studying the mechanical properties of soft biological systems. However, this technique is currently limited to lifeless systems because it cannot differentiate the indentation-induced deformation from that associated with live activities or other external perturbations. Here we develop a dynamic nonlocal deformation sensing method using oscillatory nanoindentation and spectroscopic analysis to overcome this limitation. The method realizes both temporally and spatially resolved mechanical analysis, with tens of microsecond time-lag precision, nanometer vertical deformation precision, and sub-hundred nanometer lateral spatial resolution, leading to the disclosure of surface/interface effects in the mechanical response of viscoelastic materials and live cells. Neglecting surface tension would underestimate the liquid-like characteristics of the materials. This work demonstrates nanodiamond sensors as a useful tool for spatial-temporal mechanical analysis of soft, complex bio-relevant materials. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: 33 pages (4 figures) + 26 pages (20 figures)

arXiv:2406.18189 [pdf, other]

Functional knockoffs selection with applications to functional data analysis in high dimensions

Authors: Xinghao Qiao, Mingya Long, Qizhai Li

Abstract: The knockoffs is a recently proposed powerful framework that effectively controls the false discovery rate (FDR) for variable selection. However, none of the existing knockoff solutions are directly suited to handle multivariate or high-dimensional functional data, which has become increasingly prevalent in various scientific applications. In this paper, we propose a novel functional model-X knock… ▽ More The knockoffs is a recently proposed powerful framework that effectively controls the false discovery rate (FDR) for variable selection. However, none of the existing knockoff solutions are directly suited to handle multivariate or high-dimensional functional data, which has become increasingly prevalent in various scientific applications. In this paper, we propose a novel functional model-X knockoffs selection framework tailored to sparse high-dimensional functional models, and show that our proposal can achieve the effective FDR control for any sample size. Furthermore, we illustrate the proposed functional model-X knockoffs selection procedure along with the associated theoretical guarantees for both FDR control and asymptotic power using examples of commonly adopted functional linear additive regression models and the functional graphical model. In the construction of functional knockoffs, we integrate essential components including the correlation operator matrix, the Karhunen-Loève expansion, and semidefinite programming, and develop executable algorithms. We demonstrate the superiority of our proposed methods over the competitors through both extensive simulations and the analysis of two brain imaging datasets. △ Less

Submitted 27 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

arXiv:2406.18183 [pdf, other]

Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of… ▽ More Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$, evidence for $ψ(4160) \to K^{-}\barΞ^{+}Λ$ is found for the first time with a significance of 4.4$σ$, including systematic uncertainties. No evidence for other possible resonances is found. In addition, the products of electronic partial width and branching fraction for all assumed resonances decaying into $K^{-}\barΞ^{+}Λ/Σ^{0}$ are determined. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 26 pages,5 tables, 4 figures

arXiv:2406.18144 [pdf, other]

doi 10.1007/s11263-024-02153-0

Artificial Immune System of Secure Face Recognition Against Adversarial Attacks

Authors: Min Ren, Yunlong Wang, Yuhao Zhu, Yongzhen Huang, Zhenan Sun, Qi Li, Tieniu Tan

Abstract: Insect production for food and feed presents a promising supplement to ensure food safety and address the adverse impacts of agriculture on climate and environment in the future. However, optimisation is required for insect production to realise its full potential. This can be by targeted improvement of traits of interest through selective breeding, an approach which has so far been underexplored… ▽ More Insect production for food and feed presents a promising supplement to ensure food safety and address the adverse impacts of agriculture on climate and environment in the future. However, optimisation is required for insect production to realise its full potential. This can be by targeted improvement of traits of interest through selective breeding, an approach which has so far been underexplored and underutilised in insect farming. Here we present a comprehensive review of the selective breeding framework in the context of insect production. We systematically evaluate adjustments of selective breeding techniques to the realm of insects and highlight the essential components integral to the breeding process. The discussion covers every step of a conventional breeding scheme, such as formulation of breeding objectives, phenoty**, estimation of genetic parameters and breeding values, selection of appropriate breeding strategies, and mitigation of issues associated with genetic diversity depletion and inbreeding. This review combines knowledge from diverse disciplines, bridging the gap between animal breeding, quantitative genetics, evolutionary biology, and entomology, offering an integrated view of the insect breeding research area and uniting knowledge which has previously remained scattered across diverse fields of expertise. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Journal ref: International Journal of Computer Vision (IJCV), 2024

arXiv:2406.18083 [pdf, other]

Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

Abstract: Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an… ▽ More Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, and $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^0)=(2.02 \pm 0.13 \pm 0.05)\%$, where the first uncertainties are statistical and the second systematic. Combining with the known branching fractions of $Λ_c^+ \to pK_{S}^{0}$, $Λ_c^+ \to pK_{S}^{0}π^+π^-$, and $Λ_c^+ \to pK_{S}^{0}π^0$, we present the first measurements of the $K_{S}^{0}$-$K_{L}^{0}$ asymmetries $R(Λ_c^+, K_{S,L}^0X) = \frac{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) - \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) + \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}$ in charmed baryon decays: $R(Λ_c^+, pK_{S,L}^0) = -0.025 \pm 0.031$, $R(Λ_c^+, pK_{S,L}^0π^+π^-) = -0.027 \pm 0.048$, and $R(Λ_c^+, pK_{S,L}^0π^0) =-0.015 \pm 0.046$. No significant asymmetries within the uncertainties are observed. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 19 pages, 2 figures

arXiv:2406.17712 [pdf, ps, other]

Representations of domains via closure spaces in the quantale-valued setting

Authors: Guojun Wu, Wei Yao, Qingguo Li

Abstract: With a commutative unital quantale $L$ as the truth value table, this study focuses on the representations of $L$-domains by means of $L$-closure spaces. First, the notions of interpolative generalized $L$-closure spaces and directed closed sets are introduced. It is proved that in an interpolative generalized $L$-closure space (resp., $L$-closure space), the collection of directed closed sets wit… ▽ More With a commutative unital quantale $L$ as the truth value table, this study focuses on the representations of $L$-domains by means of $L$-closure spaces. First, the notions of interpolative generalized $L$-closure spaces and directed closed sets are introduced. It is proved that in an interpolative generalized $L$-closure space (resp., $L$-closure space), the collection of directed closed sets with respect to the inclusion $L$-order forms a continuous $L$-dcpo (resp., an algebraic $L$-dcpo). Conversely, it is shown that every continuous $L$-dcpo (resp., algebraic $L$-dcpo) can be reconstructed by an interpolative generalized $L$-closure space (resp., $L$-closure space). Second, when $L$ is integral, the notion of dense subspaces of generalized $L$-closure spaces is introduced. By means of dense subspaces, an alternative representation for algebraic $L$-dcpos is given. Moreover, the concept of $L$-approximable relations between interpolative generalized $L$-closure spaces is introduced. Consequently, a categorical equivalence between the category of interpolative generalized $L$-closure spaces (resp., $L$-closure spaces) with $L$-approximable relations and that of continuous $L$-dcpos (resp., algebraic $L$-dcpos) with Scott continuous map**s is established. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.17496 [pdf, other]

Comparison of the origin of Short Gamma ray Bursts with or without extended emission

Authors: Qin-Mei Li, Qi-Bin Sun

Abstract: The merger of compact binary stars produces short gamma-ray bursts (sGRBs), involving channels such as neutron star - neutron star (BNS) and neutron star - black hole (NS-BH). The association between sGRB 170817A and gravitational wave GW 170817 provides reliable evidence for the BNS channel. The spatial distribution and merger rate differ between BNS mergers and NS-BH mergers. Some speculations s… ▽ More The merger of compact binary stars produces short gamma-ray bursts (sGRBs), involving channels such as neutron star - neutron star (BNS) and neutron star - black hole (NS-BH). The association between sGRB 170817A and gravitational wave GW 170817 provides reliable evidence for the BNS channel. The spatial distribution and merger rate differ between BNS mergers and NS-BH mergers. Some speculations suggest that sGRBs with extended emission (EE) may represent another distinct population. We compared the offset distributions of these two types of samples and found that they follow the same distribution. Utilizing non-parametric methods, we investigated the origin of these burst types in terms of their formation rate. We examined the luminosity function and formation rate of sGRBs without any assuming. The luminosity function can be described as $ψ(L_{0}) \propto L_{0}^{-0.09 \pm 0.01}$ for $L_{0} < L_0^b$ ($ψ(L_{0}) \propto L_{0}^{-0.57 \pm 0.02}$ for $L_{0} > L_0^b$) for standard sGRBs and $ψ(L_{0}) \propto L_{0}^{-0.11 \pm 0.004}$ for $L_{0} < L_0^b$ ($ψ(L_{0}) \propto L_{0}^{-0.61 \pm 0.01}$ for $L_{0} > L_0^b$) for sGRBs with EE. The formation rate is characterized as $ρ(z) \propto (1 + z)^{-4.21 \pm 0.22}$ for $z < 0.8$ and $ρ(z) \propto (1 + z)^{-0.22 \pm 0.74}$ for $0.8 < z < 3$ for standard sGRBs, while for sGRBs with EE, it is $ρ(z) \propto (1 + z)^{-4.30 \pm 0.13}$ for $z < 0.8$ and $ρ(z) \propto (1 + z)^{-0.33 \pm 0.66}$ for $0.8 < z < 3$. Based on these findings, we suggest that there is no significant difference in the progenitor stars of sGRBs with and without EE, considering the spatial offset and formation rate perspectives. △ Less

Submitted 26 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

Comments: 16 pages, 11 figures and 2 tables

arXiv:2406.17469 [pdf, other]

Cross-Modal Spherical Aggregation for Weakly Supervised Remote Sensing Shadow Removal

Authors: Kaichen Chi, Wei **g, Junjie Li, Qiang Li, Qi Wang

Abstract: Remote sensing shadow removal, which aims to recover contaminated surface information, is tricky since shadows typically display overwhelmingly low illumination intensities. In contrast, the infrared image is robust toward significant light changes, providing visual clues complementary to the visible image. Nevertheless, the existing methods ignore the collaboration between heterogeneous modalitie… ▽ More Remote sensing shadow removal, which aims to recover contaminated surface information, is tricky since shadows typically display overwhelmingly low illumination intensities. In contrast, the infrared image is robust toward significant light changes, providing visual clues complementary to the visible image. Nevertheless, the existing methods ignore the collaboration between heterogeneous modalities, leading to undesired quality degradation. To fill this gap, we propose a weakly supervised shadow removal network with a spherical feature space, dubbed S2-ShadowNet, to explore the best of both worlds for visible and infrared modalities. Specifically, we employ a modal translation (visible-to-infrared) model to learn the cross-domain map**, thus generating realistic infrared samples. Then, Swin Transformer is utilized to extract strong representational visible/infrared features. Simultaneously, the extracted features are mapped to the smooth spherical manifold, which alleviates the domain shift through regularization. Well-designed similarity loss and orthogonality loss are embedded into the spherical space, prompting the separation of private visible/infrared features and the alignment of shared visible/infrared features through constraints on both representation content and orientation. Such a manner encourages implicit reciprocity between modalities, thus providing a novel insight into shadow removal. Notably, ground truth is not available in practice, thus S2-ShadowNet is trained by crop** shadow and shadow-free patches from the shadow image itself, avoiding stereotypical and strict pair data acquisition. More importantly, we contribute a large-scale weakly supervised shadow removal benchmark, including 4000 shadow images with corresponding shadow masks. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: 9pages, 11 figures

arXiv:2406.17452 [pdf, ps, other]

Study of the $f_{0}(980)$ through the decay $D_{s}^{+}\rightarrow π^{+}π^{+}π^{-}π^{0}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (649 additional authors not shown)

Abstract: We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and… ▽ More We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and determine the branching fractions $\mathcal{B}(D_s^+\toπ^+π^+π^-π^0|_{{\rm non}-η})=(2.04\pm0.08_{\rm stat.}\pm0.05_{\rm syst.})\%$ and $\mathcal{B}(D_s^+\toηπ^+)=(1.56\pm0.09_{\rm stat.}\pm0.04_{\rm syst.})\%$. Moreover, we measure the relative branching fraction between $φ\toπ^+π^-π^0$ and $φ\to K^+K^-$ to be $\frac{\mathcal{B}(φ(1020) \to π^+π^-π^0)}{\mathcal{B}(φ(1020) \to K^+K^-)}=0.230 \pm 0.014_{\rm stat.} \pm 0.010_{\rm syst.}$, which deviates from the world average value by more than $4σ$. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.17305 [pdf, other]

Retrieval Augmented Instruction Tuning for Open NER with Large Language Models

Authors: Tingyu Xie, Jian Zhang, Yan Zhang, Yuanyuan Liang, Qi Li, Hongwei Wang

Abstract: The strong capability of large language models (LLMs) has been applied to information extraction (IE) through either retrieval augmented prompting or instruction tuning (IT). However, the best way to incorporate information with LLMs for IE remains an open question. In this paper, we explore Retrieval Augmented Instruction Tuning (RA-IT) for IE, focusing on the task of open named entity recognitio… ▽ More The strong capability of large language models (LLMs) has been applied to information extraction (IE) through either retrieval augmented prompting or instruction tuning (IT). However, the best way to incorporate information with LLMs for IE remains an open question. In this paper, we explore Retrieval Augmented Instruction Tuning (RA-IT) for IE, focusing on the task of open named entity recognition (NER). Specifically, for each training sample, we retrieve semantically similar examples from the training dataset as the context and prepend them to the input of the original instruction. To evaluate our RA-IT approach more thoroughly, we construct a Chinese IT dataset for open NER and evaluate RA-IT in both English and Chinese scenarios. Experimental results verify the effectiveness of RA-IT across various data sizes and in both English and Chinese scenarios. We also conduct thorough studies to explore the impacts of various retrieval strategies in the proposed RA-IT framework. Code and data are available at: https://github.com/Emma1066/Retrieval-Augmented-IT-OpenNER △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.17248 [pdf, other]

MindSpore Quantum: A User-Friendly, High-Performance, and AI-Compatible Quantum Computing Framework

Authors: Xusheng Xu, Jiangyu Cui, Zidong Cui, Runhong He, Qingyu Li, Xiaowei Li, Yanling Lin, Jiale Liu, Wuxin Liu, Jiale Lu, Maolin Luo, Chufan Lyu, Shijie Pan, Mosharev Pavel, Runqiu Shu, Jialiang Tang, Ruoqian Xu, Shu Xu, Kang Yang, Fan Yu, Qingguo Zeng, Haiying Zhao, Qiang Zheng, Junyuan Zhou, Xu Zhou , et al. (14 additional authors not shown)

Abstract: We introduce MindSpore Quantum, a pioneering hybrid quantum-classical framework with a primary focus on the design and implementation of noisy intermediate-scale quantum (NISQ) algorithms. Leveraging the robust support of MindSpore, an advanced open-source deep learning training/inference framework, MindSpore Quantum exhibits exceptional efficiency in the design and training of variational quantum… ▽ More We introduce MindSpore Quantum, a pioneering hybrid quantum-classical framework with a primary focus on the design and implementation of noisy intermediate-scale quantum (NISQ) algorithms. Leveraging the robust support of MindSpore, an advanced open-source deep learning training/inference framework, MindSpore Quantum exhibits exceptional efficiency in the design and training of variational quantum algorithms on both CPU and GPU platforms, delivering remarkable performance. Furthermore, this framework places a strong emphasis on enhancing the operational efficiency of quantum algorithms when executed on real quantum hardware. This encompasses the development of algorithms for quantum circuit compilation and qubit map**, crucial components for achieving optimal performance on quantum processors. In addition to the core framework, we introduce QuPack, a meticulously crafted quantum computing acceleration engine. QuPack significantly accelerates the simulation speed of MindSpore Quantum, particularly in variational quantum eigensolver (VQE), quantum approximate optimization algorithm (QAOA), and tensor network simulations, providing astonishing speed. This combination of cutting-edge technologies empowers researchers and practitioners to explore the frontiers of quantum computing with unprecedented efficiency and performance. △ Less

Submitted 27 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.17006 [pdf, other]

Probing the nature of the $χ_{c1}(3872)$ state using radiative decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1094 additional authors not shown)

Abstract: The radiative decays $χ_{c1}(3872)\rightarrowψ(2S)γ$ and $χ_{c1}(3872)\rightarrow J/ψγ$ are used to probe the~nature of the~$χ_{c1}(3872)$ state using proton-proton collision data collected with the LHCb detector, corresponding to an~integrated luminosity of~9fb$^{-1}$. Using the~$B^+\rightarrow χ_{c1}(3872)K^+$decay, the $χ_{c1}(3872)\rightarrow ψ(2S)γ$ process is observed for the first time and… ▽ More The radiative decays $χ_{c1}(3872)\rightarrowψ(2S)γ$ and $χ_{c1}(3872)\rightarrow J/ψγ$ are used to probe the~nature of the~$χ_{c1}(3872)$ state using proton-proton collision data collected with the LHCb detector, corresponding to an~integrated luminosity of~9fb$^{-1}$. Using the~$B^+\rightarrow χ_{c1}(3872)K^+$decay, the $χ_{c1}(3872)\rightarrow ψ(2S)γ$ process is observed for the first time and the ratio of its partial width to that of the $χ_{c1}(3872)\rightarrow J/ψγ$ decay is measured to be $$ \frac{Γ_{χ_{c1}(3872)\rightarrow ψ(2S)γ}} {Γ_{χ_{c1}(3872)\rightarrow J/ψγ}} = 1.67 \pm 0.21 \pm 0.12 \pm0.04 , $$ where the first uncertainty is statistical, the second systematic and the third is due to the uncertainties on the branching fractions of the $ψ(2S)$ and $J/ψ$ mesons. The measured ratio makes the interpretation of the $χ_{c1}(3872)$ state as a~pure $D^0\bar{D}^{*0}+\bar{D}^0D^{*0}$ molecule questionable and strongly indicates a sizeable compact charmonium or tetraquark component within the $χ_{c1}(3872)$ state. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 31 pages, 2 figures. All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-015.html (LHCb public pages)

Report number: LHCb-PAPER-2024-015, CERN-EP-2025-157

arXiv:2406.16639 [pdf, other]

Boosting the convergence of DSMC by GSIS

Authors: Liyan Luo, Qi Li, Fei Fei, Lei Wu

Abstract: A deterministic-stochastic coupling scheme is developed for simulating rarefied gas flows, where the key process is the alternative solving of the macroscopic synthetic equations [Su et al., J. Comput. Phys., 407 (2020) 109245] and the mesoscopic equation via the asymptotic-preserving time-relaxed Monte Carlo scheme [Fei, J. Comput. Phys., 486 (2023) 112128]. Firstly, the macroscopic synthetic equ… ▽ More A deterministic-stochastic coupling scheme is developed for simulating rarefied gas flows, where the key process is the alternative solving of the macroscopic synthetic equations [Su et al., J. Comput. Phys., 407 (2020) 109245] and the mesoscopic equation via the asymptotic-preserving time-relaxed Monte Carlo scheme [Fei, J. Comput. Phys., 486 (2023) 112128]. Firstly, the macroscopic synthetic equations are exactly derived from the Boltzmann equation, incorporating not only the Newtonian viscosity and Fourier thermal conduction laws but also higher-order constitutive relations that capture rarefaction effects; the latter are extracted from the stochastic solver over a defined sampling interval. Secondly, the macroscopic synthetic equations, with the initial field extracted from the stochastic solver over the same sampling interval, are solved to the steady state or over certain iteration steps. Finally, the simulation particles in the stochastic solver are updated to match the density, velocity, and temperature obtained from the macroscopic synthetic equations. Moreover, simulation particles in the subsequent interval will be partly sampled according to the solutions of macroscopic synthetic equations. As a result, our coupling strategy enhances the asymptotic-preserving characteristic of the stochastic solver and substantially accelerates convergence towards the steady state. Several numerical tests are performed, and it is found that our method can reduce the computational cost in the near-continuum flow regime by two orders of magnitude compared to the direct simulation Monte Carlo method. △ Less

Submitted 25 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.16486 [pdf, other]

Towards Comprehensive Preference Data Collection for Reward Modeling

Authors: Yulan Hu, Qingyang Li, Sheng Ouyang, Ge Chen, Kaihui Chen, Lijun Mei, Xucheng Ye, Fuzheng Zhang, Yong Liu

Abstract: Reinforcement Learning from Human Feedback (RLHF) facilitates the alignment of large language models (LLMs) with human preferences, thereby enhancing the quality of responses generated. A critical component of RLHF is the reward model, which is trained on preference data and outputs a scalar reward during the inference stage. However, the collection of preference data still lacks thorough investig… ▽ More Reinforcement Learning from Human Feedback (RLHF) facilitates the alignment of large language models (LLMs) with human preferences, thereby enhancing the quality of responses generated. A critical component of RLHF is the reward model, which is trained on preference data and outputs a scalar reward during the inference stage. However, the collection of preference data still lacks thorough investigation. Recent studies indicate that preference data is collected either by AI or humans, where chosen and rejected instances are identified among pairwise responses. We question whether this process effectively filters out noise and ensures sufficient diversity in collected data. To address these concerns, for the first time, we propose a comprehensive framework for preference data collection, decomposing the process into four incremental steps: Prompt Generation, Response Generation, Response Filtering, and Human Labeling. This structured approach ensures the collection of high-quality preferences while reducing reliance on human labor. We conducted comprehensive experiments based on the data collected at different stages, demonstrating the effectiveness of the proposed data collection method. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.16087 [pdf, other]

Imperative Learning: A Self-supervised Neural-Symbolic Learning Framework for Robot Autonomy

Authors: Chen Wang, Kaiyi Ji, Junyi Geng, Zhongqiang Ren, Taimeng Fu, Fan Yang, Yifan Guo, Haonan He, Xiangyu Chen, Zitong Zhan, Qiwei Du, Shaoshu Su, Bowen Li, Yuheng Qiu, Yi Du, Qihang Li, Yifan Yang, Xiao Lin, Zhipeng Zhao

Abstract: Data-driven methods such as reinforcement and imitation learning have achieved remarkable success in robot autonomy. However, their data-centric nature still hinders them from generalizing well to ever-changing environments. Moreover, collecting large datasets for robotic tasks is often impractical and expensive. To overcome these challenges, we introduce a new self-supervised neural-symbolic (NeS… ▽ More Data-driven methods such as reinforcement and imitation learning have achieved remarkable success in robot autonomy. However, their data-centric nature still hinders them from generalizing well to ever-changing environments. Moreover, collecting large datasets for robotic tasks is often impractical and expensive. To overcome these challenges, we introduce a new self-supervised neural-symbolic (NeSy) computational framework, imperative learning (IL), for robot autonomy, leveraging the generalization abilities of symbolic reasoning. The framework of IL consists of three primary components: a neural module, a reasoning engine, and a memory system. We formulate IL as a special bilevel optimization (BLO), which enables reciprocal learning over the three modules. This overcomes the label-intensive obstacles associated with data-driven approaches and takes advantage of symbolic reasoning concerning logical reasoning, physical principles, geometric analysis, etc. We discuss several optimization techniques for IL and verify their effectiveness in five distinct robot autonomy tasks including path planning, rule induction, optimal control, visual odometry, and multi-robot routing. Through various experiments, we show that IL can significantly enhance robot autonomy capabilities and we anticipate that it will catalyze further research across diverse domains. △ Less

Submitted 23 June, 2024; originally announced June 2024.

arXiv:2406.15416 [pdf, other]

A novel dual-stage algorithm for capacitated arc routing problems with time-dependent service costs

Authors: Qingya Li, Shengcai Liu, Juan Zou, Ke Tang

Abstract: This paper focuses on solving the capacitated arc routing problem with time-dependent service costs (CARPTDSC), which is motivated by winter gritting applications. In the current literature, exact algorithms designed for CARPTDSC can only handle small-scale instances, while heuristic algorithms fail to obtain high-quality solutions. To overcome these limitations, we propose a novel dual-stage algo… ▽ More This paper focuses on solving the capacitated arc routing problem with time-dependent service costs (CARPTDSC), which is motivated by winter gritting applications. In the current literature, exact algorithms designed for CARPTDSC can only handle small-scale instances, while heuristic algorithms fail to obtain high-quality solutions. To overcome these limitations, we propose a novel dual-stage algorithm, called MAENS-GN, that consists of a routing stage and a vehicle departure time optimization stage. The former obtains the routing plan, while the the latter determines the vehicle departure time. Importantly, existing literature often ignores the characteristic information contained in the relationship between the route cost and the vehicle departure time. The most significant innovation in this paper lies in the exploitation of this characteristic information during the vehicle departure time optimization stage. Specifically, we conduct a detailed analysis of this relationship under various scenarios and employ tailored methods to obtain the (approximately) optimal vehicle departure time. Furthermore, we propose an improved initialization strategy that considers time-dependent characteristics to achieve better solution quality. In addition to the modified benchmark test sets, we also experiment on a real-world test set. Experimental results demonstrate that MAENS-GN can obtain high-quality solutions on both small-scale and larger-scale instances of CARPTDSC. △ Less

Submitted 16 May, 2024; originally announced June 2024.

arXiv:2406.15030 [pdf, ps, other]

Search for the $e^+e^- \to φχ_{c1}(3872)$ process at BESIII

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction… ▽ More Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction $\mathcal{B}[χ_{c1}(3872)\toπ^+π^- J/ψ]$ at 4.914 and 4.946 GeV are set to be 0.85 and 0.96 pb, respectively. These measurements provide useful information for the production of the $χ_{c1}(3872)$ at $e^+e^-$ collider and deepen our understanding about the nature of this particle. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: 11 pages, 3 figures

arXiv:2406.14869 [pdf, other]

Cost-Effective RF Fingerprinting Based on Hybrid CVNN-RF Classifier with Automated Multi-Dimensional Early-Exit Strategy

Authors: Jiayan Gan, Zhixing Du, Qiang Li, Huaizong Shao, **gran Lin, Ye Pan, Zhongyi Wen, Shafei Wang

Abstract: While the Internet of Things (IoT) technology is booming and offers huge opportunities for information exchange, it also faces unprecedented security challenges. As an important complement to the physical layer security technologies for IoT, radio frequency fingerprinting (RFF) is of great interest due to its difficulty in counterfeiting. Recently, many machine learning (ML)-based RFF algorithms h… ▽ More While the Internet of Things (IoT) technology is booming and offers huge opportunities for information exchange, it also faces unprecedented security challenges. As an important complement to the physical layer security technologies for IoT, radio frequency fingerprinting (RFF) is of great interest due to its difficulty in counterfeiting. Recently, many machine learning (ML)-based RFF algorithms have emerged. In particular, deep learning (DL) has shown great benefits in automatically extracting complex and subtle features from raw data with high classification accuracy. However, DL algorithms face the computational cost problem as the difficulty of the RFF task and the size of the DNN have increased dramatically. To address the above challenge, this paper proposes a novel costeffective early-exit neural network consisting of a complex-valued neural network (CVNN) backbone with multiple random forest branches, called hybrid CVNN-RF. Unlike conventional studies that use a single fixed DL model to process all RF samples, our hybrid CVNN-RF considers differences in the recognition difficulty of RF samples and introduces an early-exit mechanism to dynamically process the samples. When processing "easy" samples that can be well classified with high confidence, the hybrid CVNN-RF can end early at the random forest branch to reduce computational cost. Conversely, subsequent network layers will be activated to ensure accuracy. To further improve the early-exit rate, an automated multi-dimensional early-exit strategy is proposed to achieve scheduling control from multiple dimensions within the network depth and classification category. Finally, our experiments on the public ADS-B dataset show that the proposed algorithm can reduce the computational cost by 83% while improving the accuracy by 1.6% under a classification task with 100 categories. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: Accepted by IEEE Internet of Things Journal

arXiv:2406.13595 [pdf, other]

Topological representations for frame-valued domains via $L$-sobriety

Authors: Guojun Wu, Wei Yao, Qingguo Li

Abstract: With a frame $L$ as the truth value table, we study the topological representations for frame-valued domains. We introduce the notions of locally super-compact $L$-topological space and strong locally super-compact $L$-topological space. Using these concepts, continuous $L$-dcpos and algebraic $L$-dcpos are successfully represented via $L$-sobriety. By means of Scott $L$-topology and specializatio… ▽ More With a frame $L$ as the truth value table, we study the topological representations for frame-valued domains. We introduce the notions of locally super-compact $L$-topological space and strong locally super-compact $L$-topological space. Using these concepts, continuous $L$-dcpos and algebraic $L$-dcpos are successfully represented via $L$-sobriety. By means of Scott $L$-topology and specialization $L$-order, we establish a categorical isomorphism between the category of the continuous (resp., algebraic) $L$-dcpos with Scott continuous maps and that of the locally super-compact (resp., strong locally super-compact) $L$-sober spaces with continuous maps. As an application, for a continuous $L$-poset $P$, we obtain a categorical isomorphism between the category of directed completions of $P$ with Scott continuous maps and that of the $L$-sobrifications of $(P, σ_{L}(P))$ with continuous maps. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.13532 [pdf, other]

SALI: Short-term Alignment and Long-term Interaction Network for Colonoscopy Video Polyp Segmentation

Authors: Qiang Hu, Zhenyu Yi, Ying Zhou, Fang Peng, Mei Liu, Qiang Li, Zhiwei Wang

Abstract: Colonoscopy videos provide richer information in polyp segmentation for rectal cancer diagnosis. However, the endoscope's fast moving and close-up observing make the current methods suffer from large spatial incoherence and continuous low-quality frames, and thus yield limited segmentation accuracy. In this context, we focus on robust video polyp segmentation by enhancing the adjacent feature cons… ▽ More Colonoscopy videos provide richer information in polyp segmentation for rectal cancer diagnosis. However, the endoscope's fast moving and close-up observing make the current methods suffer from large spatial incoherence and continuous low-quality frames, and thus yield limited segmentation accuracy. In this context, we focus on robust video polyp segmentation by enhancing the adjacent feature consistency and rebuilding the reliable polyp representation. To achieve this goal, we in this paper propose SALI network, a hybrid of Short-term Alignment Module (SAM) and Long-term Interaction Module (LIM). The SAM learns spatial-aligned features of adjacent frames via deformable convolution and further harmonizes them to capture more stable short-term polyp representation. In case of low-quality frames, the LIM stores the historical polyp representations as a long-term memory bank, and explores the retrospective relations to interactively rebuild more reliable polyp features for the current segmentation. Combing SAM and LIM, the SALI network of video segmentation shows a great robustness to the spatial variations and low-visual cues. Benchmark on the large-scale SUNSEG verifies the superiority of SALI over the current state-of-the-arts by improving Dice by 2.1%, 2.5%, 4.1% and 1.9%, for the four test sub-sets, respectively. Codes are at https://github.com/Scatteredrain/SALI. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: Accepted to MICCAI 2024. Code and models: https://github.com/Scatteredrain/SALI

arXiv:2406.13173 [pdf, other]

Biomedical Visual Instruction Tuning with Clinician Preference Alignment

Authors: Hejie Cui, Lingjun Mao, Xin Liang, Jieyu Zhang, Hui Ren, Quanzheng Li, Xiang Li, Carl Yang

Abstract: Recent advancements in multimodal foundation models have showcased impressive capabilities in understanding and reasoning with visual and textual information. Adapting these foundation models trained for general usage to specialized domains like biomedicine requires large-scale domain-specific instruction datasets. While existing works have explored curating such datasets automatically, the result… ▽ More Recent advancements in multimodal foundation models have showcased impressive capabilities in understanding and reasoning with visual and textual information. Adapting these foundation models trained for general usage to specialized domains like biomedicine requires large-scale domain-specific instruction datasets. While existing works have explored curating such datasets automatically, the resultant datasets are not explicitly aligned with domain expertise. In this work, we propose a data-centric framework, Biomedical Visual Instruction Tuning with Clinician Preference Alignment (BioMed-VITAL), that incorporates clinician preferences into both stages of generating and selecting instruction data for tuning biomedical multimodal foundation models. First, during the generation stage, we prompt the GPT-4V generator with a diverse set of clinician-selected demonstrations for preference-aligned data candidate generation. Then, during the selection phase, we train a separate selection model, which explicitly distills clinician and policy-guided model preferences into a rating function to select high-quality data for medical instruction tuning. Results show that the model tuned with the instruction-following data from our method demonstrates a significant improvement in open visual chat (18.5% relatively) and medical VQA (win rate up to 81.73%). Our instruction-following data and models are available at BioMed-VITAL.github.io. △ Less

Submitted 29 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

MSC Class: 68T50; 68T45; 68T37; 68T05; 68T07; 68T09; ACM Class: I.2.7; I.2.6; I.2.10

arXiv:2406.12947 [pdf, other]

AutoFirm: Automatically Identifying Reused Libraries inside IoT Firmware at Large-Scale

Authors: YongLe Chen, Feng Ma, Ying Zhang, YongZhong He, Haining Wang, Qiang Li

Abstract: The Internet of Things (IoT) has become indispensable to our daily lives and work. Unfortunately, developers often reuse software libraries in the IoT firmware, leading to a major security concern. If vulnerabilities or insecure versions of these libraries go unpatched, a massive number of IoT devices can be impacted. In this paper, we propose the AutoFirm, an automated tool for detecting reused l… ▽ More The Internet of Things (IoT) has become indispensable to our daily lives and work. Unfortunately, developers often reuse software libraries in the IoT firmware, leading to a major security concern. If vulnerabilities or insecure versions of these libraries go unpatched, a massive number of IoT devices can be impacted. In this paper, we propose the AutoFirm, an automated tool for detecting reused libraries in IoT firmware at a large scale. Specifically, AutoFirm leverages the syntax information (library name and version) to determine whether IoT firmware reuses the libraries. We conduct a large-scale empirical study of reused libraries of IoT firmware, investigating more than 6,900+ firmware and 2,700+ distinct vulnerabilities affecting 11,300+ vulnerable versions from 349 open-source software libraries. Leveraging this diverse information set, we conduct a qualitative assessment of vulnerable library versions to understand security gaps and the misplaced trust of libraries in IoT firmware. Our research reveals that: manufacturers neglected to update outdated libraries for IoT firmware in 67.3\% of cases; on average, outdated libraries persisted for over 1.34 years prior to remediation; vulnerabilities of software libraries have posed server threats to widespread IoT devices. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 13 pages, 20 figures

arXiv:2406.12784 [pdf, other]

UBENCH: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions

Authors: Xunzhi Wang, Zhuowei Zhang, Qiongyu Li, Gaonan Chen, Mengting Hu, Zhiyu li, Bitong Luo, Hang Gao, Zhixin Han, Haotian Wang

Abstract: The rapid development of large language models (LLMs) has shown promising practical results. However, their low interpretability often leads to errors in unforeseen circumstances, limiting their utility. Many works have focused on creating comprehensive evaluation systems, but previous benchmarks have primarily assessed problem-solving abilities while neglecting the response's uncertainty, which m… ▽ More The rapid development of large language models (LLMs) has shown promising practical results. However, their low interpretability often leads to errors in unforeseen circumstances, limiting their utility. Many works have focused on creating comprehensive evaluation systems, but previous benchmarks have primarily assessed problem-solving abilities while neglecting the response's uncertainty, which may result in unreliability. Recent methods for measuring LLM reliability are resource-intensive and unable to test black-box models. To address this, we propose UBENCH, a comprehensive benchmark for evaluating LLM reliability. UBENCH includes 3,978 multiple-choice questions covering knowledge, language, understanding, and reasoning abilities. Experimental results show that UBENCH has achieved state-of-the-art performance, while its single-sampling method significantly saves computational resources compared to baseline methods that require multiple samplings. Additionally, based on UBENCH, we evaluate the reliability of 15 popular LLMs, finding GLM4 to be the most outstanding, closely followed by GPT-4. We also explore the impact of Chain-of-Thought prompts, role-playing prompts, option order, and temperature on LLM reliability, analyzing the varying effects on different LLMs. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: Under review

arXiv:2406.12646 [pdf, other]

An Empirical Study on the Fairness of Foundation Models for Multi-Organ Image Segmentation

Authors: Qin Li, Yizhe Zhang, Yan Li, Jun Lyu, Meng Liu, Longyu Sun, Mengting Sun, Qirong Li, Wenyue Mao, Xinran Wu, Ya**g Zhang, Yinghua Chu, Shuo Wang, Chengyan Wang

Abstract: The segmentation foundation model, e.g., Segment Anything Model (SAM), has attracted increasing interest in the medical image community. Early pioneering studies primarily concentrated on assessing and improving SAM's performance from the perspectives of overall accuracy and efficiency, yet little attention was given to the fairness considerations. This oversight raises questions about the potenti… ▽ More The segmentation foundation model, e.g., Segment Anything Model (SAM), has attracted increasing interest in the medical image community. Early pioneering studies primarily concentrated on assessing and improving SAM's performance from the perspectives of overall accuracy and efficiency, yet little attention was given to the fairness considerations. This oversight raises questions about the potential for performance biases that could mirror those found in task-specific deep learning models like nnU-Net. In this paper, we explored the fairness dilemma concerning large segmentation foundation models. We prospectively curate a benchmark dataset of 3D MRI and CT scans of the organs including liver, kidney, spleen, lung and aorta from a total of 1056 healthy subjects with expert segmentations. Crucially, we document demographic details such as gender, age, and body mass index (BMI) for each subject to facilitate a nuanced fairness analysis. We test state-of-the-art foundation models for medical image segmentation, including the original SAM, medical SAM and SAT models, to evaluate segmentation efficacy across different demographic groups and identify disparities. Our comprehensive analysis, which accounts for various confounding factors, reveals significant fairness concerns within these foundational models. Moreover, our findings highlight not only disparities in overall segmentation metrics, such as the Dice Similarity Coefficient but also significant variations in the spatial distribution of segmentation errors, offering empirical evidence of the nuanced challenges in ensuring fairness in medical image segmentation. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: Accepted to MICCAI-2024

arXiv:2406.12479 [pdf, other]

RS-GPT4V: A Unified Multimodal Instruction-Following Dataset for Remote Sensing Image Understanding

Authors: Linrui Xu, Ling Zhao, Wang Guo, Qiujun Li, Kewang Long, Kaiqi Zou, Yuhan Wang, Haifeng Li

Abstract: The remote sensing image intelligence understanding model is undergoing a new profound paradigm shift which has been promoted by multi-modal large language model (MLLM), i.e. from the paradigm learning a domain model (LaDM) shifts to paradigm learning a pre-trained general foundation model followed by an adaptive domain model (LaGD). Under the new LaGD paradigm, the old datasets, which have led to… ▽ More The remote sensing image intelligence understanding model is undergoing a new profound paradigm shift which has been promoted by multi-modal large language model (MLLM), i.e. from the paradigm learning a domain model (LaDM) shifts to paradigm learning a pre-trained general foundation model followed by an adaptive domain model (LaGD). Under the new LaGD paradigm, the old datasets, which have led to advances in RSI intelligence understanding in the last decade, are no longer suitable for fire-new tasks. We argued that a new dataset must be designed to lighten tasks with the following features: 1) Generalization: training model to learn shared knowledge among tasks and to adapt to different tasks; 2) Understanding complex scenes: training model to understand the fine-grained attribute of the objects of interest, and to be able to describe the scene with natural language; 3) Reasoning: training model to be able to realize high-level visual reasoning. In this paper, we designed a high-quality, diversified, and unified multimodal instruction-following dataset for RSI understanding produced by GPT-4V and existing datasets, which we called RS-GPT4V. To achieve generalization, we used a (Question, Answer) which was deduced from GPT-4V via instruction-following to unify the tasks such as captioning and localization; To achieve complex scene, we proposed a hierarchical instruction description with local strategy in which the fine-grained attributes of the objects and their spatial relationships are described and global strategy in which all the local information are integrated to yield detailed instruction descript; To achieve reasoning, we designed multiple-turn QA pair to provide the reasoning ability for a model. The empirical results show that the fine-tuned MLLMs by RS-GPT4V can describe fine-grained information. The dataset is available at: https://github.com/GeoX-Lab/RS-GPT4V. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 14 pages, 6 figures, 4 tables

arXiv:2406.12111 [pdf, other]

Precision measurement of the $Ξ^-_b$ baryon lifetime

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1064 additional authors not shown)

Abstract: A sample of $pp$ collision data, corresponding to an integrated luminosity of 5.5 fb$^{-1}$ and collected by the LHCb experiment during Run 2, is used to measure the ratio of the lifetime of the $Ξ^-_b$ baryon to that of the $Λ^0_b$ baryon, $r_τ\equivτ_{Ξ^-_b}/τ_{Λ^0_b}$. The value ${r_τ^{\rm Run\,2}=1.076\pm0.013\pm0.006}$ is obtained, where the first uncertainty is statistical and the second sys… ▽ More A sample of $pp$ collision data, corresponding to an integrated luminosity of 5.5 fb$^{-1}$ and collected by the LHCb experiment during Run 2, is used to measure the ratio of the lifetime of the $Ξ^-_b$ baryon to that of the $Λ^0_b$ baryon, $r_τ\equivτ_{Ξ^-_b}/τ_{Λ^0_b}$. The value ${r_τ^{\rm Run\,2}=1.076\pm0.013\pm0.006}$ is obtained, where the first uncertainty is statistical and the second systematic. This value is averaged with the corresponding value from Run 1 to obtain ${r_τ^{\rm Run\,1,2} = 1.078\pm0.012\pm0.007}$. Multiplying by the world-average value of the $Λ^0_b$ lifetime yields $τ_{Ξ^-_b}^{\rm Run~1,2} = 1.578\pm0.018\pm0.010\pm0.011$ ps, where the uncertainties are statistical, systematic, and due to the limited knowledge of the $Λ^0_b$ lifetime. This measurement improves the precision of the current world average of the $Ξ^-_b$ lifetime by about a factor of two, and is in good agreement with the most recent theoretical predictions. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 12 pages, 5 figures. All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2014-010.html (LHCb public pages)

Report number: LHCb-PAPER-2024-010, CERN-EP-2024-139

arXiv:2406.11900 [pdf, other]

Horizon-wise Learning Paradigm Promotes Gene Splicing Identification

Authors: Qi-Jie Li, Qian Sun, Shao-Qun Zhang

Abstract: Identifying gene splicing is a core and significant task confronted in modern collaboration between artificial intelligence and bioinformatics. Past decades have witnessed great efforts on this concern, such as the bio-plausible splicing pattern AT-CG and the famous SpliceAI. In this paper, we propose a novel framework for the task of gene splicing identification, named Horizon-wise Gene Splicing… ▽ More Identifying gene splicing is a core and significant task confronted in modern collaboration between artificial intelligence and bioinformatics. Past decades have witnessed great efforts on this concern, such as the bio-plausible splicing pattern AT-CG and the famous SpliceAI. In this paper, we propose a novel framework for the task of gene splicing identification, named Horizon-wise Gene Splicing Identification (H-GSI). The proposed H-GSI follows the horizon-wise identification paradigm and comprises four components: the pre-processing procedure transforming string data into tensors, the sliding window technique handling long sequences, the SeqLab model, and the predictor. In contrast to existing studies that process gene information with a truncated fixed-length sequence, H-GSI employs a horizon-wise identification paradigm in which all positions in a sequence are predicted with only one forward computation, improving accuracy and efficiency. The experiments conducted on the real-world Human dataset show that our proposed H-GSI outperforms SpliceAI and achieves the best accuracy of 97.20\%. The source code is available from this link. △ Less

Submitted 15 June, 2024; originally announced June 2024.

arXiv:2406.11882 [pdf]

Applications of Explainable artificial intelligence in Earth system science

Authors: Feini Huang, Shijie Jiang, Lu Li, Yongkun Zhang, Ye Zhang, Ruqing Zhang, Qingliang Li, Danxi Li, Wei Shangguan, Yongjiu Dai

Abstract: In recent years, artificial intelligence (AI) rapidly accelerated its influence and is expected to promote the development of Earth system science (ESS) if properly harnessed. In application of AI to ESS, a significant hurdle lies in the interpretability conundrum, an inherent problem of black-box nature arising from the complexity of AI algorithms. To address this, explainable AI (XAI) offers a s… ▽ More In recent years, artificial intelligence (AI) rapidly accelerated its influence and is expected to promote the development of Earth system science (ESS) if properly harnessed. In application of AI to ESS, a significant hurdle lies in the interpretability conundrum, an inherent problem of black-box nature arising from the complexity of AI algorithms. To address this, explainable AI (XAI) offers a set of powerful tools that make the models more transparent. The purpose of this review is twofold: First, to provide ESS scholars, especially newcomers, with a foundational understanding of XAI, serving as a primer to inspire future research advances; second, to encourage ESS professionals to embrace the benefits of AI, free from preconceived biases due to its lack of interpretability. We begin with elucidating the concept of XAI, along with typical methods. We then delve into a review of XAI applications in the ESS literature, highlighting the important role that XAI has played in facilitating communication with AI model decisions, improving model diagnosis, and uncovering scientific insights. We identify four significant challenges that XAI faces within the ESS, and propose solutions. Furthermore, we provide a comprehensive illustration of multifaceted perspectives. Given the unique challenges in ESS, an interpretable hybrid approach that seamlessly integrates AI with domain-specific knowledge appears to be a promising way to enhance the utility of AI in ESS. A visionary outlook for ESS envisions a harmonious blend where process-based models govern the known, AI models explore the unknown, and XAI bridges the gap by providing explanations. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.11602 [pdf, other]

Association between a Failed Prominence Eruption and the Drainage of Mass from Another Prominence

Authors: Jianchao Xue, Li Feng, Hui Li, ** Zhang, Jun Chen, Guanglu Shi, Kaifan Ji, Ye Qiu, Chuan Li, Lei Lu, Beili Ying, Ying Li, Yu Huang, You** Li, **gwei Li, Jie Zhao, Dechao Song, Shuting Li, Zhengyuan Tian, Yingna Su, Qingmin Zhang, Yunyi Ge, Jiahui Shan, Qiao Li, Gen Li , et al. (9 additional authors not shown)

Abstract: Sympathetic eruptions of solar prominences have been studied for decades, however, it is usually difficult to identify their causal links. Here we present two failed prominence eruptions on 26 October 2022 and explore their connections. Using stereoscopic observations, the south prominence (PRO-S) erupts with untwisting motions, flare ribbons occur underneath, and new connections are formed during… ▽ More Sympathetic eruptions of solar prominences have been studied for decades, however, it is usually difficult to identify their causal links. Here we present two failed prominence eruptions on 26 October 2022 and explore their connections. Using stereoscopic observations, the south prominence (PRO-S) erupts with untwisting motions, flare ribbons occur underneath, and new connections are formed during the eruption. The north prominence (PRO-N) rises up along with PRO-S, and its upper part disappears due to catastrophic mass draining along an elongated structure after PRO-S failed eruption. We suggest that the eruption of PRO-S initiates due to a kink instability, further rises up, and fails to erupt due to reconnection with surrounding fields. The elongated structure connecting PRO-N overlies PRO-S, which causes the rising up of PRO-N along with PRO-S and mass drainage after PRO-S eruption. This study suggests that a prominence may end its life through mass drainage forced by an eruption underneath. △ Less

Submitted 20 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

Comments: 15 pages, 7 figures, has been accepted by Solar Physics

arXiv:2406.11515 [pdf, other]

Obfuscating IoT Device Scanning Activity via Adversarial Example Generation

Authors: Haocong Li, Yaxin Zhang, Long Cheng, Wenjia Niu, Haining Wang, Qiang Li

Abstract: Nowadays, attackers target Internet of Things (IoT) devices for security exploitation, and search engines for devices and services compromise user privacy, including IP addresses, open ports, device types, vendors, and products.Typically, application banners are used to recognize IoT device profiles during network measurement and reconnaissance. In this paper, we propose a novel approach to obfusc… ▽ More Nowadays, attackers target Internet of Things (IoT) devices for security exploitation, and search engines for devices and services compromise user privacy, including IP addresses, open ports, device types, vendors, and products.Typically, application banners are used to recognize IoT device profiles during network measurement and reconnaissance. In this paper, we propose a novel approach to obfuscating IoT device banners (BANADV) based on adversarial examples. The key idea is to explore the susceptibility of fingerprinting techniques to a slight perturbation of an IoT device banner. By modifying device banners, BANADV disrupts the collection of IoT device profiles. To validate the efficacy of BANADV, we conduct a set of experiments. Our evaluation results show that adversarial examples can spoof state-of-the-art fingerprinting techniques, including learning- and matching-based approaches. We further provide a detailed analysis of the weakness of learning-based/matching-based fingerprints to carefully crafted samples. Overall, the innovations of BANADV lie in three aspects: (1) it utilizes an IoT-related semantic space and a visual similarity space to locate available manipulating perturbations of IoT banners; (2) it achieves at least 80\% success rate for spoofing IoT scanning techniques; and (3) it is the first to utilize adversarial examples of IoT banners in network measurement and reconnaissance. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.10928 [pdf, other]

doi 10.1145/3637528.3671708

Make Your Home Safe: Time-aware Unsupervised User Behavior Anomaly Detection in Smart Homes via Loss-guided Mask

Authors: **gyu Xiao, Zhiyao Xu, Qingsong Zou, Qing Li, Dan Zhao, Dong Fang, Ruoyu Li, Wenxin Tang, Kang Li, Xudong Zuo, Penghui Hu, Yong Jiang, Zixuan Weng, Michael R. Lyv

Abstract: Smart homes, powered by the Internet of Things, offer great convenience but also pose security concerns due to abnormal behaviors, such as improper operations of users and potential attacks from malicious attackers. Several behavior modeling methods have been proposed to identify abnormal behaviors and mitigate potential risks. However, their performance often falls short because they do not effec… ▽ More Smart homes, powered by the Internet of Things, offer great convenience but also pose security concerns due to abnormal behaviors, such as improper operations of users and potential attacks from malicious attackers. Several behavior modeling methods have been proposed to identify abnormal behaviors and mitigate potential risks. However, their performance often falls short because they do not effectively learn less frequent behaviors, consider temporal context, or account for the impact of noise in human behaviors. In this paper, we propose SmartGuard, an autoencoder-based unsupervised user behavior anomaly detection framework. First, we design a Loss-guided Dynamic Mask Strategy (LDMS) to encourage the model to learn less frequent behaviors, which are often overlooked during learning. Second, we propose a Three-level Time-aware Position Embedding (TTPE) to incorporate temporal information into positional embedding to detect temporal context anomaly. Third, we propose a Noise-aware Weighted Reconstruction Loss (NWRL) that assigns different weights for routine behaviors and noise behaviors to mitigate the interference of noise behaviors during inference. Comprehensive experiments on three datasets with ten types of anomaly behaviors demonstrates that SmartGuard consistently outperforms state-of-the-art baselines and also offers highly interpretable results. △ Less

Submitted 18 June, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

Comments: KDD 2024

arXiv:2406.10744

Technique Report of CVPR 2024 PBDL Challenges

Authors: Ying Fu, Yu Li, Shaodi You, Boxin Shi, Jose Alvarez, Coert van Gemeren, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Sheng** Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou , et al. (77 additional authors not shown)

Abstract: The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, a… ▽ More The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, and medium properties from images. In recent years, deep learning has shown promising improvements for various vision tasks, and when combined with physics-based vision, these approaches can enhance the robustness and accuracy of vision systems. This technical report summarizes the outcomes of the Physics-Based Vision Meets Deep Learning (PBDL) 2024 challenge, held in CVPR 2024 workshop. The challenge consisted of eight tracks, focusing on Low-Light Enhancement and Detection as well as High Dynamic Range (HDR) Imaging. This report details the objectives, methodologies, and results of each track, highlighting the top-performing solutions and their innovative approaches. △ Less

Submitted 27 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

Comments: The author list and contents need to be verified by all authors

arXiv:2406.10450 [pdf, other]

TokenRec: Learning to Tokenize ID for LLM-based Generative Recommendation

Authors: Haohao Qu, Wenqi Fan, Zihuai Zhao, Qing Li

Abstract: There is a growing interest in utilizing large-scale language models (LLMs) to advance next-generation Recommender Systems (RecSys), driven by their outstanding language understanding and in-context learning capabilities. In this scenario, tokenizing (i.e., indexing) users and items becomes essential for ensuring a seamless alignment of LLMs with recommendations. While several studies have made pr… ▽ More There is a growing interest in utilizing large-scale language models (LLMs) to advance next-generation Recommender Systems (RecSys), driven by their outstanding language understanding and in-context learning capabilities. In this scenario, tokenizing (i.e., indexing) users and items becomes essential for ensuring a seamless alignment of LLMs with recommendations. While several studies have made progress in representing users and items through textual contents or latent representations, challenges remain in efficiently capturing high-order collaborative knowledge into discrete tokens that are compatible with LLMs. Additionally, the majority of existing tokenization approaches often face difficulties in generalizing effectively to new/unseen users or items that were not in the training corpus. To address these challenges, we propose a novel framework called TokenRec, which introduces not only an effective ID tokenization strategy but also an efficient retrieval paradigm for LLM-based recommendations. Specifically, our tokenization strategy, Masked Vector-Quantized (MQ) Tokenizer, involves quantizing the masked user/item representations learned from collaborative filtering into discrete tokens, thus achieving a smooth incorporation of high-order collaborative knowledge and a generalizable tokenization of users and items for LLM-based RecSys. Meanwhile, our generative retrieval paradigm is designed to efficiently recommend top-$K$ items for users to eliminate the need for the time-consuming auto-regressive decoding and beam search processes used by LLMs, thus significantly reducing inference time. Comprehensive experiments validate the effectiveness of the proposed methods, demonstrating that TokenRec outperforms competitive benchmarks, including both traditional recommender systems and emerging LLM-based recommender systems. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.10358 [pdf, other]

I Still See You: Why Existing IoT Traffic Resha** Fails

Authors: Su Wang, Keyang Yu, Qi Li, Dong Chen

Abstract: The Internet traffic data produced by the Internet of Things (IoT) devices are collected by Internet Service Providers (ISPs) and device manufacturers, and often shared with their third parties to maintain and enhance user services. Unfortunately, on-path adversaries could infer and fingerprint users' sensitive privacy information such as occupancy and user activities by analyzing these network tr… ▽ More The Internet traffic data produced by the Internet of Things (IoT) devices are collected by Internet Service Providers (ISPs) and device manufacturers, and often shared with their third parties to maintain and enhance user services. Unfortunately, on-path adversaries could infer and fingerprint users' sensitive privacy information such as occupancy and user activities by analyzing these network traffic traces. While there's a growing body of literature on defending against this side-channel attack-malicious IoT traffic analytics (TA), there's currently no systematic method to compare and evaluate the comprehensiveness of these existing studies. To address this problem, we design a new low-cost, open-source system framework-IoT Traffic Exposure Monitoring Toolkit (ITEMTK) that enables people to comprehensively examine and validate prior attack models and their defending approaches. In particular, we also design a novel image-based attack capable of inferring sensitive user information, even when users employ the most robust preventative measures in their smart homes. Researchers could leverage our new image-based attack to systematize and understand the existing literature on IoT traffic analysis attacks and preventing studies. Our results show that current defending approaches are not sufficient to protect IoT device user privacy. IoT devices are significantly vulnerable to our new image-based user privacy inference attacks, posing a grave threat to IoT device user privacy. We also highlight potential future improvements to enhance the defending approaches. ITEMTK's flexibility allows other researchers for easy expansion by integrating new TA attack models and prevention methods to benchmark their future work. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: EWSN'24 paper accepted, to appear

arXiv:2406.10243 [pdf, ps, other]

RKKY interaction in helical higher-order topological insulators

Authors: Sha **, Jian Li, Qing-Xu Li, Jia-Ji Zhu

Abstract: We theoretically investigate the RKKY interaction in helical higher-order topological insulators (HOTIs), revealing distinct behaviors mediated by hinge and Dirac-type bulk carriers. Our findings show that hinge-mediated interactions consist of Heisenberg, Ising, and Dzyaloshinskii-Moriya (DM) terms, exhibiting a decay with impurity spacing z and oscillations with Fermi energy εF . These interacti… ▽ More We theoretically investigate the RKKY interaction in helical higher-order topological insulators (HOTIs), revealing distinct behaviors mediated by hinge and Dirac-type bulk carriers. Our findings show that hinge-mediated interactions consist of Heisenberg, Ising, and Dzyaloshinskii-Moriya (DM) terms, exhibiting a decay with impurity spacing z and oscillations with Fermi energy εF . These interactions demonstrate ferromagnetic behaviors for the Heisenberg and Ising terms and alternating behavior for the DM term. In contrast, bulk-mediated interactions include Heisenberg, twisted Ising, and DM terms, with a conventional cubic oscillating decay. This study highlights the nuanced interplay between hinge and bulk RKKY interactions in HOTIs, offering insights into the design of next-generation quantum devices based on the HOTIs. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 8 pages, 5 figures

arXiv:2406.09628 [pdf]

doi 10.1103/PhysRevB.104.085153

Massive Dirac Fermions and Strong Shubnikov-de Haas Oscillations in Topological Insulator Sm,Fe:Bi2Se3 Single Crystals

Authors: Weiyao Zhao, Chi Xuan Trang, Qile Li, Lei Chen, Zengji Yue, Abdulhakim Bake, Cheng Tan, Lan Wang, Mitchell Nancarrow, Mark Edmonds, David Cortie, Xiaolin Wang

Abstract: Topological insulators (TIs) are emergent materials with unique band structure, which allow the study of quantum effect in solids, as well as contribute to high performance quantum devices. To achieve the better performance of TI, here we present a co-do** strategy using synergistic rare-earth Sm and transition-metal Fe dopants in Bi2Se3 single crystals, which combine the advantages of both tran… ▽ More Topological insulators (TIs) are emergent materials with unique band structure, which allow the study of quantum effect in solids, as well as contribute to high performance quantum devices. To achieve the better performance of TI, here we present a co-do** strategy using synergistic rare-earth Sm and transition-metal Fe dopants in Bi2Se3 single crystals, which combine the advantages of both transition metal doped TI (high ferromagnetic ordering temperature and observed QAHE), and rare-earth doped TI (large magnetic moments and significant spin orbit coupling). In the as-grown single crystals, clear evidences of ferromagnetic ordering were observed. The angle resolve photoemission spectroscopy indicate the ferromagnetism opens a 44 meV band gap at surface Dirac point. Moreover, the carrier mobility at 3 K is ~ 7400 cm2/Vs, and we thus observed an ultra-strong Shubnikov-de Haas oscillation in the longitudinal resistivity, as well as the Hall steps in transverse resistivity below 14 T. Our transport and angular resolved photoemission spectroscopy results suggest that the rare-earth and transition metal co-do** in Bi2Se3 system is a promising avenue implement the quantum anomalous Hall effect, as well as harnessing the massive Dirac fermion in electrical devices. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 5 figures

Journal ref: Physical Review B 104, 085153 (2021)

arXiv:2406.09475 [pdf, other]

Search for $X(1870)$ via the decay $J/ψ\to ωK^+ K^-η$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

Abstract: Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the… ▽ More Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the $90\%$ confidence level. In addition, the branching faction $B(J/ψ\toωK^+ K^- η)$ is measured to be $(3.33\pm0.02(\rm{stat.})\pm 0.12(\rm{syst.}))\times 10^{-4}$. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.09187 [pdf, other]

GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning

Authors: Zhen Xiang, Linzhi Zheng, Yanjie Li, Junyuan Hong, Qinbin Li, Han Xie, Jiawei Zhang, Zidi Xiong, Chulin Xie, Carl Yang, Dawn Song, Bo Li

Abstract: The rapid advancement of large language models (LLMs) has catalyzed the deployment of LLM-powered agents across numerous applications, raising new concerns regarding their safety and trustworthiness. Existing methods for enhancing the safety of LLMs are not directly transferable to LLM-powered agents due to their diverse objectives and output modalities. In this paper, we propose GuardAgent, the f… ▽ More The rapid advancement of large language models (LLMs) has catalyzed the deployment of LLM-powered agents across numerous applications, raising new concerns regarding their safety and trustworthiness. Existing methods for enhancing the safety of LLMs are not directly transferable to LLM-powered agents due to their diverse objectives and output modalities. In this paper, we propose GuardAgent, the first LLM agent as a guardrail to other LLM agents. Specifically, GuardAgent oversees a target LLM agent by checking whether its inputs/outputs satisfy a set of given guard requests defined by the users. GuardAgent comprises two steps: 1) creating a task plan by analyzing the provided guard requests, and 2) generating guardrail code based on the task plan and executing the code by calling APIs or using external engines. In both steps, an LLM is utilized as the core reasoning component, supplemented by in-context demonstrations retrieved from a memory module. Such knowledge-enabled reasoning allows GuardAgent to understand various textual guard requests and accurately "translate" them into executable code that provides reliable guardrails. Furthermore, GuardAgent is equipped with an extendable toolbox containing functions and APIs and requires no additional LLM training, which underscores its generalization capabilities and low operational overhead. Additionally, we propose two novel benchmarks: an EICU-AC benchmark for assessing privacy-related access control for healthcare agents and a Mind2Web-SC benchmark for safety evaluation for web agents. We show the effectiveness of GuardAgent on these two benchmarks with 98.7% and 90.0% accuracy in moderating invalid inputs and outputs for the two types of agents, respectively. We also show that GuardAgent is able to define novel functions in adaption to emergent LLM agents and guard requests, which underscores its strong generalization capabilities. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.09135 [pdf, other]

AdaRevD: Adaptive Patch Exiting Reversible Decoder Pushes the Limit of Image Deblurring

Authors: Xintian Mao, Qingli Li, Yan Wang

Abstract: Despite the recent progress in enhancing the efficacy of image deblurring, the limited decoding capability constrains the upper limit of State-Of-The-Art (SOTA) methods. This paper proposes a pioneering work, Adaptive Patch Exiting Reversible Decoder (AdaRevD), to explore their insufficient decoding capability. By inheriting the weights of the well-trained encoder, we refactor a reversible decoder… ▽ More Despite the recent progress in enhancing the efficacy of image deblurring, the limited decoding capability constrains the upper limit of State-Of-The-Art (SOTA) methods. This paper proposes a pioneering work, Adaptive Patch Exiting Reversible Decoder (AdaRevD), to explore their insufficient decoding capability. By inheriting the weights of the well-trained encoder, we refactor a reversible decoder which scales up the single-decoder training to multi-decoder training while remaining GPU memory-friendly. Meanwhile, we show that our reversible structure gradually disentangles high-level degradation degree and low-level blur pattern (residual of the blur image and its sharp counterpart) from compact degradation representation. Besides, due to the spatially-variant motion blur kernels, different blur patches have various deblurring difficulties. We further introduce a classifier to learn the degradation degree of image patches, enabling them to exit at different sub-decoders for speedup. Experiments show that our AdaRevD pushes the limit of image deblurring, e.g., achieving 34.60 dB in PSNR on GoPro dataset. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Showing 1–50 of 4,202 results for author: Li, Q