Search | arXiv e-print repository

Precision measurement of the branching fraction of \boldmath $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (604 additional authors not shown)

Abstract: Using a sample of $448.1 \times 10^6$ $ψ(2S)$ events collected with the BESIII detector, we perform a study of the decay $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$. The branching fraction of $J/ψ\rightarrow K^+K^-$ is determined to be $\mathcal{B}_{K^+K^-}=(3.072\pm 0.023({\rm stat.})\pm 0.050({\rm syst.}))\times 10^{-4}$, which is consistent with previous measurements but with sig… ▽ More Using a sample of $448.1 \times 10^6$ $ψ(2S)$ events collected with the BESIII detector, we perform a study of the decay $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$. The branching fraction of $J/ψ\rightarrow K^+K^-$ is determined to be $\mathcal{B}_{K^+K^-}=(3.072\pm 0.023({\rm stat.})\pm 0.050({\rm syst.}))\times 10^{-4}$, which is consistent with previous measurements but with significantly improved precision. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: to be submitted to PRD

arXiv:2405.12725 [pdf, other]

Nearest is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks

Authors: Boheng Li, Yishuo Cai, Haowei Li, Feng Xue, Zhifeng Li, Yiming Li

Abstract: Model quantization is widely used to compress and accelerate deep neural networks. However, recent studies have revealed the feasibility of weaponizing model quantization via implanting quantization-conditioned backdoors (QCBs). These special backdoors stay dormant on released full-precision models but will come into effect after standard quantization. Due to the peculiarity of QCBs, existing defe… ▽ More Model quantization is widely used to compress and accelerate deep neural networks. However, recent studies have revealed the feasibility of weaponizing model quantization via implanting quantization-conditioned backdoors (QCBs). These special backdoors stay dormant on released full-precision models but will come into effect after standard quantization. Due to the peculiarity of QCBs, existing defenses have minor effects on reducing their threats or are even infeasible. In this paper, we conduct the first in-depth analysis of QCBs. We reveal that the activation of existing QCBs primarily stems from the nearest rounding operation and is closely related to the norms of neuron-wise truncation errors (i.e., the difference between the continuous full-precision weights and its quantized version). Motivated by these insights, we propose Error-guided Flipped Rounding with Activation Preservation (EFRAP), an effective and practical defense against QCBs. Specifically, EFRAP learns a non-nearest rounding strategy with neuron-wise error norm and layer-wise activation preservation guidance, flip** the rounding strategies of neurons crucial for backdoor effects but with minimal impact on clean accuracy. Extensive evaluations on benchmark datasets demonstrate that our EFRAP can defeat state-of-the-art QCB attacks under various settings. Code is available at https://github.com/AntigoneRandy/QuantBackdoor_EFRAP. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: Accepted to CVPR 2024. 19 pages, 9 figures

arXiv:2405.12601 [pdf, other]

FFAM: Feature Factorization Activation Map for Explanation of 3D Detectors

Authors: Shuai Liu, Boyang Li, Zhiyu Fang, Mingyue Cui, Kai Huang

Abstract: LiDAR-based 3D object detection has made impressive progress recently, yet most existing models are black-box, lacking interpretability. Previous explanation approaches primarily focus on analyzing image-based models and are not readily applicable to LiDAR-based 3D detectors. In this paper, we propose a feature factorization activation map (FFAM) to generate high-quality visual explanations for 3D… ▽ More LiDAR-based 3D object detection has made impressive progress recently, yet most existing models are black-box, lacking interpretability. Previous explanation approaches primarily focus on analyzing image-based models and are not readily applicable to LiDAR-based 3D detectors. In this paper, we propose a feature factorization activation map (FFAM) to generate high-quality visual explanations for 3D detectors. FFAM employs non-negative matrix factorization to generate concept activation maps and subsequently aggregates these maps to obtain a global visual explanation. To achieve object-specific visual explanations, we refine the global visual explanation using the feature gradient of a target object. Additionally, we introduce a voxel upsampling strategy to align the scale between the activation map and input point cloud. We qualitatively and quantitatively analyze FFAM with multiple detectors on several datasets. Experimental results validate the high-quality visual explanations produced by FFAM. The Code will be available at \url{https://github.com/Say2L/FFAM.git}. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.12490 [pdf, other]

Customize Your Own Paired Data via Few-shot Way

Authors: **shu Chen, Bingchuan Li, Miao Hua, Panpan Xu, Qian He

Abstract: Existing solutions to image editing tasks suffer from several issues. Though achieving remarkably satisfying generated results, some supervised methods require huge amounts of paired training data, which greatly limits their usages. The other unsupervised methods take full advantage of large-scale pre-trained priors, thus being strictly restricted to the domains where the priors are trained on and… ▽ More Existing solutions to image editing tasks suffer from several issues. Though achieving remarkably satisfying generated results, some supervised methods require huge amounts of paired training data, which greatly limits their usages. The other unsupervised methods take full advantage of large-scale pre-trained priors, thus being strictly restricted to the domains where the priors are trained on and behaving badly in out-of-distribution cases. The task we focus on is how to enable the users to customize their desired effects through only few image pairs. In our proposed framework, a novel few-shot learning mechanism based on the directional transformations among samples is introduced and expands the learnable space exponentially. Adopting a diffusion model pipeline, we redesign the condition calculating modules in our model and apply several technical improvements. Experimental results demonstrate the capabilities of our method in various cases. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: Accepted by AI4CC CVPR2024 WorkShop

arXiv:2405.12420 [pdf, other]

GarmentDreamer: 3DGS Guided Garment Synthesis with Diverse Geometry and Texture Details

Authors: Boqian Li, Xuan Li, Ying Jiang, Tianyi Xie, Feng Gao, Huamin Wang, Yin Yang, Chenfanfu Jiang

Abstract: Traditional 3D garment creation is labor-intensive, involving sketching, modeling, UV map**, and texturing, which are time-consuming and costly. Recent advances in diffusion-based generative models have enabled new possibilities for 3D garment generation from text prompts, images, and videos. However, existing methods either suffer from inconsistencies among multi-view images or require addition… ▽ More Traditional 3D garment creation is labor-intensive, involving sketching, modeling, UV map**, and texturing, which are time-consuming and costly. Recent advances in diffusion-based generative models have enabled new possibilities for 3D garment generation from text prompts, images, and videos. However, existing methods either suffer from inconsistencies among multi-view images or require additional processes to separate cloth from the underlying human model. In this paper, we propose GarmentDreamer, a novel method that leverages 3D Gaussian Splatting (GS) as guidance to generate wearable, simulation-ready 3D garment meshes from text prompts. In contrast to using multi-view images directly predicted by generative models as guidance, our 3DGS guidance ensures consistent optimization in both garment deformation and texture synthesis. Our method introduces a novel garment augmentation module, guided by normal and RGBA information, and employs implicit Neural Texture Fields (NeTF) combined with Score Distillation Sampling (SDS) to generate diverse geometric and texture details. We validate the effectiveness of our approach through comprehensive qualitative and quantitative experiments, showcasing the superior performance of GarmentDreamer over state-of-the-art alternatives. Our project page is available at: https://xuan-li.github.io/GarmentDreamerDemo/. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.12296 [pdf, other]

Constraining regularization ambiguities in Loop Quantum Cosmology via CMB

Authors: Bao-Fei Li, Meysam Motaharfar, Parampreet Singh

Abstract: In order to investigate the potential observational signals of different regularization ambiguities in loop quantum cosmological models, we systematically compute and compare the primordial scalar power spectra and the resulting angular power spectra in the standard loop quantum cosmology (LQC) and its Thiemann regularized versions -- modified LQC-I/II (mLQC-I/II), using both the dressed metric an… ▽ More In order to investigate the potential observational signals of different regularization ambiguities in loop quantum cosmological models, we systematically compute and compare the primordial scalar power spectra and the resulting angular power spectra in the standard loop quantum cosmology (LQC) and its Thiemann regularized versions -- modified LQC-I/II (mLQC-I/II), using both the dressed metric and the hybrid approaches. All three loop quantum cosmological models yield a non-singular bounce with a post-bounce physics that converges rapidly in a few Planck seconds. Using Starobinsky potential and the initial conditions for the background dynamics chosen to yield the same inflationary e-foldings, which are fixed to be $65$ in all three LQC models, we require that all three models result in the same scale-invariant regime for the primordial power spectrum with a relative difference of less than one percent. This permits us to explore the differences resulting from the deep Planck regime in the angular power spectrum. For the adiabatic states, our results demonstrate that the angular power spectrum predicted by the hybrid approach has a smaller deviation from the angular power spectrum predicted by the standard $Λ$CDM cosmological model at large angles in comparison with the dressed metric approach for all three models. The angular power spectrum predicted by mLQC-I in both the hybrid and the dressed metric approaches shows the smallest deviation from the one predicted by the standard $Λ$CDM cosmological model at large angular scales, except for the case of fourth order adiabatic initial states in the hybrid approach. On the contrary, mLQC-II results in the largest deviations for the amplitude of the angular power spectrum at large angles and is most disfavored. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 28 pages, 12 figures

arXiv:2405.11826 [pdf, other]

Data quality control system and long-term performance monitor of the LHAASO-KM2A

Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (263 additional authors not shown)

Abstract: The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To… ▽ More The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To ensure the reliability of the LHAASO-KM2A data, a three-level quality control system has been established. It is used to monitor the status of detector units, stability of reconstructed parameters and the performance of the array based on observations of the Crab Nebula and Moon shadow. This paper will introduce the control system and its application on the LHAASO-KM2A data collected from August 2021 to July 2023. During this period, the pointing and angular resolution of the array were stable. From the observations of the Moon shadow and Crab Nebula, the results achieved using the two methods are consistent with each other. According to the observation of the Crab Nebula at energies from 25 TeV to 100 TeV, the time averaged pointing errors are estimated to be $-0.003^{\circ} \pm 0.005^{\circ}$ and $0.001^{\circ} \pm 0.006^{\circ}$ in the R.A. and Dec directions, respectively. △ Less

Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: 15 pages, 9 figures

arXiv:2405.11585 [pdf, other]

Improved measurement of the branching fraction of $h_{c}\rightarrowγη^\prime/η$ and search for $h_{c}\rightarrowγπ^0$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (645 additional authors not shown)

Abstract: The processes $h_c\rightarrowγP(P = η^\prime,~η,~π^{0}))$ are studied with a sample of $(27.12\pm0.14)\times10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. The branching fractions of $h_c\rightarrowγη^\prime$ and $h_c\rightarrowγη$ are measured to be $(1.40\pm0.11\pm0.04\pm0.10)\times10^{-3}$ and $(3.77\pm0.55\pm0.13\pm0.26)\times10^{-4}$, respectively, where the… ▽ More The processes $h_c\rightarrowγP(P = η^\prime,~η,~π^{0}))$ are studied with a sample of $(27.12\pm0.14)\times10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. The branching fractions of $h_c\rightarrowγη^\prime$ and $h_c\rightarrowγη$ are measured to be $(1.40\pm0.11\pm0.04\pm0.10)\times10^{-3}$ and $(3.77\pm0.55\pm0.13\pm0.26)\times10^{-4}$, respectively, where the first uncertainties are statistical, the second systematic, and the third from the branching fraction of $ψ(3686)\rightarrowπ^{0}h_c$. The ratio $R_{h_c}=\frac{\mathscr{B}(h_c\rightarrowγη)}{\mathscr{B}(h_c\rightarrowγη^\prime)}$ is calculated to be $(27.0\pm4.4\pm1.0)\%$. The measurements are consistent with the previous results with improved precision by a factor of 2. The results are valuable for gaining a deeper understanding of $η-η^\prime$ mixing, and its manifestation within quantum chromodynamics. No significant signal is found for the decay $h_c\rightarrowγπ^{0}$, and an upper limit is placed on its branching fraction of $\mathscr{B}(h_c\rightarrowγπ^{0})<5.0\times10^{-5}$, at the 90\% confidence level. △ Less

Submitted 19 May, 2024; originally announced May 2024.

arXiv:2405.11390 [pdf, other]

Search for Two-Body $B$ Meson Decays to $Λ^{0}$ and $Ω^{(*)0}_{c}$

Authors: Belle Collaboration, V. Savinov, I. Adachi, J. K. Ahn, H. Aihara, D. M. Asner, H. Atmacan, R. Ayad, Sw. Banerjee, J. Bennett, M. Bessner, V. Bhardwaj, D. Biswas, A. Bobrov, D. Bodrov, J. Borah, M. Bračko, P. Branchini, T. E. Browder, A. Budano, D. Červenkov, M. -C. Chang, P. Chang, B. G. Cheon, K. Cho , et al. (124 additional authors not shown)

Abstract: We report the results of the first search for Standard Model and baryon-number-violating two-body decays of the neutral $B$ mesons to $Λ^{0}$ and $Ω^{(*)0}_c$ using 711~${\rm fb^{-1}}$ of data collected at the $Υ(4S)$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+ e^-$ collider. We observe no evidence of signal from any such decays and set 95\% confidence-level upper limits o… ▽ More We report the results of the first search for Standard Model and baryon-number-violating two-body decays of the neutral $B$ mesons to $Λ^{0}$ and $Ω^{(*)0}_c$ using 711~${\rm fb^{-1}}$ of data collected at the $Υ(4S)$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+ e^-$ collider. We observe no evidence of signal from any such decays and set 95\% confidence-level upper limits on the products of $B^0$ and $\bar{B}^0$ branching fractions for these two-body decays with $\mathcal{B}(Ω_{c}^{0} \to π^+ Ω^-)$ in the range between 9.5~$\times 10^{-8}$ and 31.2~$\times 10^{-8}$. △ Less

Submitted 18 May, 2024; originally announced May 2024.

Comments: 6 pages, 2 figures, submitted to PRD(L)

Report number: Belle Preprint 2024-04, KEK Preprint 2024-5

arXiv:2405.10990 [pdf, ps, other]

Linear canonical space-time transform and convolution theorems

Authors: Yi-Qiao Xu, Bing-Zhao Li

Abstract: Following the idea of the fractional space-time Fourier transform, a linear canonical space-time transform for 16-dimensional space-time $C\ell_{3,1}$-valued signals is investigated in this paper. First, the definition of the proposed linear canonical space-time transform is given, and some related properties of this transform are obtained. Second, the convolution operator and the corresponding co… ▽ More Following the idea of the fractional space-time Fourier transform, a linear canonical space-time transform for 16-dimensional space-time $C\ell_{3,1}$-valued signals is investigated in this paper. First, the definition of the proposed linear canonical space-time transform is given, and some related properties of this transform are obtained. Second, the convolution operator and the corresponding convolution theorem are proposed. Third, the convolution theorem associated with the two-sided linear canonical space-time transform is derived. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: 18 pages

arXiv:2405.10835 [pdf, other]

doi 10.1145/3626772.3661356

A Unified Search and Recommendation Framework Based on Multi-Scenario Learning for Ranking in E-commerce

Authors: **han Liu, Qiyu Chen, Junjie Xu, Junjie Li, Baoli Li, Sulong Xu

Abstract: Search and recommendation (S&R) are the two most important scenarios in e-commerce. The majority of users typically interact with products in S&R scenarios, indicating the need and potential for joint modeling. Traditional multi-scenario models use shared parameters to learn the similarity of multiple tasks, and task-specific parameters to learn the divergence of individual tasks. This coarse-grai… ▽ More Search and recommendation (S&R) are the two most important scenarios in e-commerce. The majority of users typically interact with products in S&R scenarios, indicating the need and potential for joint modeling. Traditional multi-scenario models use shared parameters to learn the similarity of multiple tasks, and task-specific parameters to learn the divergence of individual tasks. This coarse-grained modeling approach does not effectively capture the differences between S&R scenarios. Furthermore, this approach does not sufficiently exploit the information across the global label space. These issues can result in the suboptimal performance of multi-scenario models in handling both S&R scenarios. To address these issues, we propose an effective and universal framework for Unified Search and Recommendation (USR), designed with S&R Views User Interest Extractor Layer (IE) and S&R Views Feature Generator Layer (FG) to separately generate user interests and scenario-agnostic feature representations for S&R. Next, we introduce a Global Label Space Multi-Task Layer (GLMT) that uses global labels as supervised signals of auxiliary tasks and jointly models the main task and auxiliary tasks using conditional probability. Extensive experimental evaluations on real-world industrial datasets show that USR can be applied to various multi-scenario models and significantly improve their performance. Online A/B testing also indicates substantial performance gains across multiple metrics. Currently, USR has been successfully deployed in the 7Fresh App. △ Less

Submitted 11 June, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

Comments: Accepted by SIGIR 2024

arXiv:2405.10163 [pdf]

Electrically Injected mid-infrared GeSn laser on Si operating at 140 K

Authors: Sudip Acharya, Hryhorii Stanchu, Rajesh Kumar, Solomon Ojo, Mourad Benamara, Guo-En Chang, Baohua Li, Wei Du, Shui-Qing Yu

Abstract: Owing to its true direct bandgap and tunable bandgap energies,GeSn alloys are increasingly attractive as gain media for mid-IR lasers that can be monolithically integrated on Si. Demonstrations of optically pumped GeSn laser at room under pulsed condition and at cryogenic temperature under continuous-wave excitation show great promise of GeSn lasers to be efficient electrically injected light sour… ▽ More Owing to its true direct bandgap and tunable bandgap energies,GeSn alloys are increasingly attractive as gain media for mid-IR lasers that can be monolithically integrated on Si. Demonstrations of optically pumped GeSn laser at room under pulsed condition and at cryogenic temperature under continuous-wave excitation show great promise of GeSn lasers to be efficient electrically injected light sources on Si. Here we report electrically injected GeSn lasers using Fabry-Perot cavity with 20, 40, and 80 micron ridge widths. A maximum operating temperature of 140 K with lasing threshold of 0.756 kA/cm2 at 77 K and emitting wavelength of 2722 nm at 140 K was obtained. The lower threshold current density compared to previous works was achieved by reducing optical loss and improving the optical confinement. The peak power was measured as 2.2 mW/facet at 77 K. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2405.10128 [pdf, other]

Red Teaming Language Models for Contradictory Dialogues

Authors: Xiaofei Wen, Bangzheng Li, Tenghao Huang, Muhao Chen

Abstract: Most language models currently available are prone to self-contradiction during dialogues. To mitigate this issue, this study explores a novel contradictory dialogue processing task that aims to detect and modify contradictory statements in a conversation. This task is inspired by research on context faithfulness and dialogue comprehension, which have demonstrated that the detection and understand… ▽ More Most language models currently available are prone to self-contradiction during dialogues. To mitigate this issue, this study explores a novel contradictory dialogue processing task that aims to detect and modify contradictory statements in a conversation. This task is inspired by research on context faithfulness and dialogue comprehension, which have demonstrated that the detection and understanding of contradictions often necessitate detailed explanations. We develop a dataset comprising contradictory dialogues, in which one side of the conversation contradicts itself. Each dialogue is accompanied by an explanatory label that highlights the location and details of the contradiction. With this dataset, we present a Red Teaming framework for contradictory dialogue processing. The framework detects and attempts to explain the dialogue, then modifies the existing contradictory content using the explanation. Our experiments demonstrate that the framework improves the ability to detect contradictory dialogues and provides valid explanations. Additionally, it showcases distinct capabilities for modifying such dialogues. Our study highlights the importance of the logical inconsistency problem in conversational AI. △ Less

Submitted 16 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

Comments: 18 pages, 5 figures

arXiv:2405.09787 [pdf, other]

Analysis of the BraTS 2023 Intracranial Meningioma Segmentation Challenge

Authors: Dominic LaBella, Ujjwal Baid, Omaditya Khanna, Shan McBurney-Lin, Ryan McLean, Pierre Nedelec, Arif Rashid, Nourel Hoda Tahon, Talissa Altes, Radhika Bhalerao, Yaseen Dhemesh, Devon Godfrey, Fathi Hilal, Scott Floyd, Anastasia Janas, Anahita Fathi Kazerooni, John Kirkpatrick, Collin Kent, Florian Kofler, Kevin Leu, Nazanin Maleki, Bjoern Menze, Maxence Pajot, Zachary J. Reitman, Jeffrey D. Rudie , et al. (96 additional authors not shown)

Abstract: We describe the design and results from the BraTS 2023 Intracranial Meningioma Segmentation Challenge. The BraTS Meningioma Challenge differed from prior BraTS Glioma challenges in that it focused on meningiomas, which are typically benign extra-axial tumors with diverse radiologic and anatomical presentation and a propensity for multiplicity. Nine participating teams each developed deep-learning… ▽ More We describe the design and results from the BraTS 2023 Intracranial Meningioma Segmentation Challenge. The BraTS Meningioma Challenge differed from prior BraTS Glioma challenges in that it focused on meningiomas, which are typically benign extra-axial tumors with diverse radiologic and anatomical presentation and a propensity for multiplicity. Nine participating teams each developed deep-learning automated segmentation models using image data from the largest multi-institutional systematically expert annotated multilabel multi-sequence meningioma MRI dataset to date, which included 1000 training set cases, 141 validation set cases, and 283 hidden test set cases. Each case included T2, T2/FLAIR, T1, and T1Gd brain MRI sequences with associated tumor compartment labels delineating enhancing tumor, non-enhancing tumor, and surrounding non-enhancing T2/FLAIR hyperintensity. Participant automated segmentation models were evaluated and ranked based on a scoring system evaluating lesion-wise metrics including dice similarity coefficient (DSC) and 95% Hausdorff Distance. The top ranked team had a lesion-wise median dice similarity coefficient (DSC) of 0.976, 0.976, and 0.964 for enhancing tumor, tumor core, and whole tumor, respectively and a corresponding average DSC of 0.899, 0.904, and 0.871, respectively. These results serve as state-of-the-art benchmarks for future pre-operative meningioma automated segmentation algorithms. Additionally, we found that 1286 of 1424 cases (90.3%) had at least 1 compartment voxel abutting the edge of the skull-stripped image edge, which requires further investigation into optimal pre-processing face anonymization steps. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: 16 pages, 11 tables, 10 figures, MICCAI

arXiv:2405.09762 [pdf, other]

Reassessment of the dipole in the distribution of quasars on the sky

Authors: Arefe Abghari, Emory F. Bunn, Lukas T. Hergt, Boris Li, Douglas Scott, Raelyn M. Sullivan, Dingchen Wei

Abstract: We investigate claims of an anomalously large amplitude of the dipole in the distribution of quasars on the sky. Two main issues indicate that the systematic uncertainties in the derived quasar-density dipole are underestimated. Firstly, the spatial distribution of the quasars is not a pure dipole, possessing low-order multipoles of comparable size to the dipole. These multipoles are unexpected an… ▽ More We investigate claims of an anomalously large amplitude of the dipole in the distribution of quasars on the sky. Two main issues indicate that the systematic uncertainties in the derived quasar-density dipole are underestimated. Firstly, the spatial distribution of the quasars is not a pure dipole, possessing low-order multipoles of comparable size to the dipole. These multipoles are unexpected and presumably caused by unknown systematic effects; we cannot be confident that the dipole amplitude is not also affected by the same systematics until the origin of these fluctuations is understood. Secondly, the 50 percent sky cut associated with the quasar catalogue strongly couples the multipoles, meaning that the power estimate at ell=1 contains significant contributions from ell>1. In particular, the dominant quadrupole mode in the Galactic mask strongly couples the dipole with the octupole, leading to a large uncertainty in the dipole amplitude. Together these issues mean that the dipole in the quasar catalogue has an uncertainty large enough that consistency with the cosmic microwave background (CMB) dipole cannot be ruled out. More generally, current data sets are insufficiently clean to robustly measure the quasar dipole and future studies will require samples that are larger (preferably covering more of the sky) and free of systematic effects to make strong claims regarding their consistency with the CMB dipole. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: 19 pages, 9 figures

arXiv:2405.09761 [pdf, other]

Receptivity of a supersonic jet due to acoustic excitations near the nozzle lip

Authors: Binhong Li, Sicheng Zhang, Benshuai Lyu

Abstract: In this paper, we develop an analytical model to investigate the generation of instability waves triggered by the upstream acoustic forcing near the nozzle lip of a supersonic jet. This represents an important stage, i.e. the jet receptivity, of the screech feedback loop. The upstream acoustic forcing, resulting from the shock-instability interaction, reaches the nozzle lip and excites new shear-l… ▽ More In this paper, we develop an analytical model to investigate the generation of instability waves triggered by the upstream acoustic forcing near the nozzle lip of a supersonic jet. This represents an important stage, i.e. the jet receptivity, of the screech feedback loop. The upstream acoustic forcing, resulting from the shock-instability interaction, reaches the nozzle lip and excites new shear-layer instability waves. To obtain the newly-excited instability wave, we first determine the scattered sound field due to the upstream forcing using the Wiener-Hopf technique, with the kernel function factored using asymptotic expansions and overlap** approximations. Subsequently, the unsteady Kutta condition is imposed at the nozzle lip, enabling the derivation of the dispersion relation for the newly-excited instability wave. A linear transfer function between the upstream forcing and the newly-excited instability wave is obtained. We calculate the amplitude and phase delay in this receptivity process and examine their variations against the frequency. The phase delay enables us to re-evaluate the phase condition for jet screech and propose a new frequency prediction model. The new model shows improved agreement between the predicted screech frequencies and the experimental data compared to classical models. It is hoped that this model may help in develo** a full screech model. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: 33 pages, 11 figures

arXiv:2405.09066 [pdf, other]

Search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, V. Batozskaya, D. Becker, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko , et al. (559 additional authors not shown)

Abstract: We present the first search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ by analyzing a data sample of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.178 and 4.226 GeV, corresponding to an integrated luminosity of 6.32~fb$^{-1}$. No significant signal is observed. The upper limits on the branching fractions for… ▽ More We present the first search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ by analyzing a data sample of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.178 and 4.226 GeV, corresponding to an integrated luminosity of 6.32~fb$^{-1}$. No significant signal is observed. The upper limits on the branching fractions for $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ are set to be $1.1 \times 10^{-5}$ and $4.3 \times 10^{-6}$ at 90\% confidence level, respectively. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 14 pages, 7 figures

arXiv:2405.08674 [pdf, other]

Expensive Multi-Objective Bayesian Optimization Based on Diffusion Models

Authors: Bingdong Li, Zixiang Di, Yongfan Lu, Hong Qian, Feng Wang, Peng Yang, Ke Tang, Aimin Zhou

Abstract: Multi-objective Bayesian optimization (MOBO) has shown promising performance on various expensive multi-objective optimization problems (EMOPs). However, effectively modeling complex distributions of the Pareto optimal solutions is difficult with limited function evaluations. Existing Pareto set learning algorithms may exhibit considerable instability in such expensive scenarios, leading to signif… ▽ More Multi-objective Bayesian optimization (MOBO) has shown promising performance on various expensive multi-objective optimization problems (EMOPs). However, effectively modeling complex distributions of the Pareto optimal solutions is difficult with limited function evaluations. Existing Pareto set learning algorithms may exhibit considerable instability in such expensive scenarios, leading to significant deviations between the obtained solution set and the Pareto set (PS). In this paper, we propose a novel Composite Diffusion Model based Pareto Set Learning algorithm, namely CDM-PSL, for expensive MOBO. CDM-PSL includes both unconditional and conditional diffusion model for generating high-quality samples. Besides, we introduce an information entropy based weighting method to balance different objectives of EMOPs. This method is integrated with the guiding strategy, ensuring that all the objectives are appropriately balanced and given due consideration during the optimization process; Extensive experimental results on both synthetic benchmarks and real-world problems demonstrates that our proposed algorithm attains superior performance compared with various state-of-the-art MOBO algorithms. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.08633 [pdf, other]

On the superconducting gap structure of the miassite Rh17S15: Nodal or nodeless?

Authors: J. Y. Nie, C. C. Zhao, C. Q. Xu, B. Li, C. P. Tu, X. Zhang, D. Z. Dai, H. R. Wang, S. Xu, Wenhe Jiao, B. M. Wang, Zhu'an Xu, Xiaofeng Xu, S. Y. Li

Abstract: Recent penetration depth measurement claimed the observation of unconventional superconductivity in the miassite Rh$_{17}$S$_{15}$ single crystals, evidenced by the linear-in-temperature penetration depth at low temperatures, thereby arguing for the presence of the lines of node in its superconducting gap structure. Here we measure the thermal conductivity of Rh$_{17}$S$_{15}$ single crystals down… ▽ More Recent penetration depth measurement claimed the observation of unconventional superconductivity in the miassite Rh$_{17}$S$_{15}$ single crystals, evidenced by the linear-in-temperature penetration depth at low temperatures, thereby arguing for the presence of the lines of node in its superconducting gap structure. Here we measure the thermal conductivity of Rh$_{17}$S$_{15}$ single crystals down to 110 mK and up to a field of 8 T ($\simeq 0.4H{\rm_{c2}}$). In marked contrast to the penetration depth measurement, we observe a negligible residual linear term $κ_0/T$ in zero field, in line with the nodeless gap structure. The field dependence of $κ_0(H)/T$ shows a profile that is more consistent with either a highly anisotropic gap structure or multiple nodeless gaps with significantly different magnitudes. Moreover, first-principles calculations give two electronic bands with complex shape of Fermi surfaces. These results suggest multigap nodeless superconductivity in this multiband Rh$_{17}$S$_{15}$ superconductor. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 7 pages, 6 figures

arXiv:2405.08604 [pdf, other]

Towards Geometry-Aware Pareto Set Learning for Neural Multi-Objective Combinatorial Optimization

Authors: Yongfan Lu, Zixiang Di, Bingdong Li, Shengcai Liu, Hong Qian, Peng Yang, Ke Tang, Aimin Zhou

Abstract: Multi-objective combinatorial optimization (MOCO) problems are prevalent in various real-world applications. Most existing neural MOCO methods rely on problem decomposition to transform an MOCO problem into a series of singe-objective combinatorial optimization (SOCO) problems. However, these methods often approximate partial regions of the Pareto front and spend excessive time on diversity enhanc… ▽ More Multi-objective combinatorial optimization (MOCO) problems are prevalent in various real-world applications. Most existing neural MOCO methods rely on problem decomposition to transform an MOCO problem into a series of singe-objective combinatorial optimization (SOCO) problems. However, these methods often approximate partial regions of the Pareto front and spend excessive time on diversity enhancement because of ambiguous decomposition and time-consuming precise hypervolume calculation. To address these limitations, we design a Geometry-Aware Pareto set Learning algorithm named GAPL, which provides a novel geometric perspective for neural MOCO via a Pareto attention model based on hypervolume expectation maximization. In addition, we propose a hypervolume residual update strategy to enable the Pareto attention model to capture both local and non-local information of the Pareto set/front. We also design a novel inference approach to further improve quality of the solution set and speed up hypervolume calculation. Experimental results on three classic MOCO problems demonstrate that our GAPL outperforms several state-of-the-art baselines via superior decomposition and efficient diversity enhancement. △ Less

Submitted 23 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.08573 [pdf, other]

ViSTooth: A Visualization Framework for Tooth Segmentation on Panoramic Radiograph

Authors: Shenji Zhu, Miaoxin Hu, Tianya Pan, Yue Hong, Bin Li, Zhiguang Zhou, Ting Xu

Abstract: Tooth segmentation is a key step for computer aided diagnosis of dental diseases. Numerous machine learning models have been employed for tooth segmentation on dental panoramic radiograph. However, it is a difficult task to achieve accurate tooth segmentation due to complex tooth shapes, diverse tooth categories and incomplete sample set for machine learning. In this paper, we propose ViSTooth, a… ▽ More Tooth segmentation is a key step for computer aided diagnosis of dental diseases. Numerous machine learning models have been employed for tooth segmentation on dental panoramic radiograph. However, it is a difficult task to achieve accurate tooth segmentation due to complex tooth shapes, diverse tooth categories and incomplete sample set for machine learning. In this paper, we propose ViSTooth, a visualization framework for tooth segmentation on dental panoramic radiograph. First, we employ Mask R-CNN to conduct preliminary tooth segmentation, and a set of domain metrics are proposed to estimate the accuracy of the segmented teeth, including tooth shape, tooth position and tooth angle. Then, we represent the teeth with high-dimensional vectors and visualize their distribution in a low-dimensional space, in which experts can easily observe those teeth with specific metrics. Further, we expand the sample set with the expert-specified teeth and train the tooth segmentation model iteratively. Finally, we conduct case study and expert study to demonstrate the effectiveness and usability of our ViSTooth, in aiding experts to implement accurate tooth segmentation guided by expert knowledge. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.07741 [pdf, other]

Search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (635 additional authors not shown)

Abstract: Using 9.0 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies from 4.178 to 4.278 GeV with the BESIII detector at the BEPCII collider, we perform the first search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$. No $χ_{c1}(3872)\toγψ_2(3823)$ signal is observed. The upper limit on the ratio of branching fractions… ▽ More Using 9.0 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies from 4.178 to 4.278 GeV with the BESIII detector at the BEPCII collider, we perform the first search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$. No $χ_{c1}(3872)\toγψ_2(3823)$ signal is observed. The upper limit on the ratio of branching fractions $\mathcal{B}(χ_{c1}(3872)\toγψ_2(3823), ψ_2(3823)\toγχ_{c1})/\mathcal{B}(χ_{c1}(3872)\toπ^+π^- J/ψ)$ is set as 0.075 at the 90\% confidence level. Our result contradicts theoretical predictions under the assumption that the $χ_{c1}(3872)$ is the pure charmonium state $χ_{c1}(2P)$. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 8 pages, 2 figures

arXiv:2405.07691 [pdf, other]

Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) i… ▽ More The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) is compatible with NGC 4278 within $\sim0.03$ degree. Variation analysis shows an indication of the variability at a few months level in the TeV band, which is consistent with low frequency observations. Based on these observations, we report the detection of TeV $γ$-ray emissions from this low-luminosity AGN NGC 4278. The observations by LHAASO-WCDA during active period has a significance level of 8.8\,$σ$ with best-fit photon spectral index $\varGamma=2.56\pm0.14$ and a flux $f_{1-10\,\rm{TeV}}=(7.0\pm1.1_{\rm{sta}}\pm0.35_{\rm{syst}})\times10^{-13}\,\rm{photons\,cm^{-2}\,s^{-1}}$, or approximately $5\%$ of the Crab Nebula. The discovery of VHE from NGC 4278 indicates that the compact, weak radio jet can efficiently accelerate particles and emit TeV photons. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures

arXiv:2405.07386 [pdf, other]

Search for lepton-flavor-violating $τ^- \to μ^-μ^+μ^-$ decays at Belle II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer, J. Becker , et al. (407 additional authors not shown)

Abstract: We present the result of a search for the charged-lepton-flavor violating decay $τ^- \to μ^-μ^+μ^-$ using a $424fb^{-1}$ sample of data recorded by the Belle II experiment at the SuperKEKB $e^{-}e^{+}$ collider. The selection of $e^{-}e^{+}\toτ^+τ^-$ events is based on an inclusive reconstruction of the non-signal tau decay, and on a boosted decision tree to suppress background. We observe one sig… ▽ More We present the result of a search for the charged-lepton-flavor violating decay $τ^- \to μ^-μ^+μ^-$ using a $424fb^{-1}$ sample of data recorded by the Belle II experiment at the SuperKEKB $e^{-}e^{+}$ collider. The selection of $e^{-}e^{+}\toτ^+τ^-$ events is based on an inclusive reconstruction of the non-signal tau decay, and on a boosted decision tree to suppress background. We observe one signal candidate, which is compatible with the expectation from background processes. We set a $90\%$ confidence level upper limit of $1.9 \times 10^{-8}$ on the branching fraction of the \taumu decay, which is the most stringent bound to date. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Report number: Belle II Preprint 2024-012 KEK Preprint 2024-6

arXiv:2405.07303 [pdf, other]

Search for solar axions by Primakoff effect with the full dataset of the CDEX-1B Experiment

Authors: L. T. Yang, S. K. Liu, Q. Yue, K. J. Kang, Y. J. Li, H. P. An, Greeshma C., J. P. Chang, Y. H. Chen, J. P. Cheng, W. H. Dai, Z. Deng, C. H. Fang, X. P. Geng, H. Gong, Q. J. Guo, T. Guo, X. Y. Guo, L. He, J. R. He, J. W. Hu, H. X. Huang, T. C. Huang, L. Jiang, S. Karmakar , et al. (61 additional authors not shown)

Abstract: We present the first limit on $g_{Aγ}$ coupling constant using the Bragg-Primakoff conversion based on an exposure of 1107.5 kg days of data from the CDEX-1B experiment at the China **** Underground Laboratory. The data are consistent with the null signal hypothesis, and no excess signals are observed. Limits of the coupling $g_{Aγ}<2.08\times10^{-9}$ GeV$^{-1}$ (95\% C.L.) are derived for axio… ▽ More We present the first limit on $g_{Aγ}$ coupling constant using the Bragg-Primakoff conversion based on an exposure of 1107.5 kg days of data from the CDEX-1B experiment at the China **** Underground Laboratory. The data are consistent with the null signal hypothesis, and no excess signals are observed. Limits of the coupling $g_{Aγ}<2.08\times10^{-9}$ GeV$^{-1}$ (95\% C.L.) are derived for axions with mass up to 100 eV/$c^2$. Within the hadronic model of KSVZ, our results exclude axion mass $>5.3~\rm{eV}/c^2$ at 95\% C.L. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: 7 pages, 5 figures

arXiv:2405.06413 [pdf, other]

Multi-level Personalized Federated Learning on Heterogeneous and Long-Tailed Data

Authors: Rongyu Zhang, Yun Chen, Chenrui Wu, Fangxin Wang, Bo Li

Abstract: Federated learning (FL) offers a privacy-centric distributed learning framework, enabling model training on individual clients and central aggregation without necessitating data exchange. Nonetheless, FL implementations often suffer from non-i.i.d. and long-tailed class distributions across mobile applications, e.g., autonomous vehicles, which leads models to overfitting as local training may conv… ▽ More Federated learning (FL) offers a privacy-centric distributed learning framework, enabling model training on individual clients and central aggregation without necessitating data exchange. Nonetheless, FL implementations often suffer from non-i.i.d. and long-tailed class distributions across mobile applications, e.g., autonomous vehicles, which leads models to overfitting as local training may converge to sub-optimal. In our study, we explore the impact of data heterogeneity on model bias and introduce an innovative personalized FL framework, Multi-level Personalized Federated Learning (MuPFL), which leverages the hierarchical architecture of FL to fully harness computational resources at various levels. This framework integrates three pivotal modules: Biased Activation Value Dropout (BAVD) to mitigate overfitting and accelerate training; Adaptive Cluster-based Model Update (ACMU) to refine local models ensuring coherent global aggregation; and Prior Knowledge-assisted Classifier Fine-tuning (PKCF) to bolster classification and personalize models in accord with skewed local data with shared knowledge. Extensive experiments on diverse real-world datasets for image classification and semantic segmentation validate that MuPFL consistently outperforms state-of-the-art baselines, even under extreme non-i.i.d. and long-tail conditions, which enhances accuracy by as much as 7.39% and accelerates training by up to 80% at most, marking significant advancements in both efficiency and effectiveness. △ Less

Submitted 10 May, 2024; originally announced May 2024.

Comments: 14 pages, 10 figures

arXiv:2405.06393 [pdf, other]

Measurement of the ${e}^{+}{e}^{-}\to p \bar{p}π^{0}$ cross section at $\sqrt{s}=2.1000-3.0800$ GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: The process $e^{+}e^{-}\to p\bar{p}π^{0}$ is studied at 20 center-of-mass energies ranging from 2.1000 to 3.0800 GeV using 636.8 pb$^{-1}$ of data collected with the BESIII detector operating at the BEPCII collider. The Born cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$ are measured with high precision. Since the lowest center-of-mass energy, 2.1000 GeV, is less than 90 MeV above the… ▽ More The process $e^{+}e^{-}\to p\bar{p}π^{0}$ is studied at 20 center-of-mass energies ranging from 2.1000 to 3.0800 GeV using 636.8 pb$^{-1}$ of data collected with the BESIII detector operating at the BEPCII collider. The Born cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$ are measured with high precision. Since the lowest center-of-mass energy, 2.1000 GeV, is less than 90 MeV above the $p\bar{p}π^0$ energy threshold, we can probe the threshold behavior for this reaction. However, no anomalous threshold enhancement is found in the cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.05413 [pdf]

Digital Evolution: Novo Nordisk's Shift to Ontology-Based Data Management

Authors: Shawn Zheng Kai Tan, Shounak Baksi, Thomas Gade Bjerregaard, Preethi Elangovan, Thrishna Kuttikattu Gopalakrishnan, Darko Hric, Joffrey Joumaa, Beidi Li, Kashif Rabbani, Santhosh Kannan Venkatesan, Joshua Daniel Valdez, Saritha Vettikunnel Kuriakose

Abstract: Biomedical data is growing exponentially, and managing it is increasingly challenging. While Findable, Accessible, Interoperable and Reusable (FAIR) data principles provide guidance, their adoption has proven difficult, especially in larger enterprises like pharmaceutical companies. In this manuscript, we describe how we leverage an Ontology-Based Data Management (OBDM) strategy for digital transf… ▽ More Biomedical data is growing exponentially, and managing it is increasingly challenging. While Findable, Accessible, Interoperable and Reusable (FAIR) data principles provide guidance, their adoption has proven difficult, especially in larger enterprises like pharmaceutical companies. In this manuscript, we describe how we leverage an Ontology-Based Data Management (OBDM) strategy for digital transformation in Novo Nordisk Research & Early Development. Here, we include both our technical blueprint and our approach for organizational change management. We further discuss how such an OBDM ecosystem plays a pivotal role in the organizations digital aspirations for data federation and discovery fuelled by artificial intelligence. Our aim for this paper is to share the lessons learned in order to foster dialogue with parties navigating similar waters while collectively advancing the efforts in the fields of data management, semantics and data driven drug discovery. △ Less

Submitted 10 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

Comments: 14 pages, 2 figures

arXiv:2405.04543 [pdf, other]

The Phase Transition of Reissner-Nordström Black Holes

Authors: Bobin Li

Abstract: Under the framework of thermodynamics, the phase transition of the black hole is a general issue in general relativity. In this work, the phase transition of charged black holes is discussed carefully. The metric tensor of thermodynamics is redefined in the charged black hole, based on the Ruppeiner geometry. With the well-defined metric tensor of thermodynamics, the scalar curvature of the charge… ▽ More Under the framework of thermodynamics, the phase transition of the black hole is a general issue in general relativity. In this work, the phase transition of charged black holes is discussed carefully. The metric tensor of thermodynamics is redefined in the charged black hole, based on the Ruppeiner geometry. With the well-defined metric tensor of thermodynamics, the scalar curvature of the charged black hole is obtained. It is indicated that the scalar curvature is diverged and infinite when the mass M or charge Q are set by some values, and it is shown that the charged black hole suffers from a phase transition. At the same time, there is a phase transition from small mass to large mass or from small to high charged state. It is shown that the phase transition of a charged black hole is a common and general process and this work is meaningful for the construction of microscopic states of black holes. △ Less

Submitted 29 April, 2024; originally announced May 2024.

arXiv:2405.04479 [pdf]

doi 10.1038/s41586-024-07584-w

Tunable superconductivity in electron- and hole-doped Bernal bilayer graphene

Authors: Chushan Li, Fan Xu, Bohao Li, Jiayi Li, Guoan Li, Kenji Watanabe, Takashi Taniguchi, Bingbing Tong, Jie Shen, Li Lu, **feng Jia, Fengcheng Wu, Xiaoxue Liu, Tingxin Li

Abstract: Graphene-based, high quality two-dimensional electronic systems have emerged as a highly tunable platform for studying superconductivity. Specifically, superconductivity has been observed in both electron-doped and hole-doped twisted graphene moire systems, whereas in crystalline graphene systems, superconductivity has so far only been observed in hole-doped rhombohedral trilayer and hole-doped Be… ▽ More Graphene-based, high quality two-dimensional electronic systems have emerged as a highly tunable platform for studying superconductivity. Specifically, superconductivity has been observed in both electron-doped and hole-doped twisted graphene moire systems, whereas in crystalline graphene systems, superconductivity has so far only been observed in hole-doped rhombohedral trilayer and hole-doped Bernal bilayer graphene (BBG). Recently, enhanced superconductivity has been demonstrated in BBG due to the proximity with a monolayer WSe2. Here, we report the observation of superconductivity and a series of flavor-symmetry-breaking phases in both electron- and hole-doped BBG/WSe2 device by electrostatic do**. The strength of the observed superconductivity is tunable by applied vertical electric fields. The maximum Berezinskii-Kosterlitz-Thouless (BKT) transition temperature for the electron- and hole-doped superconductivity is about 210 mK and 400 mK, respectively. Superconductivities emerge only when applied electric fields drive BBG electron or hole wavefunctions toward the WSe2 layer, underscoring the importance of the WSe2 layer in the observed superconductivity. We find the hole-doped superconductivity violates the Pauli paramagnetic limit, consistent with an Ising-like superconductor. In contrast, the electron-doped superconductivity obeys the Pauli limit, even though the proximity induced Ising spin-orbit coupling is also notable in the conduction band. Our findings highlight the rich physics associated with the conduction band in BBG, paving the way for further studies into the superconducting mechanisms of crystalline graphene and the development of novel superconductor devices based on BBG. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.04254 [pdf, ps, other]

Distributed variable screening for generalized linear models

Authors: Tianbo Diao, Lianqiang Qu, Bo Li, Liuquan Sun

Abstract: In this article, we develop a distributed variable screening method for generalized linear models. This method is designed to handle situations where both the sample size and the number of covariates are large. Specifically, the proposed method selects relevant covariates by using a sparsity-restricted surrogate likelihood estimator. It takes into account the joint effects of the covariates rather… ▽ More In this article, we develop a distributed variable screening method for generalized linear models. This method is designed to handle situations where both the sample size and the number of covariates are large. Specifically, the proposed method selects relevant covariates by using a sparsity-restricted surrogate likelihood estimator. It takes into account the joint effects of the covariates rather than just the marginal effect, and this characteristic enhances the reliability of the screening results. We establish the sure screening property of the proposed method, which ensures that with a high probability, the true model is included in the selected model. Simulation studies are conducted to evaluate the finite sample performance of the proposed method, and an application to a real dataset showcases its practical utility. △ Less

Submitted 7 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.03869 [pdf, other]

Outlier Gradient Analysis: Efficiently Improving Deep Learning Model Performance via Hessian-Free Influence Functions

Authors: Anshuman Chhabra, Bo Li, Jian Chen, Prasant Mohapatra, Hongfu Liu

Abstract: Influence functions offer a robust framework for assessing the impact of each training data sample on model predictions, serving as a prominent tool in data-centric learning. Despite their widespread use in various tasks, the strong convexity assumption on the model and the computational cost associated with calculating the inverse of the Hessian matrix pose constraints, particularly when analyzin… ▽ More Influence functions offer a robust framework for assessing the impact of each training data sample on model predictions, serving as a prominent tool in data-centric learning. Despite their widespread use in various tasks, the strong convexity assumption on the model and the computational cost associated with calculating the inverse of the Hessian matrix pose constraints, particularly when analyzing large deep models. This paper focuses on a classical data-centric scenario--trimming detrimental samples--and addresses both challenges within a unified framework. Specifically, we establish an equivalence transformation between identifying detrimental training samples via influence functions and outlier gradient detection. This transformation not only presents a straightforward and Hessian-free formulation but also provides profound insights into the role of the gradient in sample impact. Moreover, it relaxes the convexity assumption of influence functions, extending their applicability to non-convex deep models. Through systematic empirical evaluations, we first validate the correctness of our proposed outlier gradient analysis on synthetic datasets and then demonstrate its effectiveness in detecting mislabeled samples in vision models, selecting data samples for improving performance of transformer models for natural language processing, and identifying influential samples for fine-tuned Large Language Models. △ Less

Submitted 12 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

arXiv:2405.03728 [pdf, other]

GLHF: General Learned Evolutionary Algorithm Via Hyper Functions

Authors: Xiaobin Li, Kai Wu, Yujian Betterest Li, Xiaoyu Zhang, Handing Wang, **g Liu

Abstract: Pretrained Optimization Models (POMs) leverage knowledge gained from optimizing various tasks, providing efficient solutions for new optimization challenges through direct usage or fine-tuning. Despite the inefficiencies and limited generalization abilities observed in current POMs, our proposed model, the general pre-trained optimization model (GPOM), addresses these shortcomings. GPOM constructs… ▽ More Pretrained Optimization Models (POMs) leverage knowledge gained from optimizing various tasks, providing efficient solutions for new optimization challenges through direct usage or fine-tuning. Despite the inefficiencies and limited generalization abilities observed in current POMs, our proposed model, the general pre-trained optimization model (GPOM), addresses these shortcomings. GPOM constructs a population-based pretrained Black-Box Optimization (BBO) model tailored for continuous optimization. Evaluation on the BBOB benchmark and two robot control tasks demonstrates that GPOM outperforms other pretrained BBO models significantly, especially for high-dimensional tasks. Its direct optimization performance exceeds that of state-of-the-art evolutionary algorithms and POMs. Furthermore, GPOM exhibits robust generalization capabilities across diverse task distributions, dimensions, population sizes, and optimization horizons. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2405.03685 [pdf, other]

Language-Image Models with 3D Understanding

Authors: Jang Hyun Cho, Boris Ivanovic, Yulong Cao, Edward Schmerling, Yue Wang, Xinshuo Weng, Boyi Li, Yurong You, Philipp Krähenbühl, Yan Wang, Marco Pavone

Abstract: Multi-modal large language models (MLLMs) have shown incredible capabilities in a variety of 2D vision and language tasks. We extend MLLMs' perceptual capabilities to ground and reason about images in 3-dimensional space. To that end, we first develop a large-scale pre-training dataset for 2D and 3D called LV3D by combining multiple existing 2D and 3D recognition datasets under a common task formu… ▽ More Multi-modal large language models (MLLMs) have shown incredible capabilities in a variety of 2D vision and language tasks. We extend MLLMs' perceptual capabilities to ground and reason about images in 3-dimensional space. To that end, we first develop a large-scale pre-training dataset for 2D and 3D called LV3D by combining multiple existing 2D and 3D recognition datasets under a common task formulation: as multi-turn question-answering. Next, we introduce a new MLLM named Cube-LLM and pre-train it on LV3D. We show that pure data scaling makes a strong 3D perception capability without 3D specific architectural design or training objective. Cube-LLM exhibits intriguing properties similar to LLMs: (1) Cube-LLM can apply chain-of-thought prompting to improve 3D understanding from 2D context information. (2) Cube-LLM can follow complex and diverse instructions and adapt to versatile input and output formats. (3) Cube-LLM can be visually prompted such as 2D box or a set of candidate 3D boxes from specialists. Our experiments on outdoor benchmarks demonstrate that Cube-LLM significantly outperforms existing baselines by 21.3 points of AP-BEV on the Talk2Car dataset for 3D grounded reasoning and 17.7 points on the DriveLM dataset for complex reasoning about driving scenarios, respectively. Cube-LLM also shows competitive results in general MLLM benchmarks such as refCOCO for 2D grounding with (87.0) average score, as well as visual question answering benchmarks such as VQAv2, GQA, SQA, POPE, etc. for complex reasoning. Our project is available at https://janghyuncho.github.io/Cube-LLM. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: Project page: https://janghyuncho.github.io/Cube-LLM

arXiv:2405.03316 [pdf, other]

Provably Unlearnable Examples

Authors: Derui Wang, Minhui Xue, Bo Li, Seyit Camtepe, Liming Zhu

Abstract: The exploitation of publicly accessible data has led to escalating concerns regarding data privacy and intellectual property (IP) breaches in the age of artificial intelligence. As a strategy to safeguard both data privacy and IP-related domain knowledge, efforts have been undertaken to render shared data unlearnable for unauthorized models in the wild. Existing methods apply empirically optimized… ▽ More The exploitation of publicly accessible data has led to escalating concerns regarding data privacy and intellectual property (IP) breaches in the age of artificial intelligence. As a strategy to safeguard both data privacy and IP-related domain knowledge, efforts have been undertaken to render shared data unlearnable for unauthorized models in the wild. Existing methods apply empirically optimized perturbations to the data in the hope of disrupting the correlation between the inputs and the corresponding labels such that the data samples are converted into Unlearnable Examples (UEs). Nevertheless, the absence of mechanisms that can verify how robust the UEs are against unknown unauthorized models and train-time techniques engenders several problems. First, the empirically optimized perturbations may suffer from the problem of cross-model generalization, which echoes the fact that the unauthorized models are usually unknown to the defender. Second, UEs can be mitigated by train-time techniques such as data augmentation and adversarial training. Furthermore, we find that a simple recovery attack can restore the clean-task performance of the classifiers trained on UEs by slightly perturbing the learned weights. To mitigate the aforementioned problems, in this paper, we propose a mechanism for certifying the so-called $(q, η)$-Learnability of an unlearnable dataset via parametric smoothing. A lower certified $(q, η)$-Learnability indicates a more robust protection over the dataset. Finally, we try to 1) improve the tightness of certified $(q, η)$-Learnability and 2) design Provably Unlearnable Examples (PUEs) which have reduced $(q, η)$-Learnability. According to experimental results, PUEs demonstrate both decreased certified $(q, η)$-Learnability and enhanced empirical robustness compared to existing UEs. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2405.03272 [pdf, other]

WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning

Authors: Yuanhan Zhang, Kaichen Zhang, Bo Li, Fanyi Pu, Christopher Arif Setiadharma, **gkang Yang, Ziwei Liu

Abstract: Multimodal information, together with our knowledge, help us to understand the complex and dynamic world. Large language models (LLM) and large multimodal models (LMM), however, still struggle to emulate this capability. In this paper, we present WorldQA, a video understanding dataset designed to push the boundaries of multimodal world models with three appealing properties: (1) Multimodal Inputs:… ▽ More Multimodal information, together with our knowledge, help us to understand the complex and dynamic world. Large language models (LLM) and large multimodal models (LMM), however, still struggle to emulate this capability. In this paper, we present WorldQA, a video understanding dataset designed to push the boundaries of multimodal world models with three appealing properties: (1) Multimodal Inputs: The dataset comprises 1007 question-answer pairs and 303 videos, necessitating the analysis of both auditory and visual data for successful interpretation. (2) World Knowledge: We identify five essential types of world knowledge for question formulation. This approach challenges models to extend their capabilities beyond mere perception. (3) Long-Chain Reasoning: Our dataset introduces an average reasoning step of 4.45, notably surpassing other videoQA datasets. Furthermore, we introduce WorldRetriever, an agent designed to synthesize expert knowledge into a coherent reasoning chain, thereby facilitating accurate responses to WorldQA queries. Extensive evaluations of 13 prominent LLMs and LMMs reveal that WorldRetriever, although being the most effective model, achieved only 70% of humanlevel performance in multiple-choice questions. This finding highlights the necessity for further advancement in the reasoning and comprehension abilities of models. Our experiments also yield several key insights. For instance, while humans tend to perform better with increased frames, current LMMs, including WorldRetriever, show diminished performance under similar conditions. We hope that WorldQA,our methodology, and these insights could contribute to the future development of multimodal world models. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2405.03016 [pdf, ps, other]

Pathwise uniform convergence of a full discretization for a three-dimensional stochastic Allen-Cahn equation with multiplicative noise

Authors: Binjie Li, Qin Zhou

Abstract: This paper analyzes a full discretization of a three-dimensional stochastic Allen-Cahn equation with multiplicative noise. The discretization uses the Euler scheme for temporal discretization and the finite element method for spatial discretization. A key contribution of this work is the introduction of a novel stability estimate for a discrete stochastic convolution, which plays a crucial role in… ▽ More This paper analyzes a full discretization of a three-dimensional stochastic Allen-Cahn equation with multiplicative noise. The discretization uses the Euler scheme for temporal discretization and the finite element method for spatial discretization. A key contribution of this work is the introduction of a novel stability estimate for a discrete stochastic convolution, which plays a crucial role in establishing pathwise uniform convergence estimates for fully discrete approximations of nonlinear stochastic parabolic equations. By using this stability estimate in conjunction with the discrete stochastic maximal $L^p$-regularity estimate, the study derives a pathwise uniform convergence rate that encompasses general general spatial $L^q$-norms. Moreover, the theoretical convergence rate is verified by numerical experiments. △ Less

Submitted 18 June, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

MSC Class: 60H35; 35R60; 60H15

arXiv:2405.02933 [pdf, other]

Relay Decoding: Concatenating Large Language Models for Machine Translation

Authors: Chengpeng Fu, Xiaocheng Feng, Yichong Huang, Wenshuai Huo, Baohang Li, Hui Wang, Bin Qin, Ting Liu

Abstract: Leveraging large language models for machine translation has demonstrated promising results. However, it does require the large language models to possess the capability of handling both the source and target languages in machine translation. When it is challenging to find large models that support the desired languages, resorting to continuous learning methods becomes a costly endeavor. To mitiga… ▽ More Leveraging large language models for machine translation has demonstrated promising results. However, it does require the large language models to possess the capability of handling both the source and target languages in machine translation. When it is challenging to find large models that support the desired languages, resorting to continuous learning methods becomes a costly endeavor. To mitigate these expenses, we propose an innovative approach called RD (Relay Decoding), which entails concatenating two distinct large models that individually support the source and target languages. By incorporating a simple map** layer to facilitate the connection between these two models and utilizing a limited amount of parallel data for training, we successfully achieve superior results in the machine translation task. Experimental results conducted on the Multi30k and WikiMatrix datasets validate the effectiveness of our proposed method. △ Less

Submitted 5 May, 2024; originally announced May 2024.

Comments: Work in progress

arXiv:2405.02564 [pdf]

Leveraging the Human Ventral Visual Stream to Improve Neural Network Robustness

Authors: Zhenan Shao, Linjian Ma, Bo Li, Diane M. Beck

Abstract: Human object recognition exhibits remarkable resilience in cluttered and dynamic visual environments. In contrast, despite their unparalleled performance across numerous visual tasks, Deep Neural Networks (DNNs) remain far less robust than humans, showing, for example, a surprising susceptibility to adversarial attacks involving image perturbations that are (almost) imperceptible to humans. Human… ▽ More Human object recognition exhibits remarkable resilience in cluttered and dynamic visual environments. In contrast, despite their unparalleled performance across numerous visual tasks, Deep Neural Networks (DNNs) remain far less robust than humans, showing, for example, a surprising susceptibility to adversarial attacks involving image perturbations that are (almost) imperceptible to humans. Human object recognition likely owes its robustness, in part, to the increasingly resilient representations that emerge along the hierarchy of the ventral visual cortex. Here we show that DNNs, when guided by neural representations from a hierarchical sequence of regions in the human ventral visual stream, display increasing robustness to adversarial attacks. These neural-guided models also exhibit a gradual shift towards more human-like decision-making patterns and develop hierarchically smoother decision surfaces. Importantly, the resulting representational spaces differ in important ways from those produced by conventional smoothing methods, suggesting that such neural-guidance may provide previously unexplored robustness solutions. Our findings support the gradual emergence of human robustness along the ventral visual hierarchy and suggest that the key to DNN robustness may lie in increasing emulation of the human brain. △ Less

Submitted 4 May, 2024; originally announced May 2024.

arXiv:2405.00466 [pdf, other]

Lazy Layers to Make Fine-Tuned Diffusion Models More Traceable

Authors: Haozhe Liu, Wentian Zhang, Bing Li, Bernard Ghanem, Jürgen Schmidhuber

Abstract: Foundational generative models should be traceable to protect their owners and facilitate safety regulation. To achieve this, traditional approaches embed identifiers based on supervisory trigger-response signals, which are commonly known as backdoor watermarks. They are prone to failure when the model is fine-tuned with nontrigger data. Our experiments show that this vulnerability is due to energ… ▽ More Foundational generative models should be traceable to protect their owners and facilitate safety regulation. To achieve this, traditional approaches embed identifiers based on supervisory trigger-response signals, which are commonly known as backdoor watermarks. They are prone to failure when the model is fine-tuned with nontrigger data. Our experiments show that this vulnerability is due to energetic changes in only a few 'busy' layers during fine-tuning. This yields a novel arbitrary-in-arbitrary-out (AIAO) strategy that makes watermarks resilient to fine-tuning-based removal. The trigger-response pairs of AIAO samples across various neural network depths can be used to construct watermarked subpaths, employing Monte Carlo sampling to achieve stable verification results. In addition, unlike the existing methods of designing a backdoor for the input/output space of diffusion models, in our method, we propose to embed the backdoor into the feature space of sampled subpaths, where a mask-controlled trigger function is proposed to preserve the generation performance and ensure the invisibility of the embedded backdoor. Our empirical studies on the MS-COCO, AFHQ, LSUN, CUB-200, and DreamBooth datasets confirm the robustness of AIAO; while the verification rates of other trigger-based methods fall from ~90% to ~70% after fine-tuning, those of our method remain consistently above 90%. △ Less

Submitted 1 May, 2024; originally announced May 2024.

arXiv:2405.00266 [pdf, other]

Robot-As-A-Sensor: Forming a Sensing Network with Robots for Underground Mining Missions

Authors: Xiaoyu Ai, Chengpei Xu, Binghao Li, Feng Xia

Abstract: Nowadays, robots are deployed as mobile platforms equipped with sensing, communication and computing capabilities, especially in the mining industry, where they perform tasks in hazardous and repetitive environments. Despite their potential, individual robots face significant limitations when completing complex tasks that require the collaboration of multiple robots. This collaboration requires a… ▽ More Nowadays, robots are deployed as mobile platforms equipped with sensing, communication and computing capabilities, especially in the mining industry, where they perform tasks in hazardous and repetitive environments. Despite their potential, individual robots face significant limitations when completing complex tasks that require the collaboration of multiple robots. This collaboration requires a robust wireless network to ensure operational efficiency and reliability. This paper introduces the concept of "Robot-As-A-Sensor" (RAAS), which treats the robots as mobile sensors within structures similar to Wireless Sensor Networks (WSNs). We later identify specific challenges in integrating RAAS technology and propose technological advancements to address these challenges. Finally, we provide an outlook about the technologies that can contribute to realising RAAS, suggesting that this approach could catalyse a shift towards safer, more intelligent, and sustainable industry practices. We believe that this innovative RAAS framework could significantly transform industries requiring advanced technological integration. △ Less

Submitted 30 April, 2024; originally announced May 2024.

Comments: Submitted to Special Issue on Neuro-Inspired Learning for Robotics for IEEE Transactions on Cognitive and Developmental Systems

arXiv:2405.00256 [pdf, other]

ASAM: Boosting Segment Anything Model with Adversarial Tuning

Authors: Bo Li, Haoke Xiao, Lv Tang

Abstract: In the evolving landscape of computer vision, foundation models have emerged as pivotal tools, exhibiting exceptional adaptability to a myriad of tasks. Among these, the Segment Anything Model (SAM) by Meta AI has distinguished itself in image segmentation. However, SAM, like its counterparts, encounters limitations in specific niche applications, prompting a quest for enhancement strategies that… ▽ More In the evolving landscape of computer vision, foundation models have emerged as pivotal tools, exhibiting exceptional adaptability to a myriad of tasks. Among these, the Segment Anything Model (SAM) by Meta AI has distinguished itself in image segmentation. However, SAM, like its counterparts, encounters limitations in specific niche applications, prompting a quest for enhancement strategies that do not compromise its inherent capabilities. This paper introduces ASAM, a novel methodology that amplifies SAM's performance through adversarial tuning. We harness the potential of natural adversarial examples, inspired by their successful implementation in natural language processing. By utilizing a stable diffusion model, we augment a subset (1%) of the SA-1B dataset, generating adversarial instances that are more representative of natural variations rather than conventional imperceptible perturbations. Our approach maintains the photorealism of adversarial examples and ensures alignment with original mask annotations, thereby preserving the integrity of the segmentation task. The fine-tuned ASAM demonstrates significant improvements across a diverse range of segmentation tasks without necessitating additional data or architectural modifications. The results of our extensive evaluations confirm that ASAM establishes new benchmarks in segmentation tasks, thereby contributing to the advancement of foundational models in computer vision. Our project page is in https://asam2024.github.io/. △ Less

Submitted 30 April, 2024; originally announced May 2024.

Comments: This paper is accepted by CVPR2024

arXiv:2404.19534 [pdf, other]

MIPI 2024 Challenge on Nighttime Flare Removal: Methods and Results

Authors: Yuekun Dai, Dafeng Zhang, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Peiqing Yang, Zhezhu **, Guanqun Liu, Chen Change Loy, Lize Zhang, Shuai Liu, Chaoyu Feng, Luyang Wang, Shuan Chen, Guangqi Shao, Xiaotao Wang, Lei Lei, Qirui Yang, Qihua Cheng, Zhiqiang Xu, Yihao Liu, Huan**g Yue, **gyu Yang , et al. (38 additional authors not shown)

Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photography and imaging (MIPI). Building on the achievements of the previous MIPI Workshops held at ECCV 2022 and CVPR 2023, we introduce our third MIPI challenge including three tracks focusing on novel image sensors and imaging algorithms. In this paper, we summarize and review the Nighttime Flare Removal track on MIPI 2024. In total, 170 participants were successfully registered, and 14 teams submitted results in the final testing phase. The developed solutions in this challenge achieved state-of-the-art performance on Nighttime Flare Removal. More details of this challenge and the link to the dataset can be found at https://mipi-challenge.org/MIPI2024/. △ Less

Submitted 27 May, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

Comments: CVPR 2024 Mobile Intelligent Photography and Imaging (MIPI) Workshop--Nighttime Flare Removal Challenge Report. Website: https://mipi-challenge.org/MIPI2024/

arXiv:2404.18413 [pdf, other]

3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset

Authors: Xinyu Ma, Xuebo Liu, Derek F. Wong, Jun Rao, Bei Li, Liang Ding, Lidia S. Chao, Dacheng Tao, Min Zhang

Abstract: Multimodal machine translation (MMT) is a challenging task that seeks to improve translation quality by incorporating visual information. However, recent studies have indicated that the visual information provided by existing MMT datasets is insufficient, causing models to disregard it and overestimate their capabilities. This issue presents a significant obstacle to the development of MMT researc… ▽ More Multimodal machine translation (MMT) is a challenging task that seeks to improve translation quality by incorporating visual information. However, recent studies have indicated that the visual information provided by existing MMT datasets is insufficient, causing models to disregard it and overestimate their capabilities. This issue presents a significant obstacle to the development of MMT research. This paper presents a novel solution to this issue by introducing 3AM, an ambiguity-aware MMT dataset comprising 26,000 parallel sentence pairs in English and Chinese, each with corresponding images. Our dataset is specifically designed to include more ambiguity and a greater variety of both captions and images than other MMT datasets. We utilize a word sense disambiguation model to select ambiguous data from vision-and-language datasets, resulting in a more challenging dataset. We further benchmark several state-of-the-art MMT models on our proposed dataset. Experimental results show that MMT models trained on our dataset exhibit a greater ability to exploit visual information than those trained on other MMT datasets. Our work provides a valuable resource for researchers in the field of multimodal learning and encourages further exploration in this area. The data, code and scripts are freely available at https://github.com/MaxyLee/3AM. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.18361 [pdf, other]

Improving Multi-Instance GPU Efficiency via Sub-Entry Sharing TLB Design

Authors: Bingyao Li, Yueqi Wang, Tianyu Wang, Lieven Eeckhout, Jun Yang, Aamer Jaleel, Xulong Tang

Abstract: NVIDIA's Multi-Instance GPU (MIG) technology enables partitioning GPU computing power and memory into separate hardware instances, providing complete isolation including compute resources, caches, and memory. However, prior work identifies that MIG does not extend to partitioning the last-level TLB (i.e., L3 TLB), which remains shared among all instances. To enhance TLB reach, NVIDIA GPUs reorgani… ▽ More NVIDIA's Multi-Instance GPU (MIG) technology enables partitioning GPU computing power and memory into separate hardware instances, providing complete isolation including compute resources, caches, and memory. However, prior work identifies that MIG does not extend to partitioning the last-level TLB (i.e., L3 TLB), which remains shared among all instances. To enhance TLB reach, NVIDIA GPUs reorganized the TLB structure with 16 sub-entries in each L3 TLB entry that have a one-to-one map** to the address translations for 16 pages of size 64KB located within the same 1MB aligned range. Our comprehensive investigation of address translation efficiency in MIG identifies two main issues caused by L3 TLB sharing interference: (i) it results in performance degradation for co-running applications, and (ii) TLB sub-entries are not fully utilized before eviction. Based on this observation, we propose STAR to improve the utilization of TLB sub-entries through dynamic sharing of TLB entries across multiple base addresses. STAR evaluates TLB entries based on their sub-entry utilization to optimize address translation storage, dynamically adjusting between a shared and non-shared status to cater to current demand. We show that STAR improves overall performance by an average of 30.2% across various multi-tenant workloads. △ Less

Submitted 28 April, 2024; originally announced April 2024.

arXiv:2404.18132 [pdf, other]

Allocating Mixed Goods with Customized Fairness and Indivisibility Ratio

Authors: Bo Li, Zihao Li, Shengxin Liu, Zekai Wu

Abstract: We consider the problem of fairly allocating a combination of divisible and indivisible goods. While fairness criteria like envy-freeness (EF) and proportionality (PROP) can always be achieved for divisible goods, only their relaxed versions, such as the ''up to one'' relaxations EF1 and PROP1, can be satisfied when the goods are indivisible. The ''up to one'' relaxations require the fairness cond… ▽ More We consider the problem of fairly allocating a combination of divisible and indivisible goods. While fairness criteria like envy-freeness (EF) and proportionality (PROP) can always be achieved for divisible goods, only their relaxed versions, such as the ''up to one'' relaxations EF1 and PROP1, can be satisfied when the goods are indivisible. The ''up to one'' relaxations require the fairness conditions to be satisfied provided that one good can be completely eliminated or added in the comparison. In this work, we bridge the gap between the two extremes and propose ''up to a fraction'' relaxations for the allocation of mixed divisible and indivisible goods. The fraction is determined based on the proportion of indivisible goods, which we call the indivisibility ratio. The new concepts also introduce asymmetric conditions that are customized for individuals with varying indivisibility ratios. We provide both upper and lower bounds on the fractions of the modified item in order to satisfy the fairness criterion. Our results are tight up to a constant for EF and asymptotically tight for PROP. △ Less

Submitted 28 April, 2024; originally announced April 2024.

Comments: Appears in the 33rd International Joint Conference on Artificial Intelligence (IJCAI), 2024

arXiv:2404.17994 [pdf]

LpQcM: Adaptable Lesion-Quantification-Consistent Modulation for Deep Learning Low-Count PET Image Denoising

Authors: Menghua Xia, Huidong Xie, Qiong Liu, Bo Zhou, Hanzhong Wang, Biao Li, Axel Rominger, Kuangyu Shi, Georges EI Fakhri, Chi Liu

Abstract: Deep learning-based positron emission tomography (PET) image denoising offers the potential to reduce radiation exposure and scanning time by transforming low-count images into high-count equivalents. However, existing methods typically blur crucial details, leading to inaccurate lesion quantification. This paper proposes a lesion-perceived and quantification-consistent modulation (LpQcM) strategy… ▽ More Deep learning-based positron emission tomography (PET) image denoising offers the potential to reduce radiation exposure and scanning time by transforming low-count images into high-count equivalents. However, existing methods typically blur crucial details, leading to inaccurate lesion quantification. This paper proposes a lesion-perceived and quantification-consistent modulation (LpQcM) strategy for enhanced PET image denoising, via employing downstream lesion quantification analysis as auxiliary tools. The LpQcM is a plug-and-play design adaptable to a wide range of model architectures, modulating the sampling and optimization procedures of model training without adding any computational burden to the inference phase. Specifically, the LpQcM consists of two components, the lesion-perceived modulation (LpM) and the multiscale quantification-consistent modulation (QcM). The LpM enhances lesion contrast and visibility by allocating higher sampling weights and stricter loss criteria to lesion-present samples determined by an auxiliary segmentation network than lesion-absent ones. The QcM further emphasizes accuracy of quantification for both the mean and maximum standardized uptake value (SUVmean and SUVmax) across multiscale sub-regions throughout the entire image, thereby enhancing the overall image quality. Experiments conducted on large PET datasets from multiple centers and vendors, and varying noise levels demonstrated the LpQcM efficacy across various denoising frameworks. Compared to frameworks without LpQcM, the integration of LpQcM reduces the lesion SUVmean bias by 2.92% on average and increases the peak signal-to-noise ratio (PSNR) by 0.34 on average, for denoising images of extremely low-count levels below 10%. △ Less

Submitted 27 April, 2024; originally announced April 2024.

Comments: 10 pages

arXiv:2404.17739 [pdf, other]

How LLMs Aid in UML Modeling: An Exploratory Study with Novice Analysts

Authors: Beian Wang, Chong Wang, Peng Liang, Bing Li, Cheng Zeng

Abstract: Since the emergence of GPT-3, Large Language Models (LLMs) have caught the eyes of researchers, practitioners, and educators in the field of software engineering. However, there has been relatively little investigation regarding the performance of LLMs in assisting with requirements analysis and UML modeling. This paper explores how LLMs can assist novice analysts in creating three types of typica… ▽ More Since the emergence of GPT-3, Large Language Models (LLMs) have caught the eyes of researchers, practitioners, and educators in the field of software engineering. However, there has been relatively little investigation regarding the performance of LLMs in assisting with requirements analysis and UML modeling. This paper explores how LLMs can assist novice analysts in creating three types of typical UML models: use case models, class diagrams, and sequence diagrams. For this purpose, we designed the modeling tasks of these three UML models for 45 undergraduate students who participated in a requirements modeling course, with the help of LLMs. By analyzing their project reports, we found that LLMs can assist undergraduate students as novice analysts in UML modeling tasks, but LLMs also have shortcomings and limitations that should be considered when using them. △ Less

Submitted 10 June, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

Comments: The 21st IEEE International Conference on Software Services Engineering (SSE)

arXiv:2404.17433 [pdf, other]

PromptCIR: Blind Compressed Image Restoration with Prompt Learning

Authors: Bingchen Li, Xin Li, Yiting Lu, Ruoyu Feng, Mengxi Guo, Shijie Zhao, Li Zhang, Zhibo Chen

Abstract: Blind Compressed Image Restoration (CIR) has garnered significant attention due to its practical applications. It aims to mitigate compression artifacts caused by unknown quality factors, particularly with JPEG codecs. Existing works on blind CIR often seek assistance from a quality factor prediction network to facilitate their network to restore compressed images. However, the predicted numerical… ▽ More Blind Compressed Image Restoration (CIR) has garnered significant attention due to its practical applications. It aims to mitigate compression artifacts caused by unknown quality factors, particularly with JPEG codecs. Existing works on blind CIR often seek assistance from a quality factor prediction network to facilitate their network to restore compressed images. However, the predicted numerical quality factor lacks spatial information, preventing network adaptability toward image contents. Recent studies in prompt-learning-based image restoration have showcased the potential of prompts to generalize across varied degradation types and degrees. This motivated us to design a prompt-learning-based compressed image restoration network, dubbed PromptCIR, which can effectively restore images from various compress levels. Specifically, PromptCIR exploits prompts to encode compression information implicitly, where prompts directly interact with soft weights generated from image features, thus providing dynamic content-aware and distortion-aware guidance for the restoration process. The light-weight prompts enable our method to adapt to different compression levels, while introducing minimal parameter overhead. Overall, PromptCIR leverages the powerful transformer-based backbone with the dynamic prompt module to proficiently handle blind CIR tasks, winning first place in the NTIRE 2024 challenge of blind compressed image enhancement track. Extensive experiments have validated the effectiveness of our proposed PromptCIR. The code is available at https://github.com/lbc12345/PromptCIR-NTIRE24. △ Less

Submitted 26 April, 2024; originally announced April 2024.

Comments: Winner of NTIRE 2024 Blind Compressed Image Enhancement Challenge

arXiv:2404.17310 [pdf, other]

Image Copy-Move Forgery Detection via Deep PatchMatch and Pairwise Ranking Learning

Authors: Yuanman Li, Yingjie He, Changsheng Chen, Li Dong, Bin Li, Jiantao Zhou, Xia Li

Abstract: Recent advances in deep learning algorithms have shown impressive progress in image copy-move forgery detection (CMFD). However, these algorithms lack generalizability in practical scenarios where the copied regions are not present in the training images, or the cloned regions are part of the background. Additionally, these algorithms utilize convolution operations to distinguish source and target… ▽ More Recent advances in deep learning algorithms have shown impressive progress in image copy-move forgery detection (CMFD). However, these algorithms lack generalizability in practical scenarios where the copied regions are not present in the training images, or the cloned regions are part of the background. Additionally, these algorithms utilize convolution operations to distinguish source and target regions, leading to unsatisfactory results when the target regions blend well with the background. To address these limitations, this study proposes a novel end-to-end CMFD framework that integrates the strengths of conventional and deep learning methods. Specifically, the study develops a deep cross-scale PatchMatch (PM) method that is customized for CMFD to locate copy-move regions. Unlike existing deep models, our approach utilizes features extracted from high-resolution scales to seek explicit and reliable point-to-point matching between source and target regions. Furthermore, we propose a novel pairwise rank learning framework to separate source and target regions. By leveraging the strong prior of point-to-point matches, the framework can identify subtle differences and effectively discriminate between source and target regions, even when the target regions blend well with the background. Our framework is fully differentiable and can be trained end-to-end. Comprehensive experimental results highlight the remarkable generalizability of our scheme across various copy-move scenarios, significantly outperforming existing methods. △ Less

Submitted 26 April, 2024; originally announced April 2024.

Comments: 16 pages, 14figures

Showing 151–200 of 5,378 results for author: Li, B