-
Singular Regularization with Information Bottleneck Improves Model's Adversarial Robustness
Authors:
Guanlin Li,
Naishan Zheng,
Man Zhou,
Jie Zhang,
Tianwei Zhang
Abstract:
Adversarial examples are one of the most severe threats to deep learning models. Numerous works have been proposed to study and defend adversarial examples. However, these works lack analysis of adversarial information or perturbation, which cannot reveal the mystery of adversarial examples and lose proper interpretation. In this paper, we aim to fill this gap by studying adversarial information a…
▽ More
Adversarial examples are one of the most severe threats to deep learning models. Numerous works have been proposed to study and defend adversarial examples. However, these works lack analysis of adversarial information or perturbation, which cannot reveal the mystery of adversarial examples and lose proper interpretation. In this paper, we aim to fill this gap by studying adversarial information as unstructured noise, which does not have a clear pattern. Specifically, we provide some empirical studies with singular value decomposition, by decomposing images into several matrices, to analyze adversarial information for different attacks. Based on the analysis, we propose a new module to regularize adversarial information and combine information bottleneck theory, which is proposed to theoretically restrict intermediate representations. Therefore, our method is interpretable. Moreover, the fashion of our design is a novel principle that is general and unified. Equipped with our new module, we evaluate two popular model structures on two mainstream datasets with various adversarial attacks. The results indicate that the improvement in robust accuracy is significant. On the other hand, we prove that our method is efficient with only a few additional parameters and able to be explained under regional faithfulness analysis.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Improving In-Context Learning in Diffusion Models with Visual Context-Modulated Prompts
Authors:
Tianqi Chen,
Yongfei Liu,
Zhendong Wang,
Jianbo Yuan,
Quanzeng You,
Hongxia Yang,
Mingyuan Zhou
Abstract:
In light of the remarkable success of in-context learning in large language models, its potential extension to the vision domain, particularly with visual foundation models like Stable Diffusion, has sparked considerable interest. Existing approaches in visual in-context learning frequently face hurdles such as expensive pretraining, limiting frameworks, inadequate visual comprehension, and limite…
▽ More
In light of the remarkable success of in-context learning in large language models, its potential extension to the vision domain, particularly with visual foundation models like Stable Diffusion, has sparked considerable interest. Existing approaches in visual in-context learning frequently face hurdles such as expensive pretraining, limiting frameworks, inadequate visual comprehension, and limited adaptability to new tasks. In response to these challenges, we introduce improved Prompt Diffusion (iPromptDiff) in this study. iPromptDiff integrates an end-to-end trained vision encoder that converts visual context into an embedding vector. This vector is subsequently used to modulate the token embeddings of text prompts. We show that a diffusion-based vision foundation model, when equipped with this visual context-modulated text guidance and a standard ControlNet structure, exhibits versatility and robustness across a variety of training tasks and excels in in-context learning for novel vision tasks, such as normal-to-image or image-to-line transformations. The effectiveness of these capabilities relies heavily on a deep visual understanding, which is achieved through relevant visual demonstrations processed by our proposed in-context learning architecture.
△ Less
Submitted 3 December, 2023;
originally announced December 2023.
-
AV4EV: Open-Source Modular Autonomous Electric Vehicle Platform for Making Mobility Research Accessible
Authors:
Zhijie Qiao,
Mingyan Zhou,
Zhijun Zhuang,
Tejas Agarwal,
Felix Jahncke,
Po-Jen Wang,
Jason Friedman,
Hongyi Lai,
Divyanshu Sahu,
Tomáš Nagy,
Martin Endler,
Jason Schlessman,
Rahul Mangharam
Abstract:
When academic researchers develop and validate autonomous driving algorithms, there is a challenge in balancing high-performance capabilities with the cost and complexity of the vehicle platform. Much of today's research on autonomous vehicles (AV) is limited to experimentation on expensive commercial vehicles that require large skilled teams to retrofit the vehicles and test them in dedicated fac…
▽ More
When academic researchers develop and validate autonomous driving algorithms, there is a challenge in balancing high-performance capabilities with the cost and complexity of the vehicle platform. Much of today's research on autonomous vehicles (AV) is limited to experimentation on expensive commercial vehicles that require large skilled teams to retrofit the vehicles and test them in dedicated facilities. On the other hand, 1/10th-1/16th scaled-down vehicle platforms are more affordable but have limited similitude in performance and drivability. To address this issue, we present the design of a one-third-scale autonomous electric go-kart platform with open-source mechatronics design along with fully functional autonomous driving software. The platform's multi-modal driving system is capable of manual, autonomous, and teleoperation driving modes. It also features a flexible sensing suite for the algorithm deployment across perception, localization, planning, and control. This development serves as a bridge between full-scale vehicles and reduced-scale cars while accelerating cost-effective algorithmic advancements. Our experimental results demonstrate the AV4EV platform's capabilities and ease of use for develo** new AV algorithms. All materials are available at AV4EV.org to stimulate collaborative efforts within the AV and electric vehicle (EV) communities.
△ Less
Submitted 12 April, 2024; v1 submitted 1 December, 2023;
originally announced December 2023.
-
OmniMotionGPT: Animal Motion Generation with Limited Data
Authors:
Zhangsihao Yang,
Mingyuan Zhou,
Mengyi Shan,
Bingbing Wen,
Ziwei Xuan,
Mitch Hill,
Junjie Bai,
Guo-Jun Qi,
Yalin Wang
Abstract:
Our paper aims to generate diverse and realistic animal motion sequences from textual descriptions, without a large-scale animal text-motion dataset. While the task of text-driven human motion synthesis is already extensively studied and benchmarked, it remains challenging to transfer this success to other skeleton structures with limited data. In this work, we design a model architecture that imi…
▽ More
Our paper aims to generate diverse and realistic animal motion sequences from textual descriptions, without a large-scale animal text-motion dataset. While the task of text-driven human motion synthesis is already extensively studied and benchmarked, it remains challenging to transfer this success to other skeleton structures with limited data. In this work, we design a model architecture that imitates Generative Pretraining Transformer (GPT), utilizing prior knowledge learned from human data to the animal domain. We jointly train motion autoencoders for both animal and human motions and at the same time optimize through the similarity scores among human motion encoding, animal motion encoding, and text CLIP embedding. Presenting the first solution to this problem, we are able to generate animal motions with high diversity and fidelity, quantitatively and qualitatively outperforming the results of training human motion generation baselines on animal data. Additionally, we introduce AnimalML3D, the first text-animal motion dataset with 1240 animation sequences spanning 36 different animal identities. We hope this dataset would mediate the data scarcity problem in text-driven animal motion generation, providing a new playground for the research community.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Generalized Large-Scale Data Condensation via Various Backbone and Statistical Matching
Authors:
Shitong Shao,
Zeyuan Yin,
Muxin Zhou,
Xindong Zhang,
Zhiqiang Shen
Abstract:
The lightweight "local-match-global" matching introduced by SRe2L successfully creates a distilled dataset with comprehensive information on the full 224x224 ImageNet-1k. However, this one-sided approach is limited to a particular backbone, layer, and statistics, which limits the improvement of the generalization of a distilled dataset. We suggest that sufficient and various "local-match-global" m…
▽ More
The lightweight "local-match-global" matching introduced by SRe2L successfully creates a distilled dataset with comprehensive information on the full 224x224 ImageNet-1k. However, this one-sided approach is limited to a particular backbone, layer, and statistics, which limits the improvement of the generalization of a distilled dataset. We suggest that sufficient and various "local-match-global" matching are more precise and effective than a single one and has the ability to create a distilled dataset with richer information and better generalization. We call this perspective "generalized matching" and propose Generalized Various Backbone and Statistical Matching (G-VBSM) in this work, which aims to create a synthetic dataset with densities, ensuring consistency with the complete dataset across various backbones, layers, and statistics. As experimentally demonstrated, G-VBSM is the first algorithm to obtain strong performance across both small-scale and large-scale datasets. Specifically, G-VBSM achieves a performance of 38.7% on CIFAR-100 with 128-width ConvNet, 47.6% on Tiny-ImageNet with ResNet18, and 31.4% on the full 224x224 ImageNet-1k with ResNet18, under images per class (IPC) 10, 50, and 10, respectively. These results surpass all SOTA methods by margins of 3.9%, 6.5%, and 10.1%, respectively.
△ Less
Submitted 16 March, 2024; v1 submitted 29 November, 2023;
originally announced November 2023.
-
FHEmem: A Processing In-Memory Accelerator for Fully Homomorphic Encryption
Authors:
Minxuan Zhou,
Yu** Nam,
Pranav Gangwar,
Weihong Xu,
Arpan Dutta,
Kartikeyan Subramanyam,
Chris Wilkerson,
Rosario Cammarota,
Saransh Gupta,
Tajana Rosing
Abstract:
Fully Homomorphic Encryption (FHE) is a technique that allows arbitrary computations to be performed on encrypted data without the need for decryption, making it ideal for securing many emerging applications. However, FHE computation is significantly slower than computation on plain data due to the increase in data size after encryption. Processing In-Memory (PIM) is a promising technology that ca…
▽ More
Fully Homomorphic Encryption (FHE) is a technique that allows arbitrary computations to be performed on encrypted data without the need for decryption, making it ideal for securing many emerging applications. However, FHE computation is significantly slower than computation on plain data due to the increase in data size after encryption. Processing In-Memory (PIM) is a promising technology that can accelerate data-intensive workloads with extensive parallelism. However, FHE is challenging for PIM acceleration due to the long-bitwidth multiplications and complex data movements involved. We propose a PIM-based FHE accelerator, FHEmem, which exploits a novel processing in-memory architecture to achieve high-throughput and efficient acceleration for FHE. We propose an optimized end-to-end processing flow, from low-level hardware processing to high-level application map**, that fully exploits the high throughput of FHEmem hardware. Our evaluation shows FHEmem achieves significant speedup and efficiency improvement over state-of-the-art FHE accelerators.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Reconstruction of Cosmological Initial Density Field with Observations from the Epoch of Reionization
Authors:
Meng Zhou,
Yi Mao
Abstract:
Initial density distribution provides a basis for understanding the complete evolution of cosmological density fluctuations. While reconstruction in our local Universe exploits the observations of galaxy surveys with large volumes, observations of high-redshift galaxies are performed with a small field of view and therefore can hardly be used for reconstruction. Here we propose to reconstruct the…
▽ More
Initial density distribution provides a basis for understanding the complete evolution of cosmological density fluctuations. While reconstruction in our local Universe exploits the observations of galaxy surveys with large volumes, observations of high-redshift galaxies are performed with a small field of view and therefore can hardly be used for reconstruction. Here we propose to reconstruct the initial density field using the H I 21 cm and CO line intensity maps from the epoch of reionization. Observations of these two intensity maps provide complementary information of the density field -- the H I 21 cm field is a proxy of matter distributions in the neutral regions, while the CO line intensity maps are sensitive to the high-density, star-forming regions that host the sources for reionization. Technically, we employ the conjugate gradient method and develop the machinery for minimizing the cost function for the intensity map** observations. Analytical expressions for the gradient of cost function are derived explicitly. We show that the resimulated intensity maps match the input maps of mock observations using semi-numerical simulations of reionization with an rms error $\lesssim 7\%$ at all stages of reionization. This reconstruction is also robust at the same level of accuracy when a noise at the level of $\lesssim 1\%$ of the standard deviation is applied to each map. Our proof-of-concept work demonstrates the robustness of the reconstruction method, thereby providing an effective technique for reconstructing the cosmological initial density distribution from high-redshift observations.
△ Less
Submitted 25 November, 2023;
originally announced November 2023.
-
Modular Blended Attention Network for Video Question Answering
Authors:
Mingjie Zhou
Abstract:
In multimodal machine learning tasks, it is due to the complexity of the assignments that the network structure, in most cases, is assembled in a sophisticated way. The holistic architecture can be separated into several logical parts according to the respective ends that the modules are devised to achieve. As the number of modalities of information representation increases, constructing ad hoc su…
▽ More
In multimodal machine learning tasks, it is due to the complexity of the assignments that the network structure, in most cases, is assembled in a sophisticated way. The holistic architecture can be separated into several logical parts according to the respective ends that the modules are devised to achieve. As the number of modalities of information representation increases, constructing ad hoc subnetworks for processing the data from divergent modalities while mediating the fusion of different information types has become a cumbersome and expensive problem. In this paper, we present an approach to facilitate the question with a reusable and composable neural unit; by connecting the units in series or parallel, the arduous network constructing of multimodal machine learning tasks will be accomplished in a much straightforward way. Additionally, through parameter sharing (weights replication) among the units, the space complexity will be significantly reduced. We have conducted experiments on three commonly used datasets; our method achieves impressive performance compared to several video QA baselines.
△ Less
Submitted 2 November, 2023;
originally announced November 2023.
-
BrainZ-BP: A Non-invasive Cuff-less Blood Pressure Estimation Approach Leveraging Brain Bio-impedance and Electrocardiogram
Authors:
Bufang Yang,
Le Liu,
Wenxuan Wu,
Mengliang Zhou,
Hongxing Liu,
Xinbao Ning
Abstract:
Accurate and continuous blood pressure (BP) monitoring is essential to the early prevention of cardiovascular diseases. Non-invasive and cuff-less BP estimation algorithm has gained much attention in recent years. Previous studies have demonstrated that brain bio-impedance (BIOZ) is a promising technique for non-invasive intracranial pressure (ICP) monitoring. Clinically, treatment for patients wi…
▽ More
Accurate and continuous blood pressure (BP) monitoring is essential to the early prevention of cardiovascular diseases. Non-invasive and cuff-less BP estimation algorithm has gained much attention in recent years. Previous studies have demonstrated that brain bio-impedance (BIOZ) is a promising technique for non-invasive intracranial pressure (ICP) monitoring. Clinically, treatment for patients with traumatic brain injuries (TBI) requires monitoring the ICP and BP of patients simultaneously. Estimating BP by brain BIOZ directly can reduce the number of sensors attached to the patients, thus improving their comfort. To address the issues, in this study, we explore the feasibility of leveraging brain BIOZ for BP estimation and propose a novel cuff-less BP estimation approach called BrainZ-BP. Two electrodes are placed on the forehead and occipital bone of the head in the anterior-posterior direction for brain BIOZ measurement. Various features including pulse transit time and morphological features of brain BIOZ are extracted and fed into four regression models for BP estimation. Results show that the mean absolute error, root mean square error, and correlation coefficient of random forest regression model are 2.17 mmHg, 3.91 mmHg, and 0.90 for systolic pressure estimation, and are 1.71 mmHg, 3.02 mmHg, and 0.89 for diastolic pressure estimation. The presented BrainZ-BP can be applied in the brain BIOZ-based ICP monitoring scenario to monitor BP simultaneously.
△ Less
Submitted 23 November, 2023; v1 submitted 18 November, 2023;
originally announced November 2023.
-
A Counterexample for the Principal Eigenvalue of An Elliptic Operator with Large Advection
Authors:
Xueli Bai,
Xin Xu,
Kexin Zhang,
Maolin Zhou
Abstract:
There are numerous studies focusing on the convergence of the principal eigenvalue $λ(s)$ as $s\to+\infty$ corresponding to the elliptic eigenvalue problem
\begin{align*}
-Δ\varphi(x)-2s\mathbf{v}\cdot\nabla\varphi(x)+c(x)\varphi(x)=λ(s)\varphi(x),\quad x\in Ω,
\end{align*}
where $Ω$ is a bounded domain and the advection term $\mathbf{v}$ under some certain restrictions.
In this paper, w…
▽ More
There are numerous studies focusing on the convergence of the principal eigenvalue $λ(s)$ as $s\to+\infty$ corresponding to the elliptic eigenvalue problem
\begin{align*}
-Δ\varphi(x)-2s\mathbf{v}\cdot\nabla\varphi(x)+c(x)\varphi(x)=λ(s)\varphi(x),\quad x\in Ω,
\end{align*}
where $Ω$ is a bounded domain and the advection term $\mathbf{v}$ under some certain restrictions.
In this paper, we construct an infinitely oscillating gradient advection term $\mathbf{v}=\nabla m(x)\in C^1(Ω)$ such that the principal eigenvalue $λ(s)$ does not converge as $s\to+\infty$. As far as we know, this is the first result that guarantee the non-convergence of the principal eigenvalue.
△ Less
Submitted 10 November, 2023;
originally announced November 2023.
-
Quantum Fluctuations Driving the Generation and Strong Correlations of Fission Fragment Angular Momenta
Authors:
M. H. Zhou,
S. Y. Chen,
Z. Y. Li,
M. S. Smith,
Z. P. Li
Abstract:
Two critical issues in the study of the fission mechanism are how the fission fragment angular momenta (FFAM) develop dynamically from equilibrium and how they are correlated with each other. To this end, we construct a time-dependent generator coordinate method that incorporates crucial quantum fluctuations -- multiple rotations, vibrations, and their couplings -- based on covariant density funct…
▽ More
Two critical issues in the study of the fission mechanism are how the fission fragment angular momenta (FFAM) develop dynamically from equilibrium and how they are correlated with each other. To this end, we construct a time-dependent generator coordinate method that incorporates crucial quantum fluctuations -- multiple rotations, vibrations, and their couplings -- based on covariant density functional theory, providing for the first time a global, microscopic, and dynamical study on the FFAM distribution. The calculated probability distributions of FFAM are in good agreement with the experimental measurements, and the sawtooth-like mass dependence of average FFAM is reproduced very well. It is noteworthy to find that the quantum fluctuations drive the generation and chaotic evolution of FFAM during fission fragment formation and induce the strong correlations of FFAM orientations at the small, medium, and large opening angles ($φ_{\rm LH}\approx 30^{\rm o}$, $90^{\rm o}$, $160^{\rm o}$).
△ Less
Submitted 10 March, 2024; v1 submitted 10 November, 2023;
originally announced November 2023.
-
Cross-modal Prompts: Adapting Large Pre-trained Models for Audio-Visual Downstream Tasks
Authors:
Haoyi Duan,
Yan Xia,
Mingze Zhou,
Li Tang,
Jieming Zhu,
Zhou Zhao
Abstract:
In recent years, the deployment of large-scale pre-trained models in audio-visual downstream tasks has yielded remarkable outcomes. However, these models, primarily trained on single-modality unconstrained datasets, still encounter challenges in feature extraction for multi-modal tasks, leading to suboptimal performance. This limitation arises due to the introduction of irrelevant modality-specifi…
▽ More
In recent years, the deployment of large-scale pre-trained models in audio-visual downstream tasks has yielded remarkable outcomes. However, these models, primarily trained on single-modality unconstrained datasets, still encounter challenges in feature extraction for multi-modal tasks, leading to suboptimal performance. This limitation arises due to the introduction of irrelevant modality-specific information during encoding, which adversely affects the performance of downstream tasks. To address this challenge, this paper proposes a novel Dual-Guided Spatial-Channel-Temporal (DG-SCT) attention mechanism. This mechanism leverages audio and visual modalities as soft prompts to dynamically adjust the parameters of pre-trained models based on the current multi-modal input features. Specifically, the DG-SCT module incorporates trainable cross-modal interaction layers into pre-trained audio-visual encoders, allowing adaptive extraction of crucial information from the current modality across spatial, channel, and temporal dimensions, while preserving the frozen parameters of large-scale pre-trained models. Experimental evaluations demonstrate that our proposed model achieves state-of-the-art results across multiple downstream tasks, including AVE, AVVP, AVS, and AVQA. Furthermore, our model exhibits promising performance in challenging few-shot and zero-shot scenarios. The source code and pre-trained models are available at https://github.com/haoyi-duan/DG-SCT.
△ Less
Submitted 20 December, 2023; v1 submitted 9 November, 2023;
originally announced November 2023.
-
Spatio-Temporal Similarity Measure based Multi-Task Learning for Predicting Alzheimer's Disease Progression using MRI Data
Authors:
Xulong Wang,
Yu Zhang,
Menghui Zhou,
Tong Liu,
Jun Qi,
Po Yang
Abstract:
Identifying and utilising various biomarkers for tracking Alzheimer's disease (AD) progression have received many recent attentions and enable hel** clinicians make the prompt decisions. Traditional progression models focus on extracting morphological biomarkers in regions of interest (ROIs) from MRI/PET images, such as regional average cortical thickness and regional volume. They are effective…
▽ More
Identifying and utilising various biomarkers for tracking Alzheimer's disease (AD) progression have received many recent attentions and enable hel** clinicians make the prompt decisions. Traditional progression models focus on extracting morphological biomarkers in regions of interest (ROIs) from MRI/PET images, such as regional average cortical thickness and regional volume. They are effective but ignore the relationships between brain ROIs over time, which would lead to synergistic deterioration. For exploring the synergistic deteriorating relationship between these biomarkers, in this paper, we propose a novel spatio-temporal similarity measure based multi-task learning approach for effectively predicting AD progression and sensitively capturing the critical relationships between biomarkers. Specifically, we firstly define a temporal measure for estimating the magnitude and velocity of biomarker change over time, which indicate a changing trend(temporal). Converting this trend into the vector, we then compare this variability between biomarkers in a unified vector space(spatial). The experimental results show that compared with directly ROI based learning, our proposed method is more effective in predicting disease progression. Our method also enables performing longitudinal stability selection to identify the changing relationships between biomarkers, which play a key role in disease progression. We prove that the synergistic deteriorating biomarkers between cortical volumes or surface areas have a significant effect on the cognitive prediction.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Near-Infrared Ca II Triplet As An Stellar Activity Indicator: Library and Comparative Study
Authors:
Xin Huang,
Yu-JI He,
ZhongRui Bai,
Hailong Yuan,
MingKuan Yang,
Ming Zhou,
Yiqiao Dong,
Mengxin Wang,
Han He,
**ghua Zhang,
Yao-Quan Chu,
Yongheng Zhao,
Yong Zhang,
Haotong Zhang
Abstract:
We have established and released a new stellar index library of the Ca II Triplet, which serves as an indicator for characterizing the chromospheric activity of stars. The library is based on data from the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) Low-Resolution Spectroscopic Survey (LRS) Data Release 9 (DR9). To better reflect the chromospheric activity of stars, we have…
▽ More
We have established and released a new stellar index library of the Ca II Triplet, which serves as an indicator for characterizing the chromospheric activity of stars. The library is based on data from the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) Low-Resolution Spectroscopic Survey (LRS) Data Release 9 (DR9). To better reflect the chromospheric activity of stars, we have defined new indices $R$ and $R^{+}$. The library includes measurements of $R$ and $R^{+}$ for each Ca II infrared triplet (IRT) from 699,348 spectra of 562,863 F, G and K-type solar-like stars with Signal-to-Noise Ratio (SNR) higher than 100, as well as the stellar atmospheric parameters and basic information inherited from the LAMOST LRS Catalog. We compared the differences between the 3 individual index of the Ca II Triplet and also conducted a comparative analysis of $R^{+}_{\lambda8542}$ to the Ca II H&K $S$ and $R^+_{HK}$ index database. We find the fraction of low active stars decreases with $T_{eff}$ and the fraction of high active first decrease with decreasing temperature and turn to increase with decreasing temperature at 5800K. We also find a significant fraction of stars that show high activity index in both Ca II H&K and IRT are binaries with low activity, some of them could be discriminated in Ca II H&K $S$ index and $R^{+}_{\lambda8542}$ space. This newly stellar library serves as a valuable resource for studying chromospheric activity in stars and can be used to improve our comprehension of stellar magnetic activity and other astrophysical phenomena.
△ Less
Submitted 7 November, 2023; v1 submitted 6 November, 2023;
originally announced November 2023.
-
Age of Information Analysis for CR-NOMA Aided Uplink Systems with Randomly Arrived Packets
Authors:
Yanshi Sun,
Yanglin Ye,
Zhiguo Ding,
Momiao Zhou,
Lei Liu
Abstract:
This paper studies the application of cognitive radio inspired non-orthogonal multiple access (CR-NOMA) to reduce age of information (AoI) for uplink transmission. In particular, a time division multiple access (TDMA) based legacy network is considered, where each user is allocated with a dedicated time slot to transmit its status update information. The CR-NOMA is implemented as an add-on to the…
▽ More
This paper studies the application of cognitive radio inspired non-orthogonal multiple access (CR-NOMA) to reduce age of information (AoI) for uplink transmission. In particular, a time division multiple access (TDMA) based legacy network is considered, where each user is allocated with a dedicated time slot to transmit its status update information. The CR-NOMA is implemented as an add-on to the TDMA legacy network, which enables each user to have more opportunities to transmit by sharing other user's time slots. A rigorous analytical framework is developed to obtain the expressions for AoIs achieved by CR-NOMA with and without re-transmission, by taking the randomness of the status update generating process into consideration. Numerical results are presented to verify the accuracy of the developed analysis. It is shown that the AoI can be significantly reduced by applying CR-NOMA compared to TDMA. Moreover, the use of re-transmission is helpful to reduce AoI, especially when the status arrival rate is low.
△ Less
Submitted 5 November, 2023;
originally announced November 2023.
-
MATA*: Combining Learnable Node Matching with A* Algorithm for Approximate Graph Edit Distance Computation
Authors:
Junfeng Liu,
Min Zhou,
Shuai Ma,
Lujia Pan
Abstract:
Graph Edit Distance (GED) is a general and domain-agnostic metric to measure graph similarity, widely used in graph search or retrieving tasks. However, the exact GED computation is known to be NP-complete. For instance, the widely used A* algorithms explore the entire search space to find the optimal solution which inevitably suffers scalability issues. Learning-based methods apply graph represen…
▽ More
Graph Edit Distance (GED) is a general and domain-agnostic metric to measure graph similarity, widely used in graph search or retrieving tasks. However, the exact GED computation is known to be NP-complete. For instance, the widely used A* algorithms explore the entire search space to find the optimal solution which inevitably suffers scalability issues. Learning-based methods apply graph representation techniques to learn the GED by formulating a regression task, which can not recover the edit path and lead to inaccurate GED approximation (i.e., the predicted GED is smaller than the exact). To this end, in this work, we present a data-driven hybrid approach MATA* for approximate GED computation based on Graph Neural Networks (GNNs) and A* algorithms, which models from the perspective of learning to match nodes instead of directly regressing GED. Specifically, aware of the structure-dominant operations (i.e.,node and edge insertion/deletion) property in GED computation, a structure-enhanced GNN is firstly designed to jointly learn local and high-order structural information for node embeddings for node matchings. Second, top-k candidate nodes are produced via a differentiable top-k operation to enable the training for node matchings, which is adhering to another property of GED, i.e., multiple optimal node matchings. Third, benefiting from the candidate nodes, MATA* only performs on the promising search directions, reaching the solution efficiently. Finally, extensive experiments show the superiority of MATA* as it significantly outperforms the combinatorial search-based, learning-based and hybrid methods and scales well to large-size graphs.
△ Less
Submitted 4 November, 2023;
originally announced November 2023.
-
LLMaAA: Making Large Language Models as Active Annotators
Authors:
Ruoyu Zhang,
Yanzeng Li,
Yongliang Ma,
Ming Zhou,
Lei Zou
Abstract:
Prevalent supervised learning methods in natural language processing (NLP) are notoriously data-hungry, which demand large amounts of high-quality annotated data. In practice, acquiring such data is a costly endeavor. Recently, the superior few-shot performance of large language models (LLMs) has propelled the development of dataset generation, where the training data are solely synthesized from L…
▽ More
Prevalent supervised learning methods in natural language processing (NLP) are notoriously data-hungry, which demand large amounts of high-quality annotated data. In practice, acquiring such data is a costly endeavor. Recently, the superior few-shot performance of large language models (LLMs) has propelled the development of dataset generation, where the training data are solely synthesized from LLMs. However, such an approach usually suffers from low-quality issues, and requires orders of magnitude more labeled data to achieve satisfactory performance. To fully exploit the potential of LLMs and make use of massive unlabeled data, we propose LLMaAA, which takes LLMs as annotators and puts them into an active learning loop to determine what to annotate efficiently. To learn robustly with pseudo labels, we optimize both the annotation and training processes: (1) we draw k-NN examples from a small demonstration pool as in-context examples, and (2) we adopt the example reweighting technique to assign training samples with learnable weights. Compared with previous approaches, LLMaAA features both efficiency and reliability. We conduct experiments and analysis on two classic NLP tasks, named entity recognition and relation extraction. With LLMaAA, task-specific models trained from LLM-generated labels can outperform the teacher within only hundreds of annotated examples, which is much more cost-effective than other baselines.
△ Less
Submitted 31 October, 2023; v1 submitted 30 October, 2023;
originally announced October 2023.
-
Does or did the supernova remnant Cassiopeia A operate as a PeVatron?
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE;…
▽ More
For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE; $E_γ\geq 100$~TeV) $γ$-rays. In this context, the historical SNR Cassiopeia A (Cas A) is considered one of the most promising target for UHE observations. This paper presents the observation of Cas A and its vicinity by the LHAASO KM2A detector. The exceptional sensitivity of LHAASO KM2A in the UHE band, combined with the young age of Cas A, enabled us to derive stringent model-independent limits on the energy budget of UHE protons and nuclei accelerated by Cas A at any epoch after the explosion. The results challenge the prevailing paradigm that Cas A-type SNRs are major suppliers of PeV CRs in the Milky Way.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Measurement-Based Small-Scale Channel Model for Sub-6 GHz RIS-Assisted Communications
Authors:
Jian Sang,
Jifeng Lan,
Mingyong Zhou,
Boning Gao,
Wankai Tang,
Xiao Li,
Michail Matthaiou,
Shi **,
Marco Di Renzo
Abstract:
Reconfigurable intelligent surfaces (RISs) have attracted increasing interest from both academia and industry, thanks to their unique features on controlling electromagnetic (EM) waves. Although theoretical models for RIS-empowered communications have covered a variety of applications, yet, very few papers have investigated the modeling of real propagation characteristics. In this paper, we fill t…
▽ More
Reconfigurable intelligent surfaces (RISs) have attracted increasing interest from both academia and industry, thanks to their unique features on controlling electromagnetic (EM) waves. Although theoretical models for RIS-empowered communications have covered a variety of applications, yet, very few papers have investigated the modeling of real propagation characteristics. In this paper, we fill this gap by providing an empirical statistical channel model to describe the small-scale channel variations for an RIS-assisted broadband system at 2.6 GHz. Based on real channel measurements in outdoor, indoor and outdoor-to-indoor (O2I) environments, we compare and analyze the global, inter-cluster and intra-cluster parameters. Measurement results indicate that the deployment of an RIS with proper phase configurations can significantly improve the channel quality by enhancing the $K$-factor and reducing the time dispersion. The small-scale fading is well characterized by the proposed statistical model and the empirical channel parameters. These results are essential for the design of emerging RIS-assisted wireless systems for future applications.
△ Less
Submitted 4 March, 2024; v1 submitted 20 October, 2023;
originally announced October 2023.
-
Very high energy gamma-ray emission beyond 10 TeV from GRB 221009A
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
A. Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the t…
▽ More
The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the trigger. The intrinsic energy spectrum of gamma-rays can be described by a power-law after correcting for extragalactic background light (EBL) absorption. Such a hard spectrum challenges the synchrotron self-Compton (SSC) scenario of relativistic electrons for the afterglow emission above several TeV. Observations of gamma-rays up to 13 TeV from a source with a measured redshift of z=0.151 hints more transparency in intergalactic space than previously expected. Alternatively, one may invoke new physics such as Lorentz Invariance Violation (LIV) or an axion origin of very high energy (VHE) signals.
△ Less
Submitted 22 November, 2023; v1 submitted 13 October, 2023;
originally announced October 2023.
-
PhyloGFN: Phylogenetic inference with generative flow networks
Authors:
Mingyang Zhou,
Zichao Yan,
Elliot Layne,
Nikolay Malkin,
Dinghuai Zhang,
Moksh Jain,
Mathieu Blanchette,
Yoshua Bengio
Abstract:
Phylogenetics is a branch of computational biology that studies the evolutionary relationships among biological entities. Its long history and numerous applications notwithstanding, inference of phylogenetic trees from sequence data remains challenging: the high complexity of tree space poses a significant obstacle for the current combinatorial and probabilistic techniques. In this paper, we adopt…
▽ More
Phylogenetics is a branch of computational biology that studies the evolutionary relationships among biological entities. Its long history and numerous applications notwithstanding, inference of phylogenetic trees from sequence data remains challenging: the high complexity of tree space poses a significant obstacle for the current combinatorial and probabilistic techniques. In this paper, we adopt the framework of generative flow networks (GFlowNets) to tackle two core problems in phylogenetics: parsimony-based and Bayesian phylogenetic inference. Because GFlowNets are well-suited for sampling complex combinatorial structures, they are a natural choice for exploring and sampling from the multimodal posterior distribution over tree topologies and evolutionary distances. We demonstrate that our amortized posterior sampler, PhyloGFN, produces diverse and high-quality evolutionary hypotheses on real benchmark datasets. PhyloGFN is competitive with prior works in marginal likelihood estimation and achieves a closer fit to the target distribution than state-of-the-art variational inference methods. Our code is available at https://github.com/zmy1116/phylogfn.
△ Less
Submitted 24 March, 2024; v1 submitted 12 October, 2023;
originally announced October 2023.
-
Debias the Training of Diffusion Models
Authors:
Hu Yu,
Li Shen,
Jie Huang,
Man Zhou,
Hongsheng Li,
Feng Zhao
Abstract:
Diffusion models have demonstrated compelling generation quality by optimizing the variational lower bound through a simple denoising score matching loss. In this paper, we provide theoretical evidence that the prevailing practice of using a constant loss weight strategy in diffusion models leads to biased estimation during the training phase. Simply optimizing the denoising network to predict Gau…
▽ More
Diffusion models have demonstrated compelling generation quality by optimizing the variational lower bound through a simple denoising score matching loss. In this paper, we provide theoretical evidence that the prevailing practice of using a constant loss weight strategy in diffusion models leads to biased estimation during the training phase. Simply optimizing the denoising network to predict Gaussian noise with constant weighting may hinder precise estimations of original images. To address the issue, we propose an elegant and effective weighting strategy grounded in the theoretically unbiased principle. Moreover, we conduct a comprehensive and systematic exploration to dissect the inherent bias problem deriving from constant weighting loss from the perspectives of its existence, impact and reasons. These analyses are expected to advance our understanding and demystify the inner workings of diffusion models. Through empirical evaluation, we demonstrate that our proposed debiased estimation method significantly enhances sample quality without the reliance on complex techniques, and exhibits improved efficiency compared to the baseline method both in training and sampling processes.
△ Less
Submitted 3 November, 2023; v1 submitted 12 October, 2023;
originally announced October 2023.
-
Two-Loop QCD Corrections to C even Bottomonium Exclusive Decays to Double $J/ψ$
Authors:
Yu-Dong Zhang,
Xiao-Wei Bai,
Feng Feng,
Wen-Long Sang,
Ming-Zhen Zhou
Abstract:
In the framework of nonrelativistic QCD (NRQCD) factorization, we compute both the polarized and the unpolarized decay widths for the processes $η_b(χ_{bJ})\to J/ψJ/ψ$, accurate up to next-to-next-to-leading-order (NNLO) in $α_s$. For the first time, we confirm that the NRQCD factorization does hold at NNLO for the process involving triple quarkonia. We find the radiative corrections are considera…
▽ More
In the framework of nonrelativistic QCD (NRQCD) factorization, we compute both the polarized and the unpolarized decay widths for the processes $η_b(χ_{bJ})\to J/ψJ/ψ$, accurate up to next-to-next-to-leading-order (NNLO) in $α_s$. For the first time, we confirm that the NRQCD factorization does hold at NNLO for the process involving triple quarkonia. We find the radiative corrections are considerable. In particular for $χ_{b2}$, both $\mathcal{O}(α_s)$ and $\mathcal{O}(α_s^2)$ corrections are sizable and negative, and can significantly reduce the leading order prediction. At NNLO, the branching fractions are $8.2\times 10^{-7}$, $6.2\times 10^{-6}$, $7.2\times 10^{-7}$ and $2.7\times 10^{-6}$ for $η_b$, $χ_{b0}$, $χ_{b1}$ and $χ_{b2}$ decay, respectively. Our theoretical predictions are consistent with the upper limits measured by the {\tt Belle} Collaboration. Moreover, we investigate the dependence of the theoretical predictions on the ratio of the charm quark mass and the bottom quark mass. By fixing $m_b$ and varying $m_c$ from $1.25$ to $1.9$ GeV, we find the branching fraction can change a factor of $2$, $3$, and $6$ for $η_b$, $χ_{b0}$, and $χ_{b1}$, respectively. In the phenomenological analysis, with the integrate luminosity $\mathcal{L}=100\,{\rm fb}^{-1}$, we expect about $(5-10)\times 10^3$ $η_b(χ_{bJ})\to J/ψJ/ψ\to \ell \bar{\ell}\ell \bar{\ell}$ events produced at the {\tt LHC}, thus it might be hopeful to search for these processes. On the other hand, there are less than $100$ $η_b(χ_{bJ})\to J/ψJ/ψ$ signal events at the B factory, so it seems the experimental measurements on these channels are quite challenging based on current dataset.
△ Less
Submitted 12 October, 2023; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion Modeling
Authors:
Huangjie Zheng,
Zhendong Wang,
Jianbo Yuan,
Guanghan Ning,
Pengcheng He,
Quanzeng You,
Hongxia Yang,
Mingyuan Zhou
Abstract:
Diffusion models excel at generating photo-realistic images but come with significant computational costs in both training and sampling. While various techniques address these computational challenges, a less-explored issue is designing an efficient and adaptable network backbone for iterative refinement. Current options like U-Net and Vision Transformer often rely on resource-intensive deep netwo…
▽ More
Diffusion models excel at generating photo-realistic images but come with significant computational costs in both training and sampling. While various techniques address these computational challenges, a less-explored issue is designing an efficient and adaptable network backbone for iterative refinement. Current options like U-Net and Vision Transformer often rely on resource-intensive deep networks and lack the flexibility needed for generating images at variable resolutions or with a smaller network than used in training. This study introduces LEGO bricks, which seamlessly integrate Local-feature Enrichment and Global-content Orchestration. These bricks can be stacked to create a test-time reconfigurable diffusion backbone, allowing selective skip** of bricks to reduce sampling costs and generate higher-resolution images than the training data. LEGO bricks enrich local regions with an MLP and transform them using a Transformer block while maintaining a consistent full-resolution image across all bricks. Experimental results demonstrate that LEGO bricks enhance training efficiency, expedite convergence, and facilitate variable-resolution image generation while maintaining strong generative performance. Moreover, LEGO significantly reduces sampling time compared to other methods, establishing it as a valuable enhancement for diffusion models. Our code and project page are available at https://jegzheng.github.io/LEGODiffusion.
△ Less
Submitted 27 June, 2024; v1 submitted 10 October, 2023;
originally announced October 2023.
-
A Holistic Evaluation of Piano Sound Quality
Authors:
Monan Zhou,
Shangda Wu,
Shaohua Ji,
Zi** Li,
Wei Li
Abstract:
This paper aims to develop a holistic evaluation method for piano sound quality to assist in purchasing decisions. Unlike previous studies that focused on the effect of piano performance techniques on sound quality, this study evaluates the inherent sound quality of different pianos. To derive quality evaluation systems, the study uses subjective questionnaires based on a piano sound quality datas…
▽ More
This paper aims to develop a holistic evaluation method for piano sound quality to assist in purchasing decisions. Unlike previous studies that focused on the effect of piano performance techniques on sound quality, this study evaluates the inherent sound quality of different pianos. To derive quality evaluation systems, the study uses subjective questionnaires based on a piano sound quality dataset. The method selects the optimal piano classification models by comparing the fine-tuning results of different pre-training models of Convolutional Neural Networks (CNN). To improve the interpretability of the models, the study applies Equivalent Rectangular Bandwidth (ERB) analysis. The results reveal that musically trained individuals are better able to distinguish between the sound quality differences of different pianos. The best fine-tuned CNN pre-trained backbone achieves a high accuracy of 98.3\% as the piano classifier. However, the dataset is limited, and the audio is sliced to increase its quantity, resulting in a lack of diversity and balance, so we use focal loss to reduce the impact of data imbalance. To optimize the method, the dataset will be expanded, or few-shot learning techniques will be employed in future research.
△ Less
Submitted 7 October, 2023;
originally announced October 2023.
-
Generating 3D Brain Tumor Regions in MRI using Vector-Quantization Generative Adversarial Networks
Authors:
Meng Zhou,
Matthias W Wagner,
Uri Tabori,
Cynthia Hawkins,
Birgit B Ertl-Wagner,
Farzad Khalvati
Abstract:
Medical image analysis has significantly benefited from advancements in deep learning, particularly in the application of Generative Adversarial Networks (GANs) for generating realistic and diverse images that can augment training datasets. However, the effectiveness of such approaches is often limited by the amount of available data in clinical settings. Additionally, the common GAN-based approac…
▽ More
Medical image analysis has significantly benefited from advancements in deep learning, particularly in the application of Generative Adversarial Networks (GANs) for generating realistic and diverse images that can augment training datasets. However, the effectiveness of such approaches is often limited by the amount of available data in clinical settings. Additionally, the common GAN-based approach is to generate entire image volumes, rather than solely the region of interest (ROI). Research on deep learning-based brain tumor classification using MRI has shown that it is easier to classify the tumor ROIs compared to the entire image volumes. In this work, we present a novel framework that uses vector-quantization GAN and a transformer incorporating masked token modeling to generate high-resolution and diverse 3D brain tumor ROIs that can be directly used as augmented data for the classification of brain tumor ROI. We apply our method to two imbalanced datasets where we augment the minority class: (1) the Multimodal Brain Tumor Segmentation Challenge (BraTS) 2019 dataset to generate new low-grade glioma (LGG) ROIs to balance with high-grade glioma (HGG) class; (2) the internal pediatric LGG (pLGG) dataset tumor ROIs with BRAF V600E Mutation genetic marker to balance with BRAF Fusion genetic marker class. We show that the proposed method outperforms various baseline models in both qualitative and quantitative measurements. The generated data was used to balance the data in the brain tumor types classification task. Using the augmented data, our approach surpasses baseline models by 6.4% in AUC on the BraTS 2019 dataset and 4.3% in AUC on our internal pLGG dataset. The results indicate the generated tumor ROIs can effectively address the imbalanced data problem. Our proposed method has the potential to facilitate an accurate diagnosis of rare brain tumors using MRI scans.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Data-Driven Newton Raphson Controller Based on Koopman Operator Theory
Authors:
Mi Zhou
Abstract:
Newton-Raphson controller is a powerful prediction-based variable gain integral controller. Basically, the classical model-based Newton-Raphson controller requires two elements: the prediction of the system output and the derivative of the predicted output with respect to the control input. In real applications, the model may not be known and it is infeasible to predict the system sometime ahead a…
▽ More
Newton-Raphson controller is a powerful prediction-based variable gain integral controller. Basically, the classical model-based Newton-Raphson controller requires two elements: the prediction of the system output and the derivative of the predicted output with respect to the control input. In real applications, the model may not be known and it is infeasible to predict the system sometime ahead and calculate the derivative by finite difference method as done in simulation. To solve these problems, in this work, we utilize the Koopman operator framework to reconstruct a linear model of the original nonlinear dynamical system and then utilize the output of the new linear system as the predictor of the Newton-Raphson controller. This method is only based on collected data within some time instant thus more practical. Three examples related to highly nonlinear systems are provided to verify the effectiveness of our proposed method.
△ Less
Submitted 29 September, 2023;
originally announced September 2023.
-
Energy Optimal Control of a Harmonic Oscillator with a State Inequality Constraint
Authors:
Mi Zhou,
Erik I Verriest,
Chaouki Abdallah
Abstract:
In this article, the optimal control problem for a harmonic oscillator with an inequality constraint is considered. The applied energy of the oscillator during a fixed final time period is used as the performance criterion. The analytical solution with both small and large terminal time is found for a special case when the undriven oscillator system is initially at rest. For other initial states o…
▽ More
In this article, the optimal control problem for a harmonic oscillator with an inequality constraint is considered. The applied energy of the oscillator during a fixed final time period is used as the performance criterion. The analytical solution with both small and large terminal time is found for a special case when the undriven oscillator system is initially at rest. For other initial states of the Harmonic oscillator, the optimal solution is found to have three modes: wait-move, move-wait, and move-wait-move given a longer terminal time.
△ Less
Submitted 28 September, 2023;
originally announced September 2023.
-
Time-Domain Channel Measurements and Small-Scale Fading Characterization for RIS-Assisted Wireless Communication Systems
Authors:
Yanqing Ren,
Mingyong Zhou,
Xiaokun Teng,
Shengguo Meng,
Wankai Tang,
Xiao Li,
Shi **,
Michail Matthaiou
Abstract:
As a potentially revolutionary enabling technology for the sixth generation (6G) mobile communication system, reconfigurable intelligent surfaces (RISs) have attracted extensive attention from industry and academia. In RIS-assisted wireless communication systems, practical channel measurements and modeling serve as the foundation for system design, network optimization, and performance evaluation.…
▽ More
As a potentially revolutionary enabling technology for the sixth generation (6G) mobile communication system, reconfigurable intelligent surfaces (RISs) have attracted extensive attention from industry and academia. In RIS-assisted wireless communication systems, practical channel measurements and modeling serve as the foundation for system design, network optimization, and performance evaluation. In this paper, a RIS time-domain channel measurement system, based on a software defined radio (SDR) platform, is developed for the first time to investigate the small-scale fading characteristics of RIS-assisted channels. We present RIS channel measurements in corridor and laboratory scenarios and compare the power delay profile (PDP) of the channel without RIS, with RIS specular reflection, and with RIS intelligent reflection. The multipath component parameters and cluster parameters based on the Saleh-Valenzuela model are extracted. We find that the PDPs of the RIS-assisted channel fit the power-law decay model and approximate the law of square decay. Through intelligent reflection, the RIS can decrease the delay and concentrate the energy of the virtual line-of-sight (VLOS) path, thereby reducing delay spread and mitigating multipath fading. Furthermore, the cluster characteristics of RIS-assisted channels are highly related to the measurement environment. In the laboratory scenario, a single cluster dominated by the VLOS path with smooth envelope is observed. On the other hand, in the corridor scenario, some additional clusters introduced by the RIS reflection are created.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Interactive Content Diversity and User Exploration in Online Movie Recommenders: A Field Experiment
Authors:
Ruixuan Sun,
Avinash Akella,
Ruoyan Kong,
Moyan Zhou,
Joseph A. Konstan
Abstract:
Recommender systems often struggle to strike a balance between matching users' tastes and providing unexpected recommendations. When recommendations are too narrow and fail to cover the full range of users' preferences, the system is perceived as useless. Conversely, when the system suggests too many items that users don't like, it is considered impersonal or ineffective. To better understand user…
▽ More
Recommender systems often struggle to strike a balance between matching users' tastes and providing unexpected recommendations. When recommendations are too narrow and fail to cover the full range of users' preferences, the system is perceived as useless. Conversely, when the system suggests too many items that users don't like, it is considered impersonal or ineffective. To better understand user sentiment about the breadth of recommendations given by a movie recommender, we conducted interviews and surveys and found out that many users considered narrow recommendations to be useful, while a smaller number explicitly wanted greater breadth. Additionally, we designed and ran an online field experiment with a larger user group, evaluating two new interfaces designed to provide users with greater access to broader recommendations. We looked at user preferences and behavior for two groups of users: those with higher initial movie diversity and those with lower diversity. Among our findings, we discovered that different level of exploration control and users' subjective preferences on interfaces are more predictive of their satisfaction with the recommender.
△ Less
Submitted 23 September, 2023;
originally announced September 2023.
-
WikiMT++ Dataset Card
Authors:
Monan Zhou,
Shangda Wu,
Yuan Wang,
Wei Li
Abstract:
WikiMT++ is an expanded and refined version of WikiMusicText (WikiMT), featuring 1010 curated lead sheets in ABC notation. To expand application scenarios of WikiMT, we add both objective (album, lyrics, video) and subjective emotion (12 emotion adjectives) and emo\_4q (Russell 4Q) attributes, enhancing its usability for music information retrieval, conditional music generation, automatic composit…
▽ More
WikiMT++ is an expanded and refined version of WikiMusicText (WikiMT), featuring 1010 curated lead sheets in ABC notation. To expand application scenarios of WikiMT, we add both objective (album, lyrics, video) and subjective emotion (12 emotion adjectives) and emo\_4q (Russell 4Q) attributes, enhancing its usability for music information retrieval, conditional music generation, automatic composition, and emotion classification, etc. Additionally, CLaMP is implemented to correct the attributes inherited from WikiMT to reduce errors introduced during original data collection and enhance the accuracy and completeness of our dataset.
△ Less
Submitted 23 September, 2023;
originally announced September 2023.
-
Optimal path planning of multi-agent cooperative systems with rigid formation
Authors:
Ananda Rangan Narayanan,
Mi Zhou,
Erik Verriest
Abstract:
In this article, we consider the path-planning problem of a cooperative homogeneous robotic system with rigid formation. An optimal controller is designed for each agent in such rigid systems based on Pontryagin's minimum principle theory. We found that the optimal control for each agent is equivalent to the optimal control for the Center of Mass (CoM). This equivalence is then proved by using som…
▽ More
In this article, we consider the path-planning problem of a cooperative homogeneous robotic system with rigid formation. An optimal controller is designed for each agent in such rigid systems based on Pontryagin's minimum principle theory. We found that the optimal control for each agent is equivalent to the optimal control for the Center of Mass (CoM). This equivalence is then proved by using some analytical mechanics. Three examples are finally simulated to illustrate our theoretical results. One application could be utilizing this equivalence to simplify the original multi-agent optimal control problem.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
Beta Diffusion
Authors:
Mingyuan Zhou,
Tianqi Chen,
Zhendong Wang,
Huangjie Zheng
Abstract:
We introduce beta diffusion, a novel generative modeling method that integrates demasking and denoising to generate data within bounded ranges. Using scaled and shifted beta distributions, beta diffusion utilizes multiplicative transitions over time to create both forward and reverse diffusion processes, maintaining beta distributions in both the forward marginals and the reverse conditionals, giv…
▽ More
We introduce beta diffusion, a novel generative modeling method that integrates demasking and denoising to generate data within bounded ranges. Using scaled and shifted beta distributions, beta diffusion utilizes multiplicative transitions over time to create both forward and reverse diffusion processes, maintaining beta distributions in both the forward marginals and the reverse conditionals, given the data at any point in time. Unlike traditional diffusion-based generative models relying on additive Gaussian noise and reweighted evidence lower bounds (ELBOs), beta diffusion is multiplicative and optimized with KL-divergence upper bounds (KLUBs) derived from the convexity of the KL divergence. We demonstrate that the proposed KLUBs are more effective for optimizing beta diffusion compared to negative ELBOs, which can also be derived as the KLUBs of the same KL divergence with its two arguments swapped. The loss function of beta diffusion, expressed in terms of Bregman divergence, further supports the efficacy of KLUBs for optimization. Experimental results on both synthetic data and natural images demonstrate the unique capabilities of beta diffusion in generative modeling of range-bounded data and validate the effectiveness of KLUBs in optimizing diffusion models, thereby making them valuable additions to the family of diffusion-based generative models and the optimization techniques used to train them.
△ Less
Submitted 24 December, 2023; v1 submitted 14 September, 2023;
originally announced September 2023.
-
SoccerNet 2023 Challenges Results
Authors:
Anthony Cioppa,
Silvio Giancola,
Vladimir Somers,
Floriane Magera,
Xin Zhou,
Hassan Mkhallati,
Adrien Deliège,
Jan Held,
Carlos Hinojosa,
Amir M. Mansourian,
Pierre Miralles,
Olivier Barnich,
Christophe De Vleeschouwer,
Alexandre Alahi,
Bernard Ghanem,
Marc Van Droogenbroeck,
Abdullah Kamal,
Adrien Maglo,
Albert Clapés,
Amr Abdelaziz,
Artur Xarles,
Astrid Orcesi,
Atom Scott,
Bin Liu,
Byoungkwon Lim
, et al. (77 additional authors not shown)
Abstract:
The SoccerNet 2023 challenges were the third annual video understanding challenges organized by the SoccerNet team. For this third edition, the challenges were composed of seven vision-based tasks split into three main themes. The first theme, broadcast video understanding, is composed of three high-level tasks related to describing events occurring in the video broadcasts: (1) action spotting, fo…
▽ More
The SoccerNet 2023 challenges were the third annual video understanding challenges organized by the SoccerNet team. For this third edition, the challenges were composed of seven vision-based tasks split into three main themes. The first theme, broadcast video understanding, is composed of three high-level tasks related to describing events occurring in the video broadcasts: (1) action spotting, focusing on retrieving all timestamps related to global actions in soccer, (2) ball action spotting, focusing on retrieving all timestamps related to the soccer ball change of state, and (3) dense video captioning, focusing on describing the broadcast with natural language and anchored timestamps. The second theme, field understanding, relates to the single task of (4) camera calibration, focusing on retrieving the intrinsic and extrinsic camera parameters from images. The third and last theme, player understanding, is composed of three low-level tasks related to extracting information about the players: (5) re-identification, focusing on retrieving the same players across multiple views, (6) multiple object tracking, focusing on tracking players and the ball through unedited video streams, and (7) jersey number recognition, focusing on recognizing the jersey number of players from tracklets. Compared to the previous editions of the SoccerNet challenges, tasks (2-3-7) are novel, including new annotations and data, task (4) was enhanced with more data and annotations, and task (6) now focuses on end-to-end approaches. More information on the tasks, challenges, and leaderboards are available on https://www.soccer-net.org. Baselines and development kits can be found on https://github.com/SoccerNet.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
Two is Better Than One: Answering Complex Questions by Multiple Knowledge Sources with Generalized Links
Authors:
Minhao Zhang,
Yongliang Ma,
Yanzeng Li,
Ruoyu Zhang,
Lei Zou,
Ming Zhou
Abstract:
Incorporating multiple knowledge sources is proven to be beneficial for answering complex factoid questions. To utilize multiple knowledge bases (KB), previous works merge all KBs into a single graph via entity alignment and reduce the problem to question-answering (QA) over the fused KB. In reality, various link relations between KBs might be adopted in QA over multi-KBs. In addition to the ident…
▽ More
Incorporating multiple knowledge sources is proven to be beneficial for answering complex factoid questions. To utilize multiple knowledge bases (KB), previous works merge all KBs into a single graph via entity alignment and reduce the problem to question-answering (QA) over the fused KB. In reality, various link relations between KBs might be adopted in QA over multi-KBs. In addition to the identity between the alignable entities (i.e. full link), unalignable entities expressing the different aspects or types of an abstract concept may also be treated identical in a question (i.e. partial link). Hence, the KB fusion in prior works fails to represent all types of links, restricting their ability to comprehend multi-KBs for QA. In this work, we formulate the novel Multi-KB-QA task that leverages the full and partial links among multiple KBs to derive correct answers, a benchmark with diversified link and query types is also constructed to efficiently evaluate Multi-KB-QA performance. Finally, we propose a method for Multi-KB-QA that encodes all link relations in the KB embedding to score and rank candidate answers. Experiments show that our method markedly surpasses conventional KB-QA systems in Multi-KB-QA, justifying the necessity of devising this task.
△ Less
Submitted 10 September, 2023;
originally announced September 2023.
-
Finite-sample analysis of rotation operator under $l_2$ norm and $l_\infty$ norm
Authors:
Mi Zhou
Abstract:
In this article, we consider a special operator called the two-dimensional rotation operator and analyze its convergence and finite-sample bounds under the $l_2$ norm and $l_\infty$ norm with constant step size. We then consider the same problem with stochastic noise with affine variance. Furthermore, simulations are provided to illustrate our results. Finally, we conclude this article by proposin…
▽ More
In this article, we consider a special operator called the two-dimensional rotation operator and analyze its convergence and finite-sample bounds under the $l_2$ norm and $l_\infty$ norm with constant step size. We then consider the same problem with stochastic noise with affine variance. Furthermore, simulations are provided to illustrate our results. Finally, we conclude this article by proposing some possible future extensions.
△ Less
Submitted 9 September, 2023;
originally announced September 2023.
-
Covariant density functional theory for nuclear fission based on two-center harmonic oscillator basis
Authors:
Zeyu Li,
Shengyuan Chen,
Minghui Zhou,
Yong**g Chen,
Zhipan Li
Abstract:
Nowdays, modern microscopic approaches for fission are generally based on the framework of nuclear density functional theory (DFT), which has enabled a self-consistent treatment of both static and dynamic aspects of fission. The key issue is a DFT solver with high precision and efficiency especially for the large elongated configurations. Purpose: To develope a DFT solver with high precision and e…
▽ More
Nowdays, modern microscopic approaches for fission are generally based on the framework of nuclear density functional theory (DFT), which has enabled a self-consistent treatment of both static and dynamic aspects of fission. The key issue is a DFT solver with high precision and efficiency especially for the large elongated configurations. Purpose: To develope a DFT solver with high precision and efficiency based on the point coupling covariant density functional theory (CDFT), which has achieved great success in describing properties of nuclei for the whole nuclear chart. Method: We have extended the point-coupling CDFT to be based on the two-center harmonic oscillator (TCHO) basis, which matches well with the large elongated configurations during the fission process. Multi-dimensional constraint and time-dependent generator coordinate method (TDGCM) have been used to analyze the fission potential energy surface and fission dynamics, respectively. To simulate the splitting process of the nascent fragments beyond scission, we also introduce a density constraint into the new CDFT framework. Results: Illustrative calculations have been done for the PESs and induced fission dynamics of two typical examples: $^{226}$Th and $^{240}$Pu. A more reasonable PES is obtained in the new framework compared to that based on the once-center harmonic oscillator (OCHO) with the same basis space. An optimization of about $0.2\sim0.3$ MeV has been achieved for the outer fission barriers and large elongated configurations. The dynamical simulations based on TCHO basis presents a trend to improve the description for fission yields. Conclusions: The new developed CDFT solver optimizes the elongated configurations, improves the calculation efficiency, and provides a basis for large-scale multi-dimensional constraint calculations and dynamical simulations.
△ Less
Submitted 6 September, 2023;
originally announced September 2023.
-
Empowering Low-Light Image Enhancer through Customized Learnable Priors
Authors:
Naishan Zheng,
Man Zhou,
Yanmeng Dong,
Xiangyu Rui,
Jie Huang,
Chongyi Li,
Feng Zhao
Abstract:
Deep neural networks have achieved remarkable progress in enhancing low-light images by improving their brightness and eliminating noise. However, most existing methods construct end-to-end map** networks heuristically, neglecting the intrinsic prior of image enhancement task and lacking transparency and interpretability. Although some unfolding solutions have been proposed to relieve these issu…
▽ More
Deep neural networks have achieved remarkable progress in enhancing low-light images by improving their brightness and eliminating noise. However, most existing methods construct end-to-end map** networks heuristically, neglecting the intrinsic prior of image enhancement task and lacking transparency and interpretability. Although some unfolding solutions have been proposed to relieve these issues, they rely on proximal operator networks that deliver ambiguous and implicit priors. In this work, we propose a paradigm for low-light image enhancement that explores the potential of customized learnable priors to improve the transparency of the deep unfolding paradigm. Motivated by the powerful feature representation capability of Masked Autoencoder (MAE), we customize MAE-based illumination and noise priors and redevelop them from two perspectives: 1) \textbf{structure flow}: we train the MAE from a normal-light image to its illumination properties and then embed it into the proximal operator design of the unfolding architecture; and m2) \textbf{optimization flow}: we train MAE from a normal-light image to its gradient representation and then employ it as a regularization term to constrain noise in the model output. These designs improve the interpretability and representation capability of the model.Extensive experiments on multiple low-light image enhancement datasets demonstrate the superiority of our proposed paradigm over state-of-the-art methods. Code is available at https://github.com/zheng980629/CUE.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Learned Image Reasoning Prior Penetrates Deep Unfolding Network for Panchromatic and Multi-Spectral Image Fusion
Authors:
Man Zhou,
Jie Huang,
Naishan Zheng,
Chongyi Li
Abstract:
The success of deep neural networks for pan-sharpening is commonly in a form of black box, lacking transparency and interpretability. To alleviate this issue, we propose a novel model-driven deep unfolding framework with image reasoning prior tailored for the pan-sharpening task. Different from existing unfolding solutions that deliver the proximal operator networks as the uncertain and vague prio…
▽ More
The success of deep neural networks for pan-sharpening is commonly in a form of black box, lacking transparency and interpretability. To alleviate this issue, we propose a novel model-driven deep unfolding framework with image reasoning prior tailored for the pan-sharpening task. Different from existing unfolding solutions that deliver the proximal operator networks as the uncertain and vague priors, our framework is motivated by the content reasoning ability of masked autoencoders (MAE) with insightful designs. Specifically, the pre-trained MAE with spatial masking strategy, acting as intrinsic reasoning prior, is embedded into unfolding architecture. Meanwhile, the pre-trained MAE with spatial-spectral masking strategy is treated as the regularization term within loss function to constrain the spatial-spectral consistency. Such designs penetrate the image reasoning prior into deep unfolding networks while improving its interpretability and representation capability. The uniqueness of our framework is that the holistic learning process is explicitly integrated with the inherent physical mechanism underlying the pan-sharpening task. Extensive experiments on multiple satellite datasets demonstrate the superiority of our method over the existing state-of-the-art approaches. Code will be released at \url{https://manman1995.github.io/}.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
Generalized Lightness Adaptation with Channel Selective Normalization
Authors:
Mingde Yao,
Jie Huang,
Xin **,
Ruikang Xu,
Shenglong Zhou,
Man Zhou,
Zhiwei Xiong
Abstract:
Lightness adaptation is vital to the success of image processing to avoid unexpected visual deterioration, which covers multiple aspects, e.g., low-light image enhancement, image retouching, and inverse tone map**. Existing methods typically work well on their trained lightness conditions but perform poorly in unknown ones due to their limited generalization ability. To address this limitation,…
▽ More
Lightness adaptation is vital to the success of image processing to avoid unexpected visual deterioration, which covers multiple aspects, e.g., low-light image enhancement, image retouching, and inverse tone map**. Existing methods typically work well on their trained lightness conditions but perform poorly in unknown ones due to their limited generalization ability. To address this limitation, we propose a novel generalized lightness adaptation algorithm that extends conventional normalization techniques through a channel filtering design, dubbed Channel Selective Normalization (CSNorm). The proposed CSNorm purposely normalizes the statistics of lightness-relevant channels and keeps other channels unchanged, so as to improve feature generalization and discrimination. To optimize CSNorm, we propose an alternating training strategy that effectively identifies lightness-relevant channels. The model equipped with our CSNorm only needs to be trained on one lightness condition and can be well generalized to unknown lightness conditions. Experimental results on multiple benchmark datasets demonstrate the effectiveness of CSNorm in enhancing the generalization ability for the existing lightness adaptation methods. Code is available at https://github.com/mdyao/CSNorm.
△ Less
Submitted 26 August, 2023;
originally announced August 2023.
-
Ultra-clean assembly of van der Waals heterostructures
Authors:
Wendong Wang,
Nicholas Clark,
Matthew Hamer,
Amy Carl,
Endre Tovari,
Sam Sullivan-Allsop,
Evan Tillotson,
Yunze Gao,
Hugo de Latour,
Francisco Selles,
James Howarth,
Eli G. Castanon,
Mingwei Zhou,
Haoyu Bai,
Xiao Li,
Astrid Weston,
Kenji Watanabe,
Takashi Taniguchi,
Cecilia Mattevi,
Thomas H. Bointon,
Paul V. Wiper,
Andrew J. Strudwick,
Leonid A. Ponomarenko,
Andrey Kretinin,
Sarah J. Haigh
, et al. (2 additional authors not shown)
Abstract:
Layer-by-layer assembly of van der Waals (vdW) heterostructures underpins new discoveries in solid state physics, material science and chemistry. Despite the successes, all current 2D material (2DM) transfer techniques rely on the use of polymers which limit the cleanliness, ultimate electronic performance, and potential for optoelectronic applications of the heterostructures. In this article, we…
▽ More
Layer-by-layer assembly of van der Waals (vdW) heterostructures underpins new discoveries in solid state physics, material science and chemistry. Despite the successes, all current 2D material (2DM) transfer techniques rely on the use of polymers which limit the cleanliness, ultimate electronic performance, and potential for optoelectronic applications of the heterostructures. In this article, we present a novel polymer-free platform for rapid and facile heterostructure assembly which utilises re-usable flexible silicon nitride membranes. We demonstrate that this allows fast and reproducible production of 2D heterostructures using both exfoliated and CVD-grown materials with perfect interfaces free from interlayer contamination and correspondingly excellent electronic behaviour, limited only by the size and intrinsic quality of the crystals used. Furthermore, removing the need for polymeric carriers allows new possibilities for vdW heterostructure fabrication: assembly at high temperatures up to 600°C, and in different environments including ultra-high vacuum (UHV) and when the materials are fully submerged in liquids. We demonstrate UHV heterostructure assembly for the first time, and show the reliable creation of graphene moiré superlattices with more than an order of magnitude improvement in their structural homogeneity. We believe that broad adaptation of our novel inorganic 2D materials assembly strategy will allow realisation of the full potential of vdW heterostructures as a platform for new physics and advanced optoelectronic technologies.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
Personalized First Issue Recommender for Newcomers in Open Source Projects
Authors:
Wenxin Xiao,
**gyue Li,
Hao He,
Ruiqiao Qiu,
Minghui Zhou
Abstract:
Many open source projects provide good first issues (GFIs) to attract and retain newcomers. Although several automated GFI recommenders have been proposed, existing recommenders are limited to recommending generic GFIs without considering differences between individual newcomers. However, we observe mismatches between generic GFIs and the diverse background of newcomers, resulting in failed attemp…
▽ More
Many open source projects provide good first issues (GFIs) to attract and retain newcomers. Although several automated GFI recommenders have been proposed, existing recommenders are limited to recommending generic GFIs without considering differences between individual newcomers. However, we observe mismatches between generic GFIs and the diverse background of newcomers, resulting in failed attempts, discouraged onboarding, and delayed issue resolution. To address this problem, we assume that personalized first issues (PFIs) for newcomers could help reduce the mismatches. To justify the assumption, we empirically analyze 37 newcomers and their first issues resolved across multiple projects. We find that the first issues resolved by the same newcomer share similarities in task type, programming language, and project domain. These findings underscore the need for a PFI recommender to improve over state-of-the-art approaches. For that purpose, we identify features that influence newcomers' personalized selection of first issues by analyzing the relationship between possible features of the newcomers and the characteristics of the newcomers' chosen first issues. We find that the expertise preference, OSS experience, activeness, and sentiment of newcomers drive their personalized choice of the first issues. Based on these findings, we propose a Personalized First Issue Recommender (PFIRec), which employs LamdaMART to rank candidate issues for a given newcomer by leveraging the identified influential features. We evaluate PFIRec using a dataset of 68,858 issues from 100 GitHub projects. The evaluation results show that PFIRec outperforms existing first issue recommenders, potentially doubling the probability that the top recommended issue is suitable for a specific newcomer and reducing one-third of a newcomer's unsuccessful attempts to identify suitable first issues, in the median.
△ Less
Submitted 26 August, 2023; v1 submitted 17 August, 2023;
originally announced August 2023.
-
Mitigating Semantic Confusion from Hostile Neighborhood for Graph Active Learning
Authors:
Tianmeng Yang,
Min Zhou,
Yu**g Wang,
Zhengjie Lin,
Lujia Pan,
Bin Cui,
Yunhai Tong
Abstract:
Graph Active Learning (GAL), which aims to find the most informative nodes in graphs for annotation to maximize the Graph Neural Networks (GNNs) performance, has attracted many research efforts but remains non-trivial challenges. One major challenge is that existing GAL strategies may introduce semantic confusion to the selected training set, particularly when graphs are noisy. Specifically, most…
▽ More
Graph Active Learning (GAL), which aims to find the most informative nodes in graphs for annotation to maximize the Graph Neural Networks (GNNs) performance, has attracted many research efforts but remains non-trivial challenges. One major challenge is that existing GAL strategies may introduce semantic confusion to the selected training set, particularly when graphs are noisy. Specifically, most existing methods assume all aggregating features to be helpful, ignoring the semantically negative effect between inter-class edges under the message-passing mechanism. In this work, we present Semantic-aware Active learning framework for Graphs (SAG) to mitigate the semantic confusion problem. Pairwise similarities and dissimilarities of nodes with semantic features are introduced to jointly evaluate the node influence. A new prototype-based criterion and query policy are also designed to maintain diversity and class balance of the selected nodes, respectively. Extensive experiments on the public benchmark graphs and a real-world financial dataset demonstrate that SAG significantly improves node classification performances and consistently outperforms previous methods. Moreover, comprehensive analysis and ablation study also verify the effectiveness of the proposed framework.
△ Less
Submitted 17 August, 2023;
originally announced August 2023.
-
Growth of millimeter-sized high-quality CuFeSe$_2$ single crystals by the molten salt method and study of their semiconducting behavior
Authors:
Mingwei Ma,
Binbin Ruan,
Menghu Zhou,
Yadong Gu,
Qingxin Dong,
Qingsong Yang,
Qiaoyu Wang,
Lewei Chen,
Yunqing Shi,
Junkun Yi,
Genfu Chen,
Zhian Ren
Abstract:
An eutectic AlCl$_3$/KCl molten salt method in a horizontal configuration was employed to grow millimeter-sized and composition homogeneous CuFeSe$_2$ single crystals due to the continuous growth process in a temperature gradient induced solution convection. The typical as-grown CuFeSe$_2$ single crystals in cubic forms are nearly 1.6$\times$1.2$\times$1.0 mm3 in size. The chemical composition and…
▽ More
An eutectic AlCl$_3$/KCl molten salt method in a horizontal configuration was employed to grow millimeter-sized and composition homogeneous CuFeSe$_2$ single crystals due to the continuous growth process in a temperature gradient induced solution convection. The typical as-grown CuFeSe$_2$ single crystals in cubic forms are nearly 1.6$\times$1.2$\times$1.0 mm3 in size. The chemical composition and homogeneity of the crystals was examined by both inductively coupled plasma atomic emission spectroscopy and energy dispersive spectrometer with Cu:Fe:Se = 0.96:1.00:1.99 consistent with the stoichiometric composition of CuFeSe$_2$. The magnetic measurements suggest a ferrimagnetic or weak ferromagnetic transition below T$_C$ = 146 K and the resistivity reveals a semiconducting behavior and an abrupt increase below T$_C$.
△ Less
Submitted 16 August, 2023;
originally announced August 2023.
-
How Early Participation Determines Long-Term Sustained Activity in GitHub Projects?
Authors:
Wenxin Xiao,
Hao He,
Weiwei Xu,
Yuxia Zhang,
Minghui Zhou
Abstract:
Although the open source model bears many advantages in software development, open source projects are always hard to sustain. Previous research on open source sustainability mainly focuses on projects that have already reached a certain level of maturity (e.g., with communities, releases, and downstream projects). However, limited attention is paid to the development of (sustainable) open source…
▽ More
Although the open source model bears many advantages in software development, open source projects are always hard to sustain. Previous research on open source sustainability mainly focuses on projects that have already reached a certain level of maturity (e.g., with communities, releases, and downstream projects). However, limited attention is paid to the development of (sustainable) open source projects in their infancy, and we believe an understanding of early sustainability determinants is crucial for project initiators, incubators, newcomers, and users.
In this paper, we aim to explore the relationship between early participation factors and long-term project sustainability. We leverage a novel methodology combining the Blumberg model of performance and machine learning to predict the sustainability of 290,255 GitHub projects. Specificially, we train an XGBoost model based on early participation (first three months of activity) in 290,255 GitHub projects and we interpret the model using LIME. We quantitatively show that early participants have a positive effect on project's future sustained activity if they have prior experience in OSS project incubation and demonstrate concentrated focus and steady commitment. Participation from non-code contributors and detailed contribution documentation also promote project's sustained activity. Compared with individual projects, building a community that consists of more experienced core developers and more active peripheral developers is important for organizational projects. This study provides unique insights into the incubation and recognition of sustainable open source projects, and our interpretable prediction approach can also offer guidance to open source project initiators and newcomers.
△ Less
Submitted 28 September, 2023; v1 submitted 11 August, 2023;
originally announced August 2023.
-
Image-based Geolocalization by Ground-to-2.5D Map Matching
Authors:
Mengjie Zhou,
Liu Liu,
Yiran Zhong,
Andrew Calway
Abstract:
We study the image-based geolocalization problem, aiming to localize ground-view query images on cartographic maps. Current methods often utilize cross-view localization techniques to match ground-view query images with 2D maps. However, the performance of these methods is unsatisfactory due to significant cross-view appearance differences. In this paper, we lift cross-view matching to a 2.5D spac…
▽ More
We study the image-based geolocalization problem, aiming to localize ground-view query images on cartographic maps. Current methods often utilize cross-view localization techniques to match ground-view query images with 2D maps. However, the performance of these methods is unsatisfactory due to significant cross-view appearance differences. In this paper, we lift cross-view matching to a 2.5D space, where heights of structures (e.g., trees and buildings) provide geometric information to guide the cross-view matching. We propose a new approach to learning representative embeddings from multi-modal data. Specifically, we establish a projection relationship between 2.5D space and 2D aerial-view space. The projection is further used to combine multi-modal features from the 2.5D and 2D maps using an effective pixel-to-point fusion method. By encoding crucial geometric cues, our method learns discriminative location embeddings for matching panoramic images and maps. Additionally, we construct the first large-scale ground-to-2.5D map geolocalization dataset to validate our method and facilitate future research. Both single-image based and route based localization experiments are conducted to test our method. Extensive experiments demonstrate that the proposed method achieves significantly higher localization accuracy and faster convergence than previous 2D map-based approaches.
△ Less
Submitted 3 November, 2023; v1 submitted 11 August, 2023;
originally announced August 2023.
-
Understanding and Remediating Open-Source License Incompatibilities in the PyPI Ecosystem
Authors:
Weiwei Xu,
Hao He,
Kai Gao,
Minghui Zhou
Abstract:
The reuse and distribution of open-source software must be in compliance with its accompanying open-source license. In modern packaging ecosystems, maintaining such compliance is challenging because a package may have a complex multi-layered dependency graph with many packages, any of which may have an incompatible license. Although prior research finds that license incompatibilities are prevalent…
▽ More
The reuse and distribution of open-source software must be in compliance with its accompanying open-source license. In modern packaging ecosystems, maintaining such compliance is challenging because a package may have a complex multi-layered dependency graph with many packages, any of which may have an incompatible license. Although prior research finds that license incompatibilities are prevalent, empirical evidence is still scarce in some modern packaging ecosystems (e.g., PyPI). It also remains unclear how developers remediate the license incompatibilities in the dependency graphs of their packages (including direct and transitive dependencies), let alone any automated approaches. To bridge this gap, we conduct a large-scale empirical study of license incompatibilities and their remediation practices in the PyPI ecosystem. We find that 7.27% of the PyPI package releases have license incompatibilities and 61.3% of them are caused by transitive dependencies, causing challenges in their remediation; for remediation, developers can apply one of the five strategies: migration, removal, pinning versions, changing their own licenses, and negotiation. Inspired by our findings, we propose SILENCE, an SMT-solver-based approach to recommend license incompatibility remediations with minimal costs in package dependency graph. Our evaluation shows that the remediations proposed by SILENCE can match 19 historical real-world cases (except for migrations not covered by an existing knowledge base) and have been accepted by five popular PyPI packages whose developers were previously unaware of their license incompatibilities.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
TextPainter: Multimodal Text Image Generation with Visual-harmony and Text-comprehension for Poster Design
Authors:
Yifan Gao,
**peng Lin,
Min Zhou,
Chuanbin Liu,
Hongtao Xie,
Tiezheng Ge,
Yuning Jiang
Abstract:
Text design is one of the most critical procedures in poster design, as it relies heavily on the creativity and expertise of humans to design text images considering the visual harmony and text-semantic. This study introduces TextPainter, a novel multimodal approach that leverages contextual visual information and corresponding text semantics to generate text images. Specifically, TextPainter take…
▽ More
Text design is one of the most critical procedures in poster design, as it relies heavily on the creativity and expertise of humans to design text images considering the visual harmony and text-semantic. This study introduces TextPainter, a novel multimodal approach that leverages contextual visual information and corresponding text semantics to generate text images. Specifically, TextPainter takes the global-local background image as a hint of style and guides the text image generation with visual harmony. Furthermore, we leverage the language model and introduce a text comprehension module to achieve both sentence-level and word-level style variations. Besides, we construct the PosterT80K dataset, consisting of about 80K posters annotated with sentence-level bounding boxes and text contents. We hope this dataset will pave the way for further research on multimodal text image generation. Extensive quantitative and qualitative experiments demonstrate that TextPainter can generate visually-and-semantically-harmonious text images for posters.
△ Less
Submitted 12 August, 2023; v1 submitted 9 August, 2023;
originally announced August 2023.
-
The effect of Quantum Statistics on the sensitivity in an SU(1,1) interferometer
Authors:
Jie Zeng,
Yingxing Ding,
Mengyao Zhou,
Gao-Feng Jiao,
Keye Zhang,
L. Q. Chen,
Wei** Zhang,
Chun-Hua Yuan
Abstract:
We theoretically study the effect of quantum statistics of the light field on the quantum enhancement of parameter estimation based on cat state input the SU(1,1) interferometer. The phase sensitivity is dependent on the relative phase $θ$ between two coherent states of Schrödinger cat states. The optimal sensitivity is achieved when the relative phase is $π$% , i.e., odd coherent states input. Fo…
▽ More
We theoretically study the effect of quantum statistics of the light field on the quantum enhancement of parameter estimation based on cat state input the SU(1,1) interferometer. The phase sensitivity is dependent on the relative phase $θ$ between two coherent states of Schrödinger cat states. The optimal sensitivity is achieved when the relative phase is $π$% , i.e., odd coherent states input. For a coherent state input into one port, the phase sensitivity of the odd coherent state into the second input port is inferior to that of the squeezed vacuum state input. However, in the presence of losses the Schrödinger cat states are more resistant to loss than squeezed vacuum states. As the amplitude of Schrödinger cat states increases, the quantum enhancement of phase sensitivity decreases, which shows that the quantum statistics of Schrödinger cat states tends towards Poisson statistics from sub-Poisson statistics or super-Poisson statistics.
△ Less
Submitted 5 August, 2023;
originally announced August 2023.
-
Magnetogenesis in a collisionless plasma: from Weibel instability to turbulent dynamo
Authors:
Muni Zhou,
Vladimir Zhdankin,
Matthew W. Kunz,
Nuno F. Loureiro,
Dmitri A. Uzdensky
Abstract:
We report on a first-principles numerical and theoretical study of plasma dynamo in a fully kinetic framework. By applying an external mechanical force to an initially unmagnetized plasma, we develop a self-consistent treatment of the generation of ``seed'' magnetic fields, the formation of turbulence, and the inductive amplification of fields by the fluctuation dynamo. Driven large-scale motions…
▽ More
We report on a first-principles numerical and theoretical study of plasma dynamo in a fully kinetic framework. By applying an external mechanical force to an initially unmagnetized plasma, we develop a self-consistent treatment of the generation of ``seed'' magnetic fields, the formation of turbulence, and the inductive amplification of fields by the fluctuation dynamo. Driven large-scale motions in an unmagnetized, weakly collisional plasma are subject to strong phase mixing, which leads to the development of thermal pressure anisotropy. This anisotropy triggers the Weibel instability, which produces filamentary ``seed'' magnetic fields on plasma-kinetic scales. The plasma is thereby magnetized, enabling efficient stretching and folding of the fields by the plasma motions and the development of Larmor-scale kinetic instabilities such as the firehose and mirror. The scattering of particles off the associated microscale magnetic fluctuations provides an effective viscosity, regulating the field morphology and turbulence. During this process, the seed field is further amplified by the fluctuation dynamo until they reach energy equipartition with the turbulent flow. By demonstrating that equipartition magnetic fields can be generated from an initially unmagnetized plasma through large-scale turbulent flows, this work has important implications for the origin and amplification of magnetic fields in the intracluster and intergalactic mediums.
△ Less
Submitted 28 July, 2023;
originally announced August 2023.