Search | arXiv e-print repository

doi 10.1145/3539618.3591828

COUPA: An Industrial Recommender System for Online to Offline Service Platforms

Authors: Sicong Xie, Binbin Hu, Fengze Li, Ziqi Liu, Zhiqiang Zhang, Wenliang Zhong, Jun Zhou

Abstract: Aiming at hel** users locally discovery retail services (e.g., entertainment and dinning), Online to Offline (O2O) service platforms have become popular in recent years, which greatly challenge current recommender systems. With the real data in Alipay, a feeds-like scenario for O2O services, we find that recurrence based temporal patterns and position biases commonly exist in our scenarios, whic… ▽ More Aiming at hel** users locally discovery retail services (e.g., entertainment and dinning), Online to Offline (O2O) service platforms have become popular in recent years, which greatly challenge current recommender systems. With the real data in Alipay, a feeds-like scenario for O2O services, we find that recurrence based temporal patterns and position biases commonly exist in our scenarios, which seriously threaten the recommendation effectiveness. To this end, we propose COUPA, an industrial system targeting for characterizing user preference with following two considerations: (1) Time aware preference: we employ the continuous time aware point process equipped with an attention mechanism to fully capture temporal patterns for recommendation. (2) Position aware preference: a position selector component equipped with a position personalization module is elaborately designed to mitigate position bias in a personalized manner. Finally, we carefully implement and deploy COUPA on Alipay with a cooperation of edge, streaming and batch computing, as well as a two-stage online serving mode, to support several popular recommendation scenarios. We conduct extensive experiments to demonstrate that COUPA consistently achieves superior performance and has potential to provide intuitive evidences for recommendation △ Less

Submitted 24 April, 2023; originally announced April 2023.

Comments: The short version has been accepted by the SIGIR 2023 Industrial track

arXiv:2304.08010 [pdf, other]

doi 10.1088/1674-4527/acd0e9

Mock X-ray observations of hot gas with L-Galaxies semi-analytic models of galaxy formation

Authors: Wenxin Zhong, Jian Fu, Shiyin Shen, Feng Yuan

Abstract: We create mock X-ray observations of hot gas in galaxy clusters with a new extension of L-Galaxies semi-analytic model of galaxy formation, which includes the radial distribution of hot gas in each halo. Based on the model outputs, we first build some mock light cones, then generate mock spectra with SOXS package and derive the mock images in the light cones. Using the mock data, we simulate the m… ▽ More We create mock X-ray observations of hot gas in galaxy clusters with a new extension of L-Galaxies semi-analytic model of galaxy formation, which includes the radial distribution of hot gas in each halo. Based on the model outputs, we first build some mock light cones, then generate mock spectra with SOXS package and derive the mock images in the light cones. Using the mock data, we simulate the mock X-ray spectra for ROSAT all-sky survey, and compare the mock spectra with the observational results. Then, we consider the design parameters of HUBS mission and simulate the observation of the halo hot gas for HUBS as an important application of our mock work. We find: (1) Our mock data match the observations by current X-ray telescopes. (2) The survey of hot baryons in resolved clusters by HUBS is effective below redshift 0.5, and the observations of the emission lines in point-like sources at z>0.5 by HUBS help us understand the hot baryons in the early universe. (3) By taking the advantage of the large simulation box and flexibility in semi-analytic models, our mock X-ray observations provide the opportunity to make target selection and observation strategies for forthcoming X-ray facilities. △ Less

Submitted 24 April, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

Comments: 15 pages, 11 figures, accepted for publication in RAA

arXiv:2304.06960 [pdf, other]

Estimating Conditional Average Treatment Effects with Heteroscedasticity by Model Averaging and Matching

Authors: Pengfei Shi, Xinyu Zhang, Wei Zhong

Abstract: Causal inference is indispensable in many fields of empirical research, such as marketing, economic policy, medicine and so on. The estimation of treatment effects is the most important in causal inference. In this paper, we propose a model averaging approach, combined with a partition and matching method to estimate the conditional average treatment effect under heteroskedastic error settings. In… ▽ More Causal inference is indispensable in many fields of empirical research, such as marketing, economic policy, medicine and so on. The estimation of treatment effects is the most important in causal inference. In this paper, we propose a model averaging approach, combined with a partition and matching method to estimate the conditional average treatment effect under heteroskedastic error settings. In our methods, we use the partition and matching method to approximate the true treatment effects and choose the weights by minimizing a leave-one-out cross validation criterion, which is also known as jackknife method. We prove that the model averaging estimator with weights determined by our criterion has asymptotic optimality, which achieves the lowest possible squared error. When there exist correct models in the candidate model set, we have proved that the sum of weights of the correct models converge to 1 as the sample size increases but with a finite number of baseline covariates. A Monte Carlo simulation study shows that our method has good performance in finite sample cases. We apply this approach to a National Supported Work Demonstration data set. △ Less

Submitted 14 April, 2023; originally announced April 2023.

arXiv:2304.06364 [pdf, other]

AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models

Authors: Wanjun Zhong, Ruixiang Cui, Yiduo Guo, Yaobo Liang, Shuai Lu, Yanlin Wang, Amin Saied, Weizhu Chen, Nan Duan

Abstract: Evaluating the general abilities of foundation models to tackle human-level tasks is a vital aspect of their development and application in the pursuit of Artificial General Intelligence (AGI). Traditional benchmarks, which rely on artificial datasets, may not accurately represent human-level capabilities. In this paper, we introduce AGIEval, a novel benchmark specifically designed to assess found… ▽ More Evaluating the general abilities of foundation models to tackle human-level tasks is a vital aspect of their development and application in the pursuit of Artificial General Intelligence (AGI). Traditional benchmarks, which rely on artificial datasets, may not accurately represent human-level capabilities. In this paper, we introduce AGIEval, a novel benchmark specifically designed to assess foundation model in the context of human-centric standardized exams, such as college entrance exams, law school admission tests, math competitions, and lawyer qualification tests. We evaluate several state-of-the-art foundation models, including GPT-4, ChatGPT, and Text-Davinci-003, using this benchmark. Impressively, GPT-4 surpasses average human performance on SAT, LSAT, and math competitions, attaining a 95% accuracy rate on the SAT Math test and a 92.5% accuracy on the English test of the Chinese national college entrance exam. This demonstrates the extraordinary performance of contemporary foundation models. In contrast, we also find that GPT-4 is less proficient in tasks that require complex reasoning or specific domain knowledge. Our comprehensive analyses of model capabilities (understanding, knowledge, reasoning, and calculation) reveal these models' strengths and limitations, providing valuable insights into future directions for enhancing their general capabilities. By concentrating on tasks pertinent to human cognition and decision-making, our benchmark delivers a more meaningful and robust evaluation of foundation models' performance in real-world scenarios. The data, code, and all model outputs are released in https://github.com/ruixiangcui/AGIEval. △ Less

Submitted 18 September, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

Comments: 19 pages

arXiv:2304.05871 [pdf, other]

doi 10.1145/3539618.3591976

Edge-cloud Collaborative Learning with Federated and Centralized Features

Authors: Zexi Li, Qunwei Li, Yi Zhou, Wenliang Zhong, Guannan Zhang, Chao Wu

Abstract: Federated learning (FL) is a popular way of edge computing that doesn't compromise users' privacy. Current FL paradigms assume that data only resides on the edge, while cloud servers only perform model averaging. However, in real-life situations such as recommender systems, the cloud server has the ability to store historical and interactive features. In this paper, our proposed Edge-Cloud Collabo… ▽ More Federated learning (FL) is a popular way of edge computing that doesn't compromise users' privacy. Current FL paradigms assume that data only resides on the edge, while cloud servers only perform model averaging. However, in real-life situations such as recommender systems, the cloud server has the ability to store historical and interactive features. In this paper, our proposed Edge-Cloud Collaborative Knowledge Transfer Framework (ECCT) bridges the gap between the edge and cloud, enabling bi-directional knowledge transfer between both, sharing feature embeddings and prediction logits. ECCT consolidates various benefits, including enhancing personalization, enabling model heterogeneity, tolerating training asynchronization, and relieving communication burdens. Extensive experiments on public and industrial datasets demonstrate ECCT's effectiveness and potential for use in academia and industry. △ Less

Submitted 12 April, 2023; originally announced April 2023.

Comments: Accepted by Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '23)

arXiv:2303.15858 [pdf, ps, other]

doi 10.1103/PhysRevApplied.19.014036

Device-independent quantum secure direct communication with single photon sources

Authors: Lan Zhou, Bao-Wen Xu, Wei Zhong, Yu-Bo Sheng

Abstract: Quantum secure direct communication (QSDC) can directly transmit secrete messages through quantum channel. Device-independent (DI) QSDC can guarantee the communication security relying only on the observation of the Bell inequality violation, but not on any detailed description or trust of the inner workings of users' devices. In the paper, we propose a DI-QSDC protocol with practical high-efficie… ▽ More Quantum secure direct communication (QSDC) can directly transmit secrete messages through quantum channel. Device-independent (DI) QSDC can guarantee the communication security relying only on the observation of the Bell inequality violation, but not on any detailed description or trust of the inner workings of users' devices. In the paper, we propose a DI-QSDC protocol with practical high-efficient single photon sources. The communication parties construct the entanglement channel from single photons by adopting the heralded architecture, which makes the message leakage rate independent of the photon transmission loss. The secure communication distance and the practical communication efficiency of the current DI-QSDC protocol are about 6 times and 600 times of those in the original DI-QSDC protocol. Combining with the entanglement purification, the parties can construct the nearly perfect entanglement channel and completely eliminate the message leakage. This DI-QSDC protocol may have important application in future quantum communication field. △ Less

Submitted 28 March, 2023; originally announced March 2023.

Comments: 11 pages, 4 figures

Journal ref: Physical Review Applied 19, 014036 (2023)

arXiv:2303.05172 [pdf, other]

doi 10.1016/j.nima.2023.168680

The JUNO experiment Top Tracker

Authors: JUNO Collaboration, Angel Abusleme, Thomas Adam, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Muhammad Akram, Abid Aleem, Tsagkarakis Alexandros, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, Burin Asavapibhop, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato , et al. (592 additional authors not shown)

Abstract: The main task of the Top Tracker detector of the neutrino reactor experiment Jiangmen Underground Neutrino Observatory (JUNO) is to reconstruct and extrapolate atmospheric muon tracks down to the central detector. This muon tracker will help to evaluate the contribution of the cosmogenic background to the signal. The Top Tracker is located above JUNO's water Cherenkov Detector and Central Detector… ▽ More The main task of the Top Tracker detector of the neutrino reactor experiment Jiangmen Underground Neutrino Observatory (JUNO) is to reconstruct and extrapolate atmospheric muon tracks down to the central detector. This muon tracker will help to evaluate the contribution of the cosmogenic background to the signal. The Top Tracker is located above JUNO's water Cherenkov Detector and Central Detector, covering about 60% of the surface above them. The JUNO Top Tracker is constituted by the decommissioned OPERA experiment Target Tracker modules. The technology used consists in walls of two planes of plastic scintillator strips, one per transverse direction. Wavelength shifting fibres collect the light signal emitted by the scintillator strips and guide it to both ends where it is read by multianode photomultiplier tubes. Compared to the OPERA Target Tracker, the JUNO Top Tracker uses new electronics able to cope with the high rate produced by the high rock radioactivity compared to the one in Gran Sasso underground laboratory. This paper will present the new electronics and mechanical structure developed for the Top Tracker of JUNO along with its expected performance based on the current detector simulation. △ Less

Submitted 9 March, 2023; originally announced March 2023.

Comments: 20 pages

Journal ref: Nucl.Instrum.Meth.A 1057 (2023) 168680

arXiv:2303.03910 [pdf, other]

JUNO sensitivity to $^7$Be, $pep$, and CNO solar neutrinos

Authors: Angel Abusleme, Thomas Adam, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Muhammad Akram, Abid Aleem, Tsagkarakis Alexandros, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, Burin Asavapibhop, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato, Marco Beretta , et al. (592 additional authors not shown)

Abstract: The Jiangmen Underground Neutrino Observatory (JUNO), the first multi-kton liquid scintillator detector, which is under construction in China, will have a unique potential to perform a real-time measurement of solar neutrinos well below the few MeV threshold typical for Water Cherenkov detectors. JUNO's large target mass and excellent energy resolution are prerequisites for reaching unprecedented… ▽ More The Jiangmen Underground Neutrino Observatory (JUNO), the first multi-kton liquid scintillator detector, which is under construction in China, will have a unique potential to perform a real-time measurement of solar neutrinos well below the few MeV threshold typical for Water Cherenkov detectors. JUNO's large target mass and excellent energy resolution are prerequisites for reaching unprecedented levels of precision. In this paper, we provide estimation of the JUNO sensitivity to 7Be, pep, and CNO solar neutrinos that can be obtained via a spectral analysis above the 0.45 MeV threshold. This study is performed assuming different scenarios of the liquid scintillator radiopurity, ranging from the most opti mistic one corresponding to the radiopurity levels obtained by the Borexino experiment, up to the minimum requirements needed to perform the neutrino mass ordering determination with reactor antineutrinos - the main goal of JUNO. Our study shows that in most scenarios, JUNO will be able to improve the current best measurements on 7Be, pep, and CNO solar neutrino fluxes. We also perform a study on the JUNO capability to detect periodical time variations in the solar neutrino flux, such as the day-night modulation induced by neutrino flavor regeneration in Earth, and the modulations induced by temperature changes driven by helioseismic waves. △ Less

Submitted 7 March, 2023; originally announced March 2023.

arXiv:2303.03055 [pdf]

Low-discrepancy Sampling in the Expanded Dimensional Space: An Acceleration Technique for Particle Swarm Optimization

Authors: Feng Wu, Yuelin Zhao, Jianhua Pang, Jun Yan, Wanxie Zhong

Abstract: Compared with random sampling, low-discrepancy sampling is more effective in covering the search space. However, the existing research cannot definitely state whether the impact of a low-discrepancy sample on particle swarm optimization (PSO) is positive or negative. Using Niderreiter's theorem, this study completes an error analysis of PSO, which reveals that the error bound of PSO at each iterat… ▽ More Compared with random sampling, low-discrepancy sampling is more effective in covering the search space. However, the existing research cannot definitely state whether the impact of a low-discrepancy sample on particle swarm optimization (PSO) is positive or negative. Using Niderreiter's theorem, this study completes an error analysis of PSO, which reveals that the error bound of PSO at each iteration depends on the dispersion of the sample set in an expanded dimensional space. Based on this error analysis, an acceleration technique for PSO-type algorithms is proposed with low-discrepancy sampling in the expanded dimensional space. The acceleration technique can generate a low-discrepancy sample set with a smaller dispersion, compared with a random sampling, in the expanded dimensional space; it also reduces the error at each iteration, and hence improves the convergence speed. The acceleration technique is combined with the standard PSO and the comprehensive learning particle swarm optimization, and the performance of the improved algorithm is compared with the original algorithm. The experimental results show that the two improved algorithms have significantly faster convergence speed under the same accuracy requirement. △ Less

Submitted 2 July, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

Comments: 29 pages, 0 figures

ACM Class: F.2.2

arXiv:2302.13018 [pdf, other]

Sparse Bayesian Learning-Based 3D Spectrum Environment Map Construction-Sampling Optimization, Scenario-Dependent Dictionary Construction and Sparse Recovery

Authors: Jie Wang, Qiuming Zhu, Zhipeng Lin, Qihui Wu, Yang Huang, Xuezhao Cai, Weizhi Zhong, Yi Zhao

Abstract: The spectrum environment map (SEM), which can visualize the information of invisible electromagnetic spectrum, is vital for monitoring, management, and security of spectrum resources in cognitive radio (CR) networks. In view of a limited number of spectrum sensors and constrained sampling time, this paper presents a new three-dimensional (3D) SEM construction scheme based on sparse Bayesian learni… ▽ More The spectrum environment map (SEM), which can visualize the information of invisible electromagnetic spectrum, is vital for monitoring, management, and security of spectrum resources in cognitive radio (CR) networks. In view of a limited number of spectrum sensors and constrained sampling time, this paper presents a new three-dimensional (3D) SEM construction scheme based on sparse Bayesian learning (SBL). Firstly, we construct a scenario-dependent channel dictionary matrix by considering the propagation characteristic of the interested scenario. To improve sampling efficiency, a maximum mutual information (MMI)-based optimization algorithm is developed for the layout of sampling sensors. Then, a maximum and minimum distance (MMD) clustering-based SBL algorithm is proposed to recover the spectrum data at the unsampled positions and construct the whole 3D SEM. We finally use the simulation data of the campus scenario to construct the 3D SEMs and compare the proposed method with the state-of-the-art. The recovery performance and the impact of different sparsity on the constructed SEMs are also analyzed. Numerical results show that the proposed scheme can reduce the required spectrum sensor number and has higher accuracy under the low sampling rate. △ Less

Submitted 25 February, 2023; originally announced February 2023.

Comments: 13 pages, 13 figures

arXiv:2302.12679 [pdf, other]

doi 10.1093/mnras/stad628

A Study of GeV Gamma-ray Emission toward Supernova Remnant G51.26+0.11 and Its Molecular Environment

Authors: Wen-Juan Zhong, Xiao Zhang, Yang Chen, Qian-Qian Zhang

Abstract: We reanalyze the Fermi-LAT GeV $γ$-ray emission in the region of supernova remnant (SNR) G51.26+0.11 and investigate its interstellar molecular environment with the CO-line data. At GeV energies, based on 13.2 years of Fermi-LAT data, the extended $γ$-ray emission observed in this region is resolved into a uniform-disk source ('Src A') with a significance of 19.5$σ$ and a point source (4FGL J1924.… ▽ More We reanalyze the Fermi-LAT GeV $γ$-ray emission in the region of supernova remnant (SNR) G51.26+0.11 and investigate its interstellar molecular environment with the CO-line data. At GeV energies, based on 13.2 years of Fermi-LAT data, the extended $γ$-ray emission observed in this region is resolved into a uniform-disk source ('Src A') with a significance of 19.5$σ$ and a point source (4FGL J1924.3+1628) with a significance of 4.2$σ$ in 0.2$-$500 GeV. With an angular radius of $\sim$ 0.17$^°$, 'Src A' overlaps with SNR G51.26+0.11 significantly in the line of sight. On the other hand, the morphological coincidence between the SNR and the $\sim$ +54 km s$^{-1}$ molecular clouds (MCs) together with the asymmetric or broad $^{12}$CO line profiles near the SNR boundary provides evidence for the very likely SNR-MC interaction. The SNR-MC interaction and the HI absorption features indicate that SNR G51.26+0.11 is located at a kinematic distance of 6.2 $\pm$ 0.5 kpc. Combined with the results from the multi-wavelength analysis, the $γ$-ray emission of the SNR ('Src A') can be naturally explained by a hadronic model with a soft power-law proton spectrum of index $\sim$ 2.25. △ Less

Submitted 24 February, 2023; originally announced February 2023.

Comments: 10 pages, 8 figures, accepted for publication in MNRAS

arXiv:2302.09736 [pdf, other]

doi 10.1609/aaai.v37i3.25483

STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training

Authors: Weihong Zhong, Mao Zheng, Duyu Tang, Xuan Luo, Heng Gong, Xiaocheng Feng, Bing Qin

Abstract: Although large-scale video-language pre-training models, which usually build a global alignment between the video and the text, have achieved remarkable progress on various downstream tasks, the idea of adopting fine-grained information during the pre-training stage is not well explored. In this work, we propose STOA-VLP, a pre-training framework that jointly models object and action information a… ▽ More Although large-scale video-language pre-training models, which usually build a global alignment between the video and the text, have achieved remarkable progress on various downstream tasks, the idea of adopting fine-grained information during the pre-training stage is not well explored. In this work, we propose STOA-VLP, a pre-training framework that jointly models object and action information across spatial and temporal dimensions. More specifically, the model regards object trajectories across frames and multiple action features from the video as fine-grained features. Besides, We design two auxiliary tasks to better incorporate both kinds of information into the pre-training process of the video-language model. The first is the dynamic object-text alignment task, which builds a better connection between object trajectories and the relevant noun tokens. The second is the spatial-temporal action set prediction, which guides the model to generate consistent action features by predicting actions found in the text. Extensive experiments on three downstream tasks (video captioning, text-video retrieval, and video question answering) demonstrate the effectiveness of our proposed STOA-VLP (e.g. 3.7 Rouge-L improvements on MSR-VTT video captioning benchmark, 2.9% accuracy improvements on MSVD video question answering benchmark, compared to previous approaches). △ Less

Submitted 23 May, 2023; v1 submitted 19 February, 2023; originally announced February 2023.

Comments: AAAI 2023, 7 pages, 3 figures

arXiv:2302.08662 [pdf, other]

Find Beauty in the Rare: Contrastive Composition Feature Clustering for Nontrivial Crop** Box Regression

Authors: Zhiyu Pan, Yinpeng Chen, Jiale Zhang, Hao Lu, Zhiguo Cao, Weicai Zhong

Abstract: Automatic image crop** algorithms aim to recompose images like human-being photographers by generating the crop** boxes with improved composition quality. Crop** box regression approaches learn the beauty of composition from annotated crop** boxes. However, the bias of annotations leads to quasi-trivial recomposing results, which has an obvious tendency to the average location of training… ▽ More Automatic image crop** algorithms aim to recompose images like human-being photographers by generating the crop** boxes with improved composition quality. Crop** box regression approaches learn the beauty of composition from annotated crop** boxes. However, the bias of annotations leads to quasi-trivial recomposing results, which has an obvious tendency to the average location of training samples. The crux of this predicament is that the task is naively treated as a box regression problem, where rare samples might be dominated by normal samples, and the composition patterns of rare samples are not well exploited. Observing that similar composition patterns tend to be shared by the crop** boundaries annotated nearly, we argue to find the beauty of composition from the rare samples by clustering the samples with similar crop** boundary annotations, ie, similar composition patterns. We propose a novel Contrastive Composition Clustering (C2C) to regularize the composition features by contrasting dynamically established similar and dissimilar pairs. In this way, common composition patterns of multiple images can be better summarized, which especially benefits the rare samples and endows our model with better generalizability to render nontrivial results. Extensive experimental results show the superiority of our model compared with prior arts. We also illustrate the philosophy of our design with an interesting analytical visualization. △ Less

Submitted 16 February, 2023; originally announced February 2023.

Comments: Accepted to AAAI 2023 (Oral); 9 pages, 6 figures

arXiv:2302.03398 [pdf, other]

doi 10.1051/0004-6361/202345904

First Detection of Radio Recombination Lines of Ions Heavier than Helium

Authors: Xunchuan Liu, Tie Liu, Zhiqiang Shen, Paul F. Goldsmith, Neal J. Evans II, Sheng-Li Qin, Qiuyi Luo, Yu Cheng, Sheng-Yuan Liu, Fengyao Zhu, Ken'ichi Tatematsu, Meizhu Liu, Dongting Yang, Chuanshou Li, Li Cen, Juan Li, Xing Lu, Qilao Gu, Rongbing Zhao, Bing Li, Yajun Wu, Weiye Zhong, Zhang Zhao, **qing Wang, Qinghui Liu , et al. (10 additional authors not shown)

Abstract: We report the first detection of radio recombination lines (RRLs) of ions heavier than helium. In a highly sensitive multi-band (12--50 GHz) line survey toward Orion KL with the TianMa 65-m Radio Telescope (TMRT), we successfully detected more than fifteen unblended $α$ lines of RRLs of singly ionized species (XII) recombined from XIII. The Ka-band (35--50 GHz) spectrum also shows tentative signal… ▽ More We report the first detection of radio recombination lines (RRLs) of ions heavier than helium. In a highly sensitive multi-band (12--50 GHz) line survey toward Orion KL with the TianMa 65-m Radio Telescope (TMRT), we successfully detected more than fifteen unblended $α$ lines of RRLs of singly ionized species (XII) recombined from XIII. The Ka-band (35--50 GHz) spectrum also shows tentative signals of $β$ lines of ions. The detected lines can be successfully crossmatched with the the rest frequencies of RRLs of CII and/or OII. This finding greatly expands the connotation of ion RRLs, since before this work only two blended lines (105$α$ and 121$α$) of HeII had been reported. Our detected lines can be fitted simultaneously under assumption of local thermodynamic equilibrium (LTE). An abundance of CIII and OIII of 8.8$\times$10$^{-4}$ is obtained, avoiding the complexities of optical/infrared observations and the blending of RRLs of atoms. It is consistent with but approaches the upper bound of the value (10$^{-4}$--$10^{-3}$) estimated from optical/infrared observations. The effects of dielectronic recombination may contribute to enhancing the level populations even at large $n$. We expect future observations using radio interferometers could break the degeneracy between C and O, and help to reveal the ionization structure and dynamical evolution of various ionized regions. △ Less

Submitted 7 February, 2023; originally announced February 2023.

Comments: 3 figures, 2 tables, accepted by A&A Letter

Journal ref: A&A 671, L1 (2023)

arXiv:2301.08166 [pdf, ps, other]

doi 10.1007/s11128-022-03807-z

Even- and odd-orthogonality properties of the Wigner D-matrix and their metrological applications

Authors: Wei Zhong, Lan Zhou, Cui-Fang Zhang, Yu-Bo Sheng

Abstract: The Wigner D-matrix is essential in the course of angular momentum techniques. We here derive the new even- and odd-orthogonality properties of the Wigner D-matrix which was yet to be demonstrated in textbooks and also apply them to identifying optimal measurements for linear phase estimation based on two-mode optical interferometry with two specific quantum states. The Wigner D-matrix is essential in the course of angular momentum techniques. We here derive the new even- and odd-orthogonality properties of the Wigner D-matrix which was yet to be demonstrated in textbooks and also apply them to identifying optimal measurements for linear phase estimation based on two-mode optical interferometry with two specific quantum states. △ Less

Submitted 19 January, 2023; originally announced January 2023.

Comments: 16 pages,2 figures

arXiv:2212.12874 [pdf, ps, other]

Test and Measure for Partial Mean Dependence Based on Machine Learning Methods

Authors: Leheng Cai, Xu Guo, Wei Zhong

Abstract: It is of importance to investigate the significance of a subset of covariates $W$ for the response $Y$ given covariates $Z$ in regression modeling. To this end, we propose a significance test for the partial mean independence problem based on machine learning methods and data splitting. The test statistic converges to the standard chi-squared distribution under the null hypothesis while it converg… ▽ More It is of importance to investigate the significance of a subset of covariates $W$ for the response $Y$ given covariates $Z$ in regression modeling. To this end, we propose a significance test for the partial mean independence problem based on machine learning methods and data splitting. The test statistic converges to the standard chi-squared distribution under the null hypothesis while it converges to a normal distribution under the fixed alternative hypothesis. Power enhancement and algorithm stability are also discussed. If the null hypothesis is rejected, we propose a partial Generalized Measure of Correlation (pGMC) to measure the partial mean dependence of $Y$ given $W$ after controlling for the nonlinear effect of $Z$. We present the appealing theoretical properties of the pGMC and establish the asymptotic normality of its estimator with the optimal root-$N$ convergence rate. Furthermore, the valid confidence interval for the pGMC is also derived. As an important special case when there are no conditional covariates $Z$, we introduce a new test of overall significance of covariates for the response in a model-free setting. Numerical studies and real data analysis are also conducted to compare with existing approaches and to demonstrate the validity and flexibility of our proposed procedures. △ Less

Submitted 5 June, 2024; v1 submitted 25 December, 2022; originally announced December 2022.

arXiv:2212.08517 [pdf, other]

doi 10.1093/mnras/stac3735

The hot gas distribution, X-ray luminosity and baryon budget in the L-Galaxies semi-analytic model of galaxy formation

Authors: Wenxin Zhong, Jian Fu, Prateek Sharma, Shiyin Shen, Robert M. Yates

Abstract: Hot ionized gas is important in the baryon cycle of galaxies and contributes the majority of their ``missing baryons''. Until now, most semi-analytic models of galaxy formation have paid little attention to hot gaseous haloes and their X-ray emission. In this paper, we adopt the one-dimensional model from Sharma et al. instead of the isothermal sphere to describe the radial distribution of hot gas… ▽ More Hot ionized gas is important in the baryon cycle of galaxies and contributes the majority of their ``missing baryons''. Until now, most semi-analytic models of galaxy formation have paid little attention to hot gaseous haloes and their X-ray emission. In this paper, we adopt the one-dimensional model from Sharma et al. instead of the isothermal sphere to describe the radial distribution of hot gas in the L-Galaxies semi-analytic model. The hot gas halo can be divided into two parts according to the ratio of the local thermal instability time-scale and the free-fall time-scale: a cool core with $t_{\rm TI}/t_{\rm ff}=10$ and a stable outer halo with $t_{\rm TI}/t_{\rm ff}>10$. We update the prescriptions of cooling, feedback and strip** based on the new hot gas profiles, and then reproduce several X-ray observational results, like the radial profiles of hot gas density, and the scaling relations of X-ray luminosity and temperature. We find: (1) Consistent with observations, flatter density profiles in halo centers produce lower X-ray emission than an isothermal sphere; (2) Cool core regions prone to precipitation have higher gas temperature than the virial temperature, and a larger $T_{\rm X}/T_{\rm 200}$ ratio in smaller haloes leads to a steeper slope in the $L_{\rm X}-T_{\rm X}$ relation; (3) The ionized gas in the unbounded reservoir and low temperature intergalactic gas in low mass haloes could be the main components of the halo ``missing baryons''. Our model outputs can predict the observations of hot gas in the nearby universe and produce mock surveys of baryons probed by future X-ray telescopes. △ Less

Submitted 16 December, 2022; originally announced December 2022.

Comments: 18 pages, 15 figures, accepted for publication in MNRAS

Journal ref: MNRAS.519.4344Z (2023)

arXiv:2212.08502 [pdf, other]

doi 10.1088/1674-1137/ace9c6

JUNO Sensitivity on Proton Decay $p\to \barνK^+$ Searches

Authors: JUNO Collaboration, Angel Abusleme, Thomas Adam, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Muhammad Akram, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, Burin Asavapibhop, João Pedro Athayde Marcondes de André, Didier Auguste, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato, Antonio Bergnoli, Thilo Birkenfeld, Sylvie Blin , et al. (586 additional authors not shown)

Abstract: The Jiangmen Underground Neutrino Observatory (JUNO) is a large liquid scintillator detector designed to explore many topics in fundamental physics. In this paper, the potential on searching for proton decay in $p\to \barνK^+$ mode with JUNO is investigated.The kaon and its decay particles feature a clear three-fold coincidence signature that results in a high efficiency for identification. Moreov… ▽ More The Jiangmen Underground Neutrino Observatory (JUNO) is a large liquid scintillator detector designed to explore many topics in fundamental physics. In this paper, the potential on searching for proton decay in $p\to \barνK^+$ mode with JUNO is investigated.The kaon and its decay particles feature a clear three-fold coincidence signature that results in a high efficiency for identification. Moreover, the excellent energy resolution of JUNO permits to suppress the sizable background caused by other delayed signals. Based on these advantages, the detection efficiency for the proton decay via $p\to \barνK^+$ is 36.9% with a background level of 0.2 events after 10 years of data taking. The estimated sensitivity based on 200 kton-years exposure is $9.6 \times 10^{33}$ years, competitive with the current best limits on the proton lifetime in this channel. △ Less

Submitted 26 October, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

Comments: 14 pages, 12 figures, an author added

arXiv:2212.08307 [pdf, other]

Controllable Text Generation via Probability Density Estimation in the Latent Space

Authors: Yuxuan Gu, Xiaocheng Feng, Sicheng Ma, Lingyuan Zhang, Heng Gong, Weihong Zhong, Bing Qin

Abstract: Previous work on controllable text generation has explored the idea of control from the latent space, such as optimizing a representation with attribute-related classifiers or sampling a representation from relevant discrete samples. However, they are not effective enough in modeling both the latent space and the control, leaving controlled text with low quality and diversity. In this work, we pro… ▽ More Previous work on controllable text generation has explored the idea of control from the latent space, such as optimizing a representation with attribute-related classifiers or sampling a representation from relevant discrete samples. However, they are not effective enough in modeling both the latent space and the control, leaving controlled text with low quality and diversity. In this work, we propose a novel control framework using probability density estimation in the latent space. Our method utilizes an invertible transformation function, the Normalizing Flow, that maps the complex distributions in the latent space to simple Gaussian distributions in the prior space. Thus, we can perform sophisticated and flexible control in the prior space and feed the control effects back into the latent space owing to the one-one-map** property of invertible transformations. Experiments on single-attribute controls and multi-attribute control reveal that our method outperforms several strong baselines on attribute relevance and text quality and achieves the SOTA. Further analysis of control strength adjustment demonstrates the flexibility of our control strategy. △ Less

Submitted 24 May, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

Comments: 25 pages, 9 figures, Accepted to ACL2023

arXiv:2212.07040 [pdf, other]

doi 10.3390/galaxies10060113

Overview of the Observing System and Initial Scientific Accomplishments of the East Asian VLBI Network (EAVN)

Authors: Kazunori Akiyama, Juan-Carlos Algaba, Tao An, Keiichi Asada, Kitiyanee Asanok, Do-Young Byun, Thanapol Chanapote, Wen Chen, Zhong Chen, Xiaopeng Cheng, James O. Chibueze, Ilje Cho, Se-Hyung Cho, Hyun-Soo Chung, Lang Cui, Yuzhu Cui, Akihiro Doi, Jian Dong, Kenta Fujisawa, Wei Gou, Wen Guo, Kazuhiro Hada, Yoshiaki Hagiwara, Tomoya Hirota, Jeffrey A. Hodgson , et al. (79 additional authors not shown)

Abstract: The East Asian VLBI Network (EAVN) is an international VLBI facility in East Asia and is operated under mutual collaboration between East Asian countries, as well as part of Southeast Asian and European countries. EAVN currently consists of 16 radio telescopes and three correlators located in China, Japan, and Korea, and is operated mainly at three frequency bands, 6.7, 22, and 43 GHz with the lon… ▽ More The East Asian VLBI Network (EAVN) is an international VLBI facility in East Asia and is operated under mutual collaboration between East Asian countries, as well as part of Southeast Asian and European countries. EAVN currently consists of 16 radio telescopes and three correlators located in China, Japan, and Korea, and is operated mainly at three frequency bands, 6.7, 22, and 43 GHz with the longest baseline length of 5078 km, resulting in the highest angular resolution of 0.28 milliarcseconds at 43 GHz. One of distinct capabilities of EAVN is multi-frequency simultaneous data reception at nine telescopes, which enable us to employ the frequency phase transfer technique to obtain better sensitivity at higher observing frequencies. EAVN started its open-use program in the second half of 2018, providing a total observing time of more than 1100 hours in a year. EAVN fills geographical gap in global VLBI array, resulting in enabling us to conduct contiguous high-resolution VLBI observations. EAVN has produced various scientific accomplishments especially in observations toward active galactic nuclei, evolved stars, and star-forming regions. These activities motivate us to initiate launch of the 'Global VLBI Alliance' to provide an opportunity of VLBI observation with the longest baselines on the earth. △ Less

Submitted 14 December, 2022; originally announced December 2022.

Comments: 27 pages, appeared in Galaxies special issue 'Challenges in Understanding Black Hole Powered Jets with VLBI' as an invited review

Journal ref: Galaxies 2022, 10(6), 113

arXiv:2211.12893 [pdf, other]

doi 10.1145/3511808.3557674

Prototypical Contrastive Learning and Adaptive Interest Selection for Candidate Generation in Recommendations

Authors: Ningning Li, Qunwei Li, Xichen Ding, Shaohu Chen, Wenliang Zhong

Abstract: Deep Candidate Generation plays an important role in large-scale recommender systems. It takes user history behaviors as inputs and learns user and item latent embeddings for candidate generation. In the literature, conventional methods suffer from two problems. First, a user has multiple embeddings to reflect various interests, and such number is fixed. However, taking into account different leve… ▽ More Deep Candidate Generation plays an important role in large-scale recommender systems. It takes user history behaviors as inputs and learns user and item latent embeddings for candidate generation. In the literature, conventional methods suffer from two problems. First, a user has multiple embeddings to reflect various interests, and such number is fixed. However, taking into account different levels of user activeness, a fixed number of interest embeddings is sub-optimal. For example, for less active users, they may need fewer embeddings to represent their interests compared to active users. Second, the negative samples are often generated by strategies with unobserved supervision, and similar items could have different labels. Such a problem is termed as class collision. In this paper, we aim to advance the typical two-tower DNN candidate generation model. Specifically, an Adaptive Interest Selection Layer is designed to learn the number of user embeddings adaptively in an end-to-end way, according to the level of their activeness. Furthermore, we propose a Prototypical Contrastive Learning Module to tackle the class collision problem introduced by negative sampling. Extensive experimental evaluations show that the proposed scheme remarkably outperforms competitive baselines on multiple benchmarks. △ Less

Submitted 23 November, 2022; originally announced November 2022.

arXiv:2211.08776 [pdf, other]

An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022

Authors: Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, Wing-Kwong Chan, Chong-Wah Ngo, Zheng Shou, Nan Duan

Abstract: This technical report describes the CONE approach for Ego4D Natural Language Queries (NLQ) Challenge in ECCV 2022. We leverage our model CONE, an efficient window-centric COarse-to-fiNE alignment framework. Specifically, CONE dynamically slices the long video into candidate windows via a sliding window approach. Centering at windows, CONE (1) learns the inter-window (coarse-grained) semantic varia… ▽ More This technical report describes the CONE approach for Ego4D Natural Language Queries (NLQ) Challenge in ECCV 2022. We leverage our model CONE, an efficient window-centric COarse-to-fiNE alignment framework. Specifically, CONE dynamically slices the long video into candidate windows via a sliding window approach. Centering at windows, CONE (1) learns the inter-window (coarse-grained) semantic variance through contrastive learning and speeds up inference by pre-filtering the candidate windows relevant to the NL query, and (2) conducts intra-window (fine-grained) candidate moments ranking utilizing the powerful multi-modal alignment ability of the contrastive vision-text pre-trained model EgoVLP. On the blind test set, CONE achieves 15.26 and 9.24 for R1@IoU=0.3 and R1@IoU=0.5, respectively. △ Less

Submitted 16 November, 2022; originally announced November 2022.

Comments: Technical report for ECCV 2022 Ego4D workshop, 4 pages, 2 figures, 2 tables. arXiv admin note: substantial text overlap with arXiv:2209.10918

arXiv:2210.11265 [pdf, other]

Disentangling Reasoning Capabilities from Language Models with Compositional Reasoning Transformers

Authors: Wanjun Zhong, Tingting Ma, Jiahai Wang, Jian Yin, Tiejun Zhao, Chin-Yew Lin, Nan Duan

Abstract: This paper presents ReasonFormer, a unified reasoning framework for mirroring the modular and compositional reasoning process of humans in complex decision-making. Inspired by dual-process theory in cognitive science, the representation module (automatic thinking) and reasoning modules (controlled thinking) are decoupled to capture different levels of cognition. Upon the top of the representation… ▽ More This paper presents ReasonFormer, a unified reasoning framework for mirroring the modular and compositional reasoning process of humans in complex decision-making. Inspired by dual-process theory in cognitive science, the representation module (automatic thinking) and reasoning modules (controlled thinking) are decoupled to capture different levels of cognition. Upon the top of the representation module, the pre-trained reasoning modules are modular and professional in specific and fundamental reasoning skills (e.g., logic, simple QA, etc). To mimic the controlled compositional thinking process, different reasoning modules are dynamically activated and composed in both parallel and cascaded manners to control what reasoning skills are activated and how deep the reasoning process will be reached to solve the current problems. The unified reasoning framework solves multiple tasks with a single model, and is trained and inferred in an end-to-end manner. Evaluated on 11 datasets requiring different reasoning skills and complexity, ReasonFormer demonstrates substantial performance boosts, revealing the compositional reasoning ability. Few-shot experiments exhibit better generalization ability by learning to compose pre-trained skills for new tasks with limited data, and decoupling the representation module and the reasoning modules. Further analysis shows the modularity of reasoning modules as different tasks activate distinct reasoning skills at different reasoning depths. △ Less

Submitted 7 December, 2022; v1 submitted 20 October, 2022; originally announced October 2022.

Comments: 12 pages

arXiv:2210.09339 [pdf, other]

Probability Weighted Clustered Coefficients Regression Models in Complex Survey Sampling

Authors: Mingjun Gang, Xin Wang, Zhonglei Wang, Wei Zhong

Abstract: Regression analysis is commonly conducted in survey sampling. However, existing methods fail when the relationships vary across different areas or domains. In this paper, we propose a unified framework to study the group-wise covariate effect under complex survey sampling based on pairwise penalties, and the associated objective function is solved by the alternating direction method of multipliers… ▽ More Regression analysis is commonly conducted in survey sampling. However, existing methods fail when the relationships vary across different areas or domains. In this paper, we propose a unified framework to study the group-wise covariate effect under complex survey sampling based on pairwise penalties, and the associated objective function is solved by the alternating direction method of multipliers. Theoretical properties of the proposed method are investigated under some generality conditions. Numerical experiments demonstrate the superiority of the proposed method in terms of identifying groups and estimation efficiency for both linear regression models and logistic regression models. △ Less

Submitted 27 November, 2023; v1 submitted 17 October, 2022; originally announced October 2022.

Comments: 35 pages,2 figures

arXiv:2210.09139 [pdf, other]

Burstiness and information spreading in the active particles systems

Authors: Wei Zhong, You** Deng, Daxing Xiong

Abstract: We construct the temporal network using the two-dimensional active particle systems which are described by the Vicsek model. The bursts of the interevent times for a specific pair of particles are investigated numerically. We find that for different noise strength, the distribution of the interevent times of a target edge follows by a heavy-tail, revealing a strong burstiness of the signals. To fu… ▽ More We construct the temporal network using the two-dimensional active particle systems which are described by the Vicsek model. The bursts of the interevent times for a specific pair of particles are investigated numerically. We find that for different noise strength, the distribution of the interevent times of a target edge follows by a heavy-tail, revealing a strong burstiness of the signals. To further characterize the nature of the burstiness, the burstiness parameter and the memory coefficient are calculated. The results show that near the critical points of the Vicsek model, the burstiness parameters reach the minimum values for each density, indicating a relation between the phase transition of the Vicsek model and the bursty nature of the signals. Besides, the memory plays a negligible role in the burstiness. Further, we investigate the spreading dynamics on our temporal network with the susceptible-infected model, and observe a positive correlation between the burstiness and the information spreading dynamics. △ Less

Submitted 17 October, 2022; originally announced October 2022.

Comments: 5 pages, 3 figures

arXiv:2210.08437 [pdf, other]

doi 10.3847/1538-4357/ad2bfd

Model Independent Approach of the JUNO $^8$B Solar Neutrino Program

Authors: JUNO Collaboration, Jie Zhao, Baobiao Yue, Haoqi Lu, Yufeng Li, Jiajie Ling, Zeyuan Yu, Angel Abusleme, Thomas Adam, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Muhammad Akram, Abid Aleem, Tsagkarakis Alexandros, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, Burin Asavapibhop, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai , et al. (579 additional authors not shown)

Abstract: The physics potential of detecting $^8$B solar neutrinos will be exploited at the Jiangmen Underground Neutrino Observatory (JUNO), in a model independent manner by using three distinct channels of the charged-current (CC), neutral-current (NC) and elastic scattering (ES) interactions. Due to the largest-ever mass of $^{13}$C nuclei in the liquid-scintillator detectors and the {expected} low backg… ▽ More The physics potential of detecting $^8$B solar neutrinos will be exploited at the Jiangmen Underground Neutrino Observatory (JUNO), in a model independent manner by using three distinct channels of the charged-current (CC), neutral-current (NC) and elastic scattering (ES) interactions. Due to the largest-ever mass of $^{13}$C nuclei in the liquid-scintillator detectors and the {expected} low background level, $^8$B solar neutrinos would be observable in the CC and NC interactions on $^{13}$C for the first time. By virtue of optimized event selections and muon veto strategies, backgrounds from the accidental coincidence, muon-induced isotopes, and external backgrounds can be greatly suppressed. Excellent signal-to-background ratios can be achieved in the CC, NC and ES channels to guarantee the $^8$B solar neutrino observation. From the sensitivity studies performed in this work, we show that JUNO, with ten years of data, can reach the {1$σ$} precision levels of 5%, 8% and 20% for the $^8$B neutrino flux, $\sin^2θ_{12}$, and $Δm^2_{21}$, respectively. It would be unique and helpful to probe the details of both solar physics and neutrino physics. In addition, when combined with SNO, the world-best precision of 3% is expected for the $^8$B neutrino flux measurement. △ Less

Submitted 6 March, 2024; v1 submitted 15 October, 2022; originally announced October 2022.

Comments: 19 pages, 7 figures, accepted version to appear in The Astrophysical Journal. Yufeng Li and Jiajie Ling are corresponding authors

Journal ref: Astrophysical Journal 965 (2024) 122

arXiv:2210.05197 [pdf, other]

Mixed-modality Representation Learning and Pre-training for Joint Table-and-Text Retrieval in OpenQA

Authors: Junjie Huang, Wanjun Zhong, Qian Liu, Ming Gong, Daxin Jiang, Nan Duan

Abstract: Retrieving evidences from tabular and textual resources is essential for open-domain question answering (OpenQA), which provides more comprehensive information. However, training an effective dense table-text retriever is difficult due to the challenges of table-text discrepancy and data sparsity problem. To address the above challenges, we introduce an optimized OpenQA Table-Text Retriever (OTTeR… ▽ More Retrieving evidences from tabular and textual resources is essential for open-domain question answering (OpenQA), which provides more comprehensive information. However, training an effective dense table-text retriever is difficult due to the challenges of table-text discrepancy and data sparsity problem. To address the above challenges, we introduce an optimized OpenQA Table-Text Retriever (OTTeR) to jointly retrieve tabular and textual evidences. Firstly, we propose to enhance mixed-modality representation learning via two mechanisms: modality-enhanced representation and mixed-modality negative sampling strategy. Secondly, to alleviate data sparsity problem and enhance the general retrieval ability, we conduct retrieval-centric mixed-modality synthetic pre-training. Experimental results demonstrate that OTTeR substantially improves the performance of table-and-text retrieval on the OTT-QA dataset. Comprehensive analyses examine the effectiveness of all the proposed mechanisms. Besides, equipped with OTTeR, our OpenQA system achieves the state-of-the-art result on the downstream QA task, with 10.1% absolute improvement in terms of the exact match over the previous best system. All the code and data are available at https://github.com/Jun-jie-Huang/OTTeR. △ Less

Submitted 11 October, 2022; originally announced October 2022.

Comments: Accepted to Findings of EMNLP 2022

arXiv:2210.03356 [pdf]

Two Iterative algorithms for the matrix sign function based on the adaptive filtering technology

Authors: Feng Wu, Keqi Ye, Li Zhu, Yueling Zhao, Jiqiang Hu, Wanxie Zhong

Abstract: In this paper, two new efficient algorithms for calculating the sign function of the large-scale sparse matrix are proposed by combining filtering algorithm with Newton method and Newton Schultz method respectively. Through the theoretical analysis of the error diffusion in the iterative process, we designed an adaptive filtering threshold, which can ensure that the filtering has little impact on… ▽ More In this paper, two new efficient algorithms for calculating the sign function of the large-scale sparse matrix are proposed by combining filtering algorithm with Newton method and Newton Schultz method respectively. Through the theoretical analysis of the error diffusion in the iterative process, we designed an adaptive filtering threshold, which can ensure that the filtering has little impact on the iterative process and the calculation result. Numerical experiments are consistent with our theoretical analysis, which shows that the computational efficiency of our method is much better than that of Newton method and Newton Schultz method, and the computational error is of the same order of magnitude as that of the two methods. △ Less

Submitted 7 October, 2022; originally announced October 2022.

Comments: 18 pages,12 figures

MSC Class: 65F30; 15A15

arXiv:2209.13123 [pdf, other]

Explainable Graph Pyramid Autoformer for Long-Term Traffic Forecasting

Authors: Weiheng Zhong, Tanwi Mallick, Hadi Meidani, Jane Macfarlane, Prasanna Balaprakash

Abstract: Accurate traffic forecasting is vital to an intelligent transportation system. Although many deep learning models have achieved state-of-art performance for short-term traffic forecasting of up to 1 hour, long-term traffic forecasting that spans multiple hours remains a major challenge. Moreover, most of the existing deep learning traffic forecasting models are black box, presenting additional cha… ▽ More Accurate traffic forecasting is vital to an intelligent transportation system. Although many deep learning models have achieved state-of-art performance for short-term traffic forecasting of up to 1 hour, long-term traffic forecasting that spans multiple hours remains a major challenge. Moreover, most of the existing deep learning traffic forecasting models are black box, presenting additional challenges related to explainability and interpretability. We develop Graph Pyramid Autoformer (X-GPA), an explainable attention-based spatial-temporal graph neural network that uses a novel pyramid autocorrelation attention mechanism. It enables learning from long temporal sequences on graphs and improves long-term traffic forecasting accuracy. Our model can achieve up to 35 % better long-term traffic forecast accuracy than that of several state-of-the-art methods. The attention-based scores from the X-GPA model provide spatial and temporal explanations based on the traffic dynamics, which change for normal vs. peak-hour traffic and weekday vs. weekend traffic. △ Less

Submitted 26 September, 2022; originally announced September 2022.

arXiv:2209.10918 [pdf, other]

CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding

Authors: Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, Wing-Kwong Chan, Chong-Wah Ngo, Zheng Shou, Nan Duan

Abstract: This paper tackles an emerging and challenging problem of long video temporal grounding~(VTG) that localizes video moments related to a natural language (NL) query. Compared with short videos, long videos are also highly demanded but less explored, which brings new challenges in higher inference computation cost and weaker multi-modal alignment. To address these challenges, we propose CONE, an eff… ▽ More This paper tackles an emerging and challenging problem of long video temporal grounding~(VTG) that localizes video moments related to a natural language (NL) query. Compared with short videos, long videos are also highly demanded but less explored, which brings new challenges in higher inference computation cost and weaker multi-modal alignment. To address these challenges, we propose CONE, an efficient COarse-to-fiNE alignment framework. CONE is a plug-and-play framework on top of existing VTG models to handle long videos through a sliding window mechanism. Specifically, CONE (1) introduces a query-guided window selection strategy to speed up inference, and (2) proposes a coarse-to-fine mechanism via a novel incorporation of contrastive learning to enhance multi-modal alignment for long videos. Extensive experiments on two large-scale long VTG benchmarks consistently show both substantial performance gains (e.g., from 3.13% to 6.87% on MAD) and state-of-the-art results. Analyses also reveal higher efficiency as the query-guided window selection mechanism accelerates inference time by 2x on Ego4D-NLQ and 15x on MAD while kee** SOTA results. Codes have been released at https://github.com/houzhijian/CONE. △ Less

Submitted 29 May, 2023; v1 submitted 22 September, 2022; originally announced September 2022.

Comments: ACL 2023 Camera Ready. 14 pages, 7 figures, 4 tables

arXiv:2209.09438 [pdf]

Updating velocities in heterogeneous comprehensive learning particle swarm optimization with low-discrepancy sequences

Authors: Yuelin Zhao, Feng Wu, Jianhua Pang, Wanxie Zhong

Abstract: Heterogeneous comprehensive learning particle swarm optimization (HCLPSO) is a type of evolutionary algorithm with enhanced exploration and exploitation capabilities. The low-discrepancy sequence (LDS) is more uniform in covering the search space than random sequences. In this paper, making use of the good uniformity of LDS to improve HCLPSO is researched. Numerical experiments are performed to sh… ▽ More Heterogeneous comprehensive learning particle swarm optimization (HCLPSO) is a type of evolutionary algorithm with enhanced exploration and exploitation capabilities. The low-discrepancy sequence (LDS) is more uniform in covering the search space than random sequences. In this paper, making use of the good uniformity of LDS to improve HCLPSO is researched. Numerical experiments are performed to show that it is impossible to effectively improve the search ability of HCLPSO by only using LDS to generate the initial population. However, if we properly choose some random sequences from the HCLPSO velocities updating formula and replace them with the deterministic LDS, we can obtain a more efficient algorithm. Compared with the original HCLPSO under the same accuracy requirement, the HCLPSO updating the velocities with the deterministic LDS can significantly reduce the iterations required for finding the optimal solution, without decreasing the success rate. △ Less

Submitted 19 September, 2022; originally announced September 2022.

Comments: 29 pages, 5 figures

arXiv:2209.08800 [pdf, ps, other]

A Realistic 3D Non-Stationary Channel Model for UAV-to-Vehicle Communications Incorporating Fuselage Posture

Authors: Boyu Hua, Tongtong Zhou, Qiuming Zhu, Kai Mao, Junwei Bao, Weizhi Zhong, Naeem Ahmed

Abstract: Considering the unmanned aerial vehicle (UAV) three-dimensional (3D) posture, a novel 3D non-stationary geometry-based stochastic model (GBSM) is proposed for multiple-input multiple-output (MIMO) UAV-to-vehicle (U2V) channels. It consists of a line-of-sight (LoS) and non-line-of-sight (NLoS) components. The factor of fuselage posture is considered by introducing a time-variant 3D posture matrix.… ▽ More Considering the unmanned aerial vehicle (UAV) three-dimensional (3D) posture, a novel 3D non-stationary geometry-based stochastic model (GBSM) is proposed for multiple-input multiple-output (MIMO) UAV-to-vehicle (U2V) channels. It consists of a line-of-sight (LoS) and non-line-of-sight (NLoS) components. The factor of fuselage posture is considered by introducing a time-variant 3D posture matrix. Some important statistical properties, i.e. the temporal autocorrelation function (ACF) and spatial cross correlation function (CCF), are derived and investigated. Simulation results show that the fuselage posture has significant impact on the U2V channel characteristic and aggravate the non-stationarity. The agreements between analytical, simulated, and measured results verify the correctness of proposed model and derivations. Moreover, it is demonstrated that the proposed model is also compatible to the existing GBSM without considering fuselage posture. △ Less

Submitted 19 September, 2022; originally announced September 2022.

Comments: 12 pages, 8 figures, CNCOM

arXiv:2209.05960 [pdf, other]

doi 10.1103/PhysRevA.107.022221

Classical-driving-assisted quantum synchronization in non-Markovian environments

Authors: Xing Xiao, Tian-Xiang Lu, Wo-Jun Zhong, Yan-Ling Li

Abstract: We study the quantum phase synchronization of a driven two-level system (TLS) coupled to a structured environment and demonstrate that quantum synchronization can be enhanced by the classical driving field. We use the Husimi $Q$-function to characterize the phase preference and find the in-phase and anti-phase locking phenomenon in the phase diagram. Remarkably, we show that the in-phase classical… ▽ More We study the quantum phase synchronization of a driven two-level system (TLS) coupled to a structured environment and demonstrate that quantum synchronization can be enhanced by the classical driving field. We use the Husimi $Q$-function to characterize the phase preference and find the in-phase and anti-phase locking phenomenon in the phase diagram. Remarkably, we show that the in-phase classical driving enables a TLS to reach stable anti-phase locking in the Markovian regime. However, we find that the synergistic action of classical driving and non-Markovian effects significantly enhances the initial in-phase locking. By introducing the $S$-function and its maximal value to quantify the strength of synchronization and sketch the synchronization regions, we observe the typical signatures of the hollowed Arnold tongue in the parameter regions of synchronization. In the hollowed Arnold tongue, the synchronization regions exist both inside and outside the tongue while unsynchronized regions only lie on the boundary line. We also provide an intuitive interpretation of the above results by using the quasimode theory. △ Less

Submitted 7 February, 2023; v1 submitted 13 September, 2022; originally announced September 2022.

Comments: minor version, accepted by Physical Review A

arXiv:2209.03067 [pdf, other]

doi 10.3847/1538-4365/ac9127

A Q-band line survey towards Orion KL using the Tianma radio telescope

Authors: Xunchuan Liu, Tie Liu, Zhiqiang Shen, Sheng-Li Qin, Qiuyi Luo, Yu Cheng, Qilao Gu, Tianwei Zhang, Fengyao Zhu, Sheng-Yuan Liu, Xing Lu, Rongbing Zhao, Weiye Zhong, Yajun Wu, Juan Li, Zhang Zhao, **qing Wang, Qinghui Liu, Bo Xia, Bin Li, Li Fu, Zhen Yan, Chao Zhang, Lingling Wang, Qian Ye , et al. (7 additional authors not shown)

Abstract: We have conducted a line survey towards Orion KL using the Q-band receiver of Tianma 65 m radio telescope (TMRT), covering 34.8--50 GHz with a velocity resolution between 0.79 km s$^{-1}$ and 0.55 km s$^{-1}$ respectively. The observations reach a sensitivity on the level of 1-8 mK, proving that the TMRT is sensitive for conducting deep line surveys. In total, 597 Gaussian features are extracted.… ▽ More We have conducted a line survey towards Orion KL using the Q-band receiver of Tianma 65 m radio telescope (TMRT), covering 34.8--50 GHz with a velocity resolution between 0.79 km s$^{-1}$ and 0.55 km s$^{-1}$ respectively. The observations reach a sensitivity on the level of 1-8 mK, proving that the TMRT is sensitive for conducting deep line surveys. In total, 597 Gaussian features are extracted. Among them, 177 radio recombination lines (RRLs) are identified, including 126, 40 and 11 RRLs of hydrogen, helium and carbon, with a maximum $Δn$ of 16, 7, and 3, respectively. The carbon RRLs are confirmed to originate from photodissociation regions with a $V_{\rm LSR}\sim$9 km s$^{-1}$. In addition, 371 molecular transitions of 53 molecular species are identified. Twenty-one molecular species of this survey were not firmly detected in the Q band by Rizzo et al. (2017), including species such as H$_2$CS, HCOOH, C$_2$H$_5$OH, H$_2^{13}$CO, H$_2$CCO, CH$_3$CHO, CH$_2$OCH$_2$, HCN $v_2=1$, and CH$_3$OCHO $v_t=1$. In particular, the vibrationally excited states of ethyl cyanide (C$_2$H$_5$CN $v$13/$v$21) are for the first time firmly detected in the Q band. NH$_3$ (15,15) and (16,16) are identified, and they are so far the highest transitions of the NH$_3$ inversion lines detected towards Orion KL. All the identified lines can be reproduced by a radiative transfer model. △ Less

Submitted 7 September, 2022; originally announced September 2022.

Comments: 51 pages, 18 figures, accepted by ApJS

arXiv:2208.06770 [pdf, ps, other]

Joint User Association and Resource Pricing for Metaverse: Distributed and Centralized Approaches

Authors: Xumin Huang, Weifeng Zhong, Jiangtian Nie, Qin Hu, Zehui Xiong, Jiawen Kang, Tony Q. S. Quek

Abstract: Metaverse as the next-generation Internet provides users with physical-virtual world interactions. To improve the quality of immersive experience, users access to Metaverse service providers (MSPs) and purchase bandwidth resource to reduce the communication latency of the Metaverse services. The MSPs decide selling price of the bandwidth resource to maximize the revenue. This leads to a joint user… ▽ More Metaverse as the next-generation Internet provides users with physical-virtual world interactions. To improve the quality of immersive experience, users access to Metaverse service providers (MSPs) and purchase bandwidth resource to reduce the communication latency of the Metaverse services. The MSPs decide selling price of the bandwidth resource to maximize the revenue. This leads to a joint user association and resource pricing problem between all users and MSPs. To tackle the problem, we formulate a Stackelberg game where the MSPs are game leaders and users are game followers. We resolve the Stackelberg equilibrium via the distributed and centralized approaches, according to different privacy requirements. In the distributed approach, the MSPs compete against each other to maximize the individual revenue, and a user selects an MSP in a probabilistic manner. The Stackelberg equilibrium is achieved in a privacy-friendly way. In the centralized approach, all MSPs and users accept the unified management and their strategies are instructed. The centralized approach acquires the superior decision-making performance but sacrifices the privacy of the game players. Finally, we provide numerical results to demonstrate the effectiveness and efficiency of our schemes. △ Less

Submitted 13 August, 2022; originally announced August 2022.

arXiv:2208.03229 [pdf, other]

Improving Task Generalization via Unified Schema Prompt

Authors: Wanjun Zhong, Yifan Gao, Ning Ding, Zhiyuan Liu, Ming Zhou, Jiahai Wang, Jian Yin, Nan Duan

Abstract: Task generalization has been a long standing challenge in Natural Language Processing (NLP). Recent research attempts to improve the task generalization ability of pre-trained language models by map** NLP tasks into human-readable prompted forms. However, these approaches require laborious and inflexible manual collection of prompts, and different prompts on the same downstream task may receive… ▽ More Task generalization has been a long standing challenge in Natural Language Processing (NLP). Recent research attempts to improve the task generalization ability of pre-trained language models by map** NLP tasks into human-readable prompted forms. However, these approaches require laborious and inflexible manual collection of prompts, and different prompts on the same downstream task may receive unstable performance. We propose Unified Schema Prompt, a flexible and extensible prompting method, which automatically customizes the learnable prompts for each task according to the task input schema. It models the shared knowledge between tasks, while kee** the characteristics of different task schema, and thus enhances task generalization ability. The schema prompt takes the explicit data structure of each task to formulate prompts so that little human effort is involved. To test the task generalization ability of schema prompt at scale, we conduct schema prompt-based multitask pre-training on a wide variety of general NLP tasks. The framework achieves strong zero-shot and few-shot generalization performance on 16 unseen downstream tasks from 8 task types (e.g., QA, NLI, etc). Furthermore, comprehensive analyses demonstrate the effectiveness of each component in the schema prompt, its flexibility in task compositionality, and its ability to improve performance under a full-data fine-tuning setting. △ Less

Submitted 5 August, 2022; originally announced August 2022.

arXiv:2208.00945 [pdf, other]

doi 10.1145/3503161.3548088

DoF-NeRF: Depth-of-Field Meets Neural Radiance Fields

Authors: Zi** Wu, Xingyi Li, Juewen Peng, Hao Lu, Zhiguo Cao, Weicai Zhong

Abstract: Neural Radiance Field (NeRF) and its variants have exhibited great success on representing 3D scenes and synthesizing photo-realistic novel views. However, they are generally based on the pinhole camera model and assume all-in-focus inputs. This limits their applicability as images captured from the real world often have finite depth-of-field (DoF). To mitigate this issue, we introduce DoF-NeRF, a… ▽ More Neural Radiance Field (NeRF) and its variants have exhibited great success on representing 3D scenes and synthesizing photo-realistic novel views. However, they are generally based on the pinhole camera model and assume all-in-focus inputs. This limits their applicability as images captured from the real world often have finite depth-of-field (DoF). To mitigate this issue, we introduce DoF-NeRF, a novel neural rendering approach that can deal with shallow DoF inputs and can simulate DoF effect. In particular, it extends NeRF to simulate the aperture of lens following the principles of geometric optics. Such a physical guarantee allows DoF-NeRF to operate views with different focus configurations. Benefiting from explicit aperture modeling, DoF-NeRF also enables direct manipulation of DoF effect by adjusting virtual aperture and focus parameters. It is plug-and-play and can be inserted into NeRF-based frameworks. Experiments on synthetic and real-world datasets show that, DoF-NeRF not only performs comparably with NeRF in the all-in-focus setting, but also can synthesize all-in-focus novel views conditioned on shallow DoF inputs. An interesting application of DoF-NeRF to DoF rendering is also demonstrated. The source code will be made available at https://github.com/zi**wuzi**/DoF-NeRF. △ Less

Submitted 1 August, 2022; originally announced August 2022.

Comments: Accepted by ACMMM 2022

arXiv:2208.00439 [pdf, other]

Design What You Desire: Icon Generation from Orthogonal Application and Theme Labels

Authors: Yinpeng Chen, Zhiyu Pan, Min Shi, Hao Lu, Zhiguo Cao, Weicai Zhong

Abstract: Generative adversarial networks (GANs) have been trained to be professional artists able to create stunning artworks such as face generation and image style transfer. In this paper, we focus on a realistic business scenario: automated generation of customizable icons given desired mobile applications and theme styles. We first introduce a theme-application icon dataset, termed AppIcon, where each… ▽ More Generative adversarial networks (GANs) have been trained to be professional artists able to create stunning artworks such as face generation and image style transfer. In this paper, we focus on a realistic business scenario: automated generation of customizable icons given desired mobile applications and theme styles. We first introduce a theme-application icon dataset, termed AppIcon, where each icon has two orthogonal theme and app labels. By investigating a strong baseline StyleGAN2, we observe mode collapse caused by the entanglement of the orthogonal labels. To solve this challenge, we propose IconGAN composed of a conditional generator and dual discriminators with orthogonal augmentations, and a contrastive feature disentanglement strategy is further designed to regularize the feature space of the two discriminators. Compared with other approaches, IconGAN indicates a superior advantage on the AppIcon benchmark. Further analysis also justifies the effectiveness of disentangling app and theme representations. Our project will be released at: https://github.com/architect-road/IconGAN. △ Less

Submitted 31 July, 2022; originally announced August 2022.

Comments: 10 pages, 12 figures

arXiv:2207.08069 [pdf, ps, other]

doi 10.1088/1572-9494/ac6e36

Unusual slow energy relaxation induced by mobile discrete breathers in one-dimensional lattices with next-nearest-neighbor coupling

Authors: Bin Xu, Jun Zhang, Wei Zhong, Chi Xiong, Daxing Xiong

Abstract: We study the energy relaxation process in one-dimensional (1D) lattices with next-nearest-neighbor (NNN) couplings. This relaxation is produced by adding dam** (absorbing conditions) to the boundary (free-end) of the lattice. Compared to the 1D lattices with on-site potentials, the properties of discrete breathers (DBs) that are spatially localized intrinsic modes are quite unusual with the NNN… ▽ More We study the energy relaxation process in one-dimensional (1D) lattices with next-nearest-neighbor (NNN) couplings. This relaxation is produced by adding dam** (absorbing conditions) to the boundary (free-end) of the lattice. Compared to the 1D lattices with on-site potentials, the properties of discrete breathers (DBs) that are spatially localized intrinsic modes are quite unusual with the NNN couplings included, i.e., these DBs are mobile, and thus they can interact with both the phonons and the boundaries of the lattice. For the interparticle interactions of harmonic and Fermi-Pasta-Ulam-Tsingou-$β$ (FPUT-$β$) types, we find two crossovers of relaxation in general, i.e., a first crossover from the stretched-exponential to the regular exponential relaxation occurring in a short timescale, and a further crossover from the exponential to the power-law relaxation taking place in a long timescale. The first and second relaxations are universal, but the final power-law relaxation is strongly influenced by the properties of DBs, e.g. the scattering processes of DBs with phonons and boundaries in the FPUT-$β$ type systems make the power-law decay relatively faster than that in the counterparts of the harmonic type systems under the same coupling. Our results present new information and insights for understanding the slow energy relaxation in cooling the lattices. △ Less

Submitted 16 July, 2022; originally announced July 2022.

Comments: 13 pages, 5 figures

Journal ref: Communications in Theoretical Physics, Volume 74, Number 6, 2022

arXiv:2207.07372 [pdf, other]

3D Instances as 1D Kernels

Authors: Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong

Abstract: We introduce a 3D instance representation, termed instance kernels, where instances are represented by one-dimensional vectors that encode the semantic, positional, and shape information of 3D instances. We show that instance kernels enable easy mask inference by simply scanning kernels over the entire scenes, avoiding the heavy reliance on proposals or heuristic clustering algorithms in standard… ▽ More We introduce a 3D instance representation, termed instance kernels, where instances are represented by one-dimensional vectors that encode the semantic, positional, and shape information of 3D instances. We show that instance kernels enable easy mask inference by simply scanning kernels over the entire scenes, avoiding the heavy reliance on proposals or heuristic clustering algorithms in standard 3D instance segmentation pipelines. The idea of instance kernel is inspired by recent success of dynamic convolutions in 2D/3D instance segmentation. However, we find it non-trivial to represent 3D instances due to the disordered and unstructured nature of point cloud data, e.g., poor instance localization can significantly degrade instance representation. To remedy this, we construct a novel 3D instance encoding paradigm. First, potential instance centroids are localized as candidates. Then, a candidate merging scheme is devised to simultaneously aggregate duplicated candidates and collect context around the merged centroids to form the instance kernels. Once instance kernels are available, instance masks can be reconstructed via dynamic convolutions whose weights are conditioned on instance kernels. The whole pipeline is instantiated with a dynamic kernel network (DKNet). Results show that DKNet outperforms the state of the arts on both ScanNetV2 and S3DIS datasets with better instance localization. Code is available: https://github.com/W1zheng/DKNet. △ Less

Submitted 18 July, 2022; v1 submitted 15 July, 2022; originally announced July 2022.

Comments: Appearing in ECCV, 2022

arXiv:2207.02346 [pdf, other]

Many-body localized hidden generative models

Authors: Weishun Zhong, Xun Gao, Susanne F. Yelin, Khadijeh Najafi

Abstract: Born machines are quantum-inspired generative models that leverage the probabilistic nature of quantum states. Here, we present a new architecture called many-body localized (MBL) hidden Born machine that utilizes both MBL dynamics and hidden units as learning resources. We show that the hidden units act as an effective thermal bath that enhances the trainability of the system, while the MBL dynam… ▽ More Born machines are quantum-inspired generative models that leverage the probabilistic nature of quantum states. Here, we present a new architecture called many-body localized (MBL) hidden Born machine that utilizes both MBL dynamics and hidden units as learning resources. We show that the hidden units act as an effective thermal bath that enhances the trainability of the system, while the MBL dynamics stabilize the training trajectories. We numerically demonstrate that the MBL hidden Born machine is capable of learning a variety of tasks, including a toy version of MNIST handwritten digits, quantum data obtained from quantum many-body states, and non-local parity data. Our architecture and algorithm provide novel strategies of utilizing quantum many-body systems as learning resources, and reveal a powerful connection between disorder, interaction, and learning in quantum many-body systems. △ Less

Submitted 28 December, 2023; v1 submitted 5 July, 2022; originally announced July 2022.

Comments: 13 pages, 11 figures; added references

arXiv:2207.00685 [pdf, other]

Engagement Maximization

Authors: Benjamin Hébert, Weijie Zhong

Abstract: We consider the problem of a Bayesian agent receiving signals over time and then taking an action. The agent chooses when to stop and take an action based on her current beliefs, and prefers (all else equal) to act sooner rather than later. The signals received by the agent are determined by a principal, whose objective is to maximize engagement (the total attention paid by the agent to the signal… ▽ More We consider the problem of a Bayesian agent receiving signals over time and then taking an action. The agent chooses when to stop and take an action based on her current beliefs, and prefers (all else equal) to act sooner rather than later. The signals received by the agent are determined by a principal, whose objective is to maximize engagement (the total attention paid by the agent to the signals). We show that engagement maximization by the principal minimizes the agent's welfare; the agent does no better than if she gathered no information. Relative to a benchmark in which the agent chooses the signals, engagement maximization induces excessive information acquisition and extreme beliefs. An optimal strategy for the principal involves "suspensive signals" that lead the agent's belief to become "less certain than the prior" and "decisive signals" that lead the agent's belief to jump to the stop** region. △ Less

Submitted 18 October, 2022; v1 submitted 1 July, 2022; originally announced July 2022.

arXiv:2206.10346 [pdf, other]

A new stable and avoiding inversion iteration for computing matrix square root

Authors: Li Zhu, Keqi Ye, Yuelin Zhao, Feng Wu, Jiqiang Hu, Wanxie Zhong

Abstract: The objective of this research was to compute the principal matrix square root with sparse approximation. A new stable iterative scheme avoiding fully matrix inversion (SIAI) is provided. The analysis on the sparsity and error of the matrices involved during the iterative process is given. Based on the bandwidth and error analysis, a more efficient algorithm combining the SIAI with the filtering t… ▽ More The objective of this research was to compute the principal matrix square root with sparse approximation. A new stable iterative scheme avoiding fully matrix inversion (SIAI) is provided. The analysis on the sparsity and error of the matrices involved during the iterative process is given. Based on the bandwidth and error analysis, a more efficient algorithm combining the SIAI with the filtering technique is proposed. The high computational efficiency and accuracy of the proposed method are demonstrated by computing the principal square roots of different matrices to reveal its applicability over the existing methods. △ Less

Submitted 21 June, 2022; originally announced June 2022.

Comments: 19 pages, 3 figures

arXiv:2206.08933 [pdf, other]

A theory of learning with constrained weight-distribution

Authors: Weishun Zhong, Ben Sorscher, Daniel D Lee, Haim Sompolinsky

Abstract: A central question in computational neuroscience is how structure determines function in neural networks. The emerging high-quality large-scale connectomic datasets raise the question of what general functional principles can be gleaned from structural information such as the distribution of excitatory/inhibitory synapse types and the distribution of synaptic weights. Motivated by this question, w… ▽ More A central question in computational neuroscience is how structure determines function in neural networks. The emerging high-quality large-scale connectomic datasets raise the question of what general functional principles can be gleaned from structural information such as the distribution of excitatory/inhibitory synapse types and the distribution of synaptic weights. Motivated by this question, we developed a statistical mechanical theory of learning in neural networks that incorporates structural information as constraints. We derived an analytical solution for the memory capacity of the perceptron, a basic feedforward model of supervised learning, with constraint on the distribution of its weights. Our theory predicts that the reduction in capacity due to the constrained weight-distribution is related to the Wasserstein distance between the imposed distribution and that of the standard normal distribution. To test the theoretical predictions, we use optimal transport theory and information geometry to develop an SGD-based algorithm to find weights that simultaneously learn the input-output task and satisfy the distribution constraint. We show that training in our algorithm can be interpreted as geodesic flows in the Wasserstein space of probability distributions. We further developed a statistical mechanical theory for teacher-student perceptron rule learning and ask for the best way for the student to incorporate prior knowledge of the rule. Our theory shows that it is beneficial for the learner to adopt different prior weight distributions during learning, and shows that distribution-constrained learning outperforms unconstrained and sign-constrained learning. Our theory and algorithm provide novel strategies for incorporating prior knowledge about weights into learning, and reveal a powerful connection between structure and function in neural networks. △ Less

Submitted 24 October, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

Comments: 38 pages, 13 figures. Updated introduction part and fixed several typos

arXiv:2206.01182 [pdf, other]

An optimal transport approach for selecting a representative subsample with application in efficient kernel density estimation

Authors: **gyi Zhang, Cheng Meng, Jun Yu, Mengrui Zhang, Wenxuan Zhong, ** Ma

Abstract: Subsampling methods aim to select a subsample as a surrogate for the observed sample. Such methods have been used pervasively in large-scale data analytics, active learning, and privacy-preserving analysis in recent decades. Instead of model-based methods, in this paper, we study model-free subsampling methods, which aim to identify a subsample that is not confined by model assumptions. Existing m… ▽ More Subsampling methods aim to select a subsample as a surrogate for the observed sample. Such methods have been used pervasively in large-scale data analytics, active learning, and privacy-preserving analysis in recent decades. Instead of model-based methods, in this paper, we study model-free subsampling methods, which aim to identify a subsample that is not confined by model assumptions. Existing model-free subsampling methods are usually built upon clustering techniques or kernel tricks. Most of these methods suffer from either a large computational burden or a theoretical weakness. In particular, the theoretical weakness is that the empirical distribution of the selected subsample may not necessarily converge to the population distribution. Such computational and theoretical limitations hinder the broad applicability of model-free subsampling methods in practice. We propose a novel model-free subsampling method by utilizing optimal transport techniques. Moreover, we develop an efficient subsampling algorithm that is adaptive to the unknown probability density function. Theoretically, we show the selected subsample can be used for efficient density estimation by deriving the convergence rate for the proposed subsample kernel density estimator. We also provide the optimal bandwidth for the proposed estimator. Numerical studies on synthetic and real-world datasets demonstrate the performance of the proposed method is superior. △ Less

Submitted 31 May, 2022; originally announced June 2022.

arXiv:2205.08830 [pdf, other]

doi 10.1088/1475-7516/2022/10/033

Prospects for Detecting the Diffuse Supernova Neutrino Background with JUNO

Authors: JUNO Collaboration, Angel Abusleme, Thomas Adam, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Muhammad Akram, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, Burin Asavapibhop, João Pedro Athayde Marcondes de André, Didier Auguste, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato, Antonio Bergnoli, Thilo Birkenfeld, Sylvie Blin , et al. (577 additional authors not shown)

Abstract: We present the detection potential for the diffuse supernova neutrino background (DSNB) at the Jiangmen Underground Neutrino Observatory (JUNO), using the inverse-beta-decay (IBD) detection channel on free protons. We employ the latest information on the DSNB flux predictions, and investigate in detail the background and its reduction for the DSNB search at JUNO. The atmospheric neutrino induced n… ▽ More We present the detection potential for the diffuse supernova neutrino background (DSNB) at the Jiangmen Underground Neutrino Observatory (JUNO), using the inverse-beta-decay (IBD) detection channel on free protons. We employ the latest information on the DSNB flux predictions, and investigate in detail the background and its reduction for the DSNB search at JUNO. The atmospheric neutrino induced neutral current (NC) background turns out to be the most critical background, whose uncertainty is carefully evaluated from both the spread of model predictions and an envisaged \textit{in situ} measurement. We also make a careful study on the background suppression with the pulse shape discrimination (PSD) and triple coincidence (TC) cuts. With latest DSNB signal predictions, more realistic background evaluation and PSD efficiency optimization, and additional TC cut, JUNO can reach the significance of 3$σ$ for 3 years of data taking, and achieve better than 5$σ$ after 10 years for a reference DSNB model. In the pessimistic scenario of non-observation, JUNO would strongly improve the limits and exclude a significant region of the model parameter space. △ Less

Submitted 13 October, 2022; v1 submitted 18 May, 2022; originally announced May 2022.

Comments: 29 pages, 11 figures, final published version in JCAP

Journal ref: JCAP 10 (2022) 033

arXiv:2205.08794 [pdf, other]

LogiGAN: Learning Logical Reasoning via Adversarial Pre-training

Authors: Xinyu Pi, Wanjun Zhong, Yan Gao, Nan Duan, Jian-Guang Lou

Abstract: We present LogiGAN, an unsupervised adversarial pre-training framework for improving logical reasoning abilities of language models. Upon automatic identifying logical reasoning phenomena in massive text corpus via detection heuristics, we train language models to predict the masked-out logical statements. Inspired by the facilitation effect of reflective thinking in human learning, we analogicall… ▽ More We present LogiGAN, an unsupervised adversarial pre-training framework for improving logical reasoning abilities of language models. Upon automatic identifying logical reasoning phenomena in massive text corpus via detection heuristics, we train language models to predict the masked-out logical statements. Inspired by the facilitation effect of reflective thinking in human learning, we analogically simulate the learning-thinking process with an adversarial Generator-Verifier architecture to assist logic learning. LogiGAN implements a novel sequential GAN approach that (a) circumvents the non-differentiable challenge of the sequential GAN by leveraging the Generator as a sentence-level generative likelihood scorer with a learning objective of reaching scoring consensus with the Verifier; (b) is computationally feasible for large-scale pre-training with arbitrary target length. Both base and large size language models pre-trained with LogiGAN demonstrate obvious performance improvement on 12 datasets requiring general reasoning abilities, revealing the fundamental role of logic in broad reasoning, as well as the effectiveness of LogiGAN. Ablation studies on LogiGAN components reveal the relative orthogonality between linguistic and logic abilities and suggest that reflective thinking's facilitation effect might also generalize to machine learning. △ Less

Submitted 9 December, 2022; v1 submitted 18 May, 2022; originally announced May 2022.

Comments: Accepted by NeurIPS 2022

arXiv:2205.06530 [pdf, other]

Modeling Semantic Composition with Syntactic Hypergraph for Video Question Answering

Authors: Zenan Xu, Wanjun Zhong, Qinliang Su, Zi**g Ou, Fuwei Zhang

Abstract: A key challenge in video question answering is how to realize the cross-modal semantic alignment between textual concepts and corresponding visual objects. Existing methods mostly seek to align the word representations with the video regions. However, word representations are often not able to convey a complete description of textual concepts, which are in general described by the compositions of… ▽ More A key challenge in video question answering is how to realize the cross-modal semantic alignment between textual concepts and corresponding visual objects. Existing methods mostly seek to align the word representations with the video regions. However, word representations are often not able to convey a complete description of textual concepts, which are in general described by the compositions of certain words. To address this issue, we propose to first build a syntactic dependency tree for each question with an off-the-shelf tool and use it to guide the extraction of meaningful word compositions. Based on the extracted compositions, a hypergraph is further built by viewing the words as nodes and the compositions as hyperedges. Hypergraph convolutional networks (HCN) are then employed to learn the initial representations of word compositions. Afterwards, an optimal transport based method is proposed to perform cross-modal semantic alignment for the textual and visual semantic space. To reflect the cross-modal influences, the cross-modal information is incorporated into the initial representations, leading to a model named cross-modality-aware syntactic HCN. Experimental results on three benchmarks show that our method outperforms all strong baselines. Further analyses demonstrate the effectiveness of each component, and show that our model is good at modeling different levels of semantic compositions and filtering out irrelevant information. △ Less

Submitted 13 May, 2022; originally announced May 2022.

Comments: 11pages, 7 figures

arXiv:2205.04040 [pdf, other]

ProQA: Structural Prompt-based Pre-training for Unified Question Answering

Authors: Wanjun Zhong, Yifan Gao, Ning Ding, Yujia Qin, Zhiyuan Liu, Ming Zhou, Jiahai Wang, Jian Yin, Nan Duan

Abstract: Question Answering (QA) is a longstanding challenge in natural language processing. Existing QA works mostly focus on specific question types, knowledge domains, or reasoning skills. The specialty in QA research hinders systems from modeling commonalities between tasks and generalization for wider applications. To address this issue, we present ProQA, a unified QA paradigm that solves various task… ▽ More Question Answering (QA) is a longstanding challenge in natural language processing. Existing QA works mostly focus on specific question types, knowledge domains, or reasoning skills. The specialty in QA research hinders systems from modeling commonalities between tasks and generalization for wider applications. To address this issue, we present ProQA, a unified QA paradigm that solves various tasks through a single model. ProQA takes a unified structural prompt as the bridge and improves the QA-centric ability by structural prompt-based pre-training. Through a structurally designed prompt-based input schema, ProQA concurrently models the knowledge generalization for all QA tasks while kee** the knowledge customization for every specific QA task. Furthermore, ProQA is pre-trained with structural prompt-formatted large-scale synthesized corpus, which empowers the model with the commonly-required QA ability. Experimental results on 11 QA benchmarks demonstrate that ProQA consistently boosts performance on both full data fine-tuning, few-shot learning, and zero-shot testing scenarios. Furthermore, ProQA exhibits strong ability in both continual learning and transfer learning by taking the advantages of the structural prompt. △ Less

Submitted 9 December, 2022; v1 submitted 9 May, 2022; originally announced May 2022.

Comments: NAACL 2022

arXiv:2204.13249 [pdf, other]

doi 10.1088/1674-1137/ac8bc9

Sub-percent Precision Measurement of Neutrino Oscillation Parameters with JUNO

Authors: JUNO Collaboration, Angel Abusleme, Thomas Adam, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Muhammad Akram, Abid Aleem, Tsagkarakis Alexandros, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, Burin Asavapibhop, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato , et al. (581 additional authors not shown)

Abstract: JUNO is a multi-purpose neutrino observatory under construction in the south of China. This publication presents new sensitivity estimates for the measurement of the $Δm^2_{31}$, $Δm^2_{21}$, $\sin^2 θ_{12}$, and $\sin^2 θ_{13}$ oscillation parameters using reactor antineutrinos, which is one of the primary physics goals of the experiment. The sensitivities are obtained using the best knowledge av… ▽ More JUNO is a multi-purpose neutrino observatory under construction in the south of China. This publication presents new sensitivity estimates for the measurement of the $Δm^2_{31}$, $Δm^2_{21}$, $\sin^2 θ_{12}$, and $\sin^2 θ_{13}$ oscillation parameters using reactor antineutrinos, which is one of the primary physics goals of the experiment. The sensitivities are obtained using the best knowledge available to date on the location and overburden of the experimental site, the nuclear reactors in the surrounding area and beyond, the detector response uncertainties, and the reactor antineutrino spectral shape constraints expected from the TAO satellite detector. It is found that the $Δm^2_{31}$, $Δm^2_{21}$, and $\sin^2 θ_{12}$ oscillation parameters will be determined to better than 0.5% precision in six years of data collection, which represents approximately an order of magnitude improvement over existing constraints. △ Less

Submitted 27 April, 2022; originally announced April 2022.

Comments: 29 pages, 10 figures, submitted to Chinese Physics C

Showing 101–150 of 368 results for author: Zhong, W