Search | arXiv e-print repository

Resistance Distances in Directed Graphs: Definitions, Properties, and Applications

Authors: Mingzhe Zhu, Liwang Zhu, Huan Li, Wei Li, Zhongzhi Zhang

Abstract: Resistance distance has been studied extensively in the past years, with the majority of previous studies devoted to undirected networks, in spite of the fact that various realistic networks are directed. Although several generalizations of resistance distance on directed graphs have been proposed, they either have no physical interpretation or are not a metric. In this paper, we first extend the… ▽ More Resistance distance has been studied extensively in the past years, with the majority of previous studies devoted to undirected networks, in spite of the fact that various realistic networks are directed. Although several generalizations of resistance distance on directed graphs have been proposed, they either have no physical interpretation or are not a metric. In this paper, we first extend the definition of resistance distance to strongly connected directed graphs based on random walks and show that the two-node resistance distance on directed graphs is a metric. Then, we introduce the Laplacian matrix for directed graphs that subsumes the Laplacian matrix of undirected graphs as a particular case and use its pseudoinverse to express the two-node resistance distance, and many other relevant quantities derived from resistance distances. Moreover, we define the resistance distance between a vertex and a vertex group on directed graphs and further define a problem of optimally selecting a group of fixed number of nodes, such that their resistance distance is minimized. Since this combinatorial optimization problem is NP-hard, we present a greedy algorithm with a proved approximation ratio, and conduct experiments on model and realistic networks to validate the performance of this approximation algorithm. △ Less

Submitted 8 February, 2023; originally announced February 2023.

Comments: Submitted to IEEE Transactions on Information Theory

arXiv:2302.03669 [pdf, other]

Deep Reinforcement Learning for Traffic Light Control in Intelligent Transportation Systems

Authors: Xiao-Yang Liu, Ming Zhu, Sem Borst, Anwar Walid

Abstract: Smart traffic lights in intelligent transportation systems (ITSs) are envisioned to greatly increase traffic efficiency and reduce congestion. Deep reinforcement learning (DRL) is a promising approach to adaptively control traffic lights based on the real-time traffic situation in a road network. However, conventional methods may suffer from poor scalability. In this paper, we investigate deep rei… ▽ More Smart traffic lights in intelligent transportation systems (ITSs) are envisioned to greatly increase traffic efficiency and reduce congestion. Deep reinforcement learning (DRL) is a promising approach to adaptively control traffic lights based on the real-time traffic situation in a road network. However, conventional methods may suffer from poor scalability. In this paper, we investigate deep reinforcement learning to control traffic lights, and both theoretical analysis and numerical experiments show that the intelligent behavior ``greenwave" (i.e., a vehicle will see a progressive cascade of green lights, and not have to brake at any intersection) emerges naturally a grid road network, which is proved to be the optimal policy in an avenue with multiple cross streets. As a first step, we use two DRL algorithms for the traffic light control problems in two scenarios. In a single road intersection, we verify that the deep Q-network (DQN) algorithm delivers a thresholding policy; and in a grid road network, we adopt the deep deterministic policy gradient (DDPG) algorithm. Secondly, numerical experiments show that the DQN algorithm delivers the optimal control, and the DDPG algorithm with passive observations has the capability to produce on its own a high-level intelligent behavior in a grid road network, namely, the ``greenwave" policy emerges. We also verify the ``greenwave" patterns in a $5 \times 10$ grid road network. Thirdly, the ``greenwave" patterns demonstrate that DRL algorithms produce favorable solutions since the ``greenwave" policy shown in experiment results is proved to be optimal in a specified traffic model (an avenue with multiple cross streets). The delivered policies both in a single road intersection and a grid road network demonstrate the scalability of DRL algorithms. △ Less

Submitted 5 March, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

Comments: 17 pages

Journal ref: IEEE Transactions on Network Science and Engineering, 2023

arXiv:2302.03270 [pdf, other]

doi 10.1093/mnras/stad436

High sensitivity HI image of diffuse gas and new tidal features in M51 observed by FAST

Authors: Haiyang Yu, Ming Zhu, **-Long Xu, Mei Ai, Peng Jiang, Yanbin Yang

Abstract: We observed the classical interacting galaxy M51 with FAST and obtain high sensitivity HI image with column density down to 3.8 $\times$ 10$^{18}$ cm$^{-2}$. In the image we can see a diffuse extended envelope around the system and several new tidal features. We also get a deeper look at M51b's probable gas, which has an approximated velocity range of 560 to 740 km s$^{-1}$ and a flux of 7.5 Jy km… ▽ More We observed the classical interacting galaxy M51 with FAST and obtain high sensitivity HI image with column density down to 3.8 $\times$ 10$^{18}$ cm$^{-2}$. In the image we can see a diffuse extended envelope around the system and several new tidal features. We also get a deeper look at M51b's probable gas, which has an approximated velocity range of 560 to 740 km s$^{-1}$ and a flux of 7.5 Jy km s$^{-1}$. Compared to the VLA image, we observe more complete structures of the Southeast Tail, Northeast Cloud and Northwest Plume, as well as new features of the Northwest Cloud and Southwest Plume. M51's most prominent tidal feature, the Southeast Tail, looks very long and broad, in addition with two small detached clouds at the periphery. Due to the presence of optical and simulated counterparts, the Northwest cloud appears to be the tail of M51a, while the Northwest Plume is more likely a tidal tail of M51b. The large mass of the Northwest Plume suggests that M51b may have been as gas-rich as M51a before the interaction. In addition, the formation process of the Northeast Cloud and Southwest Plume is obscured by the lack of optical and simulated counterparts. These novel tidal features, together with M51b's probable gas, will inspire future simulations and provide a deeper understanding of the evolution of this interacting system. △ Less

Submitted 7 February, 2023; originally announced February 2023.

Comments: 11 pages, 9 figures, accepted for publication in MNRAS

arXiv:2302.02646 [pdf, other]

doi 10.3847/2041-8213/acb932

Discovery of an isolated dark dwarf galaxy in the nearby universe

Authors: **-Long Xu, Ming Zhu, Nai** Yu, Chuan-Peng Zhang, Xiao-Lan Liu, Mei Ai, Peng Jiang

Abstract: Based on a new HI survey using the Five-hundred-meter Aperture Spherical radio Telescope (FAST), combined with the Pan-STARRS1 images, we identified an isolated HI cloud without any optical counterpart, named FAST J0139+4328. The newly discovered HI cloud appears to be a typical disk galaxy since it has a double-peak shape in the global HI profile and an S-like rotation structure in the velocity-p… ▽ More Based on a new HI survey using the Five-hundred-meter Aperture Spherical radio Telescope (FAST), combined with the Pan-STARRS1 images, we identified an isolated HI cloud without any optical counterpart, named FAST J0139+4328. The newly discovered HI cloud appears to be a typical disk galaxy since it has a double-peak shape in the global HI profile and an S-like rotation structure in the velocity-position diagram. Moreover, this disk galaxy has an extremely low absolute magnitude (M_B>-10.0 mag) and stellar mass (<6.9*10^5 Msun). Furthermore, we obtained that the HI mass of this galaxy is 8.3*10^7 Msun, and the dynamical mass to total baryonic mass ratio is 47+-27, implying that dark matter dominates over baryons in FAST J0139+4328. These findings provide observational evidence that FAST J0139+4328 is an isolated dark dwarf galaxy with a redshift of z=0.0083. This is the first time that an isolated dark galaxy has been detected in the nearby universe. △ Less

Submitted 6 February, 2023; originally announced February 2023.

Comments: 13 pages, 3 figures, Accepted for publication in the ApJ Letters

arXiv:2302.01970 [pdf, other]

doi 10.1609/aaai.v37i10.26473

Efficient Gradient Approximation Method for Constrained Bilevel Optimization

Authors: Siyuan Xu, Minghui Zhu

Abstract: Bilevel optimization has been developed for many machine learning tasks with large-scale and high-dimensional data. This paper considers a constrained bilevel optimization problem, where the lower-level optimization problem is convex with equality and inequality constraints and the upper-level optimization problem is non-convex. The overall objective function is non-convex and non-differentiable.… ▽ More Bilevel optimization has been developed for many machine learning tasks with large-scale and high-dimensional data. This paper considers a constrained bilevel optimization problem, where the lower-level optimization problem is convex with equality and inequality constraints and the upper-level optimization problem is non-convex. The overall objective function is non-convex and non-differentiable. To solve the problem, we develop a gradient-based approach, called gradient approximation method, which determines the descent direction by computing several representative gradients of the objective function inside a neighborhood of the current estimate. We show that the algorithm asymptotically converges to the set of Clarke stationary points, and demonstrate the efficacy of the algorithm by the experiments on hyperparameter optimization and meta-learning. △ Less

Submitted 3 February, 2023; originally announced February 2023.

arXiv:2302.01811 [pdf, other]

CheckedCBox: Type Directed Program Partitioning with Checked C for Incremental Spatial Memory Safety

Authors: Liyi Li, Arunkumar Bhattar, Le Chang, Mingwei Zhu, Aravind Machiry

Abstract: Spatial memory safety violation is still a major issue for C programs. Checked-C is a safe dialect of C and extends it with Checked pointer types and annotations that guarantee spatial memory safety in a backward-compatible manner, allowing the mix of checked pointers and regular (unchecked) pointer types. However, unchecked code vulnerabilities can violate the checked code's spatial safety guaran… ▽ More Spatial memory safety violation is still a major issue for C programs. Checked-C is a safe dialect of C and extends it with Checked pointer types and annotations that guarantee spatial memory safety in a backward-compatible manner, allowing the mix of checked pointers and regular (unchecked) pointer types. However, unchecked code vulnerabilities can violate the checked code's spatial safety guarantees. We present CheckedCBox, which adds a flexible, type-directed program partitioning mechanism to Checked-C, by enhancing the Checked-C type system with tainted types that enable flexible partitioning of the program into checked and unchecked regions, in a manner such that unchecked region code does not affect the spatial safety in the checked region. We formalize our type system and prove the non-crashing and non-exposure properties of a well-typed CheckedCBox program. We implemented CheckedCBox in a configurable manner, which enables us to use existing sandbox mechanisms (eg WebAssembly) to execute programs. Consequently, in doing so, CheckedCBox has prevented four known vulnerabilities by efficiently partitioning the program. △ Less

Submitted 3 February, 2023; originally announced February 2023.

Comments: Liyi Li and Arunkumar Bhattar contributed equally to this work

arXiv:2302.01642 [pdf, other]

Cluster-CAM: Cluster-Weighted Visual Interpretation of CNNs' Decision in Image Classification

Authors: Zhenpeng Feng, Hongbing Ji, Milos Dakovic, Xiyang Cui, Mingzhe Zhu, Ljubisa Stankovic

Abstract: Despite the tremendous success of convolutional neural networks (CNNs) in computer vision, the mechanism of CNNs still lacks clear interpretation. Currently, class activation map** (CAM), a famous visualization technique to interpret CNN's decision, has drawn increasing attention. Gradient-based CAMs are efficient while the performance is heavily affected by gradient vanishing and exploding. In… ▽ More Despite the tremendous success of convolutional neural networks (CNNs) in computer vision, the mechanism of CNNs still lacks clear interpretation. Currently, class activation map** (CAM), a famous visualization technique to interpret CNN's decision, has drawn increasing attention. Gradient-based CAMs are efficient while the performance is heavily affected by gradient vanishing and exploding. In contrast, gradient-free CAMs can avoid computing gradients to produce more understandable results. However, existing gradient-free CAMs are quite time-consuming because hundreds of forward interference per image are required. In this paper, we proposed Cluster-CAM, an effective and efficient gradient-free CNN interpretation algorithm. Cluster-CAM can significantly reduce the times of forward propagation by splitting the feature maps into clusters in an unsupervised manner. Furthermore, we propose an artful strategy to forge a cognition-base map and cognition-scissors from clustered feature maps. The final salience heatmap will be computed by merging the above cognition maps. Qualitative results conspicuously show that Cluster-CAM can produce heatmaps where the highlighted regions match the human's cognition more precisely than existing CAMs. The quantitative evaluation further demonstrates the superiority of Cluster-CAM in both effectiveness and efficiency. △ Less

Submitted 3 February, 2023; originally announced February 2023.

Comments: 10 pages

arXiv:2301.13502 [pdf]

doi 10.1007/JHEP04(2023)095

Parity-violation in bouncing cosmology

Authors: Mian Zhu, Yong Cai

Abstract: We investigate the possibility of the enhancement of parity-violation signal in bouncing cosmology. Specifically, we are interested in deciding which phase should generate the most significant parity-violation signals. We find that the dominant contribution comes from the bouncing phase, while the contraction phase has a smaller contribution. Therefore, bouncing cosmology can enhance the parity-vi… ▽ More We investigate the possibility of the enhancement of parity-violation signal in bouncing cosmology. Specifically, we are interested in deciding which phase should generate the most significant parity-violation signals. We find that the dominant contribution comes from the bouncing phase, while the contraction phase has a smaller contribution. Therefore, bouncing cosmology can enhance the parity-violation signals during the bouncing phase. Moreover, since the bouncing phase has the highest energy scale in bouncing cosmology, we can also probe new physics at this scale by studying the parity-violation effect. △ Less

Submitted 21 April, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

Comments: 28 pages, 22 figures

Journal ref: J. High Energ. Phys. 2023, 95 (2023)

arXiv:2301.12699 [pdf, other]

KG-BERTScore: Incorporating Knowledge Graph into BERTScore for Reference-Free Machine Translation Evaluation

Authors: Zhanglin Wu, Min Zhang, Ming Zhu, Yinglu Li, Ting Zhu, Hao Yang, Song Peng, Ying Qin

Abstract: BERTScore is an effective and robust automatic metric for referencebased machine translation evaluation. In this paper, we incorporate multilingual knowledge graph into BERTScore and propose a metric named KG-BERTScore, which linearly combines the results of BERTScore and bilingual named entity matching for reference-free machine translation evaluation. From the experimental results on WMT19 QE as… ▽ More BERTScore is an effective and robust automatic metric for referencebased machine translation evaluation. In this paper, we incorporate multilingual knowledge graph into BERTScore and propose a metric named KG-BERTScore, which linearly combines the results of BERTScore and bilingual named entity matching for reference-free machine translation evaluation. From the experimental results on WMT19 QE as a metric without references shared tasks, our metric KG-BERTScore gets higher overall correlation with human judgements than the current state-of-the-art metrics for reference-free machine translation evaluation.1 Moreover, the pre-trained multilingual model used by KG-BERTScore and the parameter for linear combination are also studied in this paper. △ Less

Submitted 30 January, 2023; originally announced January 2023.

Comments: 5 pages

arXiv:2301.12257 [pdf, other]

Few-shot Face Image Translation via GAN Prior Distillation

Authors: Ruoyu Zhao, Mingrui Zhu, Xiaoyu Wang, Nannan Wang

Abstract: Face image translation has made notable progress in recent years. However, when training on limited data, the performance of existing approaches significantly declines. Although some studies have attempted to tackle this problem, they either failed to achieve the few-shot setting (less than 10) or can only get suboptimal results. In this paper, we propose GAN Prior Distillation (GPD) to enable eff… ▽ More Face image translation has made notable progress in recent years. However, when training on limited data, the performance of existing approaches significantly declines. Although some studies have attempted to tackle this problem, they either failed to achieve the few-shot setting (less than 10) or can only get suboptimal results. In this paper, we propose GAN Prior Distillation (GPD) to enable effective few-shot face image translation. GPD contains two models: a teacher network with GAN Prior and a student network that fulfills end-to-end translation. Specifically, we adapt the teacher network trained on large-scale data in the source domain to the target domain with only a few samples, where it can learn the target domain's knowledge. Then, we can achieve few-shot augmentation by generating source domain and target domain images simultaneously with the same latent codes. We propose an anchor-based knowledge distillation module that can fully use the difference between the training and the augmented data to distill the knowledge of the teacher network into the student network. The trained student network achieves excellent generalization performance with the absorption of additional knowledge. Qualitative and quantitative experiments demonstrate that our method achieves superior results than state-of-the-art approaches in a few-shot setting. △ Less

Submitted 28 January, 2023; originally announced January 2023.

arXiv:2301.10008 [pdf, other]

Few-shot Font Generation by Learning Style Difference and Similarity

Authors: Xiao He, Mingrui Zhu, Nannan Wang, Xinbo Gao, Heng Yang

Abstract: Few-shot font generation (FFG) aims to preserve the underlying global structure of the original character while generating target fonts by referring to a few samples. It has been applied to font library creation, a personalized signature, and other scenarios. Existing FFG methods explicitly disentangle content and style of reference glyphs universally or component-wisely. However, they ignore the… ▽ More Few-shot font generation (FFG) aims to preserve the underlying global structure of the original character while generating target fonts by referring to a few samples. It has been applied to font library creation, a personalized signature, and other scenarios. Existing FFG methods explicitly disentangle content and style of reference glyphs universally or component-wisely. However, they ignore the difference between glyphs in different styles and the similarity of glyphs in the same style, which results in artifacts such as local distortions and style inconsistency. To address this issue, we propose a novel font generation approach by learning the Difference between different styles and the Similarity of the same style (DS-Font). We introduce contrastive learning to consider the positive and negative relationship between styles. Specifically, we propose a multi-layer style projector for style encoding and realize a distinctive style representation via our proposed Cluster-level Contrastive Style (CCS) loss. In addition, we design a multi-task patch discriminator, which comprehensively considers different areas of the image and ensures that each style can be distinguished independently. We conduct qualitative and quantitative evaluations comprehensively to demonstrate that our approach achieves significantly better results than state-of-the-art methods. △ Less

Submitted 24 January, 2023; originally announced January 2023.

Comments: 11 pages

arXiv:2301.07462 [pdf, ps, other]

On explicit birational geometry for weak Fano varieties and polarised Calabi-Yau varieties

Authors: Minzhe Zhu

Abstract: Given a natural number $l$ and a weak Fano $n$-fold $X$ with $\operatorname{dim}\overline{\varphi_{-lK_X}(X)}\geq n-1$, we study the lower bound of the anti-canonical volume and the upper bound of the anti-canonical stability index. The method can also be used to give similar bounds for polarised Calabi-Yau varieties. Given a natural number $l$ and a weak Fano $n$-fold $X$ with $\operatorname{dim}\overline{\varphi_{-lK_X}(X)}\geq n-1$, we study the lower bound of the anti-canonical volume and the upper bound of the anti-canonical stability index. The method can also be used to give similar bounds for polarised Calabi-Yau varieties. △ Less

Submitted 18 January, 2023; originally announced January 2023.

Comments: 14 pages

MSC Class: 14J32; 14J40; 14J45

arXiv:2301.05135 [pdf, ps, other]

On Existence Theorems for Conditional Inferential Models

Authors: Rongrong Zhang, Michael Y. Zhu, Chuanhai Liu

Abstract: The framework of Inferential Models (IMs) has recently been developed in search of what is referred to as the holy grail of statistical theory, that is, prior-free probabilistic inference. Its method of Conditional IMs (CIMs) is a critical component in that it serves as a desirable extension of the Bayes theorem for combining information when no prior distribution is available. The general form of… ▽ More The framework of Inferential Models (IMs) has recently been developed in search of what is referred to as the holy grail of statistical theory, that is, prior-free probabilistic inference. Its method of Conditional IMs (CIMs) is a critical component in that it serves as a desirable extension of the Bayes theorem for combining information when no prior distribution is available. The general form of CIMs is defined by a system of first-order homogeneous linear partial differential equations (PDEs). When admitting simple solutions, they are referred to as regular, whereas when no regular CIMs exist, they are used as the so-called local CIMs. This paper provides conditions for regular CIMs, which are shown to be equivalent to the existence of a group-theoretical representation of the underlying statistical model. It also establishes existence theorems for CIMs, which state that under mild conditions, local CIMs always exist. Finally, the paper concludes with a simple example and a few remarks on future developments of CIMs for applications to popular but inferentially nontrivial statistical models. △ Less

Submitted 12 January, 2023; originally announced January 2023.

arXiv:2301.01802 [pdf, other]

MonoEdge: Monocular 3D Object Detection Using Local Perspectives

Authors: Minghan Zhu, Lingting Ge, Panqu Wang, Huei Peng

Abstract: We propose a novel approach for monocular 3D object detection by leveraging local perspective effects of each object. While the global perspective effect shown as size and position variations has been exploited for monocular 3D detection extensively, the local perspectives has long been overlooked. We design a local perspective module to regress a newly defined variable named keyedge-ratios as the… ▽ More We propose a novel approach for monocular 3D object detection by leveraging local perspective effects of each object. While the global perspective effect shown as size and position variations has been exploited for monocular 3D detection extensively, the local perspectives has long been overlooked. We design a local perspective module to regress a newly defined variable named keyedge-ratios as the parameterization of the local shape distortion to account for the local perspective, and derive the object depth and yaw angle from it. Theoretically, this module does not rely on the pixel-wise size or position in the image of the objects, therefore independent of the camera intrinsic parameters. By plugging this module in existing monocular 3D object detection frameworks, we incorporate the local perspective distortion with global perspective effect for monocular 3D reasoning, and we demonstrate the effectiveness and superior performance over strong baseline methods in multiple datasets. △ Less

Submitted 4 January, 2023; originally announced January 2023.

Comments: WACV 2023

arXiv:2301.01448 [pdf, other]

A deep local attention network for pre-operative lymph node metastasis prediction in pancreatic cancer via multiphase CT imaging

Authors: Zhilin Zheng, Xu Fang, Jiawen Yao, Mengmeng Zhu, Le Lu, Lingyun Huang, **g Xiao, Yu Shi, Hong Lu, Jian** Lu, Ling Zhang, Chengwei Shao, Yun Bian

Abstract: Lymph node (LN) metastasis status is one of the most critical prognostic and cancer staging factors for patients with resectable pancreatic ductal adenocarcinoma (PDAC), or in general, for any types of solid malignant tumors. Preoperative prediction of LN metastasis from non-invasive CT imaging is highly desired, as it might be straightforwardly used to guide the following neoadjuvant treatment de… ▽ More Lymph node (LN) metastasis status is one of the most critical prognostic and cancer staging factors for patients with resectable pancreatic ductal adenocarcinoma (PDAC), or in general, for any types of solid malignant tumors. Preoperative prediction of LN metastasis from non-invasive CT imaging is highly desired, as it might be straightforwardly used to guide the following neoadjuvant treatment decision and surgical planning. Most studies only capture the tumor characteristics in CT imaging to implicitly infer LN metastasis and very few work exploit direct LN's CT imaging information. To the best of our knowledge, this is the first work to propose a fully-automated LN segmentation and identification network to directly facilitate the LN metastasis status prediction task. Nevertheless LN segmentation/detection is very challenging since LN can be easily confused with other hard negative anatomic structures (e.g., vessels) from radiological images. We explore the anatomical spatial context priors of pancreatic LN locations by generating a guiding attention map from related organs and vessels to assist segmentation and infer LN status. As such, LN segmentation is impelled to focus on regions that are anatomically adjacent or plausible with respect to the specific organs and vessels. The metastasized LN identification network is trained to classify the segmented LN instances into positives or negatives by reusing the segmentation network as a pre-trained backbone and padding a new classification head. More importantly, we develop a LN metastasis status prediction network that combines the patient-wise aggregation results of LN segmentation/identification and deep imaging features extracted from the tumor region. Extensive quantitative nested five-fold cross-validation is conducted on a discovery dataset of 749 patients with PDAC. △ Less

Submitted 4 January, 2023; originally announced January 2023.

Comments: 14 pages,5 figures

arXiv:2301.01036 [pdf, other]

High-Quality Real-Time Rendering Using Subpixel Sampling Reconstruction

Authors: Boyu Zhang, Hongliang Yuan, Mingyan Zhu, Ligang Liu, Jue Wang

Abstract: Generating high-quality, realistic rendering images for real-time applications generally requires tracing a few samples-per-pixel (spp) and using deep learning-based approaches to denoise the resulting low-spp images. Existing denoising methods have yet to achieve real-time performance at high resolutions due to the physically-based sampling and network inference time costs. In this paper, we prop… ▽ More Generating high-quality, realistic rendering images for real-time applications generally requires tracing a few samples-per-pixel (spp) and using deep learning-based approaches to denoise the resulting low-spp images. Existing denoising methods have yet to achieve real-time performance at high resolutions due to the physically-based sampling and network inference time costs. In this paper, we propose a novel Monte Carlo sampling strategy to accelerate the sampling process and a corresponding denoiser, subpixel sampling reconstruction (SSR), to obtain high-quality images. Extensive experiments demonstrate that our method significantly outperforms previous approaches in denoising quality and reduces overall time costs, enabling real-time rendering capabilities at 2K resolution. △ Less

Submitted 25 June, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

arXiv:2301.00937 [pdf, other]

doi 10.3847/1538-4357/acafe8

FEASTS: IGM cooling triggered by tidal interactions through the diffuse HI phase around NGC 4631

Authors: **g Wang, Dong Yang, Se-Heon Oh, Lister Staveley-Smith, Jie Wang, Q. Daniel Wang, Kelley M. Hess, Luis C. Ho, Ligang Hou, Yingjie **g, Peter Kamphuis, Fujia Li, Xuchen Lin, Ziming Liu, Li Shao, Shun Wang, Ming Zhu

Abstract: We use the single-dish radio telescope FAST to map the HI in the tidally interacting NGC 4631 group with a resolution of 3.24$'$ (7 kpc), reaching a 5-$σ$ column density limit of $10^{17.9}$ cm$^{-2}$ assuming a line width of 20 km s$^{-1}$. Taking the existing interferometric HI image from the HALOGAS project of WSRT as reference, we are able to identify and characterize a significant excess of l… ▽ More We use the single-dish radio telescope FAST to map the HI in the tidally interacting NGC 4631 group with a resolution of 3.24$'$ (7 kpc), reaching a 5-$σ$ column density limit of $10^{17.9}$ cm$^{-2}$ assuming a line width of 20 km s$^{-1}$. Taking the existing interferometric HI image from the HALOGAS project of WSRT as reference, we are able to identify and characterize a significant excess of large-scale, low-density, and diffuse HI in the group. This diffuse HI extends for more than 120 kpc across, and accounts for more than one fourth of the total HI detected by FAST in and around the galaxy NGC 4631. In the region of the tidal tails, the diffuse HI has a typical column density above $10^{19.5}$ cm$^{-2}$, and is highly turbulent with a velocity dispersion around 50 km s$^{-1}$. It increases in column density with the dense HI, and tends to be associated with the kinematically ``hotter'' part of the dense HI. Through simple modeling, we find that the majority of the diffuse HI in the tail region is likely to induce cooling out of the hot IGM instead of evaporating or being radiatively ionized. Given these relations of gas in different phases, the diffuse HI may represent a condensing phase of the IGM. Active tidal interactions on-going and in the past may have produced the wide-spreading HI distribution, and triggered the gas accretion to NGC 4631 through the phase of the diffuse HI. △ Less

Submitted 2 January, 2023; originally announced January 2023.

Comments: 35 pages, 24 figures. Accepted for publication at ApJ. FEASTS site: http://kavli.pku.edu.cn/~jwang/FEASTS

arXiv:2212.09561 [pdf, other]

Large Language Models are Better Reasoners with Self-Verification

Authors: Yixuan Weng, Minjun Zhu, Fei Xia, Bin Li, Shizhu He, Sheng** Liu, Bin Sun, Kang Liu, Jun Zhao

Abstract: Recently, with the chain of thought (CoT) prompting, large language models (LLMs), e.g., GPT-3, have shown strong reasoning ability in several natural language processing tasks such as arithmetic, commonsense, and logical reasoning. However, LLMs with CoT require multi-step prompting and multi-token prediction, which is highly sensitive to individual mistakes and vulnerable to error accumulation.… ▽ More Recently, with the chain of thought (CoT) prompting, large language models (LLMs), e.g., GPT-3, have shown strong reasoning ability in several natural language processing tasks such as arithmetic, commonsense, and logical reasoning. However, LLMs with CoT require multi-step prompting and multi-token prediction, which is highly sensitive to individual mistakes and vulnerable to error accumulation. The above issues make the LLMs need the ability to verify the answers. In fact, after inferring conclusions in some thinking decision tasks, people often check them by re-verifying steps to avoid some mistakes. In this paper, we propose and prove that LLMs also have similar self-verification abilities. We take the conclusion obtained by CoT as one of the conditions for solving the original problem. By performing a backward verification of the answers that LLM deduced for itself, we can obtain interpretable answer validation scores to select the candidate answer with the highest score. Experimental results demonstrate that the proposed method can improve the reasoning performance on various arithmetic, commonsense, and logical reasoning datasets. Our code is publicly available at: https://github.com/WENGSYX/Self-Verification. △ Less

Submitted 19 October, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

Comments: Accept in EMNLP 2023 Findings

arXiv:2212.09337 [pdf, other]

doi 10.1109/LSP.2023.3266115

Information Bottleneck-Inspired Type Based Multiple Access for Remote Estimation in IoT Systems

Authors: Meiyi Zhu, Chunyan Feng, Caili Guo, Nan Jiang, Osvaldo Simeone

Abstract: Type-based multiple access (TBMA) is a semantics-aware multiple access protocol for remote inference. In TBMA, codewords are reused across transmitting sensors, with each codeword being assigned to a different observation value. Existing TBMA protocols are based on fixed shared codebooks and on conventional maximum-likelihood or Bayesian decoders, which require knowledge of the distributions of ob… ▽ More Type-based multiple access (TBMA) is a semantics-aware multiple access protocol for remote inference. In TBMA, codewords are reused across transmitting sensors, with each codeword being assigned to a different observation value. Existing TBMA protocols are based on fixed shared codebooks and on conventional maximum-likelihood or Bayesian decoders, which require knowledge of the distributions of observations and channels. In this letter, we propose a novel design principle for TBMA based on the information bottleneck (IB). In the proposed IB-TBMA protocol, the shared codebook is jointly optimized with a decoder based on artificial neural networks (ANNs), so as to adapt to source, observations, and channel statistics based on data only. We also introduce the Compressed IB-TBMA (CIB-TBMA) protocol, which improves IB-TBMA by enabling a reduction in the number of codewords via an IB-inspired clustering phase. Numerical results demonstrate the importance of a joint design of codebook and neural decoder, and validate the benefits of codebook compression. △ Less

Submitted 5 April, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

Comments: 5 pages, 3 figures, accepted by IEEE Signal Processing Letters (SPL)

arXiv:2212.09033 [pdf, other]

Planning Immediate Landmarks of Targets for Model-Free Skill Transfer across Agents

Authors: Minghuan Liu, Zhengbang Zhu, Menghui Zhu, Yuzheng Zhuang, Weinan Zhang, Jianye Hao

Abstract: In reinforcement learning applications like robotics, agents usually need to deal with various input/output features when specified with different state/action spaces by their developers or physical restrictions. This indicates unnecessary re-training from scratch and considerable sample inefficiency, especially when agents follow similar solution steps to achieve tasks. In this paper, we aim to t… ▽ More In reinforcement learning applications like robotics, agents usually need to deal with various input/output features when specified with different state/action spaces by their developers or physical restrictions. This indicates unnecessary re-training from scratch and considerable sample inefficiency, especially when agents follow similar solution steps to achieve tasks. In this paper, we aim to transfer similar high-level goal-transition knowledge to alleviate the challenge. Specifically, we propose PILoT, i.e., Planning Immediate Landmarks of Targets. PILoT utilizes the universal decoupled policy optimization to learn a goal-conditioned state planner; then, distills a goal-planner to plan immediate landmarks in a model-free style that can be shared among different agents. In our experiments, we show the power of PILoT on various transferring challenges, including few-shot transferring across action spaces and dynamics, from low-dimensional vector states to image inputs, from simple robot to complicated morphology; and we also illustrate a zero-shot transfer solution from a simple 2D navigation task to the harder Ant-Maze task. △ Less

Submitted 18 December, 2022; originally announced December 2022.

arXiv:2212.06347 [pdf, other]

doi 10.1016/j.cma.2023.116064

Reliable extrapolation of deep neural operators informed by physics or sparse observations

Authors: Min Zhu, Handi Zhang, Anran Jiao, George Em Karniadakis, Lu Lu

Abstract: Deep neural operators can learn nonlinear map**s between infinite-dimensional function spaces via deep neural networks. As promising surrogate solvers of partial differential equations (PDEs) for real-time prediction, deep neural operators such as deep operator networks (DeepONets) provide a new simulation paradigm in science and engineering. Pure data-driven neural operators and deep learning m… ▽ More Deep neural operators can learn nonlinear map**s between infinite-dimensional function spaces via deep neural networks. As promising surrogate solvers of partial differential equations (PDEs) for real-time prediction, deep neural operators such as deep operator networks (DeepONets) provide a new simulation paradigm in science and engineering. Pure data-driven neural operators and deep learning models, in general, are usually limited to interpolation scenarios, where new predictions utilize inputs within the support of the training set. However, in the inference stage of real-world applications, the input may lie outside the support, i.e., extrapolation is required, which may result to large errors and unavoidable failure of deep learning models. Here, we address this challenge of extrapolation for deep neural operators. First, we systematically investigate the extrapolation behavior of DeepONets by quantifying the extrapolation complexity via the 2-Wasserstein distance between two function spaces and propose a new behavior of bias-variance trade-off for extrapolation with respect to model capacity. Subsequently, we develop a complete workflow, including extrapolation determination, and we propose five reliable learning methods that guarantee a safe prediction under extrapolation by requiring additional information -- the governing PDEs of the system or sparse new observations. The proposed methods are based on either fine-tuning a pre-trained DeepONet or multifidelity learning. We demonstrate the effectiveness of the proposed framework for various types of parametric PDEs. Our systematic comparisons provide practical guidelines for selecting a proper extrapolation method depending on the available information, desired accuracy, and required inference speed. △ Less

Submitted 12 December, 2022; originally announced December 2022.

arXiv:2212.05759 [pdf, ps, other]

Resistance Distances in Simplicial Networks

Authors: Mingzhe Zhu, Wanyue Xu, Zhongzhi Zhang, Haibin Kan, Guanrong Chen

Abstract: It is well known that in many real networks, such as brain networks and scientific collaboration networks, there exist higher-order nonpairwise relations among nodes, i.e., interactions between among than two nodes at a time. This simplicial structure can be described by simplicial complexes and has an important effect on topological and dynamical properties of networks involving such group intera… ▽ More It is well known that in many real networks, such as brain networks and scientific collaboration networks, there exist higher-order nonpairwise relations among nodes, i.e., interactions between among than two nodes at a time. This simplicial structure can be described by simplicial complexes and has an important effect on topological and dynamical properties of networks involving such group interactions. In this paper, we study analytically resistance distances in iteratively growing networks with higher-order interactions characterized by the simplicial structure that is controlled by a parameter q. We derive exact formulas for interesting quantities about resistance distances, including Kirchhoff index, additive degree-Kirchhoff index, multiplicative degree-Kirchhoff index, as well as average resistance distance, which have found applications in various areas elsewhere. We show that the average resistance distance tends to a q-dependent constant, indicating the impact of simplicial organization on the structural robustness measured by average resistance distance. △ Less

Submitted 12 December, 2022; originally announced December 2022.

Comments: accepted by The Computer Journal

arXiv:2212.05744 [pdf, other]

Hitting Times of Random Walks on Edge Corona Product Graphs

Authors: Mingzhe Zhu, Wanyue Xu, Wei Li, Zhongzhi Zhang, Haibin Kan

Abstract: Graph products have been extensively applied to model complex networks with striking properties observed in real-world complex systems. In this paper, we study the hitting times for random walks on a class of graphs generated iteratively by edge corona product. We first derive recursive solutions to the eigenvalues and eigenvectors of the normalized adjacency matrix associated with the graphs. Bas… ▽ More Graph products have been extensively applied to model complex networks with striking properties observed in real-world complex systems. In this paper, we study the hitting times for random walks on a class of graphs generated iteratively by edge corona product. We first derive recursive solutions to the eigenvalues and eigenvectors of the normalized adjacency matrix associated with the graphs. Based on these results, we further obtain interesting quantities about hitting times of random walks, providing iterative formulas for two-node hitting time, as well as closed-form expressions for the Kemeny's constant defined as a weighted average of hitting times over all node pairs, as well as the arithmetic mean of hitting times of all pairs of nodes. △ Less

Submitted 12 December, 2022; originally announced December 2022.

Comments: accepted by The Computer Journal

arXiv:2212.04105 [pdf, other]

All-to-key Attention for Arbitrary Style Transfer

Authors: Mingrui Zhu, Xiao He, Nannan Wang, Xiaoyu Wang, Xinbo Gao

Abstract: Attention-based arbitrary style transfer studies have shown promising performance in synthesizing vivid local style details. They typically use the all-to-all attention mechanism -- each position of content features is fully matched to all positions of style features. However, all-to-all attention tends to generate distorted style patterns and has quadratic complexity, limiting the effectiveness a… ▽ More Attention-based arbitrary style transfer studies have shown promising performance in synthesizing vivid local style details. They typically use the all-to-all attention mechanism -- each position of content features is fully matched to all positions of style features. However, all-to-all attention tends to generate distorted style patterns and has quadratic complexity, limiting the effectiveness and efficiency of arbitrary style transfer. In this paper, we propose a novel all-to-key attention mechanism -- each position of content features is matched to stable key positions of style features -- that is more in line with the characteristics of style transfer. Specifically, it integrates two newly proposed attention forms: distributed and progressive attention. Distributed attention assigns attention to key style representations that depict the style distribution of local regions; Progressive attention pays attention from coarse-grained regions to fine-grained key positions. The resultant module, dubbed StyA2K, shows extraordinary performance in preserving the semantic structure and rendering consistent style patterns. Qualitative and quantitative comparisons with state-of-the-art methods demonstrate the superior performance of our approach. △ Less

Submitted 6 April, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

arXiv:2212.01547 [pdf, ps, other]

doi 10.1051/0004-6361/202245347

Radio continuum and OH line emission of high-z OH megamaser galaxies

Authors: Zhongzu Wu, Yu. V. Sotnikova, Bo Zhang, T. Mufakharov, Ming Zhu, Peng Jiang, Yongjun Chen, Zhiqiang Shen, Chun Sun, Hao Peng, Hong Wu

Abstract: We present the study of arcsecond scale radio continuum and OH line emission of a sample of known OH megamaser galaxies with $z \geq$ 0.15 using archival Very Large Array (VLA) data. And also the results of our pilot Five hundred meter aperture spherical radio telescope (FAST) observations of 12 of these OHM galaxies. The arcsecond-scale resolution images show that the OH emission is distributed i… ▽ More We present the study of arcsecond scale radio continuum and OH line emission of a sample of known OH megamaser galaxies with $z \geq$ 0.15 using archival Very Large Array (VLA) data. And also the results of our pilot Five hundred meter aperture spherical radio telescope (FAST) observations of 12 of these OHM galaxies. The arcsecond-scale resolution images show that the OH emission is distributed in one compact structure and spatially associated with radio continuum emission. Furthermore, nearly all the fitted components are likely smaller than the beam size ($\sim$ 1.4"), which indicates that the broad OH line profiles of these sources originated from one masing region or that more components are distributed in sub-arcsec scales. The radio parameters, including brightness temperature, spectral index, and q-index, show no significant differences with the low-redshift OHM galaxies, which have significantly lower OH line luminosities. Because these parameters are indicators of the central power sources (AGN, starburst, or both), our results indicate that the presence of radio AGN in the nuclei may not be essential for the formation of OH emission. Over 1/3 of OHMs in this sample (6/17) show possible variable features likely caused by interstellar scintillation due to small angular sizes. We might underestimate this value because these sources are associated with this sample's highest OH line flux densities. Those with low OH line flux densities might need higher sensitivity observations to study the variabilities. These results support the compact nature of OH maser emission and a starburst origin for the OHMs in our selected sample. △ Less

Submitted 3 December, 2022; originally announced December 2022.

Comments: 25 pages,7 figures,accepted by A&A

Journal ref: A&A 669, A148 (2023)

arXiv:2211.17065 [pdf, other]

doi 10.1016/j.dark.2023.101254

Microlensing effects of wormholes associated to blackhole spacetimes

Authors: Ke Gao, Lei-Hua Liu, Mian Zhu

Abstract: In this paper, we investigate the microlensing effects of wormholes associated to black hole spacetimes. Specifically, we work on three typical wormholes (WH): Schwarzschild WH, Kerr WH, and RN WH, as well as their blackhole correspondences. We evaluate the deflection angle upon the second order under weak field approximation using Gauss-Bonnet theorem. Then, we study their magnification with nume… ▽ More In this paper, we investigate the microlensing effects of wormholes associated to black hole spacetimes. Specifically, we work on three typical wormholes (WH): Schwarzschild WH, Kerr WH, and RN WH, as well as their blackhole correspondences. We evaluate the deflection angle upon the second order under weak field approximation using Gauss-Bonnet theorem. Then, we study their magnification with numerics.We find that a Kerr WH could lead to multi peaks in the magnification with certain parameters in the prograde case, while a Kerr BH predicts one peak. Therefore, the multi-peak feature of can be used to distinguish the Kerr WH from other compact objects. We also find that the magnification of RN BH will be one peak compared to RN WH, in which the magnification of RN WH is negative in some situations. For other cases, the behavior of magnification from wormholes and their corresponding blackholes is similar. Our result may shed new light on exploring compact objects through the microlensing effect. △ Less

Submitted 8 June, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

Comments: Match the publication version

Journal ref: Phys.Dark Univ. 41 (2023) 101254

arXiv:2211.16489 [pdf, ps, other]

Parton distribution of intrinsic charm in two dimensional QCD

Authors: Siwei Hu, Yu Jia, Zhewen Mo, Xiaonu Xiong, Mingliang Zhu

Abstract: We present a detailed investigation on the intrinsic charm content in a light meson within the 't Hooft model, namely, the two-dimensional QCD in large $N_c$ limit. The intrinsic charm parton distribution function (PDF) of a light meson, which first arises at order $N_c^{-1}$, is explicitly expressed in terms of the 't Hooft wave functions of the light meson and an infinite towers of excited charm… ▽ More We present a detailed investigation on the intrinsic charm content in a light meson within the 't Hooft model, namely, the two-dimensional QCD in large $N_c$ limit. The intrinsic charm parton distribution function (PDF) of a light meson, which first arises at order $N_c^{-1}$, is explicitly expressed in terms of the 't Hooft wave functions of the light meson and an infinite towers of excited charmed mesons. We also derive the functional forms from the two-dimensional counterparts of the meson cloud model (MCM) and Brodsky-Hoyer-Peterson-Sakai (BHPS) model. We then make a quantitative comparison between our rigorous results and model predictions. We also study how the profile of the intrinsic charm PDF varies with charm quark mass. The average momentum fraction carried by the charm quark inside a light meson is found to decrease faster than $m_c^{-4}$ with increasing charm quark mass. △ Less

Submitted 9 February, 2023; v1 submitted 29 November, 2022; originally announced November 2022.

Comments: 14 pages, 6 figures, 1 table

arXiv:2211.14758 [pdf, other]

VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Authors: Kun Cheng, Xiaodong Cun, Yong Zhang, Menghan Xia, Fei Yin, Mingrui Zhu, Xuan Wang, Jue Wang, Nannan Wang

Abstract: We present VideoReTalking, a new system to edit the faces of a real-world talking head video according to input audio, producing a high-quality and lip-syncing output video even with a different emotion. Our system disentangles this objective into three sequential tasks: (1) face video generation with a canonical expression; (2) audio-driven lip-sync; and (3) face enhancement for improving photo-r… ▽ More We present VideoReTalking, a new system to edit the faces of a real-world talking head video according to input audio, producing a high-quality and lip-syncing output video even with a different emotion. Our system disentangles this objective into three sequential tasks: (1) face video generation with a canonical expression; (2) audio-driven lip-sync; and (3) face enhancement for improving photo-realism. Given a talking-head video, we first modify the expression of each frame according to the same expression template using the expression editing network, resulting in a video with the canonical expression. This video, together with the given audio, is then fed into the lip-sync network to generate a lip-syncing video. Finally, we improve the photo-realism of the synthesized faces through an identity-aware face enhancement network and post-processing. We use learning-based approaches for all three steps and all our modules can be tackled in a sequential pipeline without any user intervention. Furthermore, our system is a generic approach that does not need to be retrained to a specific person. Evaluations on two widely-used datasets and in-the-wild examples demonstrate the superiority of our framework over other state-of-the-art methods in terms of lip-sync accuracy and visual quality. △ Less

Submitted 27 November, 2022; originally announced November 2022.

Comments: Accepted by SIGGRAPH Asia 2022 Conference Proceedings. Project page: https://vinthony.github.io/video-retalking/

arXiv:2211.10356 [pdf, other]

Packing $1.35\cdot 10^{11}$ rectangles into a unit square

Authors: Mingliang Zhu, Antal Joós

Abstract: It is known that $\sum\limits_{i=1}^{\infty} \frac{1}{i (i+1)} = 1$. In 1968, Meir and Moser asked for finding the smallest $ε$ such that all the rectangles of sizes $1/i \times 1/(i + 1)$ for $i = 1, 2, \ldots$, can be packed into a unit square or a rectangle of area $1 + ε$. In this paper, we show that we can pack the first $1.35\cdot10^{11}$ rectangles into the unit square and give an estimate… ▽ More It is known that $\sum\limits_{i=1}^{\infty} \frac{1}{i (i+1)} = 1$. In 1968, Meir and Moser asked for finding the smallest $ε$ such that all the rectangles of sizes $1/i \times 1/(i + 1)$ for $i = 1, 2, \ldots$, can be packed into a unit square or a rectangle of area $1 + ε$. In this paper, we show that we can pack the first $1.35\cdot10^{11}$ rectangles into the unit square and give an estimate for $ε$ from this packing. △ Less

Submitted 17 November, 2022; originally announced November 2022.

Comments: 7 pages, 4 figures

arXiv:2211.09555 [pdf, ps, other]

UAV Assisted Data Collection for Internet of Things: A Survey

Authors: Zhiqing Wei, Mingyue Zhu, Ning Zhang, Lin Wang, Yingying Zou, Zeyang Meng, Huici Wu, Zhiyong Feng

Abstract: Thanks to the advantages of flexible deployment and high mobility, unmanned aerial vehicles (UAVs) have been widely applied in the areas of disaster management, agricultural plant protection, environment monitoring and so on. With the development of UAV and sensor technologies, UAV assisted data collection for Internet of Things (IoT) has attracted increasing attentions. In this article, the scena… ▽ More Thanks to the advantages of flexible deployment and high mobility, unmanned aerial vehicles (UAVs) have been widely applied in the areas of disaster management, agricultural plant protection, environment monitoring and so on. With the development of UAV and sensor technologies, UAV assisted data collection for Internet of Things (IoT) has attracted increasing attentions. In this article, the scenarios and key technologies of UAV assisted data collection are comprehensively reviewed. First, we present the system model including the network model and mathematical model of UAV assisted data collection for IoT. Then, we review the key technologies including clustering of sensors, UAV data collection mode as well as joint path planning and resource allocation. Finally, the open problems are discussed from the perspectives of efficient multiple access as well as joint sensing and data collection. This article hopefully provides some guidelines and insights for researchers in the area of UAV assisted data collection for IoT. △ Less

Submitted 17 November, 2022; originally announced November 2022.

arXiv:2211.09484 [pdf]

doi 10.1063/5.0136180

Tremendous tunneling magnetoresistance effects based on van der Waals room-temperature ferromagnet Fe$_3$GaTe$_2$ with highly spin-polarized Fermi surfaces

Authors: Xinlu Li, Meng Zhu, Yaoyuan Wang, Fanxing Zheng, Jianting Dong, Ye Zhou, Long You, Jia Zhang

Abstract: Recently, van der Waals (vdW) magnetic heterostructures have received increasing research attention in spintronics. However, the lack of room-temperature magnetic order of vdW material has largely impedes its development in practical spintronics devices. Inspired by the recently discovered vdW ferromagnet Fe3GaTe2, which has been shown to have magnetic order above room temperature and sizable perp… ▽ More Recently, van der Waals (vdW) magnetic heterostructures have received increasing research attention in spintronics. However, the lack of room-temperature magnetic order of vdW material has largely impedes its development in practical spintronics devices. Inspired by the recently discovered vdW ferromagnet Fe3GaTe2, which has been shown to have magnetic order above room temperature and sizable perpendicular magnetic anisotropy, we investigate the basic electronic structure and magnetic properties of Fe3GaTe2 as well as tunneling magnetoresistance effect in magnetic tunnel junctions (MTJs) with structure of Fe3GaTe2/Insulator/Fe3GaTe2 by using first-principles calculations. It is found that Fe3GaTe2 with highly spin-polarized Fermi surface ensures that such magnetic tunnel junctions may have prominent tunneling magnetoresistance effect at room temperature even comparable to existing conventional AlOx and MgO-based MTJs. Our results suggest that Fe3GaTe2-based MTJs may be the promising candidate for realizing long-waiting full magnetic vdW spintronic devices. △ Less

Submitted 17 November, 2022; originally announced November 2022.

arXiv:2211.09332 [pdf]

iNavFIter-M: Matrix Formulation of Functional Iteration for Inertial Navigation Computation

Authors: Hongyan Jiang, Maoran Zhu, Yanyan Fu, Yuanxin Wu

Abstract: The acquisition of attitude, velocity, and position is an essential task in the field of inertial navigation, achieved by integrating the measurements from inertial sensors. Recently, the ultra-precision inertial navigation computation has been tackled by the functional iteration approach (iNavFIter) that drives the non-commutativity errors almost to the computer truncation error level. This paper… ▽ More The acquisition of attitude, velocity, and position is an essential task in the field of inertial navigation, achieved by integrating the measurements from inertial sensors. Recently, the ultra-precision inertial navigation computation has been tackled by the functional iteration approach (iNavFIter) that drives the non-commutativity errors almost to the computer truncation error level. This paper proposes a computationally efficient matrix formulation of the functional iteration approach, named the iNavFIter-M. The Chebyshev polynomial coefficients in two consecutive iterations are explicitly connected through the matrix formulation, in contrast to the implicit iterative relationship in the original iNavFIter. By so doing, it allows a straightforward algorithmic implementation and a number of matrix factors can be pre-calculated for more efficient computation. Numerical results demonstrate that the proposed iNavFIter-M algorithm is able to achieve the same high computation accuracy as the original iNavFIter does, at the computational cost comparable to the typical two-sample algorithm. The iNavFIter-M algorithm is also implemented on a FPGA board to demonstrate its potential in real time applications. △ Less

Submitted 16 November, 2022; originally announced November 2022.

Comments: 30 pages, 7 figures

arXiv:2211.06411 [pdf, other]

Qafny: A Quantum-Program Verifier

Authors: Liyi Li, Mingwei Zhu, Rance Cleaveland, Alexander Nicolellis, Yi Lee, Le Chang, Xiaodi Wu

Abstract: Because of the probabilistic/nondeterministic behavior of quantum programs, it is highly advisable to verify them formally to ensure that they correctly implement their specifications. Formal verification, however, also traditionally requires significant effort. To address this challenge, we present Qafny, an automated proof system based on the program verifier Dafny and designed for verifying qua… ▽ More Because of the probabilistic/nondeterministic behavior of quantum programs, it is highly advisable to verify them formally to ensure that they correctly implement their specifications. Formal verification, however, also traditionally requires significant effort. To address this challenge, we present Qafny, an automated proof system based on the program verifier Dafny and designed for verifying quantum programs. At its core, Qafny uses a type-guided quantum proof system that translates quantum operations to classical array operations modeled within a classical separation logic framework. We prove the soundness and completeness of our proof system and implement a prototype compiler that transforms Qafny programs and specifications into Dafny for automated verification purposes. We then illustrate the utility of Qafny's automated capabilities in efficiently verifying important quantum algorithms, including quantum-walk algorithms, Grover's algorithm, and Shor's algorithm. △ Less

Submitted 19 January, 2024; v1 submitted 11 November, 2022; originally announced November 2022.

Comments: Version 4

arXiv:2211.03107 [pdf, other]

FinRL-Meta: Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning

Authors: Xiao-Yang Liu, Ziyi Xia, **gyang Rui, Jiechao Gao, Hongyang Yang, Ming Zhu, Christina Dan Wang, Zhaoran Wang, Jian Guo

Abstract: Finance is a particularly difficult playground for deep reinforcement learning. However, establishing high-quality market environments and benchmarks for financial reinforcement learning is challenging due to three major factors, namely, low signal-to-noise ratio of financial data, survivorship bias of historical data, and model overfitting in the backtesting stage. In this paper, we present an op… ▽ More Finance is a particularly difficult playground for deep reinforcement learning. However, establishing high-quality market environments and benchmarks for financial reinforcement learning is challenging due to three major factors, namely, low signal-to-noise ratio of financial data, survivorship bias of historical data, and model overfitting in the backtesting stage. In this paper, we present an openly accessible FinRL-Meta library that has been actively maintained by the AI4Finance community. First, following a DataOps paradigm, we will provide hundreds of market environments through an automatic pipeline that collects dynamic datasets from real-world markets and processes them into gym-style market environments. Second, we reproduce popular papers as step** stones for users to design new trading strategies. We also deploy the library on cloud platforms so that users can visualize their own results and assess the relative performance via community-wise competitions. Third, FinRL-Meta provides tens of Jupyter/Python demos organized into a curriculum and a documentation website to serve the rapidly growing community. FinRL-Meta is available at: https://github.com/AI4Finance-Foundation/FinRL-Meta △ Less

Submitted 6 November, 2022; originally announced November 2022.

Comments: NeurIPS 2022 Datasets and Benchmarks. 36th Conference on Neural Information Processing Systems Datasets and Benchmarks Track

arXiv:2211.02281 [pdf, other]

An Efficient FPGA-based Accelerator for Deep Forest

Authors: Mingyu Zhu, Jiapeng Luo, Wendong Mao, Zhongfeng Wang

Abstract: Deep Forest is a prominent machine learning algorithm known for its high accuracy in forecasting. Compared with deep neural networks, Deep Forest has almost no multiplication operations and has better performance on small datasets. However, due to the deep structure and large forest quantity, it suffers from large amounts of calculation and memory consumption. In this paper, an efficient hardware… ▽ More Deep Forest is a prominent machine learning algorithm known for its high accuracy in forecasting. Compared with deep neural networks, Deep Forest has almost no multiplication operations and has better performance on small datasets. However, due to the deep structure and large forest quantity, it suffers from large amounts of calculation and memory consumption. In this paper, an efficient hardware accelerator is proposed for deep forest models, which is also the first work to implement Deep Forest on FPGA. Firstly, a delicate node computing unit (NCU) is designed to improve inference speed. Secondly, based on NCU, an efficient architecture and an adaptive dataflow are proposed, in order to alleviate the problem of node computing imbalance in the classification process. Moreover, an optimized storage scheme in this design also improves hardware utilization and power efficiency. The proposed design is implemented on an FPGA board, Intel Stratix V, and it is evaluated by two typical datasets, ADULT and Face Mask Detection. The experimental results show that the proposed design can achieve around 40x speedup compared to that on a 40 cores high performance x86 CPU. △ Less

Submitted 4 November, 2022; originally announced November 2022.

Comments: 5 pages, 5 figures, conference

arXiv:2210.16999 [pdf, ps, other]

Uniqueness of positive solutions to elliptic equations with the critical exponential growth on the unit disc and its applications

Authors: Lu Chen, Guozhen Lu, Ying Xue, Maochun Zhu

Abstract: In this paper, we will solve this uniqueness problem of positive solutions to the following equations of exponential growth: \begin{equation*} \begin{cases} -Δu =λue^{u^2},\quad\quad & x\in B_1\subset \mathbb{R}^2,\\ u>0, & x\in B_1,\ \\ u=0,\quad\quad &x\in \partial B_1, \end{cases} \end{equation*} where $ 0<λ<λ_1(B_1)$ and $λ_1(B_1)$ denotes the first eigenvalue of the operator $-Δ$ with the Dir… ▽ More In this paper, we will solve this uniqueness problem of positive solutions to the following equations of exponential growth: \begin{equation*} \begin{cases} -Δu =λue^{u^2},\quad\quad & x\in B_1\subset \mathbb{R}^2,\\ u>0, & x\in B_1,\ \\ u=0,\quad\quad &x\in \partial B_1, \end{cases} \end{equation*} where $ 0<λ<λ_1(B_1)$ and $λ_1(B_1)$ denotes the first eigenvalue of the operator $-Δ$ with the Dirichlet boundary in unit disk. Our method relies on delicate and difficult analysis of radial solutions to the above equation and careful asymptotic expansion of solutions near the boundary. This uniqueness result will shed some light on solving the conjecture that maximizers of the Trudinger-Moser inequality on the unit disc are unique. Furthermore, based on this uniqueness result, we develop a new strategy to establish the quantization property of elliptic equations with the critical exponential growth in the balls of hyperbolic spaces, and obtain the multiplicity and non-existence of positive critical points for super-critical Trudinger-Moser functional. Our method for the quantization property and non-existence of the critical points avoids using the complicated blow-up analysis used in the literature. This method can also be applied to study the similar problems in balls of high dimensional Euclidean space $\mathbb{R}^n$ or hyperbolic spaces provided the uniqueness for the corresponding quasilinear elliptic equations with the critical exponential growth is established. △ Less

Submitted 30 October, 2022; originally announced October 2022.

arXiv:2210.11194 [pdf, other]

Controller-Guided Partial Label Consistency Regularization with Unlabeled Data

Authors: Qian-Wei Wang, Bowen Zhao, Mingyan Zhu, Tianxiang Li, Zimo Liu, Shu-Tao Xia

Abstract: Partial label learning (PLL) learns from training examples each associated with multiple candidate labels, among which only one is valid. In recent years, benefiting from the strong capability of dealing with ambiguous supervision and the impetus of modern data augmentation methods, consistency regularization-based PLL methods have achieved a series of successes and become mainstream. However, as… ▽ More Partial label learning (PLL) learns from training examples each associated with multiple candidate labels, among which only one is valid. In recent years, benefiting from the strong capability of dealing with ambiguous supervision and the impetus of modern data augmentation methods, consistency regularization-based PLL methods have achieved a series of successes and become mainstream. However, as the partial annotation becomes insufficient, their performances drop significantly. In this paper, we leverage easily accessible unlabeled examples to facilitate the partial label consistency regularization. In addition to a partial supervised loss, our method performs a controller-guided consistency regularization at both the label-level and representation-level with the help of unlabeled data. To minimize the disadvantages of insufficient capabilities of the initial supervised model, we use the controller to estimate the confidence of each current prediction to guide the subsequent consistency regularization. Furthermore, we dynamically adjust the confidence thresholds so that the number of samples of each class participating in consistency regularization remains roughly equal to alleviate the problem of class-imbalance. Experiments show that our method achieves satisfactory performances in more practical situations, and its modules can be applied to existing PLL methods to enhance their capabilities. △ Less

Submitted 27 February, 2024; v1 submitted 20 October, 2022; originally announced October 2022.

arXiv:2210.08763 [pdf, other]

ReasonChainQA: Text-based Complex Question Answering with Explainable Evidence Chains

Authors: Minjun Zhu, Yixuan Weng, Shizhu He, Kang Liu, Jun Zhao

Abstract: The ability of reasoning over evidence has received increasing attention in question answering (QA). Recently, natural language database (NLDB) conducts complex QA in knowledge base with textual evidences rather than structured representations, this task attracts a lot of attention because of the flexibility and richness of textual evidence. However, existing text-based complex question answering… ▽ More The ability of reasoning over evidence has received increasing attention in question answering (QA). Recently, natural language database (NLDB) conducts complex QA in knowledge base with textual evidences rather than structured representations, this task attracts a lot of attention because of the flexibility and richness of textual evidence. However, existing text-based complex question answering datasets fail to provide explicit reasoning process, while it's important for retrieval effectiveness and reasoning interpretability. Therefore, we present a benchmark \textbf{ReasonChainQA} with explanatory and explicit evidence chains. ReasonChainQA consists of two subtasks: answer generation and evidence chains extraction, it also contains higher diversity for multi-hop questions with varying depths, 12 reasoning types and 78 relations. To obtain high-quality textual evidences for answering complex question. Additional experiment on supervised and unsupervised retrieval fully indicates the significance of ReasonChainQA. Dataset and codes will be made publicly available upon accepted. △ Less

Submitted 17 October, 2022; originally announced October 2022.

Comments: 5 pages

Journal ref: CAC 2022

arXiv:2210.05953 [pdf, other]

Classification by estimating the cumulative distribution function for small data

Authors: Meng-Xian Zhu, Yuan-Hai Shao

Abstract: In this paper, we study the classification problem by estimating the conditional probability function of the given data. Different from the traditional expected risk estimation theory on empirical data, we calculate the probability via Fredholm equation, this leads to estimate the distribution of the data. Based on the Fredholm equation, a new expected risk estimation theory by estimating the cumu… ▽ More In this paper, we study the classification problem by estimating the conditional probability function of the given data. Different from the traditional expected risk estimation theory on empirical data, we calculate the probability via Fredholm equation, this leads to estimate the distribution of the data. Based on the Fredholm equation, a new expected risk estimation theory by estimating the cumulative distribution function is presented. The main characteristics of the new expected risk estimation is to measure the risk on the distribution of the input space. The corresponding empirical risk estimation is also presented, and an $\varepsilon$-insensitive $L_{1}$ cumulative support vector machines ($\varepsilon$-$L_{1}VSVM$) is proposed by introducing an insensitive loss. It is worth mentioning that the classification models and the classification evaluation indicators based on the new mechanism are different from the traditional one. Experimental results show the effectiveness of the proposed $\varepsilon$-$L_{1}VSVM$ and the corresponding cumulative distribution function indicator on validity and interpretability of small data classification. △ Less

Submitted 12 October, 2022; v1 submitted 12 October, 2022; originally announced October 2022.

Comments: 39 pages, 34 figures, references added

arXiv:2209.12586 [pdf, other]

Learning Critical Scenarios in Feedback Control Systems for Automated Driving

Authors: Mengjia Zhu, Alberto Bemporad, Maximilian Kneissl, Hasan Esen

Abstract: Testing is essential for verifying and validating control designs, especially in safety-critical applications. In particular, the control system governing an automated driving vehicle must be proven reliable enough for its acceptance on the market. Recently, much research has focused on scenario-based methods. However, the number of possible driving scenarios to test is in principle infinite. In t… ▽ More Testing is essential for verifying and validating control designs, especially in safety-critical applications. In particular, the control system governing an automated driving vehicle must be proven reliable enough for its acceptance on the market. Recently, much research has focused on scenario-based methods. However, the number of possible driving scenarios to test is in principle infinite. In this paper, we formalize a learning-based optimization framework to generate corner test-cases, where we take into account the operational design domain. We examine the approach on the case of a feedback control system for automated driving, for which we suggest the design of the objective function expressing the criticality of scenarios. Numerical tests on two logical scenarios of the case study demonstrate that the approach can identify critical scenarios within a limited number of closed-loop experiments. △ Less

Submitted 8 September, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

arXiv:2209.11483 [pdf, ps, other]

doi 10.1109/TAP.2023.3286099

Enhanced Effective Aperture Distribution Function for Characterizing Large-Scale Antenna Arrays

Authors: Xuesong Cai, Meifang Zhu, Aleksei Fedorov, Fredrik Tufvesson

Abstract: Accurate characterization of large-scale antenna arrays is growing in importance and complexity for the fifth-generation (5G) and beyond systems, as they feature more antenna elements and require increased overall performance. The full 3D patterns of all antenna elements in the array need to be characterized because they are in general different due to construction inaccuracy, coupling, antenna ar… ▽ More Accurate characterization of large-scale antenna arrays is growing in importance and complexity for the fifth-generation (5G) and beyond systems, as they feature more antenna elements and require increased overall performance. The full 3D patterns of all antenna elements in the array need to be characterized because they are in general different due to construction inaccuracy, coupling, antenna array's asymmetry, etc. The effective aperture distribution function (EADF) can provide an analytic description of an antenna array based on a full-sphere measurement of the array in an anechoic chamber. However, as the array aperture increases, denser spatial samples are needed for EADF due to large distance offsets of array elements from the reference point in the anechoic chamber, leading to a prohibitive measurement time and increased complexity of EADF. In this paper, we present the EADF applied to large-scale arrays and highlight issues caused by the large array aperture. To overcome the issues, an enhanced EADF is proposed with a low complexity that is intrinsically determined by the characteristic of each array element rather than the array aperture. The enhanced EADF is validated using experimental measurements conducted at 27-30 GHz frequency band with a relatively large planar array. △ Less

Submitted 7 June, 2023; v1 submitted 23 September, 2022; originally announced September 2022.

Comments: 10 pages. To appear in IEEE Transactions on Antennas and Propagation

arXiv:2209.09795 [pdf, other]

Multi-Robot-Assisted Human Crowd Evacuation using Navigation Velocity Fields

Authors: Tongjia Zheng, Zhenyuan Yuan, Mollik Nayyar, Alan R. Wagner, Minghui Zhu, Hai Lin

Abstract: This work studies a robot-assisted crowd evacuation problem where we control a small group of robots to guide a large human crowd to safe locations. The challenge lies in how to model human-robot interactions and design robot controls to indirectly control a human population that significantly outnumbers the robots. To address the challenge, we treat the crowd as a continuum and formulate the evac… ▽ More This work studies a robot-assisted crowd evacuation problem where we control a small group of robots to guide a large human crowd to safe locations. The challenge lies in how to model human-robot interactions and design robot controls to indirectly control a human population that significantly outnumbers the robots. To address the challenge, we treat the crowd as a continuum and formulate the evacuation objective as driving the crowd density to target locations. We propose a novel mean-field model which consists of a family of microscopic equations that explicitly model how human motions are locally guided by the robots and an associated macroscopic equation that describes how the crowd density is controlled by the navigation velocity fields generated by all robots. Then, we design density feedback controllers for the robots to dynamically adjust their states such that the generated navigation velocity fields drive the crowd density to a target density. Stability guarantees of the proposed controllers are proven. Agent-based simulations are included to evaluate the proposed evacuation algorithms. △ Less

Submitted 20 September, 2022; originally announced September 2022.

arXiv:2209.09104 [pdf, other]

VS-CAM: Vertex Semantic Class Activation Map** to Interpret Vision Graph Neural Network

Authors: Zhenpeng Feng, Xiyang Cui, Hongbing Ji, Mingzhe Zhu, Ljubisa Stankovic

Abstract: Graph convolutional neural network (GCN) has drawn increasing attention and attained good performance in various computer vision tasks, however, there lacks a clear interpretation of GCN's inner mechanism. For standard convolutional neural networks (CNNs), class activation map** (CAM) methods are commonly used to visualize the connection between CNN's decision and image region by generating a he… ▽ More Graph convolutional neural network (GCN) has drawn increasing attention and attained good performance in various computer vision tasks, however, there lacks a clear interpretation of GCN's inner mechanism. For standard convolutional neural networks (CNNs), class activation map** (CAM) methods are commonly used to visualize the connection between CNN's decision and image region by generating a heatmap. Nonetheless, such heatmap usually exhibits semantic-chaos when these CAMs are applied to GCN directly. In this paper, we proposed a novel visualization method particularly applicable to GCN, Vertex Semantic Class Activation Map** (VS-CAM). VS-CAM includes two independent pipelines to produce a set of semantic-probe maps and a semantic-base map, respectively. Semantic-probe maps are used to detect the semantic information from semantic-base map to aggregate a semantic-aware heatmap. Qualitative results show that VS-CAM can obtain heatmaps where the highlighted regions match the objects much more precisely than CNN-based CAM. The quantitative evaluation further demonstrates the superiority of VS-CAM. △ Less

Submitted 15 September, 2022; originally announced September 2022.

Comments: 10 pages, 10 figures

arXiv:2209.09027 [pdf]

doi 10.1016/j.jsv.2022.117175

Wave analysis in the complex Fourier transform domain: A new method to obtain the Green's functions of dispersive linear partial differential equations

Authors: Minjiang Zhu

Abstract: This paper provides a new analytical method to obtain Green's functions of linear dispersive partial differential equations. The Euler-Bernoulli beam equation and the one-dimensional heat conduction equation (dissipation equation) under impulses in space and time are solved as examples. The complex infinite-domain Green's function of the Euler-Bernoulli beam is derived. A new approach is proposed… ▽ More This paper provides a new analytical method to obtain Green's functions of linear dispersive partial differential equations. The Euler-Bernoulli beam equation and the one-dimensional heat conduction equation (dissipation equation) under impulses in space and time are solved as examples. The complex infinite-domain Green's function of the Euler-Bernoulli beam is derived. A new approach is proposed to obtain the finite-domain Green's function from the infinite-domain Green's function by the reflection and transmission analysis in the complex Fourier transform domain. It is found that the solution obtained by this approach converges much better at short response times compared with that obtained by the traditional modal analysis. Besides, by applying the geometric summation formula for matrix series, a new modal expansion solution requiring no calculation of each mode's inner product is derived, which analytically proves the wave-mode duality and simplifies the calculation. The semi-infinite-domain cases and the coupled-domain cases are also derived by the newly developed method to show its validity and simplicity. It is found that the non-propagating waves also possess wave speed, and heat conduction can also be treated as propagating waves △ Less

Submitted 16 September, 2022; originally announced September 2022.

Comments: 14 pages, 9 figures

arXiv:2209.08688 [pdf, ps, other]

On Relaxed Locally Decodable Codes for Hamming and Insertion-Deletion Errors

Authors: Alex Block, Jeremiah Blocki, Kuan Cheng, Elena Grigorescu, Xin Li, Yu Zheng, Minshen Zhu

Abstract: Locally Decodable Codes (LDCs) are error-correcting codes $C:Σ^n\rightarrow Σ^m$ with super-fast decoding algorithms. They are important mathematical objects in many areas of theoretical computer science, yet the best constructions so far have codeword length $m$ that is super-polynomial in $n$, for codes with constant query complexity and constant alphabet size. In a very surprising result, Ben-S… ▽ More Locally Decodable Codes (LDCs) are error-correcting codes $C:Σ^n\rightarrow Σ^m$ with super-fast decoding algorithms. They are important mathematical objects in many areas of theoretical computer science, yet the best constructions so far have codeword length $m$ that is super-polynomial in $n$, for codes with constant query complexity and constant alphabet size. In a very surprising result, Ben-Sasson et al. showed how to construct a relaxed version of LDCs (RLDCs) with constant query complexity and almost linear codeword length over the binary alphabet, and used them to obtain significantly-improved constructions of Probabilistically Checkable Proofs. In this work, we study RLDCs in the standard Hamming-error setting, and introduce their variants in the insertion and deletion (Insdel) error setting. Insdel LDCs were first studied by Ostrovsky and Paskin-Cherniavsky, and are further motivated by recent advances in DNA random access bio-technologies, in which the goal is to retrieve individual files from a DNA storage database. Our first result is an exponential lower bound on the length of Hamming RLDCs making 2 queries, over the binary alphabet. This answers a question explicitly raised by Gur and Lachish. Our result exhibits a "phase-transition"-type behavior on the codeword length for constant-query Hamming RLDCs. We further define two variants of RLDCs in the Insdel-error setting, a weak and a strong version. On the one hand, we construct weak Insdel RLDCs with with parameters matching those of the Hamming variants. On the other hand, we prove exponential lower bounds for strong Insdel RLDCs. These results demonstrate that, while these variants are equivalent in the Hamming setting, they are significantly different in the insdel setting. Our results also prove a strict separation between Hamming RLDCs and Insdel RLDCs. △ Less

Submitted 18 September, 2022; originally announced September 2022.

arXiv:2209.06015 [pdf, other]

Black-box Dataset Ownership Verification via Backdoor Watermarking

Authors: Yiming Li, Mingyan Zhu, Xue Yang, Yong Jiang, Tao Wei, Shu-Tao Xia

Abstract: Deep learning, especially deep neural networks (DNNs), has been widely and successfully adopted in many critical applications for its high effectiveness and efficiency. The rapid development of DNNs has benefited from the existence of some high-quality datasets ($e.g.$, ImageNet), which allow researchers and developers to easily verify the performance of their methods. Currently, almost all existi… ▽ More Deep learning, especially deep neural networks (DNNs), has been widely and successfully adopted in many critical applications for its high effectiveness and efficiency. The rapid development of DNNs has benefited from the existence of some high-quality datasets ($e.g.$, ImageNet), which allow researchers and developers to easily verify the performance of their methods. Currently, almost all existing released datasets require that they can only be adopted for academic or educational purposes rather than commercial purposes without permission. However, there is still no good way to ensure that. In this paper, we formulate the protection of released datasets as verifying whether they are adopted for training a (suspicious) third-party model, where defenders can only query the model while having no information about its parameters and training details. Based on this formulation, we propose to embed external patterns via backdoor watermarking for the ownership verification to protect them. Our method contains two main parts, including dataset watermarking and dataset verification. Specifically, we exploit poison-only backdoor attacks ($e.g.$, BadNets) for dataset watermarking and design a hypothesis-test-guided method for dataset verification. We also provide some theoretical analyses of our methods. Experiments on multiple benchmark datasets of different tasks are conducted, which verify the effectiveness of our method. The code for reproducing main experiments is available at \url{https://github.com/THUYimingLi/DVBW}. △ Less

Submitted 30 March, 2023; v1 submitted 4 August, 2022; originally announced September 2022.

Comments: This paper is accepted by IEEE TIFS. 15 pages. The preliminary short version of this paper was posted on arXiv (arXiv:2010.05821) and presented in a non-archival NeurIPS Workshop (2020)

arXiv:2209.03490 [pdf, other]

doi 10.1088/1674-4527/ac9028

Evolution of Galaxy Types and HI Gas in Hickson Compact Groups

Authors: Yao Liu, Ming Zhu

Abstract: Compact groups have high galaxy densities and low velocity dispersions, and their group members have experienced numerous and frequent interactions during their lifetimes. They provide a unique environment to study the evolution of galaxies. We examined the galaxies types and HI contents in groups to make a study on the galaxy evolution in compact groups. We used the group crossing time as an age… ▽ More Compact groups have high galaxy densities and low velocity dispersions, and their group members have experienced numerous and frequent interactions during their lifetimes. They provide a unique environment to study the evolution of galaxies. We examined the galaxies types and HI contents in groups to make a study on the galaxy evolution in compact groups. We used the group crossing time as an age indicator for galaxy groups. Our sample is derived from the Hickson Compact Group catalog. We obtained group morphology data from the Hyper-Leda database and the IR classification based on Wide-Field Infrared Survey Explorer (WISE) fluxes from Zucker et al. (2016). By cross-matching the latest released ALFALFA 100% HI source catalog and supplemented by data found in literature, we obtained 40 galaxy groups with HI data available. We confirmed that the weak correlation between HI mass fraction and group crossing time found by Ai & Zhu (2018) in SDSS groups also exists in compact groups. We also found that the group spiral galaxy fraction is correlated with the group crossing time, but the actively star-forming galaxy fraction is not correlated with the group crossing time. These results seem to fit with the hypothesis that the sequential acquisition of neighbors from surrounding larger-scale structures has affected the morphology transition and star formation efficiency in compact groups. △ Less

Submitted 7 September, 2022; originally announced September 2022.

Comments: 20pages, 6figures. Res. Astron. Astrophys (2022)

arXiv:2208.12074 [pdf, other]

doi 10.1063/5.0144147

Ionization Induced by the Ponderomotive Force in Intense and High-Frequency Laser Fields

Authors: Mingyu Zhu, Yuxiang Liu, Chunli Wei, Hongcheng Ni, Qi Wei

Abstract: Atomic stabilization is a universal phenomenon that occurs when atoms interact with intense and high-frequency laser fields. In this work, we systematically study the influence of the ponderomotive (PM) force, present around the laser focus, on atomic stabilization. We show that the PM force could induce tunneling and even over-barrier ionization to the otherwise stabilized atoms. Such effect may… ▽ More Atomic stabilization is a universal phenomenon that occurs when atoms interact with intense and high-frequency laser fields. In this work, we systematically study the influence of the ponderomotive (PM) force, present around the laser focus, on atomic stabilization. We show that the PM force could induce tunneling and even over-barrier ionization to the otherwise stabilized atoms. Such effect may overweight the typical multiphoton ionization under moderate laser intensities. Our work highlights the importance of an improved treatment of atomic stabilization that includes the influence of the PM force. △ Less

Submitted 5 May, 2023; v1 submitted 25 August, 2022; originally announced August 2022.

Journal ref: J. Chem. Phys. 158, 164306 (2023)

arXiv:2208.10912 [pdf, other]

Learning Instrumental Variable from Data Fusion for Treatment Effect Estimation

Authors: Anpeng Wu, Kun Kuang, Ruoxuan Xiong, Minqing Zhu, Yuxuan Liu, Bo Li, Furui Liu, Zhihua Wang, Fei Wu

Abstract: The advent of the big data era brought new opportunities and challenges to draw treatment effect in data fusion, that is, a mixed dataset collected from multiple sources (each source with an independent treatment assignment mechanism). Due to possibly omitted source labels and unmeasured confounders, traditional methods cannot estimate individual treatment assignment probability and infer treatmen… ▽ More The advent of the big data era brought new opportunities and challenges to draw treatment effect in data fusion, that is, a mixed dataset collected from multiple sources (each source with an independent treatment assignment mechanism). Due to possibly omitted source labels and unmeasured confounders, traditional methods cannot estimate individual treatment assignment probability and infer treatment effect effectively. Therefore, we propose to reconstruct the source label and model it as a Group Instrumental Variable (GIV) to implement IV-based Regression for treatment effect estimation. In this paper, we conceptualize this line of thought and develop a unified framework (Meta-EM) to (1) map the raw data into a representation space to construct Linear Mixed Models for the assigned treatment variable; (2) estimate the distribution differences and model the GIV for the different treatment assignment mechanisms; and (3) adopt an alternating training strategy to iteratively optimize the representations and the joint distribution to model GIV for IV regression. Empirical results demonstrate the advantages of our Meta-EM compared with state-of-the-art methods. △ Less

Submitted 7 December, 2022; v1 submitted 23 August, 2022; originally announced August 2022.

arXiv:2208.09735 [pdf, other]

How a Small Amount of Data Sharing Benefits Distributed Optimization and Learning

Authors: Mingxi Zhu, Yinyu Ye

Abstract: Distributed optimization algorithms have been widely used in machine learning. While those algorithms have the merits in parallel processing and protecting data security, they often suffer from slow convergence. This paper focuses on how a small amount of data sharing could benefit distributed optimization and learning. Specifically, we examine higher-order optimization algorithms including distri… ▽ More Distributed optimization algorithms have been widely used in machine learning. While those algorithms have the merits in parallel processing and protecting data security, they often suffer from slow convergence. This paper focuses on how a small amount of data sharing could benefit distributed optimization and learning. Specifically, we examine higher-order optimization algorithms including distributed multi-block alternating direction method of multipliers (ADMM) and preconditioned conjugate gradient method (PCG). The contribution of this paper is three-folded. First, in theory, we answer when and why distributed optimization algorithms are slow by identifying the worst data structure. Surprisingly, while PCG algorithm converges slowly under heterogeneous data structure, for distributed ADMM, data homogeneity leads to the worst performance. This result challenges the common belief that data heterogeneity hurts convergence, highlighting the need for a universal approach on altering data structure for different algorithms. Second, in practice, we propose a meta-algorithm of data sharing, with its tailored applications in multi-block ADMM and PCG methods. By only sharing a small amount of prefixed data (e.g. 1%), our algorithms provide good quality estimators in different machine learning tasks within much fewer iterations, while purely distributed optimization algorithms may take hundreds more times of iterations to converge. Finally, in philosophy, we argue that even minimal collaboration can have huge synergy, which is a concept that extends beyond the realm of optimization analysis. We hope that the discovery resulting from this paper would encourage even a small amount of data sharing among different regions to combat difficult global learning problems. △ Less

Submitted 2 January, 2024; v1 submitted 20 August, 2022; originally announced August 2022.

MSC Class: 90C06 (Primary); 90C25; 68U04 (Secondary)

Showing 251–300 of 862 results for author: Zhu, M