Search | arXiv e-print repository

Decentralized Federated Policy Gradient with Byzantine Fault-Tolerance and Provably Fast Convergence

Authors: Philip Jordan, Florian Grötschla, Flint Xiaofeng Fan, Roger Wattenhofer

Abstract: In Federated Reinforcement Learning (FRL), agents aim to collaboratively learn a common task, while each agent is acting in its local environment without exchanging raw trajectories. Existing approaches for FRL either (a) do not provide any fault-tolerance guarantees (against misbehaving agents), or (b) rely on a trusted central agent (a single point of failure) for aggregating updates. We provide… ▽ More In Federated Reinforcement Learning (FRL), agents aim to collaboratively learn a common task, while each agent is acting in its local environment without exchanging raw trajectories. Existing approaches for FRL either (a) do not provide any fault-tolerance guarantees (against misbehaving agents), or (b) rely on a trusted central agent (a single point of failure) for aggregating updates. We provide the first decentralized Byzantine fault-tolerant FRL method. Towards this end, we first propose a new centralized Byzantine fault-tolerant policy gradient (PG) algorithm that improves over existing methods by relying only on assumptions standard for non-fault-tolerant PG. Then, as our main contribution, we show how a combination of robust aggregation and Byzantine-resilient agreement methods can be leveraged in order to eliminate the need for a trusted central entity. Since our results represent the first sample complexity analysis for Byzantine fault-tolerant decentralized federated non-convex optimization, our technical contributions may be of independent interest. Finally, we corroborate our theoretical results experimentally for common RL environments, demonstrating the speed-up of decentralized federations w.r.t. the number of participating agents and resilience against various Byzantine attacks. △ Less

Submitted 7 January, 2024; originally announced January 2024.

Comments: Accepted at AAMAS'24

arXiv:2401.03474 [pdf, other]

Nontensorial gravitational wave polarizations from the tensorial degrees of freedom: I. Linearized Lorentz-violating theory of gravity with s tensor

Authors: Shaoqi Hou, Xi-Long Fan, Tao Zhu, Zong-Hong Zhu

Abstract: General relativity predicts the existence of only two tensorial gravitational wave polarizations, while a generic metric theories of gravity can possess up to four additional polarizations, including two vector and two scalar ones. These vector/scalar polarizations are in general generated by the intrinsic new vector/scalar degrees of freedom of the specific theories of gravity. In this paper, we… ▽ More General relativity predicts the existence of only two tensorial gravitational wave polarizations, while a generic metric theories of gravity can possess up to four additional polarizations, including two vector and two scalar ones. These vector/scalar polarizations are in general generated by the intrinsic new vector/scalar degrees of freedom of the specific theories of gravity. In this paper, we show that, with the violation of the Lorentz symmetry in the framework of the standard model extension, the additional nontensorial polarizations can be directly excited by the two tensorial degrees of freedom. We consider the diffeomorphism invariant standard model extension in the gravity sector with the Lorentz-violating coefficients $\hat{\boldsymbol s}^{(d)μρνσ}$ of the even mass dimension $d\ge4$. In addition to the extra polarizations induced by the tensor modes, the gravitational wave in this theory travels at a speed depending on the propagation direction, experiences dispersion if and only if $d\ge6$, and possesses neither velocity nor amplitude birefringence. The excitement of the extra polarizations is also chiral. The antenna pattern functions of interferometers due to such kind of gravitational waves are generally linear combinations of those for all polarizations. Detected by pulsar timing arrays and the Gaia satellite, the stochastic gravitational wave background in this model could induce couplings among cross correlations, of the redshifts of photons and the astrometric deflections of the positions of pulsars, for different polarizations. These characteristics enable the use of interferometers, pulsar timing arrays and Gaia mission to constrain this model. △ Less

Submitted 11 April, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

Comments: 14 pages, 6 figures. Modified according to Referee report

Journal ref: Phys Rev D 109, 084011 (2024)

arXiv:2401.02668 [pdf, other]

Towards Integrated Fine-tuning and Inference when Generative AI meets Edge Intelligence

Authors: Ning Chen, Zhipeng Cheng, Xuwei Fan, Xiaoyu Xia, Lianfen Huang

Abstract: The high-performance generative artificial intelligence (GAI) represents the latest evolution of computational intelligence, while the blessing of future 6G networks also makes edge intelligence (EI) full of development potential. The inevitable encounter between GAI and EI can unleash new opportunities, where GAI's pre-training based on massive computing resources and large-scale unlabeled corpor… ▽ More The high-performance generative artificial intelligence (GAI) represents the latest evolution of computational intelligence, while the blessing of future 6G networks also makes edge intelligence (EI) full of development potential. The inevitable encounter between GAI and EI can unleash new opportunities, where GAI's pre-training based on massive computing resources and large-scale unlabeled corpora can provide strong foundational knowledge for EI, while EI can harness fragmented computing resources to aggregate personalized knowledge for GAI. However, the natural contradictory features pose significant challenges to direct knowledge sharing. To address this, in this paper, we propose the GAI-oriented synthetical network (GaisNet), a collaborative cloud-edge-end intelligence framework that buffers contradiction leveraging data-free knowledge relay, where the bidirectional knowledge flow enables GAI's virtuous-cycle model fine-tuning and task inference, achieving mutualism between GAI and EI with seamless fusion and collaborative evolution. Experimental results demonstrate the effectiveness of the proposed mechanisms. Finally, we discuss the future challenges and directions in the interplay between GAI and EI. △ Less

Submitted 5 January, 2024; originally announced January 2024.

Comments: 11 pages, 8 figures, and 5 tables

arXiv:2401.02662 [pdf, other]

GainNet: Coordinates the Odd Couple of Generative AI and 6G Networks

Authors: Ning Chen, Jie Yang, Zhipeng Cheng, Xuwei Fan, Zhang Liu, Bangzhen Huang, Yifeng Zhao, Lianfen Huang, Xiaojiang Du, Mohsen Guizani

Abstract: The rapid expansion of AI-generated content (AIGC) reflects the iteration from assistive AI towards generative AI (GAI) with creativity. Meanwhile, the 6G networks will also evolve from the Internet-of-everything to the Internet-of-intelligence with hybrid heterogeneous network architectures. In the future, the interplay between GAI and the 6G will lead to new opportunities, where GAI can learn th… ▽ More The rapid expansion of AI-generated content (AIGC) reflects the iteration from assistive AI towards generative AI (GAI) with creativity. Meanwhile, the 6G networks will also evolve from the Internet-of-everything to the Internet-of-intelligence with hybrid heterogeneous network architectures. In the future, the interplay between GAI and the 6G will lead to new opportunities, where GAI can learn the knowledge of personalized data from the massive connected 6G end devices, while GAI's powerful generation ability can provide advanced network solutions for 6G network and provide 6G end devices with various AIGC services. However, they seem to be an odd couple, due to the contradiction of data and resources. To achieve a better-coordinated interplay between GAI and 6G, the GAI-native networks (GainNet), a GAI-oriented collaborative cloud-edge-end intelligence framework, is proposed in this paper. By deeply integrating GAI with 6G network design, GainNet realizes the positive closed-loop knowledge flow and sustainable-evolution GAI model optimization. On this basis, the GAI-oriented generic resource orchestration mechanism with integrated sensing, communication, and computing (GaiRom-ISCC) is proposed to guarantee the efficient operation of GainNet. Two simple case studies demonstrate the effectiveness and robustness of the proposed schemes. Finally, we envision the key challenges and future directions concerning the interplay between GAI models and 6G networks. △ Less

Submitted 5 January, 2024; originally announced January 2024.

Comments: 10 pages, 5 figures, 1 table

arXiv:2401.02251 [pdf, other]

Nonreciprocal Unconventional Photon Blockade with Kerr Magnons

Authors: Xiao-Hong Fan, Yi-Ning Zhang, Jun-Po Yu, Ming-Yue Liu, Wen-Di He, Hai-Chao Li, Wei Xiong

Abstract: Nonreciprocal devices, allowing to manipulate one-way signals, are crucial to quantum information processing and quantum network. Here we propose a nonlinear cavity-magnon system, consisting of a microwave cavity coupled to one or two yttrium-iron-garnet (YIG) spheres supporting magnons with Kerr nonlinearity, to investigate nonreciprocal unconventional photon blockade. The nonreciprocity originat… ▽ More Nonreciprocal devices, allowing to manipulate one-way signals, are crucial to quantum information processing and quantum network. Here we propose a nonlinear cavity-magnon system, consisting of a microwave cavity coupled to one or two yttrium-iron-garnet (YIG) spheres supporting magnons with Kerr nonlinearity, to investigate nonreciprocal unconventional photon blockade. The nonreciprocity originates from the direction-dependent Kerr effect, distinctly different from previous proposals with spinning cavities and dissipative couplings. For a single sphere case, nonreciprocal unconventional photon blockade can be realized by manipulating the nonreciprocal destructive interference between two active paths, via vary the Kerr coefficient from positive to negative, or vice versa. By optimizing the system parameters, the perfect and well-tuned nonreciprocal unconventional photon blockade can be predicted. For the case of two spheres with opposite Kerr effects, only reciprocal unconventional photon blockade can be observed when two cavity-magnon coupling strengths Kerr strengths are symmetric. However, when coupling strengths or Kerr strengths become asymmetric, nonreciprocal unconventional photon blockade appears. This implies that two-sphere nonlinear cavity-magnon systems can be used to switch the transition between reciprocal and nonreciprocal unconventional photon blockades. Our study offers a potential platform for investigating nonreciprocal photon blockade effect in nonlinear cavity magnonics. △ Less

Submitted 25 April, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

Comments: 9 pages,8 figures. Accepted by Advanced Quantum Technologies

arXiv:2401.01491

A Hybrid Neural Network Model For Predicting The Nitrate Concentration In The Recirculating Aquaculture System

Authors: Xiangyu Fan, Jiaxin Lia, Yingzhe Wang, Yingsha Qu, Hao Li, Keming Qu, Zhengguo Cui

Abstract: This study was groundbreaking in its application of neural network models for nitrate management in the Recirculating Aquaculture System (RAS). A hybrid neural network model was proposed, which accurately predicted daily nitrate concentration and its trends using six water quality parameters. We conducted a 105-day aquaculture experiment, during which we collected 450 samples from five sets of RAS… ▽ More This study was groundbreaking in its application of neural network models for nitrate management in the Recirculating Aquaculture System (RAS). A hybrid neural network model was proposed, which accurately predicted daily nitrate concentration and its trends using six water quality parameters. We conducted a 105-day aquaculture experiment, during which we collected 450 samples from five sets of RAS to train our model (C-L-A model) which incorporates Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and self-Attention. Furthermore, we obtained 90 samples from a standalone RAS as the testing data to evaluate the performance of the model in practical applications. The experimental results proved that the C-L-A model accurately predicted nitrate concentration in RAS and maintained good performance even with a reduced proportion of training data. We recommend using water quality parameters from the past 7 days to forecast future nitrate concentration, as this timeframe allows the model to achieve maximum generalization capability. Additionally, we compared the performance of the C-L-A model with three basic neural network models (CNN, LSTM, self-Attention) as well as three hybrid neural network models (CNN-LSTM, CNN-Attention, LSTM-Attention). The results demonstrated that the C-L-A model (R2=0.956) significantly outperformed the other neural network models (R2=0.901-0.927). Our study suggests that the utilization of neural network models, specifically the C-L-A model, could potentially assist the RAS industry in conserving resources for daily nitrate monitoring. △ Less

Submitted 15 January, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

Comments: The content of this paper needs to be further filled and improved

arXiv:2401.01367 [pdf]

Guidelines in Wastewater-based Epidemiology of SARS-CoV-2 with Diagnosis

Authors: Madiha Fatima, Zhihua Cao, Aichun Huang, Shengyuan Wu, Xinxian Fan, Yi Wang, Liu Jiren, Ziyun Zhu, Qiongrou Ye, Yuan Ma, Joseph K. F Chow, Peng Jia, Yangshou Liu, Yubin Lin, Manjun Ye, Tong Wu, Zhixun Li, Cong Cai, Wenhai Zhang, Cheris H. Q. Ding, Yuanzhe Cai, Feijuan Huang

Abstract: With the global spread and increasing transmission rate of SARS-CoV-2, more and more laboratories and researchers are turning their attention to wastewater-based epidemiology (WBE), ho** it can become an effective tool for large-scale testing and provide more ac-curate predictions of the number of infected individuals. Based on the cases of sewage sampling and testing in some regions such as Hon… ▽ More With the global spread and increasing transmission rate of SARS-CoV-2, more and more laboratories and researchers are turning their attention to wastewater-based epidemiology (WBE), ho** it can become an effective tool for large-scale testing and provide more ac-curate predictions of the number of infected individuals. Based on the cases of sewage sampling and testing in some regions such as Hong Kong, Brazil, and the United States, the feasibility of detecting the novel coronavirus in sewage is extremely high. This study re-views domestic and international achievements in detecting SARS-CoV-2 through WBE and summarizes four aspects of COVID-19, including sampling methods, virus decay rate cal-culation, standardized population coverage of the watershed, algorithm prediction, and provides ideas for combining field modeling with epidemic prevention and control. Moreover, we highlighted some diagnostic techniques for detection of the virus from sew-age sample. Our review is a new approach in identification of the research gaps in waste water-based epidemiology and diagnosis and we also predict the future prospect of our analysis. △ Less

Submitted 26 December, 2023; originally announced January 2024.

arXiv:2401.00741 [pdf, other]

ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios

Authors: Junjie Ye, Guanyu Li, Songyang Gao, Caishuang Huang, Yilong Wu, Sixian Li, Xiaoran Fan, Shihan Dou, Qi Zhang, Tao Gui, Xuan**g Huang

Abstract: Existing evaluations of tool learning primarily focus on validating the alignment of selected tools for large language models (LLMs) with expected outcomes. However, these approaches rely on a limited set of scenarios where answers can be pre-determined, diverging from genuine needs. Furthermore, a sole emphasis on outcomes disregards the intricate capabilities essential for LLMs to effectively ut… ▽ More Existing evaluations of tool learning primarily focus on validating the alignment of selected tools for large language models (LLMs) with expected outcomes. However, these approaches rely on a limited set of scenarios where answers can be pre-determined, diverging from genuine needs. Furthermore, a sole emphasis on outcomes disregards the intricate capabilities essential for LLMs to effectively utilize tools. To tackle this issue, we propose ToolEyes, a fine-grained system tailored for the evaluation of the LLMs' tool learning capabilities in authentic scenarios. The system meticulously examines seven real-world scenarios, analyzing five dimensions crucial to LLMs in tool learning: format alignment, intent comprehension, behavior planning, tool selection, and answer organization. Additionally, ToolEyes incorporates a tool library boasting approximately 600 tools, serving as an intermediary between LLMs and the physical world. Evaluations involving ten LLMs across three categories reveal a preference for specific scenarios and limited cognitive abilities in tool learning. Intriguingly, expanding the model size even exacerbates the hindrance to tool learning. These findings offer instructive insights aimed at advancing the field of tool learning. The data is available att https://github.com/Junjie-Ye/ToolEyes. △ Less

Submitted 14 January, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

arXiv:2401.00421 [pdf, other]

From Text to Pixels: A Context-Aware Semantic Synergy Solution for Infrared and Visible Image Fusion

Authors: Xingyuan Li, Yang Zou, **yuan Liu, Zhiying Jiang, Long Ma, Xin Fan, Risheng Liu

Abstract: With the rapid progression of deep learning technologies, multi-modality image fusion has become increasingly prevalent in object detection tasks. Despite its popularity, the inherent disparities in how different sources depict scene content make fusion a challenging problem. Current fusion methodologies identify shared characteristics between the two modalities and integrate them within this shar… ▽ More With the rapid progression of deep learning technologies, multi-modality image fusion has become increasingly prevalent in object detection tasks. Despite its popularity, the inherent disparities in how different sources depict scene content make fusion a challenging problem. Current fusion methodologies identify shared characteristics between the two modalities and integrate them within this shared domain using either iterative optimization or deep learning architectures, which often neglect the intricate semantic relationships between modalities, resulting in a superficial understanding of inter-modal connections and, consequently, suboptimal fusion outcomes. To address this, we introduce a text-guided multi-modality image fusion method that leverages the high-level semantics from textual descriptions to integrate semantics from infrared and visible images. This method capitalizes on the complementary characteristics of diverse modalities, bolstering both the accuracy and robustness of object detection. The codebook is utilized to enhance a streamlined and concise depiction of the fused intra- and inter-domain dynamics, fine-tuned for optimal performance in detection tasks. We present a bilevel optimization strategy that establishes a nexus between the joint problem of fusion and detection, optimizing both processes concurrently. Furthermore, we introduce the first dataset of paired infrared and visible images accompanied by text prompts, paving the way for future research. Extensive experiments on several datasets demonstrate that our method not only produces visually superior fusion results but also achieves a higher detection mAP over existing methods, achieving state-of-the-art results. △ Less

Submitted 31 December, 2023; originally announced January 2024.

Comments: 10 pages, 12 figures, 3 tables, conference

MSC Class: 68T45 ACM Class: I.4.3

arXiv:2312.17674 [pdf, other]

QoE-oriented Dependent Task Scheduling under Multi-dimensional QoS Constraints over Distributed Networks

Authors: Xuwei Fan, Zhipeng Cheng, Ning Chen, Lianfen Huang, Xianbin Wang

Abstract: Task scheduling as an effective strategy can improve application performance on computing resource-limited devices over distributed networks. However, existing evaluation mechanisms fail to depict the complexity of diverse applications, which involve dependencies among tasks, computing resource requirements, and multi-dimensional quality of service (QoS) constraints. Furthermore, traditional QoS-o… ▽ More Task scheduling as an effective strategy can improve application performance on computing resource-limited devices over distributed networks. However, existing evaluation mechanisms fail to depict the complexity of diverse applications, which involve dependencies among tasks, computing resource requirements, and multi-dimensional quality of service (QoS) constraints. Furthermore, traditional QoS-oriented task scheduling strategies struggle to meet the performance requirements without considering differences in satisfaction and acceptance of application, leading application failures and resource wastage. To tackle these issues, a quality of experience (QoE) cost model is designed to evaluate application completion, depicting the relationship among application satisfaction, communications, and computing resources in the distributed networks. Specifically, considering the sensitivity and preference of QoS, we model the different dimensional QoS degradation cost functions for dependent tasks, which are then integrated into the QoE cost model. Based on the QoE model, the dependent task scheduling problem is formulated as the minimization of overall QoE cost, aiming to improve the application performance in the distributed networks, which is proven Np-hard. Moreover, a heuristic Hierarchical Multi-queue Task Scheduling Algorithm (HMTSA) is proposed to address the QoE-oriented task scheduling problem among multiple dependent tasks, which utilizes hierarchical multiple queues to determine the optimal task execution order and location according to different dimensional QoS priorities. Finally, extensive experiments demonstrate that the proposed algorithm can significantly improve the satisfaction of applications. △ Less

Submitted 29 December, 2023; originally announced December 2023.

arXiv:2312.15668 [pdf, ps, other]

Air-to-Ground Communications Beyond 5G: UAV Swarm Formation Control and Tracking

Authors: Xiao Fan, Peiran Wu, Minghua Xia

Abstract: Unmanned aerial vehicle (UAV) communications have been widely accepted as promising technologies to support air-to-ground communications in the forthcoming sixth-generation (6G) wireless networks. This paper proposes a novel air-to-ground communication model consisting of aerial base stations served by UAVs and terrestrial user equipments (UEs) by integrating the technique of coordinated multi-poi… ▽ More Unmanned aerial vehicle (UAV) communications have been widely accepted as promising technologies to support air-to-ground communications in the forthcoming sixth-generation (6G) wireless networks. This paper proposes a novel air-to-ground communication model consisting of aerial base stations served by UAVs and terrestrial user equipments (UEs) by integrating the technique of coordinated multi-point (CoMP) transmission with the theory of stochastic geometry. In particular, a CoMP set consisting of multiple UAVs is developed based on the theory of Poisson-Delaunay tetrahedralization. Effective UAV formation control and UAV swarm tracking schemes for two typical scenarios, including static and mobile UEs, are also developed using the multi-agent system theory to ensure that collaborative UAVs can efficiently reach target spatial positions for mission execution. Thanks to the ease of mathematical tractability, this model provides explicit performance expressions for a typical UE's coverage probability and achievable ergodic rate. Extensive simulation and numerical results corroborate that the proposed scheme outperforms UAV communications without CoMP transmission and obtains similar performance to the conventional CoMP scheme while avoiding search overhead. △ Less

Submitted 25 December, 2023; originally announced December 2023.

Comments: 14 pages, 9 figures, to appear in IEEE TWC

arXiv:2312.10422 [pdf, other]

Learning Dense Correspondence for NeRF-Based Face Reenactment

Authors: Songlin Yang, Wei Wang, Yushi Lan, Xiangyu Fan, Bo Peng, Lei Yang, **g Dong

Abstract: Face reenactment is challenging due to the need to establish dense correspondence between various face representations for motion transfer. Recent studies have utilized Neural Radiance Field (NeRF) as fundamental representation, which further enhanced the performance of multi-view face reenactment in photo-realism and 3D consistency. However, establishing dense correspondence between different fac… ▽ More Face reenactment is challenging due to the need to establish dense correspondence between various face representations for motion transfer. Recent studies have utilized Neural Radiance Field (NeRF) as fundamental representation, which further enhanced the performance of multi-view face reenactment in photo-realism and 3D consistency. However, establishing dense correspondence between different face NeRFs is non-trivial, because implicit representations lack ground-truth correspondence annotations like mesh-based 3D parametric models (e.g., 3DMM) with index-aligned vertexes. Although aligning 3DMM space with NeRF-based face representations can realize motion control, it is sub-optimal for their limited face-only modeling and low identity fidelity. Therefore, we are inspired to ask: Can we learn the dense correspondence between different NeRF-based face representations without a 3D parametric model prior? To address this challenge, we propose a novel framework, which adopts tri-planes as fundamental NeRF representation and decomposes face tri-planes into three components: canonical tri-planes, identity deformations, and motion. In terms of motion control, our key contribution is proposing a Plane Dictionary (PlaneDict) module, which efficiently maps the motion conditions to a linear weighted addition of learnable orthogonal plane bases. To the best of our knowledge, our framework is the first method that achieves one-shot multi-view face reenactment without a 3D parametric model prior. Extensive experiments demonstrate that we produce better results in fine-grained motion control and identity preservation than previous methods. △ Less

Submitted 18 December, 2023; v1 submitted 16 December, 2023; originally announced December 2023.

Comments: Accepted by Proceedings of the AAAI Conference on Artificial Intelligence, 2024

arXiv:2312.09979 [pdf, other]

LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin

Authors: Shihan Dou, Enyu Zhou, Yan Liu, Songyang Gao, Jun Zhao, Wei Shen, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Xiaoran Fan, Shiliang Pu, Jiang Zhu, Rui Zheng, Tao Gui, Qi Zhang, Xuan**g Huang

Abstract: Supervised fine-tuning (SFT) is a crucial step for large language models (LLMs), enabling them to align with human instructions and enhance their capabilities in downstream tasks. Increasing instruction data substantially is a direct solution to align the model with a broader range of downstream tasks or notably improve its performance on a specific task. However, we find that large-scale increase… ▽ More Supervised fine-tuning (SFT) is a crucial step for large language models (LLMs), enabling them to align with human instructions and enhance their capabilities in downstream tasks. Increasing instruction data substantially is a direct solution to align the model with a broader range of downstream tasks or notably improve its performance on a specific task. However, we find that large-scale increases in instruction data can damage the world knowledge previously stored in LLMs. To address this challenge, we propose LoRAMoE, a novelty framework that introduces several low-rank adapters (LoRA) and integrates them by using a router network, like a plugin version of Mixture of Experts (MoE). It freezes the backbone model and forces a portion of LoRAs to focus on leveraging world knowledge to solve downstream tasks, to alleviate world knowledge-edge forgetting. Experimental results show that, as the instruction data increases, LoRAMoE can significantly improve the ability to process downstream tasks, while maintaining the world knowledge stored in the LLM. △ Less

Submitted 8 March, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

Comments: 14 pages, 7 figures

arXiv:2312.09498 [pdf, other]

Neural Gaussian Similarity Modeling for Differential Graph Structure Learning

Authors: Xiaolong Fan, Maoguo Gong, Yue Wu, Zedong Tang, Jieyi Liu

Abstract: Graph Structure Learning (GSL) has demonstrated considerable potential in the analysis of graph-unknown non-Euclidean data across a wide range of domains. However, constructing an end-to-end graph structure learning model poses a challenge due to the impediment of gradient flow caused by the nearest neighbor sampling strategy. In this paper, we construct a differential graph structure learning mod… ▽ More Graph Structure Learning (GSL) has demonstrated considerable potential in the analysis of graph-unknown non-Euclidean data across a wide range of domains. However, constructing an end-to-end graph structure learning model poses a challenge due to the impediment of gradient flow caused by the nearest neighbor sampling strategy. In this paper, we construct a differential graph structure learning model by replacing the non-differentiable nearest neighbor sampling with a differentiable sampling using the reparameterization trick. Under this framework, we argue that the act of sampling \mbox{nearest} neighbors may not invariably be essential, particularly in instances where node features exhibit a significant degree of similarity. To alleviate this issue, the bell-shaped Gaussian Similarity (GauSim) modeling is proposed to sample non-nearest neighbors. To adaptively model the similarity, we further propose Neural Gaussian Similarity (NeuralGauSim) with learnable parameters featuring flexible sampling behaviors. In addition, we develop a scalable method by transferring the large-scale graph to the transition graph to significantly reduce the complexity. Experimental results demonstrate the effectiveness of the proposed methods. △ Less

Submitted 14 December, 2023; originally announced December 2023.

Comments: Accepted by AAAI 2024

arXiv:2312.08743 [pdf, other]

FAPP: Fast and Adaptive Perception and Planning for UAVs in Dynamic Cluttered Environments

Authors: Minghao Lu, Xiyu Fan, Han Chen, Peng Lu

Abstract: Obstacle avoidance for Unmanned Aerial Vehicles (UAVs) in cluttered environments is significantly challenging. Existing obstacle avoidance for UAVs either focuses on fully static environments or static environments with only a few dynamic objects. In this paper, we take the initiative to consider the obstacle avoidance of UAVs in dynamic cluttered environments in which dynamic objects are the domi… ▽ More Obstacle avoidance for Unmanned Aerial Vehicles (UAVs) in cluttered environments is significantly challenging. Existing obstacle avoidance for UAVs either focuses on fully static environments or static environments with only a few dynamic objects. In this paper, we take the initiative to consider the obstacle avoidance of UAVs in dynamic cluttered environments in which dynamic objects are the dominant objects. This type of environment poses significant challenges to both perception and planning. Multiple dynamic objects possess various motions, making it extremely difficult to estimate and predict their motions using one motion model. The planning must be highly efficient to avoid cluttered dynamic objects. This paper proposes Fast and Adaptive Perception and Planning (FAPP) for UAVs flying in complex dynamic cluttered environments. A novel and efficient point cloud segmentation strategy is proposed to distinguish static and dynamic objects. To address multiple dynamic objects with different motions, an adaptive estimation method with covariance adaptation is proposed to quickly and accurately predict their motions. Our proposed trajectory optimization algorithm is highly efficient, enabling it to avoid fast objects. Furthermore, an adaptive re-planning method is proposed to address the case when the trajectory optimization cannot find a feasible solution, which is common for dynamic cluttered environments. Extensive validations in both simulation and real-world experiments demonstrate the effectiveness of our proposed system for highly dynamic and cluttered environments. △ Less

Submitted 14 December, 2023; originally announced December 2023.

arXiv:2312.08324 [pdf, other]

Bayesian Nonparametric Clustering with Feature Selection for Spatially Resolved Transcriptomics Data

Authors: Bencong Zhu, Guanyu Hu, Yang Xie, Lin Xu, Xiaodan Fan, Qiwei Li

Abstract: The advent of next-generation sequencing-based spatially resolved transcriptomics (SRT) techniques has reshaped genomic studies by enabling high-throughput gene expression profiling while preserving spatial and morphological context. Nevertheless, there are inherent challenges associated with these new high-dimensional spatial data, such as zero-inflation, over-dispersion, and heterogeneity. These… ▽ More The advent of next-generation sequencing-based spatially resolved transcriptomics (SRT) techniques has reshaped genomic studies by enabling high-throughput gene expression profiling while preserving spatial and morphological context. Nevertheless, there are inherent challenges associated with these new high-dimensional spatial data, such as zero-inflation, over-dispersion, and heterogeneity. These challenges pose obstacles to effective clustering, which is a fundamental problem in SRT data analysis. Current computational approaches often rely on heuristic data preprocessing and arbitrary cluster number prespecification, leading to considerable information loss and consequently, suboptimal downstream analysis. In response to these challenges, we introduce BNPSpace, a novel Bayesian nonparametric spatial clustering framework that directly models SRT count data. BNPSpace facilitates the partitioning of the whole spatial domain, which is characterized by substantial heterogeneity, into homogeneous spatial domains with similar molecular characteristics while identifying a parsimonious set of discriminating genes among different spatial domains. Moreover, BNPSpace incorporates spatial information through a Markov random field prior model, encouraging a smooth and biologically meaningful partition pattern. △ Less

Submitted 13 December, 2023; originally announced December 2023.

arXiv:2312.06063 [pdf, other]

PCRDiffusion: Diffusion Probabilistic Models for Point Cloud Registration

Authors: Yue Wu, Yongzhe Yuan, Xiaolong Fan, Xiaoshui Huang, Maoguo Gong, Qiguang Miao

Abstract: We propose a new framework that formulates point cloud registration as a denoising diffusion process from noisy transformation to object transformation. During training stage, object transformation diffuses from ground-truth transformation to random distribution, and the model learns to reverse this noising process. In sampling stage, the model refines randomly generated transformation to the outp… ▽ More We propose a new framework that formulates point cloud registration as a denoising diffusion process from noisy transformation to object transformation. During training stage, object transformation diffuses from ground-truth transformation to random distribution, and the model learns to reverse this noising process. In sampling stage, the model refines randomly generated transformation to the output result in a progressive way. We derive the variational bound in closed form for training and provide implementations of the model. Our work provides the following crucial findings: (i) In contrast to most existing methods, our framework, Diffusion Probabilistic Models for Point Cloud Registration (PCRDiffusion) does not require repeatedly update source point cloud to refine the predicted transformation. (ii) Point cloud registration, one of the representative discriminative tasks, can be solved by a generative way and the unified probabilistic formulation. Finally, we discuss and provide an outlook on the application of diffusion model in different scenarios for point cloud registration. Experimental results demonstrate that our model achieves competitive performance in point cloud registration. In correspondence-free and correspondence-based scenarios, PCRDifussion can both achieve exceeding 50\% performance improvements. △ Less

Submitted 10 December, 2023; originally announced December 2023.

arXiv:2312.05824 [pdf]

Third order nonlinear transport properties in topological chiral antiferromagnetic semimetal CoNb3S6

Authors: Junjian Mi, Jialin Li, Miaocong Li, Sheng Xu, Shuang Yu, Zheng Li, Xinyi Fan, Huanfeng Zhu, Qian Tao, Linjun Li, Zhuan Xu

Abstract: The topology between Bloch states in reciprocal space has attracted tremendous attention in recent years. The quantum geometry of the band structure is composed of quantum metric as real part and berry curvature as imaginary part. While the Berry curvature, the Berry curvature dipole and Berry connection polarizability have been recently revealed by the first order anomalous hall, second order and… ▽ More The topology between Bloch states in reciprocal space has attracted tremendous attention in recent years. The quantum geometry of the band structure is composed of quantum metric as real part and berry curvature as imaginary part. While the Berry curvature, the Berry curvature dipole and Berry connection polarizability have been recently revealed by the first order anomalous hall, second order and third order nonlinear Hall effect respectively, the quantum metric induced second order nonlinear transverse and longitudinal response in topological antiferromagnetic material MnBi2Te4 was only very recently reported. Here we demonstrate the similar third order nonlinear transport properties in the topological antiferromagnetic CoNb3S6. We observed that the third order nonlinear longitudinal V3ω xx increase significantly at the antiferromagnetic transition temperature TN ~ 29 K, which was probably induced by the quantum metric without time-reversal symmetry or inversion symmetry. Besides, temperature-dependent nonlinear behaviour was observed in the first order I-V curve below the Neel temperature TN, which was not reported in MnBi2Te4 and FeSn. Such nonlinear I-V behaviour hints for the possible existence of Charge Density Wave (CDW) state, which has been discovered in its sister material FeNb3S6. Simultaneously, two plateaus in the third order nonlinear longitudinal V3ω xx~ I^ω curve are observed, which is also speculated to be related with the possible CDW state. However, the genuine mechanism for the first order nonlinear I-V and its relation with the third order nonlinear transport call for more experimental investigations and theoretical interpretation. Our work provides a way to explore third harmonic nonlinear transport and interaction with magnetic order and CDW. △ Less

Submitted 10 December, 2023; originally announced December 2023.

Comments: 16pages,4figures

arXiv:2312.05561 [pdf, other]

doi 10.1103/PhysRevA.109.043512

Nonreciprocal Photon-Phonon Entanglement in Kerr-Modified Spinning Cavity Magnomechanics

Authors: Jiaojiao Chen, Xiao-Gang Fan, Wei Xiong, Dong Wang, Liu Ye

Abstract: Cavity magnomechanics has shown great potential in studying macroscopic quantum effects, especially for quantum entanglement, which is a key resource for quantum information science. Here we propose to realize magnon mediated nonreciprocal photon-phonon entanglement, which exhibits asymmetry when opposite magnetic or driving fields are respectively applied to the magnons with the Kerr effect or th… ▽ More Cavity magnomechanics has shown great potential in studying macroscopic quantum effects, especially for quantum entanglement, which is a key resource for quantum information science. Here we propose to realize magnon mediated nonreciprocal photon-phonon entanglement, which exhibits asymmetry when opposite magnetic or driving fields are respectively applied to the magnons with the Kerr effect or the photons with the Sagnac effect. We find that the mean magnon number can selectively exhibit nonreciprocal linear or nonlinear (bistable) behavior with the strength of the strong driving field on the cavity. Assisted by this driving field, the magnon-phonon coupling is greatly enhanced, leading to the nonreciprocal photon-phonon entanglement via the swap** interaction between the magnons and photons. This nonreciprocal entanglement can be significantly enhanced with the magnon Kerr and Sagnac effects. Given the available parameters, the nonreciprocal photon-phonon entanglement can be preserved at $\sim3$ K, showing remarkable resilience against the bath temperature. The result reveals that our paper holds promise in develo** various nonreciprocal devices with both the magnon Kerr and Sagnac effects in cavity magnomechanics. △ Less

Submitted 11 April, 2024; v1 submitted 9 December, 2023; originally announced December 2023.

Comments: 9 pages, 7 figures

Journal ref: Phys. Rev. A 109 (4), 043512 (2024)

arXiv:2312.04708 [pdf]

Integrated Design of Aluminum-Containing High-entropy Refractory B2 Alloys with Synergy of High Strength and Ductility

Authors: Jie Qi, Xuesong Fan, Diego Ibarra Hoyos, Michael Widom, Peter K. Liaw, Joseph Poon

Abstract: Refractory high-entropy alloys, RHEAs, are promising high-temperature structural materials. Their large compositional space poses great design challenges for phase control and high strength-ductility synergy. The present research pioneers using integrated high-throughput machine learning with Monte Carlo simulations to effectively navigate phase-selection and mechanical-properties predictions, dev… ▽ More Refractory high-entropy alloys, RHEAs, are promising high-temperature structural materials. Their large compositional space poses great design challenges for phase control and high strength-ductility synergy. The present research pioneers using integrated high-throughput machine learning with Monte Carlo simulations to effectively navigate phase-selection and mechanical-properties predictions, develo** aluminum-containing RHEAs in single-phase ordered B2 alloys demonstrating both high strength and ductility. These aluminum-containing RHEAs achieve remarkable mechanical properties, including compressive yield strengths up to 1.6 GPa, fracture strains exceeding 50 percent, and significant high-temperature strength retention. They also demonstrate a tensile yield strength of 1.1 GPa with a tension ductility of 6.3 percent. Besides, we identify a valence-electron-count domain for alloy brittleness with the explanation from density-functional theory and provide crucial insights into elements' influence on atomic ordering and mechanical performance. The work sets forth a strategic blueprint for high-throughput alloy design and reveals fundamental principles that govern the mechanical properties of advanced structural alloys. △ Less

Submitted 7 December, 2023; originally announced December 2023.

arXiv:2312.04606 [pdf, other]

Urban Region Representation Learning with Attentive Fusion

Authors: Fengze Sun, Jianzhong Qi, Yanchuan Chang, Xiaoliang Fan, Shanika Karunasekera, Egemen Tanin

Abstract: An increasing number of related urban data sources have brought forth novel opportunities for learning urban region representations, i.e., embeddings. The embeddings describe latent features of urban regions and enable discovering similar regions for urban planning applications. Existing methods learn an embedding for a region using every different type of region feature data, and subsequently fus… ▽ More An increasing number of related urban data sources have brought forth novel opportunities for learning urban region representations, i.e., embeddings. The embeddings describe latent features of urban regions and enable discovering similar regions for urban planning applications. Existing methods learn an embedding for a region using every different type of region feature data, and subsequently fuse all learned embeddings of a region to generate a unified region embedding. However, these studies often overlook the significance of the fusion process. The typical fusion methods rely on simple aggregation, such as summation and concatenation, thereby disregarding correlations within the fused region embeddings. To address this limitation, we propose a novel model named HAFusion. Our model is powered by a dual-feature attentive fusion module named DAFusion, which fuses embeddings from different region features to learn higher-order correlations between the regions as well as between the different types of region features. DAFusion is generic - it can be integrated into existing models to enhance their fusion process. Further, motivated by the effective fusion capability of an attentive module, we propose a hybrid attentive feature learning module named HALearning to enhance the embedding learning from each individual type of region features. Extensive experiments on three real-world datasets demonstrate that our model HAFusion outperforms state-of-the-art methods across three different prediction tasks. Using our learned region embedding leads to consistent and up to 31% improvements in the prediction accuracy. △ Less

Submitted 26 April, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

arXiv:2312.04547 [pdf, other]

Digital Life Project: Autonomous 3D Characters with Social Intelligence

Authors: Zhongang Cai, Jian** Jiang, Zhongfei Qing, Xinying Guo, Mingyuan Zhang, Zhengyu Lin, Haiyi Mei, Chen Wei, Ruisi Wang, Wanqi Yin, Xiangyu Fan, Han Du, Liang Pan, Peng Gao, Zhitao Yang, Yang Gao, Jiaqi Li, Tianxiang Ren, Yukun Wei, Xiaogang Wang, Chen Change Loy, Lei Yang, Ziwei Liu

Abstract: In this work, we present Digital Life Project, a framework utilizing language as the universal medium to build autonomous 3D characters, who are capable of engaging in social interactions and expressing with articulated body motions, thereby simulating life in a digital environment. Our framework comprises two primary components: 1) SocioMind: a meticulously crafted digital brain that models perso… ▽ More In this work, we present Digital Life Project, a framework utilizing language as the universal medium to build autonomous 3D characters, who are capable of engaging in social interactions and expressing with articulated body motions, thereby simulating life in a digital environment. Our framework comprises two primary components: 1) SocioMind: a meticulously crafted digital brain that models personalities with systematic few-shot exemplars, incorporates a reflection process based on psychology principles, and emulates autonomy by initiating dialogue topics; 2) MoMat-MoGen: a text-driven motion synthesis paradigm for controlling the character's digital body. It integrates motion matching, a proven industry technique to ensure motion quality, with cutting-edge advancements in motion generation for diversity. Extensive experiments demonstrate that each module achieves state-of-the-art performance in its respective domain. Collectively, they enable virtual characters to initiate and sustain dialogues autonomously, while evolving their socio-psychological states. Concurrently, these characters can perform contextually relevant bodily movements. Additionally, a motion captioning module further allows the virtual character to recognize and appropriately respond to human players' actions. Homepage: https://digital-life-project.com/ △ Less

Submitted 7 December, 2023; originally announced December 2023.

Comments: Homepage: https://digital-life-project.com/

arXiv:2312.00851 [pdf, other]

Physics Inspired Criterion for Pruning-Quantization Joint Learning

Authors: Weiying Xie, Xiaoyi Fan, Xin Zhang, Yunsong Li, Jie Lei, Leyuan Fang

Abstract: Pruning-quantization joint learning always facilitates the deployment of deep neural networks (DNNs) on resource-constrained edge devices. However, most existing methods do not jointly learn a global criterion for pruning and quantization in an interpretable way. In this paper, we propose a novel physics inspired criterion for pruning-quantization joint learning (PIC-PQ), which is explored from an… ▽ More Pruning-quantization joint learning always facilitates the deployment of deep neural networks (DNNs) on resource-constrained edge devices. However, most existing methods do not jointly learn a global criterion for pruning and quantization in an interpretable way. In this paper, we propose a novel physics inspired criterion for pruning-quantization joint learning (PIC-PQ), which is explored from an analogy we first draw between elasticity dynamics (ED) and model compression (MC). Specifically, derived from Hooke's law in ED, we establish a linear relationship between the filters' importance distribution and the filter property (FP) by a learnable deformation scale in the physics inspired criterion (PIC). Furthermore, we extend PIC with a relative shift variable for a global view. To ensure feasibility and flexibility, available maximum bitwidth and penalty factor are introduced in quantization bitwidth assignment. Experiments on benchmarks of image classification demonstrate that PIC-PQ yields a good trade-off between accuracy and bit-operations (BOPs) compression ratio e.g., 54.96X BOPs compression ratio in ResNet56 on CIFAR10 with 0.10% accuracy drop and 53.24X in ResNet18 on ImageNet with 0.61% accuracy drop). The code will be available at https://github.com/fanxxxxyi/PIC-PQ. △ Less

Submitted 4 June, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

arXiv:2311.13791 [pdf, ps, other]

Existence of the axisymmetric weak solution to the 3D isothermal stationary compressible Navier-Stokes equations

Authors: Xinyu Fan, Song Jiang

Abstract: In this paper, we construct the axisymmetric weak solutions to the 3D isothermal stationary compressible Navier-Stokes equations on the domain D, where the heat ratio equal 1 and the external force g satisfies certain cancellation conditions. We first establish the compactness assertion of the approximation solutions by assuming the total mass of the fluid is finite, then we exclude the trivial so… ▽ More In this paper, we construct the axisymmetric weak solutions to the 3D isothermal stationary compressible Navier-Stokes equations on the domain D, where the heat ratio equal 1 and the external force g satisfies certain cancellation conditions. We first establish the compactness assertion of the approximation solutions by assuming the total mass of the fluid is finite, then we exclude the trivial solution via imposing another type of restrictions on the density. To deal with the singularities near the symmetric axis and at the far field, we will derive proper weighted estimates of the L2-norm of the velocity field on the unbounded domain D, which belongs to the critical case of the Sobolev's inequality. △ Less

Submitted 20 January, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

arXiv:2311.12858 [pdf, other]

RAEDiff: Denoising Diffusion Probabilistic Models Based Reversible Adversarial Examples Self-Generation and Self-Recovery

Authors: Fan Xing, Xiaoyi Zhou, Xuefeng Fan, Zhuo Tian, Yan Zhao

Abstract: Collected and annotated datasets, which are obtained through extensive efforts, are effective for training Deep Neural Network (DNN) models. However, these datasets are susceptible to be misused by unauthorized users, resulting in infringement of Intellectual Property (IP) rights owned by the dataset creators. Reversible Adversarial Exsamples (RAE) can help to solve the issues of IP protection for… ▽ More Collected and annotated datasets, which are obtained through extensive efforts, are effective for training Deep Neural Network (DNN) models. However, these datasets are susceptible to be misused by unauthorized users, resulting in infringement of Intellectual Property (IP) rights owned by the dataset creators. Reversible Adversarial Exsamples (RAE) can help to solve the issues of IP protection for datasets. RAEs are adversarial perturbed images that can be restored to the original. As a cutting-edge approach, RAE scheme can serve the purposes of preventing unauthorized users from engaging in malicious model training, as well as ensuring the legitimate usage of authorized users. Nevertheless, in the existing work, RAEs still rely on the embedded auxiliary information for restoration, which may compromise their adversarial abilities. In this paper, a novel self-generation and self-recovery method, named as RAEDiff, is introduced for generating RAEs based on a Denoising Diffusion Probabilistic Models (DDPM). It diffuses datasets into a Biased Gaussian Distribution (BGD) and utilizes the prior knowledge of the DDPM for generating and recovering RAEs. The experimental results demonstrate that RAEDiff effectively self-generates adversarial perturbations for DNN models, including Artificial Intelligence Generated Content (AIGC) models, while also exhibiting significant self-recovery capabilities. △ Less

Submitted 24 October, 2023; originally announced November 2023.

arXiv:2311.07896 [pdf, other]

Bayesian Conditional Diffusion Models for Versatile Spatiotemporal Turbulence Generation

Authors: Han Gao, Xu Han, Xiantao Fan, Luning Sun, Li-** Liu, Lian Duan, Jian-Xun Wang

Abstract: Turbulent flows have historically presented formidable challenges to predictive computational modeling. Traditional numerical simulations often require vast computational resources, making them infeasible for numerous engineering applications. As an alternative, deep learning-based surrogate models have emerged, offering data-drive solutions. However, these are typically constructed within determi… ▽ More Turbulent flows have historically presented formidable challenges to predictive computational modeling. Traditional numerical simulations often require vast computational resources, making them infeasible for numerous engineering applications. As an alternative, deep learning-based surrogate models have emerged, offering data-drive solutions. However, these are typically constructed within deterministic settings, leading to shortfall in capturing the innate chaotic and stochastic behaviors of turbulent dynamics. We introduce a novel generative framework grounded in probabilistic diffusion models for versatile generation of spatiotemporal turbulence. Our method unifies both unconditional and conditional sampling strategies within a Bayesian framework, which can accommodate diverse conditioning scenarios, including those with a direct differentiable link between specified conditions and generated unsteady flow outcomes, and scenarios lacking such explicit correlations. A notable feature of our approach is the method proposed for long-span flow sequence generation, which is based on autoregressive gradient-based conditional sampling, eliminating the need for cumbersome retraining processes. We showcase the versatile turbulence generation capability of our framework through a suite of numerical experiments, including: 1) the synthesis of LES simulated instantaneous flow sequences from URANS inputs; 2) holistic generation of inhomogeneous, anisotropic wall-bounded turbulence, whether from given initial conditions, prescribed turbulence statistics, or entirely from scratch; 3) super-resolved generation of high-speed turbulent boundary layer flows from low-resolution data across a range of input resolutions. Collectively, our numerical experiments highlight the merit and transformative potential of the proposed methods, making a significant advance in the field of turbulence generation. △ Less

Submitted 13 November, 2023; originally announced November 2023.

Comments: 37 pages, 31 figures

arXiv:2311.07758 [pdf, other]

Synchrophasor Data Anomaly Detection on Grid Edge by 5G Communication and Adjacent Compute

Authors: Chuan Qin, Dexin Wang, Kishan Prudhvi Guddanti, Xiaoyuan Fan, Zhangshuan Hou

Abstract: The fifth-generation mobile communication (5G) technology offers opportunities to enhance the real-time monitoring of grids. The 5G-enabled phasor measurement units (PMUs) feature flexible positioning and cost-effective long-term maintenance without the constraints of fixing wires. This paper is the first to demonstrate the applicability of 5G in PMU communication, and the experiment was carried o… ▽ More The fifth-generation mobile communication (5G) technology offers opportunities to enhance the real-time monitoring of grids. The 5G-enabled phasor measurement units (PMUs) feature flexible positioning and cost-effective long-term maintenance without the constraints of fixing wires. This paper is the first to demonstrate the applicability of 5G in PMU communication, and the experiment was carried out at Verizon non-standalone test-bed at Pacific Northwest National Laboratory (PNNL) Advanced Wireless Communication lab. The performance of the 5G-enabled PMU communication setup is reviewed and discussed in this paper, and a generalized dynamic linear model (GDLM) based real-time synchrophasor data anomaly detection use-case is presented. Last but not least, the practicability of implementing 5G for wide-area protection strategies is explored and discussed by analyzing the experimental results. △ Less

Submitted 13 November, 2023; originally announced November 2023.

Comments: 5 pages, 4 figures

arXiv:2311.06398 [pdf]

Construction of Multi-Dimensional Functions for Optimization of Additive-Manufacturing Process Parameters

Authors: Baldur Steingrimsson, Ankur Agrawal, Xuesong Fan, Anand Kulkarni, Dan Thoma, Peter Liaw

Abstract: The authors present a generic framework for parameter optimization of additive manufacturing (AM) processes, one tailored to a high-throughput experimental methodology (HTEM). Given the large number of parameters, which impact the quality of AM-metallic components, the authors advocate for partitioning the AM parameter set into stages (tiers), based on their relative importance, modeling one tier… ▽ More The authors present a generic framework for parameter optimization of additive manufacturing (AM) processes, one tailored to a high-throughput experimental methodology (HTEM). Given the large number of parameters, which impact the quality of AM-metallic components, the authors advocate for partitioning the AM parameter set into stages (tiers), based on their relative importance, modeling one tier at a time until successful, and then systematically expanding the framework. The authors demonstrate how the construction of multi-dimensional functions, based on neural networks (NN), can be applied to successfully model relative densities and Rockwell hardness obtained from HTEM testing of the Inconel 718 superalloy fabricated, using a powder-bed approach. The authors analyze the input data set, assess its suitability for predictions, and show how to optimize the framework for the multi-dimensional functional construction, such as to obtain the highest degree of fit with the input data. The novelty of the research work entails the versatile and scalable NN framework presented, suitable for use in conjunction with HTEM, for the AM parameter optimization of superalloys, and beyond. △ Less

Submitted 10 November, 2023; originally announced November 2023.

Comments: Submitted to the Journal of Additive Manufacturing on November 10, 2023

arXiv:2311.05098 [pdf, other]

XQz5: A New Ultraluminous z$\sim$5 Quasar Legacy Sample

Authors: Samuel Lai, Christopher Onken, Christian Wolf, Fuyan Bian, Xiaohui Fan

Abstract: Bright quasar samples at high redshift are useful for investigating active galactic nuclei evolution. In this study, we describe XQz5, a sample of 83 ultraluminous quasars in the redshift range $4.5 < z < 5.3$ with optical and near-infrared spectroscopic observations, with unprecendented completeness at the bright end of the quasar luminosity function. The sample is observed with the Southern Astr… ▽ More Bright quasar samples at high redshift are useful for investigating active galactic nuclei evolution. In this study, we describe XQz5, a sample of 83 ultraluminous quasars in the redshift range $4.5 < z < 5.3$ with optical and near-infrared spectroscopic observations, with unprecendented completeness at the bright end of the quasar luminosity function. The sample is observed with the Southern Astrophysical Research Telescope, the Very Large Telescope, and the ANU 2.3m Telescope, resulting in a high-quality, moderate-resolution spectral atlas of the brightest known quasars within the redshift range. We use established virial mass relations to derive the black hole masses by measuring the observed Mg\,\textsc{ii}$λ$2799Å emission-line and we estimate the bolometric luminosity with bolometric corrections to the UV continuum. Comparisons to literature samples show that XQz5 bridges the redshift gap between other X-shooter quasar samples, XQ-100 and XQR-30, and is a brighter sample than both. Luminosity-matched lower-redshift samples host more massive black holes, which indicate that quasars at high redshift are more active than their counterparts at lower-redshift, in concordance with recent literature. △ Less

Submitted 8 November, 2023; originally announced November 2023.

Comments: 20 pages, 8 figures, 3 tables, accepted for publication

arXiv:2311.03815 [pdf, other]

Integrated Sensing, Communication, and Computing for Cost-effective Multimodal Federated Perception

Authors: Ning Chen, Zhipeng Cheng, Xuwei Fan, Bangzhen Huang, Yifeng Zhao, Lianfen Huang, Xiaojiang Du, Mohsen Guizani

Abstract: Federated learning (FL) is a classic paradigm of 6G edge intelligence (EI), which alleviates privacy leaks and high communication pressure caused by traditional centralized data processing in the artificial intelligence of things (AIoT). The implementation of multimodal federated perception (MFP) services involves three sub-processes, including sensing-based multimodal data generation, communicati… ▽ More Federated learning (FL) is a classic paradigm of 6G edge intelligence (EI), which alleviates privacy leaks and high communication pressure caused by traditional centralized data processing in the artificial intelligence of things (AIoT). The implementation of multimodal federated perception (MFP) services involves three sub-processes, including sensing-based multimodal data generation, communication-based model transmission, and computing-based model training, ultimately relying on available underlying multi-domain physical resources such as time, frequency, and computing power. How to reasonably coordinate the multi-domain resources scheduling among sensing, communication, and computing, therefore, is crucial to the MFP networks. To address the above issues, this paper investigates service-oriented resource management with integrated sensing, communication, and computing (ISCC). With the incentive mechanism of the MFP service market, the resources management problem is redefined as a social welfare maximization problem, where the idea of "expanding resources" and "reducing costs" is used to improve learning performance gain and reduce resource costs. Experimental results demonstrate the effectiveness and robustness of the proposed resource scheduling mechanisms. △ Less

Submitted 7 November, 2023; originally announced November 2023.

arXiv:2311.03768 [pdf, other]

PT-Tuning: Bridging the Gap between Time Series Masked Reconstruction and Forecasting via Prompt Token Tuning

Authors: Hao Liu, **rui Gan, Xiaoxuan Fan, Yi Zhang, Chuanxian Luo, **g Zhang, Guangxin Jiang, Yucheng Qian, Changwei Zhao, Huan Ma, Zhenyu Guo

Abstract: Self-supervised learning has been actively studied in time series domain recently, especially for masked reconstruction. Most of these methods follow the "Pre-training + Fine-tuning" paradigm in which a new decoder replaces the pre-trained decoder to fit for a specific downstream task, leading to inconsistency of upstream and downstream tasks. In this paper, we first point out that the unification… ▽ More Self-supervised learning has been actively studied in time series domain recently, especially for masked reconstruction. Most of these methods follow the "Pre-training + Fine-tuning" paradigm in which a new decoder replaces the pre-trained decoder to fit for a specific downstream task, leading to inconsistency of upstream and downstream tasks. In this paper, we first point out that the unification of task objectives and adaptation for task difficulty are critical for bridging the gap between time series masked reconstruction and forecasting. By reserving the pre-trained mask token during fine-tuning stage, the forecasting task can be taken as a special case of masked reconstruction, where the future values are masked and reconstructed based on history values. It guarantees the consistency of task objectives but there is still a gap in task difficulty. Because masked reconstruction can utilize contextual information while forecasting can only use historical information to reconstruct. To further mitigate the existed gap, we propose a simple yet effective prompt token tuning (PT-Tuning) paradigm, in which all pre-trained parameters are frozen and only a few trainable prompt tokens are added to extended mask tokens in element-wise manner. Extensive experiments on real-world datasets demonstrate the superiority of our proposed paradigm with state-of-the-art performance compared to representation learning and end-to-end supervised forecasting methods. △ Less

Submitted 7 November, 2023; originally announced November 2023.

arXiv:2311.03681 [pdf, ps, other]

The Iteration Formula of (n,2,d) Full-correlated Multi-component Bell Function and Its Applications

Authors: Hui-Xian Meng, Yu Zhang, Xing-Yan Fan, Jie Zhou, Wei-Min Shang, **g-Ling Chen

Abstract: It is very difficult and important to construct Bell inequalities for n-partite, k-settings of measurement, and d-dimensional (n,k,d) systems. Inspired by the iteration formula form of the Mermin-Ardehali-Belinski{ĭ}-Klyshko (MABK) inequality, we generalize the multi-component correlation functions for bipartite d-dimensional systems to n-partite ones, and construct the corresponding Bell inequali… ▽ More It is very difficult and important to construct Bell inequalities for n-partite, k-settings of measurement, and d-dimensional (n,k,d) systems. Inspired by the iteration formula form of the Mermin-Ardehali-Belinski{ĭ}-Klyshko (MABK) inequality, we generalize the multi-component correlation functions for bipartite d-dimensional systems to n-partite ones, and construct the corresponding Bell inequality. The Collins-Gisin-Linden-Massar-Popescu inequality can be reproduced by this way. The most important result is that for prime d the general Bell function in full-correlated multi-component correlation function form for (n,2,d) systems can be reformulated in iteration formula by two full-correlated multi-component Bell functions for (n-1,2,d) systems. As applications, we recover the MABK inequality and the most robust coincidence Bell inequalities for (3,2,3),(4,2,3),(5,2,3), and (3,2,5) Bell scenarios with this iteration formula. This implies that the iteration formula is an efficient way of constructing multi-partite Bell inequalities. In addition, we also give some new Bell inequalities with the same robustness but inequivalent to the known ones. △ Less

Submitted 6 November, 2023; originally announced November 2023.

arXiv:2310.18797 [pdf, other]

New bounds and future prospects for axion force searches at Penning trap experiments

Authors: Xing Fan, Mario Reig

Abstract: In this note we consider Penning trap experiments as probes of axion-mediated forces. We show that the current measurement of electron's $g$-factor already sets a new exclusion limit for monopole-dipole axion forces acting on the electron spin. We also show that the Penning trap's capability of switching an electron and a positron can isolate the effect of an axion force and suppress systematic ef… ▽ More In this note we consider Penning trap experiments as probes of axion-mediated forces. We show that the current measurement of electron's $g$-factor already sets a new exclusion limit for monopole-dipole axion forces acting on the electron spin. We also show that the Penning trap's capability of switching an electron and a positron can isolate the effect of an axion force and suppress systematic effects. △ Less

Submitted 14 December, 2023; v1 submitted 28 October, 2023; originally announced October 2023.

Comments: 4 pages, 1 figure

arXiv:2310.15520 [pdf, ps, other]

Large-Time Behavior of the 2D Compressible Navier-Stokes System in Bounded Domains with Large Data and Vacuum

Authors: Xinyu Fan, **g Li, Xue Wang

Abstract: The large time behavior of the unique strong solution to the barotropic compressible Navier-Stokes system is studied with large external forces and initial data, where the shear viscosity is a positive constant and the bulk one is proportional to a power of the density. Some uniform estimates on the Lp-norm of the density are established, and then deduce that the density converges to its steady st… ▽ More The large time behavior of the unique strong solution to the barotropic compressible Navier-Stokes system is studied with large external forces and initial data, where the shear viscosity is a positive constant and the bulk one is proportional to a power of the density. Some uniform estimates on the Lp-norm of the density are established, and then deduce that the density converges to its steady state in Lp-spaces, which transforms the large external force into a small one in some sense. Moreover, to deal with the obstacles brought by boundary, the conformal map** and the pull back Green function are applied to give a point-wise representation of the effective viscous flux, and then make use of slip boundary conditions to cancel out the singularity. △ Less

Submitted 24 October, 2023; originally announced October 2023.

Comments: 55 pages

arXiv:2310.15317 [pdf, other]

Exploring the Potential of Large Language Models in Generating Code-Tracing Questions for Introductory Programming Courses

Authors: Aysa Xuemo Fan, Ranran Haoran Zhang, Luc Paquette, Rui Zhang

Abstract: In this paper, we explore the application of large language models (LLMs) for generating code-tracing questions in introductory programming courses. We designed targeted prompts for GPT4, guiding it to generate code-tracing questions based on code snippets and descriptions. We established a set of human evaluation metrics to assess the quality of questions produced by the model compared to those c… ▽ More In this paper, we explore the application of large language models (LLMs) for generating code-tracing questions in introductory programming courses. We designed targeted prompts for GPT4, guiding it to generate code-tracing questions based on code snippets and descriptions. We established a set of human evaluation metrics to assess the quality of questions produced by the model compared to those created by human experts. Our analysis provides insights into the capabilities and potential of LLMs in generating diverse code-tracing questions. Additionally, we present a unique dataset of human and LLM-generated tracing questions, serving as a valuable resource for both the education and NLP research communities. This work contributes to the ongoing dialogue on the potential uses of LLMs in educational settings. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: Accepted by Findings of EMNLP, 2023

arXiv:2310.12713 [pdf, other]

Learn from the Past: A Proxy Guided Adversarial Defense Framework with Self Distillation Regularization

Authors: Yaohua Liu, Jiaxin Gao, Xianghao Jiao, Zhu Liu, Xin Fan, Risheng Liu

Abstract: Adversarial Training (AT), pivotal in fortifying the robustness of deep learning models, is extensively adopted in practical applications. However, prevailing AT methods, relying on direct iterative updates for target model's defense, frequently encounter obstacles such as unstable training and catastrophic overfitting. In this context, our work illuminates the potential of leveraging the target m… ▽ More Adversarial Training (AT), pivotal in fortifying the robustness of deep learning models, is extensively adopted in practical applications. However, prevailing AT methods, relying on direct iterative updates for target model's defense, frequently encounter obstacles such as unstable training and catastrophic overfitting. In this context, our work illuminates the potential of leveraging the target model's historical states as a proxy to provide effective initialization and defense prior, which results in a general proxy guided defense framework, `LAST' ({\bf L}earn from the P{\bf ast}). Specifically, LAST derives response of the proxy model as dynamically learned fast weights, which continuously corrects the update direction of the target model. Besides, we introduce a self-distillation regularized defense objective, ingeniously designed to steer the proxy model's update trajectory without resorting to external teacher models, thereby ameliorating the impact of catastrophic overfitting on performance. Extensive experiments and ablation studies showcase the framework's efficacy in markedly improving model robustness (e.g., up to 9.2\% and 20.3\% enhancement in robust accuracy on CIFAR10 and CIFAR100 datasets, respectively) and training stability. These improvements are consistently observed across various model architectures, larger datasets, perturbation sizes, and attack modalities, affirming LAST's ability to consistently refine both single-step and multi-step AT strategies. The code will be available at~\url{https://github.com/callous-youth/LAST}. △ Less

Submitted 10 March, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

Comments: 13 Pages

arXiv:2310.11227 [pdf, other]

RealBehavior: A Framework for Faithfully Characterizing Foundation Models' Human-like Behavior Mechanisms

Authors: Enyu Zhou, Rui Zheng, Zhiheng Xi, Songyang Gao, Xiaoran Fan, Zichu Fei, **gting Ye, Tao Gui, Qi Zhang, Xuan**g Huang

Abstract: Reports of human-like behaviors in foundation models are growing, with psychological theories providing enduring tools to investigate these behaviors. However, current research tends to directly apply these human-oriented tools without verifying the faithfulness of their outcomes. In this paper, we introduce a framework, RealBehavior, which is designed to characterize the humanoid behaviors of mod… ▽ More Reports of human-like behaviors in foundation models are growing, with psychological theories providing enduring tools to investigate these behaviors. However, current research tends to directly apply these human-oriented tools without verifying the faithfulness of their outcomes. In this paper, we introduce a framework, RealBehavior, which is designed to characterize the humanoid behaviors of models faithfully. Beyond simply measuring behaviors, our framework assesses the faithfulness of results based on reproducibility, internal and external consistency, and generalizability. Our findings suggest that a simple application of psychological tools cannot faithfully characterize all human-like behaviors. Moreover, we discuss the impacts of aligning models with human and social values, arguing for the necessity of diversifying alignment objectives to prevent the creation of models with restricted characteristics. △ Less

Submitted 17 October, 2023; originally announced October 2023.

Comments: Accepted to Findings of EMNLP 2023

arXiv:2310.11012 [pdf, other]

A Data-Driven Density Functional Model for Nuclear Systems

Authors: Zu-Xing Yang, Xiao-Hua Fan, Zhi-Pan Li, Haozhao Liang

Abstract: Through ensemble learning with multitasking and complex connection neural networks, we aggregated nuclear properties, including ground state charge radii, binding energies, and single-particle state information obtained from the Kohn-Sham auxiliary single-particle systems. Compared to traditional density functional theory, our model can more accurately characterize nuclear ground state information… ▽ More Through ensemble learning with multitasking and complex connection neural networks, we aggregated nuclear properties, including ground state charge radii, binding energies, and single-particle state information obtained from the Kohn-Sham auxiliary single-particle systems. Compared to traditional density functional theory, our model can more accurately characterize nuclear ground state information. Aiming at binding energy, the root mean square error is reduced to 450 keV. Although the complexity involving the nuclear interaction is skipped, the model has not completely devolved into a black box. Leveraging the correlation between densities and binding energies, we calculate the neutron skin thickness of $^{208}$Pb to be 0.223 fm. This model will advance our understanding of nuclear properties and accelerate the integration of machine learning into modern nuclear physics. △ Less

Submitted 17 October, 2023; originally announced October 2023.

arXiv:2310.10033 [pdf, other]

Deep Unfolding Network for Image Compressed Sensing by Content-adaptive Gradient Updating and Deformation-invariant Non-local Modeling

Authors: Wenxue Cui, Xiaopeng Fan, Jian Zhang, Debin Zhao

Abstract: Inspired by certain optimization solvers, the deep unfolding network (DUN) has attracted much attention in recent years for image compressed sensing (CS). However, there still exist the following two issues: 1) In existing DUNs, most hyperparameters are usually content independent, which greatly limits their adaptability for different input contents. 2) In each iteration, a plain convolutional neu… ▽ More Inspired by certain optimization solvers, the deep unfolding network (DUN) has attracted much attention in recent years for image compressed sensing (CS). However, there still exist the following two issues: 1) In existing DUNs, most hyperparameters are usually content independent, which greatly limits their adaptability for different input contents. 2) In each iteration, a plain convolutional neural network is usually adopted, which weakens the perception of wider context prior and therefore depresses the expressive ability. In this paper, inspired by the traditional Proximal Gradient Descent (PGD) algorithm, a novel DUN for image compressed sensing (dubbed DUN-CSNet) is proposed to solve the above two issues. Specifically, for the first issue, a novel content adaptive gradient descent network is proposed, in which a well-designed step size generation sub-network is developed to dynamically allocate the corresponding step sizes for different textures of input image by generating a content-aware step size map, realizing a content-adaptive gradient updating. For the second issue, considering the fact that many similar patches exist in an image but have undergone a deformation, a novel deformation-invariant non-local proximal map** network is developed, which can adaptively build the long-range dependencies between the nonlocal patches by deformation-invariant non-local modeling, leading to a wider perception on context priors. Extensive experiments manifest that the proposed DUN-CSNet outperforms existing state-of-the-art CS methods by large margins. △ Less

Submitted 15 October, 2023; originally announced October 2023.

Comments: 16 pages, 13 figures. Accepted by IEEE Transactions on Multimedia (TMM)

arXiv:2310.09327 [pdf, other]

MAGNIF: A Tentative Lensed Rotating Disk at $z=8.34$ detected by JWST NIRCam WFSS with Dynamical Forward Modeling

Authors: Zihao Li, Zheng Cai, Fengwu Sun, Johan Richard, Maxime Trebitsch, Jakob M. Helton, Jose M. Diego, Masamune Oguri, Nicholas Foo, Xiao**g Lin, Franz Bauer, Chian-Chou Chen, Christopher J. Conselice, Daniel Espada, Eiichi Egami, Xiaohui Fan, Brenda L. Frye, Yoshinobu Fudamoto, Pablo G. Perez-Gonzalez, Kevin Hainline, Tiger Yu-Yang Hsiao, Zhiyuan Ji, Xiangyu **, Anton M. Koekemoer, Vasily Kokorev , et al. (17 additional authors not shown)

Abstract: We report galaxy MACS0416-Y3 behind the lensing cluster MACSJ0416.1--2403 as a tentative rotating disk at $z=8.34$ detected through its [OIII]$\lambda5007$ emission in JWST NIRCam wide-field slitless spectroscopic observations. The discovery is based on our new grism dynamical modeling methodology for JWST NIRCam slitless spectroscopy, using the data from ``Median-band Astrophysics with the Grism… ▽ More We report galaxy MACS0416-Y3 behind the lensing cluster MACSJ0416.1--2403 as a tentative rotating disk at $z=8.34$ detected through its [OIII]$\lambda5007$ emission in JWST NIRCam wide-field slitless spectroscopic observations. The discovery is based on our new grism dynamical modeling methodology for JWST NIRCam slitless spectroscopy, using the data from ``Median-band Astrophysics with the Grism of NIRCam in Frontier Fields'' (MAGNIF), a JWST Cycle-2 program. The [OIII]$\lambda5007$ emission line morphology in grism data shows velocity offsets compared to the F480M direct imaging, suggestive of rotation. Assuming a geometrically thin disk model, we constrain the rotation velocity of $v_{\rm rot}=58^{+53}_{-35}$ km s$^{-1}$ via forward modeling of the two-dimensional (2D) spectrum. We obtain the kinematic ratio of $v_{\rm rot}/σ_v=1.6^{+1.9}_{-0.9}$, where $σ_v$ is the velocity dispersion, in line with a quasi-stable thin disk. The resulting dynamical mass is estimated to be $\log(M_{\rm dyn}/M_{\odot})=8.4^{+0.5}_{-0.7}$. If the rotation confirmed, our discovery suggests that rotating gaseous disks may have already existed within 600 million years after Big Bang. △ Less

Submitted 13 October, 2023; originally announced October 2023.

Comments: 15 pages, 6 figures. Comments welcome

arXiv:2310.09265 [pdf, other]

PromptRE: Weakly-Supervised Document-Level Relation Extraction via Prompting-Based Data Programming

Authors: Chufan Gao, Xulin Fan, Jimeng Sun, Xuan Wang

Abstract: Relation extraction aims to classify the relationships between two entities into pre-defined categories. While previous research has mainly focused on sentence-level relation extraction, recent studies have expanded the scope to document-level relation extraction. Traditional relation extraction methods heavily rely on human-annotated training data, which is time-consuming and labor-intensive. To… ▽ More Relation extraction aims to classify the relationships between two entities into pre-defined categories. While previous research has mainly focused on sentence-level relation extraction, recent studies have expanded the scope to document-level relation extraction. Traditional relation extraction methods heavily rely on human-annotated training data, which is time-consuming and labor-intensive. To mitigate the need for manual annotation, recent weakly-supervised approaches have been developed for sentence-level relation extraction while limited work has been done on document-level relation extraction. Weakly-supervised document-level relation extraction faces significant challenges due to an imbalanced number "no relation" instances and the failure of directly probing pretrained large language models for document relation extraction. To address these challenges, we propose PromptRE, a novel weakly-supervised document-level relation extraction method that combines prompting-based techniques with data programming. Furthermore, PromptRE incorporates the label distribution and entity types as prior knowledge to improve the performance. By leveraging the strengths of both prompting and data programming, PromptRE achieves improved performance in relation classification and effectively handles the "no relation" problem. Experimental results on ReDocRED, a benchmark dataset for document-level relation extraction, demonstrate the superiority of PromptRE over baseline approaches. △ Less

Submitted 13 October, 2023; originally announced October 2023.

arXiv:2310.08804 [pdf, other]

Spiking Semantic Communication for Feature Transmission with HARQ

Authors: Mengyang Wang, Jiahui Li, Mengyao Ma, Xiaopeng Fan

Abstract: In Collaborative Intelligence (CI), the Artificial Intelligence (AI) model is divided between the edge and the cloud, with intermediate features being sent from the edge to the cloud for inference. Several deep learning-based Semantic Communication (SC) models have been proposed to reduce feature transmission overhead and mitigate channel noise interference. Previous research has demonstrated that… ▽ More In Collaborative Intelligence (CI), the Artificial Intelligence (AI) model is divided between the edge and the cloud, with intermediate features being sent from the edge to the cloud for inference. Several deep learning-based Semantic Communication (SC) models have been proposed to reduce feature transmission overhead and mitigate channel noise interference. Previous research has demonstrated that Spiking Neural Network (SNN)-based SC models exhibit greater robustness on digital channels compared to Deep Neural Network (DNN)-based SC models. However, the existing SNN-based SC models require fixed time steps, resulting in fixed transmission bandwidths that cannot be adaptively adjusted based on channel conditions. To address this issue, this paper introduces a novel SC model called SNN-SC-HARQ, which combines the SNN-based SC model with the Hybrid Automatic Repeat Request (HARQ) mechanism. SNN-SC-HARQ comprises an SNN-based SC model that supports the transmission of features at varying bandwidths, along with a policy model that determines the appropriate bandwidth. Experimental results show that SNN-SC-HARQ can dynamically adjust the bandwidth according to the channel conditions without performance loss. △ Less

Submitted 12 October, 2023; originally announced October 2023.

arXiv:2310.06342 [pdf, other]

Contrastive Prompt Learning-based Code Search based on Interaction Matrix

Authors: Yubo Zhang, Yanfang Liu, Xinxin Fan, Yunfeng Lu

Abstract: Code search aims to retrieve the code snippet that highly matches the given query described in natural language. Recently, many code pre-training approaches have demonstrated impressive performance on code search. However, existing code search methods still suffer from two performance constraints: inadequate semantic representation and the semantic gap between natural language (NL) and programming… ▽ More Code search aims to retrieve the code snippet that highly matches the given query described in natural language. Recently, many code pre-training approaches have demonstrated impressive performance on code search. However, existing code search methods still suffer from two performance constraints: inadequate semantic representation and the semantic gap between natural language (NL) and programming language (PL). In this paper, we propose CPLCS, a contrastive prompt learning-based code search method based on the cross-modal interaction mechanism. CPLCS comprises:(1) PL-NL contrastive learning, which learns the semantic matching relationship between PL and NL representations; (2) a prompt learning design for a dual-encoder structure that can alleviate the problem of inadequate semantic representation; (3) a cross-modal interaction mechanism to enhance the fine-grained map** between NL and PL. We conduct extensive experiments to evaluate the effectiveness of our approach on a real-world dataset across six programming languages. The experiment results demonstrate the efficacy of our approach in improving semantic representation quality and map** ability between PL and NL. △ Less

Submitted 10 October, 2023; originally announced October 2023.

arXiv:2310.06291 [pdf, other]

Three-Dimensional Medical Image Fusion with Deformable Cross-Attention

Authors: Lin Liu, Xinxin Fan, Chulong Zhang, **g**g Dai, Yaoqin Xie, Xiaokun Liang

Abstract: Multimodal medical image fusion plays an instrumental role in several areas of medical image processing, particularly in disease recognition and tumor detection. Traditional fusion methods tend to process each modality independently before combining the features and reconstructing the fusion image. However, this approach often neglects the fundamental commonalities and disparities between multimod… ▽ More Multimodal medical image fusion plays an instrumental role in several areas of medical image processing, particularly in disease recognition and tumor detection. Traditional fusion methods tend to process each modality independently before combining the features and reconstructing the fusion image. However, this approach often neglects the fundamental commonalities and disparities between multimodal information. Furthermore, the prevailing methodologies are largely confined to fusing two-dimensional (2D) medical image slices, leading to a lack of contextual supervision in the fusion images and subsequently, a decreased information yield for physicians relative to three-dimensional (3D) images. In this study, we introduce an innovative unsupervised feature mutual learning fusion network designed to rectify these limitations. Our approach incorporates a Deformable Cross Feature Blend (DCFB) module that facilitates the dual modalities in discerning their respective similarities and differences. We have applied our model to the fusion of 3D MRI and PET images obtained from 660 patients in the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. Through the application of the DCFB module, our network generates high-quality MRI-PET fusion images. Experimental results demonstrate that our method surpasses traditional 2D image fusion methods in performance metrics such as Peak Signal to Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM). Importantly, the capacity of our method to fuse 3D images enhances the information available to physicians and researchers, thus marking a significant step forward in the field. The code will soon be available online. △ Less

Submitted 10 October, 2023; originally announced October 2023.

arXiv:2310.04992 [pdf, other]

VisionFM: a Multi-Modal Multi-Task Vision Foundation Model for Generalist Ophthalmic Artificial Intelligence

Authors: Jianing Qiu, Jian Wu, Hao Wei, Peilun Shi, Minqing Zhang, Yunyun Sun, Lin Li, Hanruo Liu, Hongyi Liu, Simeng Hou, Yuyang Zhao, Xuehui Shi, Junfang Xian, Xiaoxia Qu, Sirui Zhu, Lijie Pan, Xiaoniao Chen, Xiaojia Zhang, Shuai Jiang, Kebing Wang, Chenlong Yang, Mingqiang Chen, Sujie Fan, Jianhua Hu, Aiguo Lv , et al. (17 additional authors not shown)

Abstract: We present VisionFM, a foundation model pre-trained with 3.4 million ophthalmic images from 560,457 individuals, covering a broad range of ophthalmic diseases, modalities, imaging devices, and demography. After pre-training, VisionFM provides a foundation to foster multiple ophthalmic artificial intelligence (AI) applications, such as disease screening and diagnosis, disease prognosis, subclassifi… ▽ More We present VisionFM, a foundation model pre-trained with 3.4 million ophthalmic images from 560,457 individuals, covering a broad range of ophthalmic diseases, modalities, imaging devices, and demography. After pre-training, VisionFM provides a foundation to foster multiple ophthalmic artificial intelligence (AI) applications, such as disease screening and diagnosis, disease prognosis, subclassification of disease phenotype, and systemic biomarker and disease prediction, with each application enhanced with expert-level intelligence and accuracy. The generalist intelligence of VisionFM outperformed ophthalmologists with basic and intermediate levels in jointly diagnosing 12 common ophthalmic diseases. Evaluated on a new large-scale ophthalmic disease diagnosis benchmark database, as well as a new large-scale segmentation and detection benchmark database, VisionFM outperformed strong baseline deep neural networks. The ophthalmic image representations learned by VisionFM exhibited noteworthy explainability, and demonstrated strong generalizability to new ophthalmic modalities, disease spectrum, and imaging devices. As a foundation model, VisionFM has a large capacity to learn from diverse ophthalmic imaging data and disparate datasets. To be commensurate with this capacity, in addition to the real data used for pre-training, we also generated and leveraged synthetic ophthalmic imaging data. Experimental results revealed that synthetic data that passed visual Turing tests, can also enhance the representation learning capability of VisionFM, leading to substantial performance gains on downstream ophthalmic AI tasks. Beyond the ophthalmic AI applications developed, validated, and demonstrated in this work, substantial further applications can be achieved in an efficient and cost-effective manner using VisionFM as the foundation. △ Less

Submitted 7 October, 2023; originally announced October 2023.

arXiv:2310.03796 [pdf]

doi 10.3847/1538-4357/ad00b8

Searching for [CII] Emission from the First Sample of $z\sim 6$ OI Absorption-Associated Galaxies with ALMA

Authors: Yun**g Wu, Zheng Cai, Jianan Li, Kristian Finlator, Marcel Neeleman, J. Xavier Prochaska, Bjorn H. C. Emonts, Shiwu Zhang, Feige Wang, **yi Yang, Ran Wang, Xiaohui Fan, Dandan Xu, Emmet Golden-Marx, Laura C. Keating, Joseph F. Hennawi

Abstract: We report the first statistical analyses of [CII] and dust continuum observations in six strong OI absorber fields at the end of the reionization epoch obtained by the Atacama Large Millimeter/Submillimeter Array (ALMA). Combined with one [CII] emitter reported in Wu et al. (2021), we detect one OI-associated [CII] emitter in six fields. At redshifts of OI-absorbers in non-detection fields, no emi… ▽ More We report the first statistical analyses of [CII] and dust continuum observations in six strong OI absorber fields at the end of the reionization epoch obtained by the Atacama Large Millimeter/Submillimeter Array (ALMA). Combined with one [CII] emitter reported in Wu et al. (2021), we detect one OI-associated [CII] emitter in six fields. At redshifts of OI-absorbers in non-detection fields, no emitters are brighter than our detection limit within impact parameters of 50 kpc and velocity offsets between $\pm200\ {\rm km\ s^{-1}}$. The averaged [CII]-detection upper limit is $< 0.06$ Jy ${\rm km\ s^{-1}}$ (3$σ$), corresponding to the [CII] luminosity of $L_{\rm [CII]} <5.8\times 10^7\ L_{\odot}$ and the [CII]-based star formation rate of ${\rm SFR_{\rm [CII]}} < 5.5$ $M_\odot$ yr$^{-1}$. Cosmological simulations suggest that only $\sim10^{-2.5}$ [CII] emitters around [OI] absorbers have comparable SFR to our detection limit. Although the detection in one out of six fields is reported, an order of magnitude number excess of emitters obtained from our ALMA observations supports that the contribution of massive galaxies that caused the metal enrichment cannot be ignored. Further, we also found 14 tentative galaxy candidates with S/N of $\approx4.3$ at large impact parameters ($>50$ kpc) and having larger outflow velocities within $\pm 600$ km s$^{-1}$. If these detections are confirmed in the future, then the mechanism of pushing metals at larger distances with higher velocities needs to be further explored from the theoretical side. △ Less

Submitted 8 November, 2023; v1 submitted 5 October, 2023; originally announced October 2023.

Comments: Accepted for publication in ApJ. Main text 10 pages, 5 figures

arXiv:2310.00296 [pdf, other]

QUIZ: An Arbitrary Volumetric Point Matching Method for Medical Image Registration

Authors: Lin Liu, Xinxin Fan, Haoyang Liu, Chulong Zhang, Weibin Kong, **g**g Dai, Yuming Jiang, Yaoqin Xie, Xiaokun Liang

Abstract: Rigid pre-registration involving local-global matching or other large deformation scenarios is crucial. Current popular methods rely on unsupervised learning based on grayscale similarity, but under circumstances where different poses lead to varying tissue structures, or where image quality is poor, these methods tend to exhibit instability and inaccuracies. In this study, we propose a novel meth… ▽ More Rigid pre-registration involving local-global matching or other large deformation scenarios is crucial. Current popular methods rely on unsupervised learning based on grayscale similarity, but under circumstances where different poses lead to varying tissue structures, or where image quality is poor, these methods tend to exhibit instability and inaccuracies. In this study, we propose a novel method for medical image registration based on arbitrary voxel point of interest matching, called query point quizzer (QUIZ). QUIZ focuses on the correspondence between local-global matching points, specifically employing CNN for feature extraction and utilizing the Transformer architecture for global point matching queries, followed by applying average displacement for local image rigid transformation. We have validated this approach on a large deformation dataset of cervical cancer patients, with results indicating substantially smaller deviations compared to state-of-the-art methods. Remarkably, even for cross-modality subjects, it achieves results surpassing the current state-of-the-art. △ Less

Submitted 30 September, 2023; originally announced October 2023.

arXiv:2309.16757 [pdf]

doi 10.3847/2041-8213/acfee3

A SPectroscopic survey of biased halos In the Reionization Era (ASPIRE): JWST Discovers an Overdensity around a Metal Absorption-selected Galaxy at $z\sim5.5$

Authors: Yun**g Wu, Feige Wang, Zheng Cai, Xiaohui Fan, Kristian Finlator, **yi Yang, Joseph F. Hennawi, Fengwu Sun, Jaclyn B. Champagne, Xiao**g Lin, Zihao Li, Zuyi Chen, Eduardo Bañados, George D. Becker, Sarah E. I. Bosman, Gstavo Bruzual, Stephane Charlot, Hsiao-Wen Chen, Jacopo Chevallard, Anna-Christina Eilers, Emanuele Paolo Farina, Xiangyu **, Hyunsung D. Jun, Koki Kakiichi, Mingyu Li , et al. (5 additional authors not shown)

Abstract: The launch of ${\it JWST}$ opens a new window for studying the connection between metal-line absorbers and galaxies at the end of the Epoch of Reionization (EoR). Previous studies have detected absorber-galaxy pairs in limited quantities through ground-based observations. To enhance our understanding of the relationship between absorbers and their host galaxies at $z>5$, we utilized the NIRCam Wid… ▽ More The launch of ${\it JWST}$ opens a new window for studying the connection between metal-line absorbers and galaxies at the end of the Epoch of Reionization (EoR). Previous studies have detected absorber-galaxy pairs in limited quantities through ground-based observations. To enhance our understanding of the relationship between absorbers and their host galaxies at $z>5$, we utilized the NIRCam Wide Field Slitless Spectroscopy (WFSS) to search for absorber-associated galaxies by detecting their rest-frame optical emission lines (e.g., [OIII] + H$β$). We report the discovery of a MgII-associated galaxy at $z=5.428$ using data from the ${\it JWST}$ ASPIRE program. The MgII absorber is detected on the spectrum of quasar J0305--3150 with a rest-frame equivalent width of 0.74$\mathring{A}$. The associated galaxy has an [OIII] luminosity of $10^{42.5}\ {\rm erg\ s^{-1}}$ with an impact parameter of 24.9 proper kiloparsecs (pkpc). The joint ${\it HST}$-${\it JWST}$ spectral energy distribution (SED) implies a stellar mass and star-formation rate of ${\rm M_* \approx 10^{8.8}}$ ${\rm M_{\odot}}$, ${\rm SFR}\approx 10\ {\rm M_{\odot}\ yr^{-1}}$. Its [OIII] equivalent width and stellar mass are typical of [OIII] emitters at this redshift. Furthermore, connecting the outflow starting time to the SED-derived stellar age, the outflow velocity of this galaxy is $\sim300\ {\rm km\ s^{-1}}$, consistent with theoretical expectations. We identified six additional [OIII] emitters with impact parameters of up to $\sim300$ pkpc at similar redshifts ($|dv|<1000\ {\rm km\ s^{-1}}$). The observed number is consistent with that in cosmological simulations. This pilot study suggests that systematically investigating the absorber-galaxy connection within the ASPIRE program will provide insights into the metal-enrichment history in the early universe. △ Less

Submitted 8 November, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

Comments: Accepted for publication in ApJL. Main text 8 pages, 4 figures. For more information of the JWST ASPIRE program please check https://aspire-quasar.github.io/index.html

arXiv:2309.08189 [pdf, ps, other]

Rates of convergence in the distances of Kolmogorov and Wasserstein for standardized martingales

Authors: Xiequan Fan, Zhonggen Su

Abstract: We give some rates of convergence in the distances of Kolmogorov and Wasserstein for standardized martingales with differences having finite variances. For the Kolmogorov distances, we present some exact Berry-Esseen bounds for martingales, which generalizes some Berry-Esseen bounds due to Bolthausen. For the Wasserstein distance, with Stein's method and Lindeberg's telesco** sum argument, the r… ▽ More We give some rates of convergence in the distances of Kolmogorov and Wasserstein for standardized martingales with differences having finite variances. For the Kolmogorov distances, we present some exact Berry-Esseen bounds for martingales, which generalizes some Berry-Esseen bounds due to Bolthausen. For the Wasserstein distance, with Stein's method and Lindeberg's telesco** sum argument, the rates of convergence in martingale central limit theorems recover the classical rates for sums of i.i.d.\ random variables, and therefore they are believed to be optimal. △ Less

Submitted 15 September, 2023; originally announced September 2023.

Comments: 31 pages

MSC Class: Primary 60G42; 60F05; Secondary 60E15

arXiv:2309.07864 [pdf, other]

The Rise and Potential of Large Language Model Based Agents: A Survey

Authors: Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie **, Enyu Zhou, Rui Zheng, Xiaoran Fan, Xiao Wang, Limao Xiong, Yuhao Zhou, Weiran Wang, Changhao Jiang, Yicheng Zou, Xiangyang Liu, Zhangyue Yin, Shihan Dou, Rongxiang Weng, Wensen Cheng, Qi Zhang, Wenjuan Qin , et al. (4 additional authors not shown)

Abstract: For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are artificial entities that sense their environment, make decisions, and take actions. Many efforts have been made to develop intelligent agents, but they mainly focus on advancement in algorithms or training stra… ▽ More For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are artificial entities that sense their environment, make decisions, and take actions. Many efforts have been made to develop intelligent agents, but they mainly focus on advancement in algorithms or training strategies to enhance specific capabilities or performance on particular tasks. Actually, what the community lacks is a general and powerful model to serve as a starting point for designing AI agents that can adapt to diverse scenarios. Due to the versatile capabilities they demonstrate, large language models (LLMs) are regarded as potential sparks for Artificial General Intelligence (AGI), offering hope for building general AI agents. Many researchers have leveraged LLMs as the foundation to build AI agents and have achieved significant progress. In this paper, we perform a comprehensive survey on LLM-based agents. We start by tracing the concept of agents from its philosophical origins to its development in AI, and explain why LLMs are suitable foundations for agents. Building upon this, we present a general framework for LLM-based agents, comprising three main components: brain, perception, and action, and the framework can be tailored for different applications. Subsequently, we explore the extensive applications of LLM-based agents in three aspects: single-agent scenarios, multi-agent scenarios, and human-agent cooperation. Following this, we delve into agent societies, exploring the behavior and personality of LLM-based agents, the social phenomena that emerge from an agent society, and the insights they offer for human society. Finally, we discuss several key topics and open problems within the field. A repository for the related papers at https://github.com/WooooDyy/LLM-Agent-Paper-List. △ Less

Submitted 19 September, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

Comments: 86 pages, 12 figures

Showing 101–150 of 1,411 results for author: Fan, X