Search | arXiv e-print repository

A Semantic-Aware and Multi-Guided Network for Infrared-Visible Image Fusion

Authors: Xiaoli Zhang, Liying Wang, Libo Zhao, Xiongfei Li, Siwei Ma

Abstract: Multi-modality image fusion aims at fusing specific-modality and shared-modality information from two source images. To tackle the problem of insufficient feature extraction and lack of semantic awareness for complex scenes, this paper focuses on how to model correlation-driven decomposing features and reason high-level graph representation by efficiently extracting complementary features and mult… ▽ More Multi-modality image fusion aims at fusing specific-modality and shared-modality information from two source images. To tackle the problem of insufficient feature extraction and lack of semantic awareness for complex scenes, this paper focuses on how to model correlation-driven decomposing features and reason high-level graph representation by efficiently extracting complementary features and multi-guided feature aggregation. We propose a three-branch encoder-decoder architecture along with corresponding fusion layers as the fusion strategy. The transformer with Multi-Dconv Transposed Attention and Local-enhanced Feed Forward network is used to extract shallow features after the depthwise convolution. In the three parallel branches encoder, Cross Attention and Invertible Block (CAI) enables to extract local features and preserve high-frequency texture details. Base feature extraction module (BFE) with residual connections can capture long-range dependency and enhance shared-modality expression capabilities. Graph Reasoning Module (GR) is introduced to reason high-level cross-modality relations and extract low-level details features as CAI's specific-modality complementary information simultaneously. Experiments demonstrate that our method has obtained competitive results compared with state-of-the-art methods in visible/infrared image fusion and medical image fusion tasks. Moreover, we surpass other fusion methods in terms of subsequent tasks, averagely scoring 9.78% [email protected] higher in object detection and 6.46% mIoU higher in semantic segmentation. △ Less

Submitted 11 June, 2024; originally announced July 2024.

arXiv:2407.05681 [pdf]

Bulk high-temperature superconductivity in the high-pressure tetragonal phase of bilayer La2PrNi2O7

Authors: Ningning Wang, Gang Wang, Xiaoling Shen, Jun Hou, Jun Luo, ** Ma, Huaixin Yang, Lifen Shi, Jie Dou, Jie Feng, Jie Yang, Yunqing Shi, Zhian Ren, Hanming Ma, Pengtao Yang, Ziyi Liu, Yue Liu, Hua Zhang, Xiaoli Dong, Yuxin Wang, Kun Jiang, Jiang** Hu, Stuart Calder, Jiaqiang Yan, Jian** Sun , et al. (4 additional authors not shown)

Abstract: The Ruddlesden-Popper (R-P) bilayer nickelate, La3Ni2O7, was recently found to show signatures of high-temperature superconductivity (HTSC) at pressures above 14 GPa. Subsequent investigations achieved zero resistance in single- and poly-crystalline samples under hydrostatic pressure conditions. Yet, obvious diamagnetic signals, the other hallmark of superconductors, are still lacking owing to the… ▽ More The Ruddlesden-Popper (R-P) bilayer nickelate, La3Ni2O7, was recently found to show signatures of high-temperature superconductivity (HTSC) at pressures above 14 GPa. Subsequent investigations achieved zero resistance in single- and poly-crystalline samples under hydrostatic pressure conditions. Yet, obvious diamagnetic signals, the other hallmark of superconductors, are still lacking owing to the filamentary nature with low superconducting volume fraction. The presence of a novel "1313" polymorph and competing R-P phases obscured proper identification of the phase for HTSC. Thus, achieving bulk HTSC and identifying the phase at play are the most prominent tasks at present. Here, we address these issues in the praseodymium (Pr)-doped La2PrNi2O7 polycrystalline samples. We find that the substitutions of Pr for La effectively inhibits the intergrowth of different R-P phases, resulting in nearly pure bilayer structure. For La2PrNi2O7, pressure-induced orthorhombic-to-tetragonal structural transition takes place at Pc ~ 11 GPa, above which HTSC emerges gradually upon further compression. The superconducting transition temperatures at 18-20 GPa reach Tconset = 82.5 K and Tczero = 60 K, which are the highest values among known nickelate superconductors. More importantly, bulk HTSC was testified by detecting clear diamagnetic signals below ~75 K corresponding to an estimated superconducting volume fraction ~ 57(5)% at 20 GPa. Our results not only resolve the existing controversies but also illuminate directions for exploring bulk HTSC in the bilayer nickelates. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.05356 [pdf, ps, other]

Extended mean-field control problems with Poissonian common noise: Stochastic maximum principle and Hamiltonian-Jacobi-Bellman equation

Authors: Lijun Bo, **gfei Wang, Xiaoli Wei, Xiang Yu

Abstract: This paper studies the extended mean-field control problems with state-control joint law dependence and Poissonian common noise. We develop the stochastic maximum principle (SMP) and establish the connection to the Hamiltonian-Jacobi-Bellman (HJB) equation on the Wasserstein space. The presence of the conditional joint law in the McKean-Vlasov dynamics and its discontinuity caused by the Poissonia… ▽ More This paper studies the extended mean-field control problems with state-control joint law dependence and Poissonian common noise. We develop the stochastic maximum principle (SMP) and establish the connection to the Hamiltonian-Jacobi-Bellman (HJB) equation on the Wasserstein space. The presence of the conditional joint law in the McKean-Vlasov dynamics and its discontinuity caused by the Poissonian common noise bring us new technical challenges. To develop the SMP when the control domain is not necessarily convex, we first consider a strong relaxed control formulation that allows us to perform the first-order variation. We also propose the technique of extension transformation to overcome the compatibility issues arising from the joint law in the relaxed control formulation. By further establishing the equivalence between the relaxed control formulation and the strict control formulation, we obtain the SMP for the original problem in the strict control formulation. In the part to investigate the HJB equation, we formulate an auxiliary control problem subjecting to a controlled measure-valued dynamics with Poisson jumps, which allows us to derive the HJB equation of the original problem through an equivalence argument. We also show the connection between the SMP and HJB equation and give an illustrative example of linear quadratic extended mean-field control with Poissonian common noise. △ Less

Submitted 7 July, 2024; originally announced July 2024.

Comments: Keywords: Extended mean-field control, Poissonian common noise, relaxed control formulation, stochastic maximum principle, HJB equation

arXiv:2407.04521 [pdf, ps, other]

Unified continuous-time q-learning for mean-field game and mean-field control problems

Authors: Xiaoli Wei, Xiang Yu, Fengyi Yuan

Abstract: This paper studies the continuous-time q-learning in the mean-field jump-diffusion models from the representative agent's perspective. To overcome the challenge when the population distribution may not be directly observable, we introduce the integrated q-function in decoupled form (decoupled Iq-function) and establish its martingale characterization together with the value function, which provide… ▽ More This paper studies the continuous-time q-learning in the mean-field jump-diffusion models from the representative agent's perspective. To overcome the challenge when the population distribution may not be directly observable, we introduce the integrated q-function in decoupled form (decoupled Iq-function) and establish its martingale characterization together with the value function, which provides a unified policy evaluation rule for both mean-field game (MFG) and mean-field control (MFC) problems. Moreover, depending on the task to solve the MFG or MFC problem, we can employ the decoupled Iq-function by different means to learn the mean-field equilibrium policy or the mean-field optimal policy respectively. As a result, we devise a unified q-learning algorithm for both MFG and MFC problems by utilizing all test policies stemming from the mean-field interactions. For several examples in the jump-diffusion setting, within and beyond the LQ framework, we can obtain the exact parameterization of the decoupled Iq-functions and the value functions, and illustrate our algorithm from the representative agent's perspective with satisfactory performance. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2407.02098 [pdf, other]

DM3D: Distortion-Minimized Weight Pruning for Lossless 3D Object Detection

Authors: Kaixin Xu, Qingtian Feng, Hao Chen, Zhe Wang, Xue Geng, Xulei Yang, Min Wu, Xiaoli Li, Weisi Lin

Abstract: Applying deep neural networks to 3D point cloud processing has attracted increasing attention due to its advanced performance in many areas, such as AR/VR, autonomous driving, and robotics. However, as neural network models and 3D point clouds expand in size, it becomes a crucial challenge to reduce the computational and memory overhead to meet latency and energy constraints in real-world applicat… ▽ More Applying deep neural networks to 3D point cloud processing has attracted increasing attention due to its advanced performance in many areas, such as AR/VR, autonomous driving, and robotics. However, as neural network models and 3D point clouds expand in size, it becomes a crucial challenge to reduce the computational and memory overhead to meet latency and energy constraints in real-world applications. Although existing approaches have proposed to reduce both computational cost and memory footprint, most of them only address the spatial redundancy in inputs, i.e. removing the redundancy of background points in 3D data. In this paper, we propose a novel post-training weight pruning scheme for 3D object detection that is (1) orthogonal to all existing point cloud sparsifying methods, which determines redundant parameters in the pretrained model that lead to minimal distortion in both locality and confidence (detection distortion); and (2) a universal plug-and-play pruning framework that works with arbitrary 3D detection model. This framework aims to minimize detection distortion of network output to maximally maintain detection precision, by identifying layer-wise sparsity based on second-order Taylor approximation of the distortion. Albeit utilizing second-order information, we introduced a lightweight scheme to efficiently acquire Hessian information, and subsequently perform dynamic programming to solve the layer-wise sparsity. Extensive experiments on KITTI, Nuscenes and ONCE datasets demonstrate that our approach is able to maintain and even boost the detection precision on pruned model under noticeable computation reduction (FLOPs). Noticeably, we achieve over 3.89x, 3.72x FLOPs reduction on CenterPoint and PVRCNN model, respectively, without mAP decrease, significantly improving the state-of-the-art. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.02068 [pdf, other]

LPViT: Low-Power Semi-structured Pruning for Vision Transformers

Authors: Kaixin Xu, Zhe Wang, Chunyun Chen, Xue Geng, Jie Lin, Xulei Yang, Min Wu, Xiaoli Li, Weisi Lin

Abstract: Vision transformers have emerged as a promising alternative to convolutional neural networks for various image analysis tasks, offering comparable or superior performance. However, one significant drawback of ViTs is their resource-intensive nature, leading to increased memory footprint, computation complexity, and power consumption. To democratize this high-performance technology and make it more… ▽ More Vision transformers have emerged as a promising alternative to convolutional neural networks for various image analysis tasks, offering comparable or superior performance. However, one significant drawback of ViTs is their resource-intensive nature, leading to increased memory footprint, computation complexity, and power consumption. To democratize this high-performance technology and make it more environmentally friendly, it is essential to compress ViT models, reducing their resource requirements while maintaining high performance. In this paper, we introduce a new block-structured pruning to address the resource-intensive issue for ViTs, offering a balanced trade-off between accuracy and hardware acceleration. Unlike unstructured pruning or channel-wise structured pruning, block pruning leverages the block-wise structure of linear layers, resulting in more efficient matrix multiplications. To optimize this pruning scheme, our paper proposes a novel hardware-aware learning objective that simultaneously maximizes speedup and minimizes power consumption during inference, tailored to the block sparsity structure. This objective eliminates the need for empirical look-up tables and focuses solely on reducing parametrized layer connections. Moreover, our paper provides a lightweight algorithm to achieve post-training pruning for ViTs, utilizing second-order Taylor approximation and empirical optimization to solve the proposed hardware-aware objective. Extensive experiments on ImageNet are conducted across various ViT architectures, including DeiT-B and DeiT-S, demonstrating competitive performance with other pruning methods and achieving a remarkable balance between accuracy preservation and power savings. Especially, we achieve up to 3.93x and 1.79x speedups on dedicated hardware and GPUs respectively for DeiT-B, and also observe an inference power reduction by 1.4x on real-world GPUs. △ Less

Submitted 6 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.01941 [pdf, other]

doi 10.1007/s11207-024-02323-w

Origin of the Chromospheric Umbral Waves in Sunspots

Authors: Xinsheng Zhang, Xiaoli Yan, Zhike Xue, **cheng Wang, Zhe Xu, Qiaoling Li, Yang Peng, Li** Yang

Abstract: Oscillations are ubiquitous in sunspots and the associated higher atmospheres. However, it is still unclear whether these oscillations are driven by the external acoustic waves (p-modes) or generated by the internal magnetoconvection. To obtain clues about the driving source of umbral waves in sunspots, we analyzed the spiral wave patterns (SWPs) in two sunspots registered by IRIS MgII 2796 Å slit… ▽ More Oscillations are ubiquitous in sunspots and the associated higher atmospheres. However, it is still unclear whether these oscillations are driven by the external acoustic waves (p-modes) or generated by the internal magnetoconvection. To obtain clues about the driving source of umbral waves in sunspots, we analyzed the spiral wave patterns (SWPs) in two sunspots registered by IRIS MgII 2796 Å slit-jaw images. By tracking the motion of the SWPs, we find for the first time that two one-armed SWPs coexist in the umbra, and they can rotate either in the same or opposite directions. Furthermore, by analyzing the spatial distribution of the oscillation centers of the one-armed SWPs within the umbra (the oscillation center is defined as the location where the SWP first appears), we find that the chromospheric umbral waves repeatedly originate from the regions with high oscillation power and most of the umbral waves occur in the dark nuclei and strong magnetic field regions of the umbra. Our study results indicate that the chromospheric umbral waves are likely excited by the p-mode oscillations. △ Less

Submitted 3 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

arXiv:2406.19027 [pdf, ps, other]

Multiple solutions for a class of nonhomogeneous elliptic systems with Dirichlet boundary or Neumann boundary

Authors: Xiaoli Yu, Xingyong Zhang

Abstract: In this paper, we mainly establish the existence of at least three non-trivial solutions for a class of nonhomogeneous quasilinear elliptic systems with Dirichlet boundary value or Neumann boundary value in a bounded domain $Ω\subset\mathbb{R}^N $ and $N\geq 1$. We exploit the method which is based on [6]. This method let us obtain the concrete open interval about the parameter $λ$. Since the quas… ▽ More In this paper, we mainly establish the existence of at least three non-trivial solutions for a class of nonhomogeneous quasilinear elliptic systems with Dirichlet boundary value or Neumann boundary value in a bounded domain $Ω\subset\mathbb{R}^N $ and $N\geq 1$. We exploit the method which is based on [6]. This method let us obtain the concrete open interval about the parameter $λ$. Since the quasilinear term depends on $u$ and $\nabla u$, it is necessary for our proofs to use the theory of monotone operators and the skill of adding one dimension to space. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.08765 [pdf, other]

LLM-based Knowledge Pruning for Time Series Data Analytics on Edge-computing Devices

Authors: Ruibing **, Qing Xu, Min Wu, Yuecong Xu, Dan Li, Xiaoli Li, Zhenghua Chen

Abstract: Limited by the scale and diversity of time series data, the neural networks trained on time series data often overfit and show unsatisfacotry performances. In comparison, large language models (LLMs) recently exhibit impressive generalization in diverse fields. Although massive LLM based approaches are proposed for time series tasks, these methods require to load the whole LLM in both training and… ▽ More Limited by the scale and diversity of time series data, the neural networks trained on time series data often overfit and show unsatisfacotry performances. In comparison, large language models (LLMs) recently exhibit impressive generalization in diverse fields. Although massive LLM based approaches are proposed for time series tasks, these methods require to load the whole LLM in both training and reference. This high computational demands limit practical applications in resource-constrained settings, like edge-computing and IoT devices. To address this issue, we propose Knowledge Pruning (KP), a novel paradigm for time series learning in this paper. For a specific downstream task, we argue that the world knowledge learned by LLMs is much redundant and only the related knowledge termed as "pertinent knowledge" is useful. Unlike other methods, our KP targets to prune the redundant knowledge and only distill the pertinent knowledge into the target model. This reduces model size and computational costs significantly. Additionally, different from existing LLM based approaches, our KP does not require to load the LLM in the process of training and testing, further easing computational burdens. With our proposed KP, a lightweight network can effectively learn the pertinent knowledge, achieving satisfactory performances with a low computation cost. To verify the effectiveness of our KP, two fundamental tasks on edge-computing devices are investigated in our experiments, where eight diverse environments or benchmarks with different networks are used to verify the generalization of our KP. Through experiments, our KP demonstrates effective learning of pertinent knowledge, achieving notable performance improvements in regression (19.7% on average) and classification (up to 13.7%) tasks, showcasing state-of-the-art results. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 12 pages, 5 figures

arXiv:2406.05485 [pdf, other]

Training-Free Robust Interactive Video Object Segmentation

Authors: Xiaoli Wei, Zhaoqing Wang, Yandong Guo, Chunxia Zhang, Tongliang Liu, Mingming Gong

Abstract: Interactive video object segmentation is a crucial video task, having various applications from video editing to data annotating. However, current approaches struggle to accurately segment objects across diverse domains. Recently, Segment Anything Model (SAM) introduces interactive visual prompts and demonstrates impressive performance across different domains. In this paper, we propose a training… ▽ More Interactive video object segmentation is a crucial video task, having various applications from video editing to data annotating. However, current approaches struggle to accurately segment objects across diverse domains. Recently, Segment Anything Model (SAM) introduces interactive visual prompts and demonstrates impressive performance across different domains. In this paper, we propose a training-free prompt tracking framework for interactive video object segmentation (I-PT), leveraging the powerful generalization of SAM. Although point tracking efficiently captures the pixel-wise information of objects in a video, points tend to be unstable when tracked over a long period, resulting in incorrect segmentation. Towards fast and robust interaction, we jointly adopt sparse points and boxes tracking, filtering out unstable points and capturing object-wise information. To better integrate reference information from multiple interactions, we introduce a cross-round space-time module (CRSTM), which adaptively aggregates mask features from previous rounds and frames, enhancing the segmentation stability. Our framework has demonstrated robust zero-shot video segmentation results on popular VOS datasets with interaction types, including DAVIS 2017, YouTube-VOS 2018, and MOSE 2023, maintaining a good tradeoff between performance and interaction time. △ Less

Submitted 8 June, 2024; originally announced June 2024.

arXiv:2406.02635 [pdf, other]

Evidentially Calibrated Source-Free Time-Series Domain Adaptation with Temporal Imputation

Authors: Mohamed Ragab, Peiliang Gong, Emadeldeen Eldele, Wenyu Zhang, Min Wu, Chuan-Sheng Foo, Daoqiang Zhang, Xiaoli Li, Zhenghua Chen

Abstract: Source-free domain adaptation (SFDA) aims to adapt a model pre-trained on a labeled source domain to an unlabeled target domain without access to source data, preserving the source domain's privacy. While SFDA is prevalent in computer vision, it remains largely unexplored in time series analysis. Existing SFDA methods, designed for visual data, struggle to capture the inherent temporal dynamics of… ▽ More Source-free domain adaptation (SFDA) aims to adapt a model pre-trained on a labeled source domain to an unlabeled target domain without access to source data, preserving the source domain's privacy. While SFDA is prevalent in computer vision, it remains largely unexplored in time series analysis. Existing SFDA methods, designed for visual data, struggle to capture the inherent temporal dynamics of time series, hindering adaptation performance. This paper proposes MAsk And imPUte (MAPU), a novel and effective approach for time series SFDA. MAPU addresses the critical challenge of temporal consistency by introducing a novel temporal imputation task. This task involves randomly masking time series signals and leveraging a dedicated temporal imputer to recover the original signal within the learned embedding space, bypassing the complexities of noisy raw data. Notably, MAPU is the first method to explicitly address temporal consistency in the context of time series SFDA. Additionally, it offers seamless integration with existing SFDA methods, providing greater flexibility. We further introduce E-MAPU, which incorporates evidential uncertainty estimation to address the overconfidence issue inherent in softmax predictions. To achieve that, we leverage evidential deep learning to obtain a better-calibrated pre-trained model and adapt the target encoder to map out-of-support target samples to a new feature representation closer to the source domain's support. This fosters better alignment, ultimately enhancing adaptation performance. Extensive experiments on five real-world time series datasets demonstrate that both MAPU and E-MAPU achieve significant performance gains compared to existing methods. These results highlight the effectiveness of our proposed approaches for tackling various time series domain adaptation problems. △ Less

Submitted 12 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.01922 [pdf, ps, other]

Performance Analysis of Hybrid Cellular and Cell-free MIMO Network

Authors: Zhuoyin Dai, **gran Xu, Xiaoli Xu, Ruoguang Li, Yong Zeng

Abstract: Cell-free wireless communication is envisioned as one of the most promising network architectures, which can achieve stable and uniform communication performance while improving the system energy and spectrum efficiency. The deployment of cell-free networks is envisioned to be a longterm evolutionary process, in which cell-free access points (APs) will be gradually introduced into the communicatio… ▽ More Cell-free wireless communication is envisioned as one of the most promising network architectures, which can achieve stable and uniform communication performance while improving the system energy and spectrum efficiency. The deployment of cell-free networks is envisioned to be a longterm evolutionary process, in which cell-free access points (APs) will be gradually introduced into the communication network and collaborate with the existing cellular base stations (BSs). To further explore the performance limits of hybrid cellular and cell-free networks, this paper develops a hybrid network model based on stochastic geometric toolkits, which reveals the coupling of the signal and interference from both the cellular and cell-free networks. Specifically, the conjugate beamforming is applied in hybrid cellular and cell-free networks, which enables user equipment (UE) to benefit from both cellular BSs and cell-free APs. The aggregate signal received from the hybrid network is approximated via moment matching, and coverage probability is characterized by deriving the Laplace transform of the interference. The analysis of signal strength and coverage probability is verified by extensive simulations. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2405.19623 [pdf, other]

A Novel Approach for Automated Design Information Mining from Issue Logs

Authors: Jiuang Zhao, Zitian Yang, Li Zhang, Xiaoli Lian, Donghao Yang

Abstract: Software architectures are usually meticulously designed to address multiple quality concerns and support long-term maintenance. However, due to the imbalance between the cost and value for developers to document design rationales (i.e., the design alternatives and the underlying arguments for making or rejecting decisions), these rationales are often obsolete or even missing. The lack of design k… ▽ More Software architectures are usually meticulously designed to address multiple quality concerns and support long-term maintenance. However, due to the imbalance between the cost and value for developers to document design rationales (i.e., the design alternatives and the underlying arguments for making or rejecting decisions), these rationales are often obsolete or even missing. The lack of design knowledge has motivated a number of studies to extract design information from various platforms in recent years. Unfortunately, despite the wealth of discussion records related to design information provided by platforms like open-source communities, existing research often overlooks the underlying arguments behind alternatives due to challenges such as the intricate semantics of discussions and the lack of benchmarks for design rationale extraction. In this paper, we propose a novel method, named by DRMiner, to automatically mine latent design rationales from developers' live discussion in open-source community (i.e., issue logs in Jira). To better identify solutions and the arguments supporting them, DRMiner skillfully decomposes the problem into multiple text classification tasks and tackles them using prompt tuning of language models and customized text-related features. To evaluate DRMiner, we acquire issue logs from Cassandra, Flink, and Solr repositories in Jira, and then annotate and process them under a rigorous scheme, ultimately forming a dataset for design rationale mining. Experimental results show that DRMiner achieves an F1 score of 65% for mining design rationales, outperforming all baselines with a 7% improvement over GPT-4.0. Furthermore, we investigate the usefulness of the design rationales mined by DRMiner for automated program repair (APR) and find that the design rationales significantly enhance APR, achieving 14 times higher full-match repairs on average. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.15458 [pdf, other]

FedCal: Achieving Local and Global Calibration in Federated Learning via Aggregated Parameterized Scaler

Authors: Hongyi Peng, Han Yu, Xiaoli Tang, Xiaoxiao Li

Abstract: Federated learning (FL) enables collaborative machine learning across distributed data owners, but data heterogeneity poses a challenge for model calibration. While prior work focused on improving accuracy for non-iid data, calibration remains under-explored. This study reveals existing FL aggregation approaches lead to sub-optimal calibration, and theoretical analysis shows despite constraining v… ▽ More Federated learning (FL) enables collaborative machine learning across distributed data owners, but data heterogeneity poses a challenge for model calibration. While prior work focused on improving accuracy for non-iid data, calibration remains under-explored. This study reveals existing FL aggregation approaches lead to sub-optimal calibration, and theoretical analysis shows despite constraining variance in clients' label distributions, global calibration error is still asymptotically lower bounded. To address this, we propose a novel Federated Calibration (FedCal) approach, emphasizing both local and global calibration. It leverages client-specific scalers for local calibration to effectively correct output misalignment without sacrificing prediction accuracy. These scalers are then aggregated via weight averaging to generate a global scaler, minimizing the global calibration error. Extensive experiments demonstrate FedCal significantly outperforms the best-performing baseline, reducing global calibration error by 47.66% on average. △ Less

Submitted 3 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

Comments: This paper has been accepted by ICML'24

arXiv:2405.14767 [pdf, other]

FinRobot: An Open-Source AI Agent Platform for Financial Applications using Large Language Models

Authors: Hongyang Yang, Boyu Zhang, Neng Wang, Cheng Guo, Xiaoli Zhang, Likun Lin, Junlin Wang, Tianyu Zhou, Mao Guan, Runjia Zhang, Christina Dan Wang

Abstract: As financial institutions and professionals increasingly incorporate Large Language Models (LLMs) into their workflows, substantial barriers, including proprietary data and specialized knowledge, persist between the finance sector and the AI community. These challenges impede the AI community's ability to enhance financial tasks effectively. Acknowledging financial analysis's critical role, we aim… ▽ More As financial institutions and professionals increasingly incorporate Large Language Models (LLMs) into their workflows, substantial barriers, including proprietary data and specialized knowledge, persist between the finance sector and the AI community. These challenges impede the AI community's ability to enhance financial tasks effectively. Acknowledging financial analysis's critical role, we aim to devise financial-specialized LLM-based toolchains and democratize access to them through open-source initiatives, promoting wider AI adoption in financial decision-making. In this paper, we introduce FinRobot, a novel open-source AI agent platform supporting multiple financially specialized AI agents, each powered by LLM. Specifically, the platform consists of four major layers: 1) the Financial AI Agents layer that formulates Financial Chain-of-Thought (CoT) by breaking sophisticated financial problems down into logical sequences; 2) the Financial LLM Algorithms layer dynamically configures appropriate model application strategies for specific tasks; 3) the LLMOps and DataOps layer produces accurate models by applying training/fine-tuning techniques and using task-relevant data; 4) the Multi-source LLM Foundation Models layer that integrates various LLMs and enables the above layers to access them directly. Finally, FinRobot provides hands-on for both professional-grade analysts and laypersons to utilize powerful AI techniques for advanced financial analysis. We open-source FinRobot at \url{https://github.com/AI4Finance-Foundation/FinRobot}. △ Less

Submitted 27 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

Comments: FinRobot Whitepaper V1.0

arXiv:2405.11223 [pdf, other]

A class of new linear, efficient and high-order implicit-explicit methods for the coupled free flow-porous media system based on nonlinear Lions interface condition

Authors: Xinhui Wang, Xu Guo, Xiaoli Li

Abstract: In this paper, we construct and analyze new first- and second-order implicit-explicit (IMEX) schemes for the unsteady Navier-Stokes-Darcy model to describe the coupled free flow-porous media system, which is based on the scalar auxiliary variable (SAV) approach in time and finite element method in space. The constructed schemes are linear, only require solving a sequence of linear differential equ… ▽ More In this paper, we construct and analyze new first- and second-order implicit-explicit (IMEX) schemes for the unsteady Navier-Stokes-Darcy model to describe the coupled free flow-porous media system, which is based on the scalar auxiliary variable (SAV) approach in time and finite element method in space. The constructed schemes are linear, only require solving a sequence of linear differential equations with constant coefficients at each time step, and can decouple the Navier-Stokes and Darcy systems. The unconditional stability of both the first- and second-order IMEX schemes can be derived for the coupled system equipped with the Lions interface condition, where the key point is that we should construct a new trilinear form to balance the fully explicit discretizations of the nonlinear terms in the complex system. We can also establish rigorous error estimates for the velocity and hydraulic head of the first-order scheme without any time step restriction. Numerical examples are presented to validate the proposed schemes. △ Less

Submitted 18 May, 2024; originally announced May 2024.

arXiv:2405.06038 [pdf, other]

From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks

Authors: Xue Geng, Zhe Wang, Chunyun Chen, Qing Xu, Kaixin Xu, Chao **, Manas Gupta, Xulei Yang, Zhenghua Chen, Mohamed M. Sabry Aly, Jie Lin, Min Wu, Xiaoli Li

Abstract: Deep neural networks (DNNs) have been widely used in many artificial intelligence (AI) tasks. However, deploying them brings significant challenges due to the huge cost of memory, energy, and computation. To address these challenges, researchers have developed various model compression techniques such as model quantization and model pruning. Recently, there has been a surge in research of compress… ▽ More Deep neural networks (DNNs) have been widely used in many artificial intelligence (AI) tasks. However, deploying them brings significant challenges due to the huge cost of memory, energy, and computation. To address these challenges, researchers have developed various model compression techniques such as model quantization and model pruning. Recently, there has been a surge in research of compression methods to achieve model efficiency while retaining the performance. Furthermore, more and more works focus on customizing the DNN hardware accelerators to better leverage the model compression techniques. In addition to efficiency, preserving security and privacy is critical for deploying DNNs. However, the vast and diverse body of related works can be overwhelming. This inspires us to conduct a comprehensive survey on recent research toward the goal of high-performance, cost-efficient, and safe deployment of DNNs. Our survey first covers the mainstream model compression techniques such as model quantization, model pruning, knowledge distillation, and optimizations of non-linear operations. We then introduce recent advances in designing hardware accelerators that can adapt to efficient model compression approaches. Additionally, we discuss how homomorphic encryption can be integrated to secure DNN deployment. Finally, we discuss several issues, such as hardware evaluation, generalization, and integration of various compression approaches. Overall, we aim to provide a big picture of efficient DNNs, from algorithm to hardware accelerators and security perspectives. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Comments: This manuscript is the accepted version for TNNLS(IEEE Transactions on Neural Networks and Learning Systems)

arXiv:2405.05991 [pdf, other]

Agent-oriented Joint Decision Support for Data Owners in Auction-based Federated Learning

Authors: Xiaoli Tang, Han Yu, Xiaoxiao Li

Abstract: Auction-based Federated Learning (AFL) has attracted extensive research interest due to its ability to motivate data owners (DOs) to join FL through economic means. While many existing AFL methods focus on providing decision support to model users (MUs) and the AFL auctioneer, decision support for data owners remains open. To bridge this gap, we propose a first-of-its-kind agent-oriented joint Pri… ▽ More Auction-based Federated Learning (AFL) has attracted extensive research interest due to its ability to motivate data owners (DOs) to join FL through economic means. While many existing AFL methods focus on providing decision support to model users (MUs) and the AFL auctioneer, decision support for data owners remains open. To bridge this gap, we propose a first-of-its-kind agent-oriented joint Pricing, Acceptance and Sub-delegation decision support approach for data owners in AFL (PAS-AFL). By considering a DO's current reputation, pending FL tasks, willingness to train FL models, and its trust relationships with other DOs, it provides a systematic approach for a DO to make joint decisions on AFL bid acceptance, task sub-delegation and pricing based on Lyapunov optimization to maximize its utility. It is the first to enable each DO to take on multiple FL tasks simultaneously to earn higher income for DOs and enhance the throughput of FL tasks in the AFL ecosystem. Extensive experiments based on six benchmarking datasets demonstrate significant advantages of PAS-AFL compared to six alternative strategies, beating the best baseline by 28.77% and 2.64% on average in terms of utility and test accuracy of the resulting FL models, respectively. △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2405.00718 [pdf, other]

Can't say cant? Measuring and Reasoning of Dark Jargons in Large Language Models

Authors: Xu Ji, Jianyi Zhang, Ziyin Zhou, Zhangchi Zhao, Qianqian Qiao, Kaiying Han, Md Imran Hossen, Xiali Hei

Abstract: Ensuring the resilience of Large Language Models (LLMs) against malicious exploitation is paramount, with recent focus on mitigating offensive responses. Yet, the understanding of cant or dark jargon remains unexplored. This paper introduces a domain-specific Cant dataset and CantCounter evaluation framework, employing Fine-Tuning, Co-Tuning, Data-Diffusion, and Data-Analysis stages. Experiments r… ▽ More Ensuring the resilience of Large Language Models (LLMs) against malicious exploitation is paramount, with recent focus on mitigating offensive responses. Yet, the understanding of cant or dark jargon remains unexplored. This paper introduces a domain-specific Cant dataset and CantCounter evaluation framework, employing Fine-Tuning, Co-Tuning, Data-Diffusion, and Data-Analysis stages. Experiments reveal LLMs, including ChatGPT, are susceptible to cant bypassing filters, with varying recognition accuracy influenced by question types, setups, and prompt clues. Updated models exhibit higher acceptance rates for cant queries. Moreover, LLM reactions differ across domains, e.g., reluctance to engage in racism versus LGBT topics. These findings underscore LLMs' understanding of cant and reflect training data characteristics and vendor approaches to sensitive topics. Additionally, we assess LLMs' ability to demonstrate reasoning capabilities. Access to our datasets and code is available at https://github.com/cistineup/CantCounter. △ Less

Submitted 25 April, 2024; originally announced May 2024.

arXiv:2404.18567 [pdf, other]

Assessing Cybersecurity Vulnerabilities in Code Large Language Models

Authors: Md Imran Hossen, Jianyi Zhang, Yinzhi Cao, Xiali Hei

Abstract: Instruction-tuned Code Large Language Models (Code LLMs) are increasingly utilized as AI coding assistants and integrated into various applications. However, the cybersecurity vulnerabilities and implications arising from the widespread integration of these models are not yet fully understood due to limited research in this domain. To bridge this gap, this paper presents EvilInstructCoder, a frame… ▽ More Instruction-tuned Code Large Language Models (Code LLMs) are increasingly utilized as AI coding assistants and integrated into various applications. However, the cybersecurity vulnerabilities and implications arising from the widespread integration of these models are not yet fully understood due to limited research in this domain. To bridge this gap, this paper presents EvilInstructCoder, a framework specifically designed to assess the cybersecurity vulnerabilities of instruction-tuned Code LLMs to adversarial attacks. EvilInstructCoder introduces the Adversarial Code Injection Engine to automatically generate malicious code snippets and inject them into benign code to poison instruction tuning datasets. It incorporates practical threat models to reflect real-world adversaries with varying capabilities and evaluates the exploitability of instruction-tuned Code LLMs under these diverse adversarial attack scenarios. Through the use of EvilInstructCoder, we conduct a comprehensive investigation into the exploitability of instruction tuning for coding tasks using three state-of-the-art Code LLM models: CodeLlama, DeepSeek-Coder, and StarCoder2, under various adversarial attack scenarios. Our experimental results reveal a significant vulnerability in these models, demonstrating that adversaries can manipulate the models to generate malicious payloads within benign code contexts in response to natural language instructions. For instance, under the backdoor attack setting, by poisoning only 81 samples (0.5\% of the entire instruction dataset), we achieve Attack Success Rate at 1 (ASR@1) scores ranging from 76\% to 86\% for different model families. Our study sheds light on the critical cybersecurity vulnerabilities posed by instruction-tuned Code LLMs and emphasizes the urgent necessity for robust defense mechanisms to mitigate the identified vulnerabilities. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.15381 [pdf, other]

Advances and Open Challenges in Federated Learning with Foundation Models

Authors: Chao Ren, Han Yu, Hongyi Peng, Xiaoli Tang, Anran Li, Yulan Gao, Alysa Ziying Tan, Bo Zhao, Xiaoxiao Li, Zengxiang Li, Qiang Yang

Abstract: The integration of Foundation Models (FMs) with Federated Learning (FL) presents a transformative paradigm in Artificial Intelligence (AI), offering enhanced capabilities while addressing concerns of privacy, data decentralization, and computational efficiency. This paper provides a comprehensive survey of the emerging field of Federated Foundation Models (FedFM), elucidating their synergistic rel… ▽ More The integration of Foundation Models (FMs) with Federated Learning (FL) presents a transformative paradigm in Artificial Intelligence (AI), offering enhanced capabilities while addressing concerns of privacy, data decentralization, and computational efficiency. This paper provides a comprehensive survey of the emerging field of Federated Foundation Models (FedFM), elucidating their synergistic relationship and exploring novel methodologies, challenges, and future directions that the FL research field needs to focus on in order to thrive in the age of foundation models. A systematic multi-tiered taxonomy is proposed, categorizing existing FedFM approaches for model training, aggregation, trustworthiness, and incentivization. Key challenges, including how to enable FL to deal with high complexity of computational demands, privacy considerations, contribution evaluation, and communication efficiency, are thoroughly discussed. Moreover, the paper explores the intricate challenges of communication, scalability and security inherent in training/fine-tuning FMs via FL, highlighting the potential of quantum computing to revolutionize the training, inference, optimization and data encryption processes. This survey underscores the importance of further research to propel innovation in FedFM, emphasizing the need for develo** trustworthy solutions. It serves as a foundational guide for researchers and practitioners interested in contributing to this interdisciplinary and rapidly advancing field. △ Less

Submitted 29 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

Comments: Survey of Federated Foundation Models (FedFM)

arXiv:2404.14283 [pdf, ps, other]

Cascade Radiations of $e^\pm$ from $γγ$-annihilation process as an extra component of the Early Optical/X-Ray Afterglows of Gamma-Ray Bursts

Authors: Ren-Jie Xiong, Xiao-Li Huang, Ze-Rui Wang

Abstract: Chromatic break and/or plateau observed in the early optical and X-ray afterglow lightcurves challenge the conventional external shock models of gamma-ray bursts (GRBs). Detection of TeV gamma-ray afterglows indicates strong gamma-ray production within the afterglow jets. We investigate the cascade radiations of the $e^\pm$ production via the $γγ$ interaction in the jets. Our numerical calculation… ▽ More Chromatic break and/or plateau observed in the early optical and X-ray afterglow lightcurves challenge the conventional external shock models of gamma-ray bursts (GRBs). Detection of TeV gamma-ray afterglows indicates strong gamma-ray production within the afterglow jets. We investigate the cascade radiations of the $e^\pm$ production via the $γγ$ interaction in the jets. Our numerical calculations show that the cascade synchrotron emission can make a significant contribution to the early optical/X-ray afterglows. The combination of the primary and cascade emission fluxes can shape a chromatic break and/or plateau in the early optical/X-ray lightcurves, depending on the jet properties. Applying our model to GRBs 050801 and 080310, we found that their optical plateaus and the late X-ray/optical lightcurves can be explained with our model in reasonable parameter values. We suggest that such a chromatic optical plateau could be a signature of strong $e^\pm$ production in GRB afterglow jets. The TeV gamma-ray flux of such kind GRBs should be significantly reduced, hence tends to be detectable for those GRBs that have a single power-law decaying optical afterglow lightcurve. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: 8 pages, 5 figures, accepted for publication in The Astrophysical Journal Letters

arXiv:2404.13244 [pdf, other]

Intelligent Agents for Auction-based Federated Learning: A Survey

Authors: Xiaoli Tang, Han Yu, Xiaoxiao Li, Sarit Kraus

Abstract: Auction-based federated learning (AFL) is an important emerging category of FL incentive mechanism design, due to its ability to fairly and efficiently motivate high-quality data owners to join data consumers' (i.e., servers') FL training tasks. To enhance the efficiency in AFL decision support for stakeholders (i.e., data consumers, data owners, and the auctioneer), intelligent agent-based techni… ▽ More Auction-based federated learning (AFL) is an important emerging category of FL incentive mechanism design, due to its ability to fairly and efficiently motivate high-quality data owners to join data consumers' (i.e., servers') FL training tasks. To enhance the efficiency in AFL decision support for stakeholders (i.e., data consumers, data owners, and the auctioneer), intelligent agent-based techniques have emerged. However, due to the highly interdisciplinary nature of this field and the lack of a comprehensive survey providing an accessible perspective, it is a challenge for researchers to enter and contribute to this field. This paper bridges this important gap by providing a first-of-its-kind survey on the Intelligent Agents for AFL (IA-AFL) literature. We propose a unique multi-tiered taxonomy that organises existing IA-AFL works according to 1) the stakeholders served, 2) the auction mechanism adopted, and 3) the goals of the agents, to provide readers with a multi-perspective view into this field. In addition, we analyse the limitations of existing approaches, summarise the commonly adopted performance evaluation metrics, and discuss promising future directions leading towards effective and efficient stakeholder-oriented decision support in IA-AFL ecosystems. △ Less

Submitted 19 April, 2024; originally announced April 2024.

arXiv:2404.12060 [pdf, other]

Environment-aware UAV Communications: CKM Construction and Predictive Beamforming

Authors: Shiqi Zeng, Xiaoli Xu, Yong Zeng

Abstract: Predictive millimeter-wave (mmWave) beamforming is a promising technique to enable low-latency and high-rate ground-air communications for cellular-connected unmanned aerial vehicles (UAVs). However, the high vulnerability of mmWave to blockages poses practical challenges to the implementation of such a technology. In this paper, we tackle the challenges by proposing a channel knowledge map (CKM)-… ▽ More Predictive millimeter-wave (mmWave) beamforming is a promising technique to enable low-latency and high-rate ground-air communications for cellular-connected unmanned aerial vehicles (UAVs). However, the high vulnerability of mmWave to blockages poses practical challenges to the implementation of such a technology. In this paper, we tackle the challenges by proposing a channel knowledge map (CKM)-assisted predictive beamforming approach based on the echoed joint communication and sensing signal, whereby the line-of-sight (LoS) link identification is performed via hypothesis testing using prior information provided by CKM. Depending on the identification result, extended Kalman filtering (EKF) is adopted to reliably track the target UAV. Furthermore, if the non-line-of-sight (NLoS) state is identified, the target UAV will be immediately connected to a candidate base station (BS), namely a handover will be triggered to alleviate the communication outage. The simulation results show that the proposed method can significantly enhance the UAV tracking and mmWave communication performance compared to the benchmarking schemes without using CKM or LoS identification. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.10233 [pdf, ps, other]

Little Pilot is Needed for Channel Estimation with Integrated Super-Resolution Sensing and Communication

Authors: **gran Xu, Huizhi Wang, Yong Zeng, Xiaoli Xu

Abstract: Integrated super-resolution sensing and communication (ISSAC) is a promising technology to achieve extremely high sensing performance for critical parameters, such as the angles of the wireless channels. In this paper, we propose an ISSAC-based channel estimation method, which requires little or even no pilot, yet still achieves accurate channel state information (CSI) estimation. The key idea is… ▽ More Integrated super-resolution sensing and communication (ISSAC) is a promising technology to achieve extremely high sensing performance for critical parameters, such as the angles of the wireless channels. In this paper, we propose an ISSAC-based channel estimation method, which requires little or even no pilot, yet still achieves accurate channel state information (CSI) estimation. The key idea is to exploit the fact that subspace-based super-resolution algorithms such as multiple signal classification (MUSIC) do not require a priori known pilots for accurate parameter estimation. Therefore, in the proposed method, the angles of the multi-path channel components are first estimated in a pilot-free manner while communication data symbols are sent. After that, the multi-path channel coefficients are estimated, where very little pilots are needed. The reasons are two folds. First, compared to the conventional channel estimation methods purely relying on channel training, much fewer parameters need to be estimated once the multi-path angles are accurately estimated. Besides, with angles obtained, the beamforming gain is also enjoyed when pilots are sent to estimate the channel path gains. To rigorously study the performance of the proposed method, we first consider the basic line-of-sight (LoS) channel. By analyzing the minimum mean square error (MMSE) of channel estimation and the resulting beamforming gains, we show that our proposed method significantly outperforms the conventional methods purely based on channel training. We then extend the study to the more general multipath channels. Simulation results are provided to demonstrate our theoretical results. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 6 pages, 5 figures, accepted by IEEE WCNC 2024 workshops

arXiv:2404.08902 [pdf, ps, other]

On a class of higher-order length preserving and energy decreasing IMEX schemes for the Landau-Lifshitz equation

Authors: Xiaoli Li, Nan Zheng, Jie Shen

Abstract: We construct new higher-order implicit-explicit (IMEX) schemes using the generalized scalar auxiliary variable (GSAV) approach for the Landau-Lifshitz equation. These schemes are linear, length preserving and only require solving one elliptic equation with constant coefficients at each time step. We show that numerical solutions of these schemes are uniformly bounded without any restriction on the… ▽ More We construct new higher-order implicit-explicit (IMEX) schemes using the generalized scalar auxiliary variable (GSAV) approach for the Landau-Lifshitz equation. These schemes are linear, length preserving and only require solving one elliptic equation with constant coefficients at each time step. We show that numerical solutions of these schemes are uniformly bounded without any restriction on the time step size, and establish rigorous error estimates in $l^{\infty}(0,T;H^1(Ω)) \bigcap l^{2}(0,T;H^2(Ω))$ of orders 1 to 5 in a unified framework. △ Less

Submitted 13 April, 2024; originally announced April 2024.

arXiv:2404.08472 [pdf, other]

TSLANet: Rethinking Transformers for Time Series Representation Learning

Authors: Emadeldeen Eldele, Mohamed Ragab, Zhenghua Chen, Min Wu, Xiaoli Li

Abstract: Time series data, characterized by its intrinsic long and short-range dependencies, poses a unique challenge across analytical applications. While Transformer-based models excel at capturing long-range dependencies, they face limitations in noise sensitivity, computational efficiency, and overfitting with smaller datasets. In response, we introduce a novel Time Series Lightweight Adaptive Network… ▽ More Time series data, characterized by its intrinsic long and short-range dependencies, poses a unique challenge across analytical applications. While Transformer-based models excel at capturing long-range dependencies, they face limitations in noise sensitivity, computational efficiency, and overfitting with smaller datasets. In response, we introduce a novel Time Series Lightweight Adaptive Network (TSLANet), as a universal convolutional model for diverse time series tasks. Specifically, we propose an Adaptive Spectral Block, harnessing Fourier analysis to enhance feature representation and to capture both long-term and short-term interactions while mitigating noise via adaptive thresholding. Additionally, we introduce an Interactive Convolution Block and leverage self-supervised learning to refine the capacity of TSLANet for decoding complex temporal patterns and improve its robustness on different datasets. Our comprehensive experiments demonstrate that TSLANet outperforms state-of-the-art models in various tasks spanning classification, forecasting, and anomaly detection, showcasing its resilience and adaptability across a spectrum of noise levels and data sizes. The code is available at https://github.com/emadeldeen24/TSLANet. △ Less

Submitted 6 May, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

Comments: Accepted in ICML 2024

arXiv:2404.08408 [pdf, other]

Seismic First Break Picking in a Higher Dimension Using Deep Graph Learning

Authors: Hongtao Wang, Li Long, Jiangshe Zhang, Xiaoli Wei, Chunxia Zhang, Zhenbo Guo

Abstract: Contemporary automatic first break (FB) picking methods typically analyze 1D signals, 2D source gathers, or 3D source-receiver gathers. Utilizing higher-dimensional data, such as 2D or 3D, incorporates global features, improving the stability of local picking. Despite the benefits, high-dimensional data requires structured input and increases computational demands. Addressing this, we propose a no… ▽ More Contemporary automatic first break (FB) picking methods typically analyze 1D signals, 2D source gathers, or 3D source-receiver gathers. Utilizing higher-dimensional data, such as 2D or 3D, incorporates global features, improving the stability of local picking. Despite the benefits, high-dimensional data requires structured input and increases computational demands. Addressing this, we propose a novel approach using deep graph learning called DGL-FB, constructing a large graph to efficiently extract information. In this graph, each seismic trace is represented as a node, connected by edges that reflect similarities. To manage the size of the graph, we develop a subgraph sampling technique to streamline model training and inference. Our proposed framework, DGL-FB, leverages deep graph learning for FB picking. It encodes subgraphs into global features using a deep graph encoder. Subsequently, the encoded global features are combined with local node signals and fed into a ResUNet-based 1D segmentation network for FB detection. Field survey evaluations of DGL-FB show superior accuracy and stability compared to a 2D U-Net-based benchmark method. △ Less

Submitted 12 April, 2024; originally announced April 2024.

arXiv:2404.04887 [pdf, other]

A Clinical-oriented Multi-level Contrastive Learning Method for Disease Diagnosis in Low-quality Medical Images

Authors: Qingshan Hou, Shuai Cheng, Peng Cao, **zhu Yang, Xiaoli Liu, Osmar R. Zaiane, Yih Chung Tham

Abstract: Representation learning offers a conduit to elucidate distinctive features within the latent space and interpret the deep models. However, the randomness of lesion distribution and the complexity of low-quality factors in medical images pose great challenges for models to extract key lesion features. Disease diagnosis methods guided by contrastive learning (CL) have shown significant advantages in… ▽ More Representation learning offers a conduit to elucidate distinctive features within the latent space and interpret the deep models. However, the randomness of lesion distribution and the complexity of low-quality factors in medical images pose great challenges for models to extract key lesion features. Disease diagnosis methods guided by contrastive learning (CL) have shown significant advantages in lesion feature representation. Nevertheless, the effectiveness of CL is highly dependent on the quality of the positive and negative sample pairs. In this work, we propose a clinical-oriented multi-level CL framework that aims to enhance the model's capacity to extract lesion features and discriminate between lesion and low-quality factors, thereby enabling more accurate disease diagnosis from low-quality medical images. Specifically, we first construct multi-level positive and negative pairs to enhance the model's comprehensive recognition capability of lesion features by integrating information from different levels and qualities of medical images. Moreover, to improve the quality of the learned lesion embeddings, we introduce a dynamic hard sample mining method based on self-paced learning. The proposed CL framework is validated on two public medical image datasets, EyeQ and Chest X-ray, demonstrating superior performance compared to other state-of-the-art disease diagnostic methods. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2404.02427 [pdf]

doi 10.1063/5.0097518

In-situ tunable giant electrical anisotropy in a grating gated AlGaN/GaN two-dimensional electron gas

Authors: Ting-Ting Wang, Sining Dong, Chong Li, Wen-Cheng Yue, Yang-Yang Lyu, Chen-Guang Wang, Chang-Kun Zeng, Zixiong Yuan, Wei Zhu, Zhi-Li Xiao, Xiaoli Lu, Bin Liu, Hai Lu, Hua-Bing Wang, Peiheng Wu, Wai-Kwong Kwok, Yong-Lei Wang

Abstract: Materials with in-plane electrical anisotropy have great potential for designing artificial synaptic devices. However, natural materials with strong intrinsic in-plane electrical anisotropy are rare. We introduce a simple strategy to produce extremely large electrical anisotropy via grating gating of a semiconductor two-dimensional electron gas (2DEG) of AlGaN/GaN. We show that periodically modula… ▽ More Materials with in-plane electrical anisotropy have great potential for designing artificial synaptic devices. However, natural materials with strong intrinsic in-plane electrical anisotropy are rare. We introduce a simple strategy to produce extremely large electrical anisotropy via grating gating of a semiconductor two-dimensional electron gas (2DEG) of AlGaN/GaN. We show that periodically modulated electric potential in the 2DEG induces in-plane electrical anisotropy, which is significantly enhanced in a magnetic field, leading to an ultra large electrical anisotropy. This is induced by a giant positive magnetoresistance and a giant negative magnetoresistance under two orthogonally oriented in-plane current flows, respectively. This giant electrical anisotropy is in-situ tunable by tailoring both the grating gate voltage and the magnetic field. Our semiconductor device with controllable giant electrical anisotropy will stimulate new device applications, such as multi-terminal memtransistors and bionic synapses. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Journal ref: Appl. Phys. Lett. 121, 092101 (2022)

arXiv:2404.02104 [pdf]

Quantum Hall effect in a CVD-grown oxide

Authors: Oleksandr Zheliuk, Yuliia Kreminska, Qundong Fu, Davide Pizzirani, Andrew A. L. N. Ammerlaan, Ying Wang, Sardar Hameed, Puhua Wan, Xiaoli Peng, Steffen Wiedmann, Zheng Liu, Jianting Ye, Uli Zeitler

Abstract: Two-dimensional electron systems (2DES) are promising for investigating correlated quantum phenomena. In particular, 2D oxides provide a platform that can host various quantum phases such as quantized Hall effect, superconductivity, or magnetism. The realization of such quantum phases in 2D oxides heavily relies on dedicated heterostructure growths. Here we show the integer quantum Hall effect ach… ▽ More Two-dimensional electron systems (2DES) are promising for investigating correlated quantum phenomena. In particular, 2D oxides provide a platform that can host various quantum phases such as quantized Hall effect, superconductivity, or magnetism. The realization of such quantum phases in 2D oxides heavily relies on dedicated heterostructure growths. Here we show the integer quantum Hall effect achieved in chemical vapor deposition grown Bi2O2Se - a representative member of a more accessible oxide family. A single or few sub-band 2DES can be prepared in thin films of Bi2O2Se, where the film thickness acts as the sole design parameter and the sub-band occupation is determined by the electric field effect. This new oxide platform exhibits characteristic advantages in structural flexibility due to its layered nature, making it suitable for scalable growth. The unique small mass distinguishes Bi2O2Se from other high-mobility oxides, providing a new platform for exploring quantum Hall physics in 2D oxides. △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2404.00363 [pdf, other]

Recent progresses in strange quark stars

Authors: Xiao-Li Zhang, Yong-Feng Huang, Ze-Cheng Zou

Abstract: According to the hypothesis that strange quark matter may be the true ground state of matter at extremely high densities, strange quark stars should be stable and could exist in the Universe. It is possible that pulsars may actually be strange stars, but not neutron stars. Here we present a short review on recent progresses in the field of strange quark stars. Three popular phenomenological models… ▽ More According to the hypothesis that strange quark matter may be the true ground state of matter at extremely high densities, strange quark stars should be stable and could exist in the Universe. It is possible that pulsars may actually be strange stars, but not neutron stars. Here we present a short review on recent progresses in the field of strange quark stars. Three popular phenomenological models widely used to describe strange quark matter are introduced, with special attention being paid on the corresponding equation of state in each model. Combining the equation of state with the Tolman-Oppenheimer-Volkov equations, the inner structure and mass-radius relation can be obtained for the whole sequence of strange stars. Strong gravitational wave emissions may be generated by strange stars through various mechanisms, which may help identify strange stars via observations. Especially, close-in strange quark planets with respect to their hosts may provide a unique test for the existence of strange quark objects. Fierce electromagnetic bursts could also be generated by strange stars. The energy may come from the phase transition of neutron stars to strange stars, or from the merger of binary strange stars. The collapse of the strange star crust can also release a huge amount of energy. It is shown that strange quark stars may be involved in short gamma-ray bursts and fast radio bursts. △ Less

Submitted 30 March, 2024; originally announced April 2024.

arXiv:2403.16424 [pdf]

An Experiment with the Use of ChatGPT for LCSH Subject Assignment on Electronic Theses and Dissertations

Authors: Eric H. C. Chow, TJ Kao, Xiaoli Li

Abstract: This study delves into the potential use of Large Language Models (LLMs) for generating Library of Congress Subject Headings (LCSH). The authors employed ChatGPT to generate subject headings for electronic theses and dissertations (ETDs) based on their titles and summaries. The results revealed that although some generated subject headings were valid, there were issues regarding specificity and ex… ▽ More This study delves into the potential use of Large Language Models (LLMs) for generating Library of Congress Subject Headings (LCSH). The authors employed ChatGPT to generate subject headings for electronic theses and dissertations (ETDs) based on their titles and summaries. The results revealed that although some generated subject headings were valid, there were issues regarding specificity and exhaustiveness. The study showcases that LLMs can serve as a strategic response to the backlog of items awaiting cataloging in academic libraries, while also offering a cost-effective approach for promptly generating LCSH. Nonetheless, human catalogers remain essential for verifying and enhancing the validity, exhaustiveness, and specificity of LCSH generated by LLMs. △ Less

Submitted 3 April, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: 20 pages

arXiv:2403.15271 [pdf, other]

doi 10.14722/ndss.2024.241231

From Hardware Fingerprint to Access Token: Enhancing the Authentication on IoT Devices

Authors: Yue Xiao, Yi He, Xiaoli Zhang, Qian Wang, Renjie Xie, Kun Sun, Ke Xu, Qi Li

Abstract: The proliferation of consumer IoT products in our daily lives has raised the need for secure device authentication and access control. Unfortunately, these resource-constrained devices typically use token-based authentication, which is vulnerable to token compromise attacks that allow attackers to impersonate the devices and perform malicious operations by stealing the access token. Using hardware… ▽ More The proliferation of consumer IoT products in our daily lives has raised the need for secure device authentication and access control. Unfortunately, these resource-constrained devices typically use token-based authentication, which is vulnerable to token compromise attacks that allow attackers to impersonate the devices and perform malicious operations by stealing the access token. Using hardware fingerprints to secure their authentication is a promising way to mitigate these threats. However, once attackers have stolen some hardware fingerprints (e.g., via MitM attacks), they can bypass the hardware authentication by training a machine learning model to mimic fingerprints or reusing these fingerprints to craft forge requests. In this paper, we present MCU-Token, a secure hardware fingerprinting framework for MCU-based IoT devices even if the cryptographic mechanisms (e.g., private keys) are compromised. MCU-Token can be easily integrated with various IoT devices by simply adding a short hardware fingerprint-based token to the existing payload. To prevent the reuse of this token, we propose a message map** approach that binds the token to a specific request via generating the hardware fingerprints based on the request payload. To defeat the machine learning attacks, we mix the valid fingerprints with poisoning data so that attackers cannot train a usable model with the leaked tokens. MCU-Token can defend against armored adversary who may replay, craft, and offload the requests via MitM or use both hardware (e.g., use identical devices) and software (e.g., machine learning attacks) strategies to mimic the fingerprints. The system evaluation shows that MCU-Token can achieve high accuracy (over 97%) with a low overhead across various IoT devices and application scenarios. △ Less

Submitted 22 March, 2024; originally announced March 2024.

arXiv:2403.14734 [pdf, other]

A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond

Authors: Qiushi Sun, Zhirui Chen, Fangzhi Xu, Kanzhi Cheng, Chang Ma, Zhangyue Yin, Jianing Wang, Chengcheng Han, Renyu Zhu, Shuai Yuan, Qipeng Guo, Xipeng Qiu, Pengcheng Yin, Xiaoli Li, Fei Yuan, Lingpeng Kong, Xiang Li, Zhiyong Wu

Abstract: Neural Code Intelligence -- leveraging deep learning to understand, generate, and optimize code -- holds immense potential for transformative impacts on the whole society. Bridging the gap between Natural Language and Programming Language, this domain has drawn significant attention from researchers in both research communities over the past few years. This survey presents a systematic and chronol… ▽ More Neural Code Intelligence -- leveraging deep learning to understand, generate, and optimize code -- holds immense potential for transformative impacts on the whole society. Bridging the gap between Natural Language and Programming Language, this domain has drawn significant attention from researchers in both research communities over the past few years. This survey presents a systematic and chronological review of the advancements in code intelligence, encompassing over 50 representative models and their variants, more than 20 categories of tasks, and an extensive coverage of over 680 related works. We follow the historical progression to trace the paradigm shifts across different research phases (e.g., from modeling code with recurrent neural networks to the era of Large Language Models). Concurrently, we highlight the major technical transitions in models, tasks, and evaluations spanning through different stages. For applications, we also observe a co-evolving shift. It spans from initial endeavors to tackling specific scenarios, through exploring a diverse array of tasks during its rapid expansion, to currently focusing on tackling increasingly complex and varied real-world challenges. Building on our examination of the developmental trajectories, we further investigate the emerging synergies between code intelligence and broader machine intelligence, uncovering new cross-domain opportunities and illustrating the substantial influence of code intelligence across various domains. Finally, we delve into both the opportunities and challenges associated with this field, alongside elucidating our insights on the most promising research directions. An ongoing, dynamically updated project and resources associated with this survey have been released at https://github.com/QiushiSun/NCISurvey. △ Less

Submitted 23 June, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

Comments: 64 pages, 6 figures, 10 tables, 692 references

arXiv:2403.14097 [pdf, other]

Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances

Authors: Jiangfei Duan, Ziang Song, Xupeng Miao, Xiaoli Xi, Dahua Lin, Harry Xu, Minjia Zhang, Zhihao Jia

Abstract: Deep neural networks (DNNs) are becoming progressively large and costly to train. This paper aims to reduce DNN training costs by leveraging preemptible instances on modern clouds, which can be allocated at a much lower price when idle but may be preempted by the cloud provider at any time. Prior work that supports DNN training on preemptive instances employs a reactive approach to handling instan… ▽ More Deep neural networks (DNNs) are becoming progressively large and costly to train. This paper aims to reduce DNN training costs by leveraging preemptible instances on modern clouds, which can be allocated at a much lower price when idle but may be preempted by the cloud provider at any time. Prior work that supports DNN training on preemptive instances employs a reactive approach to handling instance preemptions and allocations after their occurrence, which only achieves limited performance and scalability. We present Parcae, a system that enables cheap, fast, and scalable DNN training on preemptible instances by proactively adjusting the parallelization strategy of a DNN training job to adapt to predicted resource changes before instance preemptions and allocations really happen, which significantly reduces the cost of handling these events. Parcae optimizes liveput, a novel metric that measures the expected training throughput of a DNN job under various possible preemption scenarios. Compared to existing reactive, throughput-optimized systems, Parcae's proactive, live-optimized solution considers both the throughput of a job and its robustness under preemptions. To optimize liveput, Parcae supports lightweight instance migration and uses an availability predictor to forecast future preemptions. It then uses a liveput optimizer to discover an optimal strategy to parallelize DNN training under predicted preemptions. We evaluate Parcae on a variety of DNNs and preemption traces and show that Parcae outperforms existing spot-instance DNN training systems by up to 10$\times$. More importantly, Parcae achieves near-optimal performance for training large DNNs under frequent preemptions, in which case existing approaches cannot make any progress. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: NSDI '24

arXiv:2403.12639 [pdf, ps, other]

Imaging and spectroscopic observations of a confined solar filament eruption with two-stage evolution

Authors: Zhe Xu, Xiaoli Yan, Liheng Yang, Zhike Xue, **cheng Wang, Yian Zhou

Abstract: Solar filament eruptions are often characterized by stepwise evolution due to the involvement of multiple mechanisms, such as magnetohydrodynamic instabilities and magnetic reconnection. In this article, we investigated a confined filament eruption with a distinct two-stage evolution by using the imaging and spectroscopic observations from the Interface Region Imaging Spectrograph (IRIS) and the S… ▽ More Solar filament eruptions are often characterized by stepwise evolution due to the involvement of multiple mechanisms, such as magnetohydrodynamic instabilities and magnetic reconnection. In this article, we investigated a confined filament eruption with a distinct two-stage evolution by using the imaging and spectroscopic observations from the Interface Region Imaging Spectrograph (IRIS) and the Solar Dynamics Observatory (SDO). The eruption originated from a kinked filament thread that separated from an active region filament. In the first stage, the filament thread rose slowly and was obstructed due to flux pile-up in its front. This obstruction brought the filament thread into reconnection with a nearby loop-like structure, which enlarged the flux rope and changed its connectivity through the foot-point migration. The newly formed flux rope became more kink unstable and drove the rapid eruption in the second stage. It ascended into the upper atmosphere and initiated the reconnection with the overlying field. Finally, the flux rope was totally disintegrated, producing several solar jets along the overlying field. These observations demonstrate that the external reconnection between the flux rope and overlying field can destroy the flux rope, thus playing a crucial role in confining the solar eruptions. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Comments: 9 pages, 5 figures

arXiv:2403.10897 [pdf, other]

Rethinking Multi-view Representation Learning via Distilled Disentangling

Authors: Guanzhou Ke, Bo Wang, Xiaoli Wang, Shengfeng He

Abstract: Multi-view representation learning aims to derive robust representations that are both view-consistent and view-specific from diverse data sources. This paper presents an in-depth analysis of existing approaches in this domain, highlighting a commonly overlooked aspect: the redundancy between view-consistent and view-specific representations. To this end, we propose an innovative framework for mul… ▽ More Multi-view representation learning aims to derive robust representations that are both view-consistent and view-specific from diverse data sources. This paper presents an in-depth analysis of existing approaches in this domain, highlighting a commonly overlooked aspect: the redundancy between view-consistent and view-specific representations. To this end, we propose an innovative framework for multi-view representation learning, which incorporates a technique we term 'distilled disentangling'. Our method introduces the concept of masked cross-view prediction, enabling the extraction of compact, high-quality view-consistent representations from various sources without incurring extra computational overhead. Additionally, we develop a distilled disentangling module that efficiently filters out consistency-related information from multi-view representations, resulting in purer view-specific representations. This approach significantly reduces redundancy between view-consistent and view-specific representations, enhancing the overall efficiency of the learning process. Our empirical evaluations reveal that higher mask ratios substantially improve the quality of view-consistent representations. Moreover, we find that reducing the dimensionality of view-consistent representations relative to that of view-specific representations further refines the quality of the combined representations. Our code is accessible at: https://github.com/Guanzhou-Ke/MRDD. △ Less

Submitted 29 March, 2024; v1 submitted 16 March, 2024; originally announced March 2024.

Comments: Accepted by CVPR 2024

arXiv:2403.06737 [pdf, other]

Post-Training Attribute Unlearning in Recommender Systems

Authors: Chaochao Chen, Yizhao Zhang, Yuyuan Li, Dan Meng, Jun Wang, Xiaoli Zheng, Jianwei Yin

Abstract: With the growing privacy concerns in recommender systems, recommendation unlearning is getting increasing attention. Existing studies predominantly use training data, i.e., model inputs, as unlearning target. However, attackers can extract private information from the model even if it has not been explicitly encountered during training. We name this unseen information as \textit{attribute} and tre… ▽ More With the growing privacy concerns in recommender systems, recommendation unlearning is getting increasing attention. Existing studies predominantly use training data, i.e., model inputs, as unlearning target. However, attackers can extract private information from the model even if it has not been explicitly encountered during training. We name this unseen information as \textit{attribute} and treat it as unlearning target. To protect the sensitive attribute of users, Attribute Unlearning (AU) aims to make target attributes indistinguishable. In this paper, we focus on a strict but practical setting of AU, namely Post-Training Attribute Unlearning (PoT-AU), where unlearning can only be performed after the training of the recommendation model is completed. To address the PoT-AU problem in recommender systems, we propose a two-component loss function. The first component is distinguishability loss, where we design a distribution-based measurement to make attribute labels indistinguishable from attackers. We further extend this measurement to handle multi-class attribute cases with efficient computational overhead. The second component is regularization loss, where we explore a function-space measurement that effectively maintains recommendation performance compared to parameter-space regularization. We use stochastic gradient descent algorithm to optimize our proposed loss. Extensive experiments on four real-world datasets demonstrate the effectiveness of our proposed methods. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: arXiv admin note: text overlap with arXiv:2310.05847

arXiv:2403.05102 [pdf, other]

Enhancing Texture Generation with High-Fidelity Using Advanced Texture Priors

Authors: Kuo Xu, Maoyu Wang, Muyu Wang, Lincong Feng, Tianhui Zhang, Xiaoli Liu

Abstract: The recent advancements in 2D generation technology have sparked a widespread discussion on using 2D priors for 3D shape and texture content generation. However, these methods often overlook the subsequent user operations, such as texture aliasing and blurring that occur when the user acquires the 3D model and simplifies its structure. Traditional graphics methods partially alleviate this issue, b… ▽ More The recent advancements in 2D generation technology have sparked a widespread discussion on using 2D priors for 3D shape and texture content generation. However, these methods often overlook the subsequent user operations, such as texture aliasing and blurring that occur when the user acquires the 3D model and simplifies its structure. Traditional graphics methods partially alleviate this issue, but recent texture synthesis technologies fail to ensure consistency with the original model's appearance and cannot achieve high-fidelity restoration. Moreover, background noise frequently arises in high-resolution texture synthesis, limiting the practical application of these generation technologies.In this work, we propose a high-resolution and high-fidelity texture restoration technique that uses the rough texture as the initial input to enhance the consistency between the synthetic texture and the initial texture, thereby overcoming the issues of aliasing and blurring caused by the user's structure simplification operations. Additionally, we introduce a background noise smoothing technique based on a self-supervised scheme to address the noise problem in current high-resolution texture synthesis schemes. Our approach enables high-resolution texture synthesis, paving the way for high-definition and high-detail texture synthesis technology. Experiments demonstrate that our scheme outperforms currently known schemes in high-fidelity texture recovery under high-resolution conditions. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.03645 [pdf, other]

K-Link: Knowledge-Link Graph from LLMs for Enhanced Representation Learning in Multivariate Time-Series Data

Authors: Yucheng Wang, Ruibing **, Min Wu, Xiaoli Li, Lihua Xie, Zhenghua Chen

Abstract: Sourced from various sensors and organized chronologically, Multivariate Time-Series (MTS) data involves crucial spatial-temporal dependencies, e.g., correlations among sensors. To capture these dependencies, Graph Neural Networks (GNNs) have emerged as powerful tools, yet their effectiveness is restricted by the quality of graph construction from MTS data. Typically, existing approaches construct… ▽ More Sourced from various sensors and organized chronologically, Multivariate Time-Series (MTS) data involves crucial spatial-temporal dependencies, e.g., correlations among sensors. To capture these dependencies, Graph Neural Networks (GNNs) have emerged as powerful tools, yet their effectiveness is restricted by the quality of graph construction from MTS data. Typically, existing approaches construct graphs solely from MTS signals, which may introduce bias due to a small training dataset and may not accurately represent underlying dependencies. To address this challenge, we propose a novel framework named K-Link, leveraging Large Language Models (LLMs) to encode extensive general knowledge and thereby providing effective solutions to reduce the bias. Leveraging the knowledge embedded in LLMs, such as physical principles, we extract a \textit{Knowledge-Link graph}, capturing vast semantic knowledge of sensors and the linkage of the sensor-level knowledge. To harness the potential of the knowledge-link graph in enhancing the graph derived from MTS data, we propose a graph alignment module, facilitating the transfer of semantic knowledge within the knowledge-link graph into the MTS-derived graph. By doing so, we can improve the graph quality, ensuring effective representation learning with GNNs for MTS data. Extensive experiments demonstrate the efficacy of our approach for superior performance across various MTS-related downstream tasks. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: 12 pages,7 figures

arXiv:2403.00495 [pdf]

doi 10.1039/D4MA00058G

Low-temperature aqueous solution growth of the acousto-optic TeO2 single crystals

Authors: Lu Han, Chao Liu, Xiaoli Wang, Feiyu Li, Chuanyan Fan, Junjie Zhang

Abstract: $α$-TeO2 is widely used in acousto-optic devices due to its excellent physical properties. Conventionally, $α$-TeO2 single crystals were grown using melt methods. Here, we report for the first time the growth of $α$-TeO2 single crystals using the aqueous solution method below 100 °C. Solubility curve of $α$-TeO2 was measured, and then single crystals with dimensions of 3.5x3.5x2.5 mm3 were success… ▽ More $α$-TeO2 is widely used in acousto-optic devices due to its excellent physical properties. Conventionally, $α$-TeO2 single crystals were grown using melt methods. Here, we report for the first time the growth of $α$-TeO2 single crystals using the aqueous solution method below 100 °C. Solubility curve of $α$-TeO2 was measured, and then single crystals with dimensions of 3.5x3.5x2.5 mm3 were successfully grown using seed crystals that were synthesized from spontaneous nucleation. The as-grown single crystals belong to the P41212 space group, evidenced by single crystal X-ray diffraction and Rietveld refinement on powder diffraction. Rocking curve measurements show that the as-grown crystals exhibit high crystallinity with a full-width at half maxima (FWHM) of 57.2''. Ultraviolet-Visible absorption spectroscopy indicates the absorption edge is 350 nm and the band gap is estimated to be 3.58 eV. The density and Vickers hardness of as-grown single crystals are measured to be 6.042 g/cm3 and 404 kg/mm2, repectively. Our findings provide an easy-to-access and energy-saving method for growing single crystals of inorganic compounds. △ Less

Submitted 1 March, 2024; originally announced March 2024.

Comments: 14 pages, 8 figures

Journal ref: Mater. Adv., 2024, 5, 3022-3028

arXiv:2402.18933 [pdf, other]

Modality-Agnostic Structural Image Representation Learning for Deformable Multi-Modality Medical Image Registration

Authors: Tony C. W. Mok, Zi Li, Yunhao Bai, Jianpeng Zhang, Wei Liu, Yan-Jie Zhou, Ke Yan, Dakai **, Yu Shi, Xiaoli Yin, Le Lu, Ling Zhang

Abstract: Establishing dense anatomical correspondence across distinct imaging modalities is a foundational yet challenging procedure for numerous medical image analysis studies and image-guided radiotherapy. Existing multi-modality image registration algorithms rely on statistical-based similarity measures or local structural image representations. However, the former is sensitive to locally varying noise,… ▽ More Establishing dense anatomical correspondence across distinct imaging modalities is a foundational yet challenging procedure for numerous medical image analysis studies and image-guided radiotherapy. Existing multi-modality image registration algorithms rely on statistical-based similarity measures or local structural image representations. However, the former is sensitive to locally varying noise, while the latter is not discriminative enough to cope with complex anatomical structures in multimodal scans, causing ambiguity in determining the anatomical correspondence across scans with different modalities. In this paper, we propose a modality-agnostic structural representation learning method, which leverages Deep Neighbourhood Self-similarity (DNS) and anatomy-aware contrastive learning to learn discriminative and contrast-invariance deep structural image representations (DSIR) without the need for anatomical delineations or pre-aligned training images. We evaluate our method on multiphase CT, abdomen MR-CT, and brain MR T1w-T2w registration. Comprehensive results demonstrate that our method is superior to the conventional local structural representation and statistical-based similarity measures in terms of discriminability and accuracy. △ Less

Submitted 31 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

Comments: Accepted by CVPR2024

arXiv:2402.15061 [pdf, other]

Fine-tuning Large Language Models for Domain-specific Machine Translation

Authors: Jiawei Zheng, Hanghai Hong, Xiaoli Wang, **gsong Su, Yonggui Liang, Shikai Wu

Abstract: Large language models (LLMs) have made significant progress in machine translation (MT). However, their potential in domain-specific MT remains under-explored. Current LLM-based MT systems still face several challenges. First, for LLMs with in-context learning, their effectiveness is highly sensitive to input translation examples, and processing them can increase inference costs. They often requir… ▽ More Large language models (LLMs) have made significant progress in machine translation (MT). However, their potential in domain-specific MT remains under-explored. Current LLM-based MT systems still face several challenges. First, for LLMs with in-context learning, their effectiveness is highly sensitive to input translation examples, and processing them can increase inference costs. They often require extra post-processing due to over-generation. Second, LLMs with fine-tuning on domain-specific data often require high training costs for domain adaptation, and may weaken the zero-shot MT capabilities of LLMs due to over-specialization. The aforementioned methods can struggle to translate rare words in domain transfer scenarios. To address these challenges, this paper proposes a prompt-oriented fine-tuning method, denoted as LlamaIT, to effectively and efficiently fine-tune a general-purpose LLM for domain-specific MT tasks. First, we construct a task-specific mix-domain dataset, which is then used to fine-tune the LLM with LoRA. This can eliminate the need for input translation examples, post-processing, or over-specialization. By zero-shot prompting with instructions, we adapt the MT tasks to the target domain at inference time. To further elicit the MT capability for rare words, we construct new prompts by incorporating domain-specific bilingual vocabulary. We also conduct extensive experiments on both publicly available and self-constructed datasets. The results show that our LlamaIT can significantly enhance the domain-specific MT capabilities of the LLM, meanwhile preserving its zero-shot MT capabilities. △ Less

Submitted 22 February, 2024; originally announced February 2024.

Comments: 9 pages, 6 figures, 6tables

arXiv:2402.12603 [pdf]

Interlayer ferroelectric polarization modulated anomalous Hall effects in four-layer MnBi2Te4 antiferromagnets

Authors: Ziyu Niu, Xiang-Long Yu, Dingfu Shao, Xixiang **g, Defeng Hou, Xuhong Li, **g Sun, Junqin Shi, Xiaoli Fan, Tengfei Cao

Abstract: Van der Waals (vdW) assembly could efficiently modulate the symmetry of two-dimensional (2D) materials that ultimately governs their physical properties. Of particular interest is the ferroelectric polarization being introduced by proper vdW assembly that enables the realization of novel electronic, magnetic and transport properties of 2D materials. Four-layer antiferromagnetic MnBi2Te4 (F-MBT) of… ▽ More Van der Waals (vdW) assembly could efficiently modulate the symmetry of two-dimensional (2D) materials that ultimately governs their physical properties. Of particular interest is the ferroelectric polarization being introduced by proper vdW assembly that enables the realization of novel electronic, magnetic and transport properties of 2D materials. Four-layer antiferromagnetic MnBi2Te4 (F-MBT) offers an excellent platform to explore ferroelectric polarization effects on magnetic order and topological transport properties of nanomaterials. Here, by applying symmetry analyses and density-functional-theory calculations, the ferroelectric interface effects on magnetic order, anomalous Hall effect (AHE) or even quantum AHE (QAHE) on the F-MBT are analyzed. Interlayer ferroelectric polarization in F-MBT efficiently violates the PT symmetry (the combination symmetry of central inversion (P) and time reverse (T) of the F-MBT by conferring magnetoelectric couplings, and stabilizes a specific antiferromagnetic order encompassing a ferromagnetic interface in the F-MBT. We predict that engineering an interlayer polarization in the top or bottom interface of F-MBT allows converting F-MBT from a trivial insulator to a Chern insulator. The switching of ferroelectric polarization at the middle interfaces results in a direction reversal of the quantum anomalous Hall current. Additionally, the interlayer polarization of the top and bottom interfaces can be aligned in the same direction, and the switching of polarization direction also reverses the direction of anomalous Hall currents. Overall, our work highlights the occurrence of quantum-transport phenomena in 2D vdW four-layer antiferromagnets through vdW assembly. These phenomena are absent in the bulk or thin-film in bulk-like stacking forms of MnBi2Te4. △ Less

Submitted 19 February, 2024; originally announced February 2024.

arXiv:2402.11887 [pdf, other]

Generative Semi-supervised Graph Anomaly Detection

Authors: Hezhe Qiao, Qingsong Wen, Xiaoli Li, Ee-Peng Lim, Guansong Pang

Abstract: This work considers a practical semi-supervised graph anomaly detection (GAD) scenario, where part of the nodes in a graph are known to be normal, contrasting to the extensively explored unsupervised setting with a fully unlabeled graph. We reveal that having access to the normal nodes, even just a small percentage of normal nodes, helps enhance the detection performance of existing unsupervised G… ▽ More This work considers a practical semi-supervised graph anomaly detection (GAD) scenario, where part of the nodes in a graph are known to be normal, contrasting to the extensively explored unsupervised setting with a fully unlabeled graph. We reveal that having access to the normal nodes, even just a small percentage of normal nodes, helps enhance the detection performance of existing unsupervised GAD methods when they are adapted to the semi-supervised setting. However, their utilization of these normal nodes is limited. In this paper, we propose a novel Generative GAD approach (namely GGAD) for the semi-supervised scenario to better exploit the normal nodes. The key idea is to generate pseudo anomaly nodes, referred to as 'outlier nodes', for providing effective negative node samples in training a discriminative one-class classifier. The main challenge here lies in the lack of ground truth information about real anomaly nodes. To address this challenge, GGAD is designed to leverage two important priors about the anomaly nodes -- asymmetric local affinity and egocentric closeness -- to generate reliable outlier nodes that assimilate anomaly nodes in both graph structure and feature representations. Comprehensive experiments on six real-world GAD datasets are performed to establish a benchmark for semi-supervised GAD and show that GGAD substantially outperforms state-of-the-art unsupervised and semi-supervised GAD methods with varying numbers of training normal nodes. Code will be made available at https://github.com/mala-lab/GGAD. △ Less

Submitted 28 May, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

Comments: 20 pages, 11 figures

arXiv:2402.11604 [pdf, other]

Self-evolving Autoencoder Embedded Q-Network

Authors: J. Senthilnath, Bangjian Zhou, Zhen Wei Ng, Deeksha Aggarwal, Rajdeep Dutta, Ji Wei Yoon, Aye Phyu Phyu Aung, Keyu Wu, Min Wu, Xiaoli Li

Abstract: In the realm of sequential decision-making tasks, the exploration capability of a reinforcement learning (RL) agent is paramount for achieving high rewards through interactions with the environment. To enhance this crucial ability, we propose SAQN, a novel approach wherein a self-evolving autoencoder (SA) is embedded with a Q-Network (QN). In SAQN, the self-evolving autoencoder architecture adapts… ▽ More In the realm of sequential decision-making tasks, the exploration capability of a reinforcement learning (RL) agent is paramount for achieving high rewards through interactions with the environment. To enhance this crucial ability, we propose SAQN, a novel approach wherein a self-evolving autoencoder (SA) is embedded with a Q-Network (QN). In SAQN, the self-evolving autoencoder architecture adapts and evolves as the agent explores the environment. This evolution enables the autoencoder to capture a diverse range of raw observations and represent them effectively in its latent space. By leveraging the disentangled states extracted from the encoder generated latent space, the QN is trained to determine optimal actions that improve rewards. During the evolution of the autoencoder architecture, a bias-variance regulatory strategy is employed to elicit the optimal response from the RL agent. This strategy involves two key components: (i) fostering the growth of nodes to retain previously acquired knowledge, ensuring a rich representation of the environment, and (ii) pruning the least contributing nodes to maintain a more manageable and tractable latent space. Extensive experimental evaluations conducted on three distinct benchmark environments and a real-world molecular environment demonstrate that the proposed SAQN significantly outperforms state-of-the-art counterparts. The results highlight the effectiveness of the self-evolving autoencoder and its collaboration with the Q-Network in tackling sequential decision-making tasks. △ Less

Submitted 18 February, 2024; originally announced February 2024.

Comments: 11 pages, 9 figures, 3 tables

arXiv:2402.09167 [pdf, other]

Evolving Restricted Boltzmann Machine-Kohonen Network for Online Clustering

Authors: J. Senthilnath, Adithya Bhattiprolu, Ankur Singh, Bangjian Zhou, Min Wu, Jón Atli Benediktsson, Xiaoli Li

Abstract: A novel online clustering algorithm is presented where an Evolving Restricted Boltzmann Machine (ERBM) is embedded with a Kohonen Network called ERBM-KNet. The proposed ERBM-KNet efficiently handles streaming data in a single-pass mode using the ERBM, employing a bias-variance strategy for neuron growing and pruning, as well as online clustering based on a cluster update strategy for cluster predi… ▽ More A novel online clustering algorithm is presented where an Evolving Restricted Boltzmann Machine (ERBM) is embedded with a Kohonen Network called ERBM-KNet. The proposed ERBM-KNet efficiently handles streaming data in a single-pass mode using the ERBM, employing a bias-variance strategy for neuron growing and pruning, as well as online clustering based on a cluster update strategy for cluster prediction and cluster center update using KNet. Initially, ERBM evolves its architecture while processing unlabeled image data, effectively disentangling the data distribution in the latent space. Subsequently, the KNet utilizes the feature extracted from ERBM to predict the number of clusters and updates the cluster centers. By overcoming the common challenges associated with clustering algorithms, such as prior initialization of the number of clusters and subpar clustering accuracy, the proposed ERBM-KNet offers significant improvements. Extensive experimental evaluations on four benchmarks and one industry dataset demonstrate the superiority of ERBM-KNet compared to state-of-the-art approaches. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: 9 pages, 11 figures, 3 tables

arXiv:2402.04362 [pdf, other]

Neural Networks Learn Statistics of Increasing Complexity

Authors: Nora Belrose, Quintin Pope, Lucia Quirke, Alex Mallen, Xiaoli Fern

Abstract: The distributional simplicity bias (DSB) posits that neural networks learn low-order moments of the data distribution first, before moving on to higher-order correlations. In this work, we present compelling new evidence for the DSB by showing that networks automatically learn to perform well on maximum-entropy distributions whose low-order statistics match those of the training set early in train… ▽ More The distributional simplicity bias (DSB) posits that neural networks learn low-order moments of the data distribution first, before moving on to higher-order correlations. In this work, we present compelling new evidence for the DSB by showing that networks automatically learn to perform well on maximum-entropy distributions whose low-order statistics match those of the training set early in training, then lose this ability later. We also extend the DSB to discrete domains by proving an equivalence between token $n$-gram frequencies and the moments of embedding vectors, and by finding empirical evidence for the bias in LLMs. Finally we use optimal transport methods to surgically edit the low-order statistics of one class to match those of another, and show that early-training networks treat the edited samples as if they were drawn from the target class. Code is available at https://github.com/EleutherAI/features-across-time. △ Less

Submitted 13 February, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

arXiv:2402.02526 [pdf, other]

CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition

Authors: Quang Pham, Giang Do, Huy Nguyen, TrungTin Nguyen, Chenghao Liu, Mina Sartipi, Binh T. Nguyen, Savitha Ramasamy, Xiaoli Li, Steven Hoi, Nhat Ho

Abstract: Sparse mixture of experts (SMoE) offers an appealing solution to scale up the model complexity beyond the mean of increasing the network's depth or width. However, effective training of SMoE has proven to be challenging due to the representation collapse issue, which causes parameter redundancy and limited representation potentials. In this work, we propose a competition mechanism to address this… ▽ More Sparse mixture of experts (SMoE) offers an appealing solution to scale up the model complexity beyond the mean of increasing the network's depth or width. However, effective training of SMoE has proven to be challenging due to the representation collapse issue, which causes parameter redundancy and limited representation potentials. In this work, we propose a competition mechanism to address this fundamental challenge of representation collapse. By routing inputs only to experts with the highest neural response, we show that, under mild assumptions, competition enjoys the same convergence rate as the optimal estimator. We further propose CompeteSMoE, an effective and efficient algorithm to train large language models by deploying a simple router that predicts the competition outcomes. Consequently, CompeteSMoE enjoys strong performance gains from the competition routing policy while having low computation overheads. Our extensive empirical evaluations on two transformer architectures and a wide range of tasks demonstrate the efficacy, robustness, and scalability of CompeteSMoE compared to state-of-the-art SMoE strategies. △ Less

Submitted 4 February, 2024; originally announced February 2024.

Showing 1–50 of 671 results for author: Xiali