Search | arXiv e-print repository

arXiv:2404.11996 [pdf, other]

DST-GTN: Dynamic Spatio-Temporal Graph Transformer Network for Traffic Forecasting

Authors: Songtao Huang, Hong** Song, Tianqi Jiang, Akbar Telikani, Jun Shen, Qingguo Zhou, Binbin Yong, Qiang Wu

Abstract: Accurate traffic forecasting is essential for effective urban planning and congestion management. Deep learning (DL) approaches have gained colossal success in traffic forecasting but still face challenges in capturing the intricacies of traffic dynamics. In this paper, we identify and address this challenges by emphasizing that spatial features are inherently dynamic and change over time. A novel… ▽ More Accurate traffic forecasting is essential for effective urban planning and congestion management. Deep learning (DL) approaches have gained colossal success in traffic forecasting but still face challenges in capturing the intricacies of traffic dynamics. In this paper, we identify and address this challenges by emphasizing that spatial features are inherently dynamic and change over time. A novel in-depth feature representation, called Dynamic Spatio-Temporal (Dyn-ST) features, is introduced, which encapsulates spatial characteristics across varying times. Moreover, a Dynamic Spatio-Temporal Graph Transformer Network (DST-GTN) is proposed by capturing Dyn-ST features and other dynamic adjacency relations between intersections. The DST-GTN can model dynamic ST relationships between nodes accurately and refine the representation of global and local ST characteristics by adopting adaptive weights in low-pass and all-pass filters, enabling the extraction of Dyn-ST features from traffic time-series data. Through numerical experiments on public datasets, the DST-GTN achieves state-of-the-art performance for a range of traffic forecasting tasks and demonstrates enhanced stability. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.11290 [pdf, other]

doi 10.1145/3589334.3645589

Inductive Cognitive Diagnosis for Fast Student Learning in Web-Based Online Intelligent Education Systems

Authors: Shuo Liu, Junhao Shen, Hong Qian, Aimin Zhou

Abstract: Cognitive diagnosis aims to gauge students' mastery levels based on their response logs. Serving as a pivotal module in web-based online intelligent education systems (WOIESs), it plays an upstream and fundamental role in downstream tasks like learning item recommendation and computerized adaptive testing. WOIESs are open learning environment where numerous new students constantly register and com… ▽ More Cognitive diagnosis aims to gauge students' mastery levels based on their response logs. Serving as a pivotal module in web-based online intelligent education systems (WOIESs), it plays an upstream and fundamental role in downstream tasks like learning item recommendation and computerized adaptive testing. WOIESs are open learning environment where numerous new students constantly register and complete exercises. In WOIESs, efficient cognitive diagnosis is crucial to fast feedback and accelerating student learning. However, the existing cognitive diagnosis methods always employ intrinsically transductive student-specific embeddings, which become slow and costly due to retraining when dealing with new students who are unseen during training. To this end, this paper proposes an inductive cognitive diagnosis model (ICDM) for fast new students' mastery levels inference in WOIESs. Specifically, in ICDM, we propose a novel student-centered graph (SCG). Rather than inferring mastery levels through updating student-specific embedding, we derive the inductive mastery levels as the aggregated outcomes of students' neighbors in SCG. Namely, SCG enables to shift the task from finding the most suitable student-specific embedding that fits the response logs to finding the most suitable representations for different node types in SCG, and the latter is more efficient since it no longer requires retraining. To obtain this representation, ICDM consists of a construction-aggregation-generation-transformation process to learn the final representation of students, exercises and concepts. Extensive experiments across real-world datasets show that, compared with the existing cognitive diagnosis methods that are always transductive, ICDM is much more faster while maintains the competitive inference performance for new students. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: WWW 2024

arXiv:2404.09724 [pdf, other]

Privacy-Preserving Federated Unlearning with Certified Client Removal

Authors: Ziyao Liu, Huanyi Ye, Yu Jiang, Jiyuan Shen, Jiale Guo, Ivan Tjuawinata, Kwok-Yan Lam

Abstract: In recent years, Federated Unlearning (FU) has gained attention for addressing the removal of a client's influence from the global model in Federated Learning (FL) systems, thereby ensuring the ``right to be forgotten" (RTBF). State-of-the-art methods for unlearning use historical data from FL clients, such as gradients or locally trained models. However, studies have revealed significant informat… ▽ More In recent years, Federated Unlearning (FU) has gained attention for addressing the removal of a client's influence from the global model in Federated Learning (FL) systems, thereby ensuring the ``right to be forgotten" (RTBF). State-of-the-art methods for unlearning use historical data from FL clients, such as gradients or locally trained models. However, studies have revealed significant information leakage in this setting, with the possibility of reconstructing a user's local data from their uploaded information. Addressing this, we propose Starfish, a privacy-preserving federated unlearning scheme using Two-Party Computation (2PC) techniques and shared historical client data between two non-colluding servers. Starfish builds upon existing FU methods to ensure privacy in unlearning processes. To enhance the efficiency of privacy-preserving FU evaluations, we suggest 2PC-friendly alternatives for certain FU algorithm operations. We also implement strategies to reduce costs associated with 2PC operations and lessen cumulative approximation errors. Moreover, we establish a theoretical bound for the difference between the unlearned global model via Starfish and a global model retrained from scratch for certified client removal. Our theoretical and experimental analyses demonstrate that Starfish achieves effective unlearning with reasonable efficiency, maintaining privacy and security in FL systems. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.08902 [pdf, ps, other]

On a class of higher-order length preserving and energy decreasing IMEX schemes for the Landau-Lifshitz equation

Authors: Xiaoli Li, Nan Zheng, Jie Shen

Abstract: We construct new higher-order implicit-explicit (IMEX) schemes using the generalized scalar auxiliary variable (GSAV) approach for the Landau-Lifshitz equation. These schemes are linear, length preserving and only require solving one elliptic equation with constant coefficients at each time step. We show that numerical solutions of these schemes are uniformly bounded without any restriction on the… ▽ More We construct new higher-order implicit-explicit (IMEX) schemes using the generalized scalar auxiliary variable (GSAV) approach for the Landau-Lifshitz equation. These schemes are linear, length preserving and only require solving one elliptic equation with constant coefficients at each time step. We show that numerical solutions of these schemes are uniformly bounded without any restriction on the time step size, and establish rigorous error estimates in $l^{\infty}(0,T;H^1(Ω)) \bigcap l^{2}(0,T;H^2(Ω))$ of orders 1 to 5 in a unified framework. △ Less

Submitted 13 April, 2024; originally announced April 2024.

arXiv:2404.07967 [pdf, other]

Spin-Energy Entanglement of a Time-Focused Neutron

Authors: J. C. Leiner, S. J. Kuhn, S. McKay, J. K. Jochum, F. Li, A. A. M. Irfan, F. Funama, D. Mettus, L. Beddrich, C. Franz, J. Shen, S. R. Parnell, R. M. Dalgliesh, M. Loyd, N. Geerits, G. Ortiz, C. Pfleiderer, R. Pynn

Abstract: Intra-particle entanglement of individual particles such as neutrons could enable a new class of scattering probes that are sensitive to entanglement in quantum systems and materials. In this work, we present experimental results demonstrating quantum contextuality as a result of entanglement between the spin and energy modes (i.e., degrees of freedom) of single neutrons in a beam using a pair of… ▽ More Intra-particle entanglement of individual particles such as neutrons could enable a new class of scattering probes that are sensitive to entanglement in quantum systems and materials. In this work, we present experimental results demonstrating quantum contextuality as a result of entanglement between the spin and energy modes (i.e., degrees of freedom) of single neutrons in a beam using a pair of resonant radio-frequency neutron spin flippers in the MIEZE configuration (Modulated IntEnsity with Zero Effort). We verified the mode-entanglement by measuring a Clauser-Horne-Shimony-Holt (CHSH) contextuality witness $S$ defined in the spin and energy subsystems, observing a clear breach of the classical bound of $|S| \leq 2$, obtaining $S = 2.40 \pm 0.02$. These entangled beams could enable novel approaches for directly probing dynamics and entanglement in quantum materials whose low-energy excitation scales match those of the incident entangled neutron. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: 10 pages, 7 figures

arXiv:2404.07229 [pdf, other]

Personality-affected Emotion Generation in Dialog Systems

Authors: Zhiyuan Wen, Jiannong Cao, Jiaxing Shen, Ruosong Yang, Shuaiqi Liu, Maosong Sun

Abstract: Generating appropriate emotions for responses is essential for dialog systems to provide human-like interaction in various application scenarios. Most previous dialog systems tried to achieve this goal by learning empathetic manners from anonymous conversational data. However, emotional responses generated by those methods may be inconsistent, which will decrease user engagement and service qualit… ▽ More Generating appropriate emotions for responses is essential for dialog systems to provide human-like interaction in various application scenarios. Most previous dialog systems tried to achieve this goal by learning empathetic manners from anonymous conversational data. However, emotional responses generated by those methods may be inconsistent, which will decrease user engagement and service quality. Psychological findings suggest that the emotional expressions of humans are rooted in personality traits. Therefore, we propose a new task, Personality-affected Emotion Generation, to generate emotion based on the personality given to the dialog system and further investigate a solution through the personality-affected mood transition. Specifically, we first construct a daily dialog dataset, Personality EmotionLines Dataset (PELD), with emotion and personality annotations. Subsequently, we analyze the challenges in this task, i.e., (1) heterogeneously integrating personality and emotional factors and (2) extracting multi-granularity emotional information in the dialog context. Finally, we propose to model the personality as the transition weight by simulating the mood transition process in the dialog system and solve the challenges above. We conduct extensive experiments on PELD for evaluation. Results suggest that by adopting our method, the emotion generation performance is improved by 13% in macro-F1 and 5% in weighted-F1 from the BERT-base model. △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: Accepted by ACM Transactions on Information Systems

arXiv:2404.06486 [pdf, other]

GO4Align: Group Optimization for Multi-Task Alignment

Authors: Jiayi Shen, Cheems Wang, Zehao Xiao, Nanne Van Noord, Marcel Worring

Abstract: This paper proposes \textit{GO4Align}, a multi-task optimization approach that tackles task imbalance by explicitly aligning the optimization across tasks. To achieve this, we design an adaptive group risk minimization strategy, compromising two crucial techniques in implementation: (i) dynamical group assignment, which clusters similar tasks based on task interactions; (ii) risk-guided group indi… ▽ More This paper proposes \textit{GO4Align}, a multi-task optimization approach that tackles task imbalance by explicitly aligning the optimization across tasks. To achieve this, we design an adaptive group risk minimization strategy, compromising two crucial techniques in implementation: (i) dynamical group assignment, which clusters similar tasks based on task interactions; (ii) risk-guided group indicators, which exploit consistent task correlations with risk information from previous iterations. Comprehensive experimental results on diverse typical benchmarks demonstrate our method's performance superiority with even lower computational costs. △ Less

Submitted 9 April, 2024; originally announced April 2024.

arXiv:2404.04418 [pdf, other]

A moving mesh finite element method for Bernoulli free boundary problems

Authors: **ye Shen, Heng Dai, Weizhang Huang

Abstract: A moving mesh finite element method is studied for the numerical solution of Bernoulli free boundary problems. The method is based on the pseudo-transient continuation with which a moving boundary problem is constructed and its steady-state solution is taken as the solution of the underlying Bernoulli free boundary problem. The moving boundary problem is solved in a split manner at each time step:… ▽ More A moving mesh finite element method is studied for the numerical solution of Bernoulli free boundary problems. The method is based on the pseudo-transient continuation with which a moving boundary problem is constructed and its steady-state solution is taken as the solution of the underlying Bernoulli free boundary problem. The moving boundary problem is solved in a split manner at each time step: the moving boundary is updated with the Euler scheme, the interior mesh points are moved using a moving mesh method, and the corresponding initial-boundary value problem is solved using the linear finite element method. The method can take full advantages of both the pseudo-transient continuation and the moving mesh method. Particularly, it is able to move the mesh, free of tangling, to fit the varying domain for a variety of geometries no matter if they are convex or concave. Moreover, it is convergent towards steady state for a broad class of free boundary problems and initial guesses of the free boundary. Numerical examples for Bernoulli free boundary problems with constant and non-constant Bernoulli conditions and for nonlinear free boundary problems are presented to demonstrate the accuracy and robustness of the method and its ability to deal with various geometries and nonlinearities. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: 26 pages

MSC Class: 65M60; 65M50; 35R35; 35R37

arXiv:2404.04407 [pdf, other]

Meshfree finite difference solution of homogeneous Dirichlet problems of the fractional Laplacian

Authors: **ye Shen, Bowen Shi, Weizhang Huang

Abstract: A so-called grid-overlay finite difference method (GoFD) was proposed recently for the numerical solution of homogeneous Dirichlet boundary value problems of the fractional Laplacian on arbitrary bounded domains. It was shown to have advantages of both finite difference and finite element methods, including its efficient implementation through the fast Fourier transform and ability to work for com… ▽ More A so-called grid-overlay finite difference method (GoFD) was proposed recently for the numerical solution of homogeneous Dirichlet boundary value problems of the fractional Laplacian on arbitrary bounded domains. It was shown to have advantages of both finite difference and finite element methods, including its efficient implementation through the fast Fourier transform and ability to work for complex domains and with mesh adaptation. The purpose of this work is to study GoFD in a meshfree setting, a key to which is to construct the data transfer matrix from a given point cloud to a uniform grid. Two approaches are proposed, one based on the moving least squares fitting and the other based on the Delaunay triangulation and piecewise linear interpolation. Numerical results obtained for examples with convex and concave domains and various types of point clouds are presented. They show that both approaches lead to comparable results. Moreover, the resulting meshfree GoFD converges at a similar order as GoFD with unstructured meshes and finite element approximation as the number of points in the cloud increases. Furthermore, numerical results show that the method is robust to random perturbations in the location of the points. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: 18 pages

MSC Class: 65N06; 35R11

arXiv:2404.03764 [pdf, other]

CONCERT: Covariate-Elaborated Robust Local Information Transfer with Conditional Spike-and-Slab Prior

Authors: Ruqian Zhang, Yijiao Zhang, Annie Qu, Zhongyi Zhu, Juan Shen

Abstract: The popularity of transfer learning stems from the fact that it can borrow information from useful auxiliary datasets. Existing statistical transfer learning methods usually adopt a global similarity measure between the source data and the target data, which may lead to inefficiency when only local information is shared. In this paper, we propose a novel Bayesian transfer learning method named "CO… ▽ More The popularity of transfer learning stems from the fact that it can borrow information from useful auxiliary datasets. Existing statistical transfer learning methods usually adopt a global similarity measure between the source data and the target data, which may lead to inefficiency when only local information is shared. In this paper, we propose a novel Bayesian transfer learning method named "CONCERT" to allow robust local information transfer for high-dimensional data analysis. A novel conditional spike-and-slab prior is introduced in the joint distribution of target and source parameters for information transfer. By incorporating covariate-specific priors, we can characterize the local similarities and make the sources work collaboratively to help improve the performance on the target. Distinguished from existing work, CONCERT is a one-step procedure, which achieves variable selection and information transfer simultaneously. Variable selection consistency is established for our CONCERT. To make our algorithm scalable, we adopt the variational Bayes framework to facilitate implementation. Extensive experiments and a genetic data analysis demonstrate the validity and the advantage of CONCERT over existing cutting-edge transfer learning methods. We also extend our CONCERT to the logistical models with numerical studies showing its superiority over other methods. △ Less

Submitted 30 March, 2024; originally announced April 2024.

Comments: 31 pages, 22 figures

arXiv:2404.03659 [pdf, other]

Federated Unlearning for Human Activity Recognition

Authors: Kongyang Chen, Dong** zhang, Ya** Chai, Weibin Zhang, Shaowei Wang, Jiaxing Shen

Abstract: The rapid evolution of Internet of Things (IoT) technology has spurred the widespread adoption of Human Activity Recognition (HAR) in various daily life domains. Federated Learning (FL) is frequently utilized to build a global HAR model by aggregating user contributions without transmitting raw individual data. Despite substantial progress in user privacy protection with FL, challenges persist. Re… ▽ More The rapid evolution of Internet of Things (IoT) technology has spurred the widespread adoption of Human Activity Recognition (HAR) in various daily life domains. Federated Learning (FL) is frequently utilized to build a global HAR model by aggregating user contributions without transmitting raw individual data. Despite substantial progress in user privacy protection with FL, challenges persist. Regulations like the General Data Protection Regulation (GDPR) empower users to request data removal, raising a new query in FL: How can a HAR client request data removal without compromising other clients' privacy? In response, we propose a lightweight machine unlearning method for refining the FL HAR model by selectively removing a portion of a client's training data. Our method employs a third-party dataset unrelated to model training. Using KL divergence as a loss function for fine-tuning, we aim to align the predicted probability distribution on forgotten data with the third-party dataset. Additionally, we introduce a membership inference evaluation method to assess unlearning effectiveness. Experimental results across diverse datasets show our method achieves unlearning accuracy comparable to \textit{retraining} methods, resulting in speedups ranging from hundreds to thousands. △ Less

Submitted 17 January, 2024; originally announced April 2024.

arXiv:2404.02491 [pdf, other]

Measuring Social Norms of Large Language Models

Authors: Ye Yuan, Kexin Tang, Jianhao Shen, Ming Zhang, Chenguang Wang

Abstract: We present a new challenge to examine whether large language models understand social norms. In contrast to existing datasets, our dataset requires a fundamental understanding of social norms to solve. Our dataset features the largest set of social norm skills, consisting of 402 skills and 12,383 questions covering a wide set of social norms ranging from opinions and arguments to culture and laws.… ▽ More We present a new challenge to examine whether large language models understand social norms. In contrast to existing datasets, our dataset requires a fundamental understanding of social norms to solve. Our dataset features the largest set of social norm skills, consisting of 402 skills and 12,383 questions covering a wide set of social norms ranging from opinions and arguments to culture and laws. We design our dataset according to the K-12 curriculum. This enables the direct comparison of the social understanding of large language models to humans, more specifically, elementary students. While prior work generates nearly random accuracy on our benchmark, recent large language models such as GPT3.5-Turbo and LLaMA2-Chat are able to improve the performance significantly, only slightly below human performance. We then propose a multi-agent framework based on large language models to improve the models' ability to understand social norms. This method further improves large language models to be on par with humans. Given the increasing adoption of large language models in real-world applications, our finding is particularly important and presents a unique direction for future improvements. △ Less

Submitted 22 May, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

arXiv:2404.01693 [pdf, other]

HeMeNet: Heterogeneous Multichannel Equivariant Network for Protein Multitask Learning

Authors: Rong Han, Wenbing Huang, Lingxiao Luo, Xinyan Han, Jiaming Shen, Zhiqiang Zhang, Jun Zhou, Ting Chen

Abstract: Understanding and leveraging the 3D structures of proteins is central to a variety of biological and drug discovery tasks. While deep learning has been applied successfully for structure-based protein function prediction tasks, current methods usually employ distinct training for each task. However, each of the tasks is of small size, and such a single-task strategy hinders the models' performance… ▽ More Understanding and leveraging the 3D structures of proteins is central to a variety of biological and drug discovery tasks. While deep learning has been applied successfully for structure-based protein function prediction tasks, current methods usually employ distinct training for each task. However, each of the tasks is of small size, and such a single-task strategy hinders the models' performance and generalization ability. As some labeled 3D protein datasets are biologically related, combining multi-source datasets for larger-scale multi-task learning is one way to overcome this problem. In this paper, we propose a neural network model to address multiple tasks jointly upon the input of 3D protein structures. In particular, we first construct a standard structure-based multi-task benchmark called Protein-MT, consisting of 6 biologically relevant tasks, including affinity prediction and property prediction, integrated from 4 public datasets. Then, we develop a novel graph neural network for multi-task learning, dubbed Heterogeneous Multichannel Equivariant Network (HeMeNet), which is E(3) equivariant and able to capture heterogeneous relationships between different atoms. Besides, HeMeNet can achieve task-specific learning via the task-aware readout mechanism. Extensive evaluations on our benchmark verify the effectiveness of multi-task learning, and our model generally surpasses state-of-the-art models. △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2403.16040 [pdf, ps, other]

General One-loop Generating Function by IBP relations

Authors: Bo Feng, Chang Hu, Jiyuan Shen, Yaobo Zhang

Abstract: In this paper we have studied the most general generating function of reduction for one loop integrals with arbitrary tensor structure in numerator and arbitrary power distribution of propagators in denominator. Using IBP relations, we have established the partial differential equations for these generating functions and solved them analytically. These results provide useful guidance for applying… ▽ More In this paper we have studied the most general generating function of reduction for one loop integrals with arbitrary tensor structure in numerator and arbitrary power distribution of propagators in denominator. Using IBP relations, we have established the partial differential equations for these generating functions and solved them analytically. These results provide useful guidance for applying generating function method to reductions of higher loop integrals. △ Less

Submitted 24 March, 2024; originally announced March 2024.

Comments: 50 pages

arXiv:2403.15241 [pdf, other]

IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection

Authors: Junbo Yin, Jianbing Shen, Runnan Chen, Wei Li, Ruigang Yang, Pascal Frossard, Wenguan Wang

Abstract: Bird's eye view (BEV) representation has emerged as a dominant solution for describing 3D space in autonomous driving scenarios. However, objects in the BEV representation typically exhibit small sizes, and the associated point cloud context is inherently sparse, which leads to great challenges for reliable 3D perception. In this paper, we propose IS-Fusion, an innovative multimodal fusion framewo… ▽ More Bird's eye view (BEV) representation has emerged as a dominant solution for describing 3D space in autonomous driving scenarios. However, objects in the BEV representation typically exhibit small sizes, and the associated point cloud context is inherently sparse, which leads to great challenges for reliable 3D perception. In this paper, we propose IS-Fusion, an innovative multimodal fusion framework that jointly captures the Instance- and Scene-level contextual information. IS-Fusion essentially differs from existing approaches that only focus on the BEV scene-level fusion by explicitly incorporating instance-level multimodal information, thus facilitating the instance-centric tasks like 3D object detection. It comprises a Hierarchical Scene Fusion (HSF) module and an Instance-Guided Fusion (IGF) module. HSF applies Point-to-Grid and Grid-to-Region transformers to capture the multimodal scene context at different granularities. IGF mines instance candidates, explores their relationships, and aggregates the local multimodal context for each instance. These instances then serve as guidance to enhance the scene feature and yield an instance-aware BEV representation. On the challenging nuScenes benchmark, IS-Fusion outperforms all the published multimodal works to date. Code is available at: https://github.com/yinjunbo/IS-Fusion. △ Less

Submitted 22 March, 2024; originally announced March 2024.

Comments: Accepted to CVPR 2024; Code: https://github.com/yinjunbo/IS-Fusion

arXiv:2403.13380 [pdf, other]

A characteristics-based method for shock-ramp data analysis

Authors: **gxiang Shen, Wei Kang

Abstract: For the data analysis problem of shock-ramp compression, i.e., ramp compression after a relatively strong initial shock, a characteristics-based method that strictly deals with the initial hydrodynamic shock is described in detail. Validation of this analysis method using simulated shock-ramp data generated by molecular dynamics and one-dimensional radiation hydrodynamic code is also presented. For the data analysis problem of shock-ramp compression, i.e., ramp compression after a relatively strong initial shock, a characteristics-based method that strictly deals with the initial hydrodynamic shock is described in detail. Validation of this analysis method using simulated shock-ramp data generated by molecular dynamics and one-dimensional radiation hydrodynamic code is also presented. △ Less

Submitted 20 March, 2024; originally announced March 2024.

arXiv:2403.13273 [pdf, other]

Galaxy Triplets Alignment in Large-scale Filaments

Authors: Yu Rong, **zhi Shen, Zichen Hua

Abstract: Leveraging the datasets of galaxy triplets and large-scale filaments obtained from the Sloan Digital Sky Survey, we scrutinize the alignment of the three sides of the triangles formed by galaxy triplets and the normal vectors of the triplet planes within observed large-scale filaments. Our statistical investigation reveals that the longest and median sides of the galaxy triplets exhibit a robust a… ▽ More Leveraging the datasets of galaxy triplets and large-scale filaments obtained from the Sloan Digital Sky Survey, we scrutinize the alignment of the three sides of the triangles formed by galaxy triplets and the normal vectors of the triplet planes within observed large-scale filaments. Our statistical investigation reveals that the longest and median sides of the galaxy triplets exhibit a robust alignment with the spines of their host large-scale filaments, while the shortest sides show no or only weak alignment with the filaments. Additionally, the normal vectors of triplets tend to be perpendicular to the filaments. The alignment signal diminishes rapidly with the increasing distance from the triplet to the filament spine, and is primarily significant for triplets located within distances shorter than $0.2$~Mpc$/h$, with a confidence level exceeding $20σ$. Moreover, in comparison to compact galaxy triplets, the alignment signal is more conspicuous among the loose triplets. This alignment analysis contributes to the formulation of a framework depicting the clustering and relaxation of galaxies within cosmological large-scale filament regimes, providing deeper insights into the intricate interactions between galaxies and their pivotal role in sha** galaxy groups. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Comments: Accepted for publication in MNRAS Letters

arXiv:2403.12306 [pdf]

On the interplay of liquid-like and stress-driven dynamics in a metallic glass former observed by temperature scanning XPCS

Authors: Maximilian Frey, Nico Neuber, Sascha Sebastian Riegler, Antoine Cornet, Yuriy Chushkin, Federico Zontone, Lucas Ruschel, Bastian Adam, Mehran Nabahat, Fan Yang, Jie Shen, Fabian Westermeier, Michael Sprung, Daniele Cangialosi, Valerio Di Lisio, Isabella Gallino, Ralf Busch, Beatrice Ruta, Eloi Pineda

Abstract: Modern detector technology and highly brilliant fourth-generation synchrotrons allow to improve the temporal resolution in time-resolved diffraction studies. Profiting from this, we applied temperature scanning X-ray photon correlation spectroscopy (XPCS) to probe the dynamics of a Pt-based metallic glass former in the glass, glass transition region, and supercooled liquid, covering up to six orde… ▽ More Modern detector technology and highly brilliant fourth-generation synchrotrons allow to improve the temporal resolution in time-resolved diffraction studies. Profiting from this, we applied temperature scanning X-ray photon correlation spectroscopy (XPCS) to probe the dynamics of a Pt-based metallic glass former in the glass, glass transition region, and supercooled liquid, covering up to six orders of magnitude in time scales. Our data demonstrates that the structural alpha-relaxation process is still observable in the glass, although it is partially masked by a faster source of decorrelation observed at atomic scale. We present an approach that interprets these findings as the superposition of heterogeneous liquid-like and stress-driven ballistic-like atomic motions. This work not only extends the dynamical range probed by standard isothermal XPCS, but also clarifies the fate of the alpha-relaxation across the glass transition and provides a new perception on the anomalous, compressed temporal decay of the density-density correlation functions observed in metallic glasses and many out-of-equilibrium soft materials. △ Less

Submitted 22 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.11886 [pdf, other]

QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback-based Self-Correction

Authors: Xiang Huang, Sitao Cheng, Shanshan Huang, Jiayu Shen, Yong Xu, Chaoyun Zhang, Yuzhong Qu

Abstract: Employing Large Language Models (LLMs) for semantic parsing has achieved remarkable success. However, we find existing methods fall short in terms of reliability and efficiency when hallucinations are encountered. In this paper, we address these challenges with a framework called QueryAgent, which solves a question step-by-step and performs step-wise self-correction. We introduce an environmental… ▽ More Employing Large Language Models (LLMs) for semantic parsing has achieved remarkable success. However, we find existing methods fall short in terms of reliability and efficiency when hallucinations are encountered. In this paper, we address these challenges with a framework called QueryAgent, which solves a question step-by-step and performs step-wise self-correction. We introduce an environmental feedback-based self-correction method called ERASER. Unlike traditional approaches, ERASER leverages rich environmental feedback in the intermediate steps to perform selective and differentiated self-correction only when necessary. Experimental results demonstrate that QueryAgent notably outperforms all previous few-shot methods using only one example on GrailQA and GraphQ by 7.0 and 15.0 F1. Moreover, our approach exhibits superiority in terms of efficiency, including runtime, query overhead, and API invocation costs. By leveraging ERASER, we further improve another baseline (i.e., AgentBench) by approximately 10 points, revealing the strong transferability of our approach. △ Less

Submitted 13 June, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

Comments: Accepted by ACL 2024 main conference. 22 pages,7 figures, 13 tables

arXiv:2403.10616 [pdf, other]

DiPaCo: Distributed Path Composition

Authors: Arthur Douillard, Qixuan Feng, Andrei A. Rusu, Adhiguna Kuncoro, Yani Donchev, Rachita Chhaparia, Ionel Gog, Marc'Aurelio Ranzato, Jiajun Shen, Arthur Szlam

Abstract: Progress in machine learning (ML) has been fueled by scaling neural network models. This scaling has been enabled by ever more heroic feats of engineering, necessary for accommodating ML approaches that require high bandwidth communication between devices working in parallel. In this work, we propose a co-designed modular architecture and training approach for ML models, dubbed DIstributed PAth CO… ▽ More Progress in machine learning (ML) has been fueled by scaling neural network models. This scaling has been enabled by ever more heroic feats of engineering, necessary for accommodating ML approaches that require high bandwidth communication between devices working in parallel. In this work, we propose a co-designed modular architecture and training approach for ML models, dubbed DIstributed PAth COmposition (DiPaCo). During training, DiPaCo distributes computation by paths through a set of shared modules. Together with a Local-SGD inspired optimization (DiLoCo) that keeps modules in sync with drastically reduced communication, Our approach facilitates training across poorly connected and heterogeneous workers, with a design that ensures robustness to worker failures and preemptions. At inference time, only a single path needs to be executed for each input, without the need for any model compression. We consider this approach as a first prototype towards a new paradigm of large-scale learning, one that is less synchronous and more modular. Our experiments on the widely used C4 benchmark show that, for the same amount of training steps but less wall-clock time, DiPaCo exceeds the performance of a 1 billion-parameter dense transformer language model by choosing one of 256 possible paths, each with a size of 150 million parameters. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.09369 [pdf, other]

PreConfig: A Pretrained Model for Automating Network Configuration

Authors: Fuliang Li, Haozhi Lang, Jiajie Zhang, Jiaxing Shen, Xingwei Wang

Abstract: Manual network configuration automation (NCA) tools face significant challenges in versatility and flexibility due to their reliance on extensive domain expertise and manual design, limiting their adaptability to diverse scenarios and complex application needs. This paper introduces PreConfig, an innovative NCA tool that leverages a pretrained language model for automating network configuration ta… ▽ More Manual network configuration automation (NCA) tools face significant challenges in versatility and flexibility due to their reliance on extensive domain expertise and manual design, limiting their adaptability to diverse scenarios and complex application needs. This paper introduces PreConfig, an innovative NCA tool that leverages a pretrained language model for automating network configuration tasks. PreConfig is designed to address the complexity and variety of NCA tasks by framing them as text-to-text transformation problems, thus unifying the tasks of configuration generation, translation, and analysis under a single, versatile model. Our approach overcomes existing tools' limitations by utilizing advances in natural language processing to automatically comprehend and generate network configurations without extensive manual re-engineering. We confront the challenges of integrating domain-specific knowledge into pretrained models and the scarcity of supervision data in the network configuration field. Our solution involves constructing a specialized corpus and further pretraining on network configuration data, coupled with a novel data mining technique for generating task supervision data. The proposed model demonstrates robustness in configuration generation, translation, and analysis, outperforming conventional tools in handling complex networking environments. The experimental results validate the effectiveness of PreConfig, establishing a new direction for automating network configuration tasks with pretrained language models. △ Less

Submitted 14 March, 2024; originally announced March 2024.

arXiv:2403.08948 [pdf, ps, other]

Model-free Resilient Controller Design based on Incentive Feedback Stackelberg Game and Q-learning

Authors: Jiajun Shen, Fengjun Li, Morteza Hashemi, Huazhen Fang

Abstract: In the swift evolution of Cyber-Physical Systems (CPSs) within intelligent environments, especially in the industrial domain shaped by Industry 4.0, the surge in development brings forth unprecedented security challenges. This paper explores the intricate security issues of Industrial CPSs (ICPSs), with a specific focus on the unique threats presented by intelligent attackers capable of directly c… ▽ More In the swift evolution of Cyber-Physical Systems (CPSs) within intelligent environments, especially in the industrial domain shaped by Industry 4.0, the surge in development brings forth unprecedented security challenges. This paper explores the intricate security issues of Industrial CPSs (ICPSs), with a specific focus on the unique threats presented by intelligent attackers capable of directly compromising the controller, thereby posing a direct risk to physical security. Within the framework of hierarchical control and incentive feedback Stackelberg game, we design a resilient leading controller (leader) that is adaptive to a compromised following controller (follower) such that the compromised follower acts cooperatively with the leader, aligning its strategies with the leader's objective to achieve a team-optimal solution. First, we provide sufficient conditions for the existence of an incentive Stackelberg solution when system dynamics are known. Then, we propose a Q-learning-based Approximate Dynamic Programming (ADP) approach, and corresponding algorithms for the online resolution of the incentive Stackelberg solution without requiring prior knowledge of system dynamics. Last but not least, we prove the convergence of our approach to the optimum. △ Less

Submitted 13 March, 2024; originally announced March 2024.

Comments: 8 pages

arXiv:2403.07637 [pdf]

Discovery of a Magnetic Topological Semimetal Eu$_3$In$_2$As$_4$ with a Single Pair of Weyl Points

Authors: Ke Jia, **gyu Yao, Xiaobo He, Yupeng Li, Junze Deng, Ming Yang, Junfeng Wang, Zengwei Zhu, Cuixiang Wang, Dayu Yan, Hai L. Feng, Jie Shen, Yongkang Luo, Zhijun Wang, Youguo Shi

Abstract: Magnetic Weyl semimetal (MWS) is a unique topological state with open surface Fermi arc states and other exotic transport phenomena. However, most reported MWSs show multiple pairs of Weyl points and complicated Fermi surfaces, which increases the difficulty of the investigation into the intrinsic chiral transport property. In this wor, we successfully synthesized a soft magnetic Weyl semimetal Eu… ▽ More Magnetic Weyl semimetal (MWS) is a unique topological state with open surface Fermi arc states and other exotic transport phenomena. However, most reported MWSs show multiple pairs of Weyl points and complicated Fermi surfaces, which increases the difficulty of the investigation into the intrinsic chiral transport property. In this wor, we successfully synthesized a soft magnetic Weyl semimetal Eu$_3$In$_2$As$_4$ with a single pair of Weyl points under magnetic fields. The Shubnikov de Haas (SdH) oscillation with a single frequency, as well as a linear hall resistance with the same carrier density, is observed up to 50 Tesla, indicating a single pair of Weyl points around the Fermi level with a massless fermion ($m^* = 0.121 m_0$, $π$ Berry phase). Such a single pair of Weyl points is further confirmed by the density functional theory calculations. The magnetic ordering and band topology can be easily tuned by the external magnetic field. The field-induced MWS Eu$_3$In$_2$As$_4$ with a single pair of Weyl points is a good platform to detect chiral transport properties, including possible quantum anomalous Hall effect. △ Less

Submitted 12 March, 2024; originally announced March 2024.

arXiv:2403.07504 [pdf]

doi 10.1038/s41535-024-00635-5

Two-dimensional phase diagram of the charge density wave in doped CsV$_3$Sb$_5$

Authors: Linwei Huai, Hongyu Li, Yulei Han, Yang Luo, Shuting Peng, Zhiyuan Wei, Jianchang Shen, Bingqian Wang, Yu Miao, Xiupeng Sun, Zhipeng Ou, Bo Liu, Xiaoxiao Yu, Ziji Xiang, Min-Quan Kuang, Zhenhua Qiao, Xianhui Chen, Junfeng He

Abstract: Kagome superconductors AV$_3$Sb$_5$ (A = K, Rb and Cs) have attracted much recent attention due to the coexistence of multiple exotic orders. Among them, the charge density wave (CDW) order has been shown to host various unconventional behaviors. Here, we investigate the CDW order by a combination of both bulk and surface do** methods. While element substitutions in bulk do** change both carri… ▽ More Kagome superconductors AV$_3$Sb$_5$ (A = K, Rb and Cs) have attracted much recent attention due to the coexistence of multiple exotic orders. Among them, the charge density wave (CDW) order has been shown to host various unconventional behaviors. Here, we investigate the CDW order by a combination of both bulk and surface do** methods. While element substitutions in bulk do** change both carriers and the crystal lattice, the surface do** primarily tunes the carrier concentration. As such, our results reveal a two-dimensional phase diagram of the CDW in doped CsV$_3$Sb$_5$. In the lightly bulk doped regime, the existence of CDW order is reversible by tuning the carrier concentration. But excessive bulk do** permanently destroys the CDW, regardless of the carrier do** level. These results provide insights to the origin of the CDW from both electronic and structural degrees of freedom. They also open an avenue for manipulating the exotic CDW order in Kagome superconductors. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: 14 pages, 4 figures

Journal ref: npj Quantum Mater. 9,23(2024)

arXiv:2403.07333 [pdf, other]

Development of generic no-scale inflation

Authors: Lina Wu, **-Ke Shen, Tianjun Li, Junle Pei

Abstract: We develop generalized no-scale supergravity models of inflation, and then study the corresponding cosmological predictions as well as the formation of primordial black holes (PBHs) and scalar-induced gravitational waves (SIGWs). With a new parameter $0<α\leq 1$, the $α$-generalized no-scale supergravity provides the continuous connections among the generic no-scale supergravity from string theory… ▽ More We develop generalized no-scale supergravity models of inflation, and then study the corresponding cosmological predictions as well as the formation of primordial black holes (PBHs) and scalar-induced gravitational waves (SIGWs). With a new parameter $0<α\leq 1$, the $α$-generalized no-scale supergravity provides the continuous connections among the generic no-scale supergravity from string theory compactifications. The resulting prediction of the CMB, spectrum index $n_s$, and tensor-to-scalar ratio $r$ can be highly consistent with the latest Planck/BICEP/Keck Array observations. Notably, the models with $α\neq 1$ give a smaller ratio $r\leq 10^{-3}$, which is flexible even under the anticipated tighter observational constraints at the future experiments. Additionally, these models have the potential to generate a broad-band stochastic gravitational wave background, and thus explain the NANOGrav 15yr signal. Furthermore, they predict the formation of PBHs with various mass scales, which could account for a significant portion of dark matter relic density in the Universe. △ Less

Submitted 2 June, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

Comments: v2: minor modifications with updated references and benchmark points for PBH production; 19 pages, 12 figures, 2 tables. Comments are welcome

arXiv:2403.07187 [pdf, other]

UPS: Efficiently Building Foundation Models for PDE Solving via Cross-Modal Adaptation

Authors: Junhong Shen, Tanya Marwah, Ameet Talwalkar

Abstract: We present Unified PDE Solvers (UPS), a data- and compute-efficient approach to develo** unified neural operators for diverse families of spatiotemporal PDEs from various domains, dimensions, and resolutions. UPS embeds different PDEs into a shared representation space and processes them using a FNO-transformer architecture. Rather than training the network from scratch, which is data-demanding… ▽ More We present Unified PDE Solvers (UPS), a data- and compute-efficient approach to develo** unified neural operators for diverse families of spatiotemporal PDEs from various domains, dimensions, and resolutions. UPS embeds different PDEs into a shared representation space and processes them using a FNO-transformer architecture. Rather than training the network from scratch, which is data-demanding and computationally expensive, we warm-start the transformer from pretrained LLMs and perform explicit alignment to reduce the modality gap while improving data and compute efficiency. The cross-modal UPS achieves state-of-the-art results on a wide range of 1D and 2D PDE families from PDEBench, outperforming existing unified models using 4 times less data and 26 times less compute. Meanwhile, it is capable of few-shot transfer to unseen PDE families and coefficients. △ Less

Submitted 23 May, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

arXiv:2403.05799 [pdf, ps, other]

Voiculescu's Theorem in Properly Infinite Factors

Authors: Don Hadwin, Minghui Ma, Junhao Shen

Abstract: This paper investigates Voiculescu's theorem on approximate equivalence in separable properly infinite factors. We establish the norm-denseness of the set of all reducible operators and prove Voiculescu's bicommutant theorem. Additionally, we extend these results to the multiplier algebras within separable type $\mathrm{III}$ factors. This paper investigates Voiculescu's theorem on approximate equivalence in separable properly infinite factors. We establish the norm-denseness of the set of all reducible operators and prove Voiculescu's bicommutant theorem. Additionally, we extend these results to the multiplier algebras within separable type $\mathrm{III}$ factors. △ Less

Submitted 9 March, 2024; originally announced March 2024.

arXiv:2403.05689 [pdf, other]

Simulating Charged Defects at Database Scale

Authors: Jimmy-Xuan Shen, Lars F. Voss, Joel Basile Varley

Abstract: Point defects have a strong influence on the physical properties of materials, often dominating the electronic and optical behavior in semiconductors and insulators. The simulation and analysis of point defects is therefore crucial for understanding the growth and operation of materials especially for optoelectronics applications. In this work, we present a general-purpose Python framework for the… ▽ More Point defects have a strong influence on the physical properties of materials, often dominating the electronic and optical behavior in semiconductors and insulators. The simulation and analysis of point defects is therefore crucial for understanding the growth and operation of materials especially for optoelectronics applications. In this work, we present a general-purpose Python framework for the analysis of point defects in crystalline materials, as well as a generalized workflow for their treatment with high-throughput simulations. The distinguishing feature of our approach is an emphasis on a unique, unitcell, structure-only, definition of point defects which decouples the defect definition and the specific supercell representation used to simulate the defect. This allows the results of first-principles calculations to be aggregated into a database without extensive provenance information and is a crucial step in building a persistent database of point defects that can grow over time, a key component towards realizing the idea of a ``defect genome' that can yield more complex relationships governing the behavior of defects in materials. We demonstrate several examples of the approach for three technologically relevant materials and highlight current pitfalls that must be considered when employing these methodologies, as well as their potential solutions. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.04515 [pdf]

Light-induced giant enhancement of nonreciprocal transport at KTaO3-based interfaces

Authors: Xu Zhang, Tongshuai Zhu, Shuai Zhang, Zhongqiang Chen, Anke Song, Chong Zhang, Rongzheng Gao, Wei Niu, Yequan Chen, Fucong Fei, Yilin Tai, Guoan Li, Binghui Ge, Wenkai Lou, Jie Shen, Haijun Zhang, Kai Chang, Fengqi Song, Rong Zhang, Xuefeng Wang

Abstract: Nonlinear transport is a unique functionality of noncentrosymmetric systems, which reflects profound physics, such as spin-orbit interaction, superconductivity and band geometry. However, it remains highly challenging to enhance the nonreciprocal transport for promising rectification devices. Here, we observe a light-induced giant enhancement of nonreciprocal transport at the superconducting and e… ▽ More Nonlinear transport is a unique functionality of noncentrosymmetric systems, which reflects profound physics, such as spin-orbit interaction, superconductivity and band geometry. However, it remains highly challenging to enhance the nonreciprocal transport for promising rectification devices. Here, we observe a light-induced giant enhancement of nonreciprocal transport at the superconducting and epitaxial CaZrO3/KTaO3 (111) interfaces. The nonreciprocal transport coefficient undergoes a giant increase with three orders of magnitude up to 105 A-1T-1. Furthermore, a strong Rashba spin-orbit coupling effective field of 14.7 T is achieved with abundant high-mobility photocarriers under ultraviolet illumination, which accounts for the giant enhancement of nonreciprocal transport coefficient. Our first-principles calculations further disclose the stronger Rashba spin-orbit coupling strength and the longer relaxation time in the photocarrier excitation process, bridging the light-property quantitative relationship. Our work provides an alternative pathway to boost nonreciprocal transport in noncentrosymmetric systems and facilitates the promising applications in opto-rectification devices and spin-orbitronic devices. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Comments: 38 pages, 17 figures

Journal ref: Nature Communications (2024)

arXiv:2403.03212 [pdf, other]

Performance of a modular ton-scale pixel-readout liquid argon time projection chamber

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, D. A. Andrade , et al. (1340 additional authors not shown)

Abstract: The Module-0 Demonstrator is a single-phase 600 kg liquid argon time projection chamber operated as a prototype for the DUNE liquid argon near detector. Based on the ArgonCube design concept, Module-0 features a novel 80k-channel pixelated charge readout and advanced high-coverage photon detection system. In this paper, we present an analysis of an eight-day data set consisting of 25 million cosmi… ▽ More The Module-0 Demonstrator is a single-phase 600 kg liquid argon time projection chamber operated as a prototype for the DUNE liquid argon near detector. Based on the ArgonCube design concept, Module-0 features a novel 80k-channel pixelated charge readout and advanced high-coverage photon detection system. In this paper, we present an analysis of an eight-day data set consisting of 25 million cosmic ray events collected in the spring of 2021. We use this sample to demonstrate the imaging performance of the charge and light readout systems as well as the signal correlations between the two. We also report argon purity and detector uniformity measurements, and provide comparisons to detector simulations. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 47 pages, 41 figures

Report number: FERMILAB-PUB-24-0073-LBNF

arXiv:2403.02603 [pdf, other]

Drug resistance revealed by in silico deep mutational scanning and mutation tracker

Authors: Dong Chen, Gengzhuo Liu, Hongyan Du, Junjie Wee, Rui Wang, Jiahui Chen, Jana Shen, Guo-Wei Wei

Abstract: As COVID-19 enters its fifth year, it continues to pose a significant global health threat, with the constantly mutating SARS-CoV-2 virus challenging drug effectiveness. A comprehensive understanding of virus-drug interactions is essential for predicting and improving drug effectiveness, especially in combating drug resistance during the pandemic. In response, the Path Laplacian Transformer-based… ▽ More As COVID-19 enters its fifth year, it continues to pose a significant global health threat, with the constantly mutating SARS-CoV-2 virus challenging drug effectiveness. A comprehensive understanding of virus-drug interactions is essential for predicting and improving drug effectiveness, especially in combating drug resistance during the pandemic. In response, the Path Laplacian Transformer-based Prospective Analysis Framework (PLFormer-PAF) has been proposed, integrating historical data analysis and predictive modeling strategies. This dual-strategy approach utilizes path topology to transform protein-ligand complexes into topological sequences, enabling the use of advanced large language models for analyzing protein-ligand interactions and enhancing its reliability with factual insights garnered from historical data. It has shown unparalleled performance in predicting binding affinity tasks across various benchmarks, including specific evaluations related to SARS-CoV-2, and assesses the impact of virus mutations on drug efficacy, offering crucial insights into potential drug resistance. The predictions align with observed mutation patterns in SARS-CoV-2, indicating that the widespread use of the Pfizer drug has lead to viral evolution and reduced drug efficacy. PLFormer-PAF's capabilities extend beyond identifying drug-resistant strains, positioning it as a key tool in drug discovery research and the development of new therapeutic strategies against fast-mutating viruses like COVID-19. △ Less

Submitted 4 March, 2024; originally announced March 2024.

arXiv:2403.01854 [pdf, other]

Quantum counterdiabatic driving with local control

Authors: Changhao Li, Jiayu Shen, Ruslan Shaydulin, Marco Pistoia

Abstract: Suppression of diabatic transitions in quantum adiabatic evolution stands as a significant challenge for ground state preparations. Counterdiabatic driving has been proposed to compensate for diabatic losses and achieve shortcut to adiabaticity. However, its implementation necessitates the generation of adiabatic gauge potential, which requires knowledge of the spectral gap of instantaneous Hamilt… ▽ More Suppression of diabatic transitions in quantum adiabatic evolution stands as a significant challenge for ground state preparations. Counterdiabatic driving has been proposed to compensate for diabatic losses and achieve shortcut to adiabaticity. However, its implementation necessitates the generation of adiabatic gauge potential, which requires knowledge of the spectral gap of instantaneous Hamiltonians and involves highly non-local drivings in many-body systems. In this work, we consider local counterdiabatic (LCD) driving with approximate adiabatic gauge potential. Using transverse-field Ising model as an example, we present an in-depth study of the performance and optimization of LCD protocols. We then propose a novel two-step protocol based on LCD and simple local single-body control to further improve the performance. The optimization of these LCD-based protocols does not require knowledge of instantaneous Hamiltonians, and only additional local driving is involved. To benchmark the performance of LCD and the proposed local control-enhanced LCD technique, we experimentally implement digitized adiabatic quantum evolution in a trapped-ion system. We characterize the quality of the prepared states and explore the scaling behavior with system size up to 14 qubits. Our demonstration of quantum shortcut to adiabaticity opens a path towards preparing ground states of complex systems with accessible local controls. △ Less

Submitted 4 March, 2024; originally announced March 2024.

Comments: 28 pages, 13 figures

arXiv:2403.01544 [pdf, other]

Local weak convergence and its applications

Authors: Sayan Banerjee, Shankar Bhamidi, Jianan Shen, Seth Parker Young

Abstract: Motivated in part by understanding average case analysis of fundamental algorithms in computer science, and in part by the wide array of network data available over the last decade, a variety of random graph models, with corresponding processes on these objects, have been proposed over the last few years. The main goal of this paper is to give an overview of local weak convergence, which has emerg… ▽ More Motivated in part by understanding average case analysis of fundamental algorithms in computer science, and in part by the wide array of network data available over the last decade, a variety of random graph models, with corresponding processes on these objects, have been proposed over the last few years. The main goal of this paper is to give an overview of local weak convergence, which has emerged as a major technique for understanding large network asymptotics for a wide array of functionals and models. As opposed to a survey, the main goal is to try to explain some of the major concepts and their use to junior researchers in the field and indicate potential resources for further reading. △ Less

Submitted 3 March, 2024; originally announced March 2024.

Comments: 33 pages. Submitted to a special issue in honor of K.R. Parthasarathy

arXiv:2403.01383 [pdf]

Postharvest litchi (Litchi chinensis Sonn.) quality preservation by alginate oligosaccharides

Authors: Jianlie Shen, Shulin Wan, Haidong Tan

Abstract: This study investigates the efficacy of alginate oligosaccharides, derived from a novel alginate lyase expressed in E. coli (Pet21a-alginate lyase), in preserving the postharvest quality of litchi (Litchi chinensis Sonn.) fruits. The alginate lyase, characterized by Huang et al. (2013), was employed to produce AOS through enzymatic degradation of alginate. The resulting oligosaccharides were appli… ▽ More This study investigates the efficacy of alginate oligosaccharides, derived from a novel alginate lyase expressed in E. coli (Pet21a-alginate lyase), in preserving the postharvest quality of litchi (Litchi chinensis Sonn.) fruits. The alginate lyase, characterized by Huang et al. (2013), was employed to produce AOS through enzymatic degradation of alginate. The resulting oligosaccharides were applied to litchi fruits harvested from Guangzhou Zengcheng to evaluate their impact on various quality parameters under controlled storage conditions. The study focused on measuring the effects of alginate oligosaccharide treatment on the fruits' color retention, water loss rate, hardness, and susceptibility to mold infection, under a set relative humidity and temperature. Results demonstrated significant improvements in the treated fruits, with enhanced color retention, reduced water loss, maintained hardness, and lower rates of mold infection compared to untreated controls. These findings suggest that AOS offer a promising natural alternative for extending the shelf life and maintaining the quality of litchi fruits postharvest. △ Less

Submitted 2 March, 2024; originally announced March 2024.

arXiv:2403.00165 [pdf, other]

TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision

Authors: Yunyi Zhang, Ruozhen Yang, Xueqiang Xu, Rui Li, **feng Xiao, Jiaming Shen, Jiawei Han

Abstract: Hierarchical text classification aims to categorize each document into a set of classes in a label taxonomy. Most earlier works focus on fully or semi-supervised methods that require a large amount of human annotated data which is costly and time-consuming to acquire. To alleviate human efforts, in this paper, we work on hierarchical text classification with the minimal amount of supervision: usin… ▽ More Hierarchical text classification aims to categorize each document into a set of classes in a label taxonomy. Most earlier works focus on fully or semi-supervised methods that require a large amount of human annotated data which is costly and time-consuming to acquire. To alleviate human efforts, in this paper, we work on hierarchical text classification with the minimal amount of supervision: using the sole class name of each node as the only supervision. Recently, large language models (LLM) show competitive performance on various tasks through zero-shot prompting, but this method performs poorly in the hierarchical setting, because it is ineffective to include the large and structured label space in a prompt. On the other hand, previous weakly-supervised hierarchical text classification methods only utilize the raw taxonomy skeleton and ignore the rich information hidden in the text corpus that can serve as additional class-indicative features. To tackle the above challenges, we propose TELEClass, Taxonomy Enrichment and LLM-Enhanced weakly-supervised hierarchical text Classification, which (1) automatically enriches the label taxonomy with class-indicative terms to facilitate classifier training and (2) utilizes LLMs for both data annotation and creation tailored for the hierarchical label space. Experiments show that TELEClass can outperform previous weakly-supervised methods and LLM-based zero-shot prompting methods on two public datasets. △ Less

Submitted 16 June, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

arXiv:2402.19376 [pdf, other]

OzMAC: An Energy-Efficient Sparsity-Exploiting Multiply-Accumulate-Unit Design for DL Inference

Authors: Harideep Nair, Prabhu Vellaisamy, Tsung-Han Lin, Perry Wang, Shawn Blanton, John Paul Shen

Abstract: General Matrix Multiply (GEMM) hardware, employing large arrays of multiply-accumulate (MAC) units, perform bulk of the computation in deep learning (DL). Recent trends have established 8-bit integer (INT8) as the most widely used precision for DL inference. This paper proposes a novel MAC design capable of dynamically exploiting bit sparsity (i.e., number of `0' bits within a binary value) in inp… ▽ More General Matrix Multiply (GEMM) hardware, employing large arrays of multiply-accumulate (MAC) units, perform bulk of the computation in deep learning (DL). Recent trends have established 8-bit integer (INT8) as the most widely used precision for DL inference. This paper proposes a novel MAC design capable of dynamically exploiting bit sparsity (i.e., number of `0' bits within a binary value) in input data to achieve significant improvements on area, power and energy. The proposed architecture, called OzMAC (Omit-zero-MAC), skips over zeros within a binary input value and performs simple shift-and-add-based compute in place of expensive multipliers. We implement OzMAC in SystemVerilog and present post-synthesis performance-power-area (PPA) results using commercial TSMC N5 (5nm) process node. Using eight pretrained INT8 deep neural networks (DNNs) as benchmarks, we demonstrate the existence of high bit sparsity in real DNN workloads and show that 8-bit OzMAC improves all three metrics of area, power, and energy significantly by 21%, 70%, and 28%, respectively. Similar improvements are achieved when scaling data precisions (4, 8, 16 bits) and clock frequencies (0.5 GHz, 1 GHz, 1.5 GHz). For the 8-bit OzMAC, scaling its frequency to normalize the throughput relative to conventional MAC, it still achieves 30% improvement on both power and energy. △ Less

Submitted 29 February, 2024; originally announced February 2024.

arXiv:2402.19350 [pdf, other]

Prompting Explicit and Implicit Knowledge for Multi-hop Question Answering Based on Human Reading Process

Authors: Guangming Huang, Yunfei Long, Cun** Luo, Jiaxing Shen, Xia Sun

Abstract: Pre-trained language models (PLMs) leverage chains-of-thought (CoT) to simulate human reasoning and inference processes, achieving proficient performance in multi-hop QA. However, a gap persists between PLMs' reasoning abilities and those of humans when tackling complex problems. Psychological studies suggest a vital connection between explicit information in passages and human prior knowledge dur… ▽ More Pre-trained language models (PLMs) leverage chains-of-thought (CoT) to simulate human reasoning and inference processes, achieving proficient performance in multi-hop QA. However, a gap persists between PLMs' reasoning abilities and those of humans when tackling complex problems. Psychological studies suggest a vital connection between explicit information in passages and human prior knowledge during reading. Nevertheless, current research has given insufficient attention to linking input passages and PLMs' pre-training-based knowledge from the perspective of human cognition studies. In this study, we introduce a Prompting Explicit and Implicit knowledge (PEI) framework, which uses prompts to connect explicit and implicit knowledge, aligning with human reading process for multi-hop QA. We consider the input passages as explicit knowledge, employing them to elicit implicit knowledge through unified prompt reasoning. Furthermore, our model incorporates type-specific reasoning via prompts, a form of implicit knowledge. Experimental results show that PEI performs comparably to the state-of-the-art on HotpotQA. Ablation studies confirm the efficacy of our model in bridging and integrating explicit and implicit knowledge. △ Less

Submitted 27 June, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

Comments: This paper has been accepted at COLING 2024

arXiv:2402.19220 [pdf, other]

Thermal stabilities Landscape of A$_2$BB$^{\prime}$O$_6$ compounds

Authors: Yateng Wang, Bianca Baldassarri, Jiahong Shen, Jiangang He, Chris Wolverton

Abstract: Perovskite oxides have been extensively studied for their wide range of compositions and structures, as well as their valuable properties for various applications. Expanding from single perovskite ABO$_3$ to double perovskite $A_2BB^{\prime}$O$_6$ significantly enhances the ability to tailor specific physical and chemical properties. However, the vast number of potential compositions of… ▽ More Perovskite oxides have been extensively studied for their wide range of compositions and structures, as well as their valuable properties for various applications. Expanding from single perovskite ABO$_3$ to double perovskite $A_2BB^{\prime}$O$_6$ significantly enhances the ability to tailor specific physical and chemical properties. However, the vast number of potential compositions of $A_2BB^{\prime}$O$_6$ makes it impractical to explore them all experimentally. In this study, we conducted high-throughput calculations to systematically investigate the structures and stabilities of 4,900 $A_2BB^{\prime}$O$_6$ compositions (with $A$ = Ca, Sr, Ba, and La; $B$ and $B^{\prime}$ representing metal elements) through over 42,000 density functional theory (DFT) calculations. Our analysis lead to the discovery of more than 1,500 new synthesizable $A_2BB^{\prime}$O$_6$ compounds, with over 1,100 of them exhibiting double perovskite structures, predominantly in the $P2_1/c$ space group. By leveraging the high-throughput dataset, we developed machine learning models that achieved mean absolute errors of 0.0444 and 0.0330 eV/atom for formation energy and decomposition energy, respectively. Using these models, we identified 803 stable or metastable compositions beyond the chemical space covered in our initial calculations, with 612 of them having DFT-validated decomposition energies below 0.1 eV/atom, resulting in a success rate of 76.2 \%. This study delineates the stability landscape of $A_2BB^{\prime}$O$_6$ compounds and offers new insights for the exploration of these materials. △ Less

Submitted 29 February, 2024; originally announced February 2024.

arXiv:2402.18278 [pdf, other]

EAN-MapNet: Efficient Vectorized HD Map Construction with Anchor Neighborhoods

Authors: Huiyuan Xiong, Jun Shen, Taohong Zhu, Yuelong Pan

Abstract: High-definition (HD) map is crucial for autonomous driving systems. Most existing works design map elements detection heads based on the DETR decoder. However, the initial queries lack explicit incorporation of physical positional information, and vanilla self-attention entails high computational complexity. Therefore, we propose EAN-MapNet for Efficiently constructing HD map using Anchor Neighbor… ▽ More High-definition (HD) map is crucial for autonomous driving systems. Most existing works design map elements detection heads based on the DETR decoder. However, the initial queries lack explicit incorporation of physical positional information, and vanilla self-attention entails high computational complexity. Therefore, we propose EAN-MapNet for Efficiently constructing HD map using Anchor Neighborhoods. Firstly, we design query units based on the anchor neighborhoods, allowing non-neighborhood central anchors to effectively assist in fitting the neighborhood central anchors to the target points representing map elements. Then, we propose grouped local self-attention (GL-SA) by leveraging the relative instance relationship among the queries. This facilitates direct feature interaction among queries of the same instances, while innovatively employing local queries as intermediaries for interaction among queries from different instances. Consequently, GL-SA significantly reduces the computational complexity of self-attention while ensuring ample feature interaction among queries. On the nuScenes dataset, EAN-MapNet achieves a state-of-the-art performance with 63.0 mAP after training for 24 epochs, surpassing MapTR by 12.7 mAP. Furthermore, it considerably reduces memory consumption by 8198M compared to MapTRv2. △ Less

Submitted 7 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

arXiv:2402.18077 [pdf, ps, other]

Locating heating channels of the solar corona in a plage region with the aid of high-resolution 10830 Å filtergrams

Authors: Parida Hashim, Fangyu Xu, Ya Wang, Weijie Men, **hua Shen, Yingna Su, Jian** Li, Zhenyu **, Haisheng Ji

Abstract: In this paper, with a set of high-resolution He I 10830 Å filtergrams, we select an area in a plage, very likely an EUV moss area, as an interface layer to follow the clues of coronal heating channels down to the photosphere. The filtergrams are obtained from the 1-meter aperture New Vacuum Solar Telescope (NVST). We make a distinction between the darker and the brighter regions in the selected ar… ▽ More In this paper, with a set of high-resolution He I 10830 Å filtergrams, we select an area in a plage, very likely an EUV moss area, as an interface layer to follow the clues of coronal heating channels down to the photosphere. The filtergrams are obtained from the 1-meter aperture New Vacuum Solar Telescope (NVST). We make a distinction between the darker and the brighter regions in the selected area and name the two regions enhanced absorption patches (EAPs) and low absorption patches (LAPs). With well-aligned, nearly simultaneous data from multiple channels of the AIA and the continuum of the HMI on board SDO, we compare the EUV/UV emissions, emission measure, mean temperature, and continuum intensity in the two kinds of regions. The following progress is made: 1) The mean EUV emissions over EAPs are mostly stronger than the corresponding emissions over LAPs except for the emission at 335 Å. The UV emissions at 1600 and 1700 Å fail to capture the difference between the two regions. 2) In the logarithmic temperature range of 5.6-6.2, EAPs have higher EUV emission measure than LAPs, but they have lower mean coronal temperature. 3) The mean continuum intensity over EAPs is lower. Based on the above progress, we suggest that the energy for coronal heating in the moss region can be traced down to some areas in intergranular lanes with enhanced density of both cool and hot material. The lower temperature over the EAPs is due to the greater fraction of cool material over there. △ Less

Submitted 28 February, 2024; originally announced February 2024.

Comments: ApJ accepted for publication. 11 pages, 7 figures

arXiv:2402.17244 [pdf]

doi 10.1016/j.ijpt.2024.100020

The status and challenges for prostate SBRT treatments in United States proton therapy centers: An NRG Oncology practice survey

Authors: Jiajian Shen, Paige A. Taylor, Carlos E. Vargas, Minglei Kang, Jatinder Saini, Jun Zhou, Peilong Wang, Wei Liu, Charles B. Simone II, Ying Xiao, Liyong Lin

Abstract: A survey was designed to inquire about the practice of proton SBRT treatment for prostate cancer. The survey was distributed to all 30 proton therapy centers in the United States that participate in the National Clinical Trial Network in Feb. 2023. The survey focused on usage, patient selection criteria, prescriptions, target contours, dose constraints, treatment plan optimization and evaluation m… ▽ More A survey was designed to inquire about the practice of proton SBRT treatment for prostate cancer. The survey was distributed to all 30 proton therapy centers in the United States that participate in the National Clinical Trial Network in Feb. 2023. The survey focused on usage, patient selection criteria, prescriptions, target contours, dose constraints, treatment plan optimization and evaluation methods, patient-specific QA, and IGRT methods. Results: We received responses from 25 centers (83% participation). Only 8 respondent proton centers (32%) reported performing SBRT of the prostate. The remaining 17 centers cited three primary reasons for not offering this treatment: no clinical need, lack of volumetric imaging, and/or lack of clinical evidence. Only 1 center cited the reduction in overall reimbursement as a concern for not offering prostate SBRT. Several common practices among the 8 centers offering SBRT for the prostate were noted, such as using Hydrogel spacers, fiducial markers, and MRI for target delineation. Most proton centers (87.5%) utilized pencil beam scanning (PBS) delivery and completed Imaging and Radiation Oncology Core (IROC) phantom credentialing. Treatment planning typically used parallel opposed lateral beams, and consistent parameters for setup and range uncertainties were used for plan optimization and robustness evaluation. Measurements-based patient-specific QA, beam delivery every other day, fiducial contours for IGRT, and total doses of 35-40 GyRBE were consistent across all centers. However, there was no consensus on the risk levels for patient selection. Conclusion: Prostate SBRT is used in about 1/3 of proton centers in the US. There was a significant consistency in practices among proton centers treating with proton SBRT. It is possible that the adoption of proton SBRT may become more common if proton SBRT is more commonly offered in clinical trials. △ Less

Submitted 27 February, 2024; originally announced February 2024.

arXiv:2402.17205 [pdf, other]

Measuring Vision-Language STEM Skills of Neural Models

Authors: Jianhao Shen, Ye Yuan, Srbuhi Mirzoyan, Ming Zhang, Chenguang Wang

Abstract: We introduce a new challenge to test the STEM skills of neural models. The problems in the real world often require solutions, combining knowledge from STEM (science, technology, engineering, and math). Unlike existing datasets, our dataset requires the understanding of multimodal vision-language information of STEM. Our dataset features one of the largest and most comprehensive datasets for the c… ▽ More We introduce a new challenge to test the STEM skills of neural models. The problems in the real world often require solutions, combining knowledge from STEM (science, technology, engineering, and math). Unlike existing datasets, our dataset requires the understanding of multimodal vision-language information of STEM. Our dataset features one of the largest and most comprehensive datasets for the challenge. It includes 448 skills and 1,073,146 questions spanning all STEM subjects. Compared to existing datasets that often focus on examining expert-level ability, our dataset includes fundamental skills and questions designed based on the K-12 curriculum. We also add state-of-the-art foundation models such as CLIP and GPT-3.5-Turbo to our benchmark. Results show that the recent model advances only help master a very limited number of lower grade-level skills (2.5% in the third grade) in our dataset. In fact, these models are still well below (averaging 54.7%) the performance of elementary students, not to mention near expert-level performance. To understand and increase the performance on our dataset, we teach the models on a training split of our dataset. Even though we observe improved performance, the model performance remains relatively low compared to average elementary students. To solve STEM problems, we will need novel algorithmic innovations from the community. △ Less

Submitted 22 May, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

Comments: Accepted in ICLR 2024

arXiv:2402.17149 [pdf, other]

Spatial Distribution of Inertial Particles in Turbulent Taylor-Couette Flow

Authors: Hao Jiang, Zhi-ming Lu, Bo-fu Wang, Xiao-hui Meng, Jie Shen, Kai Leong Chong

Abstract: This study investigates the spatial distribution of inertial particles in turbulent Taylor-Couette flow. Direct numerical simulations are performed using a one-way coupled Eulerian-Lagrangian approach, with a fixed inner wall Reynolds number of 2500 for the carrier flow, while the particle Stokes number varies from 0.034 to 1 for the dispersed phase. We first examine the issue of preferential conc… ▽ More This study investigates the spatial distribution of inertial particles in turbulent Taylor-Couette flow. Direct numerical simulations are performed using a one-way coupled Eulerian-Lagrangian approach, with a fixed inner wall Reynolds number of 2500 for the carrier flow, while the particle Stokes number varies from 0.034 to 1 for the dispersed phase. We first examine the issue of preferential concentration of particles near the outer wall region. Employing two-dimensional (2D) Voronoi analysis, we observe a pronounced particle clustering with increasing $St$, particularly evident in regions of low fluid velocity. Additionally, we investigate the concentration balance equation, inspired by the work of johnson et al.(2020), to examine particle radial distribution. We discern the predominant sources of influence, namely biased sampling, turbophoresis, and centrifugal effects. Across all cases, centrifugal force emerges as the primary driver, causing particle migration towards the outer wall. Biased sampling predominantly affects smaller inertial particles, driving them towards the inner wall due to sampling within Taylor rolls with inward radial velocity. Conversely, turbophoresis primarily impacts larger inertial particles, inducing migration towards both walls where turbulent intensity is weaker compared to the bulk. With the revealed physics, our work provides a basis for predicting and controlling particle movement and distribution in industrial applications. △ Less

Submitted 26 February, 2024; originally announced February 2024.

Comments: 18 pages, 11 figures

arXiv:2402.16057 [pdf]

Fractal Gripper: Adaptive manipulator with mode switching

Authors: Jiaxin Huang, Jian Shen, Yilin Zheng, Zhigong Song

Abstract: Although the multi-jointed underactuated manipulator is highly dexterous, its gras** capacity does not match that of the parallel jaw gripper. This work introduces a fractal gripper to enhance the gras** capacity of multi-joint underactuated manipulators, preserving their passive clam** features. We describe in detail the working principle and manufacturing process of the fractal gripper. Th… ▽ More Although the multi-jointed underactuated manipulator is highly dexterous, its gras** capacity does not match that of the parallel jaw gripper. This work introduces a fractal gripper to enhance the gras** capacity of multi-joint underactuated manipulators, preserving their passive clam** features. We describe in detail the working principle and manufacturing process of the fractal gripper. This work, inspired by the 'Fractal Vise' structure, resulted in the invention of a fractal gripper with mode switching capabilities. The fractal gripper inherits the inherent adaptive properties of the fractal structure and realizes the self-resetting function by integrating spring into the original design, thereby enhancing the efficiency of object gras** tasks. The fractal gripper prevents object damage by distributing pressure evenly and applying it at multiple points through its fractal structure during closure. Objects of various shapes are effectively grasped by the fractal gripper, which ensures a safe and secure grasp. The superior performance was provided by the force distribution characteristics of the fractal gripper. By applying the flexible polymer PDMS, which possesses superior elasticity, to the fractal structure's wrap** surface, potential scratching during gras** is effectively prevented, thus protecting the object's geometric surface. Grab experiments with objects of diverse shapes and sizes confirm fractal gripper multi-scale adaptability and superior gras** stability. △ Less

Submitted 25 February, 2024; originally announced February 2024.

arXiv:2402.15741 [pdf, other]

Observation of the In-plane Anomalous Hall Effect induced by Octupole in Magnetization Space

Authors: Wenzhi Peng, Zheng Liu, Haolin Pan, Peng Wang, Yulong Chen, Jiachen Zhang, Xuhao Yu, **hui Shen, Mingmin Yang, Qian Niu, Yang Gao, Dazhi Hou

Abstract: The Anomalous Hall Effect (AHE) manifests as a transverse voltage proportional to magnetization in ferromagnetic materials under the application of a charge current, being an indispensable tool for probing magnetism, especially in nanoscale devices. However, the AHE primarily sensitizes to out-of-plane magnetization, thereby hindering its capacity to discern the in-plane magnetization, a character… ▽ More The Anomalous Hall Effect (AHE) manifests as a transverse voltage proportional to magnetization in ferromagnetic materials under the application of a charge current, being an indispensable tool for probing magnetism, especially in nanoscale devices. However, the AHE primarily sensitizes to out-of-plane magnetization, thereby hindering its capacity to discern the in-plane magnetization, a characteristic prevalent in ferromagnetic films. Here we challenge this conventional understanding by demonstrating the in-plane magnetization-induced AHE in iron and nickel, two ubiquitous ferromagnets. This observation of the in-plane AHE is remarkable as it contradicts existing theories that forbid such phenomena in cubic crystal systems. We trace the origin of this unanticipated phenomenon to a hitherto unconsidered octupole of the anomalous Hall conductivity in the magnetization space, a mechanism we propose could enable the detection of in-plane AHE in a wide range of ferromagnetic materials. This work realizes the in-plane AHE in common ferromagnets by exploiting the anomalous Hall conductivity octupole, revealing a new physical origin of the AHE and promising to revolutionize the design of magnetic devices and sensors. △ Less

Submitted 24 February, 2024; originally announced February 2024.

arXiv:2402.15142 [pdf, other]

Higher-Order Energy-Decreasing Exponential Time Differencing Runge-Kutta methods for Gradient Flows

Authors: Zhaohui Fu, Jie Shen, Jiang Yang

Abstract: In this paper, we develop a general framework for constructing higher-order, unconditionally energy-stable exponential time differencing Runge-Kutta methods applicable to a range of gradient flows. Specifically, we identify conditions sufficient for ETDRK schemes to maintain the original energy dissipation. Our analysis reveals that commonly used third-order and fourth-order ETDRK schemes fail to… ▽ More In this paper, we develop a general framework for constructing higher-order, unconditionally energy-stable exponential time differencing Runge-Kutta methods applicable to a range of gradient flows. Specifically, we identify conditions sufficient for ETDRK schemes to maintain the original energy dissipation. Our analysis reveals that commonly used third-order and fourth-order ETDRK schemes fail to meet these conditions. To address this, we introduce new third-order ETDRK schemes, designed with appropriate stabilization, which satisfy these conditions and thus guarantee the unconditional energy decaying property. We conduct extensive numerical experiments with these new schemes to verify their accuracy, stability, behavior under large time steps, long-term evolution, and adaptive time step** strategy across various gradient flows. This study is the first to examine the unconditional energy stability of high-order ETDRK methods, and we are optimistic that our framework will enable the development of ETDRK schemes beyond the third order that are unconditionally energy stable. △ Less

Submitted 23 February, 2024; originally announced February 2024.

arXiv:2402.14477 [pdf, other]

Pressure tunable magnetic skyrmion phase in Co8Zn8Mn4 single crystals

Authors: Zhun Li, Xinrun Mi, Xinming Wang, Jian Lyu, Na Su, Aifeng Wang, Yisheng Chai, Bao Yuan, Wanju Luo, Hui Cheng, Jianxiang Gao, Hongliang Wang, Lijie Hao, Mingquan He, Junying Shen, Young Sun, Xin Tong

Abstract: In a magnetic skyrmion phase, magnetic moments form vortex-like topological textures which are of both fundamental and industrial interests. In $β$-Mn-type Co-Zn-Mn alloys, chrial magnetic skyrmions emerge above room temperature, providing a unique system for studying the skrymion physics and exploring spintronics applications. However, the magnetic skyrmion phase is typically confined in a narrow… ▽ More In a magnetic skyrmion phase, magnetic moments form vortex-like topological textures which are of both fundamental and industrial interests. In $β$-Mn-type Co-Zn-Mn alloys, chrial magnetic skyrmions emerge above room temperature, providing a unique system for studying the skrymion physics and exploring spintronics applications. However, the magnetic skyrmion phase is typically confined in a narrow and limited temperature ($T$) and magnetic field ($H$) range. Here, we demonstrate that hydrostatic pressure can expand the skyrmion phase in the $T-H$ phase diagram of single-crystalline Co$_8$Zn$_8$Mn$_4$. At ambient pressure, signatures of skyrmions are seen within $T\sim302-308$ K and $H\sim50-100$ Oe. Applying a moderate pressure of 6 kbar extends this range to $T\sim300-310$ K and $H\sim50-150$ Oe. However, further escalation of pressure to 10 kbar results in a slight contraction of the skyrmion phase. These findings underscore the sensitivity of the skyrmion phase in Co$_8$Zn$_8$Mn$_4$ to external pressures, and hint at the potential of strain engineering, particularly in $β$-Mn-type Co-Zn-Mn thin films, as a promising avenue to customize the skyrmion phase. △ Less

Submitted 22 February, 2024; originally announced February 2024.

Comments: 7 pages, 4 figures

arXiv:2402.14008 [pdf, other]

OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems

Authors: Chaoqun He, Renjie Luo, Yuzhuo Bai, Shengding Hu, Zhen Leng Thai, Junhao Shen, **yi Hu, Xu Han, Yujie Huang, Yuxiang Zhang, Jie Liu, Lei Qi, Zhiyuan Liu, Maosong Sun

Abstract: Recent advancements have seen Large Language Models (LLMs) and Large Multimodal Models (LMMs) surpassing general human capabilities in various tasks, approaching the proficiency level of human experts across multiple domains. With traditional benchmarks becoming less challenging for these models, new rigorous challenges are essential to gauge their advanced abilities. In this work, we present Olym… ▽ More Recent advancements have seen Large Language Models (LLMs) and Large Multimodal Models (LMMs) surpassing general human capabilities in various tasks, approaching the proficiency level of human experts across multiple domains. With traditional benchmarks becoming less challenging for these models, new rigorous challenges are essential to gauge their advanced abilities. In this work, we present OlympiadBench, an Olympiad-level bilingual multimodal scientific benchmark, featuring 8,476 problems from Olympiad-level mathematics and physics competitions, including the Chinese college entrance exam. Each problem is detailed with expert-level annotations for step-by-step reasoning. Evaluating top-tier models on OlympiadBench, we implement a comprehensive assessment methodology to accurately evaluate model responses. Notably, the best-performing model, GPT-4V, attains an average score of 17.97% on OlympiadBench, with a mere 10.74% in physics, highlighting the benchmark rigor and the intricacy of physical reasoning. Our analysis orienting GPT-4V points out prevalent issues with hallucinations, knowledge omissions, and logical fallacies. We hope that our challenging benchmark can serve as a valuable resource for hel** future AGI research endeavors. The data and evaluation code are available at \url{https://github.com/OpenBMB/OlympiadBench} △ Less

Submitted 6 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

Comments: Accepted by ACL 2024 (main), update

arXiv:2402.13435 [pdf, other]

Learning to Retrieve for Job Matching

Authors: Jianqiang Shen, Yuchin Juan, Shaobo Zhang, ** Liu, Wen Pu, Sriram Vasudevan, Qingquan Song, Fedor Borisyuk, Kay Qianqi Shen, Haichao Wei, Yunxiang Ren, Yeou S. Chiou, Sicong Kuang, Yuan Yin, Ben Zheng, Muchen Wu, Shaghayegh Gharghabi, Xiaoqing Wang, Huichao Xue, Qi Guo, Daniel Hewlett, Luke Simon, Liangjie Hong, Wen**g Zhang

Abstract: Web-scale search systems typically tackle the scalability challenge with a two-step paradigm: retrieval and ranking. The retrieval step, also known as candidate selection, often involves extracting standardized entities, creating an inverted index, and performing term matching for retrieval. Such traditional methods require manual and time-consuming development of query models. In this paper, we d… ▽ More Web-scale search systems typically tackle the scalability challenge with a two-step paradigm: retrieval and ranking. The retrieval step, also known as candidate selection, often involves extracting standardized entities, creating an inverted index, and performing term matching for retrieval. Such traditional methods require manual and time-consuming development of query models. In this paper, we discuss applying learning-to-retrieve technology to enhance LinkedIns job search and recommendation systems. In the realm of promoted jobs, the key objective is to improve the quality of applicants, thereby delivering value to recruiter customers. To achieve this, we leverage confirmed hire data to construct a graph that evaluates a seeker's qualification for a job, and utilize learned links for retrieval. Our learned model is easy to explain, debug, and adjust. On the other hand, the focus for organic jobs is to optimize seeker engagement. We accomplished this by training embeddings for personalized retrieval, fortified by a set of rules derived from the categorization of member feedback. In addition to a solution based on a conventional inverted index, we developed an on-GPU solution capable of supporting both KNN and term matching efficiently. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.13430 [pdf, other]

LinkSAGE: Optimizing Job Matching Using Graph Neural Networks

Authors: ** Liu, Haichao Wei, Xiaochen Hou, Jianqiang Shen, Shihai He, Kay Qianqi Shen, Zhujun Chen, Fedor Borisyuk, Daniel Hewlett, Liang Wu, Srikant Veeraraghavan, Alex Tsun, Chengming Jiang, Wen**g Zhang

Abstract: We present LinkSAGE, an innovative framework that integrates Graph Neural Networks (GNNs) into large-scale personalized job matching systems, designed to address the complex dynamics of LinkedIns extensive professional network. Our approach capitalizes on a novel job marketplace graph, the largest and most intricate of its kind in industry, with billions of nodes and edges. This graph is not merel… ▽ More We present LinkSAGE, an innovative framework that integrates Graph Neural Networks (GNNs) into large-scale personalized job matching systems, designed to address the complex dynamics of LinkedIns extensive professional network. Our approach capitalizes on a novel job marketplace graph, the largest and most intricate of its kind in industry, with billions of nodes and edges. This graph is not merely extensive but also richly detailed, encompassing member and job nodes along with key attributes, thus creating an expansive and interwoven network. A key innovation in LinkSAGE is its training and serving methodology, which effectively combines inductive graph learning on a heterogeneous, evolving graph with an encoder-decoder GNN model. This methodology decouples the training of the GNN model from that of existing Deep Neural Nets (DNN) models, eliminating the need for frequent GNN retraining while maintaining up-to-date graph signals in near realtime, allowing for the effective integration of GNN insights through transfer learning. The subsequent nearline inference system serves the GNN encoder within a real-world setting, significantly reducing online latency and obviating the need for costly real-time GNN infrastructure. Validated across multiple online A/B tests in diverse product scenarios, LinkSAGE demonstrates marked improvements in member engagement, relevance matching, and member retention, confirming its generalizability and practical impact. △ Less

Submitted 20 February, 2024; originally announced February 2024.

Showing 51–100 of 1,350 results for author: Shen, J