Search | arXiv e-print repository

Secure Information Embedding and Extraction in Forensic 3D Fingerprinting

Authors: Canran Wang, **wen Wang, Mi Zhou, Vinh Pham, Senyue Hao, Chao Zhou, Ning Zhang, Netanel Raviv

Abstract: The prevalence of 3D printing poses a significant risk to public safety, as any individual with internet access and a commodity printer is able to produce untraceable firearms, keys, counterfeit products, etc. To aid government authorities in combating these new security threats, several approaches have been taken to tag 3D-prints with identifying information. Known as fingerprints, this informati… ▽ More The prevalence of 3D printing poses a significant risk to public safety, as any individual with internet access and a commodity printer is able to produce untraceable firearms, keys, counterfeit products, etc. To aid government authorities in combating these new security threats, several approaches have been taken to tag 3D-prints with identifying information. Known as fingerprints, this information is written into the object using various bit embedding techniques; examples include varying the height of the molten thermoplastic layers, and depositing metallic powder with different magnetic properties. Yet, the practicality of theses techniques in real-world forensic settings is hindered by the adversarial nature of this problem. That is, the 3D-printing process is out of reach of any law enforcement agencies; it is the adversary who controls all aspects of printing and possesses the printed object. To combat these threats, law enforcement agencies can regulate the manufacturing of 3D printers, on which they may enforce a fingerprinting scheme, and collect adversarially tampered remains (e.g., fragments of a broken 3D-printed firearm) during forensic investigation. Therefore, it is important to devise fingerprinting techniques so that the fingerprint could be extracted even if printing is carried out by the adversary. To this end, we present SIDE (Secure Information Embedding and Extraction), a fingerprinting framework that tackles the adversarial nature of forensic fingerprinting in 3D prints by offering both secure information embedding and secure information extraction. △ Less

Submitted 12 June, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

arXiv:2403.04459 [pdf, ps, other]

An efficient method for calculating resonant modes in biperiodic photonic structures

Authors: Nan Zhang, Ya Yan Lu

Abstract: Many photonic devices, such as photonic crystal slabs, cross gratings, and periodic metasurfaces, are biperiodic structures with two independent periodic directions, and are sandwiched between two homogeneous media. Many applications of these devices are closely related to resonance phenomena. Therefore, efficient computation of resonant modes is crucial in device design and structure analysis. Si… ▽ More Many photonic devices, such as photonic crystal slabs, cross gratings, and periodic metasurfaces, are biperiodic structures with two independent periodic directions, and are sandwiched between two homogeneous media. Many applications of these devices are closely related to resonance phenomena. Therefore, efficient computation of resonant modes is crucial in device design and structure analysis. Since resonant modes satisfy outgoing radiation conditions, perfectly matched layers (PMLs) are usually used to truncate the unbounded spatial variable perpendicular to the periodic directions. In this paper, we develop an efficient method without using PMLs to calculate resonant modes in biperiodic structures. We reduce the original eigenvalue problem to a small matrix nonlinear eigenvalue problem which is solved by the contour integral method. Numerical examples show that our method is efficient with respect to memory usage and CPU time, free of spurious solutions, and determines degenerate resonant modes without any difficulty. △ Less

Submitted 7 March, 2024; originally announced March 2024.

arXiv:2403.04267 [pdf, other]

Parallel numerical simulation of impact crater with perfect matched layers

Authors: Huacheng Li, Zongyu Yue, Nan Zhang, **hai Zhang, Zhongzheng Miao

Abstract: Impact craters are the primary geomorphic features on the surfaces of celestial bodies such as the Moon, and their formation has significant implications for the evolutionary history of the celestial body. The study of the impact crater formation process relies mainly on numerical simulation methods, with two-dimensional simulations capable of reproducing general patterns of impact processes while… ▽ More Impact craters are the primary geomorphic features on the surfaces of celestial bodies such as the Moon, and their formation has significant implications for the evolutionary history of the celestial body. The study of the impact crater formation process relies mainly on numerical simulation methods, with two-dimensional simulations capable of reproducing general patterns of impact processes while conserving computational resources. However, to mitigate the artificial reflections of shock waves at numerical boundaries, a common approach involves expanding the computational domain, greatly reducing the efficiency of numerical simulations. In this study, we developed a novel two-dimensional code SALEc-2D that employs the perfect matched layer (PML) method to suppress artificial reflections at numerical boundaries. This method enhances computational efficiency while ensuring reliable results. Additionally, we implemented MPI parallel algorithms in the new code to further improve computational efficiency. Simulations that would take over ten hours using the conventional iSALE-2D code can now be completed in less than half an hour using our code, SALEc-2D, on a standard computer. We anticipate that our code will find widespread application in numerical simulations of impact craters in the future. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Comments: 17 pages, 8 figures

arXiv:2403.03101 [pdf, other]

KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents

Authors: Yuqi Zhu, Shuofei Qiao, Yixin Ou, Shumin Deng, Ningyu Zhang, Shiwei Lyu, Yue Shen, Lei Liang, **jie Gu, Huajun Chen

Abstract: Large Language Models (LLMs) have demonstrated great potential in complex reasoning tasks, yet they fall short when tackling more sophisticated challenges, especially when interacting with environments through generating executable actions. This inadequacy primarily stems from the lack of built-in action knowledge in language agents, which fails to effectively guide the planning trajectories durin… ▽ More Large Language Models (LLMs) have demonstrated great potential in complex reasoning tasks, yet they fall short when tackling more sophisticated challenges, especially when interacting with environments through generating executable actions. This inadequacy primarily stems from the lack of built-in action knowledge in language agents, which fails to effectively guide the planning trajectories during task solving and results in planning hallucination. To address this issue, we introduce KnowAgent, a novel approach designed to enhance the planning capabilities of LLMs by incorporating explicit action knowledge. Specifically, KnowAgent employs an action knowledge base and a knowledgeable self-learning strategy to constrain the action path during planning, enabling more reasonable trajectory synthesis, and thereby enhancing the planning performance of language agents. Experimental results on HotpotQA and ALFWorld based on various backbone models demonstrate that KnowAgent can achieve comparable or superior performance to existing baselines. Further analysis indicates the effectiveness of KnowAgent in terms of planning hallucinations mitigation. Code is available in https://github.com/zjunlp/KnowAgent. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: Work in progress. Project page: https://zjunlp.github.io/project/KnowAgent/ Code: https://github.com/zjunlp/KnowAgent

arXiv:2403.02075 [pdf, other]

DiffMOT: A Real-time Diffusion-based Multiple Object Tracker with Non-linear Prediction

Authors: Weiyi Lv, Yuhang Huang, Ning Zhang, Ruei-Sung Lin, Mei Han, Dan Zeng

Abstract: In Multiple Object Tracking, objects often exhibit non-linear motion of acceleration and deceleration, with irregular direction changes. Tacking-by-detection (TBD) trackers with Kalman Filter motion prediction work well in pedestrian-dominant scenarios but fall short in complex situations when multiple objects perform non-linear and diverse motion simultaneously. To tackle the complex non-linear m… ▽ More In Multiple Object Tracking, objects often exhibit non-linear motion of acceleration and deceleration, with irregular direction changes. Tacking-by-detection (TBD) trackers with Kalman Filter motion prediction work well in pedestrian-dominant scenarios but fall short in complex situations when multiple objects perform non-linear and diverse motion simultaneously. To tackle the complex non-linear motion, we propose a real-time diffusion-based MOT approach named DiffMOT. Specifically, for the motion predictor component, we propose a novel Decoupled Diffusion-based Motion Predictor (D$^2$MP). It models the entire distribution of various motion presented by the data as a whole. It also predicts an individual object's motion conditioning on an individual's historical motion information. Furthermore, it optimizes the diffusion process with much fewer sampling steps. As a MOT tracker, the DiffMOT is real-time at 22.7FPS, and also outperforms the state-of-the-art on DanceTrack and SportsMOT datasets with $62.3\%$ and $76.2\%$ in HOTA metrics, respectively. To the best of our knowledge, DiffMOT is the first to introduce a diffusion probabilistic model into the MOT to tackle non-linear motion prediction. △ Less

Submitted 20 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

Comments: CVPR2024

arXiv:2402.18649 [pdf, other]

A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems

Authors: Fangzhou Wu, Ning Zhang, Somesh Jha, Patrick McDaniel, Chaowei Xiao

Abstract: Large Language Model (LLM) systems are inherently compositional, with individual LLM serving as the core foundation with additional layers of objects such as plugins, sandbox, and so on. Along with the great potential, there are also increasing concerns over the security of such probabilistic intelligent systems. However, existing studies on LLM security often focus on individual LLM, but without… ▽ More Large Language Model (LLM) systems are inherently compositional, with individual LLM serving as the core foundation with additional layers of objects such as plugins, sandbox, and so on. Along with the great potential, there are also increasing concerns over the security of such probabilistic intelligent systems. However, existing studies on LLM security often focus on individual LLM, but without examining the ecosystem through the lens of LLM systems with other objects (e.g., Frontend, Webtool, Sandbox, and so on). In this paper, we systematically analyze the security of LLM systems, instead of focusing on the individual LLMs. To do so, we build on top of the information flow and formulate the security of LLM systems as constraints on the alignment of the information flow within LLM and between LLM and other objects. Based on this construction and the unique probabilistic nature of LLM, the attack surface of the LLM system can be decomposed into three key components: (1) multi-layer security analysis, (2) analysis of the existence of constraints, and (3) analysis of the robustness of these constraints. To ground this new attack surface, we propose a multi-layer and multi-step approach and apply it to the state-of-art LLM system, OpenAI GPT4. Our investigation exposes several security issues, not just within the LLM model itself but also in its integration with other components. We found that although the OpenAI GPT4 has designed numerous safety constraints to improve its safety features, these safety constraints are still vulnerable to attackers. To further demonstrate the real-world threats of our discovered vulnerabilities, we construct an end-to-end attack where an adversary can illicitly acquire the user's chat history, all without the need to manipulate the user's input or gain direct access to OpenAI GPT4. Our demo is in the link: https://fzwark.github.io/LLM-System-Attack-Demo/ △ Less

Submitted 28 February, 2024; originally announced February 2024.

arXiv:2402.18215 [pdf]

Gravity effects on a bio-inspired self-burrowing probe in granular soils

Authors: Bowen Wang, Ningning Zhang, Yuyan Chen, Alejandro Martinez, Raul Fuentes

Abstract: In recent years, self-burrowing probes have been studied since they can be suitable for soil monitoring in locations with limited access such as outer space bodies and underneath existing structures. We study the performance of a self-burrowing probe under different gravity conditions, from low gravity (i.e., 1/6g, 1/3g and 1g) to high gravity (i.e., 5g, 10g and 15g), specifically in terms of pene… ▽ More In recent years, self-burrowing probes have been studied since they can be suitable for soil monitoring in locations with limited access such as outer space bodies and underneath existing structures. We study the performance of a self-burrowing probe under different gravity conditions, from low gravity (i.e., 1/6g, 1/3g and 1g) to high gravity (i.e., 5g, 10g and 15g), specifically in terms of penetration distance and energy consumption. Results show that the probe reaches efficient penetration in all gravity conditions and that it achieves larger penetration distances in high gravity conditions. However, the penetration efficiency, shown as unit energy per meter, is higher in low gravity. Additionally, we prove that a simple dimensional analysis provides reasonable scaling factors for first order effects in forces, velocities and energy. The findings in this study give confidence to the potential use of self-burrowing probes in campaigns of soil testing and sensor deployment in outer space or centrifuges in which the gravity conditions can differ from Earth. △ Less

Submitted 28 February, 2024; originally announced February 2024.

arXiv:2402.17282 [pdf, other]

doi 10.1051/0004-6361/202449200

Distribution of number of peaks within a long gamma-ray burst

Authors: C. Guidorzi, M. Sartori, R. Maccary, A. Tsvetkova, L. Amati, L. Bazzanini, M. Bulla, A. E. Camisasca, L. Ferro, F. Frontera, C. K. Li, S. L. Xiong, S. N. Zhang

Abstract: The variety of long duration gamma-ray burst (LGRB) light curves (LCs) encode a wealth of information on how LGRB engines release energy following the collapse of the progenitor star. Attempts to characterise GRB LCs focused on a number of properties, such as the minimum variability timescale, power density spectra (both ensemble average and individual), or with different definitions of variabilit… ▽ More The variety of long duration gamma-ray burst (LGRB) light curves (LCs) encode a wealth of information on how LGRB engines release energy following the collapse of the progenitor star. Attempts to characterise GRB LCs focused on a number of properties, such as the minimum variability timescale, power density spectra (both ensemble average and individual), or with different definitions of variability. In parallel, a characterisation as a stochastic process was pursued by studying the distributions of waiting times, peak flux, fluence of individual peaks within GRB time profiles. Yet, the question remains as to whether the diversity of profiles can be described in terms of a common stochastic process. Here we address this issue by studying for the first time the distribution of the number of peaks in a GRB profile. We used four different GRB catalogues: CGRO/BATSE, Swift/BAT, BeppoSAX/GRBM, and Insight-HXMT. The statistically significant peaks were identified by means of well tested algorithm MEPSA and further selected by applying a set of thresholds on signal-to-noise ratio. We then extracted the corresponding distributions of number of peaks per GRB. Among the different models considered (power-law, simple or stretched exponential) only a mixture of two exponentials models all the observed distributions, suggesting the existence of two distinct behaviours: (i) an average number of 2.1+-0.1 peaks per GRB ("peak poor") and accounting for about 80% of the observed population of GRBs; (ii) an average number of 8.3+-1.0 peaks per GRB ("peak rich") and accounting for the remaining 20% of the observed population. We associate the class of peak-rich GRBs with the presence of sub-second variability, which seems to be absent among peak-poor GRBs. The two classes could result from two different regimes through which GRB engines release energy or through which energy is dissipated into gamma-rays. △ Less

Submitted 27 February, 2024; originally announced February 2024.

Comments: 7 pages, 5 figures, accepted by A&A

Journal ref: A&A 685, A34 (2024)

arXiv:2402.16123 [pdf, other]

InstructEdit: Instruction-based Knowledge Editing for Large Language Models

Authors: Ningyu Zhang, Bozhong Tian, Siyuan Cheng, Xiaozhuan Liang, Yi Hu, Kouying Xue, Yanjie Gou, Xi Chen, Huajun Chen

Abstract: Knowledge editing for large language models can offer an efficient solution to alter a model's behavior without negatively impacting the overall performance. However, the current approaches encounter issues with limited generalizability across tasks, necessitating one distinct editor for each task, significantly hindering the broader applications. To address this, we take the first step to analyze… ▽ More Knowledge editing for large language models can offer an efficient solution to alter a model's behavior without negatively impacting the overall performance. However, the current approaches encounter issues with limited generalizability across tasks, necessitating one distinct editor for each task, significantly hindering the broader applications. To address this, we take the first step to analyze the multi-task generalization issue in knowledge editing. Specifically, we develop an instruction-based editing technique, termed InstructEdit, which facilitates the editor's adaptation to various task performances simultaneously using simple instructions. With only one unified editor for each LLM, we empirically demonstrate that InstructEdit can improve the editor's control, leading to an average 14.86% increase in Reliability in multi-task editing setting. Furthermore, experiments involving holdout unseen task illustrate that InstructEdit consistently surpass previous strong baselines. To further investigate the underlying mechanisms of instruction-based knowledge editing, we analyze the principal components of the editing gradient directions, which unveils that instructions can help control optimization direction with stronger OOD generalization. Code and datasets are available in https://github.com/zjunlp/EasyEdit. △ Less

Submitted 28 April, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

Comments: IJCAI 2024; the project website is at https://www.zjukg.org/project/InstructEdit/

arXiv:2402.14710 [pdf, other]

IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus

Authors: Honghao Gui, Lin Yuan, Hongbin Ye, Ningyu Zhang, Mengshu Sun, Lei Liang, Huajun Chen

Abstract: Large Language Models (LLMs) demonstrate remarkable potential across various domains; however, they exhibit a significant performance gap in Information Extraction (IE). Note that high-quality instruction data is the vital key for enhancing the specific capabilities of LLMs, while current IE datasets tend to be small in scale, fragmented, and lack standardized schema. To this end, we introduce IEP… ▽ More Large Language Models (LLMs) demonstrate remarkable potential across various domains; however, they exhibit a significant performance gap in Information Extraction (IE). Note that high-quality instruction data is the vital key for enhancing the specific capabilities of LLMs, while current IE datasets tend to be small in scale, fragmented, and lack standardized schema. To this end, we introduce IEPile, a comprehensive bilingual (English and Chinese) IE instruction corpus, which contains approximately 0.32B tokens. We construct IEPile by collecting and cleaning 33 existing IE datasets, and introduce schema-based instruction generation to unearth a large-scale corpus. Experimentally, IEPile enhance the performance of LLMs for IE, with notable improvements in zero-shot generalization. We open-source the resource and pre-trained models, ho** to provide valuable support to the NLP community. △ Less

Submitted 26 May, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

Comments: ACL 2024 (short); 21 pages; Github: https://github.com/zjunlp/IEPile

arXiv:2402.14226 [pdf, other]

Broadband noise and quasi-periodic oscillation characteristics of the X-ray pulsar RX J0440.9+4431

Authors: P. P. Li, L. Tao, R. C. Ma, M. Y. Ge, Q. C. Zhao, S. J. Zhao, L. Zhang, Q. C. Bu, L. D. Kong, Y. L. Tuo, L. Ji, S. Zhang, J. L. Qu, S. N. Zhang, Y. Huang, X. Ma, W. T. Ye, Q. C. Shui

Abstract: We present a comprehensive timing analysis on the Be/X-ray binary pulsar RX J0440.9+4431 using observations from \textit{NICER} and \textit{Insight}-HXMT during the 2022--2023 outburst. The power density spectrum (PDS) of RX J0440.9+4431 exhibits typical aperiodic variability in X-ray flux across a wide frequency range. During a super-critical accretion state, we detect quasi-periodic oscillations… ▽ More We present a comprehensive timing analysis on the Be/X-ray binary pulsar RX J0440.9+4431 using observations from \textit{NICER} and \textit{Insight}-HXMT during the 2022--2023 outburst. The power density spectrum (PDS) of RX J0440.9+4431 exhibits typical aperiodic variability in X-ray flux across a wide frequency range. During a super-critical accretion state, we detect quasi-periodic oscillations (QPOs) at 0.2--0.5\,Hz in the light curves of five pulses for RX J0440.9+4431. The observed QPOs manifest during flares, while the flares appear at the peaks of the pulse profiles on a timescale of seconds and are primarily caused by an increase in hard photons. These flares can be explained by increased material ingestion in the accretion column at a fixed phase, primarily generating hard photons. Alternatively, an increase in accretion rate, independent of phase, may result in highly beamed hard photons within the accretion column, causing the flares. We argue the origin of QPOs to instabilities within the accretion flow. Additionally, we find that the break frequencies in the noise power spectra align well with $\propto L_{\mathrm{x}}^{3 / 7}$ across three orders of magnitude in the luminosity, which points to a relatively strong magnetic field in RX J0440.9+4431, estimated to be \textasciitilde$10^{13}$\,G. △ Less

Submitted 21 February, 2024; originally announced February 2024.

Comments: 8 pages, 7 figures. Accepted in MNRAS

arXiv:2402.12802 [pdf, ps, other]

The Minkowski problem for the non-compact convex set with an asymptotic boundary condition

Authors: Ning Zhang

Abstract: In this paper, combining the covolume, we study the Minkowski theory for the non-compact convex set with an asymptotic boundary condition. In particular, the mixed covolume of two non-compact convex sets is introduced and its geometric interpretation is obtained by the Hadamard variational formula. The Brunn-Minkowski and Minkowski inequalities for covolume are established, and the equivalence of… ▽ More In this paper, combining the covolume, we study the Minkowski theory for the non-compact convex set with an asymptotic boundary condition. In particular, the mixed covolume of two non-compact convex sets is introduced and its geometric interpretation is obtained by the Hadamard variational formula. The Brunn-Minkowski and Minkowski inequalities for covolume are established, and the equivalence of these two inequalities are discussed as well. The Minkowski problem for non-compact convex set is proposed and solved under the asymptotic conditions. In the end, we give a solution to the Minkowski problem for $σ$-finite measure on the conic domain $Ω_C$. △ Less

Submitted 20 February, 2024; originally announced February 2024.

Comments: 20 pages

MSC Class: 52B45; 52A20; 52A39; 53A15

arXiv:2402.09995 [pdf, other]

iJTyper: An Iterative Type Inference Framework for Java by Integrating Constraint- and Statistically-based Methods

Authors: Zhixiang Chen, Anji Li, Neng Zhang, Jianguo Chen, Yuan Huang, Zibin Zheng

Abstract: Inferring the types of API elements in incomplete code snippets (e.g., those on Q&A forums) is a prepositive step required to work with the code snippets. Existing type inference methods can be mainly categorized as constraint-based or statistically-based. The former imposes higher requirements on code syntax and often suffers from low recall due to the syntactic limitation of code snippets. The l… ▽ More Inferring the types of API elements in incomplete code snippets (e.g., those on Q&A forums) is a prepositive step required to work with the code snippets. Existing type inference methods can be mainly categorized as constraint-based or statistically-based. The former imposes higher requirements on code syntax and often suffers from low recall due to the syntactic limitation of code snippets. The latter relies on the statistical regularities learned from a training corpus and does not take full advantage of the type constraints in code snippets, which may lead to low precision. In this paper, we propose an iterative type inference framework for Java, called iJTyper, by integrating the strengths of both constraint- and statistically-based methods. For a code snippet, iJTyper first applies a constraint-based method and augments the code context with the inferred types of API elements. iJTyper then applies a statistically-based method to the augmented code snippet. The predicted candidate types of API elements are further used to improve the constraint-based method by reducing its pre-built knowledge base. iJTyper iteratively executes both methods and performs code context augmentation and knowledge base reduction until a termination condition is satisfied. Finally, the final inference results are obtained by combining the results of both methods. We evaluated iJTyper on two open-source datasets. Results show that 1) iJTyper achieves high average precision/recall of 97.31% and 92.52% on both datasets; 2) iJTyper significantly improves the recall of two state-of-the-art baselines, SnR and MLMTyper, by at least 7.31% and 27.44%, respectively; and 3) iJTyper improves the average precision/recall of the popular language model, ChatGPT, by 3.25% and 0.51% on both datasets. △ Less

Submitted 15 February, 2024; originally announced February 2024.

arXiv:2402.08303

ChatCell: Facilitating Single-Cell Analysis with Natural Language

Authors: Yin Fang, Kangwei Liu, Ningyu Zhang, Xinle Deng, Penghui Yang, Zhuo Chen, Xiangru Tang, Mark Gerstein, Xiaohui Fan, Huajun Chen

Abstract: As Large Language Models (LLMs) rapidly evolve, their influence in science is becoming increasingly prominent. The emerging capabilities of LLMs in task generalization and free-form dialogue can significantly advance fields like chemistry and biology. However, the field of single-cell biology, which forms the foundational building blocks of living organisms, still faces several challenges. High kn… ▽ More As Large Language Models (LLMs) rapidly evolve, their influence in science is becoming increasingly prominent. The emerging capabilities of LLMs in task generalization and free-form dialogue can significantly advance fields like chemistry and biology. However, the field of single-cell biology, which forms the foundational building blocks of living organisms, still faces several challenges. High knowledge barriers and limited scalability in current methods restrict the full exploitation of LLMs in mastering single-cell data, impeding direct accessibility and rapid iteration. To this end, we introduce ChatCell, which signifies a paradigm shift by facilitating single-cell analysis with natural language. Leveraging vocabulary adaptation and unified sequence generation, ChatCell has acquired profound expertise in single-cell biology and the capability to accommodate a diverse range of analysis tasks. Extensive experiments further demonstrate ChatCell's robust performance and potential to deepen single-cell insights, paving the way for more accessible and intuitive exploration in this pivotal field. Our project homepage is available at https://zjunlp.github.io/project/ChatCell. △ Less

Submitted 19 February, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

Comments: I have decided to temporarily withdraw this draft as I am in the process of making further revisions to improve its content. Code: https://github.com/zjunlp/ChatCell Dataset: https://huggingface.co/datasets/zjunlp/ChatCell-Instructions Demo: https://chat.openai.com/g/g-vUwj222gQ-chatcell

arXiv:2402.05391 [pdf, other]

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

Authors: Zhuo Chen, Yichi Zhang, Yin Fang, Yuxia Geng, Lingbing Guo, Xiang Chen, Qian Li, Wen Zhang, Jiaoyan Chen, Yushan Zhu, Jiaqi Li, Xiaoze Liu, Jeff Z. Pan, Ningyu Zhang, Huajun Chen

Abstract: Knowledge Graphs (KGs) play a pivotal role in advancing various AI applications, with the semantic web community's exploration into multi-modal dimensions unlocking new avenues for innovation. In this survey, we carefully review over 300 articles, focusing on KG-aware research in two principal aspects: KG-driven Multi-Modal (KG4MM) learning, where KGs support multi-modal tasks, and Multi-Modal Kno… ▽ More Knowledge Graphs (KGs) play a pivotal role in advancing various AI applications, with the semantic web community's exploration into multi-modal dimensions unlocking new avenues for innovation. In this survey, we carefully review over 300 articles, focusing on KG-aware research in two principal aspects: KG-driven Multi-Modal (KG4MM) learning, where KGs support multi-modal tasks, and Multi-Modal Knowledge Graph (MM4KG), which extends KG studies into the MMKG realm. We begin by defining KGs and MMKGs, then explore their construction progress. Our review includes two primary task categories: KG-aware multi-modal learning tasks, such as Image Classification and Visual Question Answering, and intrinsic MMKG tasks like Multi-modal Knowledge Graph Completion and Entity Alignment, highlighting specific research trajectories. For most of these tasks, we provide definitions, evaluation benchmarks, and additionally outline essential insights for conducting relevant research. Finally, we discuss current challenges and identify emerging trends, such as progress in Large Language Modeling and Multi-modal Pre-training strategies. This survey aims to serve as a comprehensive reference for researchers already involved in or considering delving into KG and multi-modal learning research, offering insights into the evolving landscape of MMKG research and supporting future work. △ Less

Submitted 26 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

Comments: Ongoing work; 41 pages (Main Text), 55 pages (Total), 11 Tables, 13 Figures, 619 citations; Paper list is available at https://github.com/zjukg/KG-MM-Survey

arXiv:2402.04356 [pdf, other]

Bidirectional Autoregressive Diffusion Model for Dance Generation

Authors: Canyu Zhang, Youbao Tang, Ning Zhang, Ruei-Sung Lin, Mei Han, **g Xiao, Song Wang

Abstract: Dance serves as a powerful medium for expressing human emotions, but the lifelike generation of dance is still a considerable challenge. Recently, diffusion models have showcased remarkable generative abilities across various domains. They hold promise for human motion generation due to their adaptable many-to-many nature. Nonetheless, current diffusion-based motion generation models often create… ▽ More Dance serves as a powerful medium for expressing human emotions, but the lifelike generation of dance is still a considerable challenge. Recently, diffusion models have showcased remarkable generative abilities across various domains. They hold promise for human motion generation due to their adaptable many-to-many nature. Nonetheless, current diffusion-based motion generation models often create entire motion sequences directly and unidirectionally, lacking focus on the motion with local and bidirectional enhancement. When choreographing high-quality dance movements, people need to take into account not only the musical context but also the nearby music-aligned dance motions. To authentically capture human behavior, we propose a Bidirectional Autoregressive Diffusion Model (BADM) for music-to-dance generation, where a bidirectional encoder is built to enforce that the generated dance is harmonious in both the forward and backward directions. To make the generated dance motion smoother, a local information decoder is built for local motion enhancement. The proposed framework is able to generate new motions based on the input conditions and nearby motions, which foresees individual motion slices iteratively and consolidates all predictions. To further refine the synchronicity between the generated dance and the beat, the beat information is incorporated as an input to generate better music-aligned dance movements. Experimental results demonstrate that the proposed model achieves state-of-the-art performance compared to existing unidirectional approaches on the prominent benchmark for music-to-dance generation. △ Less

Submitted 22 June, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

arXiv:2402.03190 [pdf, other]

Unified Hallucination Detection for Multimodal Large Language Models

Authors: Xiang Chen, Chenxi Wang, Yida Xue, Ningyu Zhang, Xiaoyan Yang, Qiang Li, Yue Shen, Lei Liang, **jie Gu, Huajun Chen

Abstract: Despite significant strides in multimodal tasks, Multimodal Large Language Models (MLLMs) are plagued by the critical issue of hallucination. The reliable detection of such hallucinations in MLLMs has, therefore, become a vital aspect of model evaluation and the safeguarding of practical application deployment. Prior research in this domain has been constrained by a narrow focus on singular tasks,… ▽ More Despite significant strides in multimodal tasks, Multimodal Large Language Models (MLLMs) are plagued by the critical issue of hallucination. The reliable detection of such hallucinations in MLLMs has, therefore, become a vital aspect of model evaluation and the safeguarding of practical application deployment. Prior research in this domain has been constrained by a narrow focus on singular tasks, an inadequate range of hallucination categories addressed, and a lack of detailed granularity. In response to these challenges, our work expands the investigative horizons of hallucination detection. We present a novel meta-evaluation benchmark, MHaluBench, meticulously crafted to facilitate the evaluation of advancements in hallucination detection methods. Additionally, we unveil a novel unified multimodal hallucination detection framework, UNIHD, which leverages a suite of auxiliary tools to validate the occurrence of hallucinations robustly. We demonstrate the effectiveness of UNIHD through meticulous evaluation and comprehensive analysis. We also provide strategic insights on the application of specific tools for addressing various categories of hallucinations. △ Less

Submitted 27 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

Comments: Accepted by ACL 2024 (main conference)

arXiv:2402.03049 [pdf, other]

EasyInstruct: An Easy-to-use Instruction Processing Framework for Large Language Models

Authors: Yixin Ou, Ningyu Zhang, Honghao Gui, Ziwen Xu, Shuofei Qiao, Yida Xue, Runnan Fang, Kangwei Liu, Lei Li, Zhen Bi, Guozhou Zheng, Huajun Chen

Abstract: In recent years, instruction tuning has gained increasing attention and emerged as a crucial technique to enhance the capabilities of Large Language Models (LLMs). To construct high-quality instruction datasets, many instruction processing approaches have been proposed, aiming to achieve a delicate balance between data quantity and data quality. Nevertheless, due to inconsistencies that persist am… ▽ More In recent years, instruction tuning has gained increasing attention and emerged as a crucial technique to enhance the capabilities of Large Language Models (LLMs). To construct high-quality instruction datasets, many instruction processing approaches have been proposed, aiming to achieve a delicate balance between data quantity and data quality. Nevertheless, due to inconsistencies that persist among various instruction processing methods, there is no standard open-source instruction processing implementation framework available for the community, which hinders practitioners from further develo** and advancing. To facilitate instruction processing research and development, we present EasyInstruct, an easy-to-use instruction processing framework for LLMs, which modularizes instruction generation, selection, and prompting, while also considering their combination and interaction. EasyInstruct is publicly released and actively maintained at https://github.com/zjunlp/EasyInstruct, along with an online demo app and a demo video for quick-start, calling for broader research centered on instruction data and synthetic data. △ Less

Submitted 23 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

Comments: ACL 2024 System Demonstrations; Project website: https://zjunlp.github.io/project/EasyInstruct Code: https://github.com/zjunlp/EasyInstruct Video: https://youtu.be/rfQOWYfziFo Demo: https://huggingface.co/spaces/zjunlp/EasyInstruct

arXiv:2402.03005 [pdf, other]

Topological metal and high-order Dirac point in cubic Rashba model

Authors: Haijiao Ji, Ning Zhang, Noah F. Q. Yuan

Abstract: We investigate the properties of the two-dimensional model with Rashba-type spin-orbit coupling cubic in electron momentum. In the normal phase, edge states emerge on open boundaries. In the superconducting phase, edge states could evolve into gapped fermionic edge states. Applications to realistic materials of interface superconductors are also discussed. We investigate the properties of the two-dimensional model with Rashba-type spin-orbit coupling cubic in electron momentum. In the normal phase, edge states emerge on open boundaries. In the superconducting phase, edge states could evolve into gapped fermionic edge states. Applications to realistic materials of interface superconductors are also discussed. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: 5 pages, 4 figures, 1 table

arXiv:2402.02085 [pdf, other]

DeCoF: Generated Video Detection via Frame Consistency: The First Benchmark Dataset

Authors: Long Ma, Jiajia Zhang, Hong** Deng, Ningyu Zhang, Qinglang Guo, Haiyang Yu, Yong Liao, Pengyuan Zhou

Abstract: The escalating quality of video generated by advanced video generation methods results in new security challenges, while there have been few relevant research efforts: 1) There is no open-source dataset for generated video detection, 2) No generated video detection method has been proposed so far. To this end, we propose an open-source dataset and a detection method for generated video for the fir… ▽ More The escalating quality of video generated by advanced video generation methods results in new security challenges, while there have been few relevant research efforts: 1) There is no open-source dataset for generated video detection, 2) No generated video detection method has been proposed so far. To this end, we propose an open-source dataset and a detection method for generated video for the first time. First, we propose a scalable dataset consisting of 964 prompts, covering various forgery targets, scenes, behaviors, and actions, as well as various generation models with different architectures and generation methods, including the most popular commercial models like OpenAI's Sora and Google's Veo. Second, we found via probing experiments that spatial artifact-based detectors lack generalizability. Hence, we propose a simple yet effective \textbf{de}tection model based on \textbf{f}rame \textbf{co}nsistency (\textbf{DeCoF}), which focuses on temporal artifacts by eliminating the impact of spatial artifacts during feature learning. Extensive experiments demonstrate the efficacy of DeCoF in detecting videos generated by unseen video generation models and confirm its powerful generalizability across several commercially proprietary models. Our code and dataset will be released at \url{https://github.com/wuwuwuyue/DeCoF}. △ Less

Submitted 25 June, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

arXiv:2402.01920 [pdf, other]

Preference Poisoning Attacks on Reward Model Learning

Authors: Junlin Wu, Jiongxiao Wang, Chaowei Xiao, Chenguang Wang, Ning Zhang, Yevgeniy Vorobeychik

Abstract: Learning utility, or reward, models from pairwise comparisons is a fundamental component in a number of application domains. These approaches inherently entail collecting preference information from people, with feedback often provided anonymously. Since preferences are subjective, there is no gold standard to compare against; yet, reliance of high-impact systems on preference learning creates a s… ▽ More Learning utility, or reward, models from pairwise comparisons is a fundamental component in a number of application domains. These approaches inherently entail collecting preference information from people, with feedback often provided anonymously. Since preferences are subjective, there is no gold standard to compare against; yet, reliance of high-impact systems on preference learning creates a strong motivation for malicious actors to skew data collected in this fashion to their ends. We investigate the nature and extent of this vulnerability systematically by considering a threat model in which an attacker can flip a small subset of preference comparisons with the goal of either promoting or demoting a target outcome. First, we propose two classes of algorithmic approaches for these attacks: a principled gradient-based framework, and several variants of rank-by-distance methods. Next, we demonstrate the efficacy of best attacks in both these classes in successfully achieving malicious goals on datasets from three diverse domains: autonomous control, recommendation system, and textual prompt-response preference learning. We find that the best attacks are often highly successful, achieving in the most extreme case 100% success rate with only 0.3% of the data poisoned. However, which attack is best can vary significantly across domains, demonstrating the value of our comprehensive vulnerability analysis that involves several classes of attack algorithms. In addition, we observe that the simpler and more scalable rank-by-distance approaches are often competitive with the best, and on occasion significantly outperform gradient-based methods. Finally, we show that several state-of-the-art defenses against other classes of poisoning attacks exhibit, at best, limited efficacy in our setting. △ Less

Submitted 2 February, 2024; originally announced February 2024.

arXiv:2401.17623 [pdf, other]

Neighboring Perturbations of Knowledge Editing on Large Language Models

Authors: Jun-Yu Ma, Zhen-Hua Ling, Ningyu Zhang, Jia-Chen Gu

Abstract: Despite their exceptional capabilities, large language models (LLMs) are prone to generating unintended text due to false or outdated knowledge. Given the resource-intensive nature of retraining LLMs, there has been a notable increase in the development of knowledge editing. However, current approaches and evaluations rarely explore the perturbation of editing on neighboring knowledge. This paper… ▽ More Despite their exceptional capabilities, large language models (LLMs) are prone to generating unintended text due to false or outdated knowledge. Given the resource-intensive nature of retraining LLMs, there has been a notable increase in the development of knowledge editing. However, current approaches and evaluations rarely explore the perturbation of editing on neighboring knowledge. This paper studies whether updating new knowledge to LLMs perturbs the neighboring knowledge encapsulated within them. Specifically, we seek to figure out whether appending a new answer into an answer list to a factual question leads to catastrophic forgetting of original correct answers in this list, as well as unintentional inclusion of incorrect answers. A metric of additivity is introduced and a benchmark dubbed as Perturbation Evaluation of Appending Knowledge (PEAK) is constructed to evaluate the degree of perturbation to neighboring knowledge when appending new knowledge. Besides, a plug-and-play framework termed Appending via Preservation and Prevention (APP) is proposed to mitigate the neighboring perturbation by maintaining the integrity of the answer list. Experiments demonstrate the effectiveness of APP coupling with four editing methods on four LLMs. The code and data are available at https://github.com/mjy1111/PEAK. △ Less

Submitted 26 May, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

Comments: Accepted by ICML 2024

arXiv:2401.17268 [pdf, other]

Weaver: Foundation Models for Creative Writing

Authors: Tiannan Wang, Jiamin Chen, Qingrui Jia, Shuai Wang, Ruoyu Fang, Huilin Wang, Zhaowei Gao, Chunzhao Xie, Chuou Xu, Jihong Dai, Yibin Liu, Jialong Wu, Shengwei Ding, Long Li, Zhiwei Huang, Xinle Deng, Teng Yu, Gangan Ma, Han Xiao, Zixin Chen, Danjun Xiang, Yunxia Wang, Yuanyuan Zhu, Yi Xiao, **g Wang , et al. (21 additional authors not shown)

Abstract: This work introduces Weaver, our first family of large language models (LLMs) dedicated to content creation. Weaver is pre-trained on a carefully selected corpus that focuses on improving the writing capabilities of large language models. We then fine-tune Weaver for creative and professional writing purposes and align it to the preference of professional writers using a suit of novel methods for… ▽ More This work introduces Weaver, our first family of large language models (LLMs) dedicated to content creation. Weaver is pre-trained on a carefully selected corpus that focuses on improving the writing capabilities of large language models. We then fine-tune Weaver for creative and professional writing purposes and align it to the preference of professional writers using a suit of novel methods for instruction data synthesis and LLM alignment, making it able to produce more human-like texts and follow more diverse instructions for content creation. The Weaver family consists of models of Weaver Mini (1.8B), Weaver Base (6B), Weaver Pro (14B), and Weaver Ultra (34B) sizes, suitable for different applications and can be dynamically dispatched by a routing agent according to query complexity to balance response quality and computation cost. Evaluation on a carefully curated benchmark for assessing the writing capabilities of LLMs shows Weaver models of all sizes outperform generalist LLMs several times larger than them. Notably, our most-capable Weaver Ultra model surpasses GPT-4, a state-of-the-art generalist LLM, on various writing scenarios, demonstrating the advantage of training specialized LLMs for writing purposes. Moreover, Weaver natively supports retrieval-augmented generation (RAG) and function calling (tool usage). We present various use cases of these abilities for improving AI-assisted writing systems, including integration of external knowledge bases, tools, or APIs, and providing personalized writing assistance. Furthermore, we discuss and summarize a guideline and best practices for pre-training and fine-tuning domain-specific LLMs. △ Less

Submitted 30 January, 2024; originally announced January 2024.

arXiv:2401.16292 [pdf, other]

Pilotfish: Distributed Transaction Execution for Lazy Blockchains

Authors: Quentin Kniep, Lefteris Kokoris-Kogias, Alberto Sonnino, Igor Zablotchi, Nuda Zhang

Abstract: Pilotfish is the first scale-out blockchain execution engine able to harness any degree of parallelizability existing in its workload. Pilotfish allows each validator to employ multiple machines, named ExecutionWorkers, under its control to scale its execution layer. Given a sufficiently parallelizable and compute-intensive load, the number of transactions that the validator can execute increases… ▽ More Pilotfish is the first scale-out blockchain execution engine able to harness any degree of parallelizability existing in its workload. Pilotfish allows each validator to employ multiple machines, named ExecutionWorkers, under its control to scale its execution layer. Given a sufficiently parallelizable and compute-intensive load, the number of transactions that the validator can execute increases linearly with the number of ExecutionWorkers at its disposal. In addition, Pilotfish maintains the consistency of the state, even when many validators experience simultaneous machine failures. This is possible due to the meticulous co-design of our crash-recovery protocol which leverages the existing fault tolerance in the blockchain's consensus mechanism. Finally, Pilotfish can also be seen as the first distributed deterministic execution engine that provides support for dynamic reads as transactions are not required to provide a fully accurate read and write set. This loosening of requirements would normally reduce the parallelizability available by blocking write-after-write conflicts, but our novel versioned-queues scheduling algorithm circumvents this by exploiting the lazy recovery property of Pilotfish, which only persists consistent state and re-executes any optimistic steps taken before the crash. In order to prove our claims we implemented the common path of Pilotfish with support for the MoveVM and evaluated it against the parallel execution MoveVM of Sui. Our results show that Pilotfish provides good scalability up to 8 ExecutionWorkers for a variety of workloads. In computationally-heavy workloads, Pilotfish's scalability is linear. △ Less

Submitted 16 February, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.15992 [pdf, other]

Pulsed Iron line Emission from the First Galactic Ultraluminous X-ray Pulsar Swift J0243.6+6124

Authors: Y. X. Xiao, Y. J. Xu, M. Y. Ge, F. J. Lu, S. N. Zhang, S. Zhang, L. Tao, J. L. Qu, P. J. Wang, L. D. Kong, Y. L. Tuo, Y. You, S. J. Zhao, J. Q. Peng, Y. F. Du, Y. H. Zhang, W. T. Ye

Abstract: We report the phase-resolved spectral results of the first Galactic Pulsating Ultra-Luminous X-ray source (PULX) Swift J0243.6+6124, modeling at its 2017-2018 outburst peak using data collected by the Hard X-ray Modulation Telescope (Insight-HXMT). The broad energy coverage of Insight-HXMT allows us to obtain more accurate spectral continuum to reduce the coupling of broad iron line profiles with… ▽ More We report the phase-resolved spectral results of the first Galactic Pulsating Ultra-Luminous X-ray source (PULX) Swift J0243.6+6124, modeling at its 2017-2018 outburst peak using data collected by the Hard X-ray Modulation Telescope (Insight-HXMT). The broad energy coverage of Insight-HXMT allows us to obtain more accurate spectral continuum to reduce the coupling of broad iron line profiles with other components. We use three different continuum spectrum models but obtain similar iron line results. For the first time, we detected the pulse characteristics of the broad iron line in a PULX. The variation in width and intensity of this iron line with $σ\sim 1.2-1.5$\,keV has a phase offset of about 0.25 from the pulse phase. We suggest that the uneven irradiation of the thick inner disk by the accretion column produces the modulated variation of the broad iron line. In addition, the non-pulsed narrow line is suggested to come from the outer disk region. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.15289 [pdf, other]

SoK: Where's the "up"?! A Comprehensive (bottom-up) Study on the Security of Arm Cortex-M Systems

Authors: Xi Tan, Zheyuan Ma, Sandro Pinto, Le Guan, Ning Zhang, Jun Xu, Zhiqiang Lin, Hongxin Hu, Ziming Zhao

Abstract: Arm Cortex-M processors are the most widely used 32-bit microcontrollers among embedded and Internet-of-Things devices. Despite the widespread usage, there has been little effort in summarizing their hardware security features, characterizing the limitations and vulnerabilities of their hardware and software stack, and systematizing the research on securing these systems. The goals and contributio… ▽ More Arm Cortex-M processors are the most widely used 32-bit microcontrollers among embedded and Internet-of-Things devices. Despite the widespread usage, there has been little effort in summarizing their hardware security features, characterizing the limitations and vulnerabilities of their hardware and software stack, and systematizing the research on securing these systems. The goals and contributions of this paper are multi-fold. First, we analyze the hardware security limitations and issues of Cortex-M systems. Second, we conducted a deep study of the software stack designed for Cortex-M and revealed its limitations, which is accompanied by an empirical analysis of 1,797 real-world firmware. Third, we categorize the reported bugs in Cortex-M software systems. Finally, we systematize the efforts that aim at securing Cortex-M systems and evaluate them in terms of the protections they offer, runtime performance, required hardware features, etc. Based on the insights, we develop a set of recommendations for the research community and MCU software developers. △ Less

Submitted 13 May, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

Comments: To Appear in the 18th USENIX WOOT Conference on Offensive Technologies, August 12-13, 2024

ACM Class: C.0; K.6.5

arXiv:2401.14619 [pdf, other]

Resilient Practical Test-Time Adaptation: Soft Batch Normalization Alignment and Entropy-driven Memory Bank

Authors: Xingzhi Zhou, Zhiliang Tian, Ka Chun Cheung, Simon See, Nevin L. Zhang

Abstract: Test-time domain adaptation effectively adjusts the source domain model to accommodate unseen domain shifts in a target domain during inference. However, the model performance can be significantly impaired by continuous distribution changes in the target domain and non-independent and identically distributed (non-i.i.d.) test samples often encountered in practical scenarios. While existing memory… ▽ More Test-time domain adaptation effectively adjusts the source domain model to accommodate unseen domain shifts in a target domain during inference. However, the model performance can be significantly impaired by continuous distribution changes in the target domain and non-independent and identically distributed (non-i.i.d.) test samples often encountered in practical scenarios. While existing memory bank methodologies use memory to store samples and mitigate non-i.i.d. effects, they do not inherently prevent potential model degradation. To address this issue, we propose a resilient practical test-time adaptation (ResiTTA) method focused on parameter resilience and data quality. Specifically, we develop a resilient batch normalization with estimation on normalization statistics and soft alignments to mitigate overfitting and model degradation. We use an entropy-driven memory bank that accounts for timeliness, the persistence of over-confident samples, and sample uncertainty for high-quality data in adaptation. Our framework periodically adapts the source domain model using a teacher-student model through a self-training loss on the memory samples, incorporating soft alignment losses on batch normalization. We empirically validate ResiTTA across various benchmark datasets, demonstrating state-of-the-art performance. △ Less

Submitted 25 January, 2024; originally announced January 2024.

arXiv:2401.13910 [pdf]

Spatiotemporal optical vortices with controllable radial and azimuthal quantum numbers

Authors: Xin Liu, Qian Cao, Nianjia Zhang, Andy Chong, Yangjian Cai, Qiwen Zhan

Abstract: Optical spatiotemporal vortices with transverse photon orbital angular momentum (OAM) have recently become a focal point of research. In this work we theoretically and experimentally investigate optical spatiotemporal vortices with radial and azimuthal quantum numbers, known as spatiotemporal Laguerre-Gaussian (STLG) wavepackets. These 3D wavepackets exhibit phase singularities and cylinder-shaped… ▽ More Optical spatiotemporal vortices with transverse photon orbital angular momentum (OAM) have recently become a focal point of research. In this work we theoretically and experimentally investigate optical spatiotemporal vortices with radial and azimuthal quantum numbers, known as spatiotemporal Laguerre-Gaussian (STLG) wavepackets. These 3D wavepackets exhibit phase singularities and cylinder-shaped edge dislocations, resulting in a multi-ring topology in its spatiotemporal profile. Unlike conventional ST optical vortices, STLG wavepackets with non-zero p and l values carry a composite transverse OAM consisting of two directionally opposite components. We further demonstrate mode conversion between an STLG wavepacket and an ST Hermite-Gaussian wavepacket through the application of strong spatiotemporal astigmatism. The converted STHG wavepacket is de-coupled in intensity in space-time domain that can be utilized to implement the efficient and accurate recognition of ultrafast STLG wavepackets carried various p and l. This study may offer new insights into high-dimensional quantum information, photonic topology, and nonlinear optics, while promising potential applications in other wave phenomena such as acoustics and electron waves. △ Less

Submitted 24 January, 2024; originally announced January 2024.

arXiv:2401.12642 [pdf]

Spatiotemporal Hologram

Authors: Qian Cao, Nianjia Zhang, Andy Chong, Qiwen Zhan

Abstract: Spatiotemporal structured light has opened up new avenues for optics and photonics. Current spatiotemporal manipulation of light mostly relies on phase-only devices such as liquid crystal spatial light modulator to generate spatiotemporal optical fields with unique photonic properties. However, simultaneous manipulation of both amplitude and phase of the complex field for the spatiotemporal light… ▽ More Spatiotemporal structured light has opened up new avenues for optics and photonics. Current spatiotemporal manipulation of light mostly relies on phase-only devices such as liquid crystal spatial light modulator to generate spatiotemporal optical fields with unique photonic properties. However, simultaneous manipulation of both amplitude and phase of the complex field for the spatiotemporal light is still lacking, limiting the diversity and richness of achievable photonic properties. In this work, a simple and versatile spatiotemporal holographic method that can arbitrarily sculpture the spatiotemporal light is presented. The capabilities of this simple yet powerful method are demonstrated through the generation of fundamental and higher-order spatiotemporal Bessel wavepacket, spatiotemporal crystal-like and quasi-crystal-like structures, and spatiotemporal flat-top wavepackets. Fully customizable spatiotemporal wavepackets will find broader application in investigating the dynamics of spatiotemporal fields and interactions between ultrafast spatiotemporal pulses and matters, unveiling previously hidden light-matter interactions and unlocking breakthroughs in photonics and beyond. △ Less

Submitted 23 January, 2024; originally announced January 2024.

arXiv:2401.11770 [pdf, ps, other]

Evidence for Unfolded Fermi Surfaces in the Charge-Density-Wave State of Kagome Metal FeGe Revealed by de Haas-van Alphen Effect

Authors: Kaixin Tang, Han**g Zhou, Houpu Li, Senyang Pan, Xueliang Wu, Hongyu Li, Nan Zhang, Chuanying Xi, **glei Zhang, Aifeng Wang, Xiangang Wan, Ziji Xiang, Xianhui Chen

Abstract: The antiferromagnetic kagome lattice compound FeGe has been revealed to host an emergent charge-density-wave (CDW) state which manifests complex interplay between the spin, charge and lattice degrees of freedom. Here, we present a comprehensive study of the de Haas-van Alphen effect by measuring torque magnetometry under magnetic fields up to 45.2 T to map Fermi surfaces in this unusual CDW state.… ▽ More The antiferromagnetic kagome lattice compound FeGe has been revealed to host an emergent charge-density-wave (CDW) state which manifests complex interplay between the spin, charge and lattice degrees of freedom. Here, we present a comprehensive study of the de Haas-van Alphen effect by measuring torque magnetometry under magnetic fields up to 45.2 T to map Fermi surfaces in this unusual CDW state. For field along the $c$ direction, we resolve four cyclotron orbits; the largest one roughly corresponding to the area of the 2$\times$2 folded Brillouin zone. Three smaller orbits are characterized by light effective cyclotron masses range from (0.18-0.30) $m_e$. Angle-resolved measurements identify one Fermi surface segment with weak anisotropy. Combined with band structure calculations, our results suggest that features of unfolded Fermi surfaces are robust against CDW reconstruction, corroborating the novel effect of a short-ranged CDW on the electronic structure. △ Less

Submitted 22 January, 2024; originally announced January 2024.

Comments: 7 pages, 4 figures, to be published in Phys. Rev. Research

arXiv:2401.05268 [pdf, other]

AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning

Authors: Shuofei Qiao, Ningyu Zhang, Runnan Fang, Yujie Luo, Wangchunshu Zhou, Yuchen Eleanor Jiang, Chengfei Lv, Huajun Chen

Abstract: Language agents have achieved considerable performance on various complex question-answering tasks by planning with external tools. Despite the incessant exploration in this field, existing language agent systems still struggle with costly, non-reproducible data reliance and face the challenge of compelling a single model for multiple functions. To this end, we introduce AutoAct, an automatic agen… ▽ More Language agents have achieved considerable performance on various complex question-answering tasks by planning with external tools. Despite the incessant exploration in this field, existing language agent systems still struggle with costly, non-reproducible data reliance and face the challenge of compelling a single model for multiple functions. To this end, we introduce AutoAct, an automatic agent learning framework for QA that does not rely on large-scale annotated data and synthetic planning trajectories from closed-source models (e.g., GPT-4). Given limited data with a tool library, AutoAct first automatically synthesizes planning trajectories without any assistance from humans or strong closed-source models. Then, AutoAct leverages a division-of-labor strategy to automatically differentiate based on the target task information and synthesized trajectories, producing a sub-agent group to complete the task. We conduct comprehensive experiments with different LLMs, which demonstrates that AutoAct yields better or parallel performance compared to various strong baselines. Further analysis demonstrates the effectiveness of the division-of-labor strategy, with the trajectory quality generated by AutoAct generally outperforming that of others. Code will be available at https://github.com/zjunlp/AutoAct. △ Less

Submitted 26 May, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

Comments: ACL 2024

arXiv:2401.01286 [pdf, other]

A Comprehensive Study of Knowledge Editing for Large Language Models

Authors: Ningyu Zhang, Yunzhi Yao, Bozhong Tian, Peng Wang, Shumin Deng, Mengru Wang, Zekun Xi, Shengyu Mao, **tian Zhang, Yuansheng Ni, Siyuan Cheng, Ziwen Xu, Xin Xu, Jia-Chen Gu, Yong Jiang, Pengjun Xie, Fei Huang, Lei Liang, Zhiqiang Zhang, Xiaowei Zhu, Jun Zhou, Huajun Chen

Abstract: Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. However, a primary limitation lies in the significant computational demands during training, arising from their extensive parameterization. This challenge is further intensified by the dynamic nature of the world, necessitating frequent updates to LLMs t… ▽ More Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. However, a primary limitation lies in the significant computational demands during training, arising from their extensive parameterization. This challenge is further intensified by the dynamic nature of the world, necessitating frequent updates to LLMs to correct outdated information or integrate new knowledge, thereby ensuring their continued relevance. Note that many applications demand continual model adjustments post-training to address deficiencies or undesirable behaviors. There is an increasing interest in efficient, lightweight methods for on-the-fly model modifications. To this end, recent years have seen a burgeoning in the techniques of knowledge editing for LLMs, which aim to efficiently modify LLMs' behaviors within specific domains while preserving overall performance across various inputs. In this paper, we first define the knowledge editing problem and then provide a comprehensive review of cutting-edge approaches. Drawing inspiration from educational and cognitive research theories, we propose a unified categorization criterion that classifies knowledge editing methods into three groups: resorting to external knowledge, merging knowledge into the model, and editing intrinsic knowledge. Furthermore, we introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches. Additionally, we provide an in-depth analysis of knowledge location, which can give a deeper understanding of the knowledge structures inherent within LLMs. Finally, we discuss several potential applications of knowledge editing, outlining its broad and impactful implications. △ Less

Submitted 28 March, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

Comments: Ongoing work; 52 pages, 282 citations; benchmark is available at https://huggingface.co/datasets/zjunlp/KnowEdit code is available at https://github.com/zjunlp/EasyEdit paper list is available at https://github.com/zjunlp/KnowledgeEditingPapers

arXiv:2401.00625 [pdf, ps, other]

Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models

Authors: Guangji Bai, Zheng Chai, Chen Ling, Shiyu Wang, Jiaying Lu, Nan Zhang, Tingwei Shi, Ziyang Yu, Mengdan Zhu, Yifei Zhang, Carl Yang, Yue Cheng, Liang Zhao

Abstract: The burgeoning field of Large Language Models (LLMs), exemplified by sophisticated models like OpenAI's ChatGPT, represents a significant advancement in artificial intelligence. These models, however, bring forth substantial challenges in the high consumption of computational, memory, energy, and financial resources, especially in environments with limited resource capabilities. This survey aims t… ▽ More The burgeoning field of Large Language Models (LLMs), exemplified by sophisticated models like OpenAI's ChatGPT, represents a significant advancement in artificial intelligence. These models, however, bring forth substantial challenges in the high consumption of computational, memory, energy, and financial resources, especially in environments with limited resource capabilities. This survey aims to systematically address these challenges by reviewing a broad spectrum of techniques designed to enhance the resource efficiency of LLMs. We categorize methods based on their optimization focus: computational, memory, energy, financial, and network resources and their applicability across various stages of an LLM's lifecycle, including architecture design, pretraining, finetuning, and system design. Additionally, the survey introduces a nuanced categorization of resource efficiency techniques by their specific resource types, which uncovers the intricate relationships and map**s between various resources and corresponding optimization techniques. A standardized set of evaluation metrics and datasets is also presented to facilitate consistent and fair comparisons across different models and techniques. By offering a comprehensive overview of the current sota and identifying open research avenues, this survey serves as a foundational reference for researchers and practitioners, aiding them in develo** more sustainable and efficient LLMs in a rapidly evolving landscape. △ Less

Submitted 3 January, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

Comments: Preprint. GitHub repo: https://github.com/tiingweii-shii/Awesome-Resource-Efficient-LLM-Papers

arXiv:2312.15159 [pdf, other]

doi 10.1145/3656177

Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference

Authors: Hongzheng Chen, Jiahao Zhang, Yixiao Du, Shaojie Xiang, Zichao Yue, Niansong Zhang, Yaohui Cai, Zhiru Zhang

Abstract: Recent advancements in large language models (LLMs) boasting billions of parameters have generated a significant demand for efficient deployment in inference workloads. The majority of existing approaches rely on temporal architectures that reuse hardware units for different network layers and operators. However, these methods often encounter challenges in achieving low latency due to considerable… ▽ More Recent advancements in large language models (LLMs) boasting billions of parameters have generated a significant demand for efficient deployment in inference workloads. The majority of existing approaches rely on temporal architectures that reuse hardware units for different network layers and operators. However, these methods often encounter challenges in achieving low latency due to considerable memory access overhead. This paper investigates the feasibility and potential of model-specific spatial acceleration for LLM inference on FPGAs. Our approach involves the specialization of distinct hardware units for specific operators or layers, facilitating direct communication between them through a dataflow architecture while minimizing off-chip memory accesses. We introduce a comprehensive analytical model for estimating the performance of a spatial LLM accelerator, taking into account the on-chip compute and memory resources available on an FPGA. Through our analysis, we can determine the scenarios in which FPGA-based spatial acceleration can outperform its GPU-based counterpart. To enable more productive implementations of an LLM model on FPGAs, we further provide a library of high-level synthesis (HLS) kernels that are composable and reusable. This library will be made available as open-source. To validate the effectiveness of both our analytical model and HLS library, we have implemented BERT and GPT2 on an AMD Alveo U280 FPGA device. Experimental results demonstrate our approach can achieve up to 13.4x speedup when compared to previous FPGA-based accelerators for the BERT model. For GPT generative inference, we attain a 2.2x speedup compared to DFX, an FPGA overlay, in the prefill stage, while achieving a 1.9x speedup and a 5.7x improvement in energy efficiency compared to the NVIDIA A100 GPU in the decode stage. △ Less

Submitted 7 April, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

Comments: Accepted for publication in the FCCM'24 Journal Track and will appear in ACM Transactions on Reconfigurable Technology and Systems (TRETS)

arXiv:2312.09632 [pdf, other]

doi 10.1051/0004-6361/202347718

The bright black hole X-ray binary 4U 1543--47 during 2021 outburst: a thick accretion disk inflated by high luminosity

Authors: S. J. Zhao, L. Tao, P. P. Li, R. Soria, H. Feng, Y. X. Zhang, R. C. Ma, W. D. Zhang, E. L. Qiao, Q. Q. Yin, S. N. Zhang, L. Zhang, Q. C. Bu, X. Ma, Y. Huang, M. Y. Ge, X. B. Li, Q. C. Zhao, J. Q. Peng, Y. X. Xiao

Abstract: The black hole X-ray binary source 4U 1543--47 experienced a super-Eddington outburst in 2021, reaching a peak flux of up to $\sim1.96\times10^{-7}\rm erg\ \rm cm^{-2}\ \rm s^{-1}$ ($\sim 8.2$ Crab) in the 2--10\,keV band. Soon after the outburst began, it rapidly transitioned into the soft state. Our goal is to understand how the accretion disk structure deviates from a standard thin disk when th… ▽ More The black hole X-ray binary source 4U 1543--47 experienced a super-Eddington outburst in 2021, reaching a peak flux of up to $\sim1.96\times10^{-7}\rm erg\ \rm cm^{-2}\ \rm s^{-1}$ ($\sim 8.2$ Crab) in the 2--10\,keV band. Soon after the outburst began, it rapidly transitioned into the soft state. Our goal is to understand how the accretion disk structure deviates from a standard thin disk when the accretion rate is near Eddington. To do so, we analyzed spectra obtained from quasi-simultaneous observations conducted by the Hard X-ray Modulation Telescope (Insight-HXMT), the Nuclear Spectroscopic Telescope Array (NuSTAR), and the Neil Gehrels Swift Observatory (Swift). These spectra are well-fitted by a model comprising a disk, a weak corona, and a reflection component. We suggest that the reflection component is caused by disk self-irradiation, that is by photons emitted from the inner disk which return to the accretion disk surface, as their trajectories are bent by the strong gravity field. In this scenario, the best-fitting parameters imply that the reflected flux represents more than half of the total flux. Using general relativistic ray-tracing simulations, we show that this scenario is viable when the disk becomes geometrically thick, with a funnel-like shape, as the accretion rate is near or above the Eddington limit. In the specific case of 4U 1543--47, an angle $\gtrsim$ 45 deg between the disk surface and the equatorial plane can explain the required amount of self-irradiation. △ Less

Submitted 15 December, 2023; originally announced December 2023.

Comments: Accepted for publication in Astronomy and Astrophysics. 15 pages, 4 tables, 12 figures

Journal ref: A&A 685, A42 (2024)

arXiv:2312.08682 [pdf, other]

High-coherence parallelization in integrated photonics

Authors: Xuguang Zhang, Zixuan Zhou, Yijun Guo, Minxue Zhuang, Warren **, Bitao Shen, Yujun Chen, Jiahui Huang, Zihan Tao, Ming **, Ruixuan Chen, Zhangfeng Ge, Zhou Fang, Ning Zhang, Yadong Liu, Pengfei Cai, Weiwei Hu, Haowen Shu, Dong Pan, John E. Bowers, Xingjun Wang, Lin Chang

Abstract: Coherent optics has profoundly impacted diverse applications ranging from communications, LiDAR to quantum computations. However, building coherent systems in integrated photonics previously came at great expense in hardware integration and energy efficiency: the lack of a power-efficient way to generate highly coherent light necessitates bulky lasers and amplifiers, while frequency and phase reco… ▽ More Coherent optics has profoundly impacted diverse applications ranging from communications, LiDAR to quantum computations. However, building coherent systems in integrated photonics previously came at great expense in hardware integration and energy efficiency: the lack of a power-efficient way to generate highly coherent light necessitates bulky lasers and amplifiers, while frequency and phase recovery schemes require huge digital signal processing resources. In this work, we demonstrate a high-coherence parallelization strategy that facilitates advanced integrated coherent systems at a minimum price. Using a self-injection locked microcomb to injection lock a distributed feedback laser array, we boost the microcomb power by a record high gain of up to 60 dB on chip with no degradation in coherence. This strategy enables tens of highly coherent channels with an intrinsic linewidth down to the 10 Hz level and power of more than 20 dBm. The overall electrical to optical wall-plug efficiency reaches 19%, comparable with that of the state-of-the-art semiconductor lasers. Driven by this parallel source, we demonstrate a silicon photonic communication link with an unprecedented data rate beyond 60 Tbit/s. Importantly, the high coherence we achieve reduces the coherent-related DSP consumption by 99.999% compared with the traditional III-V laser pump scheme. This work paves a way to realizing scalable, high-performance coherent integrated photonic systems, potentially benefiting numerous applications. △ Less

Submitted 14 December, 2023; originally announced December 2023.

arXiv:2312.06259 [pdf, other]

Adaptive Annotation Distribution for Weakly Supervised Point Cloud Semantic Segmentation

Authors: Zhiyi Pan, Nan Zhang, Wei Gao, Shan Liu, Ge Li

Abstract: Weakly supervised point cloud semantic segmentation has attracted a lot of attention due to its ability to alleviate the heavy reliance on fine-grained annotations of point clouds. However, in practice, sparse annotation usually exhibits a distinct non-uniform distribution in point cloud, which poses challenges for weak supervision. To address these issues, we propose an adaptive annotation distri… ▽ More Weakly supervised point cloud semantic segmentation has attracted a lot of attention due to its ability to alleviate the heavy reliance on fine-grained annotations of point clouds. However, in practice, sparse annotation usually exhibits a distinct non-uniform distribution in point cloud, which poses challenges for weak supervision. To address these issues, we propose an adaptive annotation distribution method for weakly supervised point cloud semantic segmentation. Specifically, we introduce the probability density function into the gradient sampling approximation analysis and investigate the impact of sparse annotations distributions. Based on our analysis, we propose a label-aware point cloud downsampling strategy to increase the proportion of annotations involved in the training stage. Furthermore, we design the multiplicative dynamic entropy as the gradient calibration function to mitigate the gradient bias caused by non-uniformly distributed sparse annotations and explicitly reduce the epistemic uncertainty. Without any prior restrictions and additional information, our proposed method achieves comprehensive performance improvements at multiple label rates with different annotation distributions on S3DIS, ScanNetV2 and SemanticKITTI. △ Less

Submitted 11 December, 2023; originally announced December 2023.

arXiv:2312.05275 [pdf, other]

Exploring the Limits of ChatGPT in Software Security Applications

Authors: Fangzhou Wu, Qingzhao Zhang, Ati Priya Bajaj, Tiffany Bao, Ning Zhang, Ruoyu "Fish" Wang, Chaowei Xiao

Abstract: Large language models (LLMs) have undergone rapid evolution and achieved remarkable results in recent times. OpenAI's ChatGPT, backed by GPT-3.5 or GPT-4, has gained instant popularity due to its strong capability across a wide range of tasks, including natural language tasks, coding, mathematics, and engaging conversations. However, the impacts and limits of such LLMs in system security domain ar… ▽ More Large language models (LLMs) have undergone rapid evolution and achieved remarkable results in recent times. OpenAI's ChatGPT, backed by GPT-3.5 or GPT-4, has gained instant popularity due to its strong capability across a wide range of tasks, including natural language tasks, coding, mathematics, and engaging conversations. However, the impacts and limits of such LLMs in system security domain are less explored. In this paper, we delve into the limits of LLMs (i.e., ChatGPT) in seven software security applications including vulnerability detection/repair, debugging, debloating, decompilation, patching, root cause analysis, symbolic execution, and fuzzing. Our exploration reveals that ChatGPT not only excels at generating code, which is the conventional application of language models, but also demonstrates strong capability in understanding user-provided commands in natural languages, reasoning about control and data flows within programs, generating complex data structures, and even decompiling assembly code. Notably, GPT-4 showcases significant improvements over GPT-3.5 in most security tasks. Also, certain limitations of ChatGPT in security-related tasks are identified, such as its constrained ability to process long code contexts. △ Less

Submitted 7 December, 2023; originally announced December 2023.

arXiv:2311.18717 [pdf, other]

NFT Wash Trading: Direct vs. Indirect Estimation

Authors: Brett Hemenway Falk, Gerry Tsoukalas, Niuniu Zhang

Abstract: Recent studies estimate around 70% of traded value on off-chain crypto exchanges like Binance is wash trading. This paper turns to NFT markets, where the on-chain nature of transactions-a key tenet of Web3 innovation-enables more direct estimation methods to be applied. Focusing on three of the largest NFT marketplaces, we find 30-40% of NFT volume and 25-95% of traded value involve wash trading.… ▽ More Recent studies estimate around 70% of traded value on off-chain crypto exchanges like Binance is wash trading. This paper turns to NFT markets, where the on-chain nature of transactions-a key tenet of Web3 innovation-enables more direct estimation methods to be applied. Focusing on three of the largest NFT marketplaces, we find 30-40% of NFT volume and 25-95% of traded value involve wash trading. We leverage this direct approach to critically evaluate recent indirect estimation methods suggested in the literature, revealing major differences in effectiveness, with some failing altogether. Trade-roundedness filters, as suggested in Cong et al. (2023), emerge as the most accurate indirect estimation method. In fact, we show how direct and indirect approaches can be closely aligned via hyper-parameter fine-tuning. Our findings underscore the crucial role of technological innovation in detecting and regulating financial misconduct in digital finance. △ Less

Submitted 5 June, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

arXiv:2311.15717 [pdf, other]

Evidence of spin density waves in La$_3$Ni$_2$O$_{7-δ}$

Authors: Kaiwen Chen, Xiangqi Liu, Jiachen Jiao, Muyuan Zou, Yixuan Luo, Qiong Wu, Ningyuan Zhang, Yanfeng Guo, Lei Shu

Abstract: The recently discovered superconductivity with critical temperature $T_c$ up to 80 K in the double-layer Nickelate La$_3$Ni$_2$O$_{7-δ}$ under pressure has drawn great attention. Here we report the positive muon spin relaxation ($μ^+$SR) study of polycrystalline La$_3$Ni$_2$O$_{6.92}$ under ambient pressure. Zero-field $μ^+$SR experiments reveal the existence of magnetic order in La$_3$Ni$_2$O… ▽ More The recently discovered superconductivity with critical temperature $T_c$ up to 80 K in the double-layer Nickelate La$_3$Ni$_2$O$_{7-δ}$ under pressure has drawn great attention. Here we report the positive muon spin relaxation ($μ^+$SR) study of polycrystalline La$_3$Ni$_2$O$_{6.92}$ under ambient pressure. Zero-field $μ^+$SR experiments reveal the existence of magnetic order in La$_3$Ni$_2$O$_{6.92}$ with $T_{N}=154\ \rm{K}$. The weak transverse field $μ^+$SR measurements confirms the bulk nature of magnetism. In addition, a small quantity of oxygen deficiencies can greatly broaden the internal magnetic field distribution sensed by muons. △ Less

Submitted 13 May, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

arXiv:2311.14020 [pdf, ps, other]

doi 10.1103/PhysRevA.109.042623

Quantum Simulation of Bound-State-Enhanced Quantum Metrology

Authors: Cheng-Ge Liu, Cong-Wei Lu, Na-Na Zhang, Qing Ai

Abstract: Quantum metrology explores quantum effects to improve the measurement accuracy of some physical quantities beyond the classical limit. However, due to the interaction between the system and the environment, the decoherence can significantly reduce the accuracy of the measurement. Many methods have been proposed to restore the accuracy of the measurement in the long-time limit. Recently, it has bee… ▽ More Quantum metrology explores quantum effects to improve the measurement accuracy of some physical quantities beyond the classical limit. However, due to the interaction between the system and the environment, the decoherence can significantly reduce the accuracy of the measurement. Many methods have been proposed to restore the accuracy of the measurement in the long-time limit. Recently, it has been found that the bound state can assist the error-free measurement and recover the $t^{-1}$ scaling [K. Bai, Z. Peng, H. G. Luo, and J. H. An, Phys. Rev. Lett. 123, 040402 (2019)]. Here, by using $N$-qubits, we propose a method to simulate the open quantum dynamics of the hybrid system including one atom and coupled resonators. We find that the error of the measurement can vanish as the time increases due to the existence of the bound state. By both analytical and numerical simulations, we prove the $t^{-1}$ scaling of the measurement error can be recovered when there is a bound state in the hybrid system. Interestingly, we observe that there are perfect oscillations which can be used for the evaluation of the atomic transition frequency. For a finite-$N$, the duration of the perfect oscillations doubles as one more qubit is involved. △ Less

Submitted 2 May, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

Comments: 9 pages,9 figures

Journal ref: Phys. Rev. A 109, 042623 (2024)

arXiv:2311.13948 [pdf, ps, other]

Non-generic bound states in the continuum in waveguides with lateral leakage channels

Authors: Nan Zhang, Ya Yan Lu

Abstract: For optical waveguides with a layered background which itself is a slab waveguide, a guided mode is a bound state in the continuum (BIC), if it coexists with slab modes propagating outwards in the lateral direction; i.e., there are lateral leakage channels. It is known that generic BICs in optical waveguides with lateral leakage channels are robust in the sense that they still exist if the wavegui… ▽ More For optical waveguides with a layered background which itself is a slab waveguide, a guided mode is a bound state in the continuum (BIC), if it coexists with slab modes propagating outwards in the lateral direction; i.e., there are lateral leakage channels. It is known that generic BICs in optical waveguides with lateral leakage channels are robust in the sense that they still exist if the waveguide is perturbed arbitrarily. However, the theory is not applicable to non-generic BICs which can be defined precisely. Near a BIC, the waveguide supports resonant and leaky modes with a complex frequency and a complex propagation constant, respectively. In this paper, we develop a perturbation theory to show that the resonant and leaky modes near a non-generic BIC have an ultra-high $Q$ factor and ultra-low leakage loss, respectively. We also show that a merging-BIC obtained by tuning structural parameters is always a non-generic BIC. Existing studies on merging-BICs are concerned with specific examples and specific parameters. We analyze an arbitrary structural perturbation (to a waveguide supporting a non-generic BIC) given by $δF({\bf r})$, where $F({\bf r})$ is the perturbation profile and $δ$ is the amplitude, and show that the perturbed waveguide has two BICs for $δ>0$ (or $δ<0$) and no BIC for $δ<0$ (or $δ>0$). This implies that a non-generic BIC is a merging-BIC (for any perturbation profile $F$) when $δ$ is regarded as a parameter. Our study indicates that non-generic BICs have interesting special properties that are useful in applications. △ Less

Submitted 23 November, 2023; originally announced November 2023.

arXiv:2311.13162 [pdf, other]

Top-L Most Influential Community Detection Over Social Networks (Technical Report)

Authors: Nan Zhang, Yutong Ye, Xiang Lian, Mingsong Chen

Abstract: In many real-world applications such as social network analysis and online marketing/advertising, the community detection is a fundamental task to identify communities (subgraphs) in social networks with high structural cohesiveness. While previous works focus on detecting communities alone, they do not consider the collective influences of users in these communities on other user nodes in social… ▽ More In many real-world applications such as social network analysis and online marketing/advertising, the community detection is a fundamental task to identify communities (subgraphs) in social networks with high structural cohesiveness. While previous works focus on detecting communities alone, they do not consider the collective influences of users in these communities on other user nodes in social networks. Inspired by this, in this paper, we investigate the influence propagation from some seed communities and their influential effects that result in the influenced communities. We propose a novel problem, named Top-L most Influential Community DEtection (TopL-ICDE) over social networks, which aims to retrieve top-L seed communities with the highest influences, having high structural cohesiveness, and containing user-specified query keywords. In order to efficiently tackle the TopL-ICDE problem, we design effective pruning strategies to filter out false alarms of seed communities and propose an effective index mechanism to facilitate efficient Top-L community retrieval. We develop an efficient TopL-ICDE answering algorithm by traversing the index and applying our proposed pruning strategies. We also formulate and tackle a variant of TopL-ICDE, named diversified top-L most influential community detection (DTopL-ICDE), which returns a set of L diversified communities with the highest diversity score (i.e., collaborative influences by L communities). We prove that DTopL-ICDE is NP-hard, and propose an efficient greedy algorithm with our designed diversity score pruning. Through extensive experiments, we verify the efficiency and effectiveness of our proposed TopL-ICDE and DTopL-ICDE approaches over real/synthetic social networks under various parameter settings. △ Less

Submitted 1 March, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

arXiv:2311.09101 [pdf, other]

Towards A Unified View of Answer Calibration for Multi-Step Reasoning

Authors: Shumin Deng, Ningyu Zhang, Nay Oo, Bryan Hooi

Abstract: Large Language Models (LLMs) employing Chain-of-Thought (CoT) prompting have broadened the scope for improving multi-step reasoning capabilities. We generally divide multi-step reasoning into two phases: path generation to generate the reasoning path(s); and answer calibration post-processing the reasoning path(s) to obtain a final answer. However, the existing literature lacks systematic analysis… ▽ More Large Language Models (LLMs) employing Chain-of-Thought (CoT) prompting have broadened the scope for improving multi-step reasoning capabilities. We generally divide multi-step reasoning into two phases: path generation to generate the reasoning path(s); and answer calibration post-processing the reasoning path(s) to obtain a final answer. However, the existing literature lacks systematic analysis on different answer calibration approaches. In this paper, we summarize the taxonomy of recent answer calibration techniques and break them down into step-level and path-level strategies. We then conduct a thorough evaluation on these strategies from a unified view, systematically scrutinizing step-level and path-level answer calibration across multiple paths. Experimental results reveal that integrating the dominance of both strategies tends to derive optimal outcomes. Our study holds the potential to illuminate key insights for optimizing multi-step reasoning with answer calibration. △ Less

Submitted 25 February, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

Comments: Working in Progress

arXiv:2311.08063 [pdf, ps, other]

Enhanced mechanical squeezing in an optomechanical system via backward stimulated Brillouin scattering

Authors: Shan-Shan Chen, Na-Na Zhang, Yong-Rui Guo, Huan Yang, Yong Ma

Abstract: We investigate theoretically the enhancement of mechanical squeezing in a multimode optomechanical system by introducing a coherent phonon-photon interaction via the backward stimulated Brillouin scattering (BSBS) process. The coherent photon-phonon interaction where two optical modes couple to a Brillouin acoustic mode with a large decay rate provides an extra channel for the cooling of a Duffing… ▽ More We investigate theoretically the enhancement of mechanical squeezing in a multimode optomechanical system by introducing a coherent phonon-photon interaction via the backward stimulated Brillouin scattering (BSBS) process. The coherent photon-phonon interaction where two optical modes couple to a Brillouin acoustic mode with a large decay rate provides an extra channel for the cooling of a Duffing mechanical oscillator. The squeezing degree and the robustness to the thermal noises of the Duffing mechanical mode can be enhanced greatly. When the Duffing nonlinearity is weak, the squeezing degree of the mechanical mode in the presence of BSBS can be improved more than one order of magnitude compared with the absence of BSBS. Our scheme may be extended to other quantum systems to study novel quantum effects. △ Less

Submitted 14 November, 2023; originally announced November 2023.

arXiv:2311.07884 [pdf, other]

Fair Abstractive Summarization of Diverse Perspectives

Authors: Yusen Zhang, Nan Zhang, Yixin Liu, Alexander Fabbri, Junru Liu, Ryo Kamoi, Xiaoxin Lu, Caiming Xiong, Jieyu Zhao, Dragomir Radev, Kathleen McKeown, Rui Zhang

Abstract: People from different social and demographic groups express diverse perspectives and conflicting opinions on a broad set of topics such as product reviews, healthcare, law, and politics. A fair summary should provide a comprehensive coverage of diverse perspectives without underrepresenting certain groups. However, current work in summarization metrics and Large Language Models (LLMs) evaluation h… ▽ More People from different social and demographic groups express diverse perspectives and conflicting opinions on a broad set of topics such as product reviews, healthcare, law, and politics. A fair summary should provide a comprehensive coverage of diverse perspectives without underrepresenting certain groups. However, current work in summarization metrics and Large Language Models (LLMs) evaluation has not explored fair abstractive summarization. In this paper, we systematically investigate fair abstractive summarization for user-generated data. We first formally define fairness in abstractive summarization as not underrepresenting perspectives of any groups of people, and we propose four reference-free automatic metrics by measuring the differences between target and source perspectives. We evaluate nine LLMs, including three GPT models, four LLaMA models, PaLM 2, and Claude, on six datasets collected from social media, online reviews, and recorded transcripts. Experiments show that both the model-generated and the human-written reference summaries suffer from low fairness. We conduct a comprehensive analysis of the common factors influencing fairness and propose three simple but effective methods to alleviate unfair summarization. Our dataset and code are available at https://github.com/psunlpgroup/FairSumm. △ Less

Submitted 29 March, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

Comments: NAACL 2024

arXiv:2311.06140 [pdf, other]

doi 10.1021/acs.chemmater.3c03228

Atomistic origins of asymmetric charge-discharge kinetics in off-stoichiometric LiNiO$_2$

Authors: Penghao Xiao, Ning Zhang, Harold Smith Perez, Minjoon Park

Abstract: LiNiO$_2$ shows poor Li transport kinetics at the ends of charge and discharge in the first cycle, which significantly reduces its available capacity in practice. The atomistic origins of these kinetic limits have not been fully understood. Here, we examine Li transport in LiNiO$_2$ by first-principles-based kinetic Monte Carlo simulations where both long time scale and large length scale are achi… ▽ More LiNiO$_2$ shows poor Li transport kinetics at the ends of charge and discharge in the first cycle, which significantly reduces its available capacity in practice. The atomistic origins of these kinetic limits have not been fully understood. Here, we examine Li transport in LiNiO$_2$ by first-principles-based kinetic Monte Carlo simulations where both long time scale and large length scale are achieved, enabling direct comparison with experiments. Our results reveal the rate-limiting steps at both ends of the voltage scan and distinguish the differences between charge and discharge at the same Li content. The asymmetric effects of excess Ni in the Li layer (Ni$_\textrm{Li}$) are also captured in our unified modelling framework. In the low voltage region, the first cycle capacity loss due to high overpotential at the end of discharge is reproduced without empirical input. While the Li concentration gradient is found responsible for the low overpotential during charge at this state of charge. Ni$_\textrm{Li}$ increases the overpotential of discharge but not charge because it only impedes Li diffusion in a particular range of Li concentration and does not change the equilibrium voltage profile. The trends from varying the amount of Ni$_\textrm{Li}$ and temperature agree with experiments. In the high voltage region, charge becomes the slower process. The bottleneck becomes moving a Li from the Li-rich phase (H2) into the Li-poor phase (H3), while the Li hop** barriers in both phases are relatively low. The roles of preexisting nucleation sites and Ni$_\textrm{Li}$ are discussed. These results provide new atomistic insights of the kinetic hindrances, paving the road to unleash the full potential of high-Ni layered oxide cathodes. △ Less

Submitted 10 November, 2023; originally announced November 2023.

arXiv:2311.05441 [pdf, other]

5d SCFTs from Isolated Complete Intersection Singularities

Authors: Jisheng Mu, Yi-Nan Wang, Hao N. Zhang

Abstract: In this paper, we explore the zoo of 5d superconformal field theories (SCFTs) constructed from M-theory on Isolated Complete Intersection Singularities (ICIS). We systematically investigate the crepant resolution of such singularities, and obtain a classification of rank $\leqslant 10$ models with a smooth crepant resolution and smooth exceptional divisors, as well as a number of infinite sequence… ▽ More In this paper, we explore the zoo of 5d superconformal field theories (SCFTs) constructed from M-theory on Isolated Complete Intersection Singularities (ICIS). We systematically investigate the crepant resolution of such singularities, and obtain a classification of rank $\leqslant 10$ models with a smooth crepant resolution and smooth exceptional divisors, as well as a number of infinite sequences with the same smoothness properties. For these models, we study their Coulomb branch properties and compute the flavor symmetry algebra from the resolved CY3 and/or the magnetic quiver. We check the validity of the conjectures relating the properties of the 5d SCFT and the 4d $\mathcal{N}=2$ SCFT from IIB superstring on the same singularity. When the 4d $\mathcal{N}=2$ SCFT has a Lagrangian quiver gauge theory description, one can obtain the magnetic quiver of the 5d theory by gauging flavor symmetry, which encodes the 5d Higgs branch information. Regarding the smoothness of the crepant resolution and integrality of 4d Coulomb branch spectrum, we find examples with a smooth resolved CY3 and smooth exceptional divisors, but fractional 4d Coulomb branch spectrum. Moreover, we compute the discrete (higher)-symmetries of the 5d/4d SCFTs from the link topology for a few examples. △ Less

Submitted 28 November, 2023; v1 submitted 9 November, 2023; originally announced November 2023.

Comments: v2, 87 pages

arXiv:2311.05132 [pdf, ps, other]

The non-perturbative stringy interaction between NS-brane \& Dp brane

Authors: J. X. Lu, Nan Zhang

Abstract: To our best knowledge, the leading non-perturbative stringy interaction between an NS brane and a Dp brane remains unknown. We here present the non-perturbative stringy amplitudes for a system of an F-string and a Dp brane and a system of an NS 5 brane and a Dp brane for $0 \le p \le 6$. In either case, the F or NS5 and the Dp are placed parallel at a separation. We obtain the respective amplitude… ▽ More To our best knowledge, the leading non-perturbative stringy interaction between an NS brane and a Dp brane remains unknown. We here present the non-perturbative stringy amplitudes for a system of an F-string and a Dp brane and a system of an NS 5 brane and a Dp brane for $0 \le p \le 6$. In either case, the F or NS5 and the Dp are placed parallel at a separation. We obtain the respective amplitudes, starting from the amplitude for a system of a D1 and a D3 for the former and that for a system of a D5 and a D3 system for the latter, based on the IIB S-duality and various T-dualities plus the consistency of both, along with the respective known long-range amplitudes. We would like to point out that the amplitude for the D1/D3 or D3/D5 computed from the usual D-brane technique does not take into consideration of the non-perturbative contribution due to the exchange of virtual closed D-string emitted by the D3. As such the resulting amplitudes obtained from this one via the S-duality and followed by various T-dualities are not consistent with the IIB S-duality. We resolve this issue and obtain the corresponding consistent amplitudes. The implications of so obtained amplitudes are also discussed. △ Less

Submitted 26 November, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

Comments: 20 pages, 1 table, improved version, two references added

Report number: USTC-ICTS/PCFT-23-34

arXiv:2311.04487 [pdf, ps, other]

Principal specializations of Schubert polynomials, multi-layered permutations and asymptotics

Authors: Ningxin Zhang

Abstract: Let $v(n)$ be the largest principal specialization of Schubert polynomials for layered permutations $v(n) := \max_{w \in \mathcal{L}_n} \mathfrak{S}_w(1,\ldots,1)$. Morales, Pak and Panova proved that there is a limit \[\lim_{n \to \infty} \frac{\log v(n)}{n^2},\] and gave a precise description of layered permutations reaching the maximum. In this paper, we extend Morales Pak and Panova's results… ▽ More Let $v(n)$ be the largest principal specialization of Schubert polynomials for layered permutations $v(n) := \max_{w \in \mathcal{L}_n} \mathfrak{S}_w(1,\ldots,1)$. Morales, Pak and Panova proved that there is a limit \[\lim_{n \to \infty} \frac{\log v(n)}{n^2},\] and gave a precise description of layered permutations reaching the maximum. In this paper, we extend Morales Pak and Panova's results to generalized principal specialization $\mathfrak{S}_w(1,q,q^2,\ldots)$ for multi-layered permutations when $q$ equals a root of unity. △ Less

Submitted 8 November, 2023; originally announced November 2023.

Comments: 16 pages

Showing 51–100 of 1,138 results for author: Zhang, N