Search | arXiv e-print repository

Image-Conditional Diffusion Transformer for Underwater Image Enhancement

Authors: Xingyang Nie, Su Pan, Xiaoyu Zhai, Shifei Tao, Fengzhong Qu, Biao Wang, Huilin Ge, Guojie Xiao

Abstract: Underwater image enhancement (UIE) has attracted much attention owing to its importance for underwater operation and marine engineering. Motivated by the recent advance in generative models, we propose a novel UIE method based on image-conditional diffusion transformer (ICDT). Our method takes the degraded underwater image as the conditional input and converts it into latent space where ICDT is ap… ▽ More Underwater image enhancement (UIE) has attracted much attention owing to its importance for underwater operation and marine engineering. Motivated by the recent advance in generative models, we propose a novel UIE method based on image-conditional diffusion transformer (ICDT). Our method takes the degraded underwater image as the conditional input and converts it into latent space where ICDT is applied. ICDT replaces the conventional U-Net backbone in a denoising diffusion probabilistic model (DDPM) with a transformer, and thus inherits favorable properties such as scalability from transformers. Furthermore, we train ICDT with a hybrid loss function involving variances to achieve better log-likelihoods, which meanwhile significantly accelerates the sampling process. We experimentally assess the scalability of ICDTs and compare with prior works in UIE on the Underwater ImageNet dataset. Besides good scaling properties, our largest model, ICDT-XL/2, outperforms all comparison methods, achieving state-of-the-art (SOTA) quality of image enhancement. △ Less

Submitted 7 July, 2024; originally announced July 2024.

arXiv:2407.03128 [pdf]

Thorium doped strontium fluoride crystal: a unique candidate for solid nuclear optical clock material

Authors: Qiaorui Gong, Shanming Li, Shulong Zhang, Siliang Tao, Guoliang Deng, Peixiong Zhang, Chengchun Zhao, Yin Hang, Shining Zhu, Longsheng Ma

Abstract: We report a candidate with unique advantages in the cultivation of solid-state nuclear clock material, Th:SrF2 crystal. It not only has a segregation coefficient close to 1, which can achieve highly efficient and uniform do** of Th, but also ensures a high transmittance (~69% at 150 nm) while achieving extremely high do** concentration (232Th>6*10^20 cm^(-3). In addition, SrF2 crystal will not… ▽ More We report a candidate with unique advantages in the cultivation of solid-state nuclear clock material, Th:SrF2 crystal. It not only has a segregation coefficient close to 1, which can achieve highly efficient and uniform do** of Th, but also ensures a high transmittance (~69% at 150 nm) while achieving extremely high do** concentration (232Th>6*10^20 cm^(-3). In addition, SrF2 crystal will not be irradiated-colored under strong α radiation like CaF2 crystal, Th:SrF2 crystal is expected to fully unleash its high concentration do** characteristics while ensuring its transmission performance in nuclear transition band not be severely affected by 229Th radiation damage. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.01896 [pdf, other]

LogEval: A Comprehensive Benchmark Suite for Large Language Models In Log Analysis

Authors: Tianyu Cui, Shiyu Ma, Ziang Chen, Tong Xiao, Shimin Tao, Yilun Liu, Shenglin Zhang, Duoming Lin, Changchang Liu, Yuzhe Cai, Weibin Meng, Yongqian Sun, Dan Pei

Abstract: Log analysis is crucial for ensuring the orderly and stable operation of information systems, particularly in the field of Artificial Intelligence for IT Operations (AIOps). Large Language Models (LLMs) have demonstrated significant potential in natural language processing tasks. In the AIOps domain, they excel in tasks such as anomaly detection, root cause analysis of faults, operations and maint… ▽ More Log analysis is crucial for ensuring the orderly and stable operation of information systems, particularly in the field of Artificial Intelligence for IT Operations (AIOps). Large Language Models (LLMs) have demonstrated significant potential in natural language processing tasks. In the AIOps domain, they excel in tasks such as anomaly detection, root cause analysis of faults, operations and maintenance script generation, and alert information summarization. However, the performance of current LLMs in log analysis tasks remains inadequately validated. To address this gap, we introduce LogEval, a comprehensive benchmark suite designed to evaluate the capabilities of LLMs in various log analysis tasks for the first time. This benchmark covers tasks such as log parsing, log anomaly detection, log fault diagnosis, and log summarization. LogEval evaluates each task using 4,000 publicly available log data entries and employs 15 different prompts for each task to ensure a thorough and fair assessment. By rigorously evaluating leading LLMs, we demonstrate the impact of various LLM technologies on log analysis performance, focusing on aspects such as self-consistency and few-shot contextual learning. We also discuss findings related to model quantification, Chinese-English question-answering evaluation, and prompt engineering. These findings provide insights into the strengths and weaknesses of LLMs in multilingual environments and the effectiveness of different prompt strategies. Various evaluation methods are employed for different tasks to accurately measure the performance of LLMs in log analysis, ensuring a comprehensive assessment. The insights gained from LogEvals evaluation reveal the strengths and limitations of LLMs in log analysis tasks, providing valuable guidance for researchers and practitioners. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2406.07225 [pdf, other]

A generic and robust quantum agent inspired by deep meta-reinforcement learning

Authors: Zibo Miao, Shihui Zhang, Yu Pan, Sibo Tao, Yu Chen

Abstract: Deep reinforcement learning (deep RL) has enabled human- or superhuman- performances in various applications. Recently, deep RL has also been adopted to improve the performance of quantum control. However, a large volume of data is typically required to train the neural network in deep RL, making it inefficient compared with the traditional optimal quantum control method. Here, we thus develop a n… ▽ More Deep reinforcement learning (deep RL) has enabled human- or superhuman- performances in various applications. Recently, deep RL has also been adopted to improve the performance of quantum control. However, a large volume of data is typically required to train the neural network in deep RL, making it inefficient compared with the traditional optimal quantum control method. Here, we thus develop a new training algorithm inspired by the deep meta-reinforcement learning (deep meta-RL), which requires significantly less training data. The trained neural network is adaptive and robust. In addition, the algorithm proposed by us has been applied to design the Hadamard gate and show that for a wide range of parameters the infidelity of the obtained gate can be made of the order 0.0001. Our algorithm can also automatically adjust the number of pulses required to generate the target gate, which is different from the traditional optimal quantum control method which typically fixes the number of pulses a-priory. The results of this paper can pave the way towards constructing a universally robust quantum agent catering to the different demands in quantum technologies. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2406.00276 [pdf]

Non-destructive Degradation Pattern Decoupling for Ultra-early Battery Prototype Verification Using Physics-informed Machine Learning

Authors: Shengyu Tao, Mengtian Zhang, Zixi Zhao, Haoyang Li, Ruifei Ma, Yunhong Che, Xin Sun, Lin Su, Xiangyu Chen, Zihao Zhou, Heng Chang, Tingwei Cao, Xiao Xiao, Yaojun Liu, Wenjun Yu, Zhongling Xu, Yang Li, Han Hao, Xuan Zhang, Xiaosong Hu, Guangmin ZHou

Abstract: Manufacturing complexities and uncertainties have impeded the transition from material prototypes to commercial batteries, making prototype verification critical to quality assessment. A fundamental challenge involves deciphering intertwined chemical processes to characterize degradation patterns and their quantitative relationship with battery performance. Here we show that a physics-informed mac… ▽ More Manufacturing complexities and uncertainties have impeded the transition from material prototypes to commercial batteries, making prototype verification critical to quality assessment. A fundamental challenge involves deciphering intertwined chemical processes to characterize degradation patterns and their quantitative relationship with battery performance. Here we show that a physics-informed machine learning approach can quantify and visualize temporally resolved losses concerning thermodynamics and kinetics only using electric signals. Our method enables non-destructive degradation pattern characterization, expediting temperature-adaptable predictions of entire lifetime trajectories, rather than end-of-life points. The verification speed is 25 times faster yet maintaining 95.1% accuracy across temperatures. Such advances facilitate more sustainable management of defective prototypes before massive production, establishing a 19.76 billion USD scrap material recycling market by 2060 in China. By incorporating stepwise charge acceptance as a measure of the initial manufacturing variability of normally identical batteries, we can immediately identify long-term degradation variations. We attribute the predictive power to interpreting machine learning insights using material-agnostic featurization taxonomy for degradation pattern decoupling. Our findings offer new possibilities for dynamic system analysis, such as battery prototype degradation, demonstrating that complex pattern evolutions can be accurately predicted in a non-destructive and data-driven fashion by integrating physics-informed machine learning. △ Less

Submitted 31 May, 2024; originally announced June 2024.

ACM Class: J.2; G.3

arXiv:2405.18643 [pdf, other]

Temperature-Dependent Chirality in Halide Perovskites

Authors: Mike Pols, Geert Brocks, Sofía Calero, Shuxia Tao

Abstract: Using chiral organic cations in two-dimensional metal halide perovskites, chirality can be induced in the metal halide layers, which results in semiconductors with intriguing chiral optical and spin-selective transport properties. The chiral properties strongly depend on temperature, despite the basic crystal symmetry not changing fundamentally. We identify a set of descriptors that characterize t… ▽ More Using chiral organic cations in two-dimensional metal halide perovskites, chirality can be induced in the metal halide layers, which results in semiconductors with intriguing chiral optical and spin-selective transport properties. The chiral properties strongly depend on temperature, despite the basic crystal symmetry not changing fundamentally. We identify a set of descriptors that characterize the chirality of metal halide perovskites such as MBA$_{2}$PbI$_{4}$, and study their temperature dependence using molecular dynamics simulations with on-the-fly machine-learning force fields obtained from density functional theory calculations. We find that, whereas the organic cations remain chiral upon increasing the temperature, the inorganic framework loses this property more rapidly. We ascribe this to the breaking of hydrogen bonds that link the organic with the inorganic substructures, which leads to a loss of chirality transfer. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 16 pages, 4 figures

arXiv:2405.10681 [pdf, other]

Know in AdVance: Linear-Complexity Forecasting of Ad Campaign Performance with Evolving User Interest

Authors: XiaoYu Wang, YongHui Guo, Hui Sheng, Peili Lv, Chi Zhou, Wei Huang, ShiQin Ta, Dongbo Huang, Xiu** Yang, Lan Xu, Hao Zhou, Yusheng Ji

Abstract: Real-time Bidding (RTB) advertisers wish to \textit{know in advance} the expected cost and yield of ad campaigns to avoid trial-and-error expenses. However, Campaign Performance Forecasting (CPF), a sequence modeling task involving tens of thousands of ad auctions, poses challenges of evolving user interest, auction representation, and long context, making coarse-grained and static-modeling method… ▽ More Real-time Bidding (RTB) advertisers wish to \textit{know in advance} the expected cost and yield of ad campaigns to avoid trial-and-error expenses. However, Campaign Performance Forecasting (CPF), a sequence modeling task involving tens of thousands of ad auctions, poses challenges of evolving user interest, auction representation, and long context, making coarse-grained and static-modeling methods sub-optimal. We propose \textit{AdVance}, a time-aware framework that integrates local auction-level and global campaign-level modeling. User preference and fatigue are disentangled using a time-positioned sequence of clicked items and a concise vector of all displayed items. Cross-attention, conditioned on the fatigue vector, captures the dynamics of user interest toward each candidate ad. Bidders compete with each other, presenting a complete graph similar to the self-attention mechanism. Hence, we employ a Transformer Encoder to compress each auction into embedding by solving auxiliary tasks. These sequential embeddings are then summarized by a conditional state space model (SSM) to comprehend long-range dependencies while maintaining global linear complexity. Considering the irregular time intervals between auctions, we make SSM's parameters dependent on the current auction embedding and the time interval. We further condition SSM's global predictions on the accumulation of local results. Extensive evaluations and ablation studies demonstrate its superiority over state-of-the-art methods. AdVance has been deployed on the Tencent Advertising platform, and A/B tests show a remarkable 4.5\% uplift in Average Revenue per User (ARPU). △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 12 pages, 4 figures, accepted at ACM SIGKDD 2024

arXiv:2405.03379 [pdf, other]

Reverse Forward Curriculum Learning for Extreme Sample and Demonstration Efficiency in Reinforcement Learning

Authors: Stone Tao, Arth Shukla, Tse-kai Chan, Hao Su

Abstract: Reinforcement learning (RL) presents a promising framework to learn policies through environment interaction, but often requires an infeasible amount of interaction data to solve complex tasks from sparse rewards. One direction includes augmenting RL with offline data demonstrating desired tasks, but past work often require a lot of high-quality demonstration data that is difficult to obtain, espe… ▽ More Reinforcement learning (RL) presents a promising framework to learn policies through environment interaction, but often requires an infeasible amount of interaction data to solve complex tasks from sparse rewards. One direction includes augmenting RL with offline data demonstrating desired tasks, but past work often require a lot of high-quality demonstration data that is difficult to obtain, especially for domains such as robotics. Our approach consists of a reverse curriculum followed by a forward curriculum. Unique to our approach compared to past work is the ability to efficiently leverage more than one demonstration via a per-demonstration reverse curriculum generated via state resets. The result of our reverse curriculum is an initial policy that performs well on a narrow initial state distribution and helps overcome difficult exploration problems. A forward curriculum is then used to accelerate the training of the initial policy to perform well on the full initial state distribution of the task and improve demonstration and sample efficiency. We show how the combination of a reverse curriculum and forward curriculum in our method, RFCL, enables significant improvements in demonstration and sample efficiency compared against various state-of-the-art learning-from-demonstration baselines, even solving previously unsolvable tasks that require high precision and control. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: Accepted at The Twelfth International Conference on Learning Representations (ICLR 2024). Website: https://reverseforward-cl.github.io/

arXiv:2404.17287 [pdf, other]

When to Trust LLMs: Aligning Confidence with Response Quality

Authors: Shuchang Tao, Liuyi Yao, Hanxing Ding, Yuexiang Xie, Qi Cao, Fei Sun, **yang Gao, Huawei Shen, Bolin Ding

Abstract: Despite the success of large language models (LLMs) in natural language generation, much evidence shows that LLMs may produce incorrect or nonsensical text. This limitation highlights the importance of discerning when to trust LLMs, especially in safety-critical domains. Existing methods often express reliability by confidence level, however, their effectiveness is limited by the lack of objective… ▽ More Despite the success of large language models (LLMs) in natural language generation, much evidence shows that LLMs may produce incorrect or nonsensical text. This limitation highlights the importance of discerning when to trust LLMs, especially in safety-critical domains. Existing methods often express reliability by confidence level, however, their effectiveness is limited by the lack of objective guidance. To address this, we propose CONfidence-Quality-ORDer-preserving alignment approach (CONQORD), which leverages reinforcement learning guided by a tailored dual-component reward function. This function integrates quality reward and order-preserving alignment reward functions. Specifically, the order-preserving reward incentivizes the model to verbalize greater confidence for responses of higher quality to align the order of confidence and quality. Experiments demonstrate that CONQORD significantly improves the alignment performance between confidence and response accuracy, without causing over-cautious. Furthermore, the aligned confidence provided by CONQORD informs when to trust LLMs, and acts as a determinant for initiating the retrieval process of external knowledge. Aligning confidence with response quality ensures more transparent and reliable responses, providing better trustworthiness. △ Less

Submitted 9 June, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

Comments: Accepted by ACL 2024

arXiv:2404.16280 [pdf, ps, other]

An Efficient Reconstructed Differential Evolution Variant by Some of the Current State-of-the-art Strategies for Solving Single Objective Bound Constrained Problems

Authors: Sichen Tao, Ruihan Zhao, Kaiyu Wang, Shangce Gao

Abstract: Complex single-objective bounded problems are often difficult to solve. In evolutionary computation methods, since the proposal of differential evolution algorithm in 1997, it has been widely studied and developed due to its simplicity and efficiency. These developments include various adaptive strategies, operator improvements, and the introduction of other search methods. After 2014, research ba… ▽ More Complex single-objective bounded problems are often difficult to solve. In evolutionary computation methods, since the proposal of differential evolution algorithm in 1997, it has been widely studied and developed due to its simplicity and efficiency. These developments include various adaptive strategies, operator improvements, and the introduction of other search methods. After 2014, research based on LSHADE has also been widely studied by researchers. However, although recently proposed improvement strategies have shown superiority over their previous generation's first performance, adding all new strategies may not necessarily bring the strongest performance. Therefore, we recombine some effective advances based on advanced differential evolution variants in recent years and finally determine an effective combination scheme to further promote the performance of differential evolution. In this paper, we propose a strategy recombination and reconstruction differential evolution algorithm called reconstructed differential evolution (RDE) to solve single-objective bounded optimization problems. Based on the benchmark suite of the 2024 IEEE Congress on Evolutionary Computation (CEC2024), we tested RDE and several other advanced differential evolution variants. The experimental results show that RDE has superior performance in solving complex optimization problems. △ Less

Submitted 24 April, 2024; originally announced April 2024.

arXiv:2403.14118 [pdf, other]

From Handcrafted Features to LLMs: A Brief Survey for Machine Translation Quality Estimation

Authors: Haofei Zhao, Yilun Liu, Shimin Tao, Weibin Meng, Yimeng Chen, Xiang Geng, Chang Su, Min Zhang, Hao Yang

Abstract: Machine Translation Quality Estimation (MTQE) is the task of estimating the quality of machine-translated text in real time without the need for reference translations, which is of great importance for the development of MT. After two decades of evolution, QE has yielded a wealth of results. This article provides a comprehensive overview of QE datasets, annotation methods, shared tasks, methodolog… ▽ More Machine Translation Quality Estimation (MTQE) is the task of estimating the quality of machine-translated text in real time without the need for reference translations, which is of great importance for the development of MT. After two decades of evolution, QE has yielded a wealth of results. This article provides a comprehensive overview of QE datasets, annotation methods, shared tasks, methodologies, challenges, and future research directions. It begins with an introduction to the background and significance of QE, followed by an explanation of the concepts and evaluation metrics for word-level QE, sentence-level QE, document-level QE, and explainable QE. The paper categorizes the methods developed throughout the history of QE into those based on handcrafted features, deep learning, and Large Language Models (LLMs), with a further division of deep learning-based methods into classic deep learning and those incorporating pre-trained language models (LMs). Additionally, the article details the advantages and limitations of each method and offers a straightforward comparison of different approaches. Finally, the paper discusses the current challenges in QE research and provides an outlook on future research directions. △ Less

Submitted 21 March, 2024; originally announced March 2024.

Comments: Accepted by IJCNN 2024

arXiv:2403.09135 [pdf, other]

Towards Proactive Interactions for In-Vehicle Conversational Assistants Utilizing Large Language Models

Authors: Huifang Du, Xue**g Feng, Jun Ma, Meng Wang, Shiyu Tao, Yijie Zhong, Yuan-Fang Li, Haofen Wang

Abstract: Research demonstrates that the proactivity of in-vehicle conversational assistants (IVCAs) can help to reduce distractions and enhance driving safety, better meeting users' cognitive needs. However, existing IVCAs struggle with user intent recognition and context awareness, which leads to suboptimal proactive interactions. Large language models (LLMs) have shown potential for generalizing to vario… ▽ More Research demonstrates that the proactivity of in-vehicle conversational assistants (IVCAs) can help to reduce distractions and enhance driving safety, better meeting users' cognitive needs. However, existing IVCAs struggle with user intent recognition and context awareness, which leads to suboptimal proactive interactions. Large language models (LLMs) have shown potential for generalizing to various tasks with prompts, but their application in IVCAs and exploration of proactive interaction remain under-explored. These raise questions about how LLMs improve proactive interactions for IVCAs and influence user perception. To investigate these questions systematically, we establish a framework with five proactivity levels across two dimensions-assumption and autonomy-for IVCAs. According to the framework, we propose a "Rewrite + ReAct + Reflect" strategy, aiming to empower LLMs to fulfill the specific demands of each proactivity level when interacting with users. Both feasibility and subjective experiments are conducted. The LLM outperforms the state-of-the-art model in success rate and achieves satisfactory results for each proactivity level. Subjective experiments with 40 participants validate the effectiveness of our framework and show the proactive level with strong assumptions and user confirmation is most appropriate. △ Less

Submitted 14 March, 2024; originally announced March 2024.

arXiv:2403.05545 [pdf]

Unveiling the influence of behavioural, built environment and socio-economic features on the spatial and temporal variability of bus use using explainable machine learning

Authors: Sui Tao, Francisco Rowe, Hongyu Shan

Abstract: Understanding the variability of people's travel patterns is key to transport planning and policy-making. However, to what extent daily transit use displays geographic and temporal variabilities, and what are the contributing factors have not been fully addressed. Drawing on smart card data in Bei**g, China, this study seeks to address these deficits by adopting new indices to capture the spatial… ▽ More Understanding the variability of people's travel patterns is key to transport planning and policy-making. However, to what extent daily transit use displays geographic and temporal variabilities, and what are the contributing factors have not been fully addressed. Drawing on smart card data in Bei**g, China, this study seeks to address these deficits by adopting new indices to capture the spatial and temporal variability of bus use during peak hours and investigate their associations with relevant contextual features. Using explainable machine learning, our findings reveal non-linear interaction between spatial and temporal variability and trip frequency. Furthermore, greater distance to the urban centres (>10 kilometres) is associated with increased spatial variability of bus use, while greater separation of trip origins and destinations from the subcentres reduces both spatial and temporal variability. Higher availability of bus routes is linked to higher spatial variability but lower temporal variability. Meanwhile, both lower and higher road density is associated with higher spatial variability of bus use especially in morning times. These findings indicate that different built environment features moderate the flexibility of travel time and locations. Implications are derived to inform more responsive and reliable operation and planning of transit systems. △ Less

Submitted 6 February, 2024; originally announced March 2024.

Comments: 58 pages including supplementary material

arXiv:2403.04980 [pdf, other]

Photonic simulation of Majorana-based Jones polynomials

Authors: Jia-Kun Li, Kai Sun, Ze-Yan Hao, Jia-He Liang, Si-**g Tao, Jiannis K. Pachos, **-Shi Xu, Yong-Jian Han, Chuan-Feng Li, Guang-Can Guo

Abstract: Jones polynomials were introduced as a tool to distinguish between topologically different links. Recently, they emerged as the central building block of topological quantum computation: by braiding non-Abelian anyons it is possible to realise quantum algorithms through the computation of Jones polynomials. So far, it has been a formidable task to evaluate Jones polynomials through the control and… ▽ More Jones polynomials were introduced as a tool to distinguish between topologically different links. Recently, they emerged as the central building block of topological quantum computation: by braiding non-Abelian anyons it is possible to realise quantum algorithms through the computation of Jones polynomials. So far, it has been a formidable task to evaluate Jones polynomials through the control and manipulation of non-Abelian anyons. In this study, a photonic quantum system employing two-photon correlations and non-dissipative imaginary-time evolution is utilized to simulate two inequivalent braiding operations of Majorana zero modes. The resulting amplitudes are shown to be mathematically equivalent to Jones polynomials at a particular value of their parameter. The high-fidelity of our optical platform allows us to distinguish between a wide range of links, such as Hopf links, Solomon links, Trefoil knots, Figure Eight knots and Borromean rings, through determining their corresponding Jones polynomials. Our photonic quantum simulator represents a significant step towards executing fault-tolerant quantum algorithms based on topological quantum encoding and manipulation. △ Less

Submitted 31 May, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

arXiv:2402.18191 [pdf, other]

Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation

Authors: Yuan Ge, Yilun Liu, Chi Hu, Weibin Meng, Shimin Tao, Xiaofeng Zhao, Hongxia Ma, Li Zhang, Hao Yang, Tong Xiao

Abstract: With contributions from the open-source community, a vast amount of instruction tuning (IT) data has emerged. Given the significant resource allocation required by training and evaluating models, it is advantageous to have an efficient method for selecting high-quality IT data. However, existing methods for instruction data selection have limitations such as relying on fragile external APIs, being… ▽ More With contributions from the open-source community, a vast amount of instruction tuning (IT) data has emerged. Given the significant resource allocation required by training and evaluating models, it is advantageous to have an efficient method for selecting high-quality IT data. However, existing methods for instruction data selection have limitations such as relying on fragile external APIs, being affected by biases in GPT models, or reducing the diversity of the selected instruction dataset. In this paper, we propose an industrial-friendly, expert-aligned and diversity-preserved instruction data selection method: Clustering and Ranking (CaR). CaR consists of two steps. The first step involves ranking instruction pairs using a scoring model that is well aligned with expert preferences (achieving an accuracy of 84.25%). The second step involves preserving dataset diversity through a clustering process.In our experiment, CaR selected a subset containing only 1.96% of Alpaca's IT data, yet the underlying AlpaCaR model trained on this subset outperforms Alpaca by an average of 32.1% in GPT-4 evaluations. Furthermore, our method utilizes small models (355M parameters) and requires only 11.2% of the monetary cost compared to existing methods, making it easily deployable in industrial scenarios. △ Less

Submitted 28 February, 2024; originally announced February 2024.

arXiv:2402.15200 [pdf, other]

DeMPT: Decoding-enhanced Multi-phase Prompt Tuning for Making LLMs Be Better Context-aware Translators

Authors: Xinglin Lyu, Junhui Li, Yanqing Zhao, Min Zhang, Daimeng Wei, Shimin Tao, Hao Yang, Min Zhang

Abstract: Generally, the decoder-only large language models (LLMs) are adapted to context-aware neural machine translation (NMT) in a concatenating way, where LLMs take the concatenation of the source sentence (i.e., intra-sentence context) and the inter-sentence context as the input, and then to generate the target tokens sequentially. This adaptation strategy, i.e., concatenation mode, considers intra-sen… ▽ More Generally, the decoder-only large language models (LLMs) are adapted to context-aware neural machine translation (NMT) in a concatenating way, where LLMs take the concatenation of the source sentence (i.e., intra-sentence context) and the inter-sentence context as the input, and then to generate the target tokens sequentially. This adaptation strategy, i.e., concatenation mode, considers intra-sentence and inter-sentence contexts with the same priority, despite an apparent difference between the two kinds of contexts. In this paper, we propose an alternative adaptation approach, named Decoding-enhanced Multi-phase Prompt Tuning (DeMPT), to make LLMs discriminately model and utilize the inter- and intra-sentence context and more effectively adapt LLMs to context-aware NMT. First, DeMPT divides the context-aware NMT process into three separate phases. During each phase, different continuous prompts are introduced to make LLMs discriminately model various information. Second, DeMPT employs a heuristic way to further discriminately enhance the utilization of the source-side inter- and intra-sentence information at the final decoding phase. Experiments show that our approach significantly outperforms the concatenation method, and further improves the performance of LLMs in discourse modeling. △ Less

Submitted 23 February, 2024; originally announced February 2024.

Comments: under reviewing

arXiv:2402.03075 [pdf, ps, other]

Some sharp bounds for Hardy type operators on mixed radial-angular type function spaces

Authors: Ronghui Liu, Yanqi Yang, Shuang** Tao

Abstract: In this paper, we are devoted to studying some sharp bounds for Hardy type operators on mixed radial-angular type function spaces. In addition, we will establish the sharp weak-type estimates for the fractional Hardy operator and its conjugate operator, respectively. In this paper, we are devoted to studying some sharp bounds for Hardy type operators on mixed radial-angular type function spaces. In addition, we will establish the sharp weak-type estimates for the fractional Hardy operator and its conjugate operator, respectively. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: 30 pages

MSC Class: 42B35; 26D10; 46E30; 26D15

arXiv:2401.13953 [pdf, other]

Accelerating Structural Optimization through Fingerprinting Space Integration on the Potential Energy Surface

Authors: Shuo Tao, Xuecheng Shao, Li Zhu

Abstract: Structural optimization has been a crucial component in computational materials research, and structure predictions have relied heavily on this technique in particular. In this study, we introduce a novel method that enhances the efficiency of local optimization by integrating an extra fingerprint space into the optimization process. Our approach utilizes a mixed energy concept in the hyper potent… ▽ More Structural optimization has been a crucial component in computational materials research, and structure predictions have relied heavily on this technique in particular. In this study, we introduce a novel method that enhances the efficiency of local optimization by integrating an extra fingerprint space into the optimization process. Our approach utilizes a mixed energy concept in the hyper potential energy surface (PES), combining real energy and a newly introduced fingerprint energy derived from the symmetry of local atomic environment. This method strategically guides the optimization process toward high-symmetry, low-energy structures by leveraging the intrinsic symmetry of atomic configurations. The effectiveness of our approach was demonstrated through structural optimizations of silicon, silicon carbide, and Lennard-Jones cluster systems. Our results show that the fingerprint space biasing technique significantly enhances the performance and probability of discovering energetically favorable, high-symmetry structures, as compared to conventional optimizations. The proposed method is anticipated to streamline the search for new materials and facilitate the discovery of novel, energetically favorable configurations. △ Less

Submitted 25 January, 2024; originally announced January 2024.

Comments: 16 pages, 4 figures

arXiv:2401.11382 [pdf, other]

Using Large Language Model for End-to-End Chinese ASR and NER

Authors: Yuang Li, Jiawei Yu, Min Zhang, Mengxin Ren, Yanqing Zhao, Xiaofeng Zhao, Shimin Tao, **song Su, Hao Yang

Abstract: Map** speech tokens to the same feature space as text tokens has become the paradigm for the integration of speech modality into decoder-only large language models (LLMs). An alternative approach is to use an encoder-decoder architecture that incorporates speech features through cross-attention. This approach, however, has received less attention in the literature. In this work, we connect the W… ▽ More Map** speech tokens to the same feature space as text tokens has become the paradigm for the integration of speech modality into decoder-only large language models (LLMs). An alternative approach is to use an encoder-decoder architecture that incorporates speech features through cross-attention. This approach, however, has received less attention in the literature. In this work, we connect the Whisper encoder with ChatGLM3 and provide in-depth comparisons of these two approaches using Chinese automatic speech recognition (ASR) and name entity recognition (NER) tasks. We evaluate them not only by conventional metrics like the F1 score but also by a novel fine-grained taxonomy of ASR-NER errors. Our experiments reveal that encoder-decoder architecture outperforms decoder-only architecture with a short context, while decoder-only architecture benefits from a long context as it fully exploits all layers of the LLM. By using LLM, we significantly reduced the entity omission errors and improved the entity ASR accuracy compared to the Conformer baseline. Additionally, we obtained a state-of-the-art (SOTA) F1 score of 0.805 on the AISHELL-NER test set by using chain-of-thought (CoT) NER which first infers long-form ASR transcriptions and then predicts NER labels. △ Less

Submitted 6 June, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

Comments: 5 pages, 2 figures, Accepted to InterSpeech 2024

arXiv:2401.05689 [pdf, other]

doi 10.1109/ICASSP49357.2023.10096194

UCorrect: An Unsupervised Framework for Automatic Speech Recognition Error Correction

Authors: Jiaxin Guo, Minghan Wang, Xiaosong Qiao, Daimeng Wei, Hengchao Shang, Zongyao Li, Zhengzhe Yu, Yinglu Li, Chang Su, Min Zhang, Shimin Tao, Hao Yang

Abstract: Error correction techniques have been used to refine the output sentences from automatic speech recognition (ASR) models and achieve a lower word error rate (WER). Previous works usually adopt end-to-end models and has strong dependency on Pseudo Paired Data and Original Paired Data. But when only pre-training on Pseudo Paired Data, previous models have negative effect on correction. While fine-tu… ▽ More Error correction techniques have been used to refine the output sentences from automatic speech recognition (ASR) models and achieve a lower word error rate (WER). Previous works usually adopt end-to-end models and has strong dependency on Pseudo Paired Data and Original Paired Data. But when only pre-training on Pseudo Paired Data, previous models have negative effect on correction. While fine-tuning on Original Paired Data, the source side data must be transcribed by a well-trained ASR model, which takes a lot of time and not universal. In this paper, we propose UCorrect, an unsupervised Detector-Generator-Selector framework for ASR Error Correction. UCorrect has no dependency on the training data mentioned before. The whole procedure is first to detect whether the character is erroneous, then to generate some candidate characters and finally to select the most confident one to replace the error character. Experiments on the public AISHELL-1 dataset and WenetSpeech dataset show the effectiveness of UCorrect for ASR error correction: 1) it achieves significant WER reduction, achieves 6.83\% even without fine-tuning and 14.29\% after fine-tuning; 2) it outperforms the popular NAR correction models by a large margin with a competitive low latency; and 3) it is an universal method, as it reduces all WERs of the ASR model with different decoding strategies and reduces all WERs of ASR models trained on different scale datasets. △ Less

Submitted 11 January, 2024; originally announced January 2024.

Comments: Accepted in ICASSP 2023

arXiv:2312.04378 [pdf, other]

Operando pair distribution function analysis of nanocrystalline functional materials: the case of $\mathrm{TiO_{2}}$-bronze nanocrystals in Li-ion battery electrodes

Authors: Martin Aaskov Karlsen, Jonas Billet, Songsheng Tao, Isabel Van Driessche, Simon J. L. Billinge, Dorthe B. Ravnsbæk

Abstract: Structural modelling of $operando$ pair distribution function (PDF) data of functional materials can be highly complex. To aid the understanding of complex operando PDF data, we here demonstrate a toolbox for PDF analysis. The tools include the structureMining, similarityMap**, nmfMap** apps available through the online service 'PDF in the cloud' (PDFitc, www.pdfitc.org), as well as noise-filt… ▽ More Structural modelling of $operando$ pair distribution function (PDF) data of functional materials can be highly complex. To aid the understanding of complex operando PDF data, we here demonstrate a toolbox for PDF analysis. The tools include the structureMining, similarityMap**, nmfMap** apps available through the online service 'PDF in the cloud' (PDFitc, www.pdfitc.org), as well as noise-filtering using principal component analysis (PCA). The tools are applied to both ex situ and operando PDF data for 3 nm $\mathrm{TiO_{2}}$-bronze nanocrystals, which function as the active electrode material in a Li-ion battery. The tools enable structural modelling of the ex situ and operando PDF data, revealing two pristine $\mathrm{TiO_{2}}$ phases (bronze and anatase) and two lithiated $\mathrm{Li_{x}TiO_{2}}$ phases (lithiated versions of bronze and anatase), and the phase evolution during Galvanostatic cycling is characterized. △ Less

Submitted 7 December, 2023; originally announced December 2023.

Comments: Preprint, 82 pages in total (front page: 1 page, abstract: 1 page, paper: 34 pages, supporting information: 40 pages, references: 5 pages, synopsis: 1 page), 35 figures in total (frontpage: 1 figure, paper: 8 figures, supporting information: 26 figures)

arXiv:2311.13246 [pdf, other]

CoachLM: Automatic Instruction Revisions Improve the Data Quality in LLM Instruction Tuning

Authors: Yilun Liu, Shimin Tao, Xiaofeng Zhao, Ming Zhu, Wenbing Ma, Junhao Zhu, Chang Su, Yutai Hou, Miao Zhang, Min Zhang, Hongxia Ma, Li Zhang, Hao Yang, Yanfei Jiang

Abstract: Instruction tuning is crucial for enabling Language Learning Models (LLMs) in responding to human instructions. The quality of instruction pairs used for tuning greatly affects the performance of LLMs. However, the manual creation of high-quality instruction datasets is costly, leading to the adoption of automatic generation of instruction pairs by LLMs as a popular alternative. To ensure the high… ▽ More Instruction tuning is crucial for enabling Language Learning Models (LLMs) in responding to human instructions. The quality of instruction pairs used for tuning greatly affects the performance of LLMs. However, the manual creation of high-quality instruction datasets is costly, leading to the adoption of automatic generation of instruction pairs by LLMs as a popular alternative. To ensure the high quality of LLM-generated instruction datasets, several approaches have been proposed. Nevertheless, existing methods either compromise dataset integrity by filtering a large proportion of samples, or are unsuitable for industrial applications. In this paper, instead of discarding low-quality samples, we propose CoachLM, a novel approach to enhance the quality of instruction datasets through automatic revisions on samples in the dataset. CoachLM is trained from the samples revised by human experts and significantly increases the proportion of high-quality samples in the dataset from 17.7% to 78.9%. The effectiveness of CoachLM is further assessed on various real-world instruction test sets. The results show that CoachLM improves the instruction-following capabilities of the instruction-tuned LLM by an average of 29.9%, which even surpasses larger LLMs with nearly twice the number of parameters. Furthermore, CoachLM is successfully deployed in a data management system for LLMs at Huawei, resulting in an efficiency improvement of up to 20% in the cleaning of 40k real-world instruction pairs. We release various assets of CoachLM, including the training data, code and test set (https://github.com/lunyiliu/CoachLM). △ Less

Submitted 20 March, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

Comments: Accepted by ICDE 2024

arXiv:2310.17885 [pdf, other]

Sobolev regularity for a class of local fractional new maximal operators

Authors: Rui Li, Shuang** Tao

Abstract: This paper is devoted to studying the regularity properties for the new maximal operator $M_{\varphi}$ and the fractional new maximal operator $M_{\varphi,β}$ in the local case. Some new pointwise gradient estimates of $M_{\varphi,Ω}$ and $M_{\varphi,β,Ω}$ are given. Moreover, the boundedness of $M_{\varphi,Ω}$ and $M_{\varphi,β,Ω}$ on Sobolev space is established. As applications, we also obtain… ▽ More This paper is devoted to studying the regularity properties for the new maximal operator $M_{\varphi}$ and the fractional new maximal operator $M_{\varphi,β}$ in the local case. Some new pointwise gradient estimates of $M_{\varphi,Ω}$ and $M_{\varphi,β,Ω}$ are given. Moreover, the boundedness of $M_{\varphi,Ω}$ and $M_{\varphi,β,Ω}$ on Sobolev space is established. As applications, we also obtain the bounds of the above operators on Sobolev space with zero boundary values. △ Less

Submitted 27 October, 2023; originally announced October 2023.

arXiv:2309.14002 [pdf, other]

Calculating the Circular Dichroism of Chiral Halide Perovskites: A Tight-Binding Approach

Authors: Sofia Apergi, Geert Brocks, Shuxia Tao

Abstract: Chiral metal halide perovskites have emerged as promising optoelectronic materials for emission and detection of circular polarized visible light. Despite chirality being realized by adding chiral organic cations or ligands, the chiroptical activity originates from the metal halide framework. The mechanism is not well understood, as an overarching modeling framework is lacking. Capturing chirality… ▽ More Chiral metal halide perovskites have emerged as promising optoelectronic materials for emission and detection of circular polarized visible light. Despite chirality being realized by adding chiral organic cations or ligands, the chiroptical activity originates from the metal halide framework. The mechanism is not well understood, as an overarching modeling framework is lacking. Capturing chirality requires going beyond electric dipole transitions, the common approximation in condensed matter calculations. We present a density functional theory (DFT) parameterized tight-binding (TB) model, which allows us to calculate optical properties including circular dichroism (CD) at low computational cost. Comparing Pb-based chiral perovskites with different organic cations and halide anions, we find that the structural helicity within the metal halide layers determines the size of the CD. Our results mark an important step in understanding the complex correlations of structural, electronic and optical properties of chiral perovskites, and provide a useful tool to predict new compounds with desired properties for novel optoelectronic applications. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: 19 pages, 4 figures

arXiv:2309.13230 [pdf, other]

Unify word-level and span-level tasks: NJUNLP's Participation for the WMT2023 Quality Estimation Shared Task

Authors: Xiang Geng, Zhejian Lai, Yu Zhang, Shimin Tao, Hao Yang, Jiajun Chen, Shujian Huang

Abstract: We introduce the submissions of the NJUNLP team to the WMT 2023 Quality Estimation (QE) shared task. Our team submitted predictions for the English-German language pair on all two sub-tasks: (i) sentence- and word-level quality prediction; and (ii) fine-grained error span detection. This year, we further explore pseudo data methods for QE based on NJUQE framework (https://github.com/NJUNLP/njuqe).… ▽ More We introduce the submissions of the NJUNLP team to the WMT 2023 Quality Estimation (QE) shared task. Our team submitted predictions for the English-German language pair on all two sub-tasks: (i) sentence- and word-level quality prediction; and (ii) fine-grained error span detection. This year, we further explore pseudo data methods for QE based on NJUQE framework (https://github.com/NJUNLP/njuqe). We generate pseudo MQM data using parallel data from the WMT translation task. We pre-train the XLMR large model on pseudo QE data, then fine-tune it on real QE data. At both stages, we jointly learn sentence-level scores and word-level tags. Empirically, we conduct experiments to find the key hyper-parameters that improve the performance. Technically, we propose a simple method that covert the word-level outputs to fine-grained error span results. Overall, our models achieved the best results in English-German for both word-level and fine-grained error span detection sub-tasks by a considerable margin. △ Less

Submitted 11 December, 2023; v1 submitted 22 September, 2023; originally announced September 2023.

Comments: WMT2023 System Paper

Journal ref: https://aclanthology.org/2023.wmt-1.71

arXiv:2309.12003 [pdf, ps, other]

A quaternary analogue of Tang-Ding codes

Authors: Minjia Shi, Sihui Tao, Jon-Lark Kim, Patrick Sole

Abstract: In a recent paper, Tang and Ding introduced a class of binary cyclic codes of rate close to one half with a designed lower bound on their minimum distance. The definition involves the base $2$ expansion of the integers in their defining set. In this paper we propose an analogue for quaternary codes. In addition, the performances of the subfield subcode and of the trace code (two binary cyclic code… ▽ More In a recent paper, Tang and Ding introduced a class of binary cyclic codes of rate close to one half with a designed lower bound on their minimum distance. The definition involves the base $2$ expansion of the integers in their defining set. In this paper we propose an analogue for quaternary codes. In addition, the performances of the subfield subcode and of the trace code (two binary cyclic codes) are investigated. △ Less

Submitted 21 September, 2023; originally announced September 2023.

arXiv:2309.09588 [pdf, other]

doi 10.1021/acs.jpcc.4c00563

Mixing I and Br in Inorganic Perovskites: Atomistic Insights from Reactive Molecular Dynamics Simulations

Authors: Mike Pols, Adri C. T. van Duin, Sofía Calero, Shuxia Tao

Abstract: All-inorganic halide perovskites have received a lot of attention as attractive alternatives to overcome the stability issues of hybrid halide perovskites that are commonly associated with organic cations. To find a compromise between the optoelectronic properties of CsPbI$_{3}$ and CsPbBr$_{3}$, perovskites with CsPb(Br$_{\rm{x}}$I$_{\rm{1-x}}$)$_{3}$ mixed compositions are commonly used. An addi… ▽ More All-inorganic halide perovskites have received a lot of attention as attractive alternatives to overcome the stability issues of hybrid halide perovskites that are commonly associated with organic cations. To find a compromise between the optoelectronic properties of CsPbI$_{3}$ and CsPbBr$_{3}$, perovskites with CsPb(Br$_{\rm{x}}$I$_{\rm{1-x}}$)$_{3}$ mixed compositions are commonly used. An additional benefit is that, without sacrificing the optoelectronic properties for applications such as solar cells or LEDs, small amounts of Br in CsPbI$_{3}$ can prevent the inorganic perovskite from degrading to a photoinactive nonperovskite yellow phase. Despite indications that strain in the perovskite lattice plays a role in the stabilization of the material, a full understanding of such strain is lacking. Here we develop a reactive force field (ReaxFF) for perovskites starting from our previous work for CsPbI$_{3}$, we extend this force field to CsPbBr$_{3}$ and mixed CsPb(Br$_{\rm{x}}$I$_{\rm{1-x}}$)$_{3}$ compounds. This force field is used in large-scale molecular dynamics simulations to study perovskite phase transitions and the internal ion dynamics associated with the phase transitions. We find that an increase of the Br content lowers the temperature at which the perovskite reaches a cubic structure. Specifically, by substituting Br for I, the smaller ionic radius of Br induces a strain in the lattice that changes the internal dynamics of the octahedra. Importantly, this effect propagates through the perovskite lattice ranging up to distances of 2 nm, explaining why small concentrations of Br in CsPb(Br$_{\rm{x}}$I$_{\rm{1-x}}$)$_{3}$ (x $\leq$ 1/4) have a significant impact on the phase stability of mixed halide perovskites. △ Less

Submitted 29 May, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

Comments: 23 pages, 6 figures

Journal ref: J. Phys. Chem. C 128 (2024), 4111-4118

arXiv:2309.09552 [pdf, other]

A Multitask Training Approach to Enhance Whisper with Contextual Biasing and Open-Vocabulary Keyword Spotting

Authors: Yuang Li, Min Zhang, Chang Su, Yinglu Li, Xiaosong Qiao, Mengxin Ren, Miaomiao Ma, Daimeng Wei, Shimin Tao, Hao Yang

Abstract: The recognition of rare named entities, such as personal names and terminologies, is challenging for automatic speech recognition (ASR) systems, especially when they are not frequently observed in the training data. In this paper, we introduce keyword spotting enhanced Whisper (KWS-Whisper), a novel ASR system that leverages the Whisper model and performs open-vocabulary keyword spotting (OV-KWS)… ▽ More The recognition of rare named entities, such as personal names and terminologies, is challenging for automatic speech recognition (ASR) systems, especially when they are not frequently observed in the training data. In this paper, we introduce keyword spotting enhanced Whisper (KWS-Whisper), a novel ASR system that leverages the Whisper model and performs open-vocabulary keyword spotting (OV-KWS) on the hidden states of the Whisper encoder to recognize user-defined named entities. These entities serve as prompts for the Whisper decoder. To optimize the model, we propose a multitask training approach that learns OV-KWS and contextual-ASR tasks. We evaluate our approach on Chinese Aishell hot word subsets and two internal code-switching test sets and show that it significantly improves the entity recall compared to the original Whisper model. Moreover, we demonstrate that the OV-KWS can be a plug-and-play module to enhance the ASR error correction methods and frozen Whisper models. △ Less

Submitted 6 June, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

Comments: 5 pages, 2 figures, Accepted to InterSpeech 2024

arXiv:2309.02057 [pdf, other]

Robust Recommender System: A Survey and Future Directions

Authors: Kaike Zhang, Qi Cao, Fei Sun, Yunfan Wu, Shuchang Tao, Huawei Shen, Xueqi Cheng

Abstract: With the rapid growth of information, recommender systems have become integral for providing personalized suggestions and overcoming information overload. However, their practical deployment often encounters "dirty" data, where noise or malicious information can lead to abnormal recommendations. Research on improving recommender systems' robustness against such dirty data has thus gained significa… ▽ More With the rapid growth of information, recommender systems have become integral for providing personalized suggestions and overcoming information overload. However, their practical deployment often encounters "dirty" data, where noise or malicious information can lead to abnormal recommendations. Research on improving recommender systems' robustness against such dirty data has thus gained significant attention. This survey provides a comprehensive review of recent work on recommender systems' robustness. We first present a taxonomy to organize current techniques for withstanding malicious attacks and natural noise. We then explore state-of-the-art methods in each category, including fraudster detection, adversarial training, certifiable robust training against malicious attacks, and regularization, purification, self-supervised learning against natural noise. Additionally, we summarize evaluation metrics and common datasets used to assess robustness. We discuss robustness across varying recommendation scenarios and its interplay with other properties like accuracy, interpretability, privacy, and fairness. Finally, we delve into open issues and future research directions in this emerging field. Our goal is to equip readers with a holistic understanding of robust recommender systems and spotlight pathways for future research and development. △ Less

Submitted 5 September, 2023; originally announced September 2023.

arXiv:2308.13961 [pdf, other]

Translate Meanings, Not Just Words: IdiomKB's Role in Optimizing Idiomatic Translation with Language Models

Authors: Shuang Li, Jiangjie Chen, Siyu Yuan, Xinyi Wu, Hao Yang, Shimin Tao, Yanghua Xiao

Abstract: To translate well, machine translation (MT) systems and general-purposed language models (LMs) need a deep understanding of both source and target languages and cultures. Therefore, idioms, with their non-compositional nature, pose particular challenges for Transformer-based systems, as literal translations often miss the intended meaning. Traditional methods, which replace idioms using existing k… ▽ More To translate well, machine translation (MT) systems and general-purposed language models (LMs) need a deep understanding of both source and target languages and cultures. Therefore, idioms, with their non-compositional nature, pose particular challenges for Transformer-based systems, as literal translations often miss the intended meaning. Traditional methods, which replace idioms using existing knowledge bases (KBs), often lack scale and context awareness. Addressing these challenges, our approach prioritizes context awareness and scalability, allowing for offline storage of idioms in a manageable KB size. This ensures efficient serving with smaller models and provides a more comprehensive understanding of idiomatic expressions. We introduce a multilingual idiom KB (IdiomKB) developed using large LMs to address this. This KB facilitates better translation by smaller models, such as BLOOMZ (7.1B), Alpaca (7B), and InstructGPT (6.7B), by retrieving idioms' figurative meanings. We present a novel, GPT-4-powered metric for human-aligned evaluation, demonstrating that IdiomKB considerably boosts model performance. Human evaluations further validate our KB's quality. △ Less

Submitted 24 December, 2023; v1 submitted 26 August, 2023; originally announced August 2023.

Comments: Accepted to AAAI 2024

arXiv:2308.07610 [pdf, other]

Interpretable Online Log Analysis Using Large Language Models with Prompt Strategies

Authors: Yilun Liu, Shimin Tao, Weibin Meng, **gyu Wang, Wenbing Ma, Yanqing Zhao, Yuhang Chen, Hao Yang, Yanfei Jiang, Xun Chen

Abstract: Automated log analysis is crucial in modern software-intensive systems for facilitating program comprehension throughout software maintenance and engineering life cycles. Existing methods perform tasks such as log parsing and log anomaly detection by providing a single prediction value without interpretation. However, given the increasing volume of system events, the limited interpretability of an… ▽ More Automated log analysis is crucial in modern software-intensive systems for facilitating program comprehension throughout software maintenance and engineering life cycles. Existing methods perform tasks such as log parsing and log anomaly detection by providing a single prediction value without interpretation. However, given the increasing volume of system events, the limited interpretability of analysis results hinders analysts' comprehension of program status and their ability to take appropriate actions. Moreover, these methods require substantial in-domain training data, and their performance declines sharply (by up to 62.5%) in online scenarios involving unseen logs from new domains, a common occurrence due to rapid software updates. In this paper, we propose LogPrompt, a novel interpretable log analysis approach for online scenarios. LogPrompt employs large language models (LLMs) to perform online log analysis tasks via a suite of advanced prompt strategies tailored for log tasks, which enhances LLMs' performance by up to 380.7% compared with simple prompts. Experiments on nine publicly available evaluation datasets across two tasks demonstrate that LogPrompt, despite requiring no in-domain training, outperforms existing approaches trained on thousands of logs by up to 55.9%. We also conduct a human evaluation of LogPrompt's interpretability, with six practitioners possessing over 10 years of experience, who highly rated the generated content in terms of usefulness and readability (averagely 4.42/5). LogPrompt also exhibits remarkable compatibility with open-source and smaller-scale LLMs, making it flexible for practical deployment. Code of LogPrompt is available at https://github.com/lunyiliu/LogPrompt. △ Less

Submitted 25 January, 2024; v1 submitted 15 August, 2023; originally announced August 2023.

Comments: Accepted by ICPC 2024

arXiv:2308.04114 [pdf, other]

Collective Human Opinions in Semantic Textual Similarity

Authors: Yuxia Wang, Shimin Tao, Ning Xie, Hao Yang, Timothy Baldwin, Karin Verspoor

Abstract: Despite the subjective nature of semantic textual similarity (STS) and pervasive disagreements in STS annotation, existing benchmarks have used averaged human ratings as the gold standard. Averaging masks the true distribution of human opinions on examples of low agreement, and prevents models from capturing the semantic vagueness that the individual ratings represent. In this work, we introduce U… ▽ More Despite the subjective nature of semantic textual similarity (STS) and pervasive disagreements in STS annotation, existing benchmarks have used averaged human ratings as the gold standard. Averaging masks the true distribution of human opinions on examples of low agreement, and prevents models from capturing the semantic vagueness that the individual ratings represent. In this work, we introduce USTS, the first Uncertainty-aware STS dataset with ~15,000 Chinese sentence pairs and 150,000 labels, to study collective human opinions in STS. Analysis reveals that neither a scalar nor a single Gaussian fits a set of observed judgements adequately. We further show that current STS models cannot capture the variance caused by human disagreement on individual instances, but rather reflect the predictive confidence over the aggregate dataset. △ Less

Submitted 8 August, 2023; originally announced August 2023.

Comments: 16 pages, 7 figures

Journal ref: TACL Submission batch: 7/2022; Revision batch: 1/2023; Published 2023

arXiv:2308.01857 [pdf, other]

iEDA: An Open-Source Intelligent Physical Implementation Toolkit and Library

Authors: Xingquan Li, Simin Tao, Zengrong Huang, Shijian Chen, Zhisheng Zeng, Liwei Ni, Zhipeng Huang, Chunan Zhuang, Hongxi Wu, Weiguo Li1, Xueyan Zhao, He Liu, Shuaiying Long, Wei He, Bojun Liu, Sifeng Gan, Zihao Yu, Tong Liu, Yuchi Miao, Zhiyuan Yan, Hao Wang, Jie Zhao, Yifan Li, Ruizhi Liu, Xiaoze Lin , et al. (31 additional authors not shown)

Abstract: Open-source EDA shows promising potential in unleashing EDA innovation and lowering the cost of chip design. This paper presents an open-source EDA project, iEDA, aiming for building a basic infrastructure for EDA technology evolution and closing the industrial-academic gap in the EDA area. iEDA now covers the whole flow of physical design (including Floorplan, Placement, CTS, Routing, Timing Opti… ▽ More Open-source EDA shows promising potential in unleashing EDA innovation and lowering the cost of chip design. This paper presents an open-source EDA project, iEDA, aiming for building a basic infrastructure for EDA technology evolution and closing the industrial-academic gap in the EDA area. iEDA now covers the whole flow of physical design (including Floorplan, Placement, CTS, Routing, Timing Optimization etc.), and part of the analysis tools (Static Timing Analysis and Power Analysis). To demonstrate the effectiveness of iEDA, we implement and tape out three chips of different scales (from 700k to 1.5M gates) on different process nodes (110nm and 28nm) with iEDA. iEDA is publicly available from the project home page http://ieda.oscc.cc. △ Less

Submitted 3 August, 2023; originally announced August 2023.

arXiv:2307.04965 [pdf]

Acoustic diagnostics of femtosecond laser filamentation

Authors: Binpeng Shang, Nan Zhang, Pengfei Qi, Shishi Tao, Lie Lin, Weiwei Liu

Abstract: The promising application of femtosecond laser filamentation in atmospheric remote sensing brings imperative demand for diagnosing the spatiotemporal dynamics of filamentation. Acoustic emission (AE) during filamentation opens a door to give the insight into the dynamic evolution of filaments in air. In particular, the frequency features of the acoustic emission provide relevant information on the… ▽ More The promising application of femtosecond laser filamentation in atmospheric remote sensing brings imperative demand for diagnosing the spatiotemporal dynamics of filamentation. Acoustic emission (AE) during filamentation opens a door to give the insight into the dynamic evolution of filaments in air. In particular, the frequency features of the acoustic emission provide relevant information on the conversion of laser energy to acoustic energy. Here, the acoustic emission of femtosecond laser filament manipulated by energy and the focal lengths was measured quantitatively by a broadband microphone, and the acoustic parameters were compared and analyzed. Our results showed that the acoustic power presents a squared dependence on the laser energy and the bandwidth of the acoustic spectrum showed a significant positive correlation with laser energy deposition. It was found that the spectrum of the acoustic pulse emitted from the middle of the filament has a larger bandwidth compared to those emitted from the ends of the filament and the spectrum of the acoustic pulse is also an indicator of the filament intensity distribution. These findings are helpful for studying the plasma filament properties and complex dynamic processes through acoustic parameters and allow the optimization of remote applications. △ Less

Submitted 10 July, 2023; originally announced July 2023.

Comments: 8 pages,5 figures

MSC Class: 78A60 ACM Class: J.2.9

arXiv:2306.15266 [pdf, other]

Internal Contrastive Learning for Generalized Out-of-distribution Fault Diagnosis (GOOFD) Framework

Authors: Xingyue Wang, Hanrong Zhang, Ke Ma, Shuting Tao, Peng Peng, Hongwei Wang

Abstract: Fault diagnosis is essential in industrial processes for monitoring the conditions of important machines. With the ever-increasing complexity of working conditions and demand for safety during production and operation, different diagnosis methods are required, and more importantly, an integrated fault diagnosis system that can cope with multiple tasks is highly desired. However, the diagnosis subt… ▽ More Fault diagnosis is essential in industrial processes for monitoring the conditions of important machines. With the ever-increasing complexity of working conditions and demand for safety during production and operation, different diagnosis methods are required, and more importantly, an integrated fault diagnosis system that can cope with multiple tasks is highly desired. However, the diagnosis subtasks are often studied separately, and the currently available methods still need improvement for such a generalized system. To address this issue, we propose the Generalized Out-of-distribution Fault Diagnosis (GOOFD) framework to integrate diagnosis subtasks, such as fault detection, fault classification, and novel fault diagnosis. Additionally, a unified fault diagnosis method based on internal contrastive learning is put forward to underpin the proposed generalized framework. The method extracts features utilizing the internal contrastive learning technique and then recognizes the outliers based on the Mahalanobis distance. Experiments are conducted on a simulated benchmark dataset as well as two practical process datasets to evaluate the proposed framework. As demonstrated in the experiments, the proposed method achieves better performance compared with several existing techniques and thus verifies the effectiveness of the proposed framework. △ Less

Submitted 27 June, 2023; originally announced June 2023.

arXiv:2306.15150 [pdf]

Femtosecond Laser Filamentation in Atmospheric Turbulence

Authors: Jiewei Guo, Lu Sun, Yuezheng Wang, Jiayun Xue, Zhi Zhang, Haiyi Liu, Shishi Tao, Pengfei Qi, Lie Lin, Weiwei Liu

Abstract: The effects of turbulence intensity and turbulence region on the distribution of femtosecond laser filaments are experimentally elaborated. Through the ultrasonic signals emitted by the filaments, and it is observed that increasing turbulence intensity and expanding turbulence active region cause an increase in the start position of the filament, and a decrease in filament length, which can be wel… ▽ More The effects of turbulence intensity and turbulence region on the distribution of femtosecond laser filaments are experimentally elaborated. Through the ultrasonic signals emitted by the filaments, and it is observed that increasing turbulence intensity and expanding turbulence active region cause an increase in the start position of the filament, and a decrease in filament length, which can be well explained by the theoretical calculation. It is also observed that the random perturbation of the air refractive index caused by atmospheric turbulence expanded the spot size of the filament. Additionally, when turbulence intensity reaches , multiple filaments are formed. Furthermore, the standard deviation of the transverse displacement of filament is found to be proportional to the square root of turbulent structure constant under the experimental turbulence parameters in this paper. These results contribute to the study of femtosecond laser propagation mechanisms in complex atmospheric turbulence conditions △ Less

Submitted 26 June, 2023; originally announced June 2023.

Comments: 9 pages, 4 figures

arXiv:2306.12904 [pdf]

Coupled air lasing gain and Mie scattering loss: aerosol effect in filament-induced plasma spectroscopy

Authors: Jiayun Xue, Zhi Zhang, Yuezheng Wang, Binpeng Shang, Jiewei Guo, Shishi Tao, Nan Zhang, Lanjunguo, Pengfei Qi, Lie Lin, Weiwei Liu

Abstract: Femtosecond laser filament-induced plasma spectroscopy (FIPS) demonstrates great potentials in the remote sensing for identifying atmospheric pollutant molecules. Due to the widespread aerosols in atmosphere, the remote detection based on FIPS would be affected from both the excitation and the propagation of fingerprint fluorescence, which still remain elusive. Here the physical model of filament-… ▽ More Femtosecond laser filament-induced plasma spectroscopy (FIPS) demonstrates great potentials in the remote sensing for identifying atmospheric pollutant molecules. Due to the widespread aerosols in atmosphere, the remote detection based on FIPS would be affected from both the excitation and the propagation of fingerprint fluorescence, which still remain elusive. Here the physical model of filament-induced aerosol fluorescence is established to reveal the combined effect of Mie scattering and amplification spontaneous emission, which is then proved by the experimental results, the dependence of the backward fluorescence on the interaction length between filament and aerosols. These findings provide an insight into the complicated aerosol effect in the overall physical process of FIPS including propagation, excitation and emission, paving the way to its practical application in atmospheric remote sensing. △ Less

Submitted 22 June, 2023; originally announced June 2023.

Comments: 7 pages, 4 figures

arXiv:2306.07486 [pdf, other]

Knowledge-Prompted Estimator: A Novel Approach to Explainable Machine Translation Assessment

Authors: Hao Yang, Min Zhang, Shimin Tao, Minghan Wang, Daimeng Wei, Yanfei Jiang

Abstract: Cross-lingual Machine Translation (MT) quality estimation plays a crucial role in evaluating translation performance. GEMBA, the first MT quality assessment metric based on Large Language Models (LLMs), employs one-step prompting to achieve state-of-the-art (SOTA) in system-level MT quality estimation; however, it lacks segment-level analysis. In contrast, Chain-of-Thought (CoT) prompting outperfo… ▽ More Cross-lingual Machine Translation (MT) quality estimation plays a crucial role in evaluating translation performance. GEMBA, the first MT quality assessment metric based on Large Language Models (LLMs), employs one-step prompting to achieve state-of-the-art (SOTA) in system-level MT quality estimation; however, it lacks segment-level analysis. In contrast, Chain-of-Thought (CoT) prompting outperforms one-step prompting by offering improved reasoning and explainability. In this paper, we introduce Knowledge-Prompted Estimator (KPE), a CoT prompting method that combines three one-step prompting techniques, including perplexity, token-level similarity, and sentence-level similarity. This method attains enhanced performance for segment-level estimation compared with previous deep learning models and one-step prompting approaches. Furthermore, supplementary experiments on word-level visualized alignment demonstrate that our KPE method significantly improves token alignment compared with earlier models and provides better interpretability for MT quality estimation. Code will be released upon publication. △ Less

Submitted 12 June, 2023; originally announced June 2023.

arXiv:2306.07281 [pdf]

Filament based Ionizing Radiation Sensing Technology

Authors: Weiwei Liu, Jiewei Guo, Nan Zhang, Lu Sun, Haiyi Liu, Shihi Tao, Yuezheng Wang, Binpeng Shang, Pengfei Qi, Lie Lin

Abstract: Accidental exposure to overdose ionizing radiation will inevitably lead to severe biological damage, thus detecting and localizing radiation is essential. Traditional measurement techniques are generally restricted to the limited detection range of few centimeters, posing a great risk to operators. The potential in remote sensing makes femtosecond laser filament technology great candidates for con… ▽ More Accidental exposure to overdose ionizing radiation will inevitably lead to severe biological damage, thus detecting and localizing radiation is essential. Traditional measurement techniques are generally restricted to the limited detection range of few centimeters, posing a great risk to operators. The potential in remote sensing makes femtosecond laser filament technology great candidates for constructively address this challenge. Here we propose a novel filament-based ionizing radiation sensing technology (FIRST), and clarify the interaction mechanism between filaments and ionizing radiation. Specifically, it is demonstrated that the energetic electrons and ions produced by α radiation in air can be effectively accelerated within the filament, serving as seed electrons, which will markedly enhance nitrogen fluorescence. The extended nitrogen fluorescence lifetime of ~1 ns is also observed. These findings provide insights into the intricate interaction among ultra-strong light filed, plasma and energetic particle beam, and pave the way for the remote sensing of ionizing radiation. △ Less

Submitted 9 June, 2023; originally announced June 2023.

Comments: 13 pages, 6 figures

arXiv:2305.15792 [pdf, other]

IDEA: Invariant Defense for Graph Adversarial Robustness

Authors: Shuchang Tao, Qi Cao, Huawei Shen, Yunfan Wu, Bingbing Xu, Xueqi Cheng

Abstract: Despite the success of graph neural networks (GNNs), their vulnerability to adversarial attacks poses tremendous challenges for practical applications. Existing defense methods suffer from severe performance decline under unseen attacks, due to either limited observed adversarial examples or pre-defined heuristics. To address these limitations, we analyze the causalities in graph adversarial attac… ▽ More Despite the success of graph neural networks (GNNs), their vulnerability to adversarial attacks poses tremendous challenges for practical applications. Existing defense methods suffer from severe performance decline under unseen attacks, due to either limited observed adversarial examples or pre-defined heuristics. To address these limitations, we analyze the causalities in graph adversarial attacks and conclude that causal features are key to achieve graph adversarial robustness, owing to their determinedness for labels and invariance across attacks. To learn these causal features, we innovatively propose an Invariant causal DEfense method against adversarial Attacks (IDEA). We derive node-based and structure-based invariance objectives from an information-theoretic perspective. IDEA ensures strong predictability for labels and invariant predictability across attacks, which is provably a causally invariant defense across various attacks. Extensive experiments demonstrate that IDEA attains state-of-the-art defense performance under all five attacks on all five datasets. The implementation of IDEA is available at https://anonymous.4open.science/r/IDEA. △ Less

Submitted 25 April, 2024; v1 submitted 25 May, 2023; originally announced May 2023.

Comments: Submitted to Information Sciences

arXiv:2305.05204 [pdf, other]

doi 10.1145/3539618.3591947

Popularity Debiasing from Exposure to Interaction in Collaborative Filtering

Authors: Yuanhao Liu, Qi Cao, Huawei Shen, Yunfan Wu, Shuchang Tao, Xueqi Cheng

Abstract: Recommender systems often suffer from popularity bias, where popular items are overly recommended while sacrificing unpopular items. Existing researches generally focus on ensuring the number of recommendations exposure of each item is equal or proportional, using inverse propensity weighting, causal intervention, or adversarial training. However, increasing the exposure of unpopular items may not… ▽ More Recommender systems often suffer from popularity bias, where popular items are overly recommended while sacrificing unpopular items. Existing researches generally focus on ensuring the number of recommendations exposure of each item is equal or proportional, using inverse propensity weighting, causal intervention, or adversarial training. However, increasing the exposure of unpopular items may not bring more clicks or interactions, resulting in skewed benefits and failing in achieving real reasonable popularity debiasing. In this paper, we propose a new criterion for popularity debiasing, i.e., in an unbiased recommender system, both popular and unpopular items should receive Interactions Proportional to the number of users who Like it, namely IPL criterion. Under the guidance of the criterion, we then propose a debiasing framework with IPL regularization term which is theoretically shown to achieve a win-win situation of both popularity debiasing and recommendation performance. Experiments conducted on four public datasets demonstrate that when equip** two representative collaborative filtering models with our framework, the popularity bias is effectively alleviated while maintaining the recommendation performance. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Comments: Published as a SIGIR'23 short paper

arXiv:2304.09615 [pdf, other]

doi 10.1088/1748-0221/18/06/P06028

The construction and characterization of MgO transmission dynodes

Authors: H. W. Chan, V. Prodanović, A. M. M. G. Theulings, S. Tao, J. Smedley, C. W. Hagen, P. M. Sarro, H. v. d. Graaf

Abstract: In this work we demonstrate that ultra-thin (5 and 15 nm) MgO transmission dynodes (tynodes) with sufficient high transmission electron yield (TEY) can be constructed. These tynodes act as electron amplification stages in a novel vacuum electron multiplier: the Timed Photon Counter (TiPC). The ultra-thin membranes with a diameter of 30 μm are arranged in a square 64-by-64-array. The TEY was determ… ▽ More In this work we demonstrate that ultra-thin (5 and 15 nm) MgO transmission dynodes (tynodes) with sufficient high transmission electron yield (TEY) can be constructed. These tynodes act as electron amplification stages in a novel vacuum electron multiplier: the Timed Photon Counter (TiPC). The ultra-thin membranes with a diameter of 30 μm are arranged in a square 64-by-64-array. The TEY was determined with a scanning electron microscope (SEM) using primary electrons with primary energies of 0.75 - 5 keV. The method allow us to make a TEY map of the surface while simultaneously imaging the surface. The TEY of individual membranes can be extracted from the TEY map. An averaged maximum TEY of 4.6 +/- 0.2 was achieved by using 1.35 keV primary electrons on a TiN/MgO bi-layer membrane with a layer thickness of 2 and 5 nm, respectively. The TiN/MgO membrane with a layer thickness of 2 and 15 nm, respectively, has a maximum TEY of 3.3 +/- 0.1 (2.35 keV). Furthermore, the effect of the electric field strength on transmission (secondary) electron emission was investigated by placing the emission surface of a tynode in close proximity to a planar collector. By increasing the electric potential between the tynode and the collector, from -50 V to -100 V, the averaged maximum TEY improved from 4.6 +/- 0.2 to 5.0 +/- 0.3 at a primary energy of 1.35 keV with an upper limit of 5.5 on one of the membranes. △ Less

Submitted 19 April, 2023; originally announced April 2023.

arXiv:2303.10938 [pdf]

Complete Suppression of Phase Segregation in Mixed-Halide Perovskite Nanocrystals under Periodic Heating

Authors: Shengnan Feng, Rentong Duan, Yu Ju, Shuyi Li, Chunfeng Zhang, Shuxia Tao, Min Xiao, Xiaoyong Wang

Abstract: Under continuous light illumination, it is known that localized domains with segregated halide compositions form in semiconducting mixed-halide perovskites, thus severely limiting their optoelectronic applications due to the negative changes in bandgap energies and charge-carrier characteristics. Here we deposit mixed-halide perovskite CsPbBr1.2I1.8 nanocrystals onto an indium tin oxide substrate,… ▽ More Under continuous light illumination, it is known that localized domains with segregated halide compositions form in semiconducting mixed-halide perovskites, thus severely limiting their optoelectronic applications due to the negative changes in bandgap energies and charge-carrier characteristics. Here we deposit mixed-halide perovskite CsPbBr1.2I1.8 nanocrystals onto an indium tin oxide substrate, whose temperature can be rapidly changed by ~10 degree in a few seconds by applying or removing an external voltage. Such a sudden temperature change induces a temporary transition of CsPbBr1.2I1.8 nanocrystals from the segregated phase to the mixed phase, the latter of which can be permanently maintained when the light illumination is coupled with periodic heating cycles. These findings mark the emergence of a practical solution to the detrimental phase-segregation problem, given that a small temperature modulation is readily available in various fundamental studies and practical devices using mixed-halide perovskites. △ Less

Submitted 20 March, 2023; originally announced March 2023.

Comments: 25 pages, 4 figures

arXiv:2302.12048 [pdf, ps, other]

Frequency bin-wise single channel speech presence probability estimation using multiple DNNs

Authors: Shuai Tao, Himavanth Reddy, Jesper Rindom Jensen, Mads Græsbøll Christensen

Abstract: In this work, we propose a frequency bin-wise method to estimate the single-channel speech presence probability (SPP) with multiple deep neural networks (DNNs) in the short-time Fourier transform domain. Since all frequency bins are typically considered simultaneously as input features for conventional DNN-based SPP estimators, high model complexity is inevitable. To reduce the model complexity an… ▽ More In this work, we propose a frequency bin-wise method to estimate the single-channel speech presence probability (SPP) with multiple deep neural networks (DNNs) in the short-time Fourier transform domain. Since all frequency bins are typically considered simultaneously as input features for conventional DNN-based SPP estimators, high model complexity is inevitable. To reduce the model complexity and the requirements on the training data, we take a single frequency bin and some of its neighboring frequency bins into account to train separate gate recurrent units. In addition, the noisy speech and the a posteriori probability SPP representation are used to train our model. The experiments were performed on the Deep Noise Suppression challenge dataset. The experimental results show that the speech detection accuracy can be improved when we employ the frequency bin-wise model. Finally, we also demonstrate that our proposed method outperforms most of the state-of-the-art SPP estimation methods in terms of speech detection accuracy and model complexity. △ Less

Submitted 23 February, 2023; originally announced February 2023.

Comments: Accepted for ICASSP 2023

arXiv:2302.08051 [pdf, other]

Graph Adversarial Immunization for Certifiable Robustness

Authors: Shuchang Tao, Huawei Shen, Qi Cao, Yunfan Wu, Liang Hou, Xueqi Cheng

Abstract: Despite achieving great success, graph neural networks (GNNs) are vulnerable to adversarial attacks. Existing defenses focus on develo** adversarial training or model modification. In this paper, we propose and formulate graph adversarial immunization, i.e., vaccinating part of graph structure to improve certifiable robustness of graph against any admissible adversarial attack. We first propose… ▽ More Despite achieving great success, graph neural networks (GNNs) are vulnerable to adversarial attacks. Existing defenses focus on develo** adversarial training or model modification. In this paper, we propose and formulate graph adversarial immunization, i.e., vaccinating part of graph structure to improve certifiable robustness of graph against any admissible adversarial attack. We first propose edge-level immunization to vaccinate node pairs. Unfortunately, such edge-level immunization cannot defend against emerging node injection attacks, since it only immunizes existing node pairs. To this end, we further propose node-level immunization. To avoid computationally intensive combinatorial optimization associated with adversarial immunization, we develop AdvImmune-Edge and AdvImmune-Node algorithms to effectively obtain the immune node pairs or nodes. Extensive experiments demonstrate the superiority of AdvImmune methods. In particular, AdvImmune-Node remarkably improves the ratio of robust nodes by 79%, 294%, and 100%, after immunizing only 5% of nodes. Furthermore, AdvImmune methods show excellent defensive performance against various attacks, outperforming state-of-the-art defenses. To the best of our knowledge, this is the first attempt to improve certifiable robustness from graph data perspective without losing performance on clean graphs, providing new insights into graph adversarial learning. △ Less

Submitted 23 September, 2023; v1 submitted 15 February, 2023; originally announced February 2023.

Comments: Published in TKDE. Code: https://github.com/TaoShuchang/AdvImmune_node

arXiv:2302.04659 [pdf, other]

ManiSkill2: A Unified Benchmark for Generalizable Manipulation Skills

Authors: Jiayuan Gu, Fanbo Xiang, Xuanlin Li, Zhan Ling, Xiqiang Liu, Tongzhou Mu, Yihe Tang, Stone Tao, Xinyue Wei, Yunchao Yao, Xiaodi Yuan, Pengwei Xie, Zhiao Huang, Rui Chen, Hao Su

Abstract: Generalizable manipulation skills, which can be composed to tackle long-horizon and complex daily chores, are one of the cornerstones of Embodied AI. However, existing benchmarks, mostly composed of a suite of simulatable environments, are insufficient to push cutting-edge research works because they lack object-level topological and geometric variations, are not based on fully dynamic simulation,… ▽ More Generalizable manipulation skills, which can be composed to tackle long-horizon and complex daily chores, are one of the cornerstones of Embodied AI. However, existing benchmarks, mostly composed of a suite of simulatable environments, are insufficient to push cutting-edge research works because they lack object-level topological and geometric variations, are not based on fully dynamic simulation, or are short of native support for multiple types of manipulation tasks. To this end, we present ManiSkill2, the next generation of the SAPIEN ManiSkill benchmark, to address critical pain points often encountered by researchers when using benchmarks for generalizable manipulation skills. ManiSkill2 includes 20 manipulation task families with 2000+ object models and 4M+ demonstration frames, which cover stationary/mobile-base, single/dual-arm, and rigid/soft-body manipulation tasks with 2D/3D-input data simulated by fully dynamic engines. It defines a unified interface and evaluation protocol to support a wide range of algorithms (e.g., classic sense-plan-act, RL, IL), visual observations (point cloud, RGBD), and controllers (e.g., action type and parameterization). Moreover, it empowers fast visual input learning algorithms so that a CNN-based policy can collect samples at about 2000 FPS with 1 GPU and 16 processes on a regular workstation. It implements a render server infrastructure to allow sharing rendering resources across all environments, thereby significantly reducing memory usage. We open-source all codes of our benchmark (simulator, environments, and baselines) and host an online challenge open to interdisciplinary researchers. △ Less

Submitted 9 February, 2023; originally announced February 2023.

Comments: Published as a conference paper at ICLR 2023. Project website: https://maniskill2.github.io/

arXiv:2301.11485 [pdf]

Sub-ppb aerosol detection at a distance of 30 meters by millijoule femtosecond laser pulse filamentation in air

Authors: Jiewei Guo, Zhi Zhang, Nan Zhang, Binpeng Shang, Jiayun Xue, Yuezheng Wang, Shishi Tao, Bofu Xie, Lanjun Guo, Lie Lin, Weiwei Liu

Abstract: In this work, sub-ppb aerosol detection is achieved by femtosecond laser filament with a single pulse energy of 4 mJ at a distance of 30 m. A concave mirror with an open aperture of 41.4 cm is employed in an off-axis optical system to focus the femtosecond laser beam and collect the fluorescence of NaCl aerosol. The simulation and experimental results show that the astigmatism can be greatly reduc… ▽ More In this work, sub-ppb aerosol detection is achieved by femtosecond laser filament with a single pulse energy of 4 mJ at a distance of 30 m. A concave mirror with an open aperture of 41.4 cm is employed in an off-axis optical system to focus the femtosecond laser beam and collect the fluorescence of NaCl aerosol. The simulation and experimental results show that the astigmatism can be greatly reduced when femtosecond laser beam is incident non-symmetrically on the concave mirror. Compared with the case that femtosecond laser strikes at the center of the concave mirror, the intensity of the optical filament is increased by 69.5 times, and the detection of limit of sodium chloride aerosol is reduced by 86%, which is down to 0.32 ppb. The improved excitation scheme in this work utilizes the nonsymmetrical beam spot on the concave mirror to compensate the non-symmetry induced by the off-axis setup, reducing the astigmatism of the focusing laser beam and improving the aerosol's detection of limit. △ Less

Submitted 26 January, 2023; originally announced January 2023.

arXiv:2301.01609 [pdf, other]

Emergent collective intelligence from massive-agent cooperation and competition

Authors: Hanmo Chen, Stone Tao, Jiaxin Chen, Weihan Shen, Xihui Li, Chenghui Yu, Sikai Cheng, Xiaolong Zhu, Xiu Li

Abstract: Inspired by organisms evolving through cooperation and competition between different populations on Earth, we study the emergence of artificial collective intelligence through massive-agent reinforcement learning. To this end, We propose a new massive-agent reinforcement learning environment, Lux, where dynamic and massive agents in two teams scramble for limited resources and fight off the darkne… ▽ More Inspired by organisms evolving through cooperation and competition between different populations on Earth, we study the emergence of artificial collective intelligence through massive-agent reinforcement learning. To this end, We propose a new massive-agent reinforcement learning environment, Lux, where dynamic and massive agents in two teams scramble for limited resources and fight off the darkness. In Lux, we build our agents through the standard reinforcement learning algorithm in curriculum learning phases and leverage centralized control via a pixel-to-pixel policy network. As agents co-evolve through self-play, we observe several stages of intelligence, from the acquisition of atomic skills to the development of group strategies. Since these learned group strategies arise from individual decisions without an explicit coordination mechanism, we claim that artificial collective intelligence emerges from massive-agent cooperation and competition. We further analyze the emergence of various learned strategies through metrics and ablation studies, aiming to provide insights for reinforcement learning implementations in massive-agent environments. △ Less

Submitted 5 January, 2023; v1 submitted 4 January, 2023; originally announced January 2023.

Comments: Published at NeurIPS 2022 Deep RL workshop. Code available at https://github.com/hanmochen/lux-open

arXiv:2212.05830 [pdf, other]

P-Transformer: Towards Better Document-to-Document Neural Machine Translation

Authors: Yachao Li, Junhui Li, **g Jiang, Shimin Tao, Hao Yang, Min Zhang

Abstract: Directly training a document-to-document (Doc2Doc) neural machine translation (NMT) via Transformer from scratch, especially on small datasets usually fails to converge. Our dedicated probing tasks show that 1) both the absolute position and relative position information gets gradually weakened or even vanished once it reaches the upper encoder layers, and 2) the vanishing of absolute position inf… ▽ More Directly training a document-to-document (Doc2Doc) neural machine translation (NMT) via Transformer from scratch, especially on small datasets usually fails to converge. Our dedicated probing tasks show that 1) both the absolute position and relative position information gets gradually weakened or even vanished once it reaches the upper encoder layers, and 2) the vanishing of absolute position information in encoder output causes the training failure of Doc2Doc NMT. To alleviate this problem, we propose a position-aware Transformer (P-Transformer) to enhance both the absolute and relative position information in both self-attention and cross-attention. Specifically, we integrate absolute positional information, i.e., position embeddings, into the query-key pairs both in self-attention and cross-attention through a simple yet effective addition operation. Moreover, we also integrate relative position encoding in self-attention. The proposed P-Transformer utilizes sinusoidal position encoding and does not require any task-specified position embedding, segment embedding, or attention mechanism. Through the above methods, we build a Doc2Doc NMT model with P-Transformer, which ingests the source document and completely generates the target document in a sequence-to-sequence (seq2seq) way. In addition, P-Transformer can be applied to seq2seq-based document-to-sentence (Doc2Sent) and sentence-to-sentence (Sent2Sent) translation. Extensive experimental results of Doc2Doc NMT show that P-Transformer significantly outperforms strong baselines on widely-used 9 document-level datasets in 7 language pairs, covering small-, middle-, and large-scales, and achieves a new state-of-the-art. Experimentation on discourse phenomena shows that our Doc2Doc NMT models improve the translation quality in both BLEU and discourse coherence. We make our code available on Github. △ Less

Submitted 12 December, 2022; originally announced December 2022.

Comments: Submitted to TASLP

arXiv:2211.00981 [pdf, other]

Relevance Assessments for Web Search Evaluation: Should We Randomise or Prioritise the Pooled Documents? (CORRECTED VERSION)

Authors: Tetsuya Sakai, Sijie Tao, Zhaohao Zeng

Abstract: In the context of depth-$k$ pooling for constructing web search test collections, we compare two approaches to ordering pooled documents for relevance assessors: the prioritisation strategy (PRI) used widely at NTCIR, and the simple randomisation strategy (RND). In order to address research questions regarding PRI and RND, we have constructed and released the WWW3E8 data set, which contains eight… ▽ More In the context of depth-$k$ pooling for constructing web search test collections, we compare two approaches to ordering pooled documents for relevance assessors: the prioritisation strategy (PRI) used widely at NTCIR, and the simple randomisation strategy (RND). In order to address research questions regarding PRI and RND, we have constructed and released the WWW3E8 data set, which contains eight independent relevance labels for 32,375 topic-document pairs, i.e., a total of 259,000 labels. Four of the eight relevance labels were obtained from PRI-based pools; the other four were obtained from RND-based pools. Using WWW3E8, we compare PRI and RND in terms of inter-assessor agreement, system ranking agreement, and robustness to new systems that did not contribute to the pools. We also utilise an assessor activity log we obtained as a byproduct of WWW3E8 to compare the two strategies in terms of assessment efficiency. △ Less

Submitted 2 November, 2022; originally announced November 2022.

Comments: 30 pages. This is a corrected version of an open-access TOIS paper ( https://dl.acm.org/doi/pdf/10.1145/3494833 )

Showing 1–50 of 143 results for author: tao, s