Search | arXiv e-print repository

doi 10.1145/3564281

On the Robustness of Aspect-based Sentiment Analysis: Rethinking Model, Data, and Training

Authors: Hao Fei, Tat-Seng Chua, Chenliang Li, Donghong Ji, Meishan Zhang, Yafeng Ren

Abstract: Aspect-based sentiment analysis (ABSA) aims at automatically inferring the specific sentiment polarities toward certain aspects of products or services behind the social media texts or reviews, which has been a fundamental application to the real-world society. Since the early 2010s, ABSA has achieved extraordinarily high accuracy with various deep neural models. However, existing ABSA models with… ▽ More Aspect-based sentiment analysis (ABSA) aims at automatically inferring the specific sentiment polarities toward certain aspects of products or services behind the social media texts or reviews, which has been a fundamental application to the real-world society. Since the early 2010s, ABSA has achieved extraordinarily high accuracy with various deep neural models. However, existing ABSA models with strong in-house performances may fail to generalize to some challenging cases where the contexts are variable, i.e., low robustness to real-world environments. In this study, we propose to enhance the ABSA robustness by systematically rethinking the bottlenecks from all possible angles, including model, data, and training. First, we strengthen the current best-robust syntax-aware models by further incorporating the rich external syntactic dependencies and the labels with aspect simultaneously with a universal-syntax graph convolutional network. In the corpus perspective, we propose to automatically induce high-quality synthetic training data with various types, allowing models to learn sufficient inductive bias for better robustness. Last, we based on the rich pseudo data perform adversarial training to enhance the resistance to the context perturbation and meanwhile employ contrastive learning to reinforce the representations of instances with contrastive sentiments. Extensive robustness evaluations are conducted. The results demonstrate that our enhanced syntax-aware model achieves better robustness performances than all the state-of-the-art baselines. By additionally incorporating our synthetic corpus, the robust testing results are pushed with around 10% accuracy, which are then further improved by installing the advanced training strategies. In-depth analyses are presented for revealing the factors influencing the ABSA robustness. △ Less

Submitted 19 April, 2023; originally announced April 2023.

Comments: Accepted in ACM Transactions on Information Systems

Journal ref: [J]. ACM Transactions on Information Systems, 2022, 41(2): 1-32

arXiv:2304.06248 [pdf, other]

LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model

Authors: Hao Fei, Shengqiong Wu, **gye Li, Bobo Li, Fei Li, Libo Qin, Meishan Zhang, Min Zhang, Tat-Seng Chua

Abstract: Universally modeling all typical information extraction tasks (UIE) with one generative language model (GLM) has revealed great potential by the latest study, where various IE predictions are unified into a linearized hierarchical expression under a GLM. Syntactic structure information, a type of effective feature which has been extensively utilized in IE community, should also be beneficial to UI… ▽ More Universally modeling all typical information extraction tasks (UIE) with one generative language model (GLM) has revealed great potential by the latest study, where various IE predictions are unified into a linearized hierarchical expression under a GLM. Syntactic structure information, a type of effective feature which has been extensively utilized in IE community, should also be beneficial to UIE. In this work, we propose a novel structure-aware GLM, fully unleashing the power of syntactic knowledge for UIE. A heterogeneous structure inductor is explored to unsupervisedly induce rich heterogeneous structural representations by post-training an existing GLM. In particular, a structural broadcaster is devised to compact various latent trees into explicit high-order forests, hel** to guide a better generation during decoding. We finally introduce a task-oriented structure fine-tuning mechanism, further adjusting the learned structures to most coincide with the end-task's need. Over 12 IE benchmarks across 7 tasks our system shows significant improvements over the baseline UIE system. Further in-depth analyses show that our GLM learns rich task-adaptive structural bias that greatly resolves the UIE crux, the long-range dependence issue and boundary identifying. Source codes are open at https://github.com/ChocoWu/LasUIE. △ Less

Submitted 13 April, 2023; originally announced April 2023.

Comments: NeurIPS2022 conference paper

arXiv:2211.05705 [pdf, other]

DiaASQ : A Benchmark of Conversational Aspect-based Sentiment Quadruple Analysis

Authors: Bobo Li, Hao Fei, Fei Li, Yuhan Wu, **song Zhang, Shengqiong Wu, **gye Li, Yijiang Liu, Lizi Liao, Tat-Seng Chua, Donghong Ji

Abstract: The rapid development of aspect-based sentiment analysis (ABSA) within recent decades shows great potential for real-world society. The current ABSA works, however, are mostly limited to the scenario of a single text piece, leaving the study in dialogue contexts unexplored. To bridge the gap between fine-grained sentiment analysis and conversational opinion mining, in this work, we introduce a nov… ▽ More The rapid development of aspect-based sentiment analysis (ABSA) within recent decades shows great potential for real-world society. The current ABSA works, however, are mostly limited to the scenario of a single text piece, leaving the study in dialogue contexts unexplored. To bridge the gap between fine-grained sentiment analysis and conversational opinion mining, in this work, we introduce a novel task of conversational aspect-based sentiment quadruple analysis, namely DiaASQ, aiming to detect the quadruple of target-aspect-opinion-sentiment in a dialogue. We manually construct a large-scale high-quality DiaASQ dataset in both Chinese and English languages. We deliberately develop a neural model to benchmark the task, which advances in effectively performing end-to-end quadruple prediction, and manages to incorporate rich dialogue-specific and discourse feature representations for better cross-utterance quadruple extraction. We hope the new benchmark will spur more advancements in the sentiment analysis community. △ Less

Submitted 22 May, 2023; v1 submitted 10 November, 2022; originally announced November 2022.

Comments: Accepted to Findings of ACL 2023

arXiv:2210.16541 [pdf, other]

Entity-centered Cross-document Relation Extraction

Authors: Fengqi Wang, Fei Li, Hao Fei, **gye Li, Shengqiong Wu, Fangfang Su, Wenxuan Shi, Donghong Ji, Bo Cai

Abstract: Relation Extraction (RE) is a fundamental task of information extraction, which has attracted a large amount of research attention. Previous studies focus on extracting the relations within a sentence or document, while currently researchers begin to explore cross-document RE. However, current cross-document RE methods directly utilize text snippets surrounding target entities in multiple given do… ▽ More Relation Extraction (RE) is a fundamental task of information extraction, which has attracted a large amount of research attention. Previous studies focus on extracting the relations within a sentence or document, while currently researchers begin to explore cross-document RE. However, current cross-document RE methods directly utilize text snippets surrounding target entities in multiple given documents, which brings considerable noisy and non-relevant sentences. Moreover, they utilize all the text paths in a document bag in a coarse-grained way, without considering the connections between these text paths.In this paper, we aim to address both of these shortages and push the state-of-the-art for cross-document RE. First, we focus on input construction for our RE model and propose an entity-based document-context filter to retain useful information in the given documents by using the bridge entities in the text paths. Second, we propose a cross-document RE model based on cross-path entity relation attention, which allow the entity relations across text paths to interact with each other. We compare our cross-document RE method with the state-of-the-art methods in the dataset CodRED. Our method outperforms them by at least 10% in F1, thus demonstrating its effectiveness. △ Less

Submitted 29 October, 2022; originally announced October 2022.

Comments: This paper was accepted by EMNLP 2022 conference

arXiv:2210.15265 [pdf, other]

Conversation Disentanglement with Bi-Level Contrastive Learning

Authors: Chengyu Huang, Zheng Zhang, Hao Fei, Lizi Liao

Abstract: Conversation disentanglement aims to group utterances into detached sessions, which is a fundamental task in processing multi-party conversations. Existing methods have two main drawbacks. First, they overemphasize pairwise utterance relations but pay inadequate attention to the utterance-to-context relation modeling. Second, huge amount of human annotated data is required for training, which is e… ▽ More Conversation disentanglement aims to group utterances into detached sessions, which is a fundamental task in processing multi-party conversations. Existing methods have two main drawbacks. First, they overemphasize pairwise utterance relations but pay inadequate attention to the utterance-to-context relation modeling. Second, huge amount of human annotated data is required for training, which is expensive to obtain in practice. To address these issues, we propose a general disentangle model based on bi-level contrastive learning. It brings closer utterances in the same session while encourages each utterance to be near its clustered session prototypes in the representation space. Unlike existing approaches, our disentangle model works in both supervised setting with labeled data and unsupervised setting when no such data is available. The proposed method achieves new state-of-the-art performance on both settings across several public datasets. △ Less

Submitted 27 October, 2022; originally announced October 2022.

arXiv:2210.10841 [pdf, other]

Prompting through Prototype: A Prototype-based Prompt Learning on Pretrained Vision-Language Models

Authors: Yue Zhang, Hongliang Fei, Dingcheng Li, Tan Yu, ** Li

Abstract: Prompt learning is a new learning paradigm which reformulates downstream tasks as similar pretraining tasks on pretrained models by leveraging textual prompts. Recent works have demonstrated that prompt learning is particularly useful for few-shot learning, where there is limited training data. Depending on the granularity of prompts, those methods can be roughly divided into task-level prompting… ▽ More Prompt learning is a new learning paradigm which reformulates downstream tasks as similar pretraining tasks on pretrained models by leveraging textual prompts. Recent works have demonstrated that prompt learning is particularly useful for few-shot learning, where there is limited training data. Depending on the granularity of prompts, those methods can be roughly divided into task-level prompting and instance-level prompting. Task-level prompting methods learn one universal prompt for all input samples, which is efficient but ineffective to capture subtle differences among different classes. Instance-level prompting methods learn a specific prompt for each input, though effective but inefficient. In this work, we develop a novel prototype-based prompt learning method to overcome the above limitations. In particular, we focus on few-shot image recognition tasks on pretrained vision-language models (PVLMs) and develop a method of prompting through prototype (PTP), where we define $K$ image prototypes and $K$ prompt prototypes. In PTP, the image prototype represents a centroid of a certain image cluster in the latent space and a prompt prototype is defined as a soft prompt in the continuous space. The similarity between a query image and an image prototype determines how much this prediction relies on the corresponding prompt prototype. Hence, in PTP, similar images will utilize similar prompting ways. Through extensive experiments on seven real-world benchmarks, we show that PTP is an effective method to leverage the latent knowledge and adaptive to various PVLMs. Moreover, through detailed analysis, we discuss pros and cons for prompt learning and parameter-efficient fine-tuning under the context of few-shot learning. △ Less

Submitted 19 October, 2022; originally announced October 2022.

arXiv:2210.09599 [pdf, other]

Denoising Enhanced Distantly Supervised Ultrafine Entity Ty**

Authors: Yue Zhang, Hongliang Fei, ** Li

Abstract: Recently, the task of distantly supervised (DS) ultra-fine entity ty** has received significant attention. However, DS data is noisy and often suffers from missing or wrong labeling issues resulting in low precision and low recall. This paper proposes a novel ultra-fine entity ty** model with denoising capability. Specifically, we build a noise model to estimate the unknown labeling noise dist… ▽ More Recently, the task of distantly supervised (DS) ultra-fine entity ty** has received significant attention. However, DS data is noisy and often suffers from missing or wrong labeling issues resulting in low precision and low recall. This paper proposes a novel ultra-fine entity ty** model with denoising capability. Specifically, we build a noise model to estimate the unknown labeling noise distribution over input contexts and noisy type labels. With the noise model, more trustworthy labels can be recovered by subtracting the estimated noise from the input. Furthermore, we propose an entity ty** model, which adopts a bi-encoder architecture, is trained on the denoised data. Finally, the noise model and entity ty** model are trained iteratively to enhance each other. We conduct extensive experiments on the Ultra-Fine entity ty** dataset as well as OntoNotes dataset and demonstrate that our approach significantly outperforms other baseline methods. △ Less

Submitted 18 October, 2022; originally announced October 2022.

arXiv:2210.07232 [pdf, other]

Decomposing User-APP Graph into Subgraphs for Effective APP and User Embedding Learning

Authors: Tan Yu, Jun Zhi, Yufei Zhang, Jian Li, Hongliang Fei, ** Li

Abstract: APP-installation information is helpful to describe the user's characteristics. The users with similar APPs installed might share several common interests and behave similarly in some scenarios. In this work, we learn a user embedding vector based on each user's APP-installation information. Since the user APP-installation embedding is learnable without dependency on the historical intra-APP behav… ▽ More APP-installation information is helpful to describe the user's characteristics. The users with similar APPs installed might share several common interests and behave similarly in some scenarios. In this work, we learn a user embedding vector based on each user's APP-installation information. Since the user APP-installation embedding is learnable without dependency on the historical intra-APP behavioral data of the user, it complements the intra-APP embedding learned within each specific APP. Thus, they considerably help improve the effectiveness of the personalized advertising in each APP, and they are particularly beneficial for the cold start of the new users in the APP. In this paper, we formulate the APP-installation user embedding learning into a bipartite graph embedding problem. The main challenge in learning an effective APP-installation user embedding is the imbalanced data distribution. In this case, graph learning tends to be dominated by the popular APPs, which billions of users have installed. In other words, some niche/specialized APPs might have a marginal influence on graph learning. To effectively exploit the valuable information from the niche APPs, we decompose the APP-installation graph into a set of subgraphs. Each subgraph contains only one APP node and the users who install the APP. For each mini-batch, we only sample the users from the same subgraph in the training process. Thus, each APP can be involved in the training process in a more balanced manner. After integrating the learned APP-installation user embedding into our online personal advertising platform, we obtained a considerable boost in CTR, CVR, and revenue. △ Less

Submitted 13 October, 2022; originally announced October 2022.

arXiv:2210.03037 [pdf, other]

Conversational Semantic Role Labeling with Predicate-Oriented Latent Graph

Authors: Hao Fei, Shengqiong Wu, Meishan Zhang, Yafeng Ren, Donghong Ji

Abstract: Conversational semantic role labeling (CSRL) is a newly proposed task that uncovers the shallow semantic structures in a dialogue text. Unfortunately several important characteristics of the CSRL task have been overlooked by the existing works, such as the structural information integration, near-neighbor influence. In this work, we investigate the integration of a latent graph for CSRL. We propos… ▽ More Conversational semantic role labeling (CSRL) is a newly proposed task that uncovers the shallow semantic structures in a dialogue text. Unfortunately several important characteristics of the CSRL task have been overlooked by the existing works, such as the structural information integration, near-neighbor influence. In this work, we investigate the integration of a latent graph for CSRL. We propose to automatically induce a predicate-oriented latent graph (POLar) with a predicate-centered Gaussian mechanism, by which the nearer and informative words to the predicate will be allocated with more attention. The POLar structure is then dynamically pruned and refined so as to best fit the task need. We additionally introduce an effective dialogue-level pre-trained language model, CoDiaBERT, for better supporting multiple utterance sentences and handling the speaker coreference issue in CSRL. Our system outperforms best-performing baselines on three benchmark CSRL datasets with big margins, especially achieving over 4% F1 score improvements on the cross-utterance argument detection. Further analyses are presented to better understand the effectiveness of our proposed methods. △ Less

Submitted 6 October, 2022; originally announced October 2022.

arXiv:2209.11727 [pdf, other]

Boost CTR Prediction for New Advertisements via Modeling Visual Content

Authors: Tan Yu, Zhipeng **, Jie Liu, Yi Yang, Hongliang Fei, ** Li

Abstract: Existing advertisements click-through rate (CTR) prediction models are mainly dependent on behavior ID features, which are learned based on the historical user-ad interactions. Nevertheless, behavior ID features relying on historical user behaviors are not feasible to describe new ads without previous interactions with users. To overcome the limitations of behavior ID features in modeling new ads,… ▽ More Existing advertisements click-through rate (CTR) prediction models are mainly dependent on behavior ID features, which are learned based on the historical user-ad interactions. Nevertheless, behavior ID features relying on historical user behaviors are not feasible to describe new ads without previous interactions with users. To overcome the limitations of behavior ID features in modeling new ads, we exploit the visual content in ads to boost the performance of CTR prediction models. Specifically, we map each ad into a set of visual IDs based on its visual content. These visual IDs are further used for generating the visual embedding for enhancing CTR prediction models. We formulate the learning of visual IDs into a supervised quantization problem. Due to a lack of class labels for commercial images in advertisements, we exploit image textual descriptions as the supervision to optimize the image extractor for generating effective visual IDs. Meanwhile, since the hard quantization is non-differentiable, we soften the quantization operation to make it support the end-to-end network training. After map** each image into visual IDs, we learn the embedding for each visual ID based on the historical user-ad interactions accumulated in the past. Since the visual ID embedding depends only on the visual content, it generalizes well to new ads. Meanwhile, the visual ID embedding complements the ad behavior ID embedding. Thus, it can considerably boost the performance of the CTR prediction models previously relying on behavior ID features for both new ads and ads that have accumulated rich user behaviors. After incorporating the visual ID embedding in the CTR prediction model of Baidu online advertising, the average CTR of ads improves by 1.46%, and the total charge increases by 1.10%. △ Less

Submitted 23 September, 2022; originally announced September 2022.

arXiv:2209.08759 [pdf, other]

Tree-based Text-Vision BERT for Video Search in Baidu Video Advertising

Authors: Tan Yu, Jie Liu, Yi Yang, Yi Li, Hongliang Fei, ** Li

Abstract: The advancement of the communication technology and the popularity of the smart phones foster the booming of video ads. Baidu, as one of the leading search engine companies in the world, receives billions of search queries per day. How to pair the video ads with the user search is the core task of Baidu video advertising. Due to the modality gap, the query-to-video retrieval is much more challengi… ▽ More The advancement of the communication technology and the popularity of the smart phones foster the booming of video ads. Baidu, as one of the leading search engine companies in the world, receives billions of search queries per day. How to pair the video ads with the user search is the core task of Baidu video advertising. Due to the modality gap, the query-to-video retrieval is much more challenging than traditional query-to-document retrieval and image-to-image search. Traditionally, the query-to-video retrieval is tackled by the query-to-title retrieval, which is not reliable when the quality of tiles are not high. With the rapid progress achieved in computer vision and natural language processing in recent years, content-based search methods becomes promising for the query-to-video retrieval. Benefited from pretraining on large-scale datasets, some visionBERT methods based on cross-modal attention have achieved excellent performance in many vision-language tasks not only in academia but also in industry. Nevertheless, the expensive computation cost of cross-modal attention makes it impractical for large-scale search in industrial applications. In this work, we present a tree-based combo-attention network (TCAN) which has been recently launched in Baidu's dynamic video advertising platform. It provides a practical solution to deploy the heavy cross-modal attention for the large-scale query-to-video search. After launching tree-based combo-attention network, click-through rate gets improved by 2.29\% and conversion rate get improved by 2.63\%. △ Less

Submitted 19 September, 2022; originally announced September 2022.

Comments: This revision is based on a manuscript submitted in October 2020, to ICDE 2021. We thank the Program Committee for their valuable comments

arXiv:2209.08358 [pdf]

High-performance chiral all-optical logic gate based on topological edge states of valley photonic crystal

Authors: Xiaorong Wang, Hongming Fei, Han Lin, Min Wu, Lijuan Kang, Mingda Zhang, Xin Liu, Yibiao Yang, Liantuan Xiao

Abstract: For all-optical communication and information processing, it is necessary to develop all-optical logic gates based on photonic structures that can directly perform logic operations. All-optical logic gates have been demonstrated based on conventional waveguides and interferometry, as well as photonic crystal structures. Nonetheless, any defects in those structures will introduce high scattering lo… ▽ More For all-optical communication and information processing, it is necessary to develop all-optical logic gates based on photonic structures that can directly perform logic operations. All-optical logic gates have been demonstrated based on conventional waveguides and interferometry, as well as photonic crystal structures. Nonetheless, any defects in those structures will introduce high scattering loss, which compromises the fidelity and contrast ratio of the information process. Based on the spin-valley locking effect that can achieve defect-immune unidirectional transmission of topological edge states in valley photonic crystals (VPCs), we propose a high-performance all-optical logic OR gate based on a VPC structure. By tuning the working bandwidth of the two input channels, we prevent interference between the two channels to achieve a stable and high-fidelity output. The transmittance of both channels is higher than 0.8, and a high contrast ratio of 28.8 dB is achieved. Moreover, the chirality of the logic gate originated from the spin-valley locking effect allows using different circularly polarized light as inputs, representing "1" or "0", which is highly desired in quantum computing. The device's footprint is small, allowing high-density on-chip integration. In addition, this design can be experimentally fabricated using current nanofabrication techniques and will have potential applications in optical communication, information processing, and quantum computing. △ Less

Submitted 17 September, 2022; originally announced September 2022.

Comments: 10 pages, 6 figures

arXiv:2209.04112 [pdf, other]

Joint Alignment of Multi-Task Feature and Label Spaces for Emotion Cause Pair Extraction

Authors: Shunjie Chen, Xiaochuan Shi, **gye Li, Shengqiong Wu, Hao Fei, Fei Li, Donghong Ji

Abstract: Emotion cause pair extraction (ECPE), as one of the derived subtasks of emotion cause analysis (ECA), shares rich inter-related features with emotion extraction (EE) and cause extraction (CE). Therefore EE and CE are frequently utilized as auxiliary tasks for better feature learning, modeled via multi-task learning (MTL) framework by prior works to achieve state-of-the-art (SoTA) ECPE results. How… ▽ More Emotion cause pair extraction (ECPE), as one of the derived subtasks of emotion cause analysis (ECA), shares rich inter-related features with emotion extraction (EE) and cause extraction (CE). Therefore EE and CE are frequently utilized as auxiliary tasks for better feature learning, modeled via multi-task learning (MTL) framework by prior works to achieve state-of-the-art (SoTA) ECPE results. However, existing MTL-based methods either fail to simultaneously model the specific features and the interactive feature in between, or suffer from the inconsistency of label prediction. In this work, we consider addressing the above challenges for improving ECPE by performing two alignment mechanisms with a novel A^2Net model. We first propose a feature-task alignment to explicitly model the specific emotion-&cause-specific features and the shared interactive feature. Besides, an inter-task alignment is implemented, in which the label distance between the ECPE and the combinations of EE&CE are learned to be narrowed for better label consistency. Evaluations of benchmarks show that our methods outperform current best-performing systems on all ECA subtasks. Further analysis proves the importance of our proposed alignment mechanisms for the task. △ Less

Submitted 9 September, 2022; originally announced September 2022.

Comments: Accepted by Coling 2022

arXiv:2209.02693 [pdf, other]

OneEE: A One-Stage Framework for Fast Overlap** and Nested Event Extraction

Authors: Hu Cao, **gye Li, Fangfang Su, Fei Li, Hao Fei, Shengqiong Wu, Bobo Li, Liang Zhao, Donghong Ji

Abstract: Event extraction (EE) is an essential task of information extraction, which aims to extract structured event information from unstructured text. Most prior work focuses on extracting flat events while neglecting overlapped or nested ones. A few models for overlapped and nested EE includes several successive stages to extract event triggers and arguments,which suffer from error propagation. Therefo… ▽ More Event extraction (EE) is an essential task of information extraction, which aims to extract structured event information from unstructured text. Most prior work focuses on extracting flat events while neglecting overlapped or nested ones. A few models for overlapped and nested EE includes several successive stages to extract event triggers and arguments,which suffer from error propagation. Therefore, we design a simple yet effective tagging scheme and model to formulate EE as word-word relation recognition, called OneEE. The relations between trigger or argument words are simultaneously recognized in one stage with parallel grid tagging, thus yielding a very fast event extraction speed. The model is equipped with an adaptive event fusion module to generate event-aware representations and a distance-aware predictor to integrate relative distance information for word-word relation recognition, which are empirically demonstrated to be effective mechanisms. Experiments on 3 overlapped and nested EE benchmarks, namely FewFC, Genia11, and Genia13, show that OneEE achieves the state-of-the-art (SOTA) results. Moreover, the inference speed of OneEE is faster than those of baselines in the same condition, and can be further substantially improved since it supports parallel inference. △ Less

Submitted 6 September, 2022; originally announced September 2022.

Comments: Accepted by COLING'22

arXiv:2207.06057 [pdf, other]

Subband-based Generative Adversarial Network for Non-parallel Many-to-many Voice Conversion

Authors: Jian Ma, Zhedong Zheng, Hao Fei, Feng Zheng, Tat-seng Chua, Yi Yang

Abstract: Voice conversion is to generate a new speech with the source content and a target voice style. In this paper, we focus on one general setting, i.e., non-parallel many-to-many voice conversion, which is close to the real-world scenario. As the name implies, non-parallel many-to-many voice conversion does not require the paired source and reference speeches and can be applied to arbitrary voice tran… ▽ More Voice conversion is to generate a new speech with the source content and a target voice style. In this paper, we focus on one general setting, i.e., non-parallel many-to-many voice conversion, which is close to the real-world scenario. As the name implies, non-parallel many-to-many voice conversion does not require the paired source and reference speeches and can be applied to arbitrary voice transfer. In recent years, Generative Adversarial Networks (GANs) and other techniques such as Conditional Variational Autoencoders (CVAEs) have made considerable progress in this field. However, due to the sophistication of voice conversion, the style similarity of the converted speech is still unsatisfactory. Inspired by the inherent structure of mel-spectrogram, we propose a new voice conversion framework, i.e., Subband-based Generative Adversarial Network for Voice Conversion (SGAN-VC). SGAN-VC converts each subband content of the source speech separately by explicitly utilizing the spatial characteristics between different subbands. SGAN-VC contains one style encoder, one content encoder, and one decoder. In particular, the style encoder network is designed to learn style codes for different subbands of the target speaker. The content encoder network can capture the content information on the source speech. Finally, the decoder generates particular subband content. In addition, we propose a pitch-shift module to fine-tune the pitch of the source speaker, making the converted tone more accurate and explainable. Extensive experiments demonstrate that the proposed approach achieves state-of-the-art performance on VCTK Corpus and AISHELL3 datasets both qualitatively and quantitatively, whether on seen or unseen data. Furthermore, the content intelligibility of SGAN-VC on unseen data even exceeds that of StarGANv2-VC with ASR network assistance. △ Less

Submitted 27 July, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

arXiv:2204.04367 [pdf]

Design of wavelength division multiplexing devices based on tunable edge states of valley photonic crystals

Authors: YuHui Han, HongMing Fei, Han Lin, MingDa Zhang, Xin Liu, XiaoRong Wang, BinZhao Cao, YiBiao Yang, LianTuan Xiao

Abstract: Wavelength division multiplexing (WDM) devices are key elements of Photonic integrated circuits (PICs). Conventional WDM devices based on silicon waveguides and photonic crystals have limited transmittance due to high loss introduced by the strong backward scattering from defects. In addition, it is challenging to reduce the footprint of those devices. Here we theoretically demonstrate a WDM devic… ▽ More Wavelength division multiplexing (WDM) devices are key elements of Photonic integrated circuits (PICs). Conventional WDM devices based on silicon waveguides and photonic crystals have limited transmittance due to high loss introduced by the strong backward scattering from defects. In addition, it is challenging to reduce the footprint of those devices. Here we theoretically demonstrate a WDM device in the telecommunication range based on all-dielectric silicon topological valley photonic crystal (VPC) structures. We tune its effective refractive index by tuning the physical parameters of the lattice in the silicon substrate, which can continuously tune the working wavelength range of the topological edge states, which allows designing WDM devices with different channels. The WDM device has two channels (1470 nm-1523 nm and 1548 nm-1609 nm), with contrast ratios of 22.4 dB and 24.9 dB, respectively. The principle of manipulating the working bandwidth of the topological edge states can be generally applied in designing different integratable photonic devices, thus it will find broad applications. △ Less

Submitted 8 April, 2022; originally announced April 2022.

Comments: 13 pages, 6 figures

arXiv:2203.10796 [pdf, other]

Effective Token Graph Modeling using a Novel Labeling Strategy for Structured Sentiment Analysis

Authors: Wenxuan Shi, Fei Li, **gye Li, Hao Fei, Donghong Ji

Abstract: The state-of-the-art model for structured sentiment analysis casts the task as a dependency parsing problem, which has some limitations: (1) The label proportions for span prediction and span relation prediction are imbalanced. (2) The span lengths of sentiment tuple components may be very large in this task, which will further exacerbate the imbalance problem. (3) Two nodes in a dependency graph… ▽ More The state-of-the-art model for structured sentiment analysis casts the task as a dependency parsing problem, which has some limitations: (1) The label proportions for span prediction and span relation prediction are imbalanced. (2) The span lengths of sentiment tuple components may be very large in this task, which will further exacerbate the imbalance problem. (3) Two nodes in a dependency graph cannot have multiple arcs, therefore some overlapped sentiment tuples cannot be recognized. In this work, we propose nichetargeting solutions for these issues. First, we introduce a novel labeling strategy, which contains two sets of token pair labels, namely essential label set and whole label set. The essential label set consists of the basic labels for this task, which are relatively balanced and applied in the prediction layer. The whole label set includes rich labels to help our model capture various token relations, which are applied in the hidden layer to softly influence our model. Moreover, we also propose an effective model to well collaborate with our labeling strategy, which is equipped with the graph attention networks to iteratively refine token representations, and the adaptive multi-label classifier to dynamically predict multiple relations between token pairs. We perform extensive experiments on 5 benchmark datasets in four languages. Experimental results show that our model outperforms previous SOTA models by a large margin. △ Less

Submitted 21 March, 2022; originally announced March 2022.

Comments: to appear at the ACL 2022 Main conference

arXiv:2203.07602 [pdf]

doi 10.1109/JLT.2022.3203563

On-chip ultra-compact hexagonal boron nitride topological ring-resonator in visible region

Authors: Min Wu, Yibiao Yang, Hongming Fei, Han Lin, Xiaodan Zhao, Lijuan Kang

Abstract: Ultra-compact topological ring-resonators with chirality are important devices for quantum optics. However, there are limited demonstrations of chiral resonators, especially in the visible region. We proposed a topological photonic ring-resonator based on hexagonal boron nitride (hBN) valley photonic crystal (VPC). The spin-valley locking effect in VPC allows achieving robust unidirectional transm… ▽ More Ultra-compact topological ring-resonators with chirality are important devices for quantum optics. However, there are limited demonstrations of chiral resonators, especially in the visible region. We proposed a topological photonic ring-resonator based on hexagonal boron nitride (hBN) valley photonic crystal (VPC). The spin-valley locking effect in VPC allows achieving robust unidirectional transmission of edge states in the visible region (600 nm-650 nm). As a result, a high quality factor (679.3) with a free spectral range of 15.2 nm in the visible region can be achieved in a hBN all-pass filter with a compact size. In addition, we investigated the transmission properties of hBN ring-resonators with different shapes and combinations, confirming the flexibility of designing topological ring-resonators based on this principle. This design can be readily integrated with quantum photonic chips for broad applications. △ Less

Submitted 14 March, 2022; originally announced March 2022.

Comments: 8 pages, 6 figures

arXiv:2112.10070 [pdf, other]

Unified Named Entity Recognition as Word-Word Relation Classification

Authors: **gye Li, Hao Fei, Jiang Liu, Shengqiong Wu, Meishan Zhang, Chong Teng, Donghong Ji, Fei Li

Abstract: So far, named entity recognition (NER) has been involved with three major types, including flat, overlapped (aka. nested), and discontinuous NER, which have mostly been studied individually. Recently, a growing interest has been built for unified NER, tackling the above three jobs concurrently with one single model. Current best-performing methods mainly include span-based and sequence-to-sequence… ▽ More So far, named entity recognition (NER) has been involved with three major types, including flat, overlapped (aka. nested), and discontinuous NER, which have mostly been studied individually. Recently, a growing interest has been built for unified NER, tackling the above three jobs concurrently with one single model. Current best-performing methods mainly include span-based and sequence-to-sequence models, where unfortunately the former merely focus on boundary identification and the latter may suffer from exposure bias. In this work, we present a novel alternative by modeling the unified NER as word-word relation classification, namely W^2NER. The architecture resolves the kernel bottleneck of unified NER by effectively modeling the neighboring relations between entity words with Next-Neighboring-Word (NNW) and Tail-Head-Word-* (THW-*) relations. Based on the W^2NER scheme we develop a neural framework, in which the unified NER is modeled as a 2D grid of word pairs. We then propose multi-granularity 2D convolutions for better refining the grid representations. Finally, a co-predictor is used to sufficiently reason the word-word relations. We perform extensive experiments on 14 widely-used benchmark datasets for flat, overlapped, and discontinuous NER (8 English and 6 Chinese datasets), where our model beats all the current top-performing baselines, pushing the state-of-the-art performances of unified NER. △ Less

Submitted 19 December, 2021; originally announced December 2021.

Comments: Accepted by AAAI'22

arXiv:2110.02001 [pdf, other]

Mastering the Explicit Opinion-role Interaction: Syntax-aided Neural Transition System for Unified Opinion Role Labeling

Authors: Shengqiong Wu, Hao Fei, Fei Li, Donghong Ji, Meishan Zhang, Yijiang Liu, Chong Teng

Abstract: Unified opinion role labeling (ORL) aims to detect all possible opinion structures of 'opinion-holder-target' in one shot, given a text. The existing transition-based unified method, unfortunately, is subject to longer opinion terms and fails to solve the term overlap issue. Current top performance has been achieved by employing the span-based graph model, which however still suffers from both hig… ▽ More Unified opinion role labeling (ORL) aims to detect all possible opinion structures of 'opinion-holder-target' in one shot, given a text. The existing transition-based unified method, unfortunately, is subject to longer opinion terms and fails to solve the term overlap issue. Current top performance has been achieved by employing the span-based graph model, which however still suffers from both high model complexity and insufficient interaction among opinions and roles. In this work, we investigate a novel solution by revisiting the transition architecture, and augmenting it with a pointer network (PointNet). The framework parses out all opinion structures in linear-time complexity, meanwhile breaks through the limitation of any length of terms with PointNet. To achieve the explicit opinion-role interactions, we further propose a unified dependency-opinion graph (UDOG), co-modeling the syntactic dependency structure and the partial opinion-role structure. We then devise a relation-centered graph aggregator (RCGA) to encode the multi-relational UDOG, where the resulting high-order representations are used to promote the predictions in the vanilla transition system. Our model achieves new state-of-the-art results on the MPQA benchmark. Analyses further demonstrate the superiority of our methods on both efficacy and efficiency. △ Less

Submitted 13 December, 2021; v1 submitted 5 October, 2021; originally announced October 2021.

Comments: AAAI2022

arXiv:2109.09373 [pdf, other]

Fast Online Optimization for Terrain-Blind Bipedal Robot Walking with a Decoupled Actuated SLIP Model

Authors: Ke Wang, Hengyi Fei, Petar Kormushev

Abstract: We present a highly reactive controller which enables bipedal robots to blindly walk over various kinds of uneven terrains while resisting pushes. The high level motion planner does fast online optimization for footstep locations and Center of Mass (CoM) height using the decoupled actuated Spring Loaded Inverted Pendulum (aSLIP) model. The decoupled aSLIP model simplifies the original aSLIP with L… ▽ More We present a highly reactive controller which enables bipedal robots to blindly walk over various kinds of uneven terrains while resisting pushes. The high level motion planner does fast online optimization for footstep locations and Center of Mass (CoM) height using the decoupled actuated Spring Loaded Inverted Pendulum (aSLIP) model. The decoupled aSLIP model simplifies the original aSLIP with Linear Inverted Pendulum (LIP) dynamics in horizontal states and spring dynamics in the vertical state. The motion planning can be formulated as a discrete-time Model Predictive Control (MPC) and solved at a frequency of 1k~HZ. The output of the motion planner using a reduced-order model is fed into an inverse-dynamics based whole body controller for execution on the robot. A key result of this controller is that the foot of the robot is compliant, which further extends the robot's ability to be robust to unobserved terrain changes. We evaluate our method in simulation with the bipedal robot SLIDER. Results show the robot can blindly walk over various uneven terrains including slopes, wave fields and stairs. It can also resist pushes while walking on uneven terrain. △ Less

Submitted 20 September, 2021; originally announced September 2021.

Comments: 8 pages, 8 figures, submitted to ICRA 2022

arXiv:2106.15597 [pdf, other]

Segmentation with Multiple Acceptable Annotations: A Case Study of Myocardial Segmentation in Contrast Echocardiography

Authors: Dewen Zeng, Mingqi Li, Yukun Ding, Xiaowei Xu, Qiu Xie, Ruixue Xu, Hongwen Fei, Mei** Huang, Jian Zhuang, Yiyu Shi

Abstract: Most existing deep learning-based frameworks for image segmentation assume that a unique ground truth is known and can be used for performance evaluation. This is true for many applications, but not all. Myocardial segmentation of Myocardial Contrast Echocardiography (MCE), a critical task in automatic myocardial perfusion analysis, is an example. Due to the low resolution and serious artifacts in… ▽ More Most existing deep learning-based frameworks for image segmentation assume that a unique ground truth is known and can be used for performance evaluation. This is true for many applications, but not all. Myocardial segmentation of Myocardial Contrast Echocardiography (MCE), a critical task in automatic myocardial perfusion analysis, is an example. Due to the low resolution and serious artifacts in MCE data, annotations from different cardiologists can vary significantly, and it is hard to tell which one is the best. In this case, how can we find a good way to evaluate segmentation performance and how do we train the neural network? In this paper, we address the first problem by proposing a new extended Dice to effectively evaluate the segmentation performance when multiple accepted ground truth is available. Then based on our proposed metric, we solve the second problem by further incorporating the new metric into a loss function that enables neural networks to flexibly learn general features of myocardium. Experiment results on our clinical MCE data set demonstrate that the neural network trained with the proposed loss function outperforms those existing ones that try to obtain a unique ground truth from multiple annotations, both quantitatively and qualitatively. Finally, our grading study shows that using extended Dice as an evaluation metric can better identify segmentation results that need manual correction compared with using Dice. △ Less

Submitted 29 June, 2021; originally announced June 2021.

Comments: 12 pages

arXiv:2105.08267 [pdf, other]

EchoCP: An Echocardiography Dataset in Contrast Transthoracic Echocardiography for Patent Foramen Ovale Diagnosis

Authors: Tianchen Wang, Zhihe Li, Mei** Huang, Jian Zhuang, Shanshan Bi, Jiawei Zhang, Yiyu Shi, Hongwen Fei, Xiaowei Xu

Abstract: Patent foramen ovale (PFO) is a potential separation between the septum, primum and septum secundum located in the anterosuperior portion of the atrial septum. PFO is one of the main factors causing cryptogenic stroke which is the fifth leading cause of death in the United States. For PFO diagnosis, contrast transthoracic echocardiography (cTTE) is preferred as being a more robust method compared… ▽ More Patent foramen ovale (PFO) is a potential separation between the septum, primum and septum secundum located in the anterosuperior portion of the atrial septum. PFO is one of the main factors causing cryptogenic stroke which is the fifth leading cause of death in the United States. For PFO diagnosis, contrast transthoracic echocardiography (cTTE) is preferred as being a more robust method compared with others. However, the current PFO diagnosis through cTTE is extremely slow as it is proceeded manually by sonographers on echocardiography videos. Currently there is no publicly available dataset for this important topic in the community. In this paper, we present EchoCP, as the first echocardiography dataset in cTTE targeting PFO diagnosis. EchoCP consists of 30 patients with both rest and Valsalva maneuver videos which covers various PFO grades. We further establish an automated baseline method for PFO diagnosis based on the state-of-the-art cardiac chamber segmentation technique, which achieves 0.89 average mean Dice score, but only 0.60/0.67 mean accuracies for PFO diagnosis, leaving large room for improvement. We hope that the challenging EchoCP dataset can stimulate further research and lead to innovative and generic solutions that would have an impact in multiple domains. Our dataset is released. △ Less

Submitted 15 September, 2021; v1 submitted 18 May, 2021; originally announced May 2021.

Comments: MICCAI2021

arXiv:2105.02520 [pdf, other]

Learn from Syntax: Improving Pair-wise Aspect and Opinion Terms Extractionwith Rich Syntactic Knowledge

Authors: Shengqiong Wu, Hao Fei, Yafeng Ren, Donghong Ji, **gye Li

Abstract: In this paper, we propose to enhance the pair-wise aspect and opinion terms extraction (PAOTE) task by incorporating rich syntactic knowledge. We first build a syntax fusion encoder for encoding syntactic features, including a label-aware graph convolutional network (LAGCN) for modeling the dependency edges and labels, as well as the POS tags unifiedly, and a local-attention module encoding POS ta… ▽ More In this paper, we propose to enhance the pair-wise aspect and opinion terms extraction (PAOTE) task by incorporating rich syntactic knowledge. We first build a syntax fusion encoder for encoding syntactic features, including a label-aware graph convolutional network (LAGCN) for modeling the dependency edges and labels, as well as the POS tags unifiedly, and a local-attention module encoding POS tags for better term boundary detection. During pairing, we then adopt Biaffine and Triaffine scoring for high-order aspect-opinion term pairing, in the meantime re-harnessing the syntax-enriched representations in LAGCN for syntactic-aware scoring. Experimental results on four benchmark datasets demonstrate that our model outperforms current state-of-the-art baselines, meanwhile yielding explainable predictions with syntactic knowledge. △ Less

Submitted 6 May, 2021; originally announced May 2021.

Comments: IJCAI2021

arXiv:2101.00394 [pdf, other]

End-to-end Semantic Role Labeling with Neural Transition-based Model

Authors: Hao Fei, Meishan Zhang, Bobo Li, Donghong Ji

Abstract: End-to-end semantic role labeling (SRL) has been received increasing interest. It performs the two subtasks of SRL: predicate identification and argument role labeling, jointly. Recent work is mostly focused on graph-based neural models, while the transition-based framework with neural networks which has been widely used in a number of closely-related tasks, has not been studied for the joint task… ▽ More End-to-end semantic role labeling (SRL) has been received increasing interest. It performs the two subtasks of SRL: predicate identification and argument role labeling, jointly. Recent work is mostly focused on graph-based neural models, while the transition-based framework with neural networks which has been widely used in a number of closely-related tasks, has not been studied for the joint task yet. In this paper, we present the first work of transition-based neural models for end-to-end SRL. Our transition model incrementally discovers all sentential predicates as well as their arguments by a set of transition actions. The actions of the two subtasks are executed mutually for full interactions. Besides, we suggest high-order compositions to extract non-local features, which can enhance the proposed transition model further. Experimental results on CoNLL09 and Universal Proposition Bank show that our final model can produce state-of-the-art performance, and meanwhile keeps highly efficient in decoding. We also conduct detailed experimental analysis for a deep understanding of our proposed model. △ Less

Submitted 2 January, 2021; originally announced January 2021.

Comments: Accepted at AAAI 2021

arXiv:2011.09737 [pdf, other]

doi 10.1109/CVPR46437.2021.00295

Face Forgery Detection by 3D Decomposition

Authors: Xiangyu Zhu, Hao Wang, Hongyan Fei, Zhen Lei, Stan Z. Li

Abstract: Detecting digital face manipulation has attracted extensive attention due to fake media's potential harms to the public. However, recent advances have been able to reduce the forgery signals to a low magnitude. Decomposition, which reversibly decomposes an image into several constituent elements, is a promising way to highlight the hidden forgery details. In this paper, we consider a face image as… ▽ More Detecting digital face manipulation has attracted extensive attention due to fake media's potential harms to the public. However, recent advances have been able to reduce the forgery signals to a low magnitude. Decomposition, which reversibly decomposes an image into several constituent elements, is a promising way to highlight the hidden forgery details. In this paper, we consider a face image as the production of the intervention of the underlying 3D geometry and the lighting environment, and decompose it in a computer graphics view. Specifically, by disentangling the face image into 3D shape, common texture, identity texture, ambient light, and direct light, we find the devil lies in the direct light and the identity texture. Based on this observation, we propose to utilize facial detail, which is the combination of direct light and identity texture, as the clue to detect the subtle forgery patterns. Besides, we highlight the manipulated region with a supervised attention mechanism and introduce a two-stream structure to exploit both face image and facial detail together as a multi-modality task. Extensive experiments indicate the effectiveness of the extra features extracted from the facial detail, and our method achieves the state-of-the-art performance. △ Less

Submitted 19 November, 2020; originally announced November 2020.

arXiv:2009.09174 [pdf, other]

Aggressive Language Detection with Joint Text Normalization via Adversarial Multi-task Learning

Authors: Shengqiong Wu, Hao Fei, Donghong Ji

Abstract: Aggressive language detection (ALD), detecting the abusive and offensive language in texts, is one of the crucial applications in NLP community. Most existing works treat ALD as regular classification with neural models, while ignoring the inherent conflicts of social media text that they are quite unnormalized and irregular. In this work, we target improving the ALD by jointly performing text nor… ▽ More Aggressive language detection (ALD), detecting the abusive and offensive language in texts, is one of the crucial applications in NLP community. Most existing works treat ALD as regular classification with neural models, while ignoring the inherent conflicts of social media text that they are quite unnormalized and irregular. In this work, we target improving the ALD by jointly performing text normalization (TN), via an adversarial multi-task learning framework. The private encoders for ALD and TN focus on the task-specific features retrieving, respectively, and the shared encoder learns the underlying common features over two tasks. During adversarial training, a task discriminator distinguishes the separate learning of ALD or TN. Experimental results on four ALD datasets show that our model outperforms all baselines under differing settings by large margins, demonstrating the necessity of joint learning the TN with ALD. Further analysis is conducted for a better understanding of our method. △ Less

Submitted 19 September, 2020; originally announced September 2020.

Comments: accepted at NLPCC2020

arXiv:2009.09173 [pdf, other]

Nominal Compound Chain Extraction: A New Task for Semantic-enriched Lexical Chain

Authors: Bobo Li, Hao Fei, Yafeng Ren, Donghong Ji

Abstract: Lexical chain consists of cohesion words in a document, which implies the underlying structure of a text, and thus facilitates downstream NLP tasks. Nevertheless, existing work focuses on detecting the simple surface lexicons with shallow syntax associations, ignoring the semantic-aware lexical compounds as well as the latent semantic frames, (e.g., topic), which can be much more crucial for real-… ▽ More Lexical chain consists of cohesion words in a document, which implies the underlying structure of a text, and thus facilitates downstream NLP tasks. Nevertheless, existing work focuses on detecting the simple surface lexicons with shallow syntax associations, ignoring the semantic-aware lexical compounds as well as the latent semantic frames, (e.g., topic), which can be much more crucial for real-world NLP applications. In this paper, we introduce a novel task, Nominal Compound Chain Extraction (NCCE), extracting and clustering all the nominal compounds that share identical semantic topics. In addition, we model the task as a two-stage prediction (i.e., compound extraction and chain detection), which is handled via a proposed joint framework. The model employs the BERT encoder to yield contextualized document representation. Also, HowNet is exploited as external resources for offering rich sememe information. The experiments are based on our manually annotated corpus, and the results prove the necessity of the NCCE task as well as the effectiveness of our joint approach. △ Less

Submitted 19 September, 2020; originally announced September 2020.

Comments: accepted at NLPCC2020

arXiv:2009.07411 [pdf, other]

Mimic and Conquer: Heterogeneous Tree Structure Distillation for Syntactic NLP

Authors: Hao Fei, Yafeng Ren, Donghong Ji

Abstract: Syntax has been shown useful for various NLP tasks, while existing work mostly encodes singleton syntactic tree using one hierarchical neural network. In this paper, we investigate a simple and effective method, Knowledge Distillation, to integrate heterogeneous structure knowledge into a unified sequential LSTM encoder. Experimental results on four typical syntax-dependent tasks show that our met… ▽ More Syntax has been shown useful for various NLP tasks, while existing work mostly encodes singleton syntactic tree using one hierarchical neural network. In this paper, we investigate a simple and effective method, Knowledge Distillation, to integrate heterogeneous structure knowledge into a unified sequential LSTM encoder. Experimental results on four typical syntax-dependent tasks show that our method outperforms tree encoders by effectively integrating rich heterogeneous structure syntax, meanwhile reducing error propagation, and also outperforms ensemble methods, in terms of both the efficiency and accuracy. △ Less

Submitted 15 September, 2020; originally announced September 2020.

Comments: To appear at EMNLP2020

arXiv:2009.07408 [pdf, other]

Retrofitting Structure-aware Transformer Language Model for End Tasks

Authors: Hao Fei, Yafeng Ren, Donghong Ji

Abstract: We consider retrofitting structure-aware Transformer-based language model for facilitating end tasks by proposing to exploit syntactic distance to encode both the phrasal constituency and dependency connection into the language model. A middle-layer structural learning strategy is leveraged for structure integration, accomplished with main semantic task training under multi-task learning scheme. E… ▽ More We consider retrofitting structure-aware Transformer-based language model for facilitating end tasks by proposing to exploit syntactic distance to encode both the phrasal constituency and dependency connection into the language model. A middle-layer structural learning strategy is leveraged for structure integration, accomplished with main semantic task training under multi-task learning scheme. Experimental results show that the retrofitted structure-aware Transformer language model achieves improved perplexity, meanwhile inducing accurate syntactic phrases. By performing structure-aware fine-tuning, our model achieves significant improvements for both semantic- and syntactic-dependent tasks. △ Less

Submitted 15 September, 2020; originally announced September 2020.

Comments: Accepted as long paper in EMNLP2020 main proceeding

arXiv:2009.06957 [pdf, other]

High-order Refining for End-to-end Chinese Semantic Role Labeling

Authors: Hao Fei, Yafeng Ren, Donghong Ji

Abstract: Current end-to-end semantic role labeling is mostly accomplished via graph-based neural models. However, these all are first-order models, where each decision for detecting any predicate-argument pair is made in isolation with local features. In this paper, we present a high-order refining mechanism to perform interaction between all predicate-argument pairs. Based on the baseline graph model, our… ▽ More Current end-to-end semantic role labeling is mostly accomplished via graph-based neural models. However, these all are first-order models, where each decision for detecting any predicate-argument pair is made in isolation with local features. In this paper, we present a high-order refining mechanism to perform interaction between all predicate-argument pairs. Based on the baseline graph model, our high-order refining module learns higher-order features between all candidate pairs via attention calculation, which are later used to update the original token representations. After several iterations of refinement, the underlying token representations can be enriched with globally interacted features. Our high-order model achieves state-of-the-art results on Chinese SRL data, including CoNLL09 and Universal Proposition Bank, meanwhile relieving the long-range dependency issues. △ Less

Submitted 15 September, 2020; originally announced September 2020.

Comments: Accepted at AACL2020 as short paper

arXiv:2008.10284 [pdf, other]

Cross-lingual Semantic Role Labeling with Model Transfer

Authors: Hao Fei, Meishan Zhang, Fei Li, Donghong Ji

Abstract: Prior studies show that cross-lingual semantic role labeling (SRL) can be achieved by model transfer under the help of universal features. In this paper, we fill the gap of cross-lingual SRL by proposing an end-to-end SRL model that incorporates a variety of universal features and transfer methods. We study both the bilingual transfer and multi-source transfer, under gold or machine-generated synt… ▽ More Prior studies show that cross-lingual semantic role labeling (SRL) can be achieved by model transfer under the help of universal features. In this paper, we fill the gap of cross-lingual SRL by proposing an end-to-end SRL model that incorporates a variety of universal features and transfer methods. We study both the bilingual transfer and multi-source transfer, under gold or machine-generated syntactic inputs, pre-trained high-order abstract features, and contextualized multilingual word representations. Experimental results on the Universal Proposition Bank corpus indicate that performances of the cross-lingual SRL can vary by leveraging different cross-lingual features. In addition, whether the features are gold-standard also has an impact on performances. Precisely, we find that gold syntax features are much more crucial for cross-lingual SRL, compared with the automatically-generated ones. Moreover, universal dependency structure features are able to give the best help, and both pre-trained high-order features and contextualized word representations can further bring significant improvements. △ Less

Submitted 24 August, 2020; originally announced August 2020.

Comments: Accepted at TASLP

Journal ref: TASLP, 2020

arXiv:2004.06295 [pdf, other]

Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus

Authors: Hao Fei, Meishan Zhang, Donghong Ji

Abstract: Many efforts of research are devoted to semantic role labeling (SRL) which is crucial for natural language understanding. Supervised approaches have achieved impressing performances when large-scale corpora are available for resource-rich languages such as English. While for the low-resource languages with no annotated SRL dataset, it is still challenging to obtain competitive performances. Cross-… ▽ More Many efforts of research are devoted to semantic role labeling (SRL) which is crucial for natural language understanding. Supervised approaches have achieved impressing performances when large-scale corpora are available for resource-rich languages such as English. While for the low-resource languages with no annotated SRL dataset, it is still challenging to obtain competitive performances. Cross-lingual SRL is one promising way to address the problem, which has achieved great advances with the help of model transferring and annotation projection. In this paper, we propose a novel alternative based on corpus translation, constructing high-quality training datasets for the target languages from the source gold-standard SRL annotations. Experimental results on Universal Proposition Bank show that the translation-based method is highly effective, and the automatic pseudo datasets can improve the target-language SRL performances significantly. △ Less

Submitted 6 May, 2020; v1 submitted 14 April, 2020; originally announced April 2020.

Comments: Accepted at ACL 2020

arXiv:1912.07367 [pdf, other]

A Model-driven and Data-driven Fusion Framework for Accurate Air Quality Prediction

Authors: Haolin Fei, Xiaofeng Wu, Chunbo Luo

Abstract: Air quality is closely related to public health. Health issues such as cardiovascular diseases and respiratory diseases, may have connection with long exposure to highly polluted environment. Therefore, accurate air quality forecasts are extremely important to those who are vulnerable. To estimate the variation of several air pollution concentrations, previous researchers used various approaches,… ▽ More Air quality is closely related to public health. Health issues such as cardiovascular diseases and respiratory diseases, may have connection with long exposure to highly polluted environment. Therefore, accurate air quality forecasts are extremely important to those who are vulnerable. To estimate the variation of several air pollution concentrations, previous researchers used various approaches, such as the Community Multiscale Air Quality model (CMAQ) or neural networks. Although CMAQ model considers a coverage of the historic air pollution data and meteorological variables, extra bias is introduced due to additional adjustment. In this paper, a combination of model-based strategy and data-driven method namely the physical-temporal collection(PTC) model is proposed, aiming to fix the systematic error that traditional models deliver. In the data-driven part, the first components are the temporal pattern and the weather pattern to measure important features that contribute to the prediction performance. The less relevant input variables will be removed to eliminate negative weights in network training. Then, we deploy a long-short-term-memory (LSTM) to fetch the preliminary results, which will be further corrected by a neural network (NN) involving the meteorological index as well as other pollutants concentrations. The data-set we applied for forecasting is from January 1st, 2016 to December 31st, 2016. According to the results, our PTC achieves an excellent performance compared with the baseline model (CMAQ prediction, GRU, DNN and etc.). This joint model-based data-driven method for air quality prediction can be easily deployed on stations without extra adjustment, providing results with high-time-resolution information for vulnerable members to prevent heavy air pollution ahead. △ Less

Submitted 6 December, 2019; originally announced December 2019.

arXiv:1907.10934 [pdf]

doi 10.1039/D0SM00405G

Soft Granular Particles Sheared at a Controlled Volume: Rate-dependent dynamics and the solid-fluid transition

Authors: J. -C. Tsai, M. -R. Chou, P. -C. Huang, H. -T. Fei

Abstract: We study the responses of fluid-immersed soft hydrogel spheres that are sheared under controlled volume fractions. Slippery, deformable particles along with the density-matched interstitial fluid are sandwiched between two opposing rough cones, allowing studies for a wide range of volume fraction $φ$ both above and below the jamming of granular suspension. We utilize sudden cessations of shearing,… ▽ More We study the responses of fluid-immersed soft hydrogel spheres that are sheared under controlled volume fractions. Slippery, deformable particles along with the density-matched interstitial fluid are sandwiched between two opposing rough cones, allowing studies for a wide range of volume fraction $φ$ both above and below the jamming of granular suspension. We utilize sudden cessations of shearing, accompanied by refraction-matched internal imaging, to supplement the conventional flow-curve measurements. At sufficiently high volume fractions, the settling of particles after the cessations exhibits a continuous yet distinct transition over the change of shear rate. Such changes back out the qualitative difference in the state of flowing prior to the cessations: the quasi-static yielding of a tightly packed network, as opposed to the rapid sliding of particles mediated by the interstitial fluid whose dynamics depends on the driving rate. In addition, we determine the solid-fluid transition using two independent methods: the extrapolation of stress residues and the estimated yield stress from high values of $φ$, and the settling of particles upon shear cessations as $φ$ goes across the transition. We also verify the power law on values of characteristic stress with respect to the distance from jamming $φ- φ_c$, with an exponent close to 2. These results demonstrate a multitude of relaxation timescales behind the dynamics of soft particles, and provoke questions on how we extend existing paradigms on the flow of a densely packed system when the softness is actively involved. △ Less

Submitted 13 June, 2020; v1 submitted 25 July, 2019; originally announced July 2019.

Comments: Soft Matter, 2020

arXiv:1902.09249 [pdf]

doi 10.1038/s41467-019-10995-3

High-pressure synthesis of ultraincompressible hard rhenium nitride pernitride Re$_{2}$(N$_{2}$)N$_{2}$ stable at ambient conditions

Authors: Maxim Bykov, Stella Chariton, Hongzhan Fei, Timofey Fedotenko, Georgios Aprilis, Alena V. Ponomareva, Ferenc Tasnádi, Igor A. Abrikosov, Benoit Merle, Patrick Feldner, Sebastian Vogel, Wolfgang Schnick, Vitali B. Prakapenka, Eran Greenberg, Michael Hanfland, Anna Pakhomova, Hanns-Peter Liermann, Tomoo Katsura, Natalia Dubrovinskaia, Leonid Dubrovinsky

Abstract: Here we report the synthesis of metallic, ultraincompressible (bulk modulus $K_{0}$ = 428(10) GPa) and very hard (nanoindentation hardness 36.7(8) GPa) rhenium (V) nitride pernitride Re$_{2}$(N$_{2}$)N$_{2}$. While the empirical chemical formula of the compound, ReN$_{2}$, is the same as for other known transition metals pernitrides, e.g. IrN$_{2}$, PtN$_{2}$, PdN$_{2}$ and OsN$_{2}$, its crystal… ▽ More Here we report the synthesis of metallic, ultraincompressible (bulk modulus $K_{0}$ = 428(10) GPa) and very hard (nanoindentation hardness 36.7(8) GPa) rhenium (V) nitride pernitride Re$_{2}$(N$_{2}$)N$_{2}$. While the empirical chemical formula of the compound, ReN$_{2}$, is the same as for other known transition metals pernitrides, e.g. IrN$_{2}$, PtN$_{2}$, PdN$_{2}$ and OsN$_{2}$, its crystal chemistry is unique. The known pernitrides of transition metals consist of a metal in the oxidation state +IV and pernitride anions N$_{2}^{4-}$. ReN$_{2}$ contains both pernitride N$_{2}^{4-}$ and discrete N$^{3-}$ anions, which explains its exceptional properties. Moreover, in the original experimental synthesis of Re$_{2}$(N$_{2}$)N$_{2}$ performed in a laser-heated diamond anvil cell via a direct reaction between rhenium and nitrogen at pressures from 40 to 90 GPa we observed that the material was recoverable at ambient conditions. Consequently, we developed a route to scale up its synthesis through a reaction between rhenium and ammonium azide, NH$_{4}$N$_{3}$, in a large-volume press at 33 GPa. Our work resulted not only in a discovery of a novel material with unusual crystal chemistry and a set of properties attractive for potential applications, but also demonstrated a feasibility of surmounting conceptions common in material sciences. △ Less

Submitted 25 February, 2019; originally announced February 2019.

Journal ref: Nat Commun 10, 2994 (2019)

arXiv:1806.04416 [pdf]

A nanophotonic all-optical diode for non-reciprocal transmission of circularly polarized lights

Authors: Hongming Fei, Min Wu, Han Lin, Xin Liu, Yibiao Yang, Mingda Zhang, Binzhao Cao

Abstract: All optical diodes (AODs) play an important role in quantum optics and information processing, in which the information is encoded by photons. Only circularly polarized lights are able to carry the spin states of photons, which has been intensively used in quantum computing and information processing and enable new research fields, such as chiral quantum optics. An ideal AOD should be able to work… ▽ More All optical diodes (AODs) play an important role in quantum optics and information processing, in which the information is encoded by photons. Only circularly polarized lights are able to carry the spin states of photons, which has been intensively used in quantum computing and information processing and enable new research fields, such as chiral quantum optics. An ideal AOD should be able to work with arbitrary polarizations states, including circularly polarized lights, which has not been demonstrated yet. In this paper, we theoretically demonstrate for the first time a nanophotonic AOD that is able to work with circularly polarized lights. The AOD nanostructure is based on a heterostructure of two-dimension silica and silicon photonic crystals (PhCs). By controlling the effective refractive indices of the PhCs and using an inclined interface, we are able to exploit generalized total reflection principle to achieve non-reciprocal transmission of circularly polarized lights. In addition, the nanophotonic AOD is able to achieve high forward transmittance greater than 0.6 and high contrast ratio close to 1 in a broad wavelength range of 1497 nm to 1666 nm. The designed nanophotonic AOD will find broad applications in optical quantum information processing and computing. △ Less

Submitted 12 June, 2018; originally announced June 2018.

Comments: 12 pages,6 figures

arXiv:0711.3280 [pdf, ps, other]

Chaotic behavior in the accretion disk

Authors: Liu Lei, Hu Fei

Abstract: The eccentric luminosity variation of quasars is still a mystery. Analytic results of this behavior ranged from multi-periodic behavior to a purely random process. Recently, we have used nonlinear time-series analysis to analyze the light curve of 3C 273 and found its eccentric behavior may be chaos [L. Liu, Chin. J. Astron. Astrophys. \textbf{6}, 663 (2006)]. This result induces us to look for… ▽ More The eccentric luminosity variation of quasars is still a mystery. Analytic results of this behavior ranged from multi-periodic behavior to a purely random process. Recently, we have used nonlinear time-series analysis to analyze the light curve of 3C 273 and found its eccentric behavior may be chaos [L. Liu, Chin. J. Astron. Astrophys. \textbf{6}, 663 (2006)]. This result induces us to look for some nonlinear mechanism to explain the eccentric luminosity variation. In this paper, we propose a simple non-linear accretion disk model and find it shows a kind of chaotic behavior under some circumstances. Then we compute the outburst energy $\triangle F$, defined as the difference of the maximum luminosity and the minimum luminosity, and the mean luminosity $<F>$. We find that $\triangle F\sim < F >^α$ in the chaotic domain, where $α\approx 1$. In this domain, we also find that $<F > \sim M^{0.5}$, where $M$ is the mass of central black hole. These results are confirmed by or compatible with some results from the observational data analysis [A. J. Pica and A. G. Smith, Astrophys. J. \textbf{272}, 11 (1983); M. Wold, M. S. Brotherton and Z. Shang, Mon. Not. R. Astron. Soc. \textbf{375}, 989 (2007)]. △ Less

Submitted 26 March, 2008; v1 submitted 21 November, 2007; originally announced November 2007.

Comments: 14 pages, 9 figures

arXiv:cs/0306093 [pdf]

Grid-Brick Event Processing Framework in GEPS

Authors: Antonio Amorim, Luis Pedro, Han Fei, Nuno Almeida, Paulo Trezentos, Jaime E. Villate

Abstract: Experiments like ATLAS at LHC involve a scale of computing and data management that greatly exceeds the capability of existing systems, making it necessary to resort to Grid-based Parallel Event Processing Systems (GEPS). Traditional Grid systems concentrate the data in central data servers which have to be accessed by many nodes each time an analysis or processing job starts. These systems requ… ▽ More Experiments like ATLAS at LHC involve a scale of computing and data management that greatly exceeds the capability of existing systems, making it necessary to resort to Grid-based Parallel Event Processing Systems (GEPS). Traditional Grid systems concentrate the data in central data servers which have to be accessed by many nodes each time an analysis or processing job starts. These systems require very powerful central data servers and make little use of the distributed disk space that is available in commodity computers. The Grid-Brick system, which is described in this paper, follows a different approach. The data storage is split among all grid nodes having each one a piece of the whole information. Users submit queries and the system will distribute the tasks through all the nodes and retrieve the result, merging them together in the Job Submit Server. The main advantage of using this system is the huge scalability it provides, while its biggest disadvantage appears in the case of failure of one of the nodes. A workaround for this problem involves data replication or backup. △ Less

Submitted 14 June, 2003; originally announced June 2003.

Comments: 6 pages; document for CHEP'03 conference

ACM Class: C.1.4; C.2.1; C.2.4; D.1.3; D.4.3; D.4.7; H.2.4

Showing 51–89 of 89 results for author: Fei, H