Search | arXiv e-print repository

Enhancing In-Context Learning with Semantic Representations for Relation Extraction

Authors: Peitao Han, Lis Kanashiro Pereira, Fei Cheng, Wan Jou She, Eiji Aramaki

Abstract: In this work, we employ two AMR-enhanced semantic representations for ICL on RE: one that explores the AMR structure generated for a sentence at the subgraph level (shortest AMR path), and another that explores the full AMR structure generated for a sentence. In both cases, we demonstrate that all settings benefit from the fine-grained AMR's semantic structure. We evaluate our model on four RE dat… ▽ More In this work, we employ two AMR-enhanced semantic representations for ICL on RE: one that explores the AMR structure generated for a sentence at the subgraph level (shortest AMR path), and another that explores the full AMR structure generated for a sentence. In both cases, we demonstrate that all settings benefit from the fine-grained AMR's semantic structure. We evaluate our model on four RE datasets. Our results show that our model can outperform the GPT-based baselines, and achieve SOTA performance on two of the datasets, and competitive performance on the other two. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2403.18504 [pdf]

AcTED: Automatic Acquisition of Typical Event Duration for Semi-supervised Temporal Commonsense QA

Authors: Felix Virgo, Fei Cheng, Lis Kanashiro Pereira, Masayuki Asahara, Ichiro Kobayashi, Sadao Kurohashi

Abstract: We propose a voting-driven semi-supervised approach to automatically acquire the typical duration of an event and use it as pseudo-labeled data. The human evaluation demonstrates that our pseudo labels exhibit surprisingly high accuracy and balanced coverage. In the temporal commonsense QA task, experimental results show that using only pseudo examples of 400 events, we achieve performance compara… ▽ More We propose a voting-driven semi-supervised approach to automatically acquire the typical duration of an event and use it as pseudo-labeled data. The human evaluation demonstrates that our pseudo labels exhibit surprisingly high accuracy and balanced coverage. In the temporal commonsense QA task, experimental results show that using only pseudo examples of 400 events, we achieve performance comparable to the existing BERT-based weakly supervised approaches that require a significant amount of training examples. When compared to the RoBERTa baselines, our best approach establishes state-of-the-art performance with a 7% improvement in Exact Match. △ Less

Submitted 27 March, 2024; originally announced March 2024.

arXiv:2206.03025 [pdf, other]

OCHADAI at SemEval-2022 Task 2: Adversarial Training for Multilingual Idiomaticity Detection

Authors: Lis Kanashiro Pereira, Ichiro Kobayashi

Abstract: We propose a multilingual adversarial training model for determining whether a sentence contains an idiomatic expression. Given that a key challenge with this task is the limited size of annotated data, our model relies on pre-trained contextual representations from different multi-lingual state-of-the-art transformer-based language models (i.e., multilingual BERT and XLM-RoBERTa), and on adversar… ▽ More We propose a multilingual adversarial training model for determining whether a sentence contains an idiomatic expression. Given that a key challenge with this task is the limited size of annotated data, our model relies on pre-trained contextual representations from different multi-lingual state-of-the-art transformer-based language models (i.e., multilingual BERT and XLM-RoBERTa), and on adversarial training, a training method for further enhancing model generalization and robustness. Without relying on any human-crafted features, knowledge bases, or additional datasets other than the target datasets, our model achieved competitive results and ranked 6th place in SubTask A (zero-shot) setting and 15th place in SubTask A (one-shot) setting. △ Less

Submitted 7 June, 2022; originally announced June 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2105.05535

Journal ref: SemEval 2022-Task 2

arXiv:2105.05535 [pdf, other]

OCHADAI-KYOTO at SemEval-2021 Task 1: Enhancing Model Generalization and Robustness for Lexical Complexity Prediction

Authors: Yuki Taya, Lis Kanashiro Pereira, Fei Cheng, Ichiro Kobayashi

Abstract: We propose an ensemble model for predicting the lexical complexity of words and multiword expressions (MWEs). The model receives as input a sentence with a target word or MWEand outputs its complexity score. Given that a key challenge with this task is the limited size of annotated data, our model relies on pretrained contextual representations from different state-of-the-art transformer-based lan… ▽ More We propose an ensemble model for predicting the lexical complexity of words and multiword expressions (MWEs). The model receives as input a sentence with a target word or MWEand outputs its complexity score. Given that a key challenge with this task is the limited size of annotated data, our model relies on pretrained contextual representations from different state-of-the-art transformer-based language models (i.e., BERT and RoBERTa), and on a variety of training methods for further enhancing model generalization and robustness:multi-step fine-tuning and multi-task learning, and adversarial training. Additionally, we propose to enrich contextual representations by adding hand-crafted features during training. Our model achieved competitive results and ranked among the top-10 systems in both sub-tasks. △ Less

Submitted 15 June, 2021; v1 submitted 12 May, 2021; originally announced May 2021.

Showing 1–4 of 4 results for author: Pereira, L K