Skip to main content

Showing 1–7 of 7 results for author: Tsukagoshi, H

.
  1. arXiv:2404.09002  [pdf, other

    cs.CL

    WikiSplit++: Easy Data Refinement for Split and Rephrase

    Authors: Hayato Tsukagoshi, Tsutomu Hirao, Makoto Morishita, Katsuki Chousa, Ryohei Sasano, Koichi Takeda

    Abstract: The task of Split and Rephrase, which splits a complex sentence into multiple simple sentences with the same meaning, improves readability and enhances the performance of downstream tasks in natural language processing (NLP). However, while Split and Rephrase can be improved using a text-to-text generation approach that applies encoder-decoder models fine-tuned with a large-scale dataset, it still… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

    Comments: Accepted at LREC-COLING 2024

  2. arXiv:2402.15132  [pdf, other

    cs.CL cs.LG

    Improving Sentence Embeddings with an Automatically Generated NLI Dataset

    Authors: Soma Sato, Hayato Tsukagoshi, Ryohei Sasano, Koichi Takeda

    Abstract: Decoder-based large language models (LLMs) have shown high performance on many tasks in natural language processing. This is also true for sentence embedding learning, where a decoder-based model, PromptEOL, has achieved the best performance on semantic textual similarity (STS) tasks. However, PromptEOL makes great use of fine-tuning with a manually annotated natural language inference (NLI) datas… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  3. arXiv:2310.19349  [pdf, other

    cs.CL

    Japanese SimCSE Technical Report

    Authors: Hayato Tsukagoshi, Ryohei Sasano, Koichi Takeda

    Abstract: We report the development of Japanese SimCSE, Japanese sentence embedding models fine-tuned with SimCSE. Since there is a lack of sentence embedding models for Japanese that can be used as a baseline in sentence embedding research, we conducted extensive experiments on Japanese sentence embeddings involving 24 pre-trained Japanese or multilingual language models, five supervised datasets, and four… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  4. arXiv:2305.12990  [pdf, other

    cs.CL

    Sentence Representations via Gaussian Embedding

    Authors: Shohei Yoda, Hayato Tsukagoshi, Ryohei Sasano, Koichi Takeda

    Abstract: Recent progress in sentence embedding, which represents the meaning of a sentence as a point in a vector space, has achieved high performance on tasks such as a semantic textual similarity (STS) task. However, sentence representations as a point in a vector space can express only a part of the diverse information that sentences have, such as asymmetrical relationships between sentences. This paper… ▽ More

    Submitted 20 February, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted to EACL 2024 (Main)

  5. arXiv:2302.11120  [pdf

    cs.RO eess.SY

    Soft Pneumatic Actuator Capable of Generating Various Bending and Extension Motions Inspired by an Elephant Trunk

    Authors: Peizheng Yuan, Hideyuki Tsukagoshi

    Abstract: Inspired by the dexterous handling ability of an elephant's trunk, we propose a pneumatic actuator that generates diverse bending and extension motions in a flexible arm. The actuator consists of two flexible tubes. Each flexible tube is restrained by a single string with variable length and tilt angle. Even if a single tube can perform only three simple types of motions (bending, extension, and h… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: 8 pages, 11 figures, submitted to the IEEE Robotics and Automation Letters (RA-L)

  6. arXiv:2202.02990  [pdf, other

    cs.CL

    Comparison and Combination of Sentence Embeddings Derived from Different Supervision Signals

    Authors: Hayato Tsukagoshi, Ryohei Sasano, Koichi Takeda

    Abstract: There have been many successful applications of sentence embedding methods. However, it has not been well understood what properties are captured in the resulting sentence embeddings depending on the supervision signals. In this paper, we focus on two types of sentence embedding methods with similar architectures and tasks: one fine-tunes pre-trained language models on the natural language inferen… ▽ More

    Submitted 10 June, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

    Comments: Accepted at *SEM 2022

  7. arXiv:2105.04339  [pdf, other

    cs.CL

    DefSent: Sentence Embeddings using Definition Sentences

    Authors: Hayato Tsukagoshi, Ryohei Sasano, Koichi Takeda

    Abstract: Sentence embedding methods using natural language inference (NLI) datasets have been successfully applied to various tasks. However, these methods are only available for limited languages due to relying heavily on the large NLI datasets. In this paper, we propose DefSent, a sentence embedding method that uses definition sentences from a word dictionary, which performs comparably on unsupervised se… ▽ More

    Submitted 9 June, 2021; v1 submitted 10 May, 2021; originally announced May 2021.

    Comments: Accepted at ACL-IJCNLP 2021 main conference