Skip to main content

Showing 1–10 of 10 results for author: Honda, U

.
  1. arXiv:2406.12060  [pdf, other

    cs.CL cs.LG

    Not Eliminate but Aggregate: Post-Hoc Control over Mixture-of-Experts to Address Shortcut Shifts in Natural Language Understanding

    Authors: Ukyo Honda, Tatsushi Oka, Peinan Zhang, Masato Mita

    Abstract: Recent models for natural language understanding are inclined to exploit simple patterns in datasets, commonly known as shortcuts. These shortcuts hinge on spurious correlations between labels and latent features existing in the training data. At inference time, shortcut-dependent models are likely to generate erroneous predictions under distribution shifts, particularly when some latent features… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted to TACL (pre-MIT Press publication version, 21 pages, 5 figures)

  2. arXiv:2405.13541  [pdf, other

    cs.CL cs.AI cs.LG

    Annotation-Efficient Preference Optimization for Language Model Alignment

    Authors: Yuu **nai, Ukyo Honda

    Abstract: Preference optimization is a standard approach to fine-tuning large language models to align with human preferences. The quality, diversity, and quantity of the preference dataset are critical to the effectiveness of preference optimization. However, obtaining a large amount of high-quality and diverse preference annotations is difficult in many applications. This raises the question of how to use… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  3. arXiv:2405.01280  [pdf, other

    cs.CL

    Reinforcement Learning for Edit-Based Non-Autoregressive Neural Machine Translation

    Authors: Hao Wang, Tetsuro Morimura, Ukyo Honda, Daisuke Kawahara

    Abstract: Non-autoregressive (NAR) language models are known for their low latency in neural machine translation (NMT). However, a performance gap exists between NAR and autoregressive models due to the large decoding space and difficulty in capturing dependency between target words accurately. Compounding this, preparing appropriate training data for NAR models is a non-trivial task, often exacerbating exp… ▽ More

    Submitted 2 July, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: NAACL SRW 2024

  4. arXiv:2404.00752  [pdf, other

    cs.CL cs.AI

    On the True Distribution Approximation of Minimum Bayes-Risk Decoding

    Authors: Atsumoto Ohashi, Ukyo Honda, Tetsuro Morimura, Yuu **nai

    Abstract: Minimum Bayes-risk (MBR) decoding has recently gained renewed attention in text generation. MBR decoding considers texts sampled from a model as pseudo-references and selects the text with the highest similarity to the others. Therefore, sampling is one of the key elements of MBR decoding, and previous studies reported that the performance varies by sampling methods. From a theoretical standpoint,… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: NAACL 2024 (main conference)

  5. arXiv:2403.14946  [pdf, other

    cs.CL cs.AI cs.LG

    A Single Linear Layer Yields Task-Adapted Low-Rank Matrices

    Authors: Hwichan Kim, Shota Sasaki, Sho Hoshino, Ukyo Honda

    Abstract: Low-Rank Adaptation (LoRA) is a widely used Parameter-Efficient Fine-Tuning (PEFT) method that updates an initial weight matrix $W_0$ with a delta matrix $ΔW$ consisted by two low-rank matrices $A$ and $B$. A previous study suggested that there is correlation between $W_0$ and $ΔW$. In this study, we aim to delve deeper into relationships between $W_0$ and low-rank matrices $A$ and $B$ to further… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: Accepted at LREC-COLING 2024

  6. arXiv:2401.05054  [pdf, other

    cs.CL cs.AI

    Generating Diverse and High-Quality Texts by Minimum Bayes Risk Decoding

    Authors: Yuu **nai, Ukyo Honda, Tetsuro Morimura, Peinan Zhang

    Abstract: One of the most important challenges in text generation systems is to produce outputs that are not only correct but also diverse. Recently, Minimum Bayes-Risk (MBR) decoding has gained prominence for generating sentences of the highest quality among the decoding algorithms. However, existing algorithms proposed for generating diverse outputs are predominantly based on beam search or random samplin… ▽ More

    Submitted 11 June, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

  7. arXiv:2311.05263  [pdf, other

    cs.AI cs.CL

    Model-Based Minimum Bayes Risk Decoding for Text Generation

    Authors: Yuu **nai, Tetsuro Morimura, Ukyo Honda, Kaito Ariu, Kenshi Abe

    Abstract: Minimum Bayes Risk (MBR) decoding has been shown to be a powerful alternative to beam search decoding in a variety of text generation tasks. MBR decoding selects a hypothesis from a pool of hypotheses that has the least expected risk under a probability model according to a given utility function. Since it is impractical to compute the expected risk exactly over all possible hypotheses, two approx… ▽ More

    Submitted 11 June, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

  8. arXiv:2308.13696  [pdf, other

    cs.CL cs.AI

    On the Depth between Beam Search and Exhaustive Search for Text Generation

    Authors: Yuu **nai, Tetsuro Morimura, Ukyo Honda

    Abstract: Beam search and exhaustive search are two extreme ends of text decoding algorithms with respect to the search depth. Beam search is limited in both search width and depth, whereas exhaustive search is a global search that has no such limitations. Surprisingly, beam search is not only computationally cheaper but also performs better than exhaustive search despite its higher search error. Plenty of… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

  9. arXiv:2212.03230  [pdf, other

    cs.CV cs.CL

    Switching to Discriminative Image Captioning by Relieving a Bottleneck of Reinforcement Learning

    Authors: Ukyo Honda, Taro Watanabe, Yuji Matsumoto

    Abstract: Discriminativeness is a desirable feature of image captions: captions should describe the characteristic details of input images. However, recent high-performing captioning models, which are trained with reinforcement learning (RL), tend to generate overly generic captions despite their high performance in various other criteria. First, we investigate the cause of the unexpectedly low discriminati… ▽ More

    Submitted 31 December, 2022; v1 submitted 6 December, 2022; originally announced December 2022.

    Comments: WACV 2023 (19 pages, 9 figures; updated appendix)

  10. arXiv:2104.13872  [pdf, other

    cs.CL cs.CV

    Removing Word-Level Spurious Alignment between Images and Pseudo-Captions in Unsupervised Image Captioning

    Authors: Ukyo Honda, Yoshitaka Ushiku, Atsushi Hashimoto, Taro Watanabe, Yuji Matsumoto

    Abstract: Unsupervised image captioning is a challenging task that aims at generating captions without the supervision of image-sentence pairs, but only with images and sentences drawn from different sources and object labels detected from the images. In previous work, pseudo-captions, i.e., sentences that contain the detected object labels, were assigned to a given image. The focus of the previous work was… ▽ More

    Submitted 1 June, 2021; v1 submitted 28 April, 2021; originally announced April 2021.

    Comments: EACL 2021 (11 pages, 3 figures; added references)