Skip to main content

Showing 1–1 of 1 results for author: Kwiatowski, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.08688  [pdf, other

    cs.CL cs.AI

    Token Alignment via Character Matching for Subword Completion

    Authors: Ben Athiwaratkun, Shiqi Wang, Mingyue Shang, Yuchen Tian, Zijian Wang, Sujan Kumar Gonugondla, Sanjay Krishna Gouda, Rob Kwiatowski, Ramesh Nallapati, Bing Xiang

    Abstract: Generative models, widely utilized in various applications, can often struggle with prompts corresponding to partial tokens. This struggle stems from tokenization, where partial tokens fall out of distribution during inference, leading to incorrect or nonsensical outputs. This paper examines a technique to alleviate the tokenization artifact on text completion in generative models, maintaining per… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.