Skip to main content

Showing 1–1 of 1 results for author: Nukrai, D

.
  1. arXiv:2211.00575  [pdf, other

    cs.CV cs.AI cs.LG

    Text-Only Training for Image Captioning using Noise-Injected CLIP

    Authors: David Nukrai, Ron Mokady, Amir Globerson

    Abstract: We consider the task of image-captioning using only the CLIP model and additional text data at training time, and no additional captioned images. Our approach relies on the fact that CLIP is trained to make visual and textual embeddings similar. Therefore, we only need to learn how to translate CLIP textual embeddings back into text, and we can learn how to do this by learning a decoder for the fr… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

    Comments: Will be presented at EMNLP 2022. GitHub: https://github.com/DavidHuji/CapDec

    Journal ref: EMNLP 2022