Skip to main content

Showing 1–1 of 1 results for author: Jeon, B E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.19479  [pdf, other

    cs.CV

    Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

    Authors: Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Ekaterina Deyneka, Hsiang-wei Chao, Byung Eun Jeon, Yuwei Fang, Hsin-Ying Lee, Jian Ren, Ming-Hsuan Yang, Sergey Tulyakov

    Abstract: The quality of the data and annotation upper-bounds the quality of a downstream model. While there exist large text corpora and image-text pairs, high-quality video-text data is much harder to collect. First of all, manual labeling is more time-consuming, as it requires an annotator to watch an entire video. Second, videos have a temporal dimension, consisting of several scenes stacked together, a… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: CVPR 2024. Project Page: https://snap-research.github.io/Panda-70M