Skip to main content

Showing 1–1 of 1 results for author: Nagasawa, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.01575  [pdf, other

    cs.CL cs.CV

    A Challenging Multimodal Video Summary: Simultaneously Extracting and Generating Keyframe-Caption Pairs from Video

    Authors: Keito Kudo, Haruki Nagasawa, Jun Suzuki, Nobuyuki Shimizu

    Abstract: This paper proposes a practical multimodal video summarization task setting and a dataset to train and evaluate the task. The target task involves summarizing a given video into a predefined number of keyframe-caption pairs and displaying them in a listable format to grasp the video content quickly. This task aims to extract crucial scenes from the video in the form of images (keyframes) and gener… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.