Movie101: A New Movie Understanding Benchmark

Yue, Zihao; Zhang, Qi; Hu, Anwen; Zhang, Liang; Wang, Ziheng; **, Qin

Computer Science > Computer Vision and Pattern Recognition

arXiv:2305.12140 (cs)

[Submitted on 20 May 2023 (v1), last revised 27 Jun 2023 (this version, v2)]

Title:Movie101: A New Movie Understanding Benchmark

Authors:Zihao Yue, Qi Zhang, Anwen Hu, Liang Zhang, Ziheng Wang, Qin **

View PDF

Abstract:To help the visually impaired enjoy movies, automatic movie narrating systems are expected to narrate accurate, coherent, and role-aware plots when there are no speaking lines of actors. Existing works benchmark this challenge as a normal video captioning task via some simplifications, such as removing role names and evaluating narrations with ngram-based metrics, which makes it difficult for automatic systems to meet the needs of real application scenarios. To narrow this gap, we construct a large-scale Chinese movie benchmark, named Movie101. Closer to real scenarios, the Movie Clip Narrating (MCN) task in our benchmark asks models to generate role-aware narration paragraphs for complete movie clips where no actors are speaking. External knowledge, such as role information and movie genres, is also provided for better movie understanding. Besides, we propose a new metric called Movie Narration Score (MNScore) for movie narrating evaluation, which achieves the best correlation with human evaluation. Our benchmark also supports the Temporal Narration Grounding (TNG) task to investigate clip localization given text descriptions. For both two tasks, our proposed methods well leverage external knowledge and outperform carefully designed baselines. The dataset and codes are released at this https URL.

Comments:	Accepted to ACL 2023
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
Cite as:	arXiv:2305.12140 [cs.CV]
	(or arXiv:2305.12140v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2305.12140

Submission history

From: Zihao Yue [view email]
[v1] Sat, 20 May 2023 08:43:51 UTC (9,704 KB)
[v2] Tue, 27 Jun 2023 11:42:44 UTC (9,704 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Movie101: A New Movie Understanding Benchmark

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Movie101: A New Movie Understanding Benchmark

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators