-
Automated Video Labelling: Identifying Faces by Corroborative Evidence
Authors:
Andrew Brown,
Ernesto Coto,
Andrew Zisserman
Abstract:
We present a method for automatically labelling all faces in video archives, such as TV broadcasts, by combining multiple evidence sources and multiple modalities (visual and audio). We target the problem of ever-growing online video archives, where an effective, scalable indexing solution cannot require a user to provide manual annotation or supervision. To this end, we make three key contributio…
▽ More
We present a method for automatically labelling all faces in video archives, such as TV broadcasts, by combining multiple evidence sources and multiple modalities (visual and audio). We target the problem of ever-growing online video archives, where an effective, scalable indexing solution cannot require a user to provide manual annotation or supervision. To this end, we make three key contributions: (1) We provide a novel, simple, method for determining if a person is famous or not using image-search engines. In turn this enables a face-identity model to be built reliably and robustly, and used for high precision automatic labelling; (2) We show that even for less-famous people, image-search engines can then be used for corroborative evidence to accurately label faces that are named in the scene or the speech; (3) Finally, we quantitatively demonstrate the benefits of our approach on different video domains and test settings, such as TV shows and news broadcasts. Our method works across three disparate datasets without any explicit domain adaptation, and sets new state-of-the-art results on all the public benchmarks.
△ Less
Submitted 10 February, 2021;
originally announced February 2021.
-
VoxSRC 2020: The Second VoxCeleb Speaker Recognition Challenge
Authors:
Arsha Nagrani,
Joon Son Chung,
Jaesung Huh,
Andrew Brown,
Ernesto Coto,
Weidi Xie,
Mitchell McLaren,
Douglas A Reynolds,
Andrew Zisserman
Abstract:
We held the second installment of the VoxCeleb Speaker Recognition Challenge in conjunction with Interspeech 2020. The goal of this challenge was to assess how well current speaker recognition technology is able to diarise and recognize speakers in unconstrained or `in the wild' data. It consisted of: (i) a publicly available speaker recognition and diarisation dataset from YouTube videos together…
▽ More
We held the second installment of the VoxCeleb Speaker Recognition Challenge in conjunction with Interspeech 2020. The goal of this challenge was to assess how well current speaker recognition technology is able to diarise and recognize speakers in unconstrained or `in the wild' data. It consisted of: (i) a publicly available speaker recognition and diarisation dataset from YouTube videos together with ground truth annotation and standardised evaluation software; and (ii) a virtual public challenge and workshop held at Interspeech 2020. This paper outlines the challenge, and describes the baselines, methods used, and results. We conclude with a discussion of the progress over the first installment of the challenge.
△ Less
Submitted 12 December, 2020;
originally announced December 2020.
-
The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020)
Authors:
Samuel Albanie,
Yang Liu,
Arsha Nagrani,
Antoine Miech,
Ernesto Coto,
Ivan Laptev,
Rahul Sukthankar,
Bernard Ghanem,
Andrew Zisserman,
Valentin Gabeur,
Chen Sun,
Karteek Alahari,
Cordelia Schmid,
Shizhe Chen,
Yida Zhao,
Qin **,
Kaixu Cui,
Hui Liu,
Chen Wang,
Yudong Jiang,
Xiaoshuai Hao
Abstract:
We present a new video understanding pentathlon challenge, an open competition held in conjunction with the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020. The objective of the challenge was to explore and evaluate new methods for text-to-video retrieval-the task of searching for content within a corpus of videos using natural language queries. This report summarizes the re…
▽ More
We present a new video understanding pentathlon challenge, an open competition held in conjunction with the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020. The objective of the challenge was to explore and evaluate new methods for text-to-video retrieval-the task of searching for content within a corpus of videos using natural language queries. This report summarizes the results of the first edition of the challenge together with the findings of the participants.
△ Less
Submitted 3 August, 2020;
originally announced August 2020.
-
VoxSRC 2019: The first VoxCeleb Speaker Recognition Challenge
Authors:
Joon Son Chung,
Arsha Nagrani,
Ernesto Coto,
Weidi Xie,
Mitchell McLaren,
Douglas A Reynolds,
Andrew Zisserman
Abstract:
The VoxCeleb Speaker Recognition Challenge 2019 aimed to assess how well current speaker recognition technology is able to identify speakers in unconstrained or `in the wild' data. It consisted of: (i) a publicly available speaker recognition dataset from YouTube videos together with ground truth annotation and standardised evaluation software; and (ii) a public challenge and workshop held at Inte…
▽ More
The VoxCeleb Speaker Recognition Challenge 2019 aimed to assess how well current speaker recognition technology is able to identify speakers in unconstrained or `in the wild' data. It consisted of: (i) a publicly available speaker recognition dataset from YouTube videos together with ground truth annotation and standardised evaluation software; and (ii) a public challenge and workshop held at Interspeech 2019 in Graz, Austria. This paper outlines the challenge and provides its baselines, results and discussions.
△ Less
Submitted 5 December, 2019;
originally announced December 2019.