Search | arXiv e-print repository

An overview on the evaluated video retrieval tasks at TRECVID 2022

Authors: George Awad, Keith Curtis, Asad Butt, Jonathan Fiscus, Afzal Godil, Yooyoung Lee, Andrew Delgado, Eliot Godard, Lukas Diduch, Jeffrey Liu, Yvette Graham, Georges Quenot

Abstract: The TREC Video Retrieval Evaluation (TRECVID) is a TREC-style video analysis and retrieval evaluation with the goal of promoting progress in research and development of content-based exploitation and retrieval of information from digital video via open, tasks-based evaluation supported by metrology. Over the last twenty-one years this effort has yielded a better understanding of how systems can ef… ▽ More The TREC Video Retrieval Evaluation (TRECVID) is a TREC-style video analysis and retrieval evaluation with the goal of promoting progress in research and development of content-based exploitation and retrieval of information from digital video via open, tasks-based evaluation supported by metrology. Over the last twenty-one years this effort has yielded a better understanding of how systems can effectively accomplish such processing and how one can reliably benchmark their performance. TRECVID has been funded by NIST (National Institute of Standards and Technology) and other US government agencies. In addition, many organizations and individuals worldwide contribute significant time and effort. TRECVID 2022 planned for the following six tasks: Ad-hoc video search, Video to text captioning, Disaster scene description and indexing, Activity in extended videos, deep video understanding, and movie summarization. In total, 35 teams from various research organizations worldwide signed up to join the evaluation campaign this year. This paper introduces the tasks, datasets used, evaluation frameworks and metrics, as well as a high-level results overview. △ Less

Submitted 22 June, 2023; originally announced June 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2104.13473, arXiv:2009.09984

arXiv:2104.13473 [pdf, other]

TRECVID 2020: A comprehensive campaign for evaluating video retrieval tasks across multiple application domains

Authors: George Awad, Asad A. Butt, Keith Curtis, Jonathan Fiscus, Afzal Godil, Yooyoung Lee, Andrew Delgado, Jesse Zhang, Eliot Godard, Baptiste Chocot, Lukas Diduch, Jeffrey Liu, Alan F. Smeaton, Yvette Graham, Gareth J. F. Jones, Wessel Kraaij, Georges Quenot

Abstract: The TREC Video Retrieval Evaluation (TRECVID) is a TREC-style video analysis and retrieval evaluation with the goal of promoting progress in research and development of content-based exploitation and retrieval of information from digital video via open, metrics-based evaluation. Over the last twenty years this effort has yielded a better understanding of how systems can effectively accomplish such… ▽ More The TREC Video Retrieval Evaluation (TRECVID) is a TREC-style video analysis and retrieval evaluation with the goal of promoting progress in research and development of content-based exploitation and retrieval of information from digital video via open, metrics-based evaluation. Over the last twenty years this effort has yielded a better understanding of how systems can effectively accomplish such processing and how one can reliably benchmark their performance. TRECVID has been funded by NIST (National Institute of Standards and Technology) and other US government agencies. In addition, many organizations and individuals worldwide contribute significant time and effort. TRECVID 2020 represented a continuation of four tasks and the addition of two new tasks. In total, 29 teams from various research organizations worldwide completed one or more of the following six tasks: 1. Ad-hoc Video Search (AVS), 2. Instance Search (INS), 3. Disaster Scene Description and Indexing (DSDI), 4. Video to Text Description (VTT), 5. Activities in Extended Video (ActEV), 6. Video Summarization (VSUM). This paper is an introduction to the evaluation framework, tasks, data, and measures used in the evaluation campaign. △ Less

Submitted 27 April, 2021; originally announced April 2021.

Comments: TRECVID 2020 Workshop Overview Paper. arXiv admin note: substantial text overlap with arXiv:2009.09984

arXiv:2009.09984 [pdf, other]

TRECVID 2019: An Evaluation Campaign to Benchmark Video Activity Detection, Video Captioning and Matching, and Video Search & Retrieval

Authors: George Awad, Asad A. Butt, Keith Curtis, Yooyoung Lee, Jonathan Fiscus, Afzal Godil, Andrew Delgado, Jesse Zhang, Eliot Godard, Lukas Diduch, Alan F. Smeaton, Yvette Graham, Wessel Kraaij, Georges Quenot

Abstract: The TREC Video Retrieval Evaluation (TRECVID) 2019 was a TREC-style video analysis and retrieval evaluation, the goal of which remains to promote progress in research and development of content-based exploitation and retrieval of information from digital video via open, metrics-based evaluation. Over the last nineteen years this effort has yielded a better understanding of how systems can effectiv… ▽ More The TREC Video Retrieval Evaluation (TRECVID) 2019 was a TREC-style video analysis and retrieval evaluation, the goal of which remains to promote progress in research and development of content-based exploitation and retrieval of information from digital video via open, metrics-based evaluation. Over the last nineteen years this effort has yielded a better understanding of how systems can effectively accomplish such processing and how one can reliably benchmark their performance. TRECVID has been funded by NIST (National Institute of Standards and Technology) and other US government agencies. In addition, many organizations and individuals worldwide contribute significant time and effort. TRECVID 2019 represented a continuation of four tasks from TRECVID 2018. In total, 27 teams from various research organizations worldwide completed one or more of the following four tasks: 1. Ad-hoc Video Search (AVS) 2. Instance Search (INS) 3. Activities in Extended Video (ActEV) 4. Video to Text Description (VTT) This paper is an introduction to the evaluation framework, tasks, data, and measures used in the workshop. △ Less

Submitted 21 September, 2020; originally announced September 2020.

Comments: TRECVID Workshop overview paper. 39 pages

arXiv:2005.00463 [pdf, other]

doi 10.1145/3372278.3390742

HLVU : A New Challenge to Test Deep Understanding of Movies the Way Humans do

Authors: Keith Curtis, George Awad, Shahzad Rajput, Ian Soboroff

Abstract: In this paper we propose a new evaluation challenge and direction in the area of High-level Video Understanding. The challenge we are proposing is designed to test automatic video analysis and understanding, and how accurately systems can comprehend a movie in terms of actors, entities, events and their relationship to each other. A pilot High-Level Video Understanding (HLVU) dataset of open sourc… ▽ More In this paper we propose a new evaluation challenge and direction in the area of High-level Video Understanding. The challenge we are proposing is designed to test automatic video analysis and understanding, and how accurately systems can comprehend a movie in terms of actors, entities, events and their relationship to each other. A pilot High-Level Video Understanding (HLVU) dataset of open source movies were collected for human assessors to build a knowledge graph representing each of them. A set of queries will be derived from the knowledge graph to test systems on retrieving relationships among actors, as well as reasoning and retrieving non-visual concepts. The objective is to benchmark if a computer system can "understand" non-explicit but obvious relationships the same way humans do when they watch the same movies. This is long-standing problem that is being addressed in the text domain and this project moves similar research to the video domain. Work of this nature is foundational to future video analytics and video understanding technologies. This work can be of interest to streaming services and broadcasters ho** to provide more intuitive ways for their customers to interact with and consume video content. △ Less

Submitted 1 May, 2020; originally announced May 2020.

arXiv:2003.03000 [pdf]

Neural networks approach for mammography diagnosis using wavelets features

Authors: Essam A. Rashed, and Mohamed G. Awad

Abstract: A supervised diagnosis system for digital mammogram is developed. The diagnosis processes are done by transforming the data of the images into a feature vector using wavelets multilevel decomposition. This vector is used as the feature tailored toward separating different mammogram classes. The suggested model consists of artificial neural networks designed for classifying mammograms according to… ▽ More A supervised diagnosis system for digital mammogram is developed. The diagnosis processes are done by transforming the data of the images into a feature vector using wavelets multilevel decomposition. This vector is used as the feature tailored toward separating different mammogram classes. The suggested model consists of artificial neural networks designed for classifying mammograms according to tumor type and risk level. Results are enhanced from our previous study by extracting feature vectors using multilevel decompositions instead of one level of decomposition. Radiologist-labeled images were used to evaluate the diagnosis system. Results are very promising and show possible guide for future work. △ Less

Submitted 5 March, 2020; originally announced March 2020.

Comments: Reprint

Journal ref: First Canadian Student Conference on Biomedical Computing, 2006

arXiv:1810.04401 [pdf, other]

V3C - a Research Video Collection

Authors: Luca Rossetto, Heiko Schuldt, George Awad, Asad A. Butt

Abstract: With the widespread use of smartphones as recording devices and the massive growth in bandwidth, the number and volume of video collections has increased significantly in the last years. This poses novel challenges to the management of these large-scale video data and especially to the analysis of and retrieval from such video collections. At the same time, existing video datasets used for researc… ▽ More With the widespread use of smartphones as recording devices and the massive growth in bandwidth, the number and volume of video collections has increased significantly in the last years. This poses novel challenges to the management of these large-scale video data and especially to the analysis of and retrieval from such video collections. At the same time, existing video datasets used for research and experimentation are either not large enough to represent current collections or do not reflect the properties of video commonly found on the Internet in terms of content, length, or resolution. In this paper, we introduce the Vimeo Creative Commons Collection, in short V3C, a collection of 28'450 videos (with overall length of about 3'800 hours) published under creative commons license on Vimeo. V3C comes with a shot segmentation for each video, together with the resulting keyframes in original as well as reduced resolution and additional metadata. It is intended to be used from 2019 at the International large-scale TREC Video Retrieval Evaluation campaign (TRECVid). △ Less

Submitted 11 October, 2018; v1 submitted 10 October, 2018; originally announced October 2018.

arXiv:1710.10586 [pdf, other]

doi 10.1371/journal.pone.0202789

Evaluation of Automatic Video Captioning Using Direct Assessment

Authors: Yvette Graham, George Awad, Alan Smeaton

Abstract: We present Direct Assessment, a method for manually assessing the quality of automatically-generated captions for video. Evaluating the accuracy of video captions is particularly difficult because for any given video clip there is no definitive ground truth or correct answer against which to measure. Automatic metrics for comparing automatic video captions against a manual caption such as BLEU and… ▽ More We present Direct Assessment, a method for manually assessing the quality of automatically-generated captions for video. Evaluating the accuracy of video captions is particularly difficult because for any given video clip there is no definitive ground truth or correct answer against which to measure. Automatic metrics for comparing automatic video captions against a manual caption such as BLEU and METEOR, drawn from techniques used in evaluating machine translation, were used in the TRECVid video captioning task in 2016 but these are shown to have weaknesses. The work presented here brings human assessment into the evaluation by crowdsourcing how well a caption describes a video. We automatically degrade the quality of some sample captions which are assessed manually and from this we are able to rate the quality of the human assessors, a factor we take into account in the evaluation. Using data from the TRECVid video-to-text task in 2016, we show how our direct assessment method is replicable and robust and should scale to where there many caption-generation techniques to be evaluated. △ Less

Submitted 29 October, 2017; originally announced October 2017.

Comments: 26 pages, 8 figures

Showing 1–7 of 7 results for author: Awad, G