Computer Science > Computer Vision and Pattern Recognition
[Submitted on 20 Apr 2017 (this version), latest version 18 Sep 2017 (v2)]
Title:Temporal Action Detection with Structured Segment Networks
View PDFAbstract:Detecting activities in untrimmed videos is an important yet challenging task. In this paper, we tackle the difficulties of effectively locating the start and the end of a long complex action, which are often met by existing methods. Our key contribution is the structured segment network, a novel framework for temporal action detection, which models the temporal structure of each activity instance via a structured temporal pyramid. On top of the pyramid, we further introduce a decomposed discriminative model, which comprises two classifiers, respectively for classifying activities and determining completeness. This allows the framework to effectively distinguish positive proposals from background or incomplete ones, thus leading to both accurate recognition and localization. These components are integrated into a unified network that can be efficiently trained in an end-to-end fashion. We also propose a simple yet effective temporal action proposal scheme that can generate proposals of considerably higher qualities. On two challenging benchmarks, THUMOS14 and ActivityNet, our method remarkably outperforms existing state-of-the-art methods by over $ 10\% $ absolute average mAP, demonstrating superior accuracy and strong adaptivity in handling activities with various temporal structures.
Submission history
From: Yuanjun Xiong [view email][v1] Thu, 20 Apr 2017 16:51:45 UTC (686 KB)
[v2] Mon, 18 Sep 2017 08:43:11 UTC (730 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.