TransVisDrone: Spatio-Temporal Transformer for Vision-based Drone-to-Drone Detection in Aerial Videos
Authors:
Tushar Sangam,
Ishan Rajendrakumar Dave,
Waqas Sultani,
Mubarak Shah
Abstract:
Drone-to-drone detection using visual feed has crucial applications, such as detecting drone collisions, detecting drone attacks, or coordinating flight with other drones. However, existing methods are computationally costly, follow non-end-to-end optimization, and have complex multi-stage pipelines, making them less suitable for real-time deployment on edge devices. In this work, we propose a sim…
▽ More
Drone-to-drone detection using visual feed has crucial applications, such as detecting drone collisions, detecting drone attacks, or coordinating flight with other drones. However, existing methods are computationally costly, follow non-end-to-end optimization, and have complex multi-stage pipelines, making them less suitable for real-time deployment on edge devices. In this work, we propose a simple yet effective framework, \textit{TransVisDrone}, that provides an end-to-end solution with higher computational efficiency. We utilize CSPDarkNet-53 network to learn object-related spatial features and VideoSwin model to improve drone detection in challenging scenarios by learning spatio-temporal dependencies of drone motion. Our method achieves state-of-the-art performance on three challenging real-world datasets (Average [email protected]): NPS 0.95, FLDrones 0.75, and AOT 0.80, and a higher throughput than previous methods. We also demonstrate its deployment capability on edge devices and its usefulness in detecting drone-collision (encounter). Project: \url{https://tusharsangam.github.io/TransVisDrone-project-page/}.
△ Less
Submitted 25 August, 2023; v1 submitted 15 October, 2022;
originally announced October 2022.
TinyAction Challenge: Recognizing Real-world Low-resolution Activities in Videos
Authors:
Praveen Tirupattur,
Aayush J Rana,
Tushar Sangam,
Shruti Vyas,
Yogesh S Rawat,
Mubarak Shah
Abstract:
This paper summarizes the TinyAction challenge which was organized in ActivityNet workshop at CVPR 2021. This challenge focuses on recognizing real-world low-resolution activities present in videos. Action recognition task is currently focused around classifying the actions from high-quality videos where the actors and the action is clearly visible. While various approaches have been shown effecti…
▽ More
This paper summarizes the TinyAction challenge which was organized in ActivityNet workshop at CVPR 2021. This challenge focuses on recognizing real-world low-resolution activities present in videos. Action recognition task is currently focused around classifying the actions from high-quality videos where the actors and the action is clearly visible. While various approaches have been shown effective for recognition task in recent works, they often do not deal with videos of lower resolution where the action is happening in a tiny region. However, many real world security videos often have the actual action captured in a small resolution, making action recognition in a tiny region a challenging task. In this work, we propose a benchmark dataset, TinyVIRAT-v2, which is comprised of naturally occuring low-resolution actions. This is an extension of the TinyVIRAT dataset and consists of actions with multiple labels. The videos are extracted from security videos which makes them realistic and more challenging. We use current state-of-the-art action recognition methods on the dataset as a benchmark, and propose the TinyAction Challenge.
△ Less
Submitted 23 July, 2021;
originally announced July 2021.