-
Exploration of Visual Features and their weighted-additive fusion for Video Captioning
Authors:
Praveen S V,
Akhilesh Bharadwaj,
Harsh Raj,
Janhavi Dadhania,
Ganesh Samarth C. A,
Nikhil Pareek,
S R M Prasanna
Abstract:
Video captioning is a popular task that challenges models to describe events in videos using natural language. In this work, we investigate the ability of various visual feature representations derived from state-of-the-art convolutional neural networks to capture high-level semantic context. We introduce the Weighted Additive Fusion Transformer with Memory Augmented Encoders (WAFTM), a captioning…
▽ More
Video captioning is a popular task that challenges models to describe events in videos using natural language. In this work, we investigate the ability of various visual feature representations derived from state-of-the-art convolutional neural networks to capture high-level semantic context. We introduce the Weighted Additive Fusion Transformer with Memory Augmented Encoders (WAFTM), a captioning model that incorporates memory in a transformer encoder and uses a novel method, to fuse features, that ensures due importance is given to more significant representations. We illustrate a gain in performance realized by applying Word-Piece Tokenization and a popular REINFORCE algorithm. Finally, we benchmark our model on two datasets and obtain a CIDEr of 92.4 on MSVD and a METEOR of 0.091 on the ActivityNet Captions Dataset.
△ Less
Submitted 14 January, 2021;
originally announced January 2021.
-
Knowledge Fusion Transformers for Video Action Recognition
Authors:
Ganesh Samarth,
Sheetal Ojha,
Nikhil Pareek
Abstract:
We introduce Knowledge Fusion Transformers for video action classification. We present a self-attention based feature enhancer to fuse action knowledge in 3D inception based spatio-temporal context of the video clip intended to be classified. We show, how using only one stream networks and with little or, no pretraining can pave the way for a performance close to the current state-of-the-art. Addi…
▽ More
We introduce Knowledge Fusion Transformers for video action classification. We present a self-attention based feature enhancer to fuse action knowledge in 3D inception based spatio-temporal context of the video clip intended to be classified. We show, how using only one stream networks and with little or, no pretraining can pave the way for a performance close to the current state-of-the-art. Additionally, we present how different self-attention architectures used at different levels of the network can be blended-in to enhance feature representation. Our architecture is trained and evaluated on UCF-101 and Charades dataset, where it is competitive with the state of the art. It also exceeds by a large gap from single stream networks with no to less pretraining.
△ Less
Submitted 29 September, 2020; v1 submitted 29 September, 2020;
originally announced September 2020.
-
A Novel Quasigroup Substitution Scheme for Chaos Based Image Encryption
Authors:
Vinod Patidar,
N. K. Pareek,
G. Purohit
Abstract:
A During last two decades, there has been a prolific growth in the chaos based image encryption algorithms. Up to an extent these algorithms have been able to provide an alternative to exchange large media files (images and videos) over the networks in a secure way. However, there have been some issues with the implementation of chaos based image ciphers in practice. One of them is reduced/small k…
▽ More
A During last two decades, there has been a prolific growth in the chaos based image encryption algorithms. Up to an extent these algorithms have been able to provide an alternative to exchange large media files (images and videos) over the networks in a secure way. However, there have been some issues with the implementation of chaos based image ciphers in practice. One of them is reduced/small key space due to the fact that chaotic behavior is only observed for certain range of system parameters/initial conditions of the chaotic system used in such algorithms. To overcome this difficulty, we propose a simple, efficient and robust image encryption algorithm based on combined applications of quasigroups and chaotic standard map. The proposed image cipher is based on the popular substitution-diffusion architecture (Shanon) where a quasigroup of order 256 and chaotic standard map have been used for the substitution and permutation of image pixels respectively. Due to the introduction of quasigroup as part of the secret key along with the parameter and initial conditions of the chaotic standard map, the key space has been increased significantly. The proposed image cipher is very fast due to the fact that the substitution based on the quasigroup operations is very simple and can be executed easily through the lookup table operations on Latin squares (which are Cayley operation tables of quasigroups) and the permutation is performed row-by-row as well as column-by-column using the pseudo random number sequences gener-ated through the chaotic standard map. The security and performance have been analyzed through the histograms, correlation coefficients, information entropy, key sensitivity analysis, differential analysis, key space analysis etc. and the results prove the efficiency and robustness of the proposed image cipher against the possible security threats.
△ Less
Submitted 19 September, 2017;
originally announced September 2017.
-
Design and Analysis of a Novel Digital Image Encryption Scheme
Authors:
Narendra K Pareek
Abstract:
In this paper, a new image encryption scheme using a secret key of 144-bits is proposed. In the substitution process of the scheme, image is divided into blocks and subsequently into color components. Each color component is modified by performing bitwise operation which depends on secret key as well as a few most significant bits of its previous and next color component. Three rounds are taken to…
▽ More
In this paper, a new image encryption scheme using a secret key of 144-bits is proposed. In the substitution process of the scheme, image is divided into blocks and subsequently into color components. Each color component is modified by performing bitwise operation which depends on secret key as well as a few most significant bits of its previous and next color component. Three rounds are taken to complete substitution process. To make cipher more robust, a feedback mechanism is also applied by modifying used secret key after encrypting each block. Further, resultant image is partitioned into several key based dynamic sub-images. Each sub-image passes through the scrambling process where pixels of sub-image are reshuffled within itself by using a generated magic square matrix. Five rounds are taken for scrambling process. The propose scheme is simple, fast and sensitive to the secret key. Due to high order of substitution and permutation, common attacks like linear and differential cryptanalysis are infeasible. The experimental results show that the proposed encryption technique is efficient and has high security features.
△ Less
Submitted 7 April, 2012;
originally announced April 2012.