Showing 1–2 of 2 results for author: Brady, K
-
Multimodal Sparse Coding for Event Detection
Authors:
Youngjune Gwon,
William Campbell,
Kevin Brady,
Douglas Sturim,
Miriam Cha,
H. T. Kung
Abstract:
Unsupervised feature learning methods have proven effective for classification tasks based on a single modality. We present multimodal sparse coding for learning feature representations shared across multiple modalities. The shared representations are applied to multimedia event detection (MED) and evaluated in comparison to unimodal counterparts, as well as other feature learning methods such as…
▽ More
Unsupervised feature learning methods have proven effective for classification tasks based on a single modality. We present multimodal sparse coding for learning feature representations shared across multiple modalities. The shared representations are applied to multimedia event detection (MED) and evaluated in comparison to unimodal counterparts, as well as other feature learning methods such as GMM supervectors and sparse RBM. We report the cross-validated classification accuracy and mean average precision of the MED system trained on features learned from our unimodal and multimodal settings for a subset of the TRECVID MED 2014 dataset.
△ Less
Submitted 17 May, 2016;
originally announced May 2016.
-
How Deep Neural Networks Can Improve Emotion Recognition on Video Data
Authors:
Pooya Khorrami,
Tom Le Paine,
Kevin Brady,
Charlie Dagli,
Thomas S. Huang
Abstract:
We consider the task of dimensional emotion recognition on video data using deep learning. While several previous methods have shown the benefits of training temporal neural network models such as recurrent neural networks (RNNs) on hand-crafted features, few works have considered combining convolutional neural networks (CNNs) with RNNs. In this work, we present a system that performs emotion reco…
▽ More
We consider the task of dimensional emotion recognition on video data using deep learning. While several previous methods have shown the benefits of training temporal neural network models such as recurrent neural networks (RNNs) on hand-crafted features, few works have considered combining convolutional neural networks (CNNs) with RNNs. In this work, we present a system that performs emotion recognition on video data using both CNNs and RNNs, and we also analyze how much each neural network component contributes to the system's overall performance. We present our findings on videos from the Audio/Visual+Emotion Challenge (AV+EC2015). In our experiments, we analyze the effects of several hyperparameters on overall performance while also achieving superior performance to the baseline and other competing methods.
△ Less
Submitted 9 January, 2017; v1 submitted 23 February, 2016;
originally announced February 2016.