-
EgoSampling: Wide View Hyperlapse from Egocentric Videos
Authors:
Tavi Halperin,
Yair Poleg,
Chetan Arora,
Shmuel Peleg
Abstract:
The possibility of sharing one's point of view makes use of wearable cameras compelling. These videos are often long, boring and coupled with extreme shake, as the camera is worn on a moving person. Fast forwarding (i.e. frame sampling) is a natural choice for quick video browsing. However, this accentuates the shake caused by natural head motion in an egocentric video, making the fast forwarded v…
▽ More
The possibility of sharing one's point of view makes use of wearable cameras compelling. These videos are often long, boring and coupled with extreme shake, as the camera is worn on a moving person. Fast forwarding (i.e. frame sampling) is a natural choice for quick video browsing. However, this accentuates the shake caused by natural head motion in an egocentric video, making the fast forwarded video useless. We propose EgoSampling, an adaptive frame sampling that gives stable, fast forwarded, hyperlapse videos. Adaptive frame sampling is formulated as an energy minimization problem, whose optimal solution can be found in polynomial time. We further turn the camera shake from a drawback into a feature, enabling the increase in field-of-view of the output video. This is obtained when each output frame is mosaiced from several input frames. The proposed technique also enables the generation of a single hyperlapse video from multiple egocentric videos, allowing even faster video consumption.
△ Less
Submitted 12 January, 2017; v1 submitted 26 April, 2016;
originally announced April 2016.
-
Compact CNN for Indexing Egocentric Videos
Authors:
Yair Poleg,
Ariel Ephrat,
Shmuel Peleg,
Chetan Arora
Abstract:
While egocentric video is becoming increasingly popular, browsing it is very difficult. In this paper we present a compact 3D Convolutional Neural Network (CNN) architecture for long-term activity recognition in egocentric videos. Recognizing long-term activities enables us to temporally segment (index) long and unstructured egocentric videos. Existing methods for this task are based on hand tuned…
▽ More
While egocentric video is becoming increasingly popular, browsing it is very difficult. In this paper we present a compact 3D Convolutional Neural Network (CNN) architecture for long-term activity recognition in egocentric videos. Recognizing long-term activities enables us to temporally segment (index) long and unstructured egocentric videos. Existing methods for this task are based on hand tuned features derived from visible objects, location of hands, as well as optical flow.
Given a sparse optical flow volume as input, our CNN classifies the camera wearer's activity. We obtain classification accuracy of 89%, which outperforms the current state-of-the-art by 19%. Additional evaluation is performed on an extended egocentric video dataset, classifying twice the amount of categories than current state-of-the-art. Furthermore, our CNN is able to recognize whether a video is egocentric or not with 99.2% accuracy, up by 24% from current state-of-the-art. To better understand what the network actually learns, we propose a novel visualization of CNN kernels as flow fields.
△ Less
Submitted 24 November, 2015; v1 submitted 28 April, 2015;
originally announced April 2015.
-
EgoSampling: Fast-Forward and Stereo for Egocentric Videos
Authors:
Yair Poleg,
Tavi Halperin,
Chetan Arora,
Shmuel Peleg
Abstract:
While egocentric cameras like GoPro are gaining popularity, the videos they capture are long, boring, and difficult to watch from start to end. Fast forwarding (i.e. frame sampling) is a natural choice for faster video browsing. However, this accentuates the shake caused by natural head motion, making the fast forwarded video useless.
We propose EgoSampling, an adaptive frame sampling that gives…
▽ More
While egocentric cameras like GoPro are gaining popularity, the videos they capture are long, boring, and difficult to watch from start to end. Fast forwarding (i.e. frame sampling) is a natural choice for faster video browsing. However, this accentuates the shake caused by natural head motion, making the fast forwarded video useless.
We propose EgoSampling, an adaptive frame sampling that gives more stable fast forwarded videos. Adaptive frame sampling is formulated as energy minimization, whose optimal solution can be found in polynomial time.
In addition, egocentric video taken while walking suffers from the left-right movement of the head as the body weight shifts from one leg to another. We turn this drawback into a feature: Stereo video can be created by sampling the frames from the left most and right most head positions of each step, forming approximate stereo-pairs.
△ Less
Submitted 27 April, 2015; v1 submitted 11 December, 2014;
originally announced December 2014.