-
Unsupervised Action Localization Crop in Video Retargeting for 3D ConvNets
Authors:
Prithwish Jana,
Swarnabja Bhaumik,
Partha Pratim Mohanta
Abstract:
Untrimmed videos on social media or those captured by robots and surveillance cameras are of varied aspect ratios. However, 3D CNNs usually require as input a square-shaped video, whose spatial dimension is smaller than the original. Random- or center-crop** may leave out the video's subject altogether. To address this, we propose an unsupervised video crop** approach by sha** this as a reta…
▽ More
Untrimmed videos on social media or those captured by robots and surveillance cameras are of varied aspect ratios. However, 3D CNNs usually require as input a square-shaped video, whose spatial dimension is smaller than the original. Random- or center-crop** may leave out the video's subject altogether. To address this, we propose an unsupervised video crop** approach by sha** this as a retargeting and video-to-video synthesis problem. The synthesized video maintains a 1:1 aspect ratio, is smaller in size and is targeted at video-subject(s) throughout the entire duration. First, action localization is performed on each frame by identifying patches with homogeneous motion patterns. Thus, a single salient patch is pinpointed per frame. But to avoid viewpoint jitters and flickering, any inter-frame scale or position changes among the patches should be performed gradually over time. This issue is addressed with a polyBezier fitting in 3D space that passes through some chosen pivot timestamps and whose shape is influenced by the in-between control timestamps. To corroborate the effectiveness of the proposed method, we evaluate the video classification task by comparing our dynamic crop** technique with random crop** on three benchmark datasets, viz. UCF-101, HMDB-51 and ActivityNet v1.3. The clip and top-1 accuracy for video classification after our crop**, outperform 3D CNN performances for same-sized random-crop inputs, also surpassing some larger random-crop sizes.
△ Less
Submitted 22 November, 2021; v1 submitted 14 November, 2021;
originally announced November 2021.
-
Event and Activity Recognition in Video Surveillance for Cyber-Physical Systems
Authors:
Swarnabja Bhaumik,
Prithwish Jana,
Partha Pratim Mohanta
Abstract:
This chapter aims to aid the development of Cyber-Physical Systems (CPS) in automated understanding of events and activities in various applications of video-surveillance. These events are mostly captured by drones, CCTVs or novice and unskilled individuals on low-end devices. Being unconstrained, these videos are immensely challenging due to a number of quality factors. We present an extensive ac…
▽ More
This chapter aims to aid the development of Cyber-Physical Systems (CPS) in automated understanding of events and activities in various applications of video-surveillance. These events are mostly captured by drones, CCTVs or novice and unskilled individuals on low-end devices. Being unconstrained, these videos are immensely challenging due to a number of quality factors. We present an extensive account of the various approaches taken to solve the problem over the years. This ranges from methods as early as Structure from Motion (SFM) based approaches to recent solution frameworks involving deep neural networks. We show that the long-term motion patterns alone play a pivotal role in the task of recognizing an event. Consequently each video is significantly represented by a fixed number of key-frames using a graph-based approach. Only the temporal features are exploited using a hybrid Convolutional Neural Network (CNN) + Recurrent Neural Network (RNN) architecture. The results we obtain are encouraging as they outperform standard temporal CNNs and are at par with those using spatial information along with motion cues. Further exploring multistream models, we conceive a multi-tier fusion strategy for the spatial and temporal wings of a network. A consolidated representation of the respective individual prediction vectors on video and frame levels is obtained using a biased conflation technique. The fusion strategy endows us with greater rise in precision on each stage as compared to the state-of-the-art methods, and thus a powerful consensus is achieved in classification. Results are recorded on four benchmark datasets widely used in the domain of action recognition, namely CCV, HMDB, UCF-101 and KCV. It is inferable that focusing on better classification of the video sequences certainly leads to robust actuation of a system designed for event surveillance and object cum activity tracking.
△ Less
Submitted 3 November, 2021;
originally announced November 2021.
-
Assisting humans to achieve optimal sleep by changing ambient temperature
Authors:
Vivek Gupta,
Siddhant Mittal,
Sandip Bhaumik,
Raj Roy
Abstract:
Environment plays a vital role in the sleep mechanism of a human. It has been shown from many studies that slee** and waking environment, waking time and hours of sleep is of very significant importance which can result in slee** disorders and variety of diseases. This paper finds the sleep cycle of an individual and according changes the ambient temperature to maximize his/her sleep efficienc…
▽ More
Environment plays a vital role in the sleep mechanism of a human. It has been shown from many studies that slee** and waking environment, waking time and hours of sleep is of very significant importance which can result in slee** disorders and variety of diseases. This paper finds the sleep cycle of an individual and according changes the ambient temperature to maximize his/her sleep efficiency. We suggest a method which will assist in increasing sleep efficiency. Using Fast-Fourier-Transformation (FFT) of heart rate signals to extract heart rate variability data such that low frequency / high frequency (LF/HF) power ratio we are detecting sleep stages using an automated algorithm and then applying feedback mechanism to alter the ambient temperature depending upon the sleep stage.
△ Less
Submitted 23 December, 2016;
originally announced December 2016.
-
Energy-Efficient Design and Optimization of Wireline Access Networks
Authors:
Sourjya Bhaumik,
David Chuck,
Girija Narlikar,
Gordon Wilfong
Abstract:
Access networks, in particular, Digital Subscriber Line (DSL) equipment, are a significant source of energy consumption for wireline operators. Replacing large monolithic DSLAMs with smaller remote DSLAM units closer to customers can reduce the energy consumption as well as increase the reach of the access network. This paper attempts to formalize the design and optimization of the "last mile" wir…
▽ More
Access networks, in particular, Digital Subscriber Line (DSL) equipment, are a significant source of energy consumption for wireline operators. Replacing large monolithic DSLAMs with smaller remote DSLAM units closer to customers can reduce the energy consumption as well as increase the reach of the access network. This paper attempts to formalize the design and optimization of the "last mile" wireline access network with energy as one of the costs to be minimized. In particular, the placement of remote DSLAM units needs to be optimized. We propose solutions for two scenarios. For the scenario where an existing all-copper network from the central office to the customers is to be transformed into a fiber-copper network with remote DSLAM units, we present optimal polynomial-time solutions. In the green-field scenario, both the access network layout and the placement of remote DSLAM units must be determined. We show that this problem is NP-complete. We present an optimal ILP formulation and also design an efficient heuristic-based approach to build a power-and-cost-optimized access network. Our heuristic-based approach yields results that are very close to optimal. We show how the power consumption of the access network can be reduced by carefully laying the access network and introducing remote DSLAM units.
△ Less
Submitted 14 January, 2011;
originally announced January 2011.