-
SiEVE: Semantically Encoded Video Analytics on Edge and Cloud
Authors:
Tarek Elgamal,
Shu Shi,
Varun Gupta,
Rittwik Jana,
Klara Nahrstedt
Abstract:
Recent advances in computer vision and neural networks have made it possible for more surveillance videos to be automatically searched and analyzed by algorithms rather than humans. This happened in parallel with advances in edge computing where videos are analyzed over hierarchical clusters that contain edge devices, close to the video source. However, the current video analysis pipeline has seve…
▽ More
Recent advances in computer vision and neural networks have made it possible for more surveillance videos to be automatically searched and analyzed by algorithms rather than humans. This happened in parallel with advances in edge computing where videos are analyzed over hierarchical clusters that contain edge devices, close to the video source. However, the current video analysis pipeline has several disadvantages when dealing with such advances. For example, video encoders have been designed for a long time to please human viewers and be agnostic of the downstream analysis task (e.g., object detection). Moreover, most of the video analytics systems leverage 2-tier architecture where the encoded video is sent to either a remote cloud or a private edge server but does not efficiently leverage both of them. In response to these advances, we present SIEVE, a 3-tier video analytics system to reduce the latency and increase the throughput of analytics over video streams. In SIEVE, we present a novel technique to detect objects in compressed video streams. We refer to this technique as semantic video encoding because it allows video encoders to be aware of the semantics of the downstream task (e.g., object detection). Our results show that by leveraging semantic video encoding, we achieve close to 100% object detection accuracy with decompressing only 3.5% of the video frames which results in more than 100x speedup compared to classical approaches that decompress every video frame.
△ Less
Submitted 1 June, 2020;
originally announced June 2020.
-
Serdab: An IoT Framework for Partitioning Neural Networks Computation across Multiple Enclaves
Authors:
Tarek Elgamal,
Klara Nahrstedt
Abstract:
Recent advances in Deep Neural Networks (DNN) and Edge Computing have made it possible to automatically analyze streams of videos from home/security cameras over hierarchical clusters that include edge devices, close to the video source, as well as remote cloud compute resources. However, preserving the privacy and confidentiality of users' sensitive data as it passes through different devices rem…
▽ More
Recent advances in Deep Neural Networks (DNN) and Edge Computing have made it possible to automatically analyze streams of videos from home/security cameras over hierarchical clusters that include edge devices, close to the video source, as well as remote cloud compute resources. However, preserving the privacy and confidentiality of users' sensitive data as it passes through different devices remains a concern to most users. Private user data is subject to attacks by malicious attackers or misuse by internal administrators who may use the data in activities that are not explicitly approved by the user. To address this challenge, we present Serdab, a distributed orchestration framework for deploying deep neural network computation across multiple secure enclaves (e.g., Intel SGX). Secure enclaves provide a guarantee on the privacy of the data/code deployed inside it. However, their limited hardware resources make them inefficient when solely running an entire deep neural network. To bridge this gap, Serdab presents a DNN partitioning strategy to distribute the layers of the neural network across multiple enclave devices or across an enclave device and other hardware accelerators. Our partitioning strategy achieves up to 4.7x speedup compared to executing the entire neural network in one enclave.
△ Less
Submitted 12 May, 2020;
originally announced May 2020.
-
Costless: Optimizing Cost of Serverless Computing through Function Fusion and Placement
Authors:
Tarek Elgamal,
Atul Sandur,
Klara Nahrstedt,
Gul Agha
Abstract:
Serverless computing has recently experienced significant adoption by several applications, especially Internet of Things (IoT) applications. In serverless computing, rather than deploying and managing dedicated virtual machines, users are able to deploy individual functions, and pay only for the time that their code is actually executing. However, since serverless platforms are relatively new, th…
▽ More
Serverless computing has recently experienced significant adoption by several applications, especially Internet of Things (IoT) applications. In serverless computing, rather than deploying and managing dedicated virtual machines, users are able to deploy individual functions, and pay only for the time that their code is actually executing. However, since serverless platforms are relatively new, they have a completely different pricing model that depends on the memory, duration, and the number of executions of a sequence/workflow of functions. In this paper we present an algorithm that optimizes the price of serverless applications in AWS Lambda. We first describe the factors affecting price of serverless applications which include: (1) fusing a sequence of functions, (2) splitting functions across edge and cloud resources, and (3) allocating the memory for each function. We then present an efficient algorithm to explore different function fusion-placement solutions and find the solution that optimizes the application's price while kee** the latency under a certain threshold. Our results on image processing workflows show that the algorithm can find solutions optimizing the price by more than 35%-57% with only 5%-15% increase in latency. We also show that our algorithm can find non-trivial memory configurations that reduce both latency and price.
△ Less
Submitted 23 November, 2018;
originally announced November 2018.
-
Analysis of PCA Algorithms in Distributed Environments
Authors:
Tarek Elgamal,
Mohamed Hefeeda
Abstract:
Classical machine learning algorithms often face scalability bottlenecks when they are applied to large-scale data. Such algorithms were designed to work with small data that is assumed to fit in the memory of one machine. In this report, we analyze different methods for computing an important machine learing algorithm, namely Principal Component Analysis (PCA), and we comment on its limitations i…
▽ More
Classical machine learning algorithms often face scalability bottlenecks when they are applied to large-scale data. Such algorithms were designed to work with small data that is assumed to fit in the memory of one machine. In this report, we analyze different methods for computing an important machine learing algorithm, namely Principal Component Analysis (PCA), and we comment on its limitations in supporting large datasets. The methods are analyzed and compared across two important metrics: time complexity and communication complexity. We consider the worst-case scenarios for both metrics, and we identify the software libraries that implement each method. The analysis in this report helps researchers and engineers in (i) understanding the main bottlenecks for scalability in different PCA algorithms, (ii) choosing the most appropriate method and software library for a given application and data set characteristics, and (iii) designing new scalable PCA algorithms.
△ Less
Submitted 13 May, 2015; v1 submitted 17 March, 2015;
originally announced March 2015.