Search | arXiv e-print repository

Smart speaker design and implementation with biometric authentication and advanced voice interaction capability

Authors: Bharath Sudharsan, Peter Corcoran, Muhammad Intizar Ali

Abstract: Advancements in semiconductor technology have reduced dimensions and cost while improving the performance and capacity of chipsets. In addition, advancement in the AI frameworks and libraries brings possibilities to accommodate more AI at the resource-constrained edge of consumer IoT devices. Sensors are nowadays an integral part of our environment which provide continuous data streams to build in… ▽ More Advancements in semiconductor technology have reduced dimensions and cost while improving the performance and capacity of chipsets. In addition, advancement in the AI frameworks and libraries brings possibilities to accommodate more AI at the resource-constrained edge of consumer IoT devices. Sensors are nowadays an integral part of our environment which provide continuous data streams to build intelligent applications. An example could be a smart home scenario with multiple interconnected devices. In such smart environments, for convenience and quick access to web-based service and personal information such as calendars, notes, emails, reminders, banking, etc, users link third-party skills or skills from the Amazon store to their smart speakers. Also, in current smart home scenarios, several smart home products such as smart security cameras, video doorbells, smart plugs, smart carbon monoxide monitors, and smart door locks, etc. are interlinked to a modern smart speaker via means of custom skill addition. Since smart speakers are linked to such services and devices via the smart speaker user's account. They can be used by anyone with physical access to the smart speaker via voice commands. If done so, the data privacy, home security and other aspects of the user get compromised. Recently launched, Tensor Cam's AI Camera, Toshiba's Symbio, Facebook's Portal are camera-enabled smart speakers with AI functionalities. Although they are camera-enabled, yet they do not have an authentication scheme in addition to calling out the wake-word. This paper provides an overview of cybersecurity risks faced by smart speaker users due to lack of authentication scheme and discusses the development of a state-of-the-art camera-enabled, microphone array-based modern Alexa smart speaker prototype to address these risks. △ Less

Submitted 17 July, 2022; originally announced July 2022.

arXiv:2204.10183 [pdf, other]

Multi-Component Optimization and Efficient Deployment of Neural-Networks on Resource-Constrained IoT Hardware

Authors: Bharath Sudharsan, Dineshkumar Sundaram, Pankesh Patel, John G. Breslin, Muhammad Intizar Ali, Schahram Dustdar, Albert Zomaya, Rajiv Ranjan

Abstract: The majority of IoT devices like smartwatches, smart plugs, HVAC controllers, etc., are powered by hardware with a constrained specification (low memory, clock speed and processor) which is insufficient to accommodate and execute large, high-quality models. On such resource-constrained devices, manufacturers still manage to provide attractive functionalities (to boost sales) by following the tradi… ▽ More The majority of IoT devices like smartwatches, smart plugs, HVAC controllers, etc., are powered by hardware with a constrained specification (low memory, clock speed and processor) which is insufficient to accommodate and execute large, high-quality models. On such resource-constrained devices, manufacturers still manage to provide attractive functionalities (to boost sales) by following the traditional approach of programming IoT devices/products to collect and transmit data (image, audio, sensor readings, etc.) to their cloud-based ML analytics platforms. For decades, this online approach has been facing issues such as compromised data streams, non-real-time analytics due to latency, bandwidth constraints, costly subscriptions, recent privacy issues raised by users and the GDPR guidelines, etc. In this paper, to enable ultra-fast and accurate AI-based offline analytics on resource-constrained IoT devices, we present an end-to-end multi-component model optimization sequence and open-source its implementation. Researchers and developers can use our optimization sequence to optimize high memory, computation demanding models in multiple aspects in order to produce small size, low latency, low-power consuming models that can comfortably fit and execute on resource-constrained hardware. The experimental results show that our optimization components can produce models that are; (i) 12.06 x times compressed; (ii) 0.13% to 0.27% more accurate; (iii) Orders of magnitude faster unit inference at 0.06 ms. Our optimization sequence is generic and can be applied to any state-of-the-art models trained for anomaly detection, predictive maintenance, robotics, voice recognition, and machine vision. △ Less

Submitted 20 April, 2022; originally announced April 2022.

arXiv:2010.09687 [pdf, other]

A Demonstration of Smart Doorbell Design Using Federated Deep Learning

Authors: Vatsal Patel, Sarth Kanani, Tapan Pathak, Pankesh Patel, Muhammad Intizar Ali, John Breslin

Abstract: Smart doorbells have been playing an important role in protecting our modern homes. Existing approaches of sending video streams to a centralized server (or Cloud) for video analytics have been facing many challenges such as latency, bandwidth cost and more importantly users' privacy concerns. To address these challenges, this paper showcases the ability of an intelligent smart doorbell based on F… ▽ More Smart doorbells have been playing an important role in protecting our modern homes. Existing approaches of sending video streams to a centralized server (or Cloud) for video analytics have been facing many challenges such as latency, bandwidth cost and more importantly users' privacy concerns. To address these challenges, this paper showcases the ability of an intelligent smart doorbell based on Federated Deep Learning, which can deploy and manage video analytics applications such as a smart doorbell across Edge and Cloud resources. This platform can scale, work with multiple devices, seamlessly manage online orchestration of the application components. The proposed framework is implemented using state-of-the-art technology. We implement the Federated Server using the Flask framework, containerized using Nginx and Gunicorn, which is deployed on AWS EC2 and AWS Serverless architecture. △ Less

Submitted 19 October, 2020; originally announced October 2020.

Comments: 6

arXiv:2010.07680 [pdf, other]

Demonstration of a Cloud-based Software Framework for Video Analytics Application using Low-Cost IoT Devices

Authors: Bhavin Joshi, Tapan Pathak, Vatsal Patel, Sarth Kanani, Pankesh Patel, Muhammad Intizar Ali, John Breslin

Abstract: The design of products and services such as a Smart doorbell, demonstrating video analytics software/algorithm functionality, is expected to address a new kind of requirements such as designing a scalable solution while considering the trade-off between cost and accuracy; a flexible architecture to deploy new AI-based models or update existing models, as user requirements evolve; as well as seamle… ▽ More The design of products and services such as a Smart doorbell, demonstrating video analytics software/algorithm functionality, is expected to address a new kind of requirements such as designing a scalable solution while considering the trade-off between cost and accuracy; a flexible architecture to deploy new AI-based models or update existing models, as user requirements evolve; as well as seamlessly integrating different kinds of user interfaces and devices. To address these challenges, we propose a smart doorbell that orchestrates video analytics across Edge and Cloud resources. The proposal uses AWS as a base platform for implementation and leverages Commercially Available Off-The-Shelf(COTS) affordable devices such as Raspberry Pi in the form of an Edge device. △ Less

Submitted 29 September, 2020; originally announced October 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:2009.09065

arXiv:2009.09065 [pdf, other]

A Distributed Framework to Orchestrate Video Analytics Applications

Authors: Tapan Pathak, Vatsal Patel, Sarth Kanani, Shailesh Arya, Pankesh Patel, Muhammad Intizar Ali, John Breslin

Abstract: The concept of the Internet of Things (IoT) is a reality now. This paradigm shift has caught everyones attention in a large class of applications, including IoT-based video analytics using smart doorbells. Due to its growing application segments, various efforts exist in scientific literature and many video-based doorbell solutions are commercially available in the market. However, contemporary of… ▽ More The concept of the Internet of Things (IoT) is a reality now. This paradigm shift has caught everyones attention in a large class of applications, including IoT-based video analytics using smart doorbells. Due to its growing application segments, various efforts exist in scientific literature and many video-based doorbell solutions are commercially available in the market. However, contemporary offerings are bespoke, offering limited composability and reusability of a smart doorbell framework. Second, they are monolithic and proprietary, which means that the implementation details remain hidden from the users. We believe that a transparent design can greatly aid in the development of a smart doorbell, enabling its use in multiple application domains. To address the above-mentioned challenges, we propose a distributed framework to orchestrate video analytics across Edge and Cloud resources. We investigate trade-offs in the distribution of different software components over a bespoke/full system, where components over Edge and Cloud are treated generically. This paper evaluates the proposed framework as well as the state-of-the-art models and presents comparative analysis of them on various metrics (such as overall model accuracy, latency, memory, and CPU usage). The evaluation result demonstrates our intuition very well, showcasing that the AWS-based approach exhibits reasonably high object-detection accuracy, low memory, and CPU usage when compared to the state-of-the-art approaches, but high latency. △ Less

Submitted 17 September, 2020; originally announced September 2020.

Comments: 9

Showing 1–5 of 5 results for author: Ali, M I