-
Leveraging PointNet and PointNet++ for Lyft Point Cloud Classification Challenge
Authors:
Rajat K. Doshi
Abstract:
This study investigates the application of PointNet and PointNet++ in the classification of LiDAR-generated point cloud data, a critical component for achieving fully autonomous vehicles. Utilizing a modified dataset from the Lyft 3D Object Detection Challenge, we examine the models' capabilities to handle dynamic and complex environments essential for autonomous navigation. Our analysis shows tha…
▽ More
This study investigates the application of PointNet and PointNet++ in the classification of LiDAR-generated point cloud data, a critical component for achieving fully autonomous vehicles. Utilizing a modified dataset from the Lyft 3D Object Detection Challenge, we examine the models' capabilities to handle dynamic and complex environments essential for autonomous navigation. Our analysis shows that PointNet and PointNet++ achieved accuracy rates of 79.53% and 84.24%, respectively. These results underscore the models' robustness in interpreting intricate environmental data, which is pivotal for the safety and efficiency of autonomous vehicles. Moreover, the enhanced detection accuracy, particularly in distinguishing pedestrians from other objects, highlights the potential of these models to contribute substantially to the advancement of autonomous vehicle technology.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
On Hand-Held Grippers and the Morphological Gap in Human Manipulation Demonstration
Authors:
Kiran Doshi,
Yijiang Huang,
Stelian Coros
Abstract:
Collecting manipulation demonstrations with robotic hardware is tedious - and thus difficult to scale. Recording data on robot hardware ensures that it is in the appropriate format for Learning from Demonstrations (LfD) methods. By contrast, humans are proficient manipulators, and recording their actions would be easy to scale, but it is challenging to use that data format with LfD methods. The qu…
▽ More
Collecting manipulation demonstrations with robotic hardware is tedious - and thus difficult to scale. Recording data on robot hardware ensures that it is in the appropriate format for Learning from Demonstrations (LfD) methods. By contrast, humans are proficient manipulators, and recording their actions would be easy to scale, but it is challenging to use that data format with LfD methods. The question we explore is whether there is a method to collect data in a format that can be used with LfD while retaining some of the attractive features of recording human manipulation. We propose equip** humans with hand-held, hand-actuated parallel grippers and a head-mounted camera to record demonstrations of manipulation tasks. Using customised and reproducible grippers, we collect an initial dataset of common manipulation tasks. We show that there are tasks that, against our initial intuition, can be performed using parallel grippers. Qualitative insights are obtained regarding the impact of the difference in morphology on LfD by comparing the strategies used to complete tasks with human hands and grippers. Our data collection method bridges the gap between robot- and human-native manipulation demonstration. By making the design of our gripper prototype available, we hope to reduce other researchers effort to collect manipulation data.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
CCTV-Gun: Benchmarking Handgun Detection in CCTV Images
Authors:
Srikar Yellapragada,
Zhenghong Li,
Kevin Bhadresh Doshi,
Purva Makarand Mhasakar,
Heng Fan,
Jie Wei,
Erik Blasch,
Bin Zhang,
Haibin Ling
Abstract:
Gun violence is a critical security problem, and it is imperative for the computer vision community to develop effective gun detection algorithms for real-world scenarios, particularly in Closed Circuit Television (CCTV) surveillance data. Despite significant progress in visual object detection, detecting guns in real-world CCTV images remains a challenging and under-explored task. Firearms, espec…
▽ More
Gun violence is a critical security problem, and it is imperative for the computer vision community to develop effective gun detection algorithms for real-world scenarios, particularly in Closed Circuit Television (CCTV) surveillance data. Despite significant progress in visual object detection, detecting guns in real-world CCTV images remains a challenging and under-explored task. Firearms, especially handguns, are typically very small in size, non-salient in appearance, and often severely occluded or indistinguishable from other small objects. Additionally, the lack of principled benchmarks and difficulty collecting relevant datasets further hinder algorithmic development. In this paper, we present a meticulously crafted and annotated benchmark, called \textbf{CCTV-Gun}, which addresses the challenges of detecting handguns in real-world CCTV images. Our contribution is three-fold. Firstly, we carefully select and analyze real-world CCTV images from three datasets, manually annotate handguns and their holders, and assign each image with relevant challenge factors such as blur and occlusion. Secondly, we propose a new cross-dataset evaluation protocol in addition to the standard intra-dataset protocol, which is vital for gun detection in practical settings. Finally, we comprehensively evaluate both classical and state-of-the-art object detection algorithms, providing an in-depth analysis of their generalizing abilities. The benchmark will facilitate further research and development on this topic and ultimately enhance security. Code, annotations, and trained models are available at https://github.com/srikarym/CCTV-Gun.
△ Less
Submitted 11 July, 2023; v1 submitted 19 March, 2023;
originally announced March 2023.
-
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Authors:
Yizhong Wang,
Swaroop Mishra,
Pegah Alipoormolabashi,
Yeganeh Kordi,
Amirreza Mirzaei,
Anjana Arunkumar,
Arjun Ashok,
Arut Selvan Dhanasekaran,
Atharva Naik,
David Stap,
Eshaan Pathak,
Giannis Karamanolakis,
Haizhi Gary Lai,
Ishan Purohit,
Ishani Mondal,
Jacob Anderson,
Kirby Kuznia,
Krima Doshi,
Maitreya Patel,
Kuntal Kumar Pal,
Mehrad Moradshahi,
Mihir Parmar,
Mirali Purohit,
Neeraj Varshney,
Phani Rohitha Kaza
, et al. (15 additional authors not shown)
Abstract:
How well can NLP models generalize to a variety of unseen tasks when provided with task instructions? To address this question, we first introduce Super-NaturalInstructions, a benchmark of 1,616 diverse NLP tasks and their expert-written instructions. Our collection covers 76 distinct task types, including but not limited to classification, extraction, infilling, sequence tagging, text rewriting,…
▽ More
How well can NLP models generalize to a variety of unseen tasks when provided with task instructions? To address this question, we first introduce Super-NaturalInstructions, a benchmark of 1,616 diverse NLP tasks and their expert-written instructions. Our collection covers 76 distinct task types, including but not limited to classification, extraction, infilling, sequence tagging, text rewriting, and text composition. This large and diverse collection of tasks enables rigorous benchmarking of cross-task generalization under instructions -- training models to follow instructions on a subset of tasks and evaluating them on the remaining unseen ones. Furthermore, we build Tk-Instruct, a transformer model trained to follow a variety of in-context instructions (plain language task definitions or k-shot examples). Our experiments show that Tk-Instruct outperforms existing instruction-following models such as InstructGPT by over 9% on our benchmark despite being an order of magnitude smaller. We further analyze generalization as a function of various scaling parameters, such as the number of observed tasks, the number of instances per task, and model sizes. We hope our dataset and model facilitate future progress towards more general-purpose NLP models.
△ Less
Submitted 24 October, 2022; v1 submitted 15 April, 2022;
originally announced April 2022.
-
Adversarial Machine Learning Attacks Against Video Anomaly Detection Systems
Authors:
Furkan Mumcu,
Keval Doshi,
Yasin Yilmaz
Abstract:
Anomaly detection in videos is an important computer vision problem with various applications including automated video surveillance. Although adversarial attacks on image understanding models have been heavily investigated, there is not much work on adversarial machine learning targeting video understanding models and no previous work which focuses on video anomaly detection. To this end, we inve…
▽ More
Anomaly detection in videos is an important computer vision problem with various applications including automated video surveillance. Although adversarial attacks on image understanding models have been heavily investigated, there is not much work on adversarial machine learning targeting video understanding models and no previous work which focuses on video anomaly detection. To this end, we investigate an adversarial machine learning attack against video anomaly detection systems, that can be implemented via an easy-to-perform cyber-attack. Since surveillance cameras are usually connected to the server running the anomaly detection model through a wireless network, they are prone to cyber-attacks targeting the wireless connection. We demonstrate how Wi-Fi deauthentication attack, a notoriously easy-to-perform and effective denial-of-service (DoS) attack, can be utilized to generate adversarial data for video anomaly detection systems. Specifically, we apply several effects caused by the Wi-Fi deauthentication attack on video quality (e.g., slow down, freeze, fast forward, low resolution) to the popular benchmark datasets for video anomaly detection. Our experiments with several state-of-the-art anomaly detection models show that the attackers can significantly undermine the reliability of video anomaly detection systems by causing frequent false alarms and hiding physical anomalies from the surveillance system.
△ Less
Submitted 6 April, 2022;
originally announced April 2022.
-
TiSAT: Time Series Anomaly Transformer
Authors:
Keval Doshi,
Shatha Abudalou,
Yasin Yilmaz
Abstract:
While anomaly detection in time series has been an active area of research for several years, most recent approaches employ an inadequate evaluation criterion leading to an inflated F1 score. We show that a rudimentary Random Guess method can outperform state-of-the-art detectors in terms of this popular but faulty evaluation criterion. In this work, we propose a proper evaluation metric that meas…
▽ More
While anomaly detection in time series has been an active area of research for several years, most recent approaches employ an inadequate evaluation criterion leading to an inflated F1 score. We show that a rudimentary Random Guess method can outperform state-of-the-art detectors in terms of this popular but faulty evaluation criterion. In this work, we propose a proper evaluation metric that measures the timeliness and precision of detecting sequential anomalies. Moreover, most existing approaches are unable to capture temporal features from long sequences. Self-attention based approaches, such as transformers, have been demonstrated to be particularly efficient in capturing long-range dependencies while being computationally efficient during training and inference. We also propose an efficient transformer approach for anomaly detection in time series and extensively evaluate our proposed approach on several popular benchmark datasets.
△ Less
Submitted 10 March, 2022;
originally announced March 2022.
-
End-to-End Semantic Video Transformer for Zero-Shot Action Recognition
Authors:
Keval Doshi,
Yasin Yilmaz
Abstract:
While video action recognition has been an active area of research for several years, zero-shot action recognition has only recently started gaining traction. In this work, we propose a novel end-to-end trained transformer model which is capable of capturing long range spatiotemporal dependencies efficiently, contrary to existing approaches which use 3D-CNNs. Moreover, to address a common ambiguit…
▽ More
While video action recognition has been an active area of research for several years, zero-shot action recognition has only recently started gaining traction. In this work, we propose a novel end-to-end trained transformer model which is capable of capturing long range spatiotemporal dependencies efficiently, contrary to existing approaches which use 3D-CNNs. Moreover, to address a common ambiguity in the existing works about classes that can be considered as previously unseen, we propose a new experimentation setup that satisfies the zero-shot learning premise for action recognition by avoiding overlap between the training and testing classes. The proposed approach significantly outperforms the state of the arts in zero-shot action recognition in terms of the the top-1 accuracy on UCF-101, HMDB-51 and ActivityNet datasets. The code and proposed experimentation setup are available in GitHub: https://github.com/Secure-and-Intelligent-Systems-Lab/SemanticVideoTransformer
△ Less
Submitted 2 December, 2022; v1 submitted 10 March, 2022;
originally announced March 2022.
-
Online Application Guidance for Heterogeneous Memory Systems
Authors:
M. Ben Olson,
Brandon Kammerdiener,
Kshitij A. Doshi,
Terry Jones,
Michael R. Jantz
Abstract:
Many high end and next generation computing systems to incorporated alternative memory technologies to meet performance goals. Since these technologies present distinct advantages and tradeoffs compared to conventional DDR* SDRAM, such as higher bandwidth with lower capacity or vice versa, they are typically packaged alongside conventional SDRAM in a heterogeneous memory architecture. To utilize t…
▽ More
Many high end and next generation computing systems to incorporated alternative memory technologies to meet performance goals. Since these technologies present distinct advantages and tradeoffs compared to conventional DDR* SDRAM, such as higher bandwidth with lower capacity or vice versa, they are typically packaged alongside conventional SDRAM in a heterogeneous memory architecture. To utilize the different types of memory efficiently, new data management strategies are needed to match application usage to the best available memory technology. However, current proposals for managing heterogeneous memories are limited because they either: 1) do not consider high-level application behavior when assigning data to different types of memory, or 2) require separate program execution (with a representative input) to collect information about how the application uses memory resources.
This work presents a toolset for addressing the limitations of existing approaches for managing complex memories. It extends the application runtime layer with automated monitoring and management routines that assign application data to the best tier of memory based on previous usage, without any need for source code modification or a separate profiling run. It evaluates this approach on a state-of-the-art server platform with both conventional DDR4 SDRAM and non-volatile Intel Optane DC memory, using both memory-intensive high performance computing (HPC) applications as well as standard benchmarks. Overall, the results show that this approach improves program performance significantly compared to a standard unguided approach across a variety of workloads and system configurations. Additionally, we show that this approach achieves similar performance as a comparable offline profiling-based approach after a short startup period, without requiring separate program execution or offline analysis steps.
△ Less
Submitted 5 October, 2021;
originally announced October 2021.
-
An Efficient Approach for Anomaly Detection in Traffic Videos
Authors:
Keval Doshi,
Yasin Yilmaz
Abstract:
Due to its relevance in intelligent transportation systems, anomaly detection in traffic videos has recently received much interest. It remains a difficult problem due to a variety of factors influencing the video quality of a real-time traffic feed, such as temperature, perspective, lighting conditions, and so on. Even though state-of-the-art methods perform well on the available benchmark datase…
▽ More
Due to its relevance in intelligent transportation systems, anomaly detection in traffic videos has recently received much interest. It remains a difficult problem due to a variety of factors influencing the video quality of a real-time traffic feed, such as temperature, perspective, lighting conditions, and so on. Even though state-of-the-art methods perform well on the available benchmark datasets, they need a large amount of external training data as well as substantial computational resources. In this paper, we propose an efficient approach for a video anomaly detection system which is capable of running at the edge devices, e.g., on a roadside camera. The proposed approach comprises a pre-processing module that detects changes in the scene and removes the corrupted frames, a two-stage background modelling module and a two-stage object detector. Finally, a backtracking anomaly detection algorithm computes a similarity statistic and decides on the onset time of the anomaly. We also propose a sequential change detection algorithm that can quickly adapt to a new scene and detect changes in the similarity statistic. Experimental results on the Track 4 test set of the 2021 AI City Challenge show the efficacy of the proposed framework as we achieve an F1-score of 0.9157 along with 8.4027 root mean square error (RMSE) and are ranked fourth in the competition.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.
-
A Modular and Unified Framework for Detecting and Localizing Video Anomalies
Authors:
Keval Doshi,
Yasin Yilmaz
Abstract:
Anomaly detection in videos has been attracting an increasing amount of attention. Despite the competitive performance of recent methods on benchmark datasets, they typically lack desirable features such as modularity, cross-domain adaptivity, interpretability, and real-time anomalous event detection. Furthermore, current state-of-the-art approaches are evaluated using the standard instance-based…
▽ More
Anomaly detection in videos has been attracting an increasing amount of attention. Despite the competitive performance of recent methods on benchmark datasets, they typically lack desirable features such as modularity, cross-domain adaptivity, interpretability, and real-time anomalous event detection. Furthermore, current state-of-the-art approaches are evaluated using the standard instance-based detection metric by considering video frames as independent instances, which is not ideal for video anomaly detection. Motivated by these research gaps, we propose a modular and unified approach to the online video anomaly detection and localization problem, called MOVAD, which consists of a novel transfer learning based plug-and-play architecture, a sequential anomaly detector, a mathematical framework for selecting the detection threshold, and a suitable performance metric for real-time anomalous event detection in videos. Extensive performance evaluations on benchmark datasets show that the proposed framework significantly outperforms the current state-of-the-art approaches.
△ Less
Submitted 21 March, 2021;
originally announced March 2021.
-
Road Damage Detection using Deep Ensemble Learning
Authors:
Keval Doshi,
Yasin Yilmaz
Abstract:
Road damage detection is critical for the maintenance of a road, which traditionally has been performed using expensive high-performance sensors. With the recent advances in technology, especially in computer vision, it is now possible to detect and categorize different types of road damages, which can facilitate efficient maintenance and resource management. In this work, we present an ensemble m…
▽ More
Road damage detection is critical for the maintenance of a road, which traditionally has been performed using expensive high-performance sensors. With the recent advances in technology, especially in computer vision, it is now possible to detect and categorize different types of road damages, which can facilitate efficient maintenance and resource management. In this work, we present an ensemble model for efficient detection and classification of road damages, which we have submitted to the IEEE BigData Cup Challenge 2020. Our solution utilizes a state-of-the-art object detector known as You Only Look Once (YOLO-v4), which is trained on images of various types of road damages from Czech, Japan and India. Our ensemble approach was extensively tested with several different model versions and it was able to achieve an F1 score of 0.628 on the test 1 dataset and 0.6358 on the test 2 dataset.
△ Less
Submitted 29 October, 2020;
originally announced November 2020.
-
Online Anomaly Detection in Surveillance Videos with Asymptotic Bounds on False Alarm Rate
Authors:
Keval Doshi,
Yasin Yilmaz
Abstract:
Anomaly detection in surveillance videos is attracting an increasing amount of attention. Despite the competitive performance of recent methods, they lack theoretical performance analysis, particularly due to the complex deep neural network architectures used in decision making. Additionally, online decision making is an important but mostly neglected factor in this domain. Much of the existing me…
▽ More
Anomaly detection in surveillance videos is attracting an increasing amount of attention. Despite the competitive performance of recent methods, they lack theoretical performance analysis, particularly due to the complex deep neural network architectures used in decision making. Additionally, online decision making is an important but mostly neglected factor in this domain. Much of the existing methods that claim to be online, depend on batch or offline processing in practice. Motivated by these research gaps, we propose an online anomaly detection method in surveillance videos with asymptotic bounds on the false alarm rate, which in turn provides a clear procedure for selecting a proper decision threshold that satisfies the desired false alarm rate. Our proposed algorithm consists of a multi-objective deep learning module along with a statistical anomaly detection module, and its effectiveness is demonstrated on several publicly available data sets where we outperform the state-of-the-art algorithms. All codes are available at https://github.com/kevaldoshi17/Prediction-based-Video-Anomaly-Detection-.
△ Less
Submitted 10 October, 2020;
originally announced October 2020.
-
Timely Detection and Mitigation of Stealthy DDoS Attacks via IoT Networks
Authors:
Keval Doshi,
Yasin Yilmaz,
Suleyman Uludag
Abstract:
Internet of Things (IoT) networks consist of sensors, actuators, mobile and wearable devices that can connect to the Internet. With billions of such devices already in the market which have significant vulnerabilities, there is a dangerous threat to the Internet services and also some cyber-physical systems that are also connected to the Internet. Specifically, due to their existing vulnerabilitie…
▽ More
Internet of Things (IoT) networks consist of sensors, actuators, mobile and wearable devices that can connect to the Internet. With billions of such devices already in the market which have significant vulnerabilities, there is a dangerous threat to the Internet services and also some cyber-physical systems that are also connected to the Internet. Specifically, due to their existing vulnerabilities IoT devices are susceptible to being compromised and being part of a new type of stealthy Distributed Denial of Service (DDoS) attack, called Mongolian DDoS, which is characterized by its widely distributed nature and small attack size from each source. This study proposes a novel anomaly-based Intrusion Detection System (IDS) that is capable of timely detecting and mitigating this emerging type of DDoS attacks. The proposed IDS's capability of detecting and mitigating stealthy DDoS attacks with even very low attack size per source is demonstrated through numerical and testbed experiments.
△ Less
Submitted 14 June, 2020;
originally announced June 2020.
-
Continual Learning for Anomaly Detection in Surveillance Videos
Authors:
Keval Doshi,
Yasin Yilmaz
Abstract:
Anomaly detection in surveillance videos has been recently gaining attention. A challenging aspect of high-dimensional applications such as video surveillance is continual learning. While current state-of-the-art deep learning approaches perform well on existing public datasets, they fail to work in a continual learning framework due to computational and storage issues. Furthermore, online decisio…
▽ More
Anomaly detection in surveillance videos has been recently gaining attention. A challenging aspect of high-dimensional applications such as video surveillance is continual learning. While current state-of-the-art deep learning approaches perform well on existing public datasets, they fail to work in a continual learning framework due to computational and storage issues. Furthermore, online decision making is an important but mostly neglected factor in this domain. Motivated by these research gaps, we propose an online anomaly detection method for surveillance videos using transfer learning and continual learning, which in turn significantly reduces the training complexity and provides a mechanism for continually learning from recent data without suffering from catastrophic forgetting. Our proposed algorithm leverages the feature extraction power of neural network-based models for transfer learning, and the continual learning capability of statistical detection methods.
△ Less
Submitted 15 April, 2020;
originally announced April 2020.
-
Any-Shot Sequential Anomaly Detection in Surveillance Videos
Authors:
Keval Doshi,
Yasin Yilmaz
Abstract:
Anomaly detection in surveillance videos has been recently gaining attention. Even though the performance of state-of-the-art methods on publicly available data sets has been competitive, they demand a massive amount of training data. Also, they lack a concrete approach for continuously updating the trained model once new data is available. Furthermore, online decision making is an important but m…
▽ More
Anomaly detection in surveillance videos has been recently gaining attention. Even though the performance of state-of-the-art methods on publicly available data sets has been competitive, they demand a massive amount of training data. Also, they lack a concrete approach for continuously updating the trained model once new data is available. Furthermore, online decision making is an important but mostly neglected factor in this domain. Motivated by these research gaps, we propose an online anomaly detection method for surveillance videos using transfer learning and any-shot learning, which in turn significantly reduces the training complexity and provides a mechanism that can detect anomalies using only a few labeled nominal examples. Our proposed algorithm leverages the feature extraction power of neural network-based models for transfer learning and the any-shot learning capability of statistical detection methods.
△ Less
Submitted 4 April, 2020;
originally announced April 2020.
-
Synthetic Image Augmentation for Improved Classification using Generative Adversarial Networks
Authors:
Keval Doshi
Abstract:
Object detection and recognition has been an ongoing research topic for a long time in the field of computer vision. Even in robotics, detecting the state of an object by a robot still remains a challenging task. Also, collecting data for each possible state is also not feasible. In this literature, we use a deep convolutional neural network with SVM as a classifier to help with recognizing the st…
▽ More
Object detection and recognition has been an ongoing research topic for a long time in the field of computer vision. Even in robotics, detecting the state of an object by a robot still remains a challenging task. Also, collecting data for each possible state is also not feasible. In this literature, we use a deep convolutional neural network with SVM as a classifier to help with recognizing the state of a cooking object. We also study how a generative adversarial network can be used for synthetic data augmentation and improving the classification accuracy. The main motivation behind this work is to estimate how well a robot could recognize the current state of an object
△ Less
Submitted 31 July, 2019;
originally announced July 2019.
-
Hardware Transactional Persistent Memory
Authors:
Ellis Giles,
Kshitij Doshi,
Peter Varman
Abstract:
Emerging Persistent Memory technologies (also PM, Non-Volatile DIMMs, Storage Class Memory or SCM) hold tremendous promise for accelerating popular data-management applications like in-memory databases. However, programmers now need to deal with ensuring the atomicity of transactions on Persistent Memory resident data and maintaining consistency between the order in which processors perform stores…
▽ More
Emerging Persistent Memory technologies (also PM, Non-Volatile DIMMs, Storage Class Memory or SCM) hold tremendous promise for accelerating popular data-management applications like in-memory databases. However, programmers now need to deal with ensuring the atomicity of transactions on Persistent Memory resident data and maintaining consistency between the order in which processors perform stores and that in which the updated values become durable.
The problem is specially challenging when high-performance isolation mechanisms like Hardware Transactional Memory (HTM) are used for concurrency control. This work shows how HTM transactions can be ordered correctly and atomically into PM by the use of a novel software protocol combined with a Persistent Memory Controller, without requiring changes to processor cache hardware or HTM protocols. In contrast, previous approaches require significant changes to existing processor microarchitectures. Our approach, evaluated using both micro-benchmarks and the STAMP suite compares well with standard (volatile) HTM transactions. It also yields significant gains in throughput and latency in comparison with persistent transactional locking.
△ Less
Submitted 22 May, 2018;
originally announced June 2018.