-
Leveraging eBPF and AI for Ransomware Nose Out
Authors:
Arjun Sekar,
Sameer G. Kulkarni,
Joy Kuri
Abstract:
In this work, we propose a two-phased approach for real-time detection and deterrence of ransomware. To achieve this, we leverage the capabilities of eBPF (Extended Berkeley Packet Filter) and artificial intelligence to develop both proactive and reactive methods. In the first phase, we utilize signature based detection, where we employ custom eBPF programs to trace the execution of new processes…
▽ More
In this work, we propose a two-phased approach for real-time detection and deterrence of ransomware. To achieve this, we leverage the capabilities of eBPF (Extended Berkeley Packet Filter) and artificial intelligence to develop both proactive and reactive methods. In the first phase, we utilize signature based detection, where we employ custom eBPF programs to trace the execution of new processes and perform hash-based analysis against a known ransomware dataset. In the second, we employ a behavior-based technique that focuses on monitoring the process activities using a custom eBPF program and the creation of ransom notes, a prominent indicator of ransomware activity through the use of Natural Language Processing (NLP). By leveraging low-level tracing capabilities of eBPF and integrating NLP based machine learning algorithms, our solution achieves an impressive 99.76% accuracy in identifying ransomware incidents within a few seconds on the onset of zero-day attacks.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Semantic Augmentation in Images using Language
Authors:
Sahiti Yerramilli,
Jayant Sravan Tamarapalli,
Tanmay Girish Kulkarni,
Jonathan Francis,
Eric Nyberg
Abstract:
Deep Learning models are incredibly data-hungry and require very large labeled datasets for supervised learning. As a consequence, these models often suffer from overfitting, limiting their ability to generalize to real-world examples. Recent advancements in diffusion models have enabled the generation of photorealistic images based on textual inputs. Leveraging the substantial datasets used to tr…
▽ More
Deep Learning models are incredibly data-hungry and require very large labeled datasets for supervised learning. As a consequence, these models often suffer from overfitting, limiting their ability to generalize to real-world examples. Recent advancements in diffusion models have enabled the generation of photorealistic images based on textual inputs. Leveraging the substantial datasets used to train these diffusion models, we propose a technique to utilize generated images to augment existing datasets. This paper explores various strategies for effective data augmentation to improve the out-of-domain generalization capabilities of deep learning models.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
D-STACK: High Throughput DNN Inference by Effective Multiplexing and Spatio-Temporal Scheduling of GPUs
Authors:
Aditya Dhakal,
Sameer G. Kulkarni,
K. K. Ramakrishnan
Abstract:
Hardware accelerators such as GPUs are required for real-time, low-latency inference with Deep Neural Networks (DNN). However, due to the inherent limits to the parallelism they can exploit, DNNs often under-utilize the capacity of today's high-end accelerators. Although spatial multiplexing of the GPU, leads to higher GPU utilization and higher inference throughput, there remain a number of chall…
▽ More
Hardware accelerators such as GPUs are required for real-time, low-latency inference with Deep Neural Networks (DNN). However, due to the inherent limits to the parallelism they can exploit, DNNs often under-utilize the capacity of today's high-end accelerators. Although spatial multiplexing of the GPU, leads to higher GPU utilization and higher inference throughput, there remain a number of challenges. Finding the GPU percentage for right-sizing the GPU for each DNN through profiling, determining an optimal batching of requests to balance throughput improvement while meeting application-specific deadlines and service level objectives (SLOs), and maximizing throughput by appropriately scheduling DNNs are still significant challenges. This paper introduces a dynamic and fair spatio-temporal scheduler (D-STACK) that enables multiple DNNs to run in the GPU concurrently. To help allocate the appropriate GPU percentage (we call it the "Knee"), we develop and validate a model that estimates the parallelism each DNN can utilize. We also develop a lightweight optimization formulation to find an efficient batch size for each DNN operating with D-STACK. We bring together our optimizations and our spatio-temporal scheduler to provide a holistic inference framework. We demonstrate its ability to provide high throughput while meeting application SLOs. We compare D-STACK with an ideal scheduler that can allocate the right GPU percentage for every DNN kernel. D-STACK gets higher than 90 percent throughput and GPU utilization compared to the ideal scheduler. We also compare D-STACK with other GPU multiplexing and scheduling methods (e.g., NVIDIA Triton, Clipper, Nexus), using popular DNN models. Our controlled experiments with multiplexing several popular DNN models achieve up to 1.6X improvement in GPU utilization and up to 4X improvement in inference throughput.
△ Less
Submitted 31 March, 2023;
originally announced April 2023.
-
Challenges in Adapting ECH in TLS for Privacy Enhancement over the Internet
Authors:
Vinod S. Khandkar,
Manjesh K. Hanawal,
Sameer G Kulkarni
Abstract:
Security and Privacy are crucial in modern Internet services. Transport Layer Security (TLS) has largely addressed the issue of security. However, information about the type of service being accessed goes in plain-text in the initial handshakes of vanilla TLS, thus potentially revealing the activity of users and compromising privacy. The ``Encrypted ClientHello'' or ECH overcomes this issue by ext…
▽ More
Security and Privacy are crucial in modern Internet services. Transport Layer Security (TLS) has largely addressed the issue of security. However, information about the type of service being accessed goes in plain-text in the initial handshakes of vanilla TLS, thus potentially revealing the activity of users and compromising privacy. The ``Encrypted ClientHello'' or ECH overcomes this issue by extending TLS 1.3 where all of the information that can potentially reveal the service type is masked, thus addressing the privacy issues in TLS 1.3. However, we notice that Internet services tend to use different versions of TLS for application data (primary connection/channel) and supporting data (side channels) such as scheduling information \textit{etc.}. %, during the active session. Although many internet services have migrated to TLS 1.3, we notice that it is only true for the primary connections which do benefit from TLS 1.3, while the side-channels continue to use lower version of TLS (e.g., 1.2) %which do not support ECH and continue to leak type of service accessed. We demonstrate that privacy information leaked from the side-channels can be used to affect the performance on the primary channels, like blocking or throttling specific service on the internet. Our work demonstrates that adapting ECH on primary channels alone is not sufficient to prevent the privacy leaks and attacks on primary channels. Further, we demonstrate that it is necessary for all of the associated side-channels also to migrate to TLS 1.3 and adapt ECH extension in order to offer complete privacy preservatio
△ Less
Submitted 5 July, 2022;
originally announced July 2022.
-
An Empirical Study on Predictability of Software Code Smell Using Deep Learning Models
Authors:
Himanshu Gupta,
Tanmay G. Kulkarni,
Lov Kumar,
Lalita Bhanu Murthy Neti,
Aneesh Krishna
Abstract:
Code Smell, similar to a bad smell, is a surface indication of something tainted but in terms of software writing practices. This metric is an indication of a deeper problem lies within the code and is associated with an issue which is prominent to experienced software developers with acceptable coding practices. Recent studies have often observed that codes having code smells are often prone to a…
▽ More
Code Smell, similar to a bad smell, is a surface indication of something tainted but in terms of software writing practices. This metric is an indication of a deeper problem lies within the code and is associated with an issue which is prominent to experienced software developers with acceptable coding practices. Recent studies have often observed that codes having code smells are often prone to a higher probability of change in the software development cycle. In this paper, we developed code smell prediction models with the help of features extracted from source code to predict eight types of code smell. Our work also presents the application of data sampling techniques to handle class imbalance problem and feature selection techniques to find relevant feature sets. Previous studies had made use of techniques such as Naive - Bayes and Random forest but had not explored deep learning methods to predict code smell. A total of 576 distinct Deep Learning models were trained using the features and datasets mentioned above. The study concluded that the deep learning models which used data from Synthetic Minority Oversampling Technique gave better results in terms of accuracy, AUC with the accuracy of some models improving from 88.47 to 96.84.
△ Less
Submitted 8 August, 2021;
originally announced August 2021.
-
Analyzing Open-Source Serverless Platforms: Characteristics and Performance
Authors:
Junfeng Li,
Sameer G. Kulkarni,
K. K. Ramakrishnan,
Dan Li
Abstract:
Serverless computing is increasingly popular because of its lower cost and easier deployment. Several cloud service providers (CSPs) offer serverless computing on their public clouds, but it may bring the vendor lock-in risk. To avoid this limitation, many open-source serverless platforms come out to allow developers to freely deploy and manage functions on self-hosted clouds. However, building ef…
▽ More
Serverless computing is increasingly popular because of its lower cost and easier deployment. Several cloud service providers (CSPs) offer serverless computing on their public clouds, but it may bring the vendor lock-in risk. To avoid this limitation, many open-source serverless platforms come out to allow developers to freely deploy and manage functions on self-hosted clouds. However, building effective functions requires much expertise and thorough comprehension of platform frameworks and features that affect performance. It is a challenge for a service developer to differentiate and select the appropriate serverless platform for different demands and scenarios. Thus, we elaborate the frameworks and event processing models of four popular open-source serverless platforms and identify their salient idiosyncrasies. We analyze the root causes of performance differences between different service exporting and auto-scaling modes on those platforms. Further, we provide several insights for future work, such as auto-scaling and metric collection.
△ Less
Submitted 4 June, 2021;
originally announced June 2021.
-
Spatial Sharing of GPU for Autotuning DNN models
Authors:
Aditya Dhakal,
Junguk Cho,
Sameer G. Kulkarni,
K. K. Ramakrishnan,
Puneet Sharma
Abstract:
GPUs are used for training, inference, and tuning the machine learning models. However, Deep Neural Network (DNN) vary widely in their ability to exploit the full power of high-performance GPUs. Spatial sharing of GPU enables multiplexing several DNNs on the GPU and can improve GPU utilization, thus improving throughput and lowering latency. DNN models given just the right amount of GPU resources…
▽ More
GPUs are used for training, inference, and tuning the machine learning models. However, Deep Neural Network (DNN) vary widely in their ability to exploit the full power of high-performance GPUs. Spatial sharing of GPU enables multiplexing several DNNs on the GPU and can improve GPU utilization, thus improving throughput and lowering latency. DNN models given just the right amount of GPU resources can still provide low inference latency, just as much as dedicating all of the GPU for their inference task. An approach to improve DNN inference is tuning of the DNN model. Autotuning frameworks find the optimal low-level implementation for a certain target device based on the trained machine learning model, thus reducing the DNN's inference latency and increasing inference throughput. We observe an interdependency between the tuned model and its inference latency. A DNN model tuned with specific GPU resources provides the best inference latency when inferred with close to the same amount of GPU resources. While a model tuned with the maximum amount of the GPU's resources has poorer inference latency once the GPU resources are limited for inference. On the other hand, a model tuned with an appropriate amount of GPU resources still achieves good inference latency across a wide range of GPU resource availability. We explore the causes that impact the tuning of a model at different amounts of GPU resources. We present many techniques to maximize resource utilization and improve tuning performance. We enable controlled spatial sharing of GPU to multiplex several tuning applications on the GPU. We scale the tuning server instances and shard the tuning model across multiple client instances for concurrent tuning of different operators of a model, achieving better GPU multiplexing. With our improvements, we decrease DNN autotuning time by up to 75 percent and increase throughput by a factor of 5.
△ Less
Submitted 8 August, 2020;
originally announced August 2020.
-
Understanding Open Source Serverless Platforms: Design Considerations and Performance
Authors:
Junfeng Li,
Sameer G. Kulkarni,
K. K. Ramakrishnan,
Dan Li
Abstract:
Serverless computing is increasingly popular because of the promise of lower cost and the convenience it provides to users who do not need to focus on server management. This has resulted in the availability of a number of proprietary and open-source serverless solutions. We seek to understand how the performance of serverless computing depends on a number of design issues using several popular op…
▽ More
Serverless computing is increasingly popular because of the promise of lower cost and the convenience it provides to users who do not need to focus on server management. This has resulted in the availability of a number of proprietary and open-source serverless solutions. We seek to understand how the performance of serverless computing depends on a number of design issues using several popular open-source serverless platforms. We identify the idiosyncrasies affecting performance (throughput and latency) for different open-source serverless platforms. Further, we observe that just having either resource-based (CPU and memory) or workload-based (request per second (RPS) or concurrent requests) auto-scaling is inadequate to address the needs of the serverless platforms.
△ Less
Submitted 12 December, 2019; v1 submitted 18 November, 2019;
originally announced November 2019.
-
Deep Learning for Digital Text Analytics: Sentiment Analysis
Authors:
Reshma U,
Barathi Ganesh H B,
Mandar Kale,
Prachi Mankame,
Gouri Kulkarni
Abstract:
In today's scenario, imagining a world without negativity is something very unrealistic, as bad NEWS spreads more virally than good ones. Though it seems impractical in real life, this could be implemented by building a system using Machine Learning and Natural Language Processing techniques in identifying the news datum with negative shade and filter them by taking only the news with positive sha…
▽ More
In today's scenario, imagining a world without negativity is something very unrealistic, as bad NEWS spreads more virally than good ones. Though it seems impractical in real life, this could be implemented by building a system using Machine Learning and Natural Language Processing techniques in identifying the news datum with negative shade and filter them by taking only the news with positive shade (good news) to the end user. In this work, around two lakhs datum have been trained and tested using a combination of rule-based and data driven approaches. VADER along with a filtration method has been used as an annotating tool followed by statistical Machine Learning approach that have used Document Term Matrix (representation) and Support Vector Machine (classification). Deep Learning algorithms then came into picture to make this system reliable (Doc2Vec) which finally ended up with Convolutional Neural Network(CNN) that yielded better results than the other experimented modules. It showed up a training accuracy of 96%, while a test accuracy of (internal and external news datum) above 85% was obtained.
△ Less
Submitted 10 April, 2018;
originally announced April 2018.
-
Optimal routing in two-queue polling systems
Authors:
I. J. B. F. Adan,
V. G. Kulkarni,
N. Lee,
A. A. J Lefeber
Abstract:
We consider a polling system with two queues, exhaustive service, no switch-over times and exponential service times. The waiting cost depends on the position of the queue relative to the server: It costs a customer c per time unit to wait in the busy queue (where the server is) and d per time unit in the idle queue (where no server is). Customers arrive according to a Poisson process. We study th…
▽ More
We consider a polling system with two queues, exhaustive service, no switch-over times and exponential service times. The waiting cost depends on the position of the queue relative to the server: It costs a customer c per time unit to wait in the busy queue (where the server is) and d per time unit in the idle queue (where no server is). Customers arrive according to a Poisson process. We study the control problem of how arrivals should be routed to the two queues in order to minimize expected waiting costs and characterize individually and socially optimal routing policies under three scenarios of available information at decision epochs: no, partial and complete information. In the complete information case, we develop a new iterative algorithm to determine individually optimal policies, and show that such policies can be described by a switching curve. We conjecture that a linear switching curve is socially optimal, and prove that this policy is indeed optimal for the fluid version of the two-queue polling system.
△ Less
Submitted 10 August, 2016;
originally announced August 2016.