-
Design and Execution Challenges for Cybersecurity Serious Games: An Overview
Authors:
Gokul Jayakrishnan,
Vijayanand Banahatti,
Sachin Lodha
Abstract:
Serious games are increasingly being used in cybersecurity education to engage and educate users. Several studies with cybersecurity serious games have shown that they are successful in educating users and the users also find them both fun and engaging. Meanwhile, several studies have also reported issues in identifying real life effects of the game and even the long-term effects that they have. B…
▽ More
Serious games are increasingly being used in cybersecurity education to engage and educate users. Several studies with cybersecurity serious games have shown that they are successful in educating users and the users also find them both fun and engaging. Meanwhile, several studies have also reported issues in identifying real life effects of the game and even the long-term effects that they have. Based on our experience with enterprise cybersecurity games and games from recent literature, we discuss a few key challenges that must be considered while designing and evaluating serious games for cybersecurity awareness.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
PhishMatch: A Layered Approach for Effective Detection of Phishing URLs
Authors:
Harshal Tupsamudre,
Sparsh Jain,
Sachin Lodha
Abstract:
Phishing attacks continue to be a significant threat on the Internet. Prior studies show that it is possible to determine whether a website is phishing or not just by analyzing its URL more carefully. A major advantage of the URL based approach is that it can identify a phishing website even before the web page is rendered in the browser, thus avoiding other potential problems such as cryptojackin…
▽ More
Phishing attacks continue to be a significant threat on the Internet. Prior studies show that it is possible to determine whether a website is phishing or not just by analyzing its URL more carefully. A major advantage of the URL based approach is that it can identify a phishing website even before the web page is rendered in the browser, thus avoiding other potential problems such as cryptojacking and drive-by downloads. However, traditional URL based approaches have their limitations. Blacklist based approaches are prone to zero-hour phishing attacks, advanced machine learning based approaches consume high resources, and other approaches send the URL to a remote server which compromises user's privacy. In this paper, we present a layered anti-phishing defense, PhishMatch, which is robust, accurate, inexpensive, and client-side. We design a space-time efficient Aho-Corasick algorithm for exact string matching and n-gram based indexing technique for approximate string matching to detect various cybersquatting techniques in the phishing URL. To reduce false positives, we use a global whitelist and personalized user whitelists. We also determine the context in which the URL is visited and use that information to classify the input URL more accurately. The last component of PhishMatch involves a machine learning model and controlled search engine queries to classify the URL. A prototype plugin of PhishMatch, developed for the Chrome browser, was found to be fast and lightweight. Our evaluation shows that PhishMatch is both efficient and effective.
△ Less
Submitted 3 December, 2021;
originally announced December 2021.
-
Gradient-based Data Subversion Attack Against Binary Classifiers
Authors:
Rosni K Vasu,
Sanjay Seetharaman,
Shubham Malaviya,
Manish Shukla,
Sachin Lodha
Abstract:
Machine learning based data-driven technologies have shown impressive performances in a variety of application domains. Most enterprises use data from multiple sources to provide quality applications. The reliability of the external data sources raises concerns for the security of the machine learning techniques adopted. An attacker can tamper the training or test datasets to subvert the predictio…
▽ More
Machine learning based data-driven technologies have shown impressive performances in a variety of application domains. Most enterprises use data from multiple sources to provide quality applications. The reliability of the external data sources raises concerns for the security of the machine learning techniques adopted. An attacker can tamper the training or test datasets to subvert the predictions of models generated by these techniques. Data poisoning is one such attack wherein the attacker tries to degrade the performance of a classifier by manipulating the training data.
In this work, we focus on label contamination attack in which an attacker poisons the labels of data to compromise the functionality of the system. We develop Gradient-based Data Subversion strategies to achieve model degradation under the assumption that the attacker has limited-knowledge of the victim model. We exploit the gradients of a differentiable convex loss function (residual errors) with respect to the predicted label as a warm-start and formulate different strategies to find a set of data instances to contaminate. Further, we analyze the transferability of attacks and the susceptibility of binary classifiers. Our experiments show that the proposed approach outperforms the baselines and is computationally efficient.
△ Less
Submitted 31 May, 2021;
originally announced May 2021.
-
Influence Based Defense Against Data Poisoning Attacks in Online Learning
Authors:
Sanjay Seetharaman,
Shubham Malaviya,
Rosni KV,
Manish Shukla,
Sachin Lodha
Abstract:
Data poisoning is a type of adversarial attack on training data where an attacker manipulates a fraction of data to degrade the performance of machine learning model. Therefore, applications that rely on external data-sources for training data are at a significantly higher risk. There are several known defensive mechanisms that can help in mitigating the threat from such attacks. For example, data…
▽ More
Data poisoning is a type of adversarial attack on training data where an attacker manipulates a fraction of data to degrade the performance of machine learning model. Therefore, applications that rely on external data-sources for training data are at a significantly higher risk. There are several known defensive mechanisms that can help in mitigating the threat from such attacks. For example, data sanitization is a popular defensive mechanism wherein the learner rejects those data points that are sufficiently far from the set of training instances. Prior work on data poisoning defense primarily focused on offline setting, wherein all the data is assumed to be available for analysis. Defensive measures for online learning, where data points arrive sequentially, have not garnered similar interest.
In this work, we propose a defense mechanism to minimize the degradation caused by the poisoned training data on a learner's model in an online setup. Our proposed method utilizes an influence function which is a classic technique in robust statistics. Further, we supplement it with the existing data sanitization methods for filtering out some of the poisoned data points. We study the effectiveness of our defense mechanism on multiple datasets and across multiple attack strategies against an online learner.
△ Less
Submitted 24 April, 2021;
originally announced April 2021.
-
Passwords: Divided they Stand, United they Fall
Authors:
Harshal Tupsamudre,
Sachin Lodha
Abstract:
Today, offline attacks are one of the most severe threats to password security. These attacks have claimed millions of passwords from prominent websites including Yahoo, LinkedIn, Twitter, Sony, Adobe and many more. Therefore, as a preventive measure, it is necessary to gauge the offline guessing resistance of a password database and to help users choose secure passwords. The rule-based mechanisms…
▽ More
Today, offline attacks are one of the most severe threats to password security. These attacks have claimed millions of passwords from prominent websites including Yahoo, LinkedIn, Twitter, Sony, Adobe and many more. Therefore, as a preventive measure, it is necessary to gauge the offline guessing resistance of a password database and to help users choose secure passwords. The rule-based mechanisms that rely on minimum password length and different character classes are too naive to capture the intricate human behavior whereas those based on probabilistic models require the knowledge of an entire password distribution which is not always easy to learn. In this paper, we propose a space partition attack model which uses information from previous leaks, surveys, attacks and other sources to divide the password search space into non-overlap** partitions and learn partition densities. We prove that the expected success of a partition attacker is maximum if the resulting partitions are explored in decreasing order of density. We show that the proposed attack model is more general and various popular attack techniques including probabilistic-based, dictionary-based, grammar-based and brute-force are just different instances of a partition attacker. Later, we introduce bin attacker, another instance of a partition attacker, and measure the guessing resistance of real-world password databases. We demonstrate that the utilized search space is very small and as a result even a weak attacker can cause sufficient damage to the system. We prove that partition attacks can be countered only if partition densities are uniform. We use this result and propose a system that thwarts partition attacker by distributing users across different partitions. Finally, we demonstrate how some of the well-known password schemes can be adapted to help users in choosing passwords from the system assigned partitions.
△ Less
Submitted 11 September, 2020; v1 submitted 7 September, 2020;
originally announced September 2020.
-
A Note on Cryptographic Algorithms for Private Data Analysis in Contact Tracing Applications
Authors:
Rajan M A,
Manish Shukla,
Sachin Lodha
Abstract:
Contact tracing is an important measure to counter the COVID-19 pandemic. In the early phase, many countries employed manual contact tracing to contain the rate of disease spread, however it has many issues. The manual approach is cumbersome, time consuming and also requires active participation of a large number of people to realize it. In order to overcome these drawbacks, digital contact tracin…
▽ More
Contact tracing is an important measure to counter the COVID-19 pandemic. In the early phase, many countries employed manual contact tracing to contain the rate of disease spread, however it has many issues. The manual approach is cumbersome, time consuming and also requires active participation of a large number of people to realize it. In order to overcome these drawbacks, digital contact tracing has been proposed that typically involves deploying a contact tracing application on people's mobile devices which can track their movements and close social interactions. While studies suggest that digital contact tracing is more effective than manual contact tracing, it has been observed that higher adoption rates of the contact tracing app may result in a better controlled epidemic. This also increases the confidence in the accuracy of the collected data and the subsequent analytics. One key reason for low adoption rate of contact tracing applications is the concern about individual privacy. In fact, several studies report that contact tracing applications deployed in multiple countries are not privacy friendly and have potential to be used for mass surveillance by the concerned governments. Hence, privacy respecting contact tracing application is the need of the hour that can lead to highly effective, efficient contact tracing. As part of this study, we focus on various cryptographic techniques that can help in addressing the Private Set Intersection problem which lies at the heart of privacy respecting contact tracing. We analyze the computation and communication complexities of these techniques under the typical client-server architecture utilized by contact tracing applications. Further we evaluate those computation and communication complexity expressions for India scenario and thus identify cryptographic techniques that can be more suitably deployed there.
△ Less
Submitted 19 May, 2020;
originally announced May 2020.
-
Privacy Guidelines for Contact Tracing Applications
Authors:
Manish Shukla,
Rajan M A,
Sachin Lodha,
Gautam Shroff,
Ramesh Raskar
Abstract:
Contact tracing is a very powerful method to implement and enforce social distancing to avoid spreading of infectious diseases. The traditional approach of contact tracing is time consuming, manpower intensive, dangerous and prone to error due to fatigue or lack of skill. Due to this there is an emergence of mobile based applications for contact tracing. These applications primarily utilize a comb…
▽ More
Contact tracing is a very powerful method to implement and enforce social distancing to avoid spreading of infectious diseases. The traditional approach of contact tracing is time consuming, manpower intensive, dangerous and prone to error due to fatigue or lack of skill. Due to this there is an emergence of mobile based applications for contact tracing. These applications primarily utilize a combination of GPS based absolute location and Bluetooth based relative location remitted from user's smartphone to infer various insights. These applications have eased the task of contact tracing; however, they also have severe implication on user's privacy, for example, mass surveillance, personal information leakage and additionally revealing the behavioral patterns of the user. This impact on user's privacy leads to trust deficit in these applications, and hence defeats their purpose.
In this work we discuss the various scenarios which a contact tracing application should be able to handle. We highlight the privacy handling of some of the prominent contact tracing applications. Additionally, we describe the various threat actors who can disrupt its working, or misuse end user's data, or hamper its mass adoption. Finally, we present privacy guidelines for contact tracing applications from different stakeholder's perspective. To best of our knowledge, this is the first generic work which provides privacy guidelines for contact tracing applications.
△ Less
Submitted 28 April, 2020;
originally announced April 2020.
-
Extended- Force vs Nudge : Comparing Users' Pattern Choices on SysPal and TinPal
Authors:
Harshal Tupsamudre,
Sukanya Vaddepalli,
Vijayanand Banahatti,
Sachin Lodha
Abstract:
Android's 3X3 graphical pattern lock scheme is one of the widely used authentication method on smartphone devices. However, users choose 3X3 patterns from a small subspace of all possible 389,112 patterns. The two recently proposed interfaces, SysPal by Cho et al. and TinPal by the authors, demonstrate that it is possible to influence users 3X3 pattern choices by making small modifications in the…
▽ More
Android's 3X3 graphical pattern lock scheme is one of the widely used authentication method on smartphone devices. However, users choose 3X3 patterns from a small subspace of all possible 389,112 patterns. The two recently proposed interfaces, SysPal by Cho et al. and TinPal by the authors, demonstrate that it is possible to influence users 3X3 pattern choices by making small modifications in the existing interface. While SysPal forces users to include one, two or three system-assigned random dots in their pattern, TinPal employs a highlighting mechanism to inform users about the set of reachable dots from the current selected dot. Both interfaces improved the security of 3X3 patterns without affecting usability, but no comparison between SysPal and TinPal was presented.
To address this gap, we conduct a new user study with 147 participants and collect patterns on three SysPal interfaces, 1-dot, 2-dot and 3-dot. We also consider original and TinPal patterns collected in our previous user study involving 99 participants. We compare patterns created on five different interfaces, original, TinPal, 1-dot, 2-dot and 3-dot using a range of security and usability metrics including pattern length, stroke length, guessability, recall time and login attempts. Our study results show that participants in the TinPal group created significantly longer and complex patterns than participants in the other four groups. Consequently, the guessing resistance of TinPal patterns was the highest among all groups. Further, we did not find any significant difference in memorability of patterns created in the TinPal group and the other groups.
△ Less
Submitted 26 December, 2019; v1 submitted 9 December, 2019;
originally announced December 2019.
-
Stamp processing with examplar features
Authors:
Yash Bhalgat,
Mandar Kulkarni,
Shirish Karande,
Sachin Lodha
Abstract:
Document digitization is becoming increasingly crucial. In this work, we propose a shape based approach for automatic stamp verification/detection in document images using an unsupervised feature learning. Given a small set of training images, our algorithm learns an appropriate shape representation using an unsupervised clustering. Experimental results demonstrate the effectiveness of our framewo…
▽ More
Document digitization is becoming increasingly crucial. In this work, we propose a shape based approach for automatic stamp verification/detection in document images using an unsupervised feature learning. Given a small set of training images, our algorithm learns an appropriate shape representation using an unsupervised clustering. Experimental results demonstrate the effectiveness of our framework in challenging scenarios.
△ Less
Submitted 16 September, 2016;
originally announced September 2016.
-
Enhanced Circuit Densities in Epitaxially Defined FinFETs (EDFinFETs) over FinFETs
Authors:
Sushant Mittal,
Aneesh Nainani,
M. C. Abraham,
Saurabh Lodha,
Udayan Ganguly
Abstract:
FinFET technology is prone to suffer from Line Edge Roughness (LER) based VT variation with scaling. To address this, we proposed an Epitaxially Defined (ED) FinFET (EDFinFET) as an alternate to FinFET architecture for 10 nm node and beyond. We showed by statistical simulations that EDFinFET reduces LER based VT variability by 90% and overall variability by 59%. However, EDFinFET consists of wider…
▽ More
FinFET technology is prone to suffer from Line Edge Roughness (LER) based VT variation with scaling. To address this, we proposed an Epitaxially Defined (ED) FinFET (EDFinFET) as an alternate to FinFET architecture for 10 nm node and beyond. We showed by statistical simulations that EDFinFET reduces LER based VT variability by 90% and overall variability by 59%. However, EDFinFET consists of wider fins as the fin widths are not constrained by electrostatics and variability (cf. FinFETs have fin width ~ LG/3 where LG is gate-length). This indicates that EDFinFET based circuits may be less dense. In this study we show that wide fins enable taller fin heights. The ability to engineer multiple STI levels on tall fins enables different transistor widths (i.e. various W/Ls e.g. 1-10) in a single fin. This capability ensures that even though individual EDFinFET devices have ~2x larger footprints than FinFETs, EDFinFET may produce equal or higher circuit density for basic building blocks like inverters or NAND gates for W/Ls of 2 and higher.
△ Less
Submitted 15 April, 2016;
originally announced April 2016.
-
Energy-Efficient Shortest Path Algorithms for Convergecast in Sensor Networks
Authors:
John Augustine,
Qi Han,
Philip Loden,
Sachin Lodha,
Sasanka Roy
Abstract:
We introduce a variant of the capacitated vehicle routing problem that is encountered in sensor networks for scientific data collection. Consider an undirected graph $G=(V \cup \{\mathbf{sink}\},E)$. Each vertex $v \in V$ holds a constant-sized reading normalized to 1 byte that needs to be communicated to the $\mathbf{sink}$. The communication protocol is defined such that readings travel in pac…
▽ More
We introduce a variant of the capacitated vehicle routing problem that is encountered in sensor networks for scientific data collection. Consider an undirected graph $G=(V \cup \{\mathbf{sink}\},E)$. Each vertex $v \in V$ holds a constant-sized reading normalized to 1 byte that needs to be communicated to the $\mathbf{sink}$. The communication protocol is defined such that readings travel in packets. The packets have a capacity of $k$ bytes. We define a {\em packet hop} to be the communication of a packet from a vertex to its neighbor. Each packet hop drains one unit of energy and therefore, we need to communicate the readings to the $\mathbf{sink}$ with the fewest number of hops.
We show this problem to be NP-hard and counter it with a simple distributed $(2-\frac{3}{2k})$-approximation algorithm called {\tt SPT} that uses the shortest path tree rooted at the $\mathbf{sink}$. We also show that {\tt SPT} is absolutely optimal when $G$ is a tree and asymptotically optimal when $G$ is a grid. Furthermore, {\tt SPT} has two nice properties. Firstly, the readings always travel along a shortest path toward the $\mathbf{sink}$, which makes it an appealing solution to the convergecast problem as it fits the natural intuition. Secondly, each node employs a very elementary packing strategy. Given all the readings that enter into the node, it sends out as many fully packed packets as possible followed by at most 1 partial packet. We show that any solution that has either one of the two properties cannot be a $(2-ε)$-approximation, for any fixed $ε> 0$. This makes \spt optimal for the class of algorithms that obey either one of those properties.
△ Less
Submitted 20 February, 2009;
originally announced February 2009.
-
Approximation Algorithms for Shortest Descending Paths in Terrains
Authors:
Mustaq Ahmed,
Sandip Das,
Sachin Lodha,
Anna Lubiw,
Anil Maheshwari,
Sasanka Roy
Abstract:
A path from s to t on a polyhedral terrain is descending if the height of a point p never increases while we move p along the path from s to t. No efficient algorithm is known to find a shortest descending path (SDP) from s to t in a polyhedral terrain. We give two approximation algorithms (more precisely, FPTASs) that solve the SDP problem on general terrains. Both algorithms are simple, robust…
▽ More
A path from s to t on a polyhedral terrain is descending if the height of a point p never increases while we move p along the path from s to t. No efficient algorithm is known to find a shortest descending path (SDP) from s to t in a polyhedral terrain. We give two approximation algorithms (more precisely, FPTASs) that solve the SDP problem on general terrains. Both algorithms are simple, robust and easy to implement.
△ Less
Submitted 9 May, 2008;
originally announced May 2008.