-
Systematic Review: Anomaly Detection in Connected and Autonomous Vehicles
Authors:
J. R. V. Solaas,
N. Tuptuk,
E. Mariconti
Abstract:
This systematic review focuses on anomaly detection for connected and autonomous vehicles. The initial database search identified 2160 articles, of which 203 were included in this review after rigorous screening and assessment. This study revealed that the most commonly used Artificial Intelligence (AI) algorithms employed in anomaly detection are neural networks like LSTM, CNN, and autoencoders,…
▽ More
This systematic review focuses on anomaly detection for connected and autonomous vehicles. The initial database search identified 2160 articles, of which 203 were included in this review after rigorous screening and assessment. This study revealed that the most commonly used Artificial Intelligence (AI) algorithms employed in anomaly detection are neural networks like LSTM, CNN, and autoencoders, alongside one-class SVM. Most anomaly-based models were trained using real-world operational vehicle data, although anomalies, such as attacks and faults, were often injected artificially into the datasets. These models were evaluated mostly using five key evaluation metrics: recall, accuracy, precision, F1-score, and false positive rate. The most frequently used selection of evaluation metrics used for anomaly detection models were accuracy, precision, recall, and F1-score. This systematic review presents several recommendations. First, there is a need to incorporate multiple evaluation metrics to provide a comprehensive assessment of the anomaly detection models. Second, only a small proportion of the studies have made their models open source, indicating a need to share models publicly to facilitate collaboration within the research community, and to validate and compare findings effectively. Third, there is a need for benchmarking datasets with predefined anomalies or cyberattacks to test and improve the effectiveness of the proposed anomaly-based detection models. Furthermore, there is a need for future research to investigate the deployment of anomaly detection to a vehicle to assess its performance on the road. There is a notable lack of research done on intrusion detection systems using different protocols to CAN, such as Ethernet and FlexRay.
△ Less
Submitted 4 May, 2024;
originally announced May 2024.
-
Gotta Assess `Em All: A Risk Analysis of Criminal Offenses Facilitated through PokemonGO
Authors:
Ashly Fuller,
Martin Lo,
Angelica Holmes,
Lu Lemanski,
Marie Vasek,
Enrico Mariconti
Abstract:
Location-based games have come to the forefront of popularity in casual and mobile gaming over the past six years. However, there is no hard data on crimes that these games enable, ranging from assault to cyberstalking to grooming. Given these potential harms, we conduct a risk assessment and quasi-experiment on the game features of location-based games. Using PokemonGO as a case study, we identif…
▽ More
Location-based games have come to the forefront of popularity in casual and mobile gaming over the past six years. However, there is no hard data on crimes that these games enable, ranging from assault to cyberstalking to grooming. Given these potential harms, we conduct a risk assessment and quasi-experiment on the game features of location-based games. Using PokemonGO as a case study, we identify and establish cyber-enabled stalking as the main risk event where in-game features such as an innocent function to share in-game postcards can be exploited by malicious users. Users obtain postcards that are unique to each Pokestop and represent gifts that can be shared with in-game friends. The number of postcards that each user can retain is limited, so they send the excess to their friends with items that boost their friends' game activities. The postcard often also unintentionally leaks the users' commonly visited locations to their in-game friends. We analyze these in-game features using risk assessment and identify cyber-enabled stalking as one of the main threats. We further evaluate the feasibility of this crime through a quasi-experiment. Our results show that participants' routine locations such as home and work can be reliably re-identified within days from the first gift exchange. This exploitation of a previously unconsidered in-game feature enables physical stalking of previously unknown persons which can escalate into more serious crimes. Given current data protection legislation in Europe, further preventive measures are required by Niantic to protect pseudonymized users from being re-identified by in-game features and (potentially) stalked.
△ Less
Submitted 6 April, 2023;
originally announced April 2023.
-
Waiting for Q: An Exploration of QAnon Users' Online Migration to Poal in the Wake of Voat's Demise
Authors:
Antonis Papasavva,
Enrico Mariconti
Abstract:
Online communities are groups of people who interact primarily via the Internet, often sharing common interests. Some of these groups, particularly supporters of Q who created the far-right conspiracy theory known as QAnon, are highly toxic and controversial. These communities are often banned from various mainstream online social networks due to their controversy. This study examines the deplatfo…
▽ More
Online communities are groups of people who interact primarily via the Internet, often sharing common interests. Some of these groups, particularly supporters of Q who created the far-right conspiracy theory known as QAnon, are highly toxic and controversial. These communities are often banned from various mainstream online social networks due to their controversy. This study examines the deplatforming and subsequent migrations of QAnon adherents, following a two-step process. We analyze Reddit data, finding that users opt for Voat as an alternative following the Reddit bans, particularly influenced by Q's postings on 4chan. Subsequently, upon Voat's shutdown announcement, we observe users recommending Poal. Among several insights, we compare the effects of abrupt permanent bans and announced shutdowns on the migration patterns of these conspiracists. Specifically, we find that almost half of Poal's active users are Voat migrants who registered after the shutdown was announced. This contradicts the patterns observed after the Reddit bans, suggesting that advance warning can facilitate more coordinated migrations. Lastly, our research uncovers evidence of discussions and planning related to the January 6th, 2021, attack on the US Capitol, which emerged shortly after Voat's shutdown, predominantly on Poal. This underscores the continued activity of the conspiracy, albeit at a diminished scale due to various bans and a shutdown, while also exposing Poal as a platform that hosts dangerous individuals.
△ Less
Submitted 25 May, 2024; v1 submitted 2 February, 2023;
originally announced February 2023.
-
Cerberus: Exploring Federated Prediction of Security Events
Authors:
Mohammad Naseri,
Yufei Han,
Enrico Mariconti,
Yun Shen,
Gianluca Stringhini,
Emiliano De Cristofaro
Abstract:
Modern defenses against cyberattacks increasingly rely on proactive approaches, e.g., to predict the adversary's next actions based on past events. Building accurate prediction models requires knowledge from many organizations; alas, this entails disclosing sensitive information, such as network structures, security postures, and policies, which might often be undesirable or outright impossible. I…
▽ More
Modern defenses against cyberattacks increasingly rely on proactive approaches, e.g., to predict the adversary's next actions based on past events. Building accurate prediction models requires knowledge from many organizations; alas, this entails disclosing sensitive information, such as network structures, security postures, and policies, which might often be undesirable or outright impossible. In this paper, we explore the feasibility of using Federated Learning (FL) to predict future security events. To this end, we introduce Cerberus, a system enabling collaborative training of Recurrent Neural Network (RNN) models for participating organizations. The intuition is that FL could potentially offer a middle-ground between the non-private approach where the training data is pooled at a central server and the low-utility alternative of only training local models. We instantiate Cerberus on a dataset obtained from a major security company's intrusion prevention product and evaluate it vis-a-vis utility, robustness, and privacy, as well as how participants contribute to and benefit from the system. Overall, our work sheds light on both the positive aspects and the challenges of using FL for this task and paves the way for deploying federated approaches to predictive security.
△ Less
Submitted 7 September, 2022;
originally announced September 2022.
-
Shedding Light on the Targeted Victim Profiles of Malicious Downloaders
Authors:
François Labrèche,
Enrico Mariconti,
Gianluca Stringhini
Abstract:
Malware affects millions of users worldwide, impacting the daily lives of many people as well as businesses. Malware infections are increasing in complexity and unfold over a number of stages. A malicious downloader often acts as the starting point as it fingerprints the victim's machine and downloads one or more additional malware payloads. Although previous research was conducted on these malici…
▽ More
Malware affects millions of users worldwide, impacting the daily lives of many people as well as businesses. Malware infections are increasing in complexity and unfold over a number of stages. A malicious downloader often acts as the starting point as it fingerprints the victim's machine and downloads one or more additional malware payloads. Although previous research was conducted on these malicious downloaders and their Pay-Per-Install networks, limited work has investigated how the profile of the victim machine, e.g., its characteristics and software configuration, affect the targeting choice of cybercriminals.
In this paper, we operate a large-scale investigation of the relation between the machine profile and the payload downloaded by droppers, through 151,189 executions of malware downloaders over a period of 12 months. We build a fully automated framework which uses Virtual Machines (VMs) in sandboxes to build custom user and machine profiles to test our malicious samples. We then use changepoint analysis to model the behavior of different downloader families, and perform analyses of variance (ANOVA) on the ratio of infections per profile. With this, we identify which machine profile is targeted by cybercriminals at different points in time.
Our results show that a number of downloaders present different behaviors depending on a number of features of a machine. Notably, a higher number of infections for specific malware families were observed when using different browser profiles, keyboard layouts and operating systems, while one keyboard layout obtained fewer infections of a specific malware family.
Our findings bring light to the importance of the features of a machine running malicious downloader software, particularly for malware research.
△ Less
Submitted 28 August, 2022;
originally announced August 2022.
-
MaMaDroid2.0 -- The Holes of Control Flow Graphs
Authors:
Harel Berger,
Chen Hajaj,
Enrico Mariconti,
Amit Dvir
Abstract:
Android malware is a continuously expanding threat to billions of mobile users around the globe. Detection systems are updated constantly to address these threats. However, a backlash takes the form of evasion attacks, in which an adversary changes malicious samples such that those samples will be misclassified as benign. This paper fully inspects a well-known Android malware detection system, MaM…
▽ More
Android malware is a continuously expanding threat to billions of mobile users around the globe. Detection systems are updated constantly to address these threats. However, a backlash takes the form of evasion attacks, in which an adversary changes malicious samples such that those samples will be misclassified as benign. This paper fully inspects a well-known Android malware detection system, MaMaDroid, which analyzes the control flow graph of the application. Changes to the portion of benign samples in the train set and models are considered to see their effect on the classifier. The changes in the ratio between benign and malicious samples have a clear effect on each one of the models, resulting in a decrease of more than 40% in their detection rate. Moreover, adopted ML models are implemented as well, including 5-NN, Decision Tree, and Adaboost. Exploration of the six models reveals a typical behavior in different cases, of tree-based models and distance-based models. Moreover, three novel attacks that manipulate the CFG and their detection rates are described for each one of the targeted models. The attacks decrease the detection rate of most of the models to 0%, with regards to different ratios of benign to malicious apps. As a result, a new version of MaMaDroid is engineered. This model fuses the CFG of the app and static analysis of features of the app. This improved model is proved to be robust against evasion attacks targeting both CFG-based models and static analysis models, achieving a detection rate of more than 90% against each one of the attacks.
△ Less
Submitted 28 February, 2022;
originally announced February 2022.
-
Tiresias: Predicting Security Events Through Deep Learning
Authors:
Yun Shen,
Enrico Mariconti,
Pierre-Antoine Vervier,
Gianluca Stringhini
Abstract:
With the increased complexity of modern computer attacks, there is a need for defenders not only to detect malicious activity as it happens, but also to predict the specific steps that will be taken by an adversary when performing an attack. However this is still an open research problem, and previous research in predicting malicious events only looked at binary outcomes (e.g., whether an attack w…
▽ More
With the increased complexity of modern computer attacks, there is a need for defenders not only to detect malicious activity as it happens, but also to predict the specific steps that will be taken by an adversary when performing an attack. However this is still an open research problem, and previous research in predicting malicious events only looked at binary outcomes (e.g., whether an attack would happen or not), but not at the specific steps that an attacker would undertake. To fill this gap we present Tiresias, a system that leverages Recurrent Neural Networks (RNNs) to predict future events on a machine, based on previous observations. We test Tiresias on a dataset of 3.4 billion security events collected from a commercial intrusion prevention system, and show that our approach is effective in predicting the next event that will occur on a machine with a precision of up to 0.93. We also show that the models learned by Tiresias are reasonably stable over time, and provide a mechanism that can identify sudden drops in precision and trigger a retraining of the system. Finally, we show that the long-term memory typical of RNNs is key in performing event prediction, rendering simpler methods not up to the task.
△ Less
Submitted 24 May, 2019;
originally announced May 2019.
-
"You Know What to Do": Proactive Detection of YouTube Videos Targeted by Coordinated Hate Attacks
Authors:
Enrico Mariconti,
Guillermo Suarez-Tangil,
Jeremy Blackburn,
Emiliano De Cristofaro,
Nicolas Kourtellis,
Ilias Leontiadis,
Jordi Luque Serrano,
Gianluca Stringhini
Abstract:
Video sharing platforms like YouTube are increasingly targeted by aggression and hate attacks. Prior work has shown how these attacks often take place as a result of "raids," i.e., organized efforts by ad-hoc mobs coordinating from third-party communities. Despite the increasing relevance of this phenomenon, however, online services often lack effective countermeasures to mitigate it. Unlike well-…
▽ More
Video sharing platforms like YouTube are increasingly targeted by aggression and hate attacks. Prior work has shown how these attacks often take place as a result of "raids," i.e., organized efforts by ad-hoc mobs coordinating from third-party communities. Despite the increasing relevance of this phenomenon, however, online services often lack effective countermeasures to mitigate it. Unlike well-studied problems like spam and phishing, coordinated aggressive behavior both targets and is perpetrated by humans, making defense mechanisms that look for automated activity unsuitable. Therefore, the de-facto solution is to reactively rely on user reports and human moderation.
In this paper, we propose an automated solution to identify YouTube videos that are likely to be targeted by coordinated harassers from fringe communities like 4chan. First, we characterize and model YouTube videos along several axes (metadata, audio transcripts, thumbnails) based on a ground truth dataset of videos that were targeted by raids. Then, we use an ensemble of classifiers to determine the likelihood that a video will be raided with very good results (AUC up to 94%). Overall, our work provides an important first step towards deploying proactive systems to detect and mitigate coordinated hate attacks on platforms like YouTube.
△ Less
Submitted 23 August, 2019; v1 submitted 21 May, 2018;
originally announced May 2018.
-
A Family of Droids -- Android Malware Detection via Behavioral Modeling: Static vs Dynamic Analysis
Authors:
Lucky Onwuzurike,
Mario Almeida,
Enrico Mariconti,
Jeremy Blackburn,
Gianluca Stringhini,
Emiliano De Cristofaro
Abstract:
Following the increasing popularity of mobile ecosystems, cybercriminals have increasingly targeted them, designing and distributing malicious apps that steal information or cause harm to the device's owner. Aiming to counter them, detection techniques based on either static or dynamic analysis that model Android malware, have been proposed. While the pros and cons of these analysis techniques are…
▽ More
Following the increasing popularity of mobile ecosystems, cybercriminals have increasingly targeted them, designing and distributing malicious apps that steal information or cause harm to the device's owner. Aiming to counter them, detection techniques based on either static or dynamic analysis that model Android malware, have been proposed. While the pros and cons of these analysis techniques are known, they are usually compared in the context of their limitations e.g., static analysis is not able to capture runtime behaviors, full code coverage is usually not achieved during dynamic analysis, etc. Whereas, in this paper, we analyze the performance of static and dynamic analysis methods in the detection of Android malware and attempt to compare them in terms of their detection performance, using the same modeling approach.
To this end, we build on MaMaDroid, a state-of-the-art detection system that relies on static analysis to create a behavioral model from the sequences of abstracted API calls. Then, aiming to apply the same technique in a dynamic analysis setting, we modify CHIMP, a platform recently proposed to crowdsource human inputs for app testing, in order to extract API calls' sequences from the traces produced while executing the app on a CHIMP virtual device. We call this system AuntieDroid and instantiate it by using both automated (Monkey) and user-generated inputs. We find that combining both static and dynamic analysis yields the best performance, with F-measure reaching 0.92. We also show that static analysis is at least as effective as dynamic analysis, depending on how apps are stimulated during execution, and, finally, investigate the reasons for inconsistent misclassifications across methods.
△ Less
Submitted 13 July, 2018; v1 submitted 9 March, 2018;
originally announced March 2018.
-
MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models (Extended Version)
Authors:
Lucky Onwuzurike,
Enrico Mariconti,
Panagiotis Andriotis,
Emiliano De Cristofaro,
Gordon Ross,
Gianluca Stringhini
Abstract:
As Android has become increasingly popular, so has malware targeting it, thus pushing the research community to propose different detection techniques. However, the constant evolution of the Android ecosystem, and of malware itself, makes it hard to design robust tools that can operate for long periods of time without the need for modifications or costly re-training. Aiming to address this issue,…
▽ More
As Android has become increasingly popular, so has malware targeting it, thus pushing the research community to propose different detection techniques. However, the constant evolution of the Android ecosystem, and of malware itself, makes it hard to design robust tools that can operate for long periods of time without the need for modifications or costly re-training. Aiming to address this issue, we set to detect malware from a behavioral point of view, modeled as the sequence of abstracted API calls. We introduce MaMaDroid, a static-analysis based system that abstracts the API calls performed by an app to their class, package, or family, and builds a model from their sequences obtained from the call graph of an app as Markov chains. This ensures that the model is more resilient to API changes and the features set is of manageable size. We evaluate MaMaDroid using a dataset of 8.5K benign and 35.5K malicious apps collected over a period of six years, showing that it effectively detects malware (with up to 0.99 F-measure) and keeps its detection capabilities for long periods of time (up to 0.87 F-measure two years after training). We also show that MaMaDroid remarkably outperforms DroidAPIMiner, a state-of-the-art detection system that relies on the frequency of (raw) API calls. Aiming to assess whether MaMaDroid's effectiveness mainly stems from the API abstraction or from the sequencing modeling, we also evaluate a variant of it that uses frequency (instead of sequences), of abstracted API calls. We find that it is not as accurate, failing to capture maliciousness when trained on malware samples that include API calls that are equally or more frequently used by benign apps.
△ Less
Submitted 2 March, 2019; v1 submitted 20 November, 2017;
originally announced November 2017.
-
What's in a Name? Understanding Profile Name Reuse on Twitter
Authors:
Enrico Mariconti,
Jeremiah Onaolapo,
Syed Sharique Ahmad,
Nicolas Nikiforou,
Manuel Egele,
Nick Nikiforakis,
Gianluca Stringhini
Abstract:
Users on Twitter are commonly identified by their profile names. These names are used when directly addressing users on Twitter, are part of their profile page URLs, and can become a trademark for popular accounts, with people referring to celebrities by their real name and their profile name, interchangeably. Twitter, however, has chosen to not permanently link profile names to their correspondin…
▽ More
Users on Twitter are commonly identified by their profile names. These names are used when directly addressing users on Twitter, are part of their profile page URLs, and can become a trademark for popular accounts, with people referring to celebrities by their real name and their profile name, interchangeably. Twitter, however, has chosen to not permanently link profile names to their corresponding user accounts. In fact, Twitter allows users to change their profile name, and afterwards makes the old profile names available for other users to take. In this paper, we provide a large-scale study of the phenomenon of profile name reuse on Twitter. We show that this phenomenon is not uncommon, investigate the dynamics of profile name reuse, and characterize the accounts that are involved in it. We find that many of these accounts adopt abandoned profile names for questionable purposes, such as spreading malicious content, and using the profile name's popularity for search engine optimization. Finally, we show that this problem is not unique to Twitter (as other popular online social networks also release profile names) and argue that the risks involved with profile-name reuse outnumber the advantages provided by this feature.
△ Less
Submitted 14 February, 2017;
originally announced February 2017.
-
MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models
Authors:
Enrico Mariconti,
Lucky Onwuzurike,
Panagiotis Andriotis,
Emiliano De Cristofaro,
Gordon Ross,
Gianluca Stringhini
Abstract:
The rise in popularity of the Android platform has resulted in an explosion of malware threats targeting it. As both Android malware and the operating system itself constantly evolve, it is very challenging to design robust malware mitigation techniques that can operate for long periods of time without the need for modifications or costly re-training. In this paper, we present MaMaDroid, an Androi…
▽ More
The rise in popularity of the Android platform has resulted in an explosion of malware threats targeting it. As both Android malware and the operating system itself constantly evolve, it is very challenging to design robust malware mitigation techniques that can operate for long periods of time without the need for modifications or costly re-training. In this paper, we present MaMaDroid, an Android malware detection system that relies on app behavior. MaMaDroid builds a behavioral model, in the form of a Markov chain, from the sequence of abstracted API calls performed by an app, and uses it to extract features and perform classification. By abstracting calls to their packages or families, MaMaDroid maintains resilience to API changes and keeps the feature set size manageable. We evaluate its accuracy on a dataset of 8.5K benign and 35.5K malicious apps collected over a period of six years, showing that it not only effectively detects malware (with up to 99% F-measure), but also that the model built by the system keeps its detection capabilities for long periods of time (on average, 86% and 75% F-measure, respectively, one and two years after training). Finally, we compare against DroidAPIMiner, a state-of-the-art system that relies on the frequency of API calls performed by apps, showing that MaMaDroid significantly outperforms it.
△ Less
Submitted 20 November, 2017; v1 submitted 13 December, 2016;
originally announced December 2016.