Search | arXiv e-print repository

Majority Voting Approach to Ransomware Detection

Authors: Simon R Davies, Richard Macfarlane, William J Buchanan

Abstract: Crypto-ransomware remains a significant threat to governments and companies alike, with high-profile cyber security incidents regularly making headlines. Many different detection systems have been proposed as solutions to the ever-changing dynamic landscape of ransomware detection. In the majority of cases, these described systems propose a method based on the result of a single test performed on… ▽ More Crypto-ransomware remains a significant threat to governments and companies alike, with high-profile cyber security incidents regularly making headlines. Many different detection systems have been proposed as solutions to the ever-changing dynamic landscape of ransomware detection. In the majority of cases, these described systems propose a method based on the result of a single test performed on either the executable code, the process under investigation, its behaviour, or its output. In a small subset of ransomware detection systems, the concept of a scorecard is employed where multiple tests are performed on various aspects of a process under investigation and their results are then analysed using machine learning. The purpose of this paper is to propose a new majority voting approach to ransomware detection by develo** a method that uses a cumulative score derived from discrete tests based on calculations using algorithmic rather than heuristic techniques. The paper describes 23 candidate tests, as well as 9 Windows API tests which are validated to determine both their accuracy and viability for use within a ransomware detection system. Using a cumulative score calculation approach to ransomware detection has several benefits, such as the immunity to the occasional inaccuracy of individual tests when making its final classification. The system can also leverage multiple tests that can be both comprehensive and complimentary in an attempt to achieve a broader, deeper, and more robust analysis of the program under investigation. Additionally, the use of multiple collaborative tests also significantly hinders ransomware from masking or modifying its behaviour in an attempt to bypass detection. △ Less

Submitted 30 May, 2023; originally announced May 2023.

Comments: 17 pages

arXiv:2210.13376 [pdf, other]

doi 10.3390/e24101503

Comparison of Entropy Calculation Methods for Ransomware Encrypted File Identification

Authors: Simon R Davies, Richard Macfarlane, William J. Buchanan

Abstract: Ransomware is a malicious class of software that utilises encryption to implement an attack on system availability. The target's data remains encrypted and is held captive by the attacker until a ransom demand is met. A common approach used by many crypto-ransomware detection techniques is to monitor file system activity and attempt to identify encrypted files being written to disk, often using a… ▽ More Ransomware is a malicious class of software that utilises encryption to implement an attack on system availability. The target's data remains encrypted and is held captive by the attacker until a ransom demand is met. A common approach used by many crypto-ransomware detection techniques is to monitor file system activity and attempt to identify encrypted files being written to disk, often using a file's entropy as an indicator of encryption. However, often in the description of these techniques, little or no discussion is made as to why a particular entropy calculation technique is selected or any justification given as to why one technique is selected over the alternatives. The Shannon method of entropy calculation is the most commonly-used technique when it comes to file encryption identification in crypto-ransomware detection techniques. Overall, correctly encrypted data should be indistinguishable from random data, so apart from the standard mathematical entropy calculations such as Chi-Square, Shannon Entropy and Serial Correlation, the test suites used to validate the output from pseudo-random number generators would also be suited to perform this analysis. he hypothesis being that there is a fundamental difference between different entropy methods and that the best methods may be used to better detect ransomware encrypted files. The paper compares the accuracy of 53 distinct tests in being able to differentiate between encrypted data and other file types. The testing is broken down into two phases, the first phase is used to identify potential candidate tests, and a second phase where these candidates are thoroughly evaluated. To ensure that the tests were sufficiently robust, the NapierOne dataset is used. This dataset contains thousands of examples of the most commonly used file types, as well as examples of files that have been encrypted by crypto-ransomware. △ Less

Submitted 24 October, 2022; originally announced October 2022.

Journal ref: Entropy. 2022; 24(10):1503

arXiv:2201.08154 [pdf, other]

doi 10.1016/j.fsidi.2021.301330

NapierOne: A modern mixed file data set alternative to Govdocs1

Authors: Simon R Davies, Richard Macfarlane, William J Buchanan

Abstract: It was found when reviewing the ransomware detection research literature that almost no proposal provided enough detail on how the test data set was created, or sufficient description of its actual content, to allow it to be recreated by other researchers interested in reconstructing their environment and validating the research results. A modern cybersecurity mixed file data set called NapierOne… ▽ More It was found when reviewing the ransomware detection research literature that almost no proposal provided enough detail on how the test data set was created, or sufficient description of its actual content, to allow it to be recreated by other researchers interested in reconstructing their environment and validating the research results. A modern cybersecurity mixed file data set called NapierOne is presented, primarily aimed at, but not limited to, ransomware detection and forensic analysis research. NapierOne was designed to address this deficiency in reproducibility and improve consistency by facilitating research replication and repeatability. The methodology used in the creation of this data set is also described in detail. The data set was inspired by the Govdocs1 data set and it is intended that NapierOne be used as a complement to this original data set. An investigation was performed with the goal of determining the common files types currently in use. No specific research was found that explicitly provided this information, so an alternative consensus approach was employed. This involved combining the findings from multiple sources of file type usage into an overall ranked list. After which 5000 real-world example files were gathered, and a specific data subset created, for each of the common file types identified. In some circumstances, multiple data subsets were created for a specific file type, each subset representing a specific characteristic for that file type. For example, there are multiple data subsets for the ZIP file type with each subset containing examples of a specific compression method. Ransomware execution tends to produce files that have high entropy, so examples of file types that naturally have this attribute are also present. △ Less

Submitted 20 January, 2022; originally announced January 2022.

Journal ref: Forensic Science International: Digital Investigation, Volume 40, 2022, 301330, ISSN 2666-2817

arXiv:2106.14418 [pdf, other]

doi 10.1016/j.cose.2021.102377

Differential Area Analysis for Ransomware Attack Detection within Mixed File Datasets

Authors: Simon R Davies, Richard Macfarlane, William J Buchanan

Abstract: The threat from ransomware continues to grow both in the number of affected victims as well as the cost incurred by the people and organisations impacted in a successful attack. In the majority of cases, once a victim has been attacked there remain only two courses of action open to them; either pay the ransom or lose their data. One common behaviour shared between all crypto ransomware strains is… ▽ More The threat from ransomware continues to grow both in the number of affected victims as well as the cost incurred by the people and organisations impacted in a successful attack. In the majority of cases, once a victim has been attacked there remain only two courses of action open to them; either pay the ransom or lose their data. One common behaviour shared between all crypto ransomware strains is that at some point during their execution they will attempt to encrypt the users' files. Previous research Penrose et al. (2013); Zhao et al. (2011) has highlighted the difficulty in differentiating between compressed and encrypted files using Shannon entropy as both file types exhibit similar values. One of the experiments described in this paper shows a unique characteristic for the Shannon entropy of encrypted file header fragments. This characteristic was used to differentiate between encrypted files and other high entropy files such as archives. This discovery was leveraged in the development of a file classification model that used the differential area between the entropy curve of a file under analysis and one generated from random data. When comparing the entropy plot values of a file under analysis against one generated by a file containing purely random numbers, the greater the correlation of the plots is, the higher the confidence that the file under analysis contains encrypted data. △ Less

Submitted 28 June, 2021; originally announced June 2021.

Journal ref: Computers & Security, 102377, 2021

arXiv:2012.08487 [pdf, other]

doi 10.1016/j.fsidi.2020.300979

Evaluation of Live Forensic Techniques in Ransomware Attack Mitigation

Authors: Simon R. Davies, Richard Macfarlane, William J. Buchanan

Abstract: Memory was captured from a system infected by ransomware and its contents was examined using live forensic tools, with the intent of identifying the symmetric encryption keys being used. NotPetya, Bad Rabbit and Phobos hybrid ransomware samples were tested during the investigation. If keys were discovered, the following two steps were also performed. Firstly, a timeline was manually created by com… ▽ More Memory was captured from a system infected by ransomware and its contents was examined using live forensic tools, with the intent of identifying the symmetric encryption keys being used. NotPetya, Bad Rabbit and Phobos hybrid ransomware samples were tested during the investigation. If keys were discovered, the following two steps were also performed. Firstly, a timeline was manually created by combining data from multiple sources to illustrate the ransomware's behaviour as well as showing when the encryption keys were present in memory and how long they remained there. Secondly, an attempt was made to decrypt the files encrypted by the ransomware using the found keys. In all cases, the investigation was able to confirm that it was possible to identify the encryption keys used. A description of how these found keys were then used to successfully decrypt files that had been encrypted during the execution of the ransomware is also given. The resulting generated timelines provided a excellent way to visualise the behaviour of the ransomware and the encryption key management practices it employed, and from a forensic investigation and possible mitigation point of view, when the encryption keys are in memory. △ Less

Submitted 19 December, 2020; v1 submitted 15 December, 2020; originally announced December 2020.

Comments: 11 pages, 10 figures

ACM Class: E.3; K.6.5

Journal ref: Forensic Science International: Digital Investigation. Volume 33, June 2020, 300979

arXiv:2006.01849 [pdf, other]

doi 10.1109/CyberSecurity49315.2020.9138859

Towards Identifying Human Actions, Intent, and Severity of APT Attacks Applying Deception Techniques -- An Experiment

Authors: Joel Chacon, Sean McKeown, Richard Macfarlane

Abstract: Attacks by Advanced Persistent Threats (APTs) have been shown to be difficult to detect using traditional signature- and anomaly-based intrusion detection approaches. Deception techniques such as decoy objects, often called honey items, may be deployed for intrusion detection and attack analysis, providing an alternative to detect APT behaviours. This work explores the use of honey items to classi… ▽ More Attacks by Advanced Persistent Threats (APTs) have been shown to be difficult to detect using traditional signature- and anomaly-based intrusion detection approaches. Deception techniques such as decoy objects, often called honey items, may be deployed for intrusion detection and attack analysis, providing an alternative to detect APT behaviours. This work explores the use of honey items to classify intrusion interactions, differentiating automated attacks from those which need some human reasoning and interaction towards APT detection. Multiple decoy items are deployed on honeypots in a virtual honey network, some as breadcrumbs to detect indications of a structured manual attack. Monitoring functionality was created around Elastic Stack with a Kibana dashboard created to display interactions with various honey items. APT type manual intrusions are simulated by an experienced pentesting practitioner carrying out simulated attacks. Interactions with honey items are evaluated in order to determine their suitability for discriminating between automated tools and direct human intervention. The results show that it is possible to differentiate automatic attacks from manual structured attacks; from the nature of the interactions with the honey items. The use of honey items found in the honeypot, such as in later parts of a structured attack, have been shown to be successful in classification of manual attacks, as well as towards providing an indication of severity of the attacks △ Less

Submitted 2 June, 2020; originally announced June 2020.

arXiv:2002.05126 [pdf, other]

doi 10.6025/jnt/2019/10/4/124-155

Wi-Fi Channel Saturation as a Mechanism to Improve Passive Capture of Bluetooth Through Channel Usage Restriction

Authors: Ian Lowe, William J Buchanan, Richard J Macfarlane, Owen Lo

Abstract: Bluetooth is a short-range wireless technology that provides audio and data links between personal smartphones and playback devices, such as speakers, headsets and car entertainment systems. Since its introduction in 2001, security researchers have suggested that the protocol is weak, and prone to a variety of attacks against its authentication, link management and encryption schemes. Key research… ▽ More Bluetooth is a short-range wireless technology that provides audio and data links between personal smartphones and playback devices, such as speakers, headsets and car entertainment systems. Since its introduction in 2001, security researchers have suggested that the protocol is weak, and prone to a variety of attacks against its authentication, link management and encryption schemes. Key researchers in the field have suggested that reliable passive sniffing of Bluetooth traffic would enable the practical application of a range of currently hypothesised attacks. Restricting Bluetooth's frequency hop** behaviour by manipulation of the available channels, in order to make brute force attacks more effective has been a frequently proposed avenue of future research from the literature. This paper has evaluated the proposed approach in a series of experiments using the software defined radio tools and custom hardware developed by the Ubertooth project. The work concludes that the mechanism suggested by previous researchers may not deliver the proposed improvements, but describes an as yet undocumented interaction between Bluetooth and Wi-Fi technologies which may provide a Denial of Service attack mechanism. △ Less

Submitted 12 February, 2020; originally announced February 2020.

Journal ref: Journal of Network Technology, 2019

arXiv:1907.10387 [pdf, other]

Privacy Parameter Variation Using RAPPOR on a Malware Dataset

Authors: Peter Aaby, Juanjo Mata De Acuna, Richard Macfarlane, William J Buchanan

Abstract: Stricter data protection regulations and the poor application of privacy protection techniques have resulted in a requirement for data-driven companies to adopt new methods of analysing sensitive user data. The RAPPOR (Randomized Aggregatable Privacy-Preserving Ordinal Response) method adds parameterised noise, which must be carefully selected to maintain adequate privacy without losing analytical… ▽ More Stricter data protection regulations and the poor application of privacy protection techniques have resulted in a requirement for data-driven companies to adopt new methods of analysing sensitive user data. The RAPPOR (Randomized Aggregatable Privacy-Preserving Ordinal Response) method adds parameterised noise, which must be carefully selected to maintain adequate privacy without losing analytical value. This paper applies RAPPOR privacy parameter variations against a public dataset containing a list of running Android applications data. The dataset is filtered and sampled into small (10,000); medium (100,000); and large (1,200,000) sample sizes while applying RAPPOR with ? = 10; 1.0; and 0.1 (respectively low; medium; high privacy guarantees). Also, in order to observe detailed variations within high to medium privacy guarantees (? = 0.5 to 1.0), a second experiment is conducted by progressively. △ Less

Submitted 24 July, 2019; originally announced July 2019.

Journal ref: 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE)

Showing 1–8 of 8 results for author: Macfarlane, R