-
Towards Incident Response Orchestration and Automation for the Advanced Metering Infrastructure
Authors:
Alexios Lekidis,
Vasileios Mavroeidis,
Konstantinos Fysarakis
Abstract:
The threat landscape of industrial infrastructures has expanded exponentially over the last few years. Such infrastructures include services such as the smart meter data exchange that should have real-time availability. Smart meters constitute the main component of the Advanced Metering Infrastructure, and their measurements are also used as historical data for forecasting the energy demand to avo…
▽ More
The threat landscape of industrial infrastructures has expanded exponentially over the last few years. Such infrastructures include services such as the smart meter data exchange that should have real-time availability. Smart meters constitute the main component of the Advanced Metering Infrastructure, and their measurements are also used as historical data for forecasting the energy demand to avoid load peaks that could lead to blackouts within specific areas. Hence, a comprehensive Incident Response plan must be in place to ensure high service availability in case of cyber-attacks or operational errors. Currently, utility operators execute such plans mostly manually, requiring extensive time, effort, and domain expertise, and they are prone to human errors. In this paper, we present a method to provide an orchestrated and highly automated Incident Response plan targeting specific use cases and attack scenarios in the energy sector, including steps for preparedness, detection and analysis, containment, eradication, recovery, and post-incident activity through the use of playbooks. In particular, we use the OASIS Collaborative Automated Course of Action Operations (CACAO) standard to define highly automatable workflows in support of cyber security operations for the Advanced Metering Infrastructure. The proposed method is validated through an Advanced Metering Infrastructure testbed where the most prominent cyber-attacks are emulated, and playbooks are instantiated to ensure rapid response for the containment and eradication of the threat, business continuity on the smart meter data exchange service, and compliance with incident reporting requirements.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
TIPS: Threat Sharing Information Platform for Enhanced Security
Authors:
Lakshmi Rama Kiran Pasumarthy,
Hisham Ali,
William J Buchanan,
Jawad Ahmad,
Audun Josang,
Vasileios Mavroeidis,
Mouad Lemoudden
Abstract:
There is an increasing need to share threat information for the prevention of widespread cyber-attacks. While threat-related information sharing can be conducted through traditional information exchange methods, such as email communications etc., these methods are often weak in terms of their trustworthiness and privacy. Additionally, the absence of a trust infrastructure between different informa…
▽ More
There is an increasing need to share threat information for the prevention of widespread cyber-attacks. While threat-related information sharing can be conducted through traditional information exchange methods, such as email communications etc., these methods are often weak in terms of their trustworthiness and privacy. Additionally, the absence of a trust infrastructure between different information-sharing domains also poses significant challenges. These challenges include redactment of information, the Right-to-be-forgotten, and access control to the information-sharing elements. These access issues could be related to time bounds, the trusted deletion of data, and the location of accesses. This paper presents an abstraction of a trusted information-sharing process which integrates Attribute-Based Encryption (ABE), Homomorphic Encryption (HE) and Zero Knowledge Proof (ZKP) integrated into a permissioned ledger, specifically Hyperledger Fabric (HLF). It then provides a protocol exchange between two threat-sharing agents that share encrypted messages through a trusted channel. This trusted channel can only be accessed by those trusted in the sharing and could be enabled for each data-sharing element or set up for long-term sharing.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
PHOENI2X -- A European Cyber Resilience Framework With Artificial-Intelligence-Assisted Orchestration, Automation and Response Capabilities for Business Continuity and Recovery, Incident Response, and Information Exchange
Authors:
Konstantinos Fysarakis,
Alexios Lekidis,
Vasileios Mavroeidis,
Konstantinos Lampropoulos,
George Lyberopoulos,
Ignasi Garcia-Milà Vidal,
José Carles Terés i Casals,
Eva Rodriguez Luna,
Alejandro Antonio Moreno Sancho,
Antonios Mavrelos,
Marinos Tsantekidis,
Sebastian Pape,
Argyro Chatzopoulou,
Christina Nanou,
George Drivas,
Vangelis Photiou,
George Spanoudakis,
Odysseas Koufopavlou
Abstract:
As digital technologies become more pervasive in society and the economy, cybersecurity incidents become more frequent and impactful. According to the NIS and NIS2 Directives, EU Member States and their Operators of Essential Services must establish a minimum baseline set of cybersecurity capabilities and engage in cross-border coordination and cooperation. However, this is only a small step towar…
▽ More
As digital technologies become more pervasive in society and the economy, cybersecurity incidents become more frequent and impactful. According to the NIS and NIS2 Directives, EU Member States and their Operators of Essential Services must establish a minimum baseline set of cybersecurity capabilities and engage in cross-border coordination and cooperation. However, this is only a small step towards European cyber resilience. In this landscape, preparedness, shared situational awareness, and coordinated incident response are essential for effective cyber crisis management and resilience. Motivated by the above, this paper presents PHOENI2X, an EU-funded project aiming to design, develop, and deliver a Cyber Resilience Framework providing Artificial-Intelligence-assisted orchestration, automation and response capabilities for business continuity and recovery, incident response, and information exchange, tailored to the needs of Operators of Essential Services and the EU Member State authorities entrusted with cybersecurity.
△ Less
Submitted 18 July, 2023; v1 submitted 13 July, 2023;
originally announced July 2023.
-
The FormAI Dataset: Generative AI in Software Security Through the Lens of Formal Verification
Authors:
Norbert Tihanyi,
Tamas Bisztray,
Ridhi Jain,
Mohamed Amine Ferrag,
Lucas C. Cordeiro,
Vasileios Mavroeidis
Abstract:
This paper presents the FormAI dataset, a large collection of 112, 000 AI-generated compilable and independent C programs with vulnerability classification. We introduce a dynamic zero-shot prompting technique constructed to spawn diverse programs utilizing Large Language Models (LLMs). The dataset is generated by GPT-3.5-turbo and comprises programs with varying levels of complexity. Some program…
▽ More
This paper presents the FormAI dataset, a large collection of 112, 000 AI-generated compilable and independent C programs with vulnerability classification. We introduce a dynamic zero-shot prompting technique constructed to spawn diverse programs utilizing Large Language Models (LLMs). The dataset is generated by GPT-3.5-turbo and comprises programs with varying levels of complexity. Some programs handle complicated tasks like network management, table games, or encryption, while others deal with simpler tasks like string manipulation. Every program is labeled with the vulnerabilities found within the source code, indicating the type, line number, and vulnerable function name. This is accomplished by employing a formal verification method using the Efficient SMT-based Bounded Model Checker (ESBMC), which uses model checking, abstract interpretation, constraint programming, and satisfiability modulo theories to reason over safety/security properties in programs. This approach definitively detects vulnerabilities and offers a formal model known as a counterexample, thus eliminating the possibility of generating false positive reports. We have associated the identified vulnerabilities with Common Weakness Enumeration (CWE) numbers. We make the source code available for the 112, 000 programs, accompanied by a separate file containing the vulnerabilities detected in each program, making the dataset ideal for training LLMs and machine learning algorithms. Our study unveiled that according to ESBMC, 51.24% of the programs generated by GPT-3.5 contained vulnerabilities, thereby presenting considerable risks to software safety and security.
△ Less
Submitted 28 March, 2024; v1 submitted 5 July, 2023;
originally announced July 2023.
-
Reviewing BPMN as a Modeling Notation for CACAO Security Playbooks
Authors:
Mateusz Zych,
Vasileios Mavroeidis,
Konstantinos Fysarakis,
Manos Athanatos
Abstract:
As cyber systems become increasingly complex and cybersecurity threats become more prominent, defenders must prepare, coordinate, automate, document, and share their response methodologies to the extent possible. The CACAO standard was developed to satisfy the above requirements, providing a common machine-readable framework and schema for documenting cybersecurity operations processes, including…
▽ More
As cyber systems become increasingly complex and cybersecurity threats become more prominent, defenders must prepare, coordinate, automate, document, and share their response methodologies to the extent possible. The CACAO standard was developed to satisfy the above requirements, providing a common machine-readable framework and schema for documenting cybersecurity operations processes, including defensive tradecraft and tactics, techniques, and procedures. Although this approach is compelling, a remaining limitation is that CACAO provides no native modeling notation for graphically representing playbooks, which is crucial for simplifying their creation, modification, and understanding. In contrast, the industry is familiar with BPMN, a standards-based modeling notation for business processes that has also found its place in representing cybersecurity processes. This research examines BPMN and CACAO and explores the feasibility of using the BPMN modeling notation to represent CACAO security playbooks graphically. The results indicate that map** CACAO and BPMN is attainable at an abstract level; however, conversion from one encoding to another introduces a degree of complexity due to the multiple ways CACAO constructs can be represented in BPMN and the extensions required in BPMN to support CACAO fully.
△ Less
Submitted 10 September, 2023; v1 submitted 30 May, 2023;
originally announced May 2023.
-
Enhancing the STIX Representation of MITRE ATT&CK for Group Filtering and Technique Prioritization
Authors:
Mateusz Zych,
Vasileios Mavroeidis
Abstract:
In this paper, we enhance the machine-readable representation of the ATT&CK Groups knowledge base provided by MITRE in STIX 2.1 format to make available and queryable additional types of contextual information. Such information includes the motivations of activity groups, the countries they have originated from, and the sectors and countries they have targeted. We demonstrate how to utilize the en…
▽ More
In this paper, we enhance the machine-readable representation of the ATT&CK Groups knowledge base provided by MITRE in STIX 2.1 format to make available and queryable additional types of contextual information. Such information includes the motivations of activity groups, the countries they have originated from, and the sectors and countries they have targeted. We demonstrate how to utilize the enhanced model to construct intelligible queries to filter activity groups of interest and retrieve relevant tactical intelligence.
△ Less
Submitted 26 April, 2022; v1 submitted 24 April, 2022;
originally announced April 2022.
-
Cybersecurity Playbook Sharing with STIX 2.1
Authors:
Vasileios Mavroeidis,
Mateusz Zych
Abstract:
Understanding that interoperable security playbooks will become a fundamental component of defenders' arsenal to decrease attack detection and response times, it is time to consider their position in structured sharing efforts. This report documents the process of extending Structured Threat Information eXpression (STIX) version 2.1, using the available extension definition mechanism, to enable sh…
▽ More
Understanding that interoperable security playbooks will become a fundamental component of defenders' arsenal to decrease attack detection and response times, it is time to consider their position in structured sharing efforts. This report documents the process of extending Structured Threat Information eXpression (STIX) version 2.1, using the available extension definition mechanism, to enable sharing security playbooks, including Collaborative Automated Course of Action Operations (CACAO) playbooks.
△ Less
Submitted 26 August, 2022; v1 submitted 22 January, 2022;
originally announced March 2022.
-
On the Integration of Course of Action Playbooks into Shareable Cyber Threat Intelligence
Authors:
Vasileios Mavroeidis,
Pavel Eis,
Martin Zadnik,
Marco Caselli,
Bret Jordan
Abstract:
Motivated by the introduction of CACAO, the first open standard that harmonizes the way we document courses of action in a machine-readable format for interoperability, and the benefits for cybersecurity operations derived from utilizing, and coupling and sharing course of action playbooks with cyber threat intelligence, we introduce a uniform metadata template that supports managing and integrati…
▽ More
Motivated by the introduction of CACAO, the first open standard that harmonizes the way we document courses of action in a machine-readable format for interoperability, and the benefits for cybersecurity operations derived from utilizing, and coupling and sharing course of action playbooks with cyber threat intelligence, we introduce a uniform metadata template that supports managing and integrating course of action playbooks into knowledge representation and knowledge management systems. We demonstrate the applicability of our approach through two use-case implementations. We utilize the playbook metadata template to introduce functionality and integrate course of action playbooks, such as CACAO, into the MISP threat intelligence platform and the OASIS Threat Actor Context ontology.
△ Less
Submitted 22 November, 2021; v1 submitted 20 October, 2021;
originally announced October 2021.
-
Data-Driven Threat Hunting Using Sysmon
Authors:
Vasileios Mavroeidis,
Audun Jøsang
Abstract:
Threat actors can be persistent, motivated and agile, and leverage a diversified and extensive set of tactics and techniques to attain their goals. In response to that, defenders establish threat intelligence programs to stay threat-informed and lower risk. Actionable threat intelligence is integrated into security information and event management systems (SIEM) or is accessed via more dedicated t…
▽ More
Threat actors can be persistent, motivated and agile, and leverage a diversified and extensive set of tactics and techniques to attain their goals. In response to that, defenders establish threat intelligence programs to stay threat-informed and lower risk. Actionable threat intelligence is integrated into security information and event management systems (SIEM) or is accessed via more dedicated tools like threat intelligence platforms. A threat intelligence platform gives access to contextual threat information by aggregating, processing, correlating, and analyzing real-time data and information from multiple sources, and in many cases, it provides centralized analysis and reporting of an organization's security events. Sysmon logs is a data source that has received considerable attention for endpoint visibility. Approaches for threat detection using Sysmon have been proposed, mainly focusing on search engine technologies like NoSQL database systems. This paper demonstrates one of the many use cases of Sysmon and cyber threat intelligence. In particular, we present a threat assessment system that relies on a cyber threat intelligence ontology to automatically classify executed software into different threat levels by analyzing Sysmon log streams. The presented system and approach augments cyber defensive capabilities through situational awareness, prediction, and automated courses of action.
△ Less
Submitted 28 March, 2021;
originally announced March 2021.
-
Cyber Threat Intelligence Model: An Evaluation of Taxonomies, Sharing Standards, and Ontologies within Cyber Threat Intelligence
Authors:
Vasileios Mavroeidis,
Siri Bromander
Abstract:
Cyber threat intelligence is the provision of evidence-based knowledge about existing or emerging threats. Benefits from threat intelligence include increased situational awareness, efficiency in security operations, and improved prevention, detection, and response capabilities. To process, correlate, and analyze vast amounts of threat information and data and derive intelligence that can be share…
▽ More
Cyber threat intelligence is the provision of evidence-based knowledge about existing or emerging threats. Benefits from threat intelligence include increased situational awareness, efficiency in security operations, and improved prevention, detection, and response capabilities. To process, correlate, and analyze vast amounts of threat information and data and derive intelligence that can be shared and consumed in meaningful times, it is required to utilize structured, machine-readable formats that incorporate the industry-required expressivity while at the same time being unambiguous. To a large extent, this is achieved with technologies like ontologies, schemas, and taxonomies. This research evaluates the coverage and high-level conceptual expressivity of cyber-threat-intelligence-relevant ontologies, sharing standards, and taxonomies pertaining to the who, what, why, where, when, and how elements of threats and attacks in addition to courses of action and technical indicators. The results confirm that little emphasis has been given to develo** a comprehensive cyber threat intelligence ontology, with existing efforts being not thoroughly designed, non-interoperable, ambiguous, and lacking proper semantics and axioms for reasoning.
△ Less
Submitted 28 August, 2023; v1 submitted 5 March, 2021;
originally announced March 2021.
-
Threat Actor Type Inference and Characterization within Cyber Threat Intelligence
Authors:
Vasileios Mavroeidis,
Ryan Hohimer,
Tim Casey,
Audun Jøsang
Abstract:
As the cyber threat landscape is constantly becoming increasingly complex and polymorphic, the more critical it becomes to understand the enemy and its modus operandi for anticipatory threat reduction. Even though the cyber security community has developed a certain maturity in describing and sharing technical indicators for informing defense components, we still struggle with non-uniform, unstruc…
▽ More
As the cyber threat landscape is constantly becoming increasingly complex and polymorphic, the more critical it becomes to understand the enemy and its modus operandi for anticipatory threat reduction. Even though the cyber security community has developed a certain maturity in describing and sharing technical indicators for informing defense components, we still struggle with non-uniform, unstructured, and ambiguous higher-level information, such as the threat actor context, thereby limiting our ability to correlate with different sources to derive more contextual, accurate, and relevant intelligence. We see the need to overcome this limitation in order to increase our ability to produce and better operationalize cyber threat intelligence. Our research demonstrates how commonly agreed upon controlled vocabularies for characterizing threat actors and their operations can be used to enrich cyber threat intelligence and infer new information at a higher contextual level that is explicable and queryable. In particular, we present an ontological approach to automatically inferring the types of threat actors based on their personas, understanding their nature, and capturing polymorphism and changes in their behavior and characteristics over time. Such an approach not only enables interoperability by providing a structured way and means for sharing highly contextual cyber threat intelligence but also derives new information at machine speed and minimizes cognitive biases that manual classification approaches entail.
△ Less
Submitted 20 September, 2021; v1 submitted 3 March, 2021;
originally announced March 2021.
-
Firearm Detection via Convolutional Neural Networks: Comparing a Semantic Segmentation Model Against End-to-End Solutions
Authors:
Alexander Egiazarov,
Fabio Massimo Zennaro,
Vasileios Mavroeidis
Abstract:
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents such as terrorism, general criminal offences, or even domestic violence. One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis. In this paper we conduct a comparison between a tr…
▽ More
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents such as terrorism, general criminal offences, or even domestic violence. One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis. In this paper we conduct a comparison between a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation. We evaluated both models from different points of view, including accuracy, computational and data complexity, flexibility and reliability. Our results show that a semantic segmentation model provides considerable amount of flexibility and resilience in the low data environment compared to classical deep model models, although its configuration and tuning presents a challenge in achieving the same levels of accuracy as an end-to-end model.
△ Less
Submitted 17 December, 2020;
originally announced December 2020.
-
Firearm Detection and Segmentation Using an Ensemble of Semantic Neural Networks
Authors:
Alexander Egiazarov,
Vasileios Mavroeidis,
Fabio Massimo Zennaro,
Kamer Vishi
Abstract:
In recent years we have seen an upsurge in terror attacks around the world. Such attacks usually happen in public places with large crowds to cause the most damage possible and get the most attention. Even though surveillance cameras are assumed to be a powerful tool, their effect in preventing crime is far from clear due to either limitation in the ability of humans to vigilantly monitor video su…
▽ More
In recent years we have seen an upsurge in terror attacks around the world. Such attacks usually happen in public places with large crowds to cause the most damage possible and get the most attention. Even though surveillance cameras are assumed to be a powerful tool, their effect in preventing crime is far from clear due to either limitation in the ability of humans to vigilantly monitor video surveillance or for the simple reason that they are operating passively. In this paper, we present a weapon detection system based on an ensemble of semantic Convolutional Neural Networks that decomposes the problem of detecting and locating a weapon into a set of smaller problems concerned with the individual component parts of a weapon. This approach has computational and practical advantages: a set of simpler neural networks dedicated to specific tasks requires less computational resources and can be trained in parallel; the overall output of the system given by the aggregation of the outputs of individual networks can be tuned by a user to trade-off false positives and false negatives; finally, according to ensemble theory, the output of the overall system will be robust and reliable even in the presence of weak individual models. We evaluated our system running simulations aimed at assessing the accuracy of individual networks and the whole system. The results on synthetic data and real-world data are promising, and they suggest that our approach may have advantages compared to the monolithic approach based on a single deep convolutional neural network.
△ Less
Submitted 11 February, 2020;
originally announced March 2020.
-
Privacy Issues and Data Protection in Big Data: A Case Study Analysis under GDPR
Authors:
Nils Gruschka,
Vasileios Mavroeidis,
Kamer Vishi,
Meiko Jensen
Abstract:
Big data has become a great asset for many organizations, promising improved operations and new business opportunities. However, big data has increased access to sensitive information that when processed can directly jeopardize the privacy of individuals and violate data protection laws. As a consequence, data controllers and data processors may be imposed tough penalties for non-compliance that c…
▽ More
Big data has become a great asset for many organizations, promising improved operations and new business opportunities. However, big data has increased access to sensitive information that when processed can directly jeopardize the privacy of individuals and violate data protection laws. As a consequence, data controllers and data processors may be imposed tough penalties for non-compliance that can result even to bankruptcy. In this paper, we discuss the current state of the legal regulations and analyze different data protection and privacy-preserving techniques in the context of big data analysis. In addition, we present and analyze two real-life research projects as case studies dealing with sensitive data and actions for complying with the data regulation laws. We show which types of information might become a privacy risk, the employed privacy-preserving techniques in accordance with the legal requirements, and the influence of these techniques on the data processing phase and the research results.
△ Less
Submitted 20 November, 2018;
originally announced November 2018.
-
A Framework for Data-Driven Physical Security and Insider Threat Detection
Authors:
Vasileios Mavroeidis,
Kamer Vishi,
Audun Jøsang
Abstract:
This paper presents PS0, an ontological framework and a methodology for improving physical security and insider threat detection. PS0 can facilitate forensic data analysis and proactively mitigate insider threats by leveraging rule-based anomaly detection. In all too many cases, rule-based anomaly detection can detect employee deviations from organizational security policies. In addition, PS0 can…
▽ More
This paper presents PS0, an ontological framework and a methodology for improving physical security and insider threat detection. PS0 can facilitate forensic data analysis and proactively mitigate insider threats by leveraging rule-based anomaly detection. In all too many cases, rule-based anomaly detection can detect employee deviations from organizational security policies. In addition, PS0 can be considered a security provenance solution because of its ability to fully reconstruct attack patterns. Provenance graphs can be further analyzed to identify deceptive actions and overcome analytical mistakes that can result in bad decision-making, such as false attribution. Moreover, the information can be used to enrich the available intelligence (about intrusion attempts) that can form use cases to detect and remediate limitations in the system, such as loosely-coupled provenance graphs that in many cases indicate weaknesses in the physical security architecture. Ultimately, validation of the framework through use cases demonstrates and proves that PS0 can improve an organization's security posture in terms of physical security and insider threat detection.
△ Less
Submitted 25 September, 2018;
originally announced September 2018.
-
An Evaluation of Score Level Fusion Approaches for Fingerprint and Finger-vein Biometrics
Authors:
Kamer Vishi,
Vasileios Mavroeidis
Abstract:
Biometric systems have to address many requirements, such as large population coverage, demographic diversity, varied deployment environment, as well as practical aspects like performance and spoofing attacks. Traditional unimodal biometric systems do not fully meet the aforementioned requirements making them vulnerable and susceptible to different types of attacks. In response to that, modern bio…
▽ More
Biometric systems have to address many requirements, such as large population coverage, demographic diversity, varied deployment environment, as well as practical aspects like performance and spoofing attacks. Traditional unimodal biometric systems do not fully meet the aforementioned requirements making them vulnerable and susceptible to different types of attacks. In response to that, modern biometric systems combine multiple biometric modalities at different fusion levels. The fused score is decisive to classify an unknown user as a genuine or impostor. In this paper, we evaluate combinations of score normalization and fusion techniques using two modalities (fingerprint and finger-vein) with the goal of identifying which one achieves better improvement rate over traditional unimodal biometric systems. The individual scores obtained from finger-veins and fingerprints are combined at score level using three score normalization techniques (min-max, z-score, hyperbolic tangent) and four score fusion approaches (minimum score, maximum score, simple sum, user weighting). The experimental results proved that the combination of hyperbolic tangent score normalization technique with the simple sum fusion approach achieve the best improvement rate of 99.98%.
△ Less
Submitted 27 May, 2018;
originally announced May 2018.
-
The Impact of Quantum Computing on Present Cryptography
Authors:
Vasileios Mavroeidis,
Kamer Vishi,
Mateusz D. Zych,
Audun Jøsang
Abstract:
The aim of this paper is to elucidate the implications of quantum computing in present cryptography and to introduce the reader to basic post-quantum algorithms. In particular the reader can delve into the following subjects: present cryptographic schemes (symmetric and asymmetric), differences between quantum and classical computing, challenges in quantum computing, quantum algorithms (Shor's and…
▽ More
The aim of this paper is to elucidate the implications of quantum computing in present cryptography and to introduce the reader to basic post-quantum algorithms. In particular the reader can delve into the following subjects: present cryptographic schemes (symmetric and asymmetric), differences between quantum and classical computing, challenges in quantum computing, quantum algorithms (Shor's and Grover's), public key encryption schemes affected, symmetric schemes affected, the impact on hash functions, and post quantum cryptography. Specifically, the section of Post-Quantum Cryptography deals with different quantum key distribution methods and mathematicalbased solutions, such as the BB84 protocol, lattice-based cryptography, multivariate-based cryptography, hash-based signatures and code-based cryptography.
△ Less
Submitted 31 March, 2018;
originally announced April 2018.
-
Automatic Detection of Malware-Generated Domains with Recurrent Neural Models
Authors:
Pierre Lison,
Vasileios Mavroeidis
Abstract:
Modern malware families often rely on domain-generation algorithms (DGAs) to determine rendezvous points to their command-and-control server. Traditional defence strategies (such as blacklisting domains or IP addresses) are inadequate against such techniques due to the large and continuously changing list of domains produced by these algorithms. This paper demonstrates that a machine learning appr…
▽ More
Modern malware families often rely on domain-generation algorithms (DGAs) to determine rendezvous points to their command-and-control server. Traditional defence strategies (such as blacklisting domains or IP addresses) are inadequate against such techniques due to the large and continuously changing list of domains produced by these algorithms. This paper demonstrates that a machine learning approach based on recurrent neural networks is able to detect domain names generated by DGAs with high precision. The neural models are estimated on a large training set of domains generated by various malwares. Experimental results show that this data-driven approach can detect malware-generated domain names with a F_1 score of 0.971. To put it differently, the model can automatically detect 93 % of malware-generated domain names for a false positive rate of 1:100.
△ Less
Submitted 20 September, 2017;
originally announced September 2017.