License: CC BY 4.0
arXiv:2312.13476v1 [cs.CR] 20 Dec 2023

Fortify Your Defenses: Strategic Budget Allocation to Enhance Power Grid Cybersecuritythanks: The research described in this paper is part of the Resilience Through Data Driven, Intelligently Designed Control (RD2C) Initiative at Pacific Northwest National Laboratory (PNNL). It was conducted under the Laboratory Directed Research and Development Program at PNNL, a multiprogram national laboratory operated by Battelle for the U.S. Department of Energy.

Rounak Meyur, Sumit Purohit, Braden K. Webb
Abstract

The abundance of cyber-physical components in modern day power grid with their diverse hardware and software vulnerabilities has made it difficult to protect them from advanced persistent threats (APTs). An attack graph depicting the propagation of potential cyber-attack sequences from the initial access point to the end objective is vital to identify critical weaknesses of any cyber-physical system. A cyber security personnel can accordingly plan preventive mitigation measures for the identified weaknesses addressing the cyber-attack sequences. However, limitations on available cybersecurity budget restrict the choice of mitigation measures. We address this aspect through our framework, which solves the following problem: given potential cyber-attack sequences for a cyber-physical component in the power grid, find the optimal manner to allocate an available budget to implement necessary preventive mitigation measures. We formulate the problem as a mixed integer linear program (MILP) to identify the optimal budget partition and set of mitigation measures which minimize the vulnerability of cyber-physical components to potential attack sequences. We assume that the allocation of budget affects the efficacy of the mitigation measures. We show how altering the budget allocation for tasks such as asset management, cybersecurity infrastructure improvement, incident response planning and employee training affects the choice of the optimal set of preventive mitigation measures and modifies the associated cybersecurity risk. The proposed framework can be used by cyber policymakers and system owners to allocate optimal budgets for various tasks required to improve the overall security of a cyber-physical system.

Introduction

An increased reliance on Information Technology (IT) in various aspects of modern life has created a vast ecosystem of interconnected systems, networks and devices (Bansal and Kumar 2020). This provides a large attack surface for cyber adversaries to target, allowing them to gain unauthorized access, steal data, and disrupt operations. The availability of off-the-shelf hacking tools and malware in the underground market makes their task even easier to an extent, which allows them to initiate complex attacks without the requirement of sophisticated programming expertise (Liggett et al. 2019). Sophisticated malware, APTs and multi-stage cyber attacks involving multiple attack vectors are principal factors, which increase the complexity of these attacks and make them harder to detect and mitigate (Li and Liu 2021). This requires system owners and cybersecurity personnel to be well aware about the latest vulnerabilities and plan to mitigate them effectively.

The modern day energy infrastructure is equipped with smart devices, which aid in its monitoring and control. These devices are an integral part of the cyber-physical energy system (CPES). They form the link between the physical power grid and the communication network, allowing system operators to take online decisions and alter system conditions remotely. However, this comes at a cost of increased vulnerability to cyber attacks where adversaries can gain access to these devices and adversely impact the power grid infrastructure, leading to severe events such as widespread blackout. A typical CPES consists of multiple smart devices interlinked through a communication network. These smart devices (such as a smart inverter or a protective relay) can be accessed directly by a cyber adversary or via the communication network after a successful intrusion into a centrally situated device (such as a substation automation controller). Therefore, the goal of cybersecurity personnel is to protect these smart devices (or components) from adversarial cyber intrusions. From hereon, we use the terms ‘component’ and ‘smart device’ interchangeably. In this work, we aim to identify an optimal set of preventive cybersecurity measures for each component in the CPES in order to reduce the risk of adversarial cyber attacks.

A bottom-up approach involves evaluating the risk associated with the failure of a component by assessing the loss of power grid resilience or stability, and thereafter, allocating budget towards securing the ‘critical’ components (Zografopoulos et al. 2021). On the contrary, a top-down approach focuses on cyber vulnerabilities for a component, possible adversarial techniques used to exploit them, and preventive strategies to avoid them (Das et al. 2022; Dutta et al. 2022; Subasi et al. 2022). This involves develo** and implementing patches for individual vulnerabilities in a timely manner, which has been reported to be almost impossible (Culafi 2021). One of the major roadblocks responsible for this is the lack of resources required to cover the sheer surface area of diverse hardware and software vulnerabilities. To this end, an organizational effort is required which addresses the prioritizing the vulnerabilities and allocating available budget based on their priorities.

The MITRE Adversarial Tactics, Techniques, and Common Knowledge (MITRE ATT&CK) framework, developed by the MITRE Corporation, serves as a comprehensive and structured database to understand and organize information about cyber threats (MITRE Corporation 2023). The framework consists of tactics, or high-level objectives of an adversary during an attack, and techniques, or specific methods and procedures to accomplish their objectives within each tactic. It also provides detailed information about associated mitigation measures, which refer to strategies that organizations can employ to defend against or reduce the impact of specific techniques. However, information regarding both the cost of implementing the preventive mitigation measures and their efficacy against adversarial techniques are not included in the framework. This makes the task of evaluating the investment required to implement a proposed mitigation plan difficult to compute. At the same time, it is difficult to quantify how the presence or absence of mitigation measures affects the success rate of an adversarial technique.

In this paper, we approach the evaluation of optimal policies to improve a component’s cybersecurity in the CPES with an aim to alleviate its risk to adversarial threats. An important aspect of policy formulation is to prioritize the problems to address and partitioning available budget to implement appropriate solutions. We treat the cybersecurity budget to be representative of the labor/staff hours and associated resources required to implement the different mitigation measures. Hence, we identify multiple organizational sectors to segregate the mitigation measures based on the skill or number of staff hours required for implementation (Georgiadou, Mouzakitis, and Askounis 2021). Thereafter, our proposed approach evaluates the high priority mitigation measures required to be implemented and the optimal manner of partitioning the cybersecurity budget to achieve this task. The underlying assumption of our approach is that allocating budget in a particular sector implies prioritizing mitigation measures within that sector. This improves the overall efficacy of the mitigation against all adversarial techniques. The formal problem statement can be stated as follows: Given potential cyber-attack sequences for a cyber-physical component in the power grid, find the optimal manner to allocate an available labor budget to implement necessary preventive mitigation measures, which reduces the risk to adversary threats.

Contributions. The major contributions of this paper are listed below: (i) we use a top-down approach to define the vulnerability of a CPES component based on the adversarial threats and attack sequences to which it is susceptible, (ii) we use the MITRE ATT&CK framework to define the efficacy of mitigation measures against adversarial techniques and thereby, propose analytic expressions to evaluate the success rate of both individual techniques and entire attack sequences, (iii) we formulate a MILP by using these analytic expressions to identify the optimal partitions of a given limited budget to improve mitigation measure efficacy and evaluate the optimal mitigation set required to minimize successful adversarial attack sequences on the cyber component. The proposed holistic framework can be generalized for any cyber-physical system or any component/system with recorded cyber vulnerabilities.

Related Works

Risk assessment of CPES has been studied extensively using various methodologies, where the impact of cyber attacks on specific nodes in the power network is analyzed to identify the resulting damage. This has been done either through low fidelity simulation frameworks (Keliris et al. 2016; Chen et al. 2014; Georg et al. 2013; Dorsch et al. 2014; Queiroz, Mahmood, and Tari 2011) or through high fidelity real time simulation test beds (Vasisht et al. 2022; Sridhar et al. 2017; Haack et al. 2013; Stanovich et al. 2013). These frameworks are useful, since they provide means to identify critical components in terms of their impact on the power grid and enable system planners to focus their cybersecurity countermeasures. However, recent intrusion reports show complex cyber attack sequences, which utilize the interconnected nature of communication systems in order to gain system-wide access (MITRE Corporation 2023). This necessitates a top-down approach, which identifies possible cyber attack sequences and recommend countermeasures to prevent them.

Authors in (Nandi, Medal, and Vadlamani 2016) propose an interdiction plan by deploying countermeasures at optimal set of edges in an attack graph to minimize losses due to security breaches. However, this work assumes a deterministic graph, with a 100%percent100100\%100 % or 0%percent00\%0 % breach success rate along the edges and also considers fixed budget for countermeasures along each edge in the graph The MITRE ATT&CK framework has been used in (Das et al. 2022; Dutta et al. 2022) to identify vulnerabilities in several CPES components and generate attack sequences. It provides list of mitigation measures to prevent adversarial techniques. However, a holistic framework to identify the optimal set of countermeasures to prevent a set of attack sequences is not available in the present literature.

The authors in (Li et al. 2019) have performed extensive survey to show how investing in various organizational sectors has affected the overall improvement of cybersecurity awareness in several organizations. The association of various mitigation measures specified in the MITRE ATT&CK framework to the cybersecurity culture of various organizations has been presented in (Georgiadou, Mouzakitis, and Askounis 2021). This allows a holistic approach to address cybersecurity gaps in infrastructure and policies for an organization. The allocation of resources to improve cybersecurity in an organization is a pertinent problem due to contrasting interest of different individuals (Srinidhi, Yan, and Tayi 2015). In this regard, the present literature lacks a framework capable of addressing the aspect of allocating budget to different organizational sectors serving a common goal of reducing vulnerability of adversarial cyber attacks. This paper aims to address this particular research gap.

Preliminaries

MITRE ATT&CK framework. The MITRE ATT&CK framework (MITRE Corporation 2023) serves as a database of cyber attack scenarios and methods undertaken by adversaries for cyber intrusion. It contains an extensive list of tactics and techniques common to adversarial cyber attacks. The ‘tactics’ denote adversarial motivation and ‘techniques’ represent the instrumental means of achieving those tactical objectives. Moreover, each technique is associated with a list of mitigation measures such that a particular cyber defense system with a given set of mitigation measures will be able to prevent only their corresponding techniques. We denote sets of N𝒯subscript𝑁𝒯N_{\mathscr{T}}italic_N start_POSTSUBSCRIPT script_T end_POSTSUBSCRIPT techniques and Nsubscript𝑁N_{\mathscr{M}}italic_N start_POSTSUBSCRIPT script_M end_POSTSUBSCRIPT mitigation measures by 𝒯𝒯\mathscr{T}script_T and \mathscr{M}script_M respectively and define a map** g:𝒯:𝑔𝒯g:\mathscr{T}\rightarrow\mathscr{M}italic_g : script_T → script_M to identify the set of mitigation measures g(t)𝑔𝑡g\left(t\right)\subseteq\mathscr{M}italic_g ( italic_t ) ⊆ script_M, which can prevent an adversary from performing technique t𝑡titalic_t. The pre-image of a mitigation measure m𝑚mitalic_m under g𝑔gitalic_g provides the induced map** g1:𝒯:superscript𝑔1𝒯g^{-1}:\mathscr{M}\rightarrow\mathscr{T}italic_g start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT : script_M → script_T as the set of techniques g1(m)superscript𝑔1𝑚g^{-1}\left(m\right)italic_g start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_m ) which can be prevented when m𝑚mitalic_m is available in the cyber defense system.

Hybrid Attack Graph (HAG). The cybersecurity risk assessment of a component in the CPES involves identifying possible adversarial techniques that can be performed on it. To this end, we use a map** framework, which maps common cyber vulnerabilities in the component to common attack patterns that can be executed on these vulnerabilities. The framework is depicted in Fig. 1, where we provide the name of a component as input and the framework evaluates the vulnerabilities from the Common Vulnerability Enumeration (CVE) database, identifies respective weaknesses from the Common Weakness Enumeration (CWE) database, gets attack patterns from the Common Attack Pattern Enumeration and Classification (CAPEC) database and finally maps to the MITRE ATT&CK framework to obtain the list of adversarial techniques. Refer (Dutta et al. 2022) for details about each of these database.

Refer to caption
Figure 1: Framework to map between various vulnerability and attack pattern databases. The list of adversarial techniques that might be possible to be executed on a “smart inverter” are obtained in this map**.

However, cybersecurity risk assessment also requires us to identify the sequence of techniques which can be performed on the component. A HAG serves as an excellent tool for this purpose. These are synthetically generated graphs which are created based on past instances of cyber attacks reported in openly available cyber intrusion reports (Donald, Meyur, and Purohit 2023). An attack graph describes attack sequences through a set of possible techniques from the MITRE ATT&CK framework. The nodes in the graph represent adversarial techniques and the edges depict consecutive techniques used in an attack sequence.

A single attack sequence is represented as a path in the HAG, describing the progression of techniques. We denote an attack sequence of length n𝑛nitalic_n as 𝒜:={t1,t2,,tn}assign𝒜subscript𝑡1subscript𝑡2subscript𝑡𝑛\mathscr{A}:=\left\{t_{1},t_{2},\cdots,t_{n}\right\}script_A := { italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT }. The adversarial techniques t1,t2,,tnsubscript𝑡1subscript𝑡2subscript𝑡𝑛t_{1},t_{2},\cdots,t_{n}italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT which comprise the sequence are nodes in the HAG. It is important to mention that we assume the following while creating a HAG: (i) the transition of techniques follow a predefined order of the associated tactics, and (ii) techniques are not repeated. An example HAG is shown in Fig. 2.

Refer to caption
Figure 2: A sample HAG describing the possible attack sequences that can be performed on a “smart inverter”. Each node denotes an adversarial technique and nodes with same color denote techniques belonging to the same tactic.

Vulnerability of a component. The goal of this work is to harness the MITRE ATT&CK framework to plan mitigation measures for a component. To this end, we need to define a component’s vulnerability in the context of the MITRE ATT&CK framework and the HAG generated for the component. The generated HAG provides us with possible sequences of techniques that adversaries could utilize to perform a successful cyber attack. A planner might seek to choose an optimal set of mitigation measures to minimize the probability of compromising the component through any of the attack sequences identified through the HAG. This goal objective requires us to minimize the success probability of each and every attack sequence in the HAG. However, for all practical purposes, reducing these success probabilities to a sufficiently small value is acceptable. We term this objective as minimizing the number of “highly likely” attack sequences. First, we define what we mean by a “highly likely” attack sequence. Here, we assume that the probability of successful execution of the techniques are independent.

Definition 1.

A sequence 𝒜l:={t1,t2,,}\mathscr{A}_{l}:=\left\{t_{1},t_{2},\cdots,\right\}script_A start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT := { italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , } is said to be “highly likely” if the probability of its successful execution (or success rate vlsubscript𝑣𝑙v_{l}italic_v start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT) exceeds a chosen threshold δ>0𝛿0\delta>0italic_δ > 0, i.e.,

vl=t𝒜lrtδt𝒜llogrtlogδformulae-sequencesubscript𝑣𝑙subscriptproduct𝑡subscript𝒜𝑙subscript𝑟𝑡𝛿iffsubscript𝑡subscript𝒜𝑙subscript𝑟𝑡𝛿v_{l}=\prod_{t\in\mathscr{A}_{l}}r_{t}\geq\delta\quad\iff\quad\sum_{t\in% \mathscr{A}_{l}}\log{r_{t}}\geq\log{\delta}italic_v start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT = ∏ start_POSTSUBSCRIPT italic_t ∈ script_A start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ≥ italic_δ ⇔ ∑ start_POSTSUBSCRIPT italic_t ∈ script_A start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_log italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ≥ roman_log italic_δ (1)

where rtsubscript𝑟𝑡r_{t}italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is the success rate of technique t𝒜l𝑡subscript𝒜𝑙t\in\mathscr{A}_{l}italic_t ∈ script_A start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT.

Definition 2.

The vulnerability of the component Dnormal-D{\mathrm{D}}roman_D with a set of mitigation measures Ssubscriptnormal-S\mathscr{M}_{\operatorname{S}}\subseteq\mathscr{M}script_M start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT ⊆ script_M is given by the fraction of “highly likely” attack sequences

Vul(D,S,δ)=ND(S,δ)NDVulDsubscriptS𝛿subscript𝑁DsubscriptS𝛿subscript𝑁D\operatorname{Vul}\left({\mathrm{D}},\mathscr{M}_{\operatorname{S}},\delta% \right)=\frac{N_{{\mathrm{D}}}\left(\mathscr{M}_{\operatorname{S}},\delta% \right)}{N_{{\mathrm{D}}}}roman_Vul ( roman_D , script_M start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT , italic_δ ) = divide start_ARG italic_N start_POSTSUBSCRIPT roman_D end_POSTSUBSCRIPT ( script_M start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT , italic_δ ) end_ARG start_ARG italic_N start_POSTSUBSCRIPT roman_D end_POSTSUBSCRIPT end_ARG (2)

where δ𝛿\deltaitalic_δ denotes the threshold success rate for a technique to be “highly likely” and ND(S,δ)subscript𝑁normal-Dsubscriptnormal-S𝛿N_{{\mathrm{D}}}\left(\mathscr{M}_{\operatorname{S}},\delta\right)italic_N start_POSTSUBSCRIPT roman_D end_POSTSUBSCRIPT ( script_M start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT , italic_δ ) denotes the number of “highly likely” attack sequences.

Table 1: Sets and set elements used
Symbol Description
𝒯𝒯\mathscr{T}script_T Set of all cyber adversary techniques
\mathscr{M}script_M Set of all mitigation measures
𝒞𝒞\mathscr{C}script_C Set of cybersecurity budget sectors
SsubscriptS\mathscr{M}_{\operatorname{S}}script_M start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT Set of selected mitigation measures
𝒮Dsubscript𝒮D\mathscr{S}_{{\mathrm{D}}}script_S start_POSTSUBSCRIPT roman_D end_POSTSUBSCRIPT Set of attack sequences for device DD{\mathrm{D}}roman_D
N𝒯subscript𝑁𝒯N_{\mathscr{T}}italic_N start_POSTSUBSCRIPT script_T end_POSTSUBSCRIPT Number of adversary techniques
Nsubscript𝑁N_{\mathscr{M}}italic_N start_POSTSUBSCRIPT script_M end_POSTSUBSCRIPT Number of mitigation measures
NDsubscript𝑁DN_{{\mathrm{D}}}italic_N start_POSTSUBSCRIPT roman_D end_POSTSUBSCRIPT Number of attack sequences in 𝒮Dsubscript𝒮D\mathscr{S}_{{\mathrm{D}}}script_S start_POSTSUBSCRIPT roman_D end_POSTSUBSCRIPT
N𝒞subscript𝑁𝒞N_{\mathscr{C}}italic_N start_POSTSUBSCRIPT script_C end_POSTSUBSCRIPT Number of cybersecurity budget sectors
t𝑡titalic_t An adversarial technique in set 𝒯𝒯\mathscr{T}script_T
m𝑚mitalic_m An adversarial technique in set \mathscr{M}script_M
𝒜𝒜\mathscr{A}script_A An attack sequence

Proposed Approach

The goal of cybersecurity planning is to identify which mitigation measures to implement to reduce the vulnerability of the cyber-physical component under consideration. We define the optimal defender problem as follows.

Problem 1 (Optimal Defender Problem).

Given limited budget to enhance cybersecurity of a component Dnormal-D{\mathrm{D}}roman_D, find the optimal set of mitigation measures Ssubscriptnormal-S\mathscr{M}_{\operatorname{S}}script_M start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT to minimize its vulnerability Vul(D,S)normal-Vulnormal-Dsubscriptnormal-S\operatorname{Vul}\left({\mathrm{D}},\mathscr{M}_{\operatorname{S}}\right)roman_Vul ( roman_D , script_M start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT ) for a given set of attack sequences 𝒮Dsubscript𝒮normal-D\mathscr{S}_{{\mathrm{D}}}script_S start_POSTSUBSCRIPT roman_D end_POSTSUBSCRIPT.

The main challenge arises when evaluating the cost of implementing each mitigation measure—either in terms of monetary investment or required time commitment. Furthermore, the efficacy of each mitigation measure against the adversarial techniques is usually unknown. However, we note that allocating additional budget generally improves the efficacy of mitigation measures. Therefore, we formulate the problem in a way to to address how to allocate a limited cybersecurity budget to reduce component vulnerability. We state the strategic cybersecurity budget allocation problem as follows:

Problem 2 (Cybersecurity Budget Allocation Problem).

Given a limited budget to enhance the cybersecurity of a component Dnormal-D{\mathrm{D}}roman_D, find the optimal way to partition the budget in order to improve efficacy of a set of mitigation measures Ssubscriptnormal-S\mathscr{M}_{\operatorname{S}}script_M start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT and thereby minimize the vulnerability Vul(D,S)normal-Vulnormal-Dsubscriptnormal-S\operatorname{Vul}\left({\mathrm{D}},\mathscr{M}_{\operatorname{S}}\right)roman_Vul ( roman_D , script_M start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT ) of the component with a given set of attack sequences 𝒮Dsubscript𝒮normal-D\mathscr{S}_{{\mathrm{D}}}script_S start_POSTSUBSCRIPT roman_D end_POSTSUBSCRIPT.

Budget allocation. Following (Georgiadou, Mouzakitis, and Askounis 2021), we categorize the mitigation measures into the following overlap** sectors (or categories): asset management, business continuity, access and trust, operations, defense, security governance and employee training. We partition the entire available budget into the above N𝒞subscript𝑁𝒞N_{\mathscr{C}}italic_N start_POSTSUBSCRIPT script_C end_POSTSUBSCRIPT sectors. Let bjsubscript𝑏𝑗b_{j}italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT denote the portion of budget assigned to the jthsuperscript𝑗𝑡j^{th}italic_j start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT category and jbj=1subscript𝑗subscript𝑏𝑗1\sum_{j}{b_{j}}=1∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 1. Further, we define matrix 𝑪{0,1}N×N𝒞𝑪superscript01subscript𝑁subscript𝑁𝒞\boldsymbol{C}\in\left\{0,1\right\}^{N_{\mathscr{M}}\times N_{\mathscr{C}}}bold_italic_C ∈ { 0 , 1 } start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT script_M end_POSTSUBSCRIPT × italic_N start_POSTSUBSCRIPT script_C end_POSTSUBSCRIPT end_POSTSUPERSCRIPT such that Cij=1subscript𝐶𝑖𝑗1C_{ij}=1italic_C start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = 1 if the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT mitigation is included in the jthsuperscript𝑗𝑡j^{th}italic_j start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT category, otherwise Cij=0subscript𝐶𝑖𝑗0C_{ij}=0italic_C start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = 0.

The underlying assumption is that allocating budget improves mitigation measure efficacy, i.e., the probability that a mitigation measure successfully prevents a technique. In our case, we also assume that for a given mitigation, this probability is uniform for all associated techniques. A mitigation measure misubscript𝑚𝑖m_{i}\in\mathscr{M}italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ script_M belongs to one or more of the N𝒞subscript𝑁𝒞N_{\mathscr{C}}italic_N start_POSTSUBSCRIPT script_C end_POSTSUBSCRIPT sectors. We assume that in order to improve the efficacy of a mitigation, the budget must be allocated to all of the associated sectors to which it belongs. Therefore, we compute the fractional budget fisubscript𝑓𝑖f_{i}italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT allocated for improving efficacy of mitigation measure misubscript𝑚𝑖m_{i}italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT as the weighted sum of the category budgets,

fi=j=1N𝒞Cijbjj=1N𝒞Cijsubscript𝑓𝑖superscriptsubscript𝑗1subscript𝑁𝒞subscript𝐶𝑖𝑗subscript𝑏𝑗superscriptsubscript𝑗1subscript𝑁𝒞subscript𝐶𝑖𝑗f_{i}=\frac{\sum_{j=1}^{N_{\mathscr{C}}}C_{ij}b_{j}}{\sum_{j=1}^{N_{\mathscr{C% }}}C_{ij}}italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = divide start_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT script_C end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_C start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT script_C end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_C start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_ARG (3)

Let 𝒇𝒇{\boldsymbol{f}}bold_italic_f be the Nsubscript𝑁N_{\mathscr{M}}italic_N start_POSTSUBSCRIPT script_M end_POSTSUBSCRIPT-length vector obtained by stacking the fisubscript𝑓𝑖f_{i}italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT entries for all mitigation measures. The matrix version of (3) is written as 𝒇=diag(𝑪𝟏)1𝑪𝒃{\boldsymbol{f}}=\operatorname{diag}\left(\boldsymbol{C}\boldsymbol{1}\right)^% {-1}\boldsymbol{C}{\boldsymbol{b}}bold_italic_f = roman_diag ( bold_italic_C bold_1 ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_C bold_italic_b, where 𝟏1\boldsymbol{1}bold_1 is a vector of 1111s.

Let ηi,0[0,1)subscript𝜂𝑖001\eta_{i,0}\in\left[0,1\right)italic_η start_POSTSUBSCRIPT italic_i , 0 end_POSTSUBSCRIPT ∈ [ 0 , 1 ) denote the initial efficacy of the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT mitigation. This depends on the efficacy of the mitigation measures already in place – for example, the strength of firewall. We define an exponential improvement in the efficacy with increase in the cybersecurity labor budget, such that it is asymptotic to value of 1.01.01.01.0 based on the following expression:

ηi=1(1ηi,0)eλfisubscript𝜂𝑖11subscript𝜂𝑖0superscript𝑒𝜆subscript𝑓𝑖\eta_{i}=1-\left(1-\eta_{i,0}\right)e^{-\lambda f_{i}}italic_η start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 - ( 1 - italic_η start_POSTSUBSCRIPT italic_i , 0 end_POSTSUBSCRIPT ) italic_e start_POSTSUPERSCRIPT - italic_λ italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT (4)

where ηisubscript𝜂𝑖\eta_{i}italic_η start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the improved efficacy and λ𝜆\lambdaitalic_λ is a suitable scaling factor to relate the improvement in efficacy to the overall budget allocation. Fig. 3 shows the exponential improvement in mitigation efficacy for λ=0.1𝜆0.1\lambda=0.1italic_λ = 0.1. An exponential relation mimics the most natural behavior of diminishing returns on investments and has been used in similar models (Kubanek 2017).

Refer to caption
Figure 3: Improvement in mitigation efficacy with increase in allocated budget.

In practice, the parameter λ𝜆\lambdaitalic_λ denotes the organization’s efficiency in utilizing the allocated budget to improve the overall cybersecurity. A high λ𝜆\lambdaitalic_λ means a higher rate of improvement in efficacy for a given increase in budget allocation factor. Further, note that for a given λ𝜆\lambdaitalic_λ, the maximum improvement in efficacy of mitigation i𝑖iitalic_i occurs when budget is allocated to all the associated sectors.

Success rate of techniques. Let pi,ksubscript𝑝𝑖𝑘p_{i,k}italic_p start_POSTSUBSCRIPT italic_i , italic_k end_POSTSUBSCRIPT be the probability that technique tk𝒯subscript𝑡𝑘𝒯t_{k}\in\mathscr{T}italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ script_T is avoided by a mitigation measure misubscript𝑚𝑖m_{i}\in\mathscr{M}italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ script_M. Note that pik=0subscript𝑝𝑖𝑘0p_{ik}=0italic_p start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT = 0 for all techniques tk𝒯subscript𝑡𝑘𝒯t_{k}\in\mathscr{T}italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ script_T which cannot be mitigated by misubscript𝑚𝑖m_{i}italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, i.e., tkg1(mi)subscript𝑡𝑘superscript𝑔1subscript𝑚𝑖t_{k}\notin g^{-1}\left(m_{i}\right)italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∉ italic_g start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ). Based on the assumption mentioned in the previous section, we have

pi,k={ηi,if tkg1(mi)0,otherwisesubscript𝑝𝑖𝑘casessubscript𝜂𝑖if subscript𝑡𝑘superscript𝑔1subscript𝑚𝑖𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒0otherwise𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒p_{i,k}=\begin{cases}\eta_{i},\quad\quad\textrm{if }t_{k}\in g^{-1}\left(m_{i}% \right)\\ 0,\quad\quad\textrm{otherwise}\end{cases}italic_p start_POSTSUBSCRIPT italic_i , italic_k end_POSTSUBSCRIPT = { start_ROW start_CELL italic_η start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , if italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ italic_g start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL 0 , otherwise end_CELL start_CELL end_CELL end_ROW (5)

Let 𝑴{0,1}N×N𝒯𝑴superscript01subscript𝑁subscript𝑁𝒯\boldsymbol{M}\in\left\{0,1\right\}^{N_{\mathscr{M}}\times N_{\mathscr{T}}}bold_italic_M ∈ { 0 , 1 } start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT script_M end_POSTSUBSCRIPT × italic_N start_POSTSUBSCRIPT script_T end_POSTSUBSCRIPT end_POSTSUPERSCRIPT denote the mitigation-technique relation matrix. The entry Miksubscript𝑀𝑖𝑘M_{ik}italic_M start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT along the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT row and kthsuperscript𝑘𝑡k^{th}italic_k start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT column of 𝑴𝑴\boldsymbol{M}bold_italic_M is 1111 if the technique tk𝒯subscript𝑡𝑘𝒯t_{k}\in\mathscr{T}italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ script_T is mitigated by mitigation measure misubscript𝑚𝑖m_{i}\in\mathscr{M}italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ script_M; otherwise the entry is 00. This is constructed from the MITRE ATT&CK framework. Fig. 4 shows the matrix through a heat-map where the rows denote the mitigation measures and columns represent techniques. The techniques for each tactic are grouped together. The opacity of every element in the matrix shows the efficacy of the mitigation measure against the adversarial technique. We call this matrix as the mitigation profile.

Refer to caption
Figure 4: Mitigation-technique relation matrix with efficacy values for each mitigation measure against adversarial techniques.

We want to identify the set of mitigation measures SsubscriptS\mathscr{M}_{\operatorname{S}}\subseteq\mathscr{M}script_M start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT ⊆ script_M which would reduce the vulnerability of the component DD{\mathrm{D}}roman_D with possible attack sequences listed in 𝒮Dsubscript𝒮D\mathscr{S}_{{\mathrm{D}}}script_S start_POSTSUBSCRIPT roman_D end_POSTSUBSCRIPT. Let xi{0,1}subscript𝑥𝑖01x_{i}\in\left\{0,1\right\}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ { 0 , 1 } denote the absence/presence of mitigation measure misubscript𝑚𝑖m_{i}italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in the set SsubscriptS\mathscr{M}_{\operatorname{S}}script_M start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT. Note that the mitigation measures which are not present in the cyber system cannot affect the success rate of an adversarial technique. Using (5) we can write

pik=ηixiMiksubscript𝑝𝑖𝑘subscript𝜂𝑖subscript𝑥𝑖subscript𝑀𝑖𝑘p_{ik}=\eta_{i}x_{i}M_{ik}italic_p start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT = italic_η start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT (6)

Observe that 1pik1subscript𝑝𝑖𝑘1-p_{ik}1 - italic_p start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT indicates the probability that technique tksubscript𝑡𝑘t_{k}italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is not avoided by mitigation measure misubscript𝑚𝑖m_{i}italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. We consider each event of technique tksubscript𝑡𝑘t_{k}italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT being avoided by mitigation mi,misubscript𝑚𝑖for-allsubscript𝑚𝑖m_{i},~{}\forall m_{i}\in\mathscr{M}italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , ∀ italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ script_M to be independent. Therefore, the success rate of a technique tksubscript𝑡𝑘t_{k}italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT in the cyber system with a set of mitigation measures SsubscriptS\mathscr{M}_{\operatorname{S}}script_M start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT is computed as

rk=i,miS(1pik)=i=1N(1xiMikηi)subscript𝑟𝑘subscriptproduct𝑖subscript𝑚𝑖subscriptS1subscript𝑝𝑖𝑘superscriptsubscriptproduct𝑖1subscript𝑁1subscript𝑥𝑖subscript𝑀𝑖𝑘subscript𝜂𝑖r_{k}=\prod_{i,m_{i}\in\mathscr{M}_{\operatorname{S}}}{\left(1-p_{ik}\right)}=% \prod_{i=1}^{N_{\mathscr{M}}}{\left(1-x_{i}M_{ik}\eta_{i}\right)}italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ∏ start_POSTSUBSCRIPT italic_i , italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ script_M start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( 1 - italic_p start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT ) = ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT script_M end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT italic_η start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) (7)

The logarithm of (7) is computed as

Lk=log(rk)=i=1Nlog(1xiMikηi)subscript𝐿𝑘subscript𝑟𝑘superscriptsubscript𝑖1subscript𝑁1subscript𝑥𝑖subscript𝑀𝑖𝑘subscript𝜂𝑖L_{k}=\log\left(r_{k}\right)=\sum_{i=1}^{N_{\mathscr{M}}}\log\left(1-x_{i}M_{% ik}\eta_{i}\right)italic_L start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = roman_log ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT script_M end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_log ( 1 - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT italic_η start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) (8)

It is interesting to note that

log(1xiMikηi)={log(1ηi)ifxi=1,Mik=10otherwise1subscript𝑥𝑖subscript𝑀𝑖𝑘subscript𝜂𝑖casesformulae-sequence1subscript𝜂𝑖ifsubscript𝑥𝑖1subscript𝑀𝑖𝑘1𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒0otherwise𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒\log\left(1-x_{i}M_{ik}\eta_{i}\right)=\begin{cases}\log\left(1-\eta_{i}\right% )\quad~{}~{}\textrm{if}~{}x_{i}=1,M_{ik}=1\\ 0\quad\quad\quad\quad\quad\quad~{}\textrm{otherwise}\end{cases}roman_log ( 1 - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT italic_η start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = { start_ROW start_CELL roman_log ( 1 - italic_η start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) if italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 , italic_M start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT = 1 end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL 0 otherwise end_CELL start_CELL end_CELL end_ROW (9)

which helps us simplify the log success rate of a technique tksubscript𝑡𝑘t_{k}italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT as

Lk=log(rk)=i=1NxiMiklog(1ηi)subscript𝐿𝑘subscript𝑟𝑘superscriptsubscript𝑖1subscript𝑁subscript𝑥𝑖subscript𝑀𝑖𝑘1subscript𝜂𝑖L_{k}=\log\left(r_{k}\right)=\sum_{i=1}^{N_{\mathscr{M}}}x_{i}M_{ik}\log\left(% 1-\eta_{i}\right)italic_L start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = roman_log ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT script_M end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT roman_log ( 1 - italic_η start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) (10)

Using (4) and (10) we have

log(rk)=i=1NxiMik[log(1ηi,0)λfi]subscript𝑟𝑘superscriptsubscript𝑖1subscript𝑁subscript𝑥𝑖subscript𝑀𝑖𝑘delimited-[]1subscript𝜂𝑖0𝜆subscript𝑓𝑖\log\left(r_{k}\right)=\sum_{i=1}^{N_{\mathscr{M}}}x_{i}M_{ik}\left[\log{\left% (1-\eta_{i,0}\right)}-\lambda f_{i}\right]roman_log ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT script_M end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT [ roman_log ( 1 - italic_η start_POSTSUBSCRIPT italic_i , 0 end_POSTSUBSCRIPT ) - italic_λ italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] (11)

Let 𝒙𝒙{\boldsymbol{x}}bold_italic_x be the Nsubscript𝑁N_{\mathscr{M}}italic_N start_POSTSUBSCRIPT script_M end_POSTSUBSCRIPT-length vector constructed by stacking the xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT for all mitigation measures. We define matrix 𝑷N×N𝒯𝑷superscriptsubscript𝑁subscript𝑁𝒯\boldsymbol{P}\in\mathbb{R}^{N_{\mathscr{M}}\times N_{\mathscr{T}}}bold_italic_P ∈ blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT script_M end_POSTSUBSCRIPT × italic_N start_POSTSUBSCRIPT script_T end_POSTSUBSCRIPT end_POSTSUPERSCRIPT with element Pik=Miklog(1ηi,0)subscript𝑃𝑖𝑘subscript𝑀𝑖𝑘1subscript𝜂𝑖0P_{ik}=M_{ik}\log\left(1-\eta_{i,0}\right)italic_P start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT = italic_M start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT roman_log ( 1 - italic_η start_POSTSUBSCRIPT italic_i , 0 end_POSTSUBSCRIPT ) along the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT row and kthsuperscript𝑘𝑡k^{th}italic_k start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT column. Note that Pik0subscript𝑃𝑖𝑘0P_{ik}\leq 0italic_P start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT ≤ 0 for all i,k𝑖𝑘i,kitalic_i , italic_k since it is the logarithm of fractional values. The logarithm of the success rate of technique tksubscript𝑡𝑘t_{k}italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT can therefore be computed as

log(rk)=[𝑷T𝒙]kλ[𝑴Tdiag(𝒇)𝒙]ksubscript𝑟𝑘subscriptdelimited-[]superscript𝑷𝑇𝒙𝑘𝜆subscriptdelimited-[]superscript𝑴𝑇diag𝒇𝒙𝑘\log\left(r_{k}\right)=\left[\boldsymbol{P}^{T}{\boldsymbol{x}}\right]_{k}-% \lambda\left[\boldsymbol{M}^{T}\operatorname{diag}({\boldsymbol{f}}){% \boldsymbol{x}}\right]_{k}roman_log ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = [ bold_italic_P start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_x ] start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_λ [ bold_italic_M start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_diag ( bold_italic_f ) bold_italic_x ] start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT (12)

where [𝒛]ksubscriptdelimited-[]𝒛𝑘\left[{\boldsymbol{z}}\right]_{k}[ bold_italic_z ] start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT denotes the kthsuperscript𝑘𝑡k^{th}italic_k start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT element of vector 𝒛𝒛{\boldsymbol{z}}bold_italic_z.

Next, we evaluate the success rate of an attack sequence. Recall that an attack sequence is a list of techniques. We define the attack sequence and technique relation matrix 𝑺{0,1}ND×N𝒯𝑺superscript01subscript𝑁Dsubscript𝑁𝒯\boldsymbol{S}\in\left\{0,1\right\}^{N_{{\mathrm{D}}}\times N_{\mathscr{T}}}bold_italic_S ∈ { 0 , 1 } start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT roman_D end_POSTSUBSCRIPT × italic_N start_POSTSUBSCRIPT script_T end_POSTSUBSCRIPT end_POSTSUPERSCRIPT where the entry Slksubscript𝑆𝑙𝑘S_{lk}italic_S start_POSTSUBSCRIPT italic_l italic_k end_POSTSUBSCRIPT along the lthsuperscript𝑙𝑡l^{th}italic_l start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT row and kthsuperscript𝑘𝑡k^{th}italic_k start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT column is 1111 if the technique tksubscript𝑡𝑘t_{k}italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is present in the attack sequence 𝒜l𝒮Dsubscript𝒜𝑙subscript𝒮D\mathscr{A}_{l}\in\mathscr{S}_{{\mathrm{D}}}script_A start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ∈ script_S start_POSTSUBSCRIPT roman_D end_POSTSUBSCRIPT and 00 otherwise. The logarithm of success rate of an attack sequence can be computed as

log(vl)=tk𝒜llogrk=k=1N𝒯Slklogrksubscript𝑣𝑙subscriptsubscript𝑡𝑘subscript𝒜𝑙subscript𝑟𝑘superscriptsubscript𝑘1subscript𝑁𝒯subscript𝑆𝑙𝑘subscript𝑟𝑘\log\left(v_{l}\right)=\sum_{t_{k}\in\mathscr{A}_{l}}{\log{r_{k}}}=\sum_{k=1}^% {N_{\mathscr{T}}}{S_{lk}\log{r_{k}}}roman_log ( italic_v start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ script_A start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_log italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT script_T end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_l italic_k end_POSTSUBSCRIPT roman_log italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT (13)

We can stack log(vl)subscript𝑣𝑙\log\left(v_{l}\right)roman_log ( italic_v start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) for all the attack sequences to an NDsubscript𝑁DN_{{\mathrm{D}}}italic_N start_POSTSUBSCRIPT roman_D end_POSTSUBSCRIPT-length vector 𝒍𝒍{\boldsymbol{l}}bold_italic_l, which can be expressed as

𝒍=𝑺𝑷T𝒙λ𝑺𝑴Tdiag(𝒇)𝒙𝒍𝑺superscript𝑷𝑇𝒙𝜆𝑺superscript𝑴𝑇diag𝒇𝒙{\boldsymbol{l}}=\boldsymbol{S}\boldsymbol{P}^{T}{\boldsymbol{x}}-\lambda% \boldsymbol{S}\boldsymbol{M}^{T}\operatorname{diag}({\boldsymbol{f}}){% \boldsymbol{x}}bold_italic_l = bold_italic_S bold_italic_P start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_x - italic_λ bold_italic_S bold_italic_M start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_diag ( bold_italic_f ) bold_italic_x (14)

We can formulate Problem 2 as follows

min𝒃,𝒙subscript𝒃𝒙\displaystyle\min_{{\boldsymbol{b}},{\boldsymbol{x}}}\quadroman_min start_POSTSUBSCRIPT bold_italic_b , bold_italic_x end_POSTSUBSCRIPT Vul(D,S)VulDsubscriptS\displaystyle\operatorname{Vul}\left({\mathrm{D}},\mathscr{M}_{\operatorname{S% }}\right)roman_Vul ( roman_D , script_M start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT ) (15a)
s.to 𝒍=𝑺𝑷T𝒙λ𝑺𝑴Tdiag(𝒇)𝒙𝒍𝑺superscript𝑷𝑇𝒙𝜆𝑺superscript𝑴𝑇diag𝒇𝒙\displaystyle{\boldsymbol{l}}=\boldsymbol{S}\boldsymbol{P}^{T}{\boldsymbol{x}}% -\lambda\boldsymbol{S}\boldsymbol{M}^{T}\operatorname{diag}({\boldsymbol{f}}){% \boldsymbol{x}}bold_italic_l = bold_italic_S bold_italic_P start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_x - italic_λ bold_italic_S bold_italic_M start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_diag ( bold_italic_f ) bold_italic_x (15b)
𝒇=diag(𝑪𝟏)1𝑪𝒃\displaystyle{\boldsymbol{f}}=\operatorname{diag}\left(\boldsymbol{C}% \boldsymbol{1}\right)^{-1}\boldsymbol{C}{\boldsymbol{b}}bold_italic_f = roman_diag ( bold_italic_C bold_1 ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_C bold_italic_b (15c)
𝟏T𝒃=1,𝒃0formulae-sequencesuperscript1𝑇𝒃1𝒃0\displaystyle\mathbf{1}^{T}{\boldsymbol{b}}=1,~{}~{}{\boldsymbol{b}}\geq 0bold_1 start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_b = 1 , bold_italic_b ≥ 0 (15d)
Table 2: Matrices, vectors and variables used
Symbol Description
𝑪𝑪\boldsymbol{C}bold_italic_C Mitigation & budget category relation matrix
𝑴𝑴\boldsymbol{M}bold_italic_M Mitigation & technique relation matrix
𝑺𝑺\boldsymbol{S}bold_italic_S Attack sequence & technique relation matrix
𝒙𝒙{\boldsymbol{x}}bold_italic_x Vector of mitigation measure indicator
𝒚𝒚{\boldsymbol{y}}bold_italic_y Vector of attack sequence indicator
𝒇𝒇{\boldsymbol{f}}bold_italic_f Vector of mitigation specific budget partitions
𝒃𝒃{\boldsymbol{b}}bold_italic_b Vector of cybersecurity budget partitions
𝒍𝒍{\boldsymbol{l}}bold_italic_l Vector of log of attack sequence success rate
ηisubscript𝜂𝑖\eta_{i}italic_η start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT Efficacy of mitigation misubscript𝑚𝑖m_{i}italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT
rksubscript𝑟𝑘r_{k}italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT Success rate of technique tksubscript𝑡𝑘t_{k}italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT
vlsubscript𝑣𝑙v_{l}italic_v start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT Success rate of attack sequence 𝒜lsubscript𝒜𝑙\mathscr{A}_{l}script_A start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT
λ𝜆\lambdaitalic_λ Skill level of defender

Proposed Optimization Framework

First, we define variable hi=fixisubscript𝑖subscript𝑓𝑖subscript𝑥𝑖h_{i}=f_{i}x_{i}italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to get rid of the bi-linear product term in the expression of (15b). The corresponding Nsubscript𝑁N_{\mathscr{M}}italic_N start_POSTSUBSCRIPT script_M end_POSTSUBSCRIPT-length vector obtained by stacking them is denoted by 𝒉𝒉{\boldsymbol{h}}bold_italic_h. Since xi{0,1}subscript𝑥𝑖01x_{i}\in\{0,1\}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ { 0 , 1 } and 0fi10subscript𝑓𝑖10\leq f_{i}\leq 10 ≤ italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≤ 1, we can write the following inequalities

𝒉𝒙𝒉𝒙\displaystyle{\boldsymbol{h}}\leq{\boldsymbol{x}}bold_italic_h ≤ bold_italic_x (16a)
𝒉𝟎𝒉0\displaystyle{\boldsymbol{h}}\geq\boldsymbol{0}bold_italic_h ≥ bold_0 (16b)
𝒉𝒇𝒉𝒇\displaystyle{\boldsymbol{h}}\leq{\boldsymbol{f}}bold_italic_h ≤ bold_italic_f (16c)
𝒉𝒙(𝟏𝒇)𝒉𝒙1𝒇\displaystyle{\boldsymbol{h}}\geq{\boldsymbol{x}}-\left(\boldsymbol{1}-{% \boldsymbol{f}}\right)bold_italic_h ≥ bold_italic_x - ( bold_1 - bold_italic_f ) (16d)

Note that when xi=0subscript𝑥𝑖0x_{i}=0italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0, we obtain the equality hi=0subscript𝑖0h_{i}=0italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 from the first two inequalities, and when xi=1subscript𝑥𝑖1x_{i}=1italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1, we obtain hi=fisubscript𝑖subscript𝑓𝑖h_{i}=f_{i}italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT from the last two inequalities.

We use the definition of vulnerability described through (2), which leads us to a MILP as discussed below. We define the binary variable yl{0,1}subscript𝑦𝑙01y_{l}\in\left\{0,1\right\}italic_y start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ∈ { 0 , 1 } to denote whether sequence 𝒜l𝒮Dsubscript𝒜𝑙subscript𝒮D\mathscr{A}_{l}\in\mathscr{S}_{{\mathrm{D}}}script_A start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ∈ script_S start_POSTSUBSCRIPT roman_D end_POSTSUBSCRIPT is “highly likely” or not. Mathematically,

yl={1,if[𝑺(𝑷T𝒙λ𝑴T𝒉)]llogδ0,otherwisesubscript𝑦𝑙cases1ifsubscriptdelimited-[]𝑺superscript𝑷𝑇𝒙𝜆superscript𝑴𝑇𝒉𝑙𝛿𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒0otherwise𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒y_{l}=\begin{cases}1,\quad\quad\mathrm{if}~{}\left[\boldsymbol{S}\left(% \boldsymbol{P}^{T}{\boldsymbol{x}}-\lambda\boldsymbol{M}^{T}{\boldsymbol{h}}% \right)\right]_{l}\geq\log{\delta}\\ 0,\quad\quad\mathrm{otherwise}\end{cases}italic_y start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT = { start_ROW start_CELL 1 , roman_if [ bold_italic_S ( bold_italic_P start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_x - italic_λ bold_italic_M start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_h ) ] start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ≥ roman_log italic_δ end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL 0 , roman_otherwise end_CELL start_CELL end_CELL end_ROW (17)

Define δ=logδsuperscript𝛿𝛿\delta^{\prime}=\log{\delta}italic_δ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = roman_log italic_δ. We can rewrite (17) for all sequences 𝒜l𝒮Dsubscript𝒜𝑙subscript𝒮D\mathscr{A}_{l}\in\mathscr{S}_{{\mathrm{D}}}script_A start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ∈ script_S start_POSTSUBSCRIPT roman_D end_POSTSUBSCRIPT with a large positive constant K𝐾Kitalic_K using the following inequalities

K𝒚𝑺𝑷T𝒙λ𝑺𝑴T𝒉δ𝟏𝐾𝒚𝑺superscript𝑷𝑇𝒙𝜆𝑺superscript𝑴𝑇𝒉superscript𝛿1\displaystyle K{\boldsymbol{y}}\geq\boldsymbol{S}\boldsymbol{P}^{T}{% \boldsymbol{x}}-\lambda\boldsymbol{S}\boldsymbol{M}^{T}{\boldsymbol{h}}-\delta% ^{\prime}\mathbf{1}italic_K bold_italic_y ≥ bold_italic_S bold_italic_P start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_x - italic_λ bold_italic_S bold_italic_M start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_h - italic_δ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT bold_1 (18a)
K(𝟏𝒚)δ𝟏𝑺𝑷T𝒙+λ𝑺𝑴T𝒉𝐾1𝒚superscript𝛿1𝑺superscript𝑷𝑇𝒙𝜆𝑺superscript𝑴𝑇𝒉\displaystyle K\left(\mathbf{1}-{\boldsymbol{y}}\right)\geq\delta^{\prime}% \mathbf{1}-\boldsymbol{S}\boldsymbol{P}^{T}{\boldsymbol{x}}+\lambda\boldsymbol% {S}\boldsymbol{M}^{T}{\boldsymbol{h}}italic_K ( bold_1 - bold_italic_y ) ≥ italic_δ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT bold_1 - bold_italic_S bold_italic_P start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_x + italic_λ bold_italic_S bold_italic_M start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_h (18b)

Our aim is to minimize the number of “highly likely” attack sequences. We can therefore write the optimization problem as

𝒙=argmin𝒃N𝒞,𝒙{0,1}Nsuperscript𝒙subscriptargminformulae-sequence𝒃superscriptsubscript𝑁𝒞𝒙superscript01subscript𝑁\displaystyle{\boldsymbol{x}}^{\star}=\operatorname*{arg\,min}_{{\boldsymbol{b% }}\in\mathbb{R}^{N_{\mathscr{C}}},{\boldsymbol{x}}\in\left\{0,1\right\}^{N_{% \mathscr{M}}}}bold_italic_x start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT bold_italic_b ∈ blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT script_C end_POSTSUBSCRIPT end_POSTSUPERSCRIPT , bold_italic_x ∈ { 0 , 1 } start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT script_M end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_POSTSUBSCRIPT 𝟏T𝒚superscript1𝑇𝒚\displaystyle\mathbf{1}^{T}{\boldsymbol{y}}bold_1 start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_y (19a)
s.to 𝒇=diag(𝑪𝟏)1𝑪𝒃\displaystyle{\boldsymbol{f}}=\operatorname{diag}\left(\boldsymbol{C}% \boldsymbol{1}\right)^{-1}\boldsymbol{C}{\boldsymbol{b}}bold_italic_f = roman_diag ( bold_italic_C bold_1 ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_C bold_italic_b (19b)
𝟏T𝒃=1,𝒃0formulae-sequencesuperscript1𝑇𝒃1𝒃0\displaystyle\mathbf{1}^{T}{\boldsymbol{b}}=1,~{}~{}{\boldsymbol{b}}\geq 0bold_1 start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_b = 1 , bold_italic_b ≥ 0 (19c)
(Proposed Optimization Framework),(Proposed Optimization Framework)Proposed Optimization FrameworkProposed Optimization Framework\displaystyle(\ref{eq:milp-1}),(\ref{eq:milp-2})( ) , ( ) (19d)

Results and Discussion

We use the database map** and the HAG generation frameworks (discussed in Section Preliminaries) to obtain the attack sequences for a component used in the CPES. In this section, we discuss the results obtained for the components - (i) substation automation controller, and (ii) smart inverter. For each component, we identify the set of MITRE ATT&CK adversary techniques which can be executed on it. Thereafter, we use the HAG generation framework to generate 100100100100 sample HAG s. These steps are accomplished using the frameworks described in Section Preliminaries. From these HAG s, we identify possible attack sequences which can be executed on the components. In our case, we identify 397397397397 sequences for substation automation controller and 364364364364 sequences for smart inverter. We select only those attack sequences which contains adversarial techniques included under the “Impact” tactic of the MITRE ATT&CK framework. Therefore, we shortlist the attack sequences which are meaningful in the context of creating an impact in the CPES.

Refer to caption
Refer to caption
Figure 5: Matrix showing the optimal mitigation strategy for the HAG for ‘smart inverter’ (top) and heat map of technique success rates (bottom) after implementation of the strategy.
Refer to caption
Refer to caption
Figure 6: Optimal budget allocation in different sectors for improving cybersecurity of “substation automation controller” (left) and “smart inverter” (right). The stacked bar plots denote the percentage budget allocated to each sector and the red line plot denotes the vulnerability of the component with the prescribed budget allocation. Increased skill level of defender allows allocated budget to be efficiently used in improving the efficacy of mitigation and hence we observe reduction in vulnerability.

We assume a base case with the HAG presented in Fig. 2, where no mitigation measures are implemented. Therefore, all adversary techniques in the HAG have a success rate of 100%percent100100\%100 %. Fig. 5 shows the optimal mitigation profile (top plot) and success rates of the techniques after implementing the optimal mitigation measures through a heat map on the HAG (bottom plot). It is evident from the mitigation profile that only particular set of mitigation measures are selected. The heat map shows impact of implementing the optimal mitigation strategy in reducing the success rate of adversary techniques in the HAG. We note that the optimal strategy identifies the mitigation measures which reduces the success rate of techniques such that maximum number of attack sequences are affected. This observation can be validated from the fact that the techniques with the highest out-degree are the most influential nodes in the HAG. These are the techniques with the lowest success rates after the mitigation measures are implemented.

Next, we use the proposed optimization framework to partition allocated budget into various organizational sectors. We choose 7777 sectors and identify set of mitigation measures in each of the sectors as described in (Georgiadou, Mouzakitis, and Askounis 2021) – (i) assets includes hardware and software asset management, network infrastructure management, improving data security and privacy, (ii) continuity sector consists of preventive strategies to continue business operations in the event of a data breach, (iii) access & trust deals with policies and practices for account and access management, (iv) operations sector involves performing system risk assessment through Threat Intelligence programs, (v) defense sector includes mitigation measures associated with firewall implementation, (vi) governance sector covers tasks related to audit log management and (vii) individual category involves practices making employees aware about cybersecurity risks through training programs and performing frequent security skill evaluation.

Fig. 6 shows the results of optimal budget allocation for two components - (i) substation automation controller and (ii) smart inverter. We perform multiple experiments for different skill level of the defender - thereby solving an optimization problem for each skill level. Recall that parameter λ𝜆\lambdaitalic_λ denotes the skill level. Each bar shows the partitions of the budget for a particular defender skill level. Note that the partitions sum up to 100%percent100100\%100 %. Further, we denote the vulnerability of the component to the HAG sequences under the optimal mitigation policy with the red dot on each bar. This is computed using (2) after computing the optimal value of the objective function as follows.

Vul(D,S,δ)=ND(S,δ)ND=𝟏T𝒚NDVulDsubscriptS𝛿subscript𝑁DsubscriptS𝛿subscript𝑁Dsuperscript1𝑇superscript𝒚subscript𝑁𝐷\operatorname{Vul}\left({\mathrm{D}},\mathscr{M}_{\operatorname{S}},\delta% \right)=\frac{N_{{\mathrm{D}}}\left(\mathscr{M}_{\operatorname{S}},\delta% \right)}{N_{{\mathrm{D}}}}=\frac{\mathbf{1}^{T}{\boldsymbol{y}}^{\star}}{N_{D}}roman_Vul ( roman_D , script_M start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT , italic_δ ) = divide start_ARG italic_N start_POSTSUBSCRIPT roman_D end_POSTSUBSCRIPT ( script_M start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT , italic_δ ) end_ARG start_ARG italic_N start_POSTSUBSCRIPT roman_D end_POSTSUBSCRIPT end_ARG = divide start_ARG bold_1 start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_y start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT end_ARG start_ARG italic_N start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_ARG (20)

where 𝒚superscript𝒚{\boldsymbol{y}}^{\star}bold_italic_y start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT denotes the optimal value of 𝒚𝒚{\boldsymbol{y}}bold_italic_y in (Proposed Optimization Framework).

We note that with increases in defender skill level, the vulnerability of the component to HAG sequences reduces. However, we notice that the budget allocation for each sector do not follow any particular trend. This is because sectors overlap in their coverage of mitigation measures. In the case of “substation automation controller”, we observe that for an unskilled defender, the proposed optimization framework recommends the budget be allocated mostly towards the “access” sector which comprises of mitigation measures related to access management, account management and password robustness. With a skilled defender, we observe that budget allocation gets divided to other sectors such as “assets” and “defense”. We note similar observation for “smart inverter”—however, the budget gets divided to the “assets” and “access” sectors for more skilled defenders.

Conclusion

We propose a generalized framework which performs an optimal partitioning of a limited cybersecurity budget into various organizational sectors in order to improve the cybersecurity of a smart device or component in the CPES. The framework identifies the adversarial threats and possible attack sequences which can be performed to exploit cyber vulnerabilities of the component. Thereafter, we formulate an MILP optimization problem which aims to evaluate the optimal budget partitions in order to minimize the number of highly likely attack sequences. Though we provide results for using the framework in CPES, the proposed methodology can be extended for any cyber-physical system. Such a framework equips managers in an organization to formulate cybersecurity policies, allocate staff budgets in order to improve the overall security and reduce risk of APTs.

In practice, a significant portion of cybersecurity budget allocation is aimed at improving software and hardware tools to prevent APTs along with hiring skilled cybersecurity personnel. In our paper, we combine aspects of cybersecurity tools and personnel skill through the parameters of efficacy and defender skill in our simplified analytic expressions. We plan to identify dedicated parameters which quantify these aspects in order to infuse realism in our model as part of our future work.

References

  • Bansal and Kumar (2020) Bansal, S.; and Kumar, D. 2020. IoT Ecosystem: A Survey on Devices, Gateways, Operating Systems, Middleware and Communication. International Journal of Wireless Information Networks, 27(3): 340–364.
  • Chen et al. (2014) Chen, B.; Butler-Purry, K. L.; Goulart, A.; and Kundur, D. 2014. Implementing a real-time cyber-physical system test bed in RTDS and OPNET. In 2014 North American Power Symposium (NAPS), 1–6.
  • Culafi (2021) Culafi, A. 2021. Why patching vulnerabilities is still a problem, and how to fix it. https://www.techtarget.com/searchsecurity/news/252503950/Why-patching-vulnerabilities-is-still-a-problem-and-how-to-fix-it.
  • Das et al. (2022) Das, S. S.; Dutta, A.; Purohit, S.; Serra, E.; Halappanavar, M.; and Pothen, A. 2022. Towards Automatic Map** of Vulnerabilities to Attack Patterns using Large Language Models. In 2022 IEEE International Symposium on Technologies for Homeland Security (HST), 1–7.
  • Donald, Meyur, and Purohit (2023) Donald, S.; Meyur, R.; and Purohit, S. 2023. Hybrid Attack Graph Generation with Graph Convolutional Deep-Q Learning. In The 3rd Workshop on Artificial Intelligence-Enabled Cybersecurity Analytics, KDD 2023. Long Beach, CA, USA.
  • Dorsch et al. (2014) Dorsch, N.; Kurtz, F.; Georg, H.; Hägerling, C.; and Wietfeld, C. 2014. Software-defined networking for Smart Grid communications: Applications, challenges and advantages. In 2014 IEEE International Conference on Smart Grid Communications (SmartGridComm), 422–427.
  • Dutta et al. (2022) Dutta, A.; Purohit, S.; Bhattacharya, A.; and Bel, O. 2022. Cyber Attack Sequences Generation for Electric Power Grid. In The 10th Workshop on Modelling and Simulation of Cyber-Physical Energy Systems (MSCPES), 1–6. IEEE.
  • Georg et al. (2013) Georg, H.; Müller, S. C.; Dorsch, N.; Rehtanz, C.; and Wietfeld, C. 2013. INSPIRE: Integrated co-simulation of power and ICT systems for real-time evaluation. In 2013 IEEE International Conference on Smart Grid Communications (SmartGridComm), 576–581.
  • Georgiadou, Mouzakitis, and Askounis (2021) Georgiadou, A.; Mouzakitis, S.; and Askounis, D. 2021. Assessing MITRE ATT&CK Risk Using a Cyber-Security Culture Framework. Sensors, 21(9): 3267.
  • Haack et al. (2013) Haack, J.; Akyol, B.; Tenney, N.; Carpenter, B.; Pratt, R.; and Carroll, T. 2013. VOLTTRON: An agent platform for integrating electric vehicles and Smart Grid. In 2013 International Conference on Connected Vehicles and Expo (ICCVE), 81–86.
  • Keliris et al. (2016) Keliris, A.; Konstantinou, C.; Tsoutsos, N. G.; Baiad, R.; and Maniatakos, M. 2016. Enabling multi-layer cyber-security assessment of Industrial Control Systems through Hardware-In-The-Loop testbeds. In 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC), 511–518.
  • Kubanek (2017) Kubanek, J. 2017. Optimal decision making and matching are tied through diminishing returns. Proceedings of the National Academy of Sciences, 114(32): 8499–8504.
  • Li et al. (2019) Li, L.; He, W.; Xu, L.; Ash, I.; Anwar, M.; and Yuan, X. 2019. Investigating the impact of cybersecurity policy awareness on employees’ cybersecurity behavior. International Journal of Information Management, 45: 13–24.
  • Li and Liu (2021) Li, Y.; and Liu, Q. 2021. A comprehensive review study of cyber-attacks and cyber security; Emerging trends and recent developments. Energy Reports, 7: 8176–8186.
  • Liggett et al. (2019) Liggett, R.; Lee, J. R.; Roddy, A. L.; and Wallin, M. A. 2019. The Dark Web as a Platform for Crime: An Exploration of Illicit Drug, Firearm, CSAM, and Cybercrime Markets, 1–27. Cham: Springer International Publishing.
  • MITRE Corporation (2023) MITRE Corporation. 2023. MITRE ATT&CK Framework. Last accessed February 2023.
  • Nandi, Medal, and Vadlamani (2016) Nandi, A. K.; Medal, H. R.; and Vadlamani, S. 2016. Interdicting attack graphs to protect organizations from cyber attacks: A bi-level defender–attacker model. Computers & Operations Research, 75: 118–131.
  • Queiroz, Mahmood, and Tari (2011) Queiroz, C.; Mahmood, A.; and Tari, Z. 2011. SCADASim—A Framework for Building SCADA Simulations. IEEE Transactions on Smart Grid, 2(4): 589–597.
  • Sridhar et al. (2017) Sridhar, S.; Ashok, A.; Mylrea, M.; Pal, S.; Rice, M.; and Gourisetti, S. N. G. 2017. A testbed environment for buildings-to-grid cyber resilience research and development. In 2017 Resilience Week (RWS), 12–17.
  • Srinidhi, Yan, and Tayi (2015) Srinidhi, B.; Yan, J.; and Tayi, G. K. 2015. Allocation of resources to cyber-security: The effect of misalignment of interest between managers and investors. Decision Support Systems, 75: 49–62.
  • Stanovich et al. (2013) Stanovich, M. J.; Leonard, I.; Sanjeev, K.; Steurer, M.; Roth, T. P.; Jackson, S.; and Bruce, M. 2013. Development of a smart-grid cyber-physical systems testbed. In 2013 IEEE PES Innovative Smart Grid Technologies Conference (ISGT), 1–6.
  • Subasi et al. (2022) Subasi, O.; Purohit, S.; Bhattacharya, A.; and Chatterjee, S. 2022. Impact-Driven Sampling Strategies for Hybrid Attack Graphs. In 2022 IEEE International Symposium on Technologies for Homeland Security (HST), 1–7.
  • Vasisht et al. (2022) Vasisht, S.; Rahman, A.; Ramachandran, T.; Bhattacharya, A.; and Adetola, V. 2022. Multi-fidelity Bayesian Optimization for Co-design of Resilient Cyber-Physical Systems. In 2022 ACM/IEEE 13th International Conference on Cyber-Physical Systems (ICCPS), 298–299.
  • Zografopoulos et al. (2021) Zografopoulos, I.; Ospina, J.; Liu, X.; and Konstantinou, C. 2021. Cyber-Physical Energy Systems Security: Threat Modeling, Risk Assessment, Resources, Metrics, and Case Studies. IEEE Access, 9: 29775–29818.