Search | arXiv e-print repository

Benchmarking Unsupervised Online IDS for Masquerade Attacks in CAN

Authors: Pablo Moriano, Steven C. Hespeler, Mingyan Li, Robert A. Bridges

Abstract: Vehicular controller area networks (CANs) are susceptible to masquerade attacks by malicious adversaries. In masquerade attacks, adversaries silence a targeted ID and then send malicious frames with forged content at the expected timing of benign frames. As masquerade attacks could seriously harm vehicle functionality and are the stealthiest attacks to detect in CAN, recent work has devoted attent… ▽ More Vehicular controller area networks (CANs) are susceptible to masquerade attacks by malicious adversaries. In masquerade attacks, adversaries silence a targeted ID and then send malicious frames with forged content at the expected timing of benign frames. As masquerade attacks could seriously harm vehicle functionality and are the stealthiest attacks to detect in CAN, recent work has devoted attention to compare frameworks for detecting masquerade attacks in CAN. However, most existing works report offline evaluations using CAN logs already collected using simulations that do not comply with domain's real-time constraints. Here we contribute to advance the state of the art by introducing a benchmark study of four different non-deep learning (DL)-based unsupervised online intrusion detection systems (IDS) for masquerade attacks in CAN. Our approach differs from existing benchmarks in that we analyze the effect of controlling streaming data conditions in a sliding window setting. In doing so, we use realistic masquerade attacks being replayed from the ROAD dataset. We show that although benchmarked IDS are not effective at detecting every attack type, the method that relies on detecting changes at the hierarchical structure of clusters of time series produces the best results at the expense of higher computational overhead. We discuss limitations, open challenges, and how the benchmarked methods can be used for practical unsupervised online CAN IDS for masquerade attacks. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 15 pages, 9 figures, 3 tables

arXiv:2405.00636 [pdf, other]

Robustness of graph embedding methods for community detection

Authors: Zhi-Feng Wei, Pablo Moriano, Ramakrishnan Kannan

Abstract: This study investigates the robustness of graph embedding methods for community detection in the face of network perturbations, specifically edge deletions. Graph embedding techniques, which represent nodes as low-dimensional vectors, are widely used for various graph machine learning tasks due to their ability to capture structural properties of networks effectively. However, the impact of pertur… ▽ More This study investigates the robustness of graph embedding methods for community detection in the face of network perturbations, specifically edge deletions. Graph embedding techniques, which represent nodes as low-dimensional vectors, are widely used for various graph machine learning tasks due to their ability to capture structural properties of networks effectively. However, the impact of perturbations on the performance of these methods remains relatively understudied. The research considers state-of-the-art graph embedding methods from two families: matrix factorization (e.g., LE, LLE, HOPE, M-NMF) and random walk-based (e.g., DeepWalk, LINE, node2vec). Through experiments conducted on both synthetic and real-world networks, the study reveals varying degrees of robustness within each family of graph embedding methods. The robustness is found to be influenced by factors such as network size, initial community partition strength, and the type of perturbation. Notably, node2vec and LLE consistently demonstrate higher robustness for community detection across different scenarios, including networks with degree and community size heterogeneity. These findings highlight the importance of selecting an appropriate graph embedding method based on the specific characteristics of the network and the task at hand, particularly in scenarios where robustness to perturbations is crucial. △ Less

Submitted 1 May, 2024; originally announced May 2024.

Comments: 17 pages, 26 figures, 3 tables. Comments are welcome

arXiv:2306.15588 [pdf]

Develo** and Deploying Security Applications for In-Vehicle Networks

Authors: Samuel C Hollifield, Pablo Moriano, William L Lambert, Joel Asiamah, Isaac Sikkema, Michael D Iannacone

Abstract: Radiological material transportation is primarily facilitated by heavy-duty on-road vehicles. Modern vehicles have dozens of electronic control units or ECUs, which are small, embedded computers that communicate with sensors and each other for vehicle functionality. ECUs use a standardized network architecture--Controller Area Network or CAN--which presents grave security concerns that have been e… ▽ More Radiological material transportation is primarily facilitated by heavy-duty on-road vehicles. Modern vehicles have dozens of electronic control units or ECUs, which are small, embedded computers that communicate with sensors and each other for vehicle functionality. ECUs use a standardized network architecture--Controller Area Network or CAN--which presents grave security concerns that have been exploited by researchers and hackers alike. For instance, ECUs can be impersonated by adversaries who have infiltrated an automotive CAN and disable or invoke unintended vehicle functions such as brakes, acceleration, or safety mechanisms. Further, the quality of security approaches varies wildly between manufacturers. Thus, research and development of after-market security solutions have grown remarkably in recent years. Many researchers are exploring deployable intrusion detection and prevention mechanisms using machine learning and data science techniques. However, there is a gap between develo** security system algorithms and deploying prototype security appliances in-vehicle. In this paper, we, a research team at Oak Ridge National Laboratory working in this space, highlight challenges in the development pipeline, and provide techniques to standardize methodology and overcome technological hurdles. △ Less

Submitted 27 June, 2023; originally announced June 2023.

Comments: 10 pages, PATRAM 22

arXiv:2304.07238 [pdf, other]

doi 10.1103/PhysRevE.108.054302

Robustness of community structure under edge addition

Authors: Moyi Tian, Pablo Moriano

Abstract: Communities often represent key structural and functional clusters in networks. To preserve such communities, it is important to understand their robustness under network perturbations. Previous work in community robustness analysis has focused on studying changes in the community structure as a response of edge rewiring and node or edge removal. However, the impact of increasing connectivity on t… ▽ More Communities often represent key structural and functional clusters in networks. To preserve such communities, it is important to understand their robustness under network perturbations. Previous work in community robustness analysis has focused on studying changes in the community structure as a response of edge rewiring and node or edge removal. However, the impact of increasing connectivity on the robustness of communities in networked systems is relatively unexplored. Studying the limits of community robustness under edge addition is crucial to better understanding the cases in which density expands or false edges erroneously appear. In this paper, we analyze the effect of edge addition on community robustness in synthetic and empirical temporal networks. We study two scenarios of edge addition: random and targeted. We use four community detection algorithms, Infomap, Label Propagation, Leiden, and Louvain, and demonstrate the results in community similarity metrics. The experiments on synthetic networks show that communities are more robust when the initial partition is stronger or the edge addition is random, and the experiments on empirical data also indicate that robustness performance can be affected by the community similarity metric. Overall, our results suggest that the communities identified by the different types of community detection algorithms exhibit different levels of robustness, and so the robustness of communities depends strongly on the choice of detection method. △ Less

Submitted 1 November, 2023; v1 submitted 14 April, 2023; originally announced April 2023.

Comments: 17 pages, 30 figures

Journal ref: Phys. Rev. E 108 (2023) 054302

arXiv:2205.01306 [pdf, other]

doi 10.1109/JIOT.2023.3303271

CANShield: Deep Learning-Based Intrusion Detection Framework for Controller Area Networks at the Signal-Level

Authors: Md Hasan Shahriar, Yang Xiao, Pablo Moriano, Wen**g Lou, Y. Thomas Hou

Abstract: Modern vehicles rely on a fleet of electronic control units (ECUs) connected through controller area network (CAN) buses for critical vehicular control. With the expansion of advanced connectivity features in automobiles and the elevated risks of internal system exposure, the CAN bus is increasingly prone to intrusions and injection attacks. As ordinary injection attacks disrupt the typical timing… ▽ More Modern vehicles rely on a fleet of electronic control units (ECUs) connected through controller area network (CAN) buses for critical vehicular control. With the expansion of advanced connectivity features in automobiles and the elevated risks of internal system exposure, the CAN bus is increasingly prone to intrusions and injection attacks. As ordinary injection attacks disrupt the typical timing properties of the CAN data stream, rule-based intrusion detection systems (IDS) can easily detect them. However, advanced attackers can inject false data to the signal/semantic level, while looking innocuous by the pattern/frequency of the CAN messages. The rule-based IDS, as well as the anomaly-based IDS, are built merely on the sequence of CAN messages IDs or just the binary payload data and are less effective in detecting such attacks. Therefore, to detect such intelligent attacks, we propose CANShield, a deep learning-based signal-level intrusion detection framework for the CAN bus. CANShield consists of three modules: a data preprocessing module that handles the high-dimensional CAN data stream at the signal level and parses them into time series suitable for a deep learning model; a data analyzer module consisting of multiple deep autoencoder (AE) networks, each analyzing the time-series data from a different temporal scale and granularity, and finally an attack detection module that uses an ensemble method to make the final decision. Evaluation results on two high-fidelity signal-based CAN attack datasets show the high accuracy and responsiveness of CANShield in detecting advanced intrusion attacks. △ Less

Submitted 7 October, 2023; v1 submitted 3 May, 2022; originally announced May 2022.

Comments: 17 pages, 13 figures, A version of this paper is accepted by IEEE Internet of Things Journal

arXiv:2201.02665 [pdf, other]

doi 10.14722/autosec.2022.23028

Detecting CAN Masquerade Attacks with Signal Clustering Similarity

Authors: Pablo Moriano, Robert A. Bridges, Michael D. Iannacone

Abstract: Vehicular Controller Area Networks (CANs) are susceptible to cyber attacks of different levels of sophistication. Fabrication attacks are the easiest to administer -- an adversary simply sends (extra) frames on a CAN -- but also the easiest to detect because they disrupt frame frequency. To overcome time-based detection methods, adversaries must administer masquerade attacks by sending frames in l… ▽ More Vehicular Controller Area Networks (CANs) are susceptible to cyber attacks of different levels of sophistication. Fabrication attacks are the easiest to administer -- an adversary simply sends (extra) frames on a CAN -- but also the easiest to detect because they disrupt frame frequency. To overcome time-based detection methods, adversaries must administer masquerade attacks by sending frames in lieu of (and therefore at the expected time of) benign frames but with malicious payloads. Research efforts have proven that CAN attacks, and masquerade attacks in particular, can affect vehicle functionality. Examples include causing unintended acceleration, deactivation of vehicle's brakes, as well as steering the vehicle. We hypothesize that masquerade attacks modify the nuanced correlations of CAN signal time series and how they cluster together. Therefore, changes in cluster assignments should indicate anomalous behavior. We confirm this hypothesis by leveraging our previously developed capability for reverse engineering CAN signals (i.e., CAN-D [Controller Area Network Decoder]) and focus on advancing the state of the art for detecting masquerade attacks by analyzing time series extracted from raw CAN frames. Specifically, we demonstrate that masquerade attacks can be detected by computing time series clustering similarity using hierarchical clustering on the vehicle's CAN signals (time series) and comparing the clustering similarity across CAN captures with and without attacks. We test our approach in a previously collected CAN dataset with masquerade attacks (i.e., the ROAD dataset) and develop a forensic tool as a proof of concept to demonstrate the potential of the proposed approach for detecting CAN masquerade attacks. △ Less

Submitted 11 March, 2022; v1 submitted 7 January, 2022; originally announced January 2022.

Comments: 8 pages, 5 figures, 3 tables

Journal ref: Workshop on Automotive and Autonomous Vehicle Security (AutoSec) 2022

arXiv:2110.05371 [pdf, other]

doi 10.1371/journal.pone.0284077

Graph-Based Machine Learning Improves Just-in-Time Defect Prediction

Authors: Jonathan Bryan, Pablo Moriano

Abstract: The increasing complexity of today's software requires the contribution of thousands of developers. This complex collaboration structure makes developers more likely to introduce defect-prone changes that lead to software faults. Determining when these defect-prone changes are introduced has proven challenging, and using traditional machine learning (ML) methods to make these determinations seems… ▽ More The increasing complexity of today's software requires the contribution of thousands of developers. This complex collaboration structure makes developers more likely to introduce defect-prone changes that lead to software faults. Determining when these defect-prone changes are introduced has proven challenging, and using traditional machine learning (ML) methods to make these determinations seems to have reached a plateau. In this work, we build contribution graphs consisting of developers and source files to capture the nuanced complexity of changes required to build software. By leveraging these contribution graphs, our research shows the potential of using graph-based ML to improve Just-In-Time (JIT) defect prediction. We hypothesize that features extracted from the contribution graphs may be better predictors of defect-prone changes than intrinsic features derived from software characteristics. We corroborate our hypothesis using graph-based ML for classifying edges that represent defect-prone changes. This new framing of the JIT defect prediction problem leads to remarkably better results. We test our approach on 14 open-source projects and show that our best model can predict whether or not a code change will lead to a defect with an F1 score as high as 77.55% and a Matthews correlation coefficient (MCC) as high as 53.16%. This represents a 152% higher F1 score and a 3% higher MCC over the state-of-the-art JIT defect prediction. We describe limitations, open challenges, and how this method can be used for operational JIT defect prediction. △ Less

Submitted 14 April, 2023; v1 submitted 11 October, 2021; originally announced October 2021.

Comments: 22 pages, 2 figures, 4 tables; references added; expanded results to match baseline conditions

Journal ref: PLoS ONE 18(4): e0284077, 2023

arXiv:2101.05781 [pdf, other]

doi 10.14722/autosec.2021.23013

Time-Based CAN Intrusion Detection Benchmark

Authors: Deborah H. Blevins, Pablo Moriano, Robert A. Bridges, Miki E. Verma, Michael D. Iannacone, Samuel C Hollifield

Abstract: Modern vehicles are complex cyber-physical systems made of hundreds of electronic control units (ECUs) that communicate over controller area networks (CANs). This inherited complexity has expanded the CAN attack surface which is vulnerable to message injection attacks. These injections change the overall timing characteristics of messages on the bus, and thus, to detect these malicious messages, t… ▽ More Modern vehicles are complex cyber-physical systems made of hundreds of electronic control units (ECUs) that communicate over controller area networks (CANs). This inherited complexity has expanded the CAN attack surface which is vulnerable to message injection attacks. These injections change the overall timing characteristics of messages on the bus, and thus, to detect these malicious messages, time-based intrusion detection systems (IDSs) have been proposed. However, time-based IDSs are usually trained and tested on low-fidelity datasets with unrealistic, labeled attacks. This makes difficult the task of evaluating, comparing, and validating IDSs. Here we detail and benchmark four time-based IDSs against the newly published ROAD dataset, the first open CAN IDS dataset with real (non-simulated) stealthy attacks with physically verified effects. We found that methods that perform hypothesis testing by explicitly estimating message timing distributions have lower performance than methods that seek anomalies in a distribution-related statistic. In particular, these "distribution-agnostic" based methods outperform "distribution-based" methods by at least 55% in area under the precision-recall curve (AUC-PR). Our results expand the body of knowledge of CAN time-based IDSs by providing details of these methods and reporting their results when tested on datasets with real advanced attacks. Finally, we develop an after-market plug-in detector using lightweight hardware, which can be used to deploy the best performing IDS method on nearly any vehicle. △ Less

Submitted 14 January, 2021; originally announced January 2021.

Comments: 7 pages, 2 figures

Journal ref: Workshop on Automotive and Autonomous Vehicle Security (AutoSec) 2021

arXiv:2012.14600 [pdf, other]

doi 10.1371/journal.pone.0296879

doi 10.5281/zenodo.10462795

A Comprehensive Guide to CAN IDS Data & Introduction of the ROAD Dataset

Authors: Miki E. Verma, Robert A. Bridges, Michael D. Iannacone, Samuel C. Hollifield, Pablo Moriano, Steven C. Hespeler, Bill Kay, Frank L. Combs

Abstract: Although ubiquitous in modern vehicles, Controller Area Networks (CANs) lack basic security properties and are easily exploitable. A rapidly growing field of CAN security research has emerged that seeks to detect intrusions on CANs. Producing vehicular CAN data with a variety of intrusions is out of reach for most researchers as it requires expensive assets and expertise. To assist researchers, we… ▽ More Although ubiquitous in modern vehicles, Controller Area Networks (CANs) lack basic security properties and are easily exploitable. A rapidly growing field of CAN security research has emerged that seeks to detect intrusions on CANs. Producing vehicular CAN data with a variety of intrusions is out of reach for most researchers as it requires expensive assets and expertise. To assist researchers, we present the first comprehensive guide to the existing open CAN intrusion datasets, including a quality analysis of each dataset and an enumeration of each's benefits, drawbacks, and suggested use case. Current public CAN IDS datasets are limited to real fabrication (simple message injection) attacks and simulated attacks often in synthetic data, which lack fidelity. In general, the physical effects of attacks on the vehicle are not verified in the available datasets. Only one dataset provides signal-translated data but not a corresponding raw binary version. Overall, the available data pigeon-holes CAN IDS works into testing on limited, often inappropriate data (usually with attacks that are too easily detectable to truly test the method), and this lack data has stymied comparability and reproducibility of results. As our primary contribution, we present the ROAD (Real ORNL Automotive Dynamometer) CAN Intrusion Dataset, consisting of over 3.5 hours of one vehicle's CAN data. ROAD contains ambient data recorded during a diverse set of activities, and attacks of increasing stealth with multiple variants and instances of real fuzzing, fabrication, and unique advanced attacks, as well as simulated masquerade attacks. To facilitate benchmarking CAN IDS methods that require signal-translated inputs, we also provide the signal time series format for many of the CAN captures. Our contributions aim to facilitate appropriate benchmarking and needed comparability in the CAN IDS field. △ Less

Submitted 7 February, 2024; v1 submitted 28 December, 2020; originally announced December 2020.

Comments: title changed and author added from original version

Journal ref: PLoS one 19, no. 1 (2024): e0296879

arXiv:1905.05835 [pdf, other]

doi 10.1016/j.comnet.2021.107835

Using Bursty Announcements for Detecting BGP Routing Anomalies

Authors: Pablo Moriano, Raquel Hill, L. Jean Camp

Abstract: Despite the robust structure of the Internet, it is still susceptible to disruptive routing updates that prevent network traffic from reaching its destination. Our research shows that BGP announcements that are associated with disruptive updates tend to occur in groups of relatively high frequency, followed by periods of infrequent activity. We hypothesize that we may use these bursty characterist… ▽ More Despite the robust structure of the Internet, it is still susceptible to disruptive routing updates that prevent network traffic from reaching its destination. Our research shows that BGP announcements that are associated with disruptive updates tend to occur in groups of relatively high frequency, followed by periods of infrequent activity. We hypothesize that we may use these bursty characteristics to detect anomalous routing incidents. In this work, we use manually verified ground truth metadata and volume of announcements as a baseline measure, and propose a burstiness measure that detects prior anomalous incidents with high recall and better precision than the volume baseline. We quantify the burstiness of inter-arrival times around the date and times of four large-scale incidents: the Indosat hijacking event in April 2014, the Telecom Malaysia leak in June 2015, the Bharti Airtel Ltd. hijack in November 2015, and the MainOne leak in November 2018; and three smaller scale incidents that led to traffic interception: the Belarusian traffic direction in February 2013, the Icelandic traffic direction in July 2013, and the Russian telecom that hijacked financial services in April 2017. Our method leverages the burstiness of disruptive update messages to detect these incidents. We describe limitations, open challenges, and how this method can be used for routing anomaly detection. △ Less

Submitted 29 January, 2021; v1 submitted 14 May, 2019; originally announced May 2019.

Comments: 16 pages, 13 figures, 4 table

Journal ref: Comput. Netw. vol. 188, pp. 107835, 2021

arXiv:1301.4192 [pdf, ps, other]

doi 10.1088/1742-5468/2013/06/P06010

On the formation of structure in growing networks

Authors: P. Moriano, J. Finke

Abstract: Based on the formation of triad junctions, the proposed mechanism generates networks that exhibit extended rather than single power law behavior. Triad formation guarantees strong neighborhood clustering and community-level characteristics as the network size grows to infinity. The asymptotic behavior is of interest in the study of directed networks in which (i) the formation of links cannot be de… ▽ More Based on the formation of triad junctions, the proposed mechanism generates networks that exhibit extended rather than single power law behavior. Triad formation guarantees strong neighborhood clustering and community-level characteristics as the network size grows to infinity. The asymptotic behavior is of interest in the study of directed networks in which (i) the formation of links cannot be described according to the principle of preferential attachment; (ii) the in-degree distribution fits a power law for nodes with a high degree and an exponential form otherwise; (iii) clustering properties emerge at multiple scales and depend on both the number of links that newly added nodes establish and the probability of forming triads; and (iv) groups of nodes form modules that feature less links to the rest of the nodes. △ Less

Submitted 30 April, 2013; v1 submitted 17 January, 2013; originally announced January 2013.

Comments: 17 pages, 9 figures, we apply the proposed mechanism to generate network realizations that resemble the degree distribution and clustering properties of an empirical network with no directed cycles (i.e., when the model parameter n=0), updated references

MSC Class: 90B15; 91D30; 94C15

Journal ref: Journal of Statistical Mechanics: Theory and Experiment 2013 (06), P06010

arXiv:1110.0751 [pdf, ps, other]

doi 10.1209/0295-5075/99/18002

Power-law weighted networks from local attachments

Authors: P. Moriano, J. Finke

Abstract: This letter introduces a mechanism for constructing, through a process of distributed decision-making, substrates for the study of collective dynamics on extended power-law weighted networks with both a desired scaling exponent and a fixed clustering coefficient. The analytical results show that the connectivity distribution converges to the scaling behavior often found in social and engineering s… ▽ More This letter introduces a mechanism for constructing, through a process of distributed decision-making, substrates for the study of collective dynamics on extended power-law weighted networks with both a desired scaling exponent and a fixed clustering coefficient. The analytical results show that the connectivity distribution converges to the scaling behavior often found in social and engineering systems. To illustrate the approach of the proposed framework we generate network substrates that resemble steady state properties of the empirical citation distributions of (i) publications indexed by the Institute for Scientific Information from 1981 to 1997; (ii) patents granted by the U.S. Patent and Trademark Office from 1975 to 1999; and (iii) opinions written by the Supreme Court and the cases they cite from 1754 to 2002. △ Less

Submitted 4 June, 2012; v1 submitted 4 October, 2011; originally announced October 2011.

Comments: 18 pages, 3 figures; Proceedings of the IEEE Conference on Decision and Control and the European Control Conference, Orlando, FL, Dec. 2011; Added references; We modified the model in order to take into account extended power-law distributions which better fit to the citations data sets; Added proofs of theorems; Shorten version; Updated plot

Journal ref: Europhysics Letters, vol. 99, no. 1, p. 18002, 2012

Showing 1–12 of 12 results for author: Moriano, P