Search | arXiv e-print repository

Work-in-Progress: Crash Course: Can (Under Attack) Autonomous Driving Beat Human Drivers?

Authors: Francesco Marchiori, Alessandro Brighente, Mauro Conti

Abstract: Autonomous driving is a research direction that has gained enormous traction in the last few years thanks to advancements in Artificial Intelligence (AI). Depending on the level of independence from the human driver, several studies show that Autonomous Vehicles (AVs) can reduce the number of on-road crashes and decrease overall fuel emissions by improving efficiency. However, security research on… ▽ More Autonomous driving is a research direction that has gained enormous traction in the last few years thanks to advancements in Artificial Intelligence (AI). Depending on the level of independence from the human driver, several studies show that Autonomous Vehicles (AVs) can reduce the number of on-road crashes and decrease overall fuel emissions by improving efficiency. However, security research on this topic is mixed and presents some gaps. On one hand, these studies often neglect the intrinsic vulnerabilities of AI algorithms, which are known to compromise the security of these systems. On the other, the most prevalent attacks towards AI rely on unrealistic assumptions, such as access to the model parameters or the training dataset. As such, it is unclear if autonomous driving can still claim several advantages over human driving in real-world applications. This paper evaluates the inherent risks in autonomous driving by examining the current landscape of AVs and establishing a pragmatic threat model. Through our analysis, we develop specific claims highlighting the delicate balance between the advantages of AVs and potential security challenges in real-world scenarios. Our evaluation serves as a foundation for providing essential takeaway messages, guiding both researchers and practitioners at various stages of the automation pipeline. In doing so, we contribute valuable insights to advance the discourse on the security and viability of autonomous driving in real-world applications. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: Accepted at ACSW 2024

arXiv:2404.12991 [pdf, other]

RedactBuster: Entity Type Recognition from Redacted Documents

Authors: Mirco Beltrame, Mauro Conti, Pierpaolo Guglielmin, Francesco Marchiori, Gabriele Orazi

Abstract: The widespread exchange of digital documents in various domains has resulted in abundant private information being shared. This proliferation necessitates redaction techniques to protect sensitive content and user privacy. While numerous redaction methods exist, their effectiveness varies, with some proving more robust than others. As such, the literature proposes several deanonymization technique… ▽ More The widespread exchange of digital documents in various domains has resulted in abundant private information being shared. This proliferation necessitates redaction techniques to protect sensitive content and user privacy. While numerous redaction methods exist, their effectiveness varies, with some proving more robust than others. As such, the literature proposes several deanonymization techniques, raising awareness of potential privacy threats. However, while none of these methods are successful against the most effective redaction techniques, these attacks only focus on the anonymized tokens and ignore the sentence context. In this paper, we propose RedactBuster, the first deanonymization model using sentence context to perform Named Entity Recognition on reacted text. Our methodology leverages fine-tuned state-of-the-art Transformers and Deep Learning models to determine the anonymized entity types in a document. We test RedactBuster against the most effective redaction technique and evaluate it using the publicly available Text Anonymization Benchmark (TAB). Our results show accuracy values up to 0.985 regardless of the document nature or entity type. In raising awareness of this privacy issue, we propose a countermeasure we call character evasion that helps strengthen the secrecy of sensitive information. Furthermore, we make our model and testbed open-source to aid researchers and practitioners in evaluating the resilience of novel redaction techniques and enhancing document privacy. △ Less

Submitted 19 April, 2024; originally announced April 2024.

arXiv:2404.12689 [pdf, other]

Can LLMs Understand Computer Networks? Towards a Virtual System Administrator

Authors: Denis Donadel, Francesco Marchiori, Luca Pajola, Mauro Conti

Abstract: Recent advancements in Artificial Intelligence, and particularly Large Language Models (LLMs), offer promising prospects for aiding system administrators in managing the complexity of modern networks. However, despite this potential, a significant gap exists in the literature regarding the extent to which LLMs can understand computer networks. Without empirical evidence, system administrators migh… ▽ More Recent advancements in Artificial Intelligence, and particularly Large Language Models (LLMs), offer promising prospects for aiding system administrators in managing the complexity of modern networks. However, despite this potential, a significant gap exists in the literature regarding the extent to which LLMs can understand computer networks. Without empirical evidence, system administrators might rely on these models without assurance of their efficacy in performing network-related tasks accurately. In this paper, we are the first to conduct an exhaustive study on LLMs' comprehension of computer networks. We formulate several research questions to determine whether LLMs can provide correct answers when supplied with a network topology and questions on it. To assess them, we developed a thorough framework for evaluating LLMs' capabilities in various network-related tasks. We evaluate our framework on multiple computer networks employing private (e.g., GPT4) and open-source (e.g., Llama2) models. Our findings demonstrate promising results, with the best model achieving an average accuracy of 79.3%. Private LLMs achieve noteworthy results in small and medium networks, while challenges persist in comprehending complex network topologies, particularly for open-source models. Moreover, we provide insight into how prompt engineering can enhance the accuracy of some tasks. △ Less

Submitted 19 April, 2024; originally announced April 2024.

ACM Class: C.2.1; C.2.5; I.2.1

arXiv:2404.04648 [pdf, other]

doi 10.1145/3649403.3656486

CANEDERLI: On The Impact of Adversarial Training and Transferability on CAN Intrusion Detection Systems

Authors: Francesco Marchiori, Mauro Conti

Abstract: The growing integration of vehicles with external networks has led to a surge in attacks targeting their Controller Area Network (CAN) internal bus. As a countermeasure, various Intrusion Detection Systems (IDSs) have been suggested in the literature to prevent and mitigate these threats. With the increasing volume of data facilitated by the integration of Vehicle-to-Vehicle (V2V) and Vehicle-to-I… ▽ More The growing integration of vehicles with external networks has led to a surge in attacks targeting their Controller Area Network (CAN) internal bus. As a countermeasure, various Intrusion Detection Systems (IDSs) have been suggested in the literature to prevent and mitigate these threats. With the increasing volume of data facilitated by the integration of Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I) communication networks, most of these systems rely on data-driven approaches such as Machine Learning (ML) and Deep Learning (DL) models. However, these systems are susceptible to adversarial evasion attacks. While many researchers have explored this vulnerability, their studies often involve unrealistic assumptions, lack consideration for a realistic threat model, and fail to provide effective solutions. In this paper, we present CANEDERLI (CAN Evasion Detection ResiLIence), a novel framework for securing CAN-based IDSs. Our system considers a realistic threat model and addresses the impact of adversarial attacks on DL-based detection systems. Our findings highlight strong transferability properties among diverse attack methodologies by considering multiple state-of-the-art attacks and model architectures. We analyze the impact of adversarial training in addressing this threat and propose an adaptive online adversarial training technique outclassing traditional fine-tuning methodologies with F1 scores up to 0.941. By making our framework publicly available, we aid practitioners and researchers in assessing the resilience of IDSs to a varied adversarial landscape. △ Less

Submitted 6 April, 2024; originally announced April 2024.

Comments: Accepted at WiseML 2024

arXiv:2403.17494 [pdf, other]

doi 10.1007/978-3-031-64171-8_26

FaultGuard: A Generative Approach to Resilient Fault Prediction in Smart Electrical Grids

Authors: Emad Efatinasab, Francesco Marchiori, Alessandro Brighente, Mirco Rampazzo, Mauro Conti

Abstract: Predicting and classifying faults in electricity networks is crucial for uninterrupted provision and kee** maintenance costs at a minimum. Thanks to the advancements in the field provided by the smart grid, several data-driven approaches have been proposed in the literature to tackle fault prediction tasks. Implementing these systems brought several improvements, such as optimal energy consumpti… ▽ More Predicting and classifying faults in electricity networks is crucial for uninterrupted provision and kee** maintenance costs at a minimum. Thanks to the advancements in the field provided by the smart grid, several data-driven approaches have been proposed in the literature to tackle fault prediction tasks. Implementing these systems brought several improvements, such as optimal energy consumption and quick restoration. Thus, they have become an essential component of the smart grid. However, the robustness and security of these systems against adversarial attacks have not yet been extensively investigated. These attacks can impair the whole grid and cause additional damage to the infrastructure, deceiving fault detection systems and disrupting restoration. In this paper, we present FaultGuard, the first framework for fault type and zone classification resilient to adversarial attacks. To ensure the security of our system, we employ an Anomaly Detection System (ADS) leveraging a novel Generative Adversarial Network training layer to identify attacks. Furthermore, we propose a low-complexity fault prediction model and an online adversarial training technique to enhance robustness. We comprehensively evaluate the framework's performance against various adversarial attacks using the IEEE13-AdvAttack dataset, which constitutes the state-of-the-art for resilient fault prediction benchmarking. Our model outclasses the state-of-the-art even without considering adversaries, with an accuracy of up to 0.958. Furthermore, our ADS shows attack detection capabilities with an accuracy of up to 1.000. Finally, we demonstrate how our novel training layers drastically increase performances across the whole framework, with a mean increase of 154% in ADS accuracy and 118% in model accuracy. △ Less

Submitted 26 March, 2024; originally announced March 2024.

arXiv:2401.10736 [pdf, other]

A Survey and Comparative Analysis of Security Properties of CAN Authentication Protocols

Authors: Alessandro Lotto, Francesco Marchiori, Alessandro Brighente, Mauro Conti

Abstract: The large number of Electronic Control Units (ECUs) mounted on modern cars and their expansive communication capabilities create a substantial attack surface for potential exploitation. Despite the evolution of automotive technology, the continued use of the originally insecure Controller Area Network (CAN) bus leaves in-vehicle communications inherently non-secure. In response to the absence of s… ▽ More The large number of Electronic Control Units (ECUs) mounted on modern cars and their expansive communication capabilities create a substantial attack surface for potential exploitation. Despite the evolution of automotive technology, the continued use of the originally insecure Controller Area Network (CAN) bus leaves in-vehicle communications inherently non-secure. In response to the absence of standardized authentication protocols within the automotive domain, researchers propose diverse solutions, each with unique strengths and vulnerabilities. However, the continuous influx of new protocols and potential oversights in meeting security requirements and essential operational features further complicate the implementability of these protocols. This paper comprehensively reviews and compares the 15 most prominent authentication protocols for the CAN bus. Our analysis emphasizes their strengths and weaknesses, evaluating their alignment with critical security requirements for automotive authentication. Additionally, we evaluate protocols based on essential operational criteria that contribute to ease of implementation in predefined infrastructures, enhancing overall reliability and reducing the probability of successful attacks. Our study reveals a prevalent focus on defending against external attackers in existing protocols, exposing vulnerabilities to internal threats. Notably, authentication protocols employing hash chains, Mixed Message Authentication Codes, and asymmetric encryption techniques emerge as the most effective approaches. Through our comparative study, we classify the considered protocols based on their security attributes and suitability for implementation, providing valuable insights for future developments in the field. △ Less

Submitted 19 January, 2024; originally announced January 2024.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2310.02655 [pdf, other]

AGIR: Automating Cyber Threat Intelligence Reporting with Natural Language Generation

Authors: Filippo Perrina, Francesco Marchiori, Mauro Conti, Nino Vincenzo Verde

Abstract: Cyber Threat Intelligence (CTI) reporting is pivotal in contemporary risk management strategies. As the volume of CTI reports continues to surge, the demand for automated tools to streamline report generation becomes increasingly apparent. While Natural Language Processing techniques have shown potential in handling text data, they often struggle to address the complexity of diverse data sources a… ▽ More Cyber Threat Intelligence (CTI) reporting is pivotal in contemporary risk management strategies. As the volume of CTI reports continues to surge, the demand for automated tools to streamline report generation becomes increasingly apparent. While Natural Language Processing techniques have shown potential in handling text data, they often struggle to address the complexity of diverse data sources and their intricate interrelationships. Moreover, established paradigms like STIX have emerged as de facto standards within the CTI community, emphasizing the formal categorization of entities and relations to facilitate consistent data sharing. In this paper, we introduce AGIR (Automatic Generation of Intelligence Reports), a transformative Natural Language Generation tool specifically designed to address the pressing challenges in the realm of CTI reporting. AGIR's primary objective is to empower security analysts by automating the labor-intensive task of generating comprehensive intelligence reports from formal representations of entity graphs. AGIR utilizes a two-stage pipeline by combining the advantages of template-based approaches and the capabilities of Large Language Models such as ChatGPT. We evaluate AGIR's report generation capabilities both quantitatively and qualitatively. The generated reports accurately convey information expressed through formal language, achieving a high recall value (0.99) without introducing hallucination. Furthermore, we compare the fluency and utility of the reports with state-of-the-art approaches, showing how AGIR achieves higher scores in terms of Syntactic Log-Odds Ratio (SLOR) and through questionnaires. By using our tool, we estimate that the report writing time is reduced by more than 40%, therefore streamlining the CTI production of any organization and contributing to the automation of several CTI tasks. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Comments: 10 pages, 7 figures

arXiv:2309.03607 [pdf, other]

Your Battery Is a Blast! Safeguarding Against Counterfeit Batteries with Authentication

Authors: Francesco Marchiori, Mauro Conti

Abstract: Lithium-ion (Li-ion) batteries are the primary power source in various applications due to their high energy and power density. Their market was estimated to be up to 48 billion U.S. dollars in 2022. However, the widespread adoption of Li-ion batteries has resulted in counterfeit cell production, which can pose safety hazards to users. Counterfeit cells can cause explosions or fires, and their pre… ▽ More Lithium-ion (Li-ion) batteries are the primary power source in various applications due to their high energy and power density. Their market was estimated to be up to 48 billion U.S. dollars in 2022. However, the widespread adoption of Li-ion batteries has resulted in counterfeit cell production, which can pose safety hazards to users. Counterfeit cells can cause explosions or fires, and their prevalence in the market makes it difficult for users to detect fake cells. Indeed, current battery authentication methods can be susceptible to advanced counterfeiting techniques and are often not adaptable to various cells and systems. In this paper, we improve the state of the art on battery authentication by proposing two novel methodologies, DCAuth and EISthentication, which leverage the internal characteristics of each cell through Machine Learning models. Our methods automatically authenticate lithium-ion battery models and architectures using data from their regular usage without the need for any external device. They are also resilient to the most common and critical counterfeit practices and can scale to several batteries and devices. To evaluate the effectiveness of our proposed methodologies, we analyze time-series data from a total of 20 datasets that we have processed to extract meaningful features for our analysis. Our methods achieve high accuracy in battery authentication for both architectures (up to 0.99) and models (up to 0.96). Moreover, our methods offer comparable identification performances. By using our proposed methodologies, manufacturers can ensure that devices only use legitimate batteries, guaranteeing the operational state of any system and safety measures for the users. △ Less

Submitted 7 September, 2023; originally announced September 2023.

Comments: 18 pages, 11 figures

arXiv:2306.15363 [pdf, other]

Your Attack Is Too DUMB: Formalizing Attacker Scenarios for Adversarial Transferability

Authors: Marco Alecci, Mauro Conti, Francesco Marchiori, Luca Martinelli, Luca Pajola

Abstract: Evasion attacks are a threat to machine learning models, where adversaries attempt to affect classifiers by injecting malicious samples. An alarming side-effect of evasion attacks is their ability to transfer among different models: this property is called transferability. Therefore, an attacker can produce adversarial samples on a custom model (surrogate) to conduct the attack on a victim's organ… ▽ More Evasion attacks are a threat to machine learning models, where adversaries attempt to affect classifiers by injecting malicious samples. An alarming side-effect of evasion attacks is their ability to transfer among different models: this property is called transferability. Therefore, an attacker can produce adversarial samples on a custom model (surrogate) to conduct the attack on a victim's organization later. Although literature widely discusses how adversaries can transfer their attacks, their experimental settings are limited and far from reality. For instance, many experiments consider both attacker and defender sharing the same dataset, balance level (i.e., how the ground truth is distributed), and model architecture. In this work, we propose the DUMB attacker model. This framework allows analyzing if evasion attacks fail to transfer when the training conditions of surrogate and victim models differ. DUMB considers the following conditions: Dataset soUrces, Model architecture, and the Balance of the ground truth. We then propose a novel testbed to evaluate many state-of-the-art evasion attacks with DUMB; the testbed consists of three computer vision tasks with two distinct datasets each, four types of balance levels, and three model architectures. Our analysis, which generated 13K tests over 14 distinct attacks, led to numerous novel findings in the scope of transferable attacks with surrogate models. In particular, mismatches between attackers and victims in terms of dataset source, balance levels, and model architecture lead to non-negligible loss of attack performance. △ Less

Submitted 27 June, 2023; originally announced June 2023.

Comments: Accepted at RAID 2023

arXiv:2306.05923 [pdf, other]

When Authentication Is Not Enough: On the Security of Behavioral-Based Driver Authentication Systems

Authors: Emad Efatinasab, Francesco Marchiori, Denis Donadel, Alessandro Brighente, Mauro Conti

Abstract: Many research papers have recently focused on behavioral-based driver authentication systems in vehicles. Pushed by Artificial Intelligence (AI) advancements, these works propose powerful models to identify drivers through their unique biometric behavior. However, these models have never been scrutinized from a security point of view, rather focusing on the performance of the AI algorithms. Severa… ▽ More Many research papers have recently focused on behavioral-based driver authentication systems in vehicles. Pushed by Artificial Intelligence (AI) advancements, these works propose powerful models to identify drivers through their unique biometric behavior. However, these models have never been scrutinized from a security point of view, rather focusing on the performance of the AI algorithms. Several limitations and oversights make implementing the state-of-the-art impractical, such as their secure connection to the vehicle's network and the management of security alerts. Furthermore, due to the extensive use of AI, these systems may be vulnerable to adversarial attacks. However, there is currently no discussion on the feasibility and impact of such attacks in this scenario. Driven by the significant gap between research and practical application, this paper seeks to connect these two domains. We propose the first security-aware system model for behavioral-based driver authentication. We develop two lightweight driver authentication systems based on Random Forest and Recurrent Neural Network architectures designed for our constrained environments. We formalize a realistic system and threat model reflecting a real-world vehicle's network for their implementation. When evaluated on real driving data, our models outclass the state-of-the-art with an accuracy of up to 0.999 in identification and authentication. Moreover, we are the first to propose attacks against these systems by develo** two novel evasion attacks, SMARTCAN and GANCAN. We show how attackers can still exploit these systems with a perfect attack success rate (up to 1.000). Finally, we discuss requirements for deploying driver authentication systems securely. Through our contributions, we aid practitioners in safely adopting these systems, help reduce car thefts, and enhance driver security. △ Less

Submitted 10 June, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

arXiv:2303.09999 [pdf, other]

doi 10.1145/3600160.3600182

STIXnet: A Novel and Modular Solution for Extracting All STIX Objects in CTI Reports

Authors: Francesco Marchiori, Mauro Conti, Nino Vincenzo Verde

Abstract: The automatic extraction of information from Cyber Threat Intelligence (CTI) reports is crucial in risk management. The increased frequency of the publications of these reports has led researchers to develop new systems for automatically recovering different types of entities and relations from textual data. Most state-of-the-art models leverage Natural Language Processing (NLP) techniques, which… ▽ More The automatic extraction of information from Cyber Threat Intelligence (CTI) reports is crucial in risk management. The increased frequency of the publications of these reports has led researchers to develop new systems for automatically recovering different types of entities and relations from textual data. Most state-of-the-art models leverage Natural Language Processing (NLP) techniques, which perform greatly in extracting a few types of entities at a time but cannot detect heterogeneous data or their relations. Furthermore, several paradigms, such as STIX, have become de facto standards in the CTI community and dictate a formal categorization of different entities and relations to enable organizations to share data consistently. This paper presents STIXnet, the first solution for the automated extraction of all STIX entities and relationships in CTI reports. Through the use of NLP techniques and an interactive Knowledge Base (KB) of entities, our approach obtains F1 scores comparable to state-of-the-art models for entity extraction (0.916) and relation extraction (0.724) while considering significantly more types of entities and relations. Moreover, STIXnet constitutes a modular and extensible framework that manages and coordinates different modules to merge their contributions uniquely and exhaustively. With our approach, researchers and organizations can extend their Information Extraction (IE) capabilities by integrating the efforts of several techniques without needing to develop new tools from scratch. △ Less

Submitted 8 September, 2023; v1 submitted 17 March, 2023; originally announced March 2023.

Comments: 11 pages, 3 figures

Showing 1–11 of 11 results for author: Marchiori, F