Search | arXiv e-print repository

Transfer Learning in Pre-Trained Large Language Models for Malware Detection Based on System Calls

Authors: Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Gérôme Bovet, Gregorio Martínez Pérez

Abstract: In the current cybersecurity landscape, protecting military devices such as communication and battlefield management systems against sophisticated cyber attacks is crucial. Malware exploits vulnerabilities through stealth methods, often evading traditional detection mechanisms such as software signatures. The application of ML/DL in vulnerability detection has been extensively explored in the lite… ▽ More In the current cybersecurity landscape, protecting military devices such as communication and battlefield management systems against sophisticated cyber attacks is crucial. Malware exploits vulnerabilities through stealth methods, often evading traditional detection mechanisms such as software signatures. The application of ML/DL in vulnerability detection has been extensively explored in the literature. However, current ML/DL vulnerability detection methods struggle with understanding the context and intent behind complex attacks. Integrating large language models (LLMs) with system call analysis offers a promising approach to enhance malware detection. This work presents a novel framework leveraging LLMs to classify malware based on system call data. The framework uses transfer learning to adapt pre-trained LLMs for malware detection. By retraining LLMs on a dataset of benign and malicious system calls, the models are refined to detect signs of malware activity. Experiments with a dataset of over 1TB of system calls demonstrate that models with larger context sizes, such as BigBird and Longformer, achieve superior accuracy and F1-Score of approximately 0.86. The results highlight the importance of context size in improving detection rates and underscore the trade-offs between computational complexity and performance. This approach shows significant potential for real-time detection in high-stakes environments, offering a robust solution to evolving cyber threats. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: Submitted to IEEE MILCOM 2024

arXiv:2401.13320 [pdf, other]

A Big Data Architecture for Early Identification and Categorization of Dark Web Sites

Authors: Javier Pastor-Galindo, Hông-Ân Sandlin, Félix Gómez Mármol, Gérôme Bovet, Gregorio Martínez Pérez

Abstract: The dark web has become notorious for its association with illicit activities and there is a growing need for systems to automate the monitoring of this space. This paper proposes an end-to-end scalable architecture for the early identification of new Tor sites and the daily analysis of their content. The solution is built using an Open Source Big Data stack for data serving with Kubernetes, Kafka… ▽ More The dark web has become notorious for its association with illicit activities and there is a growing need for systems to automate the monitoring of this space. This paper proposes an end-to-end scalable architecture for the early identification of new Tor sites and the daily analysis of their content. The solution is built using an Open Source Big Data stack for data serving with Kubernetes, Kafka, Kubeflow, and MinIO, continuously discovering onion addresses in different sources (threat intelligence, code repositories, web-Tor gateways, and Tor repositories), downloading the HTML from Tor and deduplicating the content using MinHash LSH, and categorizing with the BERTopic modeling (SBERT embedding, UMAP dimensionality reduction, HDBSCAN document clustering and c-TF-IDF topic keywords). In 93 days, the system identified 80,049 onion services and characterized 90% of them, addressing the challenge of Tor volatility. A disproportionate amount of repeated content is found, with only 6.1% unique sites. From the HTML files of the dark sites, 31 different low-topics are extracted, manually labeled, and grouped into 11 high-level topics. The five most popular included sexual and violent content, repositories, search engines, carding, cryptocurrencies, and marketplaces. During the experiments, we identified 14 sites with 13,946 clones that shared a suspiciously similar mirroring rate per day, suggesting an extensive common phishing network. Among the related works, this study is the most representative characterization of onion services based on topics to date. △ Less

Submitted 24 January, 2024; originally announced January 2024.

arXiv:2311.05270 [pdf, other]

Evaluation of Data Processing and Machine Learning Techniques in P300-based Authentication using Brain-Computer Interfaces

Authors: Eduardo López Bernal, Sergio López Bernal, Gregorio Martínez Pérez, Alberto Huertas Celdrán

Abstract: Brain-Computer Interfaces (BCIs) are used in various application scenarios allowing direct communication between the brain and computers. Specifically, electroencephalography (EEG) is one of the most common techniques for obtaining evoked potentials resulting from external stimuli, as the P300 potential is elicited from known images. The combination of Machine Learning (ML) and P300 potentials is… ▽ More Brain-Computer Interfaces (BCIs) are used in various application scenarios allowing direct communication between the brain and computers. Specifically, electroencephalography (EEG) is one of the most common techniques for obtaining evoked potentials resulting from external stimuli, as the P300 potential is elicited from known images. The combination of Machine Learning (ML) and P300 potentials is promising for authenticating subjects since the brain waves generated by each person when facing a particular stimulus are unique. However, existing authentication solutions do not extensively explore P300 potentials and fail when analyzing the most suitable processing and ML-based classification techniques. Thus, this work proposes i) a framework for authenticating BCI users using the P300 potential; ii) the validation of the framework on ten subjects creating an experimental scenario employing a non-invasive EEG-based BCI; and iii) the evaluation of the framework performance defining two experiments (binary and multiclass ML classification) and three testing configurations incrementally analyzing the performance of different processing techniques and the differences between classifying with epochs or statistical values. This framework achieved a performance close to 100\% f1-score in both experiments for the best classifier, highlighting its effectiveness in accurately authenticating users and demonstrating the feasibility of performing EEG-based authentication using P300 potentials. △ Less

Submitted 9 November, 2023; originally announced November 2023.

arXiv:2308.05978 [pdf, other]

CyberForce: A Federated Reinforcement Learning Framework for Malware Mitigation

Authors: Chao Feng, Alberto Huertas Celdran, Pedro Miguel Sanchez Sanchez, Jan Kreischer, Jan von der Assen, Gerome Bovet, Gregorio Martinez Perez, Burkhard Stiller

Abstract: Recent research has shown that the integration of Reinforcement Learning (RL) with Moving Target Defense (MTD) can enhance cybersecurity in Internet-of-Things (IoT) devices. Nevertheless, the practicality of existing work is hindered by data privacy concerns associated with centralized data processing in RL, and the unsatisfactory time needed to learn right MTD techniques that are effective agains… ▽ More Recent research has shown that the integration of Reinforcement Learning (RL) with Moving Target Defense (MTD) can enhance cybersecurity in Internet-of-Things (IoT) devices. Nevertheless, the practicality of existing work is hindered by data privacy concerns associated with centralized data processing in RL, and the unsatisfactory time needed to learn right MTD techniques that are effective against a rising number of heterogeneous zero-day attacks. Thus, this work presents CyberForce, a framework that combines Federated and Reinforcement Learning (FRL) to collaboratively and privately learn suitable MTD techniques for mitigating zero-day attacks. CyberForce integrates device fingerprinting and anomaly detection to reward or penalize MTD mechanisms chosen by an FRL-based agent. The framework has been deployed and evaluated in a scenario consisting of ten physical devices of a real IoT platform affected by heterogeneous malware samples. A pool of experiments has demonstrated that CyberForce learns the MTD technique mitigating each attack faster than existing RL-based centralized approaches. In addition, when various devices are exposed to different attacks, CyberForce benefits from knowledge transfer, leading to enhanced performance and reduced learning time in comparison to recent works. Finally, different aggregation algorithms used during the agent learning process provide CyberForce with notable robustness to malicious attacks. △ Less

Submitted 8 September, 2023; v1 submitted 11 August, 2023; originally announced August 2023.

Comments: 11 pages, 8 figures

arXiv:2307.11730 [pdf, other]

Mitigating Communications Threats in Decentralized Federated Learning through Moving Target Defense

Authors: Enrique Tomás Martínez Beltrán, Pedro Miguel Sánchez Sánchez, Sergio López Bernal, Gérôme Bovet, Manuel Gil Pérez, Gregorio Martínez Pérez, Alberto Huertas Celdrán

Abstract: The rise of Decentralized Federated Learning (DFL) has enabled the training of machine learning models across federated participants, fostering decentralized model aggregation and reducing dependence on a server. However, this approach introduces unique communication security challenges that have yet to be thoroughly addressed in the literature. These challenges primarily originate from the decent… ▽ More The rise of Decentralized Federated Learning (DFL) has enabled the training of machine learning models across federated participants, fostering decentralized model aggregation and reducing dependence on a server. However, this approach introduces unique communication security challenges that have yet to be thoroughly addressed in the literature. These challenges primarily originate from the decentralized nature of the aggregation process, the varied roles and responsibilities of the participants, and the absence of a central authority to oversee and mitigate threats. Addressing these challenges, this paper first delineates a comprehensive threat model focused on DFL communications. In response to these identified risks, this work introduces a security module to counter communication-based attacks for DFL platforms. The module combines security techniques such as symmetric and asymmetric encryption with Moving Target Defense (MTD) techniques, including random neighbor selection and IP/port switching. The security module is implemented in a DFL platform, Fedstellar, allowing the deployment and monitoring of the federation. A DFL scenario with physical and virtual deployments have been executed, encompassing three security configurations: (i) a baseline without security, (ii) an encrypted configuration, and (iii) a configuration integrating both encryption and MTD techniques. The effectiveness of the security module is validated through experiments with the MNIST dataset and eclipse attacks. The results showed an average F1 score of 95%, with the most secure configuration resulting in CPU usage peaking at 68% (+-9%) in virtual deployments and network traffic reaching 480.8 MB (+-18 MB), effectively mitigating risks associated with eavesdrop** or eclipse attacks. △ Less

Submitted 9 December, 2023; v1 submitted 21 July, 2023; originally announced July 2023.

arXiv:2306.15559 [pdf, other]

RansomAI: AI-powered Ransomware for Stealthy Encryption

Authors: Jan von der Assen, Alberto Huertas Celdrán, Janik Luechinger, Pedro Miguel Sánchez Sánchez, Gérôme Bovet, Gregorio Martínez Pérez, Burkhard Stiller

Abstract: Cybersecurity solutions have shown promising performance when detecting ransomware samples that use fixed algorithms and encryption rates. However, due to the current explosion of Artificial Intelligence (AI), sooner than later, ransomware (and malware in general) will incorporate AI techniques to intelligently and dynamically adapt its encryption behavior to be undetected. It might result in inef… ▽ More Cybersecurity solutions have shown promising performance when detecting ransomware samples that use fixed algorithms and encryption rates. However, due to the current explosion of Artificial Intelligence (AI), sooner than later, ransomware (and malware in general) will incorporate AI techniques to intelligently and dynamically adapt its encryption behavior to be undetected. It might result in ineffective and obsolete cybersecurity solutions, but the literature lacks AI-powered ransomware to verify it. Thus, this work proposes RansomAI, a Reinforcement Learning-based framework that can be integrated into existing ransomware samples to adapt their encryption behavior and stay stealthy while encrypting files. RansomAI presents an agent that learns the best encryption algorithm, rate, and duration that minimizes its detection (using a reward mechanism and a fingerprinting intelligent detection system) while maximizing its damage function. The proposed framework was validated in a ransomware, Ransomware-PoC, that infected a Raspberry Pi 4, acting as a crowdsensor. A pool of experiments with Deep Q-Learning and Isolation Forest (deployed on the agent and detection system, respectively) has demonstrated that RansomAI evades the detection of Ransomware-PoC affecting the Raspberry Pi 4 in a few minutes with >90% accuracy. △ Less

Submitted 27 June, 2023; originally announced June 2023.

arXiv:2306.09750 [pdf, other]

doi 10.1016/j.eswa.2023.122861

Fedstellar: A Platform for Decentralized Federated Learning

Authors: Enrique Tomás Martínez Beltrán, Ángel Luis Perales Gómez, Chao Feng, Pedro Miguel Sánchez Sánchez, Sergio López Bernal, Gérôme Bovet, Manuel Gil Pérez, Gregorio Martínez Pérez, Alberto Huertas Celdrán

Abstract: In 2016, Google proposed Federated Learning (FL) as a novel paradigm to train Machine Learning (ML) models across the participants of a federation while preserving data privacy. Since its birth, Centralized FL (CFL) has been the most used approach, where a central entity aggregates participants' models to create a global one. However, CFL presents limitations such as communication bottlenecks, sin… ▽ More In 2016, Google proposed Federated Learning (FL) as a novel paradigm to train Machine Learning (ML) models across the participants of a federation while preserving data privacy. Since its birth, Centralized FL (CFL) has been the most used approach, where a central entity aggregates participants' models to create a global one. However, CFL presents limitations such as communication bottlenecks, single point of failure, and reliance on a central server. Decentralized Federated Learning (DFL) addresses these issues by enabling decentralized model aggregation and minimizing dependency on a central entity. Despite these advances, current platforms training DFL models struggle with key issues such as managing heterogeneous federation network topologies. To overcome these challenges, this paper presents Fedstellar, a platform extended from p2pfl library and designed to train FL models in a decentralized, semi-decentralized, and centralized fashion across diverse federations of physical or virtualized devices. The Fedstellar implementation encompasses a web application with an interactive graphical interface, a controller for deploying federations of nodes using physical or virtual devices, and a core deployed on each device which provides the logic needed to train, aggregate, and communicate in the network. The effectiveness of the platform has been demonstrated in two scenarios: a physical deployment involving single-board devices such as Raspberry Pis for detecting cyberattacks, and a virtualized deployment comparing various FL approaches in a controlled environment using MNIST and CIFAR-10 datasets. In both scenarios, Fedstellar demonstrated consistent performance and adaptability, achieving F1 scores of 91%, 98%, and 91.2% using DFL for detecting cyberattacks and classifying MNIST and CIFAR-10, respectively, reducing training time by 32% compared to centralized approaches. △ Less

Submitted 8 April, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

arXiv:2306.08495 [pdf, other]

Single-board Device Individual Authentication based on Hardware Performance and Autoencoder Transformer Models

Authors: Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Gérôme Bovet, Gregorio Martínez Pérez

Abstract: The proliferation of the Internet of Things (IoT) has led to the emergence of crowdsensing applications, where a multitude of interconnected devices collaboratively collect and analyze data. Ensuring the authenticity and integrity of the data collected by these devices is crucial for reliable decision-making and maintaining trust in the system. Traditional authentication methods are often vulnerab… ▽ More The proliferation of the Internet of Things (IoT) has led to the emergence of crowdsensing applications, where a multitude of interconnected devices collaboratively collect and analyze data. Ensuring the authenticity and integrity of the data collected by these devices is crucial for reliable decision-making and maintaining trust in the system. Traditional authentication methods are often vulnerable to attacks or can be easily duplicated, posing challenges to securing crowdsensing applications. Besides, current solutions leveraging device behavior are mostly focused on device identification, which is a simpler task than authentication. To address these issues, an individual IoT device authentication framework based on hardware behavior fingerprinting and Transformer autoencoders is proposed in this work. This solution leverages the inherent imperfections and variations in IoT device hardware to differentiate between devices with identical specifications. By monitoring and analyzing the behavior of key hardware components, such as the CPU, GPU, RAM, and Storage on devices, unique fingerprints for each device are created. The performance samples are considered as time series data and used to train outlier detection transformer models, one per device and aiming to model its normal data distribution. Then, the framework is validated within a spectrum crowdsensing system leveraging Raspberry Pi devices. After a pool of experiments, the model from each device is able to individually authenticate it between the 45 devices employed for validation. An average True Positive Rate (TPR) of 0.74+-0.13 and an average maximum False Positive Rate (FPR) of 0.06+-0.09 demonstrate the effectiveness of this approach in enhancing authentication, security, and trust in crowdsensing applications. △ Less

Submitted 11 November, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

arXiv:2302.13784 [pdf, other]

Solution for the EPO CodeFest on Green Plastics: Hierarchical multi-label classification of patents relating to green plastics using deep learning

Authors: Tingting Qiao, Gonzalo Moro Perez

Abstract: This work aims at hierarchical multi-label patents classification for patents disclosing technologies related to green plastics. This is an emerging field for which there is currently no classification scheme, and hence, no labeled data is available, making this task particularly challenging. We first propose a classification scheme for this technology and a way to learn a machine learning model t… ▽ More This work aims at hierarchical multi-label patents classification for patents disclosing technologies related to green plastics. This is an emerging field for which there is currently no classification scheme, and hence, no labeled data is available, making this task particularly challenging. We first propose a classification scheme for this technology and a way to learn a machine learning model to classify patents into the proposed classification scheme. To achieve this, we come up with a strategy to automatically assign labels to patents in order to create a labeled training dataset that can be used to learn a classification model in a supervised learning setting. Using said training dataset, we come up with two classification models, a SciBERT Neural Network (SBNN) model and a SciBERT Hierarchical Neural Network (SBHNN) model. Both models use a BERT model as a feature extractor and on top of it, a neural network as a classifier. We carry out extensive experiments and report commonly evaluation metrics for this challenging classification problem. The experiment results verify the validity of our approach and show that our model sets a very strong benchmark for this problem. We also interpret our models by visualizing the word importance given by the trained model, which indicates the model is capable to extract high-level semantic information of input documents. Finally, we highlight how our solution fulfills the evaluation criteria for the EPO CodeFest and we also outline possible directions for future work. Our code has been made available at https://github.com/epo/CF22-Green-Hands △ Less

Submitted 22 February, 2023; originally announced February 2023.

arXiv:2302.09844 [pdf, other]

FederatedTrust: A Solution for Trustworthy Federated Learning

Authors: Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Ning Xie, Gérôme Bovet, Gregorio Martínez Pérez, Burkhard Stiller

Abstract: The rapid expansion of the Internet of Things (IoT) and Edge Computing has presented challenges for centralized Machine and Deep Learning (ML/DL) methods due to the presence of distributed data silos that hold sensitive information. To address concerns regarding data privacy, collaborative and privacy-preserving ML/DL techniques like Federated Learning (FL) have emerged. However, ensuring data pri… ▽ More The rapid expansion of the Internet of Things (IoT) and Edge Computing has presented challenges for centralized Machine and Deep Learning (ML/DL) methods due to the presence of distributed data silos that hold sensitive information. To address concerns regarding data privacy, collaborative and privacy-preserving ML/DL techniques like Federated Learning (FL) have emerged. However, ensuring data privacy and performance alone is insufficient since there is a growing need to establish trust in model predictions. Existing literature has proposed various approaches on trustworthy ML/DL (excluding data privacy), identifying robustness, fairness, explainability, and accountability as important pillars. Nevertheless, further research is required to identify trustworthiness pillars and evaluation metrics specifically relevant to FL models, as well as to develop solutions that can compute the trustworthiness level of FL models. This work examines the existing requirements for evaluating trustworthiness in FL and introduces a comprehensive taxonomy consisting of six pillars (privacy, robustness, fairness, explainability, accountability, and federation), along with over 30 metrics for computing the trustworthiness of FL models. Subsequently, an algorithm named FederatedTrust is designed based on the pillars and metrics identified in the taxonomy to compute the trustworthiness score of FL models. A prototype of FederatedTrust is implemented and integrated into the learning process of FederatedScope, a well-established FL framework. Finally, five experiments are conducted using different configurations of FederatedScope to demonstrate the utility of FederatedTrust in computing the trustworthiness of FL models. Three experiments employ the FEMNIST dataset, and two utilize the N-BaIoT dataset considering a real-world IoT security use case. △ Less

Submitted 6 July, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

arXiv:2212.14677 [pdf, other]

Adversarial attacks and defenses on ML- and hardware-based IoT device fingerprinting and identification

Authors: Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Gérôme Bovet, Gregorio Martínez Pérez

Abstract: In the last years, the number of IoT devices deployed has suffered an undoubted explosion, reaching the scale of billions. However, some new cybersecurity issues have appeared together with this development. Some of these issues are the deployment of unauthorized devices, malicious code modification, malware deployment, or vulnerability exploitation. This fact has motivated the requirement for new… ▽ More In the last years, the number of IoT devices deployed has suffered an undoubted explosion, reaching the scale of billions. However, some new cybersecurity issues have appeared together with this development. Some of these issues are the deployment of unauthorized devices, malicious code modification, malware deployment, or vulnerability exploitation. This fact has motivated the requirement for new device identification mechanisms based on behavior monitoring. Besides, these solutions have recently leveraged Machine and Deep Learning techniques due to the advances in this field and the increase in processing capabilities. In contrast, attackers do not stay stalled and have developed adversarial attacks focused on context modification and ML/DL evaluation evasion applied to IoT device identification solutions. This work explores the performance of hardware behavior-based individual device identification, how it is affected by possible context- and ML/DL-focused attacks, and how its resilience can be improved using defense techniques. In this sense, it proposes an LSTM-CNN architecture based on hardware performance behavior for individual device identification. Then, previous techniques have been compared with the proposed architecture using a hardware performance dataset collected from 45 Raspberry Pi devices running identical software. The LSTM-CNN improves previous solutions achieving a +0.96 average F1-Score and 0.8 minimum TPR for all devices. Afterward, context- and ML/DL-focused adversarial attacks were applied against the previous model to test its robustness. A temperature-based context attack was not able to disrupt the identification. However, some ML/DL state-of-the-art evasion attacks were successful. Finally, adversarial training and model distillation defense techniques are selected to improve the model resilience to evasion attacks, without degrading its performance. △ Less

Submitted 30 December, 2022; originally announced December 2022.

arXiv:2212.14647 [pdf, other]

doi 10.1109/TIFS.2024.3402055

RL and Fingerprinting to Select Moving Target Defense Mechanisms for Zero-day Attacks in IoT

Authors: Alberto Huertas Celdrán, Pedro Miguel Sánchez Sánchez, Jan von der Assen, Timo Schenk, Gérôme Bovet, Gregorio Martínez Pérez, Burkhard Stiller

Abstract: Cybercriminals are moving towards zero-day attacks affecting resource-constrained devices such as single-board computers (SBC). Assuming that perfect security is unrealistic, Moving Target Defense (MTD) is a promising approach to mitigate attacks by dynamically altering target attack surfaces. Still, selecting suitable MTD techniques for zero-day attacks is an open challenge. Reinforcement Learnin… ▽ More Cybercriminals are moving towards zero-day attacks affecting resource-constrained devices such as single-board computers (SBC). Assuming that perfect security is unrealistic, Moving Target Defense (MTD) is a promising approach to mitigate attacks by dynamically altering target attack surfaces. Still, selecting suitable MTD techniques for zero-day attacks is an open challenge. Reinforcement Learning (RL) could be an effective approach to optimize the MTD selection through trial and error, but the literature fails when i) evaluating the performance of RL and MTD solutions in real-world scenarios, ii) studying whether behavioral fingerprinting is suitable for representing SBC's states, and iii) calculating the consumption of resources in SBC. To improve these limitations, the work at hand proposes an online RL-based framework to learn the correct MTD mechanisms mitigating heterogeneous zero-day attacks in SBC. The framework considers behavioral fingerprinting to represent SBCs' states and RL to learn MTD techniques that mitigate each malicious state. It has been deployed on a real IoT crowdsensing scenario with a Raspberry Pi acting as a spectrum sensor. More in detail, the Raspberry Pi has been infected with different samples of command and control malware, rootkits, and ransomware to later select between four existing MTD techniques. A set of experiments demonstrated the suitability of the framework to learn proper MTD techniques mitigating all attacks (except a harmfulness rootkit) while consuming <1 MB of storage and utilizing <55% CPU and <80% RAM. △ Less

Submitted 30 December, 2022; originally announced December 2022.

arXiv:2212.03169 [pdf, other]

When Brain-Computer Interfaces Meet the Metaverse: Landscape, Demonstrator, Trends, Challenges, and Concerns

Authors: Sergio López Bernal, Mario Quiles Pérez, Enrique Tomás Martínez Beltrán, Gregorio Martínez Pérez, Alberto Huertas Celdrán

Abstract: The metaverse has gained tremendous popularity in recent years, allowing the interconnection of users worldwide. However, current systems in metaverse scenarios, such as virtual reality glasses, offer a partial immersive experience. In this context, Brain-Computer Interfaces (BCIs) can introduce a revolution in the metaverse, although a study of the applicability and implications of BCIs in these… ▽ More The metaverse has gained tremendous popularity in recent years, allowing the interconnection of users worldwide. However, current systems in metaverse scenarios, such as virtual reality glasses, offer a partial immersive experience. In this context, Brain-Computer Interfaces (BCIs) can introduce a revolution in the metaverse, although a study of the applicability and implications of BCIs in these virtual scenarios is required. Based on the absence of literature, this work reviews, for the first time, the applicability of BCIs in the metaverse, analyzing the current status of this integration based on different categories related to virtual worlds and the evolution of BCIs in these scenarios in the medium and long term. This work also proposes the design and implementation of a general framework that integrates BCIs with different data sources from sensors and actuators (e.g., VR glasses) based on a modular design to be easily extended. This manuscript also validates the framework in a demonstrator consisting of driving a car within a metaverse, using a BCI for neural data acquisition, a VR headset to provide realism, and a steering wheel and pedals. Four use cases (UCs) are selected, focusing on cognitive and emotional assessment of the driver, detection of drowsiness, and driver authentication while using the vehicle. Moreover, this manuscript offers an analysis of BCI trends in the metaverse, also identifying future challenges that the intersection of these technologies will face. Finally, it reviews the concerns that using BCIs in virtual world applications could generate according to different categories: accessibility, user inclusion, privacy, cybersecurity, physical safety, and ethics. △ Less

Submitted 16 November, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

arXiv:2211.08413 [pdf, other]

doi 10.1109/COMST.2023.3315746

Decentralized Federated Learning: Fundamentals, State of the Art, Frameworks, Trends, and Challenges

Authors: Enrique Tomás Martínez Beltrán, Mario Quiles Pérez, Pedro Miguel Sánchez Sánchez, Sergio López Bernal, Gérôme Bovet, Manuel Gil Pérez, Gregorio Martínez Pérez, Alberto Huertas Celdrán

Abstract: In recent years, Federated Learning (FL) has gained relevance in training collaborative models without sharing sensitive data. Since its birth, Centralized FL (CFL) has been the most common approach in the literature, where a central entity creates a global model. However, a centralized approach leads to increased latency due to bottlenecks, heightened vulnerability to system failures, and trustwo… ▽ More In recent years, Federated Learning (FL) has gained relevance in training collaborative models without sharing sensitive data. Since its birth, Centralized FL (CFL) has been the most common approach in the literature, where a central entity creates a global model. However, a centralized approach leads to increased latency due to bottlenecks, heightened vulnerability to system failures, and trustworthiness concerns affecting the entity responsible for the global model creation. Decentralized Federated Learning (DFL) emerged to address these concerns by promoting decentralized model aggregation and minimizing reliance on centralized architectures. However, despite the work done in DFL, the literature has not (i) studied the main aspects differentiating DFL and CFL; (ii) analyzed DFL frameworks to create and evaluate new solutions; and (iii) reviewed application scenarios using DFL. Thus, this article identifies and analyzes the main fundamentals of DFL in terms of federation architectures, topologies, communication mechanisms, security approaches, and key performance indicators. Additionally, the paper at hand explores existing mechanisms to optimize critical DFL fundamentals. Then, the most relevant features of the current DFL frameworks are reviewed and compared. After that, it analyzes the most used DFL application scenarios, identifying solutions based on the fundamentals and frameworks previously defined. Finally, the evolution of existing DFL solutions is studied to provide a list of trends, lessons learned, and open challenges. △ Less

Submitted 13 September, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

arXiv:2210.11517 [pdf, other]

A Security and Trust Framework for Decentralized 5G Marketplaces

Authors: José María Jorquera Valero, Manuel Gil Pérez, Gregorio Martínez Pérez

Abstract: 5G networks intend to cover user demands through multi-party collaborations in a secure and trustworthy manner. To this end, marketplaces play a pivotal role as enablers for network service consumers and infrastructure providers to offer, negotiate, and purchase 5G resources and services. Nevertheless, marketplaces often do not ensure trustworthy networking by analyzing the security and trust of t… ▽ More 5G networks intend to cover user demands through multi-party collaborations in a secure and trustworthy manner. To this end, marketplaces play a pivotal role as enablers for network service consumers and infrastructure providers to offer, negotiate, and purchase 5G resources and services. Nevertheless, marketplaces often do not ensure trustworthy networking by analyzing the security and trust of their members and offers. This paper presents a security and trust framework to enable the selection of reliable third-party providers based on their history and reputation. In addition, it also introduces a reward and punishment mechanism to continuously update trust scores according to security events. Finally, we showcase a real use case in which the security and trust framework is being applied. △ Less

Submitted 20 October, 2022; originally announced October 2022.

Journal ref: Proceedings of the VII Jornadas Nacionales de Investigación en Ciberseguridad, pp. 237-240, Bilbao, Spain (2022)

arXiv:2210.11501 [pdf, other]

Trust-as-a-Service: A reputation-enabled trust framework for 5G networks

Authors: José María Jorquera Valero, Pedro Miguel Sánchez Sánchez, Manuel Gil Pérez, Alberto Huertas Celdrán, Gregorio Martínez Pérez

Abstract: Trust, security, and privacy are three of the major pillars to assemble the fifth generation network and beyond. Despite such pillars are principally interconnected, they arise a multitude of challenges to be addressed separately. 5G ought to offer flexible and pervasive computing capabilities across multiple domains according to user demands and assuring trustworthy network providers. Distributed… ▽ More Trust, security, and privacy are three of the major pillars to assemble the fifth generation network and beyond. Despite such pillars are principally interconnected, they arise a multitude of challenges to be addressed separately. 5G ought to offer flexible and pervasive computing capabilities across multiple domains according to user demands and assuring trustworthy network providers. Distributed marketplaces expect to boost the trading of heterogeneous resources so as to enable the establishment of pervasive service chains between cross-domains. Nevertheless, the need for reliable parties as ``marketplace operators'' plays a pivotal role to achieving a trustworthy ecosystem. One of the principal blockages in managing foreseeable networks is the need of adapting previous trust models to accomplish the new network and business requirements. In this regard, this article is centered on trust management of 5G multi-party networks. The design of a reputation-based trust framework is proposed as a Trust-as-a-Service (TaaS) solution for any distributed multi-stakeholder environment where zero trust and zero-touch principles should be met. Besides, a literature review is also conducted to recognize the network and business requirements currently envisaged. Finally, the validation of the proposed trust framework is performed in a real research environment, the 5GBarcelona testbed, leveraging 12% of a 2.1GHz CPU with 20 cores and 2% of the 30GiB memory. In this regard, these outcomes reveal the feasibility of the TaaS solution in the context of determining reliable network operators. △ Less

Submitted 20 October, 2022; originally announced October 2022.

arXiv:2210.11061 [pdf, other]

Analyzing the Robustness of Decentralized Horizontal and Vertical Federated Learning Architectures in a Non-IID Scenario

Authors: Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Enrique Tomás Martínez Beltrán, Daniel Demeter, Gérôme Bovet, Gregorio Martínez Pérez, Burkhard Stiller

Abstract: Federated learning (FL) allows participants to collaboratively train machine and deep learning models while protecting data privacy. However, the FL paradigm still presents drawbacks affecting its trustworthiness since malicious participants could launch adversarial attacks against the training process. Related work has studied the robustness of horizontal FL scenarios under different attacks. How… ▽ More Federated learning (FL) allows participants to collaboratively train machine and deep learning models while protecting data privacy. However, the FL paradigm still presents drawbacks affecting its trustworthiness since malicious participants could launch adversarial attacks against the training process. Related work has studied the robustness of horizontal FL scenarios under different attacks. However, there is a lack of work evaluating the robustness of decentralized vertical FL and comparing it with horizontal FL architectures affected by adversarial attacks. Thus, this work proposes three decentralized FL architectures, one for horizontal and two for vertical scenarios, namely HoriChain, VertiChain, and VertiComb. These architectures present different neural networks and training protocols suitable for horizontal and vertical scenarios. Then, a decentralized, privacy-preserving, and federated use case with non-IID data to classify handwritten digits is deployed to evaluate the performance of the three architectures. Finally, a set of experiments computes and compares the robustness of the proposed architectures when they are affected by different data poisoning based on image watermarks and gradient poisoning adversarial attacks. The experiments show that even though particular configurations of both attacks can destroy the classification performance of the architectures, HoriChain is the most robust one. △ Less

Submitted 20 October, 2022; originally announced October 2022.

arXiv:2210.07719 [pdf, other]

doi 10.1109/ICC45041.2023.10278951

A Lightweight Moving Target Defense Framework for Multi-purpose Malware Affecting IoT Devices

Authors: Jan von der Assen, Alberto Huertas Celdrán, Pedro Miguel Sánchez Sánchez, Jordan Cedeño, Gérôme Bovet, Gregorio Martínez Pérez, Burkhard Stiller

Abstract: Malware affecting Internet of Things (IoT) devices is rapidly growing due to the relevance of this paradigm in real-world scenarios. Specialized literature has also detected a trend towards multi-purpose malware able to execute different malicious actions such as remote control, data leakage, encryption, or code hiding, among others. Protecting IoT devices against this kind of malware is challengi… ▽ More Malware affecting Internet of Things (IoT) devices is rapidly growing due to the relevance of this paradigm in real-world scenarios. Specialized literature has also detected a trend towards multi-purpose malware able to execute different malicious actions such as remote control, data leakage, encryption, or code hiding, among others. Protecting IoT devices against this kind of malware is challenging due to their well-known vulnerabilities and limitation in terms of CPU, memory, and storage. To improve it, the moving target defense (MTD) paradigm was proposed a decade ago and has shown promising results, but there is a lack of IoT MTD solutions dealing with multi-purpose malware. Thus, this work proposes four MTD mechanisms changing IoT devices' network, data, and runtime environment to mitigate multi-purpose malware. Furthermore, it presents a lightweight and IoT-oriented MTD framework to decide what, when, and how the MTD mechanisms are deployed. Finally, the efficiency and effectiveness of the framework and MTD mechanisms are evaluated in a real-world scenario with one IoT spectrum sensor affected by multi-purpose malware. △ Less

Submitted 14 October, 2022; originally announced October 2022.

arXiv:2209.04048 [pdf, other]

Studying Drowsiness Detection Performance while Driving through Scalable Machine Learning Models using Electroencephalography

Authors: José Manuel Hidalgo Rogel, Enrique Tomás Martínez Beltrán, Mario Quiles Pérez, Sergio López Bernal, Gregorio Martínez Pérez, Alberto Huertas Celdrán

Abstract: - Background / Introduction: Driver drowsiness is a significant concern and one of the leading causes of traffic accidents. Advances in cognitive neuroscience and computer science have enabled the detection of drivers' drowsiness using Brain-Computer Interfaces (BCIs) and Machine Learning (ML). However, the literature lacks a comprehensive evaluation of drowsiness detection performance using a het… ▽ More - Background / Introduction: Driver drowsiness is a significant concern and one of the leading causes of traffic accidents. Advances in cognitive neuroscience and computer science have enabled the detection of drivers' drowsiness using Brain-Computer Interfaces (BCIs) and Machine Learning (ML). However, the literature lacks a comprehensive evaluation of drowsiness detection performance using a heterogeneous set of ML algorithms, and it is necessary to study the performance of scalable ML models suitable for groups of subjects. - Methods: To address these limitations, this work presents an intelligent framework employing BCIs and features based on electroencephalography for detecting drowsiness in driving scenarios. The SEED-VIG dataset is used to evaluate the best-performing models for individual subjects and groups. - Results: Results show that Random Forest (RF) outperformed other models used in the literature, such as Support Vector Machine (SVM), with a 78% f1-score for individual models. Regarding scalable models, RF reached a 79% f1-score, demonstrating the effectiveness of these approaches. This publication highlights the relevance of exploring a diverse set of ML algorithms and scalable approaches suitable for groups of subjects to improve drowsiness detection systems and ultimately reduce the number of accidents caused by driver fatigue. - Conclusions: The lessons learned from this study show that not only SVM but also other models not sufficiently explored in the literature are relevant for drowsiness detection. Additionally, scalable approaches are effective in detecting drowsiness, even when new subjects are evaluated. Thus, the proposed framework presents a novel approach for detecting drowsiness in driving scenarios using BCIs and ML. △ Less

Submitted 30 October, 2023; v1 submitted 8 September, 2022; originally announced September 2022.

arXiv:2204.08516 [pdf, other]

doi 10.1016/j.iot.2023.100764

LwHBench: A low-level hardware component benchmark and dataset for Single Board Computers

Authors: Pedro Miguel Sánchez Sánchez, José María Jorquera Valero, Alberto Huertas Celdrán, Gérôme Bovet, Manuel Gil Pérez, Gregorio Martínez Pérez

Abstract: In today's computing environment, where Artificial Intelligence (AI) and data processing are moving toward the Internet of Things (IoT) and Edge computing paradigms, benchmarking resource-constrained devices is a critical task to evaluate their suitability and performance. Between the employed devices, Single-Board Computers arise as multi-purpose and affordable systems. The literature has explore… ▽ More In today's computing environment, where Artificial Intelligence (AI) and data processing are moving toward the Internet of Things (IoT) and Edge computing paradigms, benchmarking resource-constrained devices is a critical task to evaluate their suitability and performance. Between the employed devices, Single-Board Computers arise as multi-purpose and affordable systems. The literature has explored Single-Board Computers performance when running high-level benchmarks specialized in particular application scenarios, such as AI or medical applications. However, lower-level benchmarking applications and datasets are needed to enable new Edge-based AI solutions for network, system and service management based on device and component performance, such as individual device identification. Thus, this paper presents LwHBench, a low-level hardware benchmarking application for Single-Board Computers that measures the performance of CPU, GPU, Memory and Storage taking into account the component constraints in these types of devices. LwHBench has been implemented for Raspberry Pi devices and run for 100 days on a set of 45 devices to generate an extensive dataset that allows the usage of AI techniques in scenarios where performance data can help in the device management process. Besides, to demonstrate the inter-scenario capability of the dataset, a series of AI-enabled use cases about device identification and context impact on performance are presented as exploration of the published data. Finally, the benchmark application has been adapted and applied to an agriculture-focused scenario where three RockPro64 devices are present. △ Less

Submitted 24 October, 2022; v1 submitted 18 April, 2022; originally announced April 2022.

arXiv:2202.00137 [pdf, other]

doi 10.1109/TDSC.2022.3204535

Studying the Robustness of Anti-adversarial Federated Learning Models Detecting Cyberattacks in IoT Spectrum Sensors

Authors: Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Timo Schenk, Adrian Lars Benjamin Iten, Gérôme Bovet, Gregorio Martínez Pérez, Burkhard Stiller

Abstract: Device fingerprinting combined with Machine and Deep Learning (ML/DL) report promising performance when detecting cyberattacks targeting data managed by resource-constrained spectrum sensors. However, the amount of data needed to train models and the privacy concerns of such scenarios limit the applicability of centralized ML/DL-based approaches. Federated learning (FL) addresses these limitations… ▽ More Device fingerprinting combined with Machine and Deep Learning (ML/DL) report promising performance when detecting cyberattacks targeting data managed by resource-constrained spectrum sensors. However, the amount of data needed to train models and the privacy concerns of such scenarios limit the applicability of centralized ML/DL-based approaches. Federated learning (FL) addresses these limitations by creating federated and privacy-preserving models. However, FL is vulnerable to malicious participants, and the impact of adversarial attacks on federated models detecting spectrum sensing data falsification (SSDF) attacks on spectrum sensors has not been studied. To address this challenge, the first contribution of this work is the creation of a novel dataset suitable for FL and modeling the behavior (usage of CPU, memory, or file system, among others) of resource-constrained spectrum sensors affected by different SSDF attacks. The second contribution is a pool of experiments analyzing and comparing the robustness of federated models according to i) three families of spectrum sensors, ii) eight SSDF attacks, iii) four scenarios dealing with unsupervised (anomaly detection) and supervised (binary classification) federated models, iv) up to 33% of malicious participants implementing data and model poisoning attacks, and v) four aggregation functions acting as anti-adversarial mechanisms to increase the models robustness. △ Less

Submitted 31 January, 2022; originally announced February 2022.

arXiv:2201.05410 [pdf, other]

doi 10.1109/TDSC.2023.3252918

CyberSpec: Intelligent Behavioral Fingerprinting to Detect Attacks on Crowdsensing Spectrum Sensors

Authors: Alberto Huertas Celdrán, Pedro Miguel Sánchez Sánchez, Gérôme Bovet, Gregorio Martínez Pérez, Burkhard Stiller

Abstract: Integrated sensing and communication (ISAC) is a novel paradigm using crowdsensing spectrum sensors to help with the management of spectrum scarcity. However, well-known vulnerabilities of resource-constrained spectrum sensors and the possibility of being manipulated by users with physical access complicate their protection against spectrum sensing data falsification (SSDF) attacks. Most recent li… ▽ More Integrated sensing and communication (ISAC) is a novel paradigm using crowdsensing spectrum sensors to help with the management of spectrum scarcity. However, well-known vulnerabilities of resource-constrained spectrum sensors and the possibility of being manipulated by users with physical access complicate their protection against spectrum sensing data falsification (SSDF) attacks. Most recent literature suggests using behavioral fingerprinting and Machine/Deep Learning (ML/DL) for improving similar cybersecurity issues. Nevertheless, the applicability of these techniques in resource-constrained devices, the impact of attacks affecting spectrum data integrity, and the performance and scalability of models suitable for heterogeneous sensors types are still open challenges. To improve limitations, this work presents seven SSDF attacks affecting spectrum sensors and introduces CyberSpec, an ML/DL-oriented framework using device behavioral fingerprinting to detect anomalies produced by SSDF attacks affecting resource-constrained spectrum sensors. CyberSpec has been implemented and validated in ElectroSense, a real crowdsensing RF monitoring platform where several configurations of the proposed SSDF attacks have been executed in different sensors. A pool of experiments with different unsupervised ML/DL-based models has demonstrated the suitability of CyberSpec detecting the previous attacks within an acceptable timeframe. △ Less

Submitted 14 January, 2022; originally announced January 2022.

arXiv:2111.14434 [pdf, other]

doi 10.1007/s10586-022-03949-w

Robust Federated Learning for execution time-based device model identification under label-flip** attack

Authors: Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, José Rafael Buendía Rubio, Gérôme Bovet, Gregorio Martínez Pérez

Abstract: The computing device deployment explosion experienced in recent years, motivated by the advances of technologies such as Internet-of-Things (IoT) and 5G, has led to a global scenario with increasing cybersecurity risks and threats. Among them, device spoofing and impersonation cyberattacks stand out due to their impact and, usually, low complexity required to be launched. To solve this issue, seve… ▽ More The computing device deployment explosion experienced in recent years, motivated by the advances of technologies such as Internet-of-Things (IoT) and 5G, has led to a global scenario with increasing cybersecurity risks and threats. Among them, device spoofing and impersonation cyberattacks stand out due to their impact and, usually, low complexity required to be launched. To solve this issue, several solutions have emerged to identify device models and types based on the combination of behavioral fingerprinting and Machine/Deep Learning (ML/DL) techniques. However, these solutions are not appropriated for scenarios where data privacy and protection is a must, as they require data centralization for processing. In this context, newer approaches such as Federated Learning (FL) have not been fully explored yet, especially when malicious clients are present in the scenario setup. The present work analyzes and compares the device model identification performance of a centralized DL model with an FL one while using execution time-based events. For experimental purposes, a dataset containing execution-time features of 55 Raspberry Pis belonging to four different models has been collected and published. Using this dataset, the proposed solution achieved 0.9999 accuracy in both setups, centralized and federated, showing no performance decrease while preserving data privacy. Later, the impact of a label-flip** attack during the federated model training is evaluated, using several aggregation mechanisms as countermeasure. Zeno and coordinate-wise median aggregation show the best performance, although their performance greatly degrades when the percentage of fully malicious clients (all training samples poisoned) grows over 50%. △ Less

Submitted 29 November, 2021; originally announced November 2021.

arXiv:2106.15543 [pdf, other]

BOTTER: A framework to analyze social bots in Twitter

Authors: Javier Pastor-Galindo, Félix Gómez Mármol, Gregorio Martínez Pérez

Abstract: Social networks have triumphed in communicating people online, but they have also been exploited to launch influence operations for manipulating society. The deployment of software-controlled accounts (e.g., social bots) has proven to be one of the most effective enablers for that purpose, and tools for their detection have been developed and widely adopted. However, the way to analyze these accou… ▽ More Social networks have triumphed in communicating people online, but they have also been exploited to launch influence operations for manipulating society. The deployment of software-controlled accounts (e.g., social bots) has proven to be one of the most effective enablers for that purpose, and tools for their detection have been developed and widely adopted. However, the way to analyze these accounts and measure their impact is heterogeneous in the literature, where each case study performs unique measurements. To unify these efforts, we propose a common framework to analyze the interference of social bots in Twitter. The methodology compares the non-authentic actors with the rest of users from different perspectives, thus building objective metrics to measure their actual impact. We validate the framework by applying it to a dataset of Twitter iterations dated in the weeks preceding the 2019 Spanish general election. In this sense, we check that our framework facilitates the quantitative evaluation of unauthentic groups, particularly discovering that social bots changed the natural dynamics of the network in these days, but did not have a significant impact. We also consider this methodology as a practical tool for the qualitative interpretation of experimental results, particularly suggesting within the aforementioned electoral context that semi-automated accounts are potentially more threatening than fully automated ones. △ Less

Submitted 15 July, 2021; v1 submitted 29 June, 2021; originally announced June 2021.

arXiv:2106.08209 [pdf, other]

doi 10.1016/j.jnca.2022.103579

A methodology to identify identical single-board computers based on hardware behavior fingerprinting

Authors: Pedro Miguel Sánchez Sánchez, José María Jorquera Valero, Alberto Huertas Celdrán, Gérôme Bovet, Manuel Gil Pérez, Gregorio Martínez Pérez

Abstract: The connectivity and resource-constrained nature of single-board devices open the door to cybersecurity concerns affecting Internet of Things (IoT) scenarios. One of the most important issues is the presence of unauthorized IoT devices that want to impersonate legitimate ones by using identical hardware and software specifications. This situation can provoke sensitive information leakages, data po… ▽ More The connectivity and resource-constrained nature of single-board devices open the door to cybersecurity concerns affecting Internet of Things (IoT) scenarios. One of the most important issues is the presence of unauthorized IoT devices that want to impersonate legitimate ones by using identical hardware and software specifications. This situation can provoke sensitive information leakages, data poisoning, or privilege escalation in IoT scenarios. Combining behavioral fingerprinting and Machine/Deep Learning (ML/DL) techniques is a promising approach to identify these malicious spoofing devices by detecting minor performance differences generated by imperfections in manufacturing. However, existing solutions are not suitable for single-board devices since they do not consider their hardware and software limitations, underestimate critical aspects such as fingerprint stability or context changes, and do not explore the potential of ML/DL techniques. To improve it, this work first identifies the essential properties for single-board device identification: uniqueness, stability, diversity, scalability, efficiency, robustness, and security. Then, a novel methodology relies on behavioral fingerprinting to identify identical single-board devices and meet the previous properties. The methodology leverages the different built-in components of the system and ML/DL techniques, comparing the device internal behavior with each other to detect manufacturing variations. The methodology validation has been performed in a real environment composed of 15 identical Raspberry Pi 4 B and 10 Raspberry Pi 3 B+ devices, obtaining a 91.9% average TPR and identifying all devices by setting a 50% threshold in the evaluation process. Finally, a discussion compares the proposed solution with related work, highlighting the fingerprint properties not met, and provides important lessons learned and limitations. △ Less

Submitted 22 June, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

arXiv:2106.04968 [pdf, other]

Eight Reasons Why Cybersecurity on Novel Generations of Brain-Computer Interfaces Must Be Prioritized

Authors: Sergio López Bernal, Alberto Huertas Celdrán, Gregorio Martínez Pérez

Abstract: This article presents eight neural cyberattacks affecting spontaneous neural activity, inspired by well-known cyberattacks from the computer science domain: Neural Flooding, Neural Jamming, Neural Scanning, Neural Selective Forwarding, Neural Spoofing, Neural Sybil, Neural Sinkhole and Neural Nonce. These cyberattacks are based on the exploitation of vulnerabilities existing in the new generation… ▽ More This article presents eight neural cyberattacks affecting spontaneous neural activity, inspired by well-known cyberattacks from the computer science domain: Neural Flooding, Neural Jamming, Neural Scanning, Neural Selective Forwarding, Neural Spoofing, Neural Sybil, Neural Sinkhole and Neural Nonce. These cyberattacks are based on the exploitation of vulnerabilities existing in the new generation of Brain-Computer Interfaces. After presenting their formal definitions, the cyberattacks have been implemented over a neuronal simulation. To evaluate the impact of each cyberattack, they have been implemented in a Convolutional Neural Network (CNN) simulating a portion of a mouse's visual cortex. This implementation is based on existing literature indicating the similarities that CNNs have with neuronal structures from the visual cortex. Some conclusions are also provided, indicating that Neural Nonce and Neural Jamming are the most impactful cyberattacks for short-term effects, while Neural Scanning and Neural Nonce are the most damaging for long-term effects. △ Less

Submitted 9 June, 2021; originally announced June 2021.

arXiv:2105.10997 [pdf, other]

Neuronal Jamming Cyberattack over Invasive BCI Affecting the Resolution of Tasks Requiring Visual Capabilities

Authors: Sergio López Bernal, Alberto Huertas Celdrán, Gregorio Martínez Pérez

Abstract: Invasive Brain-Computer Interfaces (BCI) are extensively used in medical application scenarios to record, stimulate, or inhibit neural activity with different purposes. An example is the stimulation of some brain areas to reduce the effects generated by Parkinson's disease. Despite the advances in recent years, cybersecurity on BCI is an open challenge since attackers can exploit the vulnerabiliti… ▽ More Invasive Brain-Computer Interfaces (BCI) are extensively used in medical application scenarios to record, stimulate, or inhibit neural activity with different purposes. An example is the stimulation of some brain areas to reduce the effects generated by Parkinson's disease. Despite the advances in recent years, cybersecurity on BCI is an open challenge since attackers can exploit the vulnerabilities of invasive BCIs to induce malicious stimulation or treatment disruption, affecting neuronal activity. In this work, we design and implement a novel neuronal cyberattack, called Neuronal Jamming (JAM), which prevents neurons from producing spikes. To implement and measure the JAM impact, and due to the lack of realistic neuronal topologies in mammalians, we have defined a use case with a Convolutional Neural Network (CNN) trained to allow a mouse to exit a particular maze. The resulting model has been translated to a neural topology, simulating a portion of a mouse's visual cortex. The impact of JAM on both biological and artificial networks is measured, analyzing how the attacks can both disrupt the spontaneous neural signaling and the mouse's capacity to exit the maze. Besides, we compare the impacts of both JAM and FLO (an existing neural cyberattack) demonstrating that JAM generates a higher impact in terms of neuronal spike rate. Finally, we discuss on whether and how JAM and FLO attacks could induce the effects of neurodegenerative diseases if the implanted BCI had a comprehensive electrode coverage of the targeted brain regions. △ Less

Submitted 23 May, 2021; originally announced May 2021.

arXiv:2008.03343 [pdf, other]

doi 10.1109/COMST.2021.3064259

A Survey on Device Behavior Fingerprinting: Data Sources, Techniques, Application Scenarios, and Datasets

Authors: Pedro Miguel Sánchez Sánchez, Jose María Jorquera Valero, Alberto Huertas Celdrán, Gérôme Bovet, Manuel Gil Pérez, Gregorio Martínez Pérez

Abstract: In the current network-based computing world, where the number of interconnected devices grows exponentially, their diversity, malfunctions, and cybersecurity threats are increasing at the same rate. To guarantee the correct functioning and performance of novel environments such as Smart Cities, Industry 4.0, or crowdsensing, it is crucial to identify the capabilities of their devices (e.g., senso… ▽ More In the current network-based computing world, where the number of interconnected devices grows exponentially, their diversity, malfunctions, and cybersecurity threats are increasing at the same rate. To guarantee the correct functioning and performance of novel environments such as Smart Cities, Industry 4.0, or crowdsensing, it is crucial to identify the capabilities of their devices (e.g., sensors, actuators) and detect potential misbehavior that may arise due to cyberattacks, system faults, or misconfigurations. With this goal in mind, a promising research field emerged focusing on creating and managing fingerprints that model the behavior of both the device actions and its components. The article at hand studies the recent growth of the device behavior fingerprinting field in terms of application scenarios, behavioral sources, and processing and evaluation techniques. First, it performs a comprehensive review of the device types, behavioral data, and processing and evaluation techniques used by the most recent and representative research works dealing with two major scenarios: device identification and device misbehavior detection. After that, each work is deeply analyzed and compared, emphasizing its characteristics, advantages, and limitations. This article also provides researchers with a review of the most relevant characteristics of existing datasets as most of the novel processing techniques are based on machine learning and deep learning. Finally, it studies the evolution of these two scenarios in recent years, providing lessons learned, current trends, and future research challenges to guide new solutions in the area. △ Less

Submitted 3 March, 2021; v1 submitted 7 August, 2020; originally announced August 2020.

arXiv:2007.09466 [pdf, other]

Cyberattacks on Miniature Brain Implants to Disrupt Spontaneous Neural Signaling

Authors: Sergio López Bernal, Alberto Huertas Celdrán, Lorenzo Fernández Maimó, Michael Taynnan Barros, Sasitharan Balasubramaniam, Gregorio Martínez Pérez

Abstract: Brain-Computer Interfaces (BCI) arose as systems that merge computing systems with the human brain to facilitate recording, stimulation, and inhibition of neural activity. Over the years, the development of BCI technologies has shifted towards miniaturization of devices that can be seamlessly embedded into the brain and can target single neuron or small population sensing and control. We present a… ▽ More Brain-Computer Interfaces (BCI) arose as systems that merge computing systems with the human brain to facilitate recording, stimulation, and inhibition of neural activity. Over the years, the development of BCI technologies has shifted towards miniaturization of devices that can be seamlessly embedded into the brain and can target single neuron or small population sensing and control. We present a motivating example highlighting vulnerabilities of two promising micron-scale BCI technologies, demonstrating the lack of security and privacy principles in existing solutions. This situation opens the door to a novel family of cyberattacks, called neuronal cyberattacks, affecting neuronal signaling. This paper defines the first two neural cyberattacks, Neuronal Flooding (FLO) and Neuronal Scanning (SCA), where each threat can affect the natural activity of neurons. This work implements these attacks in a neuronal simulator to determine their impact over the spontaneous neuronal behavior, defining three metrics: number of spikes, percentage of shifts, and dispersion of spikes. Several experiments demonstrate that both cyberattacks produce a reduction of spikes compared to spontaneous behavior, generating a rise in temporal shifts and a dispersion increase. Mainly, SCA presents a higher impact than FLO in the metrics focused on the number of spikes and dispersion, where FLO is slightly more damaging, considering the percentage of shifts. Nevertheless, the intrinsic behavior of each attack generates a differentiation on how they alter neuronal signaling. FLO is adequate to generate an immediate impact on the neuronal activity, whereas SCA presents higher effectiveness for damages to the neural signaling in the long-term. △ Less

Submitted 10 September, 2020; v1 submitted 18 July, 2020; originally announced July 2020.

arXiv:2004.07877 [pdf, other]

doi 10.1016/j.cose.2020.102168

AuthCODE: A Privacy-preserving and Multi-device Continuous Authentication Architecture based on Machine and Deep Learning

Authors: Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Lorenzo Fernández Maimó, Gregorio Martínez Pérez

Abstract: The authentication field is evolving towards mechanisms able to keep users continuously authenticated without the necessity of remembering or possessing authentication credentials. While existing continuous authentication systems have demonstrated their suitability for single-device scenarios, the Internet of Things and next generation of mobile networks (5G) are enabling novel multi-device scenar… ▽ More The authentication field is evolving towards mechanisms able to keep users continuously authenticated without the necessity of remembering or possessing authentication credentials. While existing continuous authentication systems have demonstrated their suitability for single-device scenarios, the Internet of Things and next generation of mobile networks (5G) are enabling novel multi-device scenarios -- such as Smart Offices -- where continuous authentication is still an open challenge. The paper at hand, proposes an AI-based, privacy-preserving and multi-device continuous authentication architecture called AuthCODE. A realistic Smart Office scenario with several users, interacting with their mobile devices and personal computer, has been used to create a set of single- and multi-device behavioural datasets and validate AuthCODE. A pool of experiments with machine and deep learning classifiers measured the impact of time in authentication accuracy and improved the results of single-device approaches by considering multi-device behaviour profiles. The f1-score average reached for XGBoost on multi-device profiles based on 1-minute windows was 99.33%, while the best performance achieved for single devices was lower than 97.39%. The inclusion of temporal information in the form of vector sequences classified by a Long-Short Term Memory Network, allowed the identification of additional complex behaviour patterns associated to each user, resulting in an average f1-score of 99.02% on identification of long-term behaviours. △ Less

Submitted 30 November, 2020; v1 submitted 16 April, 2020; originally announced April 2020.

arXiv:2004.00931 [pdf, other]

doi 10.1109/TNSM.2020.3031573

Spotting political social bots in Twitter: A use case of the 2019 Spanish general election

Authors: Javier Pastor-Galindo, Mattia Zago, Pantaleone Nespoli, Sergio López Bernal, Alberto Huertas Celdrán, Manuel Gil Pérez, José A. Ruipérez-Valiente, Gregorio Martínez Pérez, Félix Gómez Mármol

Abstract: While social media has been proved as an exceptionally useful tool to interact with other people and massively and quickly spread helpful information, its great potential has been ill-intentionally leveraged as well to distort political elections and manipulate constituents. In the paper at hand, we analyzed the presence and behavior of social bots on Twitter in the context of the November 2019 Sp… ▽ More While social media has been proved as an exceptionally useful tool to interact with other people and massively and quickly spread helpful information, its great potential has been ill-intentionally leveraged as well to distort political elections and manipulate constituents. In the paper at hand, we analyzed the presence and behavior of social bots on Twitter in the context of the November 2019 Spanish general election. Throughout our study, we classified involved users as social bots or humans, and examined their interactions from a quantitative (i.e., amount of traffic generated and existing relations) and qualitative (i.e., user's political affinity and sentiment towards the most important parties) perspectives. Results demonstrated that a non-negligible amount of those bots actively participated in the election, supporting each of the five principal political parties. △ Less

Submitted 12 October, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

arXiv:1908.03536 [pdf, other]

doi 10.1145/3427376

Security in Brain-Computer Interfaces: State-of-the-art, opportunities, and future challenges

Authors: Sergio López Bernal, Alberto Huertas Celdrán, Gregorio Martínez Pérez, Michael Taynnan Barros, Sasitharan Balasubramaniam

Abstract: BCIs have significantly improved the patients' quality of life by restoring damaged hearing, sight, and movement capabilities. After evolving their application scenarios, the current trend of BCI is to enable new innovative brain-to-brain and brain-to-the-Internet communication paradigms. This technological advancement generates opportunities for attackers since users' personal information and phy… ▽ More BCIs have significantly improved the patients' quality of life by restoring damaged hearing, sight, and movement capabilities. After evolving their application scenarios, the current trend of BCI is to enable new innovative brain-to-brain and brain-to-the-Internet communication paradigms. This technological advancement generates opportunities for attackers since users' personal information and physical integrity could be under tremendous risk. This work presents the existing versions of the BCI life-cycle and homogenizes them in a new approach that overcomes current limitations. After that, we offer a qualitative characterization of the security attacks affecting each phase of the BCI cycle to analyze their impacts and countermeasures documented in the literature. Finally, we reflect on lessons learned, highlighting research trends and future challenges concerning security on BCIs. △ Less

Submitted 2 October, 2020; v1 submitted 9 August, 2019; originally announced August 2019.

arXiv:1405.7831 [pdf, other]

ROMEO: ReputatiOn Model Enhancing OpenID Simulator

Authors: Ginés Dólera Tormo, Félix Gómez Mármol, Gregorio Martínez Pérez

Abstract: OpenID is a standard decentralized initiative aimed at allowing Internet users to use the same personal account to access different services. Since it does not rely on any central authority, it is hard for such users or other entities to validate the trust level of each entity deployed in the system. Some research has been conducted to handle this issue, defining a reputation framework to determin… ▽ More OpenID is a standard decentralized initiative aimed at allowing Internet users to use the same personal account to access different services. Since it does not rely on any central authority, it is hard for such users or other entities to validate the trust level of each entity deployed in the system. Some research has been conducted to handle this issue, defining a reputation framework to determine the trust level of a relying party based on past experiences. However, this framework has been proposed in a theoretical way and some deeper analysis and validation is still missing. Our main contribution in this paper consist of a simulation environment able to validate the feasibility of the reputation framework and analyze its behaviour within different scenarios. △ Less

Submitted 30 May, 2014; originally announced May 2014.

Showing 1–33 of 33 results for author: Pérez, G M