-
TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models
Authors:
Hang Zou,
Qiyang Zhao,
Yu Tian,
Lina Bariah,
Faouzi Bader,
Thierry Lestable,
Merouane Debbah
Abstract:
Large Language Models (LLMs) have the potential to revolutionize the Sixth Generation (6G) communication networks. However, current mainstream LLMs generally lack the specialized knowledge in telecom domain. In this paper, for the first time, we propose a pipeline to adapt any general purpose LLMs to a telecom-specific LLMs. We collect and build telecom-specific pre-train dataset, instruction data…
▽ More
Large Language Models (LLMs) have the potential to revolutionize the Sixth Generation (6G) communication networks. However, current mainstream LLMs generally lack the specialized knowledge in telecom domain. In this paper, for the first time, we propose a pipeline to adapt any general purpose LLMs to a telecom-specific LLMs. We collect and build telecom-specific pre-train dataset, instruction dataset, preference dataset to perform continual pre-training, instruct tuning and alignment tuning respectively. Besides, due to the lack of widely accepted evaluation benchmarks in telecom domain, we extend existing evaluation benchmarks and proposed three new benchmarks, namely, Telecom Math Modeling, Telecom Open QnA and Telecom Code Tasks. These new benchmarks provide a holistic evaluation of the capabilities of LLMs including math modeling, Open-Ended question answering, code generation, infilling, summarization and analysis in telecom domain. Our fine-tuned LLM TelecomGPT outperforms state of the art (SOTA) LLMs including GPT-4, Llama-3 and Mistral in Telecom Math Modeling benchmark significantly and achieve comparable performance in various evaluation benchmarks such as TeleQnA, 3GPP technical documents classification, telecom code summary and generation and infilling.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
SecureFalcon: Are We There Yet in Automated Software Vulnerability Detection with LLMs?
Authors:
Mohamed Amine Ferrag,
Ammar Battah,
Norbert Tihanyi,
Ridhi Jain,
Diana Maimut,
Fatima Alwahedi,
Thierry Lestable,
Narinderjit Singh Thandi,
Abdechakour Mechri,
Merouane Debbah,
Lucas C. Cordeiro
Abstract:
Software vulnerabilities can cause numerous problems, including crashes, data loss, and security breaches. These issues greatly compromise quality and can negatively impact the market adoption of software applications and systems. Traditional bug-fixing methods, such as static analysis, often produce false positives. While bounded model checking, a form of Formal Verification (FV), can provide mor…
▽ More
Software vulnerabilities can cause numerous problems, including crashes, data loss, and security breaches. These issues greatly compromise quality and can negatively impact the market adoption of software applications and systems. Traditional bug-fixing methods, such as static analysis, often produce false positives. While bounded model checking, a form of Formal Verification (FV), can provide more accurate outcomes compared to static analyzers, it demands substantial resources and significantly hinders developer productivity. Can Machine Learning (ML) achieve accuracy comparable to FV methods and be used in popular instant code completion frameworks in near real-time? In this paper, we introduce SecureFalcon, an innovative model architecture with only 121 million parameters derived from the Falcon-40B model and explicitly tailored for classifying software vulnerabilities. To achieve the best performance, we trained our model using two datasets, namely the FormAI dataset and the FalconVulnDB. The FalconVulnDB is a combination of recent public datasets, namely the SySeVR framework, Draper VDISC, Bigvul, Diversevul, SARD Juliet, and ReVeal datasets. These datasets contain the top 25 most dangerous software weaknesses, such as CWE-119, CWE-120, CWE-476, CWE-122, CWE-190, CWE-121, CWE-78, CWE-787, CWE-20, and CWE-762. SecureFalcon achieves 94% accuracy in binary classification and up to 92% in multiclassification, with instant CPU inference times. It outperforms existing models such as BERT, RoBERTa, CodeBERT, and traditional ML algorithms, promising to push the boundaries of software vulnerability detection and instant code completion frameworks.
△ Less
Submitted 29 May, 2024; v1 submitted 13 July, 2023;
originally announced July 2023.
-
Revolutionizing Cyber Threat Detection with Large Language Models: A privacy-preserving BERT-based Lightweight Model for IoT/IIoT Devices
Authors:
Mohamed Amine Ferrag,
Mthandazo Ndhlovu,
Norbert Tihanyi,
Lucas C. Cordeiro,
Merouane Debbah,
Thierry Lestable,
Narinderjit Singh Thandi
Abstract:
The field of Natural Language Processing (NLP) is currently undergoing a revolutionary transformation driven by the power of pre-trained Large Language Models (LLMs) based on groundbreaking Transformer architectures. As the frequency and diversity of cybersecurity attacks continue to rise, the importance of incident detection has significantly increased. IoT devices are expanding rapidly, resultin…
▽ More
The field of Natural Language Processing (NLP) is currently undergoing a revolutionary transformation driven by the power of pre-trained Large Language Models (LLMs) based on groundbreaking Transformer architectures. As the frequency and diversity of cybersecurity attacks continue to rise, the importance of incident detection has significantly increased. IoT devices are expanding rapidly, resulting in a growing need for efficient techniques to autonomously identify network-based attacks in IoT networks with both high precision and minimal computational requirements. This paper presents SecurityBERT, a novel architecture that leverages the Bidirectional Encoder Representations from Transformers (BERT) model for cyber threat detection in IoT networks. During the training of SecurityBERT, we incorporated a novel privacy-preserving encoding technique called Privacy-Preserving Fixed-Length Encoding (PPFLE). We effectively represented network traffic data in a structured format by combining PPFLE with the Byte-level Byte-Pair Encoder (BBPE) Tokenizer. Our research demonstrates that SecurityBERT outperforms traditional Machine Learning (ML) and Deep Learning (DL) methods, such as Convolutional Neural Networks (CNNs) or Recurrent Neural Networks (RNNs), in cyber threat detection. Employing the Edge-IIoTset cybersecurity dataset, our experimental analysis shows that SecurityBERT achieved an impressive 98.2% overall accuracy in identifying fourteen distinct attack types, surpassing previous records set by hybrid solutions such as GAN-Transformer-based architectures and CNN-LSTM models. With an inference time of less than 0.15 seconds on an average CPU and a compact model size of just 16.7MB, SecurityBERT is ideally suited for real-life traffic analysis and a suitable choice for deployment on resource-constrained IoT devices.
△ Less
Submitted 8 February, 2024; v1 submitted 25 June, 2023;
originally announced June 2023.
-
An Incremental Gray-box Physical Adversarial Attack on Neural Network Training
Authors:
Rabiah Al-qudah,
Moayad Aloqaily,
Bassem Ouni,
Mohsen Guizani,
Thierry Lestable
Abstract:
Neural networks have demonstrated remarkable success in learning and solving complex tasks in a variety of fields. Nevertheless, the rise of those networks in modern computing has been accompanied by concerns regarding their vulnerability to adversarial attacks. In this work, we propose a novel gradient-free, gray box, incremental attack that targets the training process of neural networks. The pr…
▽ More
Neural networks have demonstrated remarkable success in learning and solving complex tasks in a variety of fields. Nevertheless, the rise of those networks in modern computing has been accompanied by concerns regarding their vulnerability to adversarial attacks. In this work, we propose a novel gradient-free, gray box, incremental attack that targets the training process of neural networks. The proposed attack, which implicitly poisons the intermediate data structures that retain the training instances between training epochs acquires its high-risk property from attacking data structures that are typically unobserved by professionals. Hence, the attack goes unnoticed despite the damage it can cause. Moreover, the attack can be executed without the attackers' knowledge of the neural network structure or training data making it more dangerous. The attack was tested under a sensitive application of secure cognitive cities, namely, biometric authentication. The conducted experiments showed that the proposed attack is effective and stealthy. Finally, the attack effectiveness property was concluded from the fact that it was able to flip the sign of the loss gradient in the conducted experiments to become positive, which indicated noisy and unstable training. Moreover, the attack was able to decrease the inference probability in the poisoned networks compared to their unpoisoned counterparts by 15.37%, 14.68%, and 24.88% for the Densenet, VGG, and Xception, respectively. Finally, the attack retained its stealthiness despite its high effectiveness. This was demonstrated by the fact that the attack did not cause a notable increase in the training time, in addition, the Fscore values only dropped by an average of 1.2%, 1.9%, and 1.5% for the poisoned Densenet, VGG, and Xception, respectively.
△ Less
Submitted 20 February, 2023;
originally announced March 2023.