Attention Meets UAVs:
A Comprehensive Evaluation of DDoS Detection in Low-Cost UAVs

Ashish Sharma¹, SVSLN Surya Suhas Vaddhiparthy², Sai Usha Goparaju³,
Deepak Gangadharan⁴ and Harikumar Kandath⁵ ¹Ashish Sharma, Computer Systems Group, International Institute of Information Technology Hyderabad, India. [email protected]²SVSLN Surya Suhas Vaddhiparthy, Computer Systems Group, International Institute of Information Technology Hyderabad, India. [email protected]³Sai Usha Goparaju, Computer Systems Group, International Institute of Information Technology Hyderabad, India. [email protected]⁴Deepak Gangadharan, Computer Systems Group, International Institute of Information Technology Hyderabad, India. [email protected]⁵Harikumar Kandath, Robotics Research Center, International Institute of Information Techno logy Hyderabad, India. [email protected]

Abstract

This paper explores the critical issue of enhancing cybersecurity measures for low-cost, Wi-Fi-based Unmanned Aerial Vehicles (UAVs) against Distributed Denial of Service (DDoS) attacks. In the current work, we have explored three variants of DDoS attacks, namely Transmission Control Protocol (TCP), Internet Control Message Protocol (ICMP), and TCP + ICMP flooding attacks, and developed a detection mechanism that runs on the companion computer of the UAV system. As a part of the detection mechanism, we have evaluated various machine learning, and deep learning algorithms, such as XGBoost, Isolation Forest, Long Short-Term Memory (LSTM), Bidirectional-LSTM (Bi-LSTM), LSTM with attention, Bi-LSTM with attention, and Time Series Transformer (TST) in terms of various classification metrics. Our evaluation reveals that algorithms with attention mechanisms outperform their counterparts in general, and TST stands out as the most efficient model with a run time of 0.1 seconds. TST has demonstrated an F1 score of 0.999, 0.997, and 0.943 for TCP, ICMP, and TCP + ICMP flooding attacks respectively. In this work, we present the necessary steps required to build an on-board DDoS detection mechanism. Further, we also present the ablation study to identify the best TST hyperparameters for DDoS detection, and we have also underscored the advantage of adapting learnable positional embeddings in TST for DDoS detection with an improvement in F1 score from 0.94 to 0.99.

Index Terms:

UAV, DDOS Attacks, Cyber Security, DDOS Detection

I Introduction

Technological innovations have recently spread their wings through Unmanned Aerial Vehicles (UAVs), marking a dawn of advancements in aerial ingenuity. These advances have resulted in unparalleled growth of UAV adaptation, particularly low-cost UAVs, which are anticipated to reach USD 42.8 billion in the market by 2025 [1]. The autonomy and cost-effectiveness combined with the agile nature of low-cost UAVs have drastically increased their visibility in various applications, such as Traffic Management, Healthcare, Disaster Response and Mega Sporting Events [2].

As the use of UAVs proliferates across various applications, scalability and reliable connectivity requirements become paramount for a UAV-based approach. Amidst the available approaches, Wi-Fi-based UAV connectivity emerges as one potential solution to enable the full capabilities of a UAV system [3]. One can establish resilient networks supporting swarm deployments by leveraging Wi-Fi technology [4].

Figure 1 illustrates a general Wi-Fi-based UAV system involving a Central Router, Ground Control Station (GCS), and a UAV system. A typical UAV system uses a Micro-Air- Vehicle link (MAVlink), a lightweight open-source communication protocol for seamless two-way communication between the GCS and the UAV system. MAVlink adapts a binary serialization approach for overhead-free communication between the devices in the network. Despite its widespread usage and development, this protocol is vulnerable to numerous security attacks, such as spoofing, Denial of Service (DoS), and message foraging attacks [5] [6].

Refer to caption — Figure 1: General Wi-Fi-based UAV System

According to Esentire’s 2023 Official Cybercrime report [7], the global annual cost of cybercrime is predicted to reach USD 9.5 trillion in 2024 and is projected to reach USD 10.5 trillion by 2025. The report highlights that this heightened risk can pertain to the explosion in mobile, cloud, Internet of Things (IoT) usage, and remote tools. The Cybersecurity statistics report by Cobalt [8] highlights that Distributed Denial of Service (DDoS), Ransomware, and Phishing attacks hold a predominant share as the first-hand choices of the attackers. The report particularly emphasizes that resource constraint edge applications such as IoT devices and UAVs fall prey to DDoS attacks significantly, making DDoS one of the most calamitous forms of cyber security attack [9].

On the other hand, multiple stakeholders opted to integrate AI with cybersecurity detection [6] in the endeavor to find a solution. Nevertheless, these efforts were promptly halted as this approach raised privacy and security issues according to the General Data Protection Regulation (GDPR) [8], as the detection models required data to be transferred outside the resource-constrained system for processing. Considering the UAV system’s resource-constrained nature, it is vital to deploy a resource-efficient mechanism on the UAV system to hinder any potential DDoS attacks. A general-purpose UAV system is equipped with a companion computer [6] capable of running a routine or a lightweight task. Therefore, an efficient approach can involve deploying a DDoS Detection model on the companion computer of the UAV system. Based on this notion, in the current work, we conduct a comprehensive evaluation considering various algorithms for DDoS Detection.

The contributions of our work are as follows.

1.

We present the necessary pre-processing steps to collect a comprehensive dataset while conducting flooding attacks on UAVs, as there is a scarcity of diverse datasets tailored explicitly for such attacks.
2.

We evaluate the feasibility of deploying Machine Learning (ML) and Deep Learning (DL) architectures on resource-constrained UAV companion computers for DDoS detection. Specifically, we have evaluated various algorithms, such as XGBoost (XGB), Isola- tion Forest (IF), Long Short-Term Memory (LSTM), Bidirectional-LSTM (Bi-LSTM), LSTM with attention (LSTM-A), Bi-LSTM with attention (BLSTM-A), and Time Series Transformer (TST) in terms of various classification metrics against three flooding attacks: TCP, ICMP, and TCP with ICMP.
3.

To the best of our knowledge, for the first time, we have explored the capability of TST and compared it with other attention mechanisms for DDoS attack detection in UAVs. We have also conducted an ablation study to identify the best TST hyperparameters and presented the significance of learnable-positional embeddings in improving the F1 score.

II Related Works

This section presents the UAV network attack scenario, followed by literature on DDoS attacks and detection mechanisms.

II-A UAV Network Attack Scenario

Numerous studies [10][6][11] have outlined the importance and benefits of adapting Wi-Fi-based UAVs. UAV systems use intermediate network nodes in either a centralized or decentralized manner for data exchange. A centralized network uses a router-based approach to control and maintain network devices. On the other hand, a decentralized network uses another network node for data exchange. A centralized network node can act as a single-point vulnerability as it can support collaborative attacks such as DDoS, making it crucial to analyze [12]. The existing works have adopted ML and DL algorithms for DDoS detection of TCP and ICMP attacks using ML and DL algorithms. However, they have not considered a mix of two DDoS flooding attacks [13].

II-B DDoS Attacks

Due to increased complexity and frequency, DDoS attacks have become a prominent research problem. Lee et al. has unveiled the first taxonomy and classification details of DDoS attacks and their effects [14]. These attacks can significantly compromise the network and computation resources of the target, making this calamitous for edge and UAV devices. DDoS attacks can often be characterized by fluctuating network traffic, which can be statistically analyzed [15]. An efficient DDoS attack usually involves network tools to generate attacks such as TCP and ICMP flooding, which ML algorithms can detect. However, these approaches can particularly fail when a hybrid DDoS attack involving a mix of TCP with ICMP flooding is considered. The current work evaluates the performance of various ML and DL algorithms for TCP, ICMP, and TCP with ICMP attack modes in a Wi-Fi-based low-cost UAV system.

II-C DDoS Detection

As emphasized in several works [16] [9] [6] [17], ML and DL have demonstrated the capabilities of efficient DDoS detection. For instance, the work by Zhang et al. [16] has proposed DDoS detection based on the dynamic behavior of TCP and UDP protocols. The work introduces a two-step process where the first step analyzes the traffic, followed by a mathematical representation. Although efficient, the work has considered a simulation-based approach and has not been tested on a practical UAV system. The work by Cam et al. [9] has efficiently summarised various available approaches for DDoS detection in IoT devices. The work has surveyed wavelet-based approaches followed by ML algorithms, such as Random Forest. The work has also explored DL algorithms such as autoencoders and CNN-LSTMs. However, the work has not considered the effect of longer attack sequences, which can lead to increased chances of incorrect predictions.

The work by Tlili et al. [18] has developed a general fault detection mechanism using LSTM, Bi-LSTM, and Gated Recurrent Unit algorithms. The authors have emphasized and demonstrated the capabilities and advantages of using Bi-LSTM for multi-fault-class detection scenarios. Further, the authors have mentioned the need and necessity for studying longer sequences to detect potential anomalies in a UAV scenario. Although efficient, Vaswani et al.’s work [19] has proven the unparalleled capabilities of transformers with positional encoding and multi-head self-attention mechanism can better capture the long-term dependencies than LSTM, Bi-LSTM based approaches in general. In the current work, we have adapted the Time Series Transformer (TST) [20], a modified and more suitable version of the original transformer architecture for time series analysis for DDoS detection.

The rest of the paper is organized as follows. Section III shows how these attacks work, ways to detect them in Section IV, our setup for experiments in Section V, and the results we found shown in Section VI. Finally, in Section VII, we wrap up with conclusions and ideas for future research.

III DDoS Attack Framework and Scenario

In the current work, three DDoS flooding attacks are considered to analyze the effect of a DDoS attack on a Wi-Fi-based UAV system. The overall system architecture and the different flooding attacks are elaborated below.

III-A Overall Flooding Architecture

Figure 2 illustrates the overall DDoS architecture with three flooding approaches: TCP flooding, ICMP flooding, and TCP with ICMP flooding. A typical DDoS attack generates numerous DoS messages through single or multiple compromised devices. Each DoS message is assigned unique Internet Protocol (IP) and Media Access Control (MAC) addresses to mask their origins. These random messages then target the UAV system’s IP address, ultimately leading to compromised network bandwidth for the UAV system. The central router is the primary focal point for receiving the flood of illegitimate traffic and authentic information from the GCS. This information is then forwarded to the target UAV system.

III-B DDoS Flooding Approaches

TCP flooding and ICMP flooding are common approaches used in DDoS attacks to compromise the network bandwidth. The specific and general input parameters used in the H**3 tool (an open-source packet generator) [21], for TCP and ICMP flooding are populated in Table I.

S.No.	Parameter	Description	Usage
1	U (TCP)	Set Urgent flag (TCP mode)	-U
2	p (TCP)	Target port	-p 80
3	1 (ICMP)	Set ICMP mode	-1
4	data	Size of packet body	–data 1000
5	n	Do not resolve hostnames	-n
6	flood	Send packets as fast as possible	–flood
7	target IP	Destination IP address	10.42.0.34

TABLE I: H**3 Parameters for flooding attacks.

III-B1 TCP Flooding

TCP flooding involves transmitting many TCP connection requests to the target system from single or multiple sources, rendering it difficult for the target system to differentiate between legitimate and malicious traffic. The description of H**3 tool parameters used for TCP flooding attack from Table I is as follows. Parameter $<$ U $>$ is used to set the Urgent flag, indicating that the receiver should prioritize the data following it. Parameter $<$ p $>$ refers to the target port address, followed by $<$ data $>$ refers to the size of data being posted in bytes. Parameter $<$ n $>$ ensures the host names are not resolved, increasing the attack efficiency. $<$ flood $>$ ensures that the flooding happens with all the available system resources, and $<$ target IP $>$ indicates the destination IP address.

III-B2 ICMP Flooding

ICMP flooding involves sending a barrage of incorrectly defined ICMP packets to the target system. The target system attempts to respond to each received request, ultimately leading to exhausted network bandwidth. Leaving out the TCP-specific parameters $<$ U $>$ and $<$ p $>$ , ICMP flooding attacks use the same parameters as TCP flooding attacks along with $<$ 1 $>$ flag to indicate ICMP attack mode in the H**3 tool.

III-B3 TCP With ICMP Flooding

This hybrid attack involves transmitting many TCP connection requests to the target while simultaneously sending a barrage of incorrectly defined ICMP packets to the target system. Two parallel instances of the H**3 tool can be used for generating this hybrid attack.

III-C Preliminary observations of a Flood Attack

Table II presents a 10-minute TCP and ICMP flood attack analysis. It can be observed that TCP packets have higher inter-arrival times than ICMP. This can be explained by TCP’s data handshake process, which ensures data reliability with increased overhead. In contrast, ICMP is designed for speedy message exchange, which accounts for its quicker packet delivery.

Statistic	ICMP	TCP
Mean	$5.6\times 10^{-3}$ s	$7.7\times 10^{-3}$ s
Standard Deviation	$4.48\times 10^{-2}$ s	$4.52\times 10^{-2}$ s
25th Percentile	$1.1\times 10^{-5}$ s	$1.2\times 10^{-5}$ s
50th Percentile	$5.55\times 10^{-4}$ s	$4.9\times 10^{-5}$ s
75th Percentile	$1.475\times 10^{-3}$ s	$4.126\times 10^{-3}$ s
Arrival Rate	$1.7857\times 10^{2}$ pac/s	$1.2987\times 10^{2}$ pac/s

TABLE II: Inter-Arrival Times Statistics for Packets

IV DDoS Detection

In this section, we delve into the DDoS Detection mechanism that runs on the companion computer of the UAV system. Figure 3 illustrates the pipeline for DDoS attack detection on the companion computer of the UAV system using Pixhawk flight control hardware. The network interface of the companion computer receives the MAVlink communication packets along with DDoS flood packets. All the incoming packets are then captured and forwarded to the packet queue for pre-processing, after which the classification algorithm is used for DDoS attack detection. MAVProxy, an open-source tool [22], acts as a proxy bridge to enable Wi-Fi-based data communication for the UAV system and is elaborated further in Section V-A

IV-A Data Collection for Model Training and Evaluation

The incoming network traffic is captured using Wireshark, an open-source packet analyzer [23]. The packet capture component of Wireshark captures the relative arrival time, packet description, protocol name, and the sequence length of the incoming traffic.

Training Data: The training data is collected as a continuous packet capture of normal traffic followed by a DDoS attack. Here a uniform window of 10 mins of normal network traffic data is followed by a 10 mins capture of a DDoS infested traffic. This is repeated for both the DDoS attacks separately. The total packet count and their capture duration are tabulated in Table III.

Name	Scenario	Packet Count	Duration (m)
Normal + TCP	Benign	60,401	10

	Malicious	121,401	10

Normal + ICMP	Benign	73,663	10

	Malicious	86,929	10

TABLE III: Train Data Description

Testing Data: Similar to train data generation, a test permutation involves 10-minute normal network traffic followed by a 10-minute DDoS traffic. Here three permutations are used to validate the performance of various DDoS detection algorithms: Normal + TCP Data, Normal + ICMP Data, and Normal + mix of TCP and ICMP Data. Table IV summarises these various modes along with their packet count and durations.

Name	Scenario	Packet Count	Duration (m)
Normal + TCP	Benign	54,203	10

	Malicious	110,945	10

Normal + ICMP	Benign	54,203	10

	Malicious	165,073	10

Normal + (TCP+ICMP)	Benign	36,025	10

	Malicious	441,442	10

TABLE IV: Test Data Description

IV-B Data Pre-Processing

In the current work, the MAVLink packet count is used as the classification feature to identify potential DDoS attacks. A sliding window of 0.1 seconds is considered to aggregate the number of MAVLink packets arrived based on relative arrival times. The aggregated data is then binary encoded with zero as the normal condition and one as the DDoS condition. Figure 4 illustrates the variation in total MAVLink packet count over a window of 0.1 seconds. It can be observed that MAVLink packets decrease significantly during a DDoS attack, thereby making it a resourceful indicator for classification. However, a time series sequence in the variation of MAVlink packet count can better identify a malicious scenario.

Kee** the training data, which has normal, TCP, and ICMP traffic information as a reference, Standard Scaler is applied to each of the testing conditions, where the mean value is subtracted from each of the test conditions, and then the result is divided by the standard deviation of the data.

IV-C Proposed and Comparision Algorithms to Detect DDoS

Kee**, the data pre-processing approach the same, six algorithms: XGBoost, Isolation Forest, LSTM, Bi-LSTM, LSTM with attention, and Bi-LSTM with attention are adopted as the comparison algorithms to evaluate the classification performance of our proposed Time Series Transformer. The description and the necessary hyper-parameters of these models are elaborated further.

XGBoost: Extreme Gradient Boosting is a supervised learning algorithm that uses a gradient boosting approach to minimize the overall training loss. XGBoost is a tree-based linear model optimized for classification problems [24]. Grid Search is used for hyper-parameter tuning and tuned parameters are: n_estimators equal to 100, max_depth equal to 4, and subsample equal to 1. n_estimators indicates the number of tree estimators for classification, max_depth indicates the tree depth and subsample indicates that each estimator is trained on whole dataset.

LSTM and Bi-LSTM Attention Models: In this study, we use LSTM and its extensions, including Bi-LSTM equipped with an attention mechanism to process Mavlink count data. LSTMs excel in retaining information from the Mavlink count data over extended periods, making them adept at modeling sequential information. Bi-LSTMs expand upon this by processing the input sequence in both forward and reverse order, thereby enriching the model’s understanding with additional context, which can significantly boost performance in sequence classification tasks [25]. Integrating an attention mechanism [26] further refines the model’s capability by focusing on crucial features within the Mavlink count data, enabling the selective emphasis of important information and highlighting its superior ability to detect DDoS attacks. Our attention mechanism highlights crucial information by adjusting weights, taking the dot product of weights and inputs, adding bias, and then applying a tanh function followed by a softmax layer to refine the focus on significant features. In this work, we focused on optimizing the models by tuning the sequence length, the number of layers, the number of units in the dense layer, and the dropout rate.

IV-D Time Series Transformer:

The TST by George et al. [20] has generalized transformer architecture for time series analysis. Unlike the baseline transformer architecture by Vaswani et al. [19], this uses an encoder-only approach to perform classification and regression analysis. Figure 5 illustrates the currently adapted TST architecture. The input data sequence is projected by a 1D convolution layer to match the dimension of the encoder layer (d_model). In contrast to the sinusoidal embeddings in the original transformer, here, fully learnable positional encodings are added to the output of the 1D convolution layer, which is given as the keys, queries, and values to the multi-head self-attention layer. Multi-head self-attention helps retain the longer sequences of DDoS data. The output of the attention layer is batch-normalized, after which it is provided as an input to the feed-forward network. A sigmoid function is used to threshold the output for classification.

In the current work, We have considered six parameters namely: seq_len, d_model, n_heads, d_ff, n_layers, and dropout for tuning the classification performance of the TST model. The parameter seq_len determines the length of the input sequence given as an input. Since each entry in the data is created at 0.1 seconds, a sequence length of 400 would translate to past 40 seconds of variations in MAVLink packet count. The parameter d_model determines the dimension of the input data to the encoder model, n_heads is the number of attention head elements given as input to create the attention vector, d_ff is the size of the feed-forward network which is present before the classification head, n_layers is the number of cascaded encoder layers before the classification head, and dropout is used to randomly prune the transformer to prevent over-fitting during the training phase. The rest of the hyper-parameters are listed in Table V. Section VI-C elaborates on the parameter ablations and analysis.

S.No.	Parameter	Description	Parameter Value
1	seq_len	Length of I/P sequence	400
2	d_model	Dimension of Encoder	64
3	n_heads	Number of Attention Heads	32
4	d_ff	Dimension of Feed-Forward N/W	64
5	n_layers	No. of Encoder Layers	2
6	dropout	Dropout in Encoder and FF layer	0.1
7	bs	Batch Size	4
8	d_k	Key Dimensions	2
9	d_v	Values Dimensions	2
10	le	Learned Embeddings	True
11	act	activation	gelu
12	seed	random seed value	42

TABLE V: Hyperparameter of TST Algorithm

V Experimental Setup

This section sheds light on various hardware and software components used in the current work for DDoS Attack and Detection. The description of the UAV system, Hardware, and Software components are as follows.

V-A UAV System Description

Figure 6(a) illustrates the different onboard components of the custom-built quad-copter used in the current work. PixHawk 2.4.8 is used as the flight controller. It provides various sensors capable of sensing positional information through the integrated IMU and magnetometer. The other onboard components include a Radio Telemetry for control information exchange with the GCS and a UBlox Neo-M8N Module [27].

The UAV System includes Raspberry Pi 4B [28] as an onboard companion computer. MAVProxy, an open-source GCS, acts as a proxy bridge to enable Wi-Fi-based data communication for the UAV system and also enables companion computing [22]. MAVProxy uses MAVlink 2.0 protocol encapsulated in TCP/UDP Header for data communication between UAV system and GCS.

V-B Hardware Components

The various hardware components used in the current work are as follows

GCS and Attacker Systems: GCS and Attacker systems use identical configurations. The system is a Ryzen 7 5825U Octa-core processor with Radeon Graphics clocked at 2000 MHz. The system runs with 16 GB of RAM and 512 GB of Solid-State Storage, and an 802.11ac WiFi Transceiver.

Companion Computer: As illustrated in Figure 6(b), Raspberry Pi 4B, a general purpose and versatile system, is chosen as the Companion Computer and the Central Router. This 64-bit Quadcore SoC runs at 1.5GHz with support for 802.11 b/g/n/ac standards [28] [29].

External Wi-Fi Adapter: Figure 6(c) displays a TP-link AC600 high gain wireless dual band usb adapter [30]. This adapter facilitates a peak transfer rate of 200 Mbps with a 5dBi high-gain antenna, enabling powerful DDoS traffic injection.

V-C Software Componenets

The various software components used in the current work are as follows

Wireshark: Wireshark is an open-source network packet analyzer. This tool enables users to monitor network traffic in real-time. This tool is mainly used for DDoS data generation for the model training phase in the current work [23].

H**3: H**3 is an open-source packet generator tool capable of producing a network flooding attack. This tool aims to generate sufficient network packets leading to compromised network bandwidth [21].

QGroundControl: Q Ground-Control, a user-friendly GCS is adapted for analyzing the UAV state and for control [27].

VI Experiments and Results

In this section, we evaluate the classification performance of various algorithms for DDoS detection. We then present the memory utilization and inference time for these algorithms. Following that, we present ablation study results to evaluate the best hyperparameters for the TST.

All the algorithms are trained and tested on the Colab notebook using Nvidia T4 GPU. Inference time for the algorithms is calculated on RPI-4B 8 GB RAM. The various results are as follows

VI-A Performance Evaluation

In this section, we present the performance evaluation of XGB, IF, LSTM, Bi-LSTM, LSTM-A, BLSTM-A for three kinds of flooding attacks TCP, ICMP and TCP with ICMP as shown in Table II. The model’s effectiveness is evaluated using performance metrics including F1 score, accuracy, recall, and precision.

	Precision	Recall	Accuracy	F1 score
		TCP Flooding
XGB	0.90	0.96	0.94	0.92
IF	0.68	0.76	0.67	0.64
LSTM	0.93	0.97	0.96	0.95
Bi-LSTM	0.94	0.96	0.97	0.95
LSTM-A	0.94	0.96	0.97	0.95
BLSTM-A	0.99	0.98	0.99	0.99
TST	0.99	0.99	0.99	0.99

TABLE VI: Performance evaluation for TCP Flooding

TCP Flooding: Table VI presents the performance for TCP Flooding attack. The significant deviation in classification accuracy of XGB, IF, and other algorithms can be explained by the time series nature of LSTMs, and TST-based algorithms. It can be observed that the adoption of the attention mechanism has significantly improved TCP Flood detection owing to the extended capabilities of the attention mechanism to analyze longer sequences. TST algorithm stands out with maximum performance across all the metrics due to its capability to handle longer sequences with learned positional embeddings.

	Precision	Recall	Accuracy	F1 score
		ICMP Flooding
XGB	0.89	0.95	0.93	0.91
IF	0.65	0.72	0.61	0.60
LSTM	0.71	0.92	0.85	0.75
Bi-LSTM	0.76	0.95	0.90	0.81
LSTM-A	0.78	0.95	0.91	0.84
BLSTM-A	0.89	0.98	0.97	0.93
TST	0.99	0.99	0.99	0.99

TABLE VII: Performance evaluation for ICMP Flooding

ICMP Flooding: Table VII presents the performance for ICMP Flooding attack. A similar trend, such as TCP flooding evaluation, can be observed for ICMP flood detection. However, unlike TCP flooding, the Bi-LSTM-A algorithm drastically decreases classification efficiency, while TST presents a consistent performance. This can be explained by multiple encoder layers of the TST architecture adopted in the current work. Multiple encoder layers can efficiently help identify the hidden trends in the data.

	Precision	Recall	Accuracy	F1 score
		TCP + ICMP Flooding
XGB	0.74	0.90	0.83	0.77
IF	0.67	0.81	0.70	0.65
LSTM	0.69	0.90	0.82	0.72
Bi-LSTM	0.72	0.92	0.86	0.76
LSTM-A	0.74	0.94	0.88	0.79
BLSTM-A	0.79	0.96	0.92	0.85
TST	0.92	0.98	0.97	0.94

TABLE VIII: Performance evaluation for TCP+ICMP Flooding

TCP + ICMP Flooding: TST algorithm follows a consistent efficiency in classifying a hybrid DDoS scenario involving TCP and ICMP attacks as shown in Table VIII. This unparalleled performance can be explained by the capabilities of transformer architecture to better capture the trends and variations in the data due to the presence of the multi-head self-attention along with the multiple encoders, which are absent in other algorithms.

VI-B Analysis of Memory and Prediction Time

S.No.	Algorithm	Memory(MB)	Inference Time (s)
1	XGB	7.14	0.0005
2	IF	13.89	0.0001
3	LSTM	253.62	0.0005
4	Bi-LSTM	255.71	0.0022
5	LSTM-A	257.54	0.0019
6	BLSTM-A	259.94	0.0061
7	TST	371.32	0.1339

TABLE IX: Computational analysis of test data on RPI-4B

We analyzed the memory usage and inference time for each algorithm on the companion computer of the UAV system. Table IX shows that algorithms with attention mechanisms require more memory and inference time than their counterparts without attention, as the mechanism requires additional computation to handle the input data. While the TST model outperforms the other algorithms, it incurs a higher memory and inference time due to the presence of multiple encoder layers for DDoS detection. Although TST has the highest inference time, the practical implication can be sustainable in a flight scenario as the inference time is nearly 0.1 seconds.

VI-C TST Ablation Study

Sequence Length	F1 score		Encoder Size	F1 score
Sequence Length	TCP	ICMP	Encoder Size	TCP	ICMP
100	0.96	0.89	64	0.99	0.99
200	0.80	0.85	128	0.81	0.84
300	0.90	0.91	256	0.84	0.71
400	0.99	0.99	512	0.99	0.98
500	0.86	0.83

TABLE X: F1 score vs Sequence Length, F1 score vs Encoder Size

Num of heads	F1 score		FF size	F1 score
Num of heads	TCP	ICMP	FF size	TCP	ICMP
8	0.99	0.98	32	0.97	0.96
16	0.98	0.97	64	0.99	0.99
32	0.99	0.99	128	0.97	0.98

TABLE XI: F1 score vs Num of heads, F1 score vs Feed forward size

Num of encoders	F1 score		Dropout	F1 score
Num of encoders	TCP	ICMP	Dropout	TCP	ICMP
1	0.96	0.96	0.1	0.99	0.99
2	0.99	0.99	0.2	0.98	0.97
3	0.96	0.90	0.3	0.97	0.97

TABLE XII: F1 score vs Num of encoders, F1 score vs Dropout

Table X, XI, and XII present the variation in sequence length and encoder size. TST model goes through a transitional phase from a sub-optimal F1 score to an optimal score at 400 sequence length, which can be explained by the dependency of DDoS detection on longer sequences. TST reaches the optimal F1 score for a smaller encoder and feed-forward network size of 64, which the simplicity of input data can explain. The need for attention mechanisms in longer sequences for DDoS detection can explain the higher number of attention heads and encoder layers. The dropout rate reaches the optimal F1 score at a lower value of 0,1 owing to a smaller feed-forward network and the encoder size. Table V presents the optimal hyper-parameters of the proposed TST algorithm for efficient DDoS detection.

VI-D Effect of Learnable Positional Embedding

Table XIII highlights the advantage of adapting learnable positional encoding (LPE) in TST for a DDoS detection scenario. LPE enables the TST algorithm to better capture the sequential relationships between input data elements using learned weights for each sequence component. LPE can help better adjust the attention weights, ultimately determining the classification outcome.

		Effect of Learnable
		Positional Embeddings
	Precision	Recall	Accuracy	F1 score
with LPE	0.99	0.99	0.99	0.99
without LPE	0.91	0.95	0.96	0.94

TABLE XIII: Effect of Learnable Positional Embeddings

VII Conclusion and Future Scope

Low-cost UAV systems have become prominent due to their ease of use. A typical low-cost UAV uses Wi-Fi-based information exchange, making them vulnerable to various cybersecurity attacks such as Distribute Denial of Service (DDoS) attacks. In the current work, we first present the suitable steps and pre-processing to handle DDoS network data. We have then evaluated various ML and DL algorithms such as XGBoost, Isolation Forest, Long Short-Term Memory (LSTM), Bidirectional-LSTM (Bi-LSTM), LSTM with attention, Bi-LSTM with attention, and Time Series Transformer (TST) for DDoS detection considering three variants of DDoS attacks namely: TCP, ICMP, and TCP + ICMP attacks. Our evaluation indicates that the proposed TST model outperforms the other ML and DL algorithms. TST has demonstrated an F1 score of 0.999, 0.997, and 0.943 for TCP, ICMP, and TCP + ICMP flooding attacks. We have also conducted a TST ablation analysis for fine-tuning the hyperparameters and we have also emphasized the advantage of adapting learnable positional embeddings in TST for DDoS detection. We plan to extend this work to build a DDoS mitigation algorithm for various UAV flight scenarios as part of our future work.

References

[1] C.-H. Fu, M.-W. Tsao, L.-P. Chi, and Z.-Y. Zhuang, “On the dominant factors of civilian-use drones: A thorough study and analysis of cross-group opinions using a triple helix model (thm) with the analytic hierarchy process (ahp),” Drones, vol. 5, no. 2, 2021.
[2] K. AL-Dosari, Z. Hunaiti, and W. Balachandran, “Systematic review on civilian drones in safety and security applications,” Drones, vol. 7, no. 3, p. 210, 2023.
[3] A. Guillen-Perez, R. Sanchez-Iborra, M.-D. Cano, J. C. Sanchez-Aarnoutse, and J. Garcia-Haro, “Wifi networks on drones,” in 2016 ITU Kaleidoscope: ICTs for a Sustainable World, pp. 1–8, 2016.
[4] O. Shrit, S. Martin, K. Alagha, and G. Pujolle, “A new approach to realize drone swarm using ad-hoc network,” in 2017 16th Annual Mediterranean Ad Hoc Networking Workshop (Med-Hoc-Net), pp. 1–5, IEEE, 2017.
[5] N. A. Sabuwala and R. D. Daruwala, “An approach to enhance the security of unmanned aerial vehicles (uavs),” The Journal of Supercomputing, pp. 1–31, 2023.
[6] A. Koubâa, A. Allouch, M. Alajlan, Y. Javed, A. Belghith, and M. Khalgui, “Micro air vehicle link (mavlink) in a nutshell: A survey,” IEEE Access, vol. 7, pp. 87658–87680, 2019.
[7] ESENTIRE, “esentire 2023-official-cybercrime-report,” 2024.
[8] COBALT, “Cobalt cybersecurity-statistics-2024,” 2024.
[9] N. T. Cam and N. G. Trung, “An intelligent approach to improving the performance of threat detection in iot,” IEEE Access, vol. 11, pp. 44319–44334, 2023.
[10] M. Dai, N. Huang, Y. Wu, J. Gao, and Z. Su, “Unmanned-aerial-vehicle-assisted wireless networks: Advancements, challenges, and solutions,” IEEE Internet of Things Journal, vol. 10, no. 5, pp. 4117–4147, 2022.
[11] J. Gordon, V. Kraj, J. H. Hwang, and A. Raja, “A security assessment for consumer wifi drones,” in 2019 IEEE International Conference on Industrial Internet (ICII), pp. 1–5, IEEE, 2019.
[12] O. Ceviz, P. Sadioğlu, and S. Sen, “A survey of security in uavs and fanets: Issues, threats, analysis of attacks, and solutions,” 06 2023.
[13] V. Shrivastava and A. K. Chaturvedi, “A survey on intrusion detection system based on machine learning and deep learning,” in 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT), pp. 1–6, IEEE, 2023.
[14] R. B. Lee, “Taxonomies of distributed denial of service networks, attacks, tools, and countermeasures,” 2004.
[15] V. Borgiani, P. Moratori, J. F. Kazienko, E. R. Tubino, and S. E. Quincozes, “Toward a distributed approach for detection and mitigation of denial-of-service attacks within industrial internet of things,” IEEE Internet of Things Journal, vol. 8, no. 6, pp. 4569–4578, 2020.
[16] R. Zhang, J.-P. Condomines, N. Larrieu, and R. Chemali, “Design of a novel network intrusion detection system for drone communications,” in 2018 IEEE/AIAA 37th Digital Avionics Systems Conference (DASC), pp. 1–10, 2018.
[17] O. Bouhamed, O. Bouachir, M. Aloqaily, and I. Al Ridhawi, “Lightweight ids for uav networks: A periodic deep reinforcement learning-based approach,” in 2021 IFIP/IEEE International Symposium on Integrated Network Management, pp. 1032–1037, IEEE, 2021.
[18] F. Tlili, S. Ayed, and L. C. Fourati, “A new hybrid adaptive deep learning-based framework for uavs faults and attacks detection,” IEEE Transactions on Services Computing, 2023.
[19] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems (I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, eds.), vol. 30, Curran Associates, Inc., 2017.
[20] G. Zerveas, S. Jayaraman, D. Patel, A. Bhamidipaty, and C. Eickhoff, “A transformer-based framework for multivariate time series representation learning,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, KDD ’21, (New York, NY, USA), p. 2114–2124, Association for Computing Machinery, 2021.
[21] K. L. Tools, “H**3,” 2024.
[22] MAVProxy, “Mavproxy - enables companion computing,” 2024.
[23] WireShark, “Wireshark network protocol analyzer,” 2024.
[24] Sklearn, “Xgboost,” 2024.
[25] S. U. Nagasri Goparaju, L. Lakshmanan, A. N, R. B, L. B, D. Gangadharan, and A. M. Hussain, “Time series-based driving event recognition for two wheelers,” in 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 1–2, 2023.
[26] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” CoRR, vol. abs/1706.03762, 2017.
[27] S. S. S. V, G. Sreya, P. R. Turlapati, D. Gangadharan, and H. Kandath, “A comprehensive evaluation on the impact of various spoofing scenarios on gps sensors in a low-cost uav,” in 2023 IEEE 19th International Conference on Automation Science and Engineering, pp. 1–6, 2023.
[28] RaspberryPi, “Rpi4b companion computer,” 2024.
[29] C. Gudla, M. S. Rana, and A. H. Sung, “Defense techniques against cyber attacks on unmanned aerial vehicles,” in Proceedings of the international conference on embedded systems, cyber-physical systems, and applications (ESCS), pp. 110–116, The Steering Committee of The World Congress in Computer Science, Computer …, 2018.
[30] W.-F. Adapter, “Tplink - a600 antenna,” 2024.

Attention Meets UAVs: A Comprehensive Evaluation of DDoS Detection in Low-Cost UAVs