NetTiSA: Extended IP Flow with Time-series Features for Universal Bandwidth-constrained High-speed Network Traffic Classification
Authors:
Josef Koumar,
Karel Hynek,
Jaroslav Pešek,
Tomáš Čejka
Abstract:
Network traffic monitoring based on IP Flows is a standard monitoring approach that can be deployed to various network infrastructures, even the large IPS-based networks connecting millions of people. Since flow records traditionally contain only limited information (addresses, transport ports, and amount of exchanged data), they are also commonly extended for additional features that enable netwo…
▽ More
Network traffic monitoring based on IP Flows is a standard monitoring approach that can be deployed to various network infrastructures, even the large IPS-based networks connecting millions of people. Since flow records traditionally contain only limited information (addresses, transport ports, and amount of exchanged data), they are also commonly extended for additional features that enable network traffic analysis with high accuracy. Nevertheless, the flow extensions are often too large or hard to compute, which limits their deployment only to smaller-sized networks. This paper proposes a novel extended IP flow called NetTiSA (Network Time Series Analysed), which is based on the analysis of the time series of packet sizes. By thoroughly testing 25 different network classification tasks, we show the broad applicability and high usability of NetTiSA, which often outperforms the best-performing related works. For practical deployment, we also consider the sizes of flows extended for NetTiSA and evaluate the performance impacts of its computation in the flow exporter. The novel feature set proved universal and deployable to high-speed ISP networks with 100\,Gbps lines; thus, it enables accurate and widespread network security protection.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
Active Learning Framework to Automate NetworkTraffic Classification
Authors:
Jaroslav Pešek,
Dominik Soukup,
Tomáš Čejka
Abstract:
Recent network traffic classification methods benefitfrom machine learning (ML) technology. However, there aremany challenges due to use of ML, such as: lack of high-qualityannotated datasets, data-drifts and other effects causing aging ofdatasets and ML models, high volumes of network traffic etc. Thispaper argues that it is necessary to augment traditional workflowsof ML training&deployment and…
▽ More
Recent network traffic classification methods benefitfrom machine learning (ML) technology. However, there aremany challenges due to use of ML, such as: lack of high-qualityannotated datasets, data-drifts and other effects causing aging ofdatasets and ML models, high volumes of network traffic etc. Thispaper argues that it is necessary to augment traditional workflowsof ML training&deployment and adapt Active Learning concepton network traffic analysis. The paper presents a novel ActiveLearning Framework (ALF) to address this topic. ALF providesprepared software components that can be used to deploy an activelearning loop and maintain an ALF instance that continuouslyevolves a dataset and ML model automatically. The resultingsolution is deployable for IP flow-based analysis of high-speed(100 Gb/s) networks, and also supports research experiments ondifferent strategies and methods for annotation, evaluation, datasetoptimization, etc. Finally, the paper lists some research challengesthat emerge from the first experiments with ALF in practice.
△ Less
Submitted 26 October, 2022;
originally announced November 2022.