Search | arXiv e-print repository

Exploring Layerwise Adversarial Robustness Through the Lens of t-SNE

Authors: Inês Valentim, Nuno Antunes, Nuno Lourenço

Abstract: Adversarial examples, designed to trick Artificial Neural Networks (ANNs) into producing wrong outputs, highlight vulnerabilities in these models. Exploring these weaknesses is crucial for develo** defenses, and so, we propose a method to assess the adversarial robustness of image-classifying ANNs. The t-distributed Stochastic Neighbor Embedding (t-SNE) technique is used for visual inspection, a… ▽ More Adversarial examples, designed to trick Artificial Neural Networks (ANNs) into producing wrong outputs, highlight vulnerabilities in these models. Exploring these weaknesses is crucial for develo** defenses, and so, we propose a method to assess the adversarial robustness of image-classifying ANNs. The t-distributed Stochastic Neighbor Embedding (t-SNE) technique is used for visual inspection, and a metric, which compares the clean and perturbed embeddings, helps pinpoint weak spots in the layers. Analyzing two ANNs on CIFAR-10, one designed by humans and another via NeuroEvolution, we found that differences between clean and perturbed representations emerge early on, in the feature extraction layers, affecting subsequent classification. The findings with our metric are supported by the visual analysis of the t-SNE maps. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2308.16570 [pdf, other]

MONDEO: Multistage Botnet Detection

Authors: Duarte Dias, Bruno Sousa, Nuno Antunes

Abstract: Mobile devices have widespread to become the most used piece of technology. Due to their characteristics, they have become major targets for botnet-related malware. FluBot is one example of botnet malware that infects mobile devices. In particular, FluBot is a DNS-based botnet that uses Domain Generation Algorithms (DGA) to establish communication with the Command and Control Server (C2). MONDEO i… ▽ More Mobile devices have widespread to become the most used piece of technology. Due to their characteristics, they have become major targets for botnet-related malware. FluBot is one example of botnet malware that infects mobile devices. In particular, FluBot is a DNS-based botnet that uses Domain Generation Algorithms (DGA) to establish communication with the Command and Control Server (C2). MONDEO is a multistage mechanism with a flexible design to detect DNS-based botnet malware. MONDEO is lightweight and can be deployed without requiring the deployment of software, agents, or configuration in mobile devices, allowing easy integration in core networks. MONDEO comprises four detection stages: Blacklisting/Whitelisting, Query rate analysis, DGA analysis, and Machine learning evaluation. It was created with the goal of processing streams of packets to identify attacks with high efficiency, in the distinct phases. MONDEO was tested against several datasets to measure its efficiency and performance, being able to achieve high performance with RandomForest classifiers. The implementation is available at github. △ Less

Submitted 31 August, 2023; originally announced August 2023.

arXiv:2308.03404 [pdf, other]

Applied metamodelling for ATM performance simulations

Authors: Christoffer Riis, Francisco N. Antunes, Tatjana Bolić, Gérald Gurtner, Andrew Cook, Carlos Lima Azevedo, Francisco Câmara Pereira

Abstract: The use of Air traffic management (ATM) simulators for planing and operations can be challenging due to their modelling complexity. This paper presents XALM (eXplainable Active Learning Metamodel), a three-step framework integrating active learning and SHAP (SHapley Additive exPlanations) values into simulation metamodels for supporting ATM decision-making. XALM efficiently uncovers hidden relatio… ▽ More The use of Air traffic management (ATM) simulators for planing and operations can be challenging due to their modelling complexity. This paper presents XALM (eXplainable Active Learning Metamodel), a three-step framework integrating active learning and SHAP (SHapley Additive exPlanations) values into simulation metamodels for supporting ATM decision-making. XALM efficiently uncovers hidden relationships among input and output variables in ATM simulators, those usually of interest in policy analysis. Our experiments show XALM's predictive performance comparable to the XGBoost metamodel with fewer simulations. Additionally, XALM exhibits superior explanatory capabilities compared to non-active learning metamodels. Using the `Mercury' (flight and passenger) ATM simulator, XALM is applied to a real-world scenario in Paris Charles de Gaulle airport, extending an arrival manager's range and scope by analysing six variables. This case study illustrates XALM's effectiveness in enhancing simulation interpretability and understanding variable interactions. By addressing computational challenges and improving explainability, XALM complements traditional simulation-based analyses. Lastly, we discuss two practical approaches for reducing the computational burden of the metamodelling further: we introduce a stop** criterion for active learning based on the inherent uncertainty of the metamodel, and we show how the simulations used for the metamodel can be reused across key performance indicators, thus decreasing the overall number of simulations needed. △ Less

Submitted 7 August, 2023; originally announced August 2023.

arXiv:2207.05451 [pdf, other]

Adversarial Robustness Assessment of NeuroEvolution Approaches

Authors: Inês Valentim, Nuno Lourenço, Nuno Antunes

Abstract: NeuroEvolution automates the generation of Artificial Neural Networks through the application of techniques from Evolutionary Computation. The main goal of these approaches is to build models that maximize predictive performance, sometimes with an additional objective of minimizing computational complexity. Although the evolved models achieve competitive results performance-wise, their robustness… ▽ More NeuroEvolution automates the generation of Artificial Neural Networks through the application of techniques from Evolutionary Computation. The main goal of these approaches is to build models that maximize predictive performance, sometimes with an additional objective of minimizing computational complexity. Although the evolved models achieve competitive results performance-wise, their robustness to adversarial examples, which becomes a concern in security-critical scenarios, has received limited attention. In this paper, we evaluate the adversarial robustness of models found by two prominent NeuroEvolution approaches on the CIFAR-10 image classification task: DENSER and NSGA-Net. Since the models are publicly available, we consider white-box untargeted attacks, where the perturbations are bounded by either the L2 or the Linfinity-norm. Similarly to manually-designed networks, our results show that when the evolved models are attacked with iterative methods, their accuracy usually drops to, or close to, zero under both distance metrics. The DENSER model is an exception to this trend, showing some resistance under the L2 threat model, where its accuracy only drops from 93.70% to 18.10% even with iterative attacks. Additionally, we analyzed the impact of pre-processing applied to the data before the first layer of the network. Our observations suggest that some of these techniques can exacerbate the perturbations added to the original inputs, potentially harming robustness. Thus, this choice should not be neglected when automatically designing networks for applications where adversarial attacks are prone to occur. △ Less

Submitted 12 July, 2022; originally announced July 2022.

arXiv:2006.08811 [pdf, other]

A Model-Based Approach to Anomaly Detection Trading Detection Time and False Alarm Rate

Authors: Charles F. Gonçalves, Daniel S. Menasché, Alberto Avritzer, Nuno Antunes, Marco Vieira

Abstract: The complexity and ubiquity of modern computing systems is a fertile ground for anomalies, including security and privacy breaches. In this paper, we propose a new methodology that addresses the practical challenges to implement anomaly detection approaches. Specifically, it is challenging to define normal behavior comprehensively and to acquire data on anomalies in diverse cloud environments. To… ▽ More The complexity and ubiquity of modern computing systems is a fertile ground for anomalies, including security and privacy breaches. In this paper, we propose a new methodology that addresses the practical challenges to implement anomaly detection approaches. Specifically, it is challenging to define normal behavior comprehensively and to acquire data on anomalies in diverse cloud environments. To tackle those challenges, we focus on anomaly detection approaches based on system performance signatures. In particular, performance signatures have the potential of detecting zero-day attacks, as those approaches are based on detecting performance deviations and do not require detailed knowledge of attack history. The proposed methodology leverages an analytical performance model and experimentation and allows to control the rate of false positives in a principled manner. The methodology is evaluated using the TPCx-V workload, which was profiled during a set of executions using resource exhaustion anomalies that emulate the effects of anomalies affecting system performance. The proposed approach was able to successfully detect the anomalies, with a low number of false positives (precision 90%-98%). △ Less

Submitted 15 June, 2020; originally announced June 2020.

Comments: 2020 Mediterranean Communication and Computer Networking Conference (MedComNet)

ACM Class: C.4

arXiv:1910.02321 [pdf, other]

The Impact of Data Preparation on the Fairness of Software Systems

Authors: Inês Valentim, Nuno Lourenço, Nuno Antunes

Abstract: Machine learning models are widely adopted in scenarios that directly affect people. The development of software systems based on these models raises societal and legal concerns, as their decisions may lead to the unfair treatment of individuals based on attributes like race or gender. Data preparation is key in any machine learning pipeline, but its effect on fairness is yet to be studied in deta… ▽ More Machine learning models are widely adopted in scenarios that directly affect people. The development of software systems based on these models raises societal and legal concerns, as their decisions may lead to the unfair treatment of individuals based on attributes like race or gender. Data preparation is key in any machine learning pipeline, but its effect on fairness is yet to be studied in detail. In this paper, we evaluate how the fairness and effectiveness of the learned models are affected by the removal of the sensitive attribute, the encoding of the categorical attributes, and instance selection methods (including cross-validators and random undersampling). We used the Adult Income and the German Credit Data datasets, which are widely studied and known to have fairness concerns. We applied each data preparation technique individually to analyse the difference in predictive performance and fairness, using statistical parity difference, disparate impact, and the normalised prejudice index. The results show that fairness is affected by transformations made to the training data, particularly in imbalanced datasets. Removing the sensitive attribute is insufficient to eliminate all the unfairness in the predictions, as expected, but it is key to achieve fairer models. Additionally, the standard random undersampling with respect to the true labels is sometimes more prejudicial than performing no random undersampling. △ Less

Submitted 5 October, 2019; originally announced October 2019.

arXiv:1810.01300 [pdf, other]

Sampling-based Estimation of In-degree Distribution with Applications to Directed Complex Networks

Authors: Nelson Antunes, Shankar Bhamidi, Tianjian Guo, Vladas Pipiras, Bang Wang

Abstract: The focus of this work is on estimation of the in-degree distribution in directed networks from sampling network nodes or edges. A number of sampling schemes are considered, including random sampling with and without replacement, and several approaches based on random walks with possible jumps. When sampling nodes, it is assumed that only the out-edges of that node are visible, that is, the in-deg… ▽ More The focus of this work is on estimation of the in-degree distribution in directed networks from sampling network nodes or edges. A number of sampling schemes are considered, including random sampling with and without replacement, and several approaches based on random walks with possible jumps. When sampling nodes, it is assumed that only the out-edges of that node are visible, that is, the in-degree of that node is not observed. The suggested estimation of the in-degree distribution is based on two approaches. The inversion approach exploits the relation between the original and sample in-degree distributions, and can estimate the bulk of the in-degree distribution, but not the tail of the distribution. The tail of the in-degree distribution is estimated through an asymptotic approach, which itself has two versions: one assuming a power-law tail and the other for a tail of general form. The two estimation approaches are examined on synthetic and real networks, with good performance results, especially striking for the asymptotic approach. △ Less

Submitted 2 October, 2018; originally announced October 2018.

Comments: 30 pages , 6 figures

arXiv:1410.1160 [pdf, other]

On Benchmarking Intrusion Detection Systems in Virtualized Environments

Authors: Aleksandar Milenkoski, Samuel Kounev, Alberto Avritzer, Nuno Antunes, Marco Vieira

Abstract: Modern intrusion detection systems (IDSes) for virtualized environments are deployed in the virtualization layer with components inside the virtual machine monitor (VMM) and the trusted host virtual machine (VM). Such IDSes can monitor at the same time the network and host activities of all guest VMs running on top of a VMM being isolated from malicious users of these VMs. We refer to IDSes for vi… ▽ More Modern intrusion detection systems (IDSes) for virtualized environments are deployed in the virtualization layer with components inside the virtual machine monitor (VMM) and the trusted host virtual machine (VM). Such IDSes can monitor at the same time the network and host activities of all guest VMs running on top of a VMM being isolated from malicious users of these VMs. We refer to IDSes for virtualized environments as VMM-based IDSes. In this work, we analyze state-of-the-art intrusion detection techniques applied in virtualized environments and architectures of VMM-based IDSes. Further, we identify challenges that apply specifically to benchmarking VMM-based IDSes focussing on workloads and metrics. For example, we discuss the challenge of defining representative baseline benign workload profiles as well as the challenge of defining malicious workloads containing attacks targeted at the VMM. We also discuss the impact of on-demand resource provisioning features of virtualized environments (e.g., CPU and memory hotplugging, memory ballooning) on IDS benchmarking measures such as capacity and attack detection accuracy. Finally, we outline future research directions in the area of benchmarking VMM-based IDSes and of intrusion detection in virtualized environments in general. △ Less

Submitted 5 October, 2014; originally announced October 2014.

Comments: SPEC (Standard Performance Evaluation Corporation) Research Group --- IDS Benchmarking Working Group

Report number: SPEC-RG-2013-002

arXiv:1410.1158 [pdf, other]

Technical Information on Vulnerabilities of Hypercall Handlers

Authors: Aleksandar Milenkoski, Marco Vieira, Bryan D. Payne, Nuno Antunes, Samuel Kounev

Abstract: Modern virtualized service infrastructures expose attack vectors that enable attacks of high severity, such as attacks targeting hypervisors. A malicious user of a guest VM (virtual machine) may execute an attack against the underlying hypervisor via hypercalls, which are software traps from a kernel of a fully or partially paravirtualized guest VM to the hypervisor. The exploitation of a vulnerab… ▽ More Modern virtualized service infrastructures expose attack vectors that enable attacks of high severity, such as attacks targeting hypervisors. A malicious user of a guest VM (virtual machine) may execute an attack against the underlying hypervisor via hypercalls, which are software traps from a kernel of a fully or partially paravirtualized guest VM to the hypervisor. The exploitation of a vulnerability of a hypercall handler may have severe consequences such as altering hypervisor's memory, which may result in the execution of malicious code with hypervisor privilege. Despite the importance of vulnerabilities of hypercall handlers, there is not much publicly available information on them. This significantly hinders advances towards securing hypercall interfaces. In this work, we provide in-depth technical information on publicly disclosed vulnerabilities of hypercall handlers. Our vulnerability analysis is based on reverse engineering the released patches fixing the considered vulnerabilities. For each analyzed vulnerability, we provide background information essential for understanding the vulnerability, and information on the vulnerable hypercall handler and the error causing the vulnerability. We also show how the vulnerability can be triggered and discuss the state of the targeted hypervisor after the vulnerability has been triggered. △ Less

Submitted 5 October, 2014; originally announced October 2014.

Comments: SPEC (Standard Performance Evaluation Corporation) Research Group --- IDS Benchmarking Working Group

Report number: SPEC-RG-2014-001

arXiv:1002.2385 [pdf, other]

Traffic Capacity of Large WDM Passive Optical Networks

Authors: Nelson Antunes, Christine Fricker, Philippe Robert, James Roberts

Abstract: As passive optical networks (PON) are increasingly deployed to provide high speed Internet access, it is important to understand their fundamental traffic capacity limits. The paper discusses performance models applicable to wavelength division multiplexing (WDM) EPONs and GPONs under the assumption that users access the fibre via optical network units equipped with tunable transmitters. The con… ▽ More As passive optical networks (PON) are increasingly deployed to provide high speed Internet access, it is important to understand their fundamental traffic capacity limits. The paper discusses performance models applicable to wavelength division multiplexing (WDM) EPONs and GPONs under the assumption that users access the fibre via optical network units equipped with tunable transmitters. The considered stochastic models are based on multiserver polling systems for which explicit analytical results are not known. A large system asymptotic, mean-field approximation, is used to derive closed form solutions of these complex systems. Convergence of the mean field dynamics is proved in the case of a simple network configuration. Simulation results show that, for a realistic sized PON, the mean field approximation is accurate. △ Less

Submitted 27 March, 2010; v1 submitted 11 February, 2010; originally announced February 2010.

arXiv:1002.2384 [pdf, other]

Upstream traffic capacity of a WDM EPON under online GATE-driven scheduling

Authors: Nelson Antunes, Christine Fricker, Philippe Robert, James Roberts

Abstract: Passive optical networks are increasingly used for access to the Internet and it is important to understand the performance of future long-reach, multi-channel variants. In this paper we discuss requirements on the dynamic bandwidth allocation (DBA) algorithm used to manage the upstream resource in a WDM EPON and propose a simple novel DBA algorithm that is considerably more efficient than class… ▽ More Passive optical networks are increasingly used for access to the Internet and it is important to understand the performance of future long-reach, multi-channel variants. In this paper we discuss requirements on the dynamic bandwidth allocation (DBA) algorithm used to manage the upstream resource in a WDM EPON and propose a simple novel DBA algorithm that is considerably more efficient than classical approaches. We demonstrate that the algorithm emulates a multi-server polling system and derive capacity formulas that are valid for general traffic processes. We evaluate delay performance by simulation demonstrating the superiority of the proposed scheduler. The proposed scheduler offers considerable flexibility and is particularly efficient in long-reach access networks where propagation times are high. △ Less

Submitted 11 February, 2010; originally announced February 2010.

arXiv:cs/0601016 [pdf, ps, other]

doi 10.1016/j.peva.2005.07.006

Integration of streaming services and TCP data transmission in the Internet

Authors: Nelson Antunes, Christine Fricker, Fabrice Guillemin, Philippe Robert

Abstract: We study in this paper the integration of elastic and streaming traffic on a same link in an IP network. We are specifically interested in the computation of the mean bit rate obtained by a data transfer. For this purpose, we consider that the bit rate offered by streaming traffic is low, of the order of magnitude of a small parameter \eps \ll 1 and related to an auxiliary stationary Markovian p… ▽ More We study in this paper the integration of elastic and streaming traffic on a same link in an IP network. We are specifically interested in the computation of the mean bit rate obtained by a data transfer. For this purpose, we consider that the bit rate offered by streaming traffic is low, of the order of magnitude of a small parameter \eps \ll 1 and related to an auxiliary stationary Markovian process (X(t)). Under the assumption that data transfers are exponentially distributed, arrive according to a Poisson process, and share the available bandwidth according to the ideal processor sharing discipline, we derive the mean bit rate of a data transfer as a power series expansion in \eps. Since the system can be described by means of an M/M/1 queue with a time-varying server rate, which depends upon the parameter \eps and process (X(t)), the key issue is to compute an expansion of the area swept under the occupation process of this queue in a busy period. We obtain closed formulas for the power series expansion in \eps of the mean bit rate, which allow us to verify the validity of the so-called reduced service rate at the first order. The second order term yields more insight into the negative impact of the variability of streaming flows. △ Less

Submitted 6 January, 2006; originally announced January 2006.

Journal ref: Performance Evaluation 62, 1-4 (2006) 263-277

arXiv:cs/0601015 [pdf, ps, other]

Perturbation Analysis of a Variable M/M/1 Queue: A Probabilistic Approach

Authors: Nelson Antunes, Christine Fricker, Fabrice Guillemin, Philippe Robert

Abstract: Motivated by the problem of the coexistence on transmission links of telecommunication networks of elastic and unresponsive traffic, we study in this paper the impact on the busy period of an M/M/1 queue of a small perturbation in the server rate. The perturbation depends upon an independent stationary process (X(t)) and is quantified by means of a parameter \eps \ll 1. We specifically compute t… ▽ More Motivated by the problem of the coexistence on transmission links of telecommunication networks of elastic and unresponsive traffic, we study in this paper the impact on the busy period of an M/M/1 queue of a small perturbation in the server rate. The perturbation depends upon an independent stationary process (X(t)) and is quantified by means of a parameter \eps \ll 1. We specifically compute the two first terms of the power series expansion in \eps of the mean value of the busy period duration. This allows us to study the validity of the Reduced Service Rate (RSR) approximation, which consists in comparing the perturbed M/M/1 queue with the M/M/1 queue where the service rate is constant and equal to the mean value of the perturbation. For the first term of the expansion, the two systems are equivalent. For the second term, the situation is more complex and it is shown that the correlations of the environment process (X(t)) play a key role. △ Less

Submitted 6 January, 2006; originally announced January 2006.

Journal ref: Advances in Applied Probability 38, 1 (2006) 263-283

arXiv:cs/0512088 [pdf, ps, other]

Analysis of loss networks with routing

Authors: Nelson Antunes, Christine Fricker, Philippe Robert, Danielle Tibi

Abstract: This paper analyzes stochastic networks consisting of finite capacity nodes with different classes of requests which move according to some routing policy. The Markov processes describing these networks do not, in general, have reversibility properties, so the explicit expression of their invariant distribution is not known. Kelly's limiting regime is considered: the arrival rates of calls as we… ▽ More This paper analyzes stochastic networks consisting of finite capacity nodes with different classes of requests which move according to some routing policy. The Markov processes describing these networks do not, in general, have reversibility properties, so the explicit expression of their invariant distribution is not known. Kelly's limiting regime is considered: the arrival rates of calls as well as the capacities of the nodes are proportional to a factor going to infinity. It is proved that, in limit, the associated rescaled Markov process converges to a deterministic dynamical system with a unique equilibrium point characterized by a nonstandard fixed point equation. △ Less

Submitted 14 February, 2007; v1 submitted 22 December, 2005; originally announced December 2005.

Comments: Published at http://dx.doi.org/10.1214/105051606000000466 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org)

MSC Class: 60K35 (Primary) 60K25 (Secondary)

Journal ref: Annals of Applied Probability 16, 4 (2006) 2007-2026

Showing 1–14 of 14 results for author: Antunes, N