Search | arXiv e-print repository

ModSec-Learn: Boosting ModSecurity with Machine Learning

Authors: Christian Scano, Giuseppe Floris, Biagio Montaruli, Luca Demetrio, Andrea Valenza, Luca Compagna, Davide Ariu, Luca Piras, Davide Balzarotti, Battista Biggio

Abstract: ModSecurity is widely recognized as the standard open-source Web Application Firewall (WAF), maintained by the OWASP Foundation. It detects malicious requests by matching them against the Core Rule Set (CRS), identifying well-known attack patterns. Each rule is manually assigned a weight based on the severity of the corresponding attack, and a request is blocked if the sum of the weights of matche… ▽ More ModSecurity is widely recognized as the standard open-source Web Application Firewall (WAF), maintained by the OWASP Foundation. It detects malicious requests by matching them against the Core Rule Set (CRS), identifying well-known attack patterns. Each rule is manually assigned a weight based on the severity of the corresponding attack, and a request is blocked if the sum of the weights of matched rules exceeds a given threshold. However, we argue that this strategy is largely ineffective against web attacks, as detection is only based on heuristics and not customized on the application to protect. In this work, we overcome this issue by proposing a machine-learning model that uses the CRS rules as input features. Through training, ModSec-Learn is able to tune the contribution of each CRS rule to predictions, thus adapting the severity level to the web applications to protect. Our experiments show that ModSec-Learn achieves a significantly better trade-off between detection and false positive rates. Finally, we analyze how sparse regularization can reduce the number of rules that are relevant at inference time, by discarding more than 30% of the CRS rules. We release our open-source code and the dataset at https://github.com/pralab/modsec-learn and https://github.com/pralab/http-traffic-dataset, respectively. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: arXiv admin note: text overlap with arXiv:2308.04964

arXiv:2405.19970 [pdf, other]

Strategies to Counter Artificial Intelligence in Law Enforcement: Cross-Country Comparison of Citizens in Greece, Italy and Spain

Authors: Petra Saskia Bayerl, Babak Akhgar, Ernesto La Mattina, Barbara Pirillo, Ioana Cotoi, Davide Ariu, Matteo Mauri, Jorge Garcia, Dimitris Kavallieros, Antonia Kardara, Konstantina Karagiorgou

Abstract: This paper investigates citizens' counter-strategies to the use of Artificial Intelligence (AI) by law enforcement agencies (LEAs). Based on information from three countries (Greece, Italy and Spain) we demonstrate disparities in the likelihood of ten specific counter-strategies. We further identified factors that increase the propensity for counter-strategies. Our study provides an important new… ▽ More This paper investigates citizens' counter-strategies to the use of Artificial Intelligence (AI) by law enforcement agencies (LEAs). Based on information from three countries (Greece, Italy and Spain) we demonstrate disparities in the likelihood of ten specific counter-strategies. We further identified factors that increase the propensity for counter-strategies. Our study provides an important new perspective to societal impacts of security-focused AI applications by illustrating the conscious, strategic choices by citizens when confronted with AI capabilities for LEAs. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 20th International Conference on Information and Knowledge Engineering (IKE'21), 3 papges, 1 figure

ACM Class: I.2.0; K.4.1

arXiv:2308.04964 [pdf, other]

Adversarial ModSecurity: Countering Adversarial SQL Injections with Robust Machine Learning

Authors: Biagio Montaruli, Luca Demetrio, Andrea Valenza, Luca Compagna, Davide Ariu, Luca Piras, Davide Balzarotti, Battista Biggio

Abstract: ModSecurity is widely recognized as the standard open-source Web Application Firewall (WAF), maintained by the OWASP Foundation. It detects malicious requests by matching them against the Core Rule Set, identifying well-known attack patterns. Each rule in the CRS is manually assigned a weight, based on the severity of the corresponding attack, and a request is detected as malicious if the sum of t… ▽ More ModSecurity is widely recognized as the standard open-source Web Application Firewall (WAF), maintained by the OWASP Foundation. It detects malicious requests by matching them against the Core Rule Set, identifying well-known attack patterns. Each rule in the CRS is manually assigned a weight, based on the severity of the corresponding attack, and a request is detected as malicious if the sum of the weights of the firing rules exceeds a given threshold. In this work, we show that this simple strategy is largely ineffective for detecting SQL injection (SQLi) attacks, as it tends to block many legitimate requests, while also being vulnerable to adversarial SQLi attacks, i.e., attacks intentionally manipulated to evade detection. To overcome these issues, we design a robust machine learning model, named AdvModSec, which uses the CRS rules as input features, and it is trained to detect adversarial SQLi attacks. Our experiments show that AdvModSec, being trained on the traffic directed towards the protected web services, achieves a better trade-off between detection and false positive rates, improving the detection rate of the vanilla version of ModSecurity with CRS by 21%. Moreover, our approach is able to improve its adversarial robustness against adversarial SQLi attacks by 42%, thereby taking a step forward towards building more robust and trustworthy WAFs. △ Less

Submitted 17 August, 2023; v1 submitted 9 August, 2023; originally announced August 2023.

arXiv:1811.09985 [pdf, other]

Poisoning Behavioral Malware Clustering

Authors: Battista Biggio, Konrad Rieck, Davide Ariu, Christian Wressnegger, Igino Corona, Giorgio Giacinto, Fabio Roli

Abstract: Clustering algorithms have become a popular tool in computer security to analyze the behavior of malware variants, identify novel malware families, and generate signatures for antivirus systems. However, the suitability of clustering algorithms for security-sensitive settings has been recently questioned by showing that they can be significantly compromised if an attacker can exercise some control… ▽ More Clustering algorithms have become a popular tool in computer security to analyze the behavior of malware variants, identify novel malware families, and generate signatures for antivirus systems. However, the suitability of clustering algorithms for security-sensitive settings has been recently questioned by showing that they can be significantly compromised if an attacker can exercise some control over the input data. In this paper, we revisit this problem by focusing on behavioral malware clustering approaches, and investigate whether and to what extent an attacker may be able to subvert these approaches through a careful injection of samples with poisoning behavior. To this end, we present a case study on Malheur, an open-source tool for behavioral malware clustering. Our experiments not only demonstrate that this tool is vulnerable to poisoning attacks, but also that it can be significantly compromised even if the attacker can only inject a very small percentage of attacks into the input data. As a remedy, we discuss possible countermeasures and highlight the need for more secure clustering algorithms. △ Less

Submitted 25 November, 2018; originally announced November 2018.

Journal ref: 2014 ACM CCS Workshop on Artificial Intelligent and Security, AISec '14, pages 27-36, New York, NY, USA, 2014. ACM

arXiv:1811.09982 [pdf, ps, other]

Is Data Clustering in Adversarial Settings Secure?

Authors: Battista Biggio, Ignazio Pillai, Samuel Rota Bulò, Davide Ariu, Marcello Pelillo, Fabio Roli

Abstract: Clustering algorithms have been increasingly adopted in security applications to spot dangerous or illicit activities. However, they have not been originally devised to deal with deliberate attack attempts that may aim to subvert the clustering process itself. Whether clustering can be safely adopted in such settings remains thus questionable. In this work we propose a general framework that allow… ▽ More Clustering algorithms have been increasingly adopted in security applications to spot dangerous or illicit activities. However, they have not been originally devised to deal with deliberate attack attempts that may aim to subvert the clustering process itself. Whether clustering can be safely adopted in such settings remains thus questionable. In this work we propose a general framework that allows one to identify potential attacks against clustering algorithms, and to evaluate their impact, by making specific assumptions on the adversary's goal, knowledge of the attacked system, and capabilities of manipulating the input data. We show that an attacker may significantly poison the whole clustering process by adding a relatively small percentage of attack samples to the input data, and that some attack samples may be obfuscated to be hidden within some existing clusters. We present a case study on single-linkage hierarchical clustering, and report experiments on clustering of malware samples and handwritten digits. △ Less

Submitted 25 November, 2018; originally announced November 2018.

Journal ref: Proceedings of the 2013 ACM Workshop on Artificial Intelligence and Security, AISec '13, pages 87-98, New York, NY, USA, 2013. ACM

arXiv:1707.00317 [pdf, other]

DeltaPhish: Detecting Phishing Webpages in Compromised Websites

Authors: Igino Corona, Battista Biggio, Matteo Contini, Luca Piras, Roberto Corda, Mauro Mereu, Guido Mureddu, Davide Ariu, Fabio Roli

Abstract: The large-scale deployment of modern phishing attacks relies on the automatic exploitation of vulnerable websites in the wild, to maximize profit while hindering attack traceability, detection and blacklisting. To the best of our knowledge, this is the first work that specifically leverages this adversarial behavior for detection purposes. We show that phishing webpages can be accurately detected… ▽ More The large-scale deployment of modern phishing attacks relies on the automatic exploitation of vulnerable websites in the wild, to maximize profit while hindering attack traceability, detection and blacklisting. To the best of our knowledge, this is the first work that specifically leverages this adversarial behavior for detection purposes. We show that phishing webpages can be accurately detected by highlighting HTML code and visual differences with respect to other (legitimate) pages hosted within a compromised website. Our system, named DeltaPhish, can be installed as part of a web application firewall, to detect the presence of anomalous content on a website after compromise, and eventually prevent access to it. DeltaPhish is also robust against adversarial attempts in which the HTML code of the phishing page is carefully manipulated to evade detection. We empirically evaluate it on more than 5,500 webpages collected in the wild from compromised websites, showing that it is capable of detecting more than 99% of phishing webpages, while only misclassifying less than 1% of legitimate pages. We further show that the detection rate remains higher than 70% even under very sophisticated attacks carefully designed to evade our system. △ Less

Submitted 2 July, 2017; originally announced July 2017.

Comments: Preprint version of the work accepted at ESORICS 2017

Showing 1–6 of 6 results for author: Ariu, D