Search | arXiv e-print repository

arXiv:2001.05571 [pdf, other]

On Model Evaluation under Non-constant Class Imbalance

Authors: Jan Brabec, Tomáš Komárek, Vojtěch Franc, Lukáš Machlica

Abstract: Many real-world classification problems are significantly class-imbalanced to detriment of the class of interest. The standard set of proper evaluation metrics is well-known but the usual assumption is that the test dataset imbalance equals the real-world imbalance. In practice, this assumption is often broken for various reasons. The reported results are then often too optimistic and may lead to… ▽ More Many real-world classification problems are significantly class-imbalanced to detriment of the class of interest. The standard set of proper evaluation metrics is well-known but the usual assumption is that the test dataset imbalance equals the real-world imbalance. In practice, this assumption is often broken for various reasons. The reported results are then often too optimistic and may lead to wrong conclusions about industrial impact and suitability of proposed techniques. We introduce methods focusing on evaluation under non-constant class imbalance. We show that not only the absolute values of commonly used metrics, but even the order of classifiers in relation to the evaluation metric used is affected by the change of the imbalance rate. Finally, we demonstrate that using subsampling in order to get a test dataset with class imbalance equal to the one observed in the wild is not necessary, and eventually can lead to significant errors in classifier's performance estimate. △ Less

Submitted 15 April, 2020; v1 submitted 15 January, 2020; originally announced January 2020.

Comments: Accepted for proceedings of ICCS 2020. Supplementary code at: https://github.com/CiscoCTA/nci_eval

arXiv:1906.09084 [pdf, other]

doi 10.1007/s10994-019-05789-z

Joint Detection of Malicious Domains and Infected Clients

Authors: Paul Prasse, Rene Knaebel, Lukas Machlica, Tomas Pevny, Tobias Scheffer

Abstract: Detection of malware-infected computers and detection of malicious web domains based on their encrypted HTTPS traffic are challenging problems, because only addresses, timestamps, and data volumes are observable. The detection problems are coupled, because infected clients tend to interact with malicious domains. Traffic data can be collected at a large scale, and antivirus tools can be used to id… ▽ More Detection of malware-infected computers and detection of malicious web domains based on their encrypted HTTPS traffic are challenging problems, because only addresses, timestamps, and data volumes are observable. The detection problems are coupled, because infected clients tend to interact with malicious domains. Traffic data can be collected at a large scale, and antivirus tools can be used to identify infected clients in retrospect. Domains, by contrast, have to be labeled individually after forensic analysis. We explore transfer learning based on sluice networks; this allows the detection models to bootstrap each other. In a large-scale experimental study, we find that the model outperforms known reference models and detects previously unknown malware, previously unknown malware families, and previously unknown malicious domains. △ Less

Submitted 21 June, 2019; originally announced June 2019.

Comments: Mach Learn (2019)

arXiv:1812.01388 [pdf, other]

Bad practices in evaluation methodology relevant to class-imbalanced problems

Authors: Jan Brabec, Lukas Machlica

Abstract: For research to go in the right direction, it is essential to be able to compare and quantify performance of different algorithms focused on the same problem. Choosing a suitable evaluation metric requires deep understanding of the pursued task along with all of its characteristics. We argue that in the case of applied machine learning, proper evaluation metric is the basic building block that sho… ▽ More For research to go in the right direction, it is essential to be able to compare and quantify performance of different algorithms focused on the same problem. Choosing a suitable evaluation metric requires deep understanding of the pursued task along with all of its characteristics. We argue that in the case of applied machine learning, proper evaluation metric is the basic building block that should be in the spotlight and put under thorough examination. Here, we address tasks with class imbalance, in which the class of interest is the one with much lower number of samples. We encountered non-insignificant amount of recent papers, in which improper evaluation methods are used, borrowed mainly from the field of balanced problems. Such bad practices may heavily bias the results in favour of inappropriate algorithms and give false expectations of the state of the field. △ Less

Submitted 4 December, 2018; originally announced December 2018.

Comments: Accepted to Critiquing and Correcting Trends in Machine Learning workshop at NeurIPS 2018 (https://ml-critique-correct.github.io/)

arXiv:1702.02530 [pdf, other]

Learning detectors of malicious web requests for intrusion detection in network traffic

Authors: Lukas Machlica, Karel Bartos, Michal Sofka

Abstract: This paper proposes a generic classification system designed to detect security threats based on the behavior of malware samples. The system relies on statistical features computed from proxy log fields to train detectors using a database of malware samples. The behavior detectors serve as basic reusable building blocks of the multi-level detection architecture. The detectors identify malicious co… ▽ More This paper proposes a generic classification system designed to detect security threats based on the behavior of malware samples. The system relies on statistical features computed from proxy log fields to train detectors using a database of malware samples. The behavior detectors serve as basic reusable building blocks of the multi-level detection architecture. The detectors identify malicious communication exploiting encrypted URL strings and domains generated by a Domain Generation Algorithm (DGA) which are frequently used in Command and Control (C&C), phishing, and click fraud. Surprisingly, very precise detectors can be built given only a limited amount of information extracted from a single proxy log. This way, the computational requirements of the detectors are kept low which allows for deployment on a wide range of security devices and without depending on traffic context such as DNS logs, Whois records, webpage content, etc. Results on several weeks of live traffic from 100+ companies having 350k+ hosts show correct detection with a precision exceeding 95% of malicious flows, 95% of malicious URLs and 90% of infected hosts. In addition, a comparison with a signature and rule-based solution shows that our system is able to detect significant amount of new threats. △ Less

Submitted 8 February, 2017; originally announced February 2017.

Showing 1–4 of 4 results for author: Machlica, L