-
AI ATAC 1: An Evaluation of Prominent Commercial Malware Detectors
Authors:
Robert A. Bridges,
Brian Weber,
Justin M. Beaver,
Jared M. Smith,
Miki E. Verma,
Savannah Norem,
Kevin Spakes,
Cory Watson,
Jeff A. Nichols,
Brian Jewell,
Michael. D. Iannacone,
Chelsey Dunivan Stahl,
Kelly M. T. Huffer,
T. Sean Oesch
Abstract:
This work presents an evaluation of six prominent commercial endpoint malware detectors, a network malware detector, and a file-conviction algorithm from a cyber technology vendor. The evaluation was administered as the first of the Artificial Intelligence Applications to Autonomous Cybersecurity (AI ATAC) prize challenges, funded by / completed in service of the US Navy. The experiment employed 1…
▽ More
This work presents an evaluation of six prominent commercial endpoint malware detectors, a network malware detector, and a file-conviction algorithm from a cyber technology vendor. The evaluation was administered as the first of the Artificial Intelligence Applications to Autonomous Cybersecurity (AI ATAC) prize challenges, funded by / completed in service of the US Navy. The experiment employed 100K files (50/50% benign/malicious) with a stratified distribution of file types, including ~1K zero-day program executables (increasing experiment size two orders of magnitude over previous work). We present an evaluation process of delivering a file to a fresh virtual machine donning the detection technology, waiting 90s to allow static detection, then executing the file and waiting another period for dynamic detection; this allows greater fidelity in the observational data than previous experiments, in particular, resource and time-to-detection statistics. To execute all 800K trials (100K files $\times$ 8 tools), a software framework is designed to choreographed the experiment into a completely automated, time-synced, and reproducible workflow with substantial parallelization. A cost-benefit model was configured to integrate the tools' recall, precision, time to detection, and resource requirements into a single comparable quantity by simulating costs of use. This provides a ranking methodology for cyber competitions and a lens through which to reason about the varied statistical viewpoints of the results. These statistical and cost-model results provide insights on state of commercial malware detection.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Testing SOAR Tools in Use
Authors:
Robert A. Bridges,
Ashley E. Rice,
Sean Oesch,
Jeff A. Nichols,
Cory Watson,
Kevin Spakes,
Savannah Norem,
Mike Huettel,
Brian Jewell,
Brian Weber,
Connor Gannon,
Olivia Bizovi,
Samuel C Hollifield,
Samantha Erwin
Abstract:
Modern security operation centers (SOCs) rely on operators and a tapestry of logging and alerting tools with large scale collection and query abilities. SOC investigations are tedious as they rely on manual efforts to query diverse data sources, overlay related logs, and correlate the data into information and then document results in a ticketing system. Security orchestration, automation, and res…
▽ More
Modern security operation centers (SOCs) rely on operators and a tapestry of logging and alerting tools with large scale collection and query abilities. SOC investigations are tedious as they rely on manual efforts to query diverse data sources, overlay related logs, and correlate the data into information and then document results in a ticketing system. Security orchestration, automation, and response (SOAR) tools are a new technology that promise to collect, filter, and display needed data; automate common tasks that require SOC analysts' time; facilitate SOC collaboration; and, improve both efficiency and consistency of SOCs. SOAR tools have never been tested in practice to evaluate their effect and understand them in use. In this paper, we design and administer the first hands-on user study of SOAR tools, involving 24 participants and 6 commercial SOAR tools. Our contributions include the experimental design, itemizing six characteristics of SOAR tools and a methodology for testing them. We describe configuration of the test environment in a cyber range, including network, user, and threat emulation; a full SOC tool suite; and creation of artifacts allowing multiple representative investigation scenarios to permit testing. We present the first research results on SOAR tools. We found that SOAR configuration is critical, as it involves creative design for data display and automation. We found that SOAR tools increased efficiency and reduced context switching during investigations, although ticket accuracy and completeness (indicating investigation quality) decreased with SOAR use. Our findings indicated that user preferences are slightly negatively correlated with their performance with the tool; overautomation was a concern of senior analysts, and SOAR tools that balanced automation with assisting a user to make decisions were preferred.
△ Less
Submitted 14 February, 2023; v1 submitted 11 August, 2022;
originally announced August 2022.
-
Assembling a Cyber Range to Evaluate Artificial Intelligence / Machine Learning (AI/ML) Security Tools
Authors:
Jeffrey A. Nichols,
Kevin D. Spakes,
Cory L. Watson,
Robert A. Bridges
Abstract:
In this case study, we describe the design and assembly of a cyber security testbed at Oak Ridge National Laboratory in Oak Ridge, TN, USA. The range is designed to provide agile reconfigurations to facilitate a wide variety of experiments for evaluations of cyber security tools -- particularly those involving AI/ML. In particular, the testbed provides realistic test environments while permitting…
▽ More
In this case study, we describe the design and assembly of a cyber security testbed at Oak Ridge National Laboratory in Oak Ridge, TN, USA. The range is designed to provide agile reconfigurations to facilitate a wide variety of experiments for evaluations of cyber security tools -- particularly those involving AI/ML. In particular, the testbed provides realistic test environments while permitting control and programmatic observations/data collection during the experiments. We have designed in the ability to repeat the evaluations, so additional tools can be evaluated and compared at a later time. The system is one that can be scaled up or down for experiment sizes. At the time of the conference we will have completed two full-scale, national, government challenges on this range. These challenges are evaluating the performance and operating costs for AI/ML-based cyber security tools for application into large, government-sized networks. These evaluations will be described as examples providing motivation and context for various design decisions and adaptations we have made. The first challenge measured end-point security tools against 100K file samples (benignware and malware) chosen across a range of file types. The second is an evaluation of network intrusion detection systems efficacy in identifying multi-step adversarial campaigns -- involving reconnaissance, penetration and exploitations, lateral movement, etc. -- with varying levels of covertness in a high-volume business network. The scale of each of these challenges requires automation systems to repeat, or simultaneously mirror identical the experiments for each ML tool under test. Providing an array of easy-to-difficult malicious activity for sussing out the true abilities of the AI/ML tools has been a particularly interesting and challenging aspect of designing and executing these challenge events.
△ Less
Submitted 20 January, 2022;
originally announced January 2022.
-
Coarse reduced model selection for nonlinear state estimation
Authors:
James A. Nichols
Abstract:
State estimation is the task of approximately reconstructing a solution $u$ of a parametric partial differential equation when the parameter vector $y$ is unknown and the only information is $m$ linear measurements of $u$. In [Cohen et. al., 2021] the authors proposed a method to use a family of linear reduced spaces as a generalised nonlinear reduced model for state estimation. A computable surro…
▽ More
State estimation is the task of approximately reconstructing a solution $u$ of a parametric partial differential equation when the parameter vector $y$ is unknown and the only information is $m$ linear measurements of $u$. In [Cohen et. al., 2021] the authors proposed a method to use a family of linear reduced spaces as a generalised nonlinear reduced model for state estimation. A computable surrogate distance is used to evaluate which linear estimate lies closest to a true solution of the PDE problem. In this paper we propose a strategy of coarse computation of the surrogate distance while maintaining a fine mesh reduced model, as the computational cost of the surrogate distance is large relative to the reduced modelling task. We demonstrate numerically that the error induced by the coarse distance is dominated by other approximation errors.
△ Less
Submitted 5 March, 2021;
originally announced March 2021.
-
Beyond the Hype: A Real-World Evaluation of the Impact and Cost of Machine Learning-Based Malware Detection
Authors:
Robert A. Bridges,
Sean Oesch,
Miki E. Verma,
Michael D. Iannacone,
Kelly M. T. Huffer,
Brian Jewell,
Jeff A. Nichols,
Brian Weber,
Justin M. Beaver,
Jared M. Smith,
Daniel Scofield,
Craig Miles,
Thomas Plummer,
Mark Daniell,
Anne M. Tall
Abstract:
In this paper, we present a scientific evaluation of four prominent malware detection tools to assist an organization with two primary questions: To what extent do ML-based tools accurately classify previously- and never-before-seen files? Is it worth purchasing a network-level malware detector? To identify weaknesses, we tested each tool against 3,536 total files (2,554 or 72\% malicious, 982 or…
▽ More
In this paper, we present a scientific evaluation of four prominent malware detection tools to assist an organization with two primary questions: To what extent do ML-based tools accurately classify previously- and never-before-seen files? Is it worth purchasing a network-level malware detector? To identify weaknesses, we tested each tool against 3,536 total files (2,554 or 72\% malicious, 982 or 28\% benign) of a variety of file types, including hundreds of malicious zero-days, polyglots, and APT-style files, delivered on multiple protocols. We present statistical results on detection time and accuracy, consider complementary analysis (using multiple tools together), and provide two novel applications of the recent cost-benefit evaluation procedure of Iannacone \& Bridges. While the ML-based tools are more effective at detecting zero-day files and executables, the signature-based tool may still be an overall better option. Both network-based tools provide substantial (simulated) savings when paired with either host tool, yet both show poor detection rates on protocols other than HTTP or SMTP. Our results show that all four tools have near-perfect precision but alarmingly low recall, especially on file types other than executables and office files -- 37% of malware tested, including all polyglot files, were undetected. Priorities for researchers and takeaways for end users are given.
△ Less
Submitted 17 August, 2022; v1 submitted 16 December, 2020;
originally announced December 2020.
-
NWChem: Past, Present, and Future
Authors:
E. Aprà,
E. J. Bylaska,
W. A. de Jong,
N. Govind,
K. Kowalski,
T. P. Straatsma,
M. Valiev,
H. J. J. van Dam,
Y. Alexeev,
J. Anchell,
V. Anisimov,
F. W. Aquino,
R. Atta-Fynn,
J. Autschbach,
N. P. Bauman,
J. C. Becca,
D. E. Bernholdt,
K. Bhaskaran-Nair,
S. Bogatko,
P. Borowski,
J. Boschen,
J. Brabec,
A. Bruner,
E. Cauët,
Y. Chen
, et al. (89 additional authors not shown)
Abstract:
Specialized computational chemistry packages have permanently reshaped the landscape of chemical and materials science by providing tools to support and guide experimental efforts and for the prediction of atomistic and electronic properties. In this regard, electronic structure packages have played a special role by using first-principledriven methodologies to model complex chemical and materials…
▽ More
Specialized computational chemistry packages have permanently reshaped the landscape of chemical and materials science by providing tools to support and guide experimental efforts and for the prediction of atomistic and electronic properties. In this regard, electronic structure packages have played a special role by using first-principledriven methodologies to model complex chemical and materials processes. Over the last few decades, the rapid development of computing technologies and the tremendous increase in computational power have offered a unique chance to study complex transformations using sophisticated and predictive many-body techniques that describe correlated behavior of electrons in molecular and condensed phase systems at different levels of theory. In enabling these simulations, novel parallel algorithms have been able to take advantage of computational resources to address the polynomial scaling of electronic structure methods. In this paper, we briefly review the NWChem computational chemistry suite, including its history, design principles, parallel tools, current capabilities, outreach and outlook.
△ Less
Submitted 26 May, 2020; v1 submitted 24 April, 2020;
originally announced April 2020.
-
Subdiffusive discrete time random walks via Monte Carlo and subordination
Authors:
J. A. Nichols,
B. I. Henry,
C. N. Angstmann
Abstract:
A class of discrete time random walks has recently been introduced to provide a stochastic process based numerical scheme for solving fractional order partial differential equations, including the fractional subdiffusion equation. Here we develop a Monte Carlo method for simulating discrete time random walks with Sibuya power law waiting times, providing another approximate solution of the fractio…
▽ More
A class of discrete time random walks has recently been introduced to provide a stochastic process based numerical scheme for solving fractional order partial differential equations, including the fractional subdiffusion equation. Here we develop a Monte Carlo method for simulating discrete time random walks with Sibuya power law waiting times, providing another approximate solution of the fractional subdiffusion equation. The computation time scales as a power law in the number of time steps with a fractional exponent simply related to the order of the fractional derivative. We also provide an explicit form of a subordinator for discrete time random walks with Sibuya power law waiting times. This subordinator transforms from an operational time, in the expected number of random walk steps, to the physical time, in the number of time steps.
△ Less
Submitted 16 November, 2017;
originally announced November 2017.
-
Malware Detection on General-Purpose Computers Using Power Consumption Monitoring: A Proof of Concept and Case Study
Authors:
Jarilyn M. Hernández Jiménez,
Jeffrey A. Nichols,
Katerina Goseva-Popstojanova,
Stacy Prowell,
Robert A. Bridges
Abstract:
Malware detection is challenging when faced with automatically generated and polymorphic malware, as well as with rootkits, which are exceptionally hard to detect. In an attempt to contribute towards addressing these challenges, we conducted a proof of concept study that explored the use of power consumption for detection of malware presence in a general-purpose computer. The results of our experi…
▽ More
Malware detection is challenging when faced with automatically generated and polymorphic malware, as well as with rootkits, which are exceptionally hard to detect. In an attempt to contribute towards addressing these challenges, we conducted a proof of concept study that explored the use of power consumption for detection of malware presence in a general-purpose computer. The results of our experiments indicate that malware indeed leaves a signal on the power consumption of a general-purpose computer. Specifically, for the case study based on two different rootkits, the data collected at the +12V rails on the motherboard showed the most noticeable increment of the power consumption after the computer was infected. Our future work includes experimenting with more malware examples and workloads, and develo** data analytics approach for automatic malware detection based on power consumption.
△ Less
Submitted 4 May, 2017;
originally announced May 2017.