-
SIGL: Securing Software Installations Through Deep Graph Learning
Authors:
Xueyuan Han,
Xiao Yu,
Thomas Pasquier,
Ding Li,
Junghwan Rhee,
James Mickens,
Margo Seltzer,
Haifeng Chen
Abstract:
Many users implicitly assume that software can only be exploited after it is installed. However, recent supply-chain attacks demonstrate that application integrity must be ensured during installation itself. We introduce SIGL, a new tool for detecting malicious behavior during software installation. SIGL collects traces of system call activity, building a data provenance graph that it analyzes usi…
▽ More
Many users implicitly assume that software can only be exploited after it is installed. However, recent supply-chain attacks demonstrate that application integrity must be ensured during installation itself. We introduce SIGL, a new tool for detecting malicious behavior during software installation. SIGL collects traces of system call activity, building a data provenance graph that it analyzes using a novel autoencoder architecture with a graph long short-term memory network (graph LSTM) for the encoder and a standard multilayer perceptron for the decoder. SIGL flags suspicious installations as well as the specific installation-time processes that are likely to be malicious. Using a test corpus of 625 malicious installers containing real-world malware, we demonstrate that SIGL has a detection accuracy of 96%, outperforming similar systems from industry and academia by up to 87% in precision and recall and 45% in accuracy. We also demonstrate that SIGL can pinpoint the processes most likely to have triggered malicious behavior, works on different audit platforms and operating systems, and is robust to training data contamination and adversarial attack. It can be used with application-specific models, even in the presence of new software versions, as well as application-agnostic meta-models that encompass a wide range of applications and installers.
△ Less
Submitted 22 June, 2021; v1 submitted 26 August, 2020;
originally announced August 2020.
-
Xanthus: Push-button Orchestration of Host Provenance Data Collection
Authors:
Xueyuan Han,
James Mickens,
Ashish Gehani,
Margo Seltzer,
Thomas Pasquier
Abstract:
Host-based anomaly detectors generate alarms by inspecting audit logs for suspicious behavior. Unfortunately, evaluating these anomaly detectors is hard. There are few high-quality, publicly-available audit logs, and there are no pre-existing frameworks that enable push-button creation of realistic system traces. To make trace generation easier, we created Xanthus, an automated tool that orchestra…
▽ More
Host-based anomaly detectors generate alarms by inspecting audit logs for suspicious behavior. Unfortunately, evaluating these anomaly detectors is hard. There are few high-quality, publicly-available audit logs, and there are no pre-existing frameworks that enable push-button creation of realistic system traces. To make trace generation easier, we created Xanthus, an automated tool that orchestrates virtual machines to generate realistic audit logs. Using Xanthus' simple management interface, administrators select a base VM image, configure a particular tracing framework to use within that VM, and define post-launch scripts that collect and save trace data. Once data collection is finished, Xanthus creates a self-describing archive, which contains the VM, its configuration parameters, and the collected trace data. We demonstrate that Xanthus hides many of the tedious (yet subtle) orchestration tasks that humans often get wrong; Xanthus avoids mistakes that lead to non-replicable experiments.
△ Less
Submitted 10 May, 2020;
originally announced May 2020.
-
UNICORN: Runtime Provenance-Based Detector for Advanced Persistent Threats
Authors:
Xueyuan Han,
Thomas Pasquier,
Adam Bates,
James Mickens,
Margo Seltzer
Abstract:
Advanced Persistent Threats (APTs) are difficult to detect due to their "low-and-slow" attack patterns and frequent use of zero-day exploits. We present UNICORN, an anomaly-based APT detector that effectively leverages data provenance analysis. From modeling to detection, UNICORN tailors its design specifically for the unique characteristics of APTs. Through extensive yet time-efficient graph anal…
▽ More
Advanced Persistent Threats (APTs) are difficult to detect due to their "low-and-slow" attack patterns and frequent use of zero-day exploits. We present UNICORN, an anomaly-based APT detector that effectively leverages data provenance analysis. From modeling to detection, UNICORN tailors its design specifically for the unique characteristics of APTs. Through extensive yet time-efficient graph analysis, UNICORN explores provenance graphs that provide rich contextual and historical information to identify stealthy anomalous activities without pre-defined attack signatures. Using a graph sketching technique, it summarizes long-running system execution with space efficiency to combat slow-acting attacks that take place over a long time span. UNICORN further improves its detection capability using a novel modeling approach to understand long-term behavior as the system evolves. Our evaluation shows that UNICORN outperforms an existing state-of-the-art APT detection system and detects real-life APT scenarios with high accuracy.
△ Less
Submitted 14 January, 2020; v1 submitted 6 January, 2020;
originally announced January 2020.
-
McFly: Time-Travel Debugging for the Web
Authors:
John Vilk,
Emery D. Berger,
James Mickens,
Mark Marron
Abstract:
Time-traveling debuggers offer the promise of simplifying debugging by letting developers freely step forwards and backwards through a program's execution. However, web applications present multiple challenges that make time-travel debugging especially difficult. A time-traveling debugger for web applications must accurately reproduce all network interactions, asynchronous events, and visual state…
▽ More
Time-traveling debuggers offer the promise of simplifying debugging by letting developers freely step forwards and backwards through a program's execution. However, web applications present multiple challenges that make time-travel debugging especially difficult. A time-traveling debugger for web applications must accurately reproduce all network interactions, asynchronous events, and visual states observed during the original execution, both while step** forwards and backwards. This must all be done in the context of a complex and highly multithreaded browser runtime. At the same time, to be practical, a time-traveling debugger must maintain interactive speeds.
This paper presents McFly, the first time-traveling debugger for web applications. McFly departs from previous approaches by operating on a high-level representation of the browser's internal state. This approach lets McFly provide accurate time-travel debugging - maintaining JavaScript and visual state in sync at all times - at interactive speeds. McFly's architecture is browser-agnostic, building on web standards supported by all major browsers. We have implemented McFly as an extension to the Microsoft Edge web browser, and core parts of McFly have been integrated into a time-traveling debugger product from Microsoft.
△ Less
Submitted 28 October, 2018;
originally announced October 2018.