-
CoVE: Towards Confidential Computing on RISC-V Platforms
Authors:
Ravi Sahita,
Atish Patra,
Vedvyas Shanbhogue,
Samuel Ortiz,
Andrew Bresticker,
Dylan Reid,
Atul Khare,
Rajnesh Kanwal
Abstract:
Multi-tenant computing platforms are typically comprised of several software and hardware components including platform firmware, host operating system kernel, virtualization monitor, and the actual tenant payloads that run on them (typically in a virtual machine, container, or application). This model is well established in large scale commercial deployment, but the downside is that all platform…
▽ More
Multi-tenant computing platforms are typically comprised of several software and hardware components including platform firmware, host operating system kernel, virtualization monitor, and the actual tenant payloads that run on them (typically in a virtual machine, container, or application). This model is well established in large scale commercial deployment, but the downside is that all platform components and operators are in the Trusted Computing Base (TCB) of the tenant. This aspect is ill-suited for privacy-oriented workloads that aim to minimize the TCB footprint. Confidential computing presents a good step**-stone towards providing a quantifiable TCB for computing. Confidential computing [1] requires the use of a HW-attested Trusted Execution Environments for data-in-use protection. The RISC-V architecture presents a strong foundation for meeting the requirements for Confidential Computing and other security paradigms in a clean slate manner. This paper describes a reference architecture and discusses ISA, non-ISA and system-on-chip (SoC) requirements for confidential computing on RISC-V Platforms. It discusses proposed ISA and non-ISA Extension for Confidential Virtual Machine for RISC-V platforms, referred to as CoVE.
△ Less
Submitted 12 April, 2023;
originally announced April 2023.
-
Swivel: Hardening WebAssembly against Spectre
Authors:
Shravan Narayan,
Craig Disselkoen,
Daniel Moghimi,
Sunjay Cauligi,
Evan Johnson,
Zhao Gang,
Anjo Vahldiek-Oberwagner,
Ravi Sahita,
Hovav Shacham,
Dean Tullsen,
Deian Stefan
Abstract:
We describe Swivel, a new compiler framework for hardening WebAssembly (Wasm) against Spectre attacks. Outside the browser, Wasm has become a popular lightweight, in-process sandbox and is, for example, used in production to isolate different clients on edge clouds and function-as-a-service platforms. Unfortunately, Spectre attacks can bypass Wasm's isolation guarantees. Swivel hardens Wasm agains…
▽ More
We describe Swivel, a new compiler framework for hardening WebAssembly (Wasm) against Spectre attacks. Outside the browser, Wasm has become a popular lightweight, in-process sandbox and is, for example, used in production to isolate different clients on edge clouds and function-as-a-service platforms. Unfortunately, Spectre attacks can bypass Wasm's isolation guarantees. Swivel hardens Wasm against this class of attacks by ensuring that potentially malicious code can neither use Spectre attacks to break out of the Wasm sandbox nor coerce victim code-another Wasm client or the embedding process-to leak secret data.
We describe two Swivel designs, a software-only approach that can be used on existing CPUs, and a hardware-assisted approach that uses extension available in Intel 11th generation CPUs. For both, we evaluate a randomized approach that mitigates Spectre and a deterministic approach that eliminates Spectre altogether. Our randomized implementations impose under 10.3% overhead on the Wasm-compatible subset of SPEC 2006, while our deterministic implementations impose overheads between 3.3% and 240.2%. Though high on some benchmarks, Swivel's overhead is still between 9x and 36.3x smaller than existing defenses that rely on pipeline fences.
△ Less
Submitted 19 March, 2021; v1 submitted 25 February, 2021;
originally announced February 2021.
-
Towards a Resilient Machine Learning Classifier -- a Case Study of Ransomware Detection
Authors:
Chih-Yuan Yang,
Ravi Sahita
Abstract:
The damage caused by crypto-ransomware, due to encryption, is difficult to revert and cause data losses. In this paper, a machine learning (ML) classifier was built to early detect ransomware (called crypto-ransomware) that uses cryptography by program behavior. If a signature-based detection was missed, a behavior-based detector can be the last line of defense to detect and contain the damages. W…
▽ More
The damage caused by crypto-ransomware, due to encryption, is difficult to revert and cause data losses. In this paper, a machine learning (ML) classifier was built to early detect ransomware (called crypto-ransomware) that uses cryptography by program behavior. If a signature-based detection was missed, a behavior-based detector can be the last line of defense to detect and contain the damages. We find that input/output activities of ransomware and the file-content entropy are unique traits to detect crypto-ransomware. A deep-learning (DL) classifier can detect ransomware with a high accuracy and a low false positive rate. We conduct an adversarial research against the models generated. We use simulated ransomware programs to launch a gray-box analysis to probe the weakness of ML classifiers and to improve model robustness. In addition to accuracy and resiliency, trustworthiness is the other key criteria for a quality detector. Making sure that the correct information was used for inference is important for a security application. The Integrated Gradient method was used to explain the deep learning model and also to reveal why false negatives evade the detection. The approaches to build and to evaluate a real-world detector were demonstrated and discussed.
△ Less
Submitted 13 March, 2020;
originally announced March 2020.
-
Towards resilient machine learning for ransomware detection
Authors:
Li Chen,
Chih-Yuan Yang,
Anindya Paul,
Ravi Sahita
Abstract:
There has been a surge of interest in using machine learning (ML) to automatically detect malware through their dynamic behaviors. These approaches have achieved significant improvement in detection rates and lower false positive rates at large scale compared with traditional malware analysis methods. ML in threat detection has demonstrated to be a good cop to guard platform security. However it i…
▽ More
There has been a surge of interest in using machine learning (ML) to automatically detect malware through their dynamic behaviors. These approaches have achieved significant improvement in detection rates and lower false positive rates at large scale compared with traditional malware analysis methods. ML in threat detection has demonstrated to be a good cop to guard platform security. However it is imperative to evaluate - is ML-powered security resilient enough?
In this paper, we juxtapose the resiliency and trustworthiness of ML algorithms for security, via a case study of evaluating the resiliency of ransomware detection via the generative adversarial network (GAN). In this case study, we propose to use GAN to automatically produce dynamic features that exhibit generalized malicious behaviors that can reduce the efficacy of black-box ransomware classifiers. We examine the quality of the GAN-generated samples by comparing the statistical similarity of these samples to real ransomware and benign software. Further we investigate the latent subspace where the GAN-generated samples lie and explore reasons why such samples cause a certain class of ransomware classifiers to degrade in performance. Our focus is to emphasize necessary defense improvement in ML-based approaches for ransomware detection before deployment in the wild. Our results and discoveries should pose relevant questions for defenders such as how ML models can be made more resilient for robust enforcement of security objectives.
△ Less
Submitted 16 May, 2019; v1 submitted 21 December, 2018;
originally announced December 2018.
-
HeNet: A Deep Learning Approach on Intel$^\circledR$ Processor Trace for Effective Exploit Detection
Authors:
Li Chen,
Salmin Sultana,
Ravi Sahita
Abstract:
This paper presents HeNet, a hierarchical ensemble neural network, applied to classify hardware-generated control flow traces for malware detection. Deep learning-based malware detection has so far focused on analyzing executable files and runtime API calls. Static code analysis approaches face challenges due to obfuscated code and adversarial perturbations. Behavioral data collected during execut…
▽ More
This paper presents HeNet, a hierarchical ensemble neural network, applied to classify hardware-generated control flow traces for malware detection. Deep learning-based malware detection has so far focused on analyzing executable files and runtime API calls. Static code analysis approaches face challenges due to obfuscated code and adversarial perturbations. Behavioral data collected during execution is more difficult to obfuscate but recent research has shown successful attacks against API call based malware classifiers. We investigate control flow based characterization of a program execution to build robust deep learning malware classifiers. HeNet consists of a low-level behavior model and a top-level ensemble model. The low-level model is a per-application behavior model, trained via transfer learning on a time-series of images generated from control flow trace of an execution. We use Intel$^\circledR$ Processor Trace enabled processor for low overhead execution tracing and design a lightweight image conversion and segmentation of the control flow trace. The top-level ensemble model aggregates the behavior classification of all the trace segments and detects an attack. The use of hardware trace adds portability to our system and the use of deep learning eliminates the manual effort of feature engineering. We evaluate HeNet against real-world exploitations of PDF readers. HeNet achieves 100\% accuracy and 0\% false positive on test set, and higher classification accuracy compared to classical machine learning algorithms.
△ Less
Submitted 8 January, 2018;
originally announced January 2018.
-
Semi-supervised classification for dynamic Android malware detection
Authors:
Li Chen,
Mingwei Zhang,
Chih-Yuan Yang,
Ravi Sahita
Abstract:
A growing number of threats to Android phones creates challenges for malware detection. Manually labeling the samples into benign or different malicious families requires tremendous human efforts, while it is comparably easy and cheap to obtain a large amount of unlabeled APKs from various sources. Moreover, the fast-paced evolution of Android malware continuously generates derivative malware fami…
▽ More
A growing number of threats to Android phones creates challenges for malware detection. Manually labeling the samples into benign or different malicious families requires tremendous human efforts, while it is comparably easy and cheap to obtain a large amount of unlabeled APKs from various sources. Moreover, the fast-paced evolution of Android malware continuously generates derivative malware families. These families often contain new signatures, which can escape detection when using static analysis. These practical challenges can also cause traditional supervised machine learning algorithms to degrade in performance.
In this paper, we propose a framework that uses model-based semi-supervised (MBSS) classification scheme on the dynamic Android API call logs. The semi-supervised approach efficiently uses the labeled and unlabeled APKs to estimate a finite mixture model of Gaussian distributions via conditional expectation-maximization and efficiently detects malwares during out-of-sample testing. We compare MBSS with the popular malware detection classifiers such as support vector machine (SVM), $k$-nearest neighbor (kNN) and linear discriminant analysis (LDA). Under the ideal classification setting, MBSS has competitive performance with 98\% accuracy and very low false positive rate for in-sample classification. For out-of-sample testing, the out-of-sample test data exhibit similar behavior of retrieving phone information and sending to the network, compared with in-sample training set. When this similarity is strong, MBSS and SVM with linear kernel maintain 90\% detection rate while $k$NN and LDA suffer great performance degradation. When this similarity is slightly weaker, all classifiers degrade in performance, but MBSS still performs significantly better than other classifiers.
△ Less
Submitted 19 April, 2017;
originally announced April 2017.