-
SinClave: Hardware-assisted Singletons for TEEs
Authors:
Franz Gregor,
Robert Krahn,
Do Le Quoc,
Christof Fetzer
Abstract:
For trusted execution environments (TEEs), remote attestation permits establishing trust in software executed on a remote host. It requires that the measurement of a remote TEE is both complete and fresh: We need to measure all aspects that might determine the behavior of an application, and this measurement has to be reasonably fresh. Performing measurements only at the start of a TEE simplifies…
▽ More
For trusted execution environments (TEEs), remote attestation permits establishing trust in software executed on a remote host. It requires that the measurement of a remote TEE is both complete and fresh: We need to measure all aspects that might determine the behavior of an application, and this measurement has to be reasonably fresh. Performing measurements only at the start of a TEE simplifies the attestation but enables "reuse" attacks of enclaves. We demonstrate how to perform such reuse attacks for different TEE frameworks. We also show how to address this issue by enforcing freshness - through the concept of a singleton enclave - and completeness of the measurements. Completeness of measurements is not trivial since the secrets provisioned to an enclave and the content of the filesystem can both affect the behavior of the software, i.e., can be used to mount reuse attacks. We present mechanisms to include measurements of these two components in the remote attestation. Our evaluation based on real-world applications shows that our approach incurs only negligible overhead ranging from 1.03% to 13.2%.
△ Less
Submitted 5 November, 2023;
originally announced November 2023.
-
Accelerating Transfer Learning with Near-Data Computation on Cloud Object Stores
Authors:
Arsany Guirguis,
Diana Petrescu,
Florin Dinu,
Do Le Quoc,
Javier Picorel,
Rachid Guerraoui
Abstract:
Near-data computation techniques have been successfully deployed to mitigate the cloud network bottleneck between the storage and compute tiers. At Huawei, we are currently looking to get more value from these techniques by broadening their applicability. Machine learning (ML) applications are an appealing and timely target. This paper describes our experience applying near-data computation techni…
▽ More
Near-data computation techniques have been successfully deployed to mitigate the cloud network bottleneck between the storage and compute tiers. At Huawei, we are currently looking to get more value from these techniques by broadening their applicability. Machine learning (ML) applications are an appealing and timely target. This paper describes our experience applying near-data computation techniques to transfer learning (TL), a widely popular ML technique, in the context of disaggregated cloud object stores. Our techniques benefit both cloud providers and users. They improve our operational efficiency while providing users the performance improvements they demand from us. The main practical challenge to consider is that the storage-side computational resources are limited. Our approach is to split the TL deep neural network (DNN) during the feature extraction phase, before the training phase. This reduces the network transfers to the compute tier and further decouples the batch size of feature extraction from the training batch size. This facilitates our second technique, storage-side batch adaptation, which enables increased concurrency in the storage tier while avoiding out-of-memory errors. Guided by these insights, we present HAPI, our processing system for TL that spans the compute and storage tiers while remaining transparent to the user. Our evaluation with several state-of-the-art DNNs, such as ResNet, VGG, and Transformer, shows up to 11x improvement in application runtime and up to 8.3x reduction in the data transferred from the storage to the compute tier compared to running the computation entirely in the compute tier.
△ Less
Submitted 9 January, 2023; v1 submitted 16 October, 2022;
originally announced October 2022.
-
Synergia: Hardening High-Assurance Security Systems with Confidential and Trusted Computing
Authors:
Wojciech Ozga,
Rasha Faqeh,
Do Le Quoc,
Franz Gregor,
Silvio Dragone,
Christof Fetzer
Abstract:
High-assurance security systems require strong isolation from the untrusted world to protect the security-sensitive or privacy-sensitive data they process. Existing regulations impose that such systems must execute in a trustworthy operating system (OS) to ensure they are not collocated with untrusted software that might negatively impact their availability or security. However, the existing techn…
▽ More
High-assurance security systems require strong isolation from the untrusted world to protect the security-sensitive or privacy-sensitive data they process. Existing regulations impose that such systems must execute in a trustworthy operating system (OS) to ensure they are not collocated with untrusted software that might negatively impact their availability or security. However, the existing techniques to attest to the OS integrity fall short due to the cuckoo attack. In this paper, we first show a novel defense mechanism against the cuckoo attack, and we formally prove it. Then, we implement it as part of an integrity monitoring and enforcement framework that attests to the trustworthiness of the OS from 3.7x to 8.5x faster than the existing integrity monitoring systems. We demonstrate its practicality by protecting the execution of a real-world eHealth application, performing micro and macro-benchmarks, and assessing the security risk.
△ Less
Submitted 12 May, 2022;
originally announced May 2022.
-
SecFL: Confidential Federated Learning using TEEs
Authors:
Do Le Quoc,
Christof Fetzer
Abstract:
Federated Learning (FL) is an emerging machine learning paradigm that enables multiple clients to jointly train a model to take benefits from diverse datasets from the clients without sharing their local training datasets. FL helps reduce data privacy risks. Unfortunately, FL still exist several issues regarding privacy and security. First, it is possible to leak sensitive information from the sha…
▽ More
Federated Learning (FL) is an emerging machine learning paradigm that enables multiple clients to jointly train a model to take benefits from diverse datasets from the clients without sharing their local training datasets. FL helps reduce data privacy risks. Unfortunately, FL still exist several issues regarding privacy and security. First, it is possible to leak sensitive information from the shared training parameters. Second, malicious clients can collude with each other to steal data, models from regular clients or corrupt the global training model. To tackle these challenges, we propose SecFL - a confidential federated learning framework that leverages Trusted Execution Environments (TEEs). SecFL performs the global and local training inside TEE enclaves to ensure the confidentiality and integrity of the computations against powerful adversaries with privileged access. SecFL provides a transparent remote attestation mechanism, relying on the remote attestation provided by TEEs, to allow clients to attest the global training computation as well as the local training computation of each other. Thus, all malicious clients can be detected using the remote attestation mechanisms.
△ Less
Submitted 7 October, 2021; v1 submitted 3 October, 2021;
originally announced October 2021.
-
WELES: Policy-driven Runtime Integrity Enforcement of Virtual Machines
Authors:
Wojciech Ozga,
Do Le Quoc,
Christof Fetzer
Abstract:
Trust is of paramount concern for tenants to deploy their security-sensitive services in the cloud. The integrity of VMs in which these services are deployed needs to be ensured even in the presence of powerful adversaries with administrative access to the cloud. Traditional approaches for solving this challenge leverage trusted computing techniques, e.g., vTPM, or hardware CPU extensions, e.g., A…
▽ More
Trust is of paramount concern for tenants to deploy their security-sensitive services in the cloud. The integrity of VMs in which these services are deployed needs to be ensured even in the presence of powerful adversaries with administrative access to the cloud. Traditional approaches for solving this challenge leverage trusted computing techniques, e.g., vTPM, or hardware CPU extensions, e.g., AMD SEV. But, they are vulnerable to powerful adversaries, or they provide only load time (not runtime) integrity measurements of VMs.
We propose WELES, a protocol allowing tenants to establish and maintain trust in VM runtime integrity of software and its configuration. WELES is transparent to the VM configuration and setup. It performs an implicit attestation of VMs during a secure login and binds the VM integrity state with the secure connection. Our prototype's evaluation shows that WELES is practical and incurs low performance overhead.
△ Less
Submitted 30 April, 2021;
originally announced April 2021.
-
Perun: Secure Multi-Stakeholder Machine Learning Framework with GPU Support
Authors:
Wojciech Ozga,
Do Le Quoc,
Christof Fetzer
Abstract:
Confidential multi-stakeholder machine learning (ML) allows multiple parties to perform collaborative data analytics while not revealing their intellectual property, such as ML source code, model, or datasets. State-of-the-art solutions based on homomorphic encryption incur a large performance overhead. Hardware-based solutions, such as trusted execution environments (TEEs), significantly improve…
▽ More
Confidential multi-stakeholder machine learning (ML) allows multiple parties to perform collaborative data analytics while not revealing their intellectual property, such as ML source code, model, or datasets. State-of-the-art solutions based on homomorphic encryption incur a large performance overhead. Hardware-based solutions, such as trusted execution environments (TEEs), significantly improve the performance in inference computations but still suffer from low performance in training computations, e.g., deep neural networks model training, because of limited availability of protected memory and lack of GPU support.
To address this problem, we designed and implemented Perun, a framework for confidential multi-stakeholder machine learning that allows users to make a trade-off between security and performance. Perun executes ML training on hardware accelerators (e.g., GPU) while providing security guarantees using trusted computing technologies, such as trusted platform module and integrity measurement architecture. Less compute-intensive workloads, such as inference, execute only inside TEE, thus at a lower trusted computing base. The evaluation shows that during the ML training on CIFAR-10 and real-world medical datasets, Perun achieved a 161x to 1560x speedup compared to a pure TEE-based approach.
△ Less
Submitted 31 March, 2021;
originally announced March 2021.
-
secureTF: A Secure TensorFlow Framework
Authors:
Do Le Quoc,
Franz Gregor,
Sergei Arnautov,
Roland Kunkel,
Pramod Bhatotia,
Christof Fetzer
Abstract:
Data-driven intelligent applications in modern online services have become ubiquitous. These applications are usually hosted in the untrusted cloud computing infrastructure. This poses significant security risks since these applications rely on applying machine learning algorithms on large datasets which may contain private and sensitive information.
To tackle this challenge, we designed secureT…
▽ More
Data-driven intelligent applications in modern online services have become ubiquitous. These applications are usually hosted in the untrusted cloud computing infrastructure. This poses significant security risks since these applications rely on applying machine learning algorithms on large datasets which may contain private and sensitive information.
To tackle this challenge, we designed secureTF, a distributed secure machine learning framework based on Tensorflow for the untrusted cloud infrastructure. secureTF is a generic platform to support unmodified TensorFlow applications, while providing end-to-end security for the input data, ML model, and application code. secureTF is built from ground-up based on the security properties provided by Trusted Execution Environments (TEEs). However, it extends the trust of a volatile memory region (or secure enclave) provided by the single node TEE to secure a distributed infrastructure required for supporting unmodified stateful machine learning applications running in the cloud.
The paper reports on our experiences about the system design choices and the system deployment in production use-cases. We conclude with the lessons learned based on the limitations of our commercially available platform, and discuss open research problems for the future work.
△ Less
Submitted 20 January, 2021;
originally announced January 2021.
-
A practical approach for updating an integrity-enforced operating system
Authors:
Wojciech Ozga,
Do Le Quoc,
Christof Fetzer
Abstract:
Trusted computing defines how to securely measure, store, and verify the integrity of software controlling a computer. One of the major challenges that make them hard to be applied in practice is the issue with software updates. Specifically, an operating system update causes the integrity violation because it changes the well-known initial state trusted by remote verifiers, such as integrity moni…
▽ More
Trusted computing defines how to securely measure, store, and verify the integrity of software controlling a computer. One of the major challenges that make them hard to be applied in practice is the issue with software updates. Specifically, an operating system update causes the integrity violation because it changes the well-known initial state trusted by remote verifiers, such as integrity monitoring systems. Consequently, the integrity monitoring of remote computers becomes unreliable due to the high amount of false positives. We address this problem by adding an extra level of indirection between the operating system and software repositories. We propose a trusted software repository (TSR), a secure proxy that overcomes the shortcomings of previous approaches by sanitizing software packages. Sanitization consists of modifying unsafe installation scripts and adding digital signatures in a way software packages can be installed in the operating system without violating its integrity. TSR leverages shielded execution, i.e., Intel SGX, to achieve confidentiality and integrity guarantees of the sanitization process. TSR is transparent to package managers, and requires no changes in the software packages building and distributing processes. Our evaluation shows that running TSR inside SGX is practical; since it induces only ~1.18X performance overhead during package sanitization compared to the native execution without SGX. TSR supports 99.76% of packages available in the main and community repositories of Alpine Linux while increasing the total repository size by 3.6%.
△ Less
Submitted 4 January, 2021;
originally announced January 2021.
-
TEEMon: A continuous performance monitoring framework for TEEs
Authors:
Robert Krahn,
Donald Dragoti,
Franz Gregor,
Do Le Quoc,
Valerio Schiavoni,
Pascal Felber,
Clenimar Souza,
Andrey Brito,
Christof Fetzer
Abstract:
Trusted Execution Environments (TEEs), such as Intel Software Guard eXtensions (SGX), are considered as a promising approach to resolve security challenges in clouds. TEEs protect the confidentiality and integrity of application code and data even against privileged attackers with root and physical access by providing an isolated secure memory area, i.e., enclaves. The security guarantees are prov…
▽ More
Trusted Execution Environments (TEEs), such as Intel Software Guard eXtensions (SGX), are considered as a promising approach to resolve security challenges in clouds. TEEs protect the confidentiality and integrity of application code and data even against privileged attackers with root and physical access by providing an isolated secure memory area, i.e., enclaves. The security guarantees are provided by the CPU, thus even if system software is compromised, the attacker can never access the enclave's content. While this approach ensures strong security guarantees for applications, it also introduces a considerable runtime overhead in part by the limited availability of protected memory (enclave page cache). Currently, only a limited number of performance measurement tools for TEE-based applications exist and none offer performance monitoring and analysis during runtime.
This paper presents TEEMon, the first continuous performance monitoring and analysis tool for TEE-based applications. TEEMon provides not only fine-grained performance metrics during runtime, but also assists the analysis of identifying causes of performance bottlenecks, e.g., excessive system calls. Our approach smoothly integrates with existing open-source tools (e.g., Prometheus or Grafana) towards a holistic monitoring solution, particularly optimized for systems deployed through Docker containers or Kubernetes and offers several dedicated metrics and visualizations. Our evaluation shows that TEEMon's overhead ranges from 5% to 17%.
△ Less
Submitted 11 December, 2020;
originally announced December 2020.
-
Trust Management as a Service: Enabling Trusted Execution in the Face of Byzantine Stakeholders
Authors:
Franz Gregor,
Wojciech Ozga,
Sébastien Vaucher,
Rafael Pires,
Do Le Quoc,
Sergei Arnautov,
André Martin,
Valerio Schiavoni,
Pascal Felber,
Christof Fetzer
Abstract:
Trust is arguably the most important challenge for critical services both deployed as well as accessed remotely over the network. These systems are exposed to a wide diversity of threats, ranging from bugs to exploits, active attacks, rogue operators, or simply careless administrators. To protect such applications, one needs to guarantee that they are properly configured and securely provisioned w…
▽ More
Trust is arguably the most important challenge for critical services both deployed as well as accessed remotely over the network. These systems are exposed to a wide diversity of threats, ranging from bugs to exploits, active attacks, rogue operators, or simply careless administrators. To protect such applications, one needs to guarantee that they are properly configured and securely provisioned with the "secrets" (e.g., encryption keys) necessary to preserve not only the confidentiality, integrity and freshness of their data but also their code. Furthermore, these secrets should not be kept under the control of a single stakeholder - which might be compromised and would represent a single point of failure - and they must be protected across software versions in the sense that attackers cannot get access to them via malicious updates. Traditional approaches for solving these challenges often use ad hoc techniques and ultimately rely on a hardware security module (HSM) as root of trust. We propose a more powerful and generic approach to trust management that instead relies on trusted execution environments (TEEs) and a set of stakeholders as root of trust. Our system, PALAEMON, can operate as a managed service deployed in an untrusted environment, i.e., one can delegate its operations to an untrusted cloud provider with the guarantee that data will remain confidential despite not trusting any individual human (even with root access) nor system software. PALAEMON addresses in a secure, efficient and cost-effective way five main challenges faced when develo** trusted networked applications and services. Our evaluation on a range of benchmarks and real applications shows that PALAEMON performs efficiently and can protect secrets of services without any change to their source code.
△ Less
Submitted 31 March, 2020;
originally announced March 2020.
-
TensorSCONE: A Secure TensorFlow Framework using Intel SGX
Authors:
Roland Kunkel,
Do Le Quoc,
Franz Gregor,
Sergei Arnautov,
Pramod Bhatotia,
Christof Fetzer
Abstract:
Machine learning has become a critical component of modern data-driven online services. Typically, the training phase of machine learning techniques requires to process large-scale datasets which may contain private and sensitive information of customers. This imposes significant security risks since modern online services rely on cloud computing to store and process the sensitive data. In the unt…
▽ More
Machine learning has become a critical component of modern data-driven online services. Typically, the training phase of machine learning techniques requires to process large-scale datasets which may contain private and sensitive information of customers. This imposes significant security risks since modern online services rely on cloud computing to store and process the sensitive data. In the untrusted computing infrastructure, security is becoming a paramount concern since the customers need to trust the thirdparty cloud provider. Unfortunately, this trust has been violated multiple times in the past. To overcome the potential security risks in the cloud, we answer the following research question: how to enable secure executions of machine learning computations in the untrusted infrastructure? To achieve this goal, we propose a hardware-assisted approach based on Trusted Execution Environments (TEEs), specifically Intel SGX, to enable secure execution of the machine learning computations over the private and sensitive datasets. More specifically, we propose a generic and secure machine learning framework based on Tensorflow, which enables secure execution of existing applications on the commodity untrusted infrastructure. In particular, we have built our system called TensorSCONE from ground-up by integrating TensorFlow with SCONE, a shielded execution framework based on Intel SGX. The main challenge of this work is to overcome the architectural limitations of Intel SGX in the context of building a secure TensorFlow system. Our evaluation shows that we achieve reasonable performance overheads while providing strong security properties with low TCB.
△ Less
Submitted 12 February, 2019;
originally announced February 2019.
-
Approximate Distributed Joins in Apache Spark
Authors:
Do Le Quoc,
Istemi Ekin Akkus,
Pramod Bhatotia,
Spyros Blanas,
Ruichuan Chen,
Christof Fetzer,
Thorsten Strufe
Abstract:
The join operation is a fundamental building block of parallel data processing. Unfortunately, it is very resource-intensive to compute an equi-join across massive datasets. The approximate computing paradigm allows users to trade accuracy and latency for expensive data processing operations. The equi-join operator is thus a natural candidate for optimization using approximation techniques. Althou…
▽ More
The join operation is a fundamental building block of parallel data processing. Unfortunately, it is very resource-intensive to compute an equi-join across massive datasets. The approximate computing paradigm allows users to trade accuracy and latency for expensive data processing operations. The equi-join operator is thus a natural candidate for optimization using approximation techniques. Although sampling-based approaches are widely used for approximation, sampling over joins is a compelling but challenging task regarding the output quality. Naive approaches, which perform joins over dataset samples, would not preserve statistical properties of the join output.
To realize this potential, we interweave Bloom filter sketching and stratified sampling with the join computation in a new operator, ApproxJoin, that preserves the statistical properties of the join output. ApproxJoin leverages a Bloom filter to avoid shuffling non-joinable data items around the network and then applies stratified sampling to obtain a representative sample of the join output.
Our analysis shows that ApproxJoin scales well and significantly reduces data movement, without sacrificing tight error bounds on the accuracy of the final results. We implemented ApproxJoin in Apache Spark and evaluated ApproxJoin using microbenchmarks and real-world case studies. The evaluation shows that ApproxJoin achieves a speedup of 6-9x over unmodified Spark-based joins with the same sampling rate. Furthermore, the speedup is accompanied by a significant reduction in the shuffled data volume, which is 5-82x less than unmodified Spark-based joins.
△ Less
Submitted 15 May, 2018;
originally announced May 2018.
-
Approximate Edge Analytics for the IoT Ecosystem
Authors:
Zhenyu Wen,
Do Le Quoc,
Pramod Bhatotia,
Ruichuan Chen,
Myung** Lee
Abstract:
IoT-enabled devices continue to generate a massive amount of data. Transforming this continuously arriving raw data into timely insights is critical for many modern online services. For such settings, the traditional form of data analytics over the entire dataset would be prohibitively limiting and expensive for supporting real-time stream analytics. In this work, we make a case for approximate co…
▽ More
IoT-enabled devices continue to generate a massive amount of data. Transforming this continuously arriving raw data into timely insights is critical for many modern online services. For such settings, the traditional form of data analytics over the entire dataset would be prohibitively limiting and expensive for supporting real-time stream analytics. In this work, we make a case for approximate computing for data analytics in IoT settings. Approximate computing aims for efficient execution of workflows where an approximate output is sufficient instead of the exact output. The idea behind approximate computing is to compute over a representative sample instead of the entire input dataset. Thus, approximate computing - based on the chosen sample size - can make a systematic trade-off between the output accuracy and computation efficiency. This motivated the design of APPROXIOT - a data analytics system for approximate computing in IoT. To realize this idea, we designed an online hierarchical stratified reservoir sampling algorithm that uses edge computing resources to produce approximate output with rigorous error bounds. To showcase the effectiveness of our algorithm, we implemented APPROXIOT based on Apache Kafka and evaluated its effectiveness using a set of microbenchmarks and real-world case studies. Our results show that APPROXIOT achieves a speedup 1.3X-9.9X with varying sampling fraction of 80% to 10% compared to simple random sampling.
△ Less
Submitted 15 May, 2018;
originally announced May 2018.
-
Approximate Stream Analytics in Apache Flink and Apache Spark Streaming
Authors:
Do Le Quoc,
Ruichuan Chen,
Pramod Bhatotia,
Christof Fetze,
Volker Hilt,
Thorsten Strufe
Abstract:
Approximate computing aims for efficient execution of workflows where an approximate output is sufficient instead of the exact output. The idea behind approximate computing is to compute over a representative sample instead of the entire input dataset. Thus, approximate computing - based on the chosen sample size - can make a systematic trade-off between the output accuracy and computation efficie…
▽ More
Approximate computing aims for efficient execution of workflows where an approximate output is sufficient instead of the exact output. The idea behind approximate computing is to compute over a representative sample instead of the entire input dataset. Thus, approximate computing - based on the chosen sample size - can make a systematic trade-off between the output accuracy and computation efficiency.
Unfortunately, the state-of-the-art systems for approximate computing primarily target batch analytics, where the input data remains unchanged during the course of sampling. Thus, they are not well-suited for stream analytics. This motivated the design of StreamApprox - a stream analytics system for approximate computing. To realize this idea, we designed an online stratified reservoir sampling algorithm to produce approximate output with rigorous error bounds. Importantly, our proposed algorithm is generic and can be applied to two prominent types of stream processing systems: (1) batched stream processing such as Apache Spark Streaming, and (2) pipelined stream processing such as Apache Flink.
We evaluated StreamApprox using a set of microbenchmarks and real-world case studies. Our results show that Spark- and Flink-based StreamApprox systems achieve a speedup of $1.15\times$-$3\times$ compared to the respective native Spark Streaming and Flink executions, with varying sampling fraction of $80\%$ to $10\%$. Furthermore, we have also implemented an improved baseline in addition to the native execution baseline - a Spark-based approximate computing system leveraging the existing sampling modules in Apache Spark. Compared to the improved baseline, our results show that StreamApprox achieves a speedup $1.1\times$-$2.4\times$ while maintaining the same accuracy level.
△ Less
Submitted 9 September, 2017;
originally announced September 2017.
-
Privacy Preserving Stream Analytics: The Marriage of Randomized Response and Approximate Computing
Authors:
Do Le Quoc,
Martin Beck,
Pramod Bhatotia,
Ruichuan Chen,
Christof Fetzer,
Thorsten Strufe
Abstract:
How to preserve users' privacy while supporting high-utility analytics for low-latency stream processing? To answer this question: we describe the design, implementation, and evaluation of PRIVAPPROX, a data analytics system for privacy-preserving stream processing. PRIVAPPROX provides three properties: (i) Privacy: zero-knowledge privacy guarantees for users, a privacy bound tighter than the stat…
▽ More
How to preserve users' privacy while supporting high-utility analytics for low-latency stream processing? To answer this question: we describe the design, implementation, and evaluation of PRIVAPPROX, a data analytics system for privacy-preserving stream processing. PRIVAPPROX provides three properties: (i) Privacy: zero-knowledge privacy guarantees for users, a privacy bound tighter than the state-of-the-art differential privacy; (ii) Utility: an interface for data analysts to systematically explore the trade-offs between the output accuracy (with error-estimation) and query execution budget; (iii) Latency: near real-time stream processing based on a scalable "synchronization-free" distributed architecture. The key idea behind our approach is to marry two existing techniques together: namely, sampling (used in the context of approximate computing) and randomized response (used in the context of privacy-preserving analytics). The resulting marriage is complementary - it achieves stronger privacy guarantees and also improves performance, a necessary ingredient for achieving low-latency stream analytics.
△ Less
Submitted 5 June, 2017; v1 submitted 19 January, 2017;
originally announced January 2017.