-
Intel TDX Demystified: A Top-Down Approach
Authors:
Pau-Chen Cheng,
Wojciech Ozga,
Enriquillo Valdez,
Salman Ahmed,
Zhongshu Gu,
Hani Jamjoom,
Hubertus Franke,
James Bottomley
Abstract:
Intel Trust Domain Extensions (TDX) is a new architectural extension in the 4th Generation Intel Xeon Scalable Processor that supports confidential computing. TDX allows the deployment of virtual machines in the Secure-Arbitration Mode (SEAM) with encrypted CPU state and memory, integrity protection, and remote attestation. TDX aims to enforce hardware-assisted isolation for virtual machines and m…
▽ More
Intel Trust Domain Extensions (TDX) is a new architectural extension in the 4th Generation Intel Xeon Scalable Processor that supports confidential computing. TDX allows the deployment of virtual machines in the Secure-Arbitration Mode (SEAM) with encrypted CPU state and memory, integrity protection, and remote attestation. TDX aims to enforce hardware-assisted isolation for virtual machines and minimize the attack surface exposed to host platforms, which are considered to be untrustworthy or adversarial in the confidential computing's new threat model. TDX can be leveraged by regulated industries or sensitive data holders to outsource their computations and data with end-to-end protection in public cloud infrastructure.
This paper aims to provide a comprehensive understanding of TDX to potential adopters, domain experts, and security researchers looking to leverage the technology for their own purposes. We adopt a top-down approach, starting with high-level security principles and moving to low-level technical details of TDX. Our analysis is based on publicly available documentation and source code, offering insights from security researchers outside of Intel.
△ Less
Submitted 27 March, 2023;
originally announced March 2023.
-
Separation of Powers in Federated Learning
Authors:
Pau-Chen Cheng,
Kevin Eykholt,
Zhongshu Gu,
Hani Jamjoom,
K. R. Jayaram,
Enriquillo Valdez,
Ashish Verma
Abstract:
Federated Learning (FL) enables collaborative training among mutually distrusting parties. Model updates, rather than training data, are concentrated and fused in a central aggregation server. A key security challenge in FL is that an untrustworthy or compromised aggregation process might lead to unforeseeable information leakage. This challenge is especially acute due to recently demonstrated att…
▽ More
Federated Learning (FL) enables collaborative training among mutually distrusting parties. Model updates, rather than training data, are concentrated and fused in a central aggregation server. A key security challenge in FL is that an untrustworthy or compromised aggregation process might lead to unforeseeable information leakage. This challenge is especially acute due to recently demonstrated attacks that have reconstructed large fractions of training data from ostensibly "sanitized" model updates.
In this paper, we introduce TRUDA, a new cross-silo FL system, employing a trustworthy and decentralized aggregation architecture to break down information concentration with regard to a single aggregator. Based on the unique computational properties of model-fusion algorithms, all exchanged model updates in TRUDA are disassembled at the parameter-granularity and re-stitched to random partitions designated for multiple TEE-protected aggregators. Thus, each aggregator only has a fragmentary and shuffled view of model updates and is oblivious to the model architecture. Our new security mechanisms can fundamentally mitigate training reconstruction attacks, while still preserving the final accuracy of trained models and kee** performance overheads low.
△ Less
Submitted 19 May, 2021;
originally announced May 2021.
-
Reaching Data Confidentiality and Model Accountability on the CalTrain
Authors:
Zhongshu Gu,
Hani Jamjoom,
Dong Su,
Heqing Huang,
Jialong Zhang,
Tengfei Ma,
Dimitrios Pendarakis,
Ian Molloy
Abstract:
Distributed collaborative learning (DCL) paradigms enable building joint machine learning models from distrusting multi-party participants. Data confidentiality is guaranteed by retaining private training data on each participant's local infrastructure. However, this approach to achieving data confidentiality makes today's DCL designs fundamentally vulnerable to data poisoning and backdoor attacks…
▽ More
Distributed collaborative learning (DCL) paradigms enable building joint machine learning models from distrusting multi-party participants. Data confidentiality is guaranteed by retaining private training data on each participant's local infrastructure. However, this approach to achieving data confidentiality makes today's DCL designs fundamentally vulnerable to data poisoning and backdoor attacks. It also limits DCL's model accountability, which is key to backtracking the responsible "bad" training data instances/contributors. In this paper, we introduce CALTRAIN, a Trusted Execution Environment (TEE) based centralized multi-party collaborative learning system that simultaneously achieves data confidentiality and model accountability. CALTRAIN enforces isolated computation on centrally aggregated training data to guarantee data confidentiality. To support building accountable learning models, we securely maintain the links between training instances and their corresponding contributors. Our evaluation shows that the models generated from CALTRAIN can achieve the same prediction accuracy when compared to the models trained in non-protected environments. We also demonstrate that when malicious training participants tend to implant backdoors during model training, CALTRAIN can accurately and precisely discover the poisoned and mislabeled training data that lead to the runtime mispredictions.
△ Less
Submitted 7 December, 2018;
originally announced December 2018.
-
Confidential Inference via Ternary Model Partitioning
Authors:
Zhongshu Gu,
Heqing Huang,
Jialong Zhang,
Dong Su,
Hani Jamjoom,
Ankita Lamba,
Dimitrios Pendarakis,
Ian Molloy
Abstract:
Today's cloud vendors are competing to provide various offerings to simplify and accelerate AI service deployment. However, cloud users always have concerns about the confidentiality of their runtime data, which are supposed to be processed on third-party's compute infrastructures. Information disclosure of user-supplied data may jeopardize users' privacy and breach increasingly stringent data pro…
▽ More
Today's cloud vendors are competing to provide various offerings to simplify and accelerate AI service deployment. However, cloud users always have concerns about the confidentiality of their runtime data, which are supposed to be processed on third-party's compute infrastructures. Information disclosure of user-supplied data may jeopardize users' privacy and breach increasingly stringent data protection regulations. In this paper, we systematically investigate the life cycles of inference inputs in deep learning image classification pipelines and understand how the information could be leaked. Based on the discovered insights, we develop a Ternary Model Partitioning mechanism and bring trusted execution environments to mitigate the identified information leakages. Our research prototype consists of two co-operative components: (1) Model Assessment Framework, a local model evaluation and partitioning tool that assists cloud users in deployment preparation; (2) Infenclave, an enclave-based model serving system for online confidential inference in the cloud. We have conducted comprehensive security and performance evaluation on three representative ImageNet-level deep learning models with different network depths and architectural complexity. Our results demonstrate the feasibility of launching confidential inference services in the cloud with maximized confidentiality guarantees and low performance costs.
△ Less
Submitted 12 August, 2020; v1 submitted 3 July, 2018;
originally announced July 2018.
-
Analysis and Modeling of Social Influence in High Performance Computing Workloads
Authors:
Shuai Zheng,
Zon-Yin Shae,
Xiangliang Zhang,
Hani Jamjoom,
Liana Fong
Abstract:
Social influence among users (e.g., collaboration on a project) creates bursty behavior in the underlying high performance computing (HPC) workloads. Using representative HPC and cluster workload logs, this paper identifies, analyzes, and quantifies the level of social influence across HPC users. We show the existence of a social graph that is characterized by a pattern of dominant users and follo…
▽ More
Social influence among users (e.g., collaboration on a project) creates bursty behavior in the underlying high performance computing (HPC) workloads. Using representative HPC and cluster workload logs, this paper identifies, analyzes, and quantifies the level of social influence across HPC users. We show the existence of a social graph that is characterized by a pattern of dominant users and followers. This pattern also follows a power-law distribution, which is consistent with those observed in mainstream social networks. Given its potential impact on HPC workloads prediction and scheduling, we propose a fast-converging, computationally-efficient online learning algorithm for identifying social groups. Extensive evaluation shows that our online algorithm can (1) quickly identify the social relationships by using a small portion of incoming jobs and (2) can efficiently track group evolution over time.
△ Less
Submitted 14 October, 2016;
originally announced October 2016.
-
Quality of Consumption: The Friendlier Side of Quality of Service
Authors:
Murad Kablan,
Hani Jamjoom,
Eric Keller
Abstract:
Cloud services today are increasingly built using functionality from other running services. In this paper, we question whether legacy Quality of Services (QoS) metrics and enforcement techniques are sufficient as they are producer centric. We argue that, similar to customer rating systems found in banking systems and many sharing economy apps (e.g., Uber and Airbnb), Quality of Consumption (QoC)…
▽ More
Cloud services today are increasingly built using functionality from other running services. In this paper, we question whether legacy Quality of Services (QoS) metrics and enforcement techniques are sufficient as they are producer centric. We argue that, similar to customer rating systems found in banking systems and many sharing economy apps (e.g., Uber and Airbnb), Quality of Consumption (QoC) should be introduced to capture different metrics about service consumers. We show how the combination of QoS and QoC, dubbed QoX, can be used by consumers and providers to improve the security and management of their infrastructure. In addition, we demonstrate how sharing information among other consumers and providers increase the value of QoX. To address the main challenge with sharing information, namely sybil attacks and mis-information, we describe how we can leverage cloud providers as vouching authorities to ensure the integrity of information. We present initial results in prototy** the appropriate abstractions and interfaces in a cloud environment, focusing on the design impact on both service providers and consumers.
△ Less
Submitted 30 September, 2015;
originally announced September 2015.
-
The Cloud Needs a Reputation System
Authors:
Murad Kablan,
Carlee Joe-Won,
Sangtae Ha,
Hani Jamjoom,
Eric Keller
Abstract:
Today's cloud apps are built from many diverse services that are managed by different parties. At the same time, these parties, which consume and/or provide services, continue to rely on arcane static security and entitlements models. In this paper, we introduce Seit, an inter-tenant framework that manages the interactions between cloud services. Seit is a software-defined reputation-based framewo…
▽ More
Today's cloud apps are built from many diverse services that are managed by different parties. At the same time, these parties, which consume and/or provide services, continue to rely on arcane static security and entitlements models. In this paper, we introduce Seit, an inter-tenant framework that manages the interactions between cloud services. Seit is a software-defined reputation-based framework. It consists of two primary components: (1) a set of integration and query interfaces that can be easily integrated into cloud and service providers' management stacks, and (2) a controller that maintains reputation information using a mechanism that is adaptive to the highly dynamic environment of the cloud. We have fully implemented Seit, and integrated it into an SDN controller, a load balancer, a cloud service broker, an intrusion detection system, and a monitoring framework. We evaluate the efficacy of Seit using both an analytical model and a Mininet-based emulated environment. Our analytical model validate the isolation and stability properties of Seit. Using our emulated environment, we show that Seit can provide improved security by isolating malicious tenants, reduced costs by adapting the infrastructure without compromising security, and increased revenues for high quality service providers by enabling reputation to impact discovery.
△ Less
Submitted 30 September, 2015;
originally announced September 2015.