Search | arXiv e-print repository

WebGPU-SPY: Finding Fingerprints in the Sandbox through GPU Cache Attacks

Authors: Ethan Ferguson, Adam Wilson, Hoda Naghibijouybari

Abstract: Microarchitectural attacks on CPU structures have been studied in native applications, as well as in web browsers. These attacks continue to be a substantial threat to computing systems at all scales. With the proliferation of heterogeneous systems and integration of hardware accelerators in every computing system, modern web browsers provide the support of GPU-based acceleration for the graphic… ▽ More Microarchitectural attacks on CPU structures have been studied in native applications, as well as in web browsers. These attacks continue to be a substantial threat to computing systems at all scales. With the proliferation of heterogeneous systems and integration of hardware accelerators in every computing system, modern web browsers provide the support of GPU-based acceleration for the graphics and rendering processes. Emerging web standards also support the GPU acceleration of general-purpose computation within web browsers. In this paper, we present a new attack vector for microarchitectural attacks in web browsers. We use emerging GPU accelerating APIs in modern browsers (specifically WebGPU) to launch a GPU-based cache side channel attack on the compute stack of the GPU that spies on victim activities on the graphics (rendering) stack of the GPU. Unlike prior works that rely on JavaScript APIs or software interfaces to build timing primitives, we build the timer using GPU hardware resources and develop a cache side channel attack on Intel's integrated GPUs. We leverage the GPU's inherent parallelism at different levels to develop high-resolution parallel attacks. We demonstrate that GPU-based cache attacks can achieve a precision of 90 for website fingerprinting of 100 top websites. We also discuss potential countermeasures against the proposed attack to secure the systems at a critical time when these web standards are being developed and before they are widely deployed. △ Less

Submitted 8 January, 2024; originally announced January 2024.

arXiv:2307.16123 [pdf, other]

Exploiting Parallel Memory Write Requests for Covert Channel Attacks in Integrated CPU-GPU Systems

Authors: Ghadeer Almusaddar, Hoda Naghibijouybari

Abstract: In heterogeneous SoCs, accelerators like integrated GPUs (iGPUs) are integrated on the same chip as CPUs, sharing the memory subsystem. In such systems, the massive memory requests from throughput-oriented accelerators significantly interfere with CPU memory requests. In addition to the large performance impact, this interference provides an attacker with a strong leakage vector for covert attacks… ▽ More In heterogeneous SoCs, accelerators like integrated GPUs (iGPUs) are integrated on the same chip as CPUs, sharing the memory subsystem. In such systems, the massive memory requests from throughput-oriented accelerators significantly interfere with CPU memory requests. In addition to the large performance impact, this interference provides an attacker with a strong leakage vector for covert attacks across the processors, which is hard to achieve across the cores in a multi-core CPU. In this paper, we demonstrate that parallel memory write requests of the iGPU and more specifically, the management policy of the write buffer in the memory controller (MC) can lead to significantly stalling CPU memory read requests in heterogeneous SoCs. We characterize the slowdown on the shared read and write buffers in the memory controller and exploit it to build a cross-processor covert channel in Intel-based integrated CPU-GPU systems. We develop two attack variants that achieve a bandwidth of 1.65 kbps and 4.41 kbps and error rates of 0.49% and 4.32% respectively. △ Less

Submitted 30 July, 2023; originally announced July 2023.

arXiv:2207.01298 [pdf, other]

doi 10.1145/3531437.3539699

Sealer: In-SRAM AES for High-Performance and Low-Overhead Memory Encryption

Authors: **gyao Zhang, Hoda Naghibijouybari, Elaheh Sadredini

Abstract: To provide data and code confidentiality and reduce the risk of information leak from memory or memory bus, computing systems are enhanced with encryption and decryption engine. Despite massive efforts in designing hardware enhancements for data and code protection, existing solutions incur significant performance overhead as the encryption/decryption is on the critical path. In this paper, we pre… ▽ More To provide data and code confidentiality and reduce the risk of information leak from memory or memory bus, computing systems are enhanced with encryption and decryption engine. Despite massive efforts in designing hardware enhancements for data and code protection, existing solutions incur significant performance overhead as the encryption/decryption is on the critical path. In this paper, we present Sealer, a high-performance and low-overhead in-SRAM memory encryption engine by exploiting the massive parallelism and bitline computational capability of SRAM subarrays. Sealer encrypts data before sending it off-chip and decrypts it upon receiving the memory blocks, thus, providing data confidentiality. Our proposed solution requires only minimal modifications to the existing SRAM peripheral circuitry. Sealer can achieve up to two orders of magnitude throughput-per-area improvement while consuming 3x less energy compared to the prior solutions. △ Less

Submitted 16 August, 2022; v1 submitted 4 July, 2022; originally announced July 2022.

Comments: 6 pages, ISLPED 2022

MSC Class: 68Mxx

arXiv:2203.15981 [pdf, other]

Spy in the GPU-box: Covert and Side Channel Attacks on Multi-GPU Systems

Authors: Sankha Baran Dutta, Hoda Naghibijouybari, Arjun Gupta, Nael Abu-Ghazaleh, Andres Marquez, Kevin Barker

Abstract: The deep learning revolution has been enabled in large part by GPUs, and more recently accelerators, which make it possible to carry out computationally demanding training and inference in acceptable times. As the size of machine learning networks and workloads continues to increase, multi-GPU machines have emerged as an important platform offered on High Performance Computing and cloud data cente… ▽ More The deep learning revolution has been enabled in large part by GPUs, and more recently accelerators, which make it possible to carry out computationally demanding training and inference in acceptable times. As the size of machine learning networks and workloads continues to increase, multi-GPU machines have emerged as an important platform offered on High Performance Computing and cloud data centers. As these machines are shared between multiple users, it becomes increasingly important to protect applications against potential attacks. In this paper, we explore the vulnerability of Nvidia's DGX multi-GPU machines to covert and side channel attacks. These machines consist of a number of discrete GPUs that are interconnected through a combination of custom interconnect (NVLink) and PCIe connections. We reverse engineer the cache hierarchy and show that it is possible for an attacker on one GPU to cause contention on the L2 cache of another GPU. We use this observation to first develop a covert channel attack across two GPUs, achieving the best bandwidth of 3.95 MB/s. We also develop a prime and probe attack on a remote GPU allowing an attacker to recover the cache hit and miss behavior of another workload. This basic capability can be used in any number of side channel attacks: we demonstrate a proof of concept attack that fingerprints the application running on the remote GPU, with high accuracy. Our work establishes for the first time the vulnerability of these machines to microarchitectural attacks, and we hope that it guides future research to improve their security. △ Less

Submitted 29 March, 2022; originally announced March 2022.

arXiv:2011.09642 [pdf, other]

Leaky Buddies: Cross-Component Covert Channels on Integrated CPU-GPU Systems

Authors: Sankha Baran Dutta, Hoda Naghibijouybari, Nael Abu-Ghazaleh, Andres Marquez, Kevin Barker

Abstract: Graphics Processing Units (GPUs) are a ubiquitous component across the range of today's computing platforms, from phones and tablets, through personal computers, to high-end server class platforms. With the increasing importance of graphics and video workloads, recent processors are shipped with GPU devices that are integrated on the same chip. Integrated GPUs share some resources with the CPU and… ▽ More Graphics Processing Units (GPUs) are a ubiquitous component across the range of today's computing platforms, from phones and tablets, through personal computers, to high-end server class platforms. With the increasing importance of graphics and video workloads, recent processors are shipped with GPU devices that are integrated on the same chip. Integrated GPUs share some resources with the CPU and as a result, there is a potential for microarchitectural attacks from the GPU to the CPU or vice versa. We believe this type of attack, crossing the component boundary (GPU to CPU or vice versa) is novel, introducing unique challenges, but also providing the attacker with new capabilities that must be considered when we design defenses against microarchitectrual attacks in these environments. Specifically, we consider the potential for covert channel attacks that arise either from shared microarchitectural components (such as caches) or through shared contention domains (e.g., shared buses). We illustrate these two types of channels by develo** two reliable covert channel attacks. The first covert channel uses the shared LLC cache in Intel's integrated GPU architectures. The second is a contention based channel targeting the ring bus connecting the CPU and GPU to the LLC. Cross component channels introduce a number of new challenges that we had to overcome since they occur across heterogeneous components that use different computation models and are interconnected using asymmetric memory hierarchies. We also exploit GPU parallelism to increase the bandwidth of the communication, even without relying on a common clock. The LLC based channel achieves a bandwidth of 120 kbps with a low error rate of 2%, while the contention based channel delivers up to 400 kbps with a 0.8% error rate. △ Less

Submitted 18 November, 2020; originally announced November 2020.

Showing 1–5 of 5 results for author: Naghibijouybari, H