Search | arXiv e-print repository

Testing side-channel security of cryptographic implementations against future microarchitectures

Authors: Gilles Barthe, Marcel Böhme, Sunjay Cauligi, Chitchanok Chuengsatiansup, Daniel Genkin, Marco Guarnieri, David Mateos Romero, Peter Schwabe, David Wu, Yuval Yarom

Abstract: How will future microarchitectures impact the security of existing cryptographic implementations? As we cannot keep reducing the size of transistors, chip vendors have started develo** new microarchitectural optimizations to speed up computation. A recent study (Sanchez Vicarte et al., ISCA 2021) suggests that these optimizations might open the Pandora's box of microarchitectural attacks. Howeve… ▽ More How will future microarchitectures impact the security of existing cryptographic implementations? As we cannot keep reducing the size of transistors, chip vendors have started develo** new microarchitectural optimizations to speed up computation. A recent study (Sanchez Vicarte et al., ISCA 2021) suggests that these optimizations might open the Pandora's box of microarchitectural attacks. However, there is little guidance on how to evaluate the security impact of future optimization proposals. To help chip vendors explore the impact of microarchitectural optimizations on cryptographic implementations, we develop (i) an expressive domain-specific language, called LmSpec, that allows them to specify the leakage model for the given optimization and (ii) a testing framework, called LmTest, to automatically detect leaks under the specified leakage model within the given implementation. Using this framework, we conduct an empirical study of 18 proposed microarchitectural optimizations on 25 implementations of eight cryptographic primitives in five popular libraries. We find that every implementation would contain secret-dependent leaks, sometimes sufficient to recover a victim's secret key, if these optimizations were realized. Ironically, some leaks are possible only because of coding idioms used to prevent leaks under the standard constant-time model. △ Less

Submitted 1 February, 2024; originally announced February 2024.

arXiv:2401.17628 [pdf, other]

Elephants Do Not Forget: Differential Privacy with State Continuity for Privacy Budget

Authors: Jiankai **, Chitchanok Chuengsatiansup, Toby Murray, Benjamin I. P. Rubinstein, Yuval Yarom, Olga Ohrimenko

Abstract: Current implementations of differentially-private (DP) systems either lack support to track the global privacy budget consumed on a dataset, or fail to faithfully maintain the state continuity of this budget. We show that failure to maintain a privacy budget enables an adversary to mount replay, rollback and fork attacks - obtaining answers to many more queries than what a secure system would allo… ▽ More Current implementations of differentially-private (DP) systems either lack support to track the global privacy budget consumed on a dataset, or fail to faithfully maintain the state continuity of this budget. We show that failure to maintain a privacy budget enables an adversary to mount replay, rollback and fork attacks - obtaining answers to many more queries than what a secure system would allow. As a result the attacker can reconstruct secret data that DP aims to protect - even if DP code runs in a Trusted Execution Environment (TEE). We propose ElephantDP, a system that aims to provide the same guarantees as a trusted curator in the global DP model would, albeit set in an untrusted environment. Our system relies on a state continuity module to provide protection for the privacy budget and a TEE to faithfully execute DP code and update the budget. To provide security, our protocol makes several design choices including the content of the persistent state and the order between budget updates and query answers. We prove that ElephantDP provides liveness (i.e., the protocol can restart from a correct state and respond to queries as long as the budget is not exceeded) and DP confidentiality (i.e., an attacker learns about a dataset as much as it would from interacting with a trusted curator). Our implementation and evaluation of the protocol use Intel SGX as a TEE to run the DP code and a network of TEEs to maintain state continuity. Compared to an insecure baseline, we observe only 1.1-2$\times$ overheads and lower relative overheads for larger datasets and complex DP queries. △ Less

Submitted 31 January, 2024; originally announced January 2024.

arXiv:2401.13575 [pdf, other]

CNN architecture extraction on edge GPU

Authors: Peter Horvath, Lukasz Chmielewski, Leo Weissbart, Lejla Batina, Yuval Yarom

Abstract: Neural networks have become popular due to their versatility and state-of-the-art results in many applications, such as image classification, natural language processing, speech recognition, forecasting, etc. These applications are also used in resource-constrained environments such as embedded devices. In this work, the susceptibility of neural network implementations to reverse engineering is ex… ▽ More Neural networks have become popular due to their versatility and state-of-the-art results in many applications, such as image classification, natural language processing, speech recognition, forecasting, etc. These applications are also used in resource-constrained environments such as embedded devices. In this work, the susceptibility of neural network implementations to reverse engineering is explored on the NVIDIA Jetson Nano microcomputer via side-channel analysis. To this end, an architecture extraction attack is presented. In the attack, 15 popular convolutional neural network architectures (EfficientNets, MobileNets, NasNet, etc.) are implemented on the GPU of Jetson Nano and the electromagnetic radiation of the GPU is analyzed during the inference operation of the neural networks. The results of the analysis show that neural network architectures are easily distinguishable using deep learning-based side-channel analysis. △ Less

Submitted 24 January, 2024; originally announced January 2024.

Comments: Will appear at the AIHWS 2024 workshop at ACNS 2024

Report number: AIHWS008

arXiv:2312.07783 [pdf, other]

BarraCUDA: GPUs do Leak DNN Weights

Authors: Peter Horvath, Lukasz Chmielewski, Leo Weissbart, Lejla Batina, Yuval Yarom

Abstract: Over the last decade, applications of neural networks (NNs) have spread to various aspects of our lives. A large number of companies base their businesses on building products that use neural networks for tasks such as face recognition, machine translation, and self-driving cars. Much of the intellectual property underpinning these products is encoded in the exact parameters of the neural networks… ▽ More Over the last decade, applications of neural networks (NNs) have spread to various aspects of our lives. A large number of companies base their businesses on building products that use neural networks for tasks such as face recognition, machine translation, and self-driving cars. Much of the intellectual property underpinning these products is encoded in the exact parameters of the neural networks. Consequently, protecting these is of utmost priority to businesses. At the same time, many of these products need to operate under a strong threat model, in which the adversary has unfettered physical control of the product. In this work, we present BarraCUDA, a novel attack on general purpose Graphic Processing Units (GPUs) that can extract parameters of neural networks running on the popular Nvidia Jetson Nano device. BarraCUDA uses correlation electromagnetic analysis to recover parameters of real-world convolutional neural networks. △ Less

Submitted 27 February, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

arXiv:2307.09001 [pdf, other]

On Borrowed Time -- Preventing Static Power Side-Channel Analysis

Authors: Robert Dumitru, Andrew Wabnitz, Yuval Yarom

Abstract: In recent years, static power side-channel analysis attacks have emerged as a serious threat to cryptographic implementations, overcoming state-of-the-art countermeasures against side-channel attacks. The continued down-scaling of semiconductor process technology, which results in an increase of the relative weight of static power in the total power budget of circuits, will only improve the viabil… ▽ More In recent years, static power side-channel analysis attacks have emerged as a serious threat to cryptographic implementations, overcoming state-of-the-art countermeasures against side-channel attacks. The continued down-scaling of semiconductor process technology, which results in an increase of the relative weight of static power in the total power budget of circuits, will only improve the viability of static power side-channel analysis attacks. Yet, despite the threat posed, limited work has been invested into mitigating this class of attack. In this work we address this gap. We observe that static power side-channel analysis relies on stop** the target circuit's clock over a prolonged period, during which the circuit holds secret information in its registers. We propose Borrowed Time, a countermeasure that hinders an attacker's ability to leverage such clock control. Borrowed Time detects a stopped clock and triggers a reset that wipes any registers containing sensitive intermediates, whose leakages would otherwise be exploitable. We demonstrate the effectiveness of our countermeasure by performing practical Correlation Power Analysis attacks under optimal conditions against an AES implementation on an FPGA target with and without our countermeasure in place. In the unprotected case, we can recover the entire secret key using traces from 1,500 encryptions. Under the same conditions, the protected implementation successfully prevents key recovery even with traces from 1,000,000 encryptions. △ Less

Submitted 18 July, 2023; originally announced July 2023.

arXiv:2305.19586 [pdf, other]

CryptOpt: Automatic Optimization of Straightline Code

Authors: Joel Kuepper, Andres Erbsen, Jason Gross, Owen Conoly, Chuyue Sun, Samuel Tian, David Wu, Adam Chlipala, Chitchanok Chuengsatiansup, Daniel Genkin, Markus Wagner, Yuval Yarom

Abstract: Manual engineering of high-performance implementations typically consumes many resources and requires in-depth knowledge of the hardware. Compilers try to address these problems; however, they are limited by design in what they can do. To address this, we present CryptOpt, an automatic optimizer for long stretches of straightline code. Experimental results across eight hardware platforms show that… ▽ More Manual engineering of high-performance implementations typically consumes many resources and requires in-depth knowledge of the hardware. Compilers try to address these problems; however, they are limited by design in what they can do. To address this, we present CryptOpt, an automatic optimizer for long stretches of straightline code. Experimental results across eight hardware platforms show that CryptOpt achieves a speed-up factor of up to 2.56 over current off-the-shelf compilers. △ Less

Submitted 31 May, 2023; originally announced May 2023.

arXiv:2305.12784 [pdf, other]

Hot Pixels: Frequency, Power, and Temperature Attacks on GPUs and ARM SoCs

Authors: Hritvik Taneja, Jason Kim, Jie Jeff Xu, Stephan van Schaik, Daniel Genkin, Yuval Yarom

Abstract: The drive to create thinner, lighter, and more energy efficient devices has resulted in modern SoCs being forced to balance a delicate tradeoff between power consumption, heat dissipation, and execution speed (i.e., frequency). While beneficial, these DVFS mechanisms have also resulted in software-visible hybrid side-channels, which use software to probe analog properties of computing devices. Suc… ▽ More The drive to create thinner, lighter, and more energy efficient devices has resulted in modern SoCs being forced to balance a delicate tradeoff between power consumption, heat dissipation, and execution speed (i.e., frequency). While beneficial, these DVFS mechanisms have also resulted in software-visible hybrid side-channels, which use software to probe analog properties of computing devices. Such hybrid attacks are an emerging threat that can bypass countermeasures for traditional microarchitectural side-channel attacks. Given the rise in popularity of both Arm SoCs and GPUs, in this paper we investigate the susceptibility of these devices to information leakage via power, temperature and frequency, as measured via internal sensors. We demonstrate that the sensor data observed correlates with both instructions executed and data processed, allowing us to mount software-visible hybrid side-channel attacks on these devices. To demonstrate the real-world impact of this issue, we present JavaScript-based pixel stealing and history sniffing attacks on Chrome and Safari, with all side channel countermeasures enabled. Finally, we also show website fingerprinting attacks, without any elevated privileges. △ Less

Submitted 22 May, 2023; originally announced May 2023.

arXiv:2211.10665 [pdf, other]

CryptOpt: Verified Compilation with Randomized Program Search for Cryptographic Primitives (full version)

Authors: Joel Kuepper, Andres Erbsen, Jason Gross, Owen Conoly, Chuyue Sun, Samuel Tian, David Wu, Adam Chlipala, Chitchanok Chuengsatiansup, Daniel Genkin, Markus Wagner, Yuval Yarom

Abstract: Most software domains rely on compilers to translate high-level code to multiple different machine languages, with performance not too much worse than what developers would have the patience to write directly in assembly language. However, cryptography has been an exception, where many performance-critical routines have been written directly in assembly (sometimes through metaprogramming layers).… ▽ More Most software domains rely on compilers to translate high-level code to multiple different machine languages, with performance not too much worse than what developers would have the patience to write directly in assembly language. However, cryptography has been an exception, where many performance-critical routines have been written directly in assembly (sometimes through metaprogramming layers). Some past work has shown how to do formal verification of that assembly, and other work has shown how to generate C code automatically along with formal proof, but with consequent performance penalties vs. the best-known assembly. We present CryptOpt, the first compilation pipeline that specializes high-level cryptographic functional programs into assembly code significantly faster than what GCC or Clang produce, with mechanized proof (in Coq) whose final theorem statement mentions little beyond the input functional program and the operational semantics of x86-64 assembly. On the optimization side, we apply randomized search through the space of assembly programs, with repeated automatic benchmarking on target CPUs. On the formal-verification side, we connect to the Fiat Cryptography framework (which translates functional programs into C-like IR code) and extend it with a new formally verified program-equivalence checker, incorporating a modest subset of known features of SMT solvers and symbolic-execution engines. The overall prototype is quite practical, e.g. producing new fastest-known implementations of finite-field arithmetic for both Curve25519 (part of the TLS standard) and the Bitcoin elliptic curve secp256k1 for the Intel $12^{th}$ and $13^{th}$ generations. △ Less

Submitted 21 May, 2023; v1 submitted 19 November, 2022; originally announced November 2022.

arXiv:2211.01109 [pdf, other]

The Impostor Among US(B): Off-Path Injection Attacks on USB Communications

Authors: Robert Dumitru, Daniel Genkin, Andrew Wabnitz, Yuval Yarom

Abstract: USB is the most prevalent peripheral interface in modern computer systems and its inherent insecurities make it an appealing attack vector. A well-known limitation of USB is that traffic is not encrypted. This allows on-path adversaries to trivially perform man-in-the-middle attacks. Off-path attacks that compromise the confidentiality of communications have also been shown to be possible. However… ▽ More USB is the most prevalent peripheral interface in modern computer systems and its inherent insecurities make it an appealing attack vector. A well-known limitation of USB is that traffic is not encrypted. This allows on-path adversaries to trivially perform man-in-the-middle attacks. Off-path attacks that compromise the confidentiality of communications have also been shown to be possible. However, so far no off-path attacks that breach USB communications integrity have been demonstrated. In this work we show that the integrity of USB communications is not guaranteed even against off-path attackers.Specifically, we design and build malicious devices that, even when placed outside of the path between a victim device and the host, can inject data to that path. Using our developed injectors we can falsify the provenance of data input as interpreted by a host computer system. By injecting on behalf of trusted victim devices we can circumvent any software-based authorisation policy defences that computer systems employ against common USB attacks. We demonstrate two concrete attacks. The first injects keystrokes allowing an attacker to execute commands. The second demonstrates file-contents replacement including during system install from a USB disk. We test the attacks on 29 USB 2.0 and USB 3.x hubs and find 14 of them to be vulnerable. △ Less

Submitted 2 November, 2022; v1 submitted 2 November, 2022; originally announced November 2022.

Comments: To appear in USENIX Security 2023

arXiv:2201.11377 [pdf, other]

CacheFX: A Framework for Evaluating Cache Security

Authors: Daniel Genkin, William Kosasih, Fangfei Liu, Anna Trikalinou, Thomas Unterluggauer, Yuval Yarom

Abstract: Over the last two decades, the danger of sharing resources between programs has been repeatedly highlighted. Multiple side-channel attacks, which seek to exploit shared components for leaking information, have been devised, mostly targeting shared caching components. In response, the research community has proposed multiple cache designs that aim at curbing the source of side channels. With multip… ▽ More Over the last two decades, the danger of sharing resources between programs has been repeatedly highlighted. Multiple side-channel attacks, which seek to exploit shared components for leaking information, have been devised, mostly targeting shared caching components. In response, the research community has proposed multiple cache designs that aim at curbing the source of side channels. With multiple competing designs, there is a need for assessing the level of security against side-channel attacks that each design offers. In this work we propose CacheFX, a flexible framework for assessing and evaluating the resilience of cache designs to side-channel attacks. CacheFX allows the evaluator to implement various cache designs, victims, and attackers, as well as to exercise them for assessing the leakage of information via the cache. To demonstrate the power of CacheFX, we implement multiple cache designs and replacement algorithms, and devise three evaluation metrics that measure different aspects of the caches:(1) the entropy induced by a memory access; (2) the complexity of building an eviction set; and (3) protection against cryptographic attacks. Our experiments highlight that different security metrics give different insights to designs, making a comprehensive analysis mandatory. For instance, while eviction-set building was fastest for randomized skewed caches, these caches featured lower eviction entropy and higher practical attack complexity. Our experiments show that all non-partitioned designs allow for effective cryptographic attacks. However, in state-of-the-art secure caches, eviction-based attacks are more difficult to mount than occupancy-based attacks, highlighting the need to consider the latter in cache design. △ Less

Submitted 27 January, 2022; originally announced January 2022.

arXiv:2201.09956 [pdf, other]

doi 10.14722/ndss.2022.24093

DRAWNAPART: A Device Identification Technique based on Remote GPU Fingerprinting

Authors: Tomer Laor, Naif Mehanna, Antonin Durey, Vitaly Dyadyuk, Pierre Laperdrix, Clémentine Maurice, Yossi Oren, Romain Rouvoy, Walter Rudametkin, Yuval Yarom

Abstract: Browser fingerprinting aims to identify users or their devices, through scripts that execute in the users' browser and collect information on software or hardware characteristics. It is used to track users or as an additional means of identification to improve security. In this paper, we report on a new technique that can significantly extend the tracking time of fingerprint-based tracking methods… ▽ More Browser fingerprinting aims to identify users or their devices, through scripts that execute in the users' browser and collect information on software or hardware characteristics. It is used to track users or as an additional means of identification to improve security. In this paper, we report on a new technique that can significantly extend the tracking time of fingerprint-based tracking methods. Our technique, which we call DrawnApart, is a new GPU fingerprinting technique that identifies a device based on the unique properties of its GPU stack. Specifically, we show that variations in speed among the multiple execution units that comprise a GPU can serve as a reliable and robust device signature, which can be collected using unprivileged JavaScript. We investigate the accuracy of DrawnApart under two scenarios. In the first scenario, our controlled experiments confirm that the technique is effective in distinguishing devices with similar hardware and software configurations, even when they are considered identical by current state-of-the-art fingerprinting algorithms. In the second scenario, we integrate a one-shot learning version of our technique into a state-of-the-art browser fingerprint tracking algorithm. We verify our technique through a large-scale experiment involving data collected from over 2,500 crowd-sourced devices over a period of several months and show it provides a boost of up to 67% to the median tracking duration, compared to the state-of-the-art method. DrawnApart makes two contributions to the state of the art in browser fingerprinting. On the conceptual front, it is the first work that explores the manufacturing differences between identical GPUs and the first to exploit these differences in a privacy context. On the practical front, it demonstrates a robust technique for distinguishing between machines with identical hardware and software configurations. △ Less

Submitted 24 January, 2022; originally announced January 2022.

Comments: Network and Distributed System Security Symposium, Feb 2022, San Diego, United States

arXiv:2109.11741 [pdf, other]

doi 10.1145/3460120.3485380

Rosita++: Automatic Higher-Order Leakage Elimination from Cryptographic Code

Authors: Madura A. Shelton, Łukasz Chmielewski, Niels Samwel, Markus Wagner, Lejla Batina, Yuval Yarom

Abstract: Side-channel attacks are a major threat to the security of cryptographic implementations, particularly for small devices that are under the physical control of the adversary. While several strategies for protecting against side-channel attacks exist, these often fail in practice due to unintended interactions between values deep within the CPU. To detect and protect from side-channel attacks, seve… ▽ More Side-channel attacks are a major threat to the security of cryptographic implementations, particularly for small devices that are under the physical control of the adversary. While several strategies for protecting against side-channel attacks exist, these often fail in practice due to unintended interactions between values deep within the CPU. To detect and protect from side-channel attacks, several automated tools have recently been proposed; one of their common limitations is that they only support first-order leakage. In this work, we present the first automated tool for detecting and eliminating higher-order leakage from cryptographic implementations. Rosita++ proposes statistical and software-based tools to allow high-performance higher-order leakage detection. It then uses the code rewrite engine of Rosita (Shelton et al. NDSS 2021) to eliminate detected leakage. For the sake of practicality we evaluate Rosita++ against second and third order leakage, but our framework is not restricted to only these orders. We evaluate Rosita++ against second-order leakage with three-share implementations of two ciphers, PRESENT and Xoodoo, and with the second-order Boolean-to-arithmetic masking, a core building block of masked implementations of many cryptographic primitives, including SHA-2, ChaCha and Blake. We show effective second-order leakage elimination at a performance cost of 36% for Xoodoo, 189% for PRESENT, and 29% for the Boolean-to-arithmetic masking. For third-order analysis, we evaluate Rosita++ against the third-order leakage using a four-share synthetic example that corresponds to typical four-share processing. Rosita++ correctly identified this leakage and applied code fixes. △ Less

Submitted 24 September, 2021; originally announced September 2021.

arXiv:2104.08593 [pdf, other]

SoK: Design Tools for Side-Channel-Aware Implementations

Authors: Ileana Buhan, Lejla Batina, Yuval Yarom, Patrick Schaumont

Abstract: Side-channel attacks that leak sensitive information through a computing device's interaction with its physical environment have proven to be a severe threat to devices' security, particularly when adversaries have unfettered physical access to the device. Traditional approaches for leakage detection measure the physical properties of the device. Hence, they cannot be used during the design proces… ▽ More Side-channel attacks that leak sensitive information through a computing device's interaction with its physical environment have proven to be a severe threat to devices' security, particularly when adversaries have unfettered physical access to the device. Traditional approaches for leakage detection measure the physical properties of the device. Hence, they cannot be used during the design process and fail to provide root cause analysis. An alternative approach that is gaining traction is to automate leakage detection by modeling the device. The demand to understand the scope, benefits, and limitations of the proposed tools intensifies with the increase in the number of proposals. In this SoK, we classify approaches to automated leakage detection based on the model's source of truth. We classify the existing tools on two main parameters: whether the model includes measurements from a concrete device and the abstraction level of the device specification used for constructing the model. We survey the proposed tools to determine the current knowledge level across the domain and identify open problems. In particular, we highlight the absence of evaluation methodologies and metrics that would compare proposals' effectiveness from across the domain. We believe that our results help practitioners who want to use automated leakage detection and researchers interested in advancing the knowledge and improving automated leakage detection. △ Less

Submitted 14 June, 2021; v1 submitted 17 April, 2021; originally announced April 2021.

arXiv:2103.04952 [pdf, other]

Prime+Probe 1, JavaScript 0: Overcoming Browser-based Side-Channel Defenses

Authors: Anatoly Shusterman, Ayush Agarwal, Sioli O'Connell, Daniel Genkin, Yossi Oren, Yuval Yarom

Abstract: The "eternal war in cache" has reached browsers, with multiple cache-based side-channel attacks and countermeasures being suggested. A common approach for countermeasures is to disable or restrict JavaScript features deemed essential for carrying out attacks. To assess the effectiveness of this approach, in this work we seek to identify those JavaScript features which are essential for carrying ou… ▽ More The "eternal war in cache" has reached browsers, with multiple cache-based side-channel attacks and countermeasures being suggested. A common approach for countermeasures is to disable or restrict JavaScript features deemed essential for carrying out attacks. To assess the effectiveness of this approach, in this work we seek to identify those JavaScript features which are essential for carrying out a cache-based attack. We develop a sequence of attacks with progressively decreasing dependency on JavaScript features, culminating in the first browser-based side-channel attack which is constructed entirely from Cascading Style Sheets (CSS) and HTML, and works even when script execution is completely blocked. We then show that avoiding JavaScript features makes our techniques architecturally agnostic, resulting in microarchitectural website fingerprinting attacks that work across hardware platforms including Intel Core, AMD Ryzen, Samsung Exynos, and Apple M1 architectures. As a final contribution, we evaluate our techniques in hardened browser environments including the Tor browser, Deter-Fox (Cao el al., CCS 2017), and Chrome Zero (Schwartz et al., NDSS 2018). We confirm that none of these approaches completely defend against our attacks. We further argue that the protections of Chrome Zero need to be more comprehensively applied, and that the performance and user experience of Chrome Zero will be severely degraded if this approach is taken. △ Less

Submitted 8 March, 2021; originally announced March 2021.

arXiv:2007.08707 [pdf, other]

PThammer: Cross-User-Kernel-Boundary Rowhammer through Implicit Accesses

Authors: Zhi Zhang, Yueqiang Cheng, Dongxi Liu, Surya Nepal, Zhi Wang, Yuval Yarom

Abstract: Rowhammer is a hardware vulnerability in DRAM memory, where repeated access to memory can induce bit flips in neighboring memory locations. Being a hardware vulnerability, rowhammer bypasses all of the system memory protection, allowing adversaries to compromise the integrity and confidentiality of data. Rowhammer attacks have shown to enable privilege escalation, sandbox escape, and cryptographic… ▽ More Rowhammer is a hardware vulnerability in DRAM memory, where repeated access to memory can induce bit flips in neighboring memory locations. Being a hardware vulnerability, rowhammer bypasses all of the system memory protection, allowing adversaries to compromise the integrity and confidentiality of data. Rowhammer attacks have shown to enable privilege escalation, sandbox escape, and cryptographic key disclosures. Recently, several proposals suggest exploiting the spatial proximity between the accessed memory location and the location of the bit flip for a defense against rowhammer. These all aim to deny the attacker's permission to access memory locations near sensitive data. In this paper, we question the core assumption underlying these defenses. We present PThammer, a confused-deputy attack that causes accesses to memory locations that the attacker is not allowed to access. Specifically, PThammer exploits the address translation process of modern processors, inducing the processor to generate frequent accesses to protected memory locations. We implement PThammer, demonstrating that it is a viable attack, resulting in a system compromise (e.g., kernel privilege escalation). We further evaluate the effectiveness of proposed software-only defenses showing that PThammer can overcome those. △ Less

Submitted 23 July, 2020; v1 submitted 16 July, 2020; originally announced July 2020.

Comments: Preprint of the work accepted at the International Symposium on Microarchitecture (MICRO) 2020. arXiv admin note: text overlap with arXiv:1912.03076

arXiv:2006.13353 [pdf, other]

CacheOut: Leaking Data on Intel CPUs via Cache Evictions

Authors: Stephan van Schaik, Marina Minkin, Andrew Kwong, Daniel Genkin, Yuval Yarom

Abstract: Recent transient-execution attacks, such as RIDL, Fallout, and ZombieLoad, demonstrated that attackers can leak information while it transits through microarchitectural buffers. Named Microarchitectural Data Sampling (MDS) by Intel, these attacks are likened to "drinking from the firehose", as the attacker has little control over what data is observed and from what origin. Unable to prevent the bu… ▽ More Recent transient-execution attacks, such as RIDL, Fallout, and ZombieLoad, demonstrated that attackers can leak information while it transits through microarchitectural buffers. Named Microarchitectural Data Sampling (MDS) by Intel, these attacks are likened to "drinking from the firehose", as the attacker has little control over what data is observed and from what origin. Unable to prevent the buffers from leaking, Intel issued countermeasures via microcode updates that overwrite the buffers when the CPU changes security domains. In this work we present CacheOut, a new microarchitectural attack that is capable of bypassing Intel's buffer overwrite countermeasures. We observe that as data is being evicted from the CPU's L1 cache, it is often transferred back to the leaky CPU buffers where it can be recovered by the attacker. CacheOut improves over previous MDS attacks by allowing the attacker to choose which data to leak from the CPU's L1 cache, as well as which part of a cache line to leak. We demonstrate that CacheOut can leak information across multiple security boundaries, including those between processes, virtual machines, user and kernel space, and from SGX enclaves. △ Less

Submitted 23 June, 2020; originally announced June 2020.

arXiv:1912.05183 [pdf, other]

doi 10.14722/ndss.2021.23137

Rosita: Towards Automatic Elimination of Power-Analysis Leakage in Ciphers

Authors: Madura A Shelton, Niels Samwel, Lejla Batina, Francesco Regazzoni, Markus Wagner, Yuval Yarom

Abstract: Since their introduction over two decades ago, side-channel attacks have presented a serious security threat. While many ciphers' implementations employ masking techniques to protect against such attacks, they often leak secret information due to unintended interactions in the hardware. We present Rosita, a code rewrite engine that uses a leakage emulator which we amend to correctly emulate the mi… ▽ More Since their introduction over two decades ago, side-channel attacks have presented a serious security threat. While many ciphers' implementations employ masking techniques to protect against such attacks, they often leak secret information due to unintended interactions in the hardware. We present Rosita, a code rewrite engine that uses a leakage emulator which we amend to correctly emulate the micro-architecture of a target system. We use Rosita to automatically protect masked implementations of AES, ChaCha, and Xoodoo. For AES and Xoodoo, we show the absence of observable leakage at 1,000,000 traces with less than 21% penalty to the performance. For ChaCha, which has significantly more leakage, Rosita eliminates over 99% of the leakage, at a performance cost of 64%. △ Less

Submitted 19 November, 2020; v1 submitted 11 December, 2019; originally announced December 2019.

Comments: 17 pages, 16 figures. Accepted in Network and Distributed Systems Security (NDSS) Symposium 2021

arXiv:1905.12701 [pdf, other]

Fallout: Reading Kernel Writes From User Space

Authors: Marina Minkin, Daniel Moghimi, Moritz Lipp, Michael Schwarz, Jo Van Bulck, Daniel Genkin, Daniel Gruss, Frank Piessens, Berk Sunar, Yuval Yarom

Abstract: Recently, out-of-order execution, an important performance optimization in modern high-end processors, has been revealed to pose a significant security threat, allowing information leaks across security domains. In particular, the Meltdown attack leaks information from the operating system kernel to user space, completely eroding the security of the system. To address this and similar attacks, wit… ▽ More Recently, out-of-order execution, an important performance optimization in modern high-end processors, has been revealed to pose a significant security threat, allowing information leaks across security domains. In particular, the Meltdown attack leaks information from the operating system kernel to user space, completely eroding the security of the system. To address this and similar attacks, without incurring the performance costs of software countermeasures, Intel includes hardware-based defenses in its recent Coffee Lake R processors. In this work, we show that the recent hardware defenses are not sufficient. Specifically, we present Fallout, a new transient execution attack that leaks information from a previously unexplored microarchitectural component called the store buffer. We show how unprivileged user processes can exploit Fallout to reconstruct privileged information recently written by the kernel. We further show how Fallout can be used to bypass kernel address space randomization. Finally, we identify and explore microcode assists as a hitherto ignored cause of transient execution. Fallout affects all processor generations we have tested. However, we notice a worrying regression, where the newer Coffee Lake R processors are more vulnerable to Fallout than older generations. △ Less

Submitted 29 May, 2019; originally announced May 2019.

arXiv:1811.07153 [pdf, other]

Robust Website Fingerprinting Through the Cache Occupancy Channel

Authors: Anatoly Shusterman, Lachlan Kang, Yarden Haskal, Yosef Meltser, Prateek Mittal, Yossi Oren, Yuval Yarom

Abstract: Website fingerprinting attacks, which use statistical analysis on network traffic to compromise user privacy, have been shown to be effective even if the traffic is sent over anonymity-preserving networks such as Tor. The classical attack model used to evaluate website fingerprinting attacks assumes an on-path adversary, who can observe all traffic traveling between the user's computer and the Tor… ▽ More Website fingerprinting attacks, which use statistical analysis on network traffic to compromise user privacy, have been shown to be effective even if the traffic is sent over anonymity-preserving networks such as Tor. The classical attack model used to evaluate website fingerprinting attacks assumes an on-path adversary, who can observe all traffic traveling between the user's computer and the Tor network. In this work we investigate these attacks under a different attack model, in which the adversary is capable of running a small amount of unprivileged code on the target user's computer. Under this model, the attacker can mount cache side-channel attacks, which exploit the effects of contention on the CPU's cache, to identify the website being browsed. In an important special case of this attack model, a JavaScript attack is launched when the target user visits a website controlled by the attacker. The effectiveness of this attack scenario has never been systematically analyzed, especially in the open-world model which assumes that the user is visiting a mix of both sensitive and non-sensitive sites. In this work we show that cache website fingerprinting attacks in JavaScript are highly feasible, even when they are run from highly restrictive environments, such as the Tor Browser. Specifically, we use machine learning techniques to classify traces of cache activity. Unlike prior works, which try to identify cache conflicts, our work measures the overall occupancy of the last-level cache. We show that our approach achieves high classification accuracy in both the open-world and the closed-world models. We further show that our techniques are resilient both to network-based defenses and to side-channel countermeasures introduced to modern browsers as a response to the Spectre attack. △ Less

Submitted 21 February, 2019; v1 submitted 17 November, 2018; originally announced November 2018.

arXiv:1810.05345 [pdf, other]

Time Protection: the Missing OS Abstraction

Authors: Qian Ge, Yuval Yarom, Tom Chothia, Gernot Heiser

Abstract: Timing channels enable data leakage that threatens the security of computer systems, from cloud platforms to smartphones and browsers executing untrusted third-party code. Preventing unauthorised information flow is a core duty of the operating system, however, present OSes are unable to prevent timing channels. We argue that OSes must provide time protection in addition to the established memory… ▽ More Timing channels enable data leakage that threatens the security of computer systems, from cloud platforms to smartphones and browsers executing untrusted third-party code. Preventing unauthorised information flow is a core duty of the operating system, however, present OSes are unable to prevent timing channels. We argue that OSes must provide time protection in addition to the established memory protection. We examine the requirements of time protection, present a design and its implementation in the seL4 microkernel, and evaluate its efficacy as well as performance overhead on Arm and x86 processors. △ Less

Submitted 15 October, 2018; v1 submitted 11 October, 2018; originally announced October 2018.

arXiv:1801.01207 [pdf, other]

Meltdown

Authors: Moritz Lipp, Michael Schwarz, Daniel Gruss, Thomas Prescher, Werner Haas, Stefan Mangard, Paul Kocher, Daniel Genkin, Yuval Yarom, Mike Hamburg

Abstract: The security of computer systems fundamentally relies on memory isolation, e.g., kernel address ranges are marked as non-accessible and are protected from user access. In this paper, we present Meltdown. Meltdown exploits side effects of out-of-order execution on modern processors to read arbitrary kernel-memory locations including personal data and passwords. Out-of-order execution is an indispen… ▽ More The security of computer systems fundamentally relies on memory isolation, e.g., kernel address ranges are marked as non-accessible and are protected from user access. In this paper, we present Meltdown. Meltdown exploits side effects of out-of-order execution on modern processors to read arbitrary kernel-memory locations including personal data and passwords. Out-of-order execution is an indispensable performance feature and present in a wide range of modern processors. The attack works on different Intel microarchitectures since at least 2010 and potentially other processors are affected. The root cause of Meltdown is the hardware. The attack is independent of the operating system, and it does not rely on any software vulnerabilities. Meltdown breaks all security assumptions given by address space isolation as well as paravirtualized environments and, thus, every security mechanism building upon this foundation. On affected systems, Meltdown enables an adversary to read memory of other processes or virtual machines in the cloud without any permissions or privileges, affecting millions of customers and virtually every user of a personal computer. We show that the KAISER defense mechanism for KASLR has the important (but inadvertent) side effect of impeding Meltdown. We stress that KAISER must be deployed immediately to prevent large-scale exploitation of this severe information leakage. △ Less

Submitted 3 January, 2018; originally announced January 2018.

arXiv:1801.01203 [pdf, ps, other]

Spectre Attacks: Exploiting Speculative Execution

Authors: Paul Kocher, Daniel Genkin, Daniel Gruss, Werner Haas, Mike Hamburg, Moritz Lipp, Stefan Mangard, Thomas Prescher, Michael Schwarz, Yuval Yarom

Abstract: Modern processors use branch prediction and speculative execution to maximize performance. For example, if the destination of a branch depends on a memory value that is in the process of being read, CPUs will try guess the destination and attempt to execute ahead. When the memory value finally arrives, the CPU either discards or commits the speculative computation. Speculative logic is unfaithful… ▽ More Modern processors use branch prediction and speculative execution to maximize performance. For example, if the destination of a branch depends on a memory value that is in the process of being read, CPUs will try guess the destination and attempt to execute ahead. When the memory value finally arrives, the CPU either discards or commits the speculative computation. Speculative logic is unfaithful in how it executes, can access to the victim's memory and registers, and can perform operations with measurable side effects. Spectre attacks involve inducing a victim to speculatively perform operations that would not occur during correct program execution and which leak the victim's confidential information via a side channel to the adversary. This paper describes practical attacks that combine methodology from side channel attacks, fault attacks, and return-oriented programming that can read arbitrary memory from the victim's process. More broadly, the paper shows that speculative execution implementations violate the security assumptions underpinning numerous software security mechanisms, including operating system process separation, static analysis, containerization, just-in-time (JIT) compilation, and countermeasures to cache timing/side-channel attacks. These attacks represent a serious threat to actual systems, since vulnerable speculative execution capabilities are found in microprocessors from Intel, AMD, and ARM that are used in billions of devices. While makeshift processor-specific countermeasures are possible in some cases, sound solutions will require fixes to processor designs as well as updates to instruction set architectures (ISAs) to give hardware architects and software developers a common understanding as to what computation state CPU implementations are (and are not) permitted to leak. △ Less

Submitted 3 January, 2018; originally announced January 2018.

arXiv:1710.00551 [pdf, other]

Another Flip in the Wall of Rowhammer Defenses

Authors: Daniel Gruss, Moritz Lipp, Michael Schwarz, Daniel Genkin, Jonas Juffinger, Sioli O'Connell, Wolfgang Schoechl, Yuval Yarom

Abstract: The Rowhammer bug allows unauthorized modification of bits in DRAM cells from unprivileged software, enabling powerful privilege-escalation attacks. Sophisticated Rowhammer countermeasures have been presented, aiming at mitigating the Rowhammer bug or its exploitation. However, the state of the art provides insufficient insight on the completeness of these defenses. In this paper, we present novel… ▽ More The Rowhammer bug allows unauthorized modification of bits in DRAM cells from unprivileged software, enabling powerful privilege-escalation attacks. Sophisticated Rowhammer countermeasures have been presented, aiming at mitigating the Rowhammer bug or its exploitation. However, the state of the art provides insufficient insight on the completeness of these defenses. In this paper, we present novel Rowhammer attack and exploitation primitives, showing that even a combination of all defenses is ineffective. Our new attack technique, one-location hammering, breaks previous assumptions on requirements for triggering the Rowhammer bug, i.e., we do not hammer multiple DRAM rows but only keep one DRAM row constantly open. Our new exploitation technique, opcode flip**, bypasses recent isolation mechanisms by flip** bits in a predictable and targeted way in userspace binaries. We replace conspicuous and memory-exhausting spraying and grooming techniques with a novel reliable technique called memory waylaying. Memory waylaying exploits system-level optimizations and a side channel to coax the operating system into placing target pages at attacker-chosen physical locations. Finally, we abuse Intel SGX to hide the attack entirely from the user and the operating system, making any inspection or detection of the attack infeasible. Our Rowhammer enclave can be used for coordinated denial-of-service attacks in the cloud and for privilege escalation on personal computers. We demonstrate that our attacks evade all previously proposed countermeasures for commodity systems. △ Less

Submitted 31 January, 2018; v1 submitted 2 October, 2017; originally announced October 2017.

Comments: Preprint of the work accepted at the 39th IEEE Symposium on Security and Privacy 2018

arXiv:1612.04474 [pdf, other]

Your Processor Leaks Information - and There's Nothing You Can Do About It

Authors: Qian Ge, Yuval Yarom, Frank Li, Gernot Heiser

Abstract: Timing channels are information flows, encoded in the relative timing of events, that bypass the system's protection mechanisms. Any microarchitectural state that depends on execution history and affects the rate of progress of later executions potentially establishes a timing channel, unless explicit steps are taken to close it. Such state includes CPU caches, TLBs, branch predictors and prefetch… ▽ More Timing channels are information flows, encoded in the relative timing of events, that bypass the system's protection mechanisms. Any microarchitectural state that depends on execution history and affects the rate of progress of later executions potentially establishes a timing channel, unless explicit steps are taken to close it. Such state includes CPU caches, TLBs, branch predictors and prefetchers; removing the channels requires that the OS can partition such state or flush it on a switch of security domains. We measure the capacities of channels based on these microarchitectural features on several generations of processors across the two mainstream ISAs, x86 and ARM, and investigate the effectiveness of the flushing mechanisms provided by the respective ISA.We find that in all processors we studied, at least one significant channel remains. This implies that closing all timing channels seems impossible on contemporary mainstream processors. △ Less

Submitted 14 September, 2017; v1 submitted 13 December, 2016; originally announced December 2016.

Showing 1–24 of 24 results for author: Yarom, Y