Search | arXiv e-print repository

Okapi: Efficiently Safeguarding Speculative Data Accesses in Sandboxed Environments

Authors: Philipp Schmitz, Tobias Jauch, Alex Wezel, Mohammad R. Fadiheh, Thore Tiemann, Jonah Heller, Thomas Eisenbarth, Dominik Stoffel, Wolfgang Kunz

Abstract: This paper introduces Okapi, a new hardware/software cross-layer architecture designed to mitigate Transient Execution Side Channel (TES) attacks in modern computing systems. Okapi enforces sandboxing for speculative execution, providing a hardware basis that can replace expensive speculation barriers in software. At its core, Okapi allows for speculative data accesses to a memory page only afte… ▽ More This paper introduces Okapi, a new hardware/software cross-layer architecture designed to mitigate Transient Execution Side Channel (TES) attacks in modern computing systems. Okapi enforces sandboxing for speculative execution, providing a hardware basis that can replace expensive speculation barriers in software. At its core, Okapi allows for speculative data accesses to a memory page only after the page has been accessed non-speculatively at least once by the current trust domain. The granularity of the trust domains can be controlled in software to achieve different security and performance trade-offs. For environments with less stringent security needs, Okapi's features can be deactivated to remove all performance overhead. Without relying on any software modification, the Okapi hardware features already provide full protection against TES breakout attacks at a thread-level granularity. This incurs an average performance overhead of only 3.17% for the SPEC CPU2017 benchmark suite. Okapi introduces the OkapiReset instruction for additional software-level security support. This instruction allows for fine-grained sandboxing with custom program sizes smaller than a thread, resulting in 1.68% performance overhead in our WebAssembly runtime experiment. On top, Okapi provides the possibility to eliminate poisoning attacks. For the highest level of security, the OkapiLoad instruction prevents confidential data from being added to the trust domain after a sequential access, thereby enforcing weak speculative non-interference. In addition, we present a hardware extension that limits the exploitable code space for Spectre gadgets to well-defined sections of the program. By ensuring the absence of gadgets in these sections, developers can tailor their software towards achieving beneficial trade-offs between the size of a trust domain and performance. △ Less

Submitted 23 April, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

arXiv:2312.06515 [pdf, other]

A Golden-Free Formal Method for Trojan Detection in Non-Interfering Accelerators

Authors: Anna Lena Duque Antón, Johannes Müller, Lucas Deutschmann, Mohammad Rahmani Fadiheh, Dominik Stoffel, Wolfgang Kunz

Abstract: The threat of hardware Trojans (HTs) in security-critical IPs like cryptographic accelerators poses severe security risks. The HT detection methods available today mostly rely on golden models and detailed circuit specifications. Often they are specific to certain HT payload types, making pre-silicon verification difficult and leading to security gaps. We propose a novel formal verification method… ▽ More The threat of hardware Trojans (HTs) in security-critical IPs like cryptographic accelerators poses severe security risks. The HT detection methods available today mostly rely on golden models and detailed circuit specifications. Often they are specific to certain HT payload types, making pre-silicon verification difficult and leading to security gaps. We propose a novel formal verification method for HT detection in non-interfering accelerators at the Register Transfer Level (RTL), employing standard formal property checking. Our method guarantees the exhaustive detection of any sequential HT independently of its payload behavior, including physical side channels. It does not require a golden model or a functional specification of the design. The experimental results demonstrate efficient and effective detection of all sequential HTs in accelerators available on Trust-Hub, including those with complex triggers and payloads. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: Accepted for publication at the 27th Design, Automation and Test in Europe Conference (DATE'24), Mar 25-27 2024, Valencia, Spain

arXiv:2309.12925 [pdf, other]

A New Security Threat in MCUs -- SoC-wide timing side channels and how to find them

Authors: Johannes Müller, Anna Lena Duque Antón, Lucas Deutschmann, Dino Mehmedagić, Mohammad Rahmani Fadiheh, Dominik Stoffel, Wolfgang Kunz

Abstract: Microarchitectural timing side channels have been thoroughly investigated as a security threat in hardware designs featuring shared buffers (e.g., caches) and/or parallelism between attacker and victim task execution. Contradicting common intuitions, recent activities demonstrate, however, that this threat is real also in microcontroller SoCs without such features. In this paper, we describe SoC-w… ▽ More Microarchitectural timing side channels have been thoroughly investigated as a security threat in hardware designs featuring shared buffers (e.g., caches) and/or parallelism between attacker and victim task execution. Contradicting common intuitions, recent activities demonstrate, however, that this threat is real also in microcontroller SoCs without such features. In this paper, we describe SoC-wide timing side channels previously neglected by security analysis and present a new formal method to close this gap. In a case study with the RISC-V Pulpissimo SoC platform, our method found a vulnerability to a so far unknown attack variant that allows an attacker to obtain information about a victim's memory access behavior. After implementing a conservative fix, we were able to verify that the SoC is now secure w.r.t. timing side channels. △ Less

Submitted 22 September, 2023; originally announced September 2023.

arXiv:2308.07757 [pdf, other]

doi 10.1109/TCAD.2024.3374249

A Scalable Formal Verification Methodology for Data-Oblivious Hardware

Authors: Lucas Deutschmann, Johannes Mueller, Mohammad Rahmani Fadiheh, Dominik Stoffel, Wolfgang Kunz

Abstract: The importance of preventing microarchitectural timing side channels in security-critical applications has surged in recent years. Constant-time programming has emerged as a best-practice technique for preventing the leakage of secret information through timing. It is based on the assumption that the timing of certain basic machine instructions is independent of their respective input data. Howeve… ▽ More The importance of preventing microarchitectural timing side channels in security-critical applications has surged in recent years. Constant-time programming has emerged as a best-practice technique for preventing the leakage of secret information through timing. It is based on the assumption that the timing of certain basic machine instructions is independent of their respective input data. However, whether or not an instruction satisfies this data-independent timing criterion varies between individual processor microarchitectures. In this paper, we propose a novel methodology to formally verify data-oblivious behavior in hardware using standard property checking techniques. The proposed methodology is based on an inductive property that enables scalability even to complex out-of-order cores. We show that proving this inductive property is sufficient to exhaustively verify data-obliviousness at the microarchitectural level. In addition, the paper discusses several techniques that can be used to make the verification process easier and faster. We demonstrate the feasibility of the proposed methodology through case studies on several open-source designs. One case study uncovered a data-dependent timing violation in the extensively verified and highly secure IBEX RISC-V core. In addition to several hardware accelerators and in-order processors, our experiments also include RISC-V BOOM, a complex out-of-order processor, highlighting the scalability of the approach. △ Less

Submitted 11 March, 2024; v1 submitted 15 August, 2023; originally announced August 2023.

arXiv:2209.14021 [pdf, other]

doi 10.1145/3565053.3565059

A Framework for Formal Verification of DRAM Controllers

Authors: Lukas Steiner, Chirag Sudarshan, Matthias Jung, Dominik Stoffel, Norbert Wehn

Abstract: The large number of recent JEDEC DRAM standard releases and their increasing feature set makes it difficult for designers to rapidly upgrade the memory controller IPs to each new standard. Especially the hardware verification is challenging due to the higher protocol complexity of standards like DDR5, LPDDR5 or HBM3 in comparison with their predecessors. With traditional simulation-based verificat… ▽ More The large number of recent JEDEC DRAM standard releases and their increasing feature set makes it difficult for designers to rapidly upgrade the memory controller IPs to each new standard. Especially the hardware verification is challenging due to the higher protocol complexity of standards like DDR5, LPDDR5 or HBM3 in comparison with their predecessors. With traditional simulation-based verification it is laborious to guarantee the coverage of all possible states, especially for control flow rich memory controllers. This has a direct impact on the time-to-market. A promising alternative is formal verification because it allows to ensure protocol compliance based on mathematical proofs. However, with regard to memory controllers no fully-automated verification process has been presented in the state-of-the-art yet, which means there is still a potential risk of human error. In this paper we present a framework that automatically generates SystemVerilog Assertions for a DRAM protocol. In addition, we show how the framework can be used efficiently for different tasks of memory controller development. △ Less

Submitted 28 September, 2022; originally announced September 2022.

Comments: ACM/IEEE International Symposium on Memory Systems (MEMSYS 2022)

arXiv:2108.01979 [pdf, other]

doi 10.1109/TC.2022.3152666

An Exhaustive Approach to Detecting Transient Execution Side Channels in RTL Designs of Processors

Authors: Mohammad Rahmani Fadiheh, Alex Wezel, Johannes Mueller, Joerg Bormann, Sayak Ray, Jason M. Fung, Subhasish Mitra, Dominik Stoffel, Wolfgang Kunz

Abstract: Hardware (HW) security issues have been emerging at an alarming rate in recent years. Transient execution attacks, in particular, pose a genuine threat to the security of modern computing systems. Despite recent advances, understanding the intricate implications of microarchitectural design decisions on processor security remains a great challenge and has caused a number of update cycles in the pa… ▽ More Hardware (HW) security issues have been emerging at an alarming rate in recent years. Transient execution attacks, in particular, pose a genuine threat to the security of modern computing systems. Despite recent advances, understanding the intricate implications of microarchitectural design decisions on processor security remains a great challenge and has caused a number of update cycles in the past. number of update cycles in the past. This papers addresses the need for a new approach to HW sign-off verification which guarantees the security of processors at the Register Transfer Level (RTL). To this end, we introduce a formal definition of security with respect to transient execution attacks, formulated as a HW property. We present a formal proof methodology based on Unique Program Execution Checking (UPEC) which can be used to systematically detect all vulnerabilities to transient execution attacks in RTL designs. UPEC does not exploit any a priori knowledge on known attacks and can therefore detect also vulnerabilities based on new, so far unknown, types of channels. This is demonstrated by two new attack scenarios discovered in our experiments with UPEC. UPEC scales to a wide range of HW designs, including in-order processors (RocketChip), pipelines with out-of-order writeback (Ariane), and processors with deep out-of-order speculative execution (BOOM). To the best of our knowledge, UPEC is the first RTL verification technique that exhaustively covers transient execution side channels in processors of realistic complexity. △ Less

Submitted 22 February, 2022; v1 submitted 4 August, 2021; originally announced August 2021.

arXiv:2106.10392 [pdf]

Effective Pre-Silicon Verification of Processor Cores by Breaking the Bounds of Symbolic Quick Error Detection

Authors: Karthik Ganesan, Florian Lonsing, Srinivasa Shashank Nuthakki, Eshan Singh, Mohammad Rahmani Fadiheh, Wolfgang Kunz, Dominik Stoffel, Clark Barrett, Subhasish Mitra

Abstract: We present a novel approach to pre-silicon verification of processor designs. The purpose of pre-silicon verification is to find logic bugs in a design at an early stage and thus avoid time- and cost-intensive post-silicon debugging. Our approach relies on symbolic quick error detection (Symbolic QED, or SQED). SQED is targeted at finding logic bugs in a symbolic representation of a design by comb… ▽ More We present a novel approach to pre-silicon verification of processor designs. The purpose of pre-silicon verification is to find logic bugs in a design at an early stage and thus avoid time- and cost-intensive post-silicon debugging. Our approach relies on symbolic quick error detection (Symbolic QED, or SQED). SQED is targeted at finding logic bugs in a symbolic representation of a design by combining bounded model checking (BMC) with QED tests. QED tests are powerful in generating short sequences of instructions (traces) that trigger bugs. We extend an existing SQED approach with symbolic starting states. This way, we enable the BMC tool to select starting states arbitrarily when generating a trace. To avoid false positives, (e.g., traces starting in unreachable states that may not be-have in accordance with the processor instruction-set architecture), we define constraints to restrict the set of possible starting states. We demonstrate that these constraints, togeth-er with reasonable assumptions about the system behavior, allow us to avoid false positives. Using our approach, we discovered previously unknown bugs in open-source RISC-V processor cores that existing methods cannot detect. Moreover, our novel approach out-performs existing ones in the detection of bugs having long traces and in the detection of hardware Trojans, i.e., unauthorized modifications of a design. △ Less

Submitted 18 June, 2021; originally announced June 2021.

Comments: This article has the full author list which was missing in arXiv:1908.06757. arXiv admin note: substantial text overlap with arXiv:1908.06757

arXiv:2004.14191 [pdf, other]

Efficient Binary-Level Coverage Analysis

Authors: M. Ammar Ben Khadra, Dominik Stoffel, Wolfgang Kunz

Abstract: Code coverage analysis plays an important role in the software testing process. More recently, the remarkable effectiveness of coverage feedback has triggered a broad interest in feedback-guided fuzzing. In this work, we introduce bcov, a tool for binary-level coverage analysis. Our tool statically instruments x86-64 binaries in the ELF format without compiler support. We implement several techniq… ▽ More Code coverage analysis plays an important role in the software testing process. More recently, the remarkable effectiveness of coverage feedback has triggered a broad interest in feedback-guided fuzzing. In this work, we introduce bcov, a tool for binary-level coverage analysis. Our tool statically instruments x86-64 binaries in the ELF format without compiler support. We implement several techniques to improve efficiency and scale to large real-world software. First, we bring Agrawal's probe pruning technique to binary-level instrumentation and effectively leverage its superblocks to reduce overhead. Second, we introduce sliced microexecution, a robust technique for jump table analysis which improves CFG precision and enables us to instrument jump table entries. Additionally, smaller instructions in x86-64 pose a challenge for inserting detours. To address this challenge, we aggressively exploit padding bytes and systematically host detours in neighboring basic blocks. We evaluate bcov on a corpus of 95 binaries compiled from eight popular and well-tested packages like FFmpeg and LLVM. Two instrumentation policies, with different edge-level precision, are used to patch all functions in this corpus - over 1.6 million functions. Our precise policy has average performance and memory overheads of 14% and 22% respectively. Instrumented binaries do not introduce any test regressions. The reported coverage is highly accurate with an average F-score of 99.86%. Finally, our jump table analysis is comparable to that of IDA Pro on gcc binaries and outperforms it on clang binaries. △ Less

Submitted 10 September, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

Comments: Accepted to the ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering - ESEC/FSE 2020

ACM Class: D.2.5

arXiv:1902.01494 [pdf]

doi 10.23919/DATE.2019.8715271

Symbolic QED Pre-silicon Verification for Automotive Microcontroller Cores: Industrial Case Study

Authors: Eshan Singh, Keerthikumara Devarajegowda, Sebastian Simon, Ralf Schnieder, Karthik Ganesan, Mohammad R. Fadiheh, Dominik Stoffel, Wolfgang Kunz, Clark Barrett, Wolfgang Ecker, Subhasish Mitra

Abstract: We present an industrial case study that demonstrates the practicality and effectiveness of Symbolic Quick Error Detection (Symbolic QED) in detecting logic design flaws (logic bugs) during pre-silicon verification. Our study focuses on several microcontroller core designs (~1,800 flip-flops, ~70,000 logic gates) that have been extensively verified using an industrial verification flow and used fo… ▽ More We present an industrial case study that demonstrates the practicality and effectiveness of Symbolic Quick Error Detection (Symbolic QED) in detecting logic design flaws (logic bugs) during pre-silicon verification. Our study focuses on several microcontroller core designs (~1,800 flip-flops, ~70,000 logic gates) that have been extensively verified using an industrial verification flow and used for various commercial automotive products. The results of our study are as follows: 1. Symbolic QED detected all logic bugs in the designs that were detected by the industrial verification flow (which includes various flavors of simulation-based verification and formal verification). 2. Symbolic QED detected additional logic bugs that were not recorded as detected by the industrial verification flow. (These additional bugs were also perhaps detected by the industrial verification flow.) 3. Symbolic QED enables significant design productivity improvements: (a) 8X improved (i.e., reduced) verification effort for a new design (8 person-weeks for Symbolic QED vs. 17 person-months using the industrial verification flow). (b) 60X improved verification effort for subsequent designs (2 person-days for Symbolic QED vs. 4-7 person-months using the industrial verification flow). (c) Quick bug detection (runtime of 20 seconds or less), together with short counterexamples (10 or fewer instructions) for quick debug, using Symbolic QED. △ Less

Submitted 4 February, 2019; originally announced February 2019.

arXiv:1812.04975 [pdf, other]

Processor Hardware Security Vulnerabilities and their Detection by Unique Program Execution Checking

Authors: Mohammad Rahmani Fadiheh, Dominik Stoffel, Clark Barrett, Subhasish Mitra, Wolfgang Kunz

Abstract: Recent discovery of security attacks in advanced processors, known as Spectre and Meltdown, has resulted in high public alertness about security of hardware. The root cause of these attacks is information leakage across "covert channels" that reveal secret data without any explicit information flow between the secret and the attacker. Many sources believe that such covert channels are intrinsic to… ▽ More Recent discovery of security attacks in advanced processors, known as Spectre and Meltdown, has resulted in high public alertness about security of hardware. The root cause of these attacks is information leakage across "covert channels" that reveal secret data without any explicit information flow between the secret and the attacker. Many sources believe that such covert channels are intrinsic to highly advanced processor architectures based on speculation and out-of-order execution, suggesting that such security risks can be avoided by staying away from high-end processors. This paper, however, shows that the problem is of wider scope: we present new classes of covert channel attacks which are possible in average-complexity processors with in-order pipelining, as they are mainstream in applications ranging from Internet-of-Things to Autonomous Systems. We present a new approach as a foundation for remedy against covert channels: while all previous attacks were found by clever thinking of human attackers, this paper presents an automated and exhaustive method called "Unique Program Execution Checking" which detects and locates vulnerabilities to covert channels systematically, including those to covert channels unknown so far. △ Less

Submitted 5 December, 2018; originally announced December 2018.

Showing 1–10 of 10 results for author: Stoffel, D