License: CC BY 4.0
arXiv:2309.05740v2 [cs.CR] 24 Mar 2024
3DES
Triple-DES
ACT-R
Adaptive Control of Thought-Rational
AES
Advanced Encryption Standard
ALU
Arithmetic Logic Unit
ANOVA
Analysis of Variance
API
Application Programming Interface
ARX
Addition Rotation XOR
ATPG
Automatic Test Pattern Generation
ASIC
Application Specific Integrated Circuit
ASIP
Application Specific Instruction-Set Processor
AS
Active Serial
BDD
Binary Decision Diagram
BGL
Boost Graph Library
BNF
Backus-Naur Form
BRAM
Block-Ram
CBC
Cipher Block Chaining
CFB
Cipher Feedback Mode
CFG
Control Flow Graph
CLB
Configurable Logic Block
CLI
Command Line Interface
COFF
Common Object File Format
CPA
Correlation Power Analysis
CPU
Central Processing Unit
CRC
Cyclic Redundancy Check
CTR
Counter
DC
Direct Current
DES
Data Encryption Standard
DFA
Differential Frequency Analysis
DFT
Discrete Fourier Transform
DIP
Distinguishing Input Pattern
DLL
Dynamic Link Library
DMA
Direct Memory Access
DNF
Disjunctive Normal Form
DPA
Differential Power Analysis
DSO
Digital Storage Oscilloscope
DSP
Digital Signal Processing
DUT
Design Under Test
ECB
Electronic Code Book
ECC
Elliptic Curve Cryptography
EEPROM
Electrically Erasable Programmable Read-only Memory
EMA
Electromagnetic Emanation
EM
electro-magnetic
EU
European Union
FFT
Fast Fourier Transformation
FF
Flip Flop
FI
Fault Injection
FIR
Finite Impulse Response
FPGA
Field Programmable Gate Array
FSM
Finite State Machine
GT
Grounded Theory
GUI
Graphical User Interface
HCI
Human Computer Interaction
HDL
Hardware Description Language
HD
Hamming Distance
HF
High Frequency
HRE
Hardware Reverse Engineering
HSM
Hardware Security Module
HW
Hamming Weight
IC
Integrated Circuit
I/O
Input/Output
IOB
Input Output Block
IoT
Internet of Things
IRB
Institutional Review Board
IP
Intellectual Property
IQ
Intelligence Quotient
ISA
Instruction Set Architecture
ISCED
International Standard Classification of Education
IV
Initialization Vector
JTAG
Joint Test Action Group
KAT
Known Answer Test
LFSR
Linear Feedback Shift Register
LSB
Least Significant Bit
LUT
Look-up table
MAC
Message Authentication Code
MAD
Median Absolute Deviation
MIPS
Microprocessor without Interlocked Pipeline Stages
MMIO
Memory Mapped Input/Output
MSB
Most Significant Bit
NASA
National Aeronautics and Space Administration
NCT
Number Connection Task
NSA
National Security Agency
NVM
Non-Volatile Memory
OFB
Output Feedback Mode
OISC
One Instruction Set Computer
ORAM
Oblivious Random Access Memory
OS
Operating System
PAR
Place-and-Route
PCB
Printed Circuit Board
PC
Personal Computer
PS
Processing Speed
PR
Perceptual Reasoning
PUF
Physically Unclonable Function
RISC
Reduced Instruction Set Computer
RNG
Random Number Generator
ROBDD
reduced ordered binary decision diagram
ROM
Read-Only Memory
ROP
Return-oriented Programming
RTL
Register-Transfer Level
SAT
Boolean satisfiability
SEMOBS
Self-Modifying Bitstreams
SCA
Side-Channel Analysis
SHA
Secure Hash Algorithm
SNR
Signal-to-Noise Ratio
SPA
Simple Power Analysis
SPI
Serial Peripheral Interface Bus
SRAM
Static Random Access Memory
SRE
Software Reverse Engineering
VLSI
Very-Large-Scale Integration
WAIS-IV
Wechsler Adult Intelligence Scale
WM
Working Memory
TRNG
True Random Number Generator
UART
Universal Asynchronous Receiver Transmitter
UHF
Ultra-High Frequency
vFPGA
virtual FPGA
VC
Verbal Comprehension
WISC
Writeable Instruction Set Computer
XDL
Xilinx Description Language
XTS
XEX-based Tweaked-codebook with ciphertext Stealing
ZVT
Zahlen-Verbindungs-Test

ReverSim: A Game-Based Environment to Study Human Aspects in Hardware Reverse Engineering

Steffen Becker*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT{}^{\dagger}start_FLOATSUPERSCRIPT † end_FLOATSUPERSCRIPT[Uncaptioned image], René Walendy*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT{}^{\dagger}start_FLOATSUPERSCRIPT † end_FLOATSUPERSCRIPT[Uncaptioned image], Markus Weber*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT[Uncaptioned image], Carina Wiesen*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT{}^{\dagger}start_FLOATSUPERSCRIPT † end_FLOATSUPERSCRIPT[Uncaptioned image],
Nikol Rummel*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT[Uncaptioned image], and Christof Paar{}^{\dagger}start_FLOATSUPERSCRIPT † end_FLOATSUPERSCRIPT[Uncaptioned image]
*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPTRuhr University Bochum {}^{\dagger}start_FLOATSUPERSCRIPT † end_FLOATSUPERSCRIPTMax Planck Institute for Security and Privacy
Abstract

Hardware Reverse Engineering (HRE) is a technique for analyzing Integrated Circuits. Experts employ HRE for security-critical tasks, such as detecting Trojans or intellectual property violations. They rely not only on their experience and customized tools but also on their cognitive abilities. Conducting controlled experiments to assess the cognitive processes involved in HRE can open new avenues for hardware protection. However, HRE experts are largely unavailable for empirical research in real-world settings. To address this challenge, we have developed ReverSim, a game-based environment that mimics realistic HRE subprocesses and can integrate standardized cognitive tests. ReverSim enables quantitative studies with easier-to-recruit non-experts to uncover cognitive factors relevant to HRE, which can subsequently be validated with small expert samples. To evaluate the design of ReverSim, the minimum requirements for successful participation, and its measurement capabilities, we conducted two studies: First, we performed semi-structured interviews with 14 professionals and researchers from the HRE domain, who attested to the comparability of ReverSim to real-world HRE problems. Second, we conducted an online user study with 109 participants, demonstrating that they could engage in ReverSim with low domain-specific prior knowledge. We provide refined screening criteria, derive fine-grained performance metrics, and successfully perform a cognitive test for mental speed in ReverSim, thus contributing an important piece of the puzzle for the development of innovative hardware protection mechanisms.

1 Introduction

Understanding the inner workings of Integrated Circuits (ICs), also known as Hardware Reverse Engineering (HRE), is of great importance for various security-critical tasks, e. g., detecting manipulations such as hardware Trojans, checking for intellectual property infringements, or verifying cryptographic hardware implementations [23]. All of these tasks have the common goal of increasing trust in Integrated Circuits and, thus, in a vast range of digital systems from traditional computers to autonomous cars and medical implants. The recently passed European Chips Act [10] and the US CHIPS and Science Act [28] are massive investments with the ultimate goal to increase trust in ICs. Despite its technical and political importance, HRE is a relatively poorly understood process [11].

In particular, because HRE processes cannot be fully automated, human interaction is critical to its successful execution [11]. A few recent exploratory studies have already investigated which sensemaking [4] or problem-solving strategies [44] and cognitive factors [4] seem to be relevant in HRE. Although these somewhat preliminary results point to important initial insights – such as a potential correlation between HRE performance and working memory – the expressiveness of those studies is limited by a methodological problem that we address through the development of a new research method presented in this paper.

Methodological Challenge.

Researchers who aim to conduct studies exploring human aspects in HRE face the methodological challenge that HRE experts are largely unavailable to participate in empirical studies involving large realistic problem settings [4, 44]. Prior studies used an approach based on an HRE training that enabled students with relevant background knowledge to acquire a sufficient amount of HRE skills and tool usage within 14 weeks [43, 42]. Although this approach allowed for an exploratory investigation of human aspects in HRE, the sample size of nine participants was quite limited. Even if several HRE experts were available to participate in a research study, the problem remains that they may be accustomed to different HRE tools – making their performance in the lab strongly dependent on the tool used in the study. That, in turn, would make it difficult to measure their expertise or performance under the same conditions.

Methodological Approach.

To address this challenge, we developed and evaluated a well-structured, simplified lab environment, called ReverSim, that standardizes and reduces the requirements for participants. Real-world HRE problems and tools played a crucial role in the design considerations for ReverSim, and the development process was accompanied by continuous expert feedback and piloting. To further increase participant engagement in ReverSim, we adopted a game-based approach to designing the environment. We also included a number connection test, which measures mental speed, an intelligence factor. In this paper, we aim to demonstrate that ReverSim is suitable for investigating both HRE problem solving and specific cognitive factors, regardless of participants’ domain-specific prior knowledge or experience with specific tools. We expect that experts will be more willing to participate in experiments conducted with ReverSim rather than with large realistic problems because playing ReverSim is not only a challenging and engaging experience but also very time-efficient.

Our main contributions include:

  • Development of ReverSim (Section 3). We created ReverSim, a game-based HRE simulation that aims to model real-world HRE problems and can integrate well-evaluated cognitive tests. We provide a playable version online (see Appendix A) and will release ReverSim under an open-source license after peer review, such that it may be used and adapted for future experiments on human aspects in HRE.

  • Expert Assessment (Section 4). During its development, we presented ReverSim to 14 researchers and professionals in the field of HRE. Participants acknowledged the comparability of ReverSim with real-world netlist reverse engineering processes and gave us feedback that we incorporated to further improve ReverSim.

  • Assessment with Non-experts (Section 5) From a user study with 109 participants with low domain-specific prior knowledge, we found that a) university entrance qualification and previous IT experience are sufficient prerequisites for successful participation in ReverSim b) the tasks contained in ReverSim cover a wide range of difficulty c) we can administer standardized cognitive tests in ReverSim

Envisioned Applications of ReverSim.

We consider ReverSim as a framework to study human aspects in HRE with large samples in a controlled environment. Foremost, we envision studies that will pave the way for the development of novel hardware protection mechanisms. Secondary benefits of ReverSim may include educational applications and HRE skill assessments.

2 Background

In this section, we present relevant background on technical and human aspects in HRE that guides the development of our methodological approach. We end with an overview of our research design and questions.

2.1 Hardware Reverse Engineering

Reverse engineering is the process of extracting knowledge or design information from anything human-made to comprehend its inner structure [26]. In the hardware security context, there are several applications for reverse engineering: Security engineers are often forced to perform reverse engineering for failure analysis or to identify counterfeited ICs, security vulnerabilities, or malicious manipulations such as hardware Trojans [23, 38, 22]. HRE is also commonly used in research or for competitor analysis, which is legal in many countries [11]. At the same time, HRE is associated with illegitimate actions, such as intellectual property infringement, the weakening of security functions, or the injection of hardware Trojans [23, 11].

There are two distinct stages during a full-scale HRE process [2]: In the first stage, a gate-level netlist is obtained directly from a physical device or by intercepting design information. Prior research has shown that netlists can be reliably extracted by experienced specialists [35, 24]. Therefore, our work focuses on the second, sense-making stage of HRE, also called netlist analysis or netlist reverse engineering.111For simplicity, we use HRE and the terms netlist analysis or netlist reverse engineering synonymously in this paper. In this stage, an analyst transforms the netlist into higher levels of abstraction that enable detailed analysis and understanding [32, 12, 2]. This often involves identifying blocks of interest through module recognition [6, 14, 30, 32, 1], detailed analysis of Boolean subcircuits [31, 20], or fully customized approaches [34].

However, despite the advances in algorithmic approaches, fully automated tools do not exist [11], and reverse engineering of netlists relies heavily on human ingenuity, sense-making, and in many cases customized computing solutions. Thus, the success of HRE depends largely on the experience and cognitive skills of the analyst.

2.2 Human Aspects in HRE

A few recent works describe the first important findings about the underlying human aspects in HRE. Lee and Johnson-Laird performed five experiments in a laboratory setting to analyze how participants without prior domain-specific knowledge solved simple reverse engineering problems based on Boolean algebra [17]. Subsequently, Lee and Johnson-Laird defined reverse engineering of Boolean systems as a specific but poorly understood kind of human problem solving in which participants had to determine how the mechanics of a specific system worked, which components influenced relevant outputs, and how strongly the components were depending on each other. However, as the tasks in their experiments were extremely simple, it is questionable to what extent the results are transferable to real-world HRE.

Subsequent research examined problem-solving processes in realistic HRE tasks. In an exploratory study, Becker et al. observed the technical processes of hardware reverse engineers analyzing an unknown gate-level netlist [4]. Based on their observations, the authors postulated a three-phase model, in which reverse engineers apply both manual analyses (e. g., visual search and identification of components in the netlist) and semi-automated steps (e. g., verification of crucial netlist components). Becker et al. also provided initial insight into cognitive processes in HRE: The descriptive data from their study suggests that working memory, an intelligence subfactor, may play a role in time-efficient problem solving of HRE tasks. Subsequently, Wiesen et al. conducted a detailed exploration of the problem-solving processes of eight intermediate reverse engineers and one HRE expert [44]. Their analysis led to a detailed hierarchical problem-solving model that yielded insights into problem-solving strategies and expertise-related differences. The authors conclude that superior performance in solving realistic HRE tasks may be a function of both expertise and intelligence [44]. A drawback of this particular work is that it relies on in-person lab tests for cognitive assessment, making the procedure demanding and time consuming not only for participants but also for investigators, and thus limiting sample sizes.

Validating these insights with larger samples by examining performance and cognitive factors in a standardized environment accessible to a broad population is a logical next step. Also, uncovering further cognitive factors relevant to HRE in large non-expert samples and validating their impact with small expert samples seems promising. To enable such validations with controlled experimental studies, we propose a methodological approach in the form of an HRE simulation called ReverSim, which includes a means for administering standardized, unsupervised psychometric tests.

2.3 Research Design and Questions

In Section 3, we outline the development process and underlying design considerations of ReverSim. To empirically evaluate whether ReverSim is indeed suitable for studying human aspects in HRE we conduct two studies, which are guided by the following research questions

RQ1

To what extent does ReverSim model real-world netlist reverse engineering?

RQ2

What are the minimum requirements to participate in ReverSim?

RQ3

Does the difficulty of tasks contained in ReverSim cover a wide range of solution probabilities and times?

RQ4

Can we perform standardized cognitive tests in ReverSim?

To answer RQ1, we evaluate 14 interviews with HRE experts who assess the comparability of ReverSim with real-world HRE processes in Section 4. To answer RQ2, RQ3, and RQ4, we analyze data from a user study with 109 participants with low domain-specific prior knowledge and varying levels of experiences in Section 5.

3 An Overview of ReverSim

We followed two principles when designing ReverSim:

  1. 1.

    ReverSim is intended to model important subprocesses of real-world netlist reverse engineering in order to encourage participants – the problem solvers – to apply solution strategies comparable to those in the real world.

  2. 2.

    ReverSim should be suitable for a broad range of HRE non-experts and experts, without requiring knowledge of any specific tools used in the IC industry.

We implemented ReverSim as a web application, thus providing a high degree of flexibility for study settings: Not only can ReverSim be used in a laboratory environment, participants can also interact with it in a fully remote setting only requiring a web browser. The simulation client is supported by a central game server supplying the individual levels and recording detailed transcripts of participants’ interaction with the game. This approach enables central administration and data collection for multiple participants simultaneously.

To obtain a practical picture of the simulation, we provide a playable version of ReverSim online (see Appendix A).

Development Process.

In the following sections, we describe the present version of ReverSim in the configuration used for the studies presented in this work. The development process was accompanied by several rounds of internal and external piloting, e. g., with HCI and security researchers and with attendees from industry and academia at computer security conferences, which led to various changes and improvements to the initial prototype. As we continue open-source development, we plan to introduce further features that were proposed in this iterative process.

3.1 Basic Elements

The core of ReverSim consists of eight basic elements for Boolean circuits that together form a level: At all inputs to the circuit, a battery provides current and is always connected to a switch, which is the only basic element the player can operate. The experimenter can define whether each switch is initially open or closed. Alternatively, ReverSim can select a random starting position. Through a mouse click, the player can close and open switches in order to pass or stop the current flow. Each output is realized by a lamp or danger sign.

The objective of each level is to determine the correct switch settings to yield the desired circuit output: For a valid solution, all lamps must light up and do so when they are supplied with current. On the other hand, the danger signs must not be supplied with current; otherwise, an electric discharge is displayed at the respective output, indicating an incorrect solution. Figure 1 illustrates the lamp and danger sign with and without supplied current.

Refer to captionRefer to captionRefer to captionRefer to captionRefer to captionno currentcurrent
Figure 1: Lamp and danger sign indicate the expected output of a circuit. The player solves the level by supplying current to the lamp, turning it on (bottom right), and by ensuring that no current is supplied to the danger sign (top left).

The Boolean circuit itself is implemented through the three basic types of combinational gates, namely AND, OR, and NOT. Current flows through the circuit according to the Boolean functionality of these three gates and the wires connecting them. Figure 2 shows a trivial example level that makes use of all three gate types.

Refer to caption
Figure 2: The interface for each level of ReverSim consists of a Boolean circuit diagram with three inputs and at least one output. The player interacts with the circuit by opening and closing the switches on the left. Annotations can be drawn onto the circuit using the drawing tools on the very left. At the top, the player’s progress statistics are displayed.
Design considerations.

With our choice of basic elements, we are able to represent any combinational circuit as a game level. Those circuits – e. g., in the form of logic functions located between memory elements of a large-scale netlist – are an essential target for real-world analyses [20]. The gameplay of ReverSim aims to model the sense-making processes that occur when reverse engineering such combinational subcircuits in the real world. The subcircuits’ inputs and outputs, i. e., memory cells or external connections of the IC, correspond to the switches, lamps, and danger signs in our levels.

Real-world Application Specific Integrated Circuit (ASIC) manufacturing processes tend to use additional compound gate types for improved power and space efficiency [25], however, their functionality can generally be decomposed into the three basic gate types available in ReverSim. Although memory elements such as flip-flops could also be implemented using those basic gates, we decided to exclude sequential logic from ReverSim for the time being to lower the entry barrier for players with little prior knowledge.

3.2 Obfuscated Gates

In addition to the three basic gates, we implemented two types of obfuscated gates in ReverSim. Both are derived from circuit obfuscation techniques recently proposed in the literature [8, 29] and share the common goal of making their functionality difficult to determine through chip-level reverse engineering [3].

  • Camouflaged gates [8] are a special set of gates designed for use in ASICs. They implement the essential logic gates introduced in Section 3.1 with the special property that all camouflaged gates look strikingly similar under microscopy. However, their appearance is clearly different from standard Boolean logic gates. While this makes them easy to identify as camouflaged gates, uncovering their individual Boolean functionality is hard. In ReverSim, we represent such gates as an ink blot, drawing participants’ attention to the fact that the gate is obfuscated, but not revealing its actual functionality.

  • Covert gates [29] have the additional property that they are not easily identifiable as being obfuscated. Hence, a gate that appears to be an OR gate to the problem solver may actually be an inverter, where only one of the two inputs is effective while the second is ignored. In ReverSim, we represent such gates with the symbol of the gate that they pretend to implement.

Figure 3 shows how the camouflaged and covert gates are visually represented in a simulation level and provides examples of the actual functionality that such gates may implement.

Refer to captionRefer to captionRefer to captionRefer to captiondisplayed gateptexample actualfunctionalitycamouflagedcovert
Figure 3: The visualization of a camouflaged and a covert gate (left) and an example of their hidden functionality (right).
Design considerations.

Obfuscation primitives such as camouflaged or covert gates are used in real-world designs to impede reverse engineering, as indicated by several existing patents [16, 7, 39]. Including obfuscated gates in ReverSim allows us to explore approaches employed by analysts to reverse engineer individual obfuscation components, providing initial insight into the effective use of hardware obfuscation.

3.3 Gameplay Mechanics

The objective of each level is to choose the correct setting of each switch to control the respective outputs. Once players think they have set the correct switch positions, they can click the Confirm button at the bottom right. Players are explicitly encouraged to only submit their solution once certain that it is correct, rather than trying all possible switch positions. This is reinforced by a score in each level that gradually decreases with each incorrect submission, as well as a gradually increasing time delay before a new attempt can be made. The effects of the current switch positions on the outputs are visualized as shown in Figure 1 only after clicking the Confirm button. During the tutorial and qualification phases only, current-carrying wires are additionally highlighted in yellow, making mistakes easier to spot. If the solution was correct, the player will move to the next level by clicking the Next button. If the solution was incorrect, the resulting circuit outputs are shown to the player and they can start a new attempt by modifying their solution. Should a player submit multiple incorrect solutions for a level, or reach a specified timeout, they are offered the option to skip to the next level.

Design considerations.

Determining (intermediate) values of individual wires is an essential subprocess in netlist reverse engineering [32]. A central part of our game rules is not allowing brute force and discouraging such trial-and-error strategies by imposing a score penalty and a time delay. In real-world HRE, using (automated) brute force may be a valid strategy to overcome some sufficiently small problems. However, an analyst requires the cognitive capabilities to develop such custom automation and apply it to a particular netlist [44]. Disallowing brute force in ReverSim may allow us to observe these cognitive aspects of HRE problem solving.

Furthermore, it is often beneficial to analyze and validate the behavior of a netlist or part thereof during reverse engineering using dynamic circuit analysis techniques without the need for costly experiments on a physical device [15]. Displaying the output state of the circuit even when an incorrect solution is submitted is therefore closely related to the dynamic analysis methods used in real-world HRE.

The above mechanisms have been tuned through extensive piloting to identify sensible defaults. However, each can be configured in detail to meet the exact requirements of various study objectives, environments, and participant groups.

3.4 Level Design

The ReverSim software is accompanied by a library of levels for use in the experiment phase, for which we provide summary information in Appendix D. Each level contains one HRE task consisting of the basic elements introduced in Section 3.1 in a circuit optimized to fit on a single screen. We provide five types of levels, which include three different complexities – low, medium, and high – and two level types containing obfuscated gates. Appendix C provides an example level for each of the different complexities.

Level complexity scales primarily with the number of outputs: Low complexity levels have only one output, medium complexity levels have two, and high complexity levels have three outputs. As the number of outputs increases, so does the number of gates in the circuit. Each level contains three switches, regardless of complexity. To ensure that outputs cannot be trivially determined and to introduce an additional measure of complexity, we also decided to require a minimum nonlinearity222The nonlinearity of a Boolean function is defined as the minimum Hamming distance to any linear or affine function. [27] The greater the distance, the more non-linear the function. between inputs and outputs. The same is required between all outputs, ensuring that no output follows trivially from another. Levels containing obfuscated gates are based on medium complexity levels, where a single gate has been replaced by a camouflaged or a covert gate. ReverSim also features a graphical level editor that allows researchers to extend and customize the level library.

Design considerations.

Our levels correspond to small (sub)modules of a larger netlist and have, for example, similarities to the transition logic of Finite State Machines [12] or to obfuscated micronetlists [7]. However, the number of netlist components – inputs, outputs, and gates – is limited in ReverSim, where the entire netlist is displayed on the screen in a fixed size. Together with eliminating the need to navigate between modules, we are able to reduce the level of abstraction to a degree that is suitable for HRE non-experts.

3.5 Drawing Tools

The players can annotate each level with a few simple drawing tools provided on the left hand side using their mouse. This includes pens in three colors, as well as an eraser and the possibility to delete all previous annotations. We choose colors with high and diverging contrasts to make them accessible.

Design considerations.

Annotating a netlist has long been a central component of the sense-making process involved in HRE and is therefore a useful aspect to observe. Even before the advent of purpose-built reverse engineering software with annotation features, reverse engineers printed microscope photographs of ICs and then manually traced wires and gates [35] with a pen. Modern reverse engineering software provides advanced features for grou**, naming, or visually highlighting gates [13, 37]. We opted for a simple drawing tool to avoid introducing additional complexity into the user interface. This way, ReverSim remains accessible to participants who are not familiar with advanced netlist annotation features.

3.6 Interactive Tutorial

To ensure that ReverSim is accesssible to participants with little prior knowledge of digital circuits, an interactive tutorial introduces all relevant elements and the objective of the game levels to the participants. For each circuit element from Section 3.1, the tutorial provides a textual and visual description. It then encourages players to individually try the elements out in minimal training circuits. Here, all inputs to the respective gate are connected directly to a switch and battery, allowing the player to directly manipulate each input. Contrary to the general gameplay mechanics, current-carrying wires are highlighted immediately, such that and players can observe the output behavior of the respective gate in real time. This tutorial is specifically focused on digital logic and circuit elements, as well as the game interface itself. However, we deliberately chose to not present any HRE concepts or specific solution strategies for the game levels that follow.

Design considerations.

To successfully engage with an HRE problem, a minimum knowledge of digital circuits and Boolean logic operators is required. The main objective of the tutorial is therefore to level the playing field across different degrees of understanding of digital circuits and Boolean logic operators. In particular, to be able to accurately observe strategies during HRE problem solving, it is important to not introduce concrete approaches to solving the following levels. Doing so could bias participants towards the strategies included in the tutorial, and alter the decision-making process of which strategy to apply to a problem instance.

3.7 Psychometric Test Integration

To test whether we can perform standardized cognitive tests in ReverSim (RQ4), we implemented a simple number connection test very similar to the Trail Making Test [5]. The test we chose is a non-verbal intelligence test called ZVT (German: “Zahlen-Verbindungs-Test”; number connection test) [21], which reliably and validly measures mental speed, a component of intelligence[36]. In the original paper-pencil test, participants use a pen to connect numbers from 1 to 90 on four standardized paper matrices as quickly as possible. The implementation in ReverSim includes an initial instruction page, two example matrices, and the four actual test matrices, and participants used their mouse instead of a pen to click the numbers in order. The two example matrices, with numbers from 1 to 20, allowed participants to familiarize themselves with the task and interface. Between the sample trials, participants were instructed to sit comfortably and click the numbers as quickly as possible. To prevent the participant from mentally going through the entire number sequence and thus gaining a speed advantage, only the numbers 1-3 are displayed before the first mouse click.

Design considerations.

We opted for the number connection test as it can be performed without experimenter intervention and without requiring audio or video capture, thus ensuring wide applicability and scalability. When implementing the number connection test in ReverSim we carefully transferred all instructions from the test manual. As participants can easily track their progress on a pen-and-paper matrix, we made sure that, in our digital version, all correctly clicked numbers are clearly visually identifiable. To make performance comparable across participants, we required participants to use a mouse when working on the number connection test. Thus, we avoid any performance influence resulting from the use of different input devices.

3.8 Sequence of ReverSim

ReverSim is divided into the four main phases described in Figure 4, which enable controlled experimental studies.

Optional Psychometric Test Embeds psychometric tests, such as the number connection test [21], which measures cognitive processing speed, seamlessly into the game environment. Interactive Tutorial Introduces and explains game elements, gameplay mechanics, and the overall objective. Participants can try out individual elements in minimal training circuits, particularly supporting those with little prior HRE knowledge. Qualification Phase Four levels of lowest possible difficulty (see Figure 2 for an example) verifying participants’ basic understanding of the game. Participants need to solve each level with a maximum of two attempts. Successfully completing the qualification allows participants to enter the experiment phase; otherwise, they must revisit the tutorial and retry. It is also possible to repeat the tutorial voluntarily. When doing so, users can navigate freely between the individual elements of the tutorial. Experiment Phase Set of levels tailored to the study at hand (see Section 3.4). Levels can be assigned per participant. This phase supports randomization and optionally a means for participants to skip levels in case they become stuck.
Figure 4: Explanation of the four phases of ReverSim.

4 Interview Study

To assesses to what extent ReverSim models real-world HRE processes (RQ1), we conducted semi-structured interviews with 14 researchers and professionals in hardware and netlist reverse engineering.

4.1 Methods

4.1.1 Ethical Considerations & Data Protection

Our institution’s ethics committee and data protection officer approved the interview study. All interviewees participated voluntarily, provided informed consent that included details of study procedures and data handling, and were free to withdraw from the study at any time. Participants received €40 in monetary compensation for time spent on study-related tasks.

4.1.2 Participants

Our 14 study participants were professionals and researchers in the field of HRE. They were recruited through the authors’ professional networks and worked in academia (8 participants from 3 institutions), federal agencies (2 participants from 2 institutions), or international companies (4 participants from 3 institutions) in four different countries. Twelve of the participants self-identified as male, two participants as female, and all had a university degree.

All participants self-rated their general HRE expertise on a scale of 1 (novice) to 5 (expert) with a mean of M=4.0𝑀4.0M=4.0italic_M = 4.0 (SD=.73𝑆𝐷.73SD=.73italic_S italic_D = .73). Overall, they had M=4.3𝑀4.3M=4.3italic_M = 4.3 years (SD=1.69𝑆𝐷1.69SD=1.69italic_S italic_D = 1.69) of experience with HRE. Half of the participants reported that they usually spend 20-30 hours a week or more for HRE tasks. In addition, all participants self-rated their prior theoretical HRE knowledge (e. g., in Boolean algebra, ICs, or hardware obfuscation) and practical HRE skills (e. g., analysis of data flows in netlists, or dynamic analysis of netlists) on a 5-point Likert scale ranging from 1 (very low) to 5 (very high). We calculated two scales for self-ratings (see Appendix E for subitems). The theoretical-knowledge scale had 10 items, and the practical-skill scale had 11 items. Both scales had acceptably high Cronbach’s alphas (α=.79𝛼.79\alpha=.79italic_α = .79 for theoretical knowledge and α=.87𝛼.87\alpha=.87italic_α = .87 for practical skills) [9, 33]. Overall, the 14 participants self-rated their prior theoretical knowledge with a mean of M=3.9𝑀3.9M=3.9italic_M = 3.9 (SD=.58𝑆𝐷.58SD=.58italic_S italic_D = .58), and their practical skills in HRE with a mean of M=3.7𝑀3.7M=3.7italic_M = 3.7 (SD=.78𝑆𝐷.78SD=.78italic_S italic_D = .78). In Appendix F, we report detailed demographic and expertise data at the individual participant level.

4.1.3 Study Procedure

Researchers from hardware security and psychology collaborated to create and revise a semi-structured interview guide. The final guide consisted of five main questions focused on answering RQ1. All interviews were conducted remotely by two researchers and lasted an average of 75 minutes. The first interviewer had a background in cognitive science and experience in semi-structured interviews, the second in HRE.

After being introduced to the procedure and objective of the study, i. e., the evaluation of ReverSim by domain experts, the interviewees explored all main components of the HRE simulation (i. e., the interactive tutorial, the qualification phase, and the experiment phase). The experiment phase consisted of a total of five levels designed to provide the participants with a comprehensive overview of ReverSim’s capabilities: One level each with low, medium and high complexity and two levels containing obfuscated gates (see Section 3.2).

At the beginning of the interview, participants were asked about their first impressions of ReverSim. The interviewers then asked participants to compare the ReverSim levels with netlists they encounter in their professional work and how their applied strategies would compare. The last interview question was about suggestions for improvements or future research with ReverSim.

After the interviews were completed, participants answered an online questionnaire on demographics and prior knowledge as presented in Section 4.1.2. Both the interview guide and questionnaire can be accessed online via Appendix B.

4.1.4 Data Analysis

We transcribed the interviews verbatim and analyzed the transcripts based on the concept of qualitative content analysis [18] with two coders – the interviewers – in a two-stage process. In the first step, Coder 1 coded eight interviews to create an initial codebook, and the second coder familiarized themselves with the transcripts. In collaboration, both coders iteratively created a final code book (see Appendix H) by defining code descriptions and rules as well as develo** code categories. In the second step, Coder 2 coded all 14 interviews, while Coder 1 revised their coding of the first eight and coded the remaining six interviews according to the final code book. Finally, both coders compared their coding and discussed all deviations until consensus was reached.

4.2 Results

Three main categories emerged from our qualitative content analysis of the interviews:

Category 1:

General Impression

Category 2:

Comparing ReverSim with Reality

Category 3:

Future Research and Additional Features

Below we describe the three categories in detail and support key statements with verbatim quotes from our participants.

4.2.1 Category 1: General Impression

The first category contains two core themes: Positive Feedback, and Constructive Criticism.

When asked for their first impression, all participants gave Positive Feedback on ReverSim, stating that they enjoyed solving the levels and liked the game design (e. g., they found the drawing tools helpful). Furthermore, they mentioned that the game elements, rules, and objectives were clear and that ReverSim was thoroughly implemented. Some interviewees reported that the abstraction in ReverSim was realized very successfully overall (e. g., the implementation of the danger sign and lamp symbol as outputs of the circuits). Many participants explicitly mentioned that the interactive tutorial was didactically well structured and indicated that they felt well guided by the tutorial.

Some participants also raised Constructive Criticism. A few mentioned that an additional explanation of the gates and their functions should be available throughout the game. In their opinion, this type of backup would be very helpful especially for non-experts, so that they could look up the functions, e. g., of an AND gate. Furthermore, some participants suggested revising the game levels containing obfuscated elements. Most of the participants solved these obfuscated levels without having to consider or identify the obfuscated elements. Accordingly, these participants suggested that obfuscated gates be placed in the critical path of analysis. Lastly, some of the participants suggested improvements in the handling of user input, particularly in the use of the drawing tools.

4.2.2 Category 2: Comparing ReverSim with Reality

In the second category, five main themes emerged from our analysis: Elements Recognized from Real-World Netlist Reverse Engineering, Real-World Elements Missing in ReverSim, Real-World Approaches Covered by ReverSim, Real-World Approaches Missing in ReverSim, and Evaluation of Gameplay Mechanics.

Regarding the identification of Elements Recognized from Real-World Netlist Reverse Engineering, all participants recognized components in the simulation that they knew from analyzing real-world netlists (e. g., logic gates such as AND or OR gates; input-output relations). One of the participants summarized this point with the following words: “The concept of making the light bulb glow is the same as making a bit in a netlist one; and setting the bit to zero in a netlist is the same like for the danger symbol in the game. That actually occurs in netlists, because you often have an output behavior, where you just want to know ‘Okay, what do I need now as input configuration, so that a certain output is zero and a certain output is one?’.”

Several interviewees also mentioned Real-World Elements Missing in ReverSim. A few participants said that they usually have to analyze netlists that also contain further logic gates such as NANDs, NORs, or XORs, as well as sequential logic elements (e. g., flip flops) and that these did not appear in the game levels. Some participants mentioned that the complexity of the simulation was only partly comparable to the complexity of real netlists with up to millions of gates. However, those participants also reported that they typically use semi-automated tools or write scripts to reduce complexity in real netlists. Finally, a few participants added that they had never seen obfuscated elements in real-world netlists. However, they also indicated that solving the obfuscated levels reminded them of dealing with erroneous netlists.

We identified several answers mentioning Real-World Approaches Covered by ReverSim. Most participants rediscovered problem-solving processes they usually apply in real-world netlist analysis when solving levels in ReverSim. For example, interviewees indicated that they proceeded backwards from the outputs to the inputs. Such output-driven approaches and back justifications are – according to the participants – common practice in HRE. Some participants hypothesized about how a particular output depends on specific inputs in ReverSim and then annotated them with the drawing tools. According to the experts, this procedure is comparable to the procedure in real netlists, where hypotheses about input-output relations are formulated and tested. One participant stated: “I wasn’t quite sure whether the switch had to be set to one or zero or whether there was another possibility. And then I just set an additional input to zero or one and then I looked if that can work or not. And if it doesn’t work out, then it must be the other one.” Another participant noted that they usually apply annotations comparable to the annotation processes in ReverSim: “So in HRE practice, I would just do something like that by annotating any signals on it. But do the whole thing in a framework and not on the screen. But the concept is the same…it’s just a bit more work with the mouse if you don’t have a tablet and a pen.”

Some participants reported Real-World Approaches Missing in ReverSim. A few indicated they would regularly use forward analysis or brute-force approaches in real netlists. However, applying a brute-force approach in ReverSim was discouraged by deducting points. In addition, some interviewees indicated that they would develop semi-automated solution approaches, especially for increasingly complex real-world netlists. Nevertheless, participants added that our game levels have practical relevance, explaining that the levels represent the manual analysis of the reverse-engineering process after the complexity of the circuit had been reduced by semi-automated steps and scripts.

Participants also provided an Evaluation of Gameplay Mechanics. Most participants stated that the game rules did not force them to apply unrealistic steps that they would not apply in real-world netlist analysis. However, the game rule that brute forcing was penalized was viewed ambivalently, as some participants would have liked to be free in their approaches without deduction of points. On the contrary, some participants drew a comparison and would consider brute force inefficient in real netlists, as summarized by this participant: “If I transfer the whole thing to real problems, i. e., large netlists, then brute forcing – with a correspondingly large input space – would not be possible.”

4.2.3 Category 3: Future Research and Features

The third category includes statements about possible future research and features for ReverSim and consists of two main topics: Add Further Gates and Netlist Components and Additional Objectives for Game Levels.

Some participants mentioned that future levels of ReverSim could include further combinational gates (e. g., NAND, XOR), sequential gates (e. g., flip flops), or high-level components such as adders. However, most participants concurred that incorporating additional components could be also more difficult for participants with little prior domain-specific knowledge. Further, a few participants expressed the idea to incorporate Additional Objectives for Game Levels. For example, different netlist modules could be presented, from which the one implementing a certain functionality should be selected. One participant described this idea as follows: “Maybe you give like a list of possible modules like multiplexer, addition, [ …] some basic function. And then you ask the user: Which one applies to your circuit?”

4.3 Discussion

In light of RQ1, our results from the interview study suggest that ReverSim is comparable to real-world netlist reverse engineering in several aspects. The positive assessments provided by the experts in the field indicate that the game is both enjoyable and didactically appropriate and that it reflects several of the problem-solving processes and strategies used in HRE. In particular, how ReverSim challenges players to analyze circuits was considered comparable to real-world problems. Additionally, many participants rated the gameplay mechanics as realistic and complimented the drawing tools. However, its comparability has some limitations, particularly in capturing the complexities and nuances of real-world netlist analysis. The criticism of the comparatively low complexity of the levels leads us to the conclusion that ReverSim is comparable to specific subprocesses of HRE and that this should be taken into account with regard to the validity of the studies. According to the participants, these specific subprocesses are certainly relevant in practice, namely when reverse engineers want to understand a specific section of a netlist precisely and have already reduced the complexity in advance (see our design considerations in Section 3.4). Another point of criticism was that some experts also partially apply brute force to analyze parts of a netlist, which is actively discouraged in ReverSim. As outlined in Section 3.3, this was a deliberate design decision, as we wanted the players to carefully think about how individual outputs came about. In this way, we aim to represent the relevant thought processes in HRE. Future research and development of ReverSim could provide a more realistic simulation of the diverse challenges faced in real-world netlist analysis, e. g., by incorporating additional components and objectives for game levels.

5 User Study

To assess the minimum requirements to participate in ReverSim (RQ2), to gain insight into the contained HRE tasks (RQ3), and to investigate the feasibility of performing standardized cognitive tests in ReverSim (RQ4), we conducted a user study with 109 participants.

5.1 Methods

5.1.1 Ethical Considerations & Data Protection

The user study was approved by the ethics committee and data protection officer of our institution. All participants took part voluntarily, provided informed consent that included details of study procedures and data handling, and were free to withdraw from the study at any time. To link participants’ answers in the questionnaires to their log files, we assigned randomly generated pseudonyms. We also ensured encrypted communication between the game clients and the server and stored the study-related materials on an internal server to which only the researchers involved in the study had access.

5.1.2 Participants

We recruited a total of 131 participants from the US and UK via Prolific333https://www.prolific.com/, divided into a pilot sample of 20 and a main sample of 111 participants, of which we had to exclude two due to technical issues. The eligibility criteria for participation in the study were a minimum age of 18 years, fluency in English, and a minimum level of education equivalent to a university entrance qualification. Participants were further required to use a desktop PC or laptop PC with a mouse. Out of 109 participants, 10 dropped out while working with ReverSim, leaving us with 99 complete data sets (see Figure 5); Participants received £22.50 as compensation for completing the study and spent a median of 69.569.569.569.5 minutes doing so.

Of our final sample (n=99𝑛99n=99italic_n = 99), 70 participants identified themselves as male and 29 as female. Participants were between 18 and 72 years old, with a mean age of M=38𝑀38M=38italic_M = 38 (SD=12.3𝑆𝐷12.3SD=12.3italic_S italic_D = 12.3). 72 participants had a university degree, eight had a professional degree, 18 had a high school degree or equivalent, and one participant preferred not to disclose.

5.1.3 Study Procedure

ptPre Survey(n=109𝑛109n=109italic_n = 109)ptNumberConnection TestptInteractiveTutorialptQualificationPhaseptExperimentwith 12 TasksptEnd ofGameptPost Survey(n=99𝑛99n=99italic_n = 99)ptrepeat tutorial voluntarilyor if not qualified (84848484)pttimeout (2222)pttimeout (22222222)ptReverSim Game Environmentptdropout (10)109108108987599181
Figure 5: Overview of the flow of our user study. Excluding two invalid datasets, 109 participants started the study. 84 revisited the tutorial at least once. Ten participants dropped out during the different phases of the study, particularly during the qualification phase, resulting in a total of 99 valid and complete datasets. 24 participants did not finish the game within the 75-minute time limit before proceeding to the post survey.

Participants proceeded through the study as shown in Figure 5. After voluntarily consenting to participate in the study, all participants answered a pre-study questionnaire. First, the questionnaire screened if they met the eligibility criteria. If so, we asked them about further demographics, including their academic and professional education, experiences in computer science and the IC industry. Participants then self-rated their prior knowledge in 16 relevant areas (e. g., Boolean algebra, logic gates, reverse engineering; see Appendix G for a complete list) on a scale of 00 (“none”) to 5555 (“very high”). Afterwards, they completed the four phases of ReverSim (see Section 3.8), starting with the number connection test [21]. During the experiment phase, participants engaged in 12 HRE tasks of increasing difficulty, which we selected based on mean and variance of solution times and participants’ feedback gathered during game development: Two low complexity tasks (Group A), four medium complexity tasks (Group B), four high complexity tasks (Group C), and two obfuscated medium complexity tasks (Group D) (see Section 3.4 and Appendix D for detailed information about each task). We randomized the order of the tasks within each group to limit order bias; and for reasons of fair payment, the total time spent in the game was limited to 75 minutes. After working with ReverSim, we asked study participants to provide feedback on positive and negative aspects of the game.

5.1.4 Measures and Variables

Prior Knowledge.

To assess participants’ domain-specific prior knowledge, we aggregated their responses across all 16 prior-knowledge topics into a single mean score. This was possible due to their high internal consistency, as evidenced by the extremely high Cronbach’s α𝛼\alphaitalic_α [9] of 0.960.960.960.96, indicating that these items measure the same underlying construct [33].

HRE Performance Variables.

To measure participant performance, we used log files recorded during the experiment phase. From this data, we calculated the following set of variables for each of the 12 tasks:

The Task Solved variable captures whether or not a participant has solved a task. Tasks may remain unsolved if the participant skips the task or if the study times out after 75 minutes. To avoid stalling participants, tasks become skippable after four unsuccessful attempts, or after exceeding a predetermined time within the task. All time limits are listed in Appendix D. To ensure data quality, we consider a task that has been solved by brute force, i. e., by rapidly trying arbitrary switch combinations without thinking, as not solved. We identify a task as brute-forced if we observe an average attempt rate of more than one submission within 10 seconds.

For all tasks that were solved, we calculated the following two variables: The variable Time in Task measures the time until a participant submits a correct solution. The variable Number of Attempts measures the number of attempts it takes the participant to produce this correct solution.

Number Connection Test.

To evaluate whether the integration of the number connection test (see Section 3.7) into ReverSim was successful, we compared participants’ performance with the norm values and corresponding Intelligence Quotient (IQ) values from the original test manual. We determined the Time per Matrix by measuring the seconds between clicking ’1’ and completing each of the four sequences. We then assessed participants’ performance by calculating the mean Time per Matrix for each individual.

5.2 Results

5.2.1 Participants’ Prior Knowledge & Experience

Our 109 participants (including dropouts) self-rated their prior knowledge with a mean of M=1.10𝑀1.10M=1.10italic_M = 1.10 (SD=1.08𝑆𝐷1.08SD=1.08italic_S italic_D = 1.08, median=0.75𝑚𝑒𝑑𝑖𝑎𝑛0.75median=0.75italic_m italic_e italic_d italic_i italic_a italic_n = 0.75), corresponding to “very low”. The distribution of prior knowledge among our participants, broken down by those completing the study and those who dropped out, is shown in Figure 6. 42 participants reported practical experience in computing and nine reported practical experience with microchips.

Refer to caption
Figure 6: Distribution of prior-knowledge scores among our 109 participants. The 99 participants who completed the study had a median score of 0.750.750.750.75, while the ten participants who dropped out had a median score of 0.310.310.310.31. Nine of the ten dropouts had a prior knowledge score of 1111 or less.

5.2.2 Participant Engagement & Performance

Of our 109 participants, 99 – or 91% – completed the study and ten participants dropped out. As shown in Figure 5, eight participants dropped out during the qualification phase, one dropped out during the number connection test, and one dropped out during the experiment phase. Of the ten dropouts, only one participant had a background in computing, and none had experience with microchips.

In the following, we only consider the 99 participants who completed the study, 24 of whom reached the time limit of 75 minutes, and thus missed a median of two tasks. Participants solved a mean of eight (out of twelve) tasks with any number of attempts, and a mean of M=4.5𝑀4.5M=4.5italic_M = 4.5 tasks on first attempt, which is a very strict measure of success. Figure 7 shows the number of tasks solved on first attempt, which ranges from 0 to 11, by fraction of participants. Remarkably, the 27 participants requiring more than three qualification attempts solved a mean of just M=2.5𝑀2.5M=2.5italic_M = 2.5 tasks on first attempt, which is only marginally better than random guessing, while the 70 participants requiring three or less qualification attempts solved a mean of 5.55.55.55.5 tasks.

Refer to caption
Figure 7: Number of HRE tasks solved on the first attempt by fraction of participants, broken down into those requiring a maximum of three attempts to qualify and those requiring more than three attempts.

5.2.3 Participants’ Feedback

Below we report the feedback provided by all 99 non-dropout participants in the post-survey. No participants reported accessibility issues when using ReverSim. On a scale of 1 (“fully disagree”) to 5 (“fully agree”), participants agree (M=4.06𝑀4.06M=4.06italic_M = 4.06, SD=1.10𝑆𝐷1.10SD=1.10italic_S italic_D = 1.10) that they enjoyed playing the game. They also agree that they understood the game rules (M=4.36𝑀4.36M=4.36italic_M = 4.36, SD=0.81𝑆𝐷0.81SD=0.81italic_S italic_D = 0.81) but were undecided as to whether the scoring was motivating (M=3.39𝑀3.39M=3.39italic_M = 3.39, SD=1.14𝑆𝐷1.14SD=1.14italic_S italic_D = 1.14). Participants appreciated the opportunity to repeat the tutorial (M=4.35𝑀4.35M=4.35italic_M = 4.35, SD=0.79𝑆𝐷0.79SD=0.79italic_S italic_D = 0.79) and agreed that it was easy to understand (M=4.10𝑀4.10M=4.10italic_M = 4.10, SD=1.04𝑆𝐷1.04SD=1.04italic_S italic_D = 1.04). A total of 74 participants indicated having used the drawing tools, agreeing that they are useful (M=4.16𝑀4.16M=4.16italic_M = 4.16, SD=0.95𝑆𝐷0.95SD=0.95italic_S italic_D = 0.95).

5.2.4 Task Statistics

Figure 8 shows the fractions of participants who solved each task, broken down by the number of attempts. If we ignore the number of attempts, then Group A tasks were solved by 91 to 96% of the participants, Group B tasks by 73 to 83%, Group C tasks by 34 to 61%, and Group D tasks by 44 to 45%. If we only consider successful solutions in the first attempt, then the solution probabilities for Group A tasks are 69 to 80%, for Group B tasks 20 to 60%, for Group C tasks 18 to 31% and for Group D tasks 14 to 24%.

Refer to caption
Figure 8: Fractions of participants who solved each task by the number of solution attempts.

Figure 9 shows the distribution of times to correct solutions for each HRE task. Median times were 22222222 to 24242424 seconds for Group A tasks, 82828282 to 126126126126 seconds for Group B tasks, 155155155155 to 226226226226 seconds for Group C tasks, and 126126126126 to 175175175175 seconds for Group D tasks.

Refer to caption
Figure 9: Time required by the participants to solve each HRE task. To make the results comparable across tasks, only participants who saw all 12 tasks were considered. Of these 75 participants, only the times of those participants who solved each individual task correctly were included, i. e., sample size varies between 74 participants for Task A1 and 30 participants for Task C3.

5.2.5 Number Connection Test Results

108 participants completed all four matrices of the number connection test. Participants solved Matrix 1 in a mean time of M=66𝑀66M=66italic_M = 66 seconds (range: 39393939 to 134134134134), the second in M=66𝑀66M=66italic_M = 66 seconds (range: 37373737 to 137137137137), the third in M=72𝑀72M=72italic_M = 72 seconds (range: 42424242 to 467467467467), and Matrix 4 in M=99𝑀99M=99italic_M = 99 seconds (range: 43434343 to 610610610610). All distributions show a positive skew, i. e., an above-average number of participants with short solution times, which is also reported in the test manual. A Welch’s ANOVA [40] (F=1.20,df=3,p=0.31formulae-sequence𝐹1.20formulae-sequence𝑑𝑓3𝑝0.31F=1.20,df=3,p=0.31italic_F = 1.20 , italic_d italic_f = 3 , italic_p = 0.31) and Bonferroni-corrected pairwise t-tests revealed a significant difference in means between Matrix 4 and all other matrices (p<.001𝑝.001p<.001italic_p < .001 to p<.05𝑝.05p<.05italic_p < .05). Therefore, we evaluated the mean times of Matrix 1-3 and the mean times of Matrix 1-4 separately. Using Matrix 1-3, the mean IQ of our participants – as measured by the number connection test – is about 115115115115, which is one standard deviation above average (i. e., 100100100100). Using Matrix 1-4, the mean IQ of our participants is approximately 108108108108, which is close to average.

5.3 Discussion

5.3.1 RQ2: Minimum Requirements for Participation

Despite their very low prior knowledge in relevant areas, many participants were able to meaningfully engage with ReverSim, indicated by a low dropout rate of less than 10% and a high mean number of eight tasks solved per participant. While we did not find any significant influence of prior knowledge and experience on participants’ performance – as might be expected given very low levels of prior knowledge overall – we did make the following observation about dropouts: Both participants with a prior knowledge score above 1111 and participants with experience in computing or microchips had a very low probability of drop** out. Future studies with ReverSim could therefore pre-select participants accordingly.

While about three quarters of the participants revisited the tutorial, they still agreed that it was easy to understand. We conclude that although the rules of the simulation are easy to understand, it is still difficult to develop successful HREstrategies. Even those participants who took the qualification up to three times were successful in the experiment phase under the strict metric Task Solved on First Attempt. However, participants who attempted to qualify more than three times were largely unable to successfully engage in ReverSim, which leads us to recommend that such participants be screened out early (while compensating them proportionally for their participation in the study). After applying these screening criteria to our sample, the number of tasks solved on first attempt is nearly centered on the available range while achieving an adequate variance, resulting in a high expressiveness of the resulting data.

5.3.2 RQ3: Assessment of ReverSim’s Tasks

As a central part of the user study, we measured solution probability and solution time. We observed large differences between the tasks for both metrics, which we discuss below.

From group A to D, we observed a strong decrease in solution probability, indicating that our selection of groups covers a wide range of task difficulty. Whether a task was solved on first attempt appears to be a particularly promising metric in this regard: The observed data range from 80 to 14% which indicates neither ceiling nor floor effects in any of the tasks. This allows ReverSim to capture participant performance without losing accuracy at either end of the spectrum. Within Group B, we make an interesting secondary observation: Even though all four tasks are designed based on our medium-complexity design criteria (Section 3.4), solution probability on first attempt varies by 40% from task B1 to B4. Hence, factors beyond the number of outputs and Boolean nonlinearity appear to influence task difficulty. We suggest that circuit layout, number of connections, and gate types, may be of interest for future research into what exactly makes a circuit difficult to reverse engineer.

We further calculated distributions of solution times for all tasks. Comparing the tasks within Group B shows that solution time is not directly related to solution probability; that is, a task solved by more participants is not necessarily also solved faster, making both metrics equally relevant. Between the non-obfuscated levels, i. e. groups A to C, we notice that the number of outputs in each task (A: 1, B: 2, C: 3) has a large impact on the mean and variance of solution times: While Group A tasks feature rather little variance, solution times in Group C vary by multiple minutes. A possible reason may be that, as the problem space grows, the identification of relevant components and an efficient order to traverse them becomes more important, so that the choice of an efficient strategy has a large impact on solution time.

Lastly, we observed that reverse engineering the obfuscated circuits in Group D yields solution probabilities and times that are between those of groups B and C. Introducing a single obfuscated gate does not appear to increase the problem space more than the addition of a third output.

5.3.3 RQ4: Performing Cognitive Tests in ReverSim

As only one participant dropped out during the number connection test and the range of the participants’ Time per Matrix corresponds to the norm values, we can assume that the implementation of the number connection test in ReverSim was successful. We also observed mean solution times and distributions of times that were consistent with the norm values from the test manual. As our sample of participants is well educated, the above-average IQ scores are in line with our expectations. We conclude that the number connection task in ReverSim is suitable for its intended purpose; i. e., measuring mental speed in HRE problem solving. Moreover, the implementation of other cognitive tests (e. g., n-back task to measure working memory capacity) to measure further cognitive factors in HRE seems feasible.

6 Conclusion and Outlook

Hardware Reverse Engineering (HRE) is an essential tool for increasing trust in ICs, a primary objective of the recent US and EU microchips initiatives [28, 10]. As HRE cannot be fully automated, human problem-solving competencies and cognitive abilities are critical to its effective execution [4, 44]. To enable quantitative studies of human aspects in HRE, which have been poorly understood to date [11], we developed and evaluated ReverSim, a game-based HRE lab environment that allows integrating standardized, unsupervised cognitive tests. 14 professionals and researchers from the HRE field attested to the comparability of ReverSim to real-world HRE problems. Based on a user study with 109 participants, we proposed screening criteria for participants who are likely to engage in ReverSim successfully, identified relevant performance metrics, and evaluated a set of 12 HRE tasks as well as our implementation of a standardized cognitive test. Against this background, we consider ReverSim a promising tool for studying cognitive factors in HRE.

6.1 Limitations

ReverSim.

We have designed ReverSim from scratch to enable controlled experimental studies that can accurately assess the impact of cognitive factors on HRE problem-solving processes. Naturally, an HRE simulation cannot represent all processes of HRE. However, based on the results of the interview study, we suggest that ReverSim successfully models important aspects and thought processes of HRE and also lays the foundation for simulating other important HRE subprocesses, such as sequential circuit analysis.

Prior Knowledge.

In both studies, we asked participants to self-rate their prior knowledge. Although we have no evidence that participants systematically over- or underestimated their prior knowledge, these self-assessments may be prone to bias and therefore may not accurately reflect participants’ actual knowledge. In our interview study, where a high level of HRE expertise was essential, we therefore included additional metrics to assess the prior knowledge of the interviewees. For future studies aimed at assessing the cognitive factors that influence HRE success, further evidence-based determination of prior knowledge, beyond experience in computing or microchips, may be required.

Cognitive Assessment

When evaluating the number connection task, we observed a slight but insignificant performance loss in Matrix 3 and a significant performance loss in Matrix 4. Although an extensive meta analysis revealed no differences in participants’ performance between computerized and paper-and-pencil tests of cognitive abilities [19], it could be argued that solving the matrices on a screen is more cognitively demanding. As we do not know the reasons for the observed performance loss, we want to point out that individual performances should be interpreted cautiously. However, the test implementation in ReverSim was not intended for psychological diagnosis of individual participants, but rather as a means of comparing their performance with respect to differing cognitive speeds. For this purpose, an exact match to the pen-and-paper version is not required; thus we consider the test applicable.

6.2 Outlook

Below we briefly outline research avenues enabled by our HRE simulation as a controlled environment: In future studies, ReverSim can be used to empirically determine which characteristics of a circuit affect the cognitive load of reverse engineering it. In this way, ReverSim can support the development of design patterns for “cognitively obfuscated” ICs [41]. As a second line of defense to traditional obfuscation techniques, this type of hardware protection aims to create circuitry that deliberately slows down a human attacker’s problem-solving process. Conversely, understanding how essential HRE skills are acquired and where problem solvers struggle may also inform the development of a training environment and educational guidelines for hardware security analysts. Hence, ReverSim may be a useful tool in meeting the growing need for hardware security professionals in the face of the US’ and EU’s massive investments into trustworthy semiconductors [28, 10]. We emphasize that it is important to complement quantitative results obtained with ReverSim by validating them, where possible, with HRE experts in a more complex real-world environment such as HAL [13].

Acknowledgements

We are very grateful to Zehra Karadag, Max Hoffmann, and Jannik Schmöle for their support in the development of ReverSim. A big thanks also goes to Malte Elson and Franziska Herbert for discussions and recommendations on our analysis methods. Last but not least, we would like to thank all expert and non-expert participants for their time and effort invested in our study. This work was supported by the PhD School “SecHuman – Security for Humans in Cyberspace” by the federal state of NRW, Germany and by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany´s Excellence Strategy – EXC 2092 CASA – 390781972.

References

  • [1] Nils Albartus, Max Hoffmann, Sebastian Temme, Leonid Azriel, and Christof Paar. DANA universal dataflow analysis for gate-level netlist reverse engineering. IACR Transactions on Cryptographic Hardware and Embedded Systems (TCHES), 2020(4):309–336, 2020.
  • [2] Leonid Azriel, Julian Speith, Nils Albartus, Ran Ginosar, Avi Mendelson, and Christof Paar. A survey of algorithmic methods in IC reverse engineering. Journal of Cryptographic Engineering, 11(3):299–315, 2021.
  • [3] Georg T Becker, Marc Fyrbiak, and Christian Kison. Hardware obfuscation: Techniques and open challenges. In Foundations of Hardware IP Protection, pages 105–123. Springer, 2017.
  • [4] Steffen Becker, Carina Wiesen, Nils Albartus, Nikol Rummel, and Christof Paar. An exploratory study of hardware reverse engineering — technical and cognitive processes. In Sixteenth Symposium on Usable Privacy and Security, SOUPS 2020, August 7-11, 2020, pages 285–300. USENIX Association, 2020.
  • [5] Christopher R Bowie and Philip D Harvey. Administration and interpretation of the trail making test. Nature Protocols, 1(5):2277–2281, December 2006.
  • [6] Gregory H. Chisholm, Steven T. Eckmann, Christopher M. Lain, and Robert Veroff. Understanding integrated circuits. IEEE Design & Test of Computers, 16(2):26–37, 1999.
  • [7] Lap Wai Chow, Bryan J Wang, James P Baukus, and Ronald P Cocchi. Secure logic locking and configuration with camouflaged programmable micro netlists, June 23 2020. US Patent 10,691,860.
  • [8] Ronald P. Cocchi, James P. Baukus, Lap Wai Chow, and Bryan J. Wang. Circuit camouflage integration for hardware IP protection. In The 51st Annual Design Automation Conference 2014, DAC ’14, San Francisco, CA, USA, June 1-5, 2014, pages 153:1–153:5. ACM, 2014.
  • [9] Lee J Cronbach. Coefficient alpha and the internal structure of tests. psychometrika, 16(3):297–334, 1951.
  • [10] European Comission. A Chips Act for Europe – Comission Staff Working Document, may 2022.
  • [11] Marc Fyrbiak, Sebastian Strauss, Christian Kison, Sebastian Wallat, Malte Elson, Nikol Rummel, and Christof Paar. Hardware reverse engineering: Overview and open challenges. In IEEE 2nd International Verification and Security Workshop, IVSW 2017, Thessaloniki, Greece, July 3-5, 2017, pages 88–94. IEEE, 2017.
  • [12] Marc Fyrbiak, Sebastian Wallat, Jonathan Déchelotte, Nils Albartus, Sinan Böcker, Russell Tessier, and Christof Paar. On the difficulty of FSM-based hardware obfuscation. IACR Transactions on Cryptographic Hardware and Embedded Systems (TCHES), 2018(3):293–330, 2018.
  • [13] Marc Fyrbiak, Sebastian Wallat, Pawel Swierczynski, Max Hoffmann, Sebastian Hoppach, Matthias Wilhelm, Tobias Weidlich, Russell Tessier, and Christof Paar. HAL – the missing piece of the puzzle for hardware reverse engineering, trojan detection and insertion. IEEE Transactions on Dependable and Secure Computing, 16(3):498–510, 2019.
  • [14] Mark C. Hansen, Hakan Yalcin, and John P. Hayes. Unveiling the ISCAS-85 benchmarks: A case study in reverse engineering. IEEE Design & Test of Computers, 16(3):72–80, 1999.
  • [15] Adam G. Kimura, Adam R. Waite, Jon Scholl, James Schaffranek, Matt Sutter, and Glen D. Via. From silicon to simulation: A full decomposition of a fabricated 130 nm serial peripheral interface for establishing an assurance baseline root-of-trust. In 2020 IEEE Physical Assurance and Inspection of Electronics (PAINE). IEEE, December 2020.
  • [16] Thomas Kuenemund. Semiconductor chip using logic circuitry including complementary fets for reverse engineering protection, July 9 2019. US Patent 10,347,630.
  • [17] N.Y. Louis Lee and P.N. Johnson-Laird. A theory of reverse engineering and its application to boolean systems. Journal of Cognitive Psychology, 25(4):365–389, 2013.
  • [18] Philipp Mayring. Qualitative content analysis: Demarcation, varieties, developments. Forum: Qualitative Social Research, 20(3):1–26, 2019.
  • [19] Alan D. Mead and Fritz Drasgow. Equivalence of computerized and paper-and-pencil cognitive ability tests: A meta-analysis. Psychological Bulletin, 114(3):449–458, November 1993.
  • [20] Travis Meade, Shaojie Zhang, and Yier **. Netlist reverse engineering for high-level functionality reconstruction. In 21st Asia and South Pacific Design Automation Conference, ASP-DAC 2016, Macao, Macao, January 25-28, 2016, pages 655–660. IEEE, 2016.
  • [21] Wolf Dieter Oswald. Zahlen-Verbindungs-Test ZVT. 3., überarbeitete und neu normierte Auflage. Hogrefe, 2016.
  • [22] Endres Puschner, Thorben Moos, Steffen Becker, Christian Kison, Amir Moradi, and Christof Paar. Red team vs. blue team: A real-world hardware trojan detection case study across four modern cmos technology generations. In 2023 IEEE Symposium on Security and Privacy (SP), pages 56–74, Los Alamitos, CA, USA, may 2023. IEEE Computer Society.
  • [23] Shahed E. Quadir, Junlin Chen, Domenic Forte, Navid Asadizanjani, Sina Shahbazmohamadi, Lei Wang, John A. Chandy, and Mark Mohammad Tehranipoor. A survey on chip to system reverse engineering. ACM Journal on Emerging Technologies in Computing Systems, 13(1):6:1–6:34, 2016.
  • [24] Raul Quijada, Roger Dura, Jofre Pallares, Xavier Formatje, Salvador Hidalgo, and Francisco Serra-Graells. Large-area automated layout extraction methodology for full-IC reverse engineering. Journal of Hardware and Systems Security, 2(4):322–332, 2018.
  • [25] Mohammad Rahman, Ryan Afonso, Hiran Tennakoon, and Carl Sechen. Power reduction via separate synthesis and physical libraries. In Proceedings of the 48th Design Automation Conference, DAC 2011, San Diego, California, USA, June 5-10, 2011, pages 627–632. ACM, 2011.
  • [26] M. G. Rekoff. On reverse engineering. IEEE Transactions on Systems, Man, and Cybernetics, 15(2):244–252, 1985.
  • [27] Palash Sarkar and Subhamoy Maitra. Nonlinearity bounds and constructions of resilient boolean functions. In Advances in Cryptology - CRYPTO 2000, 20th Annual International Cryptology Conference, Santa Barbara, California, USA, August 20-24, 2000, Proceedings, pages 515–532. Springer, 2000.
  • [28] Senate of the United States. CHIPS and Science Act 2022 (P.L. 117-167), july 2022.
  • [29] Bicky Shakya, Hao-Ting Shen, Mark Mohammad Tehranipoor, and Domenic Forte. Covert gates: Protecting integrated circuits with undetectable camouflaging. IACR Transactions on Cryptographic Hardware and Embedded Systems (TCHES), 2019(3):86–118, 2019.
  • [30] Yiqiong Shi, Bah-Hwee Gwee, Ye Ren, Thet Khaing Phone, and Chan Wai Ting. Extracting functional modules from flattened gate-level netlist. In International Symposium on Communications and Information Technologies, ISCIT 2012, Gold Coast, Australia, October 2-5, 2012, pages 538–543. IEEE, 2012.
  • [31] Yiqiong Shi, Chan Wai Ting, Bah-Hwee Gwee, and Ye Ren. A highly efficient method for extracting FSMs from flattened gate-level netlist. In International Symposium on Circuits and Systems (ISCAS 2010), May 30 - June 2, 2010, Paris, France, pages 2610–2613. IEEE, 2010.
  • [32] Pramod Subramanyan, Nestan Tsiskaridze, Wenchao Li, Adrià Gascón, Wei Yang Tan, Ashish Tiwari, Natarajan Shankar, Sanjit A. Seshia, and Sharad Malik. Reverse engineering digital circuits using structural and functional analyses. IEEE Transactions on Emerging Topics in Computing, 2(1):63–80, 2014.
  • [33] Mohsen Tavakol and Reg Dennick. Making sense of cronbach’s alpha. International journal of medical education, 2:53, 2011.
  • [34] Olivier Thomas and Dmitry Nedospasov. On the impact of automating the ic analysis process. Technical report, Texplained SARL, 2015.
  • [35] Randy Torrance and Dick James. The state-of-the-art in IC reverse engineering. In Cryptographic Hardware and Embedded Systems - CHES 2009, 11th International Workshop, Lausanne, Switzerland, September 6-9, 2009, Proceedings, pages 363–381. Springer, 2009.
  • [36] Philip A. Vernon. Der zahlen-verbindungs-test and other trail-making correlates of general intelligence. Personality and Individual Differences, 14(1):35–40, January 1993.
  • [37] Sebastian Wallat, Nils Albartus, Steffen Becker, Max Hoffmann, Maik Ender, Marc Fyrbiak, Adrian Drees, Sebastian Maaßen, and Christof Paar. Highway to HAL: Open-sourcing the first extendable gate-level netlist reverse engineering framework. In Proceedings of the 16th ACM International Conference on Computing Frontiers, CF 2019, Alghero, Italy, April 30 - May 2, 2019, pages 392–397. ACM, 2019.
  • [38] Sebastian Wallat, Marc Fyrbiak, Moritz Schlögel, and Christof Paar. A look at the dark side of hardware reverse engineering — a case study. In IEEE 2nd International Verification and Security Workshop, IVSW 2017, Thessaloniki, Greece, July 3-5, 2017, pages 95–100. IEEE, 2017.
  • [39] Bryan J Wang, Lap Wai Chow, James P Baukus, and Ronald P Cocchi. Method and apparatus for camouflaging an integrated circuit using virtual camouflage cells, October 27 2020. US Patent 10,817,638.
  • [40] B. L. Welch. On the comparison of several mean values: An alternative approach. Biometrika, 38(3/4):330, December 1951.
  • [41] Carina Wiesen, Nils Albartus, Max Hoffmann, Steffen Becker, Sebastian Wallat, Marc Fyrbiak, Nikol Rummel, and Christof Paar. Towards cognitive obfuscation: Impeding hardware reverse engineering based on psychological insights. In Proceedings of the 24th Asia and South Pacific Design Automation Conference, ASPDAC 2019, Tokyo, Japan, January 21-24, 2019, pages 104–111. ACM, 2019.
  • [42] Carina Wiesen, Steffen Becker, Nils Albartus, Christof Paar, and Nikol Rummel. Promoting the acquisition of hardware reverse engineering skills. In IEEE Frontiers in Education Conference, FIE 2019, Cincinnati, OH, USA, October 16-19, 2019, pages 1–9. IEEE, 2019.
  • [43] Carina Wiesen, Steffen Becker, Marc Fyrbiak, Nils Albartus, Malte Elson, Nikol Rummel, and Christof Paar. Teaching hardware reverse engineering: Educational guidelines and practical insights. In IEEE International Conference on Teaching, Assessment, and Learning for Engineering, TALE 2018, Wollongong, Australia, December 4-7, 2018, pages 438–445. IEEE, 2018.
  • [44] Carina Wiesen, Steffen Becker, René Walendy, Christof Paar, and Nikol Rummel. The anatomy of hardware reverse engineering: An exploration of human factors during problem solving. ACM Trans. Comput.-Hum. Interact., 30(4), sep 2023.

Appendix A ReverSim Prototype

Our HRE simulation is accessible at https://hrestudy.com/review, including the study settings introduced in Section 4.1 and Section 5.1. We have anonymized ReverSim for review. We will never intentionally collect or analyze usage data on this review version, but we note that your IP address is logged and retained for 7 days solely for network security purposes when you access the URL.

Appendix B Complete Study Materials

We provide digital artifacts such as the complete questionnaires and interview guide used in both studies at https://osf.io/vcuyg/?view_only=241797dcfdf54c36b3d9b29b18453c8b.

Appendix C Example Levels for Each Complexity Type

Refer to caption

(a) A low-complexity level features a single output and between 2 and 5 gates. All low-complexity levels have three correct solutions.

Refer to caption

(b) A medium-complexity level features two outputs and between 7 and 11 gates. There is only one correct solution.

Refer to caption

(c) A high-complexity level features three outputs and between 10 and 18 gates. There is only one correct solution.
Figure 10: Levels of increasing complexity differ in the size of the circuit and the number of outputs. All levels feature exactly three inputs and can contain light bulbs and danger signs as outputs. The three examples each show a correct solution. Current-carrying wires are highlighted in yellow, as they would be after the players submitted a correct solution. The colors have been optimized for printing; see Figure 2 for an example of the in-game representation.

Appendix D User Study Task Parameters

Table 1: Parameters for the four qualification levels and the 12 levels of the four complexity groups. The binary strings in the Target Outputs column describe the output types: A ‘1’ stands for a lamp and a ‘0’ stands for a danger sign, for example the string ‘101’ stands for the three outputs lamp, danger sign, lamp (cf. 9(c)). Likewise, the binary strings in the Solutions column indicate the switch positions for the correct solution(s). A ‘1’ stands for a closed switch and a ‘0’ stands for an open switch, for example the string ‘010’ stands for the three switch positions open, closed, open (cf. 9(c)).
Level ID Gates Target Outputs # Outputs Solutions
AND OR Inverters Camouflaged Total
Qualification
1 2 0 0 0 2 1 1 111
2 1 1 1 0 3 1 1 010
3 0 2 2 0 4 0 1 110
4 1 2 0 0 3 0 1 000, 100
Low (Group A)
1 1 1 1 0 3 1 1 001, 100, 101
2 1 1 1 0 3 1 1 000, 100, 010
Medium (Group B)
1 3 3 3 0 9 00 2 000
2 1 4 4 0 9 00 2 101
3 2 3 2 0 7 11 2 011
4 3 3 2 0 8 10 2 110
High (Group C)
1 3 5 4 0 12 001 3 011
2 2 5 5 0 12 100 3 100
3 7 3 8 0 18 100 3 110
4 7 3 6 0 16 011 3 100
Camouflaged Medium (Group D)
1 2 3 2 1 8 01 2 111
2 2 3 2 1 8 01 2 010

Appendix E Items of Theoretical-Knowledge Scale and Practical-Skill Scale from the Interview Study

Table 2: Self-rating items for the interview study participants on a scale ranging from 1 (very low) to 5 (very high)
(a) Subitems of the theoretical-knowledge scale.

Items

M𝑀Mitalic_M SD𝑆𝐷SDitalic_S italic_D N𝑁Nitalic_N

Boolean Algebra

4.29 0.72 14

Integrated Circuits (IC)

3.86 0.77 14

FPGAs

3.86 1.02 14

Hardware Design

3.57 0.85 14

Processes and Methods of Netlist Extraction from ICs

4.50 0.85 14

Processes and Methods of Netlist Extraction from FPGAs

3.50 1.45 14

Methods of Netlist Analysis

4.07 0.91 14

Intellectual Property Protection

3.79 0.97 14

Hardware Obfuscation

3.86 0.77 14

Hardware Trojans

3.71 1.20 14
(b) Subitems of the practical-skill scale.

Items

M𝑀Mitalic_M SD𝑆𝐷SDitalic_S italic_D N𝑁Nitalic_N

Netlist Extraction from ICs

3.86 1.23 14

Netlist Extraction from FPGAs

3.14 1.46 14

Usage of HRE Tools

4.00 1.10 14

Reverse Engineering of Netlists

4.21 1.05 14

Detection of Finite State Machines (FSMs)

3.43 1.22 14

Detection of Specific combinational Blocks

3.64 1.27 14

Analysis of Data Flow

3.86 1.23 14

Netlist Simulation

3.79 1.25 14

Object-Oriented Programming

4.07 0.91 14

Procedural Programming

3.64 1.08 14

Hardware Description Languages

3.57 1.08 14

Appendix F Description of the Interview Study Sample

Table 3: Demographic data and information on the technical and HRE expertise of the 14 interviewees.
ID Age group Highest degree Employer sector Years in HRE Weekly hours for tasks Systems analyzed Experience self-rating
HRE other technical general theor. practical
1 18-29 Master ECON 2 10-20 20-30 0-3 4 4.1 4.36
2 18-29 Bachelor 3 40 4-6 4 3.8 3.45
3 18-29 Master 4 20-30 10-20 7-10 5 4.7 4.64
4 30-39 PhD IC 7 5-10 40 51-100 4 4.7 4.64
5 18-29 Master 5 20-30 10-20 11-25 4 4.1 4.09
6 18-29 Master IC 4 10-20 10-20 7-10 3 3.3 3.00
7 40-49 PhD IC 7 10-20 30-40 51-100 5 4.7 4.45
8 30-39 Master IC 4 20-30 20-30 0-3 4 3.6 2.55
9 30-39 PhD PUB 7 20-30 20-30 7-10 5 4.1 3.55
10 30-39 Master IC 5 20-30 30-40 4-6 5 4.1 4.18
11 18-29 Bachelor 2 5-10 4-6 3 3.8 3.91
12 30-39 Master MANU 4 5-10 4-6 3 2.9 2.00
13 18-29 Master ECON 4 30-40 5-10 11-25 4 3.7 3.73
14 18-29 Master CS 3 20-30 10-20 4-6 4 3.0 3.91
CS Computer security MANU Manufacturing/production of goods, other industry IC Information and communication PUB Public administration, defense, education, health care and social services ECON Scientific/technical services, other economic services

Appendix G Items of Prior-Knowledge Scale from the User Study

Table 4: Subitems of the domain-specific prior-knowledge scale calculated for the user study participants ranging from 0 (none) to 5 (very high).
Items M SD N
Transfer: Propositional Logic 0.91 1.28 99
Transfer: Boolean Algebra 1.17 1.5 99
Transfer: Argumentation Logic 0.79 1.13 99
Transfer: Set Algebra 1.19 1.41 99
Transfer: Switching Algebra 0.9 1.3 99
General: Digital Electronics 1.52 1.54 99
General: Algorithms 1.74 1.49 99
General: IT Systems 1.96 1.57 99
General: Flowcharts 2.33 1.39 99
Concrete: Logic Gates 1.08 1.45 99
Concrete: Digital Circuits 0.89 1.24 99
Expert: Object-oriented Programming Languages 1.16 1.55 99
Expert: Procedural Programming Languages 1.14 1.44 99
Expert: Hardware Description Languages 0.7 1.11 99
Expert: Software Reverse Engineering 0.7 1.22 99
Expert: Hardware Reverse Engineering 0.38 0.91 99

Appendix H Detailed Codebook

The final codebook used to annotate the content of each interview is presented below. The codebook is divided into three parts that reflect the three categories identified in the interview analysis. In the following sections we present the themes and open codes per category, and provide a short description for each code. Category 3 was not further divided into subthemes and instead contains open codes at the top level.

H.1 Category 1: General Impression

Positive Feedback
  • Fun to play: Participant stated that the simulation was fun to play.

  • Obfuscation particularly interesting: Participant noted that the obfuscated levels were interesting to solve.

  • Annotation tools helpful: Participant mentioned that it was helpful to use the drawing tools to solve the levels.

  • Didactically well structured: Participant said that the interactive tutorial was didactically well structured.

  • Intuitive design: Participant stated that they felt well guided throughout the simulation.

Constructive Criticism
  • Provide additional support: Participant would like to have more support in solving the levels, especially for non-expert participants.

  • Redesign obfuscated levels: Participant suggested to redesign obfuscated levels as the obfuscated element was not in the critical analysis path and could be ignored.

  • Comparable strategies for analyzing obfuscated and unobfuscated simulation levels: In order to solve the obfuscated levels, the participant did not change their strategy they applied for solving the unobfuscated levels.

  • Varying difficulties of interactive tutorial and game levels: Participant stated that the difference in complexity between the interactive tutorial and the simulation levels was substantial.

H.2 Category 2: Comparing ReverSim with Reality

Elements Recognized from Real-World Netlist Reverse Engineering
  • Found similar obfuscation schemes in real netlists: Participant identified obfuscation principles (e. g., covert gates) in the simulation that they knew from real netlists.

  • Switches: Participant recognized switches in the simulation as (global) inputs from real netlists.

  • Wires: Participant stated that the electrical connections in a netlist are comparable to the wires depicted in the simulation.

  • Outputs: Participant said that the outputs of the circuits in the simulation (i. e., lamp, danger sign) are comparable to the concept of (global) outputs in real netlists.

  • Logic zero and one: Participant considers the concept of binary values in simulation comparable to that in real (digital) netlists.

  • Basic combinational gates: Participants identified basic combinational gates (i. e., AND, OR, NOT) that they also find in real netlists.

Real-World Elements Missing in ReverSim
  • Specific basic elements missing: Participant said that specific basic elements (e. g., NAND, NOR, XOR) were not included in the simulation but do occur in real netlist analysis.

  • Overall complexity of circuits too low: Participant stated that overall complexity of the levels in the simulation in too low in comparison to real HRE.

  • No sequential logic: Participant mentioned that in real-world netlist analysis, they also analyze sequential logic that was not part of the simulation.

  • Obfuscated gates never seen in reality: Participant noted that they have never seen obfuscated gates in real netlists.

Real-World Approaches Covered by ReverSim
  • Analysis of modules: Participants recognized parallels between the analysis of simulation levels and the analysis of (sub)modules in real netlists.

  • Annotation, pen and paper: Participant stated that they used the drawing tools to make annotations and recognized parallels to the annotation process in real netlists.

  • Output-driven approach, back justification: Participant stated that they solved the levels by focusing on the outputs and said that this approach is also common in real HREs.

  • Hypothesis-driven approach: Participant formulated a hypothesis about an element or an input-output behavior and started to analyze the circuit based on this hypothesis. Furthermore, the participant said that this approach is comparable to real HRE.

  • Simulation: Participant said that they simulated specific outputs of the circuit (e. g., with the drawing tools) which is comparable to key processes of simulation in real netlists.

  • “I saw the solution”: Participant said that they just saw the correct solution for a level what may also occur in real netlists with low complexities.

Real-World Approaches Missing in ReverSim
  • No semi-automated tools: Participant said that the usage of semi-automated tools is a common practice in HRE but missing in the simulation.

  • No (automated) dynamic analysis: Participant mentioned that dynamic analysis (i. e., simulation, which, in this context, refers to the visualization of current flow in a netlist) is commonly used in netlist analysis, but was not mapped in ReverSim.

  • No modularization of netlists: Participant noted that modularization of netlist elements is a common practice in real HRE but was not included in the simulation.

  • Specific solution approaches missing: Participant stated that a few specific solution approaches they usually apply in HRE (e. g., input-driven approach, trial and error, brute-force) were not part of the simulation.

Evaluation of Gameplay Mechanics
  • Rules force you to think: Participant said that the simulation mechanics and rules forced them to think and to apply specific strategies.

  • Brute-force approach effectively prevented: Participant stated that the rules of the simulation successfully prevented brute-force approaches, as brute-force is inefficient in real netlists.

  • Rules feel natural: Participant mentioned that the rules of the simulation forced them to apply strategies and procedures that they would also apply when analyzing real netlist, and therefore they did not feel unnatural in the HREs context.

H.3 Category 3: Future Research & Additional Features

  • Add further combinational gates: Participant suggested to add further combinational elements such as NAND or XOR in future studies with the simulation.

  • Add sequential logic: Participant suggested to add sequential logic such as flip-flops in future studies with the simulation.

  • Add high-level components: Participant suggested to add high-level components such as adders in future studies with the simulation.

  • Add waveforms: Participant suggested to add waveforms in future studies with the simulation.

  • Additional objectives for game levels: Participant suggested to add objectives to the levels, e. g., to determine gate functionality based on a presented circuit.