-
Scalable Performance Evaluation of Byzantine Fault-Tolerant Systems Using Network Simulation
Authors:
Christian Berger,
Sadok Ben Toumia,
Hans P. Reiser
Abstract:
Recent Byzantine fault-tolerant (BFT) state machine replication (SMR) protocols increasingly focus on scalability to meet the requirements of distributed ledger technology (DLT). Validating the performance of scalable BFT protocol implementations requires careful evaluation. Our solution uses network simulations to forecast the performance of BFT protocols while experimentally scaling the environm…
▽ More
Recent Byzantine fault-tolerant (BFT) state machine replication (SMR) protocols increasingly focus on scalability to meet the requirements of distributed ledger technology (DLT). Validating the performance of scalable BFT protocol implementations requires careful evaluation. Our solution uses network simulations to forecast the performance of BFT protocols while experimentally scaling the environment. Our method seamlessly plug-and-plays existing BFT implementations into the simulation without requiring code modification or re-implementation, which is often time-consuming and error-prone. Furthermore, our approach is also significantly cheaper than experiments with real large-scale cloud deployments. In this paper, we first explain our simulation architecture, which enables scalable performance evaluations of BFT systems through high performance network simulations. We validate the accuracy of these simulations for predicting the performance of BFT systems by comparing simulation results with measurements of real systems deployed on cloud infrastructures. We found that simulation results display a reasonable approximation at a larger system scale, because the network eventually becomes the dominating factor limiting system performance. In the second part of our paper, we use our simulation method to evaluate the performance of PBFT and BFT protocols from the blockchain generation, such as HotStuff and Kauri, in large-scale and realistic wide-area network scenarios, as well as under induced faults.
△ Less
Submitted 29 September, 2023;
originally announced September 2023.
-
Chasing the Speed of Light: Low-Latency Planetary-Scale Adaptive Byzantine Consensus
Authors:
Christian Berger,
Lívio Rodrigues,
Hans P. Reiser,
Vinicius Cogo,
Alysson Bessani
Abstract:
Blockchain technology has sparked renewed interest in planetary-scale Byzantine fault-tolerant (BFT) state machine replication (SMR). While recent works have mainly focused on improving the scalability and throughput of these protocols, few have addressed latency. We present FlashConsensus, a novel transformation for optimizing the latency of quorum-based BFT consensus protocols. FLASHCONSENSUS us…
▽ More
Blockchain technology has sparked renewed interest in planetary-scale Byzantine fault-tolerant (BFT) state machine replication (SMR). While recent works have mainly focused on improving the scalability and throughput of these protocols, few have addressed latency. We present FlashConsensus, a novel transformation for optimizing the latency of quorum-based BFT consensus protocols. FLASHCONSENSUS uses an adaptive resilience threshold that enables faster transaction ordering when the system contains few faulty replicas. Our construction exploits adaptive weighted replication to automatically assign high voting power to the fastest replicas, forming small quorums that significantly speed up consensus. Even when using such quorums with a smaller resilience threshold, FlashConsensus still satisfies the standard SMR safety and liveness guarantees with optimal resilience, thanks to the judicious integration of abortable SMR and BFT forensics techniques. Our experiments with tens of replicas spread in all continents show that FLASHCONSENSUS can order transactions with finality in less than 0.4s, half the time of a PBFT-like protocol (with optimal consensus latency) in the same network, and matching the latency of this protocol running on the theoretically best possible internet links (transmitting at 67% of the speed of light).
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
SoK: Scalability Techniques for BFT Consensus
Authors:
Christian Berger,
Signe Schwarz-Rüsch,
Arne Vogel,
Kai Bleeke,
Leander Jehl,
Hans P. Reiser,
Rüdiger Kapitza
Abstract:
With the advancement of blockchain systems, many recent research works have proposed distributed ledger technology~(DLT) that employs Byzantine fault-tolerant~(BFT) consensus protocols to decide which block to append next to the ledger. Notably, BFT consensus can offer high performance, energy efficiency, and provable correctness properties, and it is thus considered a promising building block for…
▽ More
With the advancement of blockchain systems, many recent research works have proposed distributed ledger technology~(DLT) that employs Byzantine fault-tolerant~(BFT) consensus protocols to decide which block to append next to the ledger. Notably, BFT consensus can offer high performance, energy efficiency, and provable correctness properties, and it is thus considered a promising building block for creating highly resilient and performant blockchain infrastructures. Yet, a major ongoing challenge is to make BFT consensus applicable to large-scale environments. A large body of recent work addresses this challenge by develo** novel ideas to improve the scalability of BFT consensus, thus opening the path for a new generation of BFT protocols tailored to the needs of blockchain. In this survey, we create a systematization of knowledge about the novel scalability-enhancing techniques that state-of-the-art BFT consensus protocols use. For our comparison, we closely analyze the efforts, assumptions, and trade-offs these protocols make.
△ Less
Submitted 20 March, 2023;
originally announced March 2023.
-
SmartKex: Machine Learning Assisted SSH Keys Extraction From The Heap Dump
Authors:
Christofer Fellicious,
Stewart Sentanoe,
Michael Granitzer,
Hans P. Reiser
Abstract:
Digital forensics is the process of extracting, preserving, and documenting evidence in digital devices. A commonly used method in digital forensics is to extract data from the main memory of a digital device. However, the main challenge is identifying the important data to be extracted. Several pieces of crucial information reside in the main memory, like usernames, passwords, and cryptographic k…
▽ More
Digital forensics is the process of extracting, preserving, and documenting evidence in digital devices. A commonly used method in digital forensics is to extract data from the main memory of a digital device. However, the main challenge is identifying the important data to be extracted. Several pieces of crucial information reside in the main memory, like usernames, passwords, and cryptographic keys such as SSH session keys. In this paper, we propose SmartKex, a machine-learning assisted method to extract session keys from heap memory snapshots of an OpenSSH process. In addition, we release an openly available dataset and the corresponding toolchain for creating additional data. Finally, we compare SmartKex with naive brute-force methods and empirically show that SmartKex can extract the session keys with high accuracy and high throughput. With the provided resources, we intend to strengthen the research on the intersection between digital forensics, cybersecurity, and machine learning.
△ Less
Submitted 13 September, 2022; v1 submitted 12 September, 2022;
originally announced September 2022.
-
Simulating BFT Protocol Implementations at Scale
Authors:
Christian Berger,
Sadok Ben Toumia,
Hans P. Reiser
Abstract:
The novel blockchain generation of Byzantine fault-tolerant (BFT) state machine replication (SMR) protocols focuses on scalability and performance to meet requirements of distributed ledger technology (DLT), e.g., decentralization and geographic dispersion. Validating scalability and performance of BFT protocol implementations requires careful evaluation. While experiments with real protocol deplo…
▽ More
The novel blockchain generation of Byzantine fault-tolerant (BFT) state machine replication (SMR) protocols focuses on scalability and performance to meet requirements of distributed ledger technology (DLT), e.g., decentralization and geographic dispersion. Validating scalability and performance of BFT protocol implementations requires careful evaluation. While experiments with real protocol deployments usually offer the best realism, they are costly and time-consuming. In this paper, we explore simulation of unmodified BFT protocol implementations as as a method for cheap and rapid protocol evaluation: We can accurately forecast the performance of a BFT protocol while experimentally scaling its environment, i.e., by varying the number of nodes or geographic dispersion. Our approach is resource-friendly and preserves application-realism, since existing BFT frameworks can be simply plugged into the simulation engine without requiring code modifications or re-implementation.
△ Less
Submitted 6 September, 2022; v1 submitted 31 August, 2022;
originally announced August 2022.
-
Automatic Integration of BFT State-Machine Replication into IoT Systems
Authors:
Christian Berger,
Hans P. Reiser,
Franz J. Hauck,
Florian Held,
Jörg Domaschka
Abstract:
Byzantine fault tolerance (BFT) can preserve the availability and integrity of IoT systems where single components may suffer from random data corruption or attacks that can expose them to malicious behavior. While state-of-the-art BFT state-machine replication (SMR) libraries are often tailored to fit a standard request-response interaction model with dedicated client-server roles, in our design,…
▽ More
Byzantine fault tolerance (BFT) can preserve the availability and integrity of IoT systems where single components may suffer from random data corruption or attacks that can expose them to malicious behavior. While state-of-the-art BFT state-machine replication (SMR) libraries are often tailored to fit a standard request-response interaction model with dedicated client-server roles, in our design, we employ an IoT-fit interaction model that assumes a loosly-coupled, event-driven interaction between arbitrarily wired IoT components. In this paper, we explore the possibility of automating and streamlining the complete process of integrating BFT SMR into a component-based IoT execution environment. Our main goal is providing simplicity for the developer: We strive to decouple the specification of a logical application architecture from the difficulty of incorporating BFT replication mechanisms into it. Thus, our contributions address the automated configuration, re-wiring and deployment of IoT components, and their replicas, within a component-based, event-driven IoT platform.
△ Less
Submitted 6 July, 2022; v1 submitted 1 July, 2022;
originally announced July 2022.
-
Evaluating Blockchain Application Requirements and their Satisfaction in Hyperledger Fabric
Authors:
Sadok Ben Toumia,
Christian Berger,
Hans P. Reiser
Abstract:
Blockchain applications may offer better fault-tolerance, integrity, traceability and transparency compared to centralized solutions. Despite these benefits, few businesses switch to blockchain-based applications. Industries worry that the current blockchain implementations do not meet their requirements, e.g., when it comes to scalability, throughput or latency. Hyperledger Fabric (HLF) is a perm…
▽ More
Blockchain applications may offer better fault-tolerance, integrity, traceability and transparency compared to centralized solutions. Despite these benefits, few businesses switch to blockchain-based applications. Industries worry that the current blockchain implementations do not meet their requirements, e.g., when it comes to scalability, throughput or latency. Hyperledger Fabric (HLF) is a permissioned blockchain infrastructure that aims to meet enterprise needs and provides a highly modular and well-conceived architecture. In this paper, we survey and analyse requirements of blockchain applications in respect to their underlying infrastructure by focusing mainly on performance and resilience characteristics. Subsequently, we discuss to what extent Fabric's current design allows it to meet these requirements. We further evaluate the performance of Hyperledger Fabric 2.2 simulating different use case scenarios by comparing single with multi ordering service performance and conducting an evaluation with mixed workloads.
△ Less
Submitted 30 November, 2021;
originally announced November 2021.
-
A Survey on Resilience in the IoT: Taxonomy, Classification and Discussion of Resilience Mechanisms
Authors:
Christian Berger,
Philipp Eichhammer,
Hans P. Reiser,
Jörg Domaschka,
Franz J. Hauck,
Gerhard Habiger
Abstract:
Internet-of-Things (IoT) ecosystems tend to grow both in scale and complexity as they consist of a variety of heterogeneous devices, which span over multiple architectural IoT layers (e.g., cloud, edge, sensors). Further, IoT systems increasingly demand the resilient operability of services as they become part of critical infrastructures. This leads to a broad variety of research works that aim to…
▽ More
Internet-of-Things (IoT) ecosystems tend to grow both in scale and complexity as they consist of a variety of heterogeneous devices, which span over multiple architectural IoT layers (e.g., cloud, edge, sensors). Further, IoT systems increasingly demand the resilient operability of services as they become part of critical infrastructures. This leads to a broad variety of research works that aim to increase the resilience of these systems. In this paper, we create a systematization of knowledge about existing scientific efforts of making IoT systems resilient. In particular, we first discuss the taxonomy and classification of resilience and resilience mechanisms and subsequently survey state-of-the-art resilience mechanisms that have been proposed by research work and are applicable to IoT. As part of the survey, we also discuss questions that focus on the practical aspects of resilience, e.g., which constraints resilience mechanisms impose on developers when designing resilient systems by incorporating a specific mechanism into IoT systems.
△ Less
Submitted 6 September, 2021;
originally announced September 2021.
-
Making Reads in BFT State Machine Replication Fast, Linearizable, and Live
Authors:
Christian Berger,
Hans P. Reiser,
Alysson Bessani
Abstract:
Practical Byzantine Fault Tolerance (PBFT) is a seminal state machine replication protocol that achieves a performance comparable to non-replicated systems in realistic environments. A reason for such high performance is the set of optimizations introduced in the protocol. One of these optimizations is read-only requests, a particular type of client request which avoids running the three-step agre…
▽ More
Practical Byzantine Fault Tolerance (PBFT) is a seminal state machine replication protocol that achieves a performance comparable to non-replicated systems in realistic environments. A reason for such high performance is the set of optimizations introduced in the protocol. One of these optimizations is read-only requests, a particular type of client request which avoids running the three-step agreement protocol and allows replicas to respond directly, thus reducing the latency of reads from five to two communication steps. Given PBFT's broad influence, its design and optimizations influenced many BFT protocols and systems that followed, e.g., BFT-SMaRt. We show, for the first time, that the read-only request optimization introduced in PBFT more than 20 years ago can violate its liveness. Notably, the problem affects not only the optimized read-only operations but also standard, totally-ordered operations. We show this weakness by presenting an attack in which a malicious leader blocks correct clients and present two solutions for patching the protocol, making read-only operations fast and correct. The two solutions were implemented on BFT-SMaRt and evaluated in different scenarios, showing their effectiveness in preventing the identified attack.
△ Less
Submitted 23 July, 2021;
originally announced July 2021.
-
AWARE: Adaptive Wide-Area Replication for Fast and Resilient Byzantine Consensus
Authors:
Christian Berger,
Hans P. Reiser,
João Sousa,
Alysson Bessani
Abstract:
With upcoming blockchain infrastructures, world-spanning Byzantine consensus is getting practical and necessary. In geographically distributed systems, the pace at which consensus is achieved is limited by the heterogenous latencies of connections between replicas. If deployed on a wide-area network, consensus-based systems benefit from weighted replication, an approach that utilizes extra replica…
▽ More
With upcoming blockchain infrastructures, world-spanning Byzantine consensus is getting practical and necessary. In geographically distributed systems, the pace at which consensus is achieved is limited by the heterogenous latencies of connections between replicas. If deployed on a wide-area network, consensus-based systems benefit from weighted replication, an approach that utilizes extra replicas and assigns higher voting power to well connected replicas. This enables more choice in quorum formation and replicas can leverage proportionally smaller quorums to advance, thus decreasing consensus latency. However, the system needs a solution to autonomously adjust to its environment if network conditions change or faults occur. We present Adaptive Wide-Area REplication (AWARE), a mechanism which improves the geographical scalability of consensus with nodes being widely spread across the world. Essentially, AWARE is an automated and dynamic voting weight tuning and leader positioning scheme, which supports the emergence of fast quorums in the system. It employs a reliable self-monitoring process and provides a prediction model seeking to minimize the system's consensus latency. In experiments using several AWS EC2 regions, AWARE dynamically optimizes consensus latency by self-reliantly finding a fast weight configuration yielding latency gains observed by clients located across the globe.
△ Less
Submitted 3 November, 2020;
originally announced November 2020.
-
Analysis of Static and Dynamic Configurability of Existing Group Communication Systems
Authors:
Johannes Köstler,
Hans P. Reiser
Abstract:
Active replication following the state machine replication (SMR) approach is a way to make existing systems and services more reliable and fault-tolerant. The additional communication overhead has a negative impact on the system's throughput and overall request latency. Today's systems should be highly optimized to their execution environment and usage scenario in order to remedy the performance l…
▽ More
Active replication following the state machine replication (SMR) approach is a way to make existing systems and services more reliable and fault-tolerant. The additional communication overhead has a negative impact on the system's throughput and overall request latency. Today's systems should be highly optimized to their execution environment and usage scenario in order to remedy the performance loss introduced by such group communication systems (GCS). In addition to that, systems should be able to adapt to changing environmental conditions. This report analyzes the available configuration options of three existing GCSs. Therefore, it explains the available configuration parameters and describes the given reconfiguration mechanisms. The found parameters are then classified in a parameter scheme.
△ Less
Submitted 17 August, 2017;
originally announced August 2017.
-
SecureSMART: A Security Architecture for BFT Replication Libraries
Authors:
Benedikt Höfling,
Hans P. Reiser
Abstract:
Several research projects have shown that Byzantine fault tolerance (BFT) is practical today in terms of performance. Deficiencies in other aspects might still be an obstacle to a more wide-spread deployment in real-world applications. One of these aspects is an over-all security architecture beyond the low-level protocol. This paper proposes the security architecture SecureSMART, which provides d…
▽ More
Several research projects have shown that Byzantine fault tolerance (BFT) is practical today in terms of performance. Deficiencies in other aspects might still be an obstacle to a more wide-spread deployment in real-world applications. One of these aspects is an over-all security architecture beyond the low-level protocol. This paper proposes the security architecture SecureSMART, which provides dynamic key distribution, internal and external integrity and confidentiality measures, as well as mechanisms for availability and access control. For this purpose, it implements security mechanism among clients, nodes and an external trust center.
△ Less
Submitted 11 April, 2012;
originally announced April 2012.