-
Motorway: Seamless high speed BFT
Authors:
Neil Giridharan,
Florian Suri-Payer,
Ittai Abraham,
Lorenzo Alvisi,
Natacha Crooks
Abstract:
Today's practical, high performance Byzantine Fault Tolerant (BFT) consensus protocols operate in the partial synchrony model. However, existing protocols are inefficient when deployments are indeed partially synchronous. They deliver either low latency during fault-free, synchronous periods (good intervals) or robust recovery from events that interrupt progress (blips). At one end, traditional, v…
▽ More
Today's practical, high performance Byzantine Fault Tolerant (BFT) consensus protocols operate in the partial synchrony model. However, existing protocols are inefficient when deployments are indeed partially synchronous. They deliver either low latency during fault-free, synchronous periods (good intervals) or robust recovery from events that interrupt progress (blips). At one end, traditional, view-based BFT protocols optimize for latency during good intervals, but, when blips occur, can suffer from performance degradation (hangovers) that can last beyond the return of a good interval. At the other end, modern DAG-based BFT protocols recover more gracefully from blips, but exhibit lackluster latency during good intervals. To close the gap, this work presents Motorway, a novel high-throughput BFT protocol that offers both low latency and seamless recovery from blips. By combining a highly parallel asynchronous data dissemination layer with a low-latency, partially synchronous consensus mechanism, Motorway (i) avoids the hangovers incurred by traditional BFT protocols and (ii) matches the throughput of state of the art DAG-based BFT protocols while cutting their latency in half, matching the latency of traditional BFT protocols.
△ Less
Submitted 9 May, 2024; v1 submitted 18 January, 2024;
originally announced January 2024.
-
Bullshark: The Partially Synchronous Version
Authors:
Alexander Spiegelman,
Neil Giridharan,
Alberto Sonnino,
Lefteris Kokoris-Kogias
Abstract:
The purpose of this manuscript is to describe the deterministic partially synchronous version of Bullshark in a simple and clean way. This result is published in CCS 2022, however, the description there is less clear because it uses the terminology of the full asynchronous Bullshark. The CCS version ties the description of the asynchronous and partially synchronous versions of Bullshark since it t…
▽ More
The purpose of this manuscript is to describe the deterministic partially synchronous version of Bullshark in a simple and clean way. This result is published in CCS 2022, however, the description there is less clear because it uses the terminology of the full asynchronous Bullshark. The CCS version ties the description of the asynchronous and partially synchronous versions of Bullshark since it targets an academic audience. Due to the recent interest in DAG-based BFT protocols, we provide a separate and simple description of the partially synchronous version that targets a more general audience. We focus here on the DAG ordering logic. For more details about the asynchronous version, garbage collection, fairness, proofs, related work, evaluation, and efficient DAG implementation please refer to the fullpaper. An intuitive extended summary can be found in the "DAG meets BFT" blogpost.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
BeeGees: stayin' alive in chained BFT
Authors:
Ittai Abraham,
Natacha Crooks,
Neil Giridharan,
Heidi Howard,
Florian Suri-Payer
Abstract:
Modern chained Byzantine Fault Tolerant (BFT) systems leverage a combination of pipelining and leader rotation to obtain both efficiency and fairness. These protocols, however, require a sequence of three or four consecutive honest leaders to commit operations. Therefore, even simple leader failures such as crashes can weaken liveness both theoretically and practically. Obtaining a chained BFT pro…
▽ More
Modern chained Byzantine Fault Tolerant (BFT) systems leverage a combination of pipelining and leader rotation to obtain both efficiency and fairness. These protocols, however, require a sequence of three or four consecutive honest leaders to commit operations. Therefore, even simple leader failures such as crashes can weaken liveness both theoretically and practically. Obtaining a chained BFT protocol that reaches decisions even if the sequence of honest leaders is non-consecutive, remains an open question. To resolve this question we present BeeGees, a novel chained BFT protocol that successfully commits blocks even with non-consecutive honest leaders. It does this while also maintaining quadratic word complexity with threshold signatures, linear word complexity with SNARKs, and responsiveness between consecutive honest leaders. BeeGees reduces the expected commit latency of HotStuff by a factor of three under failures, and the worst-case latency by a factor of seven.
△ Less
Submitted 26 January, 2023; v1 submitted 23 May, 2022;
originally announced May 2022.
-
Bullshark: DAG BFT Protocols Made Practical
Authors:
Alexander Spiegelman,
Neil Giridharan,
Alberto Sonnino,
Lefteris Kokoris-Kogias
Abstract:
We present Bullshark, the first directed acyclic graph (DAG) based asynchronous Byzantine Atomic Broadcast protocol that is optimized for the common synchronous case. Like previous DAG-based BFT protocols, Bullshark requires no extra communication to achieve consensus on top of building the DAG. That is, parties can totally order the vertices of the DAG by interpreting their local view of the DAG…
▽ More
We present Bullshark, the first directed acyclic graph (DAG) based asynchronous Byzantine Atomic Broadcast protocol that is optimized for the common synchronous case. Like previous DAG-based BFT protocols, Bullshark requires no extra communication to achieve consensus on top of building the DAG. That is, parties can totally order the vertices of the DAG by interpreting their local view of the DAG edges. Unlike other asynchronous DAG-based protocols, Bullshark provides a practical low latency fast-path that exploits synchronous periods and deprecates the need for notoriously complex view-change mechanisms. Bullshark achieves this while maintaining all the desired properties of its predecessor DAG-Rider. Namely, it has optimal amortized communication complexity, it provides fairness and asynchronous liveness, and safety is guaranteed even under a quantum adversary. In order to show the practicality and simplicity of our approach, we also introduce a standalone partially synchronous version of Bullshark which we evaluate against the state of the art. The implemented protocol is embarrassingly simple (200 LOC on top of an existing DAG-based mempool implementation (Narwhal & Tusk). It is highly efficient, achieving for example, 125,000 transaction per second with a 2 seconds latency for a deployment of 50 parties. In the same setting the state of the art pays a steep 50% latency increase as it optimizes for asynchrony.
△ Less
Submitted 7 September, 2022; v1 submitted 14 January, 2022;
originally announced January 2022.
-
Scaling Replicated State Machines with Compartmentalization [Technical Report]
Authors:
Michael Whittaker,
Ailidani Ailijiang,
Aleksey Charapko,
Murat Demirbas,
Neil Giridharan,
Joseph M. Hellerstein,
Heidi Howard,
Ion Stoica,
Adriana Szekeres
Abstract:
State machine replication protocols, like MultiPaxos and Raft, are a critical component of many distributed systems and databases. However, these protocols offer relatively low throughput due to several bottlenecked components. Numerous existing protocols fix different bottlenecks in isolation but fall short of a complete solution. When you fix one bottleneck, another arises. In this paper, we int…
▽ More
State machine replication protocols, like MultiPaxos and Raft, are a critical component of many distributed systems and databases. However, these protocols offer relatively low throughput due to several bottlenecked components. Numerous existing protocols fix different bottlenecks in isolation but fall short of a complete solution. When you fix one bottleneck, another arises. In this paper, we introduce compartmentalization, the first comprehensive technique to eliminate state machine replication bottlenecks. Compartmentalization involves decoupling individual bottlenecks into distinct components and scaling these components independently. Compartmentalization has two key strengths. First, compartmentalization leads to strong performance. In this paper, we demonstrate how to compartmentalize MultiPaxos to increase its throughput by 6x on a write-only workload and 16x on a mixed read-write workload. Unlike other approaches, we achieve this performance without the need for specialized hardware. Second, compartmentalization is a technique, not a protocol. Industry practitioners can apply compartmentalization to their protocols incrementally without having to adopt a completely new protocol.
△ Less
Submitted 16 May, 2021; v1 submitted 31 December, 2020;
originally announced December 2020.
-
Matchmaker Paxos: A Reconfigurable Consensus Protocol [Technical Report]
Authors:
Michael Whittaker,
Neil Giridharan,
Adriana Szekeres,
Joseph M. Hellerstein,
Heidi Howard,
Faisal Nawab,
Ion Stoica
Abstract:
State machine replication protocols, like MultiPaxos and Raft, are at the heart of nearly every strongly consistent distributed database. To tolerate machine failures, these protocols must replace failed machines with live machines, a process known as reconfiguration. Reconfiguration has become increasingly important over time as the need for frequent reconfiguration has grown. Despite this, recon…
▽ More
State machine replication protocols, like MultiPaxos and Raft, are at the heart of nearly every strongly consistent distributed database. To tolerate machine failures, these protocols must replace failed machines with live machines, a process known as reconfiguration. Reconfiguration has become increasingly important over time as the need for frequent reconfiguration has grown. Despite this, reconfiguration has largely been neglected in the literature. In this paper, we present Matchmaker Paxos and Matchmaker MultiPaxos, a reconfigurable consensus and state machine replication protocol respectively. Our protocols can perform a reconfiguration with little to no impact on the latency or throughput of command processing; they can perform a reconfiguration in one round trip (theoretically) and a few milliseconds (empirically); they provide a number of theoretical insights; and they present a framework that can be generalized to other replication protocols in a way that previous reconfiguration techniques can not. We provide proofs of correctness for the protocols and optimizations, and present empirical results from an open source implementation.
△ Less
Submitted 20 July, 2020; v1 submitted 18 July, 2020;
originally announced July 2020.
-
Bipartisan Paxos: A Modular State Machine Replication Protocol
Authors:
Michael Whittaker,
Neil Giridharan,
Adriana Szekeres,
Joseph M. Hellerstein,
Ion Stoica
Abstract:
There is no shortage of state machine replication protocols. From Generalized Paxos to EPaxos, a huge number of replication protocols have been proposed that achieve high throughput and low latency. However, these protocols all have two problems. First, they do not scale. Many protocols actually slow down when you scale them, instead of speeding up. For example, increasing the number of MultiPaxos…
▽ More
There is no shortage of state machine replication protocols. From Generalized Paxos to EPaxos, a huge number of replication protocols have been proposed that achieve high throughput and low latency. However, these protocols all have two problems. First, they do not scale. Many protocols actually slow down when you scale them, instead of speeding up. For example, increasing the number of MultiPaxos acceptors increases quorum sizes and slows down the protocol. Second, they are too complicated. This is not a secret; state machine replication is notoriously difficult to understand.
In this paper, we tackle both problems with a single solution: modularity. We present Bipartisan Paxos (BPaxos), a modular state machine replication protocol. Modularity yields high throughput via scaling. We note that while many replication protocol components do not scale, some do. By modularizing BPaxos, we are able to disentangle the two and scale the bottleneck components to increase the protocol's throughput. Modularity also yields simplicity. BPaxos is divided into a number of independent modules that can be understood and proven correct in isolation.
△ Less
Submitted 29 February, 2020;
originally announced March 2020.