-
2P-BFT-Log: 2-Phase Single-Author Append-Only Log for Adversarial Environments
Authors:
Erick Lavoie
Abstract:
Replicated append-only logs sequentially order messages from the same author such that their ordering can be eventually recovered even with out-of-order and unreliable dissemination of individual messages. They are widely used for implementing replicated services in both clouds and peer-to-peer environments because they provide simple and efficient incremental reconciliation. However, existing des…
▽ More
Replicated append-only logs sequentially order messages from the same author such that their ordering can be eventually recovered even with out-of-order and unreliable dissemination of individual messages. They are widely used for implementing replicated services in both clouds and peer-to-peer environments because they provide simple and efficient incremental reconciliation. However, existing designs of replicated append-only logs assume replicas faithfully maintain the sequential properties of logs and do not provide eventual consistency when malicious participants fork their logs by disseminating different messages to different replicas for the same index, which may result in partitioning of replicas according to which branch was first replicated.
In this paper, we present 2P-BFT-Log, a two-phase replicated append-only log that provides eventual consistency in the presence of forks from malicious participants such that all correct replicas will eventually agree either on the most recent message of a valid log (first phase) or on the earliest point at which a fork occurred as well as on an irrefutable proof that it happened (second phase). We provide definitions, algorithms, and proofs of the key properties of the design, and explain one way to implement the design onto Git, an eventually consistent replicated database originally designed for distributed version control.
Our design enables correct replicas to faithfully implement the happens-before relationship first introduced by Lamport that underpins most existing distributed algorithms, with eventual detection of forks from malicious participants to exclude the latter from further progress. This opens the door to adaptations of existing distributed algorithms to a cheaper detect and repair paradigm, rather than the more common and expensive systematic prevention of incorrect behaviour.
△ Less
Submitted 28 July, 2023; v1 submitted 17 July, 2023;
originally announced July 2023.
-
GOC-Ledger: State-based Conflict-Free Replicated Ledger from Grow-Only Counters
Authors:
Erick Lavoie
Abstract:
Conventional blockchains use consensus algorithms that totally order updates across all accounts, which is stronger than necessary to implement a replicated ledger. This makes updates slower and more expensive than necessary. More recent consensus-free replicated ledgers forego consensus algorithms, with significant increase in performance and decrease in infrastructure costs. However, current des…
▽ More
Conventional blockchains use consensus algorithms that totally order updates across all accounts, which is stronger than necessary to implement a replicated ledger. This makes updates slower and more expensive than necessary. More recent consensus-free replicated ledgers forego consensus algorithms, with significant increase in performance and decrease in infrastructure costs. However, current designs are based around reliable broadcast of update operations to all replicas which require reliable message delivery and reasoning over operation histories to establish convergence and safety.
In this paper, we present a replicated ledger as a state-based conflict-free replicated data type (CRDT) based on grow-only counters. This design provides two major benefits: 1) it requires a weaker eventual transitive delivery of the latest state rather than reliable broadcast of all update operations to all replicas; 2) eventual convergence and safety properties can be proven easily without having to reason over operation histories: convergence comes from the composition of grow-only counters, themselves CRDTs, and safety properties can be expressed over the state of counters, locally and globally. In addition, applications that tolerate temporary negative balances require no additional mechanisms and applications that require strictly non-negative balances can be supported by enforcing sequential updates to the same account across replicas.
Our design is sufficient when executing on replicas that might crash and recover, as common in deployments in which all replicas are managed by trusted entities. It may also provide a good foundation to explore new mechanisms for tolerating adversarial replicas.
△ Less
Submitted 26 May, 2023;
originally announced May 2023.
-
State-Based $\infty$P-Set Conflict-Free Replicated Data Type
Authors:
Erick Lavoie
Abstract:
***** This design is a duplicate of a Causal Length Set (see notes in the comments). We leave nonetheless the original paper here because the proofs are referred to in another submission.*****
The 2P-Set Conflict-Free Replicated Data Type (CRDT) supports two phases for each possible element: in the first phase an element can be added to the set and the subsequent additions are ignored; in the se…
▽ More
***** This design is a duplicate of a Causal Length Set (see notes in the comments). We leave nonetheless the original paper here because the proofs are referred to in another submission.*****
The 2P-Set Conflict-Free Replicated Data Type (CRDT) supports two phases for each possible element: in the first phase an element can be added to the set and the subsequent additions are ignored; in the second phase an element can be removed after which it will stay removed forever regardless of subsequent additions and removals. We generalize the 2P-Set to support an infinite sequence of alternating additions and removals of the same element. In the presence of concurrent additions and removals on different replicas, all replicas will eventually converge to the longest sequence of alternating additions and removals that follows causal history.
The idea of converging on the longest-causal sequence of opposite operations had already been suggested in the context of an undo-redo framework but the design was neither given a name nor fully developed. In this paper, we present the full design directly, using nothing more than the basic formulation of state-based CRDTs. We also show the connection between the set-based definition of 2P-Set and the counter-based definition of the $\infty$P-Set with simple reasoning. We then give detailed proofs of convergence. The underlying \textit{grow-only dictionary of grow-only counters} on which the $\infty$P-Set is built may be used to build other state-based CRDTs. In addition, this paper should be useful as a pedagogical example for designing state-based CRDTs, and might help raise the profile of CRDTs based on \textit{longest sequence wins}.
△ Less
Submitted 26 May, 2023; v1 submitted 4 April, 2023;
originally announced April 2023.
-
Decentralized Learning Made Practical with Client Sampling
Authors:
Martijn de Vos,
Akash Dhasade,
Anne-Marie Kermarrec,
Erick Lavoie,
Johan Pouwelse,
Rishi Sharma
Abstract:
Decentralized learning (DL) leverages edge devices for collaborative model training while avoiding coordination by a central server. Due to privacy concerns, DL has become an attractive alternative to centralized learning schemes since training data never leaves the device. In a round of DL, all nodes participate in model training and exchange their model with some other nodes. Performing DL in la…
▽ More
Decentralized learning (DL) leverages edge devices for collaborative model training while avoiding coordination by a central server. Due to privacy concerns, DL has become an attractive alternative to centralized learning schemes since training data never leaves the device. In a round of DL, all nodes participate in model training and exchange their model with some other nodes. Performing DL in large-scale heterogeneous networks results in high communication costs and prolonged round durations due to slow nodes, effectively inflating the total training time. Furthermore, current DL algorithms also assume all nodes are available for training and aggregation at all times, diminishing the practicality of DL. This paper presents Plexus, an efficient, scalable, and practical DL system. Plexus (1) avoids network-wide participation by introducing a decentralized peer sampler that selects small subsets of available nodes that train the model each round and, (2) aggregates the trained models produced by nodes every round. Plexus is designed to handle joining and leaving nodes (churn). We extensively evaluate Plexus by incorporating realistic traces for compute speed, pairwise latency, network capacity, and availability of edge devices in our experiments. Our experiments on four common learning tasks empirically show that Plexus reduces time-to-accuracy by 1.2-8.3x, communication volume by 2.4-15.3x and training resources needed for convergence by 6.4-370x compared to baseline DL algorithms.
△ Less
Submitted 7 May, 2024; v1 submitted 27 February, 2023;
originally announced February 2023.
-
Refined Convergence and Topology Learning for Decentralized SGD with Heterogeneous Data
Authors:
Batiste Le Bars,
Aurélien Bellet,
Marc Tommasi,
Erick Lavoie,
Anne-Marie Kermarrec
Abstract:
One of the key challenges in decentralized and federated learning is to design algorithms that efficiently deal with highly heterogeneous data distributions across agents. In this paper, we revisit the analysis of the popular Decentralized Stochastic Gradient Descent algorithm (D-SGD) under data heterogeneity. We exhibit the key role played by a new quantity, called neighborhood heterogeneity, on…
▽ More
One of the key challenges in decentralized and federated learning is to design algorithms that efficiently deal with highly heterogeneous data distributions across agents. In this paper, we revisit the analysis of the popular Decentralized Stochastic Gradient Descent algorithm (D-SGD) under data heterogeneity. We exhibit the key role played by a new quantity, called neighborhood heterogeneity, on the convergence rate of D-SGD. By coupling the communication topology and the heterogeneity, our analysis sheds light on the poorly understood interplay between these two concepts. We then argue that neighborhood heterogeneity provides a natural criterion to learn data-dependent topologies that reduce (and can even eliminate) the otherwise detrimental effect of data heterogeneity on the convergence time of D-SGD. For the important case of classification with label skew, we formulate the problem of learning such a good topology as a tractable optimization problem that we solve with a Frank-Wolfe algorithm. As illustrated over a set of simulated and real-world experiments, our approach provides a principled way to design a sparse topology that balances the convergence speed and the per-iteration communication costs of D-SGD under data heterogeneity.
△ Less
Submitted 21 October, 2022; v1 submitted 9 April, 2022;
originally announced April 2022.
-
D-Cliques: Compensating for Data Heterogeneity with Topology in Decentralized Federated Learning
Authors:
Aurélien Bellet,
Anne-Marie Kermarrec,
Erick Lavoie
Abstract:
The convergence speed of machine learning models trained with Federated Learning is significantly affected by heterogeneous data partitions, even more so in a fully decentralized setting without a central server. In this paper, we show that the impact of label distribution skew, an important type of data heterogeneity, can be significantly reduced by carefully designing the underlying communicatio…
▽ More
The convergence speed of machine learning models trained with Federated Learning is significantly affected by heterogeneous data partitions, even more so in a fully decentralized setting without a central server. In this paper, we show that the impact of label distribution skew, an important type of data heterogeneity, can be significantly reduced by carefully designing the underlying communication topology. We present D-Cliques, a novel topology that reduces gradient bias by grou** nodes in sparsely interconnected cliques such that the label distribution in a clique is representative of the global label distribution. We also show how to adapt the updates of decentralized SGD to obtain unbiased gradients and implement an effective momentum with D-Cliques. Our extensive empirical evaluation on MNIST and CIFAR10 demonstrates that our approach provides similar convergence speed as a fully-connected topology, which provides the best convergence in a data heterogeneous setting, with a significant reduction in the number of edges and messages. In a 1000-node topology, D-Cliques require 98% less edges and 96% less total messages, with further possible gains using a small-world topology across cliques.
△ Less
Submitted 4 November, 2021; v1 submitted 15 April, 2021;
originally announced April 2021.
-
Genet: A Quickly Scalable Fat-Tree Overlay for Personal Volunteer Computing using WebRTC
Authors:
Erick Lavoie,
Laurie Hendren,
Fréderic Desprez,
Miguel Correia
Abstract:
WebRTC enables browsers to exchange data directly but the number of possible concurrent connections to a single source is limited. We overcome the limitation by organizing participants in a fat-tree overlay: when the maximum number of connections of a tree node is reached, the new participants connect to the node's children. Our design quickly scales when a large number of participants join in a s…
▽ More
WebRTC enables browsers to exchange data directly but the number of possible concurrent connections to a single source is limited. We overcome the limitation by organizing participants in a fat-tree overlay: when the maximum number of connections of a tree node is reached, the new participants connect to the node's children. Our design quickly scales when a large number of participants join in a short amount of time, by relying on a novel scheme that only requires local information to route connection messages: the destination is derived from the hash value of the combined identifiers of the message's source and of the node that is holding the message. The scheme provides deterministic routing of a sequence of connection messages from a single source and probabilistic balancing of newer connections among the leaves. We show that this design puts at least 83% of nodes at the same depth as a deterministic algorithm, can connect a thousand browser windows in 21-55 seconds in a local network, and can be deployed for volunteer computing to tap into 320 cores in less than 30 seconds on a local network to increase the total throughput on the Collatz application by two orders of magnitude compared to a single core.
△ Less
Submitted 25 April, 2019;
originally announced April 2019.
-
Personal Volunteer Computing
Authors:
Erick Lavoie,
Laurie Hendren
Abstract:
We propose personal volunteer computing, a novel paradigm to encourage technical solutions that leverage personal devices, such as smartphones and laptops, for personal applications that require significant computations, such as animation rendering and image processing. The paradigm requires no investment in additional hardware, relying instead on devices that are already owned by users and their…
▽ More
We propose personal volunteer computing, a novel paradigm to encourage technical solutions that leverage personal devices, such as smartphones and laptops, for personal applications that require significant computations, such as animation rendering and image processing. The paradigm requires no investment in additional hardware, relying instead on devices that are already owned by users and their community, and favours simple tools that can be implemented part-time by a single developer. We show that samples of personal devices of today are competitive with a top-of-the-line laptop from two years ago. We also propose new directions to extend the paradigm.
△ Less
Submitted 19 March, 2019; v1 submitted 4 April, 2018;
originally announced April 2018.
-
Pando: Personal Volunteer Computing in Browsers
Authors:
Erick Lavoie,
Laurie Hendren,
Frederic Desprez,
Miguel Correia
Abstract:
The large penetration and continued growth in ownership of personal electronic devices represents a freely available and largely untapped source of computing power. To leverage those, we present Pando, a new volunteer computing tool based on a declarative concurrent programming model and implemented using JavaScript, WebRTC, and WebSockets. This tool enables a dynamically varying number of failure…
▽ More
The large penetration and continued growth in ownership of personal electronic devices represents a freely available and largely untapped source of computing power. To leverage those, we present Pando, a new volunteer computing tool based on a declarative concurrent programming model and implemented using JavaScript, WebRTC, and WebSockets. This tool enables a dynamically varying number of failure-prone personal devices contributed by volunteers to parallelize the application of a function on a stream of values, by using the devices' browsers. We show that Pando can provide throughput improvements compared to a single personal device, on a variety of compute-bound applications including animation rendering and image processing. We also show the flexibility of our approach by deploying Pando on personal devices connected over a local network, on Grid5000, a French-wide computing grid in a virtual private network, and seven PlanetLab nodes distributed in a wide area network over Europe.
△ Less
Submitted 6 September, 2019; v1 submitted 22 March, 2018;
originally announced March 2018.
-
A Formalization for Specifying and Implementing Correct Pull-Stream Modules
Authors:
Erick Lavoie,
Laurie Hendren
Abstract:
Pull-stream is a JavaScript demand-driven functional design pattern based on callback functions that enables the creation and easy composition of independent modules that are used to create streaming applications. It is used in popular open source projects and the community around it has created over a hundred compatible modules. While the description of the pull-stream design pattern may seem sim…
▽ More
Pull-stream is a JavaScript demand-driven functional design pattern based on callback functions that enables the creation and easy composition of independent modules that are used to create streaming applications. It is used in popular open source projects and the community around it has created over a hundred compatible modules. While the description of the pull-stream design pattern may seem simple, it does exhibit complicated termination cases. Despite the popularity and large uptake of the pull-stream design pattern, there was no existing formal specification that could help programmers reason about the correctness of their implementations.
Thus, the main contribution of this paper is to provide a formalization for specifying and implementing correct pull-stream modules based on the following: (1) we show the pull-stream design pattern is a form of declarative concurrent programming; (2) we present an event-based protocol language that supports our formalization, independently of JavaScript; (3) we provide the first precise and explicit definition of the expected sequences of events that happen at the interface of two modules, which we call the pull-stream protocol; (4) we specify reference modules that exhibit the full range of behaviors of the pull-stream protocol; (5) we validate our definitions against the community expectations by testing the existing core pull-stream modules against them and identify unspecified behaviors in existing modules.
Our approach helps to better understand the pull-stream protocol, to ensure interoperability of community modules, and to concisely and precisely specify new pull-stream abstractions in papers and documentation.
△ Less
Submitted 18 January, 2018;
originally announced January 2018.