-
Asynchronous BFT Asset Transfer: Quasi-Anonymous, Light, and Consensus-Free
Authors:
Timothé Albouy,
Emmanuelle Anceaume,
Davide Frey,
Mathieu Gestin,
Arthur Rauch,
Michel Raynal,
François Taïani
Abstract:
This article introduces a new asynchronous Byzantine-tolerant asset transfer system (cryptocurrency) with three noteworthy properties: quasi-anonymity, lightness, and consensus-freedom. Quasi-anonymity means no information is leaked regarding the receivers and amounts of the asset transfers. Lightness means that the underlying cryptographic schemes are \textit{succinct}, and each process only stor…
▽ More
This article introduces a new asynchronous Byzantine-tolerant asset transfer system (cryptocurrency) with three noteworthy properties: quasi-anonymity, lightness, and consensus-freedom. Quasi-anonymity means no information is leaked regarding the receivers and amounts of the asset transfers. Lightness means that the underlying cryptographic schemes are \textit{succinct}, and each process only stores data polylogarithmic in the number of its own transfers.Consensus-freedom means the system does not rely on a total order of asset transfers. The proposed algorithm is the first asset transfer system that simultaneously fulfills all these properties in the presence of asynchrony and Byzantine processes. To obtain them, the paper adopts a modular approach combining a new distributed object called agreement proofs and well-known techniques such as vector commitments, universal accumulators, and zero-knowledge proofs. The paper also presents a new non-trivial universal accumulator implementation that does not need knowledge of the underlying accumulated set to generate (non-)membership proofs, which could benefit other crypto-based applications.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Overcoming the Challenges of Batch Normalization in Federated Learning
Authors:
Rachid Guerraoui,
Rafael Pinot,
Geovani Rizk,
John Stephan,
François Taiani
Abstract:
Batch normalization has proven to be a very beneficial mechanism to accelerate the training and improve the accuracy of deep neural networks in centralized environments. Yet, the scheme faces significant challenges in federated learning, especially under high data heterogeneity. Essentially, the main challenges arise from external covariate shifts and inconsistent statistics across clients. We int…
▽ More
Batch normalization has proven to be a very beneficial mechanism to accelerate the training and improve the accuracy of deep neural networks in centralized environments. Yet, the scheme faces significant challenges in federated learning, especially under high data heterogeneity. Essentially, the main challenges arise from external covariate shifts and inconsistent statistics across clients. We introduce in this paper Federated BatchNorm (FBN), a novel scheme that restores the benefits of batch normalization in federated learning. Essentially, FBN ensures that the batch normalization during training is consistent with what would be achieved in a centralized execution, hence preserving the distribution of the data, and providing running statistics that accurately approximate the global statistics. FBN thereby reduces the external covariate shift and matches the evaluation performance of the centralized setting. We also show that, with a slight increase in complexity, we can robustify FBN to mitigate erroneous statistics and potentially adversarial attacks.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Low-Cost Privacy-Aware Decentralized Learning
Authors:
Sayan Biswas,
Davide Frey,
Romaric Gaudel,
Anne-Marie Kermarrec,
Dimitri Lerévérend,
Rafael Pires,
Rishi Sharma,
François Taïani
Abstract:
This paper introduces ZIP-DL, a novel privacy-aware decentralized learning (DL) algorithm that exploits correlated noise to provide strong privacy protection against a local adversary while yielding efficient convergence guarantees for a low communication cost. The progressive neutralization of the added noise during the distributed aggregation process results in ZIP-DL fostering a high model accu…
▽ More
This paper introduces ZIP-DL, a novel privacy-aware decentralized learning (DL) algorithm that exploits correlated noise to provide strong privacy protection against a local adversary while yielding efficient convergence guarantees for a low communication cost. The progressive neutralization of the added noise during the distributed aggregation process results in ZIP-DL fostering a high model accuracy under privacy guarantees. ZIP-DL further uses a single communication round between each gradient descent, thus minimizing communication overhead. We provide theoretical guarantees for both convergence speed and privacy guarantees, thereby making ZIP-DL applicable to practical scenarios. Our extensive experimental study shows that ZIP-DL significantly outperforms the state-of-the-art in terms of vulnerability/accuracy trade-off. In particular, ZIP-DL (i) reduces the efficacy of linkability attacks by up to 52 percentage points compared to baseline DL, (ii) improves accuracy by up to 37 percent w.r.t. the state-of-the-art privacy-preserving mechanism operating under the same threat model as ours, when configured to provide the same protection against membership inference attacks, and (iii) reduces communication by up to 10.5x against the same competitor for the same level of protection.
△ Less
Submitted 25 June, 2024; v1 submitted 18 March, 2024;
originally announced March 2024.
-
Under manipulations, are some AI models harder to audit?
Authors:
Augustin Godinot,
Gilles Tredan,
Erwan Le Merrer,
Camilla Penzo,
Francois Taïani
Abstract:
Auditors need robust methods to assess the compliance of web platforms with the law. However, since they hardly ever have access to the algorithm, implementation, or training data used by a platform, the problem is harder than a simple metric estimation. Within the recent framework of manipulation-proof auditing, we study in this paper the feasibility of robust audits in realistic settings, in whi…
▽ More
Auditors need robust methods to assess the compliance of web platforms with the law. However, since they hardly ever have access to the algorithm, implementation, or training data used by a platform, the problem is harder than a simple metric estimation. Within the recent framework of manipulation-proof auditing, we study in this paper the feasibility of robust audits in realistic settings, in which models exhibit large capacities. We first prove a constraining result: if a web platform uses models that may fit any data, no audit strategy -- whether active or not -- can outperform random sampling when estimating properties such as demographic parity. To better understand the conditions under which state-of-the-art auditing techniques may remain competitive, we then relate the manipulability of audits to the capacity of the targeted models, using the Rademacher complexity. We empirically validate these results on popular models of increasing capacities, thus confirming experimentally that large-capacity models, which are commonly used in practice, are particularly hard to audit robustly. These results refine the limits of the auditing problem, and open up enticing questions on the connection between model capacity and the ability of platforms to manipulate audit attempts.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Towards Optimal Communication Byzantine Reliable Broadcast under a Message Adversary
Authors:
Timothé Albouy,
Davide Frey,
Ran Gelles,
Carmit Hazay,
Michel Raynal,
Elad Michael Schiller,
Francois Taiani,
Vassilis Zikas
Abstract:
We address the problem of Reliable Broadcast in asynchronous message-passing systems with $n$ nodes, of which up to $t$ are malicious (faulty), in addition to a message adversary that can drop some of the messages sent by correct (non-faulty) nodes.
We present a Message-Adversary-Tolerant Byzantine Reliable Broadcast (MBRB) algorithm that communicates an almost optimal amount of $O(|m|+n^2κ)$ bi…
▽ More
We address the problem of Reliable Broadcast in asynchronous message-passing systems with $n$ nodes, of which up to $t$ are malicious (faulty), in addition to a message adversary that can drop some of the messages sent by correct (non-faulty) nodes.
We present a Message-Adversary-Tolerant Byzantine Reliable Broadcast (MBRB) algorithm that communicates an almost optimal amount of $O(|m|+n^2κ)$ bits per node, where $|m|$ represents the length of the application message and $κ=Ω(\log n)$ is a security parameter. This improves upon the state-of-the-art MBRB solution (Albouy, Frey, Raynal, and Taïani, SSS 2021), which incurs communication of $O(n|m|+n^2κ)$ bits per node.
Our solution sends at most $4n^2$ messages overall, which is asymptotically optimal. Reduced communication is achieved by employing coding techniques that replace the need for all nodes to (re-)broadcast the entire message~$m$. Instead, nodes forward authenticated fragments of the encoding of $m$ using an erasure-correcting code. Under the cryptographic assumptions of PKI and collision-resistant hash, and assuming $n > 3t + 2d$, where the adversary drops at most~$d$ messages per broadcast, our algorithm allows most of the correct nodes to reconstruct~$m$, despite missing fragments caused by the malicious nodes and the message adversary.
△ Less
Submitted 25 December, 2023;
originally announced December 2023.
-
Process-Commutative Distributed Objects: From Cryptocurrencies to Byzantine-Fault-Tolerant CRDTs
Authors:
Davide Frey,
Lucie Guillou,
Michel Raynal,
François Taïani
Abstract:
This paper explores the territory that lies between best-effort Byzantine-Fault-Tolerant Conflict-free Replicated Data Types (BFT CRDTs) and totally ordered distributed ledgers, such as those implemented by Blockchains. It formally characterizes a novel class of distributed objects that only requires a First In First Out (FIFO) order on the object operations from each process (taken individually).…
▽ More
This paper explores the territory that lies between best-effort Byzantine-Fault-Tolerant Conflict-free Replicated Data Types (BFT CRDTs) and totally ordered distributed ledgers, such as those implemented by Blockchains. It formally characterizes a novel class of distributed objects that only requires a First In First Out (FIFO) order on the object operations from each process (taken individually). The formalization leverages Mazurkiewicz traces to define legal sequences of operations and ensure both Strong Eventual Consistency (SEC) and Pipleline Consistency (PC). The paper presents a generic algorithm that implements this novel class of distributed objects both in a crash- and Byzantine setting. It also illustrates the practical interest of the proposed approach using four instances of this class of objects, namely money transfer, Petri nets, multi-sets, and concurrent work stealing dequeues.
△ Less
Submitted 8 March, 2024; v1 submitted 23 November, 2023;
originally announced November 2023.
-
Context Adaptive Cooperation
Authors:
Timothé Albouy,
Davide Frey,
Mathieu Gestin,
Michel Raynal,
François Taïani
Abstract:
Reliable broadcast and consensus are the two pillars that support a lot of non-trivial fault-tolerant distributed middleware and fault-tolerant distributed systems. While they have close definitions, they strongly differ in the underlying assumptions needed to implement each of them. Reliable broadcast can be implemented in asynchronous systems in the presence of crash or Byzantine failures while…
▽ More
Reliable broadcast and consensus are the two pillars that support a lot of non-trivial fault-tolerant distributed middleware and fault-tolerant distributed systems. While they have close definitions, they strongly differ in the underlying assumptions needed to implement each of them. Reliable broadcast can be implemented in asynchronous systems in the presence of crash or Byzantine failures while Consensus cannot. This key difference stems from the fact that consensus involves synchronization between multiple processes that concurrently propose values, while reliable broadcast simply involves delivering a message from a predefined sender. This paper strikes a balance between these two agreement abstractions in the presence of Byzantine failures. It proposes CAC, a novel agreement abstraction that enables multiple processes to broadcast messages simultaneously, while guaranteeing that (despite potential conflicts, asynchrony, and Byzantine behaviors) the non-faulty processes will agree on messages deliveries. We show that this novel abstraction can enable more efficient algorithms for a variety of applications (such as money transfer where several people can share a same account). This is obtained by focusing the need for synchronization only on the processes that actually need to synchronize.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Good-case Early-Stop** Latency of Synchronous Byzantine Reliable Broadcast: The Deterministic Case (Extended Version)
Authors:
Timothé Albouy,
Davide Frey,
Michel Raynal,
François Taïani
Abstract:
This paper considers the good-case latency of Byzantine Reliable Broadcast (BRB), i.e., the time taken by correct processes to deliver a message when the initial sender is correct. This time plays a crucial role in the performance of practical distributed systems. Although significant strides have been made in recent years on this question, progress has mainly focused on either asynchronous or ran…
▽ More
This paper considers the good-case latency of Byzantine Reliable Broadcast (BRB), i.e., the time taken by correct processes to deliver a message when the initial sender is correct. This time plays a crucial role in the performance of practical distributed systems. Although significant strides have been made in recent years on this question, progress has mainly focused on either asynchronous or randomized algorithms. By contrast, the good-case latency of deterministic synchronous BRB under a majority of Byzantine faults has been little studied. In particular, it was not known whether a goodcase latency below the worst-case bound of t + 1 rounds could be obtained. This work answers this open question positively and proposes a deterministic synchronous Byzantine reliable broadcast that achieves a good-case latency of max(2, t + 3 -- c) rounds, where t is the upper bound on the number of Byzantine processes and c the number of effectively correct processes.
△ Less
Submitted 10 March, 2023; v1 submitted 9 March, 2023;
originally announced March 2023.
-
Asynchronous Byzantine Reliable Broadcast With a Message Adversary
Authors:
Timothé Albouy,
Davide Frey,
Michel Raynal,
François Taïani
Abstract:
This paper considers the problem of reliable broadcast in asynchronous authenticated systems, in which n processes communicate using signed messages and up to t processes may behave arbitrarily (Byzantine processes). In addition, for each message m broadcast by a correct (i.e., non-Byzantine) process, a message adversary may prevent up to d correct processes from receiving m. (This message adversa…
▽ More
This paper considers the problem of reliable broadcast in asynchronous authenticated systems, in which n processes communicate using signed messages and up to t processes may behave arbitrarily (Byzantine processes). In addition, for each message m broadcast by a correct (i.e., non-Byzantine) process, a message adversary may prevent up to d correct processes from receiving m. (This message adversary captures network failures such as transient disconnections, silent churn, or message losses.) Considering such a "double" adversarial context and assuming n > 3t + 2d, a reliable broadcast algorithm is presented. Interestingly, when there is no message adversary (i.e., d = 0), the algorithm terminates in two communication steps (so, in this case, this algorithm is optimal in terms of both Byzantine tolerance and time efficiency). It is then shown that the condition n > 3t + 2d is necessary for implementing reliable broadcast in the presence of both Byzantine processes and a message adversary (whether the underlying system is enriched with signatures or not).
△ Less
Submitted 20 May, 2022;
originally announced May 2022.
-
Co** with Byzantine Processes and a Message Adversary: Modularity Helps!
Authors:
Davide Frey,
Michel Raynal,
François Taïani,
Timothé Albouy
Abstract:
This paper explores how reliable broadcast can be implemented when facing a dual adversary that can both corrupt processes and remove messages.More precisely, we consider an asynchronous $n$-process message-passing systems in which up to $t_b$ processes are Byzantine and where, at the network level, for each message broadcast by a correct process, an adversary can prevent up to $t_m$ processes fro…
▽ More
This paper explores how reliable broadcast can be implemented when facing a dual adversary that can both corrupt processes and remove messages.More precisely, we consider an asynchronous $n$-process message-passing systems in which up to $t_b$ processes are Byzantine and where, at the network level, for each message broadcast by a correct process, an adversary can prevent up to $t_m$ processes from receiving it (the integer $t_m$ defines the power of the message adversary).So, differently from previous works, this work considers that not only computing entities can be faulty (Byzantine processes), but also that the network can lose messages.To this end, the paper first introduces a new basic communication abstraction denoted $k\ell$-cast, and studies its properties in this new bi-dimensional adversary context.Then, the paper deconstructs existing Byzantine-tolerant asynchronous broadcast algorithms and, with the help of the $k\ell$-cast communication abstraction, reconstructs versions of them that tolerate both Byzantine processes and message adversaries.Interestingly, these reconstructed algorithms are also more efficient than the Byzantine-tolerant-only algorithms from which they originate.The paper also shows that the condition $n>3t_b+2t_m$ is necessary and sufficient (with signatures) to design such reliable broadcast algorithms.
△ Less
Submitted 2 June, 2022; v1 submitted 28 April, 2022;
originally announced April 2022.
-
The Graph Neural Networking Challenge: A Worldwide Competition for Education in AI/ML for Networks
Authors:
José Suárez-Varela,
Miquel Ferriol-Galmés,
Albert López,
Paul Almasan,
Guillermo Bernárdez,
David Pujol-Perich,
Krzysztof Rusek,
Loïck Bonniot,
Christoph Neumann,
François Schnitzler,
François Taïani,
Martin Happ,
Christian Maier,
Jia Lei Du,
Matthias Herlich,
Peter Dorfinger,
Nick Vincent Hainke,
Stefan Venz,
Johannes Wegener,
Henrike Wissing,
Bo Wu,
Shihan Xiao,
Pere Barlet-Ros,
Albert Cabellos-Aparicio
Abstract:
During the last decade, Machine Learning (ML) has increasingly become a hot topic in the field of Computer Networks and is expected to be gradually adopted for a plethora of control, monitoring and management tasks in real-world deployments. This poses the need to count on new generations of students, researchers and practitioners with a solid background in ML applied to networks. During 2020, the…
▽ More
During the last decade, Machine Learning (ML) has increasingly become a hot topic in the field of Computer Networks and is expected to be gradually adopted for a plethora of control, monitoring and management tasks in real-world deployments. This poses the need to count on new generations of students, researchers and practitioners with a solid background in ML applied to networks. During 2020, the International Telecommunication Union (ITU) has organized the "ITU AI/ML in 5G challenge'', an open global competition that has introduced to a broad audience some of the current main challenges in ML for networks. This large-scale initiative has gathered 23 different challenges proposed by network operators, equipment manufacturers and academia, and has attracted a total of 1300+ participants from 60+ countries. This paper narrates our experience organizing one of the proposed challenges: the "Graph Neural Networking Challenge 2020''. We describe the problem presented to participants, the tools and resources provided, some organization aspects and participation statistics, an outline of the top-3 awarded solutions, and a summary with some lessons learned during all this journey. As a result, this challenge leaves a curated set of educational resources openly available to anyone interested in the topic.
△ Less
Submitted 26 July, 2021;
originally announced July 2021.
-
$\scriptstyle{BASALT}$: A Rock-Solid Foundation for Epidemic Consensus Algorithms in Very Large, Very Open Networks
Authors:
Alex Auvolat,
Yérom-David Bromberg,
Davide Frey,
François Taïani
Abstract:
Recent works have proposed new Byzantine consensus algorithms for blockchains based on epidemics, a design which enables highly scalable performance at a low cost. These methods however critically depend on a secure random peer sampling service: a service that provides a stream of random network nodes where no attacking entity can become over-represented. To ensure this security property, current…
▽ More
Recent works have proposed new Byzantine consensus algorithms for blockchains based on epidemics, a design which enables highly scalable performance at a low cost. These methods however critically depend on a secure random peer sampling service: a service that provides a stream of random network nodes where no attacking entity can become over-represented. To ensure this security property, current epidemic platforms use a Proof-of-Stake system to select peer samples. However such a system limits the openness of the system as only nodes with significant stake can participate in the consensus, leading to an oligopoly situation. Moreover, this design introduces a complex interdependency between the consensus algorithm and the cryptocurrency built upon it. In this paper, we propose a radically different security design for the peer sampling service, based on the distribution of IP addresses to prevent Sybil attacks. We propose a new algorithm, $\scriptstyle{BASALT}$, that implements our design using a stubborn chaotic search to counter attackers' attempts at becoming over-represented. We show in theory and using Monte Carlo simulations that $\scriptstyle{BASALT}$ provides samples which are extremely close to the optimal distribution even in adversarial scenarios such as tentative Eclipse attacks. Live experiments on a production cryptocurrency platform confirm that the samples obtained using $\scriptstyle{BASALT}$ are equitably distributed amongst nodes, allowing for a system which is both open and where no single entity can gain excessive power.
△ Less
Submitted 8 February, 2021;
originally announced February 2021.
-
Cluster-and-Conquer: When Randomness Meets Graph Locality
Authors:
George Giakkoupis,
Anne-Marie Kermarrec,
Olivier Ruas,
François Taïani
Abstract:
K-Nearest-Neighbors (KNN) graphs are central to many emblematic data mining and machine-learning applications. Some of the most efficient KNN graph algorithms are incremental and local: they start from a random graph, which they incrementally improve by traversing neighbors-of-neighbors links. Paradoxically, this random start is also one of the key weaknesses of these algorithms: nodes are initial…
▽ More
K-Nearest-Neighbors (KNN) graphs are central to many emblematic data mining and machine-learning applications. Some of the most efficient KNN graph algorithms are incremental and local: they start from a random graph, which they incrementally improve by traversing neighbors-of-neighbors links. Paradoxically, this random start is also one of the key weaknesses of these algorithms: nodes are initially connected to dissimilar neighbors, that lie far away according to the similarity metric. As a result, incremental algorithms must first laboriously explore spurious potential neighbors before they can identify similar nodes, and start converging. In this paper, we remove this drawback with Cluster-and-Conquer (C 2 for short). Cluster-and-Conquer boosts the starting configuration of greedy algorithms thanks to a novel lightweight clustering mechanism, dubbed FastRandomHash. FastRandomHash leverages random-ness and recursion to pre-cluster similar nodes at a very low cost. Our extensive evaluation on real datasets shows that Cluster-and-Conquer significantly outperforms existing approaches, including LSH, yielding speed-ups of up to x4.42 while incurring only a negligible loss in terms of KNN quality.
△ Less
Submitted 22 October, 2020;
originally announced October 2020.
-
Spores: Stateless Predictive Onion Routing for E-Squads
Authors:
Daniel Bosk,
Yérom-David Bromberg,
Sonja Buchegger,
Adrien Luxey,
François Taïani
Abstract:
Mass surveillance of the population by state agencies and corporate parties is now a well-known fact. Journalists and whistle-blowers still lack means to circumvent global spying for the sake of their investigations. With Spores, we propose a way for journalists and their sources to plan a posteriori file exchanges when they physically meet. We leverage on the multiplication of personal devices pe…
▽ More
Mass surveillance of the population by state agencies and corporate parties is now a well-known fact. Journalists and whistle-blowers still lack means to circumvent global spying for the sake of their investigations. With Spores, we propose a way for journalists and their sources to plan a posteriori file exchanges when they physically meet. We leverage on the multiplication of personal devices per capita to provide a lightweight, robust and fully anonymous decentralised file transfer protocol between users. Spores hinges on our novel concept of e-squads: one's personal devices, rendered intelligent by gossip communication protocols, can provide private and dependable services to their user. People's e-squads are federated into a novel onion routing network, able to withstand the inherent unreliability of personal appliances while providing reliable routing. Spores' performances are competitive, and its privacy properties of the communication outperform state of the art onion routing strategies.
△ Less
Submitted 2 July, 2020;
originally announced July 2020.
-
Money Transfer Made Simple: a Specification, a Generic Algorithm, and its Proof
Authors:
Alex Auvolat,
Davide Frey,
Michel Raynal,
François Taïani
Abstract:
It has recently been shown that, contrarily to a common belief, money transfer in the presence of faulty (Byzantine) processes does not require strong agreement such as consensus. This article goes one step further: namely, it first proposes a non-sequential specification of the money-transfer object, and then presents a generic algorithm based on a simple FIFO order between each pair of processes…
▽ More
It has recently been shown that, contrarily to a common belief, money transfer in the presence of faulty (Byzantine) processes does not require strong agreement such as consensus. This article goes one step further: namely, it first proposes a non-sequential specification of the money-transfer object, and then presents a generic algorithm based on a simple FIFO order between each pair of processes that implements it. The genericity dimension lies in the underlying reliable broadcast abstraction which must be suited to the appropriate failure model. Interestingly, whatever the failure model, the money transfer algorithm only requires adding a single sequence number to its messages as control information. Moreover, as a side effect of the proposed algorithm, it follows that money transfer is a weaker problem than the construction of a safe/regular/atomic read/write register in the asynchronous message-passing crash-prone model.
△ Less
Submitted 17 February, 2021; v1 submitted 18 June, 2020;
originally announced June 2020.
-
FLeet: Online Federated Learning via Staleness Awareness and Performance Prediction
Authors:
Georgios Damaskinos,
Rachid Guerraoui,
Anne-Marie Kermarrec,
Vlad Nitu,
Rhicheek Patra,
Francois Taiani
Abstract:
Federated Learning (FL) is very appealing for its privacy benefits: essentially, a global model is trained with updates computed on mobile devices while kee** the data of users local. Standard FL infrastructures are however designed to have no energy or performance impact on mobile devices, and are therefore not suitable for applications that require frequent (online) model updates, such as news…
▽ More
Federated Learning (FL) is very appealing for its privacy benefits: essentially, a global model is trained with updates computed on mobile devices while kee** the data of users local. Standard FL infrastructures are however designed to have no energy or performance impact on mobile devices, and are therefore not suitable for applications that require frequent (online) model updates, such as news recommenders.
This paper presents FLeet, the first Online FL system, acting as a middleware between the Android OS and the machine learning application. FLeet combines the privacy of Standard FL with the precision of online learning thanks to two core components: (i) I-Prof, a new lightweight profiler that predicts and controls the impact of learning tasks on mobile devices, and (ii) AdaSGD, a new adaptive learning algorithm that is resilient to delayed updates.
Our extensive evaluation shows that Online FL, as implemented by FLeet, can deliver a 2.3x quality boost compared to Standard FL, while only consuming 0.036% of the battery per day. I-Prof can accurately control the impact of learning tasks by improving the prediction accuracy up to 3.6x (computation time) and up to 19x (energy). AdaSGD outperforms alternative FL approaches by 18.4% in terms of convergence speed on heterogeneous data.
△ Less
Submitted 3 December, 2020; v1 submitted 12 June, 2020;
originally announced June 2020.
-
DiagNet: towards a generic, Internet-scale root cause analysis solution
Authors:
Loïck Bonniot,
Christoph Neumann,
François Taïani
Abstract:
Diagnosing problems in Internet-scale services remains particularly difficult and costly for both content providers and ISPs. Because the Internet is decentralized, the cause of such problems might lie anywhere between an end-user's device and the service datacenters. Further, the set of possible problems and causes is not known in advance, making it impossible in practice to train a classifier wi…
▽ More
Diagnosing problems in Internet-scale services remains particularly difficult and costly for both content providers and ISPs. Because the Internet is decentralized, the cause of such problems might lie anywhere between an end-user's device and the service datacenters. Further, the set of possible problems and causes is not known in advance, making it impossible in practice to train a classifier with all combinations of problems, causes and locations. In this paper, we explore how different machine learning techniques can be used for Internet-scale root cause analysis using measurements taken from end-user devices. We show how to build generic models that (i) are agnostic to the underlying network topology, (ii) do not require to define the full set of possible causes during training, and (iii) can be quickly adapted to diagnose new services. Our solution, DiagNet, adapts concepts from image processing research to handle network and system metrics. We evaluate DiagNet with a multi-cloud deployment of online services with injected faults and emulated clients with automated browsers. We demonstrate promising root cause analysis capabilities, with a recall of 73.9% including causes only being introduced at inference time.
△ Less
Submitted 7 April, 2020;
originally announced April 2020.
-
PnyxDB: a Lightweight Leaderless Democratic Byzantine Fault Tolerant Replicated Datastore
Authors:
Loïck Bonniot,
Christoph Neumann,
François Taïani
Abstract:
Byzantine-Fault-Tolerant (BFT) systems are rapidly emerging as a viable technology for production-grade systems, notably in closed consortia deployments for nancial and supply-chain applications. Unfortunately, most algorithms proposed so far to coordinate these systems suffer from substantial scalability issues, and lack important features to implement Internet-scale governance mechanisms. In thi…
▽ More
Byzantine-Fault-Tolerant (BFT) systems are rapidly emerging as a viable technology for production-grade systems, notably in closed consortia deployments for nancial and supply-chain applications. Unfortunately, most algorithms proposed so far to coordinate these systems suffer from substantial scalability issues, and lack important features to implement Internet-scale governance mechanisms. In this paper, we observe that many application workloads offer little concurrency, and propose PnyxDB, an eventually-consistent Byzantine Fault Tolerant replicated datastore that exhibits both high scalability and low latency. Our approach is based on conditional endorsements, that allow nodes to specify the set of transactions that must not be committed for the endorsement to be valid. In addition to its high scalability, PnyxDB supports application-level voting, i.e. individual nodes are able to endorse or reject a transaction according to application-defined policies without compromising consistency. We provide a comparison against BFTSMaRt and Tendermint, two competitors with different design aims, and show that our implementation speeds up commit latencies by a factor of 11, remaining below 5 seconds in a worldwide geodistributed deployment of 180 nodes.
△ Less
Submitted 8 November, 2019;
originally announced November 2019.
-
Dietcoin: shortcutting the Bitcoin verification process for your smartphone
Authors:
Davide Frey,
Marc X. Makkes,
Pierre-Louis Roman,
François Taïani,
Spyros Voulgaris
Abstract:
Blockchains have a storage scalability issue. Their size is not bounded and they grow indefinitely as time passes. As of August 2017, the Bitcoin blockchain is about 120 GiB big while it was only 75 GiB in August 2016. To benefit from Bitcoin full security model, a bootstrap** node has to download and verify the entirety of the 120 GiB. This poses a challenge for low-resource devices such as sma…
▽ More
Blockchains have a storage scalability issue. Their size is not bounded and they grow indefinitely as time passes. As of August 2017, the Bitcoin blockchain is about 120 GiB big while it was only 75 GiB in August 2016. To benefit from Bitcoin full security model, a bootstrap** node has to download and verify the entirety of the 120 GiB. This poses a challenge for low-resource devices such as smartphones. Thankfully, an alternative exists for such devices which consists of downloading and verifying just the header of each block. This partial block verification enables devices to reduce their bandwidth requirements from 120 GiB to 35 MiB. However, this drastic decrease comes with a safety cost implied by a partial block verification. In this work, we enable low-resource devices to fully verify subchains of blocks without having to pay the onerous price of a full chain download and verification; a few additional MiB of bandwidth suffice. To do so, we propose the design of diet nodes that can securely query full nodes for shards of the UTXO set, which is needed to perform full block verification and can otherwise only be built by sequentially parsing the chain.
△ Less
Submitted 28 March, 2018;
originally announced March 2018.
-
Vertex Coloring with Communication and Local Memory Constraints in Synchronous Broadcast Networks
Authors:
Hicham Lakhlef,
Michel Raynal,
François Taïani
Abstract:
The vertex coloring problem has received a lot of attention in the context of synchronous round-based systems where, at each round, a process can send a message to all its neighbors, and receive a message from each of them. Hence, this communication model is particularly suited to point-to-point communication channels. Several vertex coloring algorithms suited to these systems have been proposed.…
▽ More
The vertex coloring problem has received a lot of attention in the context of synchronous round-based systems where, at each round, a process can send a message to all its neighbors, and receive a message from each of them. Hence, this communication model is particularly suited to point-to-point communication channels. Several vertex coloring algorithms suited to these systems have been proposed. They differ mainly in the number of rounds they require and the number of colors they use. This paper considers a broadcast/receive communication model in which message collisions and message conflicts can occur (a collision occurs when, during the same round, messages are sent to the same process by too many neighbors; a conflict occurs when a process and one of its neighbors broadcast during the same round). This communication model is suited to systems where processes share communication bandwidths. More precisely,the paper considers the case where, during a round, a process may either broadcast a message to its neighbors or receive a message from at most $m$ of them. This captures communication-related constraints or a local memory constraint stating that, whatever the number of neighbors of a process, its local memory allows it to receive and store at most $m$ messages during each round. The paper defines first the corresponding generic vertex multi-coloring problem (a vertex can have several colors). It focuses then on tree networks, for which it presents a lower bound on the number of colors $K$ that are necessary (namely, $K=\lceil\fracΔ{m}\rceil+1$, where $Δ$ is the maximal degree of the communication graph), and an ssociated coloring algorithm, which is optimal with respect to $K$.
△ Less
Submitted 12 April, 2016;
originally announced April 2016.
-
Fisheye Consistency: Kee** Data in Synch in a Georeplicated World
Authors:
Roy Friedman,
Michel Raynal,
François Taïani
Abstract:
Over the last thirty years, numerous consistency conditions for replicated data have been proposed and implemented. Popular examples of such conditions include linearizability (or atomicity), sequential consistency, causal consistency, and eventual consistency. These consistency conditions are usually defined independently from the computing entities (nodes) that manipulate the replicated data; i.…
▽ More
Over the last thirty years, numerous consistency conditions for replicated data have been proposed and implemented. Popular examples of such conditions include linearizability (or atomicity), sequential consistency, causal consistency, and eventual consistency. These consistency conditions are usually defined independently from the computing entities (nodes) that manipulate the replicated data; i.e., they do not take into account how computing entities might be linked to one another, or geographically distributed. To address this lack, as a first contribution, this paper introduces the notion of proximity graph between computing nodes. If two nodes are connected in this graph, their operations must satisfy a strong consistency condition, while the operations invoked by other nodes are allowed to satisfy a weaker condition. The second contribution is the use of such a graph to provide a generic approach to the hybridization of data consistency conditions into the same system. We illustrate this approach on sequential consistency and causal consistency, and present a model in which all data operations are causally consistent, while operations by neighboring processes in the proximity graph are sequentially consistent. The third contribution of the paper is the design and the proof of a distributed algorithm based on this proximity graph, which combines sequential consistency and causal consistency (the resulting condition is called fisheye consistency). In doing so the paper not only extends the domain of consistency conditions, but provides a generic provably correct solution of direct relevance to modern georeplicated systems.
△ Less
Submitted 22 October, 2015; v1 submitted 24 November, 2014;
originally announced November 2014.