-
StreamBed: capacity planning for stream processing
Authors:
Guillaume Rosinosky,
Donatien Schmitz,
Etienne Rivière
Abstract:
StreamBed is a capacity planning system for stream processing. It predicts, ahead of any production deployment, the resources that a query will require to process an incoming data rate sustainably, and the appropriate configuration of these resources. StreamBed builds a capacity planning model by piloting a series of runs of the target query in a small-scale, controlled testbed. We implement Strea…
▽ More
StreamBed is a capacity planning system for stream processing. It predicts, ahead of any production deployment, the resources that a query will require to process an incoming data rate sustainably, and the appropriate configuration of these resources. StreamBed builds a capacity planning model by piloting a series of runs of the target query in a small-scale, controlled testbed. We implement StreamBed for the popular Flink DSP engine. Our evaluation with large-scale queries of the Nexmark benchmark demonstrates that StreamBed can effectively and accurately predict capacity requirements for jobs spanning more than 1,000 cores using a testbed of only 48 cores.
△ Less
Submitted 28 September, 2023; v1 submitted 6 September, 2023;
originally announced September 2023.
-
Content Censorship in the InterPlanetary File System
Authors:
Srivatsan Sridhar,
Onur Ascigil,
Navin Keizer,
François Genon,
Sébastien Pierre,
Yiannis Psaras,
Etienne Rivière,
Michał Król
Abstract:
The InterPlanetary File System (IPFS) is currently the largest decentralized storage solution in operation, with thousands of active participants and millions of daily content transfers. IPFS is used as remote data storage for numerous blockchain-based smart contracts, Non-Fungible Tokens (NFT), and decentralized applications.
We present a content censorship attack that can be executed with mini…
▽ More
The InterPlanetary File System (IPFS) is currently the largest decentralized storage solution in operation, with thousands of active participants and millions of daily content transfers. IPFS is used as remote data storage for numerous blockchain-based smart contracts, Non-Fungible Tokens (NFT), and decentralized applications.
We present a content censorship attack that can be executed with minimal effort and cost, and that prevents the retrieval of any chosen content in the IPFS network. The attack exploits a conceptual issue in a core component of IPFS, the Kademlia Distributed Hash Table (DHT), which is used to resolve content IDs to peer addresses. We provide efficient detection and mitigation mechanisms for this vulnerability. Our mechanisms achieve a 99.6\% detection rate and mitigate 100\% of the detected attacks with minimal signaling and computational overhead. We followed responsible disclosure procedures, and our countermeasures are scheduled for deployment in the future versions of IPFS.
△ Less
Submitted 4 December, 2023; v1 submitted 22 July, 2023;
originally announced July 2023.
-
Data Availability Sampling in Ethereum: Analysis of P2P Networking Requirements
Authors:
Michał Król,
Onur Ascigil,
Sergi Rene,
Etienne Rivière,
Matthieu Pigaglio,
Kaleem Peeroo,
Vladimir Stankovic,
Ramin Sadre,
Felix Lange
Abstract:
Despite their increasing popularity, blockchains still suffer from severe scalability limitations. Recently, Ethereum proposed a novel approach to block validation based on Data Availability Sampling (DAS), that has the potential to improve its transaction per second rate by more than two orders of magnitude. DAS should also significantly reduce per-transaction validation costs. At the same time,…
▽ More
Despite their increasing popularity, blockchains still suffer from severe scalability limitations. Recently, Ethereum proposed a novel approach to block validation based on Data Availability Sampling (DAS), that has the potential to improve its transaction per second rate by more than two orders of magnitude. DAS should also significantly reduce per-transaction validation costs. At the same time, DAS introduces new communication patterns in the Ethereum Peer-to-Peer (P2P) network. These drastically increase the amount of exchanged data and impose stringent latency objectives. In this paper, we review the new requirements for P2P networking associated with DAS, discuss open challenges, and identify new research directions.
△ Less
Submitted 20 June, 2023;
originally announced June 2023.
-
RAPTEE: Leveraging trusted execution environments for Byzantine-tolerant peer sampling services
Authors:
Matthieu Pigaglio,
Joachim Bruneau-Queyreix,
David Bromberg,
Davide Frey,
Etienne Rivière,
Laurent Réveillère
Abstract:
Peer sampling is a first-class abstraction used in distributed systems for overlay management and information dissemination. The goal of peer sampling is to continuously build and refresh a partial and local view of the full membership of a dynamic, large-scale distributed system. Malicious nodes under the control of an adversary may aim at being over-represented in the views of correct nodes, inc…
▽ More
Peer sampling is a first-class abstraction used in distributed systems for overlay management and information dissemination. The goal of peer sampling is to continuously build and refresh a partial and local view of the full membership of a dynamic, large-scale distributed system. Malicious nodes under the control of an adversary may aim at being over-represented in the views of correct nodes, increasing their impact on the proper operation of protocols built over peer sampling. State-of-the-art Byzantine resilient peer sampling protocols reduce this bias as long as Byzantines are not overly present. This paper studies the benefits brought to the resilience of peer sampling services when considering that a small portion of trusted nodes can run code whose authenticity and integrity can be assessed within a trusted execution environment, and specifically Intel's software guard extensions technology (SGX). We present RAPTEE, a protocol that builds and leverages trusted gossip-based communications to hamper an adversary's ability to increase its system-wide representation in the views of all nodes. We apply RAPTEE to BRAHMS, the most resilient peer sampling protocol to date. Experiments with 10,000 nodes show that with only 1% of SGX-capable devices, RAPTEE can reduce the proportion of Byzantine IDs in the view of honest nodes by up to 17% when the system contains 10% of Byzantine nodes. In addition, the security guarantees of RAPTEE hold even in the presence of a powerful attacker attempting to identify trusted nodes and injecting view-poisoned trusted nodes.
△ Less
Submitted 8 March, 2022;
originally announced March 2022.
-
Shard Scheduler: object placement and migration in sharded account-based blockchains
Authors:
Michał Król,
Onur Ascigil,
Sergi Rene,
Alberto Sonnino,
Mustafa Al-Bassam,
Etienne Rivière
Abstract:
We propose Shard Scheduler, a system for object placement and migration in account-based sharded blockchains. Our system calculates optimal placement and decides of object migrations across shards and supports complex multi-account transactions caused by smart contracts. Placement and migration decisions made by Shard Scheduler are fully deterministic, verifiable, and can be made part of the conse…
▽ More
We propose Shard Scheduler, a system for object placement and migration in account-based sharded blockchains. Our system calculates optimal placement and decides of object migrations across shards and supports complex multi-account transactions caused by smart contracts. Placement and migration decisions made by Shard Scheduler are fully deterministic, verifiable, and can be made part of the consensus protocol. Shard Scheduler reduces the number of costly cross-shard transactions, ensures balanced load distribution and maximizes the number of processed transactions for the blockchain as a whole. It leverages a novel incentive model motivating miners to maximize the global throughput of the entire blockchain rather than the throughput of a specific shard. Shard Scheduler reduces the number of costly cross-shard transactions by half in our simulations, ensuring equal load and increasing the throughput 3 fold when using 60 shards. We also implement and evaluate Shard Scheduler on Chainspace, more than doubling its throughput and reducing user-perceived latency by 70% when using 10 shards.
△ Less
Submitted 15 July, 2021;
originally announced July 2021.
-
Fair and Efficient Gossip in Hyperledger Fabric
Authors:
Nicolae Berendea,
Hugues Mercier,
Emanuel Onica,
Etienne Rivière
Abstract:
Permissioned blockchains are supported by identified but individually untrustworthy nodes, collectively maintaining a replicated ledger whose content is trusted. The Hyperledger Fabric permissioned blockchain system targets high-throughput transaction processing. Fabric uses a set of nodes tasked with the ordering of transactions using consensus. Additional peers endorse and validate transactions,…
▽ More
Permissioned blockchains are supported by identified but individually untrustworthy nodes, collectively maintaining a replicated ledger whose content is trusted. The Hyperledger Fabric permissioned blockchain system targets high-throughput transaction processing. Fabric uses a set of nodes tasked with the ordering of transactions using consensus. Additional peers endorse and validate transactions, and maintain a copy of the ledger. The ability to quickly disseminate new transaction blocks from ordering nodes to all peers is critical for both performance and consistency. Broadcast is handled by a gossip protocol, using randomized exchanges of blocks between peers. We show that the current implementation of gossip in Fabric leads to heavy tail distributions of block propagation latencies, impacting performance, consistency, and fairness. We contribute a novel design for gossip in Fabric that simultaneously optimizes propagation time, tail latency and bandwidth consumption. Using a 100-node cluster, we show that our enhanced gossip allows the dissemination of blocks to all peers more than 10 times faster than with the original implementation, while decreasing the overall network bandwidth consumption by more than 40%. With a high throughput and concurrent application, this results in 17% to 36% fewer invalidated transactions for different block sizes.
△ Less
Submitted 15 April, 2020;
originally announced April 2020.
-
PASTRAMI: Privacy-preserving, Auditable, Scalable & Trustworthy Auctions for Multiple Items
Authors:
Michał Król,
Alberto Sonnino,
Argyrios Tasiopoulos,
Ioannis Psaras,
Etienne Rivière
Abstract:
Decentralised cloud computing platforms enable individuals to offer and rent resources in a peer-to-peer fashion. They must assign resources from multiple sellers to multiple buyers and derive prices that match the interests and capacities of both parties. The assignment process must be decentralised, fair and transparent, but also protect the privacy of buyers. We present PASTRAMI, a decentralise…
▽ More
Decentralised cloud computing platforms enable individuals to offer and rent resources in a peer-to-peer fashion. They must assign resources from multiple sellers to multiple buyers and derive prices that match the interests and capacities of both parties. The assignment process must be decentralised, fair and transparent, but also protect the privacy of buyers. We present PASTRAMI, a decentralised platform enabling trustworthy assignments of items and prices between a large number of sellers and bidders, through the support of multi-item auctions. PASTRAMI uses threshold blind signatures and commitment schemes to provide strong privacy guarantees while making bidders accountable. It leverages the Ethereum blockchain for auditability, combining efficient off-chain computations with novel, on-chain proofs of misbehaviour. Our evaluation of PASTRAMI using Filecoin workloads show its ability to efficiently produce trustworthy assignments between thousands of buyers and sellers.
△ Less
Submitted 16 December, 2020; v1 submitted 14 April, 2020;
originally announced April 2020.
-
EL PASSO: Privacy-preserving, Asynchronous Single Sign-On
Authors:
Zhiyi Zhang,
Michał Król,
Alberto Sonnino,
Lixia Zhang,
Etienne Rivière
Abstract:
We introduce EL PASSO, a privacy-preserving, asynchronous Single Sign-On (SSO) system. It enables personal authentication while protecting users' privacy against both identity providers and relying parties, and allows selective attribute disclosure. EL PASSO is based on anonymous credentials, yet it supports users' accountability. Selected authorities may recover the identity of allegedly misbehav…
▽ More
We introduce EL PASSO, a privacy-preserving, asynchronous Single Sign-On (SSO) system. It enables personal authentication while protecting users' privacy against both identity providers and relying parties, and allows selective attribute disclosure. EL PASSO is based on anonymous credentials, yet it supports users' accountability. Selected authorities may recover the identity of allegedly misbehaving users, and users can prove properties about their identity without revealing it in the clear. EL PASSO does not require specific secure hardware or a third party (other than existing participants in SSO). The generation and use of authentication credentials are asynchronous, allowing users to sign on when identity providers are temporarily unavailable. We evaluate EL PASSO in a distributed environment and prove its low computational cost, yielding faster sign-on operations than OIDC from a regular laptop, one-second user-perceived latency from a low-power device, and scaling to more than 50 sign-on operations per second at a relying party using a single 4-core server in the cloud.
△ Less
Submitted 3 June, 2020; v1 submitted 24 February, 2020;
originally announced February 2020.
-
Reliable Messaging to Millions of Users with MigratoryData
Authors:
Mihai Rotaru,
Florentin Olariu,
Emanuel Onica,
Etienne Rivière
Abstract:
Web-based notification services are used by a large range of businesses to selectively distribute live updates to customers, following the publish/subscribe (pub/sub) model. Typical deployments can involve millions of subscribers expecting ordering and delivery guarantees together with low latencies. Notification services must be vertically and horizontally scalable, and adopt replication to provi…
▽ More
Web-based notification services are used by a large range of businesses to selectively distribute live updates to customers, following the publish/subscribe (pub/sub) model. Typical deployments can involve millions of subscribers expecting ordering and delivery guarantees together with low latencies. Notification services must be vertically and horizontally scalable, and adopt replication to provide a reliable service. We report our experience building and operating MigratoryData, a highly-scalable notification service. We discuss the typical requirements of MigratoryData customers, and describe the architecture and design of the service, focusing on scalability and fault tolerance. Our evaluation demonstrates the ability of MigratoryData to handle millions of concurrent connections and support a reliable notification service despite server failures and network disconnections.
△ Less
Submitted 28 December, 2017;
originally announced December 2017.
-
Confidentiality-Preserving Publish/Subscribe: A Survey
Authors:
Emanuel Onica,
Pascal Felber,
Hugues Mercier,
Etienne Rivière
Abstract:
Publish/subscribe (pub/sub) is an attractive communication paradigm for large-scale distributed applications running across multiple administrative domains. Pub/sub allows event-based information dissemination based on constraints on the nature of the data rather than on pre-established communication channels. It is a natural fit for deployment in untrusted environments such as public clouds linki…
▽ More
Publish/subscribe (pub/sub) is an attractive communication paradigm for large-scale distributed applications running across multiple administrative domains. Pub/sub allows event-based information dissemination based on constraints on the nature of the data rather than on pre-established communication channels. It is a natural fit for deployment in untrusted environments such as public clouds linking applications across multiple sites. However, pub/sub in untrusted environments lead to major confidentiality concerns stemming from the content-centric nature of the communications. This survey classifies and analyzes different approaches to confidentiality preservation for pub/sub, from applications of trust and access control models to novel encryption techniques. It provides an overview of the current challenges posed by confidentiality concerns and points to future research directions in this promising field.
△ Less
Submitted 25 May, 2017;
originally announced May 2017.
-
On Using Micro-Clouds to Deliver the Fog
Authors:
Yehia Elkhatib,
Barry Porter,
Heverson B. Ribeiro,
Mohamed Faten Zhani,
Junaid Qadir,
Etienne Riviere
Abstract:
Cloud computing has demonstrated itself to be a scalable and cost-efficient solution for many real-world applications. However, its modus operandi is not ideally suited to resource-constrained environments that are characterized by limited network bandwidth and high latencies. With the increasing proliferation and sophistication of edge devices, the idea of fog computing proposes to offload some o…
▽ More
Cloud computing has demonstrated itself to be a scalable and cost-efficient solution for many real-world applications. However, its modus operandi is not ideally suited to resource-constrained environments that are characterized by limited network bandwidth and high latencies. With the increasing proliferation and sophistication of edge devices, the idea of fog computing proposes to offload some of the computation to the edge. To this end, micro-clouds---which are modular and portable assemblies of small single-board computers---have started to gain attention as infrastructures to support fog computing by offering isolated resource provisioning at the edge in a cost-effective way. We investigate the feasibility and readiness of micro-clouds for delivering the vision of fog computing. Through a number of experiments, we showcase the potential of micro-clouds formed by collections of Raspberry Pi computers to host a range of fog-related applications, particularly for locations where there is limited network bandwidths and long latencies.
△ Less
Submitted 6 February, 2017;
originally announced March 2017.
-
A Practical Distributed Universal Construction with Unknown Participants
Authors:
Pierre Sutra,
Etienne Rivière,
Pascal Felber
Abstract:
Modern distributed systems employ atomic read-modify-write primitives to coordinate concurrent operations. Such primitives are typically built on top of a central server, or rely on an agreement protocol. Both approaches provide a universal construction, that is, a general mechanism to construct atomic and responsive objects. These two techniques are however known to be inherently costly. As a con…
▽ More
Modern distributed systems employ atomic read-modify-write primitives to coordinate concurrent operations. Such primitives are typically built on top of a central server, or rely on an agreement protocol. Both approaches provide a universal construction, that is, a general mechanism to construct atomic and responsive objects. These two techniques are however known to be inherently costly. As a consequence, they may result in bottlenecks in applications using them for coordination. In this paper, we investigate another direction to implement a universal construction. Our idea is to delegate the implementation of the universal construction to the clients, and solely implement a distributed shared atomic memory at the servers side. The construction we propose is obstruction-free. It can be implemented in a purely asynchronous manner, and it does not assume the knowledge of the participants. It is built on top of grafarius and racing objects, two novel shared abstractions that we introduce in detail. To assess the benefits of our approach, we present a prototype implementation on top of the Cassandra data store, and compare it empirically to the Zookeeper coordination service.
△ Less
Submitted 21 May, 2014; v1 submitted 11 September, 2013;
originally announced September 2013.