Search | arXiv e-print repository

UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI

Authors: Ilia Shumailov, Jamie Hayes, Eleni Triantafillou, Guillermo Ortiz-Jimenez, Nicolas Papernot, Matthew Jagielski, Itay Yona, Heidi Howard, Eugene Bagdasaryan

Abstract: Exact unlearning was first introduced as a privacy mechanism that allowed a user to retract their data from machine learning models on request. Shortly after, inexact schemes were proposed to mitigate the impractical costs associated with exact unlearning. More recently unlearning is often discussed as an approach for removal of impermissible knowledge i.e. knowledge that the model should not poss… ▽ More Exact unlearning was first introduced as a privacy mechanism that allowed a user to retract their data from machine learning models on request. Shortly after, inexact schemes were proposed to mitigate the impractical costs associated with exact unlearning. More recently unlearning is often discussed as an approach for removal of impermissible knowledge i.e. knowledge that the model should not possess such as unlicensed copyrighted, inaccurate, or malicious information. The promise is that if the model does not have a certain malicious capability, then it cannot be used for the associated malicious purpose. In this paper we revisit the paradigm in which unlearning is used for in Large Language Models (LLMs) and highlight an underlying inconsistency arising from in-context learning. Unlearning can be an effective control mechanism for the training phase, yet it does not prevent the model from performing an impermissible act during inference. We introduce a concept of ununlearning, where unlearned knowledge gets reintroduced in-context, effectively rendering the model capable of behaving as if it knows the forgotten knowledge. As a result, we argue that content filtering for impermissible knowledge will be required and even exact unlearning schemes are not enough for effective content regulation. We discuss feasibility of ununlearning for modern LLMs and examine broader implications. △ Less

Submitted 27 June, 2024; originally announced July 2024.

arXiv:2406.17455 [pdf, other]

Smart Casual Verification of CCF's Distributed Consensus and Consistency Protocols

Authors: Heidi Howard, Markus A. Kuppe, Edward Ashton, Amaury Chamayou, Natacha Crooks

Abstract: The Confidential Consortium Framework (CCF) is an open-source platform for develo** trustworthy and reliable cloud applications. CCF powers Microsoft's Azure Confidential Ledger service and as such it is vital to build confidence in the correctness of CCF's design and implementation. This paper reports our experiences applying smart casual verification to validate the correctness of CCF's novel… ▽ More The Confidential Consortium Framework (CCF) is an open-source platform for develo** trustworthy and reliable cloud applications. CCF powers Microsoft's Azure Confidential Ledger service and as such it is vital to build confidence in the correctness of CCF's design and implementation. This paper reports our experiences applying smart casual verification to validate the correctness of CCF's novel distributed protocols, focusing on its unique distributed consensus protocol and its custom client consistency model. We use the term smart casual verification to describe our hybrid approach, which combines the rigor of formal specification and model checking with the pragmatism of automated testing, in our case binding the formal specification in TLA+ to the C++ implementation. While traditional formal methods approaches require substantial buy-in and are often one-off efforts by domain experts, we have integrated our smart casual verification approach into CCF's continuous integration pipeline, allowing contributors to continuously validate CCF as it evolves. We describe the challenges we faced in applying smart casual verification to a complex existing codebase and how we overcame them to find subtle bugs in the design and implementation before they could impact production. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2404.18048 [pdf, ps, other]

Scalable, Interpretable Distributed Protocol Verification by Inductive Proof Slicing

Authors: William Schultz, Edward Ashton, Heidi Howard, Stavros Tripakis

Abstract: Many techniques for automated inference of inductive invariants for distributed protocols have been developed over the past several years, but their performance can still be unpredictable and their failure modes opaque for large-scale verification tasks. In this paper, we present inductive proof slicing, a new automated, compositional technique for inductive invariant inference that scales effecti… ▽ More Many techniques for automated inference of inductive invariants for distributed protocols have been developed over the past several years, but their performance can still be unpredictable and their failure modes opaque for large-scale verification tasks. In this paper, we present inductive proof slicing, a new automated, compositional technique for inductive invariant inference that scales effectively to large distributed protocol verification tasks. Our technique is built on a core, novel data structure, the inductive proof graph, which explicitly represents the lemma and action dependencies of an inductive invariant and is built incrementally during the inference procedure, backwards from a target safety property. We present an invariant inference algorithm that integrates localized syntax-guided lemma synthesis routines at nodes of this graph, which are accelerated by computation of localized grammar and state variable slices. Additionally, in the case of failure to produce a complete inductive invariant, maintenance of this proof graph structure allows failures to be localized to small sub-components of this graph, enabling fine-grained failure diagnosis and repair by a user. We evaluate our technique on several complex distributed and concurrent protocols, including a large scale specification of the Raft consensus protocol, which is beyond the capabilities of modern distributed protocol verification tools, and also demonstrate how its interpretability features allow effective diagnosis and repair in cases of initial failure. △ Less

Submitted 27 April, 2024; originally announced April 2024.

arXiv:2404.01593 [pdf, other]

doi 10.1145/3639257

Optimizing Distributed Protocols with Query Rewrites [Technical Report]

Authors: David Chu, Rithvik Panchapakesan, Shadaj Laddad, Lucky Katahanas, Chris Liu, Kaushik Shivakumar, Natacha Crooks, Joseph M. Hellerstein, Heidi Howard

Abstract: Distributed protocols such as 2PC and Paxos lie at the core of many systems in the cloud, but standard implementations do not scale. New scalable distributed protocols are developed through careful analysis and rewrites, but this process is ad hoc and error-prone. This paper presents an approach for scaling any distributed protocol by applying rule-driven rewrites, borrowing from query optimizatio… ▽ More Distributed protocols such as 2PC and Paxos lie at the core of many systems in the cloud, but standard implementations do not scale. New scalable distributed protocols are developed through careful analysis and rewrites, but this process is ad hoc and error-prone. This paper presents an approach for scaling any distributed protocol by applying rule-driven rewrites, borrowing from query optimization. Distributed protocol rewrites entail a new burden: reasoning about spatiotemporal correctness. We leverage order-insensitivity and data dependency analysis to systematically identify correct coordination-free scaling opportunities. We apply this analysis to create preconditions and mechanisms for coordination-free decoupling and partitioning, two fundamental vertical and horizontal scaling techniques. Manual rule-driven applications of decoupling and partitioning improve the throughput of 2PC by $5\times$ and Paxos by $3\times$, and match state-of-the-art throughput in recent work. These results point the way toward automated optimizers for distributed protocols based on correct-by-construction rewrite rules. △ Less

Submitted 2 April, 2024; v1 submitted 3 January, 2024; originally announced April 2024.

Comments: Technical report of paper accepted at SIGMOD 2024

arXiv:2403.13793 [pdf, other]

Evaluating Frontier Models for Dangerous Capabilities

Authors: Mary Phuong, Matthew Aitchison, Elliot Catt, Sarah Cogan, Alexandre Kaskasoli, Victoria Krakovna, David Lindner, Matthew Rahtz, Yannis Assael, Sarah Hodkinson, Heidi Howard, Tom Lieberum, Ramana Kumar, Maria Abi Raad, Albert Webson, Lewis Ho, Sharon Lin, Sebastian Farquhar, Marcus Hutter, Gregoire Deletang, Anian Ruoss, Seliem El-Sayed, Sasha Brown, Anca Dragan, Rohin Shah , et al. (2 additional authors not shown)

Abstract: To understand the risks posed by a new AI system, we must understand what it can and cannot do. Building on prior work, we introduce a programme of new "dangerous capability" evaluations and pilot them on Gemini 1.0 models. Our evaluations cover four areas: (1) persuasion and deception; (2) cyber-security; (3) self-proliferation; and (4) self-reasoning. We do not find evidence of strong dangerous… ▽ More To understand the risks posed by a new AI system, we must understand what it can and cannot do. Building on prior work, we introduce a programme of new "dangerous capability" evaluations and pilot them on Gemini 1.0 models. Our evaluations cover four areas: (1) persuasion and deception; (2) cyber-security; (3) self-proliferation; and (4) self-reasoning. We do not find evidence of strong dangerous capabilities in the models we evaluated, but we flag early warning signs. Our goal is to help advance a rigorous science of dangerous capability evaluation, in preparation for future models. △ Less

Submitted 5 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

arXiv:2403.05530 [pdf, other]

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content. △ Less

Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

arXiv:2312.11805 [pdf, other]

Gemini: A Family of Highly Capable Multimodal Models

Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI. △ Less

Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

arXiv:2311.09929 [pdf, other]

Mutating etcd Towards Edge Suitability

Authors: Andrew Jeffery, Heidi Howard, Richard Mortier

Abstract: In the edge environment servers are no longer being co-located away from clients, instead they are being co-located with clients away from other servers, focusing on reliable and performant operation. Orchestration platforms, such as Kubernetes, are a key system being transitioned to the edge but they remain unsuited to the environment, stemming primarily from their critical key-value stores. In t… ▽ More In the edge environment servers are no longer being co-located away from clients, instead they are being co-located with clients away from other servers, focusing on reliable and performant operation. Orchestration platforms, such as Kubernetes, are a key system being transitioned to the edge but they remain unsuited to the environment, stemming primarily from their critical key-value stores. In this work we derive requirements from the edge environment showing that, fundamentally, the design of distributed key-value datastores, such as etcd, is unsuited to meet them. Using these requirements, we explore the design space for distributed key-value datastores and implement two successive mutations of etcd for different points: mergeable-etcd and dismerge, trading linearizability for causal consistency based on CRDTs. mergeable-etcd retains the linear revision history but encounters inherent shortcomings, whilst dismerge embraces the causal model. Both stores are local-first, maintaining reliable performance under network partitions and variability, drastically surpassing etcd's performance, whilst maintaining competitive performance in reliable settings. △ Less

Submitted 16 November, 2023; originally announced November 2023.

arXiv:2310.11559 [pdf, other]

Confidential Consortium Framework: Secure Multiparty Applications with Confidentiality, Integrity, and High Availability

Authors: Heidi Howard, Fritz Alder, Edward Ashton, Amaury Chamayou, Sylvan Clebsch, Manuel Costa, Antoine Delignat-Lavaud, Cedric Fournet, Andrew Jeffery, Matthew Kerner, Fotios Kounelis, Markus A. Kuppe, Julien Maffre, Mark Russinovich, Christoph M. Wintersteiger

Abstract: Confidentiality, integrity protection, and high availability, abbreviated to CIA, are essential properties for trustworthy data systems. The rise of cloud computing and the growing demand for multiparty applications however means that building modern CIA systems is more challenging than ever. In response, we present the Confidential Consortium Framework (CCF), a general-purpose foundation for deve… ▽ More Confidentiality, integrity protection, and high availability, abbreviated to CIA, are essential properties for trustworthy data systems. The rise of cloud computing and the growing demand for multiparty applications however means that building modern CIA systems is more challenging than ever. In response, we present the Confidential Consortium Framework (CCF), a general-purpose foundation for develo** secure stateful CIA applications. CCF combines centralized compute with decentralized trust, supporting deployment on untrusted cloud infrastructure and transparent governance by mutually untrusted parties. CCF leverages hardware-based trusted execution environments for remotely verifiable confidentiality and code integrity. This is coupled with state machine replication backed by an auditable immutable ledger for data integrity and high availability. CCF enables each service to bring its own application logic, custom multiparty governance model, and deployment scenario, decoupling the operators of nodes from the consortium that governs them. CCF is open-source and available now at https://github.com/microsoft/CCF. △ Less

Submitted 17 October, 2023; originally announced October 2023.

Comments: 16 pages, 9 figures. To appear in the Proceedings of the VLDB Endowment, Volume 17

arXiv:2209.10571 [pdf, other]

Subspace Diagonalization on Quantum Computers using Eigenvector Continuation

Authors: Akhil Francis, Anjali A. Agrawal, Jack H. Howard, Efekan Kökcü, A. F. Kemper

Abstract: Quantum subspace diagonalization (QSD) methods are quantum-classical hybrid methods, commonly used to find ground and excited state energies by projecting the Hamiltonian to a smaller subspace. In applying these, the choice of subspace basis is critical from the perspectives of basis completeness and efficiency of implementation on quantum computers. In this work, we present Eigenvector Continuati… ▽ More Quantum subspace diagonalization (QSD) methods are quantum-classical hybrid methods, commonly used to find ground and excited state energies by projecting the Hamiltonian to a smaller subspace. In applying these, the choice of subspace basis is critical from the perspectives of basis completeness and efficiency of implementation on quantum computers. In this work, we present Eigenvector Continuation (EC) as a QSD method, where low-energy states of the Hamiltonian at different points in parameter space are chosen as the subspace basis. This unique choice enables rapid evaluation of low-energy spectra, including ground and nearby excited states, with minimal hardware effort. As a particular advantage, EC is able to capture the spectrum across ground state crossovers corresponding to different symmetry sectors of the problem. We demonstrate this method for interacting spin models and molecules. △ Less

Submitted 21 September, 2022; originally announced September 2022.

Comments: 13 pages, 10 figures

arXiv:2205.11652 [pdf, other]

BeeGees: stayin' alive in chained BFT

Authors: Ittai Abraham, Natacha Crooks, Neil Giridharan, Heidi Howard, Florian Suri-Payer

Abstract: Modern chained Byzantine Fault Tolerant (BFT) systems leverage a combination of pipelining and leader rotation to obtain both efficiency and fairness. These protocols, however, require a sequence of three or four consecutive honest leaders to commit operations. Therefore, even simple leader failures such as crashes can weaken liveness both theoretically and practically. Obtaining a chained BFT pro… ▽ More Modern chained Byzantine Fault Tolerant (BFT) systems leverage a combination of pipelining and leader rotation to obtain both efficiency and fairness. These protocols, however, require a sequence of three or four consecutive honest leaders to commit operations. Therefore, even simple leader failures such as crashes can weaken liveness both theoretically and practically. Obtaining a chained BFT protocol that reaches decisions even if the sequence of honest leaders is non-consecutive, remains an open question. To resolve this question we present BeeGees, a novel chained BFT protocol that successfully commits blocks even with non-consecutive honest leaders. It does this while also maintaining quadratic word complexity with threshold signatures, linear word complexity with SNARKs, and responsiveness between consecutive honest leaders. BeeGees reduces the expected commit latency of HotStuff by a factor of three under failures, and the worst-case latency by a factor of seven. △ Less

Submitted 26 January, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

arXiv:2203.03058 [pdf, ps, other]

Relaxed Paxos: Quorum Intersection Revisited (Again)

Authors: Heidi Howard, Richard Mortier

Abstract: Distributed consensus, the ability to reach agreement in the face of failures, is a fundamental primitive for constructing reliable distributed systems. The Paxos algorithm is synonymous with consensus and widely utilized in production. Paxos uses two phases: phase one and phase two, each requiring a quorum of acceptors, to reach consensus during a round of the protocol. Traditionally, Paxos requi… ▽ More Distributed consensus, the ability to reach agreement in the face of failures, is a fundamental primitive for constructing reliable distributed systems. The Paxos algorithm is synonymous with consensus and widely utilized in production. Paxos uses two phases: phase one and phase two, each requiring a quorum of acceptors, to reach consensus during a round of the protocol. Traditionally, Paxos requires that all quorums, regardless of phase or round, intersect and majorities are often used for this purpose. Flexible Paxos proved that it is only necessary for phase one quorum of a given round to intersect with the phase two quorums of all previous rounds. In this paper, we re-examine how Paxos approaches the problem of consensus. We look again at quorum intersection in Flexible Paxos and observe that quorum intersection can be safely weakened further. Most notably, we observe that if a proposer learns that a value was proposed in some previous round then its phase one no longer needs to intersect with the phase two quorums from that round or from any previous rounds. Furthermore, in order to provide an intuitive explanation of our results, we propose a novel abstraction for reasoning about Paxos which utilizes write-once registers. △ Less

Submitted 6 March, 2022; originally announced March 2022.

Comments: to be published in the 9th Workshop on Principles and Practice of Consistency for Distributed Data (PaPoC'22)

arXiv:2104.04102 [pdf, other]

Read-Write Quorum Systems Made Practical

Authors: Michael Whittaker, Aleksey Charapko, Joseph M. Hellerstein, Heidi Howard, Ion Stoica

Abstract: Quorum systems are a powerful mechanism for ensuring the consistency of replicated data. Production systems usually opt for majority quorums due to their simplicity and fault tolerance, but majority quorum systems provide poor throughput and scalability. Alternatively, researchers have invented a number of theoretically "optimal" quorum systems, but the underlying theory ignores many practical com… ▽ More Quorum systems are a powerful mechanism for ensuring the consistency of replicated data. Production systems usually opt for majority quorums due to their simplicity and fault tolerance, but majority quorum systems provide poor throughput and scalability. Alternatively, researchers have invented a number of theoretically "optimal" quorum systems, but the underlying theory ignores many practical complexities such as machine heterogeneity and workload skew. In this paper, we conduct a pragmatic re-examination of quorum systems. We enrich the current theory on quorum systems with a number of practical refinements to find quorum systems that provide higher throughput, lower latency, and lower network load. We also develop a library Quoracle that precisely quantifies the available trade-offs between quorum systems to empower engineers to find optimal quorum systems, given a set of objectives for specific deployments and workloads. Our tool is available online at: https://github.com/mwhittaker/quoracle. △ Less

Submitted 8 April, 2021; originally announced April 2021.

Comments: To be published in PaPoC 2021 (https://papoc-workshop.github.io/2021/)

arXiv:2104.02423 [pdf, other]

doi 10.1145/3434770.3459730

Rearchitecting Kubernetes for the Edge

Authors: Andrew Jeffery, Heidi Howard, Richard Mortier

Abstract: Recent years have seen Kubernetes emerge as a primary choice for container orchestration. Kubernetes largely targets the cloud environment but new use cases require performant, available and scalable orchestration at the edge. Kubernetes stores all cluster state in etcd, a strongly consistent key-value store. We find that at larger etcd cluster sizes, offering higher availability, write request la… ▽ More Recent years have seen Kubernetes emerge as a primary choice for container orchestration. Kubernetes largely targets the cloud environment but new use cases require performant, available and scalable orchestration at the edge. Kubernetes stores all cluster state in etcd, a strongly consistent key-value store. We find that at larger etcd cluster sizes, offering higher availability, write request latency significantly increases and throughput decreases similarly. Coupled with approximately 30% of Kubernetes requests being writes, this directly impacts the request latency and availability of Kubernetes, reducing its suitability for the edge. We revisit the requirement of strong consistency and propose an eventually consistent approach instead. This enables higher performance, availability and scalability whilst still supporting the broad needs of Kubernetes. This aims to make Kubernetes much more suitable for performance-critical, dynamically-scaled edge solutions. △ Less

Submitted 6 April, 2021; originally announced April 2021.

Comments: 6 pages. Accepted in EdgeSys '21

Journal ref: Proceedings of the 4th International Workshop on Edge Systems, Analytics and Networking (2021) 7-12

arXiv:2012.15762 [pdf, other]

Scaling Replicated State Machines with Compartmentalization [Technical Report]

Authors: Michael Whittaker, Ailidani Ailijiang, Aleksey Charapko, Murat Demirbas, Neil Giridharan, Joseph M. Hellerstein, Heidi Howard, Ion Stoica, Adriana Szekeres

Abstract: State machine replication protocols, like MultiPaxos and Raft, are a critical component of many distributed systems and databases. However, these protocols offer relatively low throughput due to several bottlenecked components. Numerous existing protocols fix different bottlenecks in isolation but fall short of a complete solution. When you fix one bottleneck, another arises. In this paper, we int… ▽ More State machine replication protocols, like MultiPaxos and Raft, are a critical component of many distributed systems and databases. However, these protocols offer relatively low throughput due to several bottlenecked components. Numerous existing protocols fix different bottlenecks in isolation but fall short of a complete solution. When you fix one bottleneck, another arises. In this paper, we introduce compartmentalization, the first comprehensive technique to eliminate state machine replication bottlenecks. Compartmentalization involves decoupling individual bottlenecks into distinct components and scaling these components independently. Compartmentalization has two key strengths. First, compartmentalization leads to strong performance. In this paper, we demonstrate how to compartmentalize MultiPaxos to increase its throughput by 6x on a write-only workload and 16x on a mixed read-write workload. Unlike other approaches, we achieve this performance without the need for specialized hardware. Second, compartmentalization is a technique, not a protocol. Industry practitioners can apply compartmentalization to their protocols incrementally without having to adopt a completely new protocol. △ Less

Submitted 16 May, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

Comments: Technical Report

arXiv:2012.00472 [pdf, other]

Byzantine Eventual Consistency and the Fundamental Limits of Peer-to-Peer Databases

Authors: Martin Kleppmann, Heidi Howard

Abstract: Sybil attacks, in which a large number of adversary-controlled nodes join a network, are a concern for many peer-to-peer database systems, necessitating expensive countermeasures such as proof-of-work. However, there is a category of database applications that are, by design, immune to Sybil attacks because they can tolerate arbitrary numbers of Byzantine-faulty nodes. In this paper, we characteri… ▽ More Sybil attacks, in which a large number of adversary-controlled nodes join a network, are a concern for many peer-to-peer database systems, necessitating expensive countermeasures such as proof-of-work. However, there is a category of database applications that are, by design, immune to Sybil attacks because they can tolerate arbitrary numbers of Byzantine-faulty nodes. In this paper, we characterize this category of applications using a consistency model we call Byzantine Eventual Consistency (BEC). We introduce an algorithm that guarantees BEC based on Byzantine causal broadcast, prove its correctness, and demonstrate near-optimal performance in a prototype implementation. △ Less

Submitted 1 December, 2020; originally announced December 2020.

arXiv:2008.02671 [pdf, other]

Fast Flexible Paxos: Relaxing Quorum Intersection for Fast Paxos

Authors: Heidi Howard, Aleksey Charapko, Richard Mortier

Abstract: Paxos, the de facto standard approach to solving distributed consensus, operates in two phases, each of which requires an intersecting quorum of nodes. Multi-Paxos reduces this to one phase by electing a leader but this leader is also a performance bottleneck. Fast Paxos bypasses the leader but has stronger quorum intersection requirements. In this paper we observe that Fast Paxos' intersection… ▽ More Paxos, the de facto standard approach to solving distributed consensus, operates in two phases, each of which requires an intersecting quorum of nodes. Multi-Paxos reduces this to one phase by electing a leader but this leader is also a performance bottleneck. Fast Paxos bypasses the leader but has stronger quorum intersection requirements. In this paper we observe that Fast Paxos' intersection requirements can be safely relaxed, reducing to just one additional intersection requirement between phase-1 quorums and any pair of fast round phase-2 quorums. We thus find that the quorums used with Fast Paxos are larger than necessary, allowing alternative quorum systems to obtain new tradeoffs between performance and fault-tolerance. △ Less

Submitted 16 October, 2020; v1 submitted 6 August, 2020; originally announced August 2020.

Comments: To be published in the Proceedings of International Conference on Distributed Computing and Networking 2021 (ICDCN '21)

arXiv:2007.09468 [pdf, other]

Matchmaker Paxos: A Reconfigurable Consensus Protocol [Technical Report]

Authors: Michael Whittaker, Neil Giridharan, Adriana Szekeres, Joseph M. Hellerstein, Heidi Howard, Faisal Nawab, Ion Stoica

Abstract: State machine replication protocols, like MultiPaxos and Raft, are at the heart of nearly every strongly consistent distributed database. To tolerate machine failures, these protocols must replace failed machines with live machines, a process known as reconfiguration. Reconfiguration has become increasingly important over time as the need for frequent reconfiguration has grown. Despite this, recon… ▽ More State machine replication protocols, like MultiPaxos and Raft, are at the heart of nearly every strongly consistent distributed database. To tolerate machine failures, these protocols must replace failed machines with live machines, a process known as reconfiguration. Reconfiguration has become increasingly important over time as the need for frequent reconfiguration has grown. Despite this, reconfiguration has largely been neglected in the literature. In this paper, we present Matchmaker Paxos and Matchmaker MultiPaxos, a reconfigurable consensus and state machine replication protocol respectively. Our protocols can perform a reconfiguration with little to no impact on the latency or throughput of command processing; they can perform a reconfiguration in one round trip (theoretically) and a few milliseconds (empirically); they provide a number of theoretical insights; and they present a framework that can be generalized to other replication protocols in a way that previous reconfiguration techniques can not. We provide proofs of correctness for the protocols and optimizations, and present empirical results from an open source implementation. △ Less

Submitted 20 July, 2020; v1 submitted 18 July, 2020; originally announced July 2020.

arXiv:2004.05074 [pdf, ps, other]

doi 10.1145/3380787.3393681

Paxos vs Raft: Have we reached consensus on distributed consensus?

Authors: Heidi Howard, Richard Mortier

Abstract: Distributed consensus is a fundamental primitive for constructing fault-tolerant, strongly-consistent distributed systems. Though many distributed consensus algorithms have been proposed, just two dominate production systems: Paxos, the traditional, famously subtle, algorithm; and Raft, a more recent algorithm positioned as a more understandable alternative to Paxos. In this paper, we consider t… ▽ More Distributed consensus is a fundamental primitive for constructing fault-tolerant, strongly-consistent distributed systems. Though many distributed consensus algorithms have been proposed, just two dominate production systems: Paxos, the traditional, famously subtle, algorithm; and Raft, a more recent algorithm positioned as a more understandable alternative to Paxos. In this paper, we consider the question of which algorithm, Paxos or Raft, is the better solution to distributed consensus? We analyse both to determine exactly how they differ by describing a simplified Paxos algorithm using Raft's terminology and pragmatic abstractions. We find that both Paxos and Raft take a very similar approach to distributed consensus, differing only in their approach to leader election. Most notably, Raft only allows servers with up-to-date logs to become leaders, whereas Paxos allows any server to be leader provided it then updates its log to ensure it is up-to-date. Raft's approach is surprisingly efficient given its simplicity as, unlike Paxos, it does not require log entries to be exchanged during leader election. We surmise that much of the understandability of Raft comes from the paper's clear presentation rather than being fundamental to the underlying algorithm being presented. △ Less

Submitted 27 April, 2020; v1 submitted 10 April, 2020; originally announced April 2020.

Comments: To be published in the 7th Workshop on Principles and Practice of Consistency for Distributed Data (PaPoC)

arXiv:1902.06776 [pdf, ps, other]

A Generalised Solution to Distributed Consensus

Authors: Heidi Howard, Richard Mortier

Abstract: Distributed consensus, the ability to reach agreement in the face of failures and asynchrony, is a fundamental primitive for constructing reliable distributed systems from unreliable components. The Paxos algorithm is synonymous with distributed consensus, yet it performs poorly in practice and is famously difficult to understand. In this paper, we re-examine the foundations of distributed consens… ▽ More Distributed consensus, the ability to reach agreement in the face of failures and asynchrony, is a fundamental primitive for constructing reliable distributed systems from unreliable components. The Paxos algorithm is synonymous with distributed consensus, yet it performs poorly in practice and is famously difficult to understand. In this paper, we re-examine the foundations of distributed consensus. We derive an abstract solution to consensus, which utilises immutable state for intuitive reasoning about safety. We prove that our abstract solution generalises over Paxos as well as the Fast Paxos and Flexible Paxos algorithms. The surprising result of this analysis is a substantial weakening to the quorum requirements of these widely studied algorithms. △ Less

Submitted 18 February, 2019; originally announced February 2019.

arXiv:1608.06696 [pdf, other]

Flexible Paxos: Quorum intersection revisited

Authors: Heidi Howard, Dahlia Malkhi, Alexander Spiegelman

Abstract: Distributed consensus is integral to modern distributed systems. The widely adopted Paxos algorithm uses two phases, each requiring majority agreement, to reliably reach consensus. In this paper, we demonstrate that Paxos, which lies at the foundation of many production systems, is conservative. Specifically, we observe that each of the phases of Paxos may use non-intersecting quorums. Majority qu… ▽ More Distributed consensus is integral to modern distributed systems. The widely adopted Paxos algorithm uses two phases, each requiring majority agreement, to reliably reach consensus. In this paper, we demonstrate that Paxos, which lies at the foundation of many production systems, is conservative. Specifically, we observe that each of the phases of Paxos may use non-intersecting quorums. Majority quorums are not necessary as intersection is required only across phases. Using this weakening of the requirements made in the original formulation, we propose Flexible Paxos, which generalizes over the Paxos algorithm to provide flexible quorums. We show that Flexible Paxos is safe, efficient and easy to utilize in existing distributed systems. We conclude by discussing the wide reaching implications of this result. Examples include improved availability from reducing the size of second phase quorums by one when the number of acceptors is even and utilizing small disjoint phase-2 quorums to speed up the steady-state. △ Less

Submitted 23 August, 2016; originally announced August 2016.

ACM Class: C.2.4

arXiv:1501.04737 [pdf, ps, other]

Personal Data: Thinking Inside the Box

Authors: Hamed Haddadi, Heidi Howard, Amir Chaudhry, Jon Crowcroft, Anil Madhavapeddy, Richard Mortier

Abstract: We propose there is a need for a technical platform enabling people to engage with the collection, management and consumption of personal data; and that this platform should itself be personal, under the direct control of the individual whose data it holds. In what follows, we refer to this platform as the Databox, a personal, networked service that collates personal data and can be used to make t… ▽ More We propose there is a need for a technical platform enabling people to engage with the collection, management and consumption of personal data; and that this platform should itself be personal, under the direct control of the individual whose data it holds. In what follows, we refer to this platform as the Databox, a personal, networked service that collates personal data and can be used to make those data available. While your Databox is likely to be a virtual platform, in that it will involve multiple devices and services, at least one instance of it will exist in physical form such as on a physical form-factor computing device with associated storage and networking, such as a home hub. △ Less

Submitted 20 January, 2015; originally announced January 2015.

Showing 1–22 of 22 results for author: Howard, H