-
Trees and Turtles: Modular Abstractions for State Machine Replication Protocols
Authors:
Natalie Neamtu,
Haobin Ni,
Robbert van Renesse
Abstract:
We present two abstractions for designing modular state machine replication (SMR) protocols: trees and turtles. A tree captures the set of possible state machine histories, while a turtle represents a subprotocol that tries to find agreement in this tree. We showcase the applicability of these abstractions by constructing crash-tolerant SMR protocols out of abstract tree turtles and providing exam…
▽ More
We present two abstractions for designing modular state machine replication (SMR) protocols: trees and turtles. A tree captures the set of possible state machine histories, while a turtle represents a subprotocol that tries to find agreement in this tree. We showcase the applicability of these abstractions by constructing crash-tolerant SMR protocols out of abstract tree turtles and providing examples of tree turtle implementations. Tree turtles can also be extended to be made Byzantine fault-tolerant (BFT). The modularity of tree turtles allows a generic approach for adding a leader for liveness. We expect that these abstractions will simplify reasoning and formal verification of SMR protocols as well as facilitate innovation in protocol designs.
△ Less
Submitted 6 May, 2023; v1 submitted 16 April, 2023;
originally announced April 2023.
-
Heterogeneous Paxos: Technical Report
Authors:
Isaac Sheff,
Xinwen Wang,
Robbert van Renesse,
Andrew C. Myers
Abstract:
In distributed systems, a group of $\textit{learners}$ achieve $\textit{consensus}$ when, by observing the output of some $\textit{acceptors}$, they all arrive at the same value. Consensus is crucial for ordering transactions in failure-tolerant systems. Traditional consensus algorithms are homogeneous in three ways:
- all learners are treated equally,
- all acceptors are treated equally, and…
▽ More
In distributed systems, a group of $\textit{learners}$ achieve $\textit{consensus}$ when, by observing the output of some $\textit{acceptors}$, they all arrive at the same value. Consensus is crucial for ordering transactions in failure-tolerant systems. Traditional consensus algorithms are homogeneous in three ways:
- all learners are treated equally,
- all acceptors are treated equally, and
- all failures are treated equally.
These assumptions, however, are unsuitable for cross-domain applications, including blockchains, where not all acceptors are equally trustworthy, and not all learners have the same assumptions and priorities. We present the first consensus algorithm to be heterogeneous in all three respects. Learners set their own mixed failure tolerances over differently trusted sets of acceptors. We express these assumptions in a novel $\textit{Learner Graph}$, and demonstrate sufficient conditions for consensus. We present $\textit{Heterogeneous Paxos}$: an extension of Byzantine Paxos. Heterogeneous Paxos achieves consensus for any viable Learner Graph in best-case three message sends, which is optimal. We present a proof-of-concept implementation, and demonstrate how tailoring for heterogeneous scenarios can save resources and latency.
△ Less
Submitted 8 December, 2020; v1 submitted 16 November, 2020;
originally announced November 2020.
-
CedrusDB: Persistent Key-Value Store with Memory-Mapped Lazy-Trie
Authors:
Maofan Yin,
Hongbo Zhang,
Robbert van Renesse,
Emin Gün Sirer
Abstract:
As a result of RAM becoming cheaper, there has been a trend in key-value store design towards maintaining a fast in-memory index (such as a hash table) while logging user operations to disk, allowing high performance under failure-free conditions while still being able to recover from failures. This design, however, comes at the cost of long recovery times or expensive checkpoint operations. This…
▽ More
As a result of RAM becoming cheaper, there has been a trend in key-value store design towards maintaining a fast in-memory index (such as a hash table) while logging user operations to disk, allowing high performance under failure-free conditions while still being able to recover from failures. This design, however, comes at the cost of long recovery times or expensive checkpoint operations. This paper presents a new in-memory index that is also storage-friendly. A "lazy-trie" is a variant of the hash-trie data structure that achieves near-optimal height, has practical storage overhead, and can be maintained on-disk with standard write-ahead logging.
We implemented CedrusDB, persistent key-value store based on a lazy-trie. The lazy-trie is kept on disk while made available in memory using standard memory-mapping. The lazy-trie organization in virtual memory allows CedrusDB to better leverage concurrent processing than other on-disk index schemes (LSMs, B+-trees). CedrusDB achieves comparable or superior performance to recent log-based in-memory key-value stores in mixed workloads while being able to recover quickly from failures.
△ Less
Submitted 21 July, 2021; v1 submitted 27 May, 2020;
originally announced May 2020.
-
Asynchronous Consensus Without Rounds
Authors:
Robbert van Renesse
Abstract:
Fault tolerant consensus protocols usually involve ordered rounds of voting between a collection of processes. In this paper, we derive a general specification of fault tolerant asynchronous consensus protocols and present a class of consensus protocols that refine this specification without using rounds. Crash-tolerant protocols in this class use 3f+1 processes, while Byzantine-tolerant protocols…
▽ More
Fault tolerant consensus protocols usually involve ordered rounds of voting between a collection of processes. In this paper, we derive a general specification of fault tolerant asynchronous consensus protocols and present a class of consensus protocols that refine this specification without using rounds. Crash-tolerant protocols in this class use 3f+1 processes, while Byzantine-tolerant protocols use 5f+1 processes.
△ Less
Submitted 28 August, 2019;
originally announced August 2019.
-
Scalable and Probabilistic Leaderless BFT Consensus through Metastability
Authors:
Team Rocket,
Maofan Yin,
Kevin Sekniqi,
Robbert van Renesse,
Emin Gün Sirer
Abstract:
This paper introduces a family of leaderless Byzantine fault tolerance protocols, built around a metastable mechanism via network subsampling. These protocols provide a strong probabilistic safety guarantee in the presence of Byzantine adversaries while their concurrent and leaderless nature enables them to achieve high throughput and scalability. Unlike blockchains that rely on proof-of-work, the…
▽ More
This paper introduces a family of leaderless Byzantine fault tolerance protocols, built around a metastable mechanism via network subsampling. These protocols provide a strong probabilistic safety guarantee in the presence of Byzantine adversaries while their concurrent and leaderless nature enables them to achieve high throughput and scalability. Unlike blockchains that rely on proof-of-work, they are quiescent and green. Unlike traditional consensus protocols where one or more nodes typically process linear bits in the number of total nodes per decision, no node processes more than logarithmic bits. It does not require accurate knowledge of all participants and exposes new possible tradeoffs and improvements in safety and liveness for building consensus protocols.
The paper describes the Snow protocol family, analyzes its guarantees, and describes how it can be used to construct the core of an internet-scale electronic payment system called Avalanche, which is evaluated in a large scale deployment. Experiments demonstrate that the system can achieve high throughput (3400 tps), provide low confirmation latency (1.35 sec), and scale well compared to existing systems that deliver similar functionality. For our implementation and setup, the bottleneck of the system is in transaction verification.
△ Less
Submitted 24 August, 2020; v1 submitted 20 June, 2019;
originally announced June 2019.
-
Charlotte: Composable Authenticated Distributed Data Structures, Technical Report
Authors:
Isaac Sheff,
Xinwen Wang,
Haobin Ni,
Robbert van Renesse,
Andrew C. Myers
Abstract:
We present Charlotte, a framework for composable, authenticated distributed data structures. Charlotte data is stored in blocks that reference each other by hash. Together, all Charlotte blocks form a directed acyclic graph, the blockweb; all observers and applications use subgraphs of the blockweb for their own data structures. Unlike prior systems, Charlotte data structures are composable: appli…
▽ More
We present Charlotte, a framework for composable, authenticated distributed data structures. Charlotte data is stored in blocks that reference each other by hash. Together, all Charlotte blocks form a directed acyclic graph, the blockweb; all observers and applications use subgraphs of the blockweb for their own data structures. Unlike prior systems, Charlotte data structures are composable: applications and data structures can operate fully independently when possible, and share blocks when desired. To support this composability, we define a language-independent format for Charlotte blocks and a network API for Charlotte servers.
An authenticated distributed data structure guarantees that data is immutable and self-authenticating: data referenced will be unchanged when it is retrieved. Charlotte extends these guarantees by allowing applications to plug in their own mechanisms for ensuring availability and integrity of data structures. Unlike most traditional distributed systems, including distributed databases, blockchains, and distributed hash tables, Charlotte supports heterogeneous trust: different observers may have their own beliefs about who might fail, and how. Despite heterogeneity of trust, Charlotte presents each observer with a consistent, available view of data.
We demonstrate the flexibility of Charlotte by implementing a variety of integrity mechanisms, including consensus and proof of work. We study the power of disentangling availability and integrity mechanisms by building a variety of applications. The results from these examples suggest that developers can use Charlotte to build flexible, fast, composable applications with strong guarantees.
△ Less
Submitted 9 May, 2019;
originally announced May 2019.
-
A Web of Blocks
Authors:
Isaac Sheff,
Xinwen Wang,
Andrew C. Myers,
Robbert van Renesse
Abstract:
Blockchains offer a useful abstraction: a trustworthy, decentralized log of totally ordered transactions. Traditional blockchains have problems with scalability and efficiency, preventing their use for many applications. These limitations arise from the requirement that all participants agree on the total ordering of transactions. To address this fundamental shortcoming, we introduce Charlotte, a…
▽ More
Blockchains offer a useful abstraction: a trustworthy, decentralized log of totally ordered transactions. Traditional blockchains have problems with scalability and efficiency, preventing their use for many applications. These limitations arise from the requirement that all participants agree on the total ordering of transactions. To address this fundamental shortcoming, we introduce Charlotte, a system for maintaining decentralized, authenticated data structures, including transaction logs. Each data structurestructure -- indeed, each block -- specifies its own availability and integrity properties, allowing Charlotte applications to retain the full benefits of permissioned or permissionless blockchains. In Charlotte, a block can be atomically appended to multiple logs, allowing applications to be interoperable when they want to, without inefficiently forcing all applications to share one big log. We call this open graph of interconnected blocks a blockweb. We allow new kinds of blockweb applications that operate beyond traditional chains. We demonstrate the viability of Charlotte applications with proof-of-concept servers running interoperable blockchains. Using performance data from our prototype, we estimate that when compared with traditional blockchains, Charlotte offers multiple orders of magnitude improvement in speed and energy efficiency.
△ Less
Submitted 18 June, 2018;
originally announced June 2018.
-
Decentralization in Bitcoin and Ethereum Networks
Authors:
Adem Efe Gencer,
Soumya Basu,
Ittay Eyal,
Robbert van Renesse,
Emin Gün Sirer
Abstract:
Blockchain-based cryptocurrencies have demonstrated how to securely implement traditionally centralized systems, such as currencies, in a decentralized fashion. However, there have been few measurement studies on the level of decentralization they achieve in practice. We present a measurement study on various decentralization metrics of two of the leading cryptocurrencies with the largest market c…
▽ More
Blockchain-based cryptocurrencies have demonstrated how to securely implement traditionally centralized systems, such as currencies, in a decentralized fashion. However, there have been few measurement studies on the level of decentralization they achieve in practice. We present a measurement study on various decentralization metrics of two of the leading cryptocurrencies with the largest market capitalization and user base, Bitcoin and Ethereum. We investigate the extent of decentralization by measuring the network resources of nodes and the interconnection among them, the protocol requirements affecting the operation of nodes, and the robustness of the two systems against attacks. In particular, we adapted existing internet measurement techniques and used the Falcon Relay Network as a novel measurement tool to obtain our data. We discovered that neither Bitcoin nor Ethereum has strictly better properties than the other. We also provide concrete suggestions for improving both systems.
△ Less
Submitted 29 March, 2018; v1 submitted 11 January, 2018;
originally announced January 2018.
-
Consus: Taming the Paxi
Authors:
Robert Escriva,
Robbert van Renesse
Abstract:
Consus is a strictly serializable geo-replicated transactional key-value store. The key contribution of Consus is a new commit protocol that reduces the cost of executing a transaction to three wide area message delays in the common case. Augmenting the commit protocol are multiple Paxos implementations optimized for different purposes. Together the different implementations and optimizations comp…
▽ More
Consus is a strictly serializable geo-replicated transactional key-value store. The key contribution of Consus is a new commit protocol that reduces the cost of executing a transaction to three wide area message delays in the common case. Augmenting the commit protocol are multiple Paxos implementations optimized for different purposes. Together the different implementations and optimizations comprise a cohesive system that provides low latency, high availability, and strong guarantees. This paper describes the techniques implemented in the open source release of Consus, and lays the groundwork for evaluating Consus once the system implementation is sufficiently robust for a thorough evaluation.
△ Less
Submitted 11 December, 2016;
originally announced December 2016.
-
Service-Oriented Sharding with Aspen
Authors:
Adem Efe Gencer,
Robbert van Renesse,
Emin Gün Sirer
Abstract:
The rise of blockchain-based cryptocurrencies has led to an explosion of services using distributed ledgers as their underlying infrastructure. However, due to inherently single-service oriented blockchain protocols, such services can bloat the existing ledgers, fail to provide sufficient security, or completely forego the property of trustless auditability. Security concerns, trust restrictions,…
▽ More
The rise of blockchain-based cryptocurrencies has led to an explosion of services using distributed ledgers as their underlying infrastructure. However, due to inherently single-service oriented blockchain protocols, such services can bloat the existing ledgers, fail to provide sufficient security, or completely forego the property of trustless auditability. Security concerns, trust restrictions, and scalability limits regarding the resource requirements of users hamper the sustainable development of loosely-coupled services on blockchains.
This paper introduces Aspen, a sharded blockchain protocol designed to securely scale with increasing number of services. Aspen shares the same trust model as Bitcoin in a peer-to-peer network that is prone to extreme churn containing Byzantine participants. It enables introduction of new services without compromising the security, leveraging the trust assumptions, or flooding users with irrelevant messages.
△ Less
Submitted 21 November, 2016;
originally announced November 2016.
-
Moving Participants Turtle Consensus
Authors:
Stavros Nikolaou,
Robbert van Renesse
Abstract:
We present Moving Participants Turtle Consensus (MPTC), an asynchronous consensus protocol for crash and Byzantine-tolerant distributed systems. MPTC uses various moving target defense strategies to tolerate certain Denial-of-Service (DoS) attacks issued by an adversary capable of compromising a bounded portion of the system. MPTC supports on the fly reconfiguration of the consensus strategy as we…
▽ More
We present Moving Participants Turtle Consensus (MPTC), an asynchronous consensus protocol for crash and Byzantine-tolerant distributed systems. MPTC uses various moving target defense strategies to tolerate certain Denial-of-Service (DoS) attacks issued by an adversary capable of compromising a bounded portion of the system. MPTC supports on the fly reconfiguration of the consensus strategy as well as of the processes executing this strategy when solving the problem of agreement. It uses existing cryptographic techniques to ensure that reconfiguration takes place in an unpredictable fashion thus eliminating the adversary's advantage on predicting protocol and execution-specific information that can be used against the protocol.
We implement MPTC as well as a State Machine Replication protocol and evaluate our design under different attack scenarios. Our evaluation shows that MPTC approximates best case scenario performance even under a well-coordinated DoS attack.
△ Less
Submitted 21 November, 2016; v1 submitted 10 November, 2016;
originally announced November 2016.
-
Safe Serializable Secure Scheduling: Transactions and the Trade-off Between Security and Consistency
Authors:
Isaac Sheff,
Tom Magrino,
Jed Liu,
Andrew C. Myers,
Robbert van Renesse
Abstract:
Modern applications often operate on data in multiple administrative domains. In this federated setting, participants may not fully trust each other. These distributed applications use transactions as a core mechanism for ensuring reliability and consistency with persistent data. However, the coordination mechanisms needed for transactions can both leak confidential information and allow unauthori…
▽ More
Modern applications often operate on data in multiple administrative domains. In this federated setting, participants may not fully trust each other. These distributed applications use transactions as a core mechanism for ensuring reliability and consistency with persistent data. However, the coordination mechanisms needed for transactions can both leak confidential information and allow unauthorized influence.
By implementing a simple attack, we show these side channels can be exploited. However, our focus is on preventing such attacks. We explore secure scheduling of atomic, serializable transactions in a federated setting. While we prove that no protocol can guarantee security and liveness in all settings, we establish conditions for sets of transactions that can safely complete under secure scheduling. Based on these conditions, we introduce staged commit, a secure scheduling protocol for federated transactions. This protocol avoids insecure information channels by dividing transactions into distinct stages. We implement a compiler that statically checks code to ensure it meets our conditions, and a system that schedules these transactions using the staged commit protocol. Experiments on this implementation demonstrate that realistic federated transactions can be scheduled securely, atomically, and efficiently.
△ Less
Submitted 21 August, 2016; v1 submitted 16 August, 2016;
originally announced August 2016.
-
Bitcoin-NG: A Scalable Blockchain Protocol
Authors:
Ittay Eyal,
Adem Efe Gencer,
Emin Gun Sirer,
Robbert van Renesse
Abstract:
Cryptocurrencies, based on and led by Bitcoin, have shown promise as infrastructure for pseudonymous online payments, cheap remittance, trustless digital asset exchange, and smart contracts. However, Bitcoin-derived blockchain protocols have inherent scalability limits that trade-off between throughput and latency and withhold the realization of this potential.
This paper presents Bitcoin-NG, a…
▽ More
Cryptocurrencies, based on and led by Bitcoin, have shown promise as infrastructure for pseudonymous online payments, cheap remittance, trustless digital asset exchange, and smart contracts. However, Bitcoin-derived blockchain protocols have inherent scalability limits that trade-off between throughput and latency and withhold the realization of this potential.
This paper presents Bitcoin-NG, a new blockchain protocol designed to scale. Based on Bitcoin's blockchain protocol, Bitcoin-NG is Byzantine fault tolerant, is robust to extreme churn, and shares the same trust model obviating qualitative changes to the ecosystem.
In addition to Bitcoin-NG, we introduce several novel metrics of interest in quantifying the security and efficiency of Bitcoin-like blockchain protocols. We implement Bitcoin-NG and perform large-scale experiments at 15% the size of the operational Bitcoin system, using unchanged clients of both protocols. These experiments demonstrate that Bitcoin-NG scales optimally, with bandwidth limited only by the capacity of the individual nodes and latency limited only by the propagation time of the network.
△ Less
Submitted 11 November, 2015; v1 submitted 7 October, 2015;
originally announced October 2015.
-
Distributed Protocols and Heterogeneous Trust: Technical Report
Authors:
Isaac C. Sheff,
Robbert van Renesse,
Andrew C. Myers
Abstract:
The robustness of distributed systems is usually phrased in terms of the number of failures of certain types that they can withstand. However, these failure models are too crude to describe the different kinds of trust and expectations of participants in the modern world of complex, integrated systems extending across different owners, networks, and administrative domains. Modern systems often exi…
▽ More
The robustness of distributed systems is usually phrased in terms of the number of failures of certain types that they can withstand. However, these failure models are too crude to describe the different kinds of trust and expectations of participants in the modern world of complex, integrated systems extending across different owners, networks, and administrative domains. Modern systems often exist in an environment of heterogeneous trust, in which different participants may have different opinions about the trustworthiness of other nodes, and a single participant may consider other nodes to differ in their trustworthiness. We explore how to construct distributed protocols that meet the requirements of all participants, even in heterogeneous trust environments. The key to our approach is using lattice-based information flow to analyse and prove protocol properties. To demonstrate this approach, we show how two earlier distributed algorithms can be generalized to work in the presence of heterogeneous trust: first, Heterogeneous Fast Consensus, an adaptation of the earlier Bosco Fast Consensus protocol; and second, Nysiad, an algorithm for converting crash-tolerant protocols to be Byzantine-tolerant. Through simulations, we show that customizing a protocol to a heterogeneous trust configuration yields performance improvements over the conventional protocol designed for homogeneous trust.
△ Less
Submitted 9 December, 2014;
originally announced December 2014.
-
Cache Serializability: Reducing Inconsistency in Edge Transactions
Authors:
Ittay Eyal,
Ken Birman,
Robbert van Renesse
Abstract:
Read-only caches are widely used in cloud infrastructures to reduce access latency and load on backend databases. Operators view coherent caches as impractical at genuinely large scale and many client-facing caches are updated in an asynchronous manner with best-effort pipelines. Existing solutions that support cache consistency are inapplicable to this scenario since they require a round trip to…
▽ More
Read-only caches are widely used in cloud infrastructures to reduce access latency and load on backend databases. Operators view coherent caches as impractical at genuinely large scale and many client-facing caches are updated in an asynchronous manner with best-effort pipelines. Existing solutions that support cache consistency are inapplicable to this scenario since they require a round trip to the database on every cache transaction.
Existing incoherent cache technologies are oblivious to transactional data access, even if the backend database supports transactions. We propose T-Cache, a novel caching policy for read-only transactions in which inconsistency is tolerable (won't cause safety violations) but undesirable (has a cost). T-Cache improves cache consistency despite asynchronous and unreliable communication between the cache and the database. We define cache-serializability, a variant of serializability that is suitable for incoherent caches, and prove that with unbounded resources T-Cache implements this new specification. With limited resources, T-Cache allows the system manager to choose a trade-off between performance and consistency.
Our evaluation shows that T-Cache detects many inconsistencies with only nominal overhead. We use synthetic workloads to demonstrate the efficacy of T-Cache when data accesses are clustered and its adaptive reaction to workload changes. With workloads based on the real-world topologies, T-Cache detects 43-70% of the inconsistencies and increases the rate of consistent transactions by 33-58%.
△ Less
Submitted 26 April, 2015; v1 submitted 29 September, 2014;
originally announced September 2014.
-
Vive la Différence: Paxos vs. Viewstamped Replication vs. Zab
Authors:
Robbert Van Renesse,
Nicolas Schiper,
Fred B. Schneider
Abstract:
Paxos, Viewstamped Replication, and Zab are replication protocols that ensure high-availability in asynchronous environments with crash failures. Various claims have been made about similarities and differences between these protocols. But how does one determine whether two protocols are the same, and if not, how significant the differences are?
We propose to address these questions using refine…
▽ More
Paxos, Viewstamped Replication, and Zab are replication protocols that ensure high-availability in asynchronous environments with crash failures. Various claims have been made about similarities and differences between these protocols. But how does one determine whether two protocols are the same, and if not, how significant the differences are?
We propose to address these questions using refinement mappings, where protocols are expressed as succinct specifications that are progressively refined to executable implementations. Doing so enables a principled understanding of the correctness of the different design decisions that went into implementing the various protocols. Additionally, it allowed us to identify key differences that have a significant impact on performance.
△ Less
Submitted 27 February, 2014; v1 submitted 22 September, 2013;
originally announced September 2013.
-
Secure Abstraction with Code Capabilities
Authors:
Robbert van Renesse,
Håvard Johansen,
Nihar Naigaonkar,
Dag Johansen
Abstract:
We propose embedding executable code fragments in cryptographically protected capabilities to enable flexible discretionary access control in cloud-like computing infrastructures. We are developing this as part of a sports analytics application that runs on a federation of public and enterprise clouds. The capability mechanism is implemented completely in user space. Using a novel combination of X…
▽ More
We propose embedding executable code fragments in cryptographically protected capabilities to enable flexible discretionary access control in cloud-like computing infrastructures. We are developing this as part of a sports analytics application that runs on a federation of public and enterprise clouds. The capability mechanism is implemented completely in user space. Using a novel combination of X.509 certificates and Javscript code, the capabilities support restricted delegation, confinement, revocation, and rights amplification for secure abstraction.
△ Less
Submitted 19 October, 2012;
originally announced October 2012.
-
Nerio: Leader Election and Edict Ordering
Authors:
Robbert van Renesse,
Fred B. Schneider,
Johannes Gehrke
Abstract:
Coordination in a distributed system is facilitated if there is a unique process, the leader, to manage the other processes. The leader creates edicts and sends them to other processes for execution or forwarding to other processes. The leader may fail, and when this occurs a leader election protocol selects a replacement. This paper describes Nerio, a class of such leader election protocols.
Coordination in a distributed system is facilitated if there is a unique process, the leader, to manage the other processes. The leader creates edicts and sends them to other processes for execution or forwarding to other processes. The leader may fail, and when this occurs a leader election protocol selects a replacement. This paper describes Nerio, a class of such leader election protocols.
△ Less
Submitted 26 September, 2011; v1 submitted 23 September, 2011;
originally announced September 2011.