-
On the Design of Ethereum Data Availability Sampling: A Comprehensive Simulation Study
Authors:
Arunima Chaudhuri,
Sudipta Basak,
Csaba Kiraly,
Dmitriy Ryajov,
Leonardo Bautista-Gomez
Abstract:
This paper presents an in-depth exploration of Data Availability Sampling (DAS) and sharding mechanisms within decentralized systems through simulation-based analysis. DAS, a pivotal concept in blockchain technology and decentralized networks, is thoroughly examined to unravel its intricacies and assess its impact on system performance. Through the development of a simulator tailored explicitly fo…
▽ More
This paper presents an in-depth exploration of Data Availability Sampling (DAS) and sharding mechanisms within decentralized systems through simulation-based analysis. DAS, a pivotal concept in blockchain technology and decentralized networks, is thoroughly examined to unravel its intricacies and assess its impact on system performance. Through the development of a simulator tailored explicitly for DAS, we embark on a comprehensive investigation into the parameters that influence system behavior and efficiency. A series of experiments are conducted within the simulated environment to validate theoretical formulations and dissect the interplay of DAS parameters. This includes an exploration of approaches such as custody by row, variations in validators per node, and malicious nodes. The outcomes of these experiments furnish insights into the efficacy of DAS protocols and pave the way for the formulation of optimization strategies geared towards enhancing decentralized network performance. Moreover, the findings serve as guidelines for future research endeavors, offering a nuanced understanding of the complexities inherent in decentralized systems. This study not only contributes to the theoretical understanding of DAS but also offers practical implications for the design, implementation, and optimization of decentralized systems.
△ Less
Submitted 25 July, 2024;
originally announced July 2024.
-
Exploring Correlation Patterns in the Ethereum Validator Network
Authors:
Simon Brown,
Leonardo Bautista-Gomez
Abstract:
There have been several studies into measuring the level of decentralization in Ethereum through applying various indices to indicate the relative dominance of entities in different domains in the ecosystem. However, these indices do not capture any correlation between those different entities, that could potentially make them the subject of external coercion, or covert collusion. We propose an in…
▽ More
There have been several studies into measuring the level of decentralization in Ethereum through applying various indices to indicate the relative dominance of entities in different domains in the ecosystem. However, these indices do not capture any correlation between those different entities, that could potentially make them the subject of external coercion, or covert collusion. We propose an index that measures the relative dominance of entities based on the application of correlation factors. We posit that this approach produces a more nuanced and accurate index of decentralization.
△ Less
Submitted 22 March, 2024;
originally announced April 2024.
-
Scalability limitations of Kademlia DHTs when enabling Data Availability Sampling in Ethereum
Authors:
Mikel Cortes-Goicoechea,
Csaba Kiraly,
Dmitriy Ryajov,
Jose Luis Muñoz-Tapia,
Leonardo Bautista-Gomez
Abstract:
Scalability in blockchain remains a significant challenge, especially when prioritizing decentralization and security. The Ethereum community has proposed comprehensive data-sharding techniques to overcome storage, computational, and network processing limitations. In this context, the propagation and availability of large blocks become the subject of research to achieve scalable data-sharding. Th…
▽ More
Scalability in blockchain remains a significant challenge, especially when prioritizing decentralization and security. The Ethereum community has proposed comprehensive data-sharding techniques to overcome storage, computational, and network processing limitations. In this context, the propagation and availability of large blocks become the subject of research to achieve scalable data-sharding. This paper provides insights after exploring the usage of a Kademlia-based DHT to enable Data Availability Sampling (DAS) in Ethereum. It presents a DAS-DHT simulator to study this problem and validates the results of the simulator with experiments in a real DHT network, IPFS. Our results help us understand what parts of DAS can be achieved based on existing Kademlia DHT solutions and which ones cannot. We discuss the limitations of DHT solutions and discuss other alternatives.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
Can we run our Ethereum nodes at home?
Authors:
Mikel Cortes-Goicoechea,
Tarun Mohandas-Daryanani,
Jose L. Muñoz-Tapia,
Leonardo Bautista-Gomez
Abstract:
Scalability is a common issue among the most used permissionless blockchains, and several approaches have been proposed to solve this issue. Tackling scalability while preserving the security and decentralization of the network is a significant challenge. To deliver effective scaling solutions, Ethereum achieved a major protocol improvement, including a change in the consensus mechanism towards Pr…
▽ More
Scalability is a common issue among the most used permissionless blockchains, and several approaches have been proposed to solve this issue. Tackling scalability while preserving the security and decentralization of the network is a significant challenge. To deliver effective scaling solutions, Ethereum achieved a major protocol improvement, including a change in the consensus mechanism towards Proof of Stake. This improvement aimed a vast reduction of the hardware requirements to run a node, leading to significant sustainability benefits with a lower network energy consumption. This work analyzes the resource usage behavior of different clients running as Ethereum consensus nodes, comparing their performance under different configurations and analyzing their differences. Our results show higher requirements than claimed initially and how different clients react to network perturbations. Furthermore, we discuss the differences between the consensus clients, including their strong points and limitations.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
Unveiling Ethereum's Hidden Centralization Incentives: Does Connectivity Impact Performance?
Authors:
Mikel Cortes-Goicoechea,
Tarun Mohandas-Daryanani,
Jose Luis Munoz-Tapia,
Leonardo Bautista-Gomez
Abstract:
Modern public blockchains like Ethereum rely on p2p networks to run distributed and censorship-resistant applications. With its wide adoption, it operates as a highly critical public ledger. On its transition to become more scalable and sustainable, shifting to PoS without sacrificing the security and resilience of PoW, Ethereum offers a range of consensus clients to participate in the network. In…
▽ More
Modern public blockchains like Ethereum rely on p2p networks to run distributed and censorship-resistant applications. With its wide adoption, it operates as a highly critical public ledger. On its transition to become more scalable and sustainable, shifting to PoS without sacrificing the security and resilience of PoW, Ethereum offers a range of consensus clients to participate in the network. In this paper, we present a methodology to measure the performance of the consensus clients based on the latency to receive messages from the p2p network. The paper includes a study that identifies the incentives and limitations that the network experiences, presenting insights about the latency impact derived from running the software in different locations.
△ Less
Submitted 23 September, 2023;
originally announced September 2023.
-
Autopsy of Ethereum's Post-Merge Reward System
Authors:
Mikel Cortes-Goicoechea,
Tarun Mohandas-Daryanani,
Jose Luis Muñoz-Tapia,
Leonardo Bautista-Gomez
Abstract:
Like most modern blockchain networks, Ethereum has relied on economic incentives to promote honest participation in the chain's consensus. The distributed character of the platform, together with the "randomness" or "luck" factor that both proof of work (PoW) and proof of stake (PoS) provide when electing the next block proposer, pushed the industry to model and improve the reward system of the sy…
▽ More
Like most modern blockchain networks, Ethereum has relied on economic incentives to promote honest participation in the chain's consensus. The distributed character of the platform, together with the "randomness" or "luck" factor that both proof of work (PoW) and proof of stake (PoS) provide when electing the next block proposer, pushed the industry to model and improve the reward system of the system. With several improvements to predict PoW block proposal rewards and to maximize the extractable rewards of the same ones, the ultimate Ethereum's transition to PoS applied in the Paris Hard-Fork, more generally known as "The Merge", has meant a significant modification on the reward system in the platform. In this paper, we aim to break down both theoretically and empirically the new reward system in this post-merge era. We present a highly detailed description of the different rewards and their share among validators' rewards. Ultimately, we offer a study that uses the presented reward model to analyze the performance of the network during this transition.
△ Less
Submitted 17 March, 2023;
originally announced March 2023.
-
Discovering the Ethereum2 P2P Network
Authors:
Mikel Cortes-Goicoechea,
Leonardo Bautista-Gomez
Abstract:
Achieving the equilibrium between scalability, sustainability, and security while keeping decentralization has prevailed as the target solution for decentralized blockchain applications over the last years. Several approaches have been proposed by multiple blockchain teams to achieve it, Ethereum being among them. Ethereum is on the path of a major protocol improvement called Ethereum 2.0 (Eth2),…
▽ More
Achieving the equilibrium between scalability, sustainability, and security while keeping decentralization has prevailed as the target solution for decentralized blockchain applications over the last years. Several approaches have been proposed by multiple blockchain teams to achieve it, Ethereum being among them. Ethereum is on the path of a major protocol improvement called Ethereum 2.0 (Eth2), implementing Sharding and introducing the Proof-of-Stake (PoS). As the change of consensus mechanism is a delicate matter, this improvement will be achieved through different phases, the first of which is the implementation of the Beacon Chain. As Ethereum1, Eth2 relies on a decentralized peer-to-peer (p2p) network for the message distribution. Up to date, we estimate that there are around 5.000 nodes in the Eth2 main net geographically distributed. However, the topology of this one still prevails unknown. In this paper, we present the results obtained from the analysis we performed on the Eth2 p2p network. Describing the topology of the network, as possible hazards that this one implies.
△ Less
Submitted 22 September, 2022; v1 submitted 29 December, 2020;
originally announced December 2020.
-
Resource Analysis of Ethereum 2.0 Clients
Authors:
Mikel Cortes-Goicoechea,
Luca Franceschini,
Leonardo Bautista-Gomez
Abstract:
Scalability is a common issue among the most used permissionless blockchains, and several approaches have been proposed accordingly. As Ethereum is set to be a solid foundation for a decentralized Internet web, the need for tackling scalability issues while preserving the security of the network is an important challenge. In order to successfully deliver effective scaling solutions, Ethereum is on…
▽ More
Scalability is a common issue among the most used permissionless blockchains, and several approaches have been proposed accordingly. As Ethereum is set to be a solid foundation for a decentralized Internet web, the need for tackling scalability issues while preserving the security of the network is an important challenge. In order to successfully deliver effective scaling solutions, Ethereum is on the path of a major protocol improvement called Ethereum 2.0 (Eth2), which implements sharding. As the change of consensus mechanism is an extremely delicate matter, this improvement will be achieved through different phases, the first of which is the implementation of the Beacon Chain. For this, a specification has been developed and multiple groups have implemented clients to run the new protocol. In this work, we analyse the resource usage behaviour of different clients running as Eth2 nodes, comparing their performance and analysing differences. Our results show multiple network perturbations and how different clients react to it.
△ Less
Submitted 29 December, 2020;
originally announced December 2020.
-
On the Applicability of PEBS based Online Memory Access Tracking for Heterogeneous Memory Management at Scale
Authors:
Aleix Roca Nonell,
Balazs Gerofi,
Leonardo Bautista-Gomez,
Dominique Martinet,
Vicenç Beltran Querol,
Yutaka Ishikawa
Abstract:
Operating systems have historically had to manage only a single type of memory device. The imminent availability of heterogeneous memory devices based on emerging memory technologies confronts the classic single memory model and opens a new spectrum of possibilities for memory management. Transparent data movement between different memory devices based on access patterns of applications is a desir…
▽ More
Operating systems have historically had to manage only a single type of memory device. The imminent availability of heterogeneous memory devices based on emerging memory technologies confronts the classic single memory model and opens a new spectrum of possibilities for memory management. Transparent data movement between different memory devices based on access patterns of applications is a desired feature to make optimal use of such devices and to hide the complexity of memory management to the end-user. However, capturing memory access patterns of an application at runtime comes at a cost, which is particularly challenging for large scale parallel applications that may be sensitive to system noise.
In this work, we focus on the access pattern profiling phase prior to the actual memory relocation. We study the feasibility of using Intel's Processor Event-Based Sampling (PEBS) feature to record memory accesses by sampling at runtime and study the overhead at scale. We have implemented a custom PEBS driver in the IHK/McKernel lightweight multi-kernel operating system, one of whose advantages is minimal system interference due to the lightweight kernel's simple design compared to other OS kernels such as Linux. We present the PEBS overhead of a set of scientific applications and show the access patterns identified in noise-sensitive HPC applications. Our results show that clear access patterns can be captured with a 10% overhead in the worst-case and 1% in the best case when running on up to 128k CPU cores (2,048 Intel Xeon Phi Knights Landing nodes). We conclude that online memory access profiling using PEBS at large scale is promising for memory management in heterogeneous memory environments.
△ Less
Submitted 26 November, 2020;
originally announced November 2020.
-
Resiliency in Numerical Algorithm Design for Extreme Scale Simulations
Authors:
Emmanuel Agullo,
Mirco Altenbernd,
Hartwig Anzt,
Leonardo Bautista-Gomez,
Tommaso Benacchio,
Luca Bonaventura,
Hans-Joachim Bungartz,
Sanjay Chatterjee,
Florina M. Ciorba,
Nathan DeBardeleben,
Daniel Drzisga,
Sebastian Eibl,
Christian Engelmann,
Wilfried N. Gansterer,
Luc Giraud,
Dominik Goeddeke,
Marco Heisig,
Fabienne Jezequel,
Nils Kohl,
Xiaoye Sherry Li,
Romain Lion,
Miriam Mehl,
Paul Mycek,
Michael Obersteiner,
Enrique S. Quintana-Orti
, et al. (11 additional authors not shown)
Abstract:
This work is based on the seminar titled ``Resiliency in Numerical Algorithm Design for Extreme Scale Simulations'' held March 1-6, 2020 at Schloss Dagstuhl, that was attended by all the authors.
Naive versions of conventional resilience techniques will not scale to the exascale regime: with a main memory footprint of tens of Petabytes, synchronously writing checkpoint data all the way to backgr…
▽ More
This work is based on the seminar titled ``Resiliency in Numerical Algorithm Design for Extreme Scale Simulations'' held March 1-6, 2020 at Schloss Dagstuhl, that was attended by all the authors.
Naive versions of conventional resilience techniques will not scale to the exascale regime: with a main memory footprint of tens of Petabytes, synchronously writing checkpoint data all the way to background storage at frequent intervals will create intolerable overheads in runtime and energy consumption. Forecasts show that the mean time between failures could be lower than the time to recover from such a checkpoint, so that large calculations at scale might not make any progress if robust alternatives are not investigated.
More advanced resilience techniques must be devised. The key may lie in exploiting both advanced system features as well as specific application knowledge. Research will face two essential questions: (1) what are the reliability requirements for a particular computation and (2) how do we best design the algorithms and software to meet these requirements? One avenue would be to refine and improve on system- or application-level checkpointing and rollback strategies in the case an error is detected. Developers might use fault notification interfaces and flexible runtime systems to respond to node failures in an application-dependent fashion. Novel numerical algorithms or more stochastic computational approaches may be required to meet accuracy requirements in the face of undetectable soft errors.
The goal of this Dagstuhl Seminar was to bring together a diverse group of scientists with expertise in exascale computing to discuss novel ways to make applications resilient against detected and undetected faults. In particular, participants explored the role that algorithms and applications play in the holistic approach needed to tackle this challenge.
△ Less
Submitted 26 October, 2020;
originally announced October 2020.
-
Extending the OpenCHK Model with Advanced Checkpoint Features
Authors:
Marcos Maroñas,
Sergi Mateo,
Kai Keller,
Leonardo Bautista-Gomez,
Eduard Ayguadé,
Vicenç Beltran
Abstract:
One of the major challenges in using extreme scale systems efficiently is to mitigate the impact of faults. Application-level checkpoint/restart (CR) methods provide the best trade-off between productivity, robustness, and performance. There are many solutions implementing CR at the application level. They all provide advanced I/O capabilities to minimize the overhead introduced by CR. Nevertheles…
▽ More
One of the major challenges in using extreme scale systems efficiently is to mitigate the impact of faults. Application-level checkpoint/restart (CR) methods provide the best trade-off between productivity, robustness, and performance. There are many solutions implementing CR at the application level. They all provide advanced I/O capabilities to minimize the overhead introduced by CR. Nevertheless, there is still room for improvement in terms of programmability and flexibility, because end-users must manually serialize and deserialize application state using low-level APIs, modify the flow of the application to consider restarts, or rewrite CR code whenever the backend library changes. In this work, we propose a set of compiler directives and clauses that allow users to specify CR operations in a simple way. Our approach supports the common CR features provided by all the CR libraries. However, it can also be extended to support advanced features that are only available in some CR libraries, such as differential checkpointing, the use of HDF5 format, and the possibility of using fault-tolerance-dedicated threads. The result of our evaluation revealed a high increase in programmability. On average, we reduced the number of lines of code by 71%, 94%, and 64% for FTI, SCR, and VeloC, respectively, and no additional overhead was perceived using our solution compared to using the backend libraries directly. Finally, portability is enhanced because our programming model allows the use of any backend library without changing any code.
△ Less
Submitted 1 July, 2020; v1 submitted 30 June, 2020;
originally announced June 2020.
-
Checkpoint/restart approaches for a thread-based MPI runtime
Authors:
Julien Adam,
Maxime Kermarquer,
Jean-Baptiste Besnard,
Leonardo Bautista-Gomez,
Marc Perache,
Patrick Carribault,
Julien Jaeger,
Allen D. Malony,
Sameer Shende
Abstract:
Fault-tolerance has always been an important topic when it comes to running massively parallel programs at scale. Statistically, hardware and software failures are expected to occur more often on systems gathering millions of computing units. Moreover, the larger jobs are, the more computing hours would be wasted by a crash. In this paper, we describe the work done in our MPI runtime to enable bot…
▽ More
Fault-tolerance has always been an important topic when it comes to running massively parallel programs at scale. Statistically, hardware and software failures are expected to occur more often on systems gathering millions of computing units. Moreover, the larger jobs are, the more computing hours would be wasted by a crash. In this paper, we describe the work done in our MPI runtime to enable both transparent and application-level checkpointing mechanisms. Unlike the MPI 4.0 User-Level Failure Mitigation (ULFM) interface, our work targets solely Checkpoint/Restart and ignores other features such as resiliency. We show how existing checkpointing methods can be practically applied to a thread-based MPI implementation given sufficient runtime collaboration. The two main contributions are the preservation of high-speed network performance during transparent C/R and the over-subscription of checkpoint data replication thanks to a dedicated user-level scheduler support. These techniques are measured on MPI benchmarks such as IMB, Lulesh and Heatdis, and associated overhead and trade-offs are discussed.
△ Less
Submitted 12 June, 2019;
originally announced June 2019.