What Blocks My Blockchain’s Throughput?
Developing a Generalizable Approach for Identifying Bottlenecks in Permissioned Blockchains

Orestis Papageorgiou 0000-0003-3412-5082 SnT – Interdisciplinary Centre for Security, Reliability and TrustUniversity of LuxembourgLuxembourgLuxembourg [email protected] , Lasse Börtzler Dept. of Economics and ManagementKarlsruhe Institute of TechnologyKarlsruheGermany [email protected] , Egor Ermolaev 0000-0003-3412-5082 SnT – Interdisciplinary Centre for Security, Reliability and TrustUniversity of LuxembourgLuxembourgLuxembourg [email protected] , Jyoti Kumari SnT – Interdisciplinary Centre for Security, Reliability and TrustUniversity of LuxembourgLuxembourgLuxembourg [email protected] and Johannes Sedlmeir 0000-0003-2631-8749 SnT – Interdisciplinary Centre for Security, Reliability and TrustUniversity of LuxembourgLuxembourgLuxembourg [email protected]

Abstract.

Permissioned blockchains have been proposed for a variety of use cases that require decentralization yet address enterprise requirements that permissionless blockchains to date cannot satisfy – particularly in terms of performance. However, popular permissioned blockchains still exhibit a relatively low maximum throughput in comparison to established centralized systems. Consequently, researchers have conducted several benchmarking studies on different permissioned blockchains to identify their limitations and – in some cases – their bottlenecks in an attempt to find avenues for improvement. Yet, these approaches are highly heterogeneous, difficult to compare, and require a high level of expertise in the implementation of the underlying specific blockchain. In this paper, we develop a more unified and graphical approach for identifying bottlenecks in permissioned blockchains based on a systematic review of related work, experiments with the Distributed Ledger Performance Scan (DLPS), and an extension of its graphical evaluation functionalities. We conduct in-depth case studies on Hyperledger Fabric and Quorum, two widely used permissioned blockchains with distinct architectural designs, demonstrating the adaptability of our framework across different blockchains. We provide researchers and practitioners working on evaluating or improving permissioned blockchains with a toolkit, guidelines on what data to document, and insights on how to proceed in the search process for bottlenecks.

PVLDB Reference Format:
PVLDB, 14(1): XXX-XXX, 2020.
doi:XX.XX/XXX.XX This work is licensed under the Creative Commons BY-NC-ND 4.0 International License. Visit https://creativecommons.org/licenses/by-nc-nd/4.0/ to view a copy of this license. For any use beyond those covered by this license, obtain permission by emailing [email protected]. Copyright is held by the owner/author(s). Publication rights licensed to the VLDB Endowment.
Proceedings of the VLDB Endowment, Vol. 14, No. 1 ISSN 2150-8097.
doi:XX.XX/XXX.XX

PVLDB Artifact Availability:
The source code, data, and/or other artifacts have been made available at %leave␣empty␣if␣no␣availability␣url␣should␣be␣sethttps://github.com/orepapas/What_Blocks_My_Blockchains_Throughput_Data.

IoT: Internet of Things
DLPS: distributed ledger performance scan
CPU: central processing unit
tps: transactions per second
f ${}_{\text{req}}$: request frequency
f ${}_{\text{resp}}$: response frequency
VSCC: validation system chaincode
MVCC: multi version concurrency control
EDA: exploratory data analysis
KDE: kernel density estimate
2PC: two-phase commit
PoW: proof of work
PoS: proof of stake
RAFT: Reliable, Replicated, Redundant, And Fault-Tolerant
IBFT: Istanbul Byzantine Fault Tolerant
QBFT: Quorum Byzantine Fault Tolerant
EVM: Ethereum Virtual Machine
DApp: decentralized application
API: application programming interface
DLT: distributed ledger technology
TX: transaction
RPC: remote procedure call

1. Introduction

Since its inception by Nakamoto in 2008 (Nakamoto, 2008), blockchain technology has been explored across various industries far beyond its original use in decentralized payment systems. Researchers and practitioners have explored its potential in a variety of applications, including its use as a database (Peng et al., 2020; Nathan et al., 2019; Ge et al., 2022; Hong et al., 2023), in the Internet of Things (IoT) (Dai et al., 2019; Tseng et al., 2020; Lockl et al., 2020), in cross-organizational workflow (Fridgen et al., 2018), supply chain management (Guggenberger et al., 2020), and in several other applications where a neutral platform is desirable (Sedlmeir et al., 2022a). In this context, organizations that aimed to use distributed ledger technology (DLT)-based systems also explored permissioned blockchains that restrict not only data visibility but also participation in consensus. This approach removes the limitations in terms of storage, bandwidth, and computing that permissionless systems impose to maintain a high degree of decentralization. Moreover, permissioned blockchains use different consensus mechanisms for replicated state machines (Vukolić, 2015), for instance, PBFT (Castro and Liskov, 2002), which allow for lower latency compared to common consensus algorithms used in permissionless systems, such as proof of work (PoW), and proof of stake (PoS).

However, the transition from experimental blockchain projects to practical applications in business operations faces challenges. Many projects fail to advance beyond the pilot stage due to technical complexities (Toufaily et al., 2021). This has resulted in a slower adoption rate compared to other technologies developed around the same time (Gartner, 2021). One of the primary reasons behind this is the technology’s inferior performance when compared to that of centralized systems. The resource-intensive nature of replication and consensus in blockchains hinders efficient scaling and parallelization, causing centralized architectures to outperform by far even permissioned blockchains (Sedlmeir et al., 2022b). As a result, a substantial part of computer science research on blockchain technology focuses on performance characteristics of permissioned blockchains (Fan et al., 2020).

Blockchain benchmarking research primarily focuses on high-level performance indicators, such as overall throughput and average latency. This is very useful when comparing different blockchain types (Dinh et al., 2017) or when analyzing the effect of deployment parameters like the number of nodes, network delays, block size, and transaction type (Androulaki et al., 2018; Nguyen et al., 2019; Thakkar et al., 2018; Kuzlu et al., 2019; Guggenberger et al., 2022). Hardware utilization or networking-related aspects are frequently relegated to the background, included as aggregated data like maximum values, or considered as a configuration parameter (Dinh et al., 2017; Fan et al., 2020). Additionally, blockchain benchmarking research remains heavily fragmented, blockchain-specific, and requires a high level of expertise to exercise.

This paper seeks to address these shortcomings by analyzing the behavior of hardware components within blockchain nodes, their interrelationships, and their impact on the network’s throughput in a systematic and graphical way, providing a general and illustrative method to detect bottlenecks. To achieve this, we examine Hyperledger Fabric (Fabric) and Quorum, as research has already provided many valuable insights and seem to be the most widely studied examples (see Section 3). We survey related work that we use to ground our method and conduct and evaluate our own experiments, using and extending the distributed ledger performance scan (DLPS), an open-source blockchain benchmarking framework that allows for the easy deployment and benchmarking of a broad range of blockchains and already supports limited graphical analyses (Sedlmeir et al., 2021). We also draw parallels between our results and those of other researchers in an attempt to determine the accuracy of our method and findings.

The remainder of this paper is structured as follows: Section 2 covers the fundamental characteristics of Fabric, Quorum, and the DLPS. Section 3 presents and structures related work in the field and points out the research gap to which we aim to contribute. Section 4 then describes our bottleneck identification method and the corresponding analyses of Fabric and Quorum. We conclude the paper with a summary of our findings and discuss limitations and opportunities for future research in Section 5.

2. Background

2.1. Hyperledger Fabric

Fabric has risen to prominence as one of the industry’s leading permissioned blockchains (Zheng, 2019; Guggenberger et al., 2022). Compared to other permissioned blockchains, its unique architecture provides novel opportunities for improving performance and eliminates the need for domain-specific and deterministic smart contract (called chaincode in the case of Fabric) programming languages (Androulaki et al., 2018). Fabric’s architecture relies on the execute-order-validate paradigm. Under this approach, nodes first simulate (execute) transactions in any order, which are then batched and ordered in consensus. Finally, nodes validate the individual transactions and ensure that no conflicts have emerged owing to the potentially different ordering in the execution phase (Androulaki et al., 2018). This approach is useful if cross-checking a computation-heavy transaction requires weaker agreement than an honest (super-)majority (Guggenberger et al., 2022). At the ordering and/or validation stages, nodes can verify whether sufficient signatures for agreement, according to previously defined rules, are present.

In Fabric, nodes are grouped into organizations, and a node can take at least one of the roles of a peer node (peer) or an orderer (orderer) (Androulaki et al., 2018). Only peers maintain an append-only ledger, whereas orderers create and broadcast blocks. We detail the characteristics of the execute-order-validate architecture below:

(1)

Execution Phase: A client sends a cryptographically signed transaction proposal to the peers. Upon receiving the transaction proposal, peers simulate the transaction, i.e., they run the required chaincode on their own copy of the ledger without updating their ledger or sending a corresponding update to other peers. After the simulation, peers respond to the client with an endorsement. The endorsement includes the peer’s digital certificate chain, confirming their organization membership, and the peer’s digital signature on the transaction proposal, the transaction’s read-write set at the time of simulation (which details the version numbers but not the values of the variables the transaction interacts with (Hyperledger Fabric Architecture Reference, 2020)), and the outcome of the simulation.
(2)

Ordering Phase: Once a client has collected sufficient endorsements for a proposal to be deemed acceptable by honest peers, it packs these endorsements into a transaction and forwards it to the ordering service. The orderers use a consensus protocol, such as Kafka (Kreps et al., 2011) or Reliable, Replicated, Redundant, And Fault-Tolerant (RAFT) (Ongaro and Ousterhout, 2014), to order the transactions, grouping them into batches (blocks), and signing them without evaluating their validity. Upon reaching consensus on a block, they broadcast it to a subset of peers (one anchor peer for each organization) for validation.
(3)

Validation Phase: Once a peer receives a block – either directly from an orderer or via a gossip protocol from a fellow peer belonging to the same organization – it validates the content in three steps (Foschini et al., 2020):
1. (a)
  
  First, every transaction within a block is subjected to parallel verification through the validation system chaincode (VSCC), which ensures that they have accumulated the appropriate number of endorsements as dictated by the endorsement policy. The VSCC also verifies that all executions came to the same result. If a transaction fails any of these tests, VSCC marks it as invalid.
2. (b)
  
  Next, valid transactions undergo multi version concurrency control (MVCC) – a sequential check of whether the simulations were conducted on compatible versions of the ledger. Should any variable within the transaction’s read-set have been altered in the peer’s ledger since its endorsement, the transaction is marked as invalid.
3. (c)
  
  Once a transaction in a block has passed the VSCC and MVCC checks, the peer updates the corresponding variables in its state database in what is called the commit step. Invalid transactions, while not influencing the ledger’s state, are nevertheless recorded in the ledger.

2.2. Quorum

Quorum is another blockchain that has emerged as a significant player among the industry’s permissioned blockchains (Alastria, 2024; VAKT, 2024). Quorum is based on Geth, an Ethereum client, and, as a result, leverages many of its characteristics. In contrast to Fabric, it utilizes a single, deterministic smart contract language, Solidity. This choice streamlines development, as Solidity is the primary language used in the broader, permissionless blockchain space (DeFiLlama, 2024), making it simpler to transfer applications from the public permissionless Ethereum blockchain to Quorum. Quorum relies on the more common order-execute architecture under which transactions are first ordered and batched into blocks using a consensus mechanism and only subsequently executed.

Quorum supports four different consensus mechanisms: RAFT, Quorum Byzantine Fault Tolerant (QBFT), Istanbul Byzantine Fault Tolerant (IBFT), and Clique (Consensys, 2024). The details around the architecture vary slightly depending on the consensus mechanism used. Under RAFT (the consensus mechanism used later in our analysis), each node can take only one of two roles: leader (minter) or follower (verifier). There is only one leader in each network. Below, we detail the characteristics of the order-execute architecture based on the RAFT consensus mechanism:

(1)

Ordering Phase: Clients send cryptographically signed transactions to the nodes. Subsequently, the nodes perform preliminary validations on the transactions, such as verification of the transactions’ attributes (e.g., nonce, syntax, signatures, etc.). Once verified, the nodes disseminate the pre-validated transactions to other nodes via a gossip protocol, allowing them to be added to each node’s transaction pool (mempool). Once the transaction reaches the leader, it orders valid transactions according to the network’s established priority rules. Finally, the ordered transactions are compiled into a block, which is disseminated across the network to follower nodes.
(2)

Execution Phase: Upon receiving the block, followers attach the block to their version of the blockchain and send a message to the leader to confirm acceptance of the block. After receiving acceptance messages from the majority of nodes, the block is considered valid and becomes the new head of the blockchain.

2.3. Distributed Ledger Performance Scan

The DLPS is designed as an end-to-end pipeline in which users can define the benchmarking specifications both for the configuration of the blockchain and client network, as well as the experiment settings (Sedlmeir et al., 2021). All the necessary parameters can be set using a single config file, and we simplified further the deployment of DLPS across different computing environments by dockerizing its launch on local machines. Additionally, we extended its graphical capabilities to allow for a more in-depth analysis. The DLPS framework is composed of three Python packages, BlockchainFormation, DAppFormation, and ChainLab, which automatically set up the blockchain networks’ nodes, smart contracts, and clients and iteratively execute measurements and collect and analyze data based on which they steer the benchmarking process. The benchmarking follows a recursive localization of maximum throughput with growing resolution (for more details, see Sedlmeir et al., 2021).

This process is implemented by gradually increasing the request rate directed at the blockchain to push the network to its throughput limit (Sedlmeir et al., 2021). The first localization run starts by sending asynchronous requests at a specified base rate. The DLPS measures the time at which clients send their transaction requests and receive successful notifications of their execution and determines average request frequency (f ${}_{\text{req}}$ ) and response frequency (f ${}_{\text{resp}}$ ) across all clients. If the network manages to keep up with f ${}_{\text{req}}$ for a certain duration, this test concludes, and the next one starts with an increased request rate. If f ${}_{\text{resp}}$ falls behind f ${}_{\text{req}}$ by more than a predetermined threshold (e.g., 5 %), a specified number of retries is performed. Should all retries fail, the ramping-up sequence is terminated, and the next ramping-up sequence for localization commences, starting from a base rate slightly lower than the maximum throughput achieved previously (e.g., 80 %). Further rampings can follow with smaller increments. This increases the granularity of measurements and data collection around the rate at which the system failed initially, allowing for the determination of a network’s behavior close to maximum sustainable throughput. Once the experiment is complete, the DLPS generates multiple figures summarizing the gathered information. The DLPS also stores the benchmarking data in CSV files, allowing users to perform their own analyses. We make heavy use of these CSV files for the subsequent analysis.

3. Related work

Evaluating the performance and assessing the capabilities of permissioned blockchains is vital (Guggenberger et al., 2022). Yet, there is only little attention on this subject, and the available material tends to be highly fragmented and heterogeneous, making the comparison and reproducibility challenging (Geyer et al., 2023). For instance, in the case of Fabric, publications investigate different versions, different consensus mechanisms, and heterogeneous network configurations and hardware. The use of different benchmarking tools aggravates these inconsistencies (Javaid et al., 2019), making it very difficult for practitioners to utilize the results for their blockchain projects. The majority of research also focuses on evaluating overall blockchain performance characteristics and puts less emphasis on identifying possible bottlenecks (Guggenberger et al., 2022; Baliga et al., 2018a).

Refer to caption — Figure 1. Overview of our systematic literature review.

To gain a full overview of the benchmarking academic literature, we started by performing a systematic literature review to identify publications that evaluate blockchain performance in general and the subset that provides insights into bottlenecks through in-depth analyses. We used the broad search string (blockchain OR “distributed ledger technology”) AND (performance OR throughput OR latency) AND (benchmarking OR measurement OR evaluation OR analysis) on Google Scholar, ACM Digital Library, IEEE Xplore, and arXiv. Our search yielded 4,248 results. Google Scholar alone provided approximately 57,000 results, but due to the limitations of the platform (Gusenbauer, 2019), we could only access 980 of those. After filtering the results through a manual review of titles and abstracts and removing duplicates, 57 publications remained. For these, we performed a full-text screening and excluded the articles that do not provide empirical insights on blockchain performance, for example, because they focus on performance modeling, and articles in which performance evaluation is only a supplement and where the results contradict these in publications with a heavy benchmarking focus. Finally, we ended up with 46 relevant publications relating to five different blockchains. We found 36 publications that include measurements on Fabric (Xu et al., 2021; Wang and Chu, 2020; Ben Toumia et al., 2022; Thakkar et al., 2018; Thakkar and Natarajan, 2021; Sharma et al., 2018; Shalaby et al., 2020; Sedlmeir et al., 2021; Samy et al., 2021; Pongnumkul et al., 2017; Nguyen et al., 2019; Nasirifard et al., 2019; Nasir et al., 2018; Nakaike et al., 2020; Monrat et al., 2020; Liu et al., 2021; Kuzlu et al., 2019; Klenik and Kocsis, 2021; Javaid et al., 2019; Hao et al., 2018; Guggenberger et al., 2022; Gorenflo et al., 2020; Geyer et al., 2019; Foschini et al., 2020; Dreyer et al., 2020; Ruan et al., 2020; Dinh et al., 2018; Dabbagh et al., 2020; Chacko et al., 2021; Bergman et al., 2020; Baliga et al., 2018a; Androulaki et al., 2019, 2018; Geneiatakis et al., 2020; Wang et al., 2019; Dinh et al., 2017; Sukhwani et al., 2018), 9 on private Ethereum (Geth and Parity) (Dinh et al., 2017; Leal et al., 2020; Monrat et al., 2020; Pongnumkul et al., 2017; Rouhani and Deters, 2017; Sedlmeir et al., 2021; Toyoda et al., 2020; Schäffer et al., 2019; Benahmed et al., 2019), 5 on Quorum (Baliga et al., 2018b; Mazzoni et al., 2021; Monrat et al., 2020; Sedlmeir et al., 2021; Shapiro et al., 2020), 4 on Hyperledger Sawtooth (Benahmed et al., 2019; Sedlmeir et al., 2021; Moschou et al., 2020; Shi et al., 2019) and one on Hyperledger Indy (Sedlmeir et al., 2021). Figure 1 features an overview of our literature research.

In the case of Fabric, we do not take into consideration research performed on Fabric v0.6 since it used the order-execute architecture. For the subsequent versions, several papers provide key insights on potential bottlenecks. For Fabric v 1.0 and in one of the first detailed benchmarking attempts, Androulaki et al. (Androulaki et al., 2018) identify the validation phase and, in particular, the VSCC as a major bottleneck. Thakkar et al. (Thakkar et al., 2018), using different Fabric configurations, find three major bottlenecks of v1.0, all of which relate to the validation step: the process of verifying certificates in the VSCC to ensure that the endorsement policy is fulfilled, the sequential validation of transactions (MVCC), and the multiple calls to the StateDB (when using CouchDB) during the validation and commit steps. Ruan et al. (Ruan et al., 2020) using v1.3, again point toward the validation phase without specifying which component, as the bottleneck, especially when many unserializable transactions are included in the ledger, leading to higher latency and slowing down validation.

Wang et al. (Wang and Chu, 2020) find that for v1.4, the VSCC execution during the validation phase remains the bottleneck as parallelization is limited and, consequently, throughput does not scale well with the number of cores. Chacko et al. (Chacko et al., 2021) identify that transaction failures point towards two systemic issues independent of configuration, with the validation stage acting as a likely bottleneck for v1.4. Specifically, they find that MVCC read conflicts, which occur when changes in the world state impact a transaction after it has already gained the required endorsements, result in transaction failure, necessitating a return to the execution phase for a new round of endorsements. The second issue lies with endorsement policy failures caused by inconsistencies between ledger copies across different peers, significantly slowing down the VSCC and, as a result, transaction processing. For the same Fabric version, Gorenflo et al. (Gorenflo et al., 2020) and Thakkar et al. (Thakkar and Natarajan, 2021) provide evidence that significant changes to the validation process could considerably improve throughput, further supporting the notion that the bottleneck lies within the validation phase. In summary, across all examined versions of Fabric, the consensus among researchers is that the main bottleneck lies within the validation stage. It is also worth noting that we could not find research focusing on identifying bottlenecks in Fabric v2.0 or higher.

When it comes to Quorum, Mazzoni et al. (Mazzoni et al., 2021) posit that a potential bottleneck lies with the node’s remote procedure call (RPC) server buffers being capped at 128KB. While this limitation suggests a maximum transaction size of 128KB, it is improbable that this limit constitutes the main bottleneck under normal circumstances, as the average transaction sizes usually amount to only a few hundred bytes. For Geth, Toyoda et al. (Toyoda et al., 2020) identify that the primary bottleneck stems from the poor utilization of the nodes’ multi-threading and calls of crypto.Ecrecover – a function for retrieving public keys from signatures that is called every time a transaction reaches a node. We found no study on private Ethereum with Aura consensus (Parity), Hyperledger Sawtooth, and Hyperledger Indy that aimed to determine their performance bottlenecks.

4. Method and Results

We developed the overall approach by analyzing the experiments described in related work, using the DLPS to collect data and exploratory data analysis (EDA) to identify bottlenecks. We collected a large volume of data on factors influencing node performance, ensuring a comprehensive overview of their resource utilization. After cleaning and validating the data, we used EDA in search of irregularities or drops in the blockchain’s performance. Upon identifying these issues, we conducted a detailed examination of the relevant metrics at a finer granularity to determine the root causes of these behaviors. We divided our analysis into two major components. The first part identifies potential bottleneck candidates by examining the different node resources that can impact their performance. In the second part, we examine the relationship between these candidates and the blockchain’s throughput in varying degrees of resolution, for instance, narrowing down the time window or selection of components. So far, most enterprise applications seem to have focused on Fabric and Quorum, and related work indicates that the bottleneck analysis of both blockchains is intricate, so we decided to detail our bottleneck identification method for both blockchains.

For our analysis, we selected a particular experiment for Fabric v2.0, drawing upon the findings of (Guggenberger et al., 2022) to determine a configuration that seems robust under modifications and extended it to Quorum v23.4. For Fabric, the network configuration comprised 16 clients, 8 peers, and 4 orderers, with four organizations comprised of two peers each. In contrast, the Quorum configuration consisted of 16 clients and 8 nodes. For both blockchains, we opted for the RAFT consensus mechanism due to its minimal overhead, as previous publications suggest that consensus is not the bottleneck in this case (Guggenberger et al., 2022; Mazzoni et al., 2021). We conducted the experiment on the AWS cloud platform (Amazon EC2), where each node was configured in an independent EC2 instance allocated with 16 vCPUs, 64 GB of RAM, and 1 Gbps of bandwidth running Ubuntu Server 18.04 LTS (HVM).

4.1. Fabric: Resource utilization

The initial analysis focuses on understanding the impact of incremental increases in the request rate on the resource utilization of each node. This is achieved by plotting a data point for each node every second during a 14-second time-frame within a 20-second experiment (excluding the initial and final 3 seconds where the ramp-up and ramp-down of f ${}_{\text{req}}$ may cause temporary effects), corresponding to different request frequencies (Figure 2). This method aims to uncover potential trends and correlations between resource usage and increased f ${}_{\text{req}}$ . This initial step already indicates that peers exhibit the highest strain in central processing unit (CPU) usage, as demonstrated by the maximum utilization percentage across all cores on each node while ordering and client nodes display significantly lower CPU utilization rates.

In terms of memory usage, both clients and ordering nodes appear largely unaffected by increases in the request rate. However, peers show signs of impact, albeit without a clear pattern. Regarding network and hardware utilization, client nodes maintain stable performance, whereas peer and orderer nodes demonstrate a moderate correlation between increased utilization and f ${}_{\text{req}}$ , but as before, the exact trend is not clear. Note that at this early stage of the analysis, any anomalies could also suggest inaccuracies in the experimental setup. Therefore, it is crucial to eliminate this possibility before conducting a more detailed analysis to determine the structural causes behind these observations. Upon ruling out experimental errors, a separate analysis is conducted for each component.

4.1.1. CPU

The analysis begins by focusing on the average CPU utilization across a 14-second period, with a subsequent focus on more detailed time intervals. We observe that as the request rate increases, the linear correlation between the request rate and CPU utilization of peers breaks at f ${}_{\text{req}}$ =1600 s^-1, indicating a potential saturation point or limitation in CPU capacity. This poses a potential bottleneck that requires further examination.

Looking into the CPU usage across all individual cores (Figure 3) reveals limited utilization across orderers and clients that is not plateauing at higher request rates. Focusing on peers, we observe a significant variance in CPU utilization among peers, ranging from 25% to 80% at higher request rates. Such variability could be an indicator that the network is distributing its resources unevenly, with some peers carrying more workload than others, or reflect an uneven allocation between nodes’ cores.

First, we investigate the CPU utilization per peer in Figure 4a, which suggests an equitable distribution of computational resources among the peers, with all of them showing comparable levels of computational effort, with peer 5 falling behind slightly. Subsequently, the investigation shifts to analyzing the mean CPU utilization of individual cores for a single peer (peer 0 in this case) in Figure 4b. It is apparent that this also is not the cause of the high fluctuations in CPU usage since all cores indicate similar utilization.

Following the inconclusive results of both possible explanations, the analysis progresses to evaluate the temporal evolution of CPU usage across individual cores at the highest f ${}_{\text{req}}$ before the utilization plateaus (Figure 5). This f ${}_{\text{req}}$ represents the peak stress on the network prior to any performance degradation. Here, we observe that individual cores can fluctuate as much as 30 % within a single run. At this point, this analysis comes to a halt as it is impossible to deduce the reasons behind these fluctuations from CPU utilization data only. However, the mean utilization illustrates that these fluctuations balance themselves out over time, with all cores clocking in at a similar average. This suggests that the high fluctuations in CPU usage are not the reason behind the plateauing.

It is also worth noting that from Figure 4, we see that average CPU usage plateaus at around 50 % across all cores on all peers. This marks a significant improvement to the prior versions of Fabric. However, it is important to highlight that in scenarios utilizing a higher number of vCPUs (16 in our case), Fabric v2.0 still has a mediocre mean utilization for a resource that can limit the entire system, which leaves room for further improvement.

4.1.2. Network

The network-related part of the analysis begins by looking into the mean network utilization, distinguishing between inbound and outbound traffic for the different types of nodes. Unsurprisingly, there is a strong correlation between the request rate and mean traffic. In the examination of orderers, we observe that both incoming and outgoing traffic do not plateau at high request rates, suggesting that they are not the bottleneck (Figure 6a). Notably, orderer 0 broadcasts a disproportionate amount of traffic compared to the rest, which indicates that orderer 0 is the leader in RAFT consensus and, as a result, has the additional task of broadcasting new blocks to each following orderer. In our experiments, we did not simulate crashing nodes, so orderer 0 remained the leader for the whole duration of the experiments (cf. (Guggenberger et al., 2022)), making this discrepancy logical.

A more detailed analysis of the network traffic allows its decomposition into individual components, such as the traffic generated by block propagation. According to Fabric’s architecture, the outbound traffic of followers among orderers predominantly consists of sending blocks to the peers, and from Figure 6a, an overlapping between inbound and outbound traffic of the followers is observed. The overlapping traffic, combined with the fact that followers receive each block once from the leader, suggests that each orderer forwards the received block only once. To evaluate the validity of this hypothesis, we compare followers’ outbound traffic with the leader’s outbound traffic (Figure 6b). The ratio consistently stands around four, aligning with our hypothesis and the Fabric architecture where the RAFT leader distributes the block to each follower, accounting for three times in our case and an additional time to a peer, totaling four times. Our analysis does not take into consideration the traffic generated by consensus-related messages, such as appended entries or heartbeat messages (Hyperledger, 2023), since differentiation between them and the traffic from block propagation is challenging. Nonetheless, the consensus-related messages generate significantly less traffic compared to block dissemination, and as a result, we expect that the outbound traffic of follower orderers provides a viable approximation for assessing the traffic associated with block propagation.

Regarding peers (Figure 7), we see that inbound traffic scales linearly with the f ${}_{\text{req}}$ for all of them. This uniformity aligns with the architecture in our experiment, as all peers receive the same number of transaction proposals (i.e., endorsement requests) from clients and the same number of blocks. Additionally, inbound traffic scales linearly throughout the experiment, indicating it’s not a bottleneck. Concerning outbound traffic, we observe that it plateaus for some peers while remaining unaffected for others (Figure 7a). This divergence allows for the classification of peers into two main clusters, color-coded as blue and orange, based on their outbound traffic. The two groups distribute communication workload differently, with blue peers registering more than twice the sent data of the orange peers. The blockchain’s architecture explains this divergence since blue peers are the gossip leaders of their respective organizations, with one gossip follower each. For instance, organization 0 consists of orderer 0, peer 0, and peer 1, with peer 0 acting as the anchor peer. As such, peer 0 receives all new blocks from orderer 0 and forwards them to peer 1.

The primary distinction in outbound traffic among peers stems from the block propagation between them. To confirm that blue peers correspond to gossip leaders and the accuracy of previous assumptions regarding traffic generated by block propagation, we perform a further analysis by deducting the traffic associated with block propagation, as obtained from the previous analysis of the ordering service, from the total outbound traffic of gossip leader peers (Figure 7b). The resulting traffic is relatively uniform across all peers, confirming that the blue peers are the gossip leaders. Furthermore, we observe a significant overlap in outbound traffic among peers, particularly at lower request frequencies, while some discrepancies exist at higher rates. This is expected as traffic stemming from consensus messages increases at higher request rates, making our approximation less accurate. The outbound traffic of peers consists of sending requested endorsements and confirmations that requested transactions have been committed to the blockchain to clients. Since the outbound traffic of peers plateaus at high f ${}_{\text{req}}$ , it suggests that these components are potential bottlenecks.

To pinpoint the specific bottleneck, we focus on client traffic (Figure 8). The inbound traffic of clients, although it exhibits a drop at higher request rates, is generated by the same components as the outbound traffic of peers, leaving us with the same potential bottlenecks as before. The outbound traffic of clients is comprised of transaction proposals submitted to peers and endorsed transactions sent to the orderers. Since the traffic does not plateau, it suggests that none of them is the bottleneck. Overall, we have seen that the traffic generated by the endorsed transactions the clients send to the ordering service and the traffic generated by the blocks the service sends to the peers never plateaus. This indicates that the execution and ordering phases are not the bottleneck. Since the endorsements that peers send back to the clients come in between the execution and ordering phases, this suggests that they are not the bottleneck, leaving us only with the transaction confirmations that peers send to clients as the possible culprit. These are sent after the validation phase, suggesting that this remains the primary bottleneck even in Fabric v2.0.

4.1.3. Memory & Hard Drive

From Figure 2, we see that the utilization of memory and hard drive is limited, indicating that neither of them is the bottleneck. Despite this, we examine the resource utilization of peers and orderers further. We do not examine clients since their utilization remains unaffected by the increasing number of requests. Starting with memory usage (Figure 9), we see peers and orderers exhibit low and non-plateauing usage levels, suggesting that memory constraints are not the bottleneck. We observe some minor differences in memory utilization across orderers, but given the low overall utilization, it is highly unlikely that they constitute a bottleneck, and we don’t investigate them further.

Moving to hard drive utilization (Figure 10), we observe that the orderers’ usage is increasing linearly with the f ${}_{\text{req}}$ and, as a result, likely does not pose a problem. Most peers’ I/O operations (such as ledger updates or block processing) occur during or after the validation phase. As a result, even though their utilization plateaus, it does not provide us with any new information. Additionally, due to the variety in read/write operations that peers execute, it is impossible to differentiate between them with only data related to resource utilization at hand. Moreover, with the peak utilization at approximately 1 %, it is evident that hard drive usage is far from reaching capacity, pointing out that constraints lie within a different component, which in turn limits hard drive utilization.

As mentioned in Section 2, the validation phase is comprised of three steps: VSCC, MVCC, and each peer updating its database. As the peers’ hard drive utilization is minimal, this leaves only VSCC and MVCC as the potential bottlenecks in the validation phase.

4.2. Fabric: Throughput

In the second part of our analysis, we attempt to find correlations between Fabric’s throughput and the components highlighted as potential bottlenecks in the first part, namely peer CPU utilization and peer network traffic. We start by gaining an overview of how the request rate affects throughput. This is achieved by plotting the throughput as a rolling mean across different window sizes (Figure 11). A one-second window size results in each data point being plotted individually, revealing how unstable the throughput of this Fabric network gets beyond f ${}_{\text{req}}$ =1200 s^-1, with fluctuations in f ${}_{\text{resp}}$ reaching up to 1000 s^-1 within a single run. To see the overall network performance trend, we increase the window size, taking 3 cases at 3 seconds, 8 seconds, and full run. In these cases, the mean is calculated over more data points, and outliers are smoothed out. Here, we see that, on average, the system keeps up with the request rate until reaching approximately f ${}_{\text{req}}$ =1600 s^-1. Beyond this point, the interrelationship breaks, marking this as the peak throughput observed in the experiment.

To obtain insights into the link between peer CPU utilization and network traffic, we examine Figure 12, which plots the two resources against the network’s throughput employing a rolling average with a 3-second window. We selected this interval as it matches the average duration required for a transaction to be committed to the blockchain under high request rates. This is significant because queuing effects become prominent at elevated f ${}_{\text{req}}$ , and opting for a shorter time window could underestimate throughput.

For CPU usage, we observe an initial linear increase with throughput until the characteristic plateau is reached. Both variables seem to increase at a similar rate at the beginning, implying that increased throughput proportionally strains the CPU. Additionally, we observe that the instability in CPU utilization starts at around f ${}_{\text{resp}}$ =1200 s^-1, which is also the point at which the correlation between f ${}_{\text{req}}$ and f ${}_{\text{resp}}$ starts breaking down (Figure 11a). This indicates that CPU usage is more closely correlated to throughput than to the request rate. Similarly, although network traffic increases with throughput, it does so at a lower rate and begins to exhibit instability already at around f ${}_{\text{resp}}$ =600 s^-1, where the throughput still manages to keep up with the request rate.

Digging deeper, we focus on the throughput of individual peers (Figure 13). Here, we see that every peer contributes similarly to the overall throughput, with all of them moving in unison with minor variations at high f ${}_{\text{req}}$ . Given our findings in Section 4.1.2, where we noted significant differences in network traffic between gossip leaders and followers, this outcome suggests that network traffic is not a significant factor in determining throughput compared to CPU utilization. If it were, we would expect noticeable differences in throughput between gossip leaders and followers. Consequently, this leaves us with peer CPU utilization as the primary factor behind throughput leveling off.

The crucial role of peer CPU utilization further supports the findings in Section 4.1 that VSCC and MVCC are the main bottlenecks in Fabric as both depend on peer CPU. Given that validations in VSCC are parallelized and we have noted that the mean core utilization is limited and mean CPU utilization is similar across all peers, it appears that the main bottleneck is VSCC. However, the sequential execution of concurrency controls in MVCC, combined with the fluctuating utilization per core featured in Figure 5, makes it impossible to determine its role as a bottleneck. As a result, while VSCC appears to be the main culprit, we cannot conclusively confirm this with the available data. This also highlights a shortcoming of the DLPS’ resource utilization-based approach, which was chosen to make it blockchain agnostic. However, as a trade-off for this generalized approach, the resulting data are less specific than they could be. Therefore, further research would require taking into account and evaluating more precise monitoring data, e.g., using SoundCloud’s Prometheus that is incorporated in Fabric (Hyperledger Fabric Operations Guides, 2020).

4.3. Quorum: Resource utilization

The Quorum analysis, as before, starts by gaining an overview of the four resources in relation to f ${}_{\text{req}}$ (Figure 14). To analyze Quorum’s performance in more detail, we refine our analysis by increasing the request rate in increments of 100 requests per second. As with the case of Fabric, even at this early stage, it is apparent that CPU and network utilization are closely correlated with the request rate. Memory and hard drive usage are limited, with I/O operations exhibiting high fluctuations but generally registering less than 1 % usage. Across all resources, client utilization appears to be minimal, and it is either unaffected or grows linearly with the request rate. As a result, in the following sections, we concentrate on the resource utilization of nodes.

4.3.1. CPU

We start the CPU analysis by looking into the utilization across all cores, where again we note significant fluctuations among the nodes as the request rates increase (Figure 15). Examining the mean node CPU utilization (Figure 16a), we see that nodes can be grouped into three distinct categories. Node 0 (green) consistently exhibits the highest utilization across all request rates, indicating it is the RAFT leader. This is attributed to the leader’s additional operations, such as transaction ordering and compiling transactions into a block. Nodes 1, 2, and 3 (blue) display slightly lower utilization levels, indicating their role in receiving and pre-validating transactions alongside node 0. In contrast, the remaining nodes (orange) exhibit significantly lower CPU usage, with their primary role being the appending of blocks to their local version of the blockchain.

Examining the utilization per core of node 0 (Figure 16b) we observe that one core (core 15) bears a higher workload across all request rates. This is due to the fact that with the exception of pre-validation, all other tasks of the leader are executed sequentially, leading to a disproportionate strain on one core. Following this, we examine the temporal evolution of CPU usage across individual cores at f ${}_{\text{req}}$ =2300 s^-1, which is the highest request rate before CPU utilization plateaus (Figure 17). Here, we see fluctuations by as much as 20 % for both the leader and the follower, with disparities of up to 10 % between individual cores, excluding core 15 for node 0. Notably, the main difference between the mean utilization of node 0 and nodes1, 2, and 3 comes only from core 15. These results suggest that parallel processing in Quorum is even more limited than that of Fabric, with each core’s average usage not exceeding 40 % and specific tasks of the leader burdening only one core.

With respect to bottleneck detection, nodes 0, 1, 2, and 3 reach an apparent plateau at f ${}_{\text{req}}$ =2400 s^-1, while for the remaining nodes, some of them plateau while others do not. Additionally, we observe that there is a considerable decrease in CPU usage at f ${}_{\text{req}}$ =1800 s^-1, which, even though it does not necessarily indicate a bottleneck, could provide clues for identifying factors contributing to the decline in CPU utilization. From Figure 16a, we observe that node 0 and nodes 1, 2, and 3 show similar behavior after reaching a plateau. This similarity suggests that transaction ordering and block building, which are the main unique operations performed by the leader, are likely not the sources of the bottleneck. If they were, we would expect the leader’s utilization pattern to diverge from that of the other nodes after the plateau. Examining Figure 16b, we see a rapid decline in the utilization of core 15 at high request rates. Considering that the remaining sequential operations, such as ordering the transactions, committing the block to the chain, and propagating it through the nodes, typically require minimal CPU resources, this sharp drop cannot be justified by these processes, and it appears that another component limits CPU utilization.

4.3.2. Network

Continuing the analysis by looking into the mean network utilization of the nodes (Figure 18), we can categorize the nodes into three groups as before. Due to the complexity of the Quorum network traffic, we are not able to decompose the traffic into individual components with adequate accuracy. Therefore, we rely entirely on the architectural design to identify the reasons behind the plateau. As with the case of Fabric, the network analysis does not take into consideration the traffic generated by consensus messages since they generate minimal traffic.

Starting with the blue nodes, we note that their outbound traffic levels off at f ${}_{\text{req}}$ =2400 s^-1 while signs of plateauing in their inbound traffic appear at f ${}_{\text{req}}$ =1800 s^-1. The inbound traffic primarily consists of transactions received either directly from clients or through gossip and blocks from the leader, whereas their outbound traffic originates from the dissemination of pre-validated transactions. This suggests that these operations are the limiting factors at their respective request rates. While we see similar patterns for the orange nodes, the patterns of the leader are essentially the opposite of those of other nodes. The inbound traffic is comprised of all the transactions that are broadcasted to the network and reach the leader through gossip or directly from the clients, and plateaus at f ${}_{\text{req}}$ =2400 s^-1. The outbound traffic involves mainly block dissemination to the other nodes and plateaus at f ${}_{\text{req}}$ =1800 s^-1. Since their inbound traffic plateaus at f ${}_{\text{req}}$ =2400 s^-1, it indicates that the leader keeps receiving the transactions from the blue nodes normally up until that point, leaving us only with block dissemination as the main bottleneck at f ${}_{\text{req}}$ =1800 s^-1 and transaction propagation as the main issue for f ${}_{\text{req}}$ =2400 s^-1.

Combining our findings with CPU utilization findings, the rapid drop in CPU utilization for core 15 at f ${}_{\text{req}}$ =2400 s^-1 (Figure 16b) is caused because the leader does not receive enough transactions from the other nodes and not because the CPU cannot keep up with the processes. At f ${}_{\text{req}}$ =1800 s^-1, the decline could be attributed to either the block propagation or one of the processes preceding it, such as transaction pre-validation and adding the block to the chain, since we already excluded transaction ordering and block building as potential bottlenecks in Section 4.3.2. Considering appending the block to the blockchain is not CPU intensive, the primary issues likely lie with block propagation or pre-validation. Using the same line of reasoning for the f ${}_{\text{req}}$ =2400 s^-1 rate, the bottleneck appears to be either transaction propagation, as already identified, or pre-validation, as it is the only operation that precedes propagation.

4.3.3. Memory & Hard Drive

Similar to our observations with Fabric, Figure 14 indicates that neither memory nor hard drive utilization are likely to be bottlenecks. Nevertheless, we proceed to examine the utilization metrics of the nodes. The analysis reveals consistent behavior in memory utilization across all nodes, with no signs of plateauing (Figure 19a). This uniformity suggests that memory is not contributing to performance degradation. Despite observing significant peaks in hard drive utilization in Figure 14, the mean utilization remains below 1 % (Figure 19b), indicating it is not a constraining factor either. The large fluctuations are probably related to the writing of the block into each node’s database, but since it is improbable that it leads to a bottleneck, we desist from further investigations.

Concluding the first part of the Quorum analysis, we identify two primary points where there is a significant drop in performance. The first one appears at f ${}_{\text{req}}$ =1800 s^-1, with the main factors likely being block propagation or the pre-validation of transactions. The second one is observed at f ${}_{\text{req}}$ =2400 s^-1, which appears to stem from the propagation of transactions through the network or the pre-validation of transactions.

4.4. Quorum: Throughput

The throughput analysis of Quorum starts by examining its correlation with the request rate. We see that even for small window sizes, throughput remains relatively stable but starts to exhibit higher fluctuations at f ${}_{\text{req}}$ =1800 s^-1. Examining the full run, we observe that throughput plateaus at around f ${}_{\text{resp}}$ =2100 s^-1, which is significantly lower than the maximum request rate. This suggests that the overall performance of the blockchain started to decline long before reaching the highest request rate. This observation suggests that the performance degradation noted at f ${}_{\text{req}}$ =1800 s^-1 for both CPU and network utilization may be more critical in identifying the bottleneck.

Examining throughput against the CPU and network utilization (Figure 21), we see similar growth rates and patterns. Both resources keep up with throughput at the beginning and experience higher performance fluctuations at around f ${}_{\text{resp}}$ =1800 s^-1. Beyond this point, the differences in throughput between request rates become less pronounced, becoming indistinguishable as it approaches f ${}_{\text{resp}}$ =2100 s^-1.

Since both resources exhibit similar patterns, we search for a common source behind the blockchain’s performance degradation. Considering that block and validated transaction propagation are network-related, if they were the bottleneck, they would mainly impact network traffic. As a result, they are less likely to be the bottleneck. This leaves only transaction pre-validation as the possible constraining factor, which can impact both CPU and network as it represents the first step in processing a transaction.

Looking further into pre-validation, we deduce that the bottleneck is either pre-validation itself, i.e., the nodes do not validate the incoming transactions fast enough, which in turn slows down the network, or the nodes do not receive enough transactions. From Figure 14, we see that the network utilization of clients increases linearly with the request rate, which suggests that they send the proper number of transactions to the nodes. Examining further the interaction of nodes with the incoming transactions, we look into the number of rejected transactions (Figure 22). By the term rejected transactions, we refer to those transactions that nodes decline to propagate through the network, resulting in clients receiving almost immediate (within less than 50ms) notification of transaction failure after submission. Initially, the count of rejected transactions remains minimal but begins to surge at f ${}_{\text{req}}$ =1800 s^-1, culminating in approximately 7000 rejections by f ${}_{\text{req}}$ =2500 s^-1. This escalation translates to a rejection rate of 14 % before reaching a plateau. This significant increase in rejections prompts us to investigate further to understand the underlying causes.

In Quorum, transactions are categorized as either executable or non-executable. Executable transactions are the ones that can be immediately included in a block, while non-executable transactions are out of nonce order and must wait for preceding transactions with lower nonce to execute first. Investigating the reasons behind the increasing rejection rates, we identify two main constraints in Quorum’s transaction handling. First, Quorum caps the transaction pool at 4096 executable and 10000 non-executable transactions (Consensys, 2022). However, according to Figure 23a, the transaction pool’s capacity is never fully utilized, suggesting that this is not the cause behind the rejections. The periodic drop in the number of transactions in the pool aligns with the end of each ramp-up phase of DLPS, where nodes commit pending transactions to blocks and clear the pool before the new ramp-up commences.

The second limitation involves the clients, who are limited to 16 executable and 500 non-executable transactions in the pool at any given time (Consensys, 2022). Examining Figure 23b, we see that at f ${}_{\text{req}}$ =1800 s^-1, transaction submission by six clients begins to exhibit instability, diverging from the previous linear relationship with the request rate, and most clients diverge at f ${}_{\text{req}}$ =2400 s^-1. This pattern suggests that clients are reaching the limit of executable transactions in the pool, leading to an increased rate of rejections. To answer this definitively, we would have to separate the client transactions into executable and non-executable and examine which type is getting rejected, something that is not possible due to DLPS’ blockchain agnostic nature that does not allow us to distinguish transactions between the different types. However, considering that the number of rejected transactions increases sharply at f ${}_{\text{req}}$ =1800 s^-1, combined with the observed reduction in CPU and network utilization starting at the same point and the subsequent plateau at f ${}_{\text{req}}$ =2400 s^-1, strongly suggest that the primary bottleneck is the limit on the number of executable transactions per client.

5. Conclusion

This paper introduces a general illustrative method for blockchain bottleneck identification, demonstrated through an in-depth analysis of a 12-node Fabric network and an 8-node Quorum network. Our method leverages EDA to analyze blockchain performance metrics, highlighting their specific characteristics and bottlenecks. By employing a combination of proportional analysis and the study of plateau-shaped trends in resource utilization versus transaction metrics, we uncover anomalies across varying time windows. The approach allows us to draw conclusions by comparing the correlation between data trends, the request rates (f ${}_{\text{req}}$ ), and response rates (f ${}_{\text{resp}}$ ).

For the case of Fabric, using our method, we were able to identify that the validation phase is the main bottleneck in Fabric’s performance, even for v2.0, with VSCC being the most likely component behind the bottleneck. We were also able to showcase the average degree of parallelization within Fabric, which leaves ample room for improvement. These findings are in line with previous investigations (Thakkar and Natarajan, 2021). Further studies could also attempt to dive deeper into the validation phase using tools tailored to Fabric to identify with higher confidence the role of VSCC and MVCC for the blockchain’s performance bottlenecks.

For Quorum, we determined that the bottleneck stems from the restriction on the number of executable transactions a client can have in the transaction pool, which results in a significant number of rejected transactions. Additionally, our findings illustrate Quorum’s relatively limited capacity for parallel processing. Future research could again rely on more specialized tools to examine thoroughly the role of executable and non-executable transactions in the network’s performance.

Our study is not without limitations. Specifically, in the network configurations we studied, all nodes were located in the same data center, while in real-life scenarios, nodes would likely be scattered around the globe. Related work has already analyzed performance degradation with significant network latencies (Guggenberger et al., 2022). As a result, our analysis poses the best-case scenario for network conditions, and additional research is needed to evaluate it under real-life situations. Moreover, while our approach offers a streamlined analysis, a deep understanding of blockchain-specific mechanisms remains crucial for accurate diagnoses.

We believe our work, including the publication and the open-source code, provides researchers and practitioners with a clear and illustrative starting point for simpler blockchain performance and bottleneck analysis. It can also serve to analyze other blockchains or as a stepping point for researchers to develop their approaches while offering a deeper understanding of the individual components of the underlying blockchain and their interrelationships.

Acknowledgements.

This work was supported in part by the Luxembourg National Research Fund (FNR) in the FiReSpARX (ref. 14783405) and PABLO (ref. 16326754) projects and by PayPal, PEARL grant (ref. 1334293). Additional funding was provided by the Bavarian Ministry of Economic Affairs, Regional Development and Energy for their funding of the Fraunhofer Blockchain Center project (ref. 20-3066-2-6-14).

References

(1)
Alastria (2024) Alastria. 2024. How to Install a Node in Alastria Red-T. https://github.com/alastria/alastria-node-quorum
Androulaki et al. (2018) Elli Androulaki, Artem Barger, Vita Bortnikov, Christian Cachin, Konstantinos Christidis, Angelo De Caro, David Enyeart, Christopher Ferris, Gennady Laventman, Yacov Manevich, Srinivasan Muralidharan, Chet Murthy, Binh Nguyen, Manish Sethi, Gari Singh, Keith Smith, Alessandro Sorniotti, Chrysoula Stathakopoulou, Marko Vukolić, Sharon Weed Cocco, and Jason Yellick. 2018. Hyperledger Fabric: A Distributed Operating System for Permissioned Blockchains. In Proceedings of the 13th EuroSys Conference. ACM. https://doi.org/10.1145/3190508.3190538
Androulaki et al. (2019) Elli Androulaki, Angelo De Caro, Matthias Neugschwandtner, and Alessandro Sorniotti. 2019. Endorsement in Hyperledger Fabric. In Proceedings of the International Conference on Blockchain. IEEE, 510–519. https://doi.org/10.1109/Blockchain.2019.00077
Baliga et al. (2018a) Arati Baliga, Nitesh Solanki, Shubham Verekar, Amol Pednekar, Pandurang Kamat, and Siddhartha Chatterjee. 2018a. Performance Characterization of Hyperledger Fabric. In Proceedings of the Crypto Valley Conference on Blockchain Technology. 65–74. https://doi.org/10.1109/CVCBT.2018.00013
Baliga et al. (2018b) Arati Baliga, I. Subhod, Pandurang Kamat, and Siddhartha Chatterjee. 2018b. Performance Evaluation of the Quorum Blockchain Platform. http://arxiv.org/abs/1809.03421
Ben Toumia et al. (2022) Sadok Ben Toumia, Christian Berger, and Hans P. Reiser. 2022. An Evaluation of Blockchain Application Requirements and their Satisfaction in Hyperledger Fabric: A Practical Experience Report. In Distributed Applications and Interoperable Systems: 22nd International Conference. Springer, 3–20. https://doi.org/10.1007/978-3-031-16092-9_1
Benahmed et al. (2019) Sofiane Benahmed, Ivan Pidikseev, Rasheed Hussain, JooYoung Lee, S.M. Ahsan Kazmi, Alma Oracevic, and Fatima Hussain. 2019. A Comparative Analysis of Distributed Ledger Technologies for Smart Contract Development. In Proceedings of the 30th Annual International Symposium on Personal, Indoor and Mobile Radio Communications. IEEE. https://doi.org/10.1109/PIMRC.2019.8904256
Bergman et al. (2020) Sara Bergman, Mikael Asplund, and Simin Nadjm‐Tehrani. 2020. Permissioned Blockchains and Distributed Databases: A Performance Study. Concurrency and Computation: Practice and Experience 32, 12 (2020). https://doi.org/10.1002/cpe.5227
Castro and Liskov (2002) Miguel Castro and Barbara Liskov. 2002. Practical Byzantine Fault Tolerance and Proactive Recovery. ACM Transactions on Computer Systems 20, 4 (2002), 398–461. https://doi.org/10.1145/571637.571640
Chacko et al. (2021) Jeeta Ann Chacko, Ruben Mayer, and Hans-Arno Jacobsen. 2021. Why do my Blockchain Transactions Fail? A Study of Hyperledger Fabric. In Proceedings of the International Conference on Management of Data. ACM, 221–234. https://doi.org/10.1145/3448016.3452823
Consensys (2022) Consensys. 2022. Quorum Documentation. https://github.com/Consensys/quorum/blob/6cf0f5aa6d870b6d727d80a7b0d074c444369b81/cmd/utils/flags.go
Consensys (2024) Consensys. 2024. Quorum GitHub Repository. https://github.com/Consensys/quorum/blob/master/README.md
Dabbagh et al. (2020) Mohammad Dabbagh, Mohsen Kakavand, Mohammad Tahir, and Angela Amphawan. 2020. Performance Analysis of Blockchain Platforms: Empirical Evaluation of Hyperledger Fabric and Ethereum. In Proceedings of the 2nd International Conference on Artificial Intelligence in Engineering and Technology. IEEE. https://doi.org/10.1109/IICAIET49801.2020.9257811
Dai et al. (2019) Hong-Ning Dai, Zibin Zheng, and Yan Zhang. 2019. Blockchain for Internet of Things: A Survey. IEEE Internet of Things Journal 6, 5 (2019), 8076–8094. https://doi.org/10.1109/JIOT.2019.2920987
DeFiLlama (2024) DeFiLlama. 2024. Smart Contract Language Dominance. https://defillama.com/languages
Dinh et al. (2018) Tien Tuan Anh Dinh, Rui Liu, Meihui Zhang, Gang Chen, Beng Chin Ooi, and Ji Wang. 2018. Untangling Blockchain: A Data Processing View of Blockchain Systems. IEEE Transactions on Knowledge and Data Engineering 30, 7 (2018), 1366–1385. https://doi.org/10.1109/TKDE.2017.2781227
Dinh et al. (2017) Tien Tuan Anh Dinh, Ji Wang, Gang Chen, Rui Liu, Beng Chin Ooi, and Kian-Lee Tan. 2017. Blockbench: A Framework for Analyzing Private Blockchains. In Proceedings of the International Conference on Management of Data. ACM, 1085–1100. https://doi.org/10.1145/3035918.3064033
Dreyer et al. (2020) Julian Dreyer, Marten Fischer, and Ralf Tönjes. 2020. Performance Analysis of Hyperledger Fabric 2.0 Blockchain Platform. In Proceedings of the Workshop on Cloud Continuum Services for Smart IoT Systems. ACM, 32–38. https://doi.org/10.1145/3417310.3431398
Fan et al. (2020) Caixiang Fan, Sara Ghaemi, Hamzeh Khazaei, and Petr Musilek. 2020. Performance Evaluation of Blockchain Systems: A Systematic Survey. IEEE Access 8 (2020), 126927–126950. https://doi.org/10.1109/ACCESS.2020.3006078
Foschini et al. (2020) Luca Foschini, Andrea Gavagna, Giuseppe Martuscelli, and Rebecca Montanari. 2020. Hyperledger Fabric Blockchain: Chaincode Performance Analysis. In Proceedings of the International Conference on Communications. https://doi.org/10.1109/ICC40277.2020.9149080
Fridgen et al. (2018) Gilbert Fridgen, Sven Radszuwill, Nils Urbach, and Lena Utz. 2018. Cross-Organizational Workflow Management Using Blockchain Technology – Towards Applicability, Auditability, and Automation. In Proceedings of the 51st Hawaii International Conference on System Sciences. 3507–3516. https://doi.org/10.24251/hicss.2018.444
Gartner (2021) Gartner. 2021. Emerging Technology Roadmap for Large Enterprises 2021 – 2023. https://emtemp.gcom.cloud/ngw/globalassets/en/publications/documents/le-emerging-tech-roadmap-2021-2023.pdf
Ge et al. (2022) Zerui Ge, Dumitrel Loghin, Beng Chin Ooi, Pingcheng Ruan, and Tianwen Wang. 2022. Hybrid Blockchain Database Systems: Design and Performance. Proceedings of the VLDB Endowment 15, 5 (2022), 1092–1104. https://doi.org/10.14778/3510397.3510406
Geneiatakis et al. (2020) Dimitris Geneiatakis, Yannis Soupionis, Gary Steri, Ioannis Kounelis, Ricardo Neisse, and Igor Nai-Fovino. 2020. Blockchain Performance Analysis for Supporting Cross-Border E-Government Services. IEEE Transactions on Engineering Management 67, 4 (2020), 1310–1322. https://doi.org/10.1109/TEM.2020.2979325
Geyer et al. (2019) Fabien Geyer, Holger Kinkelin, Hendrik Leppelsack, Stefan Liebald, Dominik Scholz, Georg Carle, and Dominic Schupke. 2019. Performance Perspective on Private Distributed Ledger Technologies for Industrial Networks. In International Conference on Networked Systems. https://doi.org/10.1109/NetSys.2019.8854512
Geyer et al. (2023) Frank Christian Geyer, Hans-Arno Jacobsen, Ruben Mayer, and Peter Mandl. 2023. An End-to-End Performance Comparison of Seven Permissioned Blockchain Systems. In Proceedings of the 24th International Middleware Conference. 71–84. https://doi.org/10.1145/3590140.3629106
Gorenflo et al. (2020) Christian Gorenflo, Stephen Lee, Lukasz Golab, and Srinivasan Keshav. 2020. FastFabric: Scaling Hyperledger Fabric to 20,000 Transactions Per Second. International Journal of Network Management 30, 5 (2020), e2099. https://doi.org/10.1109/BLOC.2019.8751452
Guggenberger et al. (2020) Tobias Guggenberger, André Schweizer, and Nils Urbach. 2020. Improving Interorganizational Information Sharing for Vendor Managed Inventory: Toward a Decentralized Information Hub Using Blockchain Technology. IEEE Transactions on Engineering Management 67, 4 (2020), 1074–1085. https://doi.org/10.1109/TEM.2020.2978628
Guggenberger et al. (2022) Tobias Guggenberger, Johannes Sedlmeir, Gilbert Fridgen, and André Luckow. 2022. An In-Depth Investigation of the Performance Characteristics of Hyperledger Fabric. Computers & Industrial Engineering 173 (2022), 108716. https://doi.org/10.1016/j.cie.2022.108716
Gusenbauer (2019) Michael Gusenbauer. 2019. Google Scholar to Overshadow them All? Comparing the Sizes of 12 Academic Search Engines and Bibliographic Databases. Scientometrics 118, 1 (2019), 177–214. https://doi.org/doi.org/10.1007/s11192-018-2958-5
Hao et al. (2018) Yue Hao, Yi Li, Xinghua Dong, Li Fang, and Ping Chen. 2018. Performance Analysis of Consensus Algorithm in Private Blockchain. In Intelligent Vehicles Symposium. IEEE, 280–285. https://doi.org/10.1109/IVS.2018.8500557
Hong et al. (2023) Zicong Hong, Song Guo, Enyuan Zhou, Wuhui Chen, Huawei Huang, and Albert Zomaya. 2023. GriDB: Scaling Blockchain Database via Sharding and Off-Chain Cross-Shard Mechanism. Proc. VLDB Endow. 16, 7 (2023), 1685–1698. https://doi.org/10.14778/3587136.3587143
Hyperledger (2023) Hyperledger. 2023. RAFT Configuration. https://github.com/hyperledger/fabric/blob/111cff51600d26d4b4b05f52825da11e7629e971/docs/source/raft_configuration.md?plain=1#L22
Hyperledger Fabric Architecture Reference (2020) Hyperledger Fabric Architecture Reference. 2020. Read-Write Set Semantics. https://hyperledger-fabric.readthedocs.io/en/release-2.2/readwrite.html
Hyperledger Fabric Operations Guides (2020) Hyperledger Fabric Operations Guides. 2020. Metrics Reference. https://hyperledger-fabric.readthedocs.io/en/release-2.2/metrics_reference.html
Javaid et al. (2019) Haris Javaid, Chengchen Hu, and Gordon Brebner. 2019. Optimizing Validation Phase of Hyperledger Fabric. In 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems. IEEE, 269–275. https://doi.org/10.1109/MASCOTS.2019.00038
Klenik and Kocsis (2021) Attila Klenik and Imre Kocsis. 2021. Porting a Benchmark with a Classic Workload to Blockchain: TPC-C on Hyperledger Fabric. https://doi.org/10.1145/3477314.3507006
Kreps et al. (2011) Jay Kreps, Neha Narkhede, Jun Rao, et al. 2011. Kafka: A Distributed Messaging System for Log Processing. In Proceedings of the NetDB, Vol. 11. https://api.semanticscholar.org/CorpusID:18534081
Kuzlu et al. (2019) Murat Kuzlu, Manisa Pipattanasomporn, Levent Gurses, and Saifur Rahman. 2019. Performance Analysis of a Hyperledger Fabric Blockchain Framework: Throughput, Latency and Scalability. In International Conference on Blockchain. IEEE, 536–540. https://doi.org/10.1109/Blockchain.2019.00003
Leal et al. (2020) Fátima Leal, Adriana E. Chis, and Horacio González–Vélez. 2020. Performance Evaluation of Private Ethereum Networks. SN Computer Science 1, 5 (2020), 285. https://doi.org/10.1007/s42979-020-00289-7
Liu et al. (2021) Chuan-Ming Liu, Mukesh Badigineni, and Sheng Wen Lu. 2021. Adaptive Blocksize for IoT Payload Data on Fabric Blockchain. In 30th Wireless and Optical Communications Conference. 92–96. https://doi.org/10.1109/WOCC53213.2021.9602935
Lockl et al. (2020) Jannik Lockl, Vincent Schlatt, André Schweizer, Nils Urbach, and Natascha Harth. 2020. Toward Trust in Internet of Things (IoT) Ecosystems: Design Principles for Blockchain-Based IoT Applications. IEEE Transactions on Engineering Management 67 (2020), 1256–1270. Issue 4. https://doi.org/10.1109/TEM.2020.2978014
Mazzoni et al. (2021) Marco Mazzoni, Antonio Corradi, and Vincenzo Di Nicola. 2021. Performance Evaluation of Permissioned Blockchains for Financial Applications: The ConsenSys Quorum Case Study. Blockchain: Research and Applications (2021), 100026. https://doi.org/10.1016/j.bcra.2021.100026
Monrat et al. (2020) Ahmed Afif Monrat, Olov Schelén, and Karl Andersson. 2020. Performance Evaluation of Permissioned Blockchain Platforms. In Asia-Pacific Conference on Computer Science and Data Engineering. IEEE. https://doi.org/10.1109/CSDE50874.2020.9411380
Moschou et al. (2020) Konstantinos Moschou, Anastasia Theodouli, Sofia Terzi, Konstantinos Votis, Dimitrios Tzovaras, Dimitrios Karamitros, and Sotiris Diamantopoulos. 2020. Performance Evaluation of Different Hyperledger Sawtooth Transaction Processors for Blockchain log Storage with Varying Workloads. In International Conference on Blockchain. IEEE, 476–481. https://doi.org/10.1109/Blockchain50366.2020.00069
Nakaike et al. (2020) Takuya Nakaike, Qi Zhang, Yohei Ueda, Tatsushi Inagaki, and Moriyoshi Ohara. 2020. Hyperledger Fabric Performance Characterization and Optimization Using GoLevelDB Benchmark. In International Conference on Blockchain and Cryptocurrency. IEEE. https://doi.org/10.1109/ICBC48266.2020.9169454
Nakamoto (2008) Satoshi Nakamoto. 2008. Bitcoin: A Peer-to-Peer Electronic Cash System. https://bitcoin.org/bitcoin.pdf
Nasir et al. (2018) Qassim Nasir, Ilham A. Qasse, Manar Abu Talib, and Ali Bou Nassif. 2018. Performance Analysis of Hyperledger Fabric Platforms. Security and Communication Networks 2018 (Sept. 2018). https://doi.org/10.1155/2018/3976093
Nasirifard et al. (2019) Pezhman Nasirifard, Ruben Mayer, and Hans-Arno Jacobsen. 2019. FabricCRDT: A Conflict-Free Replicated Datatypes Approach to Permissioned Blockchains. In Proceedings of the 20th International Middleware Conference. ACM, 110–122. https://doi.org/10.1145/3361525.3361540
Nathan et al. (2019) Senthil Nathan, Chander Govindarajan, Adarsh Saraf, Manish Sethi, and Praveen Jayachandran. 2019. Blockchain Meets Database: Design and Implementation of a Blockchain Relational Database. Proceedings of the VLDB Endowment 12, 11 (2019), 1539–1552. https://doi.org/10.14778/3342263.3342632
Nguyen et al. (2019) Thanh Son Lam Nguyen, Guillaume Jourjon, Maria Potop-Butucaru, and Kim Loan Thai. 2019. Impact of Network Delays on Hyperledger Fabric. In IEEE INFOCOM 2019 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). 222–227. https://doi.org/10.1109/INFCOMW.2019.8845168
Ongaro and Ousterhout (2014) Diego Ongaro and John Ousterhout. 2014. In Search of an Understandable Consensus Algorithm. In USENIX Annual Technical Conference. 305–319. https://raft.github.io/raft.pdf
Peng et al. (2020) Yanqing Peng, Min Du, Feifei Li, Raymond Cheng, and Dawn Song. 2020. FalconDB: Blockchain-Based Collaborative Database. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 637–652. https://doi.org/10.1145/3318464.3380594
Pongnumkul et al. (2017) Suporn Pongnumkul, Chaiyaphum Siripanpornchana, and Suttipong Thajchayapong. 2017. Performance Analysis of Private Blockchain Platforms in Varying Workloads. In 26th International Conference on Computer Communication and Networks. https://doi.org/10.1109/ICCCN.2017.8038517
Rouhani and Deters (2017) Sara Rouhani and Ralph Deters. 2017. Performance Analysis of Ethereum Transactions in Private Blockchain. In 8th International Conference on Software Engineering and Service Science. IEEE, 70–74. https://doi.org/10.1109/ICSESS.2017.8342866
Ruan et al. (2020) Pingcheng Ruan, Dumitrel Loghin, Quang-Trung Ta, Meihui Zhang, Gang Chen, and Beng Chin Ooi. 2020. A Transactional Perspective on Execute-Order-Validate Blockchains. In Proceedings of the ACM SIGMOD International Conference on Management of Data (Portland, OR, USA) (SIGMOD ’20). 543–557. https://doi.org/10.1145/3318464.3389693
Samy et al. (2021) Hossam Samy, Ashraf Tammam, Ahmed Fahmy, and Bahaa Hasan. 2021. Enhancing the Performance of the Blockchain Consensus Algorithm Using Multithreading Technology. In Shams Engineering Journal 12, 3 (2021), 2709–2716. https://doi.org/10.1016/j.asej.2021.01.019
Schäffer et al. (2019) Markus Schäffer, Monika di Angelo, and Gernot Salzer. 2019. Performance and Scalability of Private Ethereum Blockchains. In Business Process Management: Blockchain and Central and Eastern Europe Forum. Vol. 361. Springer, 103–118. https://doi.org/10.1007/978-3-030-30429-4_8
Sedlmeir et al. (2022a) Johannes Sedlmeir, Jonathan Lautenschlager, Gilbert Fridgen, and Nils Urbach. 2022a. The Transparency Challenge of Blockchain in Organizations. Electronic Markets 32 (2022), 1779–1794. Issue 3. https://doi.org/10.1007/s12525-022-00536-0
Sedlmeir et al. (2021) Johannes Sedlmeir, Philipp Ross, André Luckow, Jannik Lockl, Daniel Miehle, and Gilbert Fridgen. 2021. The DLPS: A New Framework for Benchmarking Blockchains. In 54th Hawaii International Conference on System Sciences. 6855–6864. https://hdl.handle.net/10993/45620
Sedlmeir et al. (2022b) Johannes Sedlmeir, Tim Wagner, Emil Djerekarov, Ryan Green, Johannes Klepsch, and Shruthi Rao. 2022b. A Serverless Distributed Ledger for Enterprises. In 55th Hawaii International Conference on System Sciences. 7382–7391. https://arxiv.org/abs/2110.09221
Shalaby et al. (2020) Salma Shalaby, Alaa Awad Abdellatif, Abdulla Al-Ali, Amr Mohamed, Aiman Erbad, and Mohsen Guizani. 2020. Performance Evaluation of Hyperledger Fabric. In Proceedings of the International Conference on Informatics, IoT, and Enabling Technologies. IEEE, 608–613. https://doi.org/10.1109/ICIoT48696.2020.9089614
Shapiro et al. (2020) Gary Shapiro, Christopher Natoli, and Vincent Gramoli. 2020. The Performance of Byzantine Fault Tolerant Blockchains. In 19th International Symposium on Network Computing and Applications. IEEE. https://doi.org/10.1109/NCA51143.2020.9306742
Sharma et al. (2018) Ankur Sharma, Felix Martin Schuhknecht, Divya Agrawal, and Jens Dittrich. 2018. How to Databasify a Blockchain: The Case of Hyperledger Fabric. http://arxiv.org/abs/1810.13177
Shi et al. (2019) Zeshun Shi, Huan Zhou, Yang Hu, Surbiryala Jayachander, Cees de Laat, and Zhiming Zhao. 2019. Operating Permissioned Blockchain in Clouds: A Performance Study of Hyperledger Sawtooth. In Proceedings of the 18th International Symposium on Parallel and Distributed Computing. IEEE, 50–57. https://doi.org/10.1109/ISPDC.2019.00010
Sukhwani et al. (2018) Harish Sukhwani, Nan Wang, Kishor S. Trivedi, and Andy Rindos. 2018. Performance Modeling of Hyperledger Fabric (Permissioned Blockchain Network). In Proceedings of the 17th International Symposium on Network Computing and Applications. IEEE. https://doi.org/10.1109/NCA.2018.8548070
Thakkar and Natarajan (2021) Parth Thakkar and Senthilnathan Natarajan. 2021. Scaling Blockchains Using Pipelined Execution and Sparse Peers. In Proceedings of the ACM Symposium on Cloud Computing. ACM, 489–502. https://doi.org/10.1145/3472883.3486975
Thakkar et al. (2018) Parth Thakkar, Senthil Nathan, and Balaji Viswanathan. 2018. Performance Benchmarking and Optimizing Hyperledger Fabric Blockchain Platform. In Proceedings of the 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems. IEEE, 264–276. https://doi.org/10.1109/MASCOTS.2018.00034
Toufaily et al. (2021) Elissar Toufaily, Tatiana Zalan, and Soumaya Ben Dhaou. 2021. A Framework of Blockchain Technology Adoption: An Investigation of Challenges and Expected Value. Information & Management 58, 3 (2021), 103444. https://doi.org/10.1016/j.im.2021.103444
Toyoda et al. (2020) Kentaroh Toyoda, Koji Machi, Yutaka Ohtake, and Allan N. Zhang. 2020. Function-Level Bottleneck Analysis of Private Proof-of-Authority Ethereum Blockchain. IEEE Access 8 (2020), 141611–141621. https://doi.org/10.1109/ACCESS.2020.3011876
Tseng et al. (2020) Lewis Tseng, Xinyu Yao, Safa Otoum, Moayad Aloqaily, and Yaser Jararweh. 2020. Blockchain-Based Database in an IoT Environment: Challenges, Opportunities, and Analysis. Cluster Computing 23, 3 (2020), 2151–2165. https://doi.org/10.1007/s10586-020-03138-7
VAKT (2024) VAKT. 2024. Technology. https://www.vakt.com/technology
Vukolić (2015) Marko Vukolić. 2015. The Quest for Scalable Blockchain Fabric: Proof-of-Work vs. BFT Replication. In International Workshop on Open Problems in Network Security. Springer, 112–125. https://doi.org/10.1007/978-3-319-39028-4_9
Wang and Chu (2020) Canhui Wang and Xiaowen Chu. 2020. Performance Characterization and Bottleneck Analysis of Hyperledger Fabric. Proceedings of the 40th International Conference on Distributed Computing Systems (2020), 1281–1286. https://doi.org/10.1109/ICDCS47774.2020.00165
Wang et al. (2019) Rui Wang, Kejiang Ye, and Cheng-Zhong Xu. 2019. Performance Benchmarking and Optimization for Blockchain Systems: A Survey. In Second International Conference on Blockchain. Springer. https://doi.org/10.1007/978-3-030-23404-1
Xu et al. (2021) Xiaoqiong Xu, Gang Sun, Long Luo, Huilong Cao, Hongfang Yu, and Athanasios V. Vasilakos. 2021. Latency Performance Modeling and Analysis for Hyperledger Fabric Blockchain Network. Information Processing & Management 58, 1 (2021), 102436. https://doi.org/10.1016/j.ipm.2020.102436
Zheng (2019) Steven Zheng. 2019. Nearly 75% of Fortune 100 Firms Have Explored Blockchain Initiatives. http://www.theblockresearch.com/nearly-75-of-fortune-100-firms-have-explored-blockchain-initiatives-68506

What Blocks My Blockchain’s Throughput? Developing a Generalizable Approach for Identifying Bottlenecks in Permissioned Blockchains