READ ME File For ‘panther-evaluation.xlsx’, ‘sherlock-throughput.xlsx’ and ‘sherlock-latency.xlsx’ Dataset DOI: 10.5258/SOTON/D3096 ReadMe Author: GILBERTO ZANFINO, University of Southampton ORCID ID 0000-0002-5576-3246 This dataset supports the thesis entitled ‘Enhancing Privacy and Scalability of Permissioned Blockchain’ AWARDED BY: University of Southampton DATE OF AWARD: 2024 DESCRIPTION OF THE DATA This dataset is used to create Table 5.2 in Chapter 5 of the thesis, reporting the performance of PANTHER and the overhead it brings into Hyperledger Fabric. PANTHER is a privacy-preserving permissioned blockchain presented in Chapter 5 and characterised by the Multi-Key Homomorphic Encryption (MKHE) model to improve the data privacy of blockchain systems. In particular, PANTHER enables blockchain users to store their encrypted data in the ledger and to invoke MKHE-based smart contract computations on it, although it is encrypted with different keys. The result of a MKHE computation is encrypted and to decrypt it a Secure Multiparty Computation (MPC) protocol is established in PANTHER, involving only those users that contributed to the computation with an input ciphertext. In a MPC decryption, each involved user send her own partial decryption to others and, by merging it with those received from the others, discovers the computation result in clear. To realise PANTHER and collect the data, we implemented a MKHE scheme and integrated it into Hyperledger Fabric. We set up a blockchain network (i.e., a channel in Fabric jargon) of 4 nodes and we implemented 4 client applications for 4 different users. The clients connect with the blockchain, send ciphertexts to be stored in the ledger, and invoke MKHE-based smart contracts to either add, multiply or subtract data from ledger. Upon the delivery of a smart contract output, the requesting client initiates a MPC decryption with other clients involved and they discover the computation result. These experiments are performed on a machine with 1 CPU Intel(R) Core(TM) i9-9980HK @ 2.40GHz and 12GB RAM running Ubuntu 22.04 LTS. Furthermore, this dataset is used to generate Figure 6.5 and Figure 6.6 in Chapter 6 of the thesis, comparing the performance of SHERLOCK with that of PBFT. SHERLOCK is a novel permissioned blockchain presented in Chapter 6 and characterised by the sharding technique to improve the scalability of nodes engaged in the consensus protocol. PBFT is the byzantine fault-tolerant consensus protocol commonly used in permissioned blockchain platforms. The comparison is performed in terms of throughput and latency. In particular, Figure 6.5 shows the throughput trend of SHERLOCK and PBFT according to ‘sherlock-throughput.xlsx’, whereas Figure 6.6 shows the latency trend of SHERLOCK and PBFT according to ‘sherlock-latency.xlsx’. To collect the data, we set up a network of 12 consensus nodes for both SHERLOCK and PBFT: * In SHERLOCK, nodes are arranged in 3 committees of 4 nodes each, running 3 parallel instances of the consensus protocol; * In PBFT, there is 1 committee of 12 nodes, running a single instance of the consensus protocol. Then, we implemented a client that generates and submits to both of them a workload of 1000 transactions per second for 5 minutes. These experiments are performed on a machine with 8 CPUs Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz and 128GB RAM running Ubuntu 22.04 LTS. The types of data collected for each file are detailed below. The ‘panther-evaluation.xlsx’ file contains: Keys Generation: It is the time in milliseconds that the MKHE key generation algorithm takes in a client application to generate user keys. Encryption: It is the time in milliseconds that the MKHE encryption algorithm takes in a client application to encrypt user data. Addition: It is the time in milliseconds that the MKHE addition algorithm takes in a smart contract to add the requested data from the ledger. Multiplication: It is the time in milliseconds that the MKHE multiplication algorithm takes in a smart contract to multiply the requested data from the ledger. Subtraction: It is the time in milliseconds that the MKHE subtraction algorithm takes in a smart contract to subtract the requested data from the ledger. MPC - Partial Decryption: It is the time in milliseconds that the MKHE partial decryption algorithm takes in a client application to partially decrypt a ciphertext. MPC - Broadcast: It is the time in milliseconds that a client application takes to broadcast a partial decryption to the users involved in MPC decryption. MPC - Merge: It is the time in milliseconds that the MKHE merge algorithm takes to merge partial decryptions and finally decrypt the ciphertext. MPC - Total Time: It is the total time in milliseconds a MPC decryption instance takes. Average Time (milliseconds): It is the average of the time values of a column (e.g., of ‘MPC - Partial Decryption’). The ‘sherlock-throughput.xlsx’ file contains: Time (seconds): It is the time of the evaluation in seconds. For each second, the other columns report the number of transactions processed in SHERLOCK and PBFT. SHERLOCK - Processed Transactions: It is the number of transactions that SHERLOCK processes in each second of the evaluation. SHERLOCK - Smoothed: It is a moving average function applied to the data in column ‘SHERLOCK - Processed Transaction’. For the first 9 column entries, the function sums the current value (i.e., transactions processed in the current second) with the preceding ones and then divides by the number of values taken. For example, the 3rd column entry is calculated by summing the first 3 values of ‘SHERLOCK - Processed Transaction’ and dividing by 3. From the 10th column entry onwards, the function sums the current value (i.e., transactions processed in the current second) with the preceding 10 and divides by 10. For example, the 15th column entry is calculated by summing the values of ‘SHERLOCK - Processed Transaction’ ranging from 5th to 15th and dividing by 10. PBFT - Processed Transactions: It is the number of transactions that PBFT processes in each second of the evaluation. PBFT - Smoothed: It is a moving average function applied to the data in column ‘PBFT - Processed Transaction’. For the first 9 column entries, the function sums the current value (i.e., transactions processed in the current second) with the preceding ones and then divides by the number of values taken. For example, the 3rd column entry is calculated by summing the first 3 values of ‘PBFT - Processed Transaction’ and dividing by 3. From the 10th column entry onwards, the function sums the current value (i.e., transactions processed in the current second) with the preceding 10 and divides by 10. For example, the 15th column entry is calculated by summing the values of ‘PBFT - Processed Transaction’ ranging from 5th to 15th and dividing by 10. The ‘sherlock-latency.xlsx’ file contains: Time (seconds): It is the time of the evaluation in seconds. For each second, the other columns report the number of transactions processed in SHERLOCK and PBFT. SHERLOCK - Average Latency: It is the average latency of transactions (in seconds) processed by SHERLOCK in each second. SHERLOCK - Smoothed: It is a moving average function applied to the data in column ‘SHERLOCK - Average Latency’. For the first 9 column entries, the function sums the current value (i.e., average latency of transactions processed in the current second) with the preceding ones and then divides by the number of values taken. For example, the 3rd column entry is calculated by summing the first 3 values of ‘SHERLOCK - Average Latency’ and dividing by 3. From the 10th column entry onwards, the function sums the current value (i.e., average latency of transactions processed in the current second) with the preceding 10 and divides by 10. For example, the 15th column entry is calculated by summing the values of ‘SHERLOCK - Average Latency’ ranging from 5th to 15th and dividing by 10. PBFT - Average Latency: It is the average latency of transactions (in seconds) processed by PBFT in each second. PBFT - Smoothed: It is a moving average function applied to the data in column ‘PBFT - Average Latency’. For the first 9 column entries, the function sums the current value (i.e., average latency of transactions processed in the current second) with the preceding ones and then divides by the number of values taken. For example, the 3rd column entry is calculated by summing the first 3 values of ‘PBFT - Average Latency’ and dividing by 3. From the 10th column entry onwards, the function sums the current value (i.e., average latency of transactions processed in the current second) with the preceding 10 and divides by 10. For example, the 15th column entry is calculated by summing the values of ‘PBFT - Average Latency’ ranging from 5th to 15th and dividing by 10. Date of data collection: from August 2023 to April 2024 Licence: CC BY Date that the file was created: June, 2024