sia.hackernoon.com

Table of Links

2 PRELIMINARIES

2.1 System Model

We define a formal model for blockchain systems which we will consider throughout this work. The model describes how block mining proceeds and is parametrized by several parameters. Different parameter value instantiations then give rise to formal models for blockchains based on PoW, PoStake, or PoST protocols. We assume the miners in the blockchain protocol are either adversarial or honest. Honest miners all follow a prescribed protocol, whereas the adversarial miners work (i.e. pool resources) together in a coalition.

System model. We assume that block mining proceeds in discrete time steps, as in many discrete-time models of PoW [17, 24] and PoStake [9] protocols. For a miner that owns a fraction 𝑝 ∈ [0, 1] of the total resources in the blockchain and that can mine up to 𝑘 > 0 blocks at any given time step, we define (𝑝, 𝑘)-mining as follows: the probability of mining a block at the given time step is proportional to 𝑝 · 𝑘, and the maximum number of blocks which are available for mining is 𝑘. One can think of 𝑘 as some further constraints on the number of blocks a miner can mine on at any given point in time, e.g. due to the number of VDFs they own in PoST. Hence, (𝑝, 1)-mining corresponds to mining in PoW based blockchains, (𝑝, 𝑘)-mining for 𝑘 < ∞ to mining in PoST based blockchains with 𝑘 VDFs, and (𝑝, ∞)-mining to mining in PoStake based blockchains

Adversarial and broadcast model. Let 𝑝 ∈ [0, 1] denote the fraction of resources in the chain owned by the adversarial coalition. We assume the adversarial coalition participates in (𝑝, 𝑘)-mining, and the honest miners participate in (1 − 𝑝, 1)-mining, where the only block the honest miners mine on at any time step is the most recent block on the longest public chain. When there is a tie, i.e., two longest chains are gossiped through the network, the honest miners will mine on the chain which is gossiped to them first. In such situations, 𝛾 ∈ [0, 1] denotes the switching probability, i.e., the probability of honest miners switching to the adversary’s chain.

2.2 Selfish Mining Objective

Chain quality. Chain quality is a measure of the number of honest blocks in any consecutive segment of the blockchain. A blockchain protocol is said to satisfy (𝜇, ℓ)-chain quality if for any segment of the chain of length ℓ, the fraction of honest blocks in that segment is at least 𝜇 [16].

Selfish mining objective. Let 𝜎 denote an adversarial mining strategy. The selfish mining objective we analyse in our work is the expected relative revenue of the adversary. Formally, let revenueA and revenueH denote the number of adversarial and honest blocks in the main chain respectively. The expected relative revenue (ERRev) of the adversary under strategy 𝜎 is defined as

Note that the chain quality of the blockchain under an adversarial mining strategy 𝜎 is simply 1 − ERRev(𝜎), hence our selfish mining objective captures the expected change in the chain quality.

2.3 Markov Decision Processes

As mentioned in Section 1, we will reduce the problem of computing optimal selfish mining strategies for the expected relative revenue objective to solving MDPs with mean-payoff objectives. In what follows, we recall the necessary notions on MDPs and formally define the mean-payoff objectives. For a finite set 𝑋, we use D (𝑋) to denote the set of all probability distributions over 𝑋.

Markov decision process. A Markov Decision Process (MDP) [26] is a tuple M = (𝑆, 𝐴, 𝑃, 𝑠0) where

• 𝑆 is a finite set of states, with 𝑠0 ∈ 𝑆 being the initial state,

• 𝐴 is a finite set of actions, overloaded to specify for each state 𝑠 ∈ 𝑆 the set of available actions 𝐴(𝑠) ⊆ 𝐴, and

• 𝑃 : 𝑆 × 𝐴 → D (𝑆) is a transition function, prescribing to each (𝑠, 𝑎) ∈ 𝑆 × 𝐴 a probability distribution over successor states.

A strategy in M is a recipe to choose an action given a history of states and actions, i.e. it is a function 𝜎 : (𝑆 × 𝐴) ∗ × 𝑆 → D (𝐴). In general, strategy can use randomization and memory. A positional strategy uses neither randomization nor memory, i.e. it is a function 𝜎 : 𝑆 → 𝐴. We denote by Σ and Σ 𝑝 the set of all strategies and all positional strategies in M. Given a strategy 𝜎, it induces a probability measure P 𝜎 [·] in M with an associated expectation operator E 𝜎 [·].

Mean-payoff objective. A reward function in M is a function 𝑟 : 𝑆 ×𝐴 × 𝑆 → R which to each state-action-state triple prescribes a real-valued reward. Given an MDP M, a reward function 𝑟 and a strategy 𝜎, we define the mean-payoff of 𝜎 with respect to 𝑟 via

where 𝑟𝑛 = 𝑟𝑛 (𝑠𝑛, 𝑎𝑛, 𝑠𝑛+1) is the reward incurred at step 𝑛.

The mean-payoff MDP problem is to compute the maximal mean-payoff that a strategy in the MDP can achieve. A classic result in MDP analysis is that there always exists a positional strategy 𝜎 ∗ that achieves maximal mean-payoff in the MDP, i.e. there is 𝜎 ∗ ∈ Σ 𝑝 such that MP(𝜎 ∗ ) = max𝜎 ∈Σ 𝑝 MP(𝜎) = sup𝜎 ∈Σ MP(𝜎) [26]. Furthermore, the optimal positional strategy and the mean-payoff that it achieves can be efficiently computed in polynomial time [15, 26].

Authors:

(1) Krishnendu Chatterjee, IST Austria, Austria ([email protected]);

(2) Amirali Ebrahimzadeh, Sharif University of Technology, Iran ([email protected]);

(3) Mehrdad Karrabi, IST Austria, Austria ([email protected]);

(4) Krzysztof Pietrzak, IST Austria, Austria ([email protected]);

(5) Michelle Yeo, National University of Singapore, Singapore ([email protected]);

(6) Ðorđe Žikelić, Singapore Management University, Singapore ([email protected]).

This paper is available on arxiv under CC BY 4.0 DEED license.

A Model for Blockchain Mining Under Adversarial Conditions

Table of Links

2 PRELIMINARIES

2.1 System Model

2.2 Selfish Mining Objective

2.3 Markov Decision Processes