sia.hackernoon.com

Table of Links

3.3 Formal Analysis

Goal of the analysis. We now show how to compute the optimal expected relative revenue together with an adversarial strategy achieving it in the MDP M = (𝑆, 𝐴, 𝑃, 𝑠0), up to an arbitrary precision parameter 𝜖 > 0. Formally, for each strategy 𝜎 in M, let

be the expected relative revenue under adversarial strategy 𝜎, i.e. the relative ratio of the number of blocks accepted on the main chain belonging to the adversary and to honest miners. Moreover, let

be the optimal expected relative revenue that an adversarial strategy can attain. Given a precision parameter 𝜖 > 0, our goal is to compute

We do this by defining a class of reward functions in the MDP M and showing that, for any value of the precision parameter 𝜖 > 0, we can compute the above by solving the mean-payoff MDP with respect to a reward function belonging to this class. Our analysis draws insight from that of [27], which considered selfish mining in PoW blockchains and also reduced reasoning about expected relative revenue to solving mean-payoff MDPs with respect to suitably defined reward functions. However, in contrast to [27], we consider selfish mining in efficient proof systems in which the adversary can mine on multiple blocks, meaning that our design of reward functions and the analysis require additional care.

Reward function definition. The key challenge in designing the reward function is that the main chain and the blocks on it may change whenever the adversary publishes a private fork. Hence, we design the reward function to incur positive (resp. negative) reward whenever a block owned by the adversary (resp. honest miners) is accepted at the depth strictly greater than 𝑑 in the main chain. Since the adversary only mines and publishes private forks mined on blocks up to depth 𝑑 in the main chain, this means that blocks beyond depth 𝑑 are guaranteed to remain on the main chain.

Formally, for each 𝛽 ∈ [0, 1], we define 𝑟𝛽 : 𝑆 × 𝐴 × 𝑆 → R to be a reward function in M which to each state-action-state triple (𝑠, 𝑎, 𝑠′ ) assigns the reward:

• 1−𝛽, for each block belonging to the adversary accepted at depth greater than 𝑑 as a result of performing the action;

• −𝛽, for each block belonging to honest miners accepted at depth greater than 𝑑 as a result of performing the action.

Formal analysis. Our formal analysis is based on the following theorem. For clarity of exposition, we defer the proof of the theorem to Appendix C . For every 𝜖 > 0, the theorem shows how to relate the optimal expected relative revenue in the MDP and 𝜖-optimal strategies to the optimal mean-payoff and 𝜖-optimal strategies under the reward function 𝑟𝛽 for a suitably chosen value of 𝛽.

Authors:

(1) Krishnendu Chatterjee, IST Austria, Austria ([email protected]);

(2) Amirali Ebrahimzadeh, Sharif University of Technology, Iran ([email protected]);

(3) Mehrdad Karrabi, IST Austria, Austria ([email protected]);

(4) Krzysztof Pietrzak, IST Austria, Austria ([email protected]);

(5) Michelle Yeo, National University of Singapore, Singapore ([email protected]);

(6) Ðorđe Žikelić, Singapore Management University, Singapore ([email protected]).

This paper is available on arxiv under CC BY 4.0 DEED license.

The Math Behind Selfish Mining and Reward Allocation

Table of Links

3.3 Formal Analysis