• 沒有找到結果。

Two Player Rank-Based Game with Complete Information . 17

3. Proposed Rank-Based Game

3.1 Rank-Based Game

3.1.1 Two Player Rank-Based Game with Complete Information . 17

In this section, we will describe the simplest situation of rank-based game. There are two players in this game, denoted by N1 and N2. Each player needs its opponent to exchange a certain number of their encoded blocks. For each player i, the cost of

18

generating a new encoded block and uploading to its opponent is ci. The gain of receiving new encoded blocks from its opponent is evaluated by rewarded rank. It has three components such as unpolluted probability, pji, the coefficient of expected rank from specific opponent, rji, and typical coefficient of expected rank at rank k, Cik, which is the rank number of player i’s independently encoded blocks. The unpolluted probability pji means that the probability of receiving an unpolluted block from player j. The coefficient of expected rank rji means that expected reward of rank when the player i receives a new encoded block from player j. Let Bi be the total number of blocks that player i will offer to exchange with others. The strategy aij mean that player i can offer aij encoded blocks to player j. The coefficient of gain is to measure the expected reward of rank with an incoming encoded block. The coefficient of expected rank at rank k, Cik, means the expected rank income when the player i receives an encoded block randomly. The utility can be calculated as the following formula.

(10)

where ai is the set of strategy of player i, denoted by ai=(ai1,ai2), where aii is the storage of exchanged blocks which does not be used, δi is the reward coefficient that store an exchanged block which is not used. We assume that δi must satisfy

(11)

19

where L is the value of minimum utility of all possible strategy which the utility is larger than zero.

Now, we will start to analyze the two-player complete information rank-based game. For player 1, its utility function is shown in (12)

(12)

For player 2, its utility function is shown in (13)

(13)

So we can plot the both utilities as two coordinate axes into a coordinate like figure 3.1.1-1.

Figure 3.1.1-1: The coordinate of utility

In figure 3.1.1-1, the vertical axis denotes player 2’s utility and the horizontal

20

axis denotes player 1’s utility. The possible strategy pair inside the convex hull of {(0, 0), (- c1 *B1, P21 *C2k *B1), (P12 *C1k *B2- c1 *B1, P21 *C2k *B1- c2 *B2), (P12 *C1k *B2, -c2 *B2)}. However, for each player, it wishes its utility is a positive value. So, the possible strategy pair inside the gray area in figure 3.1.1-1.

As above mentioned, there are many possible strategy sets, but not all the obtained strategy sets are better. Next we show how to select the better strategy set and find the Nash equilibrium. In our study, we refine the strategy set with optimality criteria of proportional fairness and it can reduce the set of strategy set to a unique point that we call Nash equilibrium.

According to [14], [15] and [16], we know that the optimality criterion which is the maximal product of both utilities is the solution for the bargaining game. The solution means that a determination of how much it should be worth to each of these individuals to have this opportunity to bargain [15]. To satisfy the bargaining game, the game must have 3 properties as follows,

1.

2.

3. U is convex, bounded and closed

Where U denotes the set of attainable utility pairs, ui denotes the utility of player i, d is the utility pairs that the players reward the utility if the players fail to achieve an agreement, and di is that the player i rewards the utility when the players fail to achieve an agreement. In figure 3.1.1-1, it is obvious that our proposed rank-based game can satisfy the first property of bargaining game, d = (0,0), and the second property of bargaining game, and the graph of utility pairs in the figure 3.1.1-1 is convex, bounded and closed. Our proposed rank-based game also can satisfy the third property of bargaining game.

21

According to the above mentioned, our proposed rank-based game belongs with the bargaining game. The selected strategy set is proportional fairness if u12 (a1, a2)*u21 (a1, a2) can be maximized. The Nash equilibrium can be derived as follows.

(14)

where ai*

is the set of strategy which is the Nash equilibrium. With complete information game, because we know the opponent’s private information such as Pji, rji, and ci, we can immediately calculate the Nash equilibrium. However, a player may not offer the private information easily. In the next section, we will introduce how to estimate the opponent’s private information with incomplete information game.

3.1.2 Two Player Rank-Based Game with Incomplete Information

In this section, we will introduce the algorithm of estimating the opponent’s

22

private information. With incomplete information game, a player only knows its private information but does not know another’s private information. Before we introduce the algorithm of estimating private information, we must know the algorithm of negotiation with incomplete information that means how to respond a strategy with incomplete information when a player receives the opponent’s strategy.

According to optimality criteria of proportional fairness, we know the product of both utilities is shown in (15). will describe the impact at the chapter 3.3. To find the maximum of the product, we separately perform the partial differential of (15) with respect to variable a12 and a21, respectively, and let it equal to zero as follows.

With variable a12:

23

With variable a21:

(18)

Then

(19)

According to (17) and (19), we can know the reaction function for player 1 and player 2 as follows.

For player 1:

(20)

where a21,t-1 is the strategy which the player 2 respond to player 1 at time t-1, a1,t is the set of strategy that the player 1 calculates at time t according to the strategy a21,t-1.

For player 2:

(21)

24

where a12,t-1 is the strategy which the player 1 respond to player 2 at time t-1, a2,t is the set of strategy that the player 2 calculates at time t according to the strategy a12,t-1.

Now, let us introduce the algorithm of estimating private information. According to (20) and (21), we can rewrite (20) and (21) with unknown information as follows.

(22)

where estimationi,t is the estimated private information which is estimated at time t.

We consider this situation that the player 1 sends the set of strategy a1,t-1=(0,a12,t-1) to player 2 and then the player 2 responds the set of strategy a2,t-1=(a21,t-1,0) to player 1.

The player 1 can estimate the estimation1,t as follows.

(23)

How the estimating private information can estimate accurately and rapidly will be verified at chapter 3.3.3.

Now, let us compare two-player complete information game with two-player incomplete information game. There are two examples to show that the obtained Nash equilibrium of complete information game is the same as incomplete information game. Without loss of generality, we assume the player 1 starts the estimation algorithm and a12,0= a21,0=1.

25

Table 3.1.2-1: The coefficient of two-player game: example 1 Pji*Cik*rji ci Pji*Cik*rji/ci Bi

player 1 0 0.7 0.28 2.5 1000

player 2 0.52 0 0.29 1.793103448 1000

According to (14), we know the Nash equilibrium of two-player complete information game is .

Table 3.1.2-2: The process of negotiation of two-player incomplete information game:

example 1

Negotiation t=1 t=2 t=3 t=4 t=5 t=6 t=7

a12,t 1.45 2.57 4.32 7.23 12.13 20.33 34.09

a21,t 1.7 2.82 4.73 7.93 13.3 22.3 37.38

Negotiation t=8 t=9 t=10 t=11 t=12 t=13 t=14

a12,t 57.15 95.81 160.62 269.27 451.42 756.79 1000 a21,t 62.67 105.06 176.13 295.27 495.01 829.86 1000

In table 3.1.2-2, we obtain the Nash equilibrium, , at t=14.

Table 3.1.2-3: The coefficient of two-player game: example 2 Pji*Cik*rji ci Pji*Cik*rji/ci Bi

26

player 1 0 0.5 0.16 3.125 1000

player 2 0.83 0 0.14 5.928571429 1000

According to (14), we know the Nash equilibrium of two-player complete information game is .

Table 3.1.2-4: The process of negotiation of two-player incomplete information game:

example 2

Negotiation t=1 t=2 t=3 t=4 t=5

a12,t 1.72 8.66 44.55 229.23 1000

a21,t 5.25 27.05 139.19 716.18 1000

In table 3.1.2-4, we obtain the Nash equilibrium, , at t=5.

According to above examples, in the two-player game, the negotiation of incomplete information can obtain a unique Nash equilibrium and this equilibrium is the same as the one of complete information game.

3.1.3 Multi Player Rank-Based Game with Complete Information

In this section, we will describe the multi-player rank-based game. There are m players in this game, denoted by (N1, N2,…, Nm). Each player needs its opponent to exchange a certain number of their encoded blocks at next exchange stage. For each

27

player i, the cost of generating a new encoded block and uploading to its opponent is ci. The gain of receiving new encoded blocks from its opponent is evaluated by rewarded rank. It has three components such as unpolluted probability Pji, the coefficient of expected rank of specific opponent, rji, and typical coefficient of expected rank at rank k, Cik, which is the rank number of player i’s independently encoded blocks. The unpolluted probability Pji means that the probability of receiving an unpolluted block from player j. The expected rank coefficient rji means that expected reward of rank when the player i receives a new encoded block from player j.

Let Bi be the total number of blocks that player i will offer to exchange with other and let Bij be the number of blocks that player i will offer to exchange with player j. The strategy aij mean that player i can offer aij encoded blocks to player j at next exchange stage. The coefficient of gain is to measure the expected reward of rank with an incoming encoded block. Because the multi-player game is based on the two-player game, the utility of multi-player game is equal to the sum of the utility of each player.

The utility function of player i in multi-player game can be calculated as follows.

(24)

where ai is the set of strategy of player i, denoted by ai=(ai1,ai2,…, aim), where aii is the storage of exchanged blocks which is not used, δi is the reward coefficient that stores an exchanged block which is not used. The definition of δi is the same as in the two-player game.

Because of the complex dimension it is too difficult to plot all utilities into the

28

coordinate system like figure 3.1.1-1. According to figure 3.1.1-1, we can speculate that there are many possible strategies in the multi-player game like two-player game.

In the multi-player game, we also require the optimality criteria to refine the possible strategy sets. The optimality criterion of proportional fairness is selecting the strategy pair which satisfies the maximum of the product of all utilities as follows.

(25)

In two-player game, the same optimality criteria can reduce the set of strategy set to a unique point and can easily obtain the Nash equilibrium by (14). However, in multi-player game, this optimality criterion can also reduce to a unique point, but it requires horrible computing time to obtain by full search. To avoid the horrible computing time of full search, we propose a method to obtain the suboptimal Nash equilibrium of multi-player game: the proportional distribution strategy.

In proportional distribution strategy, according to each potential contribution of each opponent, a player can distribute the upload bandwidth for each opponent. Now, we define how to calculate the potential contribution as follows. The intuition of potential contribution is the ratio of the reward of a received block and the cost of an upload block. At personal reaction function which described in chapter 3.1.2, the higher ratio is meaning that the player is willing to offer a better strategy to a specific opponent and also meaning that the specific opponent maybe give better resources to the player. According to above mentioned, we define the potential contribution as (26).

Let us take player i as an example.

29

(26)

where vij is the value of potential contribution for player j. The distribution of upload bandwidth can be calculated as follows.

(27)

According to distributed upload bandwidth, player i has many two-player games with each opponent.

3.1.4 Multi Player Rank-Based Game with Incomplete Information

In this section, we will introduce how to negotiation with multi player incomplete information.

The flow chart of proportional distribution strategy is shown in figure 3.1.4-1.

30

Figure 3.1.4-1: The flow charts of multi player incomplete information game

The player distributes the average distribution of upload bandwidth for each opponent at first. Next, the player exchanges their offer once with their opponent and performs the algorithm of estimating private information. Then, according to the estimated private information, the player redistributes the upload bandwidth for each opponent. Finally, the player repeats the above behavior until the player will not change their offers. The offer is the suboptimal Nash equilibrium.

3.1.5 Rank-Based Game with Malicious

31

Players and Cheating Players

Because of the assumption in which we assume the players are selfish, if a player can reward more utility through cheating behavior, we believe the player may cheat.

Let us describe the cheating behavior. We assume the player 2 will perform the respond the high offer to player 2. According to above mentioned, we know that the cheating behavior is effective.

We classify the cheating behavior in two categories: knowledgeable cheating behavior and unknowledgeable cheating behavior. The unknowledgeable cheating behavior means the cheating player only knows that responding the lower offer is better. The knowledgeable cheating behavior means the cheating player knows responding the offer with lower private information is better. To reduce the impact of cheating behavior, we propose two methods to detect cheating behavior.

First, according to the proof of algorithm of estimating private information in the chapter 3.3.3, we know the estimated value must be more approaching to the real

32

value than the estimated value in the past and never crosses the real value. There is an example as follows.

Table 3.1.5-1: The normal situation of estimated process real private information

player 1 0.453333

player 2 0.205128

Number of estimating t=1 2 3 4

estimation1,t 0.184445 0.204948 0.205127 0.205128 estimation2,t 0.457624 0.45337 0.453334 0.453333

The unknowledgeable cheating behavior will respond with the lower offer by multiplying the original offer by a parameter p, p=[0.5,1). There is an example as follows. We assume the player 2 is the unknowledgeable cheating player.

Table 3.1.5-2: The cheating situation of estimated process real private information

player 1 0.390805

player 2 0.159574

Number of estimating t=1 2 3 4

estimation1,t 0.219755 0.166896 0.230669 0.197642

33

estimation2,t 0.381825 0.38969 0.38024 0.385076

In table 3.1.5-2, the estimated value of player 1 from t = 1 to t = 2 is incremental, but from t = 2 to t = 3 is decreasing. Player 1 can detect the cheating behavior of player 2.

Another detective method is shown as follows. For each possible strategy set, it must satisfy (28).

(28)

Proof of (28):

And ∵

However, both of the above methods cannot detect all of cheating behaviors absolutely. The knowledgeable cheating player can perform the cheating behavior which responds with the offer by calculating the product of private information and the reciprocal of private information. This cheating behavior can be undetectable by the methods mentioned above.

As above analysis, in two-player game, there are some undetectable cheating behaviors. However, in multi-player game, the impacts of cheating situation may be

34

reduced. According to the above analysis, the knowledgeable cheating player will respond with the lower offer by the product of cheating parameter and private information. And then the normal player will estimate the fake private information which is equal to the product of the cheater’s cheating parameter and the cheater’s private information for normal players. This estimated private information will be always smaller than the real one. So if we consider the potential contribution to include the cheating behavior, we can modify (26) as follows,

(29-1)

(29-2)

where m means all of player i can negotiate with.

We attempt to use (29) to estimate the potential contribution of player j and according to above mentioned, we know the potential contribution is concerned with the ratio of the reward of a received block and the cost of an upload block, and the opponent’s private information. The difference between (26) and (29) is considering the opponent’s private information at (29). We think that the estimated private information can be the parameter used the weighted sum because we believe that considering both advantage between players will lead to the better utility. In (29-1), we think the opponent’s private information must be considered because the utility of game is also concerned with the opponent’s private information. In (29-2), we normalize all of estimated private information and perform the weighted sum with the ratio of the reward of a received block and the cost of an uploaded block. If an opponent performs the cheating behaviors, the estimated private information will be

35

decreased because the effective cheating behavior is responding by the fake private information which is lower than real one. So the potential contribution of cheating player will be decreased when the cheating player responds with the lower offer by lower private information. The lower potential contribution leads to the lower distribution of bandwidth in proportional distribution strategy.

We can detect the unknowledgeable cheating player by above detection methods and use the (29) to reduce the impacts of cheating behavior. The potential contribution evaluating by (26) does not consider the impacts of cheating behavior. As above analysis, the potential contribution evaluating by (29) can not only reduce the impacts of cheating behaviors but also reduce the impacts of malicious behaviors.

In the following, the two algorithms mentioned above will be compared with each other. In the simulations, there are four players in the game, and we show full search and proportional distribution strategy. We assume all of the players are normal player, so we assume both unpolluted probability and expected rank coefficient are one. The cost coefficient and total upload bandwidth is shown as table 3.1.5-3. In this program, there is one section which is encoded in 40 encoded blocks. The initial number of encoded block which each player has is 15 blocks, the situation A, and 10 blocks, the situation B. In figure 3.1.5-1, we can observe the full search algorithm estimated the highest product of estimated utility and real utility. The proportional distribution strategy has the similar result at this program.

Table 3.1.5-3: The initial coefficient of 4-player game

player i ci Bi

Player 1 0.29 30

Player 2 0.15 35

Player 3 0.3 30

Player 4 0.11 35

36

Table 3.1.5-4: The result of 4-player game

full_search j=1 j=2 j=3 j=4 utility product rank_income_A rank_income_B

i=1 0 16 6 8 25.3

proportional-(29-1) j=1 j=2 j=3 j=4 utility product rank_income_A rank_income_B

i=1 0 10 7 13 18.3

proportional-(29-2) j=1 j=2 j=3 j=4 utility product rank_income_A rank_income_B

i=1 0 11 5 14 14.3

Figure 3.1.5-1: The product of utility with different algorithm

3.2 Proposed System Architecture

In this section, the proposed architecture will be described in detail. The proposed architecture and the flow chart of the peer are shown in figure 3.2-1. There are three stages for each peer in our proposed architecture such as multi-player rank-based

37

game, exchange stage and update stage. In our proposed architecture, each peer in the peer-to-peer systems is regarded as a player in a game.

First, they perform the multi-player rank-based game. The player will negotiate with other players until all players in a game accept their strategy. When all players accept their strategy, they perform the exchange stage. In exchange stage, each player will exchange their encoded block to others according to their accepted strategy. If they finish the exchange stage, they begin the update stage. In update stage, the player will exchange their reputation score by observing the exchange stage.

Figure 3.2-1: The proposed architecture

相關文件