• 沒有找到結果。

Game Theory at Grace Baptist Church 伍

N/A
N/A
Protected

Academic year: 2021

Share "Game Theory at Grace Baptist Church 伍"

Copied!
88
0
0

加載中.... (立即查看全文)

全文

(1)

Game Theory at Grace Baptist Church

Spring 2006

Professor Hsueh-I Lu (呂學一)

National Taiwan University

(2)

重要的規定(將會每週提醒)

重要的規定(將會每週提醒)

除了白開水之外,禁止在教室內飲食,

違者立刻喪失旁聽與修課的資格。

注意路上的安全,過馬路時不要趕時間

,我們上課不會點名,所以遲到,缺席

都沒有關係,就是千萬不要「用生命趕

路」。

盡可能將腳踏車停在校園內然後走過來

,以免造成懷恩堂周遭環境的紊亂。

(3)

Outline

Outline

Extensive games with imperfect

information

– Terminology and definitions

– Mixed strategy versus behavioral strategy.

– Nash equilibrium

(4)

Perfect information

Perfect information

fight

acquiesce

attack

withdraw

attack

withdraw

in

out

(5)

Imperfect

Imperfect

information

information

2, 2

1, 3

3, 1

1, 1

fight

acquiesce

attack

withdraw

attack

withdraw

in

out

(6)

An absent-minded driver

An absent-minded driver

L

L

(7)

A 1-player extensive game

A 1-player extensive game

L

R

R

L

0

(8)

Four components

An extensive game with

perfect information has

four components

1.

Players

2.

Terminal histories

(9)

(General) extensive game

(General) extensive game

Besides the components for an extensive game

with perfect information

that allows chance

moves

, we still have to specify the following item

for each player i of the game:

F

The

information partition

I

i

of thehistories

h with P (h) = i.

Each partition of I

i

is called an

information set

(10)

Imperfect

Imperfect

information

information

fight

acquiesce

attack

withdraw

attack

withdraw

in

out

(11)

Google

Google

The information partition of Google consists

of the following two information sets:

F

f g;

(12)

Microsoft

Microsoft

Theinformation partition of Microsoft hasone

singleton information set:

(13)

Chance move

Chance move

2, 2

1, 3

3, 1

1, 1

fight

acquiesce

attack

withdraw

attack

withdraw

0.7

0.3

(14)

Simultaneous move?

Simultaneous move?

Simultaneous moves can be

“implemented” by non-singleton

information sets.

(15)

Simultaneous moves

Simultaneous moves

2, 2

1, 3

3, 1

1, 1

fight

acquiesce

attack

withdraw

attack

withdraw

0.7

0.3

(16)

Actions

Actions

F

T he actions available to all histories in an

in-formation partition should be the same.

F

If I

i

is an information set of the i-th player, let

A(I

i

) consist of the actions available to player

i after each history in I

i

.

(17)

Actions

Actions

2, 2

1, 3

3, 1

1, 1

fight

acquiesce

attack

withdraw

attack

withdraw

0.7

0.3

(18)

Strategies

Strategies

Pure strategies

Mixed strategies

Behavioral strategies

(19)

Pure strategies

Pure strategies

A pure strategy of player i in an extensive game is a

function that assigns to each information set I

i

2 I

i

an action in A(I

i

).

(20)

Pure strategies

Pure strategies

fight

acquiesce

attack

withdraw

attack

withdraw

in

out

(21)

Pure strategies

Pure strategies

Microsoft

– fight

– acquiesce

Google

– (in, attack)

– (in, withdraw)

– (out, attack)

– (out, withdraw)

(22)

Mixed strategies

Mixed strategies

A mixed strategy of

an extensive game is

simply a probabilistic

distribution over the

pure strategies of the

game.

(23)

Mixed strategies

Mixed strategies

2, 2

1, 3

3, 1

1, 1

fight

acquiesce

attack

withdraw

attack

withdraw

in

out

(24)

Mixed strategy

Mixed strategy

Microsoft

– fight with probability p

– acquiesce with probability 1 – p.

Google

– (in, attack) with probability x

– (in, withdraw) with probability y

– (out, attack) with probability z

(25)

Behavioral strategies

Behavioral strategies

A behavioral strategy of player i in an

extensive game is a collection of

proba-bility distributions, onefor each I

i

2 I

i

.

T he probability distribution for I

i

is

over A(I

i

).

(26)

Behavioral strategies

Behavioral strategies

fight

acquiesce

attack

withdraw

attack

withdraw

in

out

(27)

Behavioral strategies

Behavioral strategies

Microsoft

in

fight with probability p

acquiesce with probability 1 – p.

Google

{}

in with probability q

out with probability 1 – q.

{(in, fight), (in, acquiesce)}

attack with probability r

(28)

Mixed” v.s. “Behavioral”

Mixed” v.s. “Behavioral”

Mixed strategy

Exactly one

randomized decision, made at the

beginning of the game.

Behavioral strategy

– One randomized decision at

each

history h with P(h)

= i.

Question:

Are these two types randomized

strategies fundamentally different?

(29)

Behavioral strategies

Behavioral strategies

2, 2

1, 3

3, 1

1, 1

fight

acquiesce

attack

withdraw

attack

withdraw

in

out

(30)

Microsoft

Microsoft

Mixed strategy

– fight with probability p

– acquiesce with probability 1 – p.

Behavioral strategy

in

fight with probability p

acquiesce with probability 1 – p.

Are they

different?

(31)

Google

Google

Mixed strategy

– (in, attack) with probability x

– (in, withdraw) with probability y

– (out, attack) with probability z

– (out, withdraw) with probability 1 – x – y – z.

Behavioral strategy

{}

in with probability q

out with probability 1 – q.

{(in, fight), (in, acquiesce)}

attack with probability r

withdraw with probability 1 – r.

Are they

different?

(32)

Mixed strategy

– (in, attack) with probability x = q * r

– (in, withdraw) with probability y = q * (1 – r)

– (out, attack) with probability z = (1 – q) * r

– (out, withdraw) with probability 1 – x – y – z = (1 – q) * (1 – r)

Behavioral strategy

{}

in with probability q

out with probability 1 – q.

{(in, fight), (in, acquiesce)}

attack with probability r

(33)

Mixed strategy

– (in, attack) with probability x

– (in, withdraw) with probability y

– (out, attack) with probability z

– (out, withdraw) with probability 1 – x – y – z

Behavioral strategy

{}

in with probability q = x + y

out with probability 1 – q = 1 – x – y

{(in, fight), (in, acquiesce)}

attack with probability r = x / (x + y)

withdraw with probability 1 – r = y / (x + y)

(34)

Mixed” v.s. “Behavioral”

Mixed” v.s. “Behavioral”

Mixed strategy

Exactly one

randomized decision, made at the

beginning of the game.

Behavioral strategy

– One randomized decision at

each

history h with P(h)

= i.

Question:

Are these two types randomized

strategies fundamentally different?

(35)

An absent-minded driver

An absent-minded driver

L

L

(36)

A 1-player extensive game

A 1-player extensive game

L

R

R

L

0

(37)

Mixed strategies

Mixed strategies

L: always take the exit

R: always ignore the exit

No mixed strategies can

take him home. That is, the

expected payoff of ANY

mixed strategy for this

one-player extensive game is 0.

L

L

R

R

0

(38)

Behavioral strategies

Behavioral strategies

Throw a coin at each

decision node.

If the absent-minded driver

uses a behavioral strategy

(0.5, 0.5), the expected

payoff is 0.25.

L

L

R

R

0

1

0

(39)

Question:

Question:

Are behavioral strategies always

better than mixed strategies?

(40)

Kuhn: Not necessarily…

Kuhn: Not necessarily…

Carl

Alice

(41)

Kuhn’s game

Kuhn’s game

A 2-player

zero-sum extensive

game

¡ 2

0

0

2

¡ 1

1

Chance node

1/ 2

1/ 2

S

C

S

C

E

K

E

K

(42)

Pure strategies

Pure strategies

S

C

(

S

,

K

)

0 -0.5

(

S

,

E

)

0

0.5

(

C

,

K

) 0.5

0

(

C

,

E

) -0.5 0

¡ 1

1

¡ 2

0

0

2

E

K

E

K

S

C

S

C

1/ 2

1/ 2

Chance node

Carl

Alice + Bob

(43)

A mixed strategy with

A mixed strategy with

positive expected payoff

positive expected payoff

S

C

(

S

,

E

) 0 0.5

(

C

,

K

) 0.5 0

A ((S;E ) : 0:5;(C;K ) : 0:5) mixed

strategy for Alice+Bob ensures that

their expected payo®is at least 0:25

independent of the strategy of Carl.

¡ 1

1

¡ 2

0

0

2

Chance node

1/ 2

1/ 2

C

S

C

S

K

E

K

E

Carl

Alice + Bob

(44)

Behavioral strategies

Behavioral strategies

¡ 1

1

¡ 2

0

0

2

Let p (respectively, q) be the

prob-ability that Bob (respectively, Alice)

chooses S (respectively, K ).

The expected payo®is

½

0:5(p+ (1¡ p)2q) ¡ 0:5 Carl chooses S;

0:5(p+ (1¡ p)2q) ¡ q

Carl chooses C:

i.e.,

½

(1¡ p)(q¡ 0:5) Carl chooses S;

p(0:5¡ q)

Carl chooses C:

Chance node

1/ 2

1/ 2

S

C

S

C

E

K

E

K

(45)

Therefore, Alice and Bob cannot

guarantee a positive expected payoff using

any behavioral strategies.

In other words, no matter what p and q

(the choices of Alice and Bob) are, Carl

has a strategy to ensure a non-positive

expected payoff for Alice and Bob.

(46)

Extensive games

Extensive games

with perfect recall

(47)

We will show…

We will show…

In an extensive game with perfect recall,

– any mixed strategy has an equivalent

behavioral strategy, and

– any behavioral strategy has an equivalent

mixed strategy.

(48)

Perfect recall

Perfect recall

An extensive game is with perfect recall if

each player always remembers what she

knew in the past.

(49)

Notation: X

Notation: X

i

i

(h)

(h)

If h is a history of an extensive game, then let X

i

(h)

denote the record of player i

;

s experience. More

pre-cisely, X

i

(h) consists of

F

the sequence of information sets encountered

by h, and

(50)

Illustration

Illustration

i

i

i

h

(51)

Comments

Comments

X

i

(h) includes the encountered information

sets of all players.

X

i

(h) does NOT include the actions taken

(52)

Definition

Definition

An extensive game is with perfect recall if the following

condition holds for each player i:

If histories h and ^

h are in the same information

set of player i, then X

i

(h) = X

i

(^

h) holds.

Comment: T herefore, each extensive game with perfect

in-formation is with perfect recall.

(53)

Perfect recall?

Perfect recall?

Player 1

Player 1

Player 2

Chance node

1/ 2

1/ 2

(54)

Perfect recall?

Perfect recall?

Player 1

Player 2

Player 3

Chance node

1/ 2

1/ 2

(55)

Perfect recall?

Perfect recall?

Player 1

Player 1

Player 2

Chance node

1/ 2

1/ 2

(56)

Perfect recall?

Perfect recall?

(57)

Perfect recall?

Perfect recall?

1

1

1

c

(58)

Perfect recall?

Perfect recall?

1

1

(59)

Theorem 1

Theorem 1

The following statements hold for any extensive game with

perfect recall.

F

(a) For any behavioral strategy ¯

i

of the i-th player,

there is an

outcome-equivalent

mixed strategy ¾

i

of

the i-th player.

F

(b) For any mixed strategy ¾

i

of thei-th player, there

is an

outcome-equivalent

behavioral strategy ¯

i

of the

i-th player.

(60)

Outcome

Outcome

For any pro¯le ¾= (¾

i

)

i 2 N

of either mixed or behavioral

strategies in an extensive game, the

outcome

of ¾is the

probability distribution

over the terminal histories that

re-sults when each player i follows the precepts of ¾

i

.

(61)

1

`

r

`

r

y

x

2

¾= (x :

1

3

;y :

2

3

;` :

1

3

;r :

2

3

)

Outcome of ¾is (

1

9

;

2

9

;

2

9

;

4

9

)

(62)

Outcome-equivalent

Outcome-equivalent

Two strategies (mixed or behavioral) of a player are

outcome-equivalent

if for every collection of pure strategies

of the other players, the two strategies induce the same

outcome.

(63)

Two outcome-equivalent

Two outcome-equivalent

strategies of player 1

strategies of player 1

y

x

2

1

1

a

b

r

`

Behavioral | (x : 0:5;y : 0:5;` : 0:5;r : 0:5)

(64)

Proving Theorem 1(a)

Proving Theorem 1(a)

We show that the following statement holds for any

linear

game

:

F

(a) For any behavioral strategy ¯

i

of the i-th player,

there is an outcome-equivalent mixed strategy ¾

i

of

the i-th player.

(65)

Linear game

Linear game

Each history does not encounter the same

(66)

Linear?

Linear?

Player 1

Player 1

Player 2

Chance node

1/ 2

1/ 2

(67)

Linear?

Linear?

(68)

Linear?

Linear?

1

1

(69)

Perfect recall

Perfect recall

linear

linear

Suppose that the game is not linear. There

is a history h that encounters an

information set I

i

twice. That is, h = h’h”

where h” is no-empty and I

i

contains both

h and h’.

Clearly, X(h) and X(h’) are different.

(70)

Proving Theorem 1(a)

Proving Theorem 1(a)

We show that the following statement holds for any

linear game

:

F

(a) For any behavioral strategy ¯

i

of thei-th player, thereis

an outcome-equivalent mixed strategy ¾

i

of the i-th player.

De¯ne ¾

i

as follows. Let s

i

be a pure strategy of the i-th player,

the probability assigned to s

i

by ¾

i

is

Y

I

i

2 I

i

¯

i

(I

i

)(s

i

(I

i

));

where ¯

i

(I

i

) is the probability over the actions in A(I

i

) speci¯ed

by ¯

i

and s

i

(I

i

) is the action in A(I

i

) taken by s

i

. One can esily

(71)

Illustration

Illustration

i

i

i

h

(72)

Statement (b) for linear game?

Statement (b) for linear game?

Does thefollowing statement also hold for any

linear

game

?

F

(b) For any mixed strategy ¾

i

of the i-th

player, there is an outcome-equivalent

behav-ioral strategy ¯

i

of the i-th player.

(73)

1

`

r

`

r

y

x

Linear? Perfect Recall?

Linear? Perfect Recall?

1

(74)

Proving Theorem 1(b)

Proving Theorem 1(b)

We show that the following statement holds for any

game with perfect recall.

F

(b) For any mixed strategy ¾

i

of the i-th

player, there is an outcome-equivalent

behav-ioral strategy ¯

i

of the i-th player.

(75)

Terminology

Terminology

Wesay that a strategy (pure or mixed) s

i

of player i

is

consistent

with a history h if, under the condition

that the other players do not deviate from h, the

probability for yielding history h according to s

i

is

non-zero.

(76)

Illustration

Illustration

i

i

i

h

(77)

Notation

Notation

For any history h of the game, let

¼

i

(h)

denote the

sum of probabilities according to ¾

i

of all the pure

strategies of player i that are consistent with h.

That is, ¼

i

(h) isthesumof ¾

i

(s

i

) over all

purestrate-gies s

i

that is consistent with h.

(78)

Observation 1

Observation 1

If h and ^

h are two histories in the same information

set, then ¼

i

(h) = ¼

i

(^

h).

(79)

Observation 1

Observation 1

If h and ^

h are two histories in the same information

set, then ¼

i

(h) = ¼

i

(^

h).

(Reason) Since X

i

(h) = X

i

(^

h) holds by de¯nition of

perfect recall, the actions taken by thei-th player to

reach h are exactly the same as those to reach ^

h.

(80)

Observation 2

Observation 2

If h and ^

h are in the same information set I

i

, then

¼

i

(ha) = ¼

i

(^

ha) holds for any action a in A(I

i

).

(81)

De¯ning ¯

i

For each history h in each information set I

i

of I

i

, wede¯ne

the behavioral strategy ¯

i

of the i-th player as follows:

F

If ¼(h) = 0, then let ¯

i

(I

i

) bean arbitrary probability

distribution.

F

If ¼(h) > 0, then let

¯

i

(I

i

)(a) =

¼

i

(ha)

¼

i

(h)

for each action a of A(I

i

).

(82)

Equivalence (1)

Equivalence (1)

Let s

¡ i

denotethepurestrategy pro¯leof theother players.

Let h be an arbitrary terminal history.

Observethat if h is not consistent with any purestrategy of

s

¡ i

, then theprobabilities of theoutcomes of both (s

¡ i

i

)

and (s

¡ i

i

) are zero.

Therefore, it remains to consider the case that all pure

strategies speci¯ed by s

¡ i

are consistent with h.

(83)

Illustration

Illustration

i

i

i

h

(84)

Equivalence (2)

Equivalence (2)

Suppose that h and ¾

i

are not consistent. That is, ¾

i

assigns

probability 0 to all pure strategies of the i-th player that are

consistent with the terminal history h.

T hen, there must exists a subhistory ^

ha of h with ¼

i

(h) > 0

and ¼

i

(^

ha) = 0. Therefore, ¯

i

(I

i

)(a) = 0, where I

i

is the

information set of the i-th player that contains ^

h.

As a result, the probability of h obtained from the outcome of

(85)

Illustration

Illustration

i

i

i

h

^

h

a

(86)

Equivalence (3)

Equivalence (3)

Now (s

¡ i

i

) and h are consistent Clearly, for each

subhis-tory ^

ha of h such that ^

h belongs to an information set of

the i-th player has to satisfy ¼

i

(^

h) > 0 and ¼

i

(^

ha) > 0.

One can verify that the product of these probabity

¼

i

(^

ha)

¼

i

(^

h)

(87)

Illustration

Illustration

i

i

i

h

^

h

a

(88)

Theorem 1

Theorem 1

The following statements hold for any extensive game with

perfect recall.

F

(a) For any behavioral strategy ¯

i

of the i-th player,

there is an

outcome-equivalent

mixed strategy ¾

i

of

the i-th player.

F

(b) For any mixed strategy ¾

i

of thei-th player, there

is an

outcome-equivalent

behavioral strategy ¯

i

of the

i-th player.

參考文獻

相關文件

Write the following problem on the board: “What is the area of the largest rectangle that can be inscribed in a circle of radius 4?” Have one half of the class try to solve this

(Some campers may react faster than others to familiar words.) 5. The game can be repeated for a few rounds.. Before starting the game, the camp leaders show the campers both

Reading Task 6: Genre Structure and Language Features. • Now let’s look at how language features (e.g. sentence patterns) are connected to the structure

• Examples of items NOT recognised for fee calculation*: staff gathering/ welfare/ meal allowances, expenses related to event celebrations without student participation,

Researches of game algorithms from earlier two-player games and perfect information games extend to multi-player games and imperfect information games3. There are many kinds of

(Samuel, 1959) Some studies in machine learning using the game of checkers. Picture extracted from the original paper of Samuel for

 name common laboratory apparatus (e.g., beaker, test tube, test-tube rack, glass rod, dropper, spatula, measuring cylinder, Bunsen burner, tripod, wire gauze and heat-proof

For a directed graphical model, we need to specify the conditional probability distribution (CPD) at each node.. • If the variables are discrete, it can be represented as a