• 沒有找到結果。

1 Extensive Game

N/A
N/A
Protected

Academic year: 2022

Share "1 Extensive Game"

Copied!
2
0
0

加載中.... (立即查看全文)

全文

(1)

1 Extensive Game

• An n player extensive game Γ with a set N = {1, 2, . . . , n} of players is a 7-tuple (X, E, P, W, C, p, U):

(X, E)代表樹,Z 是葉子,P = (Pi)表示每個玩家擁有 那些點, W 是 information partition, C 是 choice parti- tion, p 是 nature 在每個 P0 內的點指定的機率分布,U 是 utilities.

• Game tree (X, E) finite. Each play of a game is a path from root r to a leaf node z∈ Z.

• 0-th player is also known as nature.

• Γ is an extensive game with perfect information if for each i∈ N, each information set w ∈ Wi contains exactly one node of X− Z.(沒有虛線邊),↔ imperfect information.

• Γ is an extensive game with perfect recall if, for each i∈ N and any two information sets w and w in Wi, the following condition holds: If a node x ∈ w comes after a choice c∈ Cw then all node of w comes after c.(我記 得我先前做過的所有選擇)

2 Strategies

• Pure strategy: ai(w)∈ Cw. Pure strategy profile: a = (a1, a2, . . . , an)∈ A.

• Mixed strategy: si ∈ Si is a probability distribution over Ai. Mixed strategy profile s = (s1, s2, . . . , sn)∈ S.

• Behavior strategy: bi of the i-th player is a mapping that assigns to each information set w∈ Wia probability distribution over Cw. For each choice c∈ Cw, bi(c) is the probability that the i-th player chooses choice c when she reaches information set w. 要求 For any two distinct informatino sets w and w in Wi, bi 分配在 Cw 和 Cw

上的機率分布必須獨立。

• Expected payoff: ui(q) =

z∈Zp(z, q)· Ui(z).

• Qi= Ai∪ Si∪ Bi

3 Equivalent of Strategies

• 高標等價 equivalent: si(ai) = bi(ai) holds for all ai Ai, where bi(ai) =∏

w∈Wibi(ai(w)).

• 對任意 behavior strategy 存在一高標等價的 mixed strat- egy,但反過來不一定對。(停換續留)

• 根據要求,在有 perfect recall 的狀況下 (任一 informa- tion set 在任何一條 path 上不經過兩次),若 bi 和 si標等價,則 p(x, (ˆs−i, si)) = p(x, (ˆs−i, bi))對任何節點和 任何 mixed strategy profile ˆs−i 都成立。

• 中 標 等 價 realization equivalent: They induce the same probility distribution over the set X of nodes.

That is, p(x, (ˆs−i, si)) = p(x, (ˆs−i, bi)) holds for all x∈ X, ˆs−i∈ S−i.

• (Perfect recall 時候高標 ⇒ 中標) 因為一條 root-to-leaf 路徑不經同一 IS 兩次,所以 p(x, q) =

a∈Ap(x, a)· q(a), 其中 q(a) = 根據 q 走出 a 的機率。而高標保證根 據 (ˆs−i, si)走出 a 的機率 = 根據 ˆs−i 走出 a−i的機率 乘上 si(ai)這會等於 bi(ai)所以中標成立。

• 低標等價 payoff equivalent: If they induce the same expected payoff, u(ˆs−i, si) = u(ˆs−i, bi) holds for each ˆ

s−i∈ S−i

• (中標⇒ 低標) 對每個點來說用 B 或用 S 走到 x 的機率 都一樣,因此 payoff 自然相同。

• 孔氏定理 (Kuhn Theorem): Let Γ be an extensive game with perfect recall. Let i ∈ N. If si ∈ Si is a mixed strategy of the i-th player in GΓ, then there is a behavior strategy bi of the i-th player in Γ such that siand bi are realization equivalent.(證明待補)

4 Nash Equilibrium

• Pure Nash: ui(a) ≥ ui(a−i, ai) holds for each i∈ N and each pure strategy ai∈ Ai.

• Mixed Nash: ui(s)≥ ui(s−i, si). (在 GΓ 上面用 Ma- trix game 的方法來找。)

• Behavior Nash: ui(b)≥ ui(b−i, bi).

• 觀察一:有 Perfect Recall(PR) 的話 a ∈ A 是 pure Nash⇔ mixed Nahs ⇔ behaviorNash。證明:M → B

→ B,因為 Bi⊆ Si 成立 (每個 bi 存在等價 si,所以沒 有更好的策略了。P→ M,ui(a−i, si) =∑

ai∈Aisi(ai)· ui(a−i, ai)。

• 觀察二:PR⇒ 存在至少一 M s和一 B b.(證明:Nash + Kuhn)

5 Subgame Perfect Equilibrium

• 無厘頭賽局 (搶戰棄讓)

• Let Γ be an extensive game. A subgame perfect equilib- rium of Γ is a behavior strategy profile b∈ B of Γ such that subprofile bx is a Nash equilibrium of subgame Γx

for each internal node x ∈ X − Z such that Γ can be decomposed at x.

• 定理一:Every extensive game Γ with perfect informa- tion has at least one pure strategy profile a∈ A that is a subgame perfect equilibrium in Γ.(證明對頂點數歸納,

每次挑最低的 internal node x 縮起來 Γ 根據歸納假設 存在 a = a−x 只要證明 a = (ax, a−x)是 SGPE。也就 是要證明對 y,ay 都是 Γy的 NE,分兩個 case:y 是或 不是 x 祖先。)

• 推論:Every extensive game with perfect information has at least one pure Nash (因為每個 SGPE 都是 NE) (例 子:Centipede game)

• 定理二:(One-shot deviation principle) Let Γ be an ex- tensive game with perfect information. For any behavior strategy profile b∈ B, b is a SGPE of Γ iff no one-shot deviation from b is profitable.

• 推廣 (定理三):Every extensive game with perfect recall has at least one behavior strategy profile b∈ B that is a SGPE of Γ.(找可以分割的點做歸納)

• 沒有 perfect recall 就未必有 behavior Nash。

• 中後無厘頭 (Backward Induction)

6 Sequential Equilibrium

• An assessment is a pair (b, µ), µ system of belief, 每個 IS 上面有個機率分布。

• (b, µ) is a sequential equilibrium if (b, µ) is sequentially rational and consistent.

• sequentially rational: 針對 b, w 把 (b, µ) 中的 bi 改成計 算其他 ˆbi 所得到的 assessment 計算從該玩家的 w 這個 IS 玩起所得的期望 payoff 不會比較好。

• consistency: belief system 的機率分布要跟使用 b 這個 behavior strategy profile 進行賽局時走到 w 中該節點的 條件機率必須一致。Def: If there is a sequence{bn} of completely mixed behaviorial strategy profiles such that limn→∞(bn, µbn) = (b, µ).(下前右配 0.5/0.5 辦不到。)

• 觀察一:perfect information 時 SGPE⇔ SeqE.

• 觀察二:SeqE⇒ SGPE 因為 sequentially rational 在單 一節點 x 自成一個 IS 的情形下就是保證 bx 是 Γx的一 個 NE。

• 定理:Perfect recall 時存在至少一個 SeqE.

• 中後無厘頭 (Forward Induction)、加「入」有差。

1

(2)

7 Perfect Equilibrium

• A tremble in Γ is a vector η = (η1, . . . , ηn) such that for each i∈ N, ηiis a function from the set Ci of the choices of the i-th player toR such that ηi(c) > 0 for each choice c∈ Ci and∑

c∈Cwηi(c) < 1 holds for each IS w∈ Wi.

• Let T (Γ) consist of the set of trembles in Γ.

• The η-perturbation of Γ is the extensive game (Γ, η), whose only difference with Γ is that, for each player i ∈ N, his set Bi of behavior strategies is reduced to Bηi = {bi ∈ Bi|bi(c) ≥ ηi(c) holds for each w Wiand each c ∈ Cw}. 抖動幅度一定都大於零,因此 都是 completely mixed behavior strategies.

• Perfect equilibrium: (a) limk→∞ηk = (0, . . . , 0), (b) limk→∞bk = b, and (c) for each sufficient large index k, behavior strategy profile bk is a Nash equilibrium of the ηk-perturbation (Γ, ηk) of Γ.

• 中後是 Perfect 均衡、高下 sequential 但不 perfect(高上 [1, 1];高下 [1, 1]; 低上 [2, 0]; 低下 [−1, −1])、賺錢仍然非 試不可。

8 Proper Equilibrium

• Let 0 < ϵ < 1 be a positive number. A behavior strategy profile b in B is ϵ-proper if “ui(b−w, c) < ui(b−w, c)⇒ b(c)≤ ϵ · b(c)” holds for each player i, each IS w ∈ Wi

and each pair of choices c, c∈ Cw. 如果相對 b−w 來說,

在 Cw 中 c 是個比 c 要差的回應,則 b 給 c 的機率必 須比 b 指定給 c 的機率要來得小,至於要多小由誤差係 數來規範。

• Proper equilibrium: (a) limk→∞ϵk= 0, (b) limk→∞bk= b, and (c) each bkwith sufficiently large k is an ϵk-proper strategy profile of Γ.

• 中後是 Perfect 但不是 Proper:對紅兔來說給「上」的 機率應比給「下」的機率高。因此對綠兔來說給「前」

的機率應該要比給「後」的來高,因此取極限不可能是

「後」。

• 均衡種類:上前 [3, 1]; 上後 [1, 0]; 中 [2, 2]; 下前 [0, 0]; 下 後 [v, 1]。當 v < 2 中後就是 Perfect 均衡。若 v≤ 1,中 後就不是 Proper。若 1 < v < 2 則兩者都是。

• 中 後 非 proper, 中 前 也 非 proper, 但 「試 中 後」 是 proper。

9 Mixed OOXX Equilibrium

• A mixed tremble in Γ is a vector η = (η1, . . . , ηn) such that for each i ∈ N, ηi is a function from the set Ai

of the pure strategies of the i-th player to R such that ηi(ai) > 0 holds for each ai∈ Aiand ∑

ai∈Aiηi(ai) < 1.

Let Tm(Γ) consist of the set of mixed trembles in Γ.

• The η-mixed perturbation of Γ is the strategic game (E(GΓ), η)只差在限制 Siη={si ∈ Si|si≥ ηi(ai)}.

• 「試中後」跟「中後」都是 mixed perfect 均衡,但都不 是 mixed proper。但 1.5 中後仍是 mixed proper。

• 喜 [2]; 怒哀 [1]; 怒樂 [0]; 喜樂是 mixed proper 但不是 subgame perfect。

• 終極法寶:從 mixed proper 定義出發,如果 Γ 有 perfect recall,根據 Kuhn Theorem 得知每個 sk 都各有一個中 標等價的 bk,如果 {bk} 收斂則 b = limk→∞bk 就是一 個 limit behavior strategy profile induced by s。

• 它一定是 sequential equilibrium。

10 Bayesian Game

• An n-player Bayesian game with set N of players is a 4-tuple BG = (Θ, ρ, A, u) whose elements are as follows.

Θ = (Θi)包含了第 i 個玩家的狀態。ρ: common prior 是一個 over Θ 的機率分布。

• Pure strategy: ˆai: Θi→ Ai.

• Behavior strategy: a mapping from Θi→ ∆Ai.

• ˆuii, ˆa) =

θ−i∈Θ−iρ(θ−i|θi)· ui(θ, ˆa(θ)).

• ρ(θ−i) =∑

θi∈Θiρ(θi)· ρ(θ−i|θi).

• ˆa ∈ ˆA is a pure Nash equilibrium of BG if ˆuii, ˆa) ˆ

uii, ˆa−θi, ai). Holds for each i, θi, ai.

11 More general setting

• An n-player Bayesian game with set N of players is a 6-tuple BG = (Ω, Θ, f, ρ, A, u). Ω: 外在的所有狀態。Θi: 第 i 個玩家會收到的 signals. f :signal function, fi: Ω Θi. ρ 每個玩家對 Ω 的機率分布的信念都不同。u: payoff function 由 Ω× A 決定。

• 每一個型態的 player i,有自己的世界觀,也就是他對 Ω 機率分布的 belief,不再是 ρi(·),而是 ρi(·|θi).

12 Coalitional game

• N : the set of players. Each subset S ⊆ N is a coali- tion(聯盟)。N is the grand coalition(大聯盟)。RSpayoff vectors x for coalition S。

• 聯 盟 賽 局 (N, V ), V 指 定 了 每 個 非 空 聯 盟 一 個 non- empty set of actions inRS(不同行動的 payoffs)。

• Let x∈ V (N), y ∈ V (S), we say y strictly dominates x if yi> xi∀i ∈ S.

• x∈ V (N) is coalitionally rational if each non-empty and non-grand coalition S does not have any y strictly dom- inates x.

• The core of G = (N, V ) is actions of N that are coali- tionally rational.

• with transferable payoff (N, v), v assigns for each non- empty coalition S⊆ N a non-negative number v(S) ∈ R.

只要總和是 v(S), 那麼在 S 內任何一種分法都是合法的 action。分法 x 滿足

i∈Sxi= v(S).

13 Shapley Value

• Let π be a permutation of players. Suppose that i is the j-th player in π. Then the marginal contribution of player i with respect to π is mπi(v) = v(Sjπ)− v(Sjπ−1)就 前 j 個人和 j− 1 個人的效益差。

• Shapley value for (G, v) is Φi= n!1

πmπi(v).

• 品管標章一 (Efficiency): Φ is an allocation rule of (N, v) such that∑

i∈NΦi= v(N ).

• 品管標章二 (Null player): if i null player then Φi= 0.

• 品管標章三 (Symmetric player): i, j symmetric if v(S∪ {i}) = v(S ∪ {j}) holds for all S ⊆ N − {i, j}. If i, j symmetric then Φi = Φj.

• 品管標章四 (Additivity): Let Φ1 be Shapley value of game (N, u). Φ2 be Shapley value of game (N, v). Then Φ = Φ1+ Φ2 is the Shapley value of game (N, u + v).

2

參考文獻

相關文件

To support Test Project design and development, a rigorous quality assurance and design process is in place (Competition Rules sections 10.6-10.7 refer.) Once approved by

[r]

Write the following problem on the board: “What is the area of the largest rectangle that can be inscribed in a circle of radius 4?” Have one half of the class try to solve this

[r]

[r]

The Seed project, REEL to REAL (R2R): Learning English and Developing 21st Century Skills through Film-making in Key Stage 2, aims to explore ways to use film-making as a means

We point out that extending the concepts of r-convex and quasi-convex functions to the setting associated with second-order cone, which be- longs to symmetric cones, is not easy

[r]