1 Extensive Game

(1)

1 Extensive Game

• An n player extensive game Γ with a set N = {1, 2, . . . , n} of players is a 7-tuple (X, E, P, W, C, p, U):

(X, E)代表樹，Z 是葉子，P = (Pi)表示每個玩家擁有 那些點, W 是 information partition, C 是 choice parti- tion, p 是 nature 在每個 P0 內的點指定的機率分布，U 是 utilities.

• Game tree (X, E) ﬁnite. Each play of a game is a path from root r to a leaf node z∈ Z.

• 0-th player is also known as nature.

• Γ is an extensive game with perfect information if for each i∈ N, each information set w ∈ Wi contains exactly one node of X− Z.(沒有虛線邊)，↔ imperfect information.

• Γ is an extensive game with perfect recall if, for each i∈ N and any two information sets w and w^′ in Wi, the following condition holds: If a node x ∈ w^′ comes after a choice c∈ Cw then all node of w^′ comes after c.(我記 得我先前做過的所有選擇)

2 Strategies

• Pure strategy: ai(w)∈ Cw. Pure strategy proﬁle: a = (a1, a2, . . . , an)∈ A.

• Mixed strategy: si ∈ Si is a probability distribution over Ai. Mixed strategy proﬁle s = (s1, s₂, . . . , s_n)∈ S.

• Behavior strategy: bi of the i-th player is a mapping that assigns to each information set w∈ Wia probability distribution over Cw. For each choice c∈ Cw, bi(c) is the probability that the i-th player chooses choice c when she reaches information set w. 要求 For any two distinct informatino sets w and w^′ in Wi, bi 分配在 Cw 和 Cw^′

上的機率分布必須獨立。

• Expected payoﬀ: ui(q) =∑

z∈Zp(z, q)· Ui(z).

• Qi= Ai∪ Si∪ Bi

3 Equivalent of Strategies

• 高標等價 equivalent: si(a_i) = b_i(a_i) holds for all a_i ∈ A_i, where bi(a_i) =∏

w∈Wib_i(a_i(w)).

• 對任意 behavior strategy 存在一高標等價的 mixed strategy，但反過來不一定對。(停換續留)

• 根據要求，在有 perfect recall 的狀況下 (任一 informa- tion set 在任何一條 path 上不經過兩次)，若 bi 和 si 高 標等價，則 p(x, (ˆs_−i, si)) = p(x, (ˆs_−i, bi))對任何節點和任何 mixed strategy proﬁle ˆs_−i 都成立。

• 中標等價 realization equivalent: They induce the same probility distribution over the set X of nodes.

That is, p(x, (ˆs_−i, s_i)) = p(x, (ˆs_−i, b_i)) holds for all x∈ X, ˆs−i∈ S−i.

• (Perfect recall 時候高標 ⇒ 中標) 因為一條 root-to-leaf 路徑不經同一 IS 兩次，所以 p(x, q) = ∑

a∈Ap(x, a)· q(a), 其中 q(a) = 根據 q 走出 a 的機率。而高標保證根 據 (ˆs_−i, si)走出 a 的機率 = 根據 ˆs_−i 走出 a_−i的機率 乘上 si(ai)這會等於 bi(ai)所以中標成立。

• 低標等價 payoﬀ equivalent: If they induce the same expected payoﬀ, u(ˆs_−i, s_i) = u(ˆs_−i, b_i) holds for each ˆ

s_−i∈ S_−i。

• (中標⇒ 低標) 對每個點來說用 B 或用 S 走到 x 的機率 都一樣，因此 payoﬀ 自然相同。

• 孔氏定理 (Kuhn Theorem): Let Γ be an extensive game with perfect recall. Let i ∈ N. If si ∈ Si is a mixed strategy of the i-th player in GΓ, then there is a behavior strategy bi of the i-th player in Γ such that siand bi are realization equivalent.(證明待補)

4 Nash Equilibrium

• Pure Nash: ui(a^∗) ≥ ui(a^∗_−i, ai) holds for each i∈ N and each pure strategy ai∈ Ai.

• Mixed Nash: ui(s^∗)≥ ui(s^∗_−i, si). (在 GΓ 上面用 Ma- trix game 的方法來找。)

• Behavior Nash: ui(b^∗)≥ ui(b^∗_−i, bi).

• 觀察一：有 Perfect Recall(PR) 的話 a^∗ ∈ A 是 pure Nash⇔ mixed Nahs ⇔ behaviorNash。證明：M → B

→ B，因為 Bi⊆ Si 成立 (每個 bi 存在等價 si，所以沒有更好的策略了。P→ M，ui(a^∗_−i, si) =∑

a_i∈Aisi(ai)· ui(a^∗_−i, ai)。

• 觀察二：PR⇒ 存在至少一 M s^∗和一 B b^∗.(證明：Nash + Kuhn)

5 Subgame Perfect Equilibrium

• 無厘頭賽局 (搶戰棄讓)

• Let Γ be an extensive game. A subgame perfect equilib- rium of Γ is a behavior strategy proﬁle b∈ B of Γ such that subproﬁle bx is a Nash equilibrium of subgame Γx

for each internal node x ∈ X − Z such that Γ can be decomposed at x.

• 定理一：Every extensive game Γ with perfect informa- tion has at least one pure strategy proﬁle a∈ A that is a subgame perfect equilibrium in Γ.(證明對頂點數歸納，

每次挑最低的 internal node x 縮起來 Γ^′ 根據歸納假設 存在 a^′ = a_−x 只要證明 a = (ax, a_−x)是 SGPE。也就 是要證明對 y，ay 都是 Γy的 NE，分兩個 case：y 是或 不是 x 祖先。)

• 推論：Every extensive game with perfect information has at least one pure Nash (因為每個 SGPE 都是 NE) (例子：Centipede game)

• 定理二：(One-shot deviation principle) Let Γ be an extensive game with perfect information. For any behavior strategy profile b∈ B, b is a SGPE of Γ iff no one-shot deviation from b is profitable.

• 推廣 (定理三)：Every extensive game with perfect recall has at least one behavior strategy proﬁle b∈ B that is a SGPE of Γ.(找可以分割的點做歸納)

• 沒有 perfect recall 就未必有 behavior Nash。

• 中後無厘頭 (Backward Induction)

6 Sequential Equilibrium

• An assessment is a pair (b, µ), µ system of belief, 每個 IS 上面有個機率分布。

• (b, µ) is a sequential equilibrium if (b, µ) is sequentially rational and consistent.

• sequentially rational: 針對 b, w 把 (b, µ) 中的 bi 改成計 算其他 ˆbi 所得到的 assessment 計算從該玩家的 w 這個 IS 玩起所得的期望 payoﬀ 不會比較好。

• consistency: belief system 的機率分布要跟使用 b 這個 behavior strategy proﬁle 進行賽局時走到 w 中該節點的 條件機率必須一致。Def: If there is a sequence{bⁿ} of completely mixed behaviorial strategy proﬁles such that limn→∞(bⁿ, µ^bⁿ) = (b, µ).(下前右配 0.5/0.5 辦不到。)

• 觀察一：perfect information 時 SGPE⇔ SeqE.

• 觀察二：SeqE⇒ SGPE 因為 sequentially rational 在單 一節點 x 自成一個 IS 的情形下就是保證 bx 是 Γx的一個 NE。

• 定理：Perfect recall 時存在至少一個 SeqE.

• 中後無厘頭 (Forward Induction)、加「入」有差。

1

(2)

7 Perfect Equilibrium

• A tremble in Γ is a vector η = (η1, . . . , ηn) such that for each i∈ N, ηiis a function from the set Ci of the choices of the i-th player toR such that ηi(c) > 0 for each choice c∈ Ci and∑

c∈Cwη_i(c) < 1 holds for each IS w∈ Wi.

• Let T (Γ) consist of the set of trembles in Γ.

• The η-perturbation of Γ is the extensive game (Γ, η), whose only diﬀerence with Γ is that, for each player i ∈ N, his set Bi of behavior strategies is reduced to B^η_i = {bi ∈ Bi|bi(c) ≥ ηi(c) holds for each w ∈ W_iand each c ∈ Cw}. 抖動幅度一定都大於零，因此都是 completely mixed behavior strategies.

• Perfect equilibrium: (a) limk→∞η^k = (0, . . . , 0), (b) limk→∞b^k = b, and (c) for each suﬃcient large index k, behavior strategy proﬁle b^k is a Nash equilibrium of the η^k-perturbation (Γ, η^k) of Γ.

• 中後是 Perfect 均衡、高下 sequential 但不 perfect(高上 [1, 1];高下 [1, 1]; 低上 [2, 0]; 低下 [−1, −1])、賺錢仍然非 試不可。

8 Proper Equilibrium

• Let 0 < ϵ < 1 be a positive number. A behavior strategy proﬁle b in B is ϵ-proper if “ui(b_−w, c^′) < u_i(b_−w, c)⇒ b(c^′)≤ ϵ · b(c)” holds for each player i, each IS w ∈ Wi

and each pair of choices c, c^′∈ Cw. 如果相對 b_−w 來說，

在 Cw 中 c^′ 是個比 c 要差的回應，則 b 給 c^′ 的機率必 須比 b 指定給 c 的機率要來得小，至於要多小由誤差係 數來規範。

• Proper equilibrium: (a) limk→∞ϵ^k= 0, (b) limk→∞b^k= b, and (c) each b^kwith suﬃciently large k is an ϵ^k-proper strategy proﬁle of Γ.

• 中後是 Perfect 但不是 Proper：對紅兔來說給「上」的機率應比給「下」的機率高。因此對綠兔來說給「前」

的機率應該要比給「後」的來高，因此取極限不可能是

「後」。

• 均衡種類：上前 [3, 1]; 上後 [1, 0]; 中 [2, 2]; 下前 [0, 0]; 下 後 [v, 1]。當 v < 2 中後就是 Perfect 均衡。若 v≤ 1，中 後就不是 Proper。若 1 < v < 2 則兩者都是。

• 中後非 proper, 中前也非 proper，但「試中後」是 proper。

9 Mixed OOXX Equilibrium

• A mixed tremble in Γ is a vector η = (η1, . . . , ηn) such that for each i ∈ N, ηi is a function from the set Ai

of the pure strategies of the i-th player to R such that ηi(ai) > 0 holds for each ai∈ Aiand ∑

ai∈Aiηi(ai) < 1.

Let Tm(Γ) consist of the set of mixed trembles in Γ.

• The η-mixed perturbation of Γ is the strategic game (E(G_Γ), η)只差在限制 S_i^η={si ∈ Si|si≥ ηi(a_i)}.

• 「試中後」跟「中後」都是 mixed perfect 均衡，但都不是 mixed proper。但 1.5 中後仍是 mixed proper。

• 喜 [2]; 怒哀 [1]; 怒樂 [0]; 喜樂是 mixed proper 但不是 subgame perfect。

• 終極法寶：從 mixed proper 定義出發，如果 Γ 有 perfect recall，根據 Kuhn Theorem 得知每個 s^k 都各有一個中 標等價的 b^k，如果 {b^k} 收斂則 b = limk→∞b^k 就是一 個 limit behavior strategy proﬁle induced by s。

• 它一定是 sequential equilibrium。

10 Bayesian Game

• An n-player Bayesian game with set N of players is a 4-tuple BG = (Θ, ρ, A, u) whose elements are as follows.

Θ = (Θi)包含了第 i 個玩家的狀態。ρ: common prior 是一個 over Θ 的機率分布。

• Pure strategy: ˆa_i: Θ_i→ Ai.

• Behavior strategy: a mapping from Θi→ ∆Ai.

• ˆui(θi, ˆa) =∑

θ_−i∈Θ−iρ(θ_−i|θ_i)· ui(θ, ˆa(θ)).

• ρ(θ_−i) =∑

θ_i∈Θiρ(θi)· ρ(θ−i|θ_i).

• â ∈ Â is a pure Nash equilibrium of BG if ûi(θi, â) ≥ ˆ

ui(θi, ˆa_−θ_i, ai). Holds for each i, θi, ai.

11 More general setting

• An n-player Bayesian game with set N of players is a 6-tuple BG = (Ω, Θ, f, ρ, A, u). Ω: 外在的所有狀態。Θi: 第 i 個玩家會收到的 signals. f :signal function, fi: Ω→ Θi. ρ 每個玩家對 Ω 的機率分布的信念都不同。u: payoﬀ function 由 Ω× A 決定。

• 每一個型態的 player i，有自己的世界觀，也就是他對 Ω 機率分布的 belief，不再是 ρi(·)，而是 ρi(·|θ_i).

12 Coalitional game

• N : the set of players. Each subset S ⊆ N is a coali- tion(聯盟)。N is the grand coalition(大聯盟)。R^Spayoﬀ vectors x for coalition S。

• 聯盟賽局 (N, V ), V 指定了每個非空聯盟一個 non- empty set of actions inR^S(不同行動的 payoﬀs)。

• Let x∈ V (N), y ∈ V (S), we say y strictly dominates x if yi> xi∀i ∈ S.

• x∈ V (N) is coalitionally rational if each non-empty and non-grand coalition S does not have any y strictly dom- inates x.

• The core of G = (N, V ) is actions of N that are coali- tionally rational.

• with transferable payoﬀ (N, v), v assigns for each non- empty coalition S⊆ N a non-negative number v(S) ∈ R.

只要總和是 v(S), 那麼在 S 內任何一種分法都是合法的 action。分法 x 滿足∑

i∈Sxi= v(S).

13 Shapley Value

• Let π be a permutation of players. Suppose that i is the j-th player in π. Then the marginal contribution of player i with respect to π is m^π_i(v) = v(S_j^π)− v(Sj^π−1)就 前 j 個人和 j− 1 個人的效益差。

• Shapley value for (G, v) is Φi= _n!¹ ∑

πm^π_i(v).

• 品管標章一 (Eﬃciency): Φ is an allocation rule of (N, v) such that∑

i∈NΦ_i= v(N ).

• 品管標章二 (Null player): if i null player then Φi= 0.

• 品管標章三 (Symmetric player): i, j symmetric if v(S∪ {i}) = v(S ∪ {j}) holds for all S ⊆ N − {i, j}. If i, j symmetric then Φi = Φ_j.

• 品管標章四 (Additivity): Let Φ¹ be Shapley value of game (N, u). Φ² be Shapley value of game (N, v). Then Φ = Φ¹+ Φ² is the Shapley value of game (N, u + v).

2