1 Extensive Game
• An n player extensive game Γ with a set N = {1, 2, . . . , n} of players is a 7-tuple (X, E, P, W, C, p, U):
(X, E)代表樹,Z 是葉子,P = (Pi)表示每個玩家擁有 那些點, W 是 information partition, C 是 choice parti- tion, p 是 nature 在每個 P0 內的點指定的機率分布,U 是 utilities.
• Game tree (X, E) finite. Each play of a game is a path from root r to a leaf node z∈ Z.
• 0-th player is also known as nature.
• Γ is an extensive game with perfect information if for each i∈ N, each information set w ∈ Wi contains exactly one node of X− Z.(沒有虛線邊),↔ imperfect information.
• Γ is an extensive game with perfect recall if, for each i∈ N and any two information sets w and w′ in Wi, the following condition holds: If a node x ∈ w′ comes after a choice c∈ Cw then all node of w′ comes after c.(我記 得我先前做過的所有選擇)
2 Strategies
• Pure strategy: ai(w)∈ Cw. Pure strategy profile: a = (a1, a2, . . . , an)∈ A.
• Mixed strategy: si ∈ Si is a probability distribution over Ai. Mixed strategy profile s = (s1, s2, . . . , sn)∈ S.
• Behavior strategy: bi of the i-th player is a mapping that assigns to each information set w∈ Wia probability distribution over Cw. For each choice c∈ Cw, bi(c) is the probability that the i-th player chooses choice c when she reaches information set w. 要求 For any two distinct informatino sets w and w′ in Wi, bi 分配在 Cw 和 Cw′
上的機率分布必須獨立。
• Expected payoff: ui(q) =∑
z∈Zp(z, q)· Ui(z).
• Qi= Ai∪ Si∪ Bi
3 Equivalent of Strategies
• 高標等價 equivalent: si(ai) = bi(ai) holds for all ai ∈ Ai, where bi(ai) =∏
w∈Wibi(ai(w)).
• 對任意 behavior strategy 存在一高標等價的 mixed strat- egy,但反過來不一定對。(停換續留)
• 根據要求,在有 perfect recall 的狀況下 (任一 informa- tion set 在任何一條 path 上不經過兩次),若 bi 和 si 高 標等價,則 p(x, (ˆs−i, si)) = p(x, (ˆs−i, bi))對任何節點和 任何 mixed strategy profile ˆs−i 都成立。
• 中 標 等 價 realization equivalent: They induce the same probility distribution over the set X of nodes.
That is, p(x, (ˆs−i, si)) = p(x, (ˆs−i, bi)) holds for all x∈ X, ˆs−i∈ S−i.
• (Perfect recall 時候高標 ⇒ 中標) 因為一條 root-to-leaf 路徑不經同一 IS 兩次,所以 p(x, q) = ∑
a∈Ap(x, a)· q(a), 其中 q(a) = 根據 q 走出 a 的機率。而高標保證根 據 (ˆs−i, si)走出 a 的機率 = 根據 ˆs−i 走出 a−i的機率 乘上 si(ai)這會等於 bi(ai)所以中標成立。
• 低標等價 payoff equivalent: If they induce the same expected payoff, u(ˆs−i, si) = u(ˆs−i, bi) holds for each ˆ
s−i∈ S−i。
• (中標⇒ 低標) 對每個點來說用 B 或用 S 走到 x 的機率 都一樣,因此 payoff 自然相同。
• 孔氏定理 (Kuhn Theorem): Let Γ be an extensive game with perfect recall. Let i ∈ N. If si ∈ Si is a mixed strategy of the i-th player in GΓ, then there is a behavior strategy bi of the i-th player in Γ such that siand bi are realization equivalent.(證明待補)
4 Nash Equilibrium
• Pure Nash: ui(a∗) ≥ ui(a∗−i, ai) holds for each i∈ N and each pure strategy ai∈ Ai.
• Mixed Nash: ui(s∗)≥ ui(s∗−i, si). (在 GΓ 上面用 Ma- trix game 的方法來找。)
• Behavior Nash: ui(b∗)≥ ui(b∗−i, bi).
• 觀察一:有 Perfect Recall(PR) 的話 a∗ ∈ A 是 pure Nash⇔ mixed Nahs ⇔ behaviorNash。證明:M → B
→ B,因為 Bi⊆ Si 成立 (每個 bi 存在等價 si,所以沒 有更好的策略了。P→ M,ui(a∗−i, si) =∑
ai∈Aisi(ai)· ui(a∗−i, ai)。
• 觀察二:PR⇒ 存在至少一 M s∗和一 B b∗.(證明:Nash + Kuhn)
5 Subgame Perfect Equilibrium
• 無厘頭賽局 (搶戰棄讓)
• Let Γ be an extensive game. A subgame perfect equilib- rium of Γ is a behavior strategy profile b∈ B of Γ such that subprofile bx is a Nash equilibrium of subgame Γx
for each internal node x ∈ X − Z such that Γ can be decomposed at x.
• 定理一:Every extensive game Γ with perfect informa- tion has at least one pure strategy profile a∈ A that is a subgame perfect equilibrium in Γ.(證明對頂點數歸納,
每次挑最低的 internal node x 縮起來 Γ′ 根據歸納假設 存在 a′ = a−x 只要證明 a = (ax, a−x)是 SGPE。也就 是要證明對 y,ay 都是 Γy的 NE,分兩個 case:y 是或 不是 x 祖先。)
• 推論:Every extensive game with perfect information has at least one pure Nash (因為每個 SGPE 都是 NE) (例 子:Centipede game)
• 定理二:(One-shot deviation principle) Let Γ be an ex- tensive game with perfect information. For any behavior strategy profile b∈ B, b is a SGPE of Γ iff no one-shot deviation from b is profitable.
• 推廣 (定理三):Every extensive game with perfect recall has at least one behavior strategy profile b∈ B that is a SGPE of Γ.(找可以分割的點做歸納)
• 沒有 perfect recall 就未必有 behavior Nash。
• 中後無厘頭 (Backward Induction)
6 Sequential Equilibrium
• An assessment is a pair (b, µ), µ system of belief, 每個 IS 上面有個機率分布。
• (b, µ) is a sequential equilibrium if (b, µ) is sequentially rational and consistent.
• sequentially rational: 針對 b, w 把 (b, µ) 中的 bi 改成計 算其他 ˆbi 所得到的 assessment 計算從該玩家的 w 這個 IS 玩起所得的期望 payoff 不會比較好。
• consistency: belief system 的機率分布要跟使用 b 這個 behavior strategy profile 進行賽局時走到 w 中該節點的 條件機率必須一致。Def: If there is a sequence{bn} of completely mixed behaviorial strategy profiles such that limn→∞(bn, µbn) = (b, µ).(下前右配 0.5/0.5 辦不到。)
• 觀察一:perfect information 時 SGPE⇔ SeqE.
• 觀察二:SeqE⇒ SGPE 因為 sequentially rational 在單 一節點 x 自成一個 IS 的情形下就是保證 bx 是 Γx的一 個 NE。
• 定理:Perfect recall 時存在至少一個 SeqE.
• 中後無厘頭 (Forward Induction)、加「入」有差。
1
7 Perfect Equilibrium
• A tremble in Γ is a vector η = (η1, . . . , ηn) such that for each i∈ N, ηiis a function from the set Ci of the choices of the i-th player toR such that ηi(c) > 0 for each choice c∈ Ci and∑
c∈Cwηi(c) < 1 holds for each IS w∈ Wi.
• Let T (Γ) consist of the set of trembles in Γ.
• The η-perturbation of Γ is the extensive game (Γ, η), whose only difference with Γ is that, for each player i ∈ N, his set Bi of behavior strategies is reduced to Bηi = {bi ∈ Bi|bi(c) ≥ ηi(c) holds for each w ∈ Wiand each c ∈ Cw}. 抖動幅度一定都大於零,因此 都是 completely mixed behavior strategies.
• Perfect equilibrium: (a) limk→∞ηk = (0, . . . , 0), (b) limk→∞bk = b, and (c) for each sufficient large index k, behavior strategy profile bk is a Nash equilibrium of the ηk-perturbation (Γ, ηk) of Γ.
• 中後是 Perfect 均衡、高下 sequential 但不 perfect(高上 [1, 1];高下 [1, 1]; 低上 [2, 0]; 低下 [−1, −1])、賺錢仍然非 試不可。
8 Proper Equilibrium
• Let 0 < ϵ < 1 be a positive number. A behavior strategy profile b in B is ϵ-proper if “ui(b−w, c′) < ui(b−w, c)⇒ b(c′)≤ ϵ · b(c)” holds for each player i, each IS w ∈ Wi
and each pair of choices c, c′∈ Cw. 如果相對 b−w 來說,
在 Cw 中 c′ 是個比 c 要差的回應,則 b 給 c′ 的機率必 須比 b 指定給 c 的機率要來得小,至於要多小由誤差係 數來規範。
• Proper equilibrium: (a) limk→∞ϵk= 0, (b) limk→∞bk= b, and (c) each bkwith sufficiently large k is an ϵk-proper strategy profile of Γ.
• 中後是 Perfect 但不是 Proper:對紅兔來說給「上」的 機率應比給「下」的機率高。因此對綠兔來說給「前」
的機率應該要比給「後」的來高,因此取極限不可能是
「後」。
• 均衡種類:上前 [3, 1]; 上後 [1, 0]; 中 [2, 2]; 下前 [0, 0]; 下 後 [v, 1]。當 v < 2 中後就是 Perfect 均衡。若 v≤ 1,中 後就不是 Proper。若 1 < v < 2 則兩者都是。
• 中 後 非 proper, 中 前 也 非 proper, 但 「試 中 後」 是 proper。
9 Mixed OOXX Equilibrium
• A mixed tremble in Γ is a vector η = (η1, . . . , ηn) such that for each i ∈ N, ηi is a function from the set Ai
of the pure strategies of the i-th player to R such that ηi(ai) > 0 holds for each ai∈ Aiand ∑
ai∈Aiηi(ai) < 1.
Let Tm(Γ) consist of the set of mixed trembles in Γ.
• The η-mixed perturbation of Γ is the strategic game (E(GΓ), η)只差在限制 Siη={si ∈ Si|si≥ ηi(ai)}.
• 「試中後」跟「中後」都是 mixed perfect 均衡,但都不 是 mixed proper。但 1.5 中後仍是 mixed proper。
• 喜 [2]; 怒哀 [1]; 怒樂 [0]; 喜樂是 mixed proper 但不是 subgame perfect。
• 終極法寶:從 mixed proper 定義出發,如果 Γ 有 perfect recall,根據 Kuhn Theorem 得知每個 sk 都各有一個中 標等價的 bk,如果 {bk} 收斂則 b = limk→∞bk 就是一 個 limit behavior strategy profile induced by s。
• 它一定是 sequential equilibrium。
10 Bayesian Game
• An n-player Bayesian game with set N of players is a 4-tuple BG = (Θ, ρ, A, u) whose elements are as follows.
Θ = (Θi)包含了第 i 個玩家的狀態。ρ: common prior 是一個 over Θ 的機率分布。
• Pure strategy: ˆai: Θi→ Ai.
• Behavior strategy: a mapping from Θi→ ∆Ai.
• ˆui(θi, ˆa) =∑
θ−i∈Θ−iρ(θ−i|θi)· ui(θ, ˆa(θ)).
• ρ(θ−i) =∑
θi∈Θiρ(θi)· ρ(θ−i|θi).
• ˆa ∈ ˆA is a pure Nash equilibrium of BG if ˆui(θi, ˆa) ≥ ˆ
ui(θi, ˆa−θi, ai). Holds for each i, θi, ai.
11 More general setting
• An n-player Bayesian game with set N of players is a 6-tuple BG = (Ω, Θ, f, ρ, A, u). Ω: 外在的所有狀態。Θi: 第 i 個玩家會收到的 signals. f :signal function, fi: Ω→ Θi. ρ 每個玩家對 Ω 的機率分布的信念都不同。u: payoff function 由 Ω× A 決定。
• 每一個型態的 player i,有自己的世界觀,也就是他對 Ω 機率分布的 belief,不再是 ρi(·),而是 ρi(·|θi).
12 Coalitional game
• N : the set of players. Each subset S ⊆ N is a coali- tion(聯盟)。N is the grand coalition(大聯盟)。RSpayoff vectors x for coalition S。
• 聯 盟 賽 局 (N, V ), V 指 定 了 每 個 非 空 聯 盟 一 個 non- empty set of actions inRS(不同行動的 payoffs)。
• Let x∈ V (N), y ∈ V (S), we say y strictly dominates x if yi> xi∀i ∈ S.
• x∈ V (N) is coalitionally rational if each non-empty and non-grand coalition S does not have any y strictly dom- inates x.
• The core of G = (N, V ) is actions of N that are coali- tionally rational.
• with transferable payoff (N, v), v assigns for each non- empty coalition S⊆ N a non-negative number v(S) ∈ R.
只要總和是 v(S), 那麼在 S 內任何一種分法都是合法的 action。分法 x 滿足∑
i∈Sxi= v(S).
13 Shapley Value
• Let π be a permutation of players. Suppose that i is the j-th player in π. Then the marginal contribution of player i with respect to π is mπi(v) = v(Sjπ)− v(Sjπ−1)就 前 j 個人和 j− 1 個人的效益差。
• Shapley value for (G, v) is Φi= n!1 ∑
πmπi(v).
• 品管標章一 (Efficiency): Φ is an allocation rule of (N, v) such that∑
i∈NΦi= v(N ).
• 品管標章二 (Null player): if i null player then Φi= 0.
• 品管標章三 (Symmetric player): i, j symmetric if v(S∪ {i}) = v(S ∪ {j}) holds for all S ⊆ N − {i, j}. If i, j symmetric then Φi = Φj.
• 品管標章四 (Additivity): Let Φ1 be Shapley value of game (N, u). Φ2 be Shapley value of game (N, v). Then Φ = Φ1+ Φ2 is the Shapley value of game (N, u + v).
2