Amortized Analysis

(1)

Analysis

Michael Tsai 2013/11/14

(2)

為什麼需要Amortized Analysis

O(f(n))

n operations =O(n f(n)) ??

running time

估得太鬆了!

(3)

Amortized Analysis

 Amortized:

 Reduce or extinguish (a debt) by money regularly put aside

 攤銷

 可以想成是把一連串的operation一起考慮它的花費(也就是執行時間)

 一起考慮的時候有時候估計比較準確(bound比較緊)

(4)

Aggregated Analysis

 T(n): n個operation最差狀況下所需花的時間

 因此, 每個operation攤分的所需時間(amortized cost)為T(n)/n

 在aggregated analysis中, 我們不區分不同

operation的所需時間 (也就是amortized cost對不同operation都一樣)

(5)

一: Stack

 Stack的基本動作:

 Push(S,x)

 Pop(S)

 Multipop(S,k)

while not STACK_EMPTY(S) and k>0 POP(S)

k=k-1

 使用平常的方法:

𝑇 𝑛 = 𝑂 𝑛²

𝑂(1) 𝑂(1)

min 𝑠, 𝑘 = 𝑂(𝑛)

栗子

(6)

 但是事實上，雖然單一Multipop operation要花很多時間

 任何n個連續的operation最多只會花𝑂 𝑛 .

 因為stack裡面最多只會放n個(n個push後)

 而不管是pop或Multipop, 最多真正pop的數目也只會是n

 因此總共花的時間不會超過𝑂 𝑛 .

 𝑇 𝑛 = 𝑂 𝑛

 所以平均每個operation所花時間為^{𝑂 𝑛}_𝑛 = 𝑂 1

 在aggregated analysis裡面, 我們把每個operation的 amortized cost設定為average cost.

 在這個例子裡,也就說三種operation的amortized cost都是𝑂 1 .

(7)

二: 二進位計數器

 A[0..k-1]放二進位數的bits

 k位數

INCREMENT(A) i=0

while A[i]==1 and i<A.length A[i]=0

i=i+1

if i<A.length A[i]=1

 n個operation要花多少時間?

A[5] A[4] A[3] A[2] A[1] A[0]

1 0 0 0 1 1

栗子

(8)

使用平常的方法

 執行一次INCREMENT最多花Θ 𝑘

 最糟的狀況: 每一位都是1, 全部清成0

 因此n次INCREMENT要花O 𝑛𝑘

 正確的bound, 但是太鬆

(9)

Aggregated Analysis

Cou nter

A[7] A[6] A[5] A[4] A[3] A[2] A[1] A[0] Cost

0 0 0 0 0 0 0 0 0 0

1 0 0 0 0 0 0 0 1 1

2 0 0 0 0 0 0 1 0 3

3 0 0 0 0 0 0 1 1 4

4 0 0 0 0 0 1 0 0 7

5 0 0 0 0 0 1 0 1 8

6 0 0 0 0 0 1 1 0 10

7 0 0 0 0 0 1 1 1 11

8 0 0 0 0 1 0 0 0 15

翻n次翻 ^𝑛

4 次

翻 ^𝑛

2 次翻 ^𝑛

8 次第i位數翻 ₂^𝑛_𝑖 次

(10)

Aggregated Analysis

 所以n個operation的cost應為:

 𝑇 𝑛 = ^𝑘−1_𝑖=0 ₂^𝑛_𝑖 < 𝑛 ^∞_𝑖=0 ₂¹_𝑖 = 2𝑛 = 𝑂 𝑛

 Amortized cost for each operation:

 ^{𝑇 𝑛}

𝑛 = ^{𝑂 𝑛}_𝑛 = 𝑂 1

(11)

The accounting method

 “發明”一種給每個operation不同”虛擬cost”的方法

 (用這種方法可能分析T(n)比較容易)

 每個operation給不一樣的amortized cost

 每個operation的amortized cost可能比實際的 cost少一點或多一點

 可以把amortized cost > 實際cost想成是存錢

 可以把amortized cost < 實際cost想成是花錢

 有些operation可能多存一些錢, 供給後面的其他花錢的operation使用

(12)

限制

 如果我們想要用amortized cost總和來分析T(n)最大有多少, 則必須滿足

 ⁿ_i=1𝑐_𝑖 ≥ ⁿ_i=1 𝑐_𝑖

 這樣的話, 如果 ⁿ_i=1 𝑐_𝑖是𝑂 𝑓 𝑛 則 ⁿ_i=1𝑐_𝑖也是 𝑂 𝑓 𝑛

 注意上面的條件是”任何n個operation”都要符合喔!

 如果不符合: 則不能保障”如果 ⁿ_i=1𝑐_𝑖是𝑂 𝑓 𝑛 則 𝑐_𝑖

ni=1 也是𝑂 𝑓 𝑛 ”

 (也就是說, 不可以使某個operation多花的錢大於之前存的錢)

𝑐 :amortized 花費 𝑐_𝑖 _𝑖: 實際上的花費

(13)

 假設我們設定以下的amortized cost:

 Push: 2

 Pop: 0

 Multipop: 0

 注意multipop的amortized cost與actual cost是不同的

O(1) versus O(n)

 這樣, push先存下的錢, 足夠讓pop和multipop花嗎?

 Pop&Multipop的amortized cost設成0, 因此會花先存下來的錢

一: Stack

(14)

存錢和花錢的故事

1 2

一塊錢付掉push本身花的時間

1

一塊錢先存起來

(跟著push進去stack的item)

2

Push: 付兩塊錢

一塊錢付掉push本身花的時間一塊錢先存起來

(跟著push進去stack的item)

存起來的錢, 是用來給之後 pop出來的花的 (不管是pop or Multipop)

也就是說, 不管n個operation的順序是怎麼樣,

永遠不會有情形是花錢花超過已經存下來的錢.

每個已經在stack裡面的item都已經預先存下了錢給pop or Multipop 用.

the total amortized cost is always a upper bound of the total actual cost.

(15)

Total Amortized Cost

 然後呢?

 Total amortized cost=2×push operation數目=O(n)

 因此total actual cost也是O(n)

(16)

 假設我們這樣定義amortized cost:

 設定一個bit為1的時候: 2

 其他都是0

 這樣的設計會不會有錢不夠花的狀況?

(17)

存錢和花錢的故事 Part II

A[5] A[4] A[3] A[2] A[1] A[0]

0 0 0 0 0 0

A[5] A[4] A[3] A[2] A[1] A[0]

0 0 0 0 0 1

A[0]

A[5] A[4] A[3] A[2] A[1] A[0]

0 0 0 0 1 0 A[1]

設A[0]=1: 付兩塊錢

一塊錢付掉設A[0]=1本身花的時間

一塊錢先存起來 (跟著A[1]的bit 1) 設A[0]=0: 不付錢用存起來的一塊錢付

設A[1]=1: 付兩塊錢一塊錢付掉A[1]=1本身花的時間

一塊錢先存起來 (跟著A[0]的bit 1)

也就是說, 不管n個operation的順序是怎麼樣,永遠不會有情形是花錢花超過已經存下來的錢.每個bit=1都已經預先存下了

錢給set bit=0的時候用. the total amortized cost is always a upper bound of the total actual cost.

(18)

Total Amortized Cost

 Total amortized cost=n個INCREMENT設定bit=1 的次數

 每次最多只設定一個bit為1

 因此n次最多設定n個bit為1

 Total amortized cost=O(n)

 Total actual cost=O(n)

(19)

The potential method

 比較:

 Accounting method是使用”先存後花”的方式

 Potential method則是用potential來計算”之前的 operation使目前的資料結構有殘存多少能量”

 物理定義:Potential energy is the energy stored in a body or in a system due to its position in a force field or due to its configuration*

*http://en.wikipedia.org/wiki/Potential_energy

(20)

定義

 𝐷_𝑖:做完第i個operation之後的data structure

 (一開始為𝐷₀,結束n個operation後為𝐷_𝑛)

 Φ(𝐷_𝑖): D_𝑖的potential (自己定義)

 則第i個operation的amortized cost為

 𝑐 = 𝑐_𝑖 _𝑖 + Φ 𝐷_𝑖 − Φ 𝐷_𝑖−1

 amortized cost=actual cost加上potential的改變

 ^𝑛_𝑖=1 𝑐_𝑖 = ^𝑛_𝑖=1(𝑐_𝑖+Φ 𝐷_𝑖 − Φ 𝐷_𝑖−1 ) = 𝑐_𝑖

𝑛𝑖=1 + Φ 𝐷_𝑛 − Φ(𝐷₀)

(21)

限制

 要怎麼確定total amortized cost是total actual cost的 upper bound呢?

 要確定:

 ^𝑛_𝑖=1𝑐_𝑖 ≥ ^𝑛_𝑖=1𝑐_𝑖

 ^𝑛_𝑖=1𝑐_𝑖 + Φ 𝐷_𝑛 − Φ 𝐷₀ ≥ ^𝑛_𝑖=1𝑐_𝑖

 也就是Φ 𝐷_𝑛 ≥ Φ 𝐷₀

 通常不知道n是多少

 所以也就是必須對於任何i

 Φ 𝐷_𝑖 ≥ Φ 𝐷₀

 通常會希望Φ 𝐷₀ = 0, Φ 𝐷_𝑖 ≥ 0

𝑐 :amortized 花費 𝑐_𝑖 _𝑖: 實際上的花費

Φ(𝐷_𝑖): D_𝑖的potential

(22)

概念 ^𝑐^{= 𝑐}^𝑖 ^𝑖 ^{+ Φ 𝐷}^𝑖 ^{− Φ 𝐷}^𝑖−1

Φ 𝐷_𝑖 − Φ 𝐷_𝑖−1 ≥ 0 amortized cost ≥ actual cost 則可以想成是存錢.

(存到potential裡面去)

Φ 𝐷_𝑖 − Φ 𝐷_𝑖−1 < 0 amortized cost < actual cost 則可以想成是花錢.

(從potential裡面拿出錢來)

(23)

 定義Φ: stack裡面有多少個item

 則Φ 𝐷₀ = 0 (一開始stack沒有東西)

 且Φ 𝐷_𝑖 ≥ Φ 𝐷₀ = 0

 total amortized cost是total actual cost的upper bound

一: Stack

(24)

 如果第i個operation是push, 目前stack有s個item

 則Φ 𝐷_𝑖 − Φ 𝐷_𝑖−1 = 𝑠 + 1 − 𝑠 = 1

 𝑐 = 𝑐_𝑖 _𝑖 + Φ 𝐷_𝑖 − Φ 𝐷_𝑖−1 = 1 + 1 = 2

 如果第i個operation是pop, 目前stack有s個item

 則Φ 𝐷_𝑖 − Φ 𝐷_𝑖−1 = 𝑠 − 1 − 𝑠 = −1

 𝑐 = 𝑐_𝑖 _𝑖 + Φ 𝐷_𝑖 − Φ 𝐷_𝑖−1 = 1 − 1 = 0

 如果第i個operation是Multipop(S,k) , 目前stack有s個 item

 (請同學練習看看)

 設𝑘^′ = min 𝑘, 𝑠

 則Φ 𝐷_𝑖 − Φ 𝐷_𝑖−1 = 𝑠 − 𝑘^′ − 𝑠 = −𝑘′

 𝑐 = 𝑐_𝑖 _𝑖 + Φ 𝐷_𝑖 − Φ 𝐷_𝑖−1 = k^′ − k^′ = 0

(25)

Total actual cost

 因為三個operation的amortized cost都是O(1)

 因此n個operation (由三種operation組成)的 amortized cost是O(n)

 Total amortized cost是total actual cost的upper bound

 因此total actual cost也是O(n)

(26)

 定義Φ: A[]裡面有幾個bit是1

 假設𝑏_𝑖為做完第i個operation後A[]裡面bit是1的數目

 假設第i個operation把𝑡_𝑖個bit從1設成0

 則第i個operation的實際花費最多為𝑡_𝑖 + 1

 (𝑡_𝑖個10, 最多1個01) (想想什麼時候沒有01)

 如果𝑏_𝑖 = 0, 則𝑏_𝑖−1 = 𝑡_𝑖

 如果𝑏_𝑖 > 0, 則𝑏_𝑖 = 𝑏_𝑖−1 − 𝑡_𝑖 + 1

 𝑏_𝑖 ≤ 𝑏_𝑖−1 − 𝑡_𝑖 + 1

 Φ 𝐷_𝑖 − Φ 𝐷_𝑖−1 = 𝑏_𝑖 − 𝑏_𝑖−1 ≤ 1 − 𝑡_𝑖

(27)

Amortized costs

 𝑐 = 𝑐_𝑖 _𝑖 + Φ 𝐷_𝑖 − Φ 𝐷_𝑖−1 ≤ 𝑡_𝑖 + 1 + 1 − 𝑡_𝑖 = 2

 如果一開始counter=0, 則Φ(𝐷_𝑖) ≥ Φ 𝐷₀ = 0 (所有i)

 total amortized cost為total actual cost之upper bound

 n個INCREMENT之total amortized cost為O(n)

 所以total actual cost也是O(n)

(28)

 如果一開始我們不從0開始count呢?

 𝑏₀ ≠ 0, Φ 𝐷_𝑖 ≥ Φ 𝐷₀ 不一定成立, total amortized cost不一定是total actual cost的upper bound

 使用原本的定義的話

 𝑐 = 𝑐_𝑖 _𝑖 + Φ 𝐷_𝑖 − Φ 𝐷_𝑖−1 ≤ 𝑡_𝑖 + 1 + 1 − 𝑡_𝑖 = 2

 ^𝑛_𝑖=1 𝑐_𝑖 = ^𝑛_𝑖=1𝑐_𝑖 − Φ 𝐷_𝑛 + Φ 𝐷₀

≤ 2𝑛 − Φ 𝐷_𝑛 + Φ 𝐷₀

= 2𝑛 − 𝑏_𝑛 + 𝑏₀

 因為只有k位數, 0 ≤ 𝑏₀, 𝑏_𝑛 ≤ 𝑘

 所以當k=𝑂(𝑛)的時候

 我們知道total actual cost =𝑂 𝑛 !

(29)

來一個實際一點的例子…

 大家很不愛用malloc, realloc, delete

 陣列一開始就開很大

 浪費記憶體空間

 Dynamic table: 怎麼隨時動態調整table大小, 使得不會有太多空間浪費, 平均起來也不會浪費很多時間?

 load-factor: 𝛼 𝑇 = ^{𝑇.𝑛𝑢𝑚}_{𝑇.𝑠𝑖𝑧𝑒} (跟hash table的定義一樣)

(30)

只有insert的case

Table_Insert(T,x) if T.size==0

allocate T.table with 1 slot T.size=1

if T.num==T.size

allocate new-table with 2*T.size slots insert all items in T.table into new- table

free T.table

T.table=new-table T.size=2*T.size insert x into T.table T.num=T.num+1

這邊insert了T.size次

這邊insert了1次

假設主要花費時間是這些

insertion

(31)

原本的方法 (非amortized analysis)

 𝑐_𝑖 =?

 如果原本的table還有空間, 則𝑐_𝑖 = 1

 如果原本的table沒有空間, 則𝑐_𝑖 = 𝑖

 (1個新的insertion跟i-1個搬移insert到新的table)

 所以一個operation的cost為𝑂(𝑛)

 n個operation的cost為𝑂 𝑛²

 一樣; 正確但是不tight

(32)

Aggregate analysis

 𝑐_𝑖 = 𝑖 1

 ^𝑛_𝑖=1 𝑐_𝑖 ≤ n + _𝑗=0^{log 𝑛} 2^𝑗 = 𝑛 + 2 ^{log 𝑛 +1} ≤ 𝑛 + 2𝑛 = 3𝑛

 所以n個operation的cost=3n=O(n)

 平均每個operation的cost=3=O(1)

, if i-1 is an exact power of 2 otherwise

(33)

Accounting method

 我們設計每個table-insert的amortized cost是3

size/2 insert的時候付3塊錢

一塊錢把最新的item insert到table的時候花掉

一塊錢給之前已經在裡面的其中一個item (準備當之後滿了需要移動的時候可以花掉) 一塊錢給自己

(準備當之後滿了需要移動的時候可以花掉) 當全滿的時候, 則每一

個item都有預存了一塊錢可以拿來搬移(insert 到新的table)

(34)

Potential method

 定義Φ 𝑇 = 2𝑇. 𝑛𝑢𝑚 − 𝑇. 𝑠𝑖𝑧𝑒

 則Table-insert沒有造成把table變大的時候:

 𝑐 = 𝑐_𝑖 _𝑖 + Φ 𝐷_𝑖 − Φ 𝐷_𝑖−1 =

1 + 2𝑛𝑢𝑚_𝑖 − 𝑠𝑖𝑧𝑒_𝑖 − 2𝑛𝑢𝑚_𝑖−1 − 𝑠𝑖𝑧𝑒_𝑖−1 = 1 + 2𝑛𝑢𝑚_𝑖 − 𝑠𝑖𝑧𝑒_𝑖 − 2 𝑛𝑢𝑚_𝑖 − 1 − 𝑠𝑖𝑧𝑒_𝑖 = 3

 當Table-insert造成把table變大的時候:

 𝑐 = 𝑐_𝑖 _𝑖 + Φ 𝐷_𝑖 − Φ 𝐷_𝑖−1 = 𝑛𝑢𝑚_𝑖 +

2𝑛𝑢𝑚_𝑖 − 𝑠𝑖𝑧𝑒_𝑖 − 2𝑛𝑢𝑚_𝑖−1 − 𝑠𝑖𝑧𝑒_𝑖−1 = 𝑛𝑢𝑚_𝑖 + 2𝑛𝑢𝑚_𝑖 − 2 𝑛𝑢𝑚_𝑖 − 1 −

2 𝑛𝑢𝑚_𝑖 − 1 − 𝑛𝑢𝑚_𝑖 − 1 = 3

(35)

Reading assignment

 Section 17.4.2

 可以長大也可以縮小的dynamic table

 作業會有17.4-2 (延續你的閱讀的題目)