• 沒有找到結果。

Analysis Tools for Data Structures and Algorithms

N/A
N/A
Protected

Academic year: 2022

Share "Analysis Tools for Data Structures and Algorithms"

Copied!
24
0
0

加載中.... (立即查看全文)

全文

(1)

Analysis Tools for Data Structures and Algorithms

Hsuan-Tien Lin

Dept. of CSIE, NTU

March 24, 2020

(2)

Motivation

(3)

Motivation

Properties of Good Programs

• meet requirements, correctness: basic

• clear usage document (external), readability (internal), etc.

Resource Usage (Performance)

• efficient use of computation resources (CPU, FPU, GPU, etc.)?

time complexity

• efficient use of storage resources (memory, disk, etc.)?

space complexity

need: “language” for describing the complexity

(4)

Motivation

Space Complexity of List Summing

LIST-SUM(float array list, integer length n) total ← 0

for i ← 0 to n − 1 do total ← total + list[i]

end for return total

• array list: size of pointer, often 8

• integer n: often 4

• float total: 4

• integer i: commonly 4

• float return place: 4

total space 24 (constant) within algorithm execution does not depend on n

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 3/23

(5)

Motivation

Space Complexity of Recursive List Summing

RECURSIVE-LIST-SUM(float array list, integer length n) if n = 0

return 0 else

return list[n]+ RECURSIVE-LIST-SUM(list, n − 1) end if

• array list: size of pointer, often 8

• integer n: often 4

• float return place: 4

only 16, better than previous one?

(6)

Motivation

Time Complexity of Matrix Addition

MATRIX-ADD

(integer matrix a, b, result integer matrix c, integer rows, cols) for i ← 0 to rows − 1 do

for j ← 0 to cols − 1 do c[i][j] ← a[i][j] + b[i][j]

end for end for

• inner for: R = P · cols + Q

• total: (S + R) · rows + T

total time needed: P · rows · cols + (Q + S) · rows + T

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 5/23

(7)

Motivation

Rough Time Complexity of Matrix Addition

P · rows · cols + (Q + S) · rows + T

P, Q, R, S, T hard to keep track and not matter much MATRIX-ADD

(integer matrix a, b, result integer matrix c, integer rows, cols) for i ← 0 to rows − 1 do

for j ← 0 to cols − 1 do c[i][j] ← a[i][j] + b[i][j]

end for end for

• inner for: R = P · cols + Q = rough(cols)

• total: (S + R) · rows + T = rough(rough(cols) · rows)

rough time needed: rough(rows · cols)

(8)

Asymptotic Notation

(9)

Asymptotic Notation

Representing “Rough” by Asymptotic Notation

• goal: rough rather than exact steps

• why rough? constant not matter much

—when input sizelarge

compare two complexity functions f (n) and g(n) growthof functions matters

—when n large, n3eventually bigger than 1126n

rough ⇔ asymptotic behavior

(10)

Asymptotic Notation

Asymptotic Notations: Rough Upper Bound

big-O: rough upper bound

• f (n) grows slower than or similar to g(n):f (n) = O(g(n))

n grows slower than n2: n = O(n2)

3n grows similar to n: 3n = O(n)

• asymptotic intuition (rigorous math later):

n→∞lim f (n) g(n) ≤ c

big-O: arguably the most used “language” for complexity

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 9/23

(11)

Asymptotic Notation

More Intuitions on Big-O

f (n) = O(g(n)) ⇐ lim

n→∞

f (n)

g(n) ≤ c (not rigorously, yet)

• “= O(·)” more like “∈”

n = O(n)

n = O(10n)

n = O(0.3n)

n = O(n5)

• “= O(·)” also like “≤”

n = O(n2)

n2=O(n2.5)

n = O(n2.5)

• 1126n = O(n): coefficient not matter

• n +√

n + log n = O(n): lower-order term not matter intuitions (properties) to be proved later

(12)

Asymptotic Notation

Formal Definition of Big-O

Consider positive functions f (n) and g(n),

f (n) = O(g(n)), iff exist c, n0such that f (n) ≤ c · g(n) for all n ≥ n0

• covers the lim intuition if limit exists

• covers other situations without “limit”

e.g. | sin(n)| = O(1)

next: prove that lim intuition ⇒ formal definition

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 11/23

(13)

Asymptotic Notation

lim Intuition ⇒ Formal Definition

For positive functions f and g, if limn→∞ g(n)f (n) ≤ c, then f (n) = O(g(n)).

• with definition of limit, there exists , n0such that for all n ≥ n0,

|g(n)f (n) − c| < .

• That is, for all n ≥ n0, g(n)f (n) <c + .

• Let c0 =c + , n00 =n0, big-O definition satisfied with (c0,n00). QED.

important to not just have intuition (building), but know definition (building block)

(14)

More on Asymptotic Notations

(15)

More on Asymptotic Notations

Asymptotic Notations: Definitions

• f (n) grows slower than or similar to g(n): (“≤”)

f (n) = O(g(n)), iff exist c, n0such that f (n) ≤ c ·g(n) for all n ≥ n0

• f (n) grows faster than or similar to g(n): (“≥”)

f (n) = Ω(g(n)), iff exist c, n0such that f (n) ≥ c · g(n) for all n ≥ n0

• f (n) grows similar to g(n): (“≈”)

f (n) = Θ(g(n)), iff f (n) = O(g(n)) and f (n) = Ω(g(n)) let’s see how to use them

(16)

More on Asymptotic Notations

The Seven Functions as g

g(n) =?

• 1: constant

• logn: logarithmic (does base matter?)

• n: linear

• n log n

• n2: square

• n3: cubic

• 2n: exponential (does base matter?)

will often encounter them in future classes

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 15/23

(17)

More on Asymptotic Notations

Analysis of Sequential Search

Sequential Search

for i ← 0 to n − 1 do if list[i] == num

return i end if end for return −1

• best case (e.g. num at 0): time Θ(1)

• worst case (e.g. num at last or not found): time Θ(n) often just say O(n)-algorithm (linear complexity)

(18)

More on Asymptotic Notations

Analysis of Binary Search

Binary Search

left ← 0, right ← n − 1 while left ≤ right do

mid ← floor((left + right)/2) if list[mid ] > num

left ← mid + 1 else if list[mid ] < num

right ← mid − 1 else

return mid end if

end while return −1

• best case (e.g. num at mid ):

time Θ(1)

• worst case (e.g. num not found):

because range (right − left) halved in each WHILE, needs time Θ(log n) iterations to decrease range to 0

often just say O(log n)-algorithm (logarithmic complexity)

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 17/23

(19)

More on Asymptotic Notations

Sequential and Binary Search

• Input: anyinteger array list with size n, an integer num

• Output: if num not within list, −1; otherwise, +1126 DIRECT-SEQ-SEARCH

(list, n, num)

for i ← 0 to n − 1 do if list[i] == num

return +1126 end if

end for return −1

SORT-AND-BIN-SEARCH (list, n, num)

SEL-SORT(list, n) return

BIN-SEARCH(list, n, num) ≥ 0? + 1126 : −1

• DIRECT-SEQ-SEARCH: O(n) time

• SORT-AND-BIN-SEARCH: O(n2)time for SEL-SORTand O(log n) time for BIN-SEARCH

next: operations for “combining” asymptotic complexity

(20)

Properties of Asymptotic Notations

(21)

Properties of Asymptotic Notations

Some Properties of Big-O I

Theorem ( 封閉律 )

if f1(n) = O(g2(n)), f2(n) = O(g2(n)) then f1(n) + f2(n) = O(g2(n))

• When n ≥ n1, f1(n) ≤ c1g2(n)

• When n ≥ n2, f2(n) ≤ c2g2(n)

• So, when n ≥ max(n1,n2), f1(n) + f2(n) ≤ (c1+c2)g2(n) Theorem ( 遞移律 )

if f1(n) = O(g1(n)), g1(n) = O(g2(n)) then f1(n) = O(g2(n))

• When n ≥ n1, f1(n) ≤ c1g1(n)

• When n ≥ n2, g1(n) ≤ c2g2(n)

• So, when n ≥ max(n1,n2), f1(n) ≤ c1c2g2(n)

(22)

Properties of Asymptotic Notations

Some Properties of Big-O II

Theorem ( 併吞律 )

if f1(n) = O(g1(n)), f2(n) = O(g2(n)) and g1(n) = O(g2(n)) then f1(n) + f2(n) = O(g2(n))

Proof: use two theorems above.

Theorem

If f (n) = amnm+ · · · +a1n + a0, then f (n) = O(nm) Proof: use the theorem above.

similar proof for Ω and Θ

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 21/23

(23)

Properties of Asymptotic Notations

Some More on Big-O

RECURSIVE-BIN-SEARCHis O(log n) timeand O(log n) space

• by 遞移律 , time also O(n)

• time also O(n log n)

• time also O(n2)

• also O(2n)

• · · ·

prefer the tightest Big-O!

(24)

Properties of Asymptotic Notations

Practical Complexity

some input sizes are time-wiseinfeasiblefor some algorithms when 1-billion-steps-per-second

n n n log2n n2 n3 n4 n10 2n

10 0.01µs 0.03µs 0.1µs 1µs 10µs 10s 1µs

20 0.02µs 0.09µs 0.4µs 8µs 160µs 2.84h 1ms

30 0.03µs 0.15µs 0.9µs 27µs 810µs 6.83d 1s 40 0.04µs 0.21µs 1.6µs 64µs 2.56ms 121d 18m 50 0.05µs 0.28µs 2.5µs 125µs 6.25ms 3.1y 13d 100 0.10µs 0.66µs 10µs 1ms 100ms 3171y 4 · 1013y 103 1µs 9.96µs 1ms 1s 16.67m 3 · 1013y 3 · 10284y 104 10µs 130µs 100ms 1000s 115.7d 3 · 1023y

105 100µs 1.66ms 10s 11.57d 3171y 3 · 1033y 106 1ms 19.92ms 16.67m 32y 3 · 107y 3 · 1043y

note: similar for space complexity,

e.g. store an N by N double matrix when N = 50000?

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 23/23

參考文獻

相關文件

• raw scores 80, 60 with term scores F, B: impossible from the principle: no individual score

Bootstrapping is a general approach to statistical in- ference based on building a sampling distribution for a statistic by resampling from the data at hand.. • The

- Settings used in films are rarely just backgrounds but are integral to creating atmosphere and building narrative within a film. The film maker may either select an already

2.8 The principles for short-term change are building on the strengths of teachers and schools to develop incremental change, and enhancing interactive collaboration to

Building on the strengths of students and considering their future learning needs, plan for a Junior Secondary English Language curriculum to gear students towards the

Data larger than memory but smaller than disk Design algorithms so that disk access is less frequent An example (Yu et al., 2010): a decomposition method to load a block at a time

Know how to implement the data structure using computer programs... What are we

• Recorded video will be available on NTU COOL after the class..