Analysis Tools for Data Structures and Algorithms
Hsuan-Tien Lin
Dept. of CSIE, NTU
March 24, 2020
Motivation
Motivation
Properties of Good Programs
• meet requirements, correctness: basic
• clear usage document (external), readability (internal), etc.
Resource Usage (Performance)
• efficient use of computation resources (CPU, FPU, GPU, etc.)?
time complexity
• efficient use of storage resources (memory, disk, etc.)?
space complexity
need: “language” for describing the complexity
Motivation
Space Complexity of List Summing
LIST-SUM(float array list, integer length n) total ← 0
for i ← 0 to n − 1 do total ← total + list[i]
end for return total
• array list: size of pointer, often 8
• integer n: often 4
• float total: 4
• integer i: commonly 4
• float return place: 4
total space 24 (constant) within algorithm execution does not depend on n
H.-T. Lin (NTU CSIE) Analysis Tools for DSA 3/23
Motivation
Space Complexity of Recursive List Summing
RECURSIVE-LIST-SUM(float array list, integer length n) if n = 0
return 0 else
return list[n]+ RECURSIVE-LIST-SUM(list, n − 1) end if
• array list: size of pointer, often 8
• integer n: often 4
• float return place: 4
only 16, better than previous one?
Motivation
Time Complexity of Matrix Addition
MATRIX-ADD
(integer matrix a, b, result integer matrix c, integer rows, cols) for i ← 0 to rows − 1 do
for j ← 0 to cols − 1 do c[i][j] ← a[i][j] + b[i][j]
end for end for
• inner for: R = P · cols + Q
• total: (S + R) · rows + T
total time needed: P · rows · cols + (Q + S) · rows + T
H.-T. Lin (NTU CSIE) Analysis Tools for DSA 5/23
Motivation
Rough Time Complexity of Matrix Addition
P · rows · cols + (Q + S) · rows + T
P, Q, R, S, T hard to keep track and not matter much MATRIX-ADD
(integer matrix a, b, result integer matrix c, integer rows, cols) for i ← 0 to rows − 1 do
for j ← 0 to cols − 1 do c[i][j] ← a[i][j] + b[i][j]
end for end for
• inner for: R = P · cols + Q = rough(cols)
• total: (S + R) · rows + T = rough(rough(cols) · rows)
rough time needed: rough(rows · cols)
Asymptotic Notation
Asymptotic Notation
Representing “Rough” by Asymptotic Notation
• goal: rough rather than exact steps
• why rough? constant not matter much
—when input sizelarge
compare two complexity functions f (n) and g(n) growthof functions matters
—when n large, n3eventually bigger than 1126n
rough ⇔ asymptotic behavior
Asymptotic Notation
Asymptotic Notations: Rough Upper Bound
big-O: rough upper bound
• f (n) grows slower than or similar to g(n):f (n) = O(g(n))
• n grows slower than n2: n = O(n2)
• 3n grows similar to n: 3n = O(n)
• asymptotic intuition (rigorous math later):
n→∞lim f (n) g(n) ≤ c
big-O: arguably the most used “language” for complexity
H.-T. Lin (NTU CSIE) Analysis Tools for DSA 9/23
Asymptotic Notation
More Intuitions on Big-O
f (n) = O(g(n)) ⇐ lim
n→∞
f (n)
g(n) ≤ c (not rigorously, yet)
• “= O(·)” more like “∈”
• n = O(n)
• n = O(10n)
• n = O(0.3n)
• n = O(n5)
• “= O(·)” also like “≤”
• n = O(n2)
• n2=O(n2.5)
• n = O(n2.5)
• 1126n = O(n): coefficient not matter
• n +√
n + log n = O(n): lower-order term not matter intuitions (properties) to be proved later
Asymptotic Notation
Formal Definition of Big-O
Consider positive functions f (n) and g(n),
f (n) = O(g(n)), iff exist c, n0such that f (n) ≤ c · g(n) for all n ≥ n0
• covers the lim intuition if limit exists
• covers other situations without “limit”
e.g. | sin(n)| = O(1)
next: prove that lim intuition ⇒ formal definition
H.-T. Lin (NTU CSIE) Analysis Tools for DSA 11/23
Asymptotic Notation
lim Intuition ⇒ Formal Definition
For positive functions f and g, if limn→∞ g(n)f (n) ≤ c, then f (n) = O(g(n)).
• with definition of limit, there exists , n0such that for all n ≥ n0,
|g(n)f (n) − c| < .
• That is, for all n ≥ n0, g(n)f (n) <c + .
• Let c0 =c + , n00 =n0, big-O definition satisfied with (c0,n00). QED.
important to not just have intuition (building), but know definition (building block)
More on Asymptotic Notations
More on Asymptotic Notations
Asymptotic Notations: Definitions
• f (n) grows slower than or similar to g(n): (“≤”)
f (n) = O(g(n)), iff exist c, n0such that f (n) ≤ c ·g(n) for all n ≥ n0
• f (n) grows faster than or similar to g(n): (“≥”)
f (n) = Ω(g(n)), iff exist c, n0such that f (n) ≥ c · g(n) for all n ≥ n0
• f (n) grows similar to g(n): (“≈”)
f (n) = Θ(g(n)), iff f (n) = O(g(n)) and f (n) = Ω(g(n)) let’s see how to use them
More on Asymptotic Notations
The Seven Functions as g
g(n) =?
• 1: constant
• logn: logarithmic (does base matter?)
• n: linear
• n log n
• n2: square
• n3: cubic
• 2n: exponential (does base matter?)
will often encounter them in future classes
H.-T. Lin (NTU CSIE) Analysis Tools for DSA 15/23
More on Asymptotic Notations
Analysis of Sequential Search
Sequential Search
for i ← 0 to n − 1 do if list[i] == num
return i end if end for return −1
• best case (e.g. num at 0): time Θ(1)
• worst case (e.g. num at last or not found): time Θ(n) often just say O(n)-algorithm (linear complexity)
More on Asymptotic Notations
Analysis of Binary Search
Binary Search
left ← 0, right ← n − 1 while left ≤ right do
mid ← floor((left + right)/2) if list[mid ] > num
left ← mid + 1 else if list[mid ] < num
right ← mid − 1 else
return mid end if
end while return −1
• best case (e.g. num at mid ):
time Θ(1)
• worst case (e.g. num not found):
because range (right − left) halved in each WHILE, needs time Θ(log n) iterations to decrease range to 0
often just say O(log n)-algorithm (logarithmic complexity)
H.-T. Lin (NTU CSIE) Analysis Tools for DSA 17/23
More on Asymptotic Notations
Sequential and Binary Search
• Input: anyinteger array list with size n, an integer num
• Output: if num not within list, −1; otherwise, +1126 DIRECT-SEQ-SEARCH
(list, n, num)
for i ← 0 to n − 1 do if list[i] == num
return +1126 end if
end for return −1
SORT-AND-BIN-SEARCH (list, n, num)
SEL-SORT(list, n) return
BIN-SEARCH(list, n, num) ≥ 0? + 1126 : −1
• DIRECT-SEQ-SEARCH: O(n) time
• SORT-AND-BIN-SEARCH: O(n2)time for SEL-SORTand O(log n) time for BIN-SEARCH
next: operations for “combining” asymptotic complexity
Properties of Asymptotic Notations
Properties of Asymptotic Notations
Some Properties of Big-O I
Theorem ( 封閉律 )
if f1(n) = O(g2(n)), f2(n) = O(g2(n)) then f1(n) + f2(n) = O(g2(n))
• When n ≥ n1, f1(n) ≤ c1g2(n)
• When n ≥ n2, f2(n) ≤ c2g2(n)
• So, when n ≥ max(n1,n2), f1(n) + f2(n) ≤ (c1+c2)g2(n) Theorem ( 遞移律 )
if f1(n) = O(g1(n)), g1(n) = O(g2(n)) then f1(n) = O(g2(n))
• When n ≥ n1, f1(n) ≤ c1g1(n)
• When n ≥ n2, g1(n) ≤ c2g2(n)
• So, when n ≥ max(n1,n2), f1(n) ≤ c1c2g2(n)
Properties of Asymptotic Notations
Some Properties of Big-O II
Theorem ( 併吞律 )
if f1(n) = O(g1(n)), f2(n) = O(g2(n)) and g1(n) = O(g2(n)) then f1(n) + f2(n) = O(g2(n))
Proof: use two theorems above.
Theorem
If f (n) = amnm+ · · · +a1n + a0, then f (n) = O(nm) Proof: use the theorem above.
similar proof for Ω and Θ
H.-T. Lin (NTU CSIE) Analysis Tools for DSA 21/23
Properties of Asymptotic Notations
Some More on Big-O
RECURSIVE-BIN-SEARCHis O(log n) timeand O(log n) space
• by 遞移律 , time also O(n)
• time also O(n log n)
• time also O(n2)
• also O(2n)
• · · ·
prefer the tightest Big-O!
Properties of Asymptotic Notations
Practical Complexity
some input sizes are time-wiseinfeasiblefor some algorithms when 1-billion-steps-per-second
n n n log2n n2 n3 n4 n10 2n
10 0.01µs 0.03µs 0.1µs 1µs 10µs 10s 1µs
20 0.02µs 0.09µs 0.4µs 8µs 160µs 2.84h 1ms
30 0.03µs 0.15µs 0.9µs 27µs 810µs 6.83d 1s 40 0.04µs 0.21µs 1.6µs 64µs 2.56ms 121d 18m 50 0.05µs 0.28µs 2.5µs 125µs 6.25ms 3.1y 13d 100 0.10µs 0.66µs 10µs 1ms 100ms 3171y 4 · 1013y 103 1µs 9.96µs 1ms 1s 16.67m 3 · 1013y 3 · 10284y 104 10µs 130µs 100ms 1000s 115.7d 3 · 1023y
105 100µs 1.66ms 10s 11.57d 3171y 3 · 1033y 106 1ms 19.92ms 16.67m 32y 3 · 107y 3 · 1043y
note: similar for space complexity,
e.g. store an N by N double matrix when N = 50000?
H.-T. Lin (NTU CSIE) Analysis Tools for DSA 23/23