# Analysis Tools for Data Structures and Algorithms

(1)

## Analysis Tools for Data Structures and Algorithms

Hsuan-Tien Lin

Dept. of CSIE, NTU

March 24, 2020

(2)

(3)

Motivation

## Properties of Good Programs

• meet requirements, correctness: basic

• clear usage document (external), readability (internal), etc.

Resource Usage (Performance)

• efficient use of computation resources (CPU, FPU, GPU, etc.)?

time complexity

• efficient use of storage resources (memory, disk, etc.)?

space complexity

need: “language” for describing the complexity

(4)

Motivation

## Space Complexity of List Summing

LIST-SUM(float array list, integer length n) total ← 0

for i ← 0 to n − 1 do total ← total + list[i]

• array list: size of pointer, often 8

• integer n: often 4

• float total: 4

• integer i: commonly 4

• float return place: 4

total space 24 (constant) within algorithm execution does not depend on n

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 3/23

(5)

Motivation

## Space Complexity of Recursive List Summing

RECURSIVE-LIST-SUM(float array list, integer length n) if n = 0

return 0 else

return list[n]+ RECURSIVE-LIST-SUM(list, n − 1) end if

• array list: size of pointer, often 8

• integer n: often 4

• float return place: 4

only 16, better than previous one?

(6)

Motivation

## Time Complexity of Matrix Addition

(integer matrix a, b, result integer matrix c, integer rows, cols) for i ← 0 to rows − 1 do

for j ← 0 to cols − 1 do c[i][j] ← a[i][j] + b[i][j]

end for end for

• inner for: R = P · cols + Q

• total: (S + R) · rows + T

total time needed: P · rows · cols + (Q + S) · rows + T

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 5/23

(7)

Motivation

## Rough Time Complexity of Matrix Addition

P · rows · cols + (Q + S) · rows + T

P, Q, R, S, T hard to keep track and not matter much MATRIX-ADD

(integer matrix a, b, result integer matrix c, integer rows, cols) for i ← 0 to rows − 1 do

for j ← 0 to cols − 1 do c[i][j] ← a[i][j] + b[i][j]

end for end for

• inner for: R = P · cols + Q = rough(cols)

• total: (S + R) · rows + T = rough(rough(cols) · rows)

rough time needed: rough(rows · cols)

(8)

## Asymptotic Notation

(9)

Asymptotic Notation

## Representing “Rough” by Asymptotic Notation

• goal: rough rather than exact steps

• why rough? constant not matter much

—when input sizelarge

compare two complexity functions f (n) and g(n) growthof functions matters

—when n large, n3eventually bigger than 1126n

rough ⇔ asymptotic behavior

(10)

Asymptotic Notation

## Asymptotic Notations: Rough Upper Bound

big-O: rough upper bound

• f (n) grows slower than or similar to g(n):f (n) = O(g(n))

n grows slower than n2: n = O(n2)

3n grows similar to n: 3n = O(n)

• asymptotic intuition (rigorous math later):

n→∞lim f (n) g(n) ≤ c

big-O: arguably the most used “language” for complexity

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 9/23

(11)

Asymptotic Notation

## More Intuitions on Big-O

f (n) = O(g(n)) ⇐ lim

n→∞

f (n)

g(n) ≤ c (not rigorously, yet)

• “= O(·)” more like “∈”

n = O(n)

n = O(10n)

n = O(0.3n)

n = O(n5)

• “= O(·)” also like “≤”

n = O(n2)

n2=O(n2.5)

n = O(n2.5)

• 1126n = O(n): coefficient not matter

• n +√

n + log n = O(n): lower-order term not matter intuitions (properties) to be proved later

(12)

Asymptotic Notation

## Formal Definition of Big-O

Consider positive functions f (n) and g(n),

f (n) = O(g(n)), iff exist c, n0such that f (n) ≤ c · g(n) for all n ≥ n0

• covers the lim intuition if limit exists

• covers other situations without “limit”

e.g. | sin(n)| = O(1)

next: prove that lim intuition ⇒ formal definition

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 11/23

(13)

Asymptotic Notation

## lim Intuition ⇒ Formal Definition

For positive functions f and g, if limn→∞ g(n)f (n) ≤ c, then f (n) = O(g(n)).

• with definition of limit, there exists , n0such that for all n ≥ n0,

|g(n)f (n) − c| < .

• That is, for all n ≥ n0, g(n)f (n) <c + .

• Let c0 =c + , n00 =n0, big-O definition satisfied with (c0,n00). QED.

important to not just have intuition (building), but know definition (building block)

(14)

## More on Asymptotic Notations

(15)

More on Asymptotic Notations

## Asymptotic Notations: Definitions

• f (n) grows slower than or similar to g(n): (“≤”)

f (n) = O(g(n)), iff exist c, n0such that f (n) ≤ c ·g(n) for all n ≥ n0

• f (n) grows faster than or similar to g(n): (“≥”)

f (n) = Ω(g(n)), iff exist c, n0such that f (n) ≥ c · g(n) for all n ≥ n0

• f (n) grows similar to g(n): (“≈”)

f (n) = Θ(g(n)), iff f (n) = O(g(n)) and f (n) = Ω(g(n)) let’s see how to use them

(16)

More on Asymptotic Notations

## The Seven Functions as g

g(n) =?

• 1: constant

• logn: logarithmic (does base matter?)

• n: linear

• n log n

• n2: square

• n3: cubic

• 2n: exponential (does base matter?)

will often encounter them in future classes

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 15/23

(17)

More on Asymptotic Notations

## Analysis of Sequential Search

Sequential Search

for i ← 0 to n − 1 do if list[i] == num

return i end if end for return −1

• best case (e.g. num at 0): time Θ(1)

• worst case (e.g. num at last or not found): time Θ(n) often just say O(n)-algorithm (linear complexity)

(18)

More on Asymptotic Notations

## Analysis of Binary Search

Binary Search

left ← 0, right ← n − 1 while left ≤ right do

mid ← floor((left + right)/2) if list[mid ] > num

left ← mid + 1 else if list[mid ] < num

right ← mid − 1 else

return mid end if

end while return −1

• best case (e.g. num at mid ):

time Θ(1)

because range (right − left) halved in each WHILE, needs time Θ(log n) iterations to decrease range to 0

often just say O(log n)-algorithm (logarithmic complexity)

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 17/23

(19)

More on Asymptotic Notations

## Sequential and Binary Search

• Input: anyinteger array list with size n, an integer num

• Output: if num not within list, −1; otherwise, +1126 DIRECT-SEQ-SEARCH

(list, n, num)

for i ← 0 to n − 1 do if list[i] == num

return +1126 end if

end for return −1

SORT-AND-BIN-SEARCH (list, n, num)

SEL-SORT(list, n) return

BIN-SEARCH(list, n, num) ≥ 0? + 1126 : −1

• DIRECT-SEQ-SEARCH: O(n) time

• SORT-AND-BIN-SEARCH: O(n2)time for SEL-SORTand O(log n) time for BIN-SEARCH

next: operations for “combining” asymptotic complexity

(20)

## Properties of Asymptotic Notations

(21)

Properties of Asymptotic Notations

## Some Properties of Big-O I

Theorem ( 封閉律 )

if f1(n) = O(g2(n)), f2(n) = O(g2(n)) then f1(n) + f2(n) = O(g2(n))

• When n ≥ n1, f1(n) ≤ c1g2(n)

• When n ≥ n2, f2(n) ≤ c2g2(n)

• So, when n ≥ max(n1,n2), f1(n) + f2(n) ≤ (c1+c2)g2(n) Theorem ( 遞移律 )

if f1(n) = O(g1(n)), g1(n) = O(g2(n)) then f1(n) = O(g2(n))

• When n ≥ n1, f1(n) ≤ c1g1(n)

• When n ≥ n2, g1(n) ≤ c2g2(n)

• So, when n ≥ max(n1,n2), f1(n) ≤ c1c2g2(n)

(22)

Properties of Asymptotic Notations

## Some Properties of Big-O II

Theorem ( 併吞律 )

if f1(n) = O(g1(n)), f2(n) = O(g2(n)) and g1(n) = O(g2(n)) then f1(n) + f2(n) = O(g2(n))

Proof: use two theorems above.

Theorem

If f (n) = amnm+ · · · +a1n + a0, then f (n) = O(nm) Proof: use the theorem above.

similar proof for Ω and Θ

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 21/23

(23)

Properties of Asymptotic Notations

## Some More on Big-O

RECURSIVE-BIN-SEARCHis O(log n) timeand O(log n) space

• by 遞移律 , time also O(n)

• time also O(n log n)

• time also O(n2)

• also O(2n)

• · · ·

prefer the tightest Big-O!

(24)

Properties of Asymptotic Notations

## Practical Complexity

some input sizes are time-wiseinfeasiblefor some algorithms when 1-billion-steps-per-second

n n n log2n n2 n3 n4 n10 2n

10 0.01µs 0.03µs 0.1µs 1µs 10µs 10s 1µs

20 0.02µs 0.09µs 0.4µs 8µs 160µs 2.84h 1ms

30 0.03µs 0.15µs 0.9µs 27µs 810µs 6.83d 1s 40 0.04µs 0.21µs 1.6µs 64µs 2.56ms 121d 18m 50 0.05µs 0.28µs 2.5µs 125µs 6.25ms 3.1y 13d 100 0.10µs 0.66µs 10µs 1ms 100ms 3171y 4 · 1013y 103 1µs 9.96µs 1ms 1s 16.67m 3 · 1013y 3 · 10284y 104 10µs 130µs 100ms 1000s 115.7d 3 · 1023y

105 100µs 1.66ms 10s 11.57d 3171y 3 · 1033y 106 1ms 19.92ms 16.67m 32y 3 · 107y 3 · 1043y

note: similar for space complexity,

e.g. store an N by N double matrix when N = 50000?

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 23/23

Updating...

## References

Related subjects :