## Analysis Tools for Data Structures and Algorithms

Hsuan-Tien Lin

Dept. of CSIE, NTU

March 24, 2020

## Motivation

Motivation

## Properties of Good Programs

• meet requirements, correctness: basic

• clear usage document (external), readability (internal), etc.

Resource Usage (Performance)

• efficient use of computation resources (CPU, FPU, GPU, etc.)?

time complexity

• efficient use of storage resources (memory, disk, etc.)?

space complexity

need: “language” for describing the complexity

Motivation

## Space Complexity of List Summing

LIST-SUM(float array list, integer length n) total ← 0

**for i ← 0 to n − 1 do**
total ← total + list[i]

**end for**
**return total**

• array list: size of pointer, often 8

• integer n: often 4

• float total: 4

• integer i: commonly 4

• float return place: 4

total space 24 (constant) within algorithm execution does not depend on n

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 3/23

Motivation

## Space Complexity of Recursive List Summing

RECURSIVE-LIST-SUM(float array list, integer length n)
**if n = 0**

**return 0**
**else**

**return list[n]+ R**ECURSIVE-LIST-SUM(list, n − 1)
**end if**

• array list: size of pointer, often 8

• integer n: often 4

• float return place: 4

only 16, better than previous one?

Motivation

## Time Complexity of Matrix Addition

M^{ATRIX}-A^{DD}

(integer matrix a, b, result integer matrix c, integer rows, cols)
**for i ← 0 to rows − 1 do**

**for j ← 0 to cols − 1 do**
c[i][j] ← a[i][j] + b[i][j]

**end for**
**end for**

• inner for: R = P · cols + Q

• total: (S + R) · rows + T

total time needed: P · rows · cols + (Q + S) · rows + T

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 5/23

Motivation

## Rough Time Complexity of Matrix Addition

P · rows · cols + (Q + S) · rows + T

P, Q, R, S, T hard to keep track and not matter much MATRIX-ADD

(integer matrix a, b, result integer matrix c, integer rows, cols)
**for i ← 0 to rows − 1 do**

**for j ← 0 to cols − 1 do**
c[i][j] ← a[i][j] + b[i][j]

**end for**
**end for**

• inner for: R = P · cols + Q = rough(cols)

• total: (S + R) · rows + T = rough(rough(cols) · rows)

rough time needed: rough(rows · cols)

## Asymptotic Notation

Asymptotic Notation

## Representing “Rough” by Asymptotic Notation

• goal: rough rather than exact steps

• why rough? constant not matter much

—when input sizelarge

compare two complexity functions f (n) and g(n) growthof functions matters

—when n large, n^{3}eventually bigger than 1126n

rough ⇔ asymptotic behavior

Asymptotic Notation

## Asymptotic Notations: Rough Upper Bound

big-O: rough upper bound

• f (n) grows slower than or similar to g(n):f (n) = O(g(n))

• n grows slower than n^{2}: n = O(n^{2})

• 3n grows similar to n: 3n = O(n)

• asymptotic intuition (rigorous math later):

n→∞lim f (n) g(n) ≤ c

big-O: arguably the most used “language” for complexity

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 9/23

Asymptotic Notation

## More Intuitions on Big-O

f (n) = O(g(n)) ⇐ lim

n→∞

f (n)

g(n) ≤ c (not rigorously, yet)

• “= O(·)” more like “∈”

• n = O(n)

• n = O(10n)

• n = O(0.3n)

• n = O(n^{5})

• “= O(·)” also like “≤”

• n = O(n^{2})

• n^{2}=O(n^{2.5})

• n = O(n^{2.5})

• 1126n = O(n): coefficient not matter

• n +√

n + log n = O(n): lower-order term not matter intuitions (properties) to be proved later

Asymptotic Notation

## Formal Definition of Big-O

Consider positive functions f (n) and g(n),

f (n) = O(g(n)), iff exist c, n_{0}such that f (n) ≤ c · g(n) for all n ≥ n_{0}

• covers the lim intuition if limit exists

• covers other situations without “limit”

e.g. | sin(n)| = O(1)

next: prove that lim intuition ⇒ formal definition

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 11/23

Asymptotic Notation

## lim Intuition ⇒ Formal Definition

For positive functions f and g, if lim_{n→∞} _{g(n)}^{f (n)} ≤ c, then f (n) = O(g(n)).

• with definition of limit, there exists , n_{0}such that for all n ≥ n_{0},

|_{g(n)}^{f (n)} − c| < .

• That is, for all n ≥ n_{0}, _{g(n)}^{f (n)} <c + .

• Let c^{0} =c + , n_{0}^{0} =n_{0}, big-O definition satisfied with (c^{0},n^{0}_{0}). QED.

important to not just have intuition (building), but know definition (building block)

## More on Asymptotic Notations

More on Asymptotic Notations

## Asymptotic Notations: Definitions

• f (n) grows slower than or similar to g(n): (“≤”)

f (n) = O(g(n)), iff exist c, n_{0}such that f (n) ≤ c ·g(n) for all n ≥ n_{0}

• f (n) grows faster than or similar to g(n): (“≥”)

f (n) = Ω(g(n)), iff exist c, n_{0}such that f (n) ≥ c · g(n) for all n ≥ n_{0}

• f (n) grows similar to g(n): (“≈”)

f (n) = Θ(g(n)), iff f (n) = O(g(n)) and f (n) = Ω(g(n)) let’s see how to use them

More on Asymptotic Notations

## The Seven Functions as g

g(n) =?

• 1: constant

• logn: logarithmic (does base matter?)

• n: linear

• n log n

• n^{2}: square

• n^{3}: cubic

• 2^{n}: exponential (does base matter?)

will often encounter them in future classes

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 15/23

More on Asymptotic Notations

## Analysis of Sequential Search

Sequential Search

**for i ← 0 to n − 1 do**
**if list[i] == num**

**return i**
**end if**
**end for**
**return −1**

• best case (e.g. num at 0): time Θ(1)

• worst case (e.g. num at last or not found): time Θ(n) often just say O(n)-algorithm (linear complexity)

More on Asymptotic Notations

## Analysis of Binary Search

Binary Search

left ← 0, right ← n − 1
**while left ≤ right do**

mid ← floor((left + right)/2)
**if list[mid ] > num**

left ← mid + 1
**else if list[mid ] < num**

right ← mid − 1
**else**

**return mid**
**end if**

**end while**
**return −1**

• best case (e.g. num at mid ):

time Θ(1)

• worst case (e.g. num not found):

because range (right − left) halved in each WHILE, needs time Θ(log n) iterations to decrease range to 0

often just say O(log n)-algorithm (logarithmic complexity)

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 17/23

More on Asymptotic Notations

## Sequential and Binary Search

• Input: anyinteger array list with size n, an integer num

• Output: if num not within list, −1; otherwise, +1126
D^{IRECT}-S^{EQ}-S^{EARCH}

(list, n, num)

**for i ← 0 to n − 1 do**
**if list[i] == num**

**return +1126**
**end if**

**end for**
**return −1**

S^{ORT}-^{AND}-B^{IN}-S^{EARCH}
(list, n, num)

SEL-SORT(list, n)
**return**

BIN-SEARCH(list, n, num) ≥ 0? + 1126 : −1

• DIRECT-SEQ-SEARCH: O(n) time

• SORT-AND-BIN-SEARCH: O(n^{2})time for SEL-SORTand O(log n)
time for BIN-SEARCH

next: operations for “combining” asymptotic complexity

## Properties of Asymptotic Notations

Properties of Asymptotic Notations

## Some Properties of Big-O I

Theorem ( 封閉律 )

if f_{1}(n) = O(g_{2}(n)), f_{2}(n) = O(g_{2}(n)) then f_{1}(n) + f_{2}(n) = O(g_{2}(n))

• When n ≥ n_{1}, f_{1}(n) ≤ c_{1}g_{2}(n)

• When n ≥ n_{2}, f_{2}(n) ≤ c_{2}g_{2}(n)

• So, when n ≥ max(n_{1},n_{2}), f_{1}(n) + f_{2}(n) ≤ (c_{1}+c_{2})g_{2}(n)
Theorem ( 遞移律 )

if f_{1}(n) = O(g_{1}(n)), g_{1}(n) = O(g_{2}(n)) then f_{1}(n) = O(g_{2}(n))

• When n ≥ n_{1}, f_{1}(n) ≤ c_{1}g_{1}(n)

• When n ≥ n_{2}, g_{1}(n) ≤ c_{2}g_{2}(n)

• So, when n ≥ max(n_{1},n_{2}), f_{1}(n) ≤ c_{1}c_{2}g_{2}(n)

Properties of Asymptotic Notations

## Some Properties of Big-O II

Theorem ( 併吞律 )

if f_{1}(n) = O(g_{1}(n)), f_{2}(n) = O(g_{2}(n)) and g_{1}(n) = O(g_{2}(n)) then
f_{1}(n) + f_{2}(n) = O(g_{2}(n))

Proof: use two theorems above.

Theorem

If f (n) = a_{m}n^{m}+ · · · +a_{1}n + a_{0}, then f (n) = O(n^{m})
Proof: use the theorem above.

similar proof for Ω and Θ

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 21/23

Properties of Asymptotic Notations

## Some More on Big-O

RECURSIVE-BIN-SEARCHis O(log n) timeand O(log n) space

• by 遞移律 , time also O(n)

• time also O(n log n)

• time also O(n^{2})

• also O(2^{n})

• · · ·

prefer the tightest Big-O!

Properties of Asymptotic Notations

## Practical Complexity

some input sizes are time-wiseinfeasiblefor some algorithms when 1-billion-steps-per-second

n n n log_{2}n n^{2} n^{3} n^{4} n^{10} 2^{n}

10 0.01µs 0.03µs 0.1µs 1µs 10µs 10s 1µs

20 0.02µs 0.09µs 0.4µs 8µs 160µs 2.84h 1ms

30 0.03µs 0.15µs 0.9µs 27µs 810µs 6.83d 1s
40 0.04µs 0.21µs 1.6µs 64µs 2.56ms 121d 18m
50 0.05µs 0.28µs 2.5µs 125µs 6.25ms 3.1y 13d
100 0.10µs 0.66µs 10µs 1ms 100ms 3171y 4 · 10^{13}y
10^{3} 1µs 9.96µs 1ms 1s 16.67m 3 · 10^{13}y 3 · 10^{284}y
10^{4} 10µs 130µs 100ms 1000s 115.7d 3 · 10^{23}y

10^{5} 100µs 1.66ms 10s 11.57d 3171y 3 · 10^{33}y
10^{6} 1ms 19.92ms 16.67m 32y 3 · 10^{7}y 3 · 10^{43}y

note: similar for space complexity,

e.g. store an N by N double matrix when N = 50000?

H.-T. Lin (NTU CSIE) Analysis Tools for DSA 23/23