Algorithm Design and Analysis

(1)

Algorithm Design and Analysis

Divide and Conquer (1)

(2)

#ADA2021

Algorithm Design Strategy

• Do not focus on “specific algorithms”

• But “some strategies” to “design” algorithms

• First Skill: Divide-and-Conquer (各個擊破/分治法)

(3)

#ADA2021

Outline

• Recurrence (遞迴)

• Divide-and-Conquer

• D&C #1: Tower of Hanoi (河內塔)

• D&C #2: Merge Sort

• D&C #3: Bitonic Champion

• D&C #4: Maximum Subarray

• Solving Recurrences

• Substitution Method

• Recursion-Tree Method

• Master Method

• D&C #5: Matrix Multiplication

• D&C #6: Selection Problem

• D&C #7: Closest Pair of Points Problem

Divide-and-Conquer 之神乎奇技

Divide-and-Conquer 首部曲

(4)

#ADA2021

What is Divide-and-Conquer?

• Solve a problem recursively

• Apply three steps at each level of the recursion

1. Divide the problem into a number of subproblems that are smaller instances of the same problem (比較小的同樣問題)

2. Conquer the subproblems by solving them recursively If the subproblem sizes are small enough

• then solve the subproblems

• else recursively solve itself

3. Combine the solutions to the subproblems into the solution for the original problem

base case recursive case

(5)

#ADA2021

Divide-and-Conquer Benefits

• Easy to solve difficult problems

• Thinking: solve easiest case + combine smaller solutions into the original solution

• Easy to find an efficient algorithm

• Better time complexity

• Suitable for parallel computing (multi-core systems)

• More efficient memory access

• Subprograms and their data can be put in cache in stead of accessing main memory

(6)

#ADA2021

Recurrence (遞迴)

(7)

#ADA2021

Recurrence Relation

• Definition

A recurrence is an equation or inequality that describes a function in terms of its value on smaller inputs.

• Example

Fibonacci sequence (費波那契數列)

• Base case: F(0) = F(1) = 1

• Recursive case: F(n) = F(n-1) + F(n-2)

n 0 1 2 3 4 5 6 7 8 …

2

1 1 3 5 8

13

21

(8)

#ADA2021

Recurrent Neural Network (RNN)

(9)

#ADA2021

Recurrence Benefits

• Easy & Clear

• Define base case and recursive case

• Define a long sequence

Base case

Recursive case

F(0), F(1), F(2)………

unlimited sequence

a program for solving F(n)

Fibonacci(n) // recursive function: 程式中會呼叫自己的函數 if n < 2 // base case: termination condition

return 1

// recursive case: call itself for solving subproblems

important otherwise the program cannot stop

(10)

#ADA2021

Recurrence v.s. Non-Recurrence

Fibonacci(n) if n < 2

____return 1 a[0] <- 1

a[1] <- 1

for i = 2 … n

____a[i] = a[i-1] + a[i-2]

return a[n]

Fibonacci(n)

if n < 2 // base case ____return 1

// recursive case

return Fibonacci(n-1) + Fibonacci(n-2)

Recursive function

• Clear structure

• Poor efficiency

Non-recursive function

• Better efficiency

• Unclear structure

(11)

#ADA2021

Recurrence Benefits

• Easy & Clear

• Define base case and recursive case

• Define a long sequence

Hanoi(n) is not easy to solve.

✓ It is easy to solve when n is small

✓ we can find the relation between Hanoi(n) & Hanoi(n-1)

Base case

Recursive case

If a problem can be simplified into a base case and a recursive case, then we can find an algorithm that solves this problem.

Base case

Recursive case

F(0), F(1), F(2)………

unlimited sequence

a program for solving F(n)

(12)

#ADA2021

D&C #1: Tower of Hanoi

(13)

#ADA2021

Tower of Hanoi (河內塔)

• Problem: move n disks from A to C

• Rules

• Move one disk at a time

• Cannot place a larger disk onto a smaller disk

A B C

(14)

#ADA2021

Hanoi(1)

• Move 1 from A to C

Disk 1

→ 1 move in total Base case

Disk 1

(15)

#ADA2021

Hanoi(2)

• Move 1 from A to B

• Move 2 from A to C

• Move 1 from B to C

Disk 2 Disk 1

→ 3 moves in total

Disk 1 Disk 2

Disk 1

(16)

#ADA2021

Hanoi(3)

• How to move 3 disks?

• How many moves in total?

Disk 3 Disk 2 Disk 1

(17)

#ADA2021

Hanoi(n)

• How to move n disks?

• How many moves in total?

Disk n Disk n-1 Disk n-2

(18)

#ADA2021

Hanoi(n)

• To move n disks from A to C (for n > 1):

1. Move Disk 1~n-1 from A to B

Disk n Disk n-1 Disk n-2

(19)

#ADA2021

Hanoi(n)

1. Move Disk 1~n-1 from A to B

Disk n Disk n-1

Disk n-2

(20)

#ADA2021

Hanoi(n)

1. Move Disk 1~n-1 from A to B 2. Move Disk n from A to C

Disk n Disk n-1

Disk n-2

(21)

#ADA2021

Hanoi(n)

Disk n Disk n-1

Disk n-2

(22)

#ADA2021

Hanoi(n)

3. Move Disk 1~n-1 from B to C

Disk n-1 Disk n-2

Disk n

(23)

#ADA2021

Hanoi(n)

3. Move Disk 1~n-1 from B to C

Disk n-1 Disk n-2 Disk n

→ 2Hanoi(n-1) + 1 moves in total recursive case

(24)

#ADA2021

Pseudocode for Hanoi

• Call tree

Hanoi(n, src, dest, spare) if n==1 // base case

Move disk from src to dest else // recursive case

Hanoi(n-1, src, spare, dest) Move disk from src to dest Hanoi(n-1, spare, dest, src)

No need to combine the results in this case

Hanoi(3, A, C, B)

Hanoi(2, A, B, C) Hanoi(2, B, C, A)

(25)

#ADA2021

Algorithm Time Complexity

• 𝑇 𝑛 = #moves with n disks

• Base case: 𝑇 1 = 1

• Recursive case (𝑛 > 1): 𝑇 𝑛 = 2𝑇 𝑛 − 1 + 1

• We will learn how to derive 𝑇 𝑛 later

Hanoi(n, src, dest, spare) if n==1 // base case

Move disk from src to dest else // recursive case

Hanoi(n-1, src, spare, dest) Move disk from src to dest Hanoi(n-1, spare, dest, src)

(26)

#ADA2021

Further Questions

• Q1: Is 𝑂 2^𝑛 tight for Hanoi? Can 𝑇 𝑛 < 2^𝑛 − 1?

• Q2: What about more than 3 pegs?

• Q3: Double-color Hanoi problem

• Input: 2 interleaved-color towers

• Output: 2 same-color towers

(27)

#ADA2021

D&C #2: Merge Sort

Textbook Chapter 2.3.1 – The divide-and-conquer approach

(28)

#ADA2021

Sorting Problem

6

Input: unsorted list of size n

What are the base case and recursive case?

3 5 1 8 7 2 4

1 2 3 4 5 6 7 8

(29)

#ADA2021

Divide-and-Conquer

• Base case (n = 1)

• Directly output the list

• Recursive case (n > 1)

• Divide the list into two sub-lists

• Sort each sub-list recursively

• Merge the two sorted lists

1 3 5 6 2 4 7 8 2 sublists of size n/2

# of comparisons = Θ(𝑛)

How?

(30)

#ADA2021

Illustration for n = 10

6 3 5 1 8 9 7 2 10 4

6 3 9 7

(31)

#ADA2021

6 3 5 1 8 9 7 2 10 4

6 3 9 7

6 3

3 5 6 1 8

6

3 5 8

1

7 9

2 7 9 4 10

10 9

4 7 2

9 10 8

7 6

5 3 4

1 2

Illustration for n = 10

(32)

#ADA2021

Pseudocode for Merge Sort

• Divide a list of size n into 2 sublists of size n/2

• Recursive case (𝑛 > 1)

• Sort 2 sublists recursively using merge sort

• Base case (𝑛 = 1)

• Return itself

• Merge 2 sorted sublists into one sorted list in linear time

MergeSort(A, p, r) // base case

if p == r ___return

// recursive case // divide

q = [(p+r-1)/2]

// conquer

MergeSort(A, p, q) MergeSort(A, q+1, r) // combine

Merge(A, p, q, r)

1. Divide

2. Conquer

3. Combine

(33)

#ADA2021

Time Complexity for Merge Sort

• Divide a list of size n into 2 sublists of size n/2

• Sort 2 sublists recursively using merge sort

• Return itself

• Merge 2 sorted sublists into one sorted list in linear time

MergeSort(A, p, r) // base case

if p == r ___return

// recursive case // divide

q = [(p+r-1)/2]

// conquer

MergeSort(A, p, q) MergeSort(A, q+1, r) // combine

Merge(A, p, q, r)

1. Divide

2. Conquer

3. Combine

▪ 𝑇 𝑛 = time for running MergeSort(A, p, r) with 𝑟– 𝑝 + 1 = 𝑛

(34)

#ADA2021

Time Complexity for Merge Sort

• Simplify recurrences

• Ignore floors and ceilings (boundary conditions)

• Assume base cases are constant (for small n)

2^nd expansion 1^st expansion

k^th expansion

(35)

#ADA2021

Theorem 1

• Theorem

• Proof

• There exists positive constant 𝑎, 𝑏 s.t.

• Use induction to prove

• n = 1, trivial

• n > 1,

Inductive hypothesis

(36)

#ADA2021

How to Solve Recurrence Relations?

1. Substitution Method (取代法)

• Guess a bound and then prove by induction

2. Recursion-Tree Method (遞迴樹法)

• Expand the recurrence into a tree and sum up the cost

3. Master Method (套公式大法/大師法)

• Apply Master Theorem to a specific form of recurrences

Let’s see more examples first and come back to this later

(37)

#ADA2021

D&C #3: Bitonic Champion Problem

(38)

#ADA2021

Bitonic Champion Problem

The bitonic sequence means “increasing before the champion and decreasing after the champion” (冠軍之前遞增、冠軍之後遞減)

3 7 9 17 35 28 21 18 6 4

(39)

#ADA2021

Bitonic Champion Problem Complexity

Why not Ω(n)?

Why?

(40)

#ADA2021

Bitonic Champion Problem Complexity

• When there are n inputs, any solution has n different outputs

• Any comparison-based algorithm needs Ω(log𝑛) time in the worst case

Ω(log 𝑛)

(41)

#ADA2021

Bitonic Champion Problem Complexity

(42)

#ADA2021

Divide-and-Conquer

• Idea: divide A into two subproblems and then find the final champion based on the champions from two subproblems

Champion(i, j)

if i==j // base case return i

else // recursive case k = floor((i+j)/2) l = Champion(i, k) r = Champion(k+1, j) if A[l] > A[r]

return l

if A[l] < A[r]

return r

Output = Champion(1, n)

(43)

#ADA2021

Illustration for n = 10

3 7 9 17 35 28 21 18 6 4

3 7 28 21

(44)

#ADA2021

Proof of Correctness

• Practice by yourself!

Champion(i, j)

return l

if A[l] < A[r]

return r

Output = Chamption(1, n)

Hint: use induction on (j – i) to prove Champion(i, j) can return the champion from A[i … j]

(45)

#ADA2021

Algorithm Time Complexity

• 𝑇 𝑛 = time for running Champion(i, j) with 𝑗 – 𝑖 + 1 = 𝑛

▪ Divide a list of size n into 2 sublists of size n/2

▪ Recursive case

▪ Find champions from 2 sublists recursively

▪ Base case

▪ Return itself

▪ Choose the final champion by a single comparison

1. Divide

2. Conquer

3. Combine

Champion(i, j)

return l

if A[l] < A[r]

return r

Output = Chamption(1, n)

(46)

#ADA2021

Theorem 2

• Theorem

• Proof

• n = 1, trivial

• n > 1,

(47)

#ADA2021

Bitonic Champion Problem Complexity

Can we have a better algorithm by using the bitonic sequence property?

(48)

#ADA2021

Improved Algorithm

Champion-2(i, j)

else // recursive case k = floor((i+j)/2) if A[k] > A[k+1]

return Champion(i, k) if A[k] < A[k+1]

return Champion(k+1, j) Champion(i, j)

return l

if A[l] < A[r]

return r

(49)

#ADA2021

Illustration for n = 10

3 7 9 17 35 28 21 18 6 4

3 7 9 17 35

17 35

35

(50)

#ADA2021

Correctness Proof

• Practice by yourself!

Champion-2(i, j)

return Champion(k+1, j)

Output = Champion-2(1, n)

Two crucial observations:

• If 𝐴[1 … 𝑛] is bitonic, then so is 𝐴[𝑖, 𝑗] for any indices 𝑖 and 𝑗 with 1 ≤ 𝑖 ≤ 𝑗 ≤ 𝑛.

• For any indices 𝑖, 𝑗, and 𝑘 with 1 ≤ 𝑖 ≤ 𝑗 ≤ 𝑛, we know that 𝐴[𝑘] > 𝐴[𝑘 + 1] if and only if the maximum of 𝐴[𝑖 … 𝑗] lies in 𝐴[𝑖 … 𝑘].

(51)

#ADA2021

Algorithm Time Complexity

• 𝑇 𝑛 = time for running Champion-2(i, j) with 𝑗 – 𝑖 + 1 = 𝑛

▪ Divide a list of size n into 2 sublists of size n/2

▪ Recursive case

▪ Find champions from 1 sublists recursively

▪ Base case

▪ Return itself

▪ Return the champion

1. Divide

2. Conquer

3. Combine Champion-2(i, j)

(52)

#ADA2021

Algorithm Time Complexity

The algorithm time complexity is 𝑂 log 𝑛

• each recursive call reduces the size of (j - i) into half

• there are 𝑂 log 𝑛 levels

• each level takes 𝑂 1

Champion-2(i, j)

• 𝑇 𝑛 = time for running Champion-2(i, j) with 𝑗 – 𝑖 + 1 = 𝑛

(53)

#ADA2021

Theorem 3

• Theorem

• Proof Practice to prove by induction

(54)

#ADA2021

Bitonic Champion Problem Complexity

(55)

#ADA2021

D&C #4: Maximum Subarray

Textbook Chapter 4.1 – The maximum-subarray problem

(56)

#ADA2021

Coding Efficiency

• How can we find the most efficient time interval for continuous coding?

5pm 6pm 7pm 8pm 9pm 10p m

11p m

12a m

1am 2am 3am

1 2 3 4

-4 -3 -2 -1 0

Coding power 戰鬥力 (K)

7pm-2:59am

Coding power= 8k

(57)

#ADA2021

Maximum Subarray Problem

3 7 9 17 5 28 21 18 6 4

-3 7 -9 17 -5 28 -21 18 -6 4

(58)

#ADA2021

O(n ³ ) Brute Force Algorithm

MaxSubarray-1(i, j) for i = 1,…,n

for j = 1,…,n S[i][j] = - ∞ for i = 1,…,n

for j = i,i+1,…,n

S[i][j] = A[i] + A[i+1] + … + A[j]

return Champion(S)

(59)

#ADA2021

O(n ² ) Brute Force Algorithm

MaxSubarray-2(i, j) for i = 1,…,n

for j = 1,…,n S[i][j] = - ∞ R[0] = 0

for i = 1,…,n

R[i] = R[i-1] + A[i]

for i = 1,…,n

for j = i+1,i+2,…,n

S[i][j] = R[j] - R[i-1]

return Champion(S)

R[n] is the sum over A[1…n]

(60)

#ADA2021

Max Subarray Problem Complexity

(61)

#ADA2021

Divide-and-Conquer

• Base case (n = 1)

• Return itself (maximum subarray)

• Recursive case (n > 1)

• Divide the array into two sub-arrays

• Find the maximum sub-array recursively

• Merge the results How?

(62)

#ADA2021

Where is the Solution?

• The maximum subarray for any input must be in one of following cases:

Case 1: left

Case 2: right

Case 3: cross the middle

Case 1: MaxSub(A, i, j) = MaxSub(A, i, k)

(63)

#ADA2021

• Goal: find the maximum subarray that crosses the middle

• Observation

• The sum of 𝐴[𝑥 … 𝑘] must be the maximum among 𝐴[𝑖 … 𝑘] (left: 𝑖 ≤ 𝑘)

• The sum of 𝐴[𝑘 + 1 … 𝑦] must be the maximum among 𝐴[𝑘 + 1 … 𝑗] (right: 𝑗 > 𝑘)

• Solvable in linear time → Θ 𝑛

Case 3: Cross the Middle

(1) Start from the middle to find the left maximum subarray

(2) Start from the middle to find the right maximum subarray

The solution of Case 3 is the combination of (1) and (2)

(64)

#ADA2021

Divide-and-Conquer Algorithm

MaxCrossSubarray(A, i, k, j) left_sum = -∞

sum=0

for p = k downto i sum = sum + A[p]

if sum > left_sum left_sum = sum max_left = p right_sum = -∞

sum=0

for q = k+1 to j sum = sum + A[q]

if sum > right_sum right_sum = sum max_right = q

(65)

#ADA2021

Combine

Conquer Divide

Divide-and-Conquer Algorithm

MaxSubarray(A, i, j)

if i == j // base case return (i, j, A[i]) else // recursive case

k = floor((i + j) / 2)

(l_low, l_high, l_sum) = MaxSubarray(A, i, k) (r_low, r_high, r_sum) = MaxSubarray(A, k+1, j)

(c_low, c_high, c_sum) = MaxCrossSubarray(A, i, k, j) if l_sum >= r_sum and l_sum >= c_sum // case 1

return (l_low, l_high, l_sum)

else if r_sum >= l_sum and r_sum >= c_sum // case 2 return (r_low, r_high, r_sum)

else // case 3

return (c_low, c_high, c_sum)

(66)

#ADA2021

Divide-and-Conquer Algorithm

MaxSubarray(A, i, j)

if i == j // base case return (i, j, A[i]) else // recursive case

k = floor((i + j) / 2)

(l_low, l_high, l_sum) = MaxSubarray(A, i, k) (r_low, r_high, r_sum) = MaxSubarray(A, k+1, j)

(c_low, c_high, c_sum) = MaxCrossSubarray(A, i, k, j) if l_sum >= r_sum and l_sum >= c_sum // case 1

return (l_low, l_high, l_sum)

else if r_sum >= l_sum and r_sum >= c_sum // case 2 return (r_low, r_high, r_sum)

else // case 3

return (c_low, c_high, c_sum)

(67)

#ADA2021

Algorithm Time Complexity

• Divide a list of size n into 2 subarrays of size n/2

• find MaxSub for each subarrays

• Return itself

• Find MaxCrossSub for the original list

• Pick the subarray with the maximum sum among 3 subarrays

1. Divide

2. Conquer

3. Combine

▪ 𝑇 𝑛 = time for running MaxSubarray(A, i, j) with 𝑗 – 𝑖 + 1 = 𝑛

(68)

#ADA2021

Theorem 1

• Theorem

• Proof

• n = 1, trivial

• n > 1,

(69)

#ADA2021

Theorem 1 (Simplified)

• Theorem

• Proof

• n = 1, trivial

• n > 1,

(70)

#ADA2021

Max Subarray Problem Complexity

(71)

#ADA2021

Max Subarray Problem Complexity

Exercise 4.1-5 page 75 of textbook

Next topic!

(72)

#ADA2021

To Be Continue…

(73)

Question?

Important announcement will be sent to

@ntu.edu.tw mailbox & post to the course website

Course Website: http://ada.miulab.tw

Algorithm Design and Analysis