Algorithm Design and Analysis Divide and Conquer (3)

(1)

Algorithm Design and Analysis Divide and Conquer (3)

Yun-Nung (Vivian) Chen

http://ada.miulab.tw

(2)

Outline

• Recurrence (遞迴)

• Divide-and-Conquer

• D&C #1: Tower of Hanoi (河內塔)

• D&C #2: Merge Sort

• D&C #3: Bitonic Champion

• D&C #4: Maximum Subarray

• Solving Recurrences

• Substitution Method

• Recursion-Tree Method

• Master Method

• D&C #5: Matrix Multiplication

• D&C #6: Selection Problem

• D&C #7: Closest Pair of Points Problem

Divide-and-Conquer 之神乎奇技

Divide-and-Conquer 首部曲

(3)

D&C #5: Matrix Multiplication

3

Textbook Chapter 4.2 – Strassen’s algorithm for matrix multiplication

(4)

Matrix Multiplication Problem

(5)

Naïve Algorithm

• Each entry takes 𝑛 multiplications

• There are total 𝑛² entries

5

A B C

(6)

Matrix Multi. Problem Complexity

Why?

(7)

Divide-and-Conquer

• We can assume that 𝑛 = 2^𝑘 for simplicity

• Otherwise, we can increase 𝑛 s.t. 𝑛 = 2 ^log² ^𝑛

• 𝑛 may not be twice large as the original in this modification

7

A₁₁ A₁₂

A₂₁ A₂₂

B₁₁ B₁₂

B₂₁ B₂₂

C₁₁ C₁₂

C₂₁ C₂₂

(8)

Combine

Conquer Divide

Algorithm Time Complexity

MatrixMultiply(n, A, B) //base case

if n == 1 __ _return AB

//recursive case

Divide A and B into n/2 by n/2 submatrices

C₁₁ = MatrixMultiply(n/2,A₁₁,B₁₁) + MatrixMultiply(n/2,A₁₂,B₂₁) C₂₁ = MatrixMultiply(n/2,A₁₁,B₁₂) + MatrixMultiply(n/2,A₁₂,B₂₂) C₂₁ = MatrixMultiply(n/2,A₂₁,B₁₁) + MatrixMultiply(n/2,A₂₂,B₂₁) C₂₂ = MatrixMultiply(n/2,A₂₁,B₁₂) + MatrixMultiply(n/2,A₂₂,B₂₂) return C

▪ 𝑇 𝑛 = time for running MatrixMultiply(n, A, B)

(9)

Strassen’s Technique

• Important theoretical breakthrough by Volker Strassen in 1969

• Reduces the running time from Θ(𝑛³) to Θ(𝑛^𝑙𝑜𝑔²⁷) ≈ Θ(𝑛^2.807)

• The key idea is to reduce the number of recursive calls

• From 8 recursive calls to 7 recursive calls

• At the cost of extra addition and subtraction operations

9

4 multiplications 3 additions

1 multiplication 2 additions Intuition:

(10)

Strassen’s Algorithm

• 𝐶 = 𝐴 × 𝐵

2 + 1×

1 + 1×

1 − 1×

1 + 1 − 1×

1 − 1×

1 + 1×

12 + 6 − 7×

2 + 1 − 1 + 1 + 2 + 1 −

(11)

Verification of Strassen’s Algorithm

• Practice

11

(12)

Combine

Conquer

Divide

Strassen’s Algorithm Time Complexity

Strassen(n, A, B) // base case if n == 1 ___ return AB

// recursive case

Divide A and B into n/2 by n/2 submatrices M₁ = Strassen(n/2, A₁₁+A₂₂, B₁₁+B₂₂)

M₂ = Strassen(n/2, A₂₁+A₂₂, B₁₁) M₃ = Strassen(n/2, A₁₁, B₁₂-B₂₂) M₄ = Strassen(n/2, A₂₂, B₂₁-B₁₁) M₅ = Strassen(n/2, A₁₁+A₁₂, B₂₂) M₆ = Strassen(n/2, A₁₁-A₂₁, B₁₁+B₁₂) M₇ = Strassen(n/2, A₁₂-A₂₂, B₂₁+B₂₂) C₁₁ = M₁ + M₄ - M₅ + M₇

C₁₂ = M₃ + M₅ C₂₁ = M₂ + M₄

C₂₂ = M₁ – M₂ + M₃ + M₆ return C

▪ 𝑇 𝑛 = time for running Strassen(n,A,B)

(13)

Practicability of Strassen’s Algorithm

• Disadvantages

1. Larger constant factor than it in the naïve approach

2. Less numerical stable than the naïve approach

• Larger errors accumulate in non-integer computation due to limited precision 3. The submatrices at the levels of recursion consume space

4. Faster algorithms exist for sparse matrices

• Advantages: find the crossover point and combine two subproblems

13

(14)

Matrix Multiplication Upper Bounds

• Each algorithm gives an upper bound

Current lowest upper bound

(15)

Matrix Multi. Problem Complexity

15

(16)

D&C #6: Selection Problem

Textbook Chapter 9.3 – Selection in worst-case linear time

(17)

Selection Problem

17

(18)

n = 10, k = 5

3 7 9 17 5 2 21 18 33 4

(19)

Selection Problem ≦ Sorting Problem

• If the sorting problem can be solved in 𝑂 𝑓 𝑛 , so can the selection problem based on the algorithm design

• Step 1: sort A into increasing order

• Step 2: output 𝐴[𝑛 − 𝑘 + 1]

19

(20)

Selection Problem Complexity

Can we make the upper bound better if we do not sort them?

(21)

Divide-and-Conquer

• Idea

• Select a pivot and divide the inputs into two subproblems

• If 𝑘 ≤ 𝑋_> , we find the 𝑘-th largest

• If 𝑘 > 𝑋_> , we find the 𝑘 − 𝑋_> -th largest

21

pivot

We want these subproblems to have similar size

→ The better pivot is the medium in the input array a

(22)

(1) Five Guys per Group

(23)

(2) A Median per Group

23

small number → large number

(24)

(3) Median of Medians (MoM)

small number → large number

(25)

(4) Partition via MoM

25

Larger than MoM Smaller than MoM

MoM

(26)

(5) Recursion

• Three cases

1. If 𝑘 ≤ 𝑋_> , then output the 𝑘-th largest number in 𝑋_>

2. If 𝑘 = 𝑋_> + 1, then output MoM

3. If 𝑘 > 𝑋_> + 1, then output the 𝑘 − 𝑋_> − 1 -th largest number in 𝑋_<

• Practice to prove by induction

Smaller than MoM Larger than MoM MoM

(27)

Two Recursive Steps

• Step (2): Determining MoM

• Step (5): Selection in X_< or X_>

27

(28)

Divide-and-Conquer for Selection

Selection(X, k) // base case if |X| <= 4

__ sort X and return X[k]

// recursive case

Divide X into |X|/5 groups with size 5 M[i] = median from group i

MoM = Selection(M, |M|/2) for i = 1 … |X|

if X[i] > MoM

insert X[i] into X2 else

insert X[i] into X1 if |X2| == k – 1

return x

if |X2| > k – 1

return Selection(X2, k)

return Selection(X1, k - |X2| - 1)

(29)

Candidates for Consideration

29

• If 𝑘 ≤ 𝑋_> , then output the 𝑘-th largest number in 𝑋_>

• If 𝑘 > 𝑋_> + 1, then output the 𝑘 − 𝑋_> − 1 -th largest number in 𝑋_<

delete delete

Deleting at least ^𝑛

5 ÷ 2 × 3 = ³

10𝑛 guys

(30)

D&C Algorithm Complexity

• 𝑇 𝑛 = time for running Selection(X, k) with |X| = n

• Intuition

(31)

Theorem

• Theorem

• Proof

• There exists positive constant 𝑎, 𝑏 s.t.

• Use induction to prove

• n = 1, 𝑎 > 𝑐

• n > 1,

31

Inductive hypothesis

select 𝑐 > 10𝑏

(32)

Selection Problem Complexity

(33)

D&C #7: Closest Pair of Points

33

Textbook Chapter 33.4 – Finding the closest pair of points

(34)

Closest Pair of Points Problem

• Input: 𝑛 ≥ 2 points, where 𝑝_𝑖 = 𝑥_𝑖, 𝑦_𝑖 for 0 ≤ 𝑖 < 𝑛

• Output: two points 𝑝_𝑖 and 𝑝_𝑗 that are closest

• “Closest”: smallest Euclidean distance

• Euclidean distance between 𝑝_𝑖 and 𝑝_𝑗:

▪ Brute-force algorithm

▪ Check all pairs of points:

Θ 𝐶₂^𝑛 = Θ 𝑛²

(35)

Closest Pair of Points Problem

• 1D:

• Sort all points

• Scan the sorted points to find the closest pair in one pass

• We only need to examine the adjacent points

• 2D:

35

(36)

Divide-and-Conquer Algorithm

• Divide: divide points evenly along x-coordinate

• Conquer: find closest pair in each region recursively

• Combine: find closet pair with one point in each region, and return the best of three solutions

left-min = 10

right-min = 13 cross-min = 7

(37)

Cross Two Regions

• Algo 1: check all pairs that cross two regions → 𝑛/2 × 𝑛/2 combinations

• Algo 2: only consider points within 𝛿 of the cut, 𝛿 = min{l−min, r−min}

• Other pairs of points must have distance larger than 𝛿

37

left-min = 10

right-min = 13 cross-min = 7

𝛿 𝛿

縮小搜尋範圍!

(38)

Cross Two Regions

• Algo 3: only consider pairs within 𝛿 × 2𝛿 blocks

• Obs 1: every pair with smaller than 𝛿 distance must appear in a 𝛿 × 2𝛿 block

要是很倒霉，所有的點都聚集在某個𝛿 ×

2𝛿區塊內怎麼辦縮小搜尋範圍!

(39)

Cross Two Regions

• Obs 2: there are at most 8 points in a 𝛿 × 2𝛿 block

• Each 𝛿/2 × 𝛿/2 block contains at most 1 point, otherwise the distance returned from left/right region should be smaller than 𝛿

39

(40)

Cross Two Regions

• Obs 2: there are at most 8 points in a 𝛿 × 2𝛿 block

p_i

p_i+4

p_i+2 p_i+5

p_i+3

Find-closet-pair-across-regions

1. Sort the points by y-values within 𝛿 of the cut (yellow region) 2. For the sorted point 𝑝_𝑖, compute the distance with 𝑝_𝑖+1,

𝑝_𝑖+2, …, 𝑝_𝑖+7

3. Return the smallest one

At most 7 distance calculations needed

(41)

Algorithm Complexity

• 𝑇 𝑛 = time for running Closest-Pair(P) with |P| = n

41 Closest-Pair(P)

// termination condition (base case)

if |P| <= 3 brute-force finding closest pair and return it // Divide

find a vertical line L s.t. both planes_contain half of the points // Conquer (by recursion)

left-pair, left-min = Closest-Pair(points in the left) right-pair, right-min = Closest-Pair(points in the right) // Combine

delta = min{left-min, right-min}

remove points that are delta or more away from L // Obs 1 sort remaining points by y-coordinate into p₀, …, p_k

for point p_i:

____compute distances with p_i+1, p_i+2, …, p_i+7_// Obs 2 ____update delta if a closer pair is found

return the closest pair and its distance

Exercise 4.6-2

(42)

Preprocessing

• Idea: do not sort inside the recursive case

Closest-Pair(P)

sort P by x- and y-coordinate and store in Px and Py // termination condition (base case)

if |P| <= 3 brute-force finding closest pair and return it // Divide

find a vertical line L s.t. both planes_contain half of the points // Conquer (by recursion)

left-pair, left-min = Closest-Pair(points in the left) right-pair, right-min = Closest-Pair(points in the right) // Combine

delta = min{left-min, right-min}

remove points that are delta or more away from L // Obs 1 for point p_i in sorted candidates

____compute distances with p_i+1, p_i+2, …, p_i+7_// Obs 2 ____update delta if a closer pair is found

return the closest pair and its distance

(43)

Closest Pair of Points Problem

• 𝑂(𝑛) algorithm

• Taking advantage of randomization

• Chapter 13.7 of Algorithm Design by Kleinberg & Tardos

• Samir Khuller and Yossi Matias. 1995. A simple randomized sieve algorithm for the closest-pair problem. Inf.

Comput. 118, 1 (April 1995), 34-37.

43

(44)

Concluding Remarks

• When to use D&C

• Whether the problem with small inputs can be solved directly

• Whether subproblem solutions can be combined into the original solution

• Whether the overall complexity is better than naïve

• Note

• Try different ways of dividing

• D&C may be suboptimal due to repetitive computations

• Example.

• D&C algo for Fibonacci:

• Bottom-up algo for Fibonacci:

1. Divide

2. Conquer

3.

Combine

Fibonacci(n) if n < 2 ____return 1

a[0]=1 a[1]=1

for i = 2 … n

____a[i]=a[i-1]+a[i-2]

return a[n]

Our next topic: Dynamic Programming

“a technique for solving problems with overlapping subproblems”

(45)

Question?

Important announcement will be sent to

@ntu.edu.tw mailbox & post to the course website

Course Website: http://ada.miulab.tw Email: [email protected]