Algorithm Design and Analysis Divide and Conquer (3)
Yun-Nung (Vivian) Chen
http://ada.miulab.tw
Outline
• Recurrence (遞迴)
• Divide-and-Conquer
• D&C #1: Tower of Hanoi (河內塔)
• D&C #2: Merge Sort
• D&C #3: Bitonic Champion
• D&C #4: Maximum Subarray
• Solving Recurrences
• Substitution Method
• Recursion-Tree Method
• Master Method
• D&C #5: Matrix Multiplication
• D&C #6: Selection Problem
• D&C #7: Closest Pair of Points Problem
Divide-and-Conquer 之神乎奇技
Divide-and-Conquer 首部曲
D&C #5: Matrix Multiplication
3
Textbook Chapter 4.2 – Strassen’s algorithm for matrix multiplication
Matrix Multiplication Problem
Naïve Algorithm
• Each entry takes 𝑛 multiplications
• There are total 𝑛2 entries
5
A B C
Matrix Multi. Problem Complexity
Why?
Divide-and-Conquer
• We can assume that 𝑛 = 2𝑘 for simplicity
• Otherwise, we can increase 𝑛 s.t. 𝑛 = 2 log2 𝑛
• 𝑛 may not be twice large as the original in this modification
7
A11 A12
A21 A22
B11 B12
B21 B22
C11 C12
C21 C22
Combine
Conquer Divide
Algorithm Time Complexity
MatrixMultiply(n, A, B) //base case
if n == 1 __ _return AB
//recursive case
Divide A and B into n/2 by n/2 submatrices
C11 = MatrixMultiply(n/2,A11,B11) + MatrixMultiply(n/2,A12,B21) C21 = MatrixMultiply(n/2,A11,B12) + MatrixMultiply(n/2,A12,B22) C21 = MatrixMultiply(n/2,A21,B11) + MatrixMultiply(n/2,A22,B21) C22 = MatrixMultiply(n/2,A21,B12) + MatrixMultiply(n/2,A22,B22) return C
▪ 𝑇 𝑛 = time for running MatrixMultiply(n, A, B)
Strassen’s Technique
• Important theoretical breakthrough by Volker Strassen in 1969
• Reduces the running time from Θ(𝑛3) to Θ(𝑛𝑙𝑜𝑔27) ≈ Θ(𝑛2.807)
• The key idea is to reduce the number of recursive calls
• From 8 recursive calls to 7 recursive calls
• At the cost of extra addition and subtraction operations
9
4 multiplications 3 additions
1 multiplication 2 additions Intuition:
Strassen’s Algorithm
• 𝐶 = 𝐴 × 𝐵
2 + 1×
1 + 1×
1 − 1×
1 + 1 − 1×
1 + 1 − 1×
1 − 1×
1 + 1×
12 + 6 − 7×
2 + 1 − 1 + 1 + 2 + 1 −
Verification of Strassen’s Algorithm
• Practice
11
Combine
Conquer
Divide
Strassen’s Algorithm Time Complexity
Strassen(n, A, B) // base case if n == 1 ___ return AB
// recursive case
Divide A and B into n/2 by n/2 submatrices M1 = Strassen(n/2, A11+A22, B11+B22)
M2 = Strassen(n/2, A21+A22, B11) M3 = Strassen(n/2, A11, B12-B22) M4 = Strassen(n/2, A22, B21-B11) M5 = Strassen(n/2, A11+A12, B22) M6 = Strassen(n/2, A11-A21, B11+B12) M7 = Strassen(n/2, A12-A22, B21+B22) C11 = M1 + M4 - M5 + M7
C12 = M3 + M5 C21 = M2 + M4
C22 = M1 – M2 + M3 + M6 return C
▪ 𝑇 𝑛 = time for running Strassen(n,A,B)
Practicability of Strassen’s Algorithm
• Disadvantages
1. Larger constant factor than it in the naïve approach
2. Less numerical stable than the naïve approach
• Larger errors accumulate in non-integer computation due to limited precision 3. The submatrices at the levels of recursion consume space
4. Faster algorithms exist for sparse matrices
• Advantages: find the crossover point and combine two subproblems
13
Matrix Multiplication Upper Bounds
• Each algorithm gives an upper bound
Current lowest upper bound
Matrix Multi. Problem Complexity
15
D&C #6: Selection Problem
Textbook Chapter 9.3 – Selection in worst-case linear time
Selection Problem
17
n = 10, k = 5
3 7 9 17 5 2 21 18 33 4
Selection Problem ≦ Sorting Problem
• If the sorting problem can be solved in 𝑂 𝑓 𝑛 , so can the selection problem based on the algorithm design
• Step 1: sort A into increasing order
• Step 2: output 𝐴[𝑛 − 𝑘 + 1]
19
Selection Problem Complexity
Can we make the upper bound better if we do not sort them?
Divide-and-Conquer
• Idea
• Select a pivot and divide the inputs into two subproblems
• If 𝑘 ≤ 𝑋> , we find the 𝑘-th largest
• If 𝑘 > 𝑋> , we find the 𝑘 − 𝑋> -th largest
21
pivot
We want these subproblems to have similar size
→ The better pivot is the medium in the input array a
(1) Five Guys per Group
(2) A Median per Group
23
small number → large number
(3) Median of Medians (MoM)
small number → large number
(4) Partition via MoM
25
Larger than MoM Smaller than MoM
MoM
(5) Recursion
• Three cases
1. If 𝑘 ≤ 𝑋> , then output the 𝑘-th largest number in 𝑋>
2. If 𝑘 = 𝑋> + 1, then output MoM
3. If 𝑘 > 𝑋> + 1, then output the 𝑘 − 𝑋> − 1 -th largest number in 𝑋<
• Practice to prove by induction
Smaller than MoM Larger than MoM MoM
Two Recursive Steps
• Step (2): Determining MoM
• Step (5): Selection in X< or X>
27
Divide-and-Conquer for Selection
Selection(X, k) // base case if |X| <= 4
__ sort X and return X[k]
// recursive case
Divide X into |X|/5 groups with size 5 M[i] = median from group i
MoM = Selection(M, |M|/2) for i = 1 … |X|
if X[i] > MoM
insert X[i] into X2 else
insert X[i] into X1 if |X2| == k – 1
return x
if |X2| > k – 1
return Selection(X2, k)
return Selection(X1, k - |X2| - 1)
Candidates for Consideration
29
• If 𝑘 ≤ 𝑋> , then output the 𝑘-th largest number in 𝑋>
• If 𝑘 > 𝑋> + 1, then output the 𝑘 − 𝑋> − 1 -th largest number in 𝑋<
delete delete
Deleting at least 𝑛
5 ÷ 2 × 3 = 3
10𝑛 guys
D&C Algorithm Complexity
• 𝑇 𝑛 = time for running Selection(X, k) with |X| = n
• Intuition
Theorem
• Theorem
• Proof
• There exists positive constant 𝑎, 𝑏 s.t.
• Use induction to prove
• n = 1, 𝑎 > 𝑐
• n > 1,
31
Inductive hypothesis
select 𝑐 > 10𝑏
Selection Problem Complexity
D&C #7: Closest Pair of Points
33
Textbook Chapter 33.4 – Finding the closest pair of points
Closest Pair of Points Problem
• Input: 𝑛 ≥ 2 points, where 𝑝𝑖 = 𝑥𝑖, 𝑦𝑖 for 0 ≤ 𝑖 < 𝑛
• Output: two points 𝑝𝑖 and 𝑝𝑗 that are closest
• “Closest”: smallest Euclidean distance
• Euclidean distance between 𝑝𝑖 and 𝑝𝑗:
▪ Brute-force algorithm
▪ Check all pairs of points:
Θ 𝐶2𝑛 = Θ 𝑛2
Closest Pair of Points Problem
• 1D:
• Sort all points
• Scan the sorted points to find the closest pair in one pass
• We only need to examine the adjacent points
• 2D:
35
Divide-and-Conquer Algorithm
• Divide: divide points evenly along x-coordinate
• Conquer: find closest pair in each region recursively
• Combine: find closet pair with one point in each region, and return the best of three solutions
left-min = 10
right-min = 13 cross-min = 7
Cross Two Regions
• Algo 1: check all pairs that cross two regions → 𝑛/2 × 𝑛/2 combinations
• Algo 2: only consider points within 𝛿 of the cut, 𝛿 = min{l−min, r−min}
• Other pairs of points must have distance larger than 𝛿
37
left-min = 10
right-min = 13 cross-min = 7
𝛿 𝛿
縮小搜尋範圍!
Cross Two Regions
• Algo 1: check all pairs that cross two regions → 𝑛/2 × 𝑛/2 combinations
• Algo 2: only consider points within 𝛿 of the cut, 𝛿 = min{l−min, r−min}
• Algo 3: only consider pairs within 𝛿 × 2𝛿 blocks
• Obs 1: every pair with smaller than 𝛿 distance must appear in a 𝛿 × 2𝛿 block
要是很倒霉,所有的 點都聚集在某個𝛿 ×
2𝛿區塊內怎麼辦 縮小搜尋範圍!
Cross Two Regions
• Algo 1: check all pairs that cross two regions → 𝑛/2 × 𝑛/2 combinations
• Algo 2: only consider points within 𝛿 of the cut, 𝛿 = min{l−min, r−min}
• Algo 3: only consider pairs within 𝛿 × 2𝛿 blocks
• Obs 1: every pair with smaller than 𝛿 distance must appear in a 𝛿 × 2𝛿 block
• Obs 2: there are at most 8 points in a 𝛿 × 2𝛿 block
• Each 𝛿/2 × 𝛿/2 block contains at most 1 point, otherwise the distance returned from left/right region should be smaller than 𝛿
39
Cross Two Regions
• Algo 1: check all pairs that cross two regions → 𝑛/2 × 𝑛/2 combinations
• Algo 2: only consider points within 𝛿 of the cut, 𝛿 = min{l−min, r−min}
• Algo 3: only consider pairs within 𝛿 × 2𝛿 blocks
• Obs 1: every pair with smaller than 𝛿 distance must appear in a 𝛿 × 2𝛿 block
• Obs 2: there are at most 8 points in a 𝛿 × 2𝛿 block
pi
pi+4
pi+2 pi+5
pi+3
Find-closet-pair-across-regions
1. Sort the points by y-values within 𝛿 of the cut (yellow region) 2. For the sorted point 𝑝𝑖, compute the distance with 𝑝𝑖+1,
𝑝𝑖+2, …, 𝑝𝑖+7
3. Return the smallest one
At most 7 distance calculations needed
Algorithm Complexity
• 𝑇 𝑛 = time for running Closest-Pair(P) with |P| = n
41 Closest-Pair(P)
// termination condition (base case)
if |P| <= 3 brute-force finding closest pair and return it // Divide
find a vertical line L s.t. both planes_contain half of the points // Conquer (by recursion)
left-pair, left-min = Closest-Pair(points in the left) right-pair, right-min = Closest-Pair(points in the right) // Combine
delta = min{left-min, right-min}
remove points that are delta or more away from L // Obs 1 sort remaining points by y-coordinate into p0, …, pk
for point pi:
____compute distances with pi+1, pi+2, …, pi+7_// Obs 2 ____update delta if a closer pair is found
return the closest pair and its distance
Exercise 4.6-2
Preprocessing
• Idea: do not sort inside the recursive case
Closest-Pair(P)
sort P by x- and y-coordinate and store in Px and Py // termination condition (base case)
if |P| <= 3 brute-force finding closest pair and return it // Divide
find a vertical line L s.t. both planes_contain half of the points // Conquer (by recursion)
left-pair, left-min = Closest-Pair(points in the left) right-pair, right-min = Closest-Pair(points in the right) // Combine
delta = min{left-min, right-min}
remove points that are delta or more away from L // Obs 1 for point pi in sorted candidates
____compute distances with pi+1, pi+2, …, pi+7_// Obs 2 ____update delta if a closer pair is found
return the closest pair and its distance
Closest Pair of Points Problem
• 𝑂(𝑛) algorithm
• Taking advantage of randomization
• Chapter 13.7 of Algorithm Design by Kleinberg & Tardos
• Samir Khuller and Yossi Matias. 1995. A simple randomized sieve algorithm for the closest-pair problem. Inf.
Comput. 118, 1 (April 1995), 34-37.
43
Concluding Remarks
• When to use D&C
• Whether the problem with small inputs can be solved directly
• Whether subproblem solutions can be combined into the original solution
• Whether the overall complexity is better than naïve
• Note
• Try different ways of dividing
• D&C may be suboptimal due to repetitive computations
• Example.
• D&C algo for Fibonacci:
• Bottom-up algo for Fibonacci:
1. Divide
2. Conquer
3.
Combine
Fibonacci(n) if n < 2 ____return 1
a[0]=1 a[1]=1
for i = 2 … n
____a[i]=a[i-1]+a[i-2]
return a[n]
Our next topic: Dynamic Programming
“a technique for solving problems with overlapping subproblems”
Question?
Important announcement will be sent to
@ntu.edu.tw mailbox & post to the course website
Course Website: http://ada.miulab.tw Email: ada-ta@csie.ntu.edu.tw