Algorithm Design and Analysis
Slides credited from Hsueh-I Lu, Hsu-Chun Hsiao, & Michael Tsai
Midterm Feedback
• Mini-HW
• NTU COOL
• Helpful TAs
• Course recordings (access the channel here)
• Instant feedback
• Grade release
• Course recordings (two classes & last year)
• Homework hints
• TA hour changes
2
Outline
• DFS Applications
• Connected Components
• Strongly Connected Components
• Topological Sorting
• Minimal Spanning Trees (MST)
• Boruvka’s Algorithm
• Kruskal’s Algorithm
• Prim’s Algorithm
Depth-First Search
4
Textbook Chapter 22.3 – Depth-first search
Depth-First Search (DFS)
• Search as deep as possible and then backtrack until finding a new path
1
2 3
4 8
9 12 13
14
5 6
7
10 11
Connected Components
6
Connected Components Problem
• Input: a graph 𝐺 = 𝑉, 𝐸
• Output: a connected component of 𝐺
• a maximal subset 𝑈 of 𝑉 s.t. any two nodes in 𝑈 are connected in 𝐺
Connected Components
8
10 1
2
5
3 4
6
7
8 9
Time Complexity:
BFS and DSF both find the connected components with the same complexity
Problem Complexity
Strongly Connected Components
10
Textbook Chapter 22.5 – Strongly connected components
Strongly Connected Components
• Input: a directed graph 𝐺 = 𝑉, 𝐸
• Output: a connected component of 𝐺
• a maximal subset 𝑈 of 𝑉 s.t. any two nodes in 𝑈 are reachable in 𝐺
1
2
4
7
8
Why must the strongly connected components of a graph be disjoint?
Algorithm
• Step 1: Run DFS on 𝐺 to obtain the finish time 𝑣. 𝑓 for 𝑣 ∈ 𝑉.
• Step 2: Run DFS on the transpose of 𝐺 where the vertices 𝑉 are processed in the decreasing order of their finish time.
• Step 3: output the vertex partition by the second DFS
12
Transpose of A Graph
1
2
4
6
5
1
2
4
6
5
Example Illustration
14
1
3
2
6 5
4 1
2
4
5 3
6
Algorithm Correctness
• Proof by contradiction
• Assume that 𝑣, 𝑤 is an incoming edge to 𝐶.
• Since 𝐶 is a strongly connected component of 𝐺, there cannot be any path from any node of 𝐶 to 𝑣 in 𝐺.
• Therefore, the finish time of 𝑣 has to be larger than any node in 𝐶, including 𝑢. → Lemma
Let 𝐶 be the strongly connected component of 𝐺 (and 𝐺𝑇) that contains the node 𝑢 with the largest finish time 𝑢. 𝑓. Then 𝐶 cannot have any incoming edge from any node of 𝐺 not in 𝐶.
𝑢
G
C 𝑤
𝑣
Algorithm Correctness
• Practice to prove using induction
16
𝑢
G
C
𝑢
GT
C Theorem
By continuing the process from the vertex 𝑢∗ whose finish time 𝑢∗. 𝑓 is the largest excluding those in 𝐶, the algorithm returns the strongly connected components.
Example
1
3
2
6
4
Example
18
1
3
2
6 5
4
Time Complexity
• Step 1: Run DFS on 𝐺 to obtain the finish time 𝑣. 𝑓 for 𝑣 ∈ 𝑉.
• Step 2: Run DFS on the transpose of 𝐺 where the vertices 𝑉 are processed in the decreasing order of their finish time.
• Step 3: output the vertex partition by the second DFS
Time Complexity:
Problem Complexity
20
Topological Sort
Textbook Chapter 22.4 – Topological sort
Directed Graph
22
1
2
3
5 4
6 1
2
3
5 4
6
Directed Acyclic Graph (DAG)
• Definition
• a directed graph without any directed cycle
1
2
3
Topological Sort Problem
• Taking courses should follow the specific order
• How to find a course taking order?
24
計程 資料結構 演算法
計概 作業系統
計算機網路
微積分上 微積分下 機率
計組
Topological Sort Problem
• Input: a directed acyclic graph 𝐺 = (𝑉, 𝐸)
• Output: a linear order of 𝑉 s.t. all edges of 𝐺 going from lower-indexed nodes to higher-indexed nodes (左→右)
a b d
f c e
a
b
d
Algorithm
• Run DFS on the input DAG G.
• Output the nodes in decreasing order of their finish time.
26
DFS(G)
for each vertex u in G.V u.color = WHITE
u.pi = NIL time = 0
for each vertex u in G.V if u.color == WHITE
DFS-VISIT(G, u)
DFS-Visit(G, u) time = time + 1 u.d = time
u.color = GRAY
for each v in G.Adj[u] (outgoing) if v.color == WHITE
v.pi = u
DFS-VISIT(G, v) u.color = BLACK
time = time + 1
u.f = time // finish time
Example Illustration
a
b
d
f c
e
a b d
f c e
1
4 2
5
Example Illustration
28
a
b
d
f c
e
f b d
a c e
1
2
3 4
6
5
Time Complexity
• Run DFS on the input DAG G.
• Output the nodes in decreasing order of their finish time.
• As each vertex is finished, insert it onto the front of a linked list
• Return the linked list of vertices
DFS(G)
for each vertex u in G.V u.color = WHITE
u.pi = NIL
DFS-Visit(G, u) time = time + 1 u.d = time
u.color = GRAY
for each v in G.Adj[u]
if v.color == WHITE v.pi = u
Time Complexity:
Algorithm Correctness
• Proof
• →: suppose there is a back edge 𝑢, 𝑣
• 𝑣 is an ancestor of 𝑢 in DFS forest
• There is a path from 𝑣 to 𝑢 in 𝐺 and 𝑢, 𝑣 completes the cycle
• : suppose there is a cycle 𝑐
• Let 𝑣 be the first vertex in 𝑐 to be discovered and 𝑢 is a predecessor of 𝑣 in 𝑐
• Upon discovering 𝑣 the whole cycle from 𝑣 to 𝑢 is WHITE
• At time 𝑣. 𝑑, the vertices of 𝑐 form a path of white vertices from 𝑣 to 𝑢
• By the white-path theorem, vertex 𝑢 becomes a descendant of 𝑣 in the DFS forest
• Therefore, 𝑢, 𝑣 is a back edge
30
Lemma 22.11
A directed graph is acyclic a DFS yields no back edges.
White Path Theorem: In a DFS forest of 𝐺, 𝑣 is a descendant of 𝑢 in the forest at the time 𝑢. 𝑑 that the search discovers 𝑢, there is a path from 𝑢 to 𝑣 in 𝐺 consisting entirely of WHITE vertices
Algorithm Correctness
• Proof
• When 𝑢, 𝑣 is being explored, 𝑢 is GRAY and there are three cases for 𝑣:
• Case 1 – GRAY
• 𝑢, 𝑣 is a back edge (contradicting Lemma 22.11), so 𝑣 cannot be GRAY
• Case 2 – WHITE
• 𝑣 becomes descendant of 𝑢
• 𝑣 will be finished before 𝑢
Theorem 22.12
The algorithm produces a topological sort of the input DAG. That is, if 𝑢, 𝑣 is a directed edge (from 𝑢 to 𝑣) of 𝐺, then 𝑢. 𝑓 > 𝑣. 𝑓.
Problem Complexity
32
Discussion
• Since cycle detection becomes back edge detection (Lemma 22.11), DFS can be used to test whether a graph is a DAG
• Is there a topological order for cyclic graphs?
• Given a topological order, is there always a DFS traversal that produces such
an order?
Minimal Spanning Tree (MST)
34
Textbook Chapter 23 – Minimal Spanning Trees
Spanning Tree
• Definition
• a subgraph that is a tree and connects all vertices
• Exactly 𝑛 − 1 edges
• Acyclic
• There can be many spanning trees of a graph
• BFS and DFS also generate spanning trees
• BFS tree is typically “short and bushy”
• DFS tree is typically “long and stringy”
2
1 1
2 2
3
1
Minimal Spanning Tree Problem
• Input: a connected 𝑛-node 𝑚-edge graph 𝐺 with edge weights 𝑤
• Output: a spanning tree 𝑇 of 𝐺 with minimum 𝑤(𝑇)
36
2
1
1
1
2
2 2
3
1
WLOG: we may assume that all edge weights are distinct
Minimal Spanning Tree Problem
• Q: What if the graph is unweighted?
• Q: What if the graph contains edges with negative weights?
Trivial
Add a large constant to every edge; a MST remains the same
Uniqueness of MST
• Proof by contradiction
• Suppose there are two MSTs 𝐴 and 𝐵
• Let 𝑒 be the least-weight edge in 𝐴⋃𝐵 and 𝑒 is not in both
• WLOG, assume 𝑒 is in 𝐴
• Add 𝑒 to 𝐵; 𝑒 ⋃𝐵 contains a cycle 𝐶
• B includes at least one edge 𝑒′ that is not in 𝐴 but on 𝐶
• Replacing 𝑒′ with 𝑒 yields a MST with less cost
38
Theorem: MST is unique if all edge weights are distinct
If edge weights are not all distinct, then the (multi-)set of weights in MST is unique
Borůvka’s Algorithm
Inventor of MST
• Otakar Borůvka
• Czech scientist
• Introduced the problem
• Gave an 𝑂 𝑚 log 𝑛 time algorithm
• The original paper was written in Czech in 1926
• The purpose was to efficiently provide electric coverage of Bohemia
40
Borůvka’s Algorithm
• Repeat the below procedure until the resulting graph becomes a single node
• For each node 𝑢, mark its lightest incident edge
• From the marked edges form a forest 𝐹, add the edges of 𝐹 into the set of edges to be reported
• Contract each maximal subtree of 𝐹 into a single node
Borůvka’s Algorithm Illustration
42
2.1
1.3
2.3
1.2 2.2
3.1
2.4 3
1
1.5
1.4
2.6
2.7 2.5
3.2
5
3.3 4
4.1 5.1
Algorithm Correctness
• Proof via contradiction
• An MST 𝑇 of 𝐺 that does not contain 𝑢, 𝑣
• A cycle 𝐶 = 𝑇 ∪ 𝑢, 𝑣 contains an edge 𝑢, 𝑤 in 𝐶 that has larger weight than 𝑢, 𝑣
• 𝑇′ = 𝑇 ∪ 𝑢, 𝑣 \ 𝑢, 𝑤 must be a spanning tree of 𝐺 lighter than 𝑇
Claim: If 𝑢, 𝑣 is the lightest edge incident to 𝑢 in 𝐺, 𝑢, 𝑣 must belong
to any MST of 𝐺
Time Complexity
• The recurrence relation
• We check all edges in each phase
• After each contraction phase, the number of nodes is reduced by at least one half
• Time complexity:
44
Cycle Property
• Proof by contradiction
• Suppose 𝑒 is in the MST
• Removing 𝑒 disconnects the MST into two components T1 and T2
• There exists another edge 𝑒′ in 𝐶 that can reconnect T1 and T2
• Since 𝑤 𝑒’ < 𝑤(𝑒), the new tree has a lower weight
• Contradiction!
Let 𝐶 be any cycle in the graph 𝐺, and let 𝑒 be an edge with the maximum weight on 𝐶. Then the MST does not contain 𝑒.
• For simplicity, assume all edge weights are distinct
Cut Property
• Proof by contradiction
• Suppose 𝑒 is not in the current MST
• Adding 𝑒 creates a cycle in the MST
• There exists another edge 𝑒′ in 𝐶 that can break the cycle
• Since 𝑤 𝑒’ > 𝑤(𝑒), the new tree has a lower weight
• Contradiction!
46
Let 𝐶 be a cut in the graph, and let 𝑒 be the edge with the minimum cost in 𝐶. Then the MST contains 𝑒.
• Cut = a partition of the vertices
• For simplicity, assume all edge weights are distinct
Kruskal’s Algorithm
Textbook Chapter 23.2 – The algorithms of Kruskal and Prim
Kruskal’s Algorithm
• For each node 𝑢
• Make-set(𝑢): create a set consisting of 𝑢
• For each edge 𝑢, 𝑣 , taken in non-decreasing order by weights
• if Find-set(𝑢) ≠Find-set(𝑣) (i.e., 𝑢 and 𝑣 are not in the same set) then
• Output edge 𝑢, 𝑣
• Union(𝑢, 𝑣): union the sets containing 𝑢 and 𝑣 into a single set
48
Kruskal’s Algorithm Illustration
2.1
1.3 1.2
2.2
1
1.5
1.4 2.7
2.5 3.2
4.1 3.1 2.3
2.4 3
5 2.6 3.3 4
5.1
Kruskal’s Algorithm Correctness
50
The lightest edge incident to a vertex must be in the MST
Kruskal’s Algorithm Correctness
• Consider whether adding 𝑒 creates a cycle:
• If adding 𝑒 to 𝑇 creates a cycle 𝐶
• Then 𝑒 is the max weight edge in 𝐶
• The cycle property ensures that 𝑒 is not in the MST
• If adding 𝑒 = 𝑢, 𝑣 to 𝑇 does not create a cycle
• Before adding 𝑒, the current MST can be divided into two trees T1 and T2 such that 𝑢 in T1 and 𝑉 in T2
• 𝑒 is the minimum-cost edge on the cut of T1 and T2
• The cut property ensures that 𝑒 is in the MST
Kruskal’s Time Complexity
• Disjoint-set data structure with union-by-rank (Textbook Ch. 21)
• MAKE-SET:
• FIND-SET:
• UNION:
• The amortized cost of 𝑚 operations on 𝑛 elements (Exercise 21.4-4):
• Total complexity:
52 MST-KRUSKAL(G, w) // w = weights
A = empty // edge set of MST for v in G.V
MAKE-SET(v)
sort edges of G.E into non-decreasing order by weight w for (u, v) in G.E, taken in non-decreasing order by weight
if FIND-SET(u) ≠ FIND-SET(v) A = A ∪ {u, v}
UNION(u, v) return A
Prim’s Algorithm
Textbook Chapter 23.2 – The algorithms of Kruskal and Prim
Prim’s Algorithm
• Let 𝑇 consist of an arbitrary node
• For 𝑖 = 1 to 𝑛 − 1
• add the least-weighted edge incident to the current subtree 𝑇 that does not incur a cycle
54
Prim’s Algorithm Illustration
13
50 11
7
2
8 12
9
10 40 14
1 3 6
20
Prim’s Algorithm Illustration
56
13
50 11
7
2
8 12
9
10 40 14
1 3 6
20
Prim’s Algorithm Illustration
13
50 11
7
2
8 12
9
10 40 14
1 3 6
20
Prim’s Algorithm Illustration
58
13
50 11
7
2
8 12
9
10 40 14
1 3 6
20
Prim’s Algorithm Illustration
13
50 11
7
2
8 12
9
10 40 14
1 3 6
20
Prim’s Algorithm Illustration
60
13
50 11
7
2
8 12
9
10 40 14
1 3 6
20
Prim’s Algorithm Illustration
13
50 11
7
2
8 12
9
10 40 14
1 3 6
20
Prim’s Algorithm Illustration
62
13
50 11
7
2
8 12
9
10 40 14
1 3 6
20
Prim’s Algorithm Illustration
13
50 11
7
2
8 12
9
10 40 14
1 3 6
20
Prim’s Algorithm Correctness
64
The lightest edge incident to a vertex must be in the MST
Prim’s Time Complexity
• Binary min-heap (Textbook Ch. 6)
MST-PRIM(G, w, r) // w = weights, r = root for u in G.V
u.key = ∞ u.π = NIL r.key = 0 Q = G.V
while Q ≠ empty
u = EXTRACT-MIN(Q) for v in G.adj[u]
if v ∈ Q and w(u, v) < v.key v.π = u
v.key = w(u, v) // DECREASE-KEY
Prim’s Time Complexity
• Fibonacci heap (Textbook Ch. 19)
• BUILD-MIN-HEAP:
• EXTRACT-MIN: (amortized)
• DECREASE-KEY: (amortized)
• Total complexity:
66 MST-PRIM(G, w, r) // w = weights, r = root
for u in G.V u.key = ∞ u.π = NIL r.key = 0 Q = G.V
while Q ≠ empty
u = EXTRACT-MIN(Q) for v in G.adj[u]
if v ∈ Q and w(u, v) < v.key v.π = u
v.key = w(u, v) // DECREASE-KEY
Concluding Remarks
• Minimal Spanning Trees (MST)
• Boruvka’s Algorithm:
• Kruskal’s Algorithm:
• Prim’s Algorithm: with binary min-heap
• Prim’s Algorithm: with Fabonacci heap
Question?
Important announcement will be sent to
@ntu.edu.tw mailbox & post to the course website
Course Website: http://ada.miulab.tw Email: ada-ta@csie.ntu.edu.tw