Algorithm Design and AnalysisGraph Algorithms (2)

(1)

Algorithm Design and Analysis

Slides credited from Hsueh-I Lu, Hsu-Chun Hsiao, & Michael Tsai

(2)

Midterm Feedback

• Mini-HW

• NTU COOL

• Helpful TAs

• Course recordings (access the channel here)

• Instant feedback

• Grade release

• Course recordings (two classes & last year)

• Homework hints

• TA hour changes

2

(3)

Outline

• DFS Applications

• Connected Components

• Strongly Connected Components

• Topological Sorting

• Minimal Spanning Trees (MST)

• Boruvka’s Algorithm

• Kruskal’s Algorithm

• Prim’s Algorithm

(4)

Depth-First Search

4

Textbook Chapter 22.3 – Depth-first search

(5)

Depth-First Search (DFS)

• Search as deep as possible and then backtrack until finding a new path

1

2 3

4 8

9 12 13

14

5 6

7

10 11

(6)

Connected Components

6

(7)

Connected Components Problem

• Input: a graph 𝐺 = 𝑉, 𝐸

• Output: a connected component of 𝐺

• a maximal subset 𝑈 of 𝑉 s.t. any two nodes in 𝑈 are connected in 𝐺

(8)

Connected Components

8

10 1

2

5

3 4

6

7

8 9

Time Complexity:

BFS and DSF both find the connected components with the same complexity

(9)

Problem Complexity

(10)

Strongly Connected Components

10

Textbook Chapter 22.5 – Strongly connected components

(11)

Strongly Connected Components

• Input: a directed graph 𝐺 = 𝑉, 𝐸

• Output: a connected component of 𝐺

• a maximal subset 𝑈 of 𝑉 s.t. any two nodes in 𝑈 are reachable in 𝐺

1

2

4

7

8

Why must the strongly connected components of a graph be disjoint?

(12)

Algorithm

• Step 1: Run DFS on 𝐺 to obtain the finish time 𝑣. 𝑓 for 𝑣 ∈ 𝑉.

• Step 2: Run DFS on the transpose of 𝐺 where the vertices 𝑉 are processed in the decreasing order of their finish time.

• Step 3: output the vertex partition by the second DFS

12

(13)

Transpose of A Graph

1

2

4

6

5

1

2

4

6

5

(14)

Example Illustration

14

1

3

2

6 5

4 1

2

4

5 3

6

(15)

Algorithm Correctness

• Proof by contradiction

• Assume that 𝑣, 𝑤 is an incoming edge to 𝐶.

• Since 𝐶 is a strongly connected component of 𝐺, there cannot be any path from any node of 𝐶 to 𝑣 in 𝐺.

• Therefore, the finish time of 𝑣 has to be larger than any node in 𝐶, including 𝑢. → Lemma

Let 𝐶 be the strongly connected component of 𝐺 (and 𝐺^𝑇) that contains the node 𝑢 with the largest finish time 𝑢. 𝑓. Then 𝐶 cannot have any incoming edge from any node of 𝐺 not in 𝐶.

𝑢

G

C 𝑤

𝑣

(16)

Algorithm Correctness

• Practice to prove using induction

16

𝑢

G

C

𝑢

G^T

C Theorem

By continuing the process from the vertex 𝑢^∗ whose finish time 𝑢^∗. 𝑓 is the largest excluding those in 𝐶, the algorithm returns the strongly connected components.

(17)

Example

1

3

2

6

4

(18)

Example

18

1

3

2

6 5

4

(19)

Time Complexity

• Step 1: Run DFS on 𝐺 to obtain the finish time 𝑣. 𝑓 for 𝑣 ∈ 𝑉.

• Step 2: Run DFS on the transpose of 𝐺 where the vertices 𝑉 are processed in the decreasing order of their finish time.

• Step 3: output the vertex partition by the second DFS

Time Complexity:

(20)

Problem Complexity

20

(21)

Topological Sort

Textbook Chapter 22.4 – Topological sort

(22)

Directed Graph

22

1

2

3

5 4

6 1

2

3

5 4

6

(23)

Directed Acyclic Graph (DAG)

• Definition

• a directed graph without any directed cycle

1

2

3

(24)

Topological Sort Problem

• Taking courses should follow the specific order

• How to find a course taking order?

24

計程資料結構演算法

計概作業系統

計算機網路

微積分上微積分下機率

計組

(25)

Topological Sort Problem

• Input: a directed acyclic graph 𝐺 = (𝑉, 𝐸)

• Output: a linear order of 𝑉 s.t. all edges of 𝐺 going from lower-indexed nodes to higher-indexed nodes (左→右)

a b d

f c e

a

b

d

(26)

Algorithm

• Run DFS on the input DAG G.

• Output the nodes in decreasing order of their finish time.

26

DFS(G)

for each vertex u in G.V u.color = WHITE

u.pi = NIL time = 0

for each vertex u in G.V if u.color == WHITE

DFS-VISIT(G, u)

DFS-Visit(G, u) time = time + 1 u.d = time

u.color = GRAY

for each v in G.Adj[u] (outgoing) if v.color == WHITE

v.pi = u

DFS-VISIT(G, v) u.color = BLACK

time = time + 1

u.f = time // finish time

(27)

Example Illustration

a

b

d

f c

e

a b d

f c e

1

4 2

5

(28)

Example Illustration

28

a

b

d

f c

e

f b d

a c e

1

2

3 4

6

5

(29)

Time Complexity

• Run DFS on the input DAG G.

• Output the nodes in decreasing order of their finish time.

• As each vertex is finished, insert it onto the front of a linked list

• Return the linked list of vertices

DFS(G)

for each vertex u in G.V u.color = WHITE

u.pi = NIL

DFS-Visit(G, u) time = time + 1 u.d = time

u.color = GRAY

for each v in G.Adj[u]

if v.color == WHITE v.pi = u

Time Complexity:

(30)

Algorithm Correctness

• Proof

• →: suppose there is a back edge 𝑢, 𝑣

• 𝑣 is an ancestor of 𝑢 in DFS forest

• There is a path from 𝑣 to 𝑢 in 𝐺 and 𝑢, 𝑣 completes the cycle

•  : suppose there is a cycle 𝑐

• Let 𝑣 be the first vertex in 𝑐 to be discovered and 𝑢 is a predecessor of 𝑣 in 𝑐

• Upon discovering 𝑣 the whole cycle from 𝑣 to 𝑢 is WHITE

• At time 𝑣. 𝑑, the vertices of 𝑐 form a path of white vertices from 𝑣 to 𝑢

• By the white-path theorem, vertex 𝑢 becomes a descendant of 𝑣 in the DFS forest

• Therefore, 𝑢, 𝑣 is a back edge

30

Lemma 22.11

A directed graph is acyclic  a DFS yields no back edges.

White Path Theorem: In a DFS forest of 𝐺, 𝑣 is a descendant of 𝑢 in the forest  at the time 𝑢. 𝑑 that the search discovers 𝑢, there is a path from 𝑢 to 𝑣 in 𝐺 consisting entirely of WHITE vertices

(31)

Algorithm Correctness

• Proof

• When 𝑢, 𝑣 is being explored, 𝑢 is GRAY and there are three cases for 𝑣:

• Case 1 – GRAY

• 𝑢, 𝑣 is a back edge (contradicting Lemma 22.11), so 𝑣 cannot be GRAY

• Case 2 – WHITE

• 𝑣 becomes descendant of 𝑢

• 𝑣 will be finished before 𝑢

Theorem 22.12

The algorithm produces a topological sort of the input DAG. That is, if 𝑢, 𝑣 is a directed edge (from 𝑢 to 𝑣) of 𝐺, then 𝑢. 𝑓 > 𝑣. 𝑓.

(32)

Problem Complexity

32

(33)

Discussion

• Since cycle detection becomes back edge detection (Lemma 22.11), DFS can be used to test whether a graph is a DAG

• Is there a topological order for cyclic graphs?

• Given a topological order, is there always a DFS traversal that produces such

an order?

(34)

Minimal Spanning Tree (MST)

34

Textbook Chapter 23 – Minimal Spanning Trees

(35)

Spanning Tree

• Definition

• a subgraph that is a tree and connects all vertices

• Exactly 𝑛 − 1 edges

• Acyclic

• There can be many spanning trees of a graph

• BFS and DFS also generate spanning trees

• BFS tree is typically “short and bushy”

• DFS tree is typically “long and stringy”

2

1 1

2 2

3

1

(36)

Minimal Spanning Tree Problem

• Input: a connected 𝑛-node 𝑚-edge graph 𝐺 with edge weights 𝑤

• Output: a spanning tree 𝑇 of 𝐺 with minimum 𝑤(𝑇)

36

2

1

2

2 2

3

1

WLOG: we may assume that all edge weights are distinct

(37)

Minimal Spanning Tree Problem

• Q: What if the graph is unweighted?

• Q: What if the graph contains edges with negative weights?

Trivial

Add a large constant to every edge; a MST remains the same

(38)

Uniqueness of MST

• Proof by contradiction

• Suppose there are two MSTs 𝐴 and 𝐵

• Let 𝑒 be the least-weight edge in 𝐴⋃𝐵 and 𝑒 is not in both

• WLOG, assume 𝑒 is in 𝐴

• Add 𝑒 to 𝐵; 𝑒 ⋃𝐵 contains a cycle 𝐶

• B includes at least one edge 𝑒′ that is not in 𝐴 but on 𝐶

• Replacing 𝑒′ with 𝑒 yields a MST with less cost

38

Theorem: MST is unique if all edge weights are distinct

If edge weights are not all distinct, then the (multi-)set of weights in MST is unique

(39)

Borůvka’s Algorithm

(40)

Inventor of MST

• Otakar Borůvka

• Czech scientist

• Introduced the problem

• Gave an 𝑂 𝑚 log 𝑛 time algorithm

• The original paper was written in Czech in 1926

• The purpose was to efficiently provide electric coverage of Bohemia

40

(41)

Borůvka’s Algorithm

• Repeat the below procedure until the resulting graph becomes a single node

• For each node 𝑢, mark its lightest incident edge

• From the marked edges form a forest 𝐹, add the edges of 𝐹 into the set of edges to be reported

• Contract each maximal subtree of 𝐹 into a single node

(42)

Borůvka’s Algorithm Illustration

42

2.1

1.3

2.3

1.2 2.2

3.1

2.4 3

1

1.5

1.4

2.6

2.7 2.5

3.2

5

3.3 4

4.1 5.1

(43)

Algorithm Correctness

• Proof via contradiction

• An MST 𝑇 of 𝐺 that does not contain 𝑢, 𝑣

• A cycle 𝐶 = 𝑇 ∪ 𝑢, 𝑣 contains an edge 𝑢, 𝑤 in 𝐶 that has larger weight than 𝑢, 𝑣

• 𝑇^′ = 𝑇 ∪ 𝑢, 𝑣 \ 𝑢, 𝑤 must be a spanning tree of 𝐺 lighter than 𝑇

Claim: If 𝑢, 𝑣 is the lightest edge incident to 𝑢 in 𝐺, 𝑢, 𝑣 must belong

to any MST of 𝐺

(44)

Time Complexity

• The recurrence relation

• We check all edges in each phase

• After each contraction phase, the number of nodes is reduced by at least one half

• Time complexity:

44

(45)

Cycle Property

• Proof by contradiction

• Suppose 𝑒 is in the MST

• Removing 𝑒 disconnects the MST into two components T1 and T2

• There exists another edge 𝑒′ in 𝐶 that can reconnect T1 and T2

• Since 𝑤 𝑒’ < 𝑤(𝑒), the new tree has a lower weight

• Contradiction!

Let 𝐶 be any cycle in the graph 𝐺, and let 𝑒 be an edge with the maximum weight on 𝐶. Then the MST does not contain 𝑒.

• For simplicity, assume all edge weights are distinct

(46)

Cut Property

• Proof by contradiction

• Suppose 𝑒 is not in the current MST

• Adding 𝑒 creates a cycle in the MST

• There exists another edge 𝑒′ in 𝐶 that can break the cycle

• Since 𝑤 𝑒’ > 𝑤(𝑒), the new tree has a lower weight

• Contradiction!

46

Let 𝐶 be a cut in the graph, and let 𝑒 be the edge with the minimum cost in 𝐶. Then the MST contains 𝑒.

• Cut = a partition of the vertices

• For simplicity, assume all edge weights are distinct

(47)

Kruskal’s Algorithm

Textbook Chapter 23.2 – The algorithms of Kruskal and Prim

(48)

Kruskal’s Algorithm

• For each node 𝑢

• Make-set(𝑢): create a set consisting of 𝑢

• For each edge 𝑢, 𝑣 , taken in non-decreasing order by weights

• if Find-set(𝑢) ≠Find-set(𝑣) (i.e., 𝑢 and 𝑣 are not in the same set) then

• Output edge 𝑢, 𝑣

• Union(𝑢, 𝑣): union the sets containing 𝑢 and 𝑣 into a single set

48

(49)

Kruskal’s Algorithm Illustration

2.1

1.3 1.2

2.2

1

1.5

1.4 2.7

2.5 3.2

4.1 3.1 2.3

2.4 3

5 2.6 3.3 4

5.1

(50)

Kruskal’s Algorithm Correctness

50

The lightest edge incident to a vertex must be in the MST

(51)

Kruskal’s Algorithm Correctness

• Consider whether adding 𝑒 creates a cycle:

• If adding 𝑒 to 𝑇 creates a cycle 𝐶

• Then 𝑒 is the max weight edge in 𝐶

• The cycle property ensures that 𝑒 is not in the MST

• If adding 𝑒 = 𝑢, 𝑣 to 𝑇 does not create a cycle

• Before adding 𝑒, the current MST can be divided into two trees T1 and T2 such that 𝑢 in T1 and 𝑉 in T2

• 𝑒 is the minimum-cost edge on the cut of T1 and T2

• The cut property ensures that 𝑒 is in the MST

(52)

Kruskal’s Time Complexity

• Disjoint-set data structure with union-by-rank (Textbook Ch. 21)

• MAKE-SET:

• FIND-SET:

• UNION:

• The amortized cost of 𝑚 operations on 𝑛 elements (Exercise 21.4-4):

• Total complexity:

52 MST-KRUSKAL(G, w) // w = weights

A = empty // edge set of MST for v in G.V

MAKE-SET(v)

sort edges of G.E into non-decreasing order by weight w for (u, v) in G.E, taken in non-decreasing order by weight

if FIND-SET(u) ≠ FIND-SET(v) A = A ∪ {u, v}

UNION(u, v) return A

(53)

Prim’s Algorithm

Textbook Chapter 23.2 – The algorithms of Kruskal and Prim

(54)

Prim’s Algorithm

• Let 𝑇 consist of an arbitrary node

• For 𝑖 = 1 to 𝑛 − 1

• add the least-weighted edge incident to the current subtree 𝑇 that does not incur a cycle

54

(55)

Prim’s Algorithm Illustration

13

50 11

7

2

8 12

9

10 40 14

1 3 6

20

(56)

Prim’s Algorithm Illustration

56

13

50 11

7

2

8 12

9

10 40 14

1 3 6

20

(57)

Prim’s Algorithm Illustration

13

50 11

7

2

8 12

9

10 40 14

1 3 6

20

(58)

Prim’s Algorithm Illustration

58

13

50 11

7

2

8 12

9

10 40 14

1 3 6

20

(59)

Prim’s Algorithm Illustration

13

50 11

7

2

8 12

9

10 40 14

1 3 6

20

(60)

Prim’s Algorithm Illustration

60

13

50 11

7

2

8 12

9

10 40 14

1 3 6

20

(61)

Prim’s Algorithm Illustration

13

50 11

7

2

8 12

9

10 40 14

1 3 6

20

(62)

Prim’s Algorithm Illustration

62

13

50 11

7

2

8 12

9

10 40 14

1 3 6

20

(63)

Prim’s Algorithm Illustration

13

50 11

7

2

8 12

9

10 40 14

1 3 6

20

(64)

Prim’s Algorithm Correctness

64

The lightest edge incident to a vertex must be in the MST

(65)

Prim’s Time Complexity

• Binary min-heap (Textbook Ch. 6)

MST-PRIM(G, w, r) // w = weights, r = root for u in G.V

u.key = ∞ u.π = NIL r.key = 0 Q = G.V

while Q ≠ empty

u = EXTRACT-MIN(Q) for v in G.adj[u]

if v ∈ Q and w(u, v) < v.key v.π = u

v.key = w(u, v) // DECREASE-KEY

(66)

Prim’s Time Complexity

• Fibonacci heap (Textbook Ch. 19)

• BUILD-MIN-HEAP:

• EXTRACT-MIN: (amortized)

• DECREASE-KEY: (amortized)

• Total complexity:

66 MST-PRIM(G, w, r) // w = weights, r = root

for u in G.V u.key = ∞ u.π = NIL r.key = 0 Q = G.V

while Q ≠ empty

u = EXTRACT-MIN(Q) for v in G.adj[u]

if v ∈ Q and w(u, v) < v.key v.π = u

v.key = w(u, v) // DECREASE-KEY

(67)

Concluding Remarks

• Minimal Spanning Trees (MST)

• Boruvka’s Algorithm:

• Kruskal’s Algorithm:

• Prim’s Algorithm: with binary min-heap

• Prim’s Algorithm: with Fabonacci heap

(68)

Question?

Important announcement will be sent to

@ntu.edu.tw mailbox & post to the course website

Course Website: http://ada.miulab.tw Email: ada-ta@csie.ntu.edu.tw