Slides credited from Hsueh-I Lu, Hsu-Chun Hsiao, & Michael Tsai
Mini-HW
NTU COOL
TA hours
Course recordings
Instant feedback
Classroom (crowded, sleepy, etc.)
Homework due time
Pseudo code
Mini-HW 7 released
Due on 11/29 (Thur) 14:20
Homework 3 released soon
Due on 12/13 (Thur) 14:20 (three weeks)
3
Frequently check the website for the updated information!
Graph Basics
Graph Theory
Graph Representations
Graph Traversal
Breadth-First Search (BFS)
Depth-First Search (DFS)
DFS Applications
Connected Components
Strongly Connected Components
Topological Sorting
5
A graph G is defined as
V: a finite, nonempty set of vertices
E: a set of edges / pairs of vertices
3 5
1
4
2
Graph type
Undirected: edge 𝑢, 𝑣 = 𝑣, 𝑢
Directed: edge 𝑢, 𝑣 goes from vertex 𝑢 to vertex 𝑣; 𝑢, 𝑣 ≠ 𝑣, 𝑢
Weighted: edges associate with weights
7
3 5
1 4
2
3 5
1 4
2
How many edges at most can a undirected (or directed) graph have?
Adjacent (相鄰)
If there is an edge 𝑢, 𝑣 , then 𝑢 and 𝑣 are adjacent.
Incident (作用)
If there is an edge 𝑢, 𝑣 , the edge
𝑢, 𝑣is incident from 𝑢 and is incident to 𝑣.
Subgraph (子圖)
If a graph 𝐺
′= 𝑉
′, 𝐸′ is a subgraph of 𝐺 = 𝑉, 𝐸 , then 𝑉
′⊆
𝑉 and 𝐸
′⊆ 𝐸
Degree
The degree of a vertex
𝑢is the number of edges incident on
𝑢 In-degree of 𝑢: #edges 𝑥, 𝑢 in a directed graph
Out-degree of 𝑢: #edges 𝑢, 𝑥 in a directed graph
Degree = in-degree + out-degree
Isolated vertex: degree = 0
9
𝐸 = σ
𝑖𝑑
𝑖2
Path
a sequence of edges that connect a sequence of vertices
If there is a path from 𝑢 (source) to 𝑣 (target), there is a sequence of edges 𝑢, 𝑖
1, 𝑖
1, 𝑖
2, … , 𝑖
𝑘−1, 𝑖
𝑘, (𝑖
𝑘, 𝑣)
Reachable: 𝑣 is reachable from 𝑢 if there exists a path from 𝑢 to 𝑣
Simple Path
All vertices except for 𝑢 and 𝑣 are all distinct
Cycle
Connected
Two vertices are connected if there is a path between them
A connected graph has a path from every vertex to every other
Tree
a connected, acyclic, undirected graph
Forest
an acyclic, undirected but possibly disconnected graph
11
3 5
1 4
2
3 5
1 4
2
3 5
1 4
2
Theorem. Let 𝐺 be an undirected graph. The following statements are equivalent:
𝐺 is a tree
Any two vertices in 𝐺 are connected by a unique simple path
𝐺 is connected, but if any edge is removed from 𝐸, the resulting graph is disconnected.
𝐺 is connected and 𝐸 = 𝑉 − 1
𝐺 is acyclic, and 𝐸 = 𝑉 − 1
𝐺 is acyclic, but if any edge is added to 𝐸, the resulting graph
13
How to traverse all bridges where each one can only be passed through once
A
B C C
A
B
Euler path
Can you traverse each edge in a connected graph exactly once without lifting the pen from the paper?
Euler tour
Can you finish where you started?
15
C A
B D
C A
B D
C A
B D
Euler path Euler tour
Euler path Euler tour
Euler path Euler tour
Solved by Leonhard Euler in 1736
𝐺 has an Euler path iff 𝐺 has exactly 0 or 2 odd vertices
𝐺 has an Euler tour iff all vertices must be even vertices
Is it possible to determine whether a graph has an Euler path or an Euler tour, without necessarily having to find one explicitly?
Even vertices = vertices with even degrees Odd vertices = vertices with odd degrees
Hamiltonian Path
A path that visits each vertex exactly once
Hamiltonian Cycle
A Hamiltonian path where the start and destination are the same
Both are NP-complete
17
Modeling applications using graph theory
What do the vertices represent?
What do the edges represent?
Undirected or directed?
19
How to represent a graph in computer programs?
Two standard ways to represent a graph 𝐺 = 𝑉, 𝐸
Adjacency matrix
Adjacency list
Matrix
21
Adjacency matrix = 𝑉 × 𝑉 matrix 𝐴 with 𝐴[𝑢][𝑣] = 1 if (𝑢, 𝑣) is an edge
1 2 3 4 5 6
1 1 1
2 1 1 1
3 1 1 1
4 1 1 1
5 1
6 1 1
1
2
3
5 4
6
• For undirected graphs, 𝐴 is symmetric; i.e., 𝐴 = 𝐴𝑇
• If weighted, store weights instead of bits in 𝐴
Matrix
Space:
Time for querying an edge:
Time for inserting an edge:
Time for deleting an edge:
Time for listing all neighbors of a vertex:
Time for identifying all edges:
Time for finding in-degree and out-degree of a vertex?
List
Adjacency lists = vertex indexed array of lists
One list per vertex, where for 𝑢 ∈ 𝑉, 𝐴[𝑢] consists of all vertices adjacent to 𝑢
23
1 2 3 4 5 6
1 4
3
2 3
2
3 2
5
1 4 6
4
6 1
2
3
5 4
6
If weighted, store weights also in adjacency lists
List
Space:
Time for querying an edge:
Time for inserting an edge:
Time for deleting an edge:
Time for listing all neighbors of a vertex:
Time for identifying all edges:
Time for finding in-degree and out-degree of a vertex?
Matrix representation is suitable for dense graphs
List representation is suitable for sparse graphs
Besides graph density, you may also choose a data structure based on the performance of other operations
25
Space Query an edge
Insert an edge
Delete an edge
List a vertex’s neighbors
Identify all edges
Adjacency Matrix Adjacency List
Textbook Chapter 22 – Elementary Graph Algorithms
26
From a source vertex, systematically follow the edges of a graph to visit all reachable vertices of the graph
Useful to discover the structure of a graph
Standard graph-searching algorithms
Breadth-First Search (BFS, 廣度優先搜尋)
Depth-First Search (DFS, 深度優先搜尋)
27
Textbook Chapter 22.2 – Breadth-first search
28
29
Source 𝒔
Layer 1
Layer 2
Input: directed/undirected graph 𝐺 = (𝑉, 𝐸) and source 𝑠
Output: a breadth-first tree with root 𝑠 (𝑇
BFS) that contains all reachable vertices
𝑣. 𝑑: distance from 𝑠 to 𝑣, for all 𝑣 ∈ 𝑉
Distance is the length of a shortest path in G
𝑣. 𝑑 = ∞ if 𝑣 is not reachable from 𝑠
𝑣. 𝑑 is also the depth of 𝑣 in 𝑇BFS
𝑣. 𝜋 = 𝑢 if (𝑢, 𝑣) is the last edge on shortest path to 𝑣
𝑢 is 𝑣’s predecessor in 𝑇
Initially 𝑇
BFScontains only 𝑠
As 𝑣 is discovered from 𝑢, 𝑣 and (𝑢, 𝑣) are added to 𝑇
BFS 𝑇BFS is not explicitly stored; can be reconstructed from 𝑣. 𝜋
Implemented via a FIFO queue
Color the vertices to keep track of progress:
GRAY: discovered (first time encountered)
BLACK: finished (all adjacent vertices discovered)
WHITE: undiscovered
31
BFS(G, s)
for each vertex u in G.V-{s}
u.color = WHITE u.d = ∞
u.pi = NIL s.color = GRAY s.d = 0
s.pi = NIL Q = {}
ENQUEUE(Q, s) while Q! = {}
u = DEQUEUE(Q)
for each v in G.Adj[u]
if v.color == WHITE v.color = GRAY v.d = u.d + 1 v.pi = u
ENQUEUE(Q,v) u.color = BLACK
𝑠 0
𝑤 𝑟
1 1
𝑟 𝑡 𝑥
1 2 2
𝑡 𝑥 𝑣
2 2 2
33
𝑢 𝑦
3 3
𝑦 3
Definition of 𝛿(𝑠, 𝑣): the shortest-path distance from 𝑠 to 𝑣 = the minimum number of edges in any path from 𝑠 to 𝑣
If there is no path from 𝑠 to 𝑣, then 𝛿 𝑠, 𝑣 = ∞
The BFS algorithm finds the shortest-path distance to each reachable vertex in a graph 𝐺 from a given source vertex 𝑠 ∈ 𝑉.
Proof
Case 1: 𝑢 is reachable from 𝑠
𝑠- 𝑢- 𝑣 is a path from 𝑠 to 𝑣 with length 𝛿 𝑠, 𝑢 + 1
Hence, 𝛿 𝑠, 𝑣 ≤ 𝛿 𝑠, 𝑢 + 1
Case 2: 𝑢 is unreachable from 𝑠
Then 𝑣 must be unreachable too.
Hence, the inequality still holds.
35
Lemma 22.1
Let 𝐺 = 𝑉, 𝐸 be a directed or undirected graph, and let 𝑠 ∈ 𝑉 be an arbitrary vertex. Then, for any edge 𝑢, 𝑣 ∈ 𝐸, 𝛿 𝑠, 𝑣 ≤ 𝛿 𝑠, 𝑢 + 1.
𝑠-𝑣的最短路徑一定會小於等於𝑠-𝑢的最短路徑距離+1
s
v
𝛿 𝑠, 𝑢 u
Proof by induction
Holds when 𝑛 = 1: 𝑠 is in the queue and 𝑣. 𝑑 = ∞ for all 𝑣 ∈ 𝑉 𝑠
After 𝑛 + 1 ENQUEUE ops, consider a white vertex 𝑣 that is discovered during the search from a vertex 𝑢
Lemma 22.2
Let 𝐺 = 𝑉, 𝐸 be a directed or undirected graph, and suppose BFS is run on 𝐺 from a given source vertex 𝑠 ∈ 𝑉. Then upon termination, for each vertex 𝑣 ∈ 𝑉, the value 𝑣. 𝑑 computed by BFS satisfies 𝑣. 𝑑 ≥ 𝛿 𝑠, 𝑣 .
BFS算出的d值必定大於等於真正距離
Inductive hypothesis: 𝑣. 𝑑 ≥ 𝛿 𝑠, 𝑣 after 𝑛 ENQUEUE ops
Proof by induction
Holds when 𝑄 = 𝑠 .
Consider two operations for inductive step:
Dequeue op: when 𝑄 = 𝑣1, 𝑣2, … , 𝑣𝑟 and dequeue 𝑣1
Enqueue op: when 𝑄 = 𝑣1, 𝑣2, … , 𝑣𝑟 and enqueue 𝑣𝑟+1
37
Lemma 22.3
Suppose that during the execution of BFS on a graph 𝐺 = 𝑉, 𝐸 , the queue 𝑄 contains the vertices 𝑣1, 𝑣2, … , 𝑣𝑟 , where 𝑣1 is the head of 𝑄 and 𝑣𝑟 is the tail. Then, 𝑣𝑟. 𝑑 ≤ 𝑣1. 𝑑 + 1 and 𝑣𝑖. 𝑑 ≤ 𝑣𝑖+1. 𝑑 for 1 ≤ 𝑖 < 𝑟.
• Q中最後一個點的d值 ≤ Q中第一個點的d值+1
• Q中第i個點的d值 ≤ Q中第i+1點的d值
Inductive hypothesis:𝑣𝑟. 𝑑 ≤ 𝑣1. 𝑑 + 1 and 𝑣𝑖. 𝑑 ≤ 𝑣𝑖+1. 𝑑 after 𝑛 queue ops
Dequeue op
Enqueue op
Inductive hypothesis:
𝑣1 𝑣2 … 𝑣𝑟−1 𝑣𝑟
𝑣2 … 𝑣𝑟−1 𝑣𝑟 (induction hypothesis H2)
𝑣1 𝑣2 … 𝑣𝑟−1 𝑣𝑟
(induction hypothesis H2)
𝑣 𝑣 … 𝑣 𝑣 𝑣
𝑢
Let 𝑢 be 𝑣𝑟+1’s predecessor,
Since 𝑢 has been removed from 𝑄, the new head 𝑣1 satisfies
(induction hypothesis H1) H1
H2
H1 holds
H2 holds
𝑢
(Q中最後一個點的d值 ≤ Q中第一個點的d值+1)
(Q中第i個點的d值 ≤ Q中第i+1點的d值)
Proof
Lemma 22.3 proves that 𝑣𝑖. 𝑑 ≤ 𝑣𝑖+1. 𝑑 for 1 ≤ 𝑖 < 𝑟
Each vertex receives a finite 𝑑 value at most once during the course of BFS
Hence, this is proved.
39
Corollary 22.4
Suppose that vertices 𝑣𝑖 and 𝑣𝑗 are enqueued during the execution of BFS, and that 𝑣𝑖 is enqueued before 𝑣𝑗. Then 𝑣𝑖. 𝑑 ≤ 𝑣𝑗. 𝑑 at the time that 𝑣𝑗 is enqueued.
若𝑣𝑖比𝑣𝑗早加入queue 𝑣𝑖. 𝑑 ≤ 𝑣𝑗. 𝑑
Proof of (1)
All vertices 𝑣 reachable from 𝑠 must be discovered; otherwise they would have 𝑣. 𝑑 = ∞ > 𝛿 𝑠, 𝑣 . (contradicting with Lemma 22.2) Theorem 22.5 – BFS Correctness
Let 𝐺 = 𝑉, 𝐸 be a directed or undirected graph, and and suppose that BFS is run on 𝐺 from a given source vertex 𝑠 ∈ 𝑉.
1) BFS discovers every vertex 𝑣 ∈ 𝑉 that is reachable from the source 𝑠 2) Upon termination, 𝑣. 𝑑 = 𝛿 𝑠, 𝑣 for all 𝑣 ∈ 𝑉
3) For any vertex 𝑣 ≠ 𝑠 that is reachable from 𝑠, one of the shortest paths from 𝑠 to 𝑣 is a shortest path from 𝑠 to 𝑣. 𝜋 followed by the edge 𝑣. 𝜋, 𝑣
(2)
Proof of (2) by contradiction
Assume some vertices receive 𝑑 values not equal to its shortest-path distance
Let 𝑣 be the vertex with minimum 𝛿 𝑠, 𝑣 that receives such an incorrect 𝑑 value; clearly 𝑣 ≠ 𝑠
By Lemma 22.2, 𝑣. 𝑑 ≥ 𝛿 𝑠, 𝑣 , thus 𝑣. 𝑑 > 𝛿 𝑠, 𝑣 (𝑣 must be reachable)
Let 𝑢 be the vertex immediately preceding 𝑣 on a shortest path from 𝑠 to 𝑣, so 𝛿 𝑠, 𝑣 = 𝛿 𝑠, 𝑢 + 1
Because 𝛿 𝑠, 𝑢 < 𝛿 𝑠, 𝑣 and 𝑣 is the minimum 𝛿 𝑠, 𝑣 , we have 𝑢. 𝑑 = 𝛿 𝑠, 𝑢
𝑣. 𝑑 > 𝛿 𝑠, 𝑣 = 𝛿 𝑠, 𝑢 + 1 = 𝑢. 𝑑 + 1
41
(2)
Proof of (2) by contradiction (cont.)
𝑣. 𝑑 > 𝛿 𝑠, 𝑣 = 𝛿 𝑠, 𝑢 + 1 = 𝑢. 𝑑 + 1
When dequeuing 𝑢 from 𝑄, vertex 𝑣 is either WHITE, GRAY, or BLACK
WHITE: 𝑣. 𝑑 = 𝑢. 𝑑 + 1, contradiction
BLACK: it was already removed from the queue
By Corollary 22.4, we have 𝑣. 𝑑 ≤ 𝑢. 𝑑, contradiction
GRAY: it was painted GRAY upon dequeuing some vertex 𝑤
Thus 𝑣. 𝑑 = 𝑤. 𝑑 + 1 (by construction)
(3) For any vertex 𝑣 ≠ 𝑠 that is reachable from 𝑠, one of the shortest paths from 𝑠 to 𝑣 is a shortest path from 𝑠 to 𝑣. 𝜋 followed by the edge 𝑣. 𝜋, 𝑣
Proof of (3)
If 𝑣. 𝜋 = 𝑢, then 𝑣. 𝑑 = 𝑢. 𝑑 + 1. Thus, we can obtain a shortest path from 𝑠 to 𝑣 by taking a shortest path from 𝑠 to 𝑣. 𝜋 and then traversing the edge 𝑣. 𝜋, 𝑣 .
43
BFS(G, s) forms a BFS tree with all reachable 𝑣 from 𝑠
We can extend the algorithm to find a BFS forest that contains every vertex in 𝐺
BFS-Visit(G, s) s.color = GRAY s.d = 0
s.π = NIL Q = empty ENQUEUE(Q, s) while Q ≠ empty
u = DEQUEUE(Q) for v in G.adj[u]
if v.color == WHITE //explore full graph and builds up
a collection of BFS trees BFS(G)
for u in G.V
u.color = WHITE u.d = ∞
u.π = NIL for s in G.V
if(s.color == WHITE)
Textbook Chapter 22.3 – Depth-first search
45
Search as deep as possible and then backtrack until finding a new path
1
2
3 4
8
9 12 13
14
5 6
7
10 11
Implemented via recursion (stack)
Color the vertices to keep track of progress:
GRAY: discovered (first time encountered)
BLACK: finished (all adjacent vertices discovered)
WHITE: undiscovered 47
// Explore full graph and builds up a collection of DFS trees
DFS(G)
for each vertex u in G.V u.color = WHITE
u.pi = NIL
time = 0 // global timestamp for each vertex u in G.V
if u.color == WHITE DFS-VISIT(G, u)
DFS-Visit(G, u) time = time + 1
u.d = time // discover time u.color = GRAY
for each v in G.Adj[u]
if v.color == WHITE v.pi = u
DFS-VISIT(G, v) u.color = BLACK
time = time + 1
u.f = time // finish time
Parenthesis Theorem
Parenthesis structure: represent the discovery of vertex 𝑢 with a left
parenthesis “(𝑢” and represent its finishing by a right parenthesis “𝑢)”. In DFS, the parentheses are properly nested.
White Path Theorem
In a DFS forest of a directed or undirected graph 𝐺 = 𝑉, 𝐸 ,
vertex 𝑣 is a descendant of vertex 𝑢 in the forest at the time 𝑢. 𝑑 that the search discovers 𝑢, there is a path from 𝑢 to 𝑣 in 𝐺 consisting entirely of WHITE vertices
Classification of Edges in 𝐺
Tree Edge
Parenthesis Theorem
Parenthesis structure: represent the discovery of vertex 𝑢 with a left parenthesis “(𝑢” and represent its finishing by a right parenthesis
“𝑢)”. In DFS, the parentheses are properly nested.
49
Properly nested: (x (y y) x) Not properly nested: (x (y x) y)
Proof in textbook p. 608
White Path Theorem
In a DFS forest of a directed or undirected graph 𝐺 = 𝑉, 𝐸 ,
vertex 𝑣 is a descendant of vertex 𝑢 in the forest at the time 𝑢. 𝑑 that the search discovers 𝑢, there is a path from 𝑢 to 𝑣 in 𝐺 consisting entirely of WHITE vertices
Proof.
Since 𝑣 is a descendant of 𝑢, 𝑢. 𝑑 < 𝑣. 𝑑
Hence, 𝑣 is WHITE at time 𝑢. 𝑑
In fact, since 𝑣 can be any descendant of 𝑢, any vertex on the path from 𝑢
Classification of Edges in 𝐺
Tree Edge (GRAY to WHITE)
Edges in the DFS forest
Found when encountering a new vertex 𝑣 by exploring 𝑢, 𝑣
Back Edge (GRAY to GRAY)
𝑢, 𝑣 , from descendant 𝑢 to ancestor 𝑣 in a DFS tree
Forward Edge (GRAY to BLACK)
𝑢, 𝑣 , from ancestor 𝑢 to descendant 𝑣. Not a tree edge.
Cross Edge (GRAY to BLACK)
Any other edge between trees or subtrees. Can go between vertices in same DFS tree or in different DFS trees
51
In an undirected graph, back edge = forward edge.
To avoid ambiguity, classify edge as the first type in the list that applies.
Edge classification by the color of 𝑣 when visiting 𝑢, 𝑣
WHITE: tree edge
GRAY: back edge
BLACK: forward edge or cross edge
𝑢. 𝑑 < 𝑣. 𝑑 forward edge
𝑢. 𝑑 > 𝑣. 𝑑 cross edge
Theorem 22.10
In DFS of an undirected graph, there are only tree edges and back edges
Connected Components
Strongly Connected Components
Topological Sort
53
54
Input: a graph 𝐺 = 𝑉, 𝐸
Output: a connected component of 𝐺
a maximal subset 𝑈 of 𝑉 s.t. any two nodes in 𝑈 are connected in 𝐺
Why must the connected components of a graph be disjoint? 55
1
2
5
3 4
6
7
8 9
57
58
Course Website: http://ada.miulab.tw Email: ada-ta@csie.ntu.edu.tw
59
Important announcement will be sent to @ntu.edu.tw mailbox
& post to the course website