A Self-Stablizing Algorithm for Finding Distances in a Distributed System

全文

(1)1. A SELF-STABILIZING ALGORITHM FOR FINDING DISTANCES IN A DISTRIBUTED SYSTEM Tetz C. Huang and Ji-Cherng Lin Department of Computer Engineering and Science, Yuan-Ze University 135 Yuan-Tung Rd., Chung-Li Taoyuan 32026, Taiwan Email: cstetz@cs.yzu.edu.tw, csjclin@cs.yzu.edu.tw ABSTRACT In this paper, we propose a self-stabilizing algorithm for finding distances in a distributed system in which a central daemon is assumed. The correctness of the proposed algorithm is proved by using the bounded function technique. I. Introduction The notion of the self-stabilization in a distributed system was first introduced by E. W. Dijkstra in his classic paper [1] in 1974. According to him, a distributed system is self-stabilizing if regardless of any initial global state, the system can automatically adjust itself to eventually converge to a legitimate state and then stay in legitimate state thereafter unless it incurs a subsequent transient fault. The main work in this paper is to provide a concise proof for the correctness of a commonly used algorithm for finding distances by employing the bounded function technique. The inspiration of using the bounded function in the proof comes from [2]. The rest of this paper is arranged as follows. In Section 2, the algorithm is proposed and the meaning of the legitimate state is explained. In Section 3, an example illustrates the execution of the algorithm. In Section 4, the correctness proof of the algorithm is given. II. The Algorithm As usual, we use a connected undirected graph G = (V, E) to model a distributed system, with each node i ∈ V representing a processor in the system and each edge {i, j} ∈ E representing the bidirectional link connecting processors i and j. Following Dijkstra[1], the system assumes the presence of a central daemon who can randomly select one among all the privileged processors to make a move; the central daemon need not be fair in any sense. We should mention here that for the shortest path problem, it suffices to consider only simple graphs, i.e., graphs without any loop and multiple edge. In the system, each edge e = {i, j} is preassigned a weight (or length) w(e) = w(i, j), which is a positive integer. If L = (e1 , e2 , . . . , et ) is a path in G, the weight (or length) t of L, w(L), is defined to be w(ek ). For any two nodes k=1. i and j in V , a shortest path between i and j is a path of minimum weight which connects i and j; the weight of a. shortest path between i and j is called the distance between i and j and is denoted by d(i, j). The problem of finding distances can be phrased as follows: Suppose a node r in G is specified as the source of the system. We want to find for each node i in G the distance between i and the source r. The self-stabilizing algorithm for finding distances in the system is given below. Note that in the algorithm, d(i) stands for a local variable of the node i and N (i) = {j ∈ V | {i, j} ∈ E} denotes the set of all neighbors of i. The value of each local variable d(i) is in the range {0, 1, 2, . . .}. Self-stabilizing algorithm for finding distances in a distributed system {For the source r} (R0) d(r) = 0 → d(r) := 0. {For node i = r} (R1) d(i) = min (d(j)+w(i, j)) → d(i) := min (d(j)+ j∈N (i). j∈N (i). w(i, j)). The legitimate states for the system is defined to be those states in which d(r) = 0 and ∀i = r, d(i) = min (d(j) + j∈N (i). w(i, j)). The meaning of the legitimate states can be seen from the following theorem. Theorem 1: If the system G = (V, E) is in any legitimate state, then ∀i ∈ V , d(i) = d(i, r). Proof: First, let each node v = r selects a neighbor k with d(k) + w(v, k) = min (d(j) + w(v, j)) to be its j∈N (v). predecessor, denoted by p(v). Since d(p(v)) + w(v, p(v)) = min (d(j)+w(v, j)) = d(v), we have d(p(v))+w(v, p(v)) =. j∈N (v). d(v) and d(p(v)) < d(v) for any v = r. Let i = r be any arbitrary node in V . If we trace predecessors from i on, we will get a sequence (v0 , v1 , v2 , . . .) with v0 = i and p(vk ) = vk+1 for any k = 0, 1, 2, . . . . If the tracing does not reach the source r at any point, then the tracing will continue indefinitely. That means the above sequence is infinite. Since d(vk ) = d(vk+1 ) + w(vk , vk+1 ) for any k, we then have d(i) = d(v0 ) > d(v1 ) > · · · > 0, i.e., we get infinitely many integers between d(i) and 0, which is a contradiction. Therefore the tracing must reach the source r at a certain point and then terminates. Consequently, the sequence (v0 , v1 , v2 , . . .) is actually a finite one (v0 , v1 , . . . , vt ) which terminates at vt = r. This sequence of nodes (v0 , v1 , . . . , vt ) defines a path from v0 = i to vt = r. The weight (or length).

(2) 2. of the path equals w(v0 , v1 ) + w(v1 , v2 ) + · · · + w(vt−1 , vt ) = w(v0 , v1 ) + w(v1 , v2 ) + · · · + (w(vt−1 , vt ) + d(vt )) = w(v0 , v1 ) + w(v1 , v2 ) + · · · + (w(vt−2 , vt−1 ) + d(vt−1 )) = · · · = w(v0 , v1 ) + d(v1 ) = d(v0 ) = d(i) So we get a path from i to r which has the weight d(i). Therefore, d(i) ≥ d(i, r). Next, we need only to show that d(i) ≤ d(i, r) for any i. Let {d(i, r) | i ∈ V } = {d0 , d1 , . . . , dt } with 0 = d0 < d1 < · · · < dt . For any node i with d(i, r) = d0 , i must be the source r and d(i) = 0. Thus d(i) ≤ d(i, r). Let k be any integer, with 0 ≤ k < t. Assume that for any node i with d(i, r) ≤ dk , d(i) ≤ d(i, r). Then consider any node i with d(i, r) = dk+1 . Let (v1 , v2 , . . . , vs ) be a shortest path connecting node i and the source r with v1 = i and vs = r. Then (v2 , v3 , . . . , vs ) is a shortest path connecting node v2 and r and d(v2 , r) < d(i, r) = dk+1 . Therefore d(v2 , r) ≤ dk and we have d(v2 ) ≤ d(v2 , r) by the induction hypothesis. But then d(i) = d(v1 ) = min (d(j) + w(i, j)) ≤ j∈N (i). d(v2 ) + w(i, v2 ) = d(v2 , r) + w(i, v2 ) = d(i, r). Thus, we have proved that for any i ∈ V , d(i) ≤ d(i, r). Consequently, d(i) = d(i, r) for any i ∈ V . Thus, as is obvious from the above theorem, there is actually only one legitimate state and when the system is in the legitimate state, our problem is solved. III. An Illustration Figure 1 illustrates the execution of the algorithm. There are six states in Figure 1. In each state, the shaded nodes represent privileged nodes whereas the shaded node with a darkened circle stands for the privileged node selected by the central daemon to make a move. IV. Correctness Proof For the sake of presentation, (R1) is split into two rules: (R1-a) d(i) < min (d(j) + w(i, j)) → d(i) := j∈N (i). min (d(j) + w(i, j)) and. j∈N (i). (R1-b) d(i) >. min (d(j) + w(i, j)) → d(i) :=. j∈N (i). min (d(j) + w(i, j)).. j∈N (i). In view of the algorithm, the following Lemma 1 and Lemma 2 are obvious. Lemma 2: (No deadlock) The system is deadlock-free in each illegitimate state. Lemma 3: (Closure) No node is privileged when the system is in the legitimate state. Next, we want to prove the convergence of the algorithm, that is, we want to show: Starting with any initial state, the system will converge to the legitimate state. So for the following discussion, we let the initial state of the system be fixed. For the sake of presentation in the following proofs,. we define some terminologies and design three bounded functions. Since the system G = (V, E) is a connected graph, a spanning tree T of G exists. If we choose the source r to be the root, then T becomes a rooted tree. For each node i in the system, let dinit (i) be the d(i) in the initial state and let the value du (i) be defined recursively by (1) du (r) = dinit (r) ; and (2) for i = r, du (i) = max{dinit (i), du (p) + w(i, p)}, where p is the parent of i in T . Lemma 4: For each node i in the system, d(i) ≤ du (i) at any time. A node i = r is called a turn node whenever d(i) < min (d(j)+w(i, j)); otherwise, it is called a non-turn node.. j∈N (i). If i is a turn node and d(i) = k then it is called a k-turn node. By definition, A(k) is the set of all k-turn nodes in the system and tk =| A(k) | is the cardinality of A(k) . Let d(i), m = max du (i) and let F1 = (t0 , t1 , . . . , tm ), F2 = i∈V. i∈V. and F = (F1 , F2 ). Note that all these functions including A(k) , tk , F1 , F2 and F have a common domain, the set of all global states. We compare the F1 -values as well as the F -values by lexicographic order. Thus, for any two global states S1 and S2 , F1 (S1 ) < F1 (S2 ) if and only if there is a k ∈ {0, 1, . . . , m} such that tj (S1 ) = tj (S2 ) for any j < k and tk (S1 ) < tk (S2 ) whereas F (S1 ) < F (S2 ) if and only if F1 (S1 ) < F1 (S2 ) or [F1 (S1 ) = F1 (S2 ) and F2 (S1 ) < F2 (S2 )]. Obviously, all the F -values are bounded below by (0, 0, . . . , 0) and between any particular F -value and (0, 0, . . . , 0), there can be only finitely many F -values possible. The following lemmas lead eventually to Theorem 12 which claims the convergence of the algorithm. Since the proofs of all these lemmas are of the same spirit, we presented here in details only that of Lemma 8. Lemma 5: (1) A node which is a turn node right before the system makes a move cannot contribute to the increase of tk , for any k, after the move of the system. (2) The source r can not contribute to the increase of any tk . Lemma 6: F1 does not increase each time after rule R(0) is executed in the system. Lemma 7: F2 decreases each time after rule R(0) is executed in the system. Lemma 8: F1 decreases each time after rule R(1-a) is executed in the system. Proof: Let node i be the node of the system which executes the rule and let d(i) = l right before the execution. (1) Since i is an l-turn node right before the execution of rule R(1-a) and is a non-turn node after the execution, i contributes to the decrease of tl by 1. (2) By Lemma 4, the source and all those nodes which are turn nodes right before the execution do not contribute to the increase of any tk after the execution..

(3) 3. (3) If node j ∈ V − (N (i) ∪{i}) is a non-turn node right before the execution, then by the same argument as in (2) in the proof of Lemma 5, j remains a non-turn node after the execution and therefore does not affect any tk . (4) If node j ∈ N (i) is any non-turn node right before the execution, then d(j) ≥ min (d(k) + w(j, k)) right before k∈N (j). the execution. If d(i) + w(j, i) >. min (d(k) + w(j, k)). k∈N (j). right before the execution, then after the execution, d(i) increases and hence min (d(k) + w(j, k)) remains unk∈N (j). changed; and therefore, d(j) ≥ min (d(k) + w(j, k)) still k∈N (j). and j remains a non-turn node. So in this case, j does not affect any tk . If d(i) + w(j, i) = min (d(k) + w(j, k)) right k∈N (j). before the execution, then d(j) ≥ d(i) + w(j, i) > d(i) = l; and therefore, after the execution, j either remains a nonturn node or becomes an s-turn node, where s = d(j) > l. So in this case, j either does not affect any tk or can only contribute to the increase of some ts with s > l. From all above, we can see tl decreases by at least 1 and tk remains unchanged for any k < l and therefore F1 decreases after the execution of rule R(1-a). Lemma 9: F1 does not increase each time after rule R(1b) is executed in the system. Lemma 10: F2 decreases each time after rule R(1-b) is executed in the system. Theorem 11: F decreases each time after rule R(0), R(1a) or R(1-b) is executed in the system. Proof: Obvious from above Lemmas 5-9. Theorem 12: The algorithm is self-stabilizing. Proof: The convergence property of the algorithm follows from Lemma 1, Theorem 2, the fact that F is bounded below by (0, 0, . . . , 0) and the fact that between the initial value and (0, 0, . . . , 0), there can only be finitely many F values possible; and the closure property of the algorithm follows immediately from Lemma 2. References [1] E. W. Dijkstra, Self-stabilizing systems in spite of distributed control, Communications of the Association of the Computing Machinery, 17, 643-644, (1974). [2] S. T. Huang and N. S. Chen, A self-stabilizing algorithm for constructing breadth-first trees, Information Processing Letters, 41, 109-117, (1992)..

(4) r 4. d (r)=4 1. The initial state. 2 d( i)=3. i. s. 5. 3 j. d( j)=5. r Central daemon picks i to make a move.. d (i)=4. 4. s. 5. 3 j. d (i)=4. 4. d (r)=0. 2. s. j. k. 4. 1 s. 5. 3 j. k. 2. 4. Node i is privileged by R(1). Node s is privileged by R (1).. d (r )=0 1. i. s. 5. 3 j. d (s)=1. Node j is privileged by R(1).. d (k)=4. k 2 r. d (i)=3. 4. d (r )=0 1. 2 i. s. 5 d( j)=6. d (s)=1. d (k)=4. 2. d( j)=5. Central daemon picks j to make a move.. d (r)=0. i. d (i)=3. Node j is privileged by R(1). Node k is privileged by R (1). Node s is privileged by R (1).. d (k)=4. 2. r Central daemon picks i to make a move.. d (s)=2. 3. 2. d( j)=5. r is privileged by R(0). is privileged by R(1). is privileged by R (1). is privileged by R (1).. 1. r d (i)=4. Source Node j Node k Node s. d (k)=4. k. 5. Central daemon picks s to make a move.. d (s)=2. 2. i. d( j)=5. r is privileged by R(0). is privileged by R(1). is privileged by R(1). is privileged by R(1). is privileged by R(1).. 1. i. d( j)=5. Source Node i Node j Node k Node s. d (r)=4. 2. r Central daemon picks r to make a move.. d (k)=4. k. 2. d (s)=2. 2. The legitimate state. 3. 1 j. d( s)=1. k. d (k)=4. Figure 1. An example which illustrates the execution of the algorithm..

(5)