• 沒有找到結果。

MESSAGE COMPLEXITY OF THE TREE QUORUM ALGORITHM

N/A
N/A
Protected

Academic year: 2021

Share "MESSAGE COMPLEXITY OF THE TREE QUORUM ALGORITHM"

Copied!
4
0
0

加載中.... (立即查看全文)

全文

(1)

lEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 6 . NO. 8, AUGUST 1995 887

Short Notes

Message Complexity of the

Tree Quorum Algorithm

Shyan-Ming Yuan and Her-Kun Chang

Abstract-The tree quorum algorithm (TQA) uses a tree structure to generate intersecting (tree) quorums for distributed mutual exclusion. This paper analyzes the number of messages required to acquire a quo- rum in TQA. Let i be the depth of the complete binary tree used in TQA, and let Mi be the number of messages required to acquire a quorum or to determine that no quorum is accessible. We discuss Mi as a function of i and p, where p ( + < p < l ) is the probability that each site is opera- tional. Let Ci denote the average number of sites in the quorum that TQA finds. The analysis shows that, although both Mi and Ci increase without bound as i increases, M i / C i approaches to

9

as i increases. According to the result, an approximate close form for Mi is derived.

Index Terms-Distributed mutual exclusion, tree quorum algorithm, quorum size, message complexity.

I. INTRODUCTION

A distributed system consists of a set of sites which are loosely coupled by a computer network. One advantage of distributed sys- tems is resource sharing. That is the resources in a distributed system can be shared among the sites in the system. Examples of sharable resources are memory, peripheral, CPU, clock, etc. The sites in a distributed system may issue requests to a shared resource at arbitrary times. When two or more sites try to access the same resource at the same time, a conflict occurs. A mechanism is required to synchronize conflicting requests so that at most one site is allowed to access the resource at a time. This problem is known as distributed mutual ex- clusion [I], [2], [3], [4],

[SI.

A survey of various algorithms for mu- tual exclusion can be found in [4] and a simple taxonomy for dis- tributed mutual exclusion algorithms was reported in [5].

A central controller can be used to control mutually exclusive ac- cesses to a shared resource. All requests for the resource are sent to the controller and scheduled by the controller. Using a central con- troller is simple and easy to implement. However, the controller is vulnerable to site failure. When the controller fails, no access to the resource is allowed, i.e., the entire system is halted. It is desirable to reduce the probability that the system is halted by using more than one site to participate in the decision making. For example, majority consensus [ 6 ] can be used to achieve mutual exclusion wherein a site is allowed to access the resource if it can get permissions from a ma- jority of sites.

Quorum consensus is a generalization of majority consensus. Let

U

be the set of sites in a system. A quorum Q is a subset of

U

and each access is allowed to perform if it can get permissions from all sites in a quorum. To synchronize the accesses in a mutually exclu- sive way, the quorums must satisfy the following property:

For each pair of quorums Q l and Q2, Ql n Q2 # 0 .

Manuscript received Mar. 2, 1994; revised July 15, 1994.

S.-M. Yuan and H.-K. Chang are with the Department of Computer and In- formation Science, National Chiao Tung University, 1001 Ta Hsueh Road, Hsinchu 30050, Taiwan; e-mal [email protected].

IEEECS Log Number D95031.

Mutual exclusion is ensured by requiring the accesses to get per- missions from intersecting quorums. Since the quorums intersect with each other, it is impossible that two accesses can get permissions from two quorums at the same time.

The communication cost of a quorum consensus algorithm can be measured by the following metrics:

message complexity-expected number of messages that the al- gorithm uses to acquire a quorum or to determine no quorum is accessible.

quorum size-expected number of sites in the quorum that the algorithm finds.

So far as we know, the communication cost of each quorum consensus algorithm proposed in the literature is estimated by the quorum size. The quorum size in majority consensus is

[?I.

The tree quorum algorithm (TQA) uses a tree structure to generate tree quorums [I]. The size of the tree quorums varies from log N to In general, the communication cost can be measured by message complexity more precisely than by quorum size. It was assumed in [ I ] that the number of messages required to construct a quorum is directly proportional to the size of the quorums. That is (in terms of this paper) message complexity is proportional to quorum size. The assumption motivates us to study the relationship between message complexity and quorum size. Let Mi be the message complexity of an

i level (complete binary) tree, and let C, be the quorum size of the i

level tree. We discuss M i as a function of i and p, where p

(4

< p < 1) is the probability that each site is operational. To verify the assumption, an asymptotic analysis of the ratio of message com- plexity to quorum size, R, = M , / C i , is presented. The analysis shows that, although both Mi and C, increase without bound as

i

increases, M i / C i approaches to

?

as i increases.

Although both Mi and Ci can be computed by recurrence equa- tions, Ci has a close form but

M i

has no close form. An important implication of the analytic result is:

As i increases, Mi/Ci

=+

and

M i

=?C,. That is, an approxi- mate close form for Mi can be derived.

The remainder of the paper is organized as follows. Section I1

briefly reviews TQA. In Section 111, message complexity of TQA is analyzed and an asymptotic analysis of the ratio of message com- plexity to quorum size is presented. Some concluding remarks are given in the final section.

[+I.

11.

TREE

QUORUM

ALGORITHM

The model in the analysis is described as follows. The sites are as- sumed to be fully connected by perfect links. When a request is sent to a site, a reply is sent if the site is operational. If the site has failed, no reply is sent.

The TQA uses a tree structure to generate (tree) quorums. The analysis in this paper consider only complete binary trees. For a bi- nary tree, a tree quorum (recursively) consists of

1 ) the root and a tree quorum of the left or right subtree, or

(2)

888 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 6, NO. 8, AUGUST 1995

2 ) a tree quorum of the left subtree and a tree quorum of the right

It was shown in [ I ] that there is a nonnull intersection between each pair of (tree) quorums. Thus mutual exclusion is ensured by requiring that each access to get permissions from any quorum of sites.

The sites are either operational or failed. The state of a site (operational or failed) is independent to the others. The probability that a site is operational, is referred to as the availability of the site. The availability of a tree is the probability that a quorum can be ac- quired from the tree.

subtree.

The analysis uses the following notations

p : availability of a single site, 1/2

<

p < 1 , Ai : availability of an

i

level tree, i 2 0, Ci : quorum size of an i level tree, i 2 0, Mi : message complexity of an i level tree,

i

2 0, Ri : M i / C i , i 2 0 .

The availability of a tree is the probability that a quorum can be ac- quired from the tree. Thus the availability of a binary tree is the prob- ability that

1) the root is operational and a tree quorum of the left or right 2 ) the root is failed, a tree quorum of the left subtree is available

subtree is available, or

and a tree quorum of the right subtree is available. Formally, A. = p and Ai, 1 5 i I k , is given as

for all i 2 1 . If p = 1, Ai = 1 , for all i 2 0. So the analysis considers only the case that

3

< p < 1.

The quorum size of a 0 level tree (i.e., a tree consists of only one site) is 1 , i.e.,

CO

= 1. Consider constructing a tree quorum at level

i,

i 2 I . If the root is operational and thus can be included in the quorum, the quorum size is I

+

Ci-,; otherwise, t h e quorum size is 2 C i _ l . Thus, for i 2 1 , Ci can be computed by the following re- currence:

ci

= p(l

+

CiJ

+

(1 - p)2C,-,

( 2 )

= ( 2

-

P)C,-I

+

p 111.

ANALYSIS

OF

TQA

In this section, message complexity of TQA is analyzed. An as- ymptotic analysis of the ratio of message complexity to quorum size, Ri, is presented. It is shown that R; converges to

9

as i increases.

A. Message Complexity

The recursive definition of the tree quorums in the previous sec- tion also implies a quorum construction algorithm. That is, if the root is operational, then the construction algorithm tries to construct a (tree) quorum from left or right subtree; otherwise, it must construct quorums from both left and right subtrees. In other words, the quo- rum construction algorithm first visits the root and then traverses the left and/or right subtrees (in some specified order or randomly).

First consider constructing a tree quorum from a 0 level tree, (i.e., a tree consists onlv one site).

1) If the site is operational, two messages are transmitted-one re- 2 ) Otherwise, only one message is sent-no reply.

quest and one reply.

Thus,

M , = 2 p + ( l - p )

= 1 + p (3)

Consider constructing an

i

level tree quorum, i 2 1. Without loss of generality, we assume that the quorum construction algorithm tries to acquire a quorum (recursively) by the order: root, left subtree and right subtree. If the root is operational, two messages are transmitted; otherwise, only one message is sent. Thus, in average, 1

+

p messages are sent. The messages required to traverse the subtrees are described below:

1) if the root is operational and a quorum of the left subtree is available-only messages for traversing the left subtree are sent, i.e., Mi-] messages are needed;

2 ) if the root is failed and no quorum of the left subtree is avail- able-since it is impossible to acquire a quorum, only messages for traversing the left subtree are sent, i.e., Mi-] messages are needed;

3) otherwise-messages are required to traverse both left and right subtrees, Le., 2 messages are required.

Formally, for i 2 I , Mi = (1

+

p)

B.

Asymptotic Analysis of Ri

LEMMA 1. (Lemma 2 of [ 7 ] ) If0 < axi

<

1,for all i , rhen

(1+ax,)(1 + a x , ) . . . ( l + a x n ) < e u ~ r '

LEMMA 2. (Lemma 3 of [ 7 ] ) For TQA,

9

< p < 1, Ai has thefollowing properties:

1) I - A ; s ( ~ - p ) ( ~ + p - 2 p 2 ) ' , f o r a l l i t ~ .

2 ) (1 - A i ) + ( l - A i + l ) + . . . + ( l +A,+,) <-(I + p - 2 p 2 ) l , f o r all

i,

m 2 0.

LEMMA 3. (Lemma 4 of [ 7 ] ) For TQA, Ci has thefollowing prop- erties:

LEMMA 4. For TQA, i 2 0,

R i 2 1 + p

PROOF. The proof is shown in the appendix.

(3)

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 6, NO. 8. AUGUST 1995 889 By Lemma 4, R;-l 2 1

+

p , we obtain According to Lemma 5 (2p

-

1)(1 - A j ) 1-p2

a;+,

<

(1

+

p ) e p ( 2 - p )

+

-

( 2 - P ) Ci+l Therefore,

IRj+,,,

-

Ril= 16i+,,,

+

6i+,,,-l+...+6;+ll

&

2 p - 1 (1 + p ) e P ( 2 - P ) ( ( 1 - ~ i ) + . . . + ( i - ~ i + ,

_,))

<-

(2 -

P)

c , + l

c,+,

According to Lemma 2 and Lemma 3

4

1 - P (1

+

p - 2p2y 2 p - 1 (2

-

P)

l ~ ~ + ~

- ~ ~-(I l <

+

p ) e A - P )

-

P(2P

-

1) +(I- p 2 ) ? L - 1 1 - p (2-p)'+l where x = 1 + p - 2 p 2 b = l + p 1 y = -2 - P 0

LEMMA 6.

If

5

< p

<

1, IR,+, - RII < ax' +by', for all i 5 0, m 5 1, THEOREM 1. ~ f + < p

c

1,

lim Ri =

-

1 + P where I+- D x = 1

+

p - 2p2 b = l + p y = - 1 2 - P PROOF. Let = Ri+l - R;, then

(1

+

p

+

A; - 2pA;)M;

+

(1

+

p ) M; 6 . = --

c;

1+1 (2

-

PIC' +

P

-

Ci

(2p-1)(1-Aj)Mj + ( l + p ) - p % + P

By Lemma 4, Ri = Mi/Cj 2 1

+

p , we obtain

(2p-1)(1-Aj)Mj ( 1 - p 2 )

+-

(2

-

P)Ci Ci+l

U

PROOF. Let a , x, b and y be defined as in Lemma 6. Since p

>

5,

x = ( l + p - 2 p 2 ) < 1 and y = & < l . For any E > 0, there is a

positive integer N such that If i > Nand m t 1, then

U?;+,,,

-

Ril <ax'+ b y i <

a$/+

b y <

E Thus, the sequence ( R i ) is convergent.

Let 1imi+ R; = Iimj+- Ri-l = y , that is,

C. Discussion

Giving p , the probability that each site is operational, both Mi and Ci increase without bound as i increases. According to Lemma 3, Ci, i t 0, has a close form:

(4)

890 IEEE TRANSACTIONS ON PARALLEL A N D DISTRIBUTED SYSTEMS, VOL. 6. NO. 8, AUGUST 1995

(5)

On the other hand, Mi has no close form, though

M i

can also be com- puted by recurrence relations. Using the analytic result in the previ- ous subsection, an approximate close form for Mi can be derived as

follows. As i increases, M i / C j approaches to

y

and thus Mi can be approximated by

y

C;. From

(3,

we obtain

((2 - PI’ - P ) ( l + P )

M .

.= (6)

PO

-

P )

IV.

CONCLUSION

The communication cost can be measured by message complexity more precisely than by quorum size. So far as we know, quorum size is used to measure the communication cost of each quorum consen- sus proposed in the literature. It was assumed that message complex- ity is directly proportional to quorum size [l].

The assumption motivates us to study the relationship between mes- sage complexity and quorum size. To verify the assumption, an asymp- totic analysis of the ratio of message complexity to quorum size is pre- sented. It is shown that the ratio converges to

y ,

where p is the prob- ability that each site is operational. The result implies two things:

1) Giving p , the probability that each site is operational, the mes- sage complexity is proportional to the quorum size, if the tree is sufficiently large.

2) Since the quorum size can be evaluated by a close form, an ap- proximate close form for the message complexity can be derived.

APPENDIX

PROOF OF LEMMA 4.

The proof is shown by induction.

1) Induction base: Ro = Mi&, = 1

+

p .

2) Induction hypothesis: Rj-l = M,-l/Ci-l 2 1

+

p , i.e.,

3) Induction step: From (2) and (4), we have, for i 2 1,

Mi-,

2 ( 1

+

p)C;-, , i 2 1.

0

ACKNOWLEDGMENTS

The authors would like to express their gratitude to the referees for their helpful comments. This work was supported in part by National Science Council of R.O.C. under the grant NSC82-0408-E009-289.

REFERENCES

[ I ] D. Agrawal, A. El Abbadi, “An efficient and fault-tolerant solution for distributed mutual exclusion,” ACM Trans. Computer System, vol. 9, no. 1, pp. 1-20, 1991.

H. Garcia-Molina, D. Barbara, “How to assign votes in a distributed system,” J. ACM. vol. 32, no. 4, pp. 841-860, 1985.

T. lbaraki, T. Kameda, “A theory of coteries: Mutual exclusion in dis- tributed systems,” IEEE Truns. Parallel & Distributed Systems, vol. 4, no. 7, pp. 779-794, 1993.

M. Raynal, Algorithmfor mutuul exclusion. The MIT Press, 1986. M. Singhal, “A taxonomy of distributed mutual exclusion,” J . Parullel andDistributed Computing, vol. 15, pp. 94-101, May, 1993. R.H. Thomas, “A majority consensus approach to concurrency control for multiple copy databases,” ACM Truns. Darubuse Systems, vol. 4, no. 2, pp. 180-209, 1979.

H.K. Chang, S.M. Yuan, “Message complexity of the tree quorum al- gorithm for distributed mutual exclusion,” Proc. 1994 IEEE Int’l Con$ on Distributed Computing System, pp. 76-80, 1994.

[2] [3] [4] [SI [6] [7]

Performance of Barrier Synchronization

Methods in a Multiaccess Network

Shun Yan Cheung and Vaidy S . Sunderam

Abstracf-Barrier synchronization is a commonly used primitive in parallel processing. In this paper, we present different algorithms for barrier synchronization on the widely prevalent multiaccess bus net- work, and derive analytical performance metrics for each of the pro- posed schemes, which are then compared against simulation results.

Index Terms-Distributed computing, parallel virtual machine

(PVM), barrier synchronization, multiaccess networks, performance evaluation.

I. INTRODUCTION

Barrier synchronization is a well-known and frequently used primitive in parallel processing. A barrier is a powerful mechanism that permits synchronization among a large number of cooperating processes in a parallel program, while being straightforward in terms of programming primitive(s) as well as semantics. Infor- mally, a barrier is a function that causes the invoking process in a parallel program to be suspended until all other processes also invoke the function, at which point all processes are allowed to continue. The simplest form of barrier synchronization assumes a fixed number of related processes in a parallel application that wish to synchronize periodically; in such situations, barriers are pro- vided as parameter-less function calls. However, variants that allow a “quorum” of participants to satisfy the barrier, or those that permit “named” barriers, also exist.

The barrier primitive originally evolved on shared-memory multiprocessors, but are currently used widely on distributed- memory multiprocessors also. Algorithms to implement barriers, as well as studies of their performance have received substantial at-

Manuscript received July 9, 1993; revised Jan. 5 , 1995.

S.-Y. Cheung and V.S. Sunderam are with the Department of Mathematics and Computer Science, Emory University, Atlanta, CA 30322; e-mail: [cheung, vss] @mathcs.emory.edu.

IEEECS Log Number D95016. 1045-9219/95$04.00 0 1995 IEEE

參考文獻

相關文件

• Hence it may surprise you that most of the complexity classes that we have seen so far have maximal elements. a Cook (1971) and

A factorization method for reconstructing an impenetrable obstacle in a homogeneous medium (Helmholtz equation) using the spectral data of the far-field operator was developed

A factorization method for reconstructing an impenetrable obstacle in a homogeneous medium (Helmholtz equation) using the spectral data of the far-eld operator was developed

In particular, we present a linear-time algorithm for the k-tuple total domination problem for graphs in which each block is a clique, a cycle or a complete bipartite graph,

The measurement basis used in the preparation of the financial statements is historical cost except that equity and debt securities managed by the Fund’s

The measurement basis used in the preparation of the financial statements is historical cost except that equity and debt securities managed by the Fund’s

• 在一台每秒可進行

The execution of a comparison-based algorithm can be described by a comparison tree, and the tree depth is the greatest number of comparisons, i.e., the worst-case