Two decentralized algorithms for strong interaction fairness for systems with unbounded speed variability

(1)

Two decentralized algorithms for strong interaction fairness

for systems with unbounded speed variability

Yuh-Jzer Joung

∗

Department of Information Management, National Taiwan University, Taipei, Taiwan, ROC Received September 1997; revised August 1998

Communicated by M. Nivat

Abstract

We present two randomized algorithms, one for message passing and the other for shared memory, that, with probability 1, schedule multiparty interactions in a strongly fair manner. Both algorithms improve upon a previous result by Joung and Smolka (proposed in a shared-memory model, along with a straightforward conversion to the message-passing paradigm) in the following aspects: rst, processes’ speeds as well as communication delays need not be bounded by any predetermined constant. Secondly, our algorithms are completely decentralized, and the shared-memory solution makes use of only single-writer variables. Finally, both algorithms are symmetric in the sense that all processes execute the same code, and no unique identier is used to distinguish processes. c 2000 Elsevier Science B.V. All rights reserved.

Keywords: Randomized algorithm; Strong interaction fairness; Weak interaction fairness; Multiparty interaction

1. Introduction

Since Hoare introduced CSP [13], interactions and nondeterminism have become two fundamental features in many programming languages for distributed computing (e.g., Ada [34], Script [11], Action Systems [3], IP [10], and DisCo [15, 14]) and algebraic models of concurrency (e.g., CCS [24], SCCS [23], LOTOS [7], -calculus [25, 26]). Interactions serve as a synchronization and communication mechanism: the participating processes of an interaction must synchronize before embarking on any data transmis-sion. Nondeterminism allows a process to choose one interaction to execute, from a set of potential interactions it has specied.

For example, consider a replica system consisting of two client processes C1 and

C2, and two replica managers M1 and M2. The two clients C1 and C2 interact with

∗_{E-mail address: [email protected] (Y. Joung).}

(2)

Fig. 1. A replica system.

the managers M1 and M2 respectively to access the database. Moreover, from time to

time the two managers interact with each other to update their replica data (Fig. 1). The system can be described by the following program written in CSP’s style except that input=output commands are now replaced by interactions (where i = 1; 2):

Ci:: ∗ [ accessi → local-computing; ]

Mi:: ∗ [ accessi → local-computing;

gossip → local-computing; ]

In the program accessi designates the interaction between Ci and Mi, and gossip

desig-nates the interaction between M1 and M2. Like CSP’s input=output guards, interactions

can also serve as guards in an alternative=repetitive command, and an interaction guard can be executed only if its participating processes are all ready for the interaction. So the replica manager M1 can either establish an interaction with its client C1, or an

interaction with its peer M2; and if both targets are ready, then the choice is

nondeter-ministic. Interactions and nondeterminism therefore provide a higher level of abstraction by hiding execution-dependent synchronization activities into the implementation level. Note that, although like CSP and Ada, each interaction in the above example involves only two processes, interactions can also be multipartied, allowing an arbitrary number of processes to establish an interaction. Multiparty interactions provide a higher level of abstraction than biparty interactions as they allow interactions in some applications to be naturally represented as an atomic unit. For example, the natural unit of process interactions in the famous Dining Philosophers problem involves a philosopher and its neighboring chopsticks, i.e., a three-party interaction. More examples can be found in [10], and a taxonomy of programming languages oering linguistic support for multiparty interaction is presented by Joung and Smolka [18].

Intuitively, since a process may be ready for more than one interaction at a time, the implementation of interaction guards must guarantee a certain level of fairness to avoid a prejudicial scheduling that favors a particular process or interaction. For example, the notion of weak interaction fairness (WIF) is usually imposed to ensure that an interaction that is continuously enabled will eventually be executed. (An interaction is enabled if its participants are all ready for the interaction, and is disabled otherwise.) To illustrate, the following execution of the above replica program does not satisfy WIF, as interaction access2 is continuously enabled forever but is never executed (note

(3)

that, in the program, when a process is ready for interaction, it is ready to execute any interaction of which it is a member):

All four processes are ready for interaction initially, and then the following scenario is repeated forever:

• C1 and M1 establish access1;

• C1 and M1 exit access1 and then respectively become ready again.

WIF has been widely implemented in CSP-like biparty interactions [8, 31, 29, 5, 33], as well as in the multiparty case [28, 4, 27, 20, 17].

Although WIF can ensure some form of liveness, it is sometimes too weak to be useful. For example, consider another execution of the replica program:

All four processes are ready for interaction initially, and then the following scenario is repeated forever:

• the four processes respectively leave their interactions and become ready again.

The computation satises WIF because no interaction is continuously enabled forever. (Recall that an enabled interaction becomes disabled when some of its participants executes an interaction.) However, in the computation the two replica managers never establish an interaction, regardless of the innitely many opportunities they have.

On the other hand, the above execution can be prevented if the implementation were to satisfy strong interaction fairness (SIF), meaning that an interaction that is innitely often enabled is executed innitely often. SIF is much stronger than most known fairness notions (including WIF) [2], and therefore induces more liveness properties. Unfortunately, given that (1) a process decides autonomously when it will be ready for interaction, and (2) a process’s readiness for interaction can be known by another only through communications, and the time it takes two processes to communicate is nonnegligible, SIF cannot be implemented by any deterministic algorithm [32, 16]. Note that, the impossibility result holds as well even if interactions are strictly bipartied.

To cope with the impossibility phenomenon, Joung and Smolka [19] propose a ran-domized algorithm for scheduling multiparty interactions that guarantees SIF with prob-ability 1. That is, if an interaction is enabled innitely often, then the probprob-ability is 1 that it will be executed innitely often. The algorithm is an extension of Francez and Rodeh’s randomized algorithm [12] for CSP-like biparty interactions to the multiparty case. Both algorithms use a very basic idea – “attempt, wait, and check” – to establish interactions. That is, when a process is ready for interaction, it rst “attempts” to es-tablish an interaction by accessing some shared variables, and then “waits” for some time before it “checks” if its partners are likewise willing to establish the interaction.1

Francez and Rodeh were able to claim only weak interaction fairness, and only under 1_{A similar concept is used by Reif and Spirakis [30], albeit the -parameter in their randomized}

algo-rithm is more deliberately calculated to meet the real-time response requirement. Like Francez and Rodeh’s algorithm, however, Reif and Spirakis’s algorithm is proposed only for biparty interactions, and guarantees WIF with probability 1.

(4)

the limiting assumption that the time it takes to access a shared variable (i.e., the communication delay) is negligible compared to . Joung and Smolka remove the negligible delay assumption, but they require the delay be bounded by some constant max so that can then be appropriately determined.2 The algorithm therefore does not

work for systems where such a bound cannot be known in advance. Moreover, given that the algorithm’s time complexity is in linear proportion to , the performance may be signicantly decreased if the average communication delay is much less than the upper bound max.

Moreover, like Francez and Rodeh’s algorithm, Joung and Smolka’s algorithm is presented in a shared-memory model where processes communicate by reading from and writing to shared variables. They also have to use a multi-writer variable (meaning that a shared variable can be read and written by more than one process) for each interaction in order to resolve the mutual exclusion and concurrency problem between the participating processes of the interaction. While it is true that multi-writer variables can be implemented from single-writer variables (where a single-writer variable allows only one process to write),3 _{some extra cost in eciency would be required in the}

conversion.

The main contributions of this paper are two randomized algorithms for the inter-action scheduling problem, one for message passing and the other for shared memory. Like Joung and Smolka’s algorithm, our algorithms are presented for a multiparty setting, and use the concept of “attempt, wait, and check” to establish interactions. However, we do not assume any predetermined bound on the length of each process step, where a step is a non-zero nite time interval in which a single instruction is instantaneously executed at the last moment of the interval. (A process’s speed is a measure of the process’s steps such that the slower the speed, the more the time it takes to execute a step.) Rather, a process’s parameter is dynamically adjusted according to other processes’ speeds. Therefore, the system’s performance is determined by the actual speeds of the processes, not by a worst-case scenario of the system. We show that our algorithm guarantees SIF with probability 1, so long as the following two con-ditions are satised: (A1) processes are not hanging (a process is hanging if it stops executing its instructions, or there exist an innite sequence of steps of the process with monotonically increasing length),4 _{and (A2) a process’s transition to a state ready for}

interaction does not depend on the random choices performed by other processes. Note that, the no-hanging assumption implies that the length of each process’s step will even-tually be bounded throughout an innite computation of the system. However, unlike Joung and Smolka’s algorithm, this bound may vary from computations to computations and, therefore, no xed bound is assumed for all possible computations of the system. 2_{As noted by Joung and Smolka [19], the impossibility result for SIF holds as well even if the}

commu-nication delay is bounded by some constant.

3_{For references on the related issues, see the book Distributed Algorithms by Lynch [22].}

4_{A similar showdown situation has been addressed by Afek et al. [1] in solving the sequence transmission}

(5)

Our algorithms are completely decentralized, meaning that no coordinating process is used in either of them. In particular, for the shared-memory algorithm, only single-writer variables have been used, as opposed to Joung and Smolka’s algorithm for which a multi-writer variable has to be associated with each interaction. Our algorithms are also symmetric in the sense that all processes execute the same code, and no unique identiers are used to distinguish processes. Symmetry is particular useful if we are to extend the algorithms to an environment where processes can be dynamically created and destroyed. Joung and Smolka have also described how to convert their algorithm into a message-passing paradigm. However, this conversion would also turn the algorithm into asymmetric because some processes are distinguished from the others to maintain the multi-writer variables they have used in their algorithm.

To help understand our algorithms, we have chosen to present the message-passing solution rst. The algorithm is simpler because a communication imposes a causal ordering between the initiator (usually the information provider) and its target (the information recipient), and the send and receive commands in the message-passing paradigm implicitly assumes this causal ordering in their executions. By contrast, a more sophisticated technique is required in a completely decentralized shared-memory model to ensure that two asynchronous processes engaged in a communication are appropriately synchronized so that the information provider will not overwrite the in-formation before the other process has observed the content. Both algorithms share the same idea in the dynamic adjustment of the -parameter.

The rest of the paper is organized as follows. Section 2 presents the multiparty inter-action scheduling problem. The message-passing solution is presented in Section 3, and the shared-memory solution in Section 4. Concluding remarks are oered in Section 5. 2. The problem

We assume a xed set of sequential processes p1; : : : ; pn which interact by engaging

in multiparty interactions X1; : : : ; Xm. Each multiparty interaction Xi involves a xed

set of processes P(Xi). Initially, each process in the system is in its local computing

phase which does not involve any interaction with other processes. From time to time, a process becomes ready for a set of potential interactions of which it is a member. After executing any one of the potential interactions the process returns to its local computing phase.

Assume that a process starting an interaction will not complete the interaction until all other participants have started the interaction. Assume further that a process will eventually complete an interaction if all other participants have started the interaction. The multiparty interaction scheduling problem is to devise an algorithm to schedule interactions satisfying the following requirements:

Synchronization: If a process pi starts X , then all other processes in P(X ) will

eventually start X . Note that by the above two assumptions that a process will not complete an interaction until all other participants have started the interaction, and that

(6)

a process will eventually complete an interaction if all other participants have started the interaction, the synchronization requirement implies that when a process starts X , all participants of X will eventually complete an instance of X .

Exclusion: No two interactions can be in execution simultaneously if they have a common member. An interaction X is in execution if every process in P(X ) has started X , but none of them has yet completed its execution of X .

Strong interaction fairness: If an interaction is enabled innitely often, then it will be executed innitely often. (Recall that an interaction is enabled if its participants are all ready for the interaction, and becomes disabled when some of them starts an interaction.)

3. A message-passing solution 3.1. The algorithm

We now present our solution for the multiparty interaction scheduling problem in the message-passing paradigm. To help explain our algorithm, we rst present a simplied version of the algorithm, which satises the synchronization and exclusion requirements of the problem, but does not satisfy strong interaction fairness unless the length of a process step is bounded by some predetermined constant. The restriction will be lifted later when we present the full version of the algorithm.

In the simplied version of the algorithm, each process pi is associated with a unique

token Ti. When pi is ready for interaction, it randomly chooses one interaction X from

the set of potential interactions it is willing to execute, and informs each process in P(X ) of pi’s interest in executing X . To do so, pi makes |P(X )| copies of Ti, tags

them with “X ”, and sends one copy to each participant of X (including pi itself).

When all of the recipients have acknowledged the receipt of Ti, pi waits for some

time, hoping that every other process in P(X ) will also send pi a copy of its token

tagged with “X ” in this time interval.

If for each pj∈ P(X ), pi does receive a copy of Tj, and each copy is tagged with

“X ”, then pi has successfully observed the establishment of X (because the processes

in P(X ) all agree to execute X ). Then pi changes the tags of the tokens to “success”.

When expires, pi retrieves its tokens from each pj∈ P(X ) by sending pj a message

request, and then starts X when the tokens are returned. (Note that pi will also receive

a copy of Ti tagged with “success” from itself.)

If pi does not receive copies of tokens tagged with “X ” from all processes in P(X )

before expires, then pi also retrieves its tokens by sending each pj a message

request. When the tokens are returned, pi checks if any one of them is tagged with

“success”. If so, then the process returning this token has observed the establishment of X . So pi also starts X . If none of the tokens is tagged with “success”, then pi must

give up on X , discard all duplicated copies of Ti, and return to the beginning of this

(7)

1 ∗[ ¬ready → do local computations; ready := true; 2 ready ∧ ¬commit ∧ attempt = nil →

3 randomly select an interaction X for which pi is ready;

4 attempt := X ;

5 send a copy of Ti tagged with “X ” to each pj∈ P(X );

6 wait until each pj∈ P(X ) acknowledges the receipt of the token;

7 init−ck := clock(pi); /* start timing */

8 receive Tj from pj →

9 add Tj to token−pool;

10 send an acknowledgment to pj;

11 ∀pj∈ P(attempt) : Tj∈ token−pool ∧ tag(Tj) = attempt →

12 for each such Tj, tag(Tj) := success;

13 receive request from pj →

14 remove Tj from token−pool and send it back to pj;

15 clock(pi) − init−ck¿ →/* expires */ 16 send each pj∈ P(attempt) a message request;

17 wait until each pj returns its copy of Ti;

18 if any returned Ti is tagged with success

19 then commit := true;

20 else attempt := nil;

21 delete the returned tokens; 22 init−ck := ∞; 23 commit → 24 execute attempt; 25 attempt := nil; 26 commit := false; 27 ready := false; 28 ]

Fig. 2. An algorithm for multiparty-interaction scheduling that may not guarantee strong interaction fairness if the length of a process step is unbounded.

The algorithm to be executed by each pi is given in Fig. 2 as a CSP-like repetitive

command consisting of guarded commands. Each guarded command is of the form “b; message → S”. A guarded command can be executed only if it is enabled; i.e., its boolean guard b evaluates to true and the specied message has arrived. Both the boolean guard and the message guard are optional. The execution receives the mes-sage and then the command S is executed. If there is more than one enabled guarded command, then one of them is chosen for execution, and the choice is nondeterminis-tic. We do, however, require that a guarded command that is continuously enabled be executed eventually.

(8)

Fig. 3. A scenario of three processes executing the algorithm. Each non-shaded interval represents the time during which a process is monitoring an interaction.

The variables local to each pi are given as follows:

• ready: a boolean ag indicating if piis ready for interaction. It is initialized to false.

• attempt: the interaction that pi randomly chooses to attempt; it is set to nil if there

is none. The initial value of attempt is nil.

• commit: a boolean ag indicating if pi has committed to an interaction. It is

initial-ized to nil.

• token−pool: set of tokens received by pi. It is initialized to ∅.

• Ti: pi’s token. Function tag(Ti) returns the tag associated with Ti.

• init−ck: a temporary variable for pi to record the time at which it starts waiting for

a -interval before it determines whether or not its chosen interaction is established. It is initialized to ∞.

Moreover, each process pi is equipped with a clock, and clock(pi) returns the content

of the clock when the function is executed. We assume that processes’ clocks tick at the same rate. Section 5 discusses how this assumption can be lifted from the algorithm.

From the above description, it is not dicult to see that the algorithm satises the synchronization requirement of the multiparty interaction scheduling problem (see The-orem 1). This is because a process can start an interaction X only if it has received a copy of it’s token tagged with “success”. Since only the process pk which possesses

a set of tokens {Tj| pj∈ P(X ); tag(Tj) = “X ”} can change the tags to “success”,

when a process pj nds that the token returned by pk is tagged with “success”, all

other processes in P(X ) will also nd that their tokens are tagged with “success” when they retrieve their tokens from pk, and so will all start X . Moreover, the

ex-clusion requirement is easily satised because a process attempts one interaction at a time.

The fairness property depends on an appropriate choice of , however. To see this, assume that interaction X involves p1; p2, and p3, which are all ready for X . We say

that a process is monitoring X if it, after choosing X , has set up init−ck (line 7 of Fig. 2) and is waiting for its -interval to expire (i.e., to execute line 15). Consider the scenario depicted in Fig. 3. In this gure, each non-shaded interval represents the time during which a process is monitoring an interaction. A shaded interval then

(9)

amounts to the maximum time a process can spend from the time it has executed line 15 until the time it loops back to line 7 to set a new init−ck to monitor another interaction. According to this scenario, p1 is monitoring some interaction from t3 to

t7. During this interval, p2 and p3 will also start monitoring some interaction (at t5

and t6, respectively). If the three processes monitor the same interaction, say X , then

by t5, p1 will have received p2’s token tagged with X ,5 and by t6, p1 will also have

received p3’s token with the same tag. So, by t6, p1 will have collected all three

processes’ tokens tagged with “X ” (p1’s own token is received prior to t3). So each

process, upon receiving its own token returned by p1, will start X .

On the other hand, if a process does not monitor an interaction long enough, then no interaction may be established among processes even if their random choices coincide. For example, consider again Fig. 3. At time t1, p1 has collected tokens from p1 and

p2 (assume that they both choose the same interaction X to monitor). Suppose p3

also chooses X to monitor at t2. However, p3’s token is not guaranteed to arrive at

p1 before t1, and so p1 may give up on X at t1 when its -interval expires.

From the above discussion it can be seen that if there exists a time instance at which all processes in P(X ) are monitoring X , then X will be established after the processes nish up their monitoring phases. Moreover, suppose that the maximum pos-sible interval during which each pi∈ P(X ) is ready for interaction but is not

monitor-ing any interaction (i.e., the maximum possible length of a shaded interval in Fig. 3; we shall henceforth refer to each such interval as a “non-monitoring window”, see Section 3.2) is strictly less than i. Suppose further that the processes in P(X )

estab-lish their non-monitoring windows, one after another, in the following manner (assume that P(X ) = {p1; p2; : : : ; pl}): p1’s window is [t; t +1−) (where the window is taken

to be semi-closed because p1 stops monitoring an interaction at t, and starts

monitor-ing a new interaction at t + 1− ), p2’s window is [t + 1− ; t + 1+ 2− 2), and

so on. Then, we see that, at no time instance in [t; t +P_p_k_{∈P(X )}k− l), the processes

in P(X ) can be all monitoring an interaction simultaneously. However, if each pi’s

satises the condition: ¿P_p_k_{∈P(X )−{p}_i_}k, then the processes in P(X ) are all

mon-itoring an interaction at t +P_p_k_{∈P(X )}k−l. Note that, on the condition that each pi’s

is greater than or equal to P_p_k_{∈P(X )−{p}_i_}k, the layout of non-monitoring windows

described above provides a maximal interval throughout which we cannot nd a time instance at which the processes in P(X ) are all monitoring an interaction.

By the algorithm, when a process is monitoring an interaction, the interaction it is monitoring is determined by the random draw performed prior to the monitoring phase. So when the processes of P(X ) are all monitoring interactions, the probability that X will be established after the monitoring phases is given by the probability that a set of random draws, one by each process in P(X ), yield the same outcome X . The Law of Large Numbers in probability theory (see, for example, the book by Chung [9]) then tells us that if there are innitely many points at which all processes in P(X ) 5_{Recall that p}₂_{’s token sent to p}₁_{is acknowledged by p}₁_{, and p}₂_{will not start monitoring an interaction}

(10)

are monitoring interactions, then the probability is 1 that they will monitor the same interaction X innitely often and, so, with probability 1 they will establish X innitely often.

So, strong fairness of the algorithm relies on the assumption that the length of each non-monitoring window is bounded by some k so that another process’s can be

determined accordingly. Note that the condition ¿P_p_k_{∈P(X )−{p}_i_}k for pi implies

that the values chosen by processes need not be the same. Moreover, a temporarily short cannot cause the algorithm to err, although it may cause a set of processes to miss a chance for rendezvous.

Based on these observations, we can remove the bounded step assumption by let-ting processes communicate with each other about the length of their previous non-monitoring windows. Processes then use this information to adjust their next -intervals. So long as processes are not hanging and every message will eventu-ally be delivered, the dynamic adjustment of processes’ -intervals guarantees that when the participants of X are all ready for X , eventually their -intervals will be adjusted to meet the rendezvous requirement (i.e., they will all monitor interactions at the same time). The chance that they will establish X is then determined by their random draws. In this regard, we need not assume any predetermined bound on pro-cesses’ steps (speeds) and communication delays; the algorithm will adapt itself to the run-time environment.

So, we can modify the algorithm, yielding that shown in Fig. 4 — the full version of our algorithm for the multiparty interaction scheduling problem. We shall refer to the algorithm as TB (for Token-Based). Algorithm TB adds the following time variables to each pi:

• : records the maximum of the durations from the time pi previously stopped

mon-itoring interaction to the time pi starts monitoring interaction. It is initialized to 0.

• init−: a temporary variable for pi to record the time at which it starts to measure .

It is initialized to ∞.

• E[1::n]: E[j], initialized to 0, records the maximum value of pj’s sent by pj.

In the algorithm, pi measures its by lines 1.1 and 7.1 (for the rst non-monitoring

window while pi is ready for interaction), and by lines 15.1 and 7.1 (for the remaining

non-monitoring windows). When pi has sent out its token to pj (line 5), pj

acknowl-edges the receipt of the token by sending its to pi (line 100). Then pi adjusts its

E[j] to the larger value of E[j] and pj’s new (lines 6.1–6.2). These E[j]’s are used

in line 150 _{to time-out p}

i’s -interval.

The system’s performance depends on the lengths of -intervals the processes choose, which in turn depend on the values of E[j]’s. From time to time, one may reset each E[j] (and ) after pi has established an interaction to prevent the system getting too

slow due to some abnormal speed retardation. (Note that the time variables cannot be reset while pi is attempting to establish an interaction; for, otherwise, the algorithm

would not even guarantee weak interaction fairness.) In general, since a temporarily short -interval cannot cause the algorithm to err, E[j] can be reset to any value, e.g., the average of the past history of E[j]’s values, or the minimum of them. On the other

(11)

1 ∗[ ¬ready → do local computations;

1.1 init− := clock(pi); /* start measuring */

1.2 ready := true;

2 ready ∧ ¬commit ∧ attempt = nil →

3 randomly select an interaction X for which pi is ready;

4 attempt := X ;

5 send a copy of Ti tagged with “X ” to each pj∈ P(X );

6 wait until each pj∈ P(X ) acknowledges the receipt of the token;

6.1 let j be the timestamp in pj’s acknowledgment;

6.2 ∀pj∈ P(X ) − {pi} : E[j] := max(E[j]; j);

7 init−ck := clock(pi); /* start timing */

/* start monitoring interaction */

7.1 := max(; clock(pi) − init−); /* record a new */ 8 receive Tj from pj →

9 add Tj to token−pool;

100 _{send an acknowledgment with timestamp to p}_j_;

11 ∀pj∈ P(attempt) : Tj∈ token−pool ∧ tag(Tj) = attempt →

12 for each such Tj, tag(Tj) := success;

13 receive request from pj →

14 remove Tj from token−pool and send it back to pj;

150 _clock(p

i) − init−ck¿, where =Ppj∈P(attempt)−{pi}E[j] →

/* expires */

15.1 init− := clock(pi); /* start measuring */

/* stop monitoring interaction */ 16 send each pj∈ P(attempt) a request;

17 wait until each pj returns its copy of Ti;

18 if any returned Ti is tagged with success

19 then commit := true;

20 else attempt := nil;

21 delete the returned tokens;

22 init−ck := ∞; 23 commit → 24 execute attempt; 25 attempt := nil; 26 commit := false; 27 ready := false; 28 ] Fig. 4. Algorithm TB.

(12)

hand, reseting E[j]’s may also bring an extra load to a stable system. This is because if E[j] is reset to a value smaller than the length of pj’s next non-monitoring window

then, when next time pi wishes to establish an interaction with pj, it may not be able

to do so because pi’s is too short. Therefore, extra attempts by pi are needed for pi

to re-catch the length of pj’s non-monitoring windows. This overhead will be analyzed

in Section 3.2.4.

3.2. Analysis of algorithm TB

In this section we prove that TB satises the synchronization and exclusion require-ments of the multiparty interaction scheduling problem and, with probability 1, is strong interaction fair. We also analyze the expected time TB takes to schedule an interaction. 3.2.1. Denitions

We assume a discrete global time axis where, to an external observer, the events of the system are totally ordered.6 _{Moreover, we assume that for any given time instances}

t0; t1; : : : on this axis, the usual less-than relation over these instances is well-founded.

That is, for any given two time instances ti and tj, there are only a nite number of

points ti1; ti2; : : : ; tik on the global time axis such that ti¡ti1¡ti2· · · ¡tik ¡ tj.

Accord-ingly, the phrase “there are innitely many time instances” refers to the interval [0; ∞]. Recall from TB that, a process pi, after sending out its tokens to the processes in

P(X ), must wait for time before it decides whether to start or give up on X . We say that pi starts monitoring X if it has executed line 7 of the algorithm to time its . It

stops monitoring X when line 15.1 is executed. Let t1 and t2, respectively, be the time at

which these two events occur. The semi-closed interval [t1; t2) is a monitoring window

of pi, and pi is monitoring X in this window. Suppose that X fails to be established

in this monitoring window, then pi must start another monitoring window. Therefore,

from the time (say t0) pi becomes ready for interaction until the time (say tl) pi stops

monitoring an interaction that has been successfully established, the interval [t0; tl)

contains a sequence of monitoring windows [t1; t2); [t3; t4); : : : ; [tl−1; tl). The interspersed

intervals [t0; t1); [t2; t3); : : : ; [tl−2; tl−1) are called non-monitoring windows.7 The length

of a window is the dierence of the two ends in the interval. Note that all non-monitoring windows and non-monitoring windows have a non-zero length. The non-monitoring window of pi at time t refers to the monitoring window [ts; tf) of pi (if any) such

that ts6t¡tf; similarly for non-monitoring windows.

6_{As usual, an event transits a process from one state to another. If an event occurs at time t and it transits}

p from state s1 to state s2, then we say that p is in state s1 just before t, and is in state s2 right after t.

For p’s state to be dened at every time instance, we stipulate that p’s state at time t is s2 if the event

occurs at time t.

7_{There is a latency between the time t}_l _{at which p}_i _{stops monitoring an interaction (line 15.1), until the}

time tl0 at which pi starts executing the interaction (line 24). To simplify the denition, we shall henceforth

consider [tl−1; tl0) rather than [tl−1; tl) as a monitoring window. As a result, we can say that, from the

time pi becomes ready for interaction until the time its executes an interaction, it spends all of its time in

(13)

Note that, if pi is monitoring X , then every process in P(X ) must hold a copy

of Ti with a tag “X ”. Moreover, recall that a process records the length of a

non-monitoring window in variable . Since a process records an value only after it has started monitoring an interaction (line 7.1), the recorded value is slightly larger than the actual length. This is crucial to the correctness of Lemma 4.

If pi is monitoring X at time t, then the choice of X must be the result of some

random draw performed by pi before t. Let Dt; pi denote the event that is this random

draw. We use v(Dt; pi) to denote the outcome of the random draw. The probability that

v(Dt; pi) = X is denoted by pi; X, and the probability is assumed to be independent of t.

Moreover, assume ts6tf. We dene a set EttfsP(X ) of random draw events, at most

one by each process pi in P(X ), as follows:

• If pi remains in a monitoring window throughout [ts; tf], or pi is in a monitoring

window at ts and then starts an interaction after the window terminates, then the

random draw events Dts; pi is included in E

tf

tsP(X ). With respect to E

tf

tsP(X ), process

pi is referred to as a type-M process.

• If pi has a non-monitoring window contained8 in [ts; tf], then the random draw event

performed in the window is included in Etf

tsP(X ), and with respect to E

tf

tsP(X ); pi

is referred to as a type-N process. If pi has more than one non-monitoring window

contained in [ts; tf], then one of the random draw events performed in these windows

is chosen for Etf

tsP(X ). To avoid ambiguity, we shall give the priority to the one

performed in the largest window; and if there is still a tie, then the tie will be broken by giving the priority to the one performed the latest.

• Otherwise, no event by pi is included in EttfsP(X ).

So, if |Etf

tsP(X )|=|P(X )|, then every process in P(X ) has a random draw event in

Etf

tsP(X ). Furthermore, with respect to E

tf

tsP(X ), let QN⊆ P(X ) be the set of type-N

processes. For each pi∈QN, let ui denote the non-monitoring window in which pi

per-forms its random draw event chosen for Etf

tsP(X ), and let kuik denote the length of ui.

Then, the set Etf

tsP(X ) is said to be proper if tf−ts6

P

pi∈QNkuik and |E

tf

tsP(X )|=|P(X )|.

3.2.2. Properties of TB that hold with certainty

We now analyze the correctness of TB. We begin with the synchronization property. For this, it is useful to distinguish between an interaction (a static entity) and an instance of an interaction (a dynamic entity): when an interaction X is established, an instance of X is executed.

Theorem 1 (Synchronization). If a process starts a new instance of X; then all other processes in P(X ) will eventually start the instance of X .

8_{We say that an interval [t}₁_{; t}₂_{] is contained in [t}₃_{; t}₄_{] if t}₃_6t₁_{and t}₂_6t₄_{. Two intervals join if they have}

a common end point, and they overlap if there exists a non-zero length interval contained in both intervals. The terms apply to semi-closed intervals as well. For example [2, 4) is contained in [1, 4], and [2, 4) and [4, 6) join.

(14)

Proof. A process starts an instance of X only if it has sent a copy of its token tagged with “X ” to some pj∈ P(X ), and the token is returned with a tag “success”. Since only

the process which holds the set of tokens {Tj| pj∈ P(X ); tag(Tj) = “X ”} can change

the tags to “success”, and since a process will not give up its attempt to establish X until its tokens are returned, when a process attempting X receives a token tagged with “success”, all other processes in P(X ) will also obtain a token tagged with “success” when they retrieve their tokens. The theorem therefore follows.

Theorem 2 (Exclusion). No two interactions can be in execution simultaneously if they have a common member.

Proof. This follows from the fact that a process attempts one interaction at a time. 3.2.3. Properties of TB that hold with probability 1

We move on to prove the fairness property of TB.

Lemma 3. Suppose that, from time t0_{−u to time t}0_{+u; for each p}

i∈ P(X ); if pi has

a non-monitoring window overlapping or joining with [t0_{− u; t}0_{+ u]; then the length} of this window is strictly less than max

i . Let X =Ppi∈P(X )maxi . If X is enabled

at t0 _{and u¿}_X_{; then there exist t}₁ _{and t}₂_{; where t}0₋_X_¡t₁_6t₂_¡t0 ₊_X _and t2− t1¡X; such that Ett21P(X ) is proper.

Proof. Since X is enabled at t0_{, each p}

i∈ P(X ) is ready for interaction at t0. So, at

t0_{; p}_i _{is either in a non-monitoring window or in a monitoring window. It is clear that} either (i) every pi∈ P(X ) is in a monitoring window at t0, or (ii) some process in

P(X ) is in a non-monitoring window at t0_.

Consider Case (i). Let t1= t2= t0. By denition, then |Ett12P(X )| = |P(X )|. Since with

respect to Et2

t1P(X ) there is no type-N process, set Ett12P(X ) is obviously proper.

More-over, the two time instances t1 and t2 we have chosen easily satisfy the condition:

t0₋

X¡t16t2¡t0+ X and t2− t1¡X. So, the lemma is proven for this case.

Consider Case (ii). We begin with the following denition. Let U be a set of inter-vals [aj; bj), where 16j6l. Let left(U) = min{aj| 16j6l}, and right(U) =

max{bj| 16j6l}. The intervals in U are said to be connected if

∀t; left(U)6t¡right(U) ⇒ ∃[ak; bk) ∈ U; ak6t¡bk

(Intuitively, the intervals are connected if they can be “glued” together to form a single interval. For example, the three intervals in {[3; 7); [5; 9); [9; 10)} are connected, but the two intervals in {[3; 7); [8; 9)} are not.) It follows from the above denition that if the intervals in U are connected, then right(U) − left(U)6P_16j6l(bj− aj).

Recall that for Case (ii), there exists some process in P(X ), say p1, that is in a

non-monitoring window at t0_{. Let [t}

1; s; t1; f) be the non-monitoring window of p1. Dene

to be a set of pairs hp; ui satisfying the following conditions:

(1) For each hp; ui ∈ ; p ∈ P(X ) and u is a non-monitoring window of p. (2) hp1; [t1; s; t1; f)i ∈ .

(15)

(3) For each p ∈ P(X ); contains at most one pair hq; ui such that p = q.

(4) Let intervals–of ( ) = {u | hp; ui ∈ }. Then, the intervals in intervals–of ( ) are connected.

(5) is maximal; that is, there exists no other pair such that set ∪ {} satises the above four conditions.

(Note that there may be more than one such set.)

Let t1= left(intervals–of ( )), and let t2= right(intervals–of ( )). Since the

inter-vals in interinter-vals–of ( ) are connected and since t1; s6t0¡t1; f, it can be seen that

t0₋

X¡t16t2¡t0+ X and t2− t1¡X.

Consider Et2

t1P(X ). Let processes–of ( ) = {p | hp; ui ∈ }. Clearly, with respect to

Et2

t1P(X ) each p ∈ processes–of ( ) is a type-N process.

Let Q = P(X ) – processes–of ( ). We argue that, if Q 6= ∅, then with respect to Et2

t1P(X ) each q ∈ Q is a type-M process. To see this, observe that t16t0¡t2 (because

t16t1; s6t0¡t1; f6t2). Since q does not have a non-monitoring window overlapping

of joining with [t1; t2) (for otherwise, would not be maximal), q is in a monitoring

window at t0_{. Since every monitoring window must be preceded by a non-monitoring} window, and since q does not have a non-monitoring window overlappng or joining with [t1; t2), either q remains in a monitoring window throughout [t1; t2], or q remains

in a monitoring window throughout [t1; t0] and starts an interaction after the window

terminates. So, with respect to Et2

t1P(X ); q is a type-M process.

Given that, with respect to Et2

t1P(X ), each p ∈ P(X ) is either a type-N of type-M

pro-cess, we have |Et2

t1P(X )| = |P(X )|. So to show that Ett21P(X ) is proper it

remains to show that t2− t16Pp∈processes of ( )kupk, where up is the non-monitoring

window in which p performs its random draw event chosen for Et2

t1P(X ). For this, let

vp be the non-monitoring window of p such that hp; vpi ∈ . Note that, because each

p ∈ processes–of ( ) may have more than one non-monitoring window contained in [t1; t2]; vp and up may not refer to the same window. However, the up we have chosen

to build up Et2

t1P(X ) guarantees that kvpk6kupk. Observe that t2− t16

P

hp; vpi∈ kvpk.

So, t2− t16Pp∈processes of ( )kupk.

Therefore, the lemma is proven for Case (ii). Lemma 4. Assume set Et2

t1P(X ) is proper. With respect to Ett21P(X ); let QN be the set

of type-N processes, and QM be the set of type-M processes. For each pi∈ QN; let

ui denote pi’s non-monitoring window from which pi’s random draw event is chosen

for Et2

t1P(X ); and let wi denote pi’s monitoring window immediately following ui. For

each pi∈ QM; let wi denote pi’s monitoring window at t1. If all the random draws

in Et2

t1P(X ) yield the same outcome X and, for each pi∈ QN; kwik¿(

P

pl∈QNkulk) −

kuik; then an instance of X will be started when some process pj∈ P(X ) nishes its

monitoring window wj.

Proof. Since t2−t16Ppl∈QNkulk, and since for each pi∈ QN; pi’s monitoring window

wi has a length strictly greater than (Ppl∈QNkulk) − kuik; pi must still be in the

(16)

Recall that every pj∈ QM either remains in a monitoring window throughout [t1; t2],

or is monitoring an interaction at t1 and starts the interaction after it nishes the

monitoring window. Suppose rst that every pj∈ QM remains in a monitoring window

throughout [t1; t2] (where, under the lemma assumptions, this window is wj). Then,

every pj∈ QM is also monitoring X at t2. So, at time t2 each process in P(X ) has

collected every other process’s token tagged with “X ” and has changed (or is changing) all the tags to “success”. Hence, every process pk∈ P(X ) will start X when it nishes

its monitoring window wk (and retrieves its tokens).

Suppose otherwise that some pj∈ QM is monitoring an interaction at t1 and starts

the interaction after it nishes the monitoring window. Since the interaction pj is

monitoring is decided by the outcome of pj’s random draw event in Ett12P(X ), by the

assumptions of the lemma, the outcome is X . So, pj will start X when it nishes its

wj.9

Note that in Lemma 4 the monitoring window wi of each pi∈ P(X ) must overlap

or join with the interval [t1; t2]. So, if an instance of X is established and each wi6,

then the instance will be established by time t2+ .

For fairness, we rst show that TB satises weak interaction fairness, for which we need some assumption on the faultless behaviour of the system. We assume that if the communication medium remains connected, then every message will eventually reach its destination. Note that, if processes are not hanging, then they remain active (that is, every process will eventually execute its next instruction unless the instruction is a message receiving command and no message specied in the command has been sent to the process), and starting from any point the time it takes a process to execute an instruction (i.e., the length of the step to execute the instruction) will eventually be bounded.

Theorem 5 (Weak interaction fairness). Assume that processes are not hanging and the communication medium remains connected. If X is enabled at time t then, with probability 1, X will be disabled eventually.

Proof. We show that the probability is 0 that X is continuously enabled from t onward. Observe that since the communication medium remains connected and processes remain active, and since every continuously enabled guarded command will eventually be executed, a process will not be blocked indenitely from executing its next action. So, the time it takes for each process to measure a new value (which corresponds to the length of a non-monitoring window, although the measured value is slightly larger) 9_{In the algorithm, it is possible that some process p}₁ _{has already started X , but another process is still}

monitoring X , or is even still in a non-monitoring window. For example, consider the following scenario, and assume that P(X ) = {p1; p2} : (1) p1 starts monitoring X ; (2) p2 randomly chooses X and sends

p1 a copy of T2 tagged with “X ”; (3) p1 receives T2 and acknowledges the receipt (at this point p1 has

successfully observed the establishment of X ); (4) p1 nishes its monitoring window, retrieves its tokens,

(17)

is nite. Moreover, the assumption that processes are not hanging also ensures that, starting from any point, all possible values measured by a process will eventually be bounded by some constant c. The well-founded ordering of events on the time axis ensures that a process may at most measure a nite number of distinct values less than c.

Recall that the length of a monitoring window for pi to monitor X is determined

by the valueP_p_j_{∈P(X )−{p}_i_}E[j], where E[j] is the maximum of pj’s previous

val-ues collected between the time pi becomes ready for interaction through the time pi

starts the monitoring window. Moreover, every time when pi chooses to attempt X ,

it will learn all other participants’ current values when they acknowledge the re-ceipt of pi’s tokens (see lines 6–6.2 of TB). Since if pi is continuously ready it will

attempt interactions innitely often, by the law of large numbers (Theorem 6 will ex-plain this law in more detail), pi will attempt X innitely often with probability 1. So

if X is continuously enabled forever, then by the previous observations on values, there must exist some t0 such that, from t0 onward, for every pi∈ P(X ); pi’s new

value is no greater than some max

i , and pi’s E[j] is equal to maxj . It follows that

from t0 onward each pi’s non-monitoring window has a length less than maxi , and

each pi’s monitoring window to monitor X has a length greater than10 or equal to

P

pj∈P(X )−{pi}maxj .

Let X=Ppj∈P(X )jmax. Consider the interval [t0; t0 + 2X). Given that from t0

onward each pi’s non-monitoring window has a length less than maxi , Lemma 3

(with t0_{= t}

0+ X and u = X) ensures that there exist two time instances t1; s; t1; f,

where t0¡t1; s6t1; f¡t0+ 2X such that Ett1; s1; fP(X ) is a proper set of random draw

events. Given that starting from t0 each pi’s non-monitoring window has a length

less than max

i , and each pi’s monitoring window to monitor X has a length greater

than or equal to P_p_j_{∈P(X )−{p}_i_}max

j , Lemma 4 implies that, if the random draws in

Et1; f

t1; sP(X ) yield the same outcome X , then X will be disabled. Note that, even if the

random draws do not yield the same outcome, some process in P(X ) may still estab-lish another interaction X0 _{if its random draw coincides with other processes’ random} draws.

Let denote the probability that X remains enabled starting from t up to the point the random draws in Et1; f

t1; sP(X ) are to be made. So the probability that the random

draws in Et1; f

t1; sP(X ) do not cause X to be disabled is no greater than (1 − X), where

X is the probability that the random draws in Ett1; s1; fP(X ) yield the same outcome X . If

X remains enabled after the random draws, then every process in P(X ) will perform a new random draw in nite time, and so by Lemma 3 again there exists another proper set of random draws Et2; f

t2; sP(X ) such that E

t1; f

t1; sP(X ) ∩ E

t2; f

t2; sP(X ) = ∅. The probability that

X remains enabled after the new set of random draws is no greater than (1− X)2. In

10_{The length may be greater than} P

pj∈P(X )−{pi}E[j] because the condition that the length of pi’s

monitoring window equals to P_p_j_{∈P(X )−{p}_i_}E[j] only causes the guarded command in line 150 _{to be}

(18)

general, the probability that X remains enabled after l mutually disjoint sets of random draws is no greater than (1 − X)l. If X continues to be enabled then l will keep

increasing and, so, (1 − X)l tends to 0. So the probability that X remains enabled

forever is 0.

Theorem 6 (Strong interaction fairness). Assume (A1) that processes are not hang-ing and the communication medium remains connected, and (A2) that a process’s transition to a state ready for interaction does not depend on the random draws per-formed by other processes. If an interaction X is enabled innitely often then, with probability 1, the interaction will be executed innitely often.

Proof. Assume the hypothesis that X is enabled innitely often. By (A1), there exists some time instance t0 after which every non-monitoring window of pk has a length

less than max

k for each pk in the system, and every monitoring window of pk has

a length no less than X − maxk , where X=Ppj∈P(X )

max

j . Because t0 is nite,

from t0 onward X is still enabled innitely often. By Lemma 3, there exist innitely

many ti’s, ti;1’s, and ti; 2’s, where i¿0, ti− X¡ti;16 ti; 2¡ti+ X and ti;2− ti;1¡X,

such that X is enabled at ti; Etti; 2i;1P(X ) is proper, and E

ti; 2

ti;1P(X ) ∩ E

tj; 2

tj;1P(X ) = ∅ if i 6= j.

Let I be the set of indices of such ti’s. By Lemma 4, if the random draws in

Eti; 2

ti;1P(X ) yield the same outcome X , then an instance of X will be established. So,

in the following, we shall show that the probability is 1 that there are innitely many i’s in I such that Eti; 2

ti;1P(X ) yield the same outcome X . This then establishes the

theorem.

Because I is innite and there are only a nite number of interactions in the system, there exists an innite subset J ⊂ I such that, for each p ∈ P(X ), p is ready for the same set of interactions Ap at ti for each i ∈ J. Let Ap; X be the non-zero probability that

X is chosen from Ap in a random draw. Let X=Qp∈P(X ) Ap; X. Consider E

ti; 2

ti;1P(X ),

where i ∈ J. By Assumption (A2), the random draws in Eti; 2

ti;1P(X ) are independent of

the enabledness of X at ti and, so, are independent of one another. So, the probability

that the random draws in Eti; 2

ti;1P(X ) produce the same outcome X is X.

For each i ∈ J, dene random variable Ei to be 1 if the random draws in Etti; 2i;1P(X )

produce the same outcome X , and 0 otherwise. Then Ei= 1 also has the probability X. Let the indices of J be enumerated by j1; j2; : : : . By the law of large numbers

in probability theory (see, for example, the book by Chung[9]), for any given we have lim n→∞P P16i6nEji n − X 6= 1:

That is, when n tends to innity, the probability is 1 that (P_16i6nEji)=n tends to X.

Therefore, with probability 1, the set {i | Eji= 1; i ¿ 1} is innite. So, with

proba-bility 1, there are innitely many i’s in J such that the random draws in Eti; 2

ti;1P(X )

(19)

I such that the random draws in Eti; 2

ti;1P(X ) yield the same outcome X . The theorem is

therefore proven.11

Like the algorithm presented in [19], a conspiracy against strong interaction fairness can be devised if Assumption (A2) is dropped from Theorem 6. To see this, consider a system of two processes p1 and p2, and three interactions X1; X2, and X12, where

P(X1) = {p1}; P(X2) = {p2}, and P(X12) = {p1; p2}. Assume that p1 is ready for both

X1 and X12. So it will toss a coin to choose one to attempt. The malicious p2 could

stay in its local computing phase until p1 has randomly selected X1; then p2 becomes

ready for X2 and X12 before p1 executes X1. Since p1’s attempt to execute X1 will

succeed once it selects X1, X12 will not be executed this time. However, X12 is enabled

as soon as p2 becomes ready. Similarly, p1 could also stay in its local computing

phase until p2’s random draw yields X2. So if this scenario is repeated over and over

again, then the resulting computations would not be strong interaction fair. Note that in the resulting computation there exist innite many ti;1’s and ti; 2’s such that Etti;1i; 2P(X12)

is proper. However, the two random draws in Eti; 2

ti;1P(X12) are not mutually dependent

because one of them is performed only if the other has outcome X1 (or X2)

3.2.4. Time complexity

To measure the time complexity of TB, we wish to know that, when an interaction X is enabled, how long it takes a participant of X to execute an interaction, i.e., to disable X .12 _{It can be seen from Theorem 5 that a necessary condition for X to be}

disabled is that processes’ speeds will not keep decreasing. So, to simplify the analysis, we shall rst consider a stable system where processes’ speeds do not vary. Moreover, for subsequent comparison with deterministic algorithms, we shall also simplify the 11_{The law of large numbers cannot be used to prove the theorem if one were to reset time variables}

E[ j] periodically. This is because although there are innitely many i’s in I such that all the random draws in each Eti; 2

ti;1P(X ) yield the same outcome X , Lemma 4 might not be used to guarantee the establishment of

X because each process’s monitoring window following its random draw in Eti; 2

ti;1P(X ) could incidentally be

reset to a value unable to satisfy the condition of Lemma 4. Instead, the second Borel–Cantelli Lemma can be used to prove the theorem. As a consequence of the lemma, it is a well known fact in measure theory and probability that (see for instance Example 4.14 of [6]), if a coin (with outcome 0 or 1) is tossed an innite number of times, then given any constant c the probability is 1 that there are innitely many runs of 1 of length greater than c (where a run of 1 is a sequence of 1’s surrounded by two 0’s; its length is the number of 1’s in the sequence).

Given that the length of a non-monitoring window will eventually be bounded, we can see that, from some point onward, if an interaction X is enabled and each participant of X always chooses X to attempt, then after at most some nite number of attempts X will be established (they failed to establish X in earlier attempts because their monitoring windows were too short to satisfy the condition of Lemma 4). The above fact in measure theory and probability guarantees that, if X is enabled innitely often, then the probability is 1 that, innitely often, every participant of X will continuously choose X to attempt for at least some nite number of times. Therefore, the probability is 1 that X will be established innitely often.

12_{Given that interactions’ membership rosters may overlap, it is clear that no algorithm can guarantee}

the following: when an interaction is enabled, then this particular instance of interaction must eventually be executed with certainty; for, otherwise, the exclusion requirement of the interaction scheduling would not be satised.

(20)

analysis by assuming that each non-monitoring window takes a constant time −  for some ¿0, and each interaction involves m participants. By the algorithm, each monitoring window then must take more than (m − 1)(n − ) time. Let us assume that it takes (m − 1) + time.

Theorem 7 (Time complexity). Suppose each interaction involves m participants. Suppose further that each non-monitoring window has a length − for some ¿0; and each monitoring window has a length (m − 1) + . Then, once an interaction X is enabled, the expected time it takes for a member of X to start an interaction is no greater than

m Q

pi∈P(X ) pi; X

+ (m − 1) + 

where pi; X is the probability that pi chooses X in its random draw.

Proof. Assume the hypothesis, and that X is enabled at time t. By Lemma 3 (with max

i = , t0= t, u = X = m), there exist two time instances t1 and t2, where t −

m¡t16 t2¡t + m and t2 − t1¡m, such that Ett12P(X ) is proper. By Lemma 4

(with the hypothesis that each monitoring window has a length (m − 1) +  sat-isfying the condition: (m − 1) + ¿(m − 1)( − )) and the remark following the lemma, if the random draws in Et2

t1P(X ) yield the same outcome X (an event that

occurs with probability X=Qpi∈P(X ) pi; X), then an instance of X will be

estab-lished by time t2 + (m − 1) + ¡t + m + (m − 1) + . Note that if the

ran-dom draws do not yield the same outcome X but some process’s ranran-dom draw in Et2

t1P(X ) leads to the establishment of some other interaction involving the process,

then the process will also start an interaction when it nishes its monitoring window (that is established following the random draw). If neither of these is the case then each process in P(X ), after performing its random draw in Et2

t1P(X ), must perform

a new random draw in another m time (which amounts to the length of a non-monitoring window − plus the length of a non-monitoring window (m − 1) + ). That is, there must exist another proper set of random draws Et2+m

t1+mP(X ) that is disjoint

from E_tt2 1P(X ).

Once again, if the new random draws yield the same outcome X or cause some other interaction to be established (with probability no less than (1 − X) X), then

some interaction involving a member of X will be established by time t2+ m + (m −

1) + ¡t + 2m + (m − 1) + . Otherwise, there must exist another proper set of random draws Et2+2m

t1+2mP(X ) that is disjoint form E

t2+m

t1+mP(X ), and so on.

In general, if X remains enabled, then there exist mutually disjoint sets of random draws Et2

t1P(X ); E

t2+m

t1+mP(X ); : : : ; E

t2+(i−1)m

t1+(i−1)mP(X ); : : : ; and each of these sets is proper.

Moreover, if the random draws in Et2+(i−1)m

t1+(i−1)mP(X ) yield the same outcome X or cause

some other interaction to be established (with probability no less than (1 − X)i−1 X),

then an interaction involving a member of X will be established by t+im+(m−1)+. Therefore, the expected time starting from t until an interaction involving a member

(21)

of X is established is less than _P i im(1 − X) i−1 _X + (m − 1) + =m X + (m − 1) + :

Similar analysis can also be carried out if interactions have dierent size or non-monitoring windows have dierent lengths. In particular, when the length of pj’s

non-monitoring windows may vary, another process pi must update its E[ j] in order to

adjust its monitoring window for monitoring some interaction involving pj. In the

algorithm, pi learns a new j (which measures the maximum length of pj’s previous

non-monitoring windows) through an attempt to establish an interaction involving pj.

For pi to have such an attempt it must choose an interaction involving pj in some

random draw. Let i; j denote the probability that, in one random draw by pi, an

interaction involving pj is chosen. Then the expected number of attempts for pi to

nally attempt an interaction involving pj so as to update pi’s E[ j] is

Ti; j=P

k k(1 − i; j) k−1

i; j=1 i; j:

If each such attempt takes no more than s time (which consists of a non-monitoring window followed by a monitoring window), then an additional s=i; j time would be

required for pi to have the knowledge of pj’s new j. If pj also has no knowledge of

pi’s new i, then an additional max{Ti; j; Tj; i} · s time would be required for both pi

and pj to have each other’s new .

To see how the time complexity is aected by (1) the number of potential interactions for which a process may be ready at a time, and (2) the size of an interaction, assume that a process may be ready for k potential interactions at a time, and each interaction involves m participants. So the probability for the processes in P(X ) to choose X in a set of random draws, one by each process, is (1=k)m_{. Assume further that each}

non-monitoring window has a length − and a non-monitoring window has a length (m − 1) + . From Theorem 7, the expected time for an enabled interaction to be disabled is dominated by mkm_{. Suppose that the time to execute a local action is negligible}

compared to the communication time for delivering a message. Then, consists of four message transmissions (a message to send the token, an acknowledgement, a message to retrieve the token, and a message to return the token) if messages in lines 5, 6, 16, and 17 of TB can be sent in parallel. If the message transmission time is c, then the time complexity is dominated by

4cmkm_:

In the above, since m messages are sent in parallel in each interval c, the expected number of messages needed to establish an interaction per process is no greater than

4m2_km_:

For comparison, the ecient deterministic alogrithm by Ramesh [28] has a worst case time complexity in the order of 3cnk and a message complexity 3mk. Note that, unlike

(22)

TB (and other randomized algorithms [12, 30, 19]), the time complexity of determinis-tic algorithms typically depends on n — the total number of processes in the system. This is because they impose priority (e.g., process id’s) to break the symmetry be-tween processes so that a low-priority process must wait for a high-priority one if they attempt to establish con icting interactions (two interactions con ict if they involve a common process).13 _{The fact that randomized algorithms often have a time complexity}

independent of n is one of the reasons that Reif and Spirakis’s randomized algorithm [30] was able to claim a real-time response.

From the above comparison, we can see that TB can out-perform deterministic al-gorithms (where only WIF is required) only if time is a main concern and the two parameters, k — the number of potential interactions for which a process may be ready at a time, and m — the number of participants in an interaction, are kept small rel-ative to n, e.g., CSP-like biparty interactions. (For eciency’s concern, deterministic or randomized, it is generally known that the two parameters must be kept small in practical applications. A technique of synchrony loosening [10] is therefore proposed for reducing the size of an interaction.) Otherwise, TB has a niche simply because deterministic algorithms are unable to guarantee SIF.

4. A shared-memory solution

In this section we present an algorithm for the multiparty interaction scheduling problem where processes communicate by reading from and writing to shared variables. In particular, the algorithm uses only single-writer variables. A non-local variable Vj

can be read by the command read (Vj).

4.1. Informal description

Like Algorithm TB, when a process pi is ready for interaction, it randomly chooses

one interaction X , from the set of potential interactions it is ready to execute, and then attempts to establish X . However, instead of sending out tokens, pi expresses

its interest in X by writing hexamining; X i to its local variable state, which is to be read by other processes. In the algorithm, values of state is of the form hstatus; X i, where X denotes the interaction pi is attempting, and status records the status of the

attempt. Besides examining, status has another three possible values: waiting, success, and closed; their meaning should be clear shortly.

After setting its state to hexamining; X i, pi begins to read the states of the other

participants. If, for every pj∈ P(X ), pj’s state is hexamining; X i or hwaiting; X i, then

the other processes in P(X ) are also interested in X . This means that pihas successfully

observed the establishment of X . It then changes its state to hsuccess; X i, and waits for the other participants to observe the establishment of X . To do so, pi keeps a

13 _{It is well known that, even if only WIF is required, there is still no symmetric, decentralized, and}

(23)

binary variable ag[X ] for each interaction X . Initially, all processes in P(X ) have their ag[X ]’s set to the same value, say 0. When a process p is to execute an instance of X , it complements its ag[X ]. In the above case, pi complements its ag[X ] before

it changes its state to hsuccess; X i. To ensure that every other pj∈ P(X ) has also

observed the establishment of X , pi keeps reading pj’s ag[X ] until it has the same

value as pi’s. Then, pi changes its state to hclosed; X i and starts X .

As we shall see, ag[X ] has another important role in the algorithm: to avoid a process from “outrunning” other processes in executing instances of X . In other words, the algorithm guarantees that, if pi is to execute an instance of X , then all other

processes in P(X ) must have nished the previous instance of X .

When examining other processes’ states, if not all of them are hexamining, X i or

hwaiting; X i, then pi changes its state to hwaiting; X i. Like TB, pi has to wait for a

period of time , and then re-inspects the other participants’ states. The value of is determined as in TB. That is, must be no less than P_p_j_{∈P(X )−{p}_i_}j, where j is

the maximum time (measured by the algorithm) pj has spent between two consecutive

-intervals.

If after time some process pj has changed its state to hsuccess; X i, and pj: ag[X ]

6= pi: ag[X ], then pi has learned the establishment of X from pj. (Throughout the

paper we often use pj:v to denote pj’s variable v.) So, pi also complements its

ag[X ] and then starts X . If after time either (1) no process’s state has changed to hsuccess; X i, or (2) some process is in state hsuccess; X i but its ag[X ] has the same value as pi: ag[X ] (which means that the process is still executing the

previ-ous instance of X ), then pi’s attempt to establish X has failed. It must return to the

beginning of the procedure to attempt another interaction. 4.2. The code

The algorithm executed by each process pi is given in Fig. 5. We shall refer to the

algorithm as SM (for Shared Memory). The variables local to pi are given as follows:

• ready: a boolean ag that is set to true when pi is ready for interaction, and is set

to false when pi has executed some interaction. It is initialized to false.

• state[1::n]: array of hstatus; X i, where X is an interaction, and status is examining;

waiting, success, or closed. Each state[ j] records the state of pj observed by pi,

and is initialized to hclosed; ⊥i.

• ag[X1::Xm]: array of binary values, where X1; : : : ; Xm are interactions of which pi

is a member. Each ag[Xj] is initialized to 0.

• : records the maximum of the durations from the time pi previously stopped

monitoring interaction to the time pi starts monitoring interaction. It is initialized

to 0.

• init−: a temporary variable used to measure . It is initialized to ∞.

• E[1::n]: E[j], initialized to 0, records the maximum value of pj’s read by pi.

In the algorithm, variable is measured in a way similar to TB. That is, pi starts